Skip to main content

Interpreting compound nouns with kernel methods


This paper presents a classification-based approach to noun–noun compound interpretation within the statistical learning framework of kernel methods. In this framework, the primary modelling task is to define measures of similarity between data items, formalised as kernel functions. We consider the different sources of information that are useful for understanding compounds and proceed to define kernels that compute similarity between compounds in terms of these sources. In particular, these kernels implement intuitive notions of lexical and relational similarity and can be computed using distributional information extracted from text corpora. We report performance on classification experiments with three semantic relation inventories at different levels of granularity, demonstrating in each case that combining lexical and relational information sources is beneficial and gives better performance than either source taken alone. The data used in our experiments are taken from general English text, but our methods are also applicable to other domains and potentially to other languages where noun–noun compounding is frequent and productive.

Hide All
ACE 2008. Automatic Content Extraction 2008 Evaluation Plan. Available at Accessed 12 December 2012.
Agarwal A. and Daumé H. III 2011. Generative kernels for exponential families. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS-11), Ft. Lauderdale, FL.
Baldwin T. and Tanaka T. 2004. Translation by machine of complex nominals: getting it right. In Proceedings of the ACL-04 Workshop on Multiword Expressions: Integrating Processing, Barcelona, Spain.
Bauer L. 2001. Compounding. In Haspelmath M. (eds.), Language Typology and Language Universals. Hague, Netherlands: Mouton de Gruyter. 695707.
Berg C., Christensen J. P. R. and Ressel P. 1984. Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions. Berlin, Germany: Springer.
Blei David M., Ng Andrew Y., and Jordan Michael I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3: 9931022.
Briscoe T., Carroll J. and Watson R. 2006. The second release of the RASP system. In Proceedings of the ACL-06 Interactive Presentation Sessions, Sydney, Australia.
Burnard L. 1995. Users' Guide for the British National Corpus. Oxford, UK: British National Corpus Consortium, Oxford University Computing Service.
Butnariu C., Kim Su N., Nakov P., Ó Séaghdha D., Szpakowicz S., and Veale T. 2010. Semeval-2010 task 9: the interpretation of noun compounds using paraphrasing verbs and prepositions. In Proceedings of the SemEval-2 Workshop, Uppsala, Sweden.
Clark S., Copestake A., Curran James R., Zhang Y., Herbelot A., Haggerty J., Ahn B.-G., Wyk C. Van, Roesner J., Kummerfeld J., and Dawborn T. 2009. Large-scale syntactic processing: parsing the web. Technical report, final report of the 2009 JHU CLSP Workshop, Baltimore, MD.
Cortes C., Mohri M. and Rostamizadeh A. 2010. Two-stage learning kernel algorithms. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel.
Cortes C. and Vapnik V. 1995. Support vector networks. Machine Learning 20 (3): 273–97.
Curran J. 2003. From Distributional to Semantic Similarity. PhD thesis, School of Informatics, University of Edinburgh, Edinburgh, UK.
Devereux B. and Costello F. 2005. Investigating the relations used in conceptual combination. Artificial Intelligence Review 24 (3–4): 489515.
Devereux B. and Costello F. 2007. Learning to interpret novel noun-noun compounds: evidence from a category learning experiment. In Proceedings of the ACL-07 Workshop on Cognitive Aspects of Computational Language Acquisition, Prague, Czech Republic.
Dietterich Thomas G. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10 (7): 1895–923.
Estes Z. and Jones Lara L. 2006. Priming via relational similarity: a copper horse is faster when seen through a glass eye. Journal of Memory and Language 55 (1): 89101.
Gagné Christina L. 2002. Lexical and relational influences on the processing of novel compounds. Brain and Language 81 (1–3): 723–35.
Gagné Christina L., and Shoben Edward J. 1997. Influence of thematic relations on the comprehension of modifier-noun combinations. Journal of Experimental Psychology: Learning, Memory and Cognition 23 (1): 7187.
Gagné Christina L., and Shoben Edward J. 2002. Priming relations in ambiguous noun-noun compounds. Memory and Cognition 30 (4): 637–46.
Gärtner T., Flach Peter A., Kowalczyk A., and Smola Alex J. 2002. Multi-instance kernels. In Proceedings of the 19th International Conference on Machine Learning (ICML-02), Sydney, Australia.
Girju R., Moldovan D., Tatu M. and Antohe D. 2005. On the semantics of noun compounds. Computer Speech and Language 19 (4): 479–96.
Girju R., Nakov P., Nastase V., Szpakowicz S., Turney P., and Yuret D. 2007. SemEval-2007 Task 04: classification of semantic relations between nominals. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-07), Prague, Czech Republic.
Graff D., Kong J., Chen K. and Maeda K. 2005. English Gigaword Corpus, 2nd ed.Philadelphia, PA: Linguistic Data Consortium.
Hein M. and Bousquet O. 2005. Hilbertian metrics and positive definite kernels on probability measures. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS-05), Barbados.
Joachims T., Cristianini N. and Shawe-Taylor J. 2001. Composite kernels for hypertext categorisation. In Proceedings of the 18th International Conference on Machine Learning (ICML-01), Williamstown, MA.
Kim Su N., and Baldwin T. 2005. Automatic interpretation of noun compounds using WordNet similarity. In Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP-05), Jeju Island, Korea.
Lafferty J. and Lebanon G. 2005. Diffusion kernels on statistical manifolds. Journal of Machine Learning Research, 6: 129–63.
Lauer M. 1995. Designing Statistical Language Learners: Experiments on Compound Nouns. PhD thesis, Macquarie University.
Lee L. 1999. Measures of distributional similarity. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), College Park, MD.
Lin D. 1999. Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), College Park, MD.
Lodhi H., Saunders C., Shawe-Taylor J., Cristianini N., and Watkins Christopher J. C. H. 2002. Text classification using string kernels. Journal of Machine Learning Research, 2: 419–44.
Martins André F. T., Smith Noah A., Xing Eric P., Aguiar Pedro M. Q., and Figueiredo Mário A. T. 2009. Nonextensive information theoretic kernels on measures. Journal of Machine Learning Research, 10: 935–75.
Mercer J. 1909. Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London, Series A, 209: 415–46.
Nakov P. 2008. Noun compound interpretation using paraphrasing verbs: Feasibility study. In Proceedings of the 13th International Conference on Artificial Intelligence: Methodology, Systems, Applications (AIMSA-08), Varna, Bulgaria.
Nakov P. and Hearst Marti A. 2008. Solving relational similarity problems using the web as a corpus. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-08: HLT), Columbus, OH.
Nastase V., Shirabad J. S., Sokolova M. and Szpakowicz S. 2006. Learning noun-modifier semantic relations with corpus-based and WordNet-based features. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), Boston, MA.
Nastase V. and Szpakowicz S. 2003. Exploring noun-modifier semantic relations. In Proceedings of the 5th International Workshop on Computational Semantics (IWCS-03), Tilburg, The Netherlands.
Ó Séaghdha D. 2008. Learning Compound Noun Semantics. PhD thesis, University of Cambridge. Published as University of Cambridge Computer Laboratory Technical Report 735.
Ó Séaghdha D., and Copestake A. 2007. Co-occurrence contexts for noun compound interpretation. In Proceedings of the ACL-07 Workshop on A Broader Perspective on Multiword Expressions, Prague, Czech Republic.
Ó Séaghdha D., and Copestake A. 2008. Semantic classification with distributional kernels. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08), Manchester, UK.
Ó Séaghdha D., and Copestake A. 2009. Using lexical and relational similarity to classify semantic relations. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09), Athens, Greece.
Ó Séaghdha D., and Korhonen A. 2011. Probabilistic models of similarity in syntactic context. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP-11), Edinburgh, UK.
Padó S. and Lapata M. 2007. Dependency-based construction of semantic space models. Computational Linguistics, 33 (2): 161–99.
Raffray Claudine N., Pickering Martin J., and Branigan Holly P. 2007. Priming the interpretation of noun-noun compounds. Journal of Memory and Language, 57 (3): 380–95.
Russell S. W. 1972. Semantic categories of nominals for conceptual dependency analysis of natural language. Computer Science Department Report CS-299, Stanford University.
Ryder M. E. 1994. Ordered Chaos: The Interpretation of English Noun-Noun Compounds. Berkeley, CA: University of California Press.
Shawe-Taylor J., and Cristianini N. 2004. Kernel Methods for Pattern Analysis., Cambridge: Cambridge University Press.
Su Stanley Y. W. 1969. A semantic theory based upon interactive meaning. Computer Sciences Technical Report #68, University of Wisconsin.
Tratz S. and Hovy E. 2010. A taxonomy, dataset and classifier for automatic noun compound interpretation. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL-10), Uppsala, Sweden.
Turney Peter D. 2006. Similarity of semantic relations. Computational Linguistics, 32 (3): 379416.
Turney Peter D. 2008. A uniform approach to analogies, synonyms, antonyms, and associations. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08), Manchester, UK.
Turney Peter D., and Pantel P. 2010. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37: 141–88.
Yao L., Mimno D. and McCallum A. 2009. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-09), Paris, France.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 2
Total number of PDF views: 15 *
Loading metrics...

Abstract views

Total abstract views: 129 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 22nd November 2017. This data will be updated every 24 hours.