Skip to main content
×
×
Home

Within-concept similarities in a taxonomy: a corpus linguistic approach

  • STIJN STORMS (a1), DIRK SPEELMAN (a1), DIRK GEERAERTS (a1) and GERT STORMS (a2)
Abstract

This paper looks at a hitherto unexplored aspect of taxonomically organized concepts which has to do with word distributions in corpora of actual language use. In parallel to the psychological informativeness claim of the differentiation explanation, the question is addressed if concepts are internally more similar than their higher-ranked taxonomical relatives. This internal similarity is measured by making use of token-based vector space models. For each occurrence of a concept in the corpus a context vector can be calculated, which then serves as input for the internal similarity measure. Experiments are conducted for taxonomies taken from the Dutch counterparts of the English semantic domains animal and means of transportation. Results do not wholeheartedly agree with the imposition of a strict taxonomical order, but give rise to a new behavioural measure of the basic level.

Copyright
Corresponding author
*Addresses for correspondence: Stijn Storms: stijn.storms@arts.kuleuven.be; Dirk Speelman: dirk.speelman@arts.kuleuven.be; Dirk Geeraerts: dirk.geeraerts@arts.kuleuven.be; Gert Storms: stijn.storms@telenet.be.
References
Hide All
Agirre, E., & Edmonds, P. G. (2006). Word sense disambiguation: algorithms and applications (Text, Speech, and Language Technology). Dordrecht: Springer.
Anglin, J. M. (1977). Word, object, and conceptual development. New York: Norton.
Barsalou, L. W. (1983). Ad hoc categories. Memory & Cognition, 11 (3), 211227.
Berlin, B., Breedlove, D. E., & Raven, P. H. (1973). General principles of classification and nomenclature in folk biology. American Anthropologist, 75 (1), 214242.
Boon, T. den, & Geeraerts, D. (2005). Van Dale Groot woordenboek van de Nederlandse taal. Utrecht/Antwerpen: Van Dale Lexicografie bv.
Bouma, G., Van Noord, G., & Malouf, Robert (2001). Alpino: wide-coverage computational analysis of Dutch. In Daelemans, W., Sima’an, K., Veenstra, J., & Zavrel, J. (Eds.), Computational Linguistics in the Netherlands 2000. Selected Papers from the 11th CLIN Meeting (pp. 4559). Amsterdam: Rodopi.
Cruse, D. A. (1977). The pragmatics of lexical specificity. Journal of Linguistics, 13 (2), 153164.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41 (6), 391407.
Erk, K. (2009). Representing words as regions in vector space. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (pp. 5765). Association for Computational Linguistics, online: <http://aclweb.org/anthology//W/W09/W09-1109.pdf>.
Erk, K., & Padó, S. (2010). Exemplar-based models for word meaning in context. In Proceedings of the ACL 2010 Conference short papers (pp. 9297). Association for Computational Linguistics, online” <http://aclweb.org/anthology//P/P10/P10-2017.pdf>.
Firth, J. R. (1957). A synopsis of linguistic theory, 1930−1955. In Firth, J. R. (Ed.), Studies in linguistic analysis (pp. 132). Oxford: Blackwell.
Geeraerts, D., Grondelaers, S., & Bakema, P. (1994). The structure of lexical variation: meaning, naming and context. New York: M. de Gruyter.
Harris, Z. S. (1954). Distributional structure. Word, 10, 146162.
Jolicoeur, P., Gluck, M. A., & Kosslyn, S. M. (1984). Pictures and names: making the connection. Cognitive Psychology, 16, 243275.
Lin, E. L., & Murphy, G. L. (1997). Effects of background knowledge on object categorization and part detection. Journal of Experimental Psychology: Human Perception and Performance, 23, 11531169.
Markman, A. B., & Wisniewski, E. J. (1997). Similar and different: the differentiation of basic-level categories. Journal of Experimental Psychology: Learning, Memory and Cognition, 23, 5470.
Markman, E. M. (1985). Why superordinate category terms can be mass nouns. Cognition, 19, 3153.
Mervis, C. B., & Crisafi, M. A. (1982). Order of acquisition of subordinate-level, basic-level and superordinate-level categories. Child Development, 53, 258266.
Morris, M., & Murphy, G. L. (1990). Converging operations on a basic level in event taxonomies. Memory & Cognition, 18, 407418.
Murphy, G. L. (2002). The big book of concepts. Cambridge, MA: MIT Press.
Murphy, G. L., & Brownell, H. H. (1985). Category differentiation in object recognition: typicality constraints on the basic category advantage. Journal of Experimental Psychology: Learning, Memory and Cognition, 11, 7084.
Murphy, G. L., & Smith, E. E. (1982). Basic-level superiority in picture categorization. Journal of Verbal Learning and Verbal Behavior, 21, 120.
Peirsman, Y. (2010). Crossing corpora: modelling semantic similarity across languages and lects. Unpublished doctoral dissertation, KU Leuven.
Reddy, S., Klapaftis, I. P., McCarthy, D., & Manandhar, S. (2011). Dynamic and static prototype vectors for semantic composition. In Proceedings of 5th International Joint Conference on Natural Language Processing (pp. 705−713), online: <http://aclweb.org/anthology//I/I11/I11-1079.pdf>.
Reisinger, J., & Mooney, R. J. (2010). Multi-prototype vector space models of word meaning. In Human Language Technologies: the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 109−117). Association for Computational Linguistics, online: <http://www.cs.utexas.edu/users/ml/papers/reisinger.naacl-2010.pdf>.
Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., & Boyes-braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382439.
Sagi, E., Kaufmann, S., & Clark, B. (2009). Semantic density analysis: comparing word meaning across time and phonetic space. In Proceedings of the Workshop on Geometrical Models of Natural Language Semantics (pp. 104111). Athens: Association for Computational Linguistics.
Schütze, H. (1998). Automatic word sense discrimination. Computational Linguistics, 24, 97123.
Tanaka, J., & Taylor, M. (1991). Object categories and expertise: Is the basic level in the eye of the beholder? Cognitive Psychology, 23, 457482.
Turney, P. D., & Pantel, P. (2010). From frequency to meaning: vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141188.
Tversky, B., & Hemenway, K. (1983). Categories of environmental scenes. Cognitive Psychology, 15, 121149.
Tversky, B., & Hemenway, K. (1984). Objects, parts and categories. Journal of Experimental Psychology: General, 113, 169193.
Wisniewski, E. J., Imai, M., & Casey, L. (1996). On the equivalence of superordinate concepts. Cognition, 60, 269298.
Wisniewski, E. J., & Murphy, G. L. (1989). Superordinate and basic category names in discourse. Discourse Processes, 12, 245261.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Language and Cognition
  • ISSN: 1866-9808
  • EISSN: 1866-9859
  • URL: /core/journals/language-and-cognition
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Keywords:

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 9
Total number of PDF views: 98 *
Loading metrics...

Abstract views

Total abstract views: 142 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 12th June 2018. This data will be updated every 24 hours.