Hostname: page-component-8448b6f56d-tj2md Total loading time: 0 Render date: 2024-04-20T00:38:39.669Z Has data issue: false hasContentIssue false

Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis

Published online by Cambridge University Press:  11 October 2010

GAËL DIAS
Affiliation:
Centre for HLT and Bioinformatics, Department of Computer Science, University of Beira Interior, 6201-001 - Covilhã, Portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt
RUMEN MORALIYSKI
Affiliation:
Centre for HLT and Bioinformatics, Department of Computer Science, University of Beira Interior, 6201-001 - Covilhã, Portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt
JOÃO CORDEIRO
Affiliation:
Centre for HLT and Bioinformatics, Department of Computer Science, University of Beira Interior, 6201-001 - Covilhã, Portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt
ANTOINE DOUCET
Affiliation:
Campus Côte de Nacre, Boulevard du Maréchal Juin, University of Caen, BP 5186 - 14032 - Caen CEDEX, France email: doucet@info.unicaen.fr
HELENA AHONEN-MYKA
Affiliation:
Department of Computer Science, University of Helsinki, P.O. Box 68 (Gustaf Hällströmin katu 2b), FI-00014, Helsinki, Finland email: helena.ahonen-myka@cs.helsinki.fi

Abstract

Thesauri, which list the most salient semantic relations between words, have mostly been compiled manually. Therefore, the inclusion of an entry depends on the subjective decision of the lexicographer. As a consequence, those resources are usually incomplete. In this paper, we propose an unsupervised methodology to automatically discover pairs of semantically related words by highlighting their local environment and evaluating their semantic similarity in local and global semantic spaces. This proposal differs from all other research presented so far as it tries to take the best of two different methodologies, i.e. semantic space models and information extraction models. In particular, it can be applied to extract close semantic relations, it limits the search space to few, highly probable options and it is unsupervised.

Type
Papers
Copyright
Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agirre, E., and Lacalle, O. L. 2003. Clustering WordNet word senses. In Recent Advances in Natural Language Processing III: Selected Papers from RANLP 2003, Borovets, Bulgaria, pp. 1118.Google Scholar
Ahonen-Myka, H. 1999. Finding all frequent maximal sequences in text. In Proceedings of ICML-99 Workshop on Machine Learning in Text Data Analysis, Bled, Slovenia, pp. 1117.Google Scholar
Barzilay, R., and Lee, L. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In HLT-NAACL 2003: Main Proceedings, Edmonton, Canada, pp. 1623.Google Scholar
Berland, M., and Charniak, E. 1999. Finding parts in very large corpora. In Proceedings of ACL, pp. 5764. College Park, MD: Association for Computational Linguistics.Google Scholar
Bollegala, D., Matsuo, Y., and Ishizuka, M. 2007. Measuring semantic similarity between words using web search engines. In Proceedings of the 16th International World Wide Web Conference (WWW 2007), Banff, Alberta, Canada, pp. 757766.Google Scholar
Bordag, S. 2003. Sentence co-occurrences as small-world graphs: a solution to automatic lexical disambiguation. In CICLing'03: Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing, vol. 2588 of Lecture Notes in Computer Science, Mexico City, Mexico, pp. 329332.CrossRefGoogle Scholar
Caraballo, S. A. 1999. Automatic construction of a hypernym-labeled noun hierarchy from text. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 120126. College Park, MD: Association for Computational Linguistics.CrossRefGoogle Scholar
Cederberg, S., and Widdows, D. 2003. Using LSA and noun coordination information to improve the precision and recall of automatic hyponymy extraction. In Daelemans, W., Osborne, M. (eds.), Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 111118. Stroudsburg, PA: Association for Computational Linguistics.CrossRefGoogle Scholar
Charles, W. G. 2000. Contextual correlates of meaning. Applied Psycholinguistics 21 (04): 505524.CrossRefGoogle Scholar
Cordeiro, J., Dias, G., and Brazdil, P. 2007a. Learning paraphrases from WNS corpora. In Twentieth International FLAIRS Conference, Key West, Florida, pp. 193198.Google Scholar
Cordeiro, J., Dias, G., and Brazdil, P. 2007b. New functions for unsupervised asymmetrical paraphrase detection. Journal of Software 2 (4): 1223.Google Scholar
Cordeiro, J., Dias, G., and Cleuziou, G. 2007c. Biology based alignments of paraphrases for sentence compression. In Proceedings of the Workshop on Textual Entailment and Paraphrasing (ACL-PASCAL / ACL2007), Prague, Czech Republic, pp. 177184.CrossRefGoogle Scholar
Curran, J. R., and Moens, M. 2002. Improvements in automatic thesaurus extraction. In Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX), Philadelphia, PA, pp. 5966.Google Scholar
Dias, G., Guilloré, S., and Lopes, J. G. P. 1999. Language independent automatic acquisition of rigid multiword units from unrestricted text corpora. In Proceedings of 6me Confrence Annuelle sur le Traitement Automatique des Langues Naturelles (TALN 1999), Cargèse, France, pp. 333339.Google Scholar
Dolan, W. B., Quirk, C., and Brockett, C. 2004. Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In Proceedings of 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 17.Google Scholar
Doucet, A., and Ahonen-Myka, H. 2006. Probability and expected document frequency of discontinued word sequences, an efficient method for their exact computation. Traitement Automatique des Langues (TAL) 46 (2): 1337.Google Scholar
Ehlert, B. 2003. Making Accurate Lexical Semantic Similarity Judgments Using Word-Context Co-Occurrence Statistics. Master's thesis, University of California, San Diego, CA.Google Scholar
Fellbaum, C. (ed.) 1998. WordNet: An Electronic Lexical Database. The MIT Press.Google Scholar
Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., and Wang, Z. 2005. New experiments in distributional representations of synonymy. In Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL), pp. 2532, Ann Arbor, MI.CrossRefGoogle Scholar
Gale, W., Church, K. W., and Yarowsky, D. 1992. One sense per discourse. In HLT '91: Proceedings of the Workshop on Speech and Natural Language, Harriman, NY, pp. 233237.CrossRefGoogle Scholar
Grefenstette, G. 1994. Explorations in Automatic Thesaurus Discovery. Norwell, MA: Kluwer.CrossRefGoogle Scholar
Harris, Z. S. 1968. Mathematical Structures of Language. Wiley, New York, NY.Google Scholar
Hearst, M. A. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics, Nantes, France, pp. 539545.Google Scholar
Heyer, L. J., Kruglyak, S., and Yooseph, S. 1999. Exploring expression data: Identification and analysis of coexpressed genes. Genome Research 9 (11): 11061115.CrossRefGoogle ScholarPubMed
Heylen, K., Peirsman, Y., Geeraerts, D., and Speelman, D. 2008. Modelling word similarity: An evaluation of automatic synonymy extraction algorithms. In Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco, pp. 32433249.Google Scholar
Hindle, D. 1990. Noun classification from predicate-argument structures. In Meeting of the Association for Computational Linguistics, Pittsburgh, PA, pp. 268275.Google Scholar
Hirschman, L., Grishman, R., and Sager, N. 1975. Grammatically-based automatic word class formation. Information Processing and Management 11 (1–2): 3957.Google Scholar
Jarmasz, M., and Szpakowicz, S. 2004. Roget's thesaurus and semantic similarity. In Proceedings of Conference on Recent Advances in Natural Language Processing (RANLP), pp. 212219, Borovets, Bulgaria.Google Scholar
Jing, H., and McKeown, K. R. 2000. Cut and paste based text summarization. In Proceedings of the 1st North American Chapter of the Association for Computational Linguistics conference, Seattle, WA, pp. 178185.Google Scholar
Kaplan, A. 1950. An experimental study of ambiguity and context. Mechanical Translation 2 (2): 3946. (Published as: Abraham, K. 1955. An experimental study of ambiguity and context. Mechanical Translation 2(2): 39–46.)Google Scholar
Landauer, T. K., and Dumais, S. T. 1997. A solution to Plato's problem: the latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104 (2): 211240.Google Scholar
Levin, B. 1993. English Verb Classes and Alternations: A Preliminary Investigation. University of Chicago Press.Google Scholar
Lewis, D. D., Yang, Y., Rose, T. G., and Li, F. 2004. RCV1: a new benchmark collection for text categorization research. Journal of Machine Learning Research 5: 361397.Google Scholar
Lin, D. 1998a. Automatic retrieval and clustering of similar words. In COLING-ACL, Montreal, QC, Canada, pp. 768774.Google Scholar
Lin, D. 1998b. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison, WI, pp. 296304.Google Scholar
Lin, D., Zhao, S., Qin, L., and Zhou, M. 2003. Identifying synonyms among distributionally similar words. In Gottlob, G., Walsh, T., Gottlob, G. and Walsh, T. (eds.), Proceedings of IJCAI-03: Proceedings of the 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, pp. 14921493.Google Scholar
Liu, H. 2004. MontyLingua: an end-to-end natural language processor with common sense. Available at: web.media.mit.edu/~hugo/montylinguaGoogle Scholar
Mohammad, S. 2008. Measuring Semantic Distance using Distributional Profiles of Concepts. PhD thesis, University of Toronto, Toronto, Canada.Google Scholar
Notredame, C. 2007. Recent evolutions of multiple sequence alignment algorithms. PLoS Computational Biology 3 (8):14051408.Google Scholar
Rapp, R. 2003. Word sense discovery based on sense descriptor dissimilarity. In Proceedings of the Ninth Machine Translation Summit, New Orleans, LA, pp. 315322.Google Scholar
Rapp, R. 2004. Utilizing the one-sense-per-discourse constraint for fully unsupervised word sense induction and disambiguation. In Proceedings of Forth Language Resources and Evaluation Conference, LREC, Lisbon, Portugal, pp. 17.Google Scholar
Roget, P. M. (ed.) 1852. Roget's Thesaurus of English Words and Phrases. Harlow, Essex, UK: Longman Group Ltd.Google Scholar
Rubenstein, H., and Goodenough, J. B. 1965. Contextual correlates of synonymy. Communications of the ACM 8 (10): 627633.CrossRefGoogle Scholar
Sahlgren, M. 2006. The Word-Space Model. PhD thesis, Stockholm University, Stockholm, Sweden.Google Scholar
Sahlgren, M., and Karlgren, J. 2002. Vector-based semantic analysis using random indexing for cross-lingual query expansion. In CLEF '01: Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems, pp. 169176. London, UK.Google Scholar
Snow, R., Jurafsky, D., and Ng, A. Y. 2005. Learning syntactic patterns for automatic hypernym discovery. In Saul, L. K., Weiss, Y. and Bottou, L. (eds.), Advances in Neural Information Processing Systems 17, pp. 12971304. MIT Press.Google Scholar
Snow, R., Jurafsky, D., and Ng, A. Y. 2006. Semantic taxonomy induction from heterogenous evidence. In ACL '06: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, Sydney, Australia, pp. 801808.Google Scholar
Spärck-Jones, K. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28 (1): 1121.Google Scholar
Terra, E., and Clarke, C. 2003. Frequency estimates for statistical word similarity measures. In NAACL '03: Proceedings of the 2003 Conference of HTL/NAACL, Edmonton, Canada, pp. 165172.Google Scholar
Turney, P. D. 2001. Mining the web for synonyms: PMI–IR versus LSA on TOEFL. In EMCL '01: Proceedings of the 12th European Conference on Machine Learning, Freiburg, Germany, pp. 491502.Google Scholar
Turney, P. D., Littman, M. L., Bigham, J., and Shnayder, V. 2003. Combining independent modules in lexical multiple-choice problems. In Recent Advances in Natural Language Processing III, Borovets, Bulgaria, pp. 101110.Google Scholar
Weeds, J., Weir, D., and McCarthy, D. 2004. Characterising measures of lexical distributional similarity. In COLING '04: Proceedings of the 20th International Conference on Computational Linguistics, Geneva, Switzerland, pp. 10151021.CrossRefGoogle Scholar