Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis

GAËL DIAS; RUMEN MORALIYSKI; JOÃO CORDEIRO; ANTOINE DOUCET; HELENA AHONEN-MYKA

doi:10.1017/S135132491000015X

Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis

Published online by Cambridge University Press: 11 October 2010

GAËL DIAS ,

ANTOINE DOUCET and

GAËL DIAS: Affiliation:
Centre for HLT and Bioinformatics, Department of Computer Science, University of Beira Interior, 6201-001 - Covilhã, Portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt
RUMEN MORALIYSKI: Affiliation:
Centre for HLT and Bioinformatics, Department of Computer Science, University of Beira Interior, 6201-001 - Covilhã, Portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt
JOÃO CORDEIRO: Affiliation:
Centre for HLT and Bioinformatics, Department of Computer Science, University of Beira Interior, 6201-001 - Covilhã, Portugal emails: ddg@di.ubi.pt, rumen@penhas.di.ubi.pt, jpaulo@di.ubi.pt
ANTOINE DOUCET: Affiliation:
Campus Côte de Nacre, Boulevard du Maréchal Juin, University of Caen, BP 5186 - 14032 - Caen CEDEX, France email: doucet@info.unicaen.fr
HELENA AHONEN-MYKA: Affiliation:
Department of Computer Science, University of Helsinki, P.O. Box 68 (Gustaf Hällströmin katu 2b), FI-00014, Helsinki, Finland email: helena.ahonen-myka@cs.helsinki.fi

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Thesauri, which list the most salient semantic relations between words, have mostly been compiled manually. Therefore, the inclusion of an entry depends on the subjective decision of the lexicographer. As a consequence, those resources are usually incomplete. In this paper, we propose an unsupervised methodology to automatically discover pairs of semantically related words by highlighting their local environment and evaluating their semantic similarity in local and global semantic spaces. This proposal differs from all other research presented so far as it tries to take the best of two different methodologies, i.e. semantic space models and information extraction models. In particular, it can be applied to extract close semantic relations, it limits the search space to few, highly probable options and it is unsupervised.

Type: Papers
Information: Natural Language Engineering , Volume 16 , Special Issue 4: Distributional Lexical Semantics , October 2010 , pp. 439 - 467

DOI: https://doi.org/10.1017/S135132491000015X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agirre, E., and Lacalle, O. L. 2003. Clustering WordNet word senses. In Recent Advances in Natural Language Processing III: Selected Papers from RANLP 2003, Borovets, Bulgaria, pp. 11–18.Google Scholar

Ahonen-Myka, H. 1999. Finding all frequent maximal sequences in text. In Proceedings of ICML-99 Workshop on Machine Learning in Text Data Analysis, Bled, Slovenia, pp. 11–17.Google Scholar

Barzilay, R., and Lee, L. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In HLT-NAACL 2003: Main Proceedings, Edmonton, Canada, pp. 16–23.Google Scholar

Berland, M., and Charniak, E. 1999. Finding parts in very large corpora. In Proceedings of ACL, pp. 57–64. College Park, MD: Association for Computational Linguistics.Google Scholar

Bollegala, D., Matsuo, Y., and Ishizuka, M. 2007. Measuring semantic similarity between words using web search engines. In Proceedings of the 16th International World Wide Web Conference (WWW 2007), Banff, Alberta, Canada, pp. 757–766.Google Scholar

Bordag, S. 2003. Sentence co-occurrences as small-world graphs: a solution to automatic lexical disambiguation. In CICLing'03: Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing, vol. 2588 of Lecture Notes in Computer Science, Mexico City, Mexico, pp. 329–332.CrossRef Google Scholar

Caraballo, S. A. 1999. Automatic construction of a hypernym-labeled noun hierarchy from text. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 120–126. College Park, MD: Association for Computational Linguistics.CrossRef Google Scholar

Cederberg, S., and Widdows, D. 2003. Using LSA and noun coordination information to improve the precision and recall of automatic hyponymy extraction. In Daelemans, W., Osborne, M. (eds.), Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 111–118. Stroudsburg, PA: Association for Computational Linguistics.CrossRef Google Scholar

Charles, W. G. 2000. Contextual correlates of meaning. Applied Psycholinguistics 21 (04): 505–524.CrossRef Google Scholar

Cordeiro, J., Dias, G., and Brazdil, P. 2007a. Learning paraphrases from WNS corpora. In Twentieth International FLAIRS Conference, Key West, Florida, pp. 193–198.Google Scholar

Cordeiro, J., Dias, G., and Brazdil, P. 2007b. New functions for unsupervised asymmetrical paraphrase detection. Journal of Software 2 (4): 12–23.Google Scholar

Cordeiro, J., Dias, G., and Cleuziou, G. 2007c. Biology based alignments of paraphrases for sentence compression. In Proceedings of the Workshop on Textual Entailment and Paraphrasing (ACL-PASCAL / ACL2007), Prague, Czech Republic, pp. 177–184.CrossRef Google Scholar

Curran, J. R., and Moens, M. 2002. Improvements in automatic thesaurus extraction. In Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX), Philadelphia, PA, pp. 59–66.Google Scholar

Dias, G., Guilloré, S., and Lopes, J. G. P. 1999. Language independent automatic acquisition of rigid multiword units from unrestricted text corpora. In Proceedings of 6me Confrence Annuelle sur le Traitement Automatique des Langues Naturelles (TALN 1999), Cargèse, France, pp. 333–339.Google Scholar

Dolan, W. B., Quirk, C., and Brockett, C. 2004. Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In Proceedings of 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 1–7.Google Scholar

Doucet, A., and Ahonen-Myka, H. 2006. Probability and expected document frequency of discontinued word sequences, an efficient method for their exact computation. Traitement Automatique des Langues (TAL) 46 (2): 13–37.Google Scholar

Ehlert, B. 2003. Making Accurate Lexical Semantic Similarity Judgments Using Word-Context Co-Occurrence Statistics. Master's thesis, University of California, San Diego, CA.Google Scholar

Fellbaum, C. (ed.) 1998. WordNet: An Electronic Lexical Database. The MIT Press.Google Scholar

Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., and Wang, Z. 2005. New experiments in distributional representations of synonymy. In Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL), pp. 25–32, Ann Arbor, MI.CrossRef Google Scholar

Gale, W., Church, K. W., and Yarowsky, D. 1992. One sense per discourse. In HLT '91: Proceedings of the Workshop on Speech and Natural Language, Harriman, NY, pp. 233–237.CrossRef Google Scholar

Grefenstette, G. 1994. Explorations in Automatic Thesaurus Discovery. Norwell, MA: Kluwer.CrossRef Google Scholar

Harris, Z. S. 1968. Mathematical Structures of Language. Wiley, New York, NY.Google Scholar

Hearst, M. A. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics, Nantes, France, pp. 539–545.Google Scholar

Heyer, L. J., Kruglyak, S., and Yooseph, S. 1999. Exploring expression data: Identification and analysis of coexpressed genes. Genome Research 9 (11): 1106–1115.CrossRef Google Scholar PubMed

Heylen, K., Peirsman, Y., Geeraerts, D., and Speelman, D. 2008. Modelling word similarity: An evaluation of automatic synonymy extraction algorithms. In Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco, pp. 3243–3249.Google Scholar

Hindle, D. 1990. Noun classification from predicate-argument structures. In Meeting of the Association for Computational Linguistics, Pittsburgh, PA, pp. 268–275.Google Scholar

Hirschman, L., Grishman, R., and Sager, N. 1975. Grammatically-based automatic word class formation. Information Processing and Management 11 (1–2): 39–57.Google Scholar

Jarmasz, M., and Szpakowicz, S. 2004. Roget's thesaurus and semantic similarity. In Proceedings of Conference on Recent Advances in Natural Language Processing (RANLP), pp. 212–219, Borovets, Bulgaria.Google Scholar

Jing, H., and McKeown, K. R. 2000. Cut and paste based text summarization. In Proceedings of the 1st North American Chapter of the Association for Computational Linguistics conference, Seattle, WA, pp. 178–185.Google Scholar

Kaplan, A. 1950. An experimental study of ambiguity and context. Mechanical Translation 2 (2): 39–46. (Published as: Abraham, K. 1955. An experimental study of ambiguity and context. Mechanical Translation 2(2): 39–46.)Google Scholar

Landauer, T. K., and Dumais, S. T. 1997. A solution to Plato's problem: the latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104 (2): 211–240.Google Scholar

Levin, B. 1993. English Verb Classes and Alternations: A Preliminary Investigation. University of Chicago Press.Google Scholar

Lewis, D. D., Yang, Y., Rose, T. G., and Li, F. 2004. RCV1: a new benchmark collection for text categorization research. Journal of Machine Learning Research 5: 361–397.Google Scholar

Lin, D. 1998a. Automatic retrieval and clustering of similar words. In COLING-ACL, Montreal, QC, Canada, pp. 768–774.Google Scholar

Lin, D. 1998b. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison, WI, pp. 296–304.Google Scholar

Lin, D., Zhao, S., Qin, L., and Zhou, M. 2003. Identifying synonyms among distributionally similar words. In Gottlob, G., Walsh, T., Gottlob, G. and Walsh, T. (eds.), Proceedings of IJCAI-03: Proceedings of the 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, pp. 1492–1493.Google Scholar

Liu, H. 2004. MontyLingua: an end-to-end natural language processor with common sense. Available at: web.media.mit.edu/~hugo/montylingua Google Scholar

Mohammad, S. 2008. Measuring Semantic Distance using Distributional Profiles of Concepts. PhD thesis, University of Toronto, Toronto, Canada.Google Scholar

Notredame, C. 2007. Recent evolutions of multiple sequence alignment algorithms. PLoS Computational Biology 3 (8):1405–1408.Google Scholar

Rapp, R. 2003. Word sense discovery based on sense descriptor dissimilarity. In Proceedings of the Ninth Machine Translation Summit, New Orleans, LA, pp. 315–322.Google Scholar

Rapp, R. 2004. Utilizing the one-sense-per-discourse constraint for fully unsupervised word sense induction and disambiguation. In Proceedings of Forth Language Resources and Evaluation Conference, LREC, Lisbon, Portugal, pp. 1–7.Google Scholar

Roget, P. M. (ed.) 1852. Roget's Thesaurus of English Words and Phrases. Harlow, Essex, UK: Longman Group Ltd.Google Scholar

Rubenstein, H., and Goodenough, J. B. 1965. Contextual correlates of synonymy. Communications of the ACM 8 (10): 627–633.CrossRef Google Scholar

Sahlgren, M. 2006. The Word-Space Model. PhD thesis, Stockholm University, Stockholm, Sweden.Google Scholar

Sahlgren, M., and Karlgren, J. 2002. Vector-based semantic analysis using random indexing for cross-lingual query expansion. In CLEF '01: Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems, pp. 169–176. London, UK.Google Scholar

Snow, R., Jurafsky, D., and Ng, A. Y. 2005. Learning syntactic patterns for automatic hypernym discovery. In Saul, L. K., Weiss, Y. and Bottou, L. (eds.), Advances in Neural Information Processing Systems 17, pp. 1297–1304. MIT Press.Google Scholar

Snow, R., Jurafsky, D., and Ng, A. Y. 2006. Semantic taxonomy induction from heterogenous evidence. In ACL '06: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, Sydney, Australia, pp. 801–808.Google Scholar

Spärck-Jones, K. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28 (1): 11–21.Google Scholar

Terra, E., and Clarke, C. 2003. Frequency estimates for statistical word similarity measures. In NAACL '03: Proceedings of the 2003 Conference of HTL/NAACL, Edmonton, Canada, pp. 165–172.Google Scholar

Turney, P. D. 2001. Mining the web for synonyms: PMI–IR versus LSA on TOEFL. In EMCL '01: Proceedings of the 12th European Conference on Machine Learning, Freiburg, Germany, pp. 491–502.Google Scholar

Turney, P. D., Littman, M. L., Bigham, J., and Shnayder, V. 2003. Combining independent modules in lexical multiple-choice problems. In Recent Advances in Natural Language Processing III, Borovets, Bulgaria, pp. 101–110.Google Scholar

Weeds, J., Weir, D., and McCarthy, D. 2004. Characterising measures of lexical distributional similarity. In COLING '04: Proceedings of the 20th International Conference on Computational Linguistics, Geneva, Switzerland, pp. 1015–1021.CrossRef Google Scholar

Article contents

Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests