Skip to main content
×
Home
    • Aa
    • Aa

CO-graph: A new graph-based technique for cross-lingual word sense disambiguation

  • ANDRES DUQUE (a1), LOURDES ARAUJO (a1) and JUAN MARTINEZ-ROMO (a1)
Abstract
Abstract

In this paper, we present a new method based on co-occurrence graphs for performing Cross-Lingual Word Sense Disambiguation (CLWSD). The proposed approach comprises the automatic generation of bilingual dictionaries, and a new technique for the construction of a co-occurrence graph used to select the most suitable translations from the dictionary. Different algorithms that combine both the dictionary and the co-occurrence graph are then used for performing this selection of the final translations: techniques based on sub-graphs (communities) containing clusters of words with related meanings, based on distances between nodes representing words, and based on the relative importance of each node in the whole graph. The initial output of the system is enhanced with translation probabilities, provided by a statistical bilingual dictionary. The system is evaluated using datasets from two competitions: task 3 of SemEval 2010, and task 10 of SemEval 2013. Results obtained by the different disambiguation techniques are analysed and compared to those obtained by the systems participating in the competitions. Our system offers the best results in comparison with other unsupervised systems in most of the experiments, and even overcomes supervised systems in some cases.

Copyright
Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

E. Agirre , O. Lopez de Lacalle , and A. Soroa , 2014. Random walks for knowledge-based word sense disambiguation. Computational Linguistics 40 (1): 5784.

S. Brin , and L. Page 1998. The anatomy of a large-scale hypertextual web search engine. In Computer Networks and ISDN Systems, Elsevier Science Publishers B. V., pp. 107117.

E. W. Dijkstra , 1959. A note on two problems in connexion with graphs. Numerische Mathematik 1 (1): 269271.

L. Màrquez , G. Exsudero , Martínez, D., and G. Rigau 2006. Supervised corpus-based methods for wsd. In Word Sense Disambiguation: Algorithms and Applications, vol. 33, pp. 167216. Text, Speech and Language Technology. Dordrecht, The Netherlands: Springer.

J. Martinez-Romo , L. Araujo , Borge-J. Holthoefer , A. Arenas , J. A. Capitán , and J. A. Cuesta 2011. Disentangling categorical relationships through a graph of co-occurrences. Physical Review E 84: 046108, October.

R. Mihalcea 2006. Knowledge-based methods for wsd. In Word Sense Disambiguation: Algorithms and Applications, vol. 33, pp. 107132. Text, Speech and Language Technology. Dordrecht, The Netherlands: Springer.

R. Navigli , and M. Lapata 2010. An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (4): 678692, April.

F. J. Och , and H. Ney 2003. A systematic comparison of various statistical alignment models. Computational Linguistics 29 (1): 1951, March.

P. Pons , and M. Latapy , 2005. Computing communities in large networks using random walks. Lecture Notes in Computer Science 3733 : 284.

P. Resnik , and D. Yarowsky , 1999. Distinguishing systems and distinguishing senses: new evaluation methods for word sense disambiguation. Natural Language Engineering 5 (2): 113–133.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Altmetric attention score