Skip to main content
×
Home

A syntactic approach for opinion mining on Spanish reviews

  • DAVID VILARES (a1), MIGUEL A. ALONSO (a1) and CARLOS GÓMEZ-RODRÍGUEZ (a1)
Abstract
Abstract

We describe an opinion mining system which classifies the polarity of Spanish texts. We propose an NLP approach that undertakes pre-processing, tokenisation and POS tagging of texts to then obtain the syntactic structure of sentences by means of a dependency parser. This structure is then used to address three of the most significant linguistic constructions for the purpose in question: intensification, subordinate adversative clauses and negation. We also propose a semi-automatic domain adaptation method to improve the accuracy of our system in specific application domains, by enriching semantic dictionaries using machine learning methods in order to adapt the semantic orientation of their words to a particular field. Experimental results are promising in both general and specific domains.

Copyright
References
Hide All
Aue A., and Gamon M. 2005. Customizing sentiment classifiers to new domains: a case study. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets, BG.
Bakliwal A., Arora P., Madhappan S., Kapre N., Singh M., and Varma V. 2012. Mining sentiments from Tweets. In Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis. WASSA '12, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 1118).
Boiy E., and Moens M., 2009. A machine learning approach to sentiment analysis in multilingual Web texts. Information Retrieval 12 (5): 526–58.
Brill E. 1992. A simple rule-based part of speech tagger. In Proceedings of the Workshop on Speech and Natural Language. HLT '91, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 112–16).
Brooke J., Tofiloski M., and Taboada M. 2009. Cross-linguistic sentiment analysis: from English to Spanish. In Proceedings of International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets, Bulgaria (pp. 50–4).
Buchholz S., and Marsi E. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the Tenth Conference on Computational Natural Language Learning. CoNLL-X '06, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 149–64).
Campos H., 1993. De la oración simple a la oración compuesta: Curso Superior de Gramática Española. Washington, D.C.: Georgetown University Press.
Chang C., and Lin C. 2011. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems Technology 2 (3): 27:1–27:27.
Cruz Mata F. L., 2011. Extracción de opiniones sobre características: Un enfoque Práctico adaptado al dominio. PhD thesis, Spain: Universidad de Sevilla.
Fernández Anta A., Morere P., Núñez Chiroque L., and Santos A. 2012. Techniques for sentiment analysis and topic detection of Spanish tweets: preliminary report. In TASS 2012 Working Notes, Castellón de la Plana, Spain.
Gómez-Rodríguez C., Carroll J., and Weir D., 2011. Dependency parsing schemata and mildly non-projective dependency parsing. Computational Linguistics 37 (3): 541–86.
Greene S., and Resnik P. 2009. More than words: syntactic packaging and implicit sentiment. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. NAACL '09, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 503–11).
Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., and Witten I. H., 2009. The WEKA data mining software: an update. SIGKDD Explorations 11 (1): 1018.
Jia L., Yu C., and Meng W. 2009. The effect of negation on sentiment analysis and retrieval effectiveness. Proceedings of the 18th ACM Conference on Information and Knowledge Management. CIKM'09, New York, NY, USA: ACM (pp. 1827–30).
Joshi M., and Penstein-Rosé C. 2009. Generalizing dependency features for opinion mining. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. ACLShort '09, Suntec, Singapore: Association for Computational Linguistics (pp. 313–16).
Kennedy A., and Inkpen D., 2006. Sentiment classification of movie reviews using contextual valence shifters. Computational Intelligence 22 (2): 110–25.
Kübler S., McDonald R., and Nivre J., 2009. Dependency Parsing. San Rafael, CA: Morgan & ClayPool Publishers.
Montejo-Ráez A., Martínez-Cámara E., Martín-Valdivia M. T., and Ureña López L. A. 2012. Random walk weighting over sentiwordnet for sentiment polarity detection on Twitter. In Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis. WASSA '12, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 310).
Murray G., and Carenini G., 2011. Subjectivity detection in spoken and written conversations. Natural Language Engineering 17 (3): 397418.
Nakagawa T., Inui K., and Kurohashi S. 2010. Dependency tree-based sentiment classification using CRFs with hidden variables. In NAACL HLT'10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Proceedings of the Main Conference. HLT '10, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 786–94).
Nivre J., 2008. Algorithms for deterministic incremental dependency parsing. Computational Linguistics 34 (4): 513–53.
Nivre J., Hall J., Nilsson J., Chanev A., Eryigit G., Kübler S., Marinov S., and Marsi E. 2007. MaltParser: a language-independent system for data-driven dependency parsing. Natural Language Engineering, 13 (2): 95135.
Pak A., and Paroubek P. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta: European Language Resources Association (ELRA).
Pang B., and Lee L., 2008. Opinion Mining and Sentiment Analysis. Hanover, MA, USA: Now Publishers.
Pang B., Lee L., and Vaithyanathan S. 2002. Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing – Volume 10. EMNLP '02, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 7986).
Reyes A., Rosso P., and Buscaldi D., 2012. From humor recognition to irony detection: the figurative language of social media. Data and Knowledge Engineering 74 : 112.
Reyes A., Rosso P., and Veale T., 2013. A multidimensional approach for detecting irony in Twitter. Language Resources and Evaluation 47 (1): 239–68.
Saralegi Urizar X., and San Vicente Roncal I. 2012. Detecting sentiments in Spanish Tweets. In: TASS 2012 Working Notes. Castellón de la Plana, Spain.
Sidorov G., Miranda-Jiménez S., Viveros-Jiménez F., Gelbukh A., Castro-Sánchez N., Velásquez F., Díaz-Rangel I., Suárez-Guerra S., Treviño A., and Gordon J. 2013. Empirical study of machine learning based approach for opinion mining in tweets. In Proceedings of the 11th Mexican International Conference on Advances in Artificial Intelligence – Volume Part I. MICAI'12, Berlin, Heidelberg: Springer-Verlag (pp. 114).
Taboada M., Brooke J., Tofiloski M., Voll K., and Stede M., 2011. Lexicon-based methods for sentiment analysis. Computational Linguistics 37 (2): 267307.
Taulé M., Martí M. A., and Recasens M. 2008. AnCora: multilevel annotated corpora for Catalan and Spanish. In Calzolari N., Choukri K., Maegaard B., Mariani J., Odjik J., Piperidis S., and Tapias D. (eds.), Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08). Marrakech, Morocco.
Turney P. D. 2002. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. ACL '02, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 417–24).
Vilares D., Alonso M. A., and Gómez-Rodríguez C. 2013. Supervised polarity classification of Spanish tweets based on linguistic knowledge. In Proceedings of the 2013 ACM Symposium on Document Engineering. DocEng '13. New York, NY, USA: ACM.
Volokh A., and Neumann G. 2012. Task-oriented dependency parsing evaluation methodology. In The 13th IEEE International Conference on Information Reuse and Integration (IRI), Las Vegas, NV (pp. 132–7).
Wu Y., Zhang Q., Huang X., and Wu L. 2009. Phrase dependency parsing for opinion mining. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. EMNLP '09, vol. 3, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 1533–41).
Yang K. 2008. WIDIT in TREC 2008 blog track: leveraging multiple sources of opinion evidence. In NIST Special Publication 500-277: The Seventeenth Text REtrieval Conference Proceedings (TREC 2008). Gaithersburg, Maryland.
Zhang L., Ghosh R., Dekhil M., Hsu M., and Liu B. 2011. Combining lexicon-based and learning-based methods for Twitter sentiment analysis. Technical Reptort HPL-2011-89. HP Laboratories, Palo Alto, CA.
Zhang C., Zeng D., Li J., Wang F., and Zuo W. 2009. Sentiment analysis of Chinese documents: from sentence to document level. Journal of the American Society for Information Science and Technology, 60 (12): 2474–87.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 1
Total number of PDF views: 47 *
Loading metrics...

Abstract views

Total abstract views: 472 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 23rd November 2017. This data will be updated every 24 hours.