Improving sentiment analysis with multi-task learning of negation

Jeremy Barnes; Erik Velldal; Lilja Øvrelid

doi:10.1017/S1351324920000510

Improving sentiment analysis with multi-task learning of negation

Published online by Cambridge University Press: 11 November 2020

Jeremy Barnes

Erik Velldal and

Lilja Øvrelid

Show author details

Jeremy Barnes*: Affiliation:
Language Technology Group, University of Oslo, Oslo, Norway
Erik Velldal: Affiliation:
Language Technology Group, University of Oslo, Oslo, Norway
Lilja Øvrelid: Affiliation:
Language Technology Group, University of Oslo, Oslo, Norway
*: *Corresponding author. E-mail: jeremycb@ifi.uio.no

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Sentiment analysis is directly affected by compositional phenomena in language that act on the prior polarity of the words and phrases found in the text. Negation is the most prevalent of these phenomena, and in order to correctly predict sentiment, a classifier must be able to identify negation and disentangle the effect that its scope has on the final polarity of a text. This paper proposes a multi-task approach to explicitly incorporate information about negation in sentiment analysis, which we show outperforms learning negation implicitly in an end-to-end manner. We describe our approach, a cascading and hierarchical neural architecture with selective sharing of Long Short-term Memory layers, and show that explicitly training the model with negation as an auxiliary task helps improve the main task of sentiment analysis. The effect is demonstrated across several different standard English-language data sets for both tasks, and we analyze several aspects of our system related to its performance, varying types and amounts of input data and different multi-task setups.

Keywords

sentiment analysis negation detection multi-task

Type: Article
Information: Natural Language Engineering , Volume 27 , Issue 2 , March 2021 , pp. 249 - 269

DOI: https://doi.org/10.1017/S1351324920000510 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ambartsoumian, A. and Popowich, F. (2018). Self-attention: A better building block for sentiment analysis neural network classifiers. In Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Brussels, Belgium, pp. 130–139.CrossRef Google Scholar

Augenstein, I., Ruder, S. and Søgaard, A. (2018). Multi-task learning of pairwise sequence classification tasks over disparate label spaces. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, USA, pp. 1896–1906.CrossRef Google Scholar

Augenstein, I. and Søgaard, A. (2017). Multi-task learning of keyphrase boundary classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp. 341–346.CrossRef Google Scholar

Barnes, J., Klinger, R. and Schulte im Walde, S. (2017). Assessing state-of-the-art sentiment models on state-of-the-art sentiment datasets. In Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Copenhagen, Denmark, pp. 2–12.CrossRef Google Scholar

Barnes, J., Øvrelid, L. and Velldal, E. (2019a). Sentiment analysis is not solved! Assessing and probing sentiment classification. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Florence, Italy, pp. 12–23.CrossRef Google Scholar

Barnes, J., Touileb, S., Øvrelid, L. and Velldal, E. (2019b). Lexicon information in neural sentiment analysis: A multi-task learning approach. In Proceedings of the 22nd Nordic Conference on Computational Linguistics, Turku, Finland.CrossRef Google Scholar

Bies, A., Mott, J., Warner, C. and Kulick, S. (2012). English web treebank. In Technical Report LDC2012T13, Linguistic Data Consortium, Philidelphia, PA, USA.Google Scholar

Bingel, J. and Søgaard, A. (2017). Identifying beneficial task relations for multi-task learning in deep neural networks. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 164–169.CrossRef Google Scholar

Bjerva, J. (2017). Will my auxiliary tagging task help? Estimating auxiliary tasks effectivity in multi-task learning. In Proceedings of the 21st Nordic Conference on Computational Linguistics, Gothenburg, Sweden, pp. 216–220.Google Scholar

Caruana, R. (1993). Multitask learning: A knowledge-based source of inductive bias. In Proceedings of the Tenth International Conference on Machine Learning. Morgan Kaufmann, pp. 41–48.CrossRef Google Scholar

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K. and Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research 12, 2493–2537.Google Scholar

Councill, I., McDonald, R. and Velikovich, L. (2010). What’s great and what’s not: Learning to classify the scope of negation for improved sentiment analysis. In Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, Uppsala, Sweden, pp. 51–59.Google Scholar

Cruz, N.P., Taboada, M. and Mitkov, R. (2016). A machine-learning approach to negation and speculation detection for sentiment analysis. Journal of the Association for Information Science and Technology 67(9), 2118–2136.CrossRef Google Scholar

Das, S.R. and Chen, M.Y. (2007). Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management Science 53(9), 1375–1388.CrossRef Google Scholar

Devlin, J., Chang, M.-W., Lee, K. and Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.Google Scholar

Enger, M., Velldal, E. and Øvrelid, L. (2017). An open-source tool for negation detection: A maximum-margin approach. In Proceedings of the EACL workshop on Computational Semantics Beyond Events and Roles (SemBEaR), Valencia, Spain, pp. 64–69.CrossRef Google Scholar

Fancellu, F., Lopez, A. and Webber, B. (2016). Neural networks for negation scope detection. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 495–504.CrossRef Google Scholar

Fancellu, F., Lopez, A., Webber, B. and He, H. (2017). Detecting negation scope is easy, except when it isn’t. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 58–63.CrossRef Google Scholar

Fares, M., Oepen, S. and Velldal, E. (2018). Transfer and multi-task learning for noun–noun compound interpretation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 1488–1498.CrossRef Google Scholar

Farias, D.H. and Rosso, P. (2017). Irony, sarcasm, and sentiment analysis. In Pozzi F.A., Fersini E., Messina E. and Liu B. (eds), Sentiment Analysis in Social Networks, Chapter 7. Boston, USA: Morgan Kaufmann, pp. 113–128.Google Scholar

Felbo, B., Mislove, A., Søgaard, A., Rahwan, I. and Lehmann, S. (2017). Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 1615–1625.CrossRef Google Scholar

Goldberg, Y. (2017). Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies 10(1), 1–309.CrossRef Google Scholar

Graves, A. and Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18(5), 602–610. IJCNN 2005.CrossRef Google Scholar PubMed

Howard, J. and Ruder, S. (2018). Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, pp. 328–339.CrossRef Google Scholar

Hu, M. and Liu, B. (2004). Mining opinion features in customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, USA, pp. 168–177.Google Scholar

Irsoy, O. and Cardie, C. (2014). Deep recursive neural networks for compositionality in language. In Ghahramani Z., Welling M., Cortes C., Lawrence N.D. and Weinberger K.Q. (eds), Advances in Neural Information Processing Systems 27. Curran Associates, Inc., pp. 2096–2104Google Scholar

Iyyer, M., Manjunatha, V., Boyd-Graber, J. and Daume, III H. (2015). Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1681–1691.CrossRef Google Scholar

Jochim, C., Bonin, F., Bar-Haim, R. and Slonim, N. (2018). SLIDE – a sentiment lexicon of common idioms. In Proceedings of the 11th Language Resources and Evaluation Conference, Miyazaki, Japan, pp. 2387–2392.Google Scholar

Kennedy, A. and Inkpen, D. (2006). Sentiment classification of movie reviews using contextual valence shifters. Computational Intelligence 22(2), 110–125.CrossRef Google Scholar

Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations.Google Scholar

Konstantinova, N., de Sousa, S.C., Cruz, N.P., Maña, M.J., Taboada, M. and Mitkov, R. (2012). A review corpus annotated for negation, speculation and their scope. In Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 3190–3195.Google Scholar

Kshirsagar, M., Thomson, S., Schneider, N., Carbonell, J., Smith, N.A. and Dyer, C. (2015). Frame-semantic role labeling with heterogeneous annotations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 218–224.CrossRef Google Scholar

Lapponi, E., Read, J. and Øvrelid, L. (2012a). Representing and resolving negation for sentiment analysis. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining Workshops, Washington, DC, USA, pp. 687–692.CrossRef Google Scholar

Lapponi, E., Velldal, E., Øvrelid, L. and Read, J. (2012b). UiO2: Sequence-labeling negation using dependency features. In Proceedings of the First Joint Conference on Lexical and Computational Semantics, Montreal, Canada, pp. 319–327.Google Scholar

Lei, Z., Yang, Y., Yang, M. and Liu, Y. (2018). A multi-sentiment-resource enhanced attention network for sentiment classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, pp. 758–763.CrossRef Google Scholar

Liu, P., Qian, K., Qiu, X. and Huang, X. (2017). Idiom-aware compositional distributed semantics. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 1204–1213.CrossRef Google Scholar

Liu, P., Qiu, X. and Huang, X. (2016). Recurrent neural network for text classification with multi-task learning. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, USA, pp. 2873–2879.Google Scholar

Liu, Q., Fancellu, F. and Webber, B. (2018). NegPar: A parallel corpus annotated for negation. In Proceedings of the 11th International Conference on Language Resources and Evaluation, Miyazaki, Japan, pp. 3464–3472.Google Scholar

Martínez Alonso, H. and Plank, B. (2017). When is multitask learning effective? Semantic sequence prediction under varying data conditions. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 44–53.CrossRef Google Scholar

Morante, R. and Blanco, E. (2012). ^*SEM 2012 shared task: Resolving the scope and focus of negation. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (^*SEM), Montréal, Canada, pp. 265–274.Google Scholar

Morante, R. and Daelemans, W. (2009). A metalearning approach to processing the scope of negation. In Proceedings of the 13th Conference on Computational Natural Language Learning, Boulder, USA.CrossRef Google Scholar

Morante, R. and Daelemans, W. (2012). ConanDoyle-neg: Annotation of negation cues and their scope in Conan Doyle stories. In Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey.Google Scholar

Morante, R., Liekens, A. and Daelemans, W. (2008). Learning the scope of negation in biomedical texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Waikiki, Hawaii.CrossRef Google Scholar

Nakov, P., Rosenthal, S., Kozareva, Z., Stoyanov, V., Ritter, A. and Wilson, T. (2013). Semeval-2013 task 2: Sentiment analysis in twitter. In Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval).Google Scholar

Packard, W., Bender, E.M., Read, J., Oepen, S. and Dridan, R. (2014). Simple negation scope resolution through deep parsing: A semantic solution to a semantic problem. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, USA.CrossRef Google Scholar

Padó, S. (2006). User’s guide to sigf: Significance testing by approximate randomisation. https://nlpado.de/sebastian/software/sigf.shtml.Google Scholar

Pang, B. and Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1–2), 1–135.CrossRef Google Scholar

Pang, B., Lee, L. and Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, Philadelphia, USA, pp. 79–86.Google Scholar

Peng, N. and Dredze, M. (2017). Multi-task domain adaptation for sequence tagging. In Proceedings of the 2nd Workshop on Representation Learning for NLP, Vancouver, Canada, pp. 91–100.CrossRef Google Scholar

Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, USA, pp. 2227–2237.CrossRef Google Scholar

Plank, B. (2016). Keystroke dynamics as signal for shallow syntactic parsing. In Proceedings of the 26th International Conference on Computational Linguistics, Osaka, Japan, pp. 609–619.Google Scholar

Polanyi, L. and Zaenen, A. (2006). Contextual Valence Shifters. Netherlands, Dordrecht: Springer, pp. 1–10.Google Scholar

Qian, Z., Li, P., Zhu, Q., Zhou, G., Luo, Z. and Luo, W. (2016). Speculation and negation scope detection via convolutional neural networks. In The 2016 Conference on Empirical Methods in Natural Language Processing.CrossRef Google Scholar

Read, J., Velldal, E., Øvrelid, L. and Oepen, S. (2012). UiO1: Constituent-based discriminative ranking for negation resolution. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (^*SEM), Montreal, Canada.Google Scholar

Ruder, S., Bingel, J., Augenstein, I. and Søgaard, A. (2019). Latent multi-task architecture learning. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, Honolulu, USA.CrossRef Google Scholar

Sanh, V., Wolf, T. and Ruder, S. (2019). A hierarchical multi-task approach for learning embeddings from semantic tasks. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, Honolulu, USA.CrossRef Google Scholar

Schneider, N. and Smith, N.A. (2015). A corpus and model integrating multiword expressions and supersenses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, pp. 1537–1547.CrossRef Google Scholar

Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C., Ng, A. and Potts, C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, USA, pp. 1631–1642.Google Scholar

Søgaard, A. and Goldberg, Y. (2016). Deep multi-task learning with low level tasks supervised at lower layers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 231–235.CrossRef Google Scholar

Taboada, M., Brooke, J., Tofiloski, M., Voll, K. and Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics 37(2), 267–307.CrossRef Google Scholar

Tai, K.S., Socher, R. and Manning, C.D. (2015). Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1556–1566.CrossRef Google Scholar

Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T. and Qin, B. (2014). Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 1555–1565.CrossRef Google Scholar

Turney, P. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, USA.Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u. and Polosukhin, I. (2017). Attention is all you need. In Guyon I., Luxburg U.V., Bengio S., Wallach H., Fergus R., Vishwanathan S. and Garnett R. (eds), Advances in Neural Information Processing Systems 30. Curran Associates, Inc., pp. 5998–6008.Google Scholar

Velldal, E., Øvrelid, L., Read, J. and Oepen, S. (2012). Speculation and negation: Rules, rankers, and the role of syntax. Computational Linguistics 38(2), 369–410.CrossRef Google Scholar

Verma, R., Kim, S. and Walter, D. (2018). Syntactical analysis of the weaknesses of sentiment analyzers. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 1122–1127.CrossRef Google Scholar

Vincze, V., Szarvas, G., Farkas, R., Móra, G. and Csirik, J. (2008). The BioScope corpus: Biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, Suppl 11.CrossRef Google Scholar

White, J. (2012). UWashington: Negation resolution using machine learning methods. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (^*SEM), Montreal, Canada.Google Scholar

Wiegand, M., Balahur, A., Roth, B., Klakow, D. and Montoyo, A. (2010). A survey on the role of negation in sentiment analysis. In Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, Uppsala, Sweden, pp. 60–68.Google Scholar

Williams, L., Bannister, C., Arribas-Ayllon, M., Preece, A. and Spasić, I. (2015). The role of idioms in sentiment analysis. Expert Systems with Applications 42(21), 7375–7385.CrossRef Google Scholar

Xu, H., Liu, B., Shu, L. and Yu, P.S. (2019). BERT post-training for review reading comprehension and aspect-based sentiment analysis. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA.Google Scholar

Yeh, A. (2000). More accurate tests for the statistical significance of result differences. In Proceedings of the 18th Conference on Computational Linguistics, Saarbrücken, Germany, pp. 947–953.CrossRef Google Scholar

Zhu, X., Guo, H., Mohammad, S. and Kiritchenko, S. (2014). An empirical study on the effect of negation words on sentiment. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, pp. 304–313.CrossRef Google Scholar

Article contents

Improving sentiment analysis with multi-task learning of negation

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests