Skip to main content

A comparative study of pivot selection strategies for unsupervised cross-domain sentiment classification

  • Xia Cui (a1), Noor Al-Bazzaz (a1), Danushka Bollegala (a1) and Frans Coenen (a1)

Selecting pivot features that connect a source domain to a target domain is an important first step in unsupervised domain adaptation (UDA). Although different strategies such as the frequency of a feature in a domain, mutual (or pointwise mutual) information have been proposed in prior work in domain adaptation (DA) for selecting pivots, a comparative study into (a) how the pivots selected using existing strategies differ, and (b) how the pivot selection strategy affects the performance of a target DA task remain unknown. In this paper, we perform a comparative study covering different strategies that use both labelled (available for the source domain only) as well as unlabelled (available for both the source and target domains) data for selecting pivots for UDA. Our experiments show that in most cases pivot selection strategies that use labelled data outperform their unlabelled counterparts, emphasising the importance of the source domain labelled data for UDA. Moreover, pointwise mutual information and frequency-based pivot selection strategies obtain the best performances in two state-of-the-art UDA methods.

Hide All
Blitzer, J., Dredze, M. & Pereira, F. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the ACL, 440–447.
Blitzer, J., McDonald, R. & Pereira, F. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the EMNLP, 120–128.
Bollegala, D., Mu, T. & Goulermas, J. Y. 2015. Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Transactions on Knowledge and Data Engineering 28(2), 398410.
Bollegala, D., Weir, D. & Carroll, J. 2011. Using multiple sources to construct a sentiment sensitive thesaurus for cross-domain sentiment classification. In Proceedings of the ACL, 132–141.
Bollegala, D., Weir, D. & Carroll, J. 2014. Learning to predict distributions of words across domains. In Proceedings of the ACL, 613–623.
Church, K. W. & Hanks, P. 1990. Word association norms, mutual information, and lexicography’. Computational Linguistics 16(1), 2229.
Jiang, J. & Zhai, C. 2007. Instance weighting for domain adaptation in nlp. In Proceedings of the ACL, 264–271.
Koehn, P. & Schroeder, J. 2007. Experiments in domain adaptation for statistical machine translation. In Proceedings of the Second Workshop on Statistical Machine Translation, 224–227.
Kübler, S. & Baucom, E. 2011. Fast domain adaptation for part of speech tagging for dialogues. In Proceedings of the RANLP, 41–48.
Li, S. & Zong, C. 2008. Multi-domain sentiment classification. In ACL 2008 (short papers), 257–260.
Liu, Y. & Zhang, Y. 2012. Unsupervised domain adaptation for joint segmentation and POS-tagging. In Proceedings of the COLING, 745–754.
Manning, C. D. & Schütze, H. 1999. Foundations of Statistical Natural Language Processing. MIT Press.
Mansour, R. H., Refaei, N., Gamon, M., Sami, K. & Abdel-Hamid, A. 2013. Revisiting the old kitchen sink: do we need sentiment domain adaptation? In Proceedings of the RANLP, 420–427.
Pan, S. J., Ni, X., Sun, J.-T., Yang, Q. & Chen, Z. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of WWW, 751–760.
Pang, B., Lee, L. & Vaithyanathan, S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the EMNLP, 79–86.
Schnabel, T. & Schütze, H. 2013. Towards robust cross-domain domain adaptation for part-of-speech tagging. In Proceedings of the IJCNLP, 198–206.
Turney, P. 2006. Similarity of semantic relations. Computational Linguistics 32(3), 379416.
Turney, P. D. 2001. Minning the web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the ECML-2001, 491–502.
Yu, J. & Jiang, J. 2015. A hassle-free unsupervised domain adaptation method using instance similarity features. In Proceedings of the ACL-IJCNLP, 168–173.
Zhang, Y., Xu, X. & Hu, X. 2015. A common subspace construction method in cross-domain sentiment classification. In Procedings of International Conference on Electronic Science and Automation Control (ESAC), 48–52.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

The Knowledge Engineering Review
  • ISSN: 0269-8889
  • EISSN: 1469-8005
  • URL: /core/journals/knowledge-engineering-review
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 1
Total number of PDF views: 1 *
Loading metrics...

Abstract views

Total abstract views: 20 *
Loading metrics...

* Views captured on Cambridge Core between 27th June 2018 - 23rd July 2018. This data will be updated every 24 hours.