Skip to main content
×
Home

Subjectivity detection in spoken and written conversations

  • GABRIEL MURRAY (a1) and GIUSEPPE CARENINI (a1)
Abstract
Abstract

In this work we investigate four subjectivity and polarity tasks on spoken and written conversations. We implement and compare several pattern-based subjectivity detection approaches, including a novel technique wherein subjective patterns are learned from both labeled and unlabeled data, using n-gram word sequences with varying levels of lexical instantiation. We compare the use of these learned patterns with an alternative approach of using a very large set of raw pattern features. We also investigate how these pattern-based approaches can be supplemented and improved with features relating to conversation structure. Experimenting with meeting speech and email threads, we find that our novel systems incorporating varying instantiation patterns and conversation features outperform state-of-the-art systems despite having no recourse to domain-specific features such as prosodic cues and email headers. In some cases, such as when working with noisy speech recognizer output, a small set of well-motivated conversation features performs as well as a very large set of raw patterns.

Copyright
References
Hide All
Baron N. 2000. Alphabet to Email: How Written English Evolved and Where it's Heading. New York, NY: Routledge (Taylor & Francis).
Biadsy F., Hirschberg J., and Filatova E. 2008. An unsupervised approach to biography production using wikipedia. In Proceedings of Acl-Hlt 2008, Columbus, OH.
Brill E. 1992. A simple rule-based part of speech tagger. In Proceedings of Darpa Speech and Natural Language Workshop, San Mateo, CA, pp. 112116.
Carenini G., Ng R., and Zhou X. 2007. Summarizing email conversations with clue words. In Proceedings of Acm www 07, Banff, Canada.
Carletta J., Ashby S., Bourban S., Flynn M., Guillemot M., Hain T., Kadlec J., Karaiskos V., Kraaij W., Kronenthal M., Lathoud G., Lincoln M., Lisowska A., McCowan I., Post W., Reidsma D., and Wellner P. 2005. The {AMI} meeting corpus: A pre-announcement. In Proceedings of Mlmi 2005, Edinburgh, UK, pp. 2839.
Fan R.-E., Chang K.-W., Hsieh C.-J., Wang X.-R., and Lin C.-J. 2008. Liblinear: A library for large linear classification. Journal of Machine Learning Research 9: 18711874.
Fawcett T. 2003. Roc graphs: Notes and practical considerations for researchers. Technical Report HP Labs HPL-2003–4.
Germesin S., Becker T., and Poller P. 2008. Hybrid multi-step disfluency detection. In Proceedings of Mlmi 2008, Utrecht, The Netherlands, pp. 185195.
Hain T., Burget L., Dines J., Garau G., Wan V., Karafiat M., Vepa J., and Lincoln M. 2007. The AMI system for transcription of speech in meetings. In Proceedings of Icassp 2007, pp. 357–360.
Murray G., and Carenini G. 2008. Summarizing spoken and written conversations. In Proceedings of Emnlp 2008, Honolulu, HI, USA.
Murray G., Kleinbauer T., Poller P., Renals S., Becker T., and Kilgour J. 2008. Extrinsic summarization evaluation: A decision audit task. In Proceedings of Mlmi 2008, Utrecht, The Netherlands.
Pang B., and Lee L. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 1–2 (2): 1135.
Quirk R., Greenbaum S., Leech G., and Svartvik J. 1985. A Comprehensive Grammar of the English Language. New York, NY: Longman.
Raaijmakers S., Truong K., and Wilson T. 2008. Multimodal subjectivity analysis of multiparty conversation. In Proceedings. of Emnlp 2008, Honolulu, HI.
Riloff E. 1996. Automatically generating extraction patterns from untagged text. In Proceedings of Aaai 1996, Portland, OR, pp. 10441049.
Riloff E., and Phillips W. 2004. An introduction to the sundance and autoslog systems. Technical Report UUCS-04-015, University of Utah School of Computing.
Riloff E., and Wiebe J. 2003. Learning extraction patterns for subjective expressions. In Proceedings of Emnlp 2003, Sapporo, Japan.
Riloff E., Patwardhan S., and Wiebe J. 2006. Feature subsumption for opinion analysis. In Proceedings of Emnlp 2006, Sydney, Australia.
Somasundaran S., Ruppenhofer J., and Wiebe J. 2007. Detecting arguing and sentiment in meetings. In Proceedings of Sigdial 2007, Antwerp, Belgium.
Ulrich J., Murray G., and Carenini G. 2008. A publicly available annotated corpus for supervised email summarization. In Proceedings of Aaai Email-2008 Workshop, Chicago, USA.
Wilson T. 2008. Annotating subjective content in meetings. In: Proceedings of Lrec 2008, Marrakech, Morocco.
Wilson T., Wiebe J., and Hwa R. 2006. Recognizing strong and weak opinion clauses. Computational Intelligence 22 (2): 7399.
Yu H., and Hatzivassiloglou V. 2003. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of Emnlp 2003, Sapporo, Japan.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 2
Total number of PDF views: 19 *
Loading metrics...

Abstract views

Total abstract views: 141 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 19th November 2017. This data will be updated every 24 hours.