Skip to main content
    • Aa
    • Aa

Estimating the number of segments for improving dialogue act labelling


In dialogue systems it is important to label the dialogue turns with dialogue-related meaning. Each turn is usually divided into segments and these segments are labelled with dialogue acts (DAs). A DA is a representation of the functional role of the segment. Each segment is labelled with one DA, representing its role in the ongoing discourse. The sequence of DAs given a dialogue turn is used by the dialogue manager to understand the turn. Probabilistic models that perform DA labelling can be used on segmented or unsegmented turns. The last option is more likely for a practical dialogue system, but it provides poorer results. In that case, a hypothesis for the number of segments can be provided to improve the results. We propose some methods to estimate the probability of the number of segments based on the transcription of the turn. The new labelling model includes the estimation of the probability of the number of segments in the turn. We tested this new approach with two different dialogue corpora: SwitchBoard and Dihana. The results show that this inclusion significantly improves the labelling accuracy.

Hide All
Ang J., Liu Y. and Shriberg E. 2005. Automatic dialog act segmentation and classification in multiparty meetings. In Proceedings of the International Conference of Acoustics, Speech, and Signal Processings, vol. 1, pp. 1061–4, Philadelphia.
Alcácer N., Benedí J. M., Blat F., Granell R., Martínez C. D., and Torres F. 2005. Acquisition and labelling of a spontaneous speech dialogue corpus. In Proceeding of 10th International Conference on Speech and Computer (SPECOM). Patras, Greece, pp. 583–6.
Benedí J. M., Lleida E., Varona A., Castro M. J., Galiano I., Justo R., López de Letona I., and Miguel A. 2006. Design and acquisition of a telephone spontaneous speech dialogue corpus in Spanish: Dihana. In Fifth International Conference on Language Resources and Evaluation (LREC), pp. 1636–9.
Bisani M. and Ney H. 2004. Bootstrap estimates for confidence intervals in asr performance evaluation. In Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on, vol. 1, pp. 1:I–409–12.
Bunt H. 1994 Context and dialogue control. THINK Quarterly 3.
Core M. G. and Allen J. F. 2007. Coding dialogues with the DAMSL annotation scheme. In Fall Symposium on Communicative Action in Humans and Machines. American Association for Artificial Intelligence, pp. 28–35.
Dybkjaer L. and Minker W. 2008. Recent Trends in Discourse and Dialogue, vol. 39 of Text, Speech and Language Technology. Springer.
Fraser M. and Gilbert G. 1991. ‘Simulating speech systems.’ Computer Speech and Language (5): 81–9.
Fukada T., Koll D., Waibel A. and Tanigaki K. 1998. Probabilistic dialogue act extraction for concept based multilingual translation systems. ICSLP 98 2771–4.
García P. and Vidal E. 1990. Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Transactions on Pattern Analysis Machine Intelligence 12 (9): 920–5. ISSN: . (1990), IEEE Computer Society.
Godfrey J., Holliman E. and McDaniel J. 1992. Switchboard: telephone speech corpus for research and development. In Acoustics, Speech, and Signal Processing, IEEE International Conference on, vol. 1, pp. 517–20, IEEE.
Gorin A., Riccardi G. and Wright J. 1997. How may I help you? Speech Communication 23: 113–27.
Jurafsky D., Shriberg E. and Biasca D. 1997. Switchboard SWBD-DAMSL shallow discourse function annotation coders manual - draft 13. Technical Report 97-01, University of Colorado Institute of Cognitive Science.
Lavie A., Levin L., Zhan P., Taboada M., Gates D., Lapata M. M., Clark C., Broadhead M., and Waibel A. 1997. Expanding the domain of a multi-lingual speech-to-speech translation system. In Proceedings of the Workshop on Spoken Language Translation, ACL/EACL-97, Madrid, Spain.
Levin L., Ries K., Thymé-Gobbel A., and Levie A. 1999. Tagging of speech acts and dialogue games in Spanish call home. In Workshop: Towards Standards and Tools for Discourse Tagging, pp. 42–7.
Manning C. D. and Schütze H. 1999. Foundations of Statistical Natural Language Processing, Cambridge, Massachussetts: Massachussetts Institute of Thechnology Press, ISBN:0262133601.
Martínez-Hinarejos C.-D. 2009. A study of a segmentation technique for dialogue act assignation. In Proceedings of the Eighth International Conference in Computational Semantics IWCS8, Tilburg University, Department of Communication and Information Sciences, pp. 299304.
Martínez-Hinarejos C. D., Benedí J. M., and Granell R. 2008. Statistical framework for a Spanish spoken dialogue corpus. Speech Communication 50: 9921008.
Martínez-Hinarejos C. D., Granell R., and Benedí J. M. 2006. Segmented and unsegmented dialogue-act annotation with statistical dialogue models. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sesions, Sydney, Australia, pp. 563–70.
Schatzmann J., Thomson B. and Young S. 2007. Statistical user simulation with a hidden agenda. In Proceedings of the SIGdial Workshop on Discourse and Dialogue, pp. 273–82.
Stolcke A., Coccaro N., Bates R., Taylor P., van Ess-Dykema C., Ries K., Shriberg E., Jurafsky D., Martin R., and Meteer M. 2000. Dialogue act modelling for automatic tagging and recognition of conversational speech. Computational Linguistics 26 (3): 134.
Young S. 2000. Probabilistic methods in spoken dialogue systems. Philosophical Trans Royal Society (Series A) 358(1769): 1389–402.
Walker M. A. 2000 An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email. Journal of Artificial Intelligence Research 12: 387416.
Webb N., Hepple M. and Wiks Y. 2005 Dialogue act classification using intra-utterance features. In Proceedings of the AAAI Workshop on Spoken Language Understanding, Pittsburgh, USA.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 1
Total number of PDF views: 7 *
Loading metrics...

Abstract views

Total abstract views: 73 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 19th October 2017. This data will be updated every 24 hours.