Pursuing and demonstrating understanding in dialogue

David DeVault; Matthew Stone

doi:10.1017/CBO9780511844492.003

3 - Pursuing and demonstrating understanding in dialogue

from Part I - Joint construction

Published online by Cambridge University Press: 05 July 2014

David DeVault and

Matthew Stone

Edited by

Amanda Stent and

Srinivas Bangalore

Show author details

David DeVault: Affiliation:
University of Southern
Matthew Stone: Affiliation:
State University of New Jersey
Amanda Stent: Affiliation:
AT&T Research, Florham Park, New Jersey
Srinivas Bangalore: Affiliation:
AT&T Research, Florham Park, New Jersey

Book contents

Get access

Summary

Introduction

The appeal of natural language dialogue as an interface modality is its ability to support open-ended mixed-initiative interaction. Many systems offer rich and extensive capabilities, but must support novice or infrequent users. It is unreasonable to expect untrained users to know the actions they need in advance, or to be able to specify their goals using a regimented scheme of commands or menu options. Dialogue allows the user to talk through their needs with the system and arrive collaboratively at a feasible solution. Dialogue, in short, becomes more useful to users as the interaction becomes more potentially problematic.

However, the flexibility of dialogue comes at a cost in system engineering. We cannot expect the user's model of the task and domain to align with the system's. Consequently, the system cannot count on a fixed schema to enable it to understand the user. It must be prepared for incorrect or incomplete analyses of users' utterances, and must be able to put together users' needs across extended interactions. Conversely, the system must be prepared for users that misunderstand it, or fail to understand it.

This chapter provides an overview of the concepts, models, and research challenges involved in this process of pursuing and demonstrating understanding in dialogue. We start in Section 3.2 from analyses of human–human conversation. People are no different from systems: they, too, face potentially problematic interactions involving misunderstandings. In response, they avail themselves of a wide range of discourse moves and interactive strategies, suggesting that they approach communication itself as a collaborative process wherein all parties establish agreement, to their mutual satisfaction, on the distinctions that matter for their discussion and on the expressions through which to identify those distinctions. In the literature, this process is often described as grounding communication, or identifying contributions well enough so that they become part of the common ground of the conversation (Clark and Marshall, 1981; Clark and Schaefer, 1989; Clark and Wilkes-Gibbs, 1990; Clark, 1996).

Type: Chapter
Information: Natural Language Generation in Interactive Systems , pp. 34 - 62

DOI: https://doi.org/10.1017/CBO9780511844492.003 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Allen, J. F., Blaylock, N., and Ferguson, G. (2002). A problem solving model for collaborative agents. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 774-781, Bologna, Italy. International Foundation for Autonomous Agents and Multiagent Systems.Google Scholar

Asher, N. and Lascarides, A. (2003). Logics of Conversation. Cambridge University Press, Cambridge, UK.Google Scholar

Brennan, S. E. (1990). Seeking and Providing Evidence for Mutual Understanding. PhD thesis, Department of Psychology, Stanford University.Google Scholar

Brennan, S. E. and Clark, H. H. (1996). Conceptual pacts and lexical choice in conversation. Journal of Experimental Psychology, 22(6):1482-1493.Google Scholar PubMed

Brennan, S. E. and Williams, M. (1995). The feeling of another's knowing: Prosody and filled pauses as cues to listeners about the metacognitive states of speakers. Journal of Memory and Language, 34(3):383-398.CrossRef Google Scholar

Bunt, H. (1994). Context and dialogue control. THINK Quarterly, 3:19-31.Google Scholar

Bunt, H. (1996). Interaction management functions and context representation requirements. In Proceedings ofthe Twente Workshop on Language Technology, pages 187-198, University of Twente. University of Twente.Google Scholar

Bunt, H. (2000). Dialogue pragmatics and context specification. In Bunt, H. and Black, W., editors, Abduction, Beliefand Context in Dialogue. Studies in Computational Pragmatics, pages 81-150. John Benjamins, Amsterdam, The Netherlands.CrossRef Google Scholar

Carberry, S. and Lambert, L. (1999). A process model for recognizing communicative acts and modeling negotiation subdialogues. Computational Linguistics, 25(1):1-53.Google Scholar

Cassell, J. (2000). Embodied conversational interface agents. Communications of the ACM, 43(4):70-78.CrossRef Google Scholar

Cassell, J., Stone, M., and Yan, H. (2000). Coordination and context-dependence in the generation of embodied conversation. In Proceedings ofthe International Conference on Natural Language Generation (INLG), pages 171-178, Mitzpe Ramon, Israel. Association for Computational Linguistics.Google Scholar

Clark, H. (1996). Using Language. Cambridge University Press, Cambridge, UK.CrossRef Google Scholar

Clark, H. H. (1993). Arenas ofLanguage Use. University of Chicago Press, Chicago, IL.Google Scholar

Clark, H. H. and Krych, M. (2004). Speaking while monitoring addressees for understanding. Journal of Memory and Language, 50(1):62-81.CrossRef Google Scholar

Clark, H. H. and Marshall, C. R. (1981). Definite reference and mutual knowledge. In Joshi, A., Webber, B., and Sag, I., editors, Elements of Discourse Understanding, pages 10-63. Cambridge University Press, Cambridge, UK.Google Scholar

Clark, H. H. and Schaefer, E. F. (1989). Contributing to discourse. Cognitive Science, 13(2): 259-294.CrossRef Google Scholar

Clark, H. H. and Wilkes-Gibbs, D. (1990). Referring as a collaborative process. In Cohen, P. R., Morgan, J., and Pollack, M. E., editors, Intentions in Communication, pages 463-493. MIT Press, Cambridge, MA.Google Scholar

Cohen, P. R. (1997). Dialogue modeling. In Cole, R., Mariani, J., Uszkoreit, H., Varile, G. B., Zaenen, A., and Zampolli, A., editors, Survey of the State of the Art in Human Language Technology (Studies in Natural Language Processing), pages 204-210. Cambridge University Press, Cambridge, UK.Google Scholar

Core, M. G. and Allen, J. F. (1997). Coding dialogues with the DAMSL annotation scheme. In Working Notes of the AAAI Fall Symposium on Communicative Action in Humans and Machines, Boston, MA. AAAI Press.Google Scholar

DeVault, D. (2008). Contribution Tracking: Participating in Task-Oriented Dialogue under Uncertainty. PhD thesis, Department of Computer Science, Rutgers, The State University of New Jersey, New Brunswick, NJ.Google Scholar

DeVault, D., Kariaeva, N., Kothari, A., Oved, I., and Stone, M. (2005). An information-state approach to collaborative reference. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 1-4, Ann Arbor, MI. Association for Computational Linguistics.Google Scholar

DeVault, D. and Stone, M. (2006). Scorekeeping in an uncertain language game. In Proceedings of the Workshop on the Semantics and Pragmatics of Dialogue (brandial), pages 139-146, Potsdam, Germany. SemDial.Google Scholar

DeVault, D. and Stone, M. (2007). Managing ambiguities across utterances in dialogue. In Proceedings ofthe Workshop on the Semantics and Pragmatics of Dialogue (DECALOG), pages 49-56, Rovereto, Italy. SemDial.Google Scholar

DeVault, D. and Stone, M. (2009). Learning to interpret utterances using dialogue history. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 184-192, Athens, Greece. Association for Computational Linguistics.Google Scholar

Di Eugenio, B., Jordan, P. W., Thomason, R. H., and Moore, J. D. (2000). The agreement process: An empirical investigation of human-human computer-mediated collaborative dialogue. International Journal of Human-Computer Studies, 53(6):1017-1076.CrossRef Google Scholar

Furnas, G. W., Landauer, T. K., Gomez, L. M., and Dumais, S. T. (1987). The vocabulary problem in human-system communications. Communications of the ACM, 30(11):964-971.CrossRef Google Scholar

Ginzburg, J. and Cooper, R. (2004). Clarification, ellipsis and the nature of contextual updates in dialogue. Linguistics and Philosophy, 27(3):297-365.CrossRef Google Scholar

Goldman, A. (1970). A Theory ofHuman Action. Prentice Hall, Upper Saddle River, NJ.Google Scholar

Gregoromichelaki, E., Kempson, R., Purver, M., Mills, G. J., Cann, R., Meyer-Viol, W., and Healey, P. G. (2011). Incrementality and intention-recognition in utterance processing. Dialogue and Discourse, 2(1):199-233.CrossRef Google Scholar

Grice, H. P. (1975). Logic and conversations. In Cole, P. and Morgan, J. L., editors, Syntax and Semantics III: Speech Acts, pages 41-58. Academic Press, New York, NY.Google Scholar

Grosz, B. J. and Sidner, C. L. (1986). Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175-204.Google Scholar

Heeman, P. A. and Hirst, G. (1995). Collaborating on referring expressions. Computational Linguistics, 21(3):351-383.Google Scholar

Henderson, J., Lemon, O., and Georgila, K. (2008). Hybrid reinforcement/supervised learning of dialogue policies from fixed datasets. Computational Linguistics, 34(4):487-513.CrossRef Google Scholar

Hobbs, J. R., Stickel, M., Appelt, D., and Martin, P. (1993). Interpretation as abduction. Artificial Intelligence, 63(1-2):69-142.CrossRef Google Scholar

Horvitz, E. and Paek, T. (2001). Harnessing models of users' goals to mediate clarification dialog in spoken language systems. In Proceedings of the International Conference on User Modeling, pages 3-13, Sonthofen, Germany. Springer.Google Scholar

Kehler, A. (2001). Coherence, Reference and the Theory of Grammar. CSLI Publications, Stanford, CA.Google Scholar

Kopp, S., Tepper, P., and Cassell, J. (2004). Towards integrated rnicroplanning of language and iconic gesture for multimodal output. In Proceedings of the International Conference on Multimodal Interfaces (ICMI), pages 97-104, State College, PA. Association for Computing Machinery.Google Scholar

Larsson, S. and Traum, D. (2000). Information state and dialogue management in the TRINDI dialogue move engine toolkit. Natural Language Engineering, 6(3-1):323-340.CrossRef Google Scholar

Lascarides, A. and Asher, N. (2009). Agreement, disputes and commitments in dialogue. Journal of Semantics, 26(2):109-158.CrossRef Google Scholar

Lascarides, A. and Stone, M. (2009). Discourse coherence and gesture interpretation. Gesture, 9(2):147-180.CrossRef Google Scholar

Lemon, O. (2011). Learning what to say and how to say it: Joint optimisation of spoken dialogue management and natural language generation. Computer Speech & Language, 25(2):210-221.CrossRef Google Scholar

Levin, E. and Pieraccini, R. (1997). A stochastic model of computer-human interaction for learning dialogue strategies. In Proceedings ofthe European Conference on Speech Communication and Technology (EUROSPEECH), pages 1883-1886, Rhodes, Greece. International Speech Communication Association.Google Scholar

Levin, E., Pieraccini, R., and Eckert, W. (1998). Using Markov decision process for learning dialogue strategies. In Proceedings of the IEEE International Conference on Acoustics, Speech, andSignal Processing (ICASSP), volume 1, pages 201-204, Seattle, WA. Institute of Electrical and Electronics Engineers.Google Scholar

Matheson, C., Poesio, M., and Traum, D. (2000). Modelling grounding and discourse obligations using update rules. In Proceedings ofthe Conference ofthe North American Chapter of the Association for Computational Linguistics (NAACL), pages 1-8, Seattle, WA. Association for Computational Linguistics.Google Scholar

Nakano, Y. I., Reinstein, G., Stocky, T., and Cassell, J. (2003). Towards a model of face-to-face grounding. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 553-561, Sapporo, Japan. Association for Computational Linguistics.Google Scholar

Newell, A. (1982). The knowledge level. Artificial Intelligence, 18:87-127.CrossRef Google Scholar

Pollack, M. (1986). A model of plan inference that distinguishes between the beliefs of actors and observers. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 207-214, New York, NY. Association for Computational Linguistics.Google Scholar

Purver, M. (2004). The Theory and Use of Clarification Requests in Dialogue. PhD thesis, Department of Computer Science, King's College, University of London.Google Scholar

Rich, C., Sidner, C. L., and Lesh, N. (2001). COLLAGEN: Applying collaborative discourse theory to human-computer interaction. Artificial Intelligence Magazine, 22(4):15-25.Google Scholar

Rieser, V. and Lemon, O. (2011). Learning and evaluation of dialogue strategies for new applications: Empirical methods for optimization from small data sets. Computational Linguistics, 37(1):153-196.CrossRef Google Scholar

Roy, N., Pineau, J., and Thrun, S. (2000). Spoken dialog management for robots. In Proceedings of the Annual Meeting ofthe Association for Computational Linguistics (ACL), pages 93-100, Hong Kong. Association for Computational Linguistics.Google Scholar

Russell, S. and Norvig, P. (1995). Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River, NJ.Google Scholar

Sengers, P. (1999). Designing comprehensible agents. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pages 1227-1232, Stockholm, Sweden. International Joint Conference on Artificial Intelligence.Google Scholar

Sidner, C. L. (1994). Negotiation in collaborative activity: A discourse analysis. Knowledge Based Systems, 7(4):265-267.CrossRef Google Scholar

Stalnaker, R. (1974). Pragmatic presuppositions. In Munitz, M. K. and Unger, P. K., editors, Semantics and Philosophy, pages 197-213. New York University Press, New York, NY.Google Scholar

Stalnaker, R. (1978). Assertion. In Cole, P., editor, Syntax and Semantics, volume 9, pages 315-332. Academic Press, New York, NY.Google Scholar

Stent, A. J. (2002). A conversation acts model for generating spoken dialogue contributions. Computer Speech and Language, 16(3-4):313-352.CrossRef Google Scholar

Stone, M. (2004). Communicative intentions and conversational processes in human-human and human-computer dialogue. In Trueswell, J. C. and Tanenhaus, M. K., editors, Approaches to Studying World-Situated Language Use: Bridging the Language-as-Product and Language-as-Action Traditions, pages 39-70. MIT Press, Cambridge, MA.Google Scholar

Stone, M., Doran, C., Webber, B., Bleam, T., and Palmer, M. (2003). Microplanning with communicative intentions: The SPUD systems. Computational Intelligence, 19(4):314-381.CrossRef Google Scholar

Stone, M. and Lascarides, A. (2010). Coherence and rationality in dialogue. In Proceedings of the Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL), pages 51-58, Poznan, Poland. SemDial.Google Scholar

Stone, M. and Oh, I. (2008). Modeling facial expression of uncertainty in conversational animation. In Wachsmuth, I. and Knoblich, G., editors, Modeling Communication with Robots and Virtual Humans, pages 57-76. Springer, Heidelberg, Germany.Google Scholar

Swartout, W., Gratch, J., Hill, R. W., Hovy, E., Marsella, S., Rickel, J., and Traum, D. (2006). Toward virtual humans. AIMagazine, 27(2):96-108.Google Scholar

Swerts, M. and Krahmer, E. (2005). Audiovisual prosody and feeling of knowing. Journal of Memory and Language, 53(1):81-94.CrossRef Google Scholar

Tetreault, J. and Litman, D. (2006). Using reinforcement learning to build a better model of dialogue state. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 289-296, Trento, Italy. Association for Computational Linguistics.Google Scholar

Thomason, R. H., Stone, M., and DeVault, D. (2006). Enlightened update: A computational architecture for presupposition and other pragmatic phenomena. For the Ohio State Pragmatics Initiative, 2006, available at http://www.research.rutgers.edu/~ddevault/. Accessed on 11/24/2013.Google Scholar

Traum, D. and Allen, J. F. (1994). Discourse obligations in dialogue processing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 1-8, Las Cruces, NM. Association for Computational Linguistics.Google Scholar

Traum, D. and Hinkelman, E. (1992). Conversation acts in task-oriented spoken dialogue. Computational Intelligence, 8(3):575-599.CrossRef Google Scholar

Traum, D. R. (1994). A Computational Theory ofGrounding in Natural Language Conversation. PhD thesis, Department of Computer Science, University of Rochester.Google Scholar

Wahlster, W., Reithinger, N., and Blocher, A. (2001). SmartKom: Multimodal communication with a life-like character. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH), pages 1547-1550, Aalborg, Denmark. International Speech Communication Association.Google Scholar

Walker, M. A. (2000). An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email. Journal of Artificial Intelligence Research, 12:387-416.Google Scholar

Williams, J. and Young, S. (2006). Scaling POMDPs for dialog management with composite summary point-based value iteration (CSPBVI). In Proceedings of the AAAI Workshop on Statistical and Empirical Approaches for Spoken Dialogue Systems. AAAI Press.Google Scholar

Williams, J. D. (2008). Demonstration of a POMDP voice dialer. In Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), pages 1-4, Columbus, OH. Association for Computational Linguistics.Google Scholar

Williams, J. D. and Young, S. (2007). Partially observable Markov decision processes for spoken dialog systems. Computer Speech and Language, 21(2):393-122.CrossRef Google Scholar

Book contents

3 - Pursuing and demonstrating understanding in dialogue

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive