Skip to main content
×
Home

Discourse structure and language technology

  • B. WEBBER (a1), M. EGG (a2) and V. KORDONI (a3)
Abstract
Abstract

An increasing number of researchers and practitioners in Natural Language Engineering face the prospect of having to work with entire texts, rather than individual sentences. While it is clear that text must have useful structure, its nature may be less clear, making it more difficult to exploit in applications. This survey of work on discourse structure thus provides a primer on the bases of which discourse is structured along with some of their formal properties. It then lays out the current state-of-the-art with respect to algorithms for recognizing these different structures, and how these algorithms are currently being used in Language Technology applications. After identifying resources that should prove useful in improving algorithm performance across a range of languages, we conclude by speculating on future discourse structure-enabled technology.

Copyright
References
Hide All
Agarwal S., Choubey L., and Yu H. 2010. Automatically classifying the role of citations in biomedical articles. In Proceedings of American Medical Informatics Association (AMIA), Fall Symposium, Washington, DC, November 13–17, pp. 1115.
Agarwal S., and Yu H. 2009. Automatically classifying sentences in full-text biomedical articles into introduction, methods, results and discussion. Bioinformatics 25 (23): 3174–80.
Al-Saif A., and Markert K. 2010. The Leeds Arabic Discourse Treebank: annotating discourse connectives for Arabic. In Proceedings of 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta, May 17–23.
Al-Saif A., and Markert K. 2011. Modelling discourse relations for Arabic. In Proceedings of Empirical Methods in Natural Language Processing, Edinburgh, Scotland pp. 736–47.
Asher N. 1993. Reference to Abstract Objects in Discourse. Boston MA: Kluwer.
Asher N., and Lascarides A. 2003. Logics of Conversation. Cambridge, UK: Cambridge University Press.
Baldridge J., Asher N., and Hunter J. 2007. Annotation for and robust parsing of discourse structure on unrestricted texts. Zeitschrift für Sprachwissenschaft 26: 213–39.
Barzilay R., and Elhadad M. 1997. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 1017.
Barzilay R., and Lapata M. 2008. Modeling local coherence: an entity-based approach. Computational Linguistics 34 (1): 134.
Barzilay R., and Lee L. 2004. Catching the drift: probabilistic content models with applications to generation and summarization. In Proceedings of the 2nd Human Language Technology Conference and Annual Meeting of the North American Chapter, Boston, MA, USA, pp. 113–20. Stroudsburg, PA: Association for Computational Linguistics.
Bestgen Y. 2006. Improving text segmentation using latent semantic analysis: a reanalysis of Choi, Wiemer-Hastings, and Moore (2001). Computational Linguistics 32 (1): 512.
Bex F., and Verheij B. 2010. Story schemes for argumentation about the facts of a crime. In Proceedings, AAAI Fall Symposium on Computational Narratives. Menlo Park, CA: AAAI Press.
Buch-Kromann M., and Korzen I. 2010 (July). The unified annotation of syntax and discourse in the Copenhagen Dependency Treebanks. In Proceedings of the Fourth Linguistic Annotation Workshop, Uppsala, Sweden, July 15–16, pp. 127–31.
Buch-Kromann M., Korzen I., and Müller H. H. 2009. Uncovering the ‘lost’ structure of translations with parallel treebanks. In Alves F., Göpferich S., and Mees I. (eds.), Copenhagen Studies of Language: Methodology, Technology and Innovation in Translation Process Research, pp. 199224. Copenhagen Studies of Language, vol. 38. Frederiksberg, Denmark: Copenhagen Business School.
Bunt H., Alexandersson J., Carletta J., Choe J.-W., Fang A. C., Hasida K., Lee K., Petukhova V., Popescu-Belis A., Romary L., Soria C., and Traum D. 2010. Towards an ISO standard for dialogue act annotation. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta.
Burchardt A., Frank A., Erk K., Kowalski A., and Padó S. 2006. SALTO – versatile multi-level annotation tool. In Proceedings of LREC 2006, Genoa, Italy.
Burstein J., Marcu D., Andreyev S., and Chodorow M. 2001. Towards automatic classification of discourse elements in essays. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, pp. 98105. Stroudsburg, PA: Association for Computational Linguistics.
Burstein J., Marcu D., and Knight K. 2003. Finding the WRITE stuff: automatic identification of discourse structure in student essays. IEEE Intelligent Systems: Special Issue on Advances in Natural Language Processing 18: 32–9.
Callison-Birch C. 2008. Syntactic constraints on paraphrases extracted from parallel corpora. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '08), Honolulu, HI, USA.
Carlson L., Marcu D., and Okurowski M. E. 2003. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In Kuppevelt J. van and Smith R. (eds.), Current Directions in Discourse and Dialogue, pp. 85112. New York: Kluwer.
Chambers N., and Jurafsky D. 2008. Unsupervised learning of narrative event chains. In Proceedings, Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Columbus, OH, USA, pp. 789–97.
Chen H., Branavan S. R. K., Barzilay R., and Karger D. 2009. Global models of document structure using latent permutations. In Proceedings, Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Boulder, CO, USA, pp. 371–9.
Chiarcos C., Dipper S., Götze M., Leser U., Ldeling A., Ritz J., and Stede M. 2008. A flexible framework for integrating annotations from different tools and tagsets. Traitement Automatique des Langues 49: 271–93.
Choi F. Y. Y., Wiemer-Hastings P., and Moore J. 2001. Latent semantic analysis for text segmentation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '01), Pittsburgh, PA USA, pp. 109–17.
Chung G. 2009 (February). Sentence retrieval for abstracts of randomized controlled trials. BMC Medical Informatics and Decision Making 9 (10).
Clarke J., and Lapata M. 2010. Discourse constraints for document compression. Computational Linguistics 36 (3): 411–41.
Dale R. 1992. Generating Referring Expressions. Cambridge MA: MIT Press.
Daume H. III, and Marcu D. 2002. A noisy-channel model for document compression. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, pp. 449–56.
Do Q. X., Chan Y. S., and Roth D. 2011. Minimally supervised event causality identification. In Proceedings, Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, pp. 294303.
Eales J., Stevens R., and Robertson D. 2008. Full-text mining: linking practice, protocols and articles in biological research. In Proceedings of the BioLink SIG, ISMB 2008, Toronto, Canada.
Egg M., and Redeker G. 2010. How complex is discourse structure? In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta, pp. 1619–23.
Eisenstein J., and Barzilay R. 2008. Bayesian unsupervised topic segmentation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, (EMNLP '08), Honolulu, HI, pp. 334–43.
Elsner M., and Charniak E. 2008a. Coreference-inspired coherence modeling. In Proceedings of ACL-HLT 2008, Columbus, OH, USA.
Elsner M., and Charniak E. 2008b. You talking to me? In Proceedings of ACL-HLT 2008, Columbus, OH, pp. 834–42.
Elwell R., and Baldridge J. 2008. Discourse connective argument identification with connective specic rankers. In Proceedings of the IEEE Conference on Semantic Computing (ICSC-08), Santa Clara, CA, USA.
Finlayson M. 2009. Deriving narrative morphologies via analogical story merging. In Proceedings, 2nd International Conference on Analogy, Sofia, Bulgaria, pp. 127–36.
Foster G., Isabelle P., and Kuhn R. 2010. Translating structured documents. In Proceedings of AMTA, Atlanta, GA, USA.
Galley M., McKeown K., Fosler-Lussier E., and Jing H. 2003. Discourse segmentation of multi-party conversation. In Proceedings of the 41st Annual Conference of the Association for Computational Linguistics, Sapporo, Japan.
Ghorbel H., Ballim A., and Coray G. 2001. ROSETTA: rhetorical and semantic environment for text alignment. Proceedings of Corpus Linguistics, Lancaster, UK, pp. 224–33.
Ghosh S., Johansson R., Riccardi G., and Tonelli S. 2011b. Shallow discourse parsing with conditional random fields. In Proceedings, International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, November 8–13.
Ghosh S., Tonelli S., Riccardi G., and Johansson R. 2011a. End-to-end discourse parser evaluation. In Proceedings, IEEE Conference on Semantic Computing (ICSC-11), Hong Kong.
Grosz B., Joshi A., and Weinstein S. 1995. Centering: a framework for modelling the local coherence of discourse. Computational Linguistics 21 (2): 203–25.
Grosz B., and Sidner C. 1986. Attention, intention and the structure of discourse. Computational Linguistics 12 (3): 175204.
Grosz B., and Sidner C. 1990. Plans for discourse. In Cohen P., Morgan J., and Pollack M. (eds.), Intentions in Communication, pp. 417–44. Cambridge MA: MIT Press.
Gu Z., and Cercone N. 2006. Segment-based hidden Markov models for information extraction. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, July 17–21, pp. 481–8. Stroudsburg PA: Association for Computational Linguistics.
Guillou L. 2011. Improving Pronoun Translation for Statistical Machine Translation (SMT). M.Sc. dissertation, University of Edinburgh, Edinburgh, UK.
Guo Y., Korhonen A., Liakata M., Silins I., Sun L., and Stenius U. 2010 (July). Identifying the information structure of scientific abstracts. In Proceedings of the 2010 BioNLP Workshop, Uppsala, Sweden.
Halliday M., and Hasan R. 1976. Cohesion in English. Switzerland: Longman.
Hardmeier C., and Federico M. 2010. Modelling pronominal anaphora in Statistical Machine Translation. In Proceedings 7th Int'l Workshop on Spoken Language Translation, Paris, France, December 2–3, pp. 283–90.
Hardt D., and Elming J. 2010. Incremental re-training for post-editing SMT. In Proceedings of AMTA, Denver, CO, USA.
Hearst M. 1994. Multi-paragraph segmentation of expository text. In Proceedings, 32nd Annual Meeting of the Association for Computational Linguistics, Plainsboro, NJ, USA, pp. 916.
Hearst M. 1997. TextTiling: segmenting text into multi-paragraph subtopic passages. Computational Linguistics 23 (1): 3364.
Higgins D., Burstein J., Marcu D., and Gentile C. 2004. Evaluating multiple aspects of coherence in student essays. In Proceedings of HLT-NAACL, Boston, MA, USA, pp. 185–92. Stroudsburg, PA: Association for Computational Linguistics.
Hirohata K., Okazaki N., Ananiadou S., and Ishizuka M. 2008. Identifying sections in scientific abstracts using conditional random fields. In Proceedings of the 3rd International Joint Conference on Natural Language Processing, Hyderabad, India, pp. 381–8.
Holler A., and Irmen L. 2007. Empirically assessing effects of the right frontier constraint. In Proceedings of the 6th Discourse Anaphora and Anaphor Resolution Conference, Lagos (Algarve), Portugal, pp. 1527.
Hovy E., Marcus M., Palmer M., Ramshaw L., and Weischedel R. 2006. OntoNotes: the 90% solution. In Proceedings, Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 5760. Stroudsburg, PA: Association for Computational Linguistics.
Ide N., Prasad R., and Joshi A. 2011. Towards interoperability for the Penn Discourse Treebank. In Proceedings, 6th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-6), Oxford, UK, pp. 4955.
Kan M.-Y., Klavans J., and McKeown K. 1998. Linear segmentation and segment significance. In Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Canada.
Kim J.-D., Ohta T., Tateisi Y., and Tsujii J. 2003. GENIA corpus – semantically annotated corpus for bio-textmining. Bioinformatics 19 (Suppl 1): i180–2.
Kingsbury P., and Palmer M. 2002. From Treebank to PropBank. In Proceedings of the 3rd International Conference on Language Resources and Evalution (LREC), Spain.
Kintsch W., and van Dijk T. 1978. Towards a model of text comprehension and production. Psychological Review 85: 363–94.
Knott A. 2001. Semantic and pragmatic relations and their intended effects. In Sanders T., Schilperoord J., and Spooren W. (eds.), Text Representation: Linguistic and Psycholinguistic Aspects, pp. 127–51. Amsterdam: Benjamins.
Knott A., Oberlander J., O'Donnell M., and Mellish C. 2001. Beyond elaboration: the interaction of relations and focus in coherent text. In Sanders T., Schilperoord J., and Spooren W. (eds.), Text Representation: Linguistic and Psycholinguistic Aspects, pp. 181–96. Amsterdam: Benjamins.
Koppel M., and Ordan N. 2011. Translationese and its dialects. In Proceedings of the 49th Annual Meeting, pp. 1318–26. Stroudsburg, PA: Association for Computational Linguistics.
Lee A., Prasad R., Joshi A., Dinesh N., and Webber B. 2006. Complexity of dependencies in discourse: are dependencies in discourse more complex than in syntax? In Proceedings of the 5th Workshop on Treebanks and Linguistic Theory (TLT'06), Prague, Czech Republic.
Lee A., Prasad R., Joshi A., and Webber B. 2008. Departures from tree structures in discourse. In Proceedings of the Workshop on Constraints in Discourse III, Potsdam, Germany.
Liakata M., Teufel S., Siddharthan A., and Batchelor C. 2010. Corpora for the conceptualisation and zoning of scientific papers. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta.
Lin J., Karakos D., Demner-Fushman D., and Khudanpur S. 2006. Generative content models for structural analysis of medical abstracts. In Proceedings of the HLT-NAACL Workshop on BioNLP, Brooklyn, New York, pp. 6572.
Lin Z., Ng H. T., and Kan M.-Y. 2010 (November). A PDTB-styled end-to-end discourse parser. Technical Report, Department of Computing, National University of Singapore. Available at http://arxiv.org/abs/1011.0835
Lochbaum K. 1998. A collaborative planning model of intentional structure. Computational Linguistics 24 (4), 525–72.
Louis A., Joshi A., and Nenkova A. 2010. Discourse indicators for content selection in summarization. In Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL '10, pp. 147–56. Stroudsburg, PA: Association for Computational Linguistics.
Louis A., and Nenkova A. 2011. General versus specific sentences: automatic identification and application to analysis of news summaries. Technical Report, University of Pennsylvania. Available at http://repository.upenn.edu/cis_reports/
Maamouri M., and Bies A. 2004. Developing an Arabic treebank: methods, guidelines, procedures, and tools. In Proceedings of the Workshop on Computational Approaches to Arabic Script-Based Languages, pp. 29. Stroudsburg, PA: ACL.
Malioutov I., and Barzilay R. 2006. Minimum cut model for spoken lecture segmentation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics (CoLing-ACL 2006), Sydney, Australia.
Mandler J. 1984. Stories, Scripts, and Scenes: Aspects of Schema Theory. Hillsdale NJ: Lawrence Erlbaum.
Mani I. 2001. Automatic Summarization. Amsterdam, Netherlands: Benjamins.
Mann W., and Thompson S. 1988. Rhetorical structure theory: toward a functional theory of text organization. Text 8 (3), 243–1.
Marcu D. 1999. A decision-based approach to rhetorical parsing. In Proceedings of ACL'99, Maryland, USA, pp. 365–72.
Marcu D. 2000. The rhetorical parsing of unrestricted texts: a surface-based approach. Computational Linguistics 26: 395448.
Marcu D., Carlson L., and Watanabe M. 2000. The automatic translation of discourse structures. In Proceedings of the 1st Conference of the North American Chapter of the ACL, Seattle, WA, pp. 917.
Marcu D., and Echihabi A. 2002. An unsupervised approach to recognizing discourse relations. In Proceedings of ACL'02, Philadelphia, PA, USA.
Marcus M., Santorini B., and Marcinkiewicz M. A. 1993. Building a large-scale annotated corpus of English: the Penn TreeBank. Computational Linguistics 19: 313–30.
Martin J. 2000. Beyond exchange: appraisal systems in English. In Hunston S. and Thompson G. (eds.), Evaluation in Text: Authorial Distance and the Construction of Discourse, pp. 142–75. Oxford, UK: Oxford University Press.
Maslennikov M., and Chua T.-S. 2007. A multi-resolution framework for information extraction from free text. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 592–99. Stroudsburg, PA: Association for Computational Linguistics.
McDonald R., Crammer K., and Pereira F. 2005. Online large-margin training of dependency parsers. In Proceedings of ACL, Michigan, USA. Stroudsburg, PA: Association for Computational Linguistics.
McKeown K. 1985. Text Generation: Using Discourse Strategies and Focus Constraints to Generate Natural Language Texts. Cambridge, UK: Cambridge University Press.
McKnight L., and Srinivasan P. 2003. Categorization of sentence types in medical abstracts. In Proceedings of the AMIA Annual Symposium, Washington DC, pp. 440–44.
Meyer T. 2011. Disambiguating temporal-contrastive connectives for machine translation. In Proceedings of the 49th Annual Meeting, Association for Computational Linguistics, Student Session, pp. 4651. Stroudsburg, PA: Association for Computational Linguistics.
Mitkov R. 1999. Introduction: special issue on anaphora resolution in machine translation and multilingual NLP. Machine Translation 14: 159–61.
Mizuta Y., Korhonen A., Mullen T., and Collier N. 2006. Zone analysis in biology articles as a basis for information extraction. International Journal of Medical Informatics 75: 468–87.
Mladová L., Šárka Z., and Hajičová E. 2008. From sentence to discourse: building an annotation scheme for discourse based on the Prague Dependency Treebank. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco.
Moens M.-F., Uyttendaele C., and Dumortier J. 1999. Information extraction from legal texts: the potential of discourse analysis. International Journal of Human-Computer Studies 51: 1155–71.
Moore J. 1995. Participating in Explanatory Dialogues. Cambridge MA: MIT Press.
Moore J., and Paris C. 1993. Planning text for advisory dialogues: capturing intentional and rhetorical information. Computational Linguistics 19 (4): 651–95.
Moore J., and Pollack M. 1992. A problem for RST: the need for multi-level discourse analysis. Computational Linguistics 18 (4): 537–44.
Moser M., and Moore J. 1996. Toward a synthesis of two accounts of discourse structure. Computational Linguistics 22 (3): 409–19.
Nagard R. L., and Koehn P. 2010. Aiding pronoun translation with co-reference resolution. In Proceedings of the 5th Joint Workshop on Statistical Machine Translation and Metrics (MATR), Uppsala, Sweden.
Ono K., Sumita K., and Miike S. 1994. Abstract generation based on rhetorical structure extraction. In Proceedings, International Conference on Computational Linguistics (COLING), Kyoto, Japan, pp. 344–48.
Oza U., Prasad R., Kolachina S., Sharma D. M., and Joshi A. 2009. The Hindi Discourse Relation Bank. In Proceedings of the 3rd ACL Language Annotation Workshop (LAW III), Singapore.
Palau R. M., and Moens M.-F. 2009. Argumentation mining: the detection, classification and structure of arguments in text. In Proceedings of the 12th International Conference on Artificial Intelligence and Law, ICAIL '09, pp. 98107. New York: ACM.
Pang B., and Lee L. 2005. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of ACL, pp. 115–24. Stroudsburg PA: ACL.
Pang B., Lee L., and Vaithyanathan S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7986. Stroudsburg PA: Association for Computational Linguistics.
Paris C. 1988. Tailoring object descriptions to a user's level of expertise. Computational Linguistics 14 (3), 6478.
Pasch R., Brausse U., Breindl E., and Wassner U. 2003. Handbuch der Deutschen Konnektoren. Berlin, Germany: Walter de Gruyter.
Patwardhan S., and Riloff E. 2007. Effective information extraction with semantic affinity patterns and relevant regions. In Proceedings of the 2007 Conference on Empirical Methods in Natural Language Processing (EMNLP-07), Prague, Czech Republic.
Petukhova V., and Bunt H. 2009. Towards a multidimensional semantics of discourse markers in spoken dialogue. In Proceedings, 8th International Conference on Computational Semantics, Tilburg, The Netherlands, pp. 157–68.
Petukhova V., Préevot L., and Bunt H. 2011. Multi-level discourse relations in dialogue. In Proceedings, 6th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-6), Oxford, UK, pp. 1827.
Pitler E., and Nenkova A. 2009. Using syntax to disambiguate explicit discourse connectives in text. In Proceedings of the 47th Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing (ACL-IJCNLP '09), Singapore.
Pitler E., Raghupathy M., Mehta H., Nenkova A., Lee A., and Joshi A. 2008. Easily identifiable discourse relations. In Proceedings, International Conference on Computational Linguistics (COLING), Manchester, UK.
Poesio M., Stevenson R., Eugenio B. D., and Hitzeman J. 2004. Centering: a parametric theory and its instantiations. Computational Linguistics 30: 309–63.
Polanyi L., Culy C., van den Berg M., Thione G. L., and Ahn D. 2004a. A rule-based approach to discourse parsing. In Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue, p. 10. Stroudsburg, PA: Association for Computational Linguistics.
Polanyi L., Culy C., van den Berg M., Thione G. L., and Ahn D. 2004b. Sentential structure and discourse parsing. In Proceedings of the ACL 2004 Workshop on Discourse Annotation, Barcelona, Spain.
Polanyi L., and Zaenen A. 2004. Contextual valence shifters. In Proceedings of AAAI Spring Symposium on Attitude, Stanford CA, USA, p. 10.
Prasad R., Dinesh N., Lee A., Joshi A., and Webber B. 2007. Attribution and its annotation in the Penn Discourse TreeBank. TAL (Traitement Automatique des Langues) 47 (2): 4363.
Prasad R., Dinesh N., Lee A., Miltsakaki E., et al. . 2008. The Penn Discourse Treebank 2.0. In Proceedings of the 6th International Conference on Language Resources and Evaluation, Morocco.
Prasad R., Joshi A., and Webber B. 2010a. Exploiting scope for shallow discourse parsing. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta.
Prasad R., Joshi A., and Webber B. 2010b. Realization of discourse relations by other means: alternative lexicalizations. In Proceedings, International Conference on Computational Linguistics (COLING). Stroudsburg, PA: Association for Computational Linguistics.
Prasad R., McRoy S., Frid N., Joshi A., and Yu H. 2011. The Biomedical Discourse Relation Bank. BMC Bioinformatics 12 (188): 18. http://www.biomedcentral.com/1471-2015/12/188
Propp V. 1968. The Morphology of the Folktale, 2nd ed.Austin TX: University of Texas Press. Publication of the American Folklore Society, Inc., Bibliographical & Special Series.
Purver M. 2011. Topic segmentation. In: Tur G. and de Mori R. (eds.), Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Hoboken NJ: Wiley. Chapter 11, doi:1002/9781119992691.ch11.
Purver M., Griffiths T., Körding K. P., and Tenenbaum J. 2006. Unsupervised topic modelling for multi-party spoken discourse. In Proceedings, International Conference on Computational Linguistics (COLING) and the Annual Meeting of the Association for Computational Linguistics, pp. 1724. Stroudsburg, PA: Association for Computational Linguistics.
Pustejovsky J., Meyers A., Palmer M., and Poesio M. 2005. Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank and Coreference. In CorpusAnno '05: Proceedings of the Workshop on Frontiers in Corpus Annotations II, pp. 512. Stroudsburg, PA: Association for Computational Linguistics.
Ruch P., Boyer C., Chichester C., Tbahriti I., Geissbühler A., Fabry P., et al. 2007. Using argumentation to extract key sentences from biomedical abstracts. International Journal of Medical Informatics 76 (2–3): 195200.
Rumelhart D. 1975. Notes on a schema for stories. In Bobrow D. and Collins A. (eds.), Representation and Understanding: Studies in Cognitive Science, pp. 211–36. New York: Academic Press.
Sagae K. 2009. Analysis of discourse structure with syntactic dependencies and data-driven shift-reduce parsing. In Proceedings of IWPT 2009, Paris, France.
Sagae K., and Lavie A. 2005. A classifier-based parser with linear run-time complexity. In Proceedings of IWPT 2005, Vancouver, British Columbia.
Say B., Zeyrek D., Oflazer K., and Özge U. 2004. Development of a corpus and a treebank for present day written Turkish. In Current Research in Turkish Linguistics, 11th International Conference on Turkish Linguistics (ICTL 2002), Eastern Mediterranean University, Northern Cyprus, pp. 183–92.
Schank R., and Abelson R. 1977. Scripts, Plans, Goals and Understanding: An Inquiry into Human Knowledge Structures. Hillsdale NJ: Lawrence Erlbaum.
Schilder F. 2002. Robust discourse parsing via discourse markers, topicality and position. Natural Language Engineering 8 (3): 235–55.
Sibun P. 1992. Generating text without trees. Computational Intelligence, 8 (1): 102–22.
Soricut R., and Marcu D. 2003. Sentence level discourse parsing using syntactic and lexical information. In Proceedings of HLT/NAACL 2003, Edmonton, Canada.
Sporleder C., and Lascarides A. 2008. Using automatically labelled examples to classify rhetorical relations: a critical assessment. Natural Language Engineering 14 (3): 369416.
Stede M. 2004. The Potsdam Commentary Corpus. In ACL Workshop on Discourse Annotation. Stroudsburg, PA: ACL.
Stede M. 2008a. Disambiguating rhetorical structure. Research on Language and Computation 6: 311–32.
Stede M. 2008b. RST revisited: disentangling nuclearity. In Fabricius-Hansen C. and Ramm W. (eds.), Subordination versus Coordination in Sentence and Text, pp. 3358. Amsterdam, Netherlands: John Benjamins.
Subba R., and Eugenio B. D. 2009. An effective discourse parser that uses rich linguistic information. In Proceedings of NAACL '09, pp. 566–74. Stroudsburg, PA: Association for Computational Linguistics.
Subba R., Eugenio B. D., and Kim S. N. 2006. Discourse parsing: learning FOL rules based on rich verb semantic representations to automatically label rhetorical relations. In Proceedings of the EACL 2006 Workshop on Learning Structured Information in Natural Language Applications, Trento, Italy.
Sweetser E. 1990. From Etymology to Pragmatics. Metaphorical and Cultural Aspects of Semantic Structure. Cambridge, UK: Cambridge University Press.
Taboada M., Brooke J., and Stede M. 2009. Genre-based paragraph classification for sentiment analysis. In Proceedings of SIGDIAL 2009, London, UK, pp. 6270.
Taboada M., and Mann W. 2006. Applications of rhetorical structure theory. Discourse Studies 8: 567–88.
Tamames J., and de Lorenzo V. 2010. EnvMine: a text-mining system for the automatic extraction of contextual information. BMC Bioinformatics 11: 294.
Teufel S., and Moens M. 2002. Summarizing scientific articles – experiments with relevance and rhetorical status. Computational Linguistics 28: 409–45.
Teufel S., Siddharthan A., and Batchelor C. 2009. Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics. In Proceedings, Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 1493–502.
Thione G., van den Berg M., Polanyi L., and Culy C. 2004. Hybrid text summarization: combining external relevance measures with structural analysis. In Proceedings of the ACL 2004 Workshop Text Summarization Branches Out, Barcelona, Spain. Stroudsburg, PA: ACL.
Tonelli S., Riccardi G., Prasad R., and Joshi A. 2010. Annotation of discourse relations for conversational spoken dialogs. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta.
Toolan M. 2006. Narrative: linguistic and structural theories. In Brown K. (ed.), Encyclopedia of Language and Linguistics, 2nd ed., pp. 459–73. Amsterdam, Netherlands: Elsevier.
Turney P. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 417–24. Stroudsburg, PA: Association for Computational Linguistics.
Uzêda V. R., Pardo T. A. S., and Nunes M. D. G. V. 2010. A comprehensive comparative evaluation of RST-based summarization methods. ACM Transactions on Speech and Language Processing 6: 120.
van der Vliet N., Berzlánovich I., Bouma G., Egg M., and Redeker G. 2011. Building a discourse-annotated Dutch text corpus. In Dipper S. and Zinsmeister H. (eds.), Bochumer Linguistische Arbeitsberichte, 157–71.
Versley Y. 2010. Discovery of ambiguous and unambiguous discourse connectives via annotation projection. In Workshop on the Annotation and Exploitation of Parallel Corpora (AEPC), NODALIDA, Tartu, Estonia.
Voll K., and Taboada M. 2007. Not all words are created equal: extracting semantic orientation as a function of adjective relevance. In Proceedings of the 20th Australian Joint Conference on Artificial Intelligence, Gold Coast, Australia, pp. 337–46.
Walker M., Stent A., Mairess F., and Prasad R. 2007. Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research 30: 413–56.
Wang L., Lui M., Kim S. N., Nivre J., and Baldwin T. 2011. Predicting thread discourse structure over technical web forms. In Proceedings, Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, pp. 1325.
Webber B. 1991. Structure and ostension in the interpretation of discourse deixis. Language and Cognitive Processes 6 (2): 107–35.
Webber B. 2006. Accounting for discourse relations: constituency and dependency. In Butt M., Dalrymple M., and King T. (eds.), Intelligent Linguistic Architectures, pp. 339–60. Stanford, CA: CSLI.
Webber B. 2009. Genre distinctions for discourse in the Penn TreeBank. In Proceedings of the Joint Conference of the 47th Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing, Suntec, Singapore.
Wellner B. 2008. Sequence Models and Ranking Methods for Discourse Parsing. PhD thesis, Brandeis University, Waltham, MA, USA.
Wellner B., and Pustejovsky J. 2007. Automatically identifying the arguments of discourse connectives. In Proceedings of the 2007 Conference on Empirical Methods in Natural Language Processing (EMNLP-07), Prague, Czech Republic.
Wolf F., and Gibson E. 2005. Representing discourse coherence: a corpus-based study. Computational Linguistics 31: 249–87.
Woods W. 1968. Procedural semantics for a question-answering machine. In Proceedings of the AFIPS National Computer Conference, pp. 457–71. Montvale NJ: AFIPS Press.
Zeyrek D., Demirşahin I., Sevdik-Çallı A., Ögel Balaban H., İhsan Y., and Turan Ü. D. 2010. The annotation scheme of the Turkish discourse bank and an evaluation of inconsistent annotations. In Proceedings of the 4th Linguistic Annotation Workshop (LAW III), Uppsala, Sweden.
Zeyrek D., Turan Ü. D., Bozsahin C., Çakıcı R., et al. 2009. Annotating subordinators in the Turkish discourse bank. In Proceedings of the 3rd Linguistic Annotation Workshop (LAW III), Singapore.
Zeyrek D., and Webber B. 2008. A discourse resource for Turkish: annotating discourse connectives in the METU corpus. In Proceedings of the 6th Workshop on Asian Language Resources (ALR6), Hyderabad, India.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 7
Total number of PDF views: 74 *
Loading metrics...

Abstract views

Total abstract views: 402 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 24th November 2017. This data will be updated every 24 hours.