Skip to main content
    • Aa
    • Aa

Dependency-based n-gram models for general purpose sentence realisation


This paper presents a general-purpose, wide-coverage, probabilistic sentence generator based on dependency n-gram models. This is particularly interesting as many semantic or abstract syntactic input specifications for sentence realisation can be represented as labelled bi-lexical dependencies or typed predicate-argument structures. Our generation method captures the mapping between semantic representations and surface forms by linearising a set of dependencies directly, rather than via the application of grammar rules as in more traditional chart-style or unification-based generators. In contrast to conventional n-gram language models over surface word forms, we exploit structural information and various linguistic features inherent in the dependency representations to constrain the generation space and improve the generation quality. A series of experiments shows that dependency-based n-gram models generalise well to different languages (English and Chinese) and representations (LFG and CoNLL). Compared with state-of-the-art generation systems, our general-purpose sentence realiser is highly competitive with the added advantages of being simple, fast, robust and accurate.

Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

S. Bangalore , and O. Rambow 2000. Exploiting a probabilistic hierarchical model for generation. In Proceedings of the 18th International Conference on Computational Linguistics, pp. 4248. Saarbrücken, Germany.

M. Gamon , E. Ringger , Z. Zhang , R. Moore , and S. Corston-Oliver 2002. Extraposition: a case study in German sentence realization. In Proceedings of the 19th International Conference on Computational Linguistics, pp. 17. Taipei, Taiwan.

R. Kaplan , and J. Wedekind 2000. LFG generation produces context-free languages. In Proceedings of the 18th International Conference on Computational Linguistics, pp. 425431. Saarbrücken, Germany.

B. Lavoie , and O. Rambow 1997. A fast and portable realizer for text generation systems. In Proceedings of the 5th Conference on Applied Natural Language Processing, pp. 265268. Washington, DC.

T. Marciniak , and M. Strube 2004. Classification-based generation using TAG. In Proceedings of the 3rd International Conference on Natural Language Generation, pp. 100109. Brockenhurst, UK.

H. Nakanishi , Y. Nakanishi , and J. Tsujii 2005. Probabilistic models for disambiguation of an HPSG-based chart generator. In Proceedings of the 9th International Workshop on Parsing Technology, pp. 93102. Vancouver, British Columbia, Canada.

N. Nicolov , and C. Mellish 2000. PROTECTOR: efficient generation with lexicalized grammars. In Recent Advances in Natural Language Processing II, pp. 221243. Amsterdam, The Netherlands: John Benjamins.

J. Nivre 2006. Inductive Dependency Parsing. New York, NY: Springer.

M. Surdeanu , R. Johansson , A. Meyers , L. Màrquez , and J. Nivre 2008. The CoNLL-2008 shared task on joint parsing of syntactic and semantic dependencies. In Proceedings of the 12th Conference on Computational Natural Language Learning, pp. 159177. Manchester, UK.

K. Uchimoto , M. Murata , Q. Ma , S. Sekine , and H. Isahara 2000. Word order acquisition from corpora. In Proceedings of the 18th International Conference on Computational Linguistics, pp. 871877. Saarbrücken, Germany.

M. White 2004. Reining in CCG chart realization. In Proceedings of the 3rd International Natural Language Generation Conference, pp. 182191. Brockenhurst, UK.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 3
Total number of PDF views: 9 *
Loading metrics...

Abstract views

Total abstract views: 86 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 29th May 2017. This data will be updated every 24 hours.