Skip to main content Accessibility help

Generating basic skills reports for low-skilled readers*



We describe SkillSum, a Natural Language Generation (NLG) system that generates a personalised feedback report for someone who has just completed a screening assessment of their basic literacy and numeracy skills. Because many SkillSum users have limited literacy, the generated reports must be easily comprehended by people with limited reading skills; this is the most novel aspect of SkillSum, and the focus of this paper. We used two approaches to maximise readability. First, for determining content and structure (document planning), we did not explicitly model readability, but rather followed a pragmatic approach of repeatedly revising content and structure following pilot experiments and interviews with domain experts. Second, for choosing linguistic expressions (microplanning), we attempted to formulate explicitly the choices that enhanced readability, using a constraints approach and preference rules; our constraints were based on corpus analysis and our preference rules were based on psycholinguistic findings. Evaluation of the SkillSum system was twofold: it compared the usefulness of NLG technology to that of canned text output, and it assessed the effectiveness of the readability model. Results showed that NLG was more effective than canned text at enhancing users' knowledge of their skills, and also suggested that the empirical ‘revise based on experiments and interviews’ approach made a substantial contribution to readability as well as our explicit psycholinguistically inspired models of readability choices.



Hide All
Bateman, John A. and Paris, Cécile L. (1989) Phrasing a text in terms the user can understand. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, IJCAI, 1989, Detroit, MI, pp. 1511–17.
Binkley, M., Matheson, N. and Williams, T. (1997) Working Paper: Adult Literacy: An International Perspective. Technical Report, National Center for Education Statistics (NCES) Electronic Catalog No. NCES 9733,
Brown, J. and Eskenazi, M. (2005) Student, text and curriculum modeling for reader-specific documant retrieval. In Proceedings of the IASTED International Conference on Human–Computer Interaction, Phoenix, AZ.
Canning, Y. (2002) Improved Syntactic Analysis of, and Simplified Text Generation from, Free-Form Text. PhD Thesis, University of Sunderland, Sunderland.
Carlson, L., Marcu, D. and Okurowski, M. E. (2003) Building a discourse-tagged corpus in the framework of rhetorical structure theory. In van Kuppevelt, Jan, and Smith, Ronnie (eds.), Current Directions in Discourse and Dialogue, Text, Speech and Language Technology, Vol. 22, pp. 85112. Berlin, Springer.
Chandrasekar, R. and Srinivas, B. (1997) Automatic induction of rules for text simplification. Knowledge-Based Systems 10: 183–90.
Coleman, E. (1962) Improving comprehensibility by shortening sentences. Journal of Applied Psychology 46: 131–4.
Collins-Thompson, K. and Callan, J. (2004) A language modeling approach to predicting reading difficulty. In Dumais, Susan, Marcu, Daniel and Roukos, Salim (eds.), HLT-NAACL 2004: Main Proceedings, pp. 193200. Morristown, NJ: Association for Computational Linguistics.
Degand, L., Lefèvre, N. and Bestgen, Y. (1999) The impact of connectives and anaphoric expressions on expository discourse comprehension. Document Design 1: 3951.
Devlin, S., Canning, Y., Tait, J., Carroll, J., Minnen, G. and Pearce, D. (2000) An AAC aid for aphasic people with reading difficulties. In Proceedings of the 9th Biennial Conference of the International Society for Augmentative and Alternative Communication (ISAAC 2000), Washington, USA, pp. 10–12.
Devlin, S. and Tait, J. (1998) The use of a psycholinguistic database in the simplification of text for aphasic readers. In Nerbonne, John (ed.), Linguistic Databases, pp. 161–73. Cambridge: Cambridge University Press, CSLI Publications.
DeVries, H. (1999) Reading Ease@WWW. Masters Thesis, Macquarie University, Australia.
Di Eugenio, B., Glass, M., Trolio, M. J. and Haller, S. (2001) Simple natural language generation and intelligent tutoring systems. Proceedings of Artificial Intelligence in Education, pp. 50–8.
Di Eugenio, B., Moore, J. D. and Paolucci, M. (1997) Learning features that predict cue usage. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, Spain, pp. 80–7.
Eddy, B. (2002) Towards balancing conciseness, readability and salience: an integrated architecture. Proceedings of the International Natural Language Generation Conference, New York, pp. 173–8.
Geldof, S. (2003) Corpus analysis for NLG. In Reiter, E., Horacek, H. and van Deemter, K. (eds.), Proceedings of the 9th European Workshop on Natural Language Generation (ENLG'03), Budapest, Hungary, pp. 31–8.
Harley, T. (2001) The Psychology of Language from Data to Theory. Erlbaum, UK: Psychology Press.
Hunter, D. and Howard, U. (2004) Including language, literacy and numeracy learning in all post-16 education. Guidance on curriculum and methodology for generic initial teacher education programmes. Technical Report, FENTO (Further Education National Training Organisation),
Inui, K., Fujita, A., Takahashi, T., Iida, R. and Iwakura, T. (2003) Text simplification for reading assistance: a project note. 2nd International Conference on Paraphrasing: paraphrase acquisition and applications, Sapporo, Japan, pp. 9–16.
Jucks, R. and Broome, R. (2007) Choice of words in doctor–patient communications: an analysis of health-related internet sites. Health Communication 21 (3): 267–77.
Kintsch, W. and Vipond, D. (1979) Reading comprehension and readability in educational practice and psychological theory. In Nilsson, L. G. (ed.), Perspectives on Memory Research, pp. 329–65. Hillsdale, NJ: Lawrence Erlbaum.
Knott, A. (1996) A Data-Driven Methodology for Motivating a Set of Coherence Relations. PhD Thesis, University of Edinburgh, Edinburgh.
Knott, A. and Sanders, T. (1998) The classification of coherence relations and their linguistic markers: an exploration of two languages. Journal of Pragmatics 30 (2): 135–75.
Lavoie, B. and Rambow, O. (1997) RealPro: A fast, portable sentence realizer. IN Proceedings of the Conference on Applied Natural Language Processing (ANLP, 1997), Washington, USA, pp. 265–8.
Leijten, M. and van Waes, L. (2001) The impact of text structure and linguistic markers on the text comprehension of elderly people. In Degand, L., Bestgen, Y., Spooren, W. and van Waes, L. (eds.), Multidisciplinary Approaches to Discourse, pp. 21–9. Amsterdam: Stichting Neerlandistiek VU, Münster: Nodus Publikationen.
Lorch, R. F. and Lorch, E. P. (1996) Effects of organizational signals on free recall of expository text. Journal of Educational Psychology 88 (1): 3848.
Mann, W. C. and Thompson, S. A. (1987) Rhetorical structure theory: a theory of text organization. Technical Report, ISI/RS-87-190, Document Center, USC/ISI, Marina del Rey, CA.
Mason, J. and Morris, L. (2000) Improving understanding and recall of the probation service contract. Journal of Community and Applied Social Psychology 10 (3): 199210.
McKeown, K., Robin, J. and Tanenblatt, M. (1993) Tailoring lexical choice to the user's vocabulary in multimedia explanation generation. In Proceedings of ACL, Columbus, OH, pp. 226–34.
Milosavljevic, M. and Oberlander, J. (1998) Dynamic hypertext catalogues: helping users to help themselves. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia (HT, 1998), Pittsburgh, PA, pp. 123–31.
Miltsakaki, E., Dinesh, N., Prasad, R., Joshi, A. and Webber, B. (2005) Experiments on sense annotations and sense disambiguation of discourse connectives. In Proceedings of the 4th Workshop on Treebanks and Linguistic Theories, Barcelona, Spain.
Moore, J. D., Porayska-Pomsta, K., Varges, S. and Zinn, C. (2004) Generating tutorial feedback with affect. In Proceedings of the 17th International Florida Artificial Intelligence Research Society Conference, Miami Beach, FL, pp. 923–8. AAAI Press, Menlo Park, CA.
Moser, C. (1999) Improving literacy and numeracy: a fresh start. The report of the working group chaired by Sir Claus Moser. Technical Report,
Moser, M. and Moore, J. D. (1995) Investigating cue selection and placement in tutorial discourse. In Proceedings of the 33rd Annual Meeting on Association For Computational Linguistics (Cambridge, Massachusetts, June 26–30, 1995). Annual Meeting of the ACL. Association for Computational Linguistics, Morristown, NJ, USA, 130–135.
Okumura, M. (2000) Producing more readable extracts by revising them. COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics, Saarbrücken, Germany, pp. 1071–5.
Paris, Cécile L. (1988) Tailoring object descriptions to the user's level of expertise. Computational Linguistics 14 (3): 6478.
Power, R. (2000) Planning texts by constraint satisfaction. In Proceedings of the International Conference on Computational Linguistics (COLING, 2000), Saarbrücken, Germany, pp. 642–8.
Power, R., Scott, D. and Bouayad-Agha, N. (2003) Document structure. Computational Linguistics 29 (2): 211–60.
Reiter, E. and Dale, R. (2000) Building Natural Language Generation Systems. Cambridge, UK: Cambridge University Press.
Reiter, E. and Sripada, S. G. (2002) Human variation and lexical choice. Computational Linguistics 28: 545–53.
Reiter, E., Robertson, R and Osman, L. M. (2003) Lessons from a failure: Generating tailored smoking cessation letters. Artificial Intelligence, 144 (1–2): 4158.
Reiter, E., Sripada, S. G. and Robertson, R. (2003) Acquiring correct knowledge for natural anguage generation. Journal of Artificial Intelligence Research 18: 491516.
Reiter, E., Williams, S. and Crichton, L. (2005) Generating feedback reports for adults taking basic skills tests. In Proceedings of the Twenty-fifth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, England, pp. 50–63.
Sanders, T. J. M. and Noordman, L. G. M. (2000) The role of coherence relations and their linguistic markers in text processing. Discourse Processes 29 (1): 3760.
Scott, D. and de Souza, C. (1990) Getting the message across in RST-based text generation. In Dale, R., Mellish, C. and Zock, M. (eds.), Current Research in Natural Language Generation, pp. 4773. Cognitive Science Series. London: Academic Press.
Siddharthan, A. (2002) Resolving attachment and clause boundary ambiguities for simplifying relative clause constructs. In Proceedings of the Student Research Workshop, 40th Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp. 60–5.
Siddharthan, A. (2003) Preserving discourse structure when simplifying text. Proceedings of the 9th European Workshop on Natural Language Generation, Budapest, Hungary, pp. 127–34.
Steeds, A. (2001) Adult literacy core curriculum including spoken communication. Technical Report, Cambridge Training and Development Ltd. on behalf of The Basic Skills Agency, ISBN 1-85990-127-1.
Tintarev, N. (2004) Content Determination for Reports Aimed at Adult Literacy Learners. Masters Thesis, Uppsala Universitet, Sweden.
Torrens, M. (2002) Java constraint library 2.1. Technical Report, Artificial Intelligence Laboratory, Swiss Federal Institute of Technology.
Walker, M., Whittaker, S., Stent, A., Maloor, P., Moore, J., Johnston, M. and Vasireddy, G. (2003) Generation and evaluation of user tailored responses in multimodal dialogue. Cognitive Science, Rumelhart Prize Special Issue Honoring Aravind K. Joshi, 28 (5): 811–40.
Williams, S. (2004) Natural Language Generation of Discourse Relations for Different Reading Levels. PhD Thesis, University of Aberdeen, Aberdeen.
Williams, S. and Reiter, E. (2005) Deriving content selection rules from a corpus of non-naturally occurring documents for a novel NLG application. In Proceedings of the Workshop on Using Corpora for Natural Language Generation, pp. 41–8. Technical Report, no. ITRI–05–03, University of Brighton: Information Technology Research Institute (ITRI).
Zukerman, I. and Pearl, J. (1986) Comprehension-driven generation of meta-technical utterances in math tutoring. In 5th National Conference AAAI-86, Philadelphia, PA, pp. 606–11.

Related content

Powered by UNSILO

Generating basic skills reports for low-skilled readers*



Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.