Skip to main content

Computer-assisted assessment of free-text answers

  • Diana Pérez-Marín (a1), Ismael Pascual-Nieto (a2) and Pilar Rodríguez (a2)

The automatic assessment of students’ free-text answers has recently received much attention, due to the necessity of exploring and taking advantage of new and more complex computer-based assessment methods. In this paper, a review of the state-of-art of the field is presented, focusing on the techniques that underpin these systems and their evaluation metrics. Although there is still a long way to go so as to reach the ideal system, the fact that the existing systems are already being used commercially and as a second opinion in exams such as GMAT proves the uptake of this field.

Corresponding author
Hide All
Alfonseca, E., Carro, R., Freire, M., Ortigosa, A., Pérez, D. 2004. Educational adaptive hypermedia meets computer assisted assessment. In Proceedings of the International Workshop of Educational Adaptive Hypermedia, collocated with the Adaptive Hypermedia (AH) Conference, Eindhoven, The Netherlands.
Birenbaum, M., Tatsuoka, K., Gutvirtz, Y. 1992. Effects of response format on diagnostic assessment of scholastic achievement. Applied Psychological Measurement 14(4), 353363.
Blayney, P., Freeman, M. 2003. Automated marking of individualised spreadsheet assignments: the impact of different formative self-assessment options. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.
Bloom, B. 1956. Taxonomy of educational objectives: the classification of educational goals. Handbook I, Cognitive Domain. Longman, Whiteplains (New York); Toronto.
Burstein, J., Kukich, K., Wolff, S., Lu, C., Chodorow, M., Bradenharder, L., Harris, M. D. 1998. Automated scoring using a hybrid feature identification technique. In Proceedings of the Annual Meeting of the Association of Computational Linguistics, The Association of Computational Linguistics, Montreal, Quebec, Canada.
Burstein, J., Leacock, C., Swartz, R. 2001. Automated evaluation of essays and short answers. In Proceedings of the 5th International Computer Asssited Assessment Conference, Loughborough, UK.
Callear, D., Jerrams-Smith, J., Soh, V. 2001. CAA of short non-MCQ answers. In Proccedings of the 5th International Computer Assissted Assessment conference, Loughborough, UK.
Christie, J. 1999. Automated essay marking—for both style and content. In Proceedings of the 3rd Computer Assisted Assessment International Conference, Loughborough, UK.
Christie, J. 2003. Automated essay marking for content—does it work? In Proceedings of the 7th International Computer Assisted Assessment Conference, Loughborough, UK.
Chung, G., O’Neill, H. 1997. Methodological Approaches to Online Scoring of Essays. Technical Report 461, UCLA, National Center for Research on Evaluation, Student Standards, and Testing, USA.
Cucchiarelli, A., Faggioli, E., Velardi, P. 2000. Will very large corpora play for semantic disambiguation the role that massive computing power is playing for other AI-hard problems? In Proceedings of the 2nd Conference on Language Resources and Evaluation, Greece.
Datar, A., Doddapaneni, N., Khanna, S., Kodali, V., Yadav, A. 2004. EGALEssay Grading and Analysis Logic, SourceForge project.
Darus, S., Hussin, S., Stapa, S. 2001. Students’ expectations of a computer-based essay marking system. In Reflections, Visions and Dreams of Practice: Selected papers from the IEC 2001 International Education Conference, Malaysia, 197–204.
Darus, S., Stapa, S. 2001. Lecturers’ expectations of a computer-based essay marking systems. Journal of the Malaysian English Language Teachers’ Association (MELTA) 30, 4756.
Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., Harshman, R. A. 1990. Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391407.
Denton, P. 2003. Evaluation of the ‘electronic feedback’ marking assistant and analysis of a novel collusion detection facility. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.
Dessus, P., Lemaire, B., Vernier, A. 2000. Free text assessment in a virtual campus. In Proceedings of the 3rd International Conference on Human System Learning, Paris, France, 61–75.
Foltz, P., Laham, D., Landauer, T. 1999. The intelligent essay assessor: Applications to educational technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning 1(2). Available online at
Ishioka, T., Kameda, M. 2004. Automated Japanese Essay Scoring System: JESS. In Proceedings of the 15th International Workshop on Database and Expert Systems Applications, 4–8.
Kakkonen, T., Myller, N., Timonen, J., Sutinen, E. 2005. Automatic Essay Grading with Probabilistic Latent Semantic Analysis. In Proceedings of the 2nd Workshop on Building Educational Applications Using NLP, Association for Computational Linguistics, 29–36.
Kintsch, E., Steinhart, D., Stahl, G., the LSA Research Group 2000. Developing summarization skills through the use of LSA-based feedback. Interactive Learning Environments 8, 87109.
Landauer, T., Dumais, S. 1997. A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review 104, 211240.
Landauer, T., Laham, D., Rehder, B., Schreiner, M. 1997. How well can passage meaning be derived without using word order? A comparison of Latent Semantic Analysis and humans. In Proceedings of the 19th Annual Meeting of the Cognitive Science Society, Erlbaum, Mawhwah, New Jersey, 412–417.
Larkey, L. S. 1998. Automatic essay grading using text categorization techniques. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press, New York, 90–95.
Leacock, C. 2004. Scoring free-responses automatically: A case study of a large-scale assessment. English version of Leacock, C. 2004. Automatisch beoordelen van antwoorden op open vragen; een taalkundige benadering. Examens Journal 1(3).
Lutticke, R. 2005. Graphic and NLP Based Assessment of Knowledge about Semantic Networks. In Proceedings of the Artificial Intelligence in Education conference. IOS Press.
Malatesta, K., Wiemer-Hastings, P., Robertson, J. 2002. Beyond the short answer question with research methods tutor. In Proceedings of the Intelligent Tutoring Systems Conference, Lecture Notes in Computer Science 2363. Springer; San Sebastian.
Manning, C., Schutze, H. 2001. Foundations of Statistical Natural Language Processing. MIT Press.
Marcu, D. 2000. The Theory and Practice of Discourse Parsing and Summarization. The MIT Press.
Marshall, S., Barron, C. 1987. Marc-methodical assessment of reports by computer. System 15(2), 161167.
Mason, O., Grove-Stephenson, I. 2002. Automated free text marking with paperless school. In Proceedings of the 6th International Computer Assisted Assessment Conference, Loughborough, UK.
Mcgrath, P. 2003. Assessing students: Computer simulation vs MCQs. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.
Mikhailov, A. 1998. Indextron. Intelligent Engineering Systems Through Artificial Neural Networks 8, 5767.
Ming, Y., Mikhailov, A., Kuan, T. 2000. Intelligent essay marking system. In Learners Together, Cheers, C. (ed.). NGEE ANN Polytechnic.
Mitchell, T., Aldridge, N., Williamson, W., Broomhead, P. 2003. Computer based testing of medial knowledge. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.
Mitchell, T., Russell, T., Broomhead, P., Aldridge, N. 2002. Towards robust computerised marking of free-text responses. In Proceedings of the 6th Computer Assisted Assessment Conference, Loughborough, UK.
MUC7. 1998. Proceedings of the 7th Message Understanding Conference (MUC-7). Morgan Kaufmann, California, USA.
Page, E. 1966. The imminence of grading essays by computer. Phi Delta Kappan 47, 238243.
Page, E. 1994. Computer grading of student prose, using modern concepts and software. Journal of Experimental Education 2(62), 127142.
Palmer, K., Richardson, P. 2003. On-line assessment and free-response input—a pedagogic and technical model for squaring the circle. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.
Parsons, H., Schofield, D., Woodget, S. 2003. Piloting summative Web assessment in secondary education. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.
Pérez, D., Gliozzo, A., Strapparava, C., Alfonseca, E., Rodríguez, P., Magnini, B. 2005. Automatic assessment of students’ free-text answers underpinned by the combination of a Bleu-inspired algorithm and latent semantic analysis. In Proceedings of the 18th International Conference of the Florida Artificial Intelligence Research Society, American Association for Artificial Intelligence (AAAI), Menlo Park, California.
Pérez-Marín, D., Alfonseca, E., Rodríguez, P., Pascual-Nieto, I. 2006. Willow: Automatic and adaptive assessment of students free-text answers. In Proceedings of the 22nd International Conference of the Spanish Society for the Natural Language Processing (SEPLN), Zaragoza, Spain.
Pérez-Marín, D., Alfonseca, E., Rodríguez, P., Pascual-Nieto, I. 2007. Automatic generation of students’ conceptual models from answers in plain text. In Proceedings of the User Modeling International Conference, Conati, C., McCoy, K. & Paliouras, G. (eds). Lecture Notes in Artificial Intelligence 4511, 329–333. Springer-Verlag.
Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.
Rosé, C., Roque, A., Bhembe, D., VanLehn, K. 2003. A hybrid text classification approach for analysis of student essays. In Proceedings of the HLT-NAACL Workshop on Educational Applications of NLP, Edmonton, Canada.
Rudner, L., Gagne, P. 2001. An overview of three approaches to scoring written essays by computer. Educational Resources Information Center (ERIC) digest, ERIC Clearinghouse on Assessment and Evaluation, College Park, MD.
Rudner, L., Liang, T. 2002. Automated essay scoring using bayes’ theorem. In Proceedings of the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.
Salton, G. 1989. Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley.
Salton, G., Wong, A., Yang, C. 1975. A vector space model for automatic indexing. Communications of the ACM 11(18), 613620.
Sealey, C., Humphries, P., Reppert, D. 2003. At the coal face. Experiences of computer-based exams. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.
Shermis, M., Koch, C., Page, E., Keith, T., Harrington, S. 2002. Trait rating for automated essay scoring. Educational and Psychological Measures 62, 518.
Streeter, L., Pstoka, J., Laham, D., MacCuish, D. 2003. The credible grading machine: Automated essay scoring in the DOD. In Proceedings of Interservice/Industry, Simulation and Education Conference (I/ITSEC), Orlando, Florida, USA.
Sukkarieh, J., Pulman, S., Raikes, N. 2003. Auto-marking: using computational linguistics to score short, free text responses. In Proceedings of the 29th IAEA Conference, Theme: Societies’ Goals and Assessment, Philadelphia, USA.
Valenti, S., Neri, F., Cucchiarelli, A. 2003. An overview of current research on automated essay grading. Journal of Information Technology Education 2, 319330.
van Rijsbergen, C. J. 1979. Information Retrieval. Butterworths.
Vantage Learning Technology 2000. A Study of Expert Scoring and Intellimetric Scoring Accuracy for Dimensional Scoring of Grade 11 Student Writing Responses. Technical Report RB-397, Vantage, USA.
Vantage Learning Technology 2001. A Preliminary Study of the Efficacy of Intellimetric for Use in Scoring Hebrew Assessments. Technical Report RB-561, Vantage, USA.
Whittingdon, D., Hunt, H. 1999. Approaches to the computerised assessment of free-text responses. In Proceedings of the 3rd International Computer Assisted Assessment Conference, Loughborough, UK.
Wiemer-Hastings, P., Graesser, A. 2000. Select-a-kibitzer: A computer tool that gives meaningful feedback on student compositions. Interactive Learning Environments 8(2), 149169.
Wiemer-Hastings, P., Allbritton, D., Arnott, E. 2004. RMT: A dialog-based research methods tutor with or without a head. In Proceedings of the 7th International Conference on Intelligent Tutoring Systems, Springer-Verlag, Berlin.
Wiemer-Hastings, P., Graesser, A., Harter, D., the Tutoring Research Group 1998. The foundations and architecture of Autotutor. In Proceedings of the 4th International Conference on Intelligent Tutoring Systems, Springer-Verlag, New York, 334–343.
Williams, R. 2001. Automated essay grading: an evaluation of four conceptual models. In Proceedings of the 10th Annual Teaching and Learning Forum: Expanding Horizons in Teaching and Learning, Curtin University of Technology, Perth, Australia.
Williams, R., Dreher, H. 2004. Automatically Grading Essays with Markit. In Proceedings of Informing Science Conference, Rockhampton, Queensland, Australia.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

The Knowledge Engineering Review
  • ISSN: 0269-8889
  • EISSN: 1469-8005
  • URL: /core/journals/knowledge-engineering-review
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed