Skip to main content Accessibility help
×
Home

Automatic summarisation: 25 years On

  • Constantin Orăsan (a1)

Abstract

Automatic text summarisation is a topic that has been receiving attention from the research community from the early days of computational linguistics, but it really took off around 25 years ago. This article presents the main developments from the last 25 years. It starts by defining what a summary is and how its definition changed over time as a result of the interest in processing new types of documents. The article continues with a brief history of the field and highlights the main challenges posed by the evaluation of summaries. The article finishes with some thoughts about the future of the field.

Copyright

Corresponding author

*Corresponding author. E-mail: C.Orasan@wlv.ac.uk

References

Hide All
Agnihotri, L., Kender, J., Dimitrova, N. and Zimmerman, J. (2005). User study for generating personalized summary profiles. In Proceedings of the 2005 IEEE International Conference on Multimedia and Expo. IEEE pp. 10941097.
Alonso, i Alemany L. and Fuentes, Fort M. (2003). Integrating cohesion and coherence for automatic summarization. In Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2 (EACL ’03), Vol. 2. Association for Computational Linguistics, Strasbourg, PA, USA, Budapest, Hungary – April 12–17, pp. 18. doi:10.3115/1067737.1067739
Ando, R., Boguraev, B., Byrd, R. and Neff, M. (2005). Visualization-enabled multi-document summarization by Iterative Residual Rescaling. Natural Language Engineering 11(1), 6786.
ANSI (1977). American National standard for writing abstracts. IEEE Transactions on Professional Communication PC-20(4), 252254.
Azzam, S., Humphrey, K. and Gaizauskas, R. (1999). Using coreference chains for text summarisation. In Bagga, A., Baldwin, B. and Shelton, S. (eds), Proceedings of the Workshop on Coreference and its Applications, Maryland, USA, pp. 7784.
Bahdanau, D., Cho, K. and Bengio, Y. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. arXiv preprint arXiv:1409.0473.
Balahur, A., Kabadjov, M., Steinberger, J., Steinberger, R. and Montoyo, A. (2012). Challenges and solutions in the opinion summarization of user-generated content. Journal of Intelligent Information Systems 39, 375398.
Barzilay, R. and Elhadad, M. (1999). Using lexical chains for text summarization. In Mani, I. and Maybury, M.T. (eds), Advances in Automatic Text Summarization, Chapter 10. The MIT Press, pp. 111122.
Barzilay, R. and McKeown, K.R. (2005). Sentence fusion for multidocument news summarization. Computational Linguistics 31(3), 297328.
Boguraev, B. and Kennedy, C. (1999). Salience-based content characterisation of text documents. In Mani, I. and Maybury, M.T. (eds), Advances in Automatic Text Summarization. The MIT Press, pp. 99110.
Borko, H. and Bernier, C.L. (1975). Abstracting Concepts and Methods. London: Academic Press.
Brill, E. and Mooney, R.J. (1997). An overview of empirical natural language processing. AI Magazine 18(4), 1324.
Carenini, G., Ng, R. and Pauls, A. (2006). Multi-document summarization of evaluative text. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Strasbourg, PA, USA, Trento, Italy, 3–7 April, pp. 305312.
Chen, J. and Zhuge, H. (2016). Summarization of related work through citations. In 2016 12th International Conference on Semantics, Knowledge and Grids (SKG). IEEE, pp. 5461.
Cheng, J. and Lapata, M. (2016). Neural summarization by extracting sentences and words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany. Association for Computational Linguistics, pp. 484494.
Church, K.W. (2017). Emerging trends: I did it, I did it, I did it, but… Natural Language Engineering 23(3), 473480.
Cleveland, D.B. (1983). Introduction to Indexing and Abstracting. Libraries Unlimited, Inc.
Cohan, A., Dernoncourt, F., Kim, D.S., Bui, T., Kim, S., Chang, W. and Goharian, N. (2018). A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), New Orleans, Louisiana, USA. Association for Computational Linguistics, pp. 615621.
Conroy, J.M. and O’leary, D.P. (2001). Text summarization via Hidden Markov Models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Louisiana, USA, pp. 406407.
DeJong, G. (1982). An overview of the FRUMP system. In Lehnert, W.G. and Ringle, M.H. (eds), Strategies for Natural Language Processing. Hillsdale, NJ: Lawrence Erlbaum, pp. 149177.
Díaz, A. and Gervás, P. (2007). User-model based personalized summarization. Information Processing & Management Management 43(6), 17151734.
Donaway, R.L., Drummey, K.W. and Mather, L.A. (2000). A comparison of rankings produced by summarization evaluation measures. In Proceedings of the NAACL-ANLP 2000 Workshop on Automatic Summarization, Morristown, NJ, USA. Association for Computational Linguistics, pp. 6978.
Edmundson, H.P. (1969). New methods in automatic extracting. Journal of the ACM 16(2), 264285.
Erkan, G. and Radev, D.R. (2004). LexRank: graph-based centrality as salience in text summarization. Journal of Artificial Intelligence Research 22(1), 457479.
Feiguina, O. and Lapalme, G. (2007). Query-based summarization of customer reviews. In Kobti, Z. and Wu, D. (eds), Advances in Artificial Intelligence. Canadian AI 2007. Berlin, Heidelberg: Springer, pp. 452463.
Filippova, K., Alfonseca, E., Colmenares, C.A., Kaiser, L. and Vinyals, O. (2015). Sentence compression by deletion with LSTMs. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. Association for Computational Linguistics, pp. 360368.
Fum, D., Guida, G. and Tasso, C. (1985). Evaluating importance: a step towards text summarisation. In Proceedings of the 9th International Joint Conference on Artificial Intelligence, Los Angeles, California, pp. 840844.
Gaizauskas, R. and Humphreys, K. (1997). Using a semantic network for information extraction. Natural Language Engineering 3(2), 147169.
Giannakopoulos, G. and Karkaletsis, V. (2009). N-gram graphs: representing documents and document sets in summary system evaluation. In Proceedings of TAC 2009.
Goldstein, J., Kantrowitz, M., Mittal, V. and Carbonell, J. (1999). Summarizing text documents. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR’99. ACM Press, pp. 121128.
Gupta, S.K. and Gupta, S.K. (2019). Abstractive summarization: an overview of the state of the art. Expert Systems with Applications 121, 4965.
Harnly, A., Nenkova, A., Passonneau, R. and Rambow, O. (2005). Automation of summary evaluation by the pyramid method. In Proceedings of Recent Advances in Natural Language Processing, Association for Computational Linguistics (ACL), Borovets, Bulgaria, pp. 226232, Sept 21–23.
Hasler, L., Orăsan, C. and Mitkov, R. (2003). Building better corpora for summarisation. In Proceedings of Corpus Linguistics 2003, Lancaster, UK, pp. 309319.
Hermann, K.M., Kočiský, T., Grefenstette, E., Espeholt, L., Espeholt, L., Kay, W., Suleyman, M. and Blunsom, P. (2015). Teaching machines to read and comprehend NIPS 2015. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS’15), pp. 16931701.
Hernández-Alvarez, M. and Gomez, J.M. (2016). Survey about citation context analysis: tasks, techniques, and resources. Natural Language Engineering 22(3), 327349.
Hirschman, L. and Mani, I. (2003). Evaluation. In Mitkov, R. (ed), The Oxford Handbook of Computational Linguistics. Oxford, England: Oxford University Press.
Hovy, E. (2003). Text summarisation. In Mitkov, R. (ed), The Oxford Handbook of Computational. Oxford University Press, pp. 583598.
Hovy, E. and Lin, C.-Y. (1999). Automated text summarization in SUMMARIST. In Mani, I. and Maybury, M.T. (eds), Advances in Automatic Text Summarization. The MIT Press, pp. 8197.
Hovy, E., Lin, C.-Y., Zhou, L. and Fukumoto, J. (2006). Automated summarization evaluation with basic elements. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), Genoa, Italy: ELRA, May 22–28, pp. 899902.
Johnson, F. (1995). Automatic abstracting research. Library Review 44(8), 2836.
Katragadda, R. (2010). GEMS: generative modeling for evaluation of summaries. In Computational Linguistics and Intelligent Text Processing. CICLing 2010, pp. 724735.
Kintsch, W. (1974). The Representation of Meaning in Memory. Oxford, England: The Experimental Psychology Series. Lawrence Erlbaum Associates Publishers.
Knight, K. and Marcu, D. (2002). Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artificial Intelligence 139(1), 91107.
Kobayashi, H., Noguchi, M. and Yatsuka, T. (2015). Summarization based on embedding distributions. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. Association for Computational Linguistics, pp. 19841989.
Kupiec, J., Pedersen, J. and Chen, F. (1995). A trainable document summarizer. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval – SIGIR’95, Seattle, Washington, USA. ACM Press, pp. 6873.
Lerman, K., Blair-Goldensohn, S. and Mcdonald, R. (2009). Sentiment summarization: evaluating and learning user preferences. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), Athens, Greece, pp. 514522.
Lerman, K., Lerman, K., McDonald, R. and McDonald, R. (2009). Contrastive summarization: an experiment with consumer reviews. In Proceedings of NAACL HLT 2009: Short Papers, Boulder, Colorado, pp. 113116.
Li, C., Liu, Y. and Zhao, L. (2015). Improving update summarization via supervised ILP and sentence reranking. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Stroudsburg, PA, USA. Association for Computational Linguistics, pp. 13171322.
Lin, C.-Y. (2004). ROUGE: a package for automatic evaluation of summaries. In Text Summarization Branches Out, pp. 7481.
Lin, C.-Y. and Hovy, E. (2003). Automatic evaluation of summaries using N-gram co-occurrence statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL’03, Vol. 1, Morristown, NJ, USA. Association for Computational Linguistics, pp. 7178.
Liu, B. (2012). Sentiment Analysis and Opinion Mining, Vol. 5, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers.
Lloret, E. and Palomar, M. (2011). Text summarisation in progress: a literature review. Artificial Intelligence Review 37(1), 141.
Lloret, E. and Palomar, M. (2013). COMPENDIUM: a text summarisation tool for generating summaries of multiple purposes, domains, and genres. Natural Language Engineering 19(2), 147186.
Lloret, E., Plaza, L. and Aker, A. (2018). The challenging task of summary evaluation: an overview. Language Resources and Evaluation 52(1), 101148.
Luhn, H.P. (1958). The automatic creation of literature abstracts. IBM Journal of Research and Development 2(2), 159165.
Luo, W., Liu, F., Liu, Z. and Litman, D. (2018). A novel ILP framework for summarizing content with high lexical variety. Natural Language Engineering 24(6), 887920.
Mani, I. (2001a). Automatic Summarization. Amsterdam, Philadelphia: Natural Language Processing. John Benjamins Publishing Company.
Mani, I. (2001b). Summarization Evaluation: An Overview. In Proceedings of the Second NTCIR Workshop on Research in Chinese & Japanese Text Retrieval and Text Summarization. National Institute of Informatics, Tokyo, Japan.
Mani, I. and Bloedorn, E. (1998). Machine learning of generic and user-focused summarization. In Proceedings of the Fifteenth National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence (AAAI’98/IAAI’98). MIT Press, pp. 820826.
Mani, I., Firmin, T., House, D., Chrzanowski, M., Klein, G., Hirshman, L., Sundheim, B. and Obrst, L. (1998). The TIPSTER SUMMAC Text Summarisation Evaluation: Final Report. Technical report MTR 98W0000138, The MITRE Corporation.
Mani, I., Klein, G., House, D., Hirschman, L., Firmin, T. and Sundheim, B. (2002). SUMMAC: a text summarization evaluation. Natural Language Engineering 8(1), 4368.
Mani, I. and Maybury, M.T. (eds) (1999). Advances in Automatic Text Summarisation. MIT Press.
Mann, W.C. and Thompson, S.A. (1988). Rhetorical Structure Theory: Toward a functional theory of text organisation. Text 3(8), 234281.
Marcu, D. (1997). From discourse structures to text summaries. In Intelligent Scalable Text Summarization Proceedings of a Workshop, Association for Computational Linguistics, Madrid, Spain, July 11, pp. 8288.
Marcu, D. (2000). The Theory and Practice of Discourse Parsing and Summarisation. Cambridge, MA: The MIT Press.
Margarido, P.R.A., Pardo, T.A.S., Antonio, G.M., Fuentes, V.B., Aires, R., Aluísio, S.M. and Fortes, R.P.M. (2008). Automatic summarization for text simplification. In Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web - WebMedia’08, New York, New York, USA, pp. 310. ACM Press.
Mihalcea, R. and Tarau, P. (2004). TextRank: bringing order into texts. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, pp. 404411.
Minel, J.L., Nugier, S. and Piat, G. (1997). How to appreciate the quality of automatic text summarization? In Intelligent Scalable Text Summarization, Madrid, Spain, pp. 2530.
Mitkov, R., Evans, R., Orăsan, C., Ha, L.A. and Pekar, V. (2007). Anaphora resolution: to what extent does it help NLP applications? In Branco, A. (ed), Anaphora: Analysis, Algorithms and Applications, Lecture Notes in Artificial Intelligence (LNAI 4410). Springer-Verlag, pp. 179190.
Nallapati, R., Zhou, B., dos Santos, C., Gulcehre, C. and Xiang, B. (2016). Abstractive text summarization using sequence-to-sequence RNNs andbeyond. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, Germany. Association for Computational Linguistics, pp. 280290.
Naserasadi, A., Khosravi, H. and Sadeghi, F. (2019). Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem. Natural Language Engineering 25(1), 121146.
Nenkova, A. and McKeown, K. (2012). A survey of text summarization techniques. In Aggarwal, Charu C. and Zhai, Cheng Xiang (eds.), Mining Text Data, Boston, MA: Springer US, pp. 4376.
Nenkova, A. and Passonneau, R. (2004). Evaluating content selection in summarization: the pyramid method. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004, Boston, MA, USA, Association for Computational Linguistics, pp. 145152.
Neto, J.L., Freitas, A.A. and Kaestner, C.A.A. (2002). Automatic text summarization using a machine learning approach. In Advances in Artificial Intelligence, 16th Brazilian Symposium on Artificial Intelligence, Porto de Galinhas/Recife, Brazil, pp. 205215.
Orăsan, C. (2006). Comparative Evaluation of Modular Automatic Summarisation Systems Using CAST. PhD thesis, University of Wolverhampton.
Orăsan, C. and Chiorean, O.A. (2008). Evaluation of a cross-lingual Romanian–English multi-documentsummariser. In Proceedings of 6th Language Resources and Evaluation Conference (LREC2008), Marrakech, Morocco, pp. 21142119.
Over, P., Dang, H. and Harman, D. (2007). DUC in context. Information Processing & Management 43(6), 15061520.
Owczarzak, K., Conroy, J.M., Dang, H.T. and Nenkova, A. (2012). An assessment of the accuracy of automatic evaluation in summarization. In Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization, Montreal, Canada, pp. 19.
Papineni, K., Roukos, S., Ward, T. and Zhu, W. (2002). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics Annual Meeting (ACL), Philadelphia, Pennsylvania, pp. 311318.
Passonneau, R.J. (2010). Formal and functional assessment of the pyramid method for summary content evaluation. Natural Language Engineering 16(2), 107131.
Qazvinian, V. and Radev, D.R. (2008). Scientific paper summarization using citation summary networks. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, pp. 689696.
Radev, D.R., Jing, H. and Budzikowska, M. (2000). Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation and user studies. In Proceedings of the NAACL-ANLP 2000 Workshop: Automatic Summarization, Seattle, Washington, April 30, pp. 2130.
Reiter, E. and Dale, R. (1997). Building applied natural language generation systems. Natural Language Engineering 3(1), 5787.
Rush, A.M., Chopra, S. and Weston, J. (2015). A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. Association for Computational Linguistics, pp. 379389.
Russell, S. and Norvig, P. (2010). Artificial Intelligence: A Modern Approach, 3rd Edn. New Jersey, USA: Pearson.
Saggion, H. (2008). Automatic summarization: an overview. Revue française de linguistique appliquée 13(1), 6381.
Salton, G., Singhal, A., Mitra, M. and Buckley, C. (1997). Automatic text structuring and summarization. Information Processing and Management 33(2), 193207.
Spärck-Jones, K. (1999). Automatic summarizing: factors and directions. In Mani, I. and Maybury, M.T. (eds), Advances in Automatic Text Summarization, Chapter 1. The MIT Press, pp. 112.
Spärck-Jones, K. (2001). Automatic language and information processing: rethinking evaluation. Natural Language Engineering 7(1), 2946.
Spärck-Jones, K. and Galliers, J.R. (1996). Evaluating Natural Language Processing Systems: An Analysis and Review, Lecture Notes in Artificial Intelligence, Vol. 1083. Berlin Heidelberg: Springer.
Srihari, R.K., Li, W., Cornell, T. and Niu, C. (2008). InfoXtract: a customizable intermediate level information extraction engine. Natural Language Engineering 14(1), 3369.
Tanti, M., Gatt, A. and Camilleri, K.P. (2018). Where to put the image in an image caption generator. Natural Language Engineering 24(3), 467489.
Teufel, S. and Moens, M. (1997). Sentence extraction as a classification task. In Intelligent Scalable Text Summarization Proceedings of a Workshop, Association for Computational Linguistics, Madrid, Spain, July 11, pp. 5865.
Teufel, S. and Moens, M. (2002). Summarizing scientific articles: experiments with relevance and rhetorical status. Computational Linguistics 28(4), 409445.
Tigelaar, A.S., Op Den, Akker R. and Hiemstra, D. (2010). Automatic summarisation of discussion fora. Natural Language Engineering 16(2), 161192.
Tucker, R. (1999). Automatic Summarising and the CLASP System. PhD thesis, University of Cambridge, UK.
UzZaman, N., Bigham, J.P. and Allen, J.F. (2011). Multimodal summarization of complex sentences. In Proceedings of the 15th International Conference on Intelligent User Interfaces - IUI’11, New York, New York, USA. ACM Press, p. 43.
Verberne, S., Boves, L., Oostdijk, N. and Coppen, P.-A. (2010). What is not in the bag of words for why -QA? Computational Linguistics 36(2), 229245.
Verberne, S., Krahmer, E., Wubben, S. and van den Bosch, A. (2019). Query-based summarization of discussion threads. Natural Language Engineering, 127. doi:10.1017/S1351324919000123
Wan, X. (2011). Using bilingual information for cross-language document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp. 15461555.
Yang, L., Ai, Q., Spina, D., Chen, R.-C., Pang, L., Croft, W.B., Guo, J. and Scholer, F. (2016). Beyond Factoid QA: effective methods for non-factoid answer sentence retrieval. In Advances in Information Retrieval. ECIR 2016, pp. 115128.
Yao, J.-g., Wan, X. and Xiao, J. (2017). Recent advances in document summarization. Knowledge and Information Systems 53(2), 297336.
Yogatama, D., Liu, F. and Smith, N.A. (2015). Extractive summarization by maximizing semantic volume. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. Association for Computational Linguistics, pp. 19611966.
Zhou, L., Ticrea, M. and Hovy, E. (2004). Multi-document biography summarization. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP 2004), Association for Computational Linguistics, Barcelona, Spain, July 25–26, pp. 434441.

Keywords

Automatic summarisation: 25 years On

  • Constantin Orăsan (a1)

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed