Hostname: page-component-77c78cf97d-5vn5w Total loading time: 0 Render date: 2026-04-23T12:46:52.629Z Has data issue: false hasContentIssue false

Keyword extraction: Issues and methods

Published online by Cambridge University Press:  11 November 2019

Nazanin Firoozeh*
Affiliation:
Pixalione SAS, 75015 Paris, France Northern Paris Computer Science Laboratory (LIPN), Paris 13 University – Sorbonne Paris Cité & CNRS, 93430 Villetaneuse, France
Adeline Nazarenko
Affiliation:
Northern Paris Computer Science Laboratory (LIPN), Paris 13 University – Sorbonne Paris Cité & CNRS, 93430 Villetaneuse, France
Fabrice Alizon
Affiliation:
Pixalione SAS, 75015 Paris, France
Béatrice Daille
Affiliation:
Laboratory of Digital Sciences of Nantes (LS2N), University of Nantes, 44322 Nantes Cedex 3, France
*
*Corresponding author. Email: nazanin.firoozeh@pixalione.com

Abstract

Due to the considerable growth of the volume of text documents on the Internet and in digital libraries, manual analysis of these documents is no longer feasible. Having efficient approaches to keyword extraction in order to retrieve the ‘key’ elements of the studied documents is now a necessity. Keyword extraction has been an active research field for many years, covering various applications in Text Mining, Information Retrieval, and Natural Language Processing, and meeting different requirements. However, it is not a unified domain of research. In spite of the existence of many approaches in the field, there is no single approach that effectively extracts keywords from different data sources. This shows the importance of having a comprehensive review, which discusses the complexity of the task and categorizes the main approaches of the field based on the features and methods of extraction that they use. This paper presents a general introduction to the field of keyword/keyphrase extraction. Unlike the existing surveys, different aspects of the problem along with the main challenges in the field are discussed. This mainly includes the unclear definition of ‘keyness’, complexities of targeting proper features for capturing desired keyness properties and selecting efficient extraction methods, and also the evaluation issues. By classifying a broad range of state-of-the-art approaches and analysing the benefits and drawbacks of different features and methods, we provide a clearer picture of them. This review is intended to help readers find their way around all the works related to keyword extraction and guide them in choosing or designing a method that is appropriate for the application they are targeting.

Information

Type
Survey Paper
Copyright
© Cambridge University Press 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable