Hostname: page-component-89b8bd64d-rbxfs Total loading time: 0 Render date: 2026-05-11T23:44:38.125Z Has data issue: false hasContentIssue false

Mining of high utility itemsets from incremental datasets: a survey

Published online by Cambridge University Press:  28 April 2025

Rajiv Kumar
Affiliation:
School of Computer Science Engineering and Technology, Bennett University, Greater Noida, India
Kuldeep Singh*
Affiliation:
Department of Computer Science, University of Delhi, Delhi 110–007, India
*
Corresponding author: Kuldeep Singh; Email: ksingh@cs.du.ac.in
Rights & Permissions [Opens in a new window]

Abstract

Traditional frequent itemset mining (FIM) is constrained by several limitations, mainly due to its failure to account for item quantity and significance, including factors such as price and profit. To address these limitations, high utility itemset mining (HUIM) is presented. Traditional HUIM algorithms are designed to operate solely on static transactional datasets. Nevertheless, in practical applications, datasets tend to be dynamic, with examples like market basket analysis and business decision-making involving regular updates to the data. Dynamic datasets are updated incrementally with the frequent addition of new data. Incremental HUIM (iHUIM) approaches mine the high utility itemsets (HUIs) from incremental datasets without scanning the whole dataset. In contrast, traditional HUIM approaches require a full dataset scan each time the dataset is updated. Consequently, iHUIM approaches effectively reduce the computational cost of identifying HUIs whenever a new record is added. This survey provides a novel taxonomy that includes two-based, pattern-growth-based, projection-based, utility-list-based, and pre-large-based algorithms. The paper delivers an in-depth analysis, covering the features and characteristics of the existing state-of-the-art algorithms. Additionally, it supplies a detailed comparative overview, advantages, disadvantages, and future research directions of these algorithms. The survey provides both a categorized analysis and a comprehensive, consolidated summary and analysis of all current state-of-the-art iHUIM algorithms. It offers a more in-depth comparative analysis than the currently available state-of-the-art surveys. Additionally, the survey highlights several research opportunities and future directions for iHUIM.

Information

Type
Review
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Table 1. A transactional dataset

Figure 1

Table 2. External utility value

Figure 2

Table 3. Transaction utility in the original dataset

Figure 3

Table 4. High utility itemsets in original dataset for $ \delta $ = 20%

Figure 4

Table 5. Transaction-weighted utilization and utility values of 1-itemsets in the original dataset

Figure 5

Table 6. New transactions

Figure 6

Table 7. Transaction utility of the newly added transactions

Figure 7

Table 8. TWU and utility values of 1-itemsets in whole dataset

Figure 8

Table 9. HUIs of the whole dataset for $\delta $ = 20%

Figure 9

Figure 1. A taxonomy of incremental high utility itemsets mining approaches

Figure 10

Table 10. Characteristics and theoretical aspects of the Two-phase-based approaches

Figure 11

Table 11. Pros and cons of the Two-phase-based approaches

Figure 12

Figure 2. Incremental maintenance process of IIUT

Figure 13

Table 12. Characteristics and theoretical aspects of the pattern-growth-based approaches

Figure 14

Table 13. Pros and cons of the pattern-growth-based approaches

Figure 15

Figure 3. Construction process of mIHAUI-Tree

Figure 16

Table 14. Characteristics and theoretical aspects of the projection-based approaches

Figure 17

Table 15. Pros and cons of the projection-based approaches

Figure 18

Figure 4. Construction process of HUI-trie structure

Figure 19

Table 16. Characteristics and theoretical aspects of the utility-list-based approaches

Figure 20

Table 17. Pros and cons of the utility-list-based approaches

Figure 21

Figure 5. Process of the proposed algorithm for transaction insertion

Figure 22

Table 18. Characteristics and theoretical aspects of the pre-large-based approaches

Figure 23

Table 19. Pros and cons of the pre-large-based approaches