The search space for frequent itemsets is usually very large and it grows exponentially with the number of items. In particular, a low minimum support value may result in an intractable number of frequent itemsets. An alternative approach, studied in this chapter, is to determine condensed representations of the frequent itemsets that summarize their essential characteristics. The use of condensed representations can not only reduce the computational and storage demands, but it can also make it easier to analyze the mined patterns. In this chapter we discuss three of these representations: closed, maximal, and nonderivable itemsets.
MAXIMAL AND CLOSED FREQUENT ITEMSETS
Given a binary database D ⊆ T × I, over the tids T and items I, let F denote the set of all frequent itemsets, that is,
F = {X | X ⊆ I and sup(X) ≥ minsup}
Maximal Frequent Itemsets
A frequent itemset X ∈ F is called maximal if it has no frequent supersets. Let M be the set of all maximal frequent itemsets, given as
M= {X | X ∈ F and 6 ∃Y ⊃ X, such that Y ∈ F}
The set M is a condensed representation of the set of all frequent itemset F, because we can determine whether any itemset X is frequent or not using M.