Hostname: page-component-7c8c6479df-fqc5m Total loading time: 0 Render date: 2024-03-26T23:40:04.469Z Has data issue: false hasContentIssue false

A review of associative classification mining

Published online by Cambridge University Press:  01 March 2007

FADI THABTAH
Affiliation:
Department of Computing and Engineering, University of Huddersfield, HD1 3DH, UK; e-mail: f.thabtah@hud.ac.uk

Abstract

Associative classification mining is a promising approach in data mining that utilizes the association rule discovery techniques to construct classification systems, also known as associative classifiers. In the last few years, a number of associative classification algorithms have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative classification techniques with regards to the above criteria. Finally, future directions in associative classification, such as incremental learning and mining low-quality data sets, are also highlighted in this paper.

Type
Original Article
Copyright
Copyright © Cambridge University Press 2007

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agrawal, R. & Srikant, R. 1994 Fast algorithms for mining association rule. In Proceedings of the 20th International Conference on Very Large Data Bases, Morgan Kaufmann, Santiago, Chile, pp. 487–499.Google Scholar
Agrawal, R., Amielinski, T. & Swami, A. 1993 Mining association rule between sets of items in large databases. In Buneman, P. & Jajodia, S. (eds.), Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington, DC, pp. 207–216.Google Scholar
Ali, K., Manganaris, S. & Srikant, R. 1997 Partial classification using association rules. In Heckerman, D., Mannila, H., Pregibon, D. & Uthurusamy, R. (eds.), Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, pp. 115–118.Google Scholar
Antonie, M. & Zaïane, O. 2004 An associative classifier based on positive and negative rules. In Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. Paris, France: ACM Press, pp. 64–69.Google Scholar
Antonie, M., Zaïane, O. & Coman, A. 2003 Associative classifiers for medical images. Mining Multimedia and Complex Data (Lecture Notes in Artificial Intelligence, 2797). Berlin: Springer, pp. 68–83.Google Scholar
Baralis, E. & Torino, P. 2002 A lazy approach to pruning classification rules. Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM’02), Maebashi City, Japan, p. 35.Google Scholar
Baralis, E., Chiusano, S. & Graza, P. 2004 On support thresholds in associative classification. In Proceedings of the 2004 ACM Symposium on Applied Computing. Nicosia, Cyprus: ACM Press, pp. 553–558.Google Scholar
Boutell, M., Shen, X., Luo, J. & Brown, C. 2003 Multi-label semantic scene classification. Technical Report 813, Department of Computer Science, University of Rochester, NY and Electronic Imaging Products R & D, Eastern Kodak Company.Google Scholar
Cheung, D. W., Ng, V. T. & Tam, B. W. 1996 Maintenance of discovered knowledge: A case in multi-level association rules. In Proceedings of the International Conference on Knowledge Discovery and Data Mining, Portland, OR: AAAI Press, pp. 307–310.Google Scholar
Clare, A. & King, R. 2001 Knowledge discovery in multi-label phenotype data. In De Raedt, L. & Siebes, A. (eds.), Proceedings of the 5th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’01) (Lecture Notes in Artificial Intelligence, 2168). Berlin: Springer, pp. 42–53.Google Scholar
Clark, P. & Boswell, R. 1991 Rule induction with CN2: Some recent improvements. In Proceedings of the 5th European Working Session on Learning. Berlin, Germany: Springer Verlag, pp. 151–163.Google Scholar
Cohen, W. 1995 Fast effective rule induction. In Proceedings of the 12th International Conference on Machine Learning, Morgan Kaufmann, CA, pp. 115–123.Google Scholar
Dong, G., Zhang, X., Wong, L. & Li, J. 1999 CAEP: Classification by aggregating emerging patterns. In Proceedings of the 2nd Imitational Conference on Discovery Science. Tokyo, Japan: Springer Verlag, pp. 30–42.Google Scholar
Duda, R. & Hart, P. 1973 Pattern Classification and Scene Analysis. New York: Wiley.Google Scholar
Fayyad, U., Piatetsky-Shapiro, G., Smith, G. & Uthurusamy, R. 1998 Advances in Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press.Google Scholar
Freitas, A. 2000 Understanding the crucial difference between classification and association rule discovery. ACM SIGKDD Explorations Newsletter 2, 6569.CrossRefGoogle Scholar
Furnkranz, J. & Widmer, G. 1994 Incremental reduced error pruning. In Proceedings of the 11th International Machine Learning Conference, New Brunswick, NJ, pp. 70–75.Google Scholar
Gehrke, J., Ramakrishnan, R. & Ganti, V. 1998 RainForest: A Framework for fast decision tree construction of large datasets. In Proceedings of the International Conference on very Large Data Bases, New York, NY, pp. 416–427.Google Scholar
Han, J., Pei, J. & Yin, Y. 2000 Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. Dallas, TX: ACM Press, pp. 1–12.Google Scholar
Hu, H. & Li, J. 2005 Using association rules to make rule-based classifiers robust. In Proceedings of the 16th Australasian Database Conference, Newcastle, Australia, pp. 47–54.Google Scholar
Li, W. 2001 Classification based on multiple association rules. MSc thesis, Simon Fraser University, BC, Canada, April 2001.Google Scholar
Li, W., Han, J. & Pei, J. 2001 CMAR: Accurate and efficient classification based on multiple-class association rule. In Proceedings of the International Conference on Data Mining (ICDM’01), San Jose, CA, pp. 369–376.Google Scholar
Lim, T., Loh, W. & Shih, Y. 2000 A comparison of prediction accuracy, complexity and training time of thirty-three old and new classification algorithms. Machine Learning 40, 203228.Google Scholar
Liu, B., Hsu, W. & Ma, Y. 1998 Integrating classification and association rule mining. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. New York, NY: AAAI Press, pp. 80–86.Google Scholar
Liu, B., Hsu, W. & Ma, Y. 1999 Mining association rules with multiple minimum supports. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA: ACM Press, pp. 337–341.Google Scholar
Liu, B., Ma, Y. & Wong, C.-K. 2000 Improving an association rule based classifier. In Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Lyon, France, pp. 504–509.Google Scholar
Liu, B., Ma, Y. & Wong, C.-K. 2001 Classification using association rules: Weakness and enhancements. In Vipin Kumar, et al. (eds), Data Mining for Scientific Applications, 2001.Google Scholar
Meretakis, D. & Wüthrich, B. 1999 Extending naïve Bayes classifiers using long itemsets. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA: ACM Press, pp. 165–174.Google Scholar
Merz, C. & Murphy, P. 1996 UCI repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science.Google Scholar
Provost, F., Fawcett, T. & Kohavi, R. 1997 The case against accuracy estimation for comparing induction algorithms. In Proceedings of the 15th International Conference on Machine Learning, Madison, WI, pp. 445–453.Google Scholar
Quinlan, J. 1987 Simplifying decision trees. International Journal of Man–Machine Studies 27, 221248.Google Scholar
Quinlan, J. 1998 Data mining tools See5 and C5.0. Technical Report, RuleQuest Research.Google Scholar
Quinlan, J. 1993 C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.Google Scholar
Quinlan, J. & Cameron-Jones, R. 1993 FOIL: A midterm report. In Proceedings of the European Conference on Machine Learning. Vienna, Austria: Springer Verlag, pp. 3–20.Google Scholar
Savasere, A., Omiecinski, E. & Navathe, S. 1995 An efficient algorithm for mining association rules in large databases. In Proceedings of the 21st conference on Very Large Databases (VLDB’95), Zurich, Switzerland, pp. 432–444.Google Scholar
Schapire, R. & Singer, Y. 2000 BoosTexter: A boosting-based system for text categorization. Machine Learning 39(2/3), 135168.CrossRefGoogle Scholar
Snedecor, W. & Cochran, W. 1989 Statistical Methods, 8th edn. Iowa City, IA: Iowa State University Press.Google Scholar
Thabtah, F. 2006 Pruning techniques in associative classification: Survey and comparison. Journal of Digital Information Management 4, 202205.Google Scholar
Thabtah, F., Cowling, P. & Peng, Y. 2004 MMAC: A new multi-class, multi-label associative classification approach. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM’04), Brighton, UK, pp. 217–224.Google Scholar
Thabtah, F., Cowling, P. & Peng, Y. 2005 MCAR: Multi-class classification based on association rule approach. In Proceeding of the 3rd IEEE International Conference on Computer Systems and Applications, Cairo, Egypt, pp. 1–7.Google Scholar
Topor, R. & Shen, H. 2001 Construct robust rule sets for classification. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Alberta, Canada: ACM Press, pp. 564–569.Google Scholar
Tsai, P., Lee, C. & Chen, A. 1999 An efficient approach for incremental association rule mining. In Proceedings of the 3rd Pacific–Asia Conference on Methodologies for Knowledge Discovery and Data Mining. London, UK: Springer Verlag, pp. 74–83.Google Scholar
Valtchev, P., Missaoui, R., Godin, R. & Meridji, M. 2002 A framework for incremental generation of frequent closed itemsets using galois (Concept) lattice theory. Journal of Experimental and Theoretical Artificial Intelligence (JETAI), Special Issue on Concept Lattice based Theory, Methods and Tools for Knowledge Discovery in Databases, 14, 115142.Google Scholar
Van Rijsbergan, C. 1979 Information Retrieval, 2nd edn. London: Buttersmiths.Google Scholar
Wang, K., Zhou, S. & He, Y. 2000 Growing decision tree on support-less association rules. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston, MA: ACM Press, pp. 265–269.Google Scholar
Wang, K., He, Y. & Cheung, D. 2001 Mining confidence rules without support requirements. In Proceedings of the 10th International Conference on Information and Knowledge Management. Atlanta, GA: ACM Press, pp. 89–96.Google Scholar
Weka, 2000 Data mining software in Java. www.cs.waikato.ac.nz/ml/weka.Google Scholar
Witten, I. & Frank, E. 2000 Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco, CA: Morgan Kaufmann.Google Scholar
Xu, X., Han, G. & Min, H. 2004 A novel algorithm for associative classification of images blocks. In Proceedings of the 4th IEEE International Conference on Computer and Information Technology, Lian, Shiguo, China, pp. 46–51.Google Scholar
Yang, Y., Slattery, S. & Ghani, R. 2002 A study of approaches to hypertext categorization. Journal of Intelligent Information Systems 18, 149241.CrossRefGoogle Scholar
Yin, X. & Han, J. 2003 CPAR: Classification based on predictive association rule. In Proceedings of the SIAM International Conference on Data Mining. San Francisco, CA: SIAM Press, pp. 369–376.Google Scholar
Zaïane, O. & Antonie, A. 2002 Classifying text documents by associating terms with text categories. In Proceedings of the 13th Australasian Database Conference (ADC’02), Melbourne, Australia, pp. 215–222.Google Scholar
Zaki, M. & Gouda, K. 2003 Fast vertical mining using diffsets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, DC: ACM Press, pp. 326–335.Google Scholar
Zaki, M., Parthasarathy, S., Ogihara, M. & Li, W. 1997 New algorithms for fast discovery of association rules. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press, pp. 283–286.Google Scholar
Zhou, Z. & Ezeife, C. 2001 A low-scan incremental association rule maintenance method based on the Apriori property. In Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence. London, UK: Springer-Verlag, pp. 26–35.Google Scholar