Data Mining and Analysis: Fundamental Concepts and Algorithms

Mohammed J. Zaki; Wagner Meira, Jr

doi:10.1017/CBO9780511810114

Chapter 17: Clustering Validation

pp. 425-464

Mohammed J. Zaki

, Rensselaer Polytechnic Institute, New York,

Wagner Meira, Jr

, Universidade Federal de Minas Gerais, Brazil

Get access

Add bookmark
Cite
Share

Summary

There exist many different clustering methods, depending on the type of clusters sought and on the inherent data characteristics. Given the diversity of clustering algorithms and their parameters it is important to develop objective approaches to assess clustering results. Cluster validation and assessment encompasses three main tasks: clustering evaluation seeks to assess the goodness or quality of the clustering, clustering stability seeks to understand the sensitivity of the clustering result to various algorithmic parameters, for example, the number of clusters, and clustering tendency assesses the suitability of applying clustering in the first place, that is, whether the data has any inherent grouping structure. There are a number of validity measures and statistics that have been proposed for each of the aforementioned tasks, which can be divided into three main types:

External: External validation measures employ criteria that are not inherent to the dataset. This can be in form of prior or expert-specified knowledge about the clusters, for example, class labels for each point.

Internal: Internal validation measures employ criteria that are derived from the data itself. For instance, we can use intracluster and intercluster distances to obtain measures of cluster compactness (e.g., how similar are the points in the same cluster?) and separation (e.g., how far apart are the points in different clusters?).

About the book

Chapter DOI https://doi.org/10.1017/CBO9780511810114.018
Book DOI https://doi.org/10.1017/CBO9780511810114
Subjects Computer Science,Data Science, Databases, Data Mining, and Information Retrieval,Machine Learning and Pattern Recognition
Format: Digital
- Publication date: 28 May 2018
- ISBN: 9780511810114
Find out more details about this book

Access options

Review the options below to login to check your access.

Purchase options

There are no purchase options available for this title.

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers

Data Mining and Analysis Fundamental Concepts and Algorithms