Skip to main content Accessibility help
Internet Explorer 11 is being discontinued by Microsoft in August 2021. If you have difficulties viewing the site on Internet Explorer 11 we recommend using a different browser such as Microsoft Edge, Google Chrome, Apple Safari or Mozilla Firefox.

Chapter 17: Clustering Validation

Chapter 17: Clustering Validation

pp. 425-464

Authors

, Rensselaer Polytechnic Institute, New York, , Universidade Federal de Minas Gerais, Brazil
Resources available Unlock the full potential of this textbook with additional resources. There are free resources and Instructor restricted resources available for this textbook. Explore resources
  • Add bookmark
  • Cite
  • Share

Summary

There exist many different clustering methods, depending on the type of clusters sought and on the inherent data characteristics. Given the diversity of clustering algorithms and their parameters it is important to develop objective approaches to assess clustering results. Cluster validation and assessment encompasses three main tasks: clustering evaluation seeks to assess the goodness or quality of the clustering, clustering stability seeks to understand the sensitivity of the clustering result to various algorithmic parameters, for example, the number of clusters, and clustering tendency assesses the suitability of applying clustering in the first place, that is, whether the data has any inherent grouping structure. There are a number of validity measures and statistics that have been proposed for each of the aforementioned tasks, which can be divided into three main types:

External: External validation measures employ criteria that are not inherent to the dataset. This can be in form of prior or expert-specified knowledge about the clusters, for example, class labels for each point.

Internal: Internal validation measures employ criteria that are derived from the data itself. For instance, we can use intracluster and intercluster distances to obtain measures of cluster compactness (e.g., how similar are the points in the same cluster?) and separation (e.g., how far apart are the points in different clusters?).

Access options

Review the options below to login to check your access.

Purchase options

There are no purchase options available for this title.

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers