Hostname: page-component-89b8bd64d-rbxfs Total loading time: 0 Render date: 2026-05-07T12:56:06.206Z Has data issue: false hasContentIssue false

An Algorithm for Generating Artificial Test Clusters

Published online by Cambridge University Press:  01 January 2025

Glenn W. Milligan*
Affiliation:
Faculty of Management Sciences, The Ohio State University
*
Requests for reprints and program listings should be sent to Glenn W. Milligan, Faculty of Management Sciences, 301 Hagerty Hall, The Ohio State University, Columbus, OH 43210.

Abstract

An algorithm for generating artificial data sets which contain distinct nonoverlapping clusters is presented. The algorithm is useful for generating test data sets for Monte Carlo validation research conducted on clustering methods or statistics. The algorithm generates data sets which contain either 1, 2, 3, 4, or 5 clusters. By default, the data are embedded in either a 4, 6, or 8 dimensional space. Three different patterns for assigning the points to the clusters are provided. One pattern assigns the points equally to the clusters while the remaining two schemes produce clusters of unequal sizes. Finally, a number of methods for introducing error in the data have been incorporated in the algorithm.

Information

Type
Computational Psychometrics
Copyright
Copyright © 1985 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable