The previous chapter introduced various techniques for analyzing data with one or two vectors. The remaining chapters of this book discuss various ways of dealing with data sets with more than two vectors. Data sets with many vectors are typically brought together in matrices. These matrices list the observations on the rows, with the vectors (column variables) specifying the different properties of the observations. Data sets like this are referred to as multivariate data.
There are two approaches for discovering the structure in multivariate data sets that we discuss in this chapter. In one approach, we seek to find structure in the data in terms of groupings of observations. These techniques are unsupervised in the sense that we do not prescribe what groupings should be there. We discuss these techniques under the heading of clustering. In the other approach, we know what groups there are in theory, and the question is whether the data support these groups. This second group of techniques can be described as supervised, because the techniques work with a grouping that is imposed by the analyst on the data. We will refer to these techniques as methods for classication.
Clustering
Tables with measurements: principal components analysis
Words such as goodness and sharpness can be analyzed as consisting of a stem, good, sharp, and an affix, the suffix -ness. Some affixes are used in many words, -ness is an example.
Review the options below to login to check your access.
Log in with your Cambridge Aspire website account to check access.
If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.