Searching for patterns in biological data has a long history, perhaps best typified by the taxonomists who arranged species into groups based on their similarities and differences. As computer resources improved there was a growth in the scope and availability of new computational methods. Some of the first biologists to exploit these methods worked in disciplines such as vegetation ecology. Vegetation ecologists often collect species data from many quadrats and they require methods that will organise the data so that relevant structures are revealed. The increase in computer power, and the parallel growth in software, allowed analyses that were previously impractical for other than small data sets.
However, the greatest data analysis challenges are undoubtedly more recent. Advances in experimental methods have generated, and continue to generate, enormous volumes of genetic information that present significant storage, retrieval and analysis challenges see Slonim (2002) for a useful review. Simultaneously there has been a growth in non-biological commercial databases and the belief that they contain information which can be used to improve company profits. One consequence of these challenges is that there is now a wide, and increasing, variety of analysis tools that have the potential to extract important information from biological data.
The analysis tools that extract information from data can be placed into two broad and overlapping categories: cluster and classification methods. This book examines a wide range of techniques from both categories to illustrate their potential for biological research. For example, they can be used to:
find patterns in gene expression profiles or biodiversity survey data;
Email your librarian or administrator to recommend adding this book to your organisation's collection.