Model-Based Clustering and Classification for Data Science
With Applications in R
Part of Cambridge Series in Statistical and Probabilistic Mathematics
- Authors:
- Charles Bouveyron, Université Côte d’Azur
- Gilles Celeux, Inria Saclay Île-de-France
- T. Brendan Murphy, University College Dublin
- Adrian E. Raftery, University of Washington
- Date Published: August 2019
- availability: In stock
- format: Hardback
- isbn: 9781108494205
Hardback
Other available formats:
eBook
Looking for an inspection copy?
This title is not currently available on inspection
-
Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observations, and also has open questions about parameter tuning, robustness and uncertainty assessment. This book frames cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions. It builds the basic ideas in an accessible but rigorous way, with extensive data examples and R code; describes modern approaches to high-dimensional data and networks; and explains such recent advances as Bayesian regularization, non-Gaussian model-based clustering, cluster merging, variable selection, semi-supervised and robust classification, clustering of functional data, text and images, and co-clustering. Written for advanced undergraduates in data science, as well as researchers and practitioners, it assumes basic knowledge of multivariate calculus, linear algebra, probability and statistics.
Read more- Extensive use of real-world examples - with data, code and color graphics - builds intuition and understanding
- R package MBCbook available on CRAN allows replication of analyses
- This up-to-date account by four leading researchers gives access to powerful, state-of-the-art methods
Reviews & endorsements
'Bouveyron, Celeux, Murphy, and Raftery pioneered the theory, computation, and application of modern model-based clustering and discriminant analysis. Here they have produced an exhaustive yet accessible text, covering both the field's state of the art as well as its intellectual development. The authors develop a unified vision of cluster analysis, rooted in the theory and computation of mixture models. Embedded R code points the way for applied readers, while graphical displays develop intuition about both model construction and the critical but often-neglected estimation process. Building on a series of running examples, the authors gradually and methodically extend their core insights into a variety of exciting data structures, including networks and functional data. This text will serve as a backbone for graduate study as well as an important reference for applied data scientists interested in working with cutting-edge tools in semi- and unsupervised machine learning.' John S. Ahlquist, University of California, San Diego
See more reviews'This book, written by authoritative experts in the field, gives a comprehensive and thorough introduction to model-based clustering and classification. The authors not only explain the statistical theory and methods, but also provide hands-on applications illustrating their use with the open-source statistical software R. The book also covers recent advances made for specific data structures (e.g. network data) or modeling strategies (e.g. variable selection techniques), making it a fantastic resource as an overview of the state of the field today.' Bettina Grün, Johannes Kepler Universität Linz, Austria
'Four authors with diverse strengths nicely integrate their specialties to illustrate how clustering and classification methods are implemented in a wide selection of real-world applications. Their inclusion of how to use available software is an added benefit for students. The book covers foundations, challenging aspects, and some essential details of applications of clustering and classification. It is a fun and informative read!' Naisyin Wang, University of Michigan
'This is a beautifully written book on a topic of fundamental importance in modern statistical science, by some of the leading researchers in the field. It is particularly effective in being an applied presentation - the reader will learn how to work with real data and at the same time clearly presenting the underlying statistical thinking. Fundamental statistical issues like model and variable selection are clearly covered as well as crucial issues in applied work such as outliers and ordinal data. The R code and graphics are particularly effective. The R code is there so you know how to do things, but it is presented in a way that does not disrupt the underlying narrative. This is not easy to do. The graphics are 'sophisticatedly simple' in that they convey complex messages without being too complex. For me, this is a 'must have' book.' Rob McCulloch, Arizona State University
'This advanced text explains the underlying concepts clearly and is strong on theory … I congratulate the authors on the theoretical aspects of their book, it's a fine achievement.' Antony Unwin, International Statistical Review
'In my opinion, the overall quality of this impactful and intriguing book can be expressed by concluding that it is a perfect fit to the Cambridge Series in Statistical and Probabilistic Mathematics, characterized as a series of high-quality upper-division textbooks and expository monographs containing applications and discussions of new techniques while emphasizing rigorous treatment of theoretical methods.' Zdenek Hlavka, MathSciNet
'… this book not only gives the big picture of the analysis of clustering and classification but also explains recent methodological advances. Extensive real-world data examples and R code for many methods are also well summarized. This book is highly recommended to students in data science, as well as researchers and data analysts.' Li-Pang Chen, Biometrical Journal
'Model-Based Clustering and Classification for Data Science: With Applications in R, written by leading statisticians in the field, provides academics and practitioners with a solid theoretical and practical foundation on the use of model-based clustering methods … this book will serve as an excellent resource for quantitative practitioners and theoreticians seeking to learn the current state of the field.' C. M. Foley, Quarterly Review of Biology
'This book frames cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions … Written for advanced undergraduates in data science, as well as researchers and practitioners, it assumes basic knowledge of multivariate calculus, linear algebra, probability and statistics.' Hans-Jürgen Schmidt, zbMATH
Customer reviews
Not yet reviewed
Be the first to review
Review was not posted due to profanity
×Product details
- Date Published: August 2019
- format: Hardback
- isbn: 9781108494205
- length: 446 pages
- dimensions: 260 x 185 x 25 mm
- weight: 1.1kg
- contains: 40 b/w illus. 171 colour illus. 48 tables
- availability: In stock
Table of Contents
1. Introduction
2. Model-based clustering: basic ideas
3. Dealing with difficulties
4. Model-based classification
5. Semi-supervised clustering and classification
6. Discrete data clustering
7. Variable selection
8. High-dimensional data
9. Non-Gaussian model-based clustering
10. Network data
11. Model-based clustering with covariates
12. Other topics
List of R packages
Bibliography
Index.-
General Resources
Find resources associated with this title
Type Name Unlocked * Format Size Showing of
This title is supported by one or more locked resources. Access to locked resources is granted exclusively by Cambridge University Press to lecturers whose faculty status has been verified. To gain access to locked resources, lecturers should sign in to or register for a Cambridge user account.
Please use locked resources responsibly and exercise your professional discretion when choosing how you share these materials with your students. Other lecturers may wish to use locked resources for assessment purposes and their usefulness is undermined when the source files (for example, solution manuals or test banks) are shared online or via social networks.
Supplementary resources are subject to copyright. Lecturers are permitted to view, print or download these resources for use in their teaching, but may not change them or use them for commercial gain.
If you are having problems accessing these resources please contact lecturers@cambridge.org.
Sorry, this resource is locked
Please register or sign in to request access. If you are having problems accessing these resources please email lecturers@cambridge.org
Register Sign in» Proceed
You are now leaving the Cambridge University Press website. Your eBook purchase and download will be completed by our partner www.ebooks.com. Please see the permission section of the www.ebooks.com catalogue page for details of the print & copy limits on our eBooks.
Continue ×Are you sure you want to delete your account?
This cannot be undone.
Thank you for your feedback which will help us improve our service.
If you requested a response, we will make sure to get back to you shortly.
×