If you are a biologist and want to get the best out of the powerful methods of modern computational statistics, this is your book. You can visualize and analyze your own data, apply unsupervised and supervised learning, integrate datasets, apply hypothesis testing, and make publication-quality figures using the power of R/Bioconductor and ggplot2. This book will teach you 'cooking from scratch', from raw data to beautiful illuminating output, as you learn to write your own scripts in the R language and to use advanced statistics packages from CRAN and Bioconductor. It covers a broad range of basic and advanced topics important in the analysis of high-throughput biological data, including principal component analysis and multidimensional scaling, clustering, multiple testing, unsupervised and supervised learning, resampling, the pitfalls of experimental design, and power simulations using Monte Carlo, and it even reaches networks, trees, spatial statistics, image data, and microbial ecology. Using a minimum of mathematical notation, it builds understanding from well-chosen examples, simulation, visualization, and above all hands-on interaction with data and code.Read more
- Introduces methods on a 'need to know' basis, so students tackle biological questions immediately and understand motivation for the methods
- Contains real-life examples done from scratch, guiding students through realistic complexities and building practical intuition
- Includes a wrap-up chapter that explains the complete workflow from design of experiments to analysis of results, identifying common pitfalls with big data
Reviews & endorsements
'This is a gorgeous book, both visually and intellectually, superbly suited for anyone who wants to learn the nuts and bolts of modern computational biology. It can also be a practical, hands-on starting point for life scientists and students who want to break out of 'canned packages' into the more versatile world of R coding. Much richer than the typical statistics textbook, it covers a wide range of topics in machine learning and image processing. The chapter on making high-quality graphics is alone worth the price of the book.' William H. Press, University of Texas, AustinSee more reviews
'The book is a timely, comprehensive and practical reference for anyone working with modern quantitative biotechnologies. It can be read at multiple levels. For scientists with a statistics background, it is a thorough review of key methods for design and analysis of high-throughput experiments. For life scientists with a limited exposure to statistics, it offers a series of examples with relevant data and R code. Avoiding buzzwords and hype, the book advocates appropriate statistical practice for reproducible research. I expect it to be as influential for the life sciences community as Modern Applied Statistics with S, by Venables and Ripley or Introduction to Statistical Learning, by James, Witten, Hastie and Tibshirani are for applied statistics.' Olga Vitek, Northeastern University, Boston
'Navigating rich data to arrive at sensible insight requires confidence in our biological understanding, informatic ability, statistical sophistication, and skills at effective communication. Fortunately the wisdom and effort of the worldwide research community has been distilled into accessible and rich collections of R and Bioconductor software packages. Holmes and Huber provide a comprehensive guide to navigating modern statistical methods for working with complex, large, and nuanced biological data. The presentation provides a firm conceptual foundation coupled with worked practical examples, extended analysis, and refined discussion of practical and theoretical challenges facing the modern practitioner. This book provides us with the confidence and tools necessary for the analysis and comprehension of modern biological data using modern statistical methods.' Martin Morgan, Roswell Park Comprehensive Cancer Center, leader of the Bioconductor project
'Holmes and Huber take an integrated approach to presenting the key statistical concepts and methods needed for the analysis of biological data. Specifically, they do a wonderful job of building these foundations in the context of modern computational tools, genuine scientific questions, and real-world datasets. The code showcases many of the newest features of R and its dynamic package ecosystem, such as using ggplot2 for visualization and dplyr for data manipulation.' Jenny Bryan, RStudio and University of British Columbia
'... the book is extremely readable and engaging, it explains complicated concepts in simple terms, and uses illuminating graphics and examples. Any researcher who wants to learn or teach up-to-date statistics to biologists will find this an essential volume for modern teaching of modern statistics to modern biologists.' Noa Pinter-Wollman, The Quarterly Review of Biology
Not yet reviewed
Be the first to review
Review was not posted due to profanity×
- Date Published: February 2019
- format: Paperback
- isbn: 9781108705295
- length: 402 pages
- dimensions: 279 x 217 x 16 mm
- weight: 1.14kg
- availability: In stock
Table of Contents
1. Generative models for discrete data
2. Statistical modeling
3. High-quality graphics in R
4. Mixture models
7. Multivariate analysis
8. High-throughput count data
9. Multivariate methods for heterogeneous data
10. Networks and trees
11. Image data
12. Supervised learning
13. Design of high-throughput experiments and their analyses
Find resources associated with this titleYour search for '' returned .
Type Name Unlocked * Format Size
This title is supported by one or more locked resources. Access to locked resources is granted exclusively by Cambridge University Press to lecturers whose faculty status has been verified. To gain access to locked resources, lecturers should sign in to or register for a Cambridge user account.
Please use locked resources responsibly and exercise your professional discretion when choosing how you share these materials with your students. Other lecturers may wish to use locked resources for assessment purposes and their usefulness is undermined when the source files (for example, solution manuals or test banks) are shared online or via social networks.
Supplementary resources are subject to copyright. Lecturers are permitted to view, print or download these resources for use in their teaching, but may not change them or use them for commercial gain.
If you are having problems accessing these resources please contact firstname.lastname@example.org.
Sorry, this resource is locked
Please register or sign in to request access. If you are having problems accessing these resources please email email@example.comRegister Sign in
You are now leaving the Cambridge University Press website. Your eBook purchase and download will be completed by our partner www.ebooks.com. Please see the permission section of the www.ebooks.com catalogue page for details of the print & copy limits on our eBooks.Continue ×
Are you sure you want to delete your account?
This cannot be undone.
Thank you for your feedback which will help us improve our service.
If you requested a response, we will make sure to get back to you shortly.×