An Introduction with Applications in Data Science
- Author: Roman Vershynin, University of California, Irvine
- Date Published: September 2018
- availability: Available
- format: Hardback
- isbn: 9781108415194
High-dimensional probability offers insight into the behavior of random vectors, random matrices, random subspaces, and objects used to quantify uncertainty in high dimensions. Drawing on ideas from probability, analysis, and geometry, it lends itself to applications in mathematics, statistics, theoretical computer science, signal processing, optimization, and more. It is the first to integrate theory, key tools, and modern applications of high-dimensional probability. Concentration inequalities form the core, and it covers both classical results such as Hoeffding's and Chernoff's inequalities and modern developments such as the matrix Bernstein's inequality. It then introduces the powerful methods based on stochastic processes, including such tools as Slepian's, Sudakov's, and Dudley's inequalities, as well as generic chaining and bounds based on VC dimension. A broad range of illustrations is embedded throughout, including classical and modern results for covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, machine learning, compressed sensing, and sparse regression.Read more
- Closes the gap between the standard probability curriculum and what mathematical data scientists need to know
- Selects the core ideas and methods and presents them systematically with modern motivating applications to bring readers quickly up to speed
- Features integrated exercises that invite readers to sharpen their skills and build practical intuition
- Winner, 2019 PROSE Award for Mathematics
Reviews & endorsements
'This is an excellent and very timely text, presenting the modern tools of high-dimensional geometry and probability in a very accessible and applications-oriented manner, with plenty of informative exercises. The book is infused with the author's insights and intuition in this field, and has extensive references to the latest developments in the area. This book will be an extremely useful resource both for newcomers to this subject and for expert researchers.' Terence Tao, University of California, Los AngelesSee more reviews
'Methods of high-dimensional probability have become indispensable in numerous problems of probability theory and its applications in mathematics, statistics, computer science, and electrical engineering. Roman Vershynin's wonderful text fills a major gap in the literature by providing a highly accessible introduction to this area. Starting with no prerequisites beyond a first course in probability and linear algebra, Vershynin takes the reader on a guided tour through the subject and consistently illustrates the utility of the material through modern data science applications. This book should be essential reading for students and researchers in probability theory, data science, and related fields.' Ramon van Handel, Princeton University, New Jersey
'This very welcome contribution to the literature gives a concise introduction to several topics in ‘high-dimensional probability’ that are of key relevance in contemporary statistical science and machine learning. The author achieves a fine balance between presenting deep theory and maintaining readability for a non-specialist audience - this book is thus highly recommended for graduate students and researchers alike who wish to learn more about this by now indispensable field of modern mathematics.' Richard Nickl, University of Cambridge
'Vershynin is one of the world's leading experts in the area of high-dimensional probability, and his textbook provides a gentle yet thorough treatment of many of the key tools in the area and their applications to the field of data science. The topics covered here are a must-know for anyone looking to do mathematical work in the field, covering subjects important in machine learning, algorithms and theoretical computer science, signal processing, and applied mathematics.' Jelani Nelson, Harvard University, Massachusetts
'High-Dimensional Probability is an excellent treatment of modern methods in probability and data analysis. Vershynin's perspective is unique and insightful, informed by his expertise as both a probabilist and a functional analyst. His treatment of the subject is gentle, thorough and inviting, providing a great resource for both newcomers and those familiar with the subject. I believe, as the author does, that the topics covered in this book are indeed essential ingredients of the developing foundations of data science.' Santosh Vempala, Georgia Institute of Technology
'Renowned for his deep contributions to high-dimensional probability, Roman Vershynin is to be commended for the clarity of his progressive exposition of the important concepts, tools and techniques of the field. Advanced students and practitioners interested in the mathematical foundations of data science will enjoy the many relevant worked examples and lively use of exercises. This book is the reference I had been waiting for.' Rémi Gribonval, IEEE and EURASIP Fellow, Directeur de Recherche, Inria, France
'High-dimensional probability is a fascinating mathematical theory that has rapidly grown in recent years. It is fundamental to high-dimensional statistics, machine learning and data science. In this book, Roman Vershynin, who is a leading researcher in high-dimensional probability and a master of exposition, provides the basic tools and some of the main results and applications of high-dimensional probability. This book is an excellent textbook for a graduate course that will be appreciated by mathematics, statistics, computer science, and engineering students. It will also serve as an excellent reference book for researchers working in high-dimensional probability and statistics.' Elchanan Mossel, Massachusetts Institute of Technology
'This book on the theory and application of high-dimensional probability is a work of exceptional clarity that will be valuable to students and researchers interested in the foundations of data science. A working knowledge of high dimensional probability is essential for researchers at the intersection of applied mathematics, statistics and computer science. The widely accessible presentation will make this book a classic that everyone in foundational data science will want to have on their bookshelf.' Alfred Hero, University of Michigan
'Vershynin's book is a brilliant introduction to the mathematics which is at the core of modern signal processing and data science. The focus is on concentration of measure and its applications to random matrices, random graphs, dimensionality reduction, and suprema of random process. The treatment is remarkably clean, and the reader will learn beautiful and deep mathematics without unnecessary formalism.' Andrea Montanari, Stanford University, California
'The ideas presented here have emerged as the essential core of a modern mathematical education, essential not only for probabilists but also for any researcher interested in high-dimensional statistics, the theory of algorithms, information theory, statistical physics and dynamical systems. Moreover, as Vershynin ably demonstrates, mastering these ideas will provide insight into the essential unity underlying these disciplines.' Michael Jordan, University of California, Berkeley
'The current monograph is a welcome text providing a clear and concise introduction to many recent (and less recent) developments, in which the author played an important role.' Sasha Sodin, MathsSciNet
'A good textbook is as much about learning as about learning something specific. Vershynin’s High-Dimensional Probability is a good textbook. When developing a topic, it starts from the simplest idea, it examines its weaknesses and builds up to a better idea; this is superbly done when bounding the tail probabilities of binomial distributions in Chapter 2. It always prioritises high-level narrative to technical details; the reader never loses sight of the main theme, arguments are kept to their essence, side results are given as exercises and important special cases are given priority over the most general statements. Intuition is at least as important as the techniques; this is usually the hardest to communicate in a book, compared for example to a classroom presentation, but it comes across beautifully in this book …' Omiros Papaspiliopoulos, Newsletter of the Bachelier Finance Society
18th Feb 2019 by Jakeknigge
Vershynin's book covers a set of topics that is likely to become central in the education for modern mathematicians, statisticians, physicists, and (electrical) engineers. He discusses ideas, techniques, and tools that arise across fields, and he conceptually unifies them under the brand name of high-dimensional probability. His choice of topics (e.g., concentration/deviation inequalities, random vectors/matrices, stochastic processes, etc.) and applications (e.g., sparse recovery, dimension-reduction, covariance estimation, optimization bounds, etc.) delivers a necessary (and timely) addition to the growing body of data-science-related literature—more on this below. Vershynin writes in a conversational, reader-friendly manner. He weaves theorems, lemmas, corollaries, and proofs into his dialogue with the reader without getting caught in an endless theorem-proof loop. In addition, the book's integrated exercises and its prompts to check! or think about why? are strong components of the book. My copy of the book is already full of notes to myself where I’m “checking” something or explaining “why” something is true/false. (Also, as an aside, I love that coffee cups are used to signal the difficulty of a problem—good style.) I want to highlight a few examples where Vershynin’s choice of topics and his prose shine brightly. In section 4.4.1, he guides us through an example that clearly illustrates the usefulness of ε-nets for bounding matrix norms. I’d seen ε-nets and covering numbers before, but never had good intuition for why they showed up in a proof. Similarly, I’d struggled to gain intuition about why/how Gaussian widths and Vapnik–Chervonekis dimension capture/measure the complexity of a set. After reading sections 7.5 and 8.3 and working through some exercises, the two concepts are much clearer. Moreover, Vershynin connects these ideas back to covering numbers, which helped me better my understanding of all three concepts. Finally, I found the discussions on chaining and generic chaining in chapter 8 to be excellent. Following them up with Talagrand’s comparison inequality, which becomes the hammer of choice for the matrix deviation inequality (in chapter 9), rounds out a long, but very valuable/useful chapter—and one that I’ll certainly re-study and reference. I would recommend this book for those interested in (high-dimensional) statistics, randomized numerical linear algebra, and electrical engineering (particularly, signal processing). As I'm coming to realize, the concentration of measure and “deviation inequality” toolbox is essential to these areas. Lastly, I believe that this book makes a great companion to “Concentration Inequalities” by Boucheron, Lugosi, Massart.
Review was not posted due to profanity×
- Date Published: September 2018
- format: Hardback
- isbn: 9781108415194
- length: 296 pages
- dimensions: 260 x 183 x 22 mm
- weight: 0.71kg
- availability: Available
Table of Contents
Appetizer: using probability to cover a geometric set
1. Preliminaries on random variables
2. Concentration of sums of independent random variables
3. Random vectors in high dimensions
4. Random matrices
5. Concentration without independence
6. Quadratic forms, symmetrization and contraction
7. Random processes
9. Deviations of random matrices and geometric consequences
10. Sparse recovery
11. Dvoretzky-Milman's theorem
Find resources associated with this titleYour search for '' returned .
Type Name Unlocked * Format Size
This title is supported by one or more locked resources. Access to locked resources is granted exclusively by Cambridge University Press to instructors whose faculty status has been verified. To gain access to locked resources, instructors should sign in to or register for a Cambridge user account.
Please use locked resources responsibly and exercise your professional discretion when choosing how you share these materials with your students. Other instructors may wish to use locked resources for assessment purposes and their usefulness is undermined when the source files (for example, solution manuals or test banks) are shared online or via social networks.
Supplementary resources are subject to copyright. Instructors are permitted to view, print or download these resources for use in their teaching, but may not change them or use them for commercial gain.
If you are having problems accessing these resources please contact email@example.com.
Sorry, this resource is locked
Please register or sign in to request access. If you are having problems accessing these resources please email firstname.lastname@example.orgRegister Sign in
You are now leaving the Cambridge University Press website. Your eBook purchase and download will be completed by our partner www.ebooks.com. Please see the permission section of the www.ebooks.com catalogue page for details of the print & copy limits on our eBooks.Continue ×
Are you sure you want to delete your account?
This cannot be undone.
Thank you for your feedback which will help us improve our service.
If you requested a response, we will make sure to get back to you shortly.×