Published online by Cambridge University Press: 05 October 2013
This chapter is about extracting small representative samples from large data sets. In the process we develop a complete computational theory of geometric sampling, with an eye toward the derandomization applications that will be discussed in later chapters. It is difficult to overestimate the impact that this theory has had in computational geometry in the 1990's.
The combinatorial discrepancy of a set system indicates how well, relative to its constituent subsets, we can sample the ground set by selecting about half of it. It is natural to ask what happens for different sample sizes. At one extreme, we might wonder how well we can sample a set if we are allowed to pick only a constant number of elements. For example, given a finite collection of points in the plane, is it possible to choose a subset of constant size, such that any disk that encloses at least one percent of the points also includes at least one sample point? Surprisingly, the answer is yes.
In fact, something even stronger and stranger is true: Suppose that we want to estimate how many people live within 10 miles of a hospital in a given country. We can do this by sampling the population carefully, answering the question for the sample, and then scaling up appropriately. What is amazing is that, for a given relative error, the same sample size works just as well whether the country is Switzerland or China! Furthermore, we can change metrics and even lift the problem into higher dimensional space, and this still remains true.
To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.