from Section I - Thinking Like a Data Scientist
Published online by Cambridge University Press: 05 December 2015
The best-laid schemes o’ mice an’ men
Gang aft agley,
An’ lea'e us nought but grief an’ pain,
For promis'd joy!
Robert Burns 1785In Chapter 3 we learned how being guided by Rubin's Model for Causal Inference helps us design experiments to measure the effects of possible causes. I illustrated this with a hypothetical experiment on how to unravel a causal puzzle of happiness. Is it really this easy? The short answer is, unfortunately, no. But in the practical world, more complicated than the one evoked in my proposed happiness study, Rubin's Model is even more useful. In this chapter we go deeper into the dimly lit practical world, where participants in our causal experiment drop out for reasons outside our control. I show how statistical thinking in general, and Rubin's Model in particular, can illuminate it. But let us go slowly and allow time for our eyes to acclimate to the darkness.
Controlled experimental studies are typically regarded as the gold standard for which all investigators should strive, and observational studies as their polar opposite, pejoratively described as “some data we found lying on the street.” In practice they are closer to each other than we are often willing to admit. The distinguished statistician Paul Holland, expanding on Robert Burns, observed that
All experimental studies are observational studies waiting to happen.
This is an important and useful warning to all who are wise enough to heed it. Let us begin with a more careful description of both kinds of studies:
The key to an experimental study is control. In an experiment, those running it control:
What is the treatment condition,
What is the alternative condition,
Who gets the treatment,
Who gets the alternative, and
What are the outcome (dependent) variables.
In an observational study the experimenter's control is not as complete. Consider an experiment to measure the causal effect of smoking on life expectancy. Were we to do an experiment, the treatment might be a pack of cigarettes a day for one's entire life. The alternative condition might be no smoking. Then we would randomly assign people to smoke or not smoke, and the dependent variable would be their age at death.
To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.