Skip to main content Accessibility help

Insights and Pitfalls: Selection Bias in Qualitative Research

  • David Collier (a1) and James Mahoney (a1)


Qualitative analysts have received stern warnings that the validity of their studies may be undermined by selection bias. This article provides an overview of this problem for qualitative researchers in the field of international and comparative studies, focusing on selection bias that may result from the deliberate selection of cases by the investigator. Examples are drawn from studies of revolution, international deterrence, the politics of inflation, international terms of trade, economic growth, and industrial competitiveness. The article first explores how insights about selection bias developed in quantitative research can most productively be applied in qualitative studies. The discussion considers why qualitative researchers need to be concerned about selection bias, even if they do not care about the generality of their findings, and it considers distinctive implications of this form of bias for qualitative research, as in the problem of what is labeled “complexification based on extreme cases.” The article then considers pitfalls in recent discussions of selection bias in qualitative studies. These discussions at times get bogged down in disagreements and misunderstandings over how the dependent variable is conceptualized and what the appropriate frame of comparison should be, issues that are crucial to the assessment of bias within a given study. At certain points it becomes clear that the real issue is not just selection bias, but a larger set of trade-offs among alternative analytic goals.



Hide All

1 King, Gary, Keohane, Robert O., and Verba, Sidney, Designing Social Inquiry: Scientific Inference Qualitative Research (Princeton: Princeton University Press, 1994), 116; Geddes, Barbara, “How the Cases You Choose Affect the Answers You Get: Selection Bias in Comparative Politics,” in Stimson, James A., ed., Political Analysis, vol. 2 (Ann Arbor: University of Michigan Press, 1990), 131, n. 1; and Achen, Christopher H. and Snidal, Duncan, “Rational Deterrence Theory and Comparative Case Studies,” World Politics 41 (January 1989), 160, 161. The most important general statement by a political scientist on selection bias is Achen, Christopher H., The Statistical Analysis of Quasi-Experiments (Berkeley: University of California Press, 1986). See also King, Gary, Unifying Political Methodology: The Likelihood Theory ofStatisticalInference (Cambridge: Cambridge University Press, 1989), chap. 9.

2 Heckman, James J., “The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models,” Annals ofEconomic and Social Measurement 5 (Fall 1976); idem, “Sample Selection Bias as a Specification Error,” Econo-metrica 47 (January 1979); idem, “Varieties of Selection Bias,” American Economic Association Papers and Proceedings 80 (May 1990); Maddala, G. S., Limited-Dependent and Qualitative Variables in Economics (Cambridge: Cambridge University Press, 1983); Campbell, Donald T. and Erlebacher, Albert, “How Regression Artifacts in Quasi-Experimental Evaluations Can Mistakenly Make Compensatory Education Look Harmful,” in Struening, Elmer L. and Guttentag, Marcia, eds., Handbook if Evaluation Research, vol. 1 (Beverly Hills, Calif.: Sage Publications, 1975); and Cain, G. G., “Regression and Selection Models to Improve Nonexperimental Comparisons,” in Bennett, C. A. and Lumsdaine, A. A., eds., Evaluation and Experiment: Some Critical Issues in Assessing Social Programs (New York: Academic Press, 1975).

3 King, Keohane, and Verba (fn. 1), 125–26.

4 “Review Symposium—The Qualitative-Quantitative Disputation: Gary King, Robert O. Keo-hane, and Sidney Verba's Designing SocialInquiry: Scientific Inference in Qualitative Research” American Political Science Review 89 (June 1995).

5 Collier, David, “Translating Quantitative Methods for Qualitative Researchers: The Case of Selection Bias” American Political Science Review 89 (June 1995).

6 Rogowski, Ronald, “The Role of Theory and Anomaly in Social-Scientific Inference,” American Political Science Review 89 (June 1995), 468—70. For a cautionary treatment of selection bias within the field of quantitative sociology, see Stolzenberg, Ross M. and Relies, Daniel A., “Theory Testing in a World of Constrained Research Design: The Significance of Heckman's Censored Sampling Bias Correction for Nonexperimental Research,” Sociological Methods and Research 18 (May 1990).

7 See Kendall, Maurice G. and Buckland, William R., A Dictionary of Statistical Terms, 4th ed. (London: Longman, 1982), 18, 66; and Vogt, W. Paul, Dictionary of Statistics and Methodology (Newbury Park, Calif.: Sage Publications, 1993), 21, 82.

8 Achen (fn. 1).

9 Przeworski, Adam and Limongi, Fernando, “Political Regimes and Economic Growth,” Journal of Economic Perspectives 7 (Summer 1993), 6264; and Adam Przeworski, contribution to “The Role of Theory in Comparative Politics: A Symposium,” World Politics 48 (October 1995). This specific problem is also referred to as “endogeneity.” It merits emphasis that even if scholars resolve the concerns about investigator-induced selection bias that are the focus of the present paper, they will still be faced with the selection issues raised by Przeworski.

10 Moses, Lincoln E., “Truncation and Censorship,” in Sills, David L., ed., International Encyclopedia ofthe Social Sciences, vol. 15 (New York: Macmillan and Free Press, 1968), 196. Moses refers to this as truncation “on the left” and “on the right.” We are not concerned with other forms of truncation, which he refers to as “inner” truncation (omitting cases within a given range of values, but including cases above and below that range) and “outer” truncation (omitting cases above and below a given range). In the discussion below, when we refer to truncation, we mean left and right truncation.

11 Heckman (fn. 2, 1976), 478–79.

12 It is important to emphasize that this does not involve the situation of causal heterogeneity discussed below, in which unit changes in the explanatory variables have different effects on the dependent variable. Rather, a different combination of extreme scores on the explanatory variables produces the high scores.

13 Putnam, Robert D., Making Democracy Work- Civic Traditions in Modern Italy (Princeton: Princeton University Press, 1993), chaps. 3–4, and esp. 91–99. His term is actually “civic-ness.”

14 King, Keohane, and Verba (fn. 1), 130. See also Heckman (fn. 2,1976), 478, n. 4; and Winship, Christopher and Mare, Robert D., “Models for Sample Selection Was” Annual Review of Sociology 18 (1992), 330.

15 Discussions of these methods of inference are found in Frendreis, John P., “Explanation of Variation and Detection of Covariation: The Purpose and Logic of Comparative Analysis,” Comparative Political Studies 16 (July 1983); DeFelice, E. Gene, “Causal Inference and Comparative Methods,” Comparative Political Studies 19 (October 1986); George, Alexander L. and McKeown, Timothy J., “Case Studies and Theories of Organizational Decision Making,” in Advances in Information Processing in Organizations, vol. 2 (Santa Barbara, Calif: JAI Press, 1985), 2941; Ragin, Charles C., The Comparative Method: Moving beyond Qualitative and Quantitative Strategies (Berkeley: University of California Press, 1987), esp. chaps. 6—8; and Collier, David, “The Comparative Method,” in Finifter, Ada W., ed., Political Science: The State ofthe Discipline II (Washington, D.C.: American Political Science Association, 1993).

16 Garfinkel, Alan, Forms ofExplanation: Rethinking the Questions in Social Theory (New Haven: Yale University Press, 1981), 2224.

17 Bartels, Larry M., “Pooling Disparate Observations,” American Journal of Political Science 40 (August 1996), 906; emphasis in original.

18 Bartels offers an excellent example of such a model. See ibid.

19 Przeworski, Adam and Teune, Henry, The Logic of Comparative Social Inquiry (New York: Wiley, 1970), 2023. “Causality” is achieved when the causal model is correctly specified. Although greater generality may at times be achieved at the cost of causality, discussions of selection bias point to the alternative view that greater generality may sometimes improve causal assessment.

20 Sartori, Giovanni, “Concept Misformation in Comparative Politics,” American Political Science Review 64 (December 1970); and Collier, David and Mahon, James E. Jr., “Conceptual ‘Stretching‘Revisited: Adapting Categories in Comparative Analysis,” American Political Science Review 87 (December 1993).

21 On discerning, see Komarovsky, Mirra, The UnemployedMan andHis Family: The Effect of Unemployment upon the Status of the Man in Fifty-nine Families (New York: Dryden Press, 1940), esp. 135–46; on process analysis, see Barton, Allen H. and Lazarsfeld, Paul, “Some Functions of Qualitative Analysis in Social Research,” in McCall, G. J. and Simmons, J. L., eds., Issues in Participant Observation (Reading, Mass.: Addison-Wesley, 1969); on pattern matching, see Campbell, Donald T., “‘Degrees of Freedom‘and the Case Study,” Comparative Political Studies 8 (July 1975), 181–82; on process tracing, see George and McKeown (fn. 15); on causal narrative, see Sewell, William H. Jr., “Three Temporalities: Toward an Eventful Sociology,” in McDonald, Terrence J., ed., The Historic Turn in the Human Sciences (Ann Arbor: University of Michigan Press, forthcoming).

22 Campbell (fn. 21).

23 Putnam (fn. 13), 85, 118–19.

24 For a particularly interesting statement on the tendency of case studies to overturn prior understandings, see again Campbell (fn. 21), 182. On the use of case studies to discover new explanations and conceptualizations, see also Piore, Michael J., “Qualitative Research Techniques in Economics,” Administrative Science Quarterly 24 (December 1979); Lijphart, Arend, “Comparative Politics and Comparative Method,” American Political Science Review 65 (September 1971), 691–92; Eckstein, Harry, “Case Study and Theory in Political Science,” in Greenstein, Fred I. and Polsby, Nelson W., eds., Handbook of Political Science, vol. 7 (Reading, Mass.: Addison-Wesley, 1975), 104–8. Some of these themes are incisively summarized in George, Alexander L., “Case Studies and Theory Development: The Method of Structured, Focused Comparison,” in Lauren, Paul Gordon, ed., Diplomacy: New Approaches in History, Theory, and Policy (New York: Free Press, 1979), 5152.

25 In this latter case, scholars may actually look at a range of variation at the high or low extreme of the variable, yet they treat this range of variation as a single outcome, for example, as “high” or “low” growth.

26 King, Keohane, and Verba (fn. 1), 129; Geddes (fn. 1), 132–33.

27 King, Keohane, and Verba (fn. 1), 129.

28 Ibid., 129, 130. We might add that notwithstanding this emphatic advice, these authors state their position more cautiously at a later point (p. 134). They suggest that this type of design may be a useful first step in addressing a research question and can be used to develop interesting hypotheses.

29 Collier (fn. 5), 464. On counterfactual analysis, see Fearon, James D., “Counterfactuals and Hypothesis Testing in Political Science,” World Politics 43 (January 1991), 179–80; and Tetlock, Philip E. and Belkin, Aaron, eds., Counterfactual Thought Experiments in World Politics (Princeton: Princeton University Press, 1996). See also Mill, John Stuart, “Of the Four Methods of Experimental Inquiry,” in A System ofLogic (1843; Toronto: University of Toronto Press, 1974).

30 King, Keohane, and Verba (fn. 1), 146, underscore this point.

31 Rogowski (fn. 6), 468–70; King, Gary, Keohane, Robert O., and Verba, Sidney, “The Importance of Research Design in Political Science,” American Political Science Review 89 (June 1995), 478—79; Katzenstein, Peter, Small States in WorldMarkets (Ithaca, N.Y.: Cornell University Press, 1985); Bates, Robert H., Markets and States in Tropical Africa: The Political Basis of Agricultural Policies (Berkeley: University of California Press, 1981).

32 Porter, , The CompetitiveAdvantage ofNations (New York: Free Press, 1990).

33 King, Keohane, and Verba (fn. 1), 134.

34 Porter (fn. 32), 6–10, 28–29, 33, 69, 577, 735.

35 Ibid., 683. See pp. 21–22 for Porter's discussion of his criteria for case selection.

36 Ibid., 675–80.

37 “The Rational Deterrence Debate: A Symposium,” World Politics 41 (January 1989).

38 Achen and Snidal (fn. 1), 160, 162.

39 Achen and Snidal (fn. 1), 161; George, Alexander L. and Smoke, Richard, Deterrence in American Foreign Policy: Theory and Practice (New York: Columbia University Press, 1974).

40 George and Smoke (fn. 39), 513—15, 519. See also George, and Smoke, , “Deterrence and Foreign Policy,” World Politics 41 (January 1989), 173.

41 George and Smoke (fn. 39), 534, 522–36. See more generally chap. 18.

42 Even the cases not classified as following one of their patterns are still treated as instances of deterrence failure. See George and Smoke (fn. 39), 547–48.

43 George and Smoke's (fn. 40) subsequent discussion of these issues appears to underscore the idea of thinking of this variability in terms of gradations (p. 172).

44 George and Smoke (fn. 39), 503.

45 Ibid., 2. Similar statements are found on pp. 503 and 589.

46 This is an adaptation of Tilly's term “variation finding.” See Tilly, Charles, Big Structures, Large Processes, Huge Comparisons (New York: Russell Sage Foundation, 1984), 82, 116–24.

47 Skocpol, Theda, States and Social Revolutions: A Comparative Analysis ofFrance, Russia, and China (Cambridge: Cambridge University Press, 1979).

48 Geddes (fn. 1), 142, 145.

49 Skocpol (fn. 47), 33–42, 287–90.

50 Geddes (fn. 1), 134.

51 Ibid., 138.

52 Geddes (fn. 1), 135, introduces additional domain restrictions that seem highly appropriate, as in the exclusion of oil-exporting states.

53 See Geddes (fn. 1), 135–140, and esp. Figures 4, 5, 6.

54 This point is made by Haggard, one of the authors whom Geddes cites. See Haggard, Stephan, “The Newly Industrializing Countries in the International System,” World Politics 38 (January 1986), 343, n. 1.

55 See Bollen, Kenneth A. and Jackman, Robert W., “Regression Diagnostics: An Expository Treatment of Outliers and Influential Cases,” SociologicalMethods and Research 13 (May 1985).

56 Geddes (fn. 1), 146–47.

57 Ibid., 145.

58 Prebisch, Raul, The Economic Development ofLatin America and Its Principal Problems (New York: United Nations, 1950).

59 Geddes (fn. 1), 146.

60 Ibid., 145–47.

61 Prébisch (fn. 58), 9.

62 Hirschmain, Albert O., Journeys toward Progress: Studies of Economic Policy-Mating in Latin America (New York: W. W. Norton, 1973), originally published by the Twentieth Century Fund in 1963.

63 Geddes (fn. 1), 147,148.

64 Ibid., 147.

65 Ibid.

66 Hirschman (fn. 62), 223.

67 Campbell, Donald T. and Stanley, Julian C., Experimental and Quasi-Experimental Designsfor Research (Chicago: Rand McNally, 1963), 3743, esp. Figure 3; Campbell, Donald T. and Ross, H. Laurence, “The Connecticut Crackdown on Speeding: Time-Series Data in Quasi-Experimental Analysis,” Law and Society Review 3 (August 1968); Hoole, Francis W., Evaluation Research andDevelopment Activities (Beverly Hills, Calif.: Sage Publications, 1978); Cook, Thomas D. and Campbell, Donald T., Quasi-Experimentation: Design andAnalysis Issuesfor Field Settings (Boston: Houghton Mifflin, 1979), chap. 2.

68 For two perspectives on the role of probabilistic causation in small-N analysis, see Lieberson, Stanley, “Small N's and Big Conclusions: An Examination of the Reasoning in Comparative Studies Based on a Small Number of Cases,” SocialForces 70 (December 1991), 309–12; and Collier, Ruth Berins and Collier, David, Shaping the Political Arena: Critical Junctures, the Labor Movement, and Regime Dynamics in Latin America (Princeton: Princeton University Press, 1991), 20.

69 Ragin (fn. 15).

* We acknowledge helpful comments from the following colleagues (but without thereby implying their agreement with the argument we develop): Christopher Achen, Larry Bartels, Andrew Bennett, Henry Brady, Barbara Geddes, Alexander George, David Freedman, Lynn Gayle, Stephan Haggard, Marcus Kurtz, Steven Levitsky, Carol Medlin, Lincoln Moses, Adam Przeworski, Philip Schrodt, Michael Sinatra, Laura Stoker, and Steven Weber. Certain of the arguments developed here were addressed in a preliminary form in Collier, David, “Translating Quantitative Methods for Qualitative Researchers: The Case of Selection Bias,” American Political Science Review 89 (June 1995). David Collier's work on this analysis at the Center for Advanced Study in the Behavioral Sciences was supported by National Science Foundation Grant No. SBR-9022192.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

World Politics
  • ISSN: 0043-8871
  • EISSN: 1086-3338
  • URL: /core/journals/world-politics
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed