Skip to main content

EDA for HLM: Visualization when Probabilistic Inference Fails

  • Jake Bowers (a1) and Katherine W. Drake (a2)

Nearly all hierarchical linear models presented to political science audiences are estimated using maximum likelihood under a repeated sampling interpretation of the results of hypothesis tests. Maximum likelihood estimators have excellent asymptotic properties but less than ideal small sample properties. Multilevel models common in political science have relatively large samples of units like individuals nested within relatively small samples of units like countries. Often these level-2 samples will be so small as to make inference about level-2 effects uninterpretable in the likelihood framework from which they were estimated. When analysts do not have enough data to make a compelling argument for repeated sampling based probabilistic inference, we show how visualization can be a useful way of allowing scientific progress to continue despite lack of fit between research design and asymptotic properties of maximum likelihood estimators.

Somewhere along the line in the teaching of statistics in the social sciences, the importance of good judgment got lost amid the minutiae of null hypothesis testing. It is all right, indeed essential, to argue flexibly and in detail for a particular case when you use statistics. Data analysis should not be pointlessly formal. It should make an interesting claim; it should tell a story that an informed audience will care about, and it should do so by intelligent interpretation of appropriate evidence from empirical measurements or observations.

—Abelson, 1995, p. 2

With neither prior mathematical theory nor intensive prior investigation of the data, throwing half a dozen or more exogenous variables into a regression, probit, or novel maximum-likelihood estimator is pointless. No one knows how they are interrelated, and the high-dimensional parameter space will generate a shimmering pseudo-fit like a bright coat of paint on a boat's rotting hull.

—Achen, 1999, p. 26

Corresponding author
e-mail: (corresponding author)
Hide All

Authors' note: We owe many thanks to James Bowers, Andrew Gelman, Orit Kedar, Burt Monroe, Kevin Quinn, Phil Shively, Laura Stoker, Cara Wong, and the anonymous reviewers of the first version of our manuscript for their many helpful comments.

Hide All
Abelson, Robert. 1995. Statistics as Principled Argument. New York: Lawrence Erlbaum.
Achen, Christopher H. 1999. “Warren miller and the future of political data analysis.” Political Analysis 8: 142146.
Achen, Christopher H., and Shively, W. P. 1995. Cross-Level Inference. University of Chicago Press, Chicago.
Becker, R. A., Cleveland, W. S., and Shyu, M. J. 1996. “The visual design and control of Trellis Display.” Journal of Computational and Statistical Graphics 5: 123155. (Available from
Bowers, Jake, and Ensley, Michael. 2003. “Issues in Analyzing Data from the Dual-Mode 2000 American National Election Study.” Technical Report. Ann Arbor, MI: National Election Studies.
Brady, Henry, and Seawright, Jason. 2004. “Framing social inquiry: From Models of Causation to Statistically Based Causal Inference.” Working paper.
Buja, Andreas, and Cook, Dianne. 1999. “Inference for Data Visualization.” Presented at the Joint Statistics Meetings, August 1999. Baltimore, MD. (Available from
Burns, Nancy, Schlozman, Kay L., and Verba, Sidney. 2001. The Private Roots of Public Action: Gender, Equality, and Political Participation. Cambridge, MA: Harvard University Press.
Cleveland, William S. 1993. Visualizing Data. Summit, NJ: Hobart.
Davidson, Russell, and MacKinnon, James G. 1993. Estimation and Inference in Econometrics. New York: Oxford University Press.
Fox, John. 1997. Applied Regression Analysis, Linear Models, and Related Methods. Thousand Oaks, CA: Sage.
Gelman, Andrew. 2003. “A Bayesian Formulation of Exploratory Data Analysis and Goodness-of-Fit Testing.” International Statistical Review 71: 369382.
Gelman, Andrew. 2004. “Exploratory Data Analysis for Complex Models (with Discussion by Andreas Buja and Rejoinder).” Journal of Computational and Graphical Statistics 13: 755787.
Gelman, Andrew, Carlin, John B., Stern, Hal S., and Rubin, Donald B. 2004. Bayesian Data Analysis, 2nd ed. Boca Raton, FL: Chapman and Hall/CRC.
Gelman, Andrew, Pasarica, Cristian, and Dodhia, Rahul. 2002. “Let's Practice What We Preach: Turning Tables into Graphs.” Statistical Computing and Graphics 56: 121130.
Gill, Jeff. 2002. Bayesian Methods: A Social and Behavioral Sciences Approach. Boca Raton, FL: Chapman and Hall/CRC.
Goldstein, H. 1999. Multilevel Statistical Models. London: Edward Arnold.
Greene, William H. 2002. Econometric Analysis, 5th ed. Upper Saddle River, NJ: Prentice Hall.
Holland, Paul W. 1986. “Statistics and Causal Inference.” Journal of the American Statistical Association, 81: 945960.
Hox, J. J., and Maas, C. J. M. 2002. Sample Sizes for Multilevel Modeling. In Social Science Methodology in the New Millennium. Proceedings of the Fifth International Conference on Logic and Methodology, eds. Blasius, J., Hox, J., de Leeuw, E., and Schmidt, P. Opladen, Germany: Leske + Budrich Verlag.
Huckfeldt, R. R. 1979. “Political Participation And The Neighborhood Social Context.” American Journal of Political Science 23: 579592.
Jackman, Simon. 2004. “Bayesian Analysis for Political Research.” Annual Review of Political Science 7: 483505.
King, Gary. 1989. Unifying Political Methodology: The Likelihood Theory of Statistical Inference. New York: Cambridge University Press.
Kreft, Ita. 1996. “Are Multilevel Techniques Necessary? An Overview, Including Simulation Studies.” Unpublished manuscript.
Kreft, I., and Leeuw, J. D. 1998. Introducing Multilevel Modeling. London: Sage.
Langford, Ian H., and Lewis, Toby. 1998. “Outliers in Multilevel Data.” Journal of the Royal Statistical Society A 161: 121160.
Leisch, Friedrich. 2002. Dynamic Generation of Statistical Reports Using Literate Data Analysis. In Compstat 2002—Proceedings in Computational Statistics, eds. Haerdle, W. and Roenz, B. Heidelberg, Germany: Physika Verlag, pp. 575580.
Leisch, Friedrich. 2005. “Sweave User Manual.” (Available from
Lewis, Jeffrey B., and Linzer, Drew A. 2005. “Estimating Regression Models in Which the Dependent Variable Is Based on Estimates.” Political Analysis. doi:10.1093/pan/mpi026.
Longford, N. T. 1993. Random Coefficient Models. Oxford: Clarendon.
Maas, Cora J.M., and Hox, Joop J. 2002. “Robustness of Multilevel Parameter Estimates against Small Sample Sizes.” In Social Science Methodology in the New Millennium, eds. Blasius, J., Hox, J., de Leeuw, E., and Schmidt, P. Opladen, Germany: Leske + Budrich. (Available from
Maas, Cora J.M., and Hox, Joop J. 2004. “Robustness Issues in Multilevel Regression Analysis.” Statistica Neerlandica 58: 127137.
McCulloch, Charles E., and Searle, Shayle R. 2001. Generalized, Linear, and Mixed Models. New York:JohnWiley and Sons.
Mundlak, Yair. 1978. “On the Pooling of Time Series and Cross Section Data.” Econometrica 46: 6985.
Nie, Norman, Junn, Jane, and Barry, Kenneth S. 1996. Education and Democratic Citizenship in America. Chicago: University of Chicago Press.
Pinheiro, José C., and Bates, Douglas M. 2000. Mixed-Effects Models in S and S-PLUS. New York: Springer-Verlag.
R Development Core Team. 2005. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. (Available from
Raudenbush, Stephen W., and Bryk, Anthony S. 2002. Hierarchical Linear Models: Applications and Data Analysis Methods, 2nd ed. Thousand Oaks, CA: Sage.
Rosenbaum, Paul R. 2002. Observational Studies. New York: Springer.
Rosenstone, Steven, and Hansen, John M. 1993. Mobilization, Participation and Democracy in America. New York: MacMillan.
Rubin, Donald B. 1991. “Practical Implications of Modes of Statistical Inference for Causal Effects and the Critical Role of the Assignment Mechanism.” Biometrics 47: 12131234.
Sarkar, Deepayan. 2005. Lattice: Lattice Graphics. R Foundation for Statistical Computing [producer and distributor]. (Available from
Singer, Judith D., and Willett, John B. 2003. Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. New York: Oxford University Press.
Snijders, T., and Bosker, R. 1999. Multilevel Modeling: An Introduction to Basic and Advanced Multilevel Modeling. London: Sage.
Steenbergen, Marco R., and Jones, Bradford S. 2002. “Modeling Multilevel Data Structures.” American Journal of Political Science 46: 218237.
Stoker, Laura, and Bowers, Jake. 2002a. “Designing Multi-level Studies: Sampling Voters and Electoral Contexts.” Electoral Studies, 21: 235267.
Stoker, L., and Bowers, J. 2002b. “Erratum to ‘Designing Multi-level Studies: Sampling Voters and Electoral Contexts’.” Electoral Studies 21: 535536.
Tenn, Stephen. 2005. “An Alternative Measure of Relative Education to Explain Voter Turnout.” Journal of Politics 67: 271282.
Tufte, Edward. 1983. The Visual Display of Quantative Information. Cheshire, CT: Graphics.
Tufte, Edward. 1990. Envisioning Information. Cheshire, CT: Graphics.
Tufte, Edward. 2003. The Cognitive Style of Powerpoint. Cheshire, CT: Graphics.
Tufte, Edward R. 1997. Visual Explanations: Images and Quantities, Evidence and Narrative. Cheshire, CT: Graphics.
Tukey, John W. 1977. Exploratory Data Analysis. Reading, MA: Addison-Wesley.
Venables, W. N., and Ripley, B. D. 2002. Modern Applied Statistics with S-PLUS, 4th ed. New York: Springer.
Verba, Sidney, Schlozman, Kay L., and Brady, Henry. 1995. Voice and Equality: Civic Voluntarism in American Politics. Cambridge: Harvard University Press.
Wooldridge, Jeffrey M. 2002. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press.
Yohai, V., Stahel, W. A., and Zamar, R. H. 1991. A Procedure for Robust Estimation and Inference in Linear Regression. In Directions in Robust Statistics and Diagnostics, Part II, eds. Stahel, W. A. and Weisberg, S. W. New York: Springer-Verlag.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed