Cluster–Robust Variance Estimation for Dyadic Data

Peter M. Aronow; Cyrus Samii; Valentina A. Assenova

doi:10.1093/pan/mpv018

Cluster–Robust Variance Estimation for Dyadic Data

Published online by Cambridge University Press: 04 January 2017

Peter M. Aronow ,

Cyrus Samii and

Valentina A. Assenova

Show author details

Peter M. Aronow: Affiliation:
Department of Political Science, Yale University, 77 Prospect Street, New Haven, CT 06520, e-mail: peter.aronow@yale.edu
Cyrus Samii*: Affiliation:
Department of Politics, New York University, 19 West 4th Street, New York, NY 10012
Valentina A. Assenova: Affiliation:
School of Management, Yale University, 165 Whitney Avenue, New Haven, CT 06520, e-mail: valentina.assenova@yale.edu
*: e-mail: cds2083@nyu.edu (corresponding author)

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Dyadic data are common in the social sciences, although inference for such settings involves accounting for a complex clustering structure. Many analyses in the social sciences fail to account for the fact that multiple dyads share a member, and that errors are thus likely correlated across these dyads. We propose a non-parametric, sandwich-type robust variance estimator for linear regression to account for such clustering in dyadic data. We enumerate conditions for estimator consistency. We also extend our results to repeated and weighted observations, including directed dyads and longitudinal data, and provide an implementation for generalized linear models such as logistic regression. We examine empirical performance with simulations and an application to interstate disputes.

Information

Type: Letter
Information: Political Analysis , Volume 23 , Issue 4 , Autumn 2015 , pp. 564 - 577

DOI: https://doi.org/10.1093/pan/mpv018 [Opens in a new window]
Copyright: Copyright © The Author 2015. Published by Oxford University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Authors' note: The authors thank Neal Beck, Allison Carnegie, Dean Eckles, Donald Lee, Winston Lin, Kelly Rader, Olav Sorenson, the Political Analysis editors, and two reviewers for helpful comments. They thank Jonathan Baron and Lauren Pinson for research assistance. Supplementary materials for this article are available on the Political Analysis Web site. Replication materials are available on the Political Analysis Dataverse (https://dataverse.harvard.edu/dataverse/pan).

References

Angrist, Joshua D., and Imbens, Guido W. 2002. Comment on “Covariance adjustment in randomized experiments and observational studies” by Paul R. Rosenbaum. Statistical Science 17(3): 304–7.Google Scholar

Angrist, Joshua D., and Pischke, Jorn-Steffen. 2009. Mostly harmless econometrics: An empiricist's companion. Princeton, NJ: Princeton University Press.Google Scholar

Arellano, Manuel. 1987. Computing robust standard errors for within-group estimators. Oxford Bulletin of Economics and Statistics 49(4): 431–34.Google Scholar

Beck, Nathanial, Skrede Gleditsch, Kristian, and Beardsley, Kyle. 2006. Space is more than geography: Using spatial ecometrics in the study of political economy. International Studies Quarterly 50:27–44.CrossRef Google Scholar

Andreas, Buja, Berk, Richard Brown, Lawrence George, Edward Pitkin, Emil Traskin, Mikhail Zhao, Linda and Zhang, Kai. 2014. Models as approximations: A conspiracy of random predictors and model violations against classical inference in regression. Manuscript, Wharton School, University of Pennsylvania, Philadelphia.Google Scholar

Cameron, A. Colin, Gelbach, Jonah B., and Miller, Douglas L. 2011. Robust inference with multi-way clustering. Journal of Business and Economic Statistics 29(2): 238–49.Google Scholar

Chamberlain, Gary. 1982. Multivariate regression models for panel data. Journal of Econometrics 18(1): 5–46.CrossRef Google Scholar

Conley, Timothy G. 1999. GMM estimation with cross-sectional dependence. Journal of Econometrics 92:1–45.Google Scholar

Davidson, Russell, and MacKinnon, James G. 2004. Econometric theory and methods. Oxford: Oxford University Press.Google Scholar

Erikson, R. S., Pinto, P. M., and Rader, K. T. 2014. Dyadic analysis in international relations: A cautionary tale. Political Analysis 22(4): 457–63.Google Scholar

Fafchamps, Marcel, and Gubert, Flore. 2007. The formation of risk sharing networks. Journal of Development Economics 83:326–50.Google Scholar

Fisman, Raymond, Iyengar, Sheena S., Kamenica, Emik, and Simonson, Itamar. 2006. Gender differences in mate selection: Evidence from a speed dating experiment. Quarterly Journal of Economics 121:673–97.Google Scholar

Gelman, Andrew, and Hill, Jennifer. 2007. Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press.Google Scholar

Goldberger, Arthur S. 1991. A course in econometrics. Cambridge, MA: Harvard University Press.Google Scholar

Green, Donald P., Yeon Kim, Soo, and Yoon, David H. 2001. Dirty pool. International Organization 55(2): 441–68.Google Scholar

Greene, William H. 2008. Econometric analysis. 6th ed. Upper Saddle River, NJ: Pearson.Google Scholar

Hansen, Christian B. 2007. Asymptotic properties of a robust variance matrix estimator for panel data when T is large. Journal of Econometrics 141:597–620.Google Scholar

Hoff, Peter D. 2005. Bilinear mixed-effects models for dyadic data. Journal of the American Statistical Association 100(469): 286–95.Google Scholar

Hubbard, Alan E., Ahern, Jennifer, Fliescher, Nancy L., Van Der Laan, Mark, Lippman, Sheri A., Jewell, Tim Bruckner, Nicholas, and Satariano, William A. 2010. To GEE or not to GEE: Comparing population average and mixed models for estimating the associations between neighborhood risk factors and health. Epidemiology 21(4): 467–74.Google Scholar

Huber, Peter J. 1967. The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Vol. 1, pp. 221–33. Berkeley, CA: University of California Press.Google Scholar

Kenny, David A., Kashy, Deborah A., and Cook, William L. 2006. Dyadic data analysis. New York: Guilford Press.Google Scholar

King, Gary, and Roberts, Margaret E. 2015. “How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It.” Political Analysis 23(2): 159–79.Google Scholar

Lehmann, Erich L. 1999. Elements of large sample theory. New York: Springer-Verlag.CrossRef Google Scholar

Liang, Kung-Yee, and Zeger, Scott L. 1986. Longitudinal data analysis using generalized linear models. Biometrika 73(1): 13–22.Google Scholar

Lin, Winston. 2013. Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique. Annals of Applied Statistics 7(1): 295–318.CrossRef Google Scholar

MacKinnon, James G., and White, Halbert. 1985. Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. Journal of Econometrics 29(3): 305–25.Google Scholar

Moulton, Brent R. 1986. Random group effects and the precision of regression estimates. Journal of Econometrics 32:385–97.Google Scholar

Neumayer, Eric, and Pluemper, Thomas. 2010. Spatial effects in dyadic data. International Organization 64(1): 145–65.CrossRef Google Scholar

Russett, Bruce M., and Oneal, John R. 2001. Triangulating peace: Democracy, interdependence, and international organizations. New York: Norton.Google Scholar

Samii, Cyrus. 2015. Cluster-Robust Variance Estimation for Dyadic Data. http://dx.doi.org/10.7910/DVN/OMJYE5, Harvard Dataverse, V1 [UNF:6:WJJ3ZmDS7COvpy1kwztcMQ==].Google Scholar

Stefanski, Leonard A., and Boos, Dennis D. 2002. The calculus of M-estimation. American Statistician 56(1): 29–38.Google Scholar

Stock, James H., and Watson, Mark W. 2008. Heteroskedasticity-robust standard errors for fixed effects panel data regression. Econometrica 76(1): 155–74.Google Scholar

White, Halbert. 1980a. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroske-dasticity. Econometrica 48(4): 817–38.Google Scholar

White, Halbert. 1980b. Using least squares to approximate unknown regression functions. International Economic Review 21(1): 149–70.Google Scholar

White, Halbert. 1981. Consequences and detection of misspecified nonlinear regression models. Journal of the American Statistical Association 76(374): 419–33.CrossRef Google Scholar

White, Halbert. 1982. Maximum likelihood estimation of misspecified models. Econometrica 50:1–25.Google Scholar

White, Halbert. 1984. Asymptotic theory of econometricians. New York: Academic Press.Google Scholar

Wooldridge, Jeffrey M. 2010. Econometric analysis of cross section and panel data. Cambridge, MA: MIT Press.Google Scholar

Zorn, Christopher. 2001. Generalized estimating equation models for correlated data: A review with applications. American Journal of Political Science 45:470–90.Google Scholar

Aronow et al. supplementary material

Supporting Information

PDF 216.8 KB

Article contents

Cluster–Robust Variance Estimation for Dyadic Data

Abstract

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Aronow et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests