Skip to main content
×
×
Home

Estimating Regression Models in Which the Dependent Variable Is Based on Estimates

  • Jeffrey B. Lewis (a1) and Drew A. Linzer (a2)
Abstract

Researchers often use as dependent variables quantities estimated from auxiliary data sets. Estimated dependent variable (EDV) models arise, for example, in studies where counties or states are the units of analysis and the dependent variable is an estimated mean, proportion, or regression coefficient. Scholars fitting EDV models have generally recognized that variation in the sampling variance of the observations on the dependent variable will induce heteroscedasticity. We show that the most common approach to this problem, weighted least squares, will usually lead to inefficient estimates and underestimated standard errors. In many cases, OLS with White's or Efron heteroscedastic consistent standard errors yields better results. We also suggest two simple alternative FGLS approaches that are more efficient and yield consistent standard error estimates. Finally, we apply the various alternative estimators to a replication of Cohen's (2004) cross-national study of presidential approval.

Copyright
Corresponding author
e-mail: jblewis@ucla.edu (corresponding author)
Footnotes
Hide All

Authors' note: We thank Chris Achen, Barry Burden, Michael Herron, Gary King, Eduardo Leoni, and Lynn Vavreck for comments on earlier drafts of this article. Any remaining errors are ours alone.

Footnotes
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×
MathJax

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 9 *
Loading metrics...

Abstract views

Total abstract views: 316 *
Loading metrics...

* Views captured on Cambridge Core between 4th January 2017 - 21st January 2018. This data will be updated every 24 hours.