Assessing a New Measure of State Policy Mood: Response to Lagodny, Jones, Koch, and Enns

Abstract This article presents a short summary of the conclusions we report in a longer manuscript (available in our Supplementary Material) subjecting Lagodny et al.’s new measure of state policy mood to the same set of face validity and construct validity tests we applied earlier to Enns and Koch’s measure. We encourage readers to read this longer manuscript, which contains not only the conclusions herein, but also the evidence justifying these conclusions, before accepting or rejecting any claims we make. Our results show that the characteristics of Enns and Koch’s measure that led us to be doubtful that it is valid are also present in Lagodny et al.’s new measure – leaving us just as doubtful that Lagodny et al.’s measure is valid. Moreover, the low correlation between Lagodny et al.’s measure and Enns and Koch’s measure, combined with evidence from replications of seven published studies that the two measures frequently yield quite different inferences about the impact of policy mood on public policy, indicate that Lagodny et al.’s claim that both their measure and Enns and Koch’s measure are valid is wrong; either neither measure is valid, or one is valid and the other is not. Moreover, extending the replications to include not only Lagodny et al.’s and Enns and Koch’s measures, but also Berry et al.’s measure and Caughey and Warshaw’s measure of mass economic liberalism, shows that each of the four measures yields a substantive conclusion about the effect of policy mood that is dramatically different than each of the other three measures. This suggests that the goal of developing a measure of state policy mood that would be widely accepted as valid remains elusive.


Introduction
reassert Enns and Koch's (2015;hereafter E&K's) claim that E&K's (2013) measure of state policy mood is valid but Berry et al.'s (1998) measure (hereafter BRFH's measure) is not.LJKE (2023, 360) also introduce a new measure of state policy mood that they claim has "even better properties than the Enns and Koch measure."Our Supplementary Material contains a detailed response to LJKE.However, because of SPPQ length restrictions, we limit ourselves here to a summary of the conclusions presented in the Supplementary Material.We encourage readers to read this Supplementary Material that contains not only the conclusions herein, but also the evidence justifying these conclusions, before accepting or rejecting any claims we make.LJKE's measure raises similar face validity concerns as does E&K's measure Berry et al. (2015) identify three respects in which E&K's measure lacks face validity.1Our analysis of LJKE's measure leaves us with the same concerns: • Berry et al. (2015) contend that scores for E&K's measure are at odds with conventional wisdom about policy mood in the South in the sense that the measure indicates that the South is a liberal region.We find evidence that LJKE's measure paints the South as even more liberal than does E&K's measure.• Berry et al. (2015) claim that E&K's measure shows less variation across regions than conventional wisdom dictates should be the case.We find evidence that LJKE's measure exhibits a larger number of significant pair-wise regional differences than does E&K's measure.However, we believe that many of regional differences lack face validity.• Berry et al. (2015) assert that E&K's measure lacks face validity in the sense that the time trend in E&K scores is more similar across states than seems plausible.When observing the 50 states for the period 1960 to 2010, the mean longitudinal correlation between LJKE's score in one state and LJKE's score in another state is 0.51.This is substantially lower than the comparable correlation for E&K's measure.However, additional analysis presented below leads us to believe that this mean correlation of 0.51 is high enough to raise validity concerns.
LJKE's measure raises similar construct validity concerns as does E&K's measure Berry et al. (2015) compute the cross-sectional correlation of each of BRFH's and E&K's measure of state policy mood with 25 indicators of state policy in a year between 1980 and 2009, and report that E&K's measure yields seven correlations with a sign contrary to conventional wisdom.We find evidence that LJKE's measure produces 12 correlations with this characteristic.
The correlation between LJKE's measure and E&K's measure implies that LJKE's claim that both measure are valid is wrong LJKE (2023, 369) claim that E&K's measure is valid and that LJKE's measure "performs even better."Yet, LJKE report that the mean (across the 50 states and DC) of the over-time correlation (for the years between 1956 and 2010) between the two measures is "just above" 0.50.2Squaring a correlation of 0.50 yields a coefficient of determination (r 2 ) of 0.25 -indicating that each measure explains just 25% of the variation in the other.When one is considering the relationship between two variables that measure distinct concepts (e.g., a dependent variable, Y, and an independent variable, X, expected to affect Y), it is often reasonable interpret an r 2 of 0.25 as "strong."However, when one is considering two variables each of which is hypothesized to be a valid measure of the same concept, we believe an r 2 of 0.25 between them is evidence that one's hypothesis is wrongthat the two variables are not measuring the same concept, and therefore, at most one of them can be valid.
Results from replications of published studies suggest that LJKE's claim that both their measure and E&K's measure are valid is wrong Berry, Fording, and Crofoot (2023) report results when each of BRHF's measure and E&K's measure is substituted for the measure of state policy mood in models from seven published studies that estimate the effect of state policy mood on public policy.We extend this analysis to include LJKE's measure as well as Caughey and Warshaw's (2018; hereafter, C&W's) measure of mass economic liberalism.In the majority of the seven studiesthe four by Boehmke and Shipan (2015), Boehmke, Osborn, and Schilling (2015), Hawes and McCrea (2018), and Ojeda et al. (2019) -E&K's measure and LJKE's measure yield starkly different conclusions about the effect of state policy mood.Indeed, in the model replicated for each of these studies, one of the two measures (E&K's or LJKE's) has a coefficient statistically significant at the 0.05 level, and the coefficient for the other measure has the opposite sign and is not statistically significant.This finding is based on a small sample, but when considered alongside the low correlation between E&K's measure and LJKE's measure, we believe it justifies a conclusion that LJKE's claim that both LJKE's measure and E&K's measure are valid cannot be sustained.

A simulation to place observed relationships in perspective
Using a simulation, we find that a nonsensical "shuffled" LJKE measure constructed by replacing each state's LJKE's mood scores with those of a randomly chosen state would be a slightly better proxy for LJKE's measure than E&K's measure, despite the fact that LJKE deem E&K's measure as valid.Moreover, a nonsensical "shuffled" E&K's measure constructed by replacing each state's E&K's mood scores with those of a randomly chosen state would be a substantially better proxy for E&K's measure than the assumed-to-be-valid LJKE measure.We contend that these simulation results are consistent with our claim that E&K's and LJKE's measures of state policy mood lack face validity.Simply put, in our view it strains credibility to believe that both measures are valid when whichever measure we prefer, if it were not available, we would better off using a nonsensical measure created by randomly shuffling its scores across states than by using the other measure even though this other measure is also presumed valid.
Concluding observations: where do we stand?Our extended replication analysis involves three indicators of state policy mood (those of E&K, LJKE, and BRFH), as well as C&W's measure of mass economic liberalism, all of which measure each state over a long period of years.This analysis yields the discouraging finding that none of the four measures yields a similar inference to any other measure in a majority of the seven studies: for any pair of measures, a similar inference occurs in at most three of the seven studies.This suggests that if one believes that any of the four measures is valid, one should be very skeptical of a claim that any of the other three measures is valid.This, in turn, leads us to believe that the goal of developing a measure of state policy mood that would be widely accepted as valid remains elusive.
Although LJKE (2023) report some analysis of C&W's measure, we chose not to consider this measure prior to this concluding section because doing so would take us beyond the scope of a reply to LJKE.However, when subjecting LJKE's measure to the battery of face validity tests Berry et al. (2015) use to evaluate E&K's measure, we applied the same tests to C&W's measure of mass economic liberalism.Two findings are notable: (i) Like E&K's and LJKE's measures, C&W's measure identifies the South as a liberal region, and (ii) the average over-time correlation between C&W's measure in one state and C&W's measure in another state is nearly as high (0.77) as the average correlation of 0.84 between E&K's measure in one state and E&K's measure in another state.
We encourage scholars interested in state policy mood to reflect on these features of E&K's, LJKE's, and C&W's measures.For example, does the fact that the three measures produce scores that deviate from the conventional wisdom that the South is conservative imply that (i) the measures are capturing a true feature of policy mood in the South, and the conventional wisdom about southern mood is wrong, or (ii) there is some shared element of the data on which the measures rely and/or some commonality in the methods underlying the measures that lead to a similar systematic overestimation of liberalness in the South?Similarly, is an average correlation of about 0.80 between a measure of mood in one state and the measure in another state (i) a signal of systematic measurement error, or (ii) an accurate reflection of the true nature of state policy mood because each state's mood is largely driven by national forces?Supplementary material.The supplementary material for this article can be found at https://doi.org/10.1017/spq.2023.14.