Expert Bias and Democratic Erosion: Assessing Expert Perceptions of Contemporary American Democracy

ABSTRACT In an important contribution to scholarship on measuring democratic performance, Little and Meng suggest that bias among expert coders accounts for erosion in ratings of democratic quality and performance observed in recent years. Drawing on 19 waves of survey data on US democracy from academic experts and from the public collected by Bright Line Watch (BLW), this study looks for but does not find manifestations of the type of expert bias that Little and Meng posit. Although we are unable to provide a direct test of Little and Meng’s hypothesis, several analyses provide reassurance that expert samples are an informative source to measure democratic performance. We find that respondents who have participated more frequently in BLW surveys, who have coded for V-Dem, and who are vocal about the state of American democracy on Twitter are no more pessimistic than other participants.


L
ittle and Meng challenge the thesis that democracies are eroding globally.Summoning salient metrics of democratic performance and comparing them with democracy indicators built largely from expert assessments drawn from V-Dem, Little and Meng detect divergence in recent years between what they characterize as "objective" versus "subjective" measures.
They then consider two accounts for how such divergence could arise: (1) would-be autocrats have grown increasingly subtle, channeling their transgressions into actions that "fly under the radar" of objective metrics but nevertheless represent threats to democracy (and, presumably, eventually could manifest into objective erosion); and (2) media have increasingly focused on the prospect of democratic erosion and, correspondingly, the coders who provide assessments for democracy indicators have grown more sensitive to transgressions against democratic norms.
Little and Meng (2023) note that these explanations are not mutually exclusive and that they are open to both.However, they lean toward the latter, noting that "we argue that coder bias likely explains at least some of the discrepancy." We (i.e., most of the coauthors) are part of Bright Line Watch (BLW), an organization that was formed in 2017 specifically to focus attention and energy on the question of whether US democracy faces existential threats.In that sense, we embody exactly the heightened attentiveness to erosion that, by Little and Meng's account, could be driving bias in expert assessments of democracy -at least for the United States.Briefly, if there is a problem here, we might be a part of it.
BLW regularly conducts parallel surveys of two distinct respondent pools.Our "expert" respondents are drawn from political science faculty at all US universities.We also poll a Olivier Bergeron-Boutin is a research associate in quantitative social science at Dartmouth College.He can be reached at olivier.bergeron.boutin@dartmouth.edu.John M. Carey is the John Wentworth Professor in the Social Sciences at Dartmouth College.He can be reached at john.m.carey@dartmouth.edu.Gretchen Helmke is the Thomas H. Jackson Distinguished University Professor at the University of Rochester.She can be reached at gretchen.helmke@rochester.edu.Eli Rau is a postdoctoral fellow at Vanderbilt University's Latin American Public Opinion Project and a research affiliate at the Chicago Center on Democracy.He can be reached at eli.g.rau@vanderbilt.edu.
representative sample of the American public assembled by the survey firm, YouGov. 1 This article probes Little and Meng's concerns about how expert bias might operate by leveraging two types of comparisons within our data.The first type considers comparisons across our expert pool; the second concentrates on comparing the attitudes of the expert respondents with those from the public sample.Although it is impossible to directly refute the proposition that experts are alarmist relative to some ground truth about the actual state of American democracy, our data allow us to counter several implications of the Little and Meng argument.Specifically, we show that: • Comparing those experts who regularly self-select into BLW surveys to those experts who only rarely participate, there is little evidence that the former are more alarmist about democracy than the latter.• Likewise, comparing those experts who are more active on "democracy Twitter" to those who are less engaged or immersed, there is scant support for the implication that the former are more pessimistic about democracy.• We do not find that experts in BLW's survey sample who also participate in coding for V-Dem (i.e., a principal research consortium whose democracy ratings have pointed to a "democratic recession" in recent years) are more despairing about democracy than BLW experts who do not or would not engage with V-Dem.• Contrary to the broader implications of Little and Meng's argument, comparisons with the public sample reveal that BLW experts are consistently more optimistic about US democracy overall.We confirm that our expert assessments correlate more closely with Democratic partisans among the public sample than with Republicans and that, although the Democratexpert alignment has not changed during the past six years, Republican and expert assessments have diverged dramatically.

ARE HIGHLY ENGAGED EXPERTS ALSO MORE PESSIMISTIC?
Unlike existing expert surveys on democracy across countries (e.g., V-Dem, Freedom House, and Polity), our expert pool is drawn entirely from American universities.Our mailing list includes approximately 10,000 unique email addresses, and typically 500 to 1,000 respondents complete the surveys.In any given wave, about 5% to 10% of the discipline participates in our surveys; thus, BLW's expert surveys are completed by a much larger pool of experts than any of the existing global measures of democracy. 2Given how BLW recruits expert respondents, self-selection is a possible source of bias.If experts who participate in our surveys are more concerned about the state of US democracy than those who do not participate, our measures could be systematically overstating the extent to which the discipline as a whole perceives threats to American democracy.Moreover, if political scientists indeed are selecting in and out of participation on the basis of their level of concern, then respondents who participate sporadicallyor even only once-should be less pessimistic than those who participate regularly, thereby constituting a disproportionate share of responses in any given survey.
To test this, we exploited the fact that each BLW expert respondent was assigned a unique participant ID that is stable across survey waves. 3For each survey wave, we computed the mean evaluation of US democracy on a 100-point scale among respondents who participated in only that one survey wave, respondents who participated in a total of two waves, respondents who participated in three waves, and so forth.In figure 1, each facet shows the mean rating on the 0-100 scale item among respondents who participated in 1, 2, 3…14 surveys of the 18 survey waves overall. 4Contrary to the self-selection hypothesis, figure 1 shows no pattern by which frequent responders are more or less sanguine about US democracy than infrequent responders. 5

MEDIA IMMERSION
Little and Meng suggest that immersion in a media environment that is pessimistic about democracy might generate expert-coder bias.Media sources increasingly indicate that democracy is under threat.Democracy experts bathe in this discourse and react to it.The same experts also generate the subjective assessments used to measure erosion.If all of this is the case, then we might expect to see that experts who are more heavily marinated in democracyalarmist media are particularly pessimistic about democracy.We call this the consumption hypothesis.We agree with Little and Meng that the media-consumption narrative is plausible, although we struggle to imagine a research design that provides a rigorous causal test.Moreover, we note that much of the media coverage about threats to democracy is rooted in the work of academics themselves.As such, media coverage should not be taken as an exogenous force to which and by which political scientists and other experts happen to be exposed and influenced.Rather, at least part of the concern expressed by the academic world is causally prior to media coverage.
A second possibility is that experts have balanced information diets that neither overstate nor understate the threat to democracy but that the bundle of information that experts produce for public consumption (e.g., academic articles, op-eds, and social media posts) is disproportionately of the "threat-to-democracy" genre.We call this the production hypothesis. 6ata from the 17th wave of BLW surveys, fielded in October 2022, touch on this theme of skewed scholarly production.In an attempt to detect any selection effects that may skew the public face of scholarship on the state of democracy, we asked our expert sample whether and how often they use Twitter.We also asked how often respondents tweeted about issues related to the state of American democracy.Twitter is a relevant platform for this test because of its role as a digital "town square" for academic communities, including the political science community (Bisbee, Larson, and Munger 2022).It is plausible that excessive pessimism regarding the state of American democracy may attract a substantial audience, given the propensity to consume negative news (Robertson et al. 2023; Sacerdote, Sehgal, and Cook 2020).
In our October 2022 survey of 682 political scientists, 40% of the 626 who answered the question reported that they did not use Twitter at all (i.e., non-users), 49% used the application but did not tweet regularly about American democracy (i.e., Twitter consumers), and 11% tweeted about American democracy once a week or more (i.e., democracy tweeters).We asked all of the experts to rate the then-current performance of US democracy on a 100-point scale and also to make projections five and 10 years into the future.If the production hypothesis is correct, we expected to see a group of highly concerned experts who select into frequent public discussions of the state of US democracy and who skew public-facing scholarship in the direction of alarmism.That is, we expected our democracy-tweeter experts to assess American democracy more negatively than those who are "less online."Democratic pessimism among the Twitter consumer experts would be consistent    with the consumption hypothesis.Figure 2 illustrates the democracy ratings as of October 2022 as well as future assessments of each group.On current assessments of democratic performance, the mean rating among non-users was 65; among Twitter consumers it was 68; and among democracy tweeters it was 66.The difference between Twitter consumers and non-users reached conventional statistical significance (p=0.04), with consumers more optimistic than non-users-contrary to the consumption hypothesis.Projecting into the future, all three groups anticipated democratic erosion.At five and 10 years out, Twitter consumers were the most optimistic, followed by non-users, and democracy tweeters were more pessimistic.Differences between non-users and either of the Twitter-engaged groups never reach statistical significance.The differences between Twitter consumers and democracy tweeters were significant at p=0.04 and p=0.03, respectively.However, we cannot imagine any theoretical account by which a limited amount of Twitter immersion should cause democratic optimism whereas further increasing Twitter immersion should cause pessimism.In summary, the differences that we observed in both current assessments and future projections did not map onto an account by which more Twitter immersion produces more pessimism.
It is important to note that the data presented so far are drawn from survey questions not purposely designed to test Little and Meng's proposition about expert bias.Rather, Little and Meng's intervention opened an important discussion and prompted us to reexamine data collected for other purposes, seeking leverage on this new debate.There could be a pessimism effect big enough to cause the observed decline in certain countries' V-Dem polyarchy indices but one limited enough to evade our imperfect searchlight.We note that the more dire claims about the state of democracy in recent V-Dem reports (e.g., Boese-Schlosser et al. 2022) rely on shifting the unit of analysis from the country to the individual citizen.Thus, recent declines in V-Dem's polyarchy scores for a few large countries (in particular, India, where 17.7% of the world's population lives) overshadow democratic improvements in smaller countries.As Little and Meng observe, the average V-Dem polyarchy score across countries has remained fairly constant since 1990-in fact, the trend line is remarkably similar to that of Little and Meng's proposed alternative measures.Nonetheless, we still may be concerned that a pessimism bias is driving or exaggerating the appearance of backsliding in the countries where V-Dem has registered declining polyarchy scores.
In our most recent BLW survey, conducted in June-July 2023, we sought to determine more directly whether the specific political scientists who generate expert assessments on which contemporary democracy scholarship is based are systematically biased toward democratic pessimism. 7Specifically, at the end of a BLW expert survey, we included questions-designed in collaboration with Little and Meng-that asked respondents whether they had ever been invited to serve as a coder for V-Dem.We also asked whether they had served (if invited) and about their willingness to serve (if not invited).Of the 544 expert respondents who completed this section of our survey, 484 (89%) had never been invited to serve as V-Dem coders 8 ; 16 (3%) had been invited but did not participate; and 44 (8%) had been invited and served as coders.
Prior to being asked about V-Dem participation, our expert respondents had rated on a 100-point scale the quality of democracy in a random subset of six countries around the world in addition to the United States from the following: Brazil, Hungary, India, Italy, Israel, Kenya, Mexico, Peru, the Philippines, Poland, Turkey, and the United Kingdom.
Expert-coder bias could operate through either V-Dem invitations (i.e., experts who are invited are more pessimistic than those who are not invited) or self-selection (i.e., experts who choose to participate are more pessimistic than those who decline).Figure 3 shows the average ratings for each country.The left panel compares ratings from expert respondents in our sample who had and had not been invited by V-Dem.The right panel compares ratings from those willing versus those unwilling to participate with V-Dem.
We find no evidence of bias toward pessimism at either stage of selection into the V-Dem coder pool, invitation or participation.The average democracy ratings among those who were invited to code for V-Dem were higher than the average ratings among uninvited experts for 12 of 13 countries (except only Turkey).The difference reached statistical significance for Brazil and Peru.With regard to participation, the average ratings were higher among those willing to code for V-Dem than among those who were unwilling for all 13 countries, with statistically significant differences for the United States, Brazil, Mexico, and Poland.Overall, those political scientists whom V-Dem targets and those inclined to participate if asked appeared more, not less, sanguine about democracy around the world than experts outside of that V-Dem coder pool.
There are three important caveats to our analytical strategy.First, those experts who code do so for specific countries for which they have the greatest expertise.Unfortunately, our survey sample does not provide a sufficient number of "direct hits" (i.e., V-Dem coders who also rated their specific country of expertise in our survey) to allow comparison with ratings from the broader set of political scientists.Second, our sample of political scientists may be vulnerable to self-selection bias; it is possible that experts who declined to take part in our survey hold systematically different attitudes from those who did. 9 Third, it also is possible that political science as a whole is unduly pessimistic about democracy across the world, in which case the baseline against which we are comparing V-Dem coders does not reflect a ground truth about democracy around the world.If such a bias were new or increased in recent years, it could cause a universal shift in coder standards -according to Little and Meng's formulation-that our approach might not detect.

EXPERT VERSUS PUBLIC ASSESSMENTS OF DEMOCRACY
In addition to our expert surveys, BLW also routinely polls the general public about the state of democracy in the United States and, occasionally, other countries.Comparing recent rankings of democracy in 13 countries, expert assessments about democratic performance exhibit far less compression than those of the public and offer more precise estimates (with lower variance as a group) -as we would expect and hope if the experts are actually better informed.
Figure 4 shows mean democracy ratings on the 100-point scale, with 95% confidence intervals, for 13 countries plus the United States as of October 2022.The rank ordering of countries by experts and the public is almost identical, with North Korea at the bottom and Canada at the top.However, mean expert assessments range from 2 to 84, whereas the public's assessments range  Data from Wave 17 (Oct.2022) Source: @BrightLineWatch -October 2023 from 18 to 59.The expert assessments are far more precise, with country-level standard deviations ranging from 6.7 (North Korea) to 20.7 (Israel), compared with 21.8 (Great Britain) to 26.6 (Israel) for the public.
Returning our focus to the United States, figure 5 shows the time series from 2017 through 2022 of mean responses with 95% confidence intervals on the same 100-point scale from our expert respondents (shown in green) and the public (shown in purple). 10f Little and Meng's thesis about expert bias is correct, we might expect experts to be more pessimistic than the public or at least that the relative optimism of experts, compared to the public, would have declined.In fact, relative to the public, our experts consistently rate American democracy approximately 10 points higher.Neither do we observe any evidence that experts are increasingly pessimistic, relative to the public as a benchmark, over time.Indeed, BLW's most recent survey, conducted in June-July 2023, shows the largest optimism gap yet between experts and the public.
Next, we consider how expert ratings compare to those of the public across a range of democratic principles.BLW regularly surveys both groups on 30 principles related to elections and voting; citizen rights and protections; and accountability, institutions, and norms.A full description of each principle is included in the Appendix.Figure 6 illustrates the proportion of experts (green circles) and the public (purple squares) who, in June-July 2023, rated the United States as fully or mostly meeting each standard.
The pattern of expert discernment and precision relative to the general public is similar to what is observed in figure 4. Expert assessments range from 9% (i.e., districts not biased) to 90% (i.e., government statistics not politically influenced).The range in these responses highlights the value of expert-coded measures of specific, concrete variables.Expert ratings of American democracy tend overall to cluster between 65 and 70 on the 100-point scale.However, when the experts are asked about specific components, their assessments range from widespread confidence (e.g., in election integrity and many civil liberties) to almost consensus on principles not met (e.g., unbiased districts or adherence to norms of mutual respect and cooperation).By contrast, the public's assessments across the 30 standards of performance are relatively clustered.Reviewing the percentage of respondents who agree that the United States fully or partially meets each standard, the range is from 15% to 56%, with 20 of the 30 items falling between 25% and 50%. 11ne interpretation of figure 6 is that many respondents in our public sample may have a general sense of the state of democracy and work backward from that overall rating to assess the individual indicators of performance on which we query them.There are notable exceptions, particularly regarding issues for which there have been salient elite cues, such as fraud-free elections and equal voting rights.Nonetheless, the broad pattern appears to be reasoning from the general to the specific.By contrast, the experts-informed and opinionated-address each standard independently.

EXPERT ASSESSMENTS AND PARTISANSHIP
A final possible source of bias in expert assessments (albeit one that was not directly raised by Little and Meng) is that the BLW expert sample may be systematically more aligned with one partisan group compared to another.BLW does not ask our expert respondents about their partisanship, but we do ask respondents confidence among experts that politicians concede defeat (recovering from a low of 34% to 57%).Third, the correlation between partisan Independents and our experts, which held steady into 2020, eroded more gradually during the next two years-although never as sharply as among Republicans-and partially recovered in late 2022 and 2023.
There are at least two competing interpretations of these patterns.The more obvious is that our sample of experts likely skews Democratic.This is unsurprising because American university faculty members are well known to skew overwhelmingly Democratic (Langbert and Stevens 2021).This predates the beginning of our time series and continues throughout the study.Another possibility-albeit one that is difficult to establish systematically-is that the partisan groups differ in the soundness of their assessments of performance on our 30 democratic principles, with Democrats being more accurate than Republicans and Independents.We note that the Democrat-expert correlation remains steady throughout, whereas high-salience political events since 2019 appear to have breached any common ground that our experts shared with Republican partisans on democratic performance.

CONCLUSION
Our data provide a unique opportunity to delve into Little and Meng's concerns about expert bias; however, it is important to highlight the limitations of our conclusions.BLW began conducting surveys only in 2017, after the period that Little and Meng identify as potentially plagued by the beginning of the bias.Moreover, BLW data focus mostly on the United States, whereas Little and Meng focus on cross-national democracy indices.Finally, nothing presented here directly refutes the Little and Meng proposition that our expert assessments are increasingly biased toward pessimism relative to a ground truth.However, relative to the discipline of political science more generally and relative to the public from 2017 to 2022, we find no evidence for a particular pessimism among our experts.The most highly engaged experts-whether with BLW surveys, Twitter, or V-Dem-evaluate American democracy about the same as less-engaged experts.The most highly engaged experts-whether with BLW surveys, Twitter, or V-Demevaluate American democracy about the same as less-engaged experts.The experts are more optimistic than the public as a whole and expert assessments…display properties of discernment and precision that are reassuring for a highly informed sample.
The experts are more optimistic than the public as a whole and expert assessments-of both specific elements of American democracy and overall democratic performance in other countries-display properties of discernment and precision that are reassuring for a highly informed sample.

F
Ratings of U.S. democracy by experts on a 0-100 scale.
Figure shows mean values by Twittter usage.Vertical error bars are 95% confidence intervals.Source: @BrightLine Watch -October 2022Vertical error bars are 95% confidence intervals.

F
Figure shows mean rating by experts and the public.Data from Wave 17 (Oct.2022) Source: @BrightLineWatch -October 2023

F
Ratings of U.S. democracy by the public and experts on a 0-100 scale.Figure shows mean values across 19 survey waves.Vertical error bars are 95% confidence intervals.Source: @BrightLineWatch -July 2023

FFigure
Figure shows the correlation coefficient in the proportion of experts and different partisan groups that rate the performance of US democracy on 30 different indicators as positive.Each point is the correlation coefficient for a given wave, across those 30 indicators.

C
o m m e n t a n d C o n t r o v e r s y : S p e c i a l I s s u e o n D e m o c r a t i c B a c k s l i d i n g ....