Encouraged to Cheat? Federal Incentives and Career Concerns at the Sub-national Level as Determinants of Under-Reporting of COVID-19 Mortality in Russia

Abstract This article investigates the determinants and consequences of manipulating COVID-19 statistics in an authoritarian federation using the Russian case. It abandons the interpretation of the authoritarian regime as a unitary actor and acknowledges the need to account for a complex interaction of various bureaucratic and political players to understand the spread and the logic of manipulation. Our estimation strategy takes advantage of a natural experiment where the onset of the pandemic adjourned the national referendum enabling new presidential terms for Putin. To implement the rescheduled referendum, Putin needed sub-national elites to manufacture favourable COVID-19 statistics to convince the public that the pandemic was under control. While virtually all regions engaged in data manipulation, there was a substantial variation in the degree of misreporting. A third of this variation can be explained by an asynchronous schedule of regional governors’ elections, winning which depends almost exclusively on support from the federal authorities.

effect of political regimes on pandemic management. 2 While some studies claim the existence of a so-called 'autocratic advantage' in terms of dealing with COVID-19 (Cepaluni, Dorsch and Branyiczki 2021;Karabulut et al. 2021), others argue that this statistical 'advantage' is a product of the deliberate under-reporting of official pandemic statistics by autocratic governments (Adiguzel, Cansunar and Corekcioglu 2020;Annaka 2021;Badman et al. 2021;Kapoor et al. 2020;Kennedy and Yam 2020;Knutsen and Kolvani 2022;Neumayer and Plümper 2022).
This under-reporting is frequently identified through the comparison of the indicators of COVID-19 deaths and overall excess mortality, with the latter measure being a substantially more reliable one (COVID-19 Excess Mortality Collaborators 2022;Karlinsky and Kobak 2021;Whittaker et al. 2021).This simple method of detecting COVID-19 mortality manipulation hands us a unique opportunity to look deeper into the mechanism to explain the extent of data manipulation in authoritarian regimes.
Autocracies are not unitary actors.Rather, their policies are an outcome of a complex interaction of numerous players with partly contradictory interests.Data published in such regimes are produced within their bureaucracies, comprised of agencies pursuing their own goals (Herrera and Kapur 2007). 3Thus, to understand the patterns of data manipulation, we need to study this interaction and the interests of the actors involved.Previous literature has acknowledged the importance of bureaucratic incentives; yet, the primary focus has been on how bureaucracies misinform their principals and which tools authoritarian regimes establish to address this problem (see, for example, Wallace 2016;Zhou and Zeng 2018).The settings where autocrats create incentives encouraging data manipulation and their consequences for reported information have received much smaller scholarly attention (for a recent exception, see Tang, Wang and Yi 2022).This article aims to close this gap.
We do so by studying the mechanisms leading to the manipulation of COVID-19 data by regional authorities in Russia in the early months of the pandemic in 2020.Russia is among the countries heavily affected by the COVID-19 crisis and against which accusations of data manipulation have been made relatively often. 4At the same time, it has a large bureaucracy plagued by severe principal-agent problems (Libman and Rochlitz 2019) and a history of responding to incentives set by the central government through massive data manipulation (Kalinin 2018).We show that the set of incentives generated by the federal government triggered less accurate reporting of COVID-19 mortality in the Russian regions.To identify the effect of incentives, we rely on cross-regional variation in COVID-19 mortality reporting.Bureaucracies in different regions have different intensities of response to the central government's incentives.Regions where local governors believe they face larger political risks and could therefore lose their office should be more willing to appease the central government and, thus, to under-report COVID-19 mortality (Egorov and Sonin 2011).
We use a specific feature of the Russian political systemasynchronous election cycles in individual regionsas a source of exogenous variation, which leads to the natural randomization of political risk for incumbent governors across regions (Akhmedov and Zhuravskaya 2004). 5The proximity to the next elections poses a risk to the survival of the governor in their office but is orthogonal to characteristics determining the spread of the pandemic in the region; hence, this setting enables us to establish a causal relationship between political risk (and, hence, susceptibility to the federal government's incentives) and under-reporting of COVID-19 mortality.2 See, for example, Ang (2020), Bayerlein et al. (2021), Cassan and Van Steenvoort (2021), Frey, Chen and Presidente (2020), Greer et al. (2020), Nelson (2021), San, Bastug and Basli (2021) and Stasavage (2020).
3 In China, for example, local bureaucracies attempted to hide the spread of COVID-19 at the beginning of the pandemic (Ding and Lin 2021;Gu and Li 2020).
The article is structured as follows.The second section discusses the theoretical contribution and describes the Russian setting during the early phase of the COVID-19 pandemic.The third section presents the data and main variables.The fourth section reports the main findings on the causal effect of career concerns on the under-reporting of COVID-19 mortality.Finally, the fifth section reveals some tentative results on the potential consequences of exposed under-reporting on trust in governmental statistics and self-isolation.

Data Manipulation in Authoritarian Regimes
Authoritarian regimes systematically manipulate information and, in particular, statistical data.Some regimes, such as the Soviet Union or North Korea, suppress a substantial amount of data or even publish deliberately wrong information (Eberstadt 2007;Jasny 1950).Others are more open but still inclined to manipulation.For example, Magee and Doces (2015) and Martinez (2022) provide evidence of the manipulation of economic growth statistics by autocracies.Economic news, election outcomes, media publications and even social network posts are all subject to manipulation (see, for example, Bader and van Ham 2015;Hale 2018;Harvey 2020, King, Pan andRoberts 2013;King, Pan and Roberts 2017;Moser and White 2017;Myagkov, Ordeshook and Shakin 2009;Pearce and Kendzior 2012;Rozenas and Stukal 2019).This manipulation serves several goals.First, when directed at domestic audiences, it reduces the likelihood of public protests by preventing coordination among the opposition and possibly misleading the public into perceiving the regime as competent and benevolent (Chen and Xu 2017b;Hollyer, Rosendorff and Vreeland 2015).Secondly, manipulating statistics matters for the international posture of the regime, for example, by making other countries overestimate the regime's stability and power, or raising its attractiveness to foreign business (Aragao and Linsi 2022).
Which political mechanisms, however, lead to the production of fake data?In some cases, authoritarian regimes explicitly order their subjects to manipulate information. 6In many cases, however, the mechanism is more complex: the authoritarian leadership creates incentives for data fabrication but leaves the details on the actual means and the extent of this manipulation to be decided by individual bureaucracies.Under these conditions, information manipulation will vary across individual agencies and branches of bureaucracy: while some of them 'underperform', thus remaining relatively 'honest', others overindulge in the information manipulation to an extent surpassing the central government's desire. 7While this mechanism appears to be plausible, there is hardly any research on how misinformation emerges from the regime nudging its bureaucrats towards it through a specific formal or, even more importantly, informal incentive structure.
The role of bureaucracies is acknowledged in different literature on data manipulation in autocracies, with one looking at regimes being at the 'receiving end' of misinformation and failing to obtain accurate data on the real political, economic and social developments in the country.Fundamentally, there is a large literature showing the severe problems that authoritarian regimes face in gathering information from their subjects (Kuran 1997;Wintrobe 1998) and studying the various tools autocracies use to improve information acquisition (Anderson et al. 2019;Chen, Pan and Xu 2016;Chen and Xu 2017a;Dimitrov 2014;Egorov, Guriev and Sonin 2009;Huang, Boranbay-Akan and Huang 2019;Jiang and Wallace 2017;Lorentzen 2014;Tan 2014).Bureaucracies of authoritarian regimes that pursue their own strategic goals, whether for promotion or the avoidance of punishment, are eager to embellish their achievements or hide their failures.Bureaucratic hierarchies are, generally speaking, a natural environment for bottom-up information manipulation due to widespread principal-agent problems (Gailmard and Patty, 2012), but in authoritarian regimes, information distortions tend to be especially severe due to the lack of free media or public accountability as alternative sources of information.The research on bureaucratic information manipulation in autocracies so far has focused primarily on the case of China (Fisman and Wang 2017;Merli and Raftery 2000;Wallace 2016;Zhou and Zeng 2018), where governmental incentives for local bureaucrats not only played an essential role in ensuring the high economic performance of the regime (Xu 2011), but also made data manipulation very attractive (Chen et al. 2019). 8 We are interested, however, in settings where bureaucracies' objective is not to hide information from their political principals, but to manipulate the publicly available information in line with the goals of the principals, or at least what bureaucrats perceive to be the goals of the principals.A fundamental challenge we face is, then: (1) to find evidence of data manipulation; and (2) to check whether bureaucrats indeed respond to specific political incentives.As already mentioned in the introduction, the COVID-19 pandemic provides us with an instrument to solve the first problem (comparison of official COVID-19 mortality and overall excess mortality).To deal with the second problem, we leverage the aforementioned fact of the heterogeneous response of individual bureaucracies to central incentives as the key element of our identification strategy and look for systematic patterns between data manipulation and the sensitivity of regional bureaucracies to incentives.We treat proximity to local elections in individual regions as the main trigger of this heterogeneity.

Proximity to Sub-national Elections and Bureaucratic Response
Approaching elections are well known to influence the behaviour of both politicians and bureaucrats.Since Nordhaus (1975), the voluminous political business-cycle literature has focused on how proximate elections influence fiscal and macroeconomic policies to create visible short-term economic results to sway voters in their favour (for a review, see Drazen 2000;Dubois 2016;Philips 2016).Its argument is straightforward: incumbents introduce policies attractive to voters, such as increasing public spending, prior to elections at the expense of policies implemented in the immediate aftermath of the elections.While the original political business-cycle literature was developed for fiscal policies, similar arguments were later used to explain the temporal dynamics of a wide variety of other policies in democracies (Ahuja 1994;Berdejó and Yuchtman 2013;Bracco 2018;Canes-Wrone and Shotts 2004;Marinov, Nomikos and Robbins 2015;Nanes 2017;Potrafke 2019;Shmuel 2021;Vadlamannati 2015).
It is, however, less clear if this logic applies in an authoritarian context (Pepinsky 2007;Shmuel 2020).Elections in autocracies are generally less important than in democracies and thus provide smaller incentives for leaders to adjust their policies.Term limits for the tenure of bureaucrats and politicians can, however, produce effects similar to elections.For instance, in the case of China, recent literature documents the existence of a sort of political business cycle tied to the term limits of local officials (Cao, Kostka and Xu 2019;Chen and Zhang 2021;Guo 2009).The core difference between this bureaucratic political cycle and the political business cycle in a democracy is that officials have incentives to 'please' their superiors (or the authoritarian leaders), rather than their electorate.
In Russia, formally, regional governors hold their position for a five-year period, after which they have to stand for re-election.Despite the de jure direct election of governors, in practice, the federal government almost always determines whether a governor stays in power or is replaced (Golosov 2018).Therefore, governors should be more likely to behave more like appointed bureaucrats than elected politicians.From the point of view of the bureaucratic political-cycle argument, they should focus on pleasing the federal centre towards the end of their tenure and be less concerned about it if the end of their tenure is farther away.9Sidorkin and Vorobyev (2018;Sidorkin and Vorobyev 2020) show that proximity to the end of the term increases the levels of predation among Russian governors, potentially expecting to lose their position, and makes them more willing to engage in the acquisition of votes for pro-Kremlin candidates at federal elections.We expect a similar logic to apply to the behaviour of governors during the pandemic.
Proximity to the end of the term provides an exogenous variation in exposure to political risk for individual governors.The election cycles in Russian regions mostly originate from the historical precedent of the 1990s, when individual regions had substantial freedom in determining their political systems, including the timing of regional elections (Gel'man 1999;Hale 2003;Sharafutdinova 2006).The governors' elections were introduced in different regions at different points of time between 1991 and 1996; the terms of governors were further influenced by regionspecific political changes and occasional early resignations.This historical variation is likely to be orthogonal to any characteristics of the regional governors occupying these positions in 2020 and thus other factors influencing governor-specific responses to the pandemic.
Perception of imminent political risk can trigger numerous types of political responses.In Russia, however, data manipulation is, as the next section shows, a particularly likely reaction on the side of bureaucrats.

Russian Bureaucracy and Data Manipulation
The Russian case presents us with an excellent opportunity to study bureaucratic data manipulation.On the one hand, Russia is a large country with extreme heterogeneity in regional economic, political and cultural conditions.On the other hand, under Vladimir Putin, Russia has developed into a consolidated authoritarian regime, where media and civil society are heavily constrained in their ability to report on local conditions openly (Gel'man 2015), offering bureaucrats a free hand in faking data.
Two features of the political organization of the Russian state gave rise to what one can refer to as a real culture of data manipulation.First, Russia is an electoral autocracy: elections at the regional and federal levels are conducted regularly and are important for the legitimation of the Russian regime.One of the crucial tasks of regional governors is to ensure electoral success: both their own and that of federal pro-regime candidates and parties.The share of votes of pro-Kremlin candidates serves as a key criterion for evaluating regional governors from the perspective of the federal administration (Gorokhov 2017;Reuter and Robertson 2012;Rochlitz 2020).In Russia, the task of ensuring favourable election outcomes is frequently achieved through electoral fraud (Enikolopov et al. 2013;Harvey 2016;Myagkov, Ordeshook and Shakin 2005;Skovoroda and Lankina 2017).Manipulating elections requires the active participation of bureaucrats at all levels, and electoral fraud constitutes a casual routine for numerous Russian state officials (Forrat 2018;Frye, Reuter and Szakonyi 2019).It stands to reason that bureaucrats who are accustomed to electoral fraud will not hesitate to manipulate data in other settings as well.
Secondly, the Russian central government heavily relies on quantitative indicators to monitor its bureaucracy.The lion's share of the salaries of Russian bureaucrats is constituted by a performancebased bonus, which is paid depending on over-fulfilling a number of quantitative indicators set by the higher-level bureaucracies.Over time, the quantitative indicators in virtually all branches of bureaucracy have become more numerous and complex (Schultz, Kozlov and Libman 2014).Conversely, Russian bureaucrats are subject to regular checks by numerous controlling agencies, which again concentrate on formal regulations and quantitative data produced by bureaucracies.State officials in many agencies consider the ability to carefully fulfil all requirements for the paperwork as the essential characteristic of performance, being more important than the actual tasks of the bureaucracy (Paneyakh 2014).As a result, Russian bureaucrats systematically fake data to fulfil formal requirements, as well as to avoid inspections and punishments (Kalgin 2016).Data manipulation is also widespread in the healthcare sector (Chernov and Sornette 2016).
Thus, for the Russian bureaucracy, data manipulation appears to be a routine, rather than an exceptional, practice.From this point of view, there are no reasons to expect the COVID-19 pandemic to have been met with a different set of tools than any other challenge of the Russian bureaucracy.However, the direction and the scope of manipulation depend on the particular structure of incentives that Russian regional bureaucrats face in a specific situation.

COVID-19 in Russia, the Referendum and the Career Incentives of Russian Bureaucrats
The start of the pandemic in Russia presented Putin's regime with a challenge.For 2020, Putin had scheduled a major constitutional reform.While the amendments to the constitution were numerous, probably the most important one was that Putin would receive the right to run for presidential office after the expiration of his current term in 2024, which would otherwise be impossible due to constitutional term limits.Although the amendments could have been adopted by a simple parliamentary decision, Putin decided to turn the change of constitution into a major showcase of their loyalty to his regime, announcing a national referendum.Initially, the referendum was scheduled for April.However, the spread of the SARS-CoV-2 virus made the feasibility of the referendum questionable.Putin was forced to postpone the referendum and decided to implement it over a seven-day period from 25 June until 1 July (Pomeranz and Smyth 2021; Teague 2020).
The feasibility of the new referendum date depended upon the development of the pandemic.Organizing the referendum at the peak of the spread of the new virus would both reduce the ability of the referendum to boost legitimacy and even result in public disapproval of the carelessness of the government.The high perceived risk of contracting the virus at the mass public event would also severely reduce the turnout.Thus, reducing the contagion rates, or at least convincing the population that the pandemic was under control, became the key task of the regime (Blackburn and Petersson 2021).
During the pandemic, Putin refrained from personally introducing unpopular measures (like lockdowns); instead, he transferred the authority to deal with the pandemic to regional governors, making them de facto responsible for containing the virus (Åslund 2020;Hartwell, Otrachshenko and Popova 2021).Given the arguments outlined earlier, it appears plausible that governors responded to the COVID-19 challenge with systematic data manipulation similar to other informal tasks of the federal centre, such as ensuring favourable election outcomes (see Busygina and Filippov 2021).
The literature has already provided the first evidence of the 'culture of silence' (Shok and Beliakova 2020) and manipulation of COVID-19 mortality and contagion data in Russia (Belianin and Shivarov 2020;Kobak 2021).We conjecture that a major share of data manipulation is likely to originate from the regional level and result from the informal incentives faced by Russian governors.Data manipulation is determined by two conditions: the importance of suppressing the COVID-19 data for the federal centre and the individual political situation of the governors.This leads to the following two expectations: -First, some governors should be more inclined to care about the informal federal objectives than others.In particular, in line with the reasoning of the previous section, governors who perceive their situation as risky (that is, face proximate elections) should be more likely to attempt to please the central government by excessively manipulating the data; for governors with a stronger political position (that is, more distant elections), excessive manipulation is less critical.-Second, data manipulation should be stronger in the months preceding the referendum, when it was essential for the regime to show that the pandemic was on the decline.After the referendum, acknowledging the spread of the pandemic became less of a problem for the Russian regime.Therefore, regional governors should also be less inclined to manipulate data.Since the national referendum took place in all regions at the same time, its timing is orthogonal to region-specific characteristics.
These arguments guide the remaining part of this article.

Data Explanatory Variable: Measuring Political Risk
In what follows, we present the key variables of our study. 10We start with the proxy of political risk, that is, the key factor of susceptibility of governors to federal incentives.As already mentioned, we measure the individual exposure of a governor to political risk by looking at the proximity to the upcoming governor's election.Since governors' elections in Russia follow an asynchronous electoral schedule, the arrival of an exogenous shock, such as a COVID-19 pandemic, automatically splits all the regions into two categories: regions with a governor in the first half of their term; and regions with a governor in the last half of their term.Governors in the second half of their term are expected to be more concerned with their political future because their performance is under more scrupulous attention from the federal centre, which eventually decides the governor's fate as the election time arrives.
For April 2020, we identify forty-three out of a total of eight-five regions where governors were in the second half of their term.We construct our main explanatory variable, Elections approaching, as a dummy that equals 1 when the elections of the regional governor are scheduled for the year 2020-22 and 0 for governors with elections in the year 2023-24 (forty-two regions).Additionally, similar to Pulejo and Querubín (2021), we use a continuous measure of the proximity to the upcoming elections measured in full years. 11Since the governor's term is for five years, our alternative measure takes values from 0 for governors with elections in 2020 to 4 for those who have to stand for re-election in 2024.Figure 1 presents the distribution of regional governors in Russia according to the period remaining until re-election, and Map 1 shows how these regions are spread across the Russian territory.
Supplementary Appendix (SA) C provides balancing tests to show that both our variables for the proximity to upcoming elections are not correlated with regional characteristics, such as income, Gini index, the share of professional education, urbanization rate, population size, life expectancy, vote share for Putin in 2018 (see Table C1 in SA C) and regional political institutions (see Table C2 in SA C).

Dependent Variable: Measuring the Under-Reporting of COVID-19 Mortality
As already discussed in the introduction, we capture the degree of under-reporting of the COVID-19 pandemic by comparing the official data on COVID-19 mortality and the data on overall excess mortality (published at a later point in time).Our analysis looks exclusively at mortality from COVID-19 and not infection rates for two reasons.First, the cross-country evidence suggests that the deliberate misreporting of mortality data from COVID-19 was much more common than the misreporting of infection rates (see, for example, Balashov, Yan and Zhu 2021). 12 In Russia, this also seemed to be the case because the official mortality rates were extremely low in 10 Summary statistics of the variables are presented in Table B1 in SA B. 11 Proximity is measured in full years because all the governors' elections in each year take place on the edinyi den golosovaniya ('Single Elections Day'), that is, the second Sunday of September.
12 We suppose that the main reason for this was the general public being substantially more sensitive to death numbers as compared to the number of infections, especially if the official media were trying to equate the new virus to seasonal diseases, such as flu.
international comparison but, at the same time, Russia was ranked third in terms of the number of infections worldwide, according to the official data. 13This appalling discord between high infections and low mortality received great attention in Russian public discourse and was even labelled 'a Russian miracle' by the official media representatives of the Russian Coronavirus Information Center. 14 The second reason to focus on mortality statistics is determined by our identification method of data manipulation based on using excess mortality as a reliable measure of the true toll of the pandemic, while such a non-manipulable benchmark does not exist for infection rates.Excess mortality has become widely recognized as a substantially less biased measure that can account for undetected and unreported COVID-19 cases (Beaney et al. 2020;Vestergaard and Mølbak 2020).In the Russian case, overall mortality has been traditionally more reliable than the often manipulated disaggregated mortality by causes (Danilova et al. 2016;Lysova and Shchitov 2015) because registering an act of death has important legal implications.Russian bureaucrats would face enormous difficulties if they tried to conceal the fact of death (and it would be immediately discovered by the citizens).The cause of death, on the other hand, can be easily manipulated since misspecifying a cause of death on a death certificate has no particular implications, with only a few exceptional circumstances, such as an investigation of a medical error.The surviving members of the family often pay little attention to it (for details, see SA A3).The key variables of our analysis are constructed in the following way.

Official COVID-19 mortality
We employ the data published at stopcoronavirus.rf, the government-operated website established in the first weeks of the pandemic, reporting real-time data on infections and mortality.The website was widely advertised on national television and the internet, including the leading social platforms.Importantly, stopcoronavirus.rfwas recognized by the authorities as the only legitimate source of statistical information on the COVID-19 pandemic in Russia.Such a status implied that publishing any alternative estimates would be classified as 'false information of public interest, shared under the guise of fake news' (Sherstoboeva 2020), and penalized with up to a five-year sentence or a heavy fine (up to 300,000 roubles or 4,200$) under a newly introduced amendment to the defamation law. 15SA A2 provides information on the process of COVID-19 mortality statistics collection in Russia.
We construct the main variable for official COVID-19 mortality as the ratio of officially reported deaths from COVID-19 to the average all-cause mortality in the respective months over the previous three years : Reported deaths from COVID19 it Past deaths from all causes (average) it , where i and t indicate the region and time period, which is either the three pandemic months before the referendum (April-June) or the three months after (July-September), which are used for testing the effect of political risk on under-reporting in the absence of federal incentives. 16Data for Reported deaths from COVID19 t are collected from the official website, stopcoronavirus.rf.Data on average deaths from all causes in the previous three years come from the Federal State Statistic Service (Rosstat).17Map 2 presents the spatial allocation of official COVID-19 mortality across Russia's territory before the referendum (April-June), indicating substantial regional heterogeneity in the intensity of the virus outbreaks.

Excess mortality
First, we compute the number of excess deaths as a difference between the number of current deaths from all causes and deaths from all causes in the respective period over the last 15 The law was introduced at the very start of the pandemic in Russia (31 March 2020).For the full text (in Russian), see the official Russian legal-informational portal, available at: http://publication.pravo.gov.ru/Document/View/0001202004010073three years. 18We are interested only in the positive number of excess deaths, as it manifests the actual death toll of the pandemic.We construct the variable for Excess mortality as denoted in Equation 2: Excess mortality it = 100 ×

Excess deaths it
Past deaths from all causes it , Excess deaths it .0 0, Excess deaths it , 0 , where i and t again indicate the region and time period, respectively, and Past deaths from all causes t is the average number of deaths from all causes in the respective period over the last three years, as in Equation 1.
Juxtaposing the two mortality measures, we observe that excess mortality in the pandemic months before the referendum was significantly higher than official COVID-19 deaths in most of the regions (eight-one regions out of eight-five).The spatial distribution for the excess mortality is also different, as illustrated by Map 3.
Importantly, the publication of all-cause mortality for May, the first month when the death toll of the pandemic was high enough to make the discrepancy between the two mortality statistics noticeable, was postponed until after the referendum was completed, that is, the data were published when the federal government no longer needed to convince the public that the COVID-19 crisis was under control.This is yet another reason to treat these data as more reliable.
Did the regions with governors in the second half of their term exhibit different rates of official COVID-19 mortality and excess mortality than regions with governors in the first half?We plot the monthly trend of both mortality measures by the two categories of regions in Figure 2.While official COVID-19 mortality was substantially higher in regions with approaching elections (see Panel A), the excess mortality rate was statistically indistinguishable between the two groups (see Panel B).The gap in official COVID-19 mortality was particularly noticeable for the months Note: Crimea and Sevastopol, while omitted on the maps, are included in the sample of the study.

18
For robustness, we have also used the excess mortality data from Kobak (2021), and in every case, we obtained results almost identical to those reported in the following.Note: There were forty-three regions with governor elections in 2020-22 and forty-two regions with elections in 2023-24.
before the referendum; it became smaller (though did not disappear entirely) after the referendum.

Medic mortality
We also use an additional variable to corroborate our results: we look at COVID-19 mortality among medical staff as an alternative estimate of the actual size of the virus outbreak in the region.The data come from a non-governmental website, Memorial List, established in the first days of the pandemic and based on colleagues and relatives reporting the deaths of medical personnel. 19Thus, it is unaffected by governmental manipulations; the irregular governmental reports provide much lower COVID-19 mortality rates among healthcare personnel than the data from Memorial List. 20In the first months of the pandemic, medical staff were particularly vulnerable (Domínguez-Varela 2021; Gross, Mohren and Erren 2021; Iyengar et al. 2020;Manzoni and Milillo 2020), and higher mortality in this group is thus likely to indicate a more substantial spread of the SARS-CoV-2 virus and the severity of the pandemic in Russian regions.Thus, medic mortality can be used as a robustness test.
Medic mortality is constructed as a ratio of the deaths of medical staff 21 from COVID-19 to the average deaths from all causes in the last three years, as follows: Medic mortality it = 100 × Deaths of medical personnel it Past deaths from all causes (average) it . (3) Alternative explanation: anti-COVID-19 policies Our analysis focuses on sensitivity to incentives (the extent of political risk) influencing the under-reporting of COVID-19 mortality.However, there is an important alternative explanation: political risk could trigger governors to implement actual measures to reduce COVID-19 mortality, rather than to fake data.This would lead to an upward bias in our estimations. 22 To check this explanation, we investigate the effect of election proximity on the actual anti-COVID-19 measures introduced by regional authorities.We use the CoronaNet Research Project, an international database on government responses to COVID-19.CoronaNet (Cheng et al. 2020) contains over 110,000 entries of individual anti-COVID-19 measures for about 200 countries in the world.For Russia, as well as for a number of other countries, data are available at the sub-national level (Schenk and Ganga 2022).We compile a variable, AntiCOVID policies it , as the number of regional policies established before the constitutional referendum.Such policies commonly included social distancing, health testing, travel restrictions, quarantine and massgathering regulations, lockdowns, curfews, and business and public restrictions, and were very widespread, with an average region introducing about 124 measures, starting as early as February.

21
Including not only physicians, but also technical staff in the healthcare system (e.g., aid persons, paramedics and medical emergency drivers).

22
Another prediction can be derived from the arguments of Pulejo and Querubín (2021), who found in a cross-country sample that for countries with presidential systems, proximity to elections was associated with less stringent public health measures to prevent the spread of COVID-19.In this case, politicians prefer to avoid measures displeasing the public in the face of elections.In Russia, however, the public is much less important for the political fate of governors than is the Kremlin.
from data on the individual behaviour of peopleparticularly their effort to self-isolate.We employ the index of self-isolation of the Russian regional population composed by Yandex, the Russian major search-engine company.The self-isolation index measures population mobility in all urban areas based on smartphone data. 23Its value ranges from 0, which is equivalent to the highest mobility during the pre-pandemic rush hour, to 5, which is the lowest level of mobility, for example, that which can be observed at night.Again, we check whether proximity to the elections of regional governors affects the self-isolation index. 24

Analysis
Cross-Sectional Results for the First Three Months of the Pandemic This section presents our main results.First, we look at the cross-sectional specification to test our main hypothesis about the positive relationship between the time left until the election and the reporting of official COVID-19 mortality across the eighty-five regions of Russia before the referendum.In theory, official COVID-19 mortality would be perfectly predicted by a more reliable estimate of overall excess mortality even under a sizeable manipulation of data as long as the manipulation effort is uniform across all regions (that is, all regional authorities report the same proportion of actual cases).However, in reality, the index of correlation between the two estimates aggregated over April-June equals about 0.7.This means that in some regions, manipulation was stronger than in others.This is precisely the variation that we are interested in for our analysis.
We estimate the following cross-sectional equation: where: i = 1, …, 85 indicates the region; Official COVID19 mortality i is the officially reported COVID-19 mortality over the first three pandemic months before the referendum; Election proximity i is one of the two variables for the distance to the next governor elections in region i; and Excess mortality i is the excess mortality.Additionally, we test the effect of election proximity on Official COVID19 mortality i during the three months after the referendum, when the federal incentives weakened.To corroborate the robustness of our main results, we run the same regression as in Equation 4for other estimates of regional mortality from the virusexcess and medic mortalityand the variables for the state response to the pandemicthe number of anti-COVID-19 policies and the self-isolation index.
The results are presented in Figure 3.The significant negative coefficient of the approaching election variable and the significant positive coefficient of election proximity in years (see Panel A) indicate significantly fewer reported deaths from the virus in regions with relatively sooner elections.The average magnitude of the effect is massive: having an election in 2020-22 decreases the reporting of COVID-19 mortality by 61 per cent of its average value, and having the election one year earlier, all things equal, reduces reporting by almost 23 per cent of its average value.This effect, however, becomes statistically insignificant in the three months after the referendum (see Panel B), suggesting that career incentives drive the manipulation of data only when there is a demand from the federal authorities (though the coefficients keep their signs).In short, political risk perception is correlated with under-reporting COVID-19 mortality prior to the referendum but not after the referendum.
The correlation we report in Panel A cannot be explained by the actual severity of the pandemic in regions with proximate elections because election proximity measures are uncorrelated 23 For more information, see the official page of the index, available at: https://yandex.ru/company/researches/2020/podomam 24 We acknowledge that the value of the index can be influenced not only by governmental policies, but also by the behaviour of the regional population; this will matter for our analysis in the fifth section.with excess mortality or medic mortality, as reported in Panels C and D. Finally, we show in Panels E and F that approaching elections are not associated with an effort by local authorities to impose additional pandemic regulations and any consequent decrease in self-isolation.Thus, election proximity is likely to influence only the COVID-19 under-reporting and is unrelated to actual measures implemented by the government.

Panel Data Results
The cross-sectional approach can be subject to criticism that it fails to capture a multitude of unobserved region-specific factors potentially correlated with proximity to elections and with COVID-19 mortality.While, as mentioned, we treat the variation in proximity to elections as exogenous, as an additional check, we still estimate a region-month panel data model to eliminate region-specific heterogeneity. 25We run the standard region fixed-effects model; additionally, we employ a generalized method of moments (GMM) estimator to account for the dynamic nature of the pandemic data.We are interested in the heterogeneous effect of election proximity conditional on the local severity of the COVID-19 outbreak for the whole period and the months before the referendum to test whether political risk amplifies the extent to which regions marked down COVID-19 mortality only before the referendum.
We estimate the following equation: Official COVID19 mortality it = a + h Excess mortality it +w(Election proximity i ×Excess mortality it ) + l (Election proximity i × Excess mortality it × Before referendum t ) where: i = 1, …, 85 indicates the region; t = 1, …, 6 indexes months (April-September); Before referendum t is a dummy variable that equals 1 in the months before the referendum (April-June); and r i and m t represent the region and month fixed effects.We are interested in the coefficient of the triple interaction λ. which should capture the proportion of actual deaths not being reported in the official statistics in the months before the referendum.The linear term of the proximity to election is time invariant and, thus, absorbed by the region fixed effects.
The results for the regressions are reported in Table 1.We start with the fixed-effects estimation of the interaction term in Equation 2, which shows a strong and significant effect of the approaching elections on the official reporting of COVID-19 mortality (see Column 1) conditional on the actual COVID-19 mortality.We observe this effect only before the referendum; after the referendum, the effect disappears.Similarly, the more years left before the election, the larger the share of the excess mortality reported as official COVID-19 mortality (see Column 3) before the referendum but not after it.Having elections in the next two years decreases the share of excess deaths reported as COVID-19 deaths by almost twofold.
The fixed-effects model, however, does not account for the dynamic nature of the infectious spread; therefore, we also estimate the system GMM estimation that includes a one-period lag of the dependent variable, as well as excess mortality.The GMM estimations do not alter the results: the overall magnitude of the effect remains unchanged.To illustrate our central findings, we plot the conditional marginal effects of the election proximity variables for the periods before and after the referendum from Columns 2 and 4 in Figure 4. Again, the results are consistent with the findings reported in the previous section and confirm our main intuition: proximity to elections triggers more intensive COVID-19 mortality manipulation prior to the referendum.

Social Costs of Data Manipulation
Besides lessening individual and collective protective behaviour, under-reporting of COVID-19 mortality may become revealed to the public and, consequently, damage the trust in stateprovided statistics, making individuals reluctant to react to any future warning signs in official data.In this section, we test this assumption by using an opportunity to study how the exposed under-reporting of COVID-19 mortality affected the trust and self-isolation behaviour of the Russian public.
On 10 July, ten days after the end of the national referendum, Rosstat made all-cause mortality data at the aggregate and regional level for the month of May accessible to a broader public.This was the first month when the pandemic death toll caused excess mortality to be noticeable for a large number of regions, also revealing a significant gap between excess deaths and officially reported deaths. 26This discrepancy between the two mortality measures instantly received press coverage in numerous online media outlets, amateur blogs and social media, thus exposing the under-reporting that happened to the data for May. 27This was also the first time the Russian While the independent press had voiced several concerns about the potential under-reporting of COVID-19 mortality already in April, the all-cause mortality data showed predominantly no positive excess deaths for this month (e.g., Novaya Gazeta failed to report mortality under-reporting for the month of April [see: https://novayagazeta.ru/articles/2020/06/19/ 85909-voskreshennye-rosstatom]).This is also the reason why our analysis focuses on data from May: in the preceding months, lack of excess all-cause mortality in almost all regions precludes us from computing our proxy of revealed underreporting after the publication of Rosstat data.

27
For a detailed discussion of the regional distribution of excess mortality and a link to under-reporting, see, for example, a publication of Novii Izvestiya (available at: https://newizv.ru/news/society/11-07-2020/smertnost-za-may-v-rossii-ischezlonaselenie-nebolshogo-goroda).Even official press mentioned the newly available data on excess deaths, though with a much milder focus on exposing under-reporting (see, e.g., news by Interfax, available at: https://www.interfax.ru/russia/public learned about excess mortality as an alternative way to estimate the true death toll of the pandemic and its application in spotting the under-reporting of COVID-19 mortality in both Russia and their region.
We are interested, in particular, in two possible effects of the exposed under-reporting for the month of May: an effect on trust in governmental COVID-19 statistics: and an effect on willingness to reduce social contacts to avoid contagion.We measure the extent to which COVID-19 reporting in May was exposed to the public by using an indicator of the ratio of excess deaths to the number of deaths reported by the governmental website.Naturally, because some regions had not yet had positive excess mortality in May, we cannot identify any under-reporting for these regions, but we include a dummy variable to control for this.It is noteworthy that only two regions with positive mortality have an under-reporting coefficient below the value of 1, meaning an over-reporting of official deaths compared to excess mortality.The rest demonstrate an under-reporting ratio ranging from 1.3 to 82.6. 28Note: Conditional marginal effects with 95 per cent confidence intervals.

716896
).In fact, suspicions about the quality of the data emerged already during the previous months of the pandemic.In SA D, we use a case of a reputable independent news portal, Mediazona, and its inquiry concerning COVID-19 mortality in regional governments, to nuance evidence on the logic of data manipulation in Russia's regions.

28
We use the full sample for our analysis; however, excluding the regions with an under-reporting ratio over 10 does not alter our results.

Trust
Trust in official statistics is essential for an adequate public response to the pandemic, in particular, for the adherence to safety regulations (Bargain and Aminjonov 2020;Pak, McBryde and Adegboye 2021), but it can be substantially damaged if the public learns about deliberate data manipulation.For measuring trust in official statistics, we employ a telephone survey (N = 1,617) carried out at the end of July by a highly reputable independent Russian pollster, Levada Center. 29The survey provides us with self-reported trust in COVID-19 statistics as of 24 July, two weeks after the data on all-cause mortality for May were published, allowing the population to infer the extent to which the regional government under-reported COVID-19 mortality, and over three weeks after the referendum was completed.Trust is assessed via the question, 'Do you trust the official information about the coronavirus situation in Russia?', with possible answers being: 'fully yes', 'mostly yes', 'only somewhat' and 'fully no'.The survey is representative nationally, and over 90 per cent of the respondents are from seventy-eight out of eighty-three Russian regions.It allows us to look at the relationship between individual trust in COVID-19 statistics and the degree of exposure of under-reporting in the respondent's location.
We start our analysis with simple descriptive statistics.For this purpose, we group respondents by the three categories of regions: regions with no excess mortality in May; half of the regions with excess mortality but with relatively accurate reporting of COVID-19 mortality; and the other half where regional governments hid COVID-19 mortality to a larger extent.Figure 5 presents the overlay of the three histograms for respondents in every region category.Respondents in regions without excess mortality and consequently without the under-reporting of COVID-19 mortality in May are consistently the most trustworthy of official statistics.However, this category does not allow us to disentangle the actual determinant of relatively higher trust because it may be driven both by the absence of the COVID-19 outbreak and by the lack of data manipulation.This is different for the regions with positive excess mortality because the intensity of under-reporting is not correlated with excess mortality, meaning that any difference in trust should be attributed to the difference in under-reporting. 30Here, we observe that respondents from regions with a larger deviation of official COVID-19 data from actual mortality report consistently lower trust levels than respondents from regions where regional governments provided more accurate information on COVID-19 mortality, despite having the same level of severity of the COVID-19 pandemic on average.
However, inferring the under-reporting from all-cause mortality data is not a straightforward task; thus, we expect this relationship to hold mostly for respondents with better analytical skills.To test this hypothesis, we split all respondents into two subgroups by education: those with and without a university degree.Figure 6 replicates Figure 5 for both subgroups.We notice that the previous pattern is more prominent in the subgroup with better education and overall mistrust of official statistics is also higher for this group.
The regression results also confirm that the under-reporting variable decreases trust only conditional on the respondent holding a university degree.The estimation results are available in Table E1 in SA E.

Self-isolation
The deliberate under-reporting negates the advantages of informed self-regulation and affects the level of self-isolation by creating a false perception of safety.Once the under-reporting is exposed and the public trust in official statistics is lost, as we showed in the previous section, the population will no longer adjust their behaviour to the official information.We test these hypotheses using the self-isolation index of the Russian regional population, as described earlier.

29
The data have been obtained directly from the Levada Center, but they can also be accessed via the Sophist HSE database upon registration (see: http://sophist.hse.ru/).Our analysis is based on the following assumptions.We hypothesize that monthly official COVID-19 mortality is positively associated with self-isolation in the regions; indeed, the higher the official mortality, the more fearful of the possible contagion people in the region become.However, the extent to which these two indicators are correlated could be affected by how likely people are to trust governmental statistics.We hypothesize that the correlation should be the highest for the months prior to the publication of the all-cause mortality that exposed prereferendum under-reporting.Furthermore, the correlation should be further suppressed by the extent of revealed under-reporting at the regional level.
Thus, we regress the self-isolation index on: (1) interaction terms between months dummies and COVID-19 mortality; and (2) triple interaction terms between COVID-19 mortality, months dummies and coefficient of exposed under-reporting in May (the same variable as in the previous subsection).The regression results are presented in Figure 7.As we expected, self-isolation and official COVID-19 mortality are strongly correlated for the first month of the pandemic, and the correlation becomes statistically insignificant after Julythe month after the end of the referendum when the all-cause mortality was finally published.
However, when we account for the regional differences in exposed under-reporting starting from June, we observe that responsiveness to official statistics dwindled proportionately to the exposed under-reporting of official COVID-19 mortality.This finding allows us to conclude that the general public may have discounted official information proportionally to the level of under-reporting in May in this region.
Our findings provide two important implications.First, since self-isolation was relatively higher in regions with higher reported COVID-19 mortality, deliberate under-reporting potentially led to suboptimal levels of social mobility, thus increasing the risk of contagion.Secondly, the publication of information exposing the data manipulations was likely to have decreased public responsiveness to official statistics further.

Conclusion
The global pandemic caused by COVID-19 has posed an unprecedented challenge for governments around the world; yet, many autocratic regimes have responded to the pandemic in the manner they are used toby manipulating official statistics to create an image of success instead of actually fighting the virus.However, without accurate pandemic information, it is hardly possible to assess the effectiveness of governmental policies, estimate the virus spread or make decisions on opening up borders for international travel.This article shows that in large authoritarian states, COVID-19 data manipulation could be driven by the actions of sub-national politicians reacting to the (informal) incentives set by the central government.Furthermore, we provide evidence that this data manipulation leads to declining public trust in official COVID-19 information and induces lower compliance with safety measures.Thus, under-reporting comes at a cost to the ability of society to contain the spread of the virus.
The case of the Russian Federation studied in this article suggests the following mechanism explaining under-reporting.To achieve political goals associated with the need to implement the referendum on constitutional amendments, the federal government provided informal incentives to governors to paint a 'rosy picture' of the COVID-19 pandemic in their regionseither by actually managing the pandemic or by doctoring the data.Since manipulating data is an everyday routine for most Russian officials, many Russian governors opted for under-reporting COVID-19 mortality to achieve the goal set by the federal centre.Governors who perceived themselves as facing larger political risks and needing support from the federal centre provided more biased reporting of COVID-19 mortality.
Our analysis finds evidence of the correlation between perceived political risk and underreporting only for the period preceding the national referendum, that is, when political incentives from the centre were particularly strong.After the referendum, we find no consistent evidence of a significant link between political risk and under-reporting.This may be explained by the fact that data manipulation, in the eyes of regional bureaucrats, is not an effortless activity, and they are more likely to engage in it only if they face respective incentives from the centre.
Our study acknowledges several limitations.First, our measure of under-reporting is based on the assumption that excess mortality is a more accurate proxy than officially reported data on COVID-19 deaths, and while this assumption has become widely accepted by scholars of the current pandemic, this approach might still not be ideal.Secondly, while the fundamental logic of the political mechanism behind under-reporting in the case of Russia is externally valid in the context of other autocratic regimes and is causal based on the identification strategy, the findings regarding the consequences of exposed under-reporting are rather more illustrative and might not apply in other circumstances.
Still, the relevance of our findings goes beyond empirical observations on how Russia handled the COVID-19 pandemic, as they provide evidence of an important mechanism of data manipulation in authoritarian regimes that has so far remained unexplored in the scholarly literature.While existing studies acknowledge the importance of data manipulation as a legitimation strategy by autocracies, they often fail to uncover the specific mechanisms of how data manipulation emerges through the interaction of multiple agents in an authoritarian political system.Our study shows that it is essential to understand not only the motives of the authoritarian regime to manipulate data, but also the specific incentives it sets for its bureaucracies engaged in data fabrication and the factors triggering more or less intensive responses on the side of bureaucrats.Supplementary Material.Online appendices are available at: https://doi.org/10.1017/S0007123422000527Raksha, Anton Shirikov and participants of the Wisconsin Russia Project Young Scholars Workshop and the Workshop on Russian Politics of the Freie Universität Berlin.We appreciate the research assistance of Guram Kvaratskhelia.We are also grateful to Caress Schenk for drawing our attention to the CoronaNet Database.All mistakes remain our own.Financial Support.None.

Figure 1 .
Figure 1.Distribution of regions by the time distance to the governor's election.

Map 3 .
Excess mortality from all causes during April-June 2020 reported ex post after the referendum.Note: Crimea and Sevastopol, while omitted on the maps, are included in the sample of the study.

Figure 2 .
Figure 2. Monthly dynamics of mortality rates in regions by the proximity to elections: official COVID-19 mortality (Panel A) and excess mortality (Panel B).

Figure 3 .
Figure 3.The effect of election proximity on COVID-19 mortality and the response to the pandemic: Panel A shows official COVID-19 mortality percentages before the referendum, April-June 2020; Panel B shows official COVID-19 mortality percentages after the referendum, July-September 2020; Panel C shows excess mortality percentages before the referendum, April-June 2020; Panel D shows medic mortality percentages before the referendum, April-June 2020; Panel E shows the number of anti-COVID-19 policies before the referendum, February-June 2020; and Panel F shows the self-isolation index before the referendum, April-June 2020.Notes: Ordinary least squares regression; 95 per cent confidence interval; standard errors robust to heteroscedasticity; N = 85.All estimations include excess mortality for April-June (for July-September in Panel B) as a control variable except Panel C, where the dependent variable is excess mortality itself.

Figure 4 .
Figure 4. Marginal effects of election proximity on official reporting of COVID-19 mortality conditional on the excess mortality and the timing of the referendum: before (April-June) and after the referendum (July-September).

Figure 5 .
Figure 5. Trust in official COVID-19 statistics by regions with different rates of under-reporting and with zero mortality.

Figure 6 .
Figure 6.Trust in official COVID-19 statistics by regions with different rates of under-reporting and with zero mortality, grouped by respondents' education level.

Figure 7 .
Figure 7.The association between month, COVID-19 mortality and the self-isolation index, and the moderation effect of the exposed under-reporting after the referendum.Note: Full regressions are reported in SA E.

Table 1 .
Election proximity and reporting of official COVID-19 mortality, panel estimation 26