Studying Democracy in Europe: Conceptualization, Measurement and Indices

Abstract Given the academic and media salience of democracy and its measurement, in this contribution we take a closer look at the various existing datasets. For this purpose, in the first two sections we look at democratic conceptualization and measurement, and then focus on the most used datasets on democracy and assess them against the conceptual criteria illustrated in the first section. The third section focuses on the notion of quality of democracy and how it has advanced the understanding of contemporary democracies. The subsequent section illustrates changes in democratic scoring in European countries over the past 15 years. Our results show that democracy has not become more robust in European countries: on the contrary, several countries witnessed significant democratic deterioration. Furthermore, we show that – with the exception of Polity – the indexes analysed are highly correlated and therefore could be equally useful for an ongoing analysis of European democracies.

Towards the end of March 2021, a search of the Google Scholar database for 'democracy' resulted in almost 4 million titles.'Political institutions' counted over 4.5 million entries, and 'political parties' around 3 million.Since Tocqueville's work on 'Democracy in America', democracy has gained growing attention and has been studied from a number of perspectives.Moreover, in the past decades, especially in political science, conceptualizations and empirical studies of democracy have become a quite popular object of study.Currently, although case studies are still very relevant and often conducive to conceptual refinements and empirical specifications, most of the work is done on a comparative basis (for example, the V-Dem reports; see following sections).Such advancements have been made possible by the growing availability of comprehensive datasets, allowing researchers from all over the world to conduct increasingly sophisticated empirical exercises.Finally, unlike other topics in political science, democracy, democratization, democratic breakdown or de-democratization are among the most covered topics in the media.Rankings of and debates on democracies have gone well beyond academic circles.
Given the academic and media salience of the issue, we believe it is important to take a closer look at the various existing datasets to better understand how conceptualizations of democracy have been translated into indicators and measures and which dimensions of democracy are in the spotlight.We think that this exercise is particularly relevant not only for academic purposes but also because it could set the scene for a broader debate that could eventually help us better understand if and how changes in democratic values (connected to empirical dimensions of democracy) may reflect growing democratic difficulties in delivering results.
This article is structured as follows: the first two sections analyse the conceptual background and discuss the indices used for measurement.We then focus on the most used datasets on democracy and try to underline their links with the broad conceptual reading provided in the first section.The third section focuses on the notion of quality of democracy and how it has advanced the understanding of contemporary democracies.The subsequent section illustrates changes in democratic scoring in EU countries over the past 15 years.The conclusion summarizes the key findings and sketches some ideas for future research on how the measurement of democratic attributes might be improved.Our results show that democracy has not become more robust in European countries.On the contrary, a number of countries witnessed large negative changes (such as the Czech Republic, Hungary, Poland and Bulgaria) whereas in a number of other countries (including Germany, the United Kingdom and Spain, among others), smaller negative changes have occurred.Furthermore, our results show thatwith the exception of Politythe various indexes analysed (Polity, Freedom House, Economist Intelligence Unit, V-Dem, European Quality of Government, Democracy Dataset) are highly correlated and therefore could be equally useful for an ongoing analysis of the trajectories of European democracies.

Concept building
Since World War II, the study of (representative) democracy has been a central object of inquiry for political science.Two main stages have characterized studies of democracy: conceptualization and measurement.To be sure, the stages should not be understood as following a linear temporal development since in several cases they have substantially overlapped.However, the stages are analytically useful in unveiling the learning logic related to the studies of democracyor polyarchy, in the words of Robert Dahl (one of the most distinguished scholars for the conceptual and historical analysis of democracy).
Due to space limitations, we shall focus briefly on the main features of the first stage and concentrate more on the second stage, which has been at the heart of a number of contributions in recent years and has given birth to numerousand importantdebates regarding the different added values of measurements and comparative empirical analyses.A final caveat: since our main concern is to analyse mostly contemporary readings of democracy, we will mainly focus on the post-World War II contributions.
Preface to Democratic Theory (1956) by Dahl and Democratic Theory (1962) by Sartori are the implicit points of departure for any serious analysis of how contemporary democracy needs to be defined.Dahl's contribution is often seen primarily as empirical.However, it has also been noted that, 'despite Dahl's contributions to empirical political science, it would be a mistake to describe him as nothing but an empiricist.He also recognizedand practicednormative inquiry and conceptual analysis as important components of modern political analysis' (Baldwin and Haugaard 2015: 159).
Dahl's conceptual contribution emerges very clearly in the proposed definition of 'requirements for a democracy among a large number of people' by distinguishing among three dimensions of 'unimpaired opportunities': preference formulation, preference signification and having 'preferences weighted equally in conduct of government' (Dahl 1971: 2-3).Following Joseph Schumpeter (Ricci 1970), Dahl conceptualized democracy as a combination of public contestation and right(s) to participate, which is constituted by eight constitutional guarantees: (1) freedom to form and join organizations; (2) freedom of expression; (3) the right to vote; (4) eligibility for public office; (5) the right of political leaders to compete for support; (6) alternative sources of information; (7) free and fair elections; and (8) institutions for making government policies depend on votes and other expressions of preference (Dahl 1971: 3-5).
To a certain extent, we may read Sartori's work as an unrivalled supplement of and enrichment to Dahl's contribution(s) since it provides further conceptual ground for the empirical (and comparative) analysis of democracy.In the words of David Collier and John Gerring, Sartori stands at the forefront among scholars who have tackled problems of conceptual confusion.Its arresting title, 'Democrazia e definizioni' (Democracy and Definitions) (1957), signals his recurring juxtaposition of basic methodological concerns and his substantive focus on democracy and political parties.… He has sought to provide a rigorous approach to methodologya rigor grounded in the careful use of language, rather than in mathematics.He viewed qualitative work with concepts as essential to achieving such rigor in both qualitative and quantitative research.(Collier and Gerring 2009: 3) Paying careful attention to concept (mis)formation (Sartori 1970) is a broader legacy of Sartori, particularly fruitful for historically informed accounts of democratic instauration and consolidation (although not explicitly cited, see Levitsky and Ziblatt 2018).For example, Sartori underlines the differences between 'democracy' and 'democratic'.The noun 'democracy' allows (and requires) a definition of what democracy is, whereas the adjective 'democratic' is more connected to the quantum or 'how much' dimension of democracywhich is what often is studied by quantitative scholars.Sartori reminds us that 'what is democracy' and 'how much democracy' are very legitimate questions that can be answered both from a qualitative and a quantitative point of view, as long as they are logically treated in a proper manner.These considerations are particularly relevant if we want to explore not only the differences between democracies and non-democracies but also the different types of democracies (and their solidity).
In conclusion, both Dahl and Sartori focus on the key attributes of democracy.Furthermore, they also look into the historical sequences that led to democracy (Dahl) and its possible differential intensity (Sartori).However, they only partially address the question of how and on what dimensions democracies may differ.

Patterns of democracy
One of the first authors to measure democracy in a comparative fashion is Arend Lijphart, whoduring the 1970sfirst expanded the focus of democratic analyses to less studied countries (such as the Netherlands) and then adopted a perspective that can be considered as the point of reference for all comparative scholars of democracy.To be sure, Lijphart does not measure democracy per se but ratheradopting Freedom House classificationsconsiders those countries seen as 'free' with reference to civil and political rights, and for a sufficient number of years.Therefore, Lijphart focuses on consolidated democracies.His comparative perspective allows him to rank states according to shared criteria, which are met to a certain degree, making them instances of types of regimes located on a continuum.
Lijphart inaugurates a new stage in the understanding of democracy, which has been labelled 'democracy with adjectives' (Collier and Levitsky 1997).In his 1984 contribution entitled Democracies: Patterns of Majoritarian and Consensus Government in Twenty-One Countries, and in a number of subsequent contributions (especially in the 1999 new edition of the book), Lijphart innovated Dahl's and Sartori's analyses by introducing specific measures of attributes of democracies and providing the first, fully fledged comparative assessment.As is well known, the point of departure for Lijphart's comparative framework was the (electoral) logic according to which power is granted to a majority.Certainly, Lijphart does not provide a measurement of the intensity of democracy but rather of 'patterns of democracy' that constitute the empirical manifestations of contemporary polyarchies.
The 36 democracies analysed in his 1999 contribution constitute the first democratic academic 'database' designed along two major dimensions: executive-parties and federal-unitary (state). 1 Within each dimension, Lijphart identifies five criteria that provide two ideal types of democracies: majoritarian and consensus.Lijphart adopted a comparative frameworkwhich has been criticized (Giuliani 2016) but also enriched (for example, see Vatter 2009)and paved the way for contemporary comparative analyses of democracies.However, in terms of the measurement of democratic attributes, for a number of years scholars relied on Freedom House data, published yearly in the Freedom in the World report.

Measurement: illustrating and assessing existing indices
In this section, we focus on the most used and still updated datasets (for more comprehensive assessments, see Högström 2013;Munck and Verkuilen 2002) to verify if attributes of the notion of democracy are adequately considered. 2 Therefore, our main concern is to consider how the concepts are translated into indicators or variables.More specifically, we will focus on the capacity of the measurements to capture the main features of democracy as defined by the above-mentioned authors.The section concludes with a comparative critical assessment.

Freedom House
In 1973, when the first Freedom in the World report was issued, the 'free countries' numbered 44 (29.73%), the partially free countries were 36 (24.32%), whereas there were 68 (45.95%) non-free countries.Freedom House was the first comprehensive dataset trying to measure freedom (by many used as a proxy of democracyfor example, by Lijphart) in the world.How does it work?Freedom in the World is produced each year by a team of in-house and external analysts and expert advisers from the academic, think tank, and human rights communities.The 2021 edition involved over 125 analysts, and nearly 40 advisers.The analysts, who prepare the draft reports and scores, use a broad range of sources, including news articles, academic analyses, reports from nongovernmental organizations, individual professional contacts, and on-the-ground research. 3  As highlighted by the website, the data are produced in a very heterogenous way since over 165 people are involved and a variety of sources are used to prepare the ranking.Furthermore, the scoring procedure is quite complex: The analysts' proposed scores are discussed and defended at a series of review meetings, organized by region and attended by Freedom House staff and a panel of expert advisers.The product represents the consensus of the analysts, outside advisers, and Freedom House staff, who are responsible for any final decisions.Although an element of subjectivity is unavoidable in such an enterprise, the ratings process emphasizes methodological consistency, intellectual rigor, and balanced and unbiased judgments. 4  The scoring process follows a two-step procedure that involves questions asked and tables used to convert scores to status.The dimensions investigated through the questionswhich cover political rights and civil libertiesand the scoring are very much in line with the eight constitutional rights identified by Robert Dahl (1971, see above).For each dimension a series of questions are asked and a 0-4 scale is used in order to score the answers to questions linked to 10 indicators connected to political rights and 15 indicators connected to civil liberties.The indicators are summarized in Tables 1 and 2.
For each indicator a number of questions are asked.For example, with regards to the 'free and fair elections of the head of government' indicator, one of the questions is 'Was the vote count transparent and timely, and were the official results reported honestly to the public?'Or for the 'government openness and transparency' indicator, one of the questions is 'Does the state ensure transparency and effective competition in the awarding of government contracts?'An additional discretionary political rights indicator is the alteration of the ethnic composition of a country or territory, based on questions such as, 'Is the government providing economic or other incentives to certain people in order to change the ethnic composition of a region or regions?' The second dimension regards civil liberties, for which a number of questions are asked.For example, in the case of the property rights, the following question is asked: 'Are the people legally allowed to purchase and sell land and other property and can they do so in practice without undue interference from the government of nonstate actors?'; or, with reference to trade unions' freedom, 'Are unions able to bargain collectively with employers and negotiate agreements that are honoured in practice?'As the questions show, answers may be highly subjective since no specific anchor to data is required and the questions' framing often merges both attitudinal and behavioural elements.As the methodological section of the website illustrates, The highest overall score that can be awarded for political rights is 40 (or a score of 4 for each of the 10 questions).The highest overall score that can be awarded for civil liberties is 60 (or a score of 4 for each of the 15 questions).The scores from the previous edition are used as a benchmark for the current year under review.A score is typically changed only if there has been a realworld development during the year that warrants a decline or improvement (e.g. a crackdown on the media, the country's first free and fair elections), though gradual changes in conditionsin the absence of a signal eventare occasionally registered in the scores. 5  Figure 1 shows the shares of regimes ( free, partially free and not free) over time according to Freedom House.Since its beginning, liberal democracies (i.e.'free' countries) have increased from 29.7% to 46.6% (in 2007) and then 'the expansion of freedom and democracy in the word came to a prolonged halt' (Diamond 2015: 142).In fact, in the following years the share of free countries decreased slightly down to 42.1% in 2020.In contrast, the share of non-free countries decreased importantly over the period analysed: from 45.9% in 1972 to 27.7% in 2020.Yet, the share of 'partially free' countries has slightly increased over time.Thus, if we consider the share of 'partially free' and 'not free' countries, overall, as of 2020, democracies are a minority.Indeed, as the latest Freedom in the Word report states (Freedom House 2021), we are witnessing an 'antidemocratic turn'.Freedom House scoring has been widely used and cited by both the media and academics.Although apparently complex in its design, the dataset is relatively easy to build and to communicate.Furthermore, not only is it related to the eight constitutional features identified by Dahl, but it also pays tribute to the 'democratic arenas' described by Juan José Linz and Alfred Stepan (1996): political society, economic society, civil society, rule of law and bureaucratic apparatus (or a 'useable bureaucracy').Put differently, in principle the Freedom House ranking is coherent and convincing, also because it relies on conceptualizations of (liberal) democracy that are now considered as common and shared points of reference in the academic literature.However, what emerges quite clearly from the questions asked and the

Government and Opposition
scoring is that the ranking may be highly subjective and heavily path-dependent.
Furthermore, what seems completely missing is the 'social rights' dimension, whichfollowing T.H. Marshall (1950) would need to be fully incorporated in any definition of democracy.In sum, subjectivity and incompleteness seem to be the weakest points of the Freedom House's operationalization and measurement of democracy.

Polity
The fundamental 1974 contribution by Ted Gurr established the framework for Polity, one of the most widely used democracy datasets.Currently (2020) in its V version, Polity is based on specific authority dimensions, such as: (1) openness of executive recruitment; (2) decision constraints on the chief executive; (3) extent of political participation; (4) directiveness (scope of governmental control); and (5) complexity of governmental structures.In Gurr's first contribution, specific scores were assigned to classify the political regimes under scrutiny.Two dimensions are seen as 'self-evidently scalable': 'decision constraints' (which 'refer directly to different degrees of limitation on the decision-making powers of chief executives by the legislatures') and 'governmental directiveness'.For the decision constraints, a four-point scale is adopted, going from substantial (4) to unlimited (1); whereas for directiveness, a five-point scale was used, going from minimal (5) to totalitarian (1).As for the openness dimension, where 'recruitment to any position or set of positions is "open" to the extent that all lower-ranked individuals have equal opportunity to attain it' (Gurr 1974(Gurr : 1486)), five possible types of openness have been identified, along a five-point scale from competitive (5full openness) to ascription (1no openness).As for political participation (conceptualized differently from Dahl's notion of participation; see Munck and Verkuilen 2002; for a reply, see Marshall et al. 2002), the framework at the heart of the Polity dataset assumes that 'the volume of participation is greatest when there are relatively stable and enduring political groups (not necessarily parties) which regularly compete for national political influence' (Gurr 1974(Gurr : 1486)).Therefore, the maximum score possible ( 5) is assigned to an 'institutionalized' pattern of participation, whereas if participation has been suppressed or is non-existent, the score is the lowest possible (1).Finally, the overall complexity of the governmental structures is seen in connection with the democratic or non-democratic nature of a political regime: more specifically, 'conditions which increase complexity include (a) decision making by collectives rather than individuals; (b) the presence of several overlapping decision-structures at the same level; and (c) the vertical differentiation of a unit (here the polity) into distinct sub-units on the same level' (Gurr 1974(Gurr : 1486)).
Based on this overall framework and a rather sophisticated although not uncontroversial scoring, Polity classifies regime authority using a 21-point scale ranging from −10 ('hereditary monarchy') to +10 (consolidated democracy).The scale is divided into autocracies (−10 to −6), anocracies (−5 to 5) and democracies (+6 to +10).The most recent Polity version is based on six component variables that are connected to three main dimensions: chief executive recruitment (regulation, competitiveness, openness), independence of executive authority (executive constraints) and political competition and opposition (regulation and competitiveness of participation).
The Polity dataset is impressive in terms of data collection but is problematic in several ways.First, the codebook (Marshall and Gurr 2018) is not entirely clear about how the actual coding works: it has substantially improved over the years and convincingly addressed issues of intercoder reliability (Marshall and Gurr 2018: 5-6), but transparency regarding the coding process could be further improved.Second, following the criticism by Gerardo Munck and Jay Verkuilen (2002), it could be argued that the notion of 'participation' is more formal than substantial since it relates primarily to the regulation of participation and not to effective participation.This is an issue we shall come back to when discussing the contributions made by the 'quality of democracy' scholars.Finally, especially when compared to other indices, the limited number of variables used may not fully capture the nuances of democratic development over time.

The Polyarchy dataset
From a more scientific perspective, in a pioneering article from 1990, Michael Coppedge and Wolfgang Reinecke applied Dahl's polyarchy conceptualization to develop a Polyarchy scale.The main aim of the authors was to define a scale that would correspond directly to Dahl's eight institutional requirements to be analytically grounded and easily replicable.The authors coded 'one variable for the extent of the suffrage, one for freedom of expression, one for freedom of organization, and one for the existence of alternative sources of information.Three of the remaining four institutional requirements were easily combined into a single variable measuring free and fair elections' (Coppedge and Reinicke 1990: 53-55).
The variable measuring free and fair elections contains three categories: elections without significant or routine fraud or coercion; elections with some fraud or coercion; no meaningful elections (i.e.elections without choice of candidates or parties, or no election at all).The freedom of organization variable was constructed of four categories: some trade unions or interest groups may be harassed or banned but there are no restrictions on purely political organization; some political parties are banned and trade unions or interest groups are harassed or banned, but membership in some alternatives to official organizations is permitted; the only relatively independent organizations that are allowed to exist are non-political; no independent organizations are allowed.All organizations are banned or controlled by the government or the party.
Freedom of expression is covered by three categories: citizens express their views on all topics without fear of punishment; dissent is discouraged, whether by informal pressure or by systematic censorship, but control is incomplete; all open dissent is forbidden and effectively suppressed, though a few citizens may express dissent publicly in covert ways.
The extent of control may range from selective punishment of dissidents on a limited number of issues to a situation in which only determined critics manage to make themselves heard.There is some freedom of private discussion, and all dissent is forbidden and effectively suppressed.Citizens are wary of criticizing the government, even privately.
Availability of alternative sources of information is measured by virtue of four categories: (1) alternative sources of information exist and are protected by law; (2) if there is significant government ownership of the media, they are effectively controlled by truly independent or multiparty bodies, where alternative sources of information are widely available, but government versions are presented in preferential fashion; (3) the government or ruling party dominates the diffusion of information to such a degree that alternative sources exist only for non-political issues, for short periods of time or for small segments of the population; and (4) the media are either mostly controlled directly by the government or party or restricted by routine prior censorship, near-certain punishment of dissident reporters, publishers and broadcasters, or pervasive self-censorship.
The indicator of the right to vote is represented by universal suffrage broken down into four categories: universal adult suffrage; suffrage with partial restrictions; suffrage denied to large segments of the population; no suffrage.As for the coding of the suffrage indicator, the legal provision of the countries analysed were used.Finally, the authorsto maintain the approach suggested by Dahluse an inclusiveness measure regarding suffrage and constructed a public contestation measure by using a Guttman scale that ranged from systems with full contestation to systems allowing no contestations at all.
The contribution by Coppedge and Reinecke was applied to one year (1985).It could be argued that the limited year coverage is the only weak point of the analytical exercise, since the theoretical underpinnings of the Polyarchy scale can be seen as particularly robust.More recently, Coppedge (after playing a key role in the development of the V-Dem dataset) and others have further refined the Polyarchy scale (reducing the institutional guarantees from eight to five), using the V-Dem dataset (see below) to produce a full measurement of the development of democracy in 182 countries, covering 1900-2017 (Teorell et al. 2019).
We will discuss the innovativeness of this dataset further in the comparative assessment, but it seems that the sophisticated construction of the dataset reduces the margins for subjectivity compared to the Freedom House index.As argued by the authors, the new Polyarchy V-Dem methodology manages to capture fully the various dimensions of Dahl's concept of polyarchy, it provides 'disaggregated data allowing for analyses of dimensionality and inquiries into what lower-level changes account for the shifts in higher-level indices', it covers longitudinally a wide range of countries over a long time period, andmost importantlyit uses transparent data-generating 'processes and aggregation rules' and provides estimates of measurement uncertainty (Teorell et al. 2019: 76).

The Economist Intelligence Unit (EIU) Democracy Index
Since 2006, The Economistvia its Intelligence Unithas entered into the 'democracy index' business and now produces a yearly report which, in the last edition (2020), covered '165 independent states and 2 territories'.By adopting a 'thick' definition of democracy, unlike other indices such as Freedom House (which adopts a 'thin' approach), the democracy index referssimilarly to the other more academic indices discussed previouslyto Dahl and his conceptualization of democracy.Therefore, the five categories used are electoral process and pluralism, civil liberties, functioning of government, political participation and political culture.Together with more conventional measures connected to elections, political and individual rights' protection, The Economist considers three dimensions that are scarcely taken into account by other indices/datasets.This is innovative but at the same it may create additional concernsas we try to argue below.
First, the functioning of government is not linked to representation or executive constraints but rather to implementation (or institutional) capacities.In the questionnaire or the 'model' (as the EIU calls it) we find questions such as 'How pervasive is corruption?' or 'Is the civil service willing to and capable of implementing government policy?' which are aimed at verifying democracy's capacity to deliver.Questions regarding trust are also part of the 'model'something that we have not found in other indices.Second, political participation is particularly valuedboth in electoral and nonelectoral forms.In the relevant sections of the questionnaire, together with questions aimed at capturing the intensity of electoral participation and the gender dimension, there are questions aimed at establishing the '[e]xtent of political participation.Membership of political parties and political non-governmental organizations' with multiple choice answers (over 7%, between 4% and 7% and below 4%), or capturing '[t]he preparedness of population to take part in lawful demonstrations' (multiple choice answers: high, moderate, low) or even a question about adult literacy, which can at best seen as an indirect proxy for political participation.Important issues are tackled but, in some cases, reliable information may not be easily found and in others the questions are not directly connected to the overall category.
Finally, the (democratic) political culture dimension is investigated through questions aimed at understanding if there is 'a sufficient degree of societal consensus and cohesion to underpin a stable, functioning democracy' (Yes; Yes, but some serious doubts and risks; No).Put differently, democracy is analysed from a substantive perspective and not from a merely procedural one.
For media-related reasons, similarly to the Freedom House index, the EIU's Democracy Index is widely cited and referred to in non-academic circles and is constantly used in public debates.However, like Freedom House, the EIU Democracy Index has substantial weaknesses.Not only are some questions difficult to answer, but also the sources are anonymous.To be sure, one methodological innovation is using survey data where possible.Nevertheless, from a methodological standpoint, combining different sources of information for the same questions constitutes a well-known potential trap.Furthermore, although one of the most recent reports tries to deal with the issue of reliability (Economist Intelligence Unit 2021: 58), the solution is not fully convincing sinceas the authors of the report acknowledge -'[t]woand three-point systems do not guarantee reliability' although it reduces the reliability issues which could emerge with 1-5 or 1-7 point scales (unless specific interceding reliability measures are foreseen).
In sum, some of the dimensions introduced by the EIU are useful to expand ways of monitoring democracies and it has the advantage of clarity and parsimony.However, the methodological shortcomings mean the index is not ideal for academic usage.

Varieties of Democracy (V-Dem) Liberal Democracy Index
In 2011, an article published by Perspectives on Democracy engaged with the existing issues related to the shortcomings of the available indices and suggests new venues for democracy database creation.Six key critical issues are identified: definition, precision, coverage and sources, coding, aggregation, and validity and reliability tests (Coppedge et al. 2011: 248).From a definitional point of view, the authors underline the fact that very often a descriptive and a normative dimension are intertwined in the existing indices and therefore they suggest taking full account of this risk in a new index.Furthermore, the authors underline that 'the precision or reliability of all indices is too low to justify confidence that a country with a score a few points higher is actually more democratic' (Coppedge et al. 2011: 249).This is a very important remark since it underlines the quality dimension that has been missing from some indices and that is included in others (such as the Democracy Barometer discussed below).With reference to coverage and sources, Coppedge et al. signal some substantial problems with existing indicesthat is, the limited reach of sources, be they surveys or newspaper sources.These limitations clearly reduce the capacity of the indices to be fully longitudinally reliable and comparable.Codingwhich is an issue raised also by other critiquesis seen as highly problematic.Coppedge et al. underline the structural limitations derived from the vagueness of the coding criteria (for example, of the EIUsee above).Aggregation is also seen as a weak point, since 'although most indices have fairly explicit aggregation rules, they are sometimes difficult to comprehend and consequently to apply' (Coppedge et al. 2011: 250).More specifically, clear guidelines for aggregation are often missing and this lacuna reduces the validity of the overall assessment of democracy.Finally, in line with other critiques, validity and reliability issues are seen as problematic since 'inter-coder reliability tests are not common practice among democracy indices' (Coppedge et al. 2011: 251).Understandably, these tests are seen as fundamental to make the analyses robust.Some simple correlation tests between two of the most used indices (Polity and Freedom House Political Rights) confirm that there may be some issues that are not adequately considered.Therefore, the authors suggest a new approach for the creation of datasets/indices which builds on six conceptions of democracy (electoral, liberal, majoritarian, participatory, deliberative and egalitarian) and on 33 indicators ranging from sovereignty to inclusive citizenship.
In recent years, the V-Dem project has been further developed: the six conceptions of democracy became five 'high-level indicators' (excluding the redundant 'majoritarian' and keeping all the others) and the dataset has been extended.Currently, it covers 202 countries from 1789 to 2021, translating 'the abstract theoretical principles of democracy … into more than 400 detailed questions with well-defined response categories or measurement scales [and] data stems from almost 200 indicators collected from country experts, mostly academics from each country in question' (Coppedge et al. 2015: 581).It distinguishes between liberal democracies, electoral democracies, electoral autocracies and closed autocracies.The most recent report showsin line with other findings by Freedom House and the EIUhow liberal democracies are declining and 68% of the world population lives in electoral and closed autocracies (Boese et al. 2022).
V-Dem stands out as a research project that was then translated into a new index of democracy whichby virtue of its easily disaggregated usagecould be seen as 'best practice' for (conceptualizing and) measuring democracy.The only limitation of the dataset could be the mix between subjective and objective information.However, substantial efforts are made to increase the intercoder reliability by usingamong other things -Item-Response Theory (IRT) models, and therefore the limitations become residual.

Conceptualizing and measuring the quality of democracy
The large coverage of years and variables of V-Dem amount to more than simply 'thick' conceptualizations of democracy.The relevance of capturing the quality of democracy is one of the aims of the project, since in one of the first articles presenting the dataset, a direct reference to the work by 'quality of democracy' scholars is made (Coppedge et al. 2015: 581).In fact, the multidimensional approach to democracy and the identification of different componentswhich go far beyond an electoral and liberal conception of democracyset up a dialogue with other strands of the literature that have primarily focused on the quality of democracy.
The quality of democracy is connected to the notion of democratic consolidation (Linz and Stepan 1996) and has been explored conceptually since the second half of the 1990s.David Altman and Anibal Pérez-Liñan (2002) were among the first to analyse empirically the quality of democracy (in Latin America).Larry Diamond and Leonardo Morlino (2004) provide a way of conceptualizing it: … a good democracy accords its citizens ample freedom, political equality, and control over public policies and policy makers through the legitimate and lawful functioning of stable institutions.Such a regime will satisfy citizen expectations regarding governance (quality of results); it will allow citizens, associations, and communities to enjoy extensive liberty and political equality (quality of content); and it will provide a context in which the whole citizenry can judge the government's performance through mechanisms such as elections, while governmental institutions and officials hold one another legally and constitutionally accountable as well ( procedural quality).(Diamond and Morlino 2004: 22, emphasis in original) Their eight dimensions of democratic quality are: rule of law, participation, competition, vertical and horizontal accountability (mainly procedural), respect for civil and political freedoms, progressive implementation of greater political, social and economic equality (mainly substantive) and responsiveness (which connects procedure and substance).

Democracy Barometer Quality of Democracy Index
The first dataset specifically dedicated to measuring the quality of democracy is the Democracy Barometer (DB), initiated in 2005 as a joint project between the Berlin Social Science Centre (WZB) and the Centre for Democracy Studies Aarau (ZDA).It emerged from a larger project of the Swiss National Centre of Competence in Research (NCCR), 'Challenges to Democracy in the 21st Century', which involved a number of European researchers, especially from Switzerland and Germany.The database stems from the consideration that existing conceptualizations were poor at the time, 6 and it is built 'on a middle-range concept of democracy, embracing liberal as well as participatory ideas of democracy' (Bühlmann et al. 2012: 519).The authors conceptualize democracy using three key principles: freedom, equality and control.These principles are then translated into nine functions: individual liberties, rule of law and public sphere ( freedom); transparency, participation and representation (equality); competition, mutual constraints and governmental capability (control).The measurement is then obtained using over 100 indicators that are derived from secondary data and in its most recent version (V7, 2020), 53 countries are covered for the 1990-2017 period.
What is particularly laudable in this contribution is the measurement: experts' assessments are not considered, as data are obtained from official statistical sources and via representative surveys.This is a strength since it avoids intercoder reliability problems.However, less convincing is the conceptualization of the 'quality of democracy', especially with reference to the equality dimension.Unlike Diamond and Morlino's (2004) conceptualization (and unlike, less explicitly, the conceptualization of the V-Dem project), economic equality is not considered as key.Christian Houle argues very convincingly that 'democracies with sufficiently low levels of inequality are nearly immune from breakdowns' (Houle 2009: 615), which means that consolidation (i.e.quality of democracy) requires low inequality.Following the development of social rights as key features of healthy democracies, any index that claims to cover the quality of democracy must at least engage with a discussion regarding the relevance (or irrelevance) of economic equality.Therefore, any conceptualization and measurement of democratic quality (which also means capacity to last over time) should include at least one indicator of economic inequality, as infor examplethe V-Dem database where the equal distribution of resources index (covering, among other indicators, educational equality and health equality) is calculated.

Quality of Government
Together with the quality of democracy, in recent years, the effectiveness of government (which could be seen as one specific component of democratic quality, i.e. the 'governmental capability' function in the Democracy Barometer orbroadly speakingresponsiveness in Diamond and Morlino's conceptualization) has been scrutinized by virtue of another project and database hosted by the Quality of Government Institute at the University of Gothenburg, created in 2004 by Bo Rothstein and Sören Holmberg.In a 2008 article, Bo Rothstein and Jan Teorell, with respect to the quality of government, 'argue that democracy, which concerns the access to government power, is a necessary but insufficient criterion' (Rothstein and Teorell 2008: 166).Therefore, the main focus of the quality of government should be 'impartiality', which constitutes its main qualification, and implies that 'government officials shall not take into consideration anything about the citizen/case that is not beforehand stipulated in the policy or the law' (Rothstein and Teorell 2008: 170).Impartiality, however, needs to be considered in a broad sensethat is, with references to a number of policy domains.Some 194 countries are included plus 17 historical countries not existing in 2014.Unlike other datasets, the quality of government does not truly have a 'global' index but only a European Quality of Government Index (EQI), which is available at the regional level.In the 'standard' dataset, over 150 variables are used to capture governmental quality. 7 Since the focus of the dataset exceeds the main topic (democracy datasets) of our contribution, we shall not discuss it in detail.What we shall note, however, is that the conceptualization of the quality of government is rather thin.This is, most likely, intentional, since the researchers wanted to collect disaggregated data that could then be used by researchers, following their own research interests and focuses.However, the added value of such an impressive dataset could increase if a more nuanced conceptualization of quality and a manageable index are produced.To a certain extent, the 'redundancy' critique made by Munck and Verkuilen (2002) could also apply to the dataset constructed under the 'quality of government' label.To be sure, this is not to say that all the indicators provided may not be relevant for specific inquiries.It underlines, however, that some aggregation could be useful to those scholars (and media professionals) who would like to monitor the evolution of governmental quality.

The challenge of monitoring democracy in Europe
As indicated above, currently researchers and media professionals have an abundance of useful datasets and indices that may be used to monitor democracy in the world.From a normative perspective, concerns are growing about what has been labelled 'democratic recession' (Diamond 2015) or 'democratic backsliding' (Bermeo 2016).The indices presented here underline the risks for democracies -also in the European Union.In fact, especially in Central-Eastern Europe, the notion of 'democratic backsliding' (Hanley and Vachudova 2018;Vachudova 2020) has been used to portray democratic troubles.Nevertheless, depending on which measure we consider, the changes in democracy scores or levels may vary importantly.From this point a reflection should be made on which of the indices are better suited to track the changes in democracy, to make an assessment in an area such as Europe, where democracy is established and consolidated.
How have European countries performed over the past years?In this section, we use the various sources presented in the previous paragraph to establish the extent of democratic backsliding in European countries.Moreover, we assess how measures of democracy compare with one another and see whether they provide different indications about the status of democracy in Europe.Indeed, the scores rely on quite diverse concepts of democracy, which of course affect their measurement.Some indices, being more complex and multidimensional, might be more able to capture subtle changes in a context in which democracies are long-standing, or at least consolidated.In contrast, indices relying on fewer indicators, which translate into more minimalist concepts of democracy, might deal well with worldwide comparisons at the level of democracy, yet less so within Europe.Thus, such a comparison could provide an indication about how these indices might assess democracy in a specific context.
Table 3 shows the changes in the score of democracy produced by Freedom House over the period 2005-2020.According to this measure, there has been a slight democratic contraction as almost all the countries under scrutiny have declined to some degree on the scale.Hungary stands out as the country that lost most points between 2005 and 2020 (−24), followed by Poland (−10) and Bulgaria (−9).Romania sees the largest increase (+8 points) followed by Slovenia (+3) and Croatia (+1).These are the only countries that are upgraded, whereas Finland, Sweden and Norway remain stable at the top.Many countries show limited changes (such as Denmark, Ireland, the Netherlands and Portugal), indicating that either there has not been democratic backsliding or such a measure of democracy is not well suited to capture changes in democracy in long-standing or consolidated democracies.Indeed, the index produced by Freedom House is much more useful if we look at a much larger picture than that provided by European countries only.Moreover, we can notice that the variability of the Freedom House index is limited within Europe.In 2005, for instance, 15 countries score between 96 and 100, 10 between 91 and 95, not allowing to fully capture country specificities.
Table 4 reports the Polity score in 2005 and 2018 (the latest year available) and its change.This measure seems to be affected by similar problems to that produced by Freedom House.The first thing to be noticed is that there is very little variation across countries in both years.In 2005 23 countries out of 29 score the highest on the autocracy-democracy scale (10).In 2018 there were 21 countries with the highest score.This is because Belgium and the United Kingdom drop two points on the scale and Czech Republic one point, while Slovakia improves by one point (from 9 to 10).Thus, according to the Polity index, there has been little change in democracy in European countries.We could question this result and wonder whether this is what is actually happening in this context.Indeed, there have been various accounts that highlight that some countries, such as Poland or Hungary, have  experienced in recent years a democratic recession (e.g.Bernhard 2021).In contrast, the Polity index indicates that these two countries have the same scores as traditionally deep-rooted and consolidated democracies as Sweden or Germany.This evidence might cast some doubts on the ability of the Polity index to gauge democratic change in Europe.It is certainly a very useful measure as it provides an instrument to track long-term trends in regime changes, starting from the 19th century, or cross-national differences, as it comprises a large number of countries.However, as with the Freedom House index, this measure might be more suited to evaluate differences at a worldwide level (see Treier and Jackman 2008) than in the relatively homogeneous context of Europe.Data from the Economic Intelligence Unit's Democracy Index cover the period from 2006 to 2021 (Table 5).In most countries, the index drops.On average it falls from 8.29 in 2006 to 7.97 in 2020, a decline of −0.32.The largest decreases are in Hungary, Greece, Malta, the Netherlands and Romania.Only in five countries does the index increase (Italy, Ireland, Estonia, Norway and the United Kingdom).Yet, for some of them (Italy and Ireland) such improvement appears to be minimal.Compared to the previous two measures, the EIU index seems to be less skewed towards high scores.For instance, in 2006 only nine countries score higher than 9 (four between 9.1 and 9.5, five between 9.6 and 10), which might indicate that this index could be considered as a 'stricter' measure of democracy as it expands the dimensions under scrutiny.
Figure 2 reports the trends in levels of democracy using the V-Dem Liberal Democracy Index.If we look at the fine-grained picture portrayed by the V-Dem Index, it does not seem that democracy retreats among European countries, at least not for most.We can easily notice that the trend lines are, for the majority of countries, flat.In particular, Austria, Belgium Cyprus, Denmark, Finland, France, Germany, Iceland, Ireland, Italy, Lithuania, Luxembourg, Netherlands, Norway, Sweden, Switzerland and the United Kingdom have trends that hardly move.Other countries, such as Estonia, Finland, Greece, Latvia, Portugal, Slovakia or Spain, present very minimal changes.Romania is the only country that shows a positive, non-negligible change, with an increase of 0.088.In particular, we can see that in this country democracy improves up to 2016, then it decreases until 2019 and grows back in 2020.
In contrast, there are only a few countries in which democratic backsliding can be clearly detected.These are Hungary and Poland.From 2005 to 2020, Hungary drops 0.402 points, which is a very dramatic change given that the scale ranges from 0 to 1.The decrease appears to be constant since 2010.A not-too-different picture can be seen for Poland.Over the same period this country drops 0.322 points, in particular since 2016.For these two countries the levels of democracy have, basically, halved from 2005 to 2020.Also, in Bulgaria and the Czech Republic the index decreases, yet to a more limited extent.We can see that in the former it drops 0.128 points, while in the latter to 0.119.Slovenia drops 0.093 points between 2005 and 2020.The decrease is mostly found in the later years observed.
Finally, it should be considered that this index comes with uncertainty.The aggregation method used to build 'lower-level' indices, which in the end make up the final measure of democracy discussed here, allows taking into account evaluators' disagreements.Therefore, the measurement of components or dimensions  of democracy comes with an error that 'provides vital information about the degree to which one can be certain that a change in scores reflects an actual change in the level of the concept being measured' (Coppedge et al. 2021: 26).This uncertainty is reported in Figure 2 as 95% confidence intervals.The intervals indicate that most changes are not significant from a statistical point of view as they overlap.Democratic backsliding is statistically significant only in Poland and Hungary.
Coming to the measures capturing the quality of democracy, Figure A1 in the Online Appendix reports the Quality of Democracy Index by the Democracy Barometer.The underlying concept of this index is that democracies should represent a balance between the principles of freedom and equality, and that this balance is achieved by control.These dimensions are broken down into multiple components then measured by multiple indicators.This index has no theoretically defined minimum or maximum.Hungary is the country with the largest decrease in the score (−0.304), followed by Denmark (−0.294).The decline for Denmark and Finland (−0.209) is not consistent with some of the data presented above.Various countries show minimal changes, such as Estonia and Spain, but so do Poland, Ireland and the United Kingdom.The countries with the largest positive changes are the Czech Republic (+0.204), the Netherlands (+0.158),France (+0.131) and Germany (+0.127) (see Figure A1 in the Online Appendix).
Next, we assess changes in democracy using the European Quality of Government Index by the Quality of Government Institutute.This index, of course, does not relate directly to the concept of 'quality of democracy' but it is worth looking at it to compare how democracies in Europe perform over time, taking a slightly different point of view.This index is based on multiple individual-level surveys through which respondents evaluate dimensions (or 'pillars') of the quality of government (corruption as perception and experience of this practice, impartiality in treatment by authorities and institutions, and quality of services provided by the state), as well as data from the World Bank's World Governance Indicators.The data are originally available for European subnational regions, which were aggregated at the country level for the purpose of this article.Looking at Table 6 we can see that Malta (−0.815),Cyprus (−0.625),Hungary (−0.446) and Ireland (−0.331) are the countries that experience the largest drop in the index.On the other hand, Lithuania (+0.792), the Netherlands (+0.531),Latvia (+0.521),Estonia (+0.479) and Bulgaria (+0.380) are the countries that improved the most over the period observed.
After this overview of possible indices useful to assess the state of democracy in Europe, we look for relationships between them.Figure 3 shows the distributions, associations and correlation coefficients between the indices discussed above at three points in time (2010, 2013 and 2017). 8In the diagonal of the figure the distributions of the indices are shown.We can see that the Freedom House, Polity and V-Dem indices tend to be skewed on the right, meaning that countries mostly score high on the scales.In contrast, the EIU, the Democracy Barometer and European Quality of Government indices have distributions that tend to be more on the left.So, this might indicate that these indices are more 'demanding' in their assessment of democracy.
The lower and upper panels of the plot illustrate and report, respectively, the associations and the correlation coefficients between these indices.Starting from   For instance, the DB index encompasses nine functions of democracy and uses more than 100 indicators to capture them.The V-Dem index, similarly, uses a more complex conceptualization and, thus, operationalization of democracy as it considers the liberal components (the limits on government, as civil liberties, rule of law, independence of the judiciary, checks and balances) and the electoral components of democracy (responsiveness to citizens, electoral competition, free political and civil society organizations, clean elections, freedom of expression).
The EIU index shows very strong correlation with the EQI index, showing that despite the criticism above, it appears that the EQI index is quite consistent with one produced by as important an academic institution as the Quality of Government Institute.Overall, most indices (considering 2010, 2013, 2017 and all years) present correlations that are higher than 0.7, which indicates a decent degree of overlap between these measures.The only index that is weakly associated with the other measures is the Polity index, which might suggest that, at least for the context of Europe, it could be a less precise measure.

Conclusion: monitoring democratic progress in Europe and beyond
The aim of this article is to discuss some existing conceptualizations and measurements of democracy and to provide a succinct overview of the main democratic trends in recent years in European countries.This exploratory exercise has shown how indicesadopting either a 'thin' or a 'thick' conceptualization of democracymay vary when applied to European countries.However, especially in the more nuanced datasets (the new Freedom House scoring and V-Dem), the deterioration of democratic health in European countries becomes quite visible.
To be sure, the purpose of this preliminary comparison is not simply to underline that different indices (and especially different nuances of the same indices) may produce different results.As a matter of fact, our results show that the results often overlap, with the exception of the Polity dataset.Our contribution enables us to make the argument thatwhen we are looking for a regional focussuch indices are merely a point of departure for broader, in-depth comparisons that could enrich our comparative understanding of how specific democracies work (evolve or backslide).Furthermore, from a normative perspective, the indices shed light on the areas on which governments, NGOs and international organizations need to focus in order to reduce democratic backsliding and increase democratic consolidation.A constant monitoring of democracy within the European Union, which could be pursued using the highly correlated indices and drawing on empirically solid comparative-or case-studies, could be the next step.This exercise, if done regularly, would offer detailed and updated knowledge of the state of democracy in the European Union (and beyond) and possibly identify tools to reinvigorate it and to reverse democratic backsliding.

Figure 1 .
Figure 1.The Share (in Percentage) of Not Free, Partially Free and Free Political Regimes between 1972 and 2020 according to Freedom House Source: Freedom in the World reports, 1973-2020.

Figure 2 .
Figure 2. The Trends in the V-Dem Liberal Democracy Index (LDI) in Europe, 2005-2020 Source: V-Dem Dataset Version 11.1.Note: European Union countries plus Iceland, Norway, Switzerland and the United Kingdom.Grey bands represent the 95% confidence intervals.

the
Freedom House index, we can see that it correlates fairly high with the EIU, the V-Dem and the EQI indices, less with the DB index, and weakly with the Polity index.There are some differences if we consider the years separately, yet correlations are mostly similar.The Polity index correlates weakly with all the other indices, especially the DB index.This might highlight the two different concepts underlying these measures and, consequently, their measurement.Indeed, the Polity index gauges whether essential elements of democracy are present in countries, such as competitiveness, the openness of executive recruitment, constraints on the executive and the competitiveness of political participation.Moreover, the Polity score does 'not include coded data on civil liberties'(Marshall and Gurr  2018: 14), which might prevent it from giving a full picture of what democracy is.Other indices are built relying on much more complex concepts of democracy.

Figure 3 .
Figure 3. Distributions, Associations and Correlations of the Indices in 2010, 2013, 2017 and all Years Source: see text.Note: *** < p = 0.001; ** < p = 0.01; * < p = 0.05; p = 0.10.The panels on the diagonal show the distribution of the scores; the panels in the lower triangle show the data points and the fit lines; the panels in the upper triangle show the correlation coefficients between pairs of scores; black, light grey and grey lines/coefficients represent, respectively, the years 2010, 2013 and 2017.

Table 1 .
Freedom House: Political Rights Indicators • Strong and effective safeguards against official corruption• The government operates with openness and transparency Source: https://freedomhouse.org/reports/freedom-world/freedom-world-research-methodology (2021).

Table 2 .
Freedom House: Civil Liberties Indicators • Due process prevalence • Protection from the illegitimate use of physical force • Guarantees of equal treatment Personal autonomy and individual rights • Freedom of movement • Property rights • Personal social freedom • Individual equality of opportunity Source: https://freedomhouse.org/reports/freedom-world/freedom-world-research-methodology (2021).

Table 4 .
Democracy in Europe according to the Polity Index(2005 and 2018) Note: European Union countries plus Iceland, Norway, Switzerland and the United Kingdom.

Table 5 .
The Change in the Economist Intelligence Unit Democracy Index inEurope (2006 and 2021) Source: Economist Intelligence Unit DemocracyIndex (2006 and 2021).Note: European Union countries plus Iceland, Norway, Switzerland and the United Kingdom.

Table 6 .
The Change in the European Quality of Government Index inEurope, 2010Europe,  -2021 Source: European Quality of Government Index 2021.Note: European Union countries plus Iceland, Norway, Switzerland and the United Kingdom.