To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The evaluation of the role of face masks in preventing respiratory infections is a paradigm case in synthesising complex evidence (i.e. extensive, diverse, technically specialised, and with multilevel chains of causality). Primary studies have assessed different mask types, diseases, populations, and settings using different research designs. Numerous review teams have attempted to synthesise this literature, in which observational (case–control, cohort, cross-sectional) and ecological studies predominate. Their findings and conclusions vary widely.
This article critically examines how 66 systematic reviews dealt with mask efficacy studies. Risk-of-bias tools produced unreliable assessments when—as was often the case—review teams lacked methodological expertise or topic-specific understanding. This was especially true when datasets were large and heterogeneous, with multiple biases playing out in different ways and requiring nuanced adjustments. In such circumstances, tools were sometimes used crudely and reductively rather than to support close reading of primary studies and guide expert judgments. Various moves by reviewers—excluding observational evidence altogether, assessing risk but not direction of biases, omitting distinguishing details of primary studies, and producing meta-analyses that combined studies of different designs or included studies at critical risk of bias—served to obscure important aspects of heterogeneity, resulting in bland and unhelpful summary statements.
We draw on philosophy to question the formulaic use of generic risk-of-bias tools, especially when the primary evidence demands expert understanding and tailoring of study quality questions to the topic. We call for more rigorous training and oversight of reviewers of complex evidence and for new review methods designed specifically for such evidence.
Over the past two decades, there has been growing interest in analyzing the effects of educational programs on outcomes using process data from computer-based testing and learning environments. However, most analyses focus on final outcomes at the end of a test or session, overlooking their functional nature over time and neglecting causal mechanisms. To address this gap, this article proposes a novel causal mediation framework for identifying and estimating functional natural direct effects, functional natural indirect effects, and functional total effects, along with their subgroup effects. We define these effects using potential outcomes and provide nonparametric identification strategies depending on whether post-treatment covariates are present or not. We then develop estimation methods using generalized additive models, a flexible and robust tool for analyzing functional data. Through a simulation study, we assess the finite-sample performance of the proposed approach by comparing it to parametric regression methods. We also demonstrate our approach by examining the effects of extended time accommodations on two functional outcomes using process data from the National Assessment of Educational Progress. Our mediation approach with functional outcomes effectively captures dynamic causal mechanisms underlying the program’s effects and pinpoints when and for whom each effect manifests throughout the testing period.
Researchers frequently deliver treatments through messages, as in many audit and get-out-the-vote (GOTV) experiments. These message-based experiments often hinge on intermediary variables—actions subjects must take to actually receive the treatment or control embedded in a message. Whether subjects open the message is a crucial intermediary step, which can serve as a condition for estimating downstream treatment effects or as an outcome of interest in its own right. Yet opens are often measured with error, most notably when some openers are misclassified as non-openers in email-based studies. We characterize the resulting bias, derive interpretable bounds on effects for well-defined subgroups, and provide sensitivity analyses for mismeasurement, thereby offering practical guidance for message-based experiments conducted through email and other communication technologies.
The question of whether and how federalism influences a country's welfare state has been a longstanding concern of political scientists. However, no agreement exists on exactly how, and under what conditions, federal structures impact the welfare state. This article examines this controversy. It concludes theoretically that the specific constellation of federal structures and distribution of powers need to be considered when theorising the effects of federalism on the welfare state. Using the case of Belgium and applying the synthetic control method, it is shown in the article that without the federalism reform of 1993, the country would have had further decreases in social spending rather than a consolidation of this spending in the years after 1993. In the case of Belgium, the combination of increased subnational spending autonomy in a still national financing system provided ideal conditions for a positive federalism effect on social spending to occur.
Although having been practised in the Social Sciences for decades, it was only in recent years that process tracing has gained prominence in methodological debates in political science. In spite of its popularity, however, there has been little success in formalising its methodology, defining its standards, and identifying its range of applicability. This symposium aims at furthering our understanding of the methodology by discussing four essential aspects: the underlying notion of causality, the role of theory, the problem of measurement in qualitative research, and the methodology's relationship with other forms of qualitative inquiry. It brings together methodological and substantive articles by young European scholars and summarises a round-table discussion with Peter A. Hall held at a workshop at the University of Oldenburg, Germany, in November 2010.
Age is often found to be associated with a plenitude of socioeconomic, politico-administrative, biological and thanatological variables. Much less attention has been paid by scholars, however, to explaining ‘age’. In this paper we address this unfortunate scientific lacuna by developing a model of ‘age’ as a function of several factors suggested by (post)rational choice and social constructionist theories. Using state-of-the-art multilevel statistical techniques, our analysis allows the determinants of age to vary with the institutional characteristics of European countries. Our findings convincingly show that generalized trust in strangers, support for incumbent extremist political parties in provincial elections held in the month of January, and the percentage of overqualified women in the cafeterias of national parliaments are all statistically significant explanations of ‘age’. Our findings have obvious implications for conspiracy theorists, organizational advisors, spin doctors and ordinary charlatans.
Building on a Lakatosian approach that sees Social Science as an endeavour that confronts rival theories with systematic empirical observations, this article responds to probing questions that have been raised about the appropriate ways in which to conduct systematic process analysis and comparative enquiry. It explores varieties of process tracing, the role of interpretation in case studies, and the relationship between process tracing and comparative historical analysis.
A fit between theory and method is essential in theory – guided empirical research. Achieving such a fit in process tracing is less straightforward than it may seem at first glance. There are two different types of processes that one can theorise and, consequently, two varieties of process tracing. The two varieties are introduced by empirical examples and distinguished with respect to four characteristics. Failure to determine the form of process tracing at hand may lead to invalid causal inferences.
Despite the recent methodological advancements in causal panel data analysis, concerns remain about unobserved unit-specific time-varying confounders that cannot be addressed by unit or time fixed effects or their interactions. We develop a Bayesian sensitivity analysis (BSA) method to address the concern. Our proposed method is built upon a general framework combining Rubin’s Bayesian framework for model-based causal inference (Rubin [1978], The Annals of Statistics 6(1), 34–58) with parametric BSA (McCandless, Gustafson, and Levy [2007], Statistics in Medicine 26(11), 2331–2347). We assess the sensitivity of the causal effect estimate from a linear factor model to the possible existence of unobserved unit-specific time-varying confounding, using the coefficients of the treatment variable and observed confounders in the model for the unobserved confounding as sensitivity parameters. We utilize priors on these coefficients to constrain the hypothetical severity of unobserved confounding. Our proposed approach allows researchers to benchmark the assumed strength of confounding on observed confounders more systematically than conventional frequentist sensitivity analysis techniques. Moreover, to cope with convergence issues typically encountered in nonidentified Bayesian models, we develop an efficient Markov chain Monte Carlo algorithm exploiting transparent parameterization (Gustafson [2005], Statistical Science 20(2), 111–140). We illustrate our proposed method in a Monte Carlo simulation study as well as an empirical example on the effect of war on inheritance tax rates.
This chapter reviews recent advances in addressing the identification of the effects of sanctions on cross-country and country-level studies. It argues that, given the difficulties in assessing causal relationships in cross-national data, country-level case studies can serve as a useful and informative complement to cross-national regression studies. However, case studies pose a set of additional potential empirical pitfalls that can also obfuscate rather than clarify the identification of causal mechanisms at work, so they should be treated as a complement rather than a substitute to cross-national research. As an example, the chapter discusses the impact of sanctions on Venezuela and shows how they contributed to the country’s economic collapse through their impact on oil production, public sector revenues, and imports of essential goods. These findings are consistent with those of the broader cross-national literature, which identifies constraints on an economy’s trade and financial links as a key channel for the impact of sanctions.
Bessel van der Kolk’s book The Body Keeps the Score has maintained exceptional cultural and clinical influence since its publication in 2014, remaining a best-seller and shaping public discourse on trauma. Its central claims – that trauma causes lasting neurobiological damage and that body-based treatments are uniquely effective – have been widely embraced but seldom subjected to systematic critical evaluation in peer-reviewed literature. This commentary synthesises the evidentiary basis for these claims as a counterweight to an influential narrative. It situates these findings within broader discussions of neuroscience framing, cultural appeal and evidence-based communication, underscoring the need for rigorous, balanced engagement with widely disseminated mental health narratives.
In recent years, many efforts have been made to bring quantitative and qualitative methods into dialogue. This article also moves in that direction. However, in contrast to most works, the present attempt does not concern the large-N/small-N issue but focuses instead on the sole single case study framework. Within this framework, two counterfactual methods, the historical counterfactual method, the qualitative one, and the synthetic control method, the quantitative one, have gained great importance without however meeting. This paper aims to advance mixed-methods research by bridging the gap between these two approaches. More precisely, it has assessed whether these two methods can be used together to understand what would have happened in a single case Z in the absence of an event X. The case study of the impact of Thatcher’s election on the UK pension system is then presented as an example of the joint use of the two methods.
A large empirical literature examines how judges’ traits affect how cases get resolved. This literature has led many to conclude that judges matter for case outcomes. But how much do they matter? Existing empirical findings understate the true extent of judicial influence over case outcomes since standard estimation techniques hide some disagreement among judges. We devise a machine learning method to reveal additional sources of disagreement. Applying this method to the Ninth Circuit, we estimate that at least 38% of cases could be decided differently based solely on the panel they were assigned to.
Chapter 1 starts by embedding the methods presented in this monograph into the rich landscape of statistical methods for causality research. Specifically, it starts with contrasting methods of causal inference and methods of causal structure learning (also known as causal discovery). While the former class of statistical methods can be considered well established across the developmental, psychological, and social sciences, the latter class only recently received attention. The methods of direction of dependence presented here can be characterized as a confirmatory approach to probe hypothesized causal structures of variable relations. To introduce the reader to the line of thinking that is involved when using methods of direction of dependence, prototypical research questions are presented that can be answered with the presented statistical tools and application areas that can benefit from taking a direction of dependence perspective in the analysis of research data are outlined. The methods of direction of dependence rely on higher moments of variables to discern causal structures from observational data. Thus, the chapter closes with an introductory discussion of moments of variables.
Existing approaches to conducting inference about the Local Average Treatment Effect or LATE require assumptions that are considered tenuous in many applied settings. In particular, Instrumental Variable techniques require monotonicity and the exclusion restriction while principal score methods rest on some form of the principal ignorability assumption. This paper provides new results showing that an estimator within the class of principal score methods allows conservative inference about the LATE without invoking such assumptions. I term this estimator the Compliance Probability Weighting estimator and show that, under very mild assumptions, it provides an asymptotically conservative estimator for the LATE. I apply this estimator to a recent survey experiment and provide evidence of a stronger effect for the subset of compliers than the original authors had uncovered.
Many philosophers think that doing philosophy cultivates valuable intellectual abilities and dispositions. Indeed this is a premise in a venerable argument for philosophy’s value. Unfortunately, empirical support for this premise has heretofore been lacking. We provide evidence that philosophical study has such effects. Using a large dataset (including records from over half a million undergraduates at hundreds of institutions across the United States), we investigate philosophy students’ performance on verbal and logical reasoning tests, as well as measures of valuable intellectual dispositions. Results indicate that students with stronger verbal abilities, and who are more curious, open-minded, and intellectually rigorous, are more likely to study philosophy. Nonetheless, after accounting for such baseline differences, philosophy majors outperform all other majors on tests of verbal and logical reasoning and on a measure of valuable habits of mind. This offers the strongest evidence to date that studying philosophy does indeed make people better thinkers.
This chapter focuses on correlation, a key metric in data science that quantifies to what extent two quantities are linearly related. We begin by defining correlation between normalized and centered random variables. Then, we generalize the definition to all random variables and introduce the concept of covariance, which measures the average joint variation of two random variables. Next, we explain how to estimate correlation from data and analyze the correlation between the height of NBA players and different basketball stats.In addition, we study the connection between correlation and simple linear regression. We then discuss the differences between uncorrelation and independence. In order to gain better intuition about the properties of correlation, we provide a geometric interpretation of correlation, where the covariance is an inner product between random variables. Finally, we show that correlation does not imply causation, as illustrated by the spurious correlation between temperature and unemployment in Spain.
This chapter describes how to model multiple discrete quantities as discrete random variables within the same probability space and manipulate them using their joint pmf. We explain how to estimate the joint pmf from data, and use it to model precipitation in Oregon. Then, we introduce marginal distributions, which describe the individual behavior of each variable in a model, and conditional distributions, which describe the behavior of a variable when other variables are fixed. Next, we generalize the concepts of independence and conditional independence to random variables. In addition, we discuss the problem of causal inference, which seeks to identify causal relationships between variables. We then turn our attention to a fundamental challenge: It is impossible to completely characterize the dependence between all variables in a model, unless they are very few. This phenomenon, known as the curse of dimensionality, is the reason why independence assumptions are needed to make probabilistic models tractable. We conclude the chapter by describing two popular models based on such assumptions: Naive Bayes and Markov chains.
This chapter begins by defining an averaging procedure for random variables, known as the mean. We show that the mean is linear, and also that the mean of the product of independent variables equals the product of their means. Then, we derive the mean of popular parametric distributions. Next, we caution that the mean can be severely distorted by extreme values, as illustrated by an analysis of NBA salaries. In addition, we define the mean square, which is the average squared value of a random variable, and the variance, which is the mean square deviation from the mean. We explain how to estimate the variance from data and use it to describe temperature variability at different geographic locations. Then, we define the conditional mean, a quantity that represents the average of a variable when other variables are fixed. We prove that the conditional mean is an optimal solution to the problem of regression, where the goal is to estimate a quantity of interest as a function of other variables. We end the chapter by studying how to estimate average causal effects.