Recent years have been times of turmoil for psychological science. Depending on whom you ask, the field underwent a “replication crisis” (Shrout and Rodgers Reference Shrout and Rodgers2018) or a “credibility revolution” (Vazire Reference Vazire2018) that might even climax in “psychology’s renaissance” (Nelson, Simmons, and Simonsohn Reference Nelson, Simmons and Simonsohn2018). This article asks what social scientists can learn from this story. Our take-home message is that although differences in research practices make it difficult to prescribe cures across disciplines, much still can be learned from interdisciplinary exchange. We provide nine lessons but first summarize psychology’s experience and what sets it apart from neighboring disciplines.
As a sociologist and a psychologist, we are outsiders to political science. What unites us is an interest in meta-scientific questions that has made us wonder how disciplines beyond psychology can benefit from increased transparency. Whereas we aim to address social scientists in general, our perspective is that of quantitative research. We focus on the practices of open data, open materials, and preregistration. These often are thought of as means to improve the credibility of research—for example, through increasing reproducibility (i.e., ensuring that a reanalysis of the same data results in the same conclusions) and/or replicability (i.e., ensuring that an empirical replication of a study leads to the same conclusions). Of course, open science also encompasses other practices such as open access publication and open educational resources, with a broad range of underlying goals, including increased accessibility and reduced inequalities.
THE VIEW FROM PSYCHOLOGY
Psychology’s current reform movement began with the insight that certain research practices were both problematic (Simmons, Nelson, and Simonsohn Reference Simmons, Nelson and Simonsohn2011) and widespread (John, Loewenstein, and Prelec Reference John, Loewenstein and Prelec2012). Low power, misuse of significance testing, researcher degrees of freedom, and post hoc hypothesizing had created a cycle in which flashy but spurious results spread with little attempt of falsification. This was exposed through a series of high-profile replication failures (e.g., Open Science Collaboration 2015) that made the problems visible and created momentum but also caused backlash (Baumeister Reference Baumeister2016; Gilbert et al. Reference Gilbert, King, Pettigrew and Wilson2016).
The next phase was marked by attempts to solve the underlying issues through increased transparency. Journals such as the Association for Psychological Science’s flagship Psychological Science adopted “badges” for contributions that adhered to open standards (Lindsay Reference Lindsay2017). By late 2018, more than 22,000 preregistrations had been filed on the Open Science Framework. More than a dozen job advertisements have asked applicants to add an open science statement to demonstrate how they have contributed to replicable, reproducible, and transparent research (see https://osf.io/7jbnt). Ostensibly, openness has become mainstream.
However, empirical follow-ups often have been sobering. Even with open data and open materials, analyses may be reproduced only with considerable effort or help, if at all (e.g., Hardwicke et al. Reference Hardwicke, Mathur, MacDonald, Nilsonne, Banks, Kidwell and Mohr2018). Preregistrations often are too vague to keep researcher degrees of freedom at bay (Veldkamp et al. Reference Veldkamp, Bakker, van Assen, Crompvoets, Ong, Nosek, Soderberg, Mellor and Wicherts2018), and undisclosed deviations from the preregistered plan seem to be common (Claesen et al. Reference Claesen, Gomes, Tuerlinckx and Vanpaemel2019).
We now seem to have entered a phase in which the movement’s initial success has invited a broader range of proposals not always linked to openness as such, including calls for better measurement (Flake and Fried Reference Flake and Fried2019), theoretical rigor (Muthukrishna and Henrich Reference Muthukrishna and Henrich2019), stricter significance thresholds (Benjamin et al. Reference Benjamin, Berger, Johannesson, Brian, Nosek and Berk2018), and multi-model analysis (Orben and Przylbylski Reference Orben and Przybylski2019). There also is growing interest in causal inference (Rohrer Reference Rohrer2018) and transparency in analyzing preexisting data (Weston et al. Reference Weston, Ritchie, Rohrer and Przybylski2019)—issues long known to the political science community.
IS PSYCHOLOGY’S EXPERIENCE GENERALIZABLE?
It may be tempting to apply some of the tools and insights from psychology to other social sciences. However, recent developments in the field have been shaped by its particularities. For example, the subfields that were hit hardest by the replication crisis stand out for their emphasis on counterintuitive results carefully teased out in small-scale experiments. Hence, the prior probability of a tested hypothesis might be low and statistical evidence may be weak, but empirical replication studies are comparably inexpensive—which, all else being equal, makes it easier to discover the problem.
Other social sciences place less emphasis on novelty and more on cumulative refinement of observational estimates with large-scale, representative data. Here, hypotheses may be more plausibly true to begin with and statistical evidence may be stronger, but replication on new data can be difficult or impossible. This does not mean that these other fields are infallible but rather that problems and solutions may differ. The statistical flukes or variance false-positives that psychology has grappled with might be overshadowed by bias false-positives from flawed sampling, measurement, or design, which can be quite replicable if follow-up studies suffer from the same flaws.
The statistical flukes or variance false-positives that psychology has grappled with might be overshadowed by bias false-positives from flawed sampling, measurement, or design, which can be quite replicable if follow-up studies suffer from the same flaws.
Where does political science stand in all of this? Increasingly, it is a discipline that takes pride in causal inference (Clark and Golder Reference Clark and Golder2015). Ironically, by moving closer to an experimental ideal, statistical flukes—that is, variance false-positives—become a greater concern (Young Reference Young2019). Moreover, whereas taste for novelty is arguably less of an issue than in psychology, political desirability can have similar influence (Zigerell Reference Zigerell2017). Furthermore, certain problems that have been identified in psychology also have been pointed out in political science, including low computational reproducibility (Stockemer, Koehler, and Lentz Reference Stockemer, Koehler and Lentz2018; cf. Jacoby, Lafferty-Hess, and Christian Reference Jacoby, Lafferty-Hess and Christian2017) and sanitized research narratives that do not capture the actual complexity of the process (Yom Reference Yom2018).
Hence, there are both commonalities and differences in the problems that affect different social sciences. With their focus on increased transparency, open science practices might be able to attenuate some of them. How these practices can be implemented, however, will depend on the methods and approaches used by researchers—which vary between and within different social sciences. Indeed, political science covers a wide range of methods and approaches. Thus, the lessons we suggest are broader points on a metalevel rather than specific prescriptions.
LESSONS FOR IMPROVING SOCIAL SCIENCE
We draw the following lessons from psychology’s experience.
One Size Does Not Fit All
Reform attempts in psychology have had an impact precisely because they struck at some of the field’s central shortcomings. Our first lesson, therefore, is that attempts to improve the empirical status of a discipline must be localized to that discipline. This work could begin by asking a set of basic questions: Which criteria are used to judge scientific progress, and how are scientific claims evaluated (e.g., Elman, Kapiszewski, and Lupia, Reference Elman, Kapiszewski and Lupia2018)? Which problems are the biggest threat to inference? What are current norms and what keeps researchers from abandoning those that are counterproductive? Once these issues have been settled and proposals are being evaluated, we must consider costs and benefits, division of labor, incentive design, and so on.
Harness Tacit Knowledge
Where to begin then? To some extent, the prevalence of specific (mal-)practices can be surveyed empirically (John, Loewenstein, and Prelec Reference John, Loewenstein and Prelec2012) and their impact can be gauged formally (Smaldino and McElreath Reference Smaldino and McElreath2016), as can the potential effects of proposed solutions (Smaldino, Turner, and Kallens Reference Smaldino, Turner and Kallens2019). However, the first step should be an open and critical dialogue among researchers in the field. In our experience, knowledge of bad practices can be widespread without leading to action. It is, for example, telling that experiments with prediction markets have found that scholars seem quite capable of identifying the replications least likely to succeed (Dreber et al. Reference Dreber, Pfeiffer, Almenberg, Isaksson, Wilson, Chen, Nosek and Johannesson2015). Such tacit knowledge, once it becomes explicated, is an important resource for improving science.
Assess the Benefits of Open Science…
Why would we want transparency in the first place? For some, the ability to reproduce an analysis is the only way to fully understand and evaluate it (King Reference King1995, 444). However, the benefits of transparency extend beyond critical evaluation. Sharing of data and other materials reduces duplicate work and increases the yield from a given dataset, enables pooling of evidence, imposes greater self-scrutiny, and allows others to adapt and build on existing efforts. These benefits serve credibility as well as other goals including efficiency and equality. Especially for early-career researchers, the entry barrier will be lowered as they become less dependent on access to prominent mentors and run a lower risk of wasting time on a topic known to be “doomed” by insiders.
Especially for early-career researchers, the entry barrier will be lowered as they become less dependent on access to prominent mentors and run a lower risk of wasting time on a topic known to be “doomed” by insiders.
…As Well as the Costs, and Ways to Reduce Them
The costs of open science are real. Considering the social costs, much of the recent backlash has been driven by targets of scrutiny who felt unfairly treated. This is an issue of culture, as Janz and Freese (Reference Janz and Freese2020) discuss in this symposium. Considering the practical costs, transparency requires work. An obligation to share materials can shift incentives away from original data collection or lead informants to withhold sensitive information (Connors, Krupnikov, and Ryan Reference Connors, Krupnikov and Ryan2019). Some of these drawbacks have technical solutions. To preserve confidentiality, there has been experimentation with “synthetic data” that preserve joint distributions without exposing individuals (Nowok, Raab, and Dibben Reference Nowok, Raab and Dibben2016). As for the workload of preregistrations, standard operating procedures can shorten the process (Lin and Green Reference Lin and Green2016). Moreover, push-button replications (Khatua Reference Khatua2018) can be used to verify analysis pipelines and markup-based tools (Hardwicke Reference Hardwicke2018) to detect undisclosed deviations between preregistration and manuscripts.
Beware of Tokenism
Psychology’s mixed experience with badges highlights how—when changed rules and norms bring new incentives—there is always a temptation to cut corners. As with any target, open practices risk turning into another metric that researchers game for their own gain. To some extent, clearer standards—more transparency about transparency—might clarify what is expected. Does open data merely indicate that some data have been made available, or should it also be the right data to reproduce the numbers? Does preregistered mean “there is a document that you can compare to the published article” or that the analyses reported were conducted as prespecified unless declared otherwise? These are only partial solutions, and we also must consider the division of labor and incentive structure.
Mind the Division of Labor
A crucial question is: Who should check whether materials allow for reproducing findings? Right now, the answer seems to be “anybody who feels like it.” Occasionally, researchers are called out for doing openness wrong—for example, claiming that a study was preregistered despite substantial deviations in the final publication. This is far from a fair solution in which the same checks are consistently applied to everyone. However, such a fair solution seems to be necessary if open practices are to become established. There are various ways to assign that burden—ranging from editorial boards and reviewers, to universities and institutes, to students as part of the curriculum (King Reference King2006). A real commitment to openness may require a new professional role dedicated to verification. This does not seem outlandish given the growing cadre of administrators tasked with facilitating research.
Reward Public-Good Contributions
That a finding is reproducible using the same data and analysis is, admittedly, a low bar. Other forms of replication involve applying different methods to the same data or the same method to different data (Freese and Peterson Reference Freese and Peterson2017). Authors have few incentives to support this type of generative work because there is no good system for adequately crediting materials that help others. Someone who spent hundreds of hours gathering, cleaning, and analyzing a dataset will be reluctant to share the fruits of that labor without reward. Fair recognition of public-good contributions might counteract some of the shortcomings of gameable “checkbox” policies, such as badges and mandatory code sharing. For example, hiring committees could explicitly consider “secondary research value,” such as new insights generated on the basis of data openly shared by applicants, regardless of whether they coauthored the respective manuscript.
In our view, one of the main benefits of open science is its inclusionary aspect. By widening access to information and lowering entry barriers, it promises to be both more democratic and more efficient than the status quo. However, open science also can create barriers. Power struggles are inherent in institutional change and, in science, traceable at least to the dawn of the experiment (Shapin and Schaffer Reference Shapin and Schaffer1985). From the inside, the open science movement looks generous and inspiring, but it can appear differently for those who feel left out. For example, higher standards for computational reproducibility require skills that not everyone has had an opportunity to acquire. Creating accessible resources therefore should be a central part of promoting open science. There also is a risk that open science is perceived as a cliquish movement pushed by zealots that must be actively worked against—as in any group effort, cohesion must be balanced with inclusiveness.
Open Science Is Just Science
If open science has any unifying core, it is the shared understanding that increased transparency and accessibility can improve the quality of research and keep scientists’ biases in check. We noticed that—more often than not—the desire for such improvement stems from a wish to answer meaningful research questions with real-world implications rather than an interest in transparency as an end it itself. Seen this way, the recent push toward openness is neither a fad nor an innovation but simply a recognition of our shared interest as a scientific community. This leads to an uplifting conclusion: the aims of open science are largely those of the scientific method itself—that is, open science is really just science.
Seen this way, the recent push toward openness is neither a fad nor an innovation but simply a recognition of our shared interest as a scientific community. This leads to an uplifting conclusion: the aims of open science are largely those of the scientific method itself—that is, open science is really just science.
Per Engzell acknowledges funding from the Swedish Research Council for Health, Working Life, and Welfare (FORTE), Grant No. 2016-07099, and support from Nuffield College and the Leverhulme Centre for Demographic Science, the Leverhulme Trust. Both authors contributed equally and are listed in alphabetical order.