Every era faces a unique set of challenges and dilemmas, but ours can credibly lay claim to some of the most complex and vexing that humankind may have ever confronted. From climate change to growing inequality to a rising tide of refugees: we face an intricate mesh of overlapping and interdependent difficulties, one that is pushing the limits of our existing policy and governance capabilities (Data for Policy, 2015; Meyer et al., Reference Meyer, Crowcroft, Engin and Alexander2017). What we require today are not so much (or not only) new solutions, but new ways for arriving at solutions (Susha et al., Reference Susha, Janssen and Verhulst2017). We need a twenty-first century paradigm of governance and policy making.
Data, it is increasingly clear, will be central to this paradigm (Pentland, Reference Pentland2013; Kirkpatrick, Reference Kirkpatrick2012). Along with ever increasing computer storage and analytics capabilities, massive amounts of data generated from citizens, devices, and sensors provide decision makers the opportunity to monitor and manage public infrastructure in real time and predict future patterns when used responsibly (Engin and Treleaven, Reference Engin and Treleaven2019; Janssen and Helbig, Reference Janssen and Helbig2018). Data have the potential to transform every part of the policy-making life cycle—agenda setting and needs identification; the search for solutions; prototyping and implementation of solutions; enforcement; and evaluation (Janssen and Helbig, Reference Janssen and Helbig2018). These are all critical, interlinked steps in addressing our societal challenges, and each of these needs a radical rethink.
The idea that data could be a key differentiator is, of course, not a new one. Its potential has been evident for some time now (Wang et al., Reference Wang, Kung and Byrd2018), especially in the business world (Henke et al., Reference Henke, Bughin, Chui, Manyika, Saleh, Wiseman and Sethupathy2016), but also in the policy community, where efforts to harness the power of information have yielded positive results in areas as disparate as gender equality (Fatehkia et al., Reference Fatehkia, Kashyap and Weber2018), improving urban traffic flows (Zhao et al., Reference Zhao, Zhang, An and Liu2018), and enhancing regulatory compliance (Heat Seek, n.d.; Credit Suisse, n.d.). Successful data initiatives have been deployed by governments around the world in both developing and developed countries (Verhulst and Young, Reference Verhulst and Young2017a). Such initiatives have led to a growing recognition that data are and should increasingly be part of any effective governance toolkit.
Despite such encouraging results, it is true that the policy world has generally lagged behind business in its use of data and data methods (Hou et al., Reference Hou, Lunsford, Sides and Jones2011). Policy–data interactions or governance initiatives that use data have been the exception rather than the norm, isolated prototypes and trials rather than an indication of real, systemic change. There are various reasons for the generally slow uptake of data in policymaking, and several factors will have to change if the situation is to improve. In particular, advocates of more data (and we include ourselves among this number) will need to overcome the following obstacles and limitations:
Despite the number of successful prototypes and small-scale initiatives, policy makers’ understanding of data’s potential and its value proposition generally remains limited (Lutes, Reference Lutes2015). There is also limited appreciation of the advances data science has made the last few years. This is a major limiting factor; we cannot expect policy makers to use data if they do not recognize what data and data science can do.
The recent (and justifiable) backlash against how certain private companies handle consumer data has had something of a reverse halo effect: There is a growing lack of trust in the way data is collected, analyzed, and used, and this often leads to a certain reluctance (or simply risk-aversion) on the part of officials and others (Engin, Reference Engin2018).
Despite several high-profile open data projects around the world, much (probably the majority) of data that could be helpful in governance remains either privately held or otherwise hidden in silos (Verhulst and Young, Reference Verhulst and Young2017b). There remains a shortage not only of data but, more specifically, of high-quality and relevant data.
With few exceptions, the technical capacities of officials remain limited, and this has obviously negative ramifications for the potential use of data in governance (Giest, Reference Giest2017).
It’s not just a question of limited technical capacities. There is often a vast conceptual and values gap between the policy and technical communities (Thompson et al., Reference Thompson, Daly, Keene, Raj and Symons2015; Uzochukwu et al., Reference Uzochukwu, Onwujekwe, Mbachu, Okwuosa, Etiaba, Nyström and Gilson2016); sometimes it seems as if they speak different languages. Compounding this difference in world views is the fact that the two communities rarely interact.
Yet, data about the use and evidence of the impact of data remain sparse. The impetus to use more data in policy making is stymied by limited scholarship and a weak evidential basis to show that data can be helpful and how. Without such evidence, data advocates are limited in their ability to make the case for more data initiatives in governance.
Data are not only changing the way policy is developed, but they have also reopened the debate around theory- versus data-driven methods in generating scientific knowledge (Lee, Reference Lee1973; Kitchin, Reference Kitchin2014; Chivers, Reference Chivers2018; Dreyfuss, Reference Dreyfuss2017) and thus directly questioning the evidence base to utilization and implementation of data within policy making. A number of associated challenges are being discussed, such as: (i) traceability and reproducibility of research outcomes (due to “black box processing”); (ii) the use of correlation instead of causation as the basis of analysis, biases and uncertainties present in large historical datasets that cause replication and, in some cases, amplification of human cognitive biases and imperfections; and (iii) the incorporation of existing human knowledge and domain expertise into the scientific knowledge generation processes—among many other topics (Castelvecchi, Reference Castelvecchi2016; Miller and Goodchild, Reference Miller and Goodchild2015; Obermeyer and Emanuel, Reference Obermeyer and Emanuel2016; Provost and Fawcett, Reference Provost and Fawcett2013).
Finally, we believe that there should be a sound under-pinning a new theory of what we call Policy–Data Interactions. To date, in reaction to the proliferation of data in the commercial world, theories of data management,Footnote 1 privacy,Footnote 2 and fairnessFootnote 3 have emerged. From the Human–Computer Interaction world, a manifesto of principles of Human–Data Interaction (Mortier et al., Reference Mortier, Haddadi, Henderson, McAuley and Crowcroft2014) has found traction, which intends reducing the asymmetry of power present in current design considerations of systems of data about people. However, we need a consistent, symmetric approach to consideration of systems of policy and data, how they interact with one another.
All these challenges are real, and they are sticky. We are under no illusions that they will be overcome easily or quickly.
They were the impetus behind the formation of the international Data for Policy conferences (dataforpolicy.org), launched in 2015. We were interested in initiating an interdisciplinary and cross-sector debate to bridge the gap between large-scale data-processing technologies and existing expert knowledge in major policy domains to make policy development processes more citizen-focused, taking into account public needs and preferences supported with actual experiences of public services. Since then we engaged in several parallel debates on the ethical and privacy concerns associated with such developments and the usability of technologies addressing the needs of diverse stakeholders.
During the past four conferences, we have hosted an incredibly diverse range of dialogues and examinations by key global thought leaders, opinion leaders, practitioners, and the scientific community (Data for Policy, 2015, 2016, 2017, 2019). What became increasingly obvious was the need for a dedicated venue to deepen and sustain the conversations and deliberations beyond the limitations of an annual conference. This leads us to today and the launch of Data & Policy, which aims to confront and mitigate the barriers to greater use of data in policy making and governance.
Data & Policy is a venue for peer-reviewed research and discussion about the potential for and impact of data science on policy. Our aim is to provide a nuanced and multistranded assessment of the potential and challenges involved in using data for policy and to bridge the “two cultures” of science and humanism—as CP Snow famously described in his lecture on “Two Cultures and the Scientific Revolution” (Snow, Reference Snow1959). By doing so, we also seek to bridge the two other dichotomies that limit an examination of datafication and is interaction with policy from various angles: the divide between practice and scholarship; and between private and public.
Importantly, our intention is not simply to advocate for greater—and blind—use of data; while we recognize the very real possibilities, we also know that there are risks, and we believe that the ultimate goal is not simply data for the sake of data, but to arrive at a better understanding of how data can be used in an efficient and responsible manner to confront the challenges of our era. Therefore, while our pages will no doubt contain a fair number of authors who advocate the use of data in governance, readers can also expect more nuanced and even skeptical perspectives.
We also see the potential with Data & Policy to extend beyond the idea of a conventional academic journal. The movements towards more open, transparent, and collaborative research—including the sharing of materials not typically published in academic journals—are highly relevant to our project of linking technical, policy, and other expertise and for building trust. Articles published in Data & Policy will be open access: freely available under licensing that allows unimpeded reading, sharing, and reuse, helping us to reach readers and potential authors in academic institutions, government agencies, international, nonprofit, and commercial organizations, and the general public. Beyond this we will encourage the open availability of data, code, and other materials to promote transparency and reuse, albeit recognizing that there are circumstances where this is not possible or responsible. Authors submitting to Data & Policy will be asked to provide a data availability statement that either links to the data underlying the results and other relevant materials or that explains the reasons why these cannot be shared. We are conscious of the need to support authors in this process so we provide information about the different resources that can be used. We encourage authors to think beyond traditional outputs to also share proposals, posters, presentations, and policy-related problems that require investigation.
To try and address the terminology and conceptual gaps that exist between different communities, we will also seek to innovate with the formats published in Data & Policy and the features within them. Articles will be published with a short but prominent policy significance statement to summarize their relevance to policy makers in language that is understandable to the wider public. We are actively soliciting ideas from the Data for Policy and the Data & Policy audiences about the types of content that can help us bridge the communities we are appealing to.
It is essential to say one more thing. Data & Policy is about policy making; it is not about politics. Throughout our enquiries, we will strive to remain ideologically neutral and avoid the political schisms that define so much of public life and discourse nowadays. This does not mean that we are unaware of the social and political contexts within which our papers are written (and will be received), but it does mean that our aspiration is to remain pragmatic and results-oriented. We seek to discover what works and how to replicate successful data initiatives at a larger scale or in different geographies.
So these are our principles: scholarly, pragmatic, open-minded, interdisciplinary, focused on actionable intelligence, and, most of all, innovative in how we will share insight and pushing at the boundaries of what we already know and what already exists. We are excited to launch Data & Policy with the support of Cambridge University Press and University College London, and we’re looking for partners to help us build it as a resource for the community. If you’re reading this manifesto it means you have at least a passing interest in the subject; we hope you will be part of the conversation.