Book contents
- Frontmatter
- Dedication
- Contents
- Contributors
- Editors’ Introduction
- Part I Conceptual Framework
- 1 Monitoring, Datafication, and Consent: Legal Approaches to Privacy in the Big Data Context
- 2 Big Data’s End Run around Anonymity and Consent
- 3 The Economics and Behavioral Economics of Privacy
- 4 Changing the Rules: General Principles for Data Use and Analysis
- 5 Enabling Reproducibility in Big Data Research: Balancing Confidentiality and Scientific Transparency
- Part II Practical Framework
- Part III Statistical Framework
- References
5 - Enabling Reproducibility in Big Data Research: Balancing Confidentiality and Scientific Transparency
Published online by Cambridge University Press: 05 July 2014
- Frontmatter
- Dedication
- Contents
- Contributors
- Editors’ Introduction
- Part I Conceptual Framework
- 1 Monitoring, Datafication, and Consent: Legal Approaches to Privacy in the Big Data Context
- 2 Big Data’s End Run around Anonymity and Consent
- 3 The Economics and Behavioral Economics of Privacy
- 4 Changing the Rules: General Principles for Data Use and Analysis
- 5 Enabling Reproducibility in Big Data Research: Balancing Confidentiality and Scientific Transparency
- Part II Practical Framework
- Part III Statistical Framework
- References
Summary
Introduction
The 21st century will be known as the century of data. Our society is making massive investments in data collection and storage, from sensors mounted on satellites down to detailed records of our most mundane supermarket purchases. Just as importantly, our reasoning about these data is recorded in software, in the scripts and code that analyze this digitally recorded world. The result is a deep digitization of scientific discovery and knowledge, and with the parallel development of the Internet as a pervasive digital communication mechanism we have powerful new ways of accessing and sharing this knowledge. The term data even has a new meaning. Gone are the days when scientific experiments were carefully planned prior to data collection. Now the abundance of readily available data creates an observational world in itself suggesting hypotheses and experiments to be carried out after collection, curation, and storage of the data has already occurred. We have departed from our old paradigm of data collection to resolve research questions – nowadays, we collect data simply because we can.
In this chapter I outline what this digitization means for the independent verification of scientific findings from these data, and how the current legal and regulatory structure helps and hinders the creation and communication of reliable scientific knowledge. Federal mandates and laws regarding data disclosure, privacy, confidentiality, and ownership all influence the ability of researchers to produce openly available and reproducible research. Two guiding principles are suggested to accelerate research in the era of big data and bring the regulatory infrastructure in line with scientific norms: the Principle of Scientific Licensing and the Principle of Scientific Data and Code Sharing. These principles are then applied to show how intellectual property and privacy tort laws could better enable the generation of verifiable knowledge, facilitate research collaboration with industry and other proprietary interests through standardized research dissemination agreements, and give rise to dual licensing structures that distinguish between software patenting and licensing for industry use and open availability for open research.
- Type
- Chapter
- Information
- Privacy, Big Data, and the Public GoodFrameworks for Engagement, pp. 112 - 132Publisher: Cambridge University PressPrint publication year: 2014
References
- 5
- Cited by