To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter focuses on the challenges in setting the clear and proper objectives of a data science application, a necessity when the goals are to predict, optimize, or recommend. The main challenges include the clarity of the objectives, the balance of benefits across affected parties, fairness (a specific topic within the topic of balance), and the impact of the objectives on an individual: manipulation, filter bubbles (though these affect society as well), privacy, and being human.
This chapter discusses the sensitivity of data science applications to failures, failures which may occur because data science problems often have no unambiguously correct answers and because solutions are often only probabilistically correct. This chapter also discusses how to characterize uncertainty, how to minimize risks while balancing them against rewards, and how to assess liability for any residual harms that may occur, despite the best efforts to minimize them.
This chapter recommends the careful application of the Analysis Rubric to increase the likelihood that the myriad aforementioned challenges are considered and met. It also makes several simple proposals to encourage the application of ethical principles.
This chapter explores data science’s broad legal issues and some previously undiscussed societal (primarily economic) implications of data science. Importantly, this chapter continues the ethics thread begun in Chapters 3 and 7 with a pragmatic discussion of the challenges of internalizing ethical considerations in organizations that apply data science.
This chapter explains how data science has been able to achieve its theoretical, methodological, and practical results by combining the approaches of different disciplines to create a new field. It also surveys the breadth of data science’s likely impact and explains that its continued success is due not only to its own core advances, but to coalitions with many other disciplines.
This chapter pivots towards taking the view of a team building new data science applications. Their work begins when someone creates a concept for a worthwhile and plausibly achievable technique, product, or service. Goals may range from scientific pursuit to commercial gain. They may be motivated by the need to solve an existing problem or by a novel way of extracting information out of an existing data source. The chapter’s material is presented by example: the application of the Analysis Rubric to 26 different uses of data science.
This first chapter of Part III begins a series of detailed discussions on each of the challenges implied by the rubric elements, beginning with generating, collecting, processing, storing, and managing data.
This chapter illustrates how the Belmont principles, based on respect for persons, beneficence, and justice, can be applied in the context of data science. This principlist approach to ethics attempts to provide a shared analytic framework and vocabulary to help communities and teams resolve difficult questions. Principles are most useful when broad enough to be comprehensive and to capture, rather than ignore, the tensions that make questions of “right” and “wrong” so difficult.
The quantity of data that is collected, processed, and employed has exploded in this millennium. Many organizations now collect more data in a month than the total stored in the Library of Congress. With the goal of gaining insight and drawing conclusions from this vast sea of information, data science has fueled many of the vast benefits brought by the Internet and provided the business models that pay many of its costs.
This chapter introduces ethical principles that help us better use data science to achieve beneficial societal goals. As with all developing technologies, data science can give rise to unanticipated negative consequences, and it may affect our professional, personal, and political realities. These challenge our norms for how we use technology in ways consistent with our values. Many scholars, educators, and technology companies refer to these as ethical challenges, building on the applied ethics tradition from basic sciences.
This chapter presents examples of what data science can do. For the technology-, healthcare-, and science-related examples, the authors define the problem and then sketch how to collect data, build a model, and use it to solve the problem. They start with spelling correction, followed by speech recognition. Other examples include recommendation systems and protein folding. Also discussed is the promise of using large quantities of individualized health data to learn about and improve human health. Finally, the authors provide a cautionary example by discussing mortality predictions during the COVID-19 pandemic. The examples illustrate the diversity of considerations data scientists must address as they solve new problems.
This chapter uses the learnings from Chapter 4’s examples to create the Analysis Rubric, which consists of seven major considerations for determining data science’s applicability to a proposed application. While these considerations may not be fully understood at a project’s inception, there needs to be a belief that answers will be forthcoming prior to completion. Three of these address requirements-oriented aspects (“For What or Why”) of data science applications, and three address implementation-oriented aspects (“How To”). The seventh addresses legal, societal, and ethical implications (ELSI). Collectively, these considerations, or Analysis Rubric elements, cover the complex trade-offs needed to achieve practical, valuable, legal, and ethical results. The chapter concludes by applying the Analysis Rubric to the previous chapter’s examples.
This chapter summarizes the motivation of this book and its central messages. It also includes individual essays from each author presenting their point of view on the core issues and challenges.
This chapter’s first three sections drill down into the Analysis Rubric’s "understandability” element, focusing on the challenges that occur when applications must explain how a conclusion was reached, show causal relationships, or provide results that can be reproduced by others. Given the immense influence of data science findings, the chapter concludes with a discussion on the great care that is needed to communicate data science findings without being misleading.