Content Listing

9 - Advanced Examples: Semi-supervised, Ensembles, Deep Learning Model Deployment
from Part III - Machine Learning for Big Data
Isaac Triguero, University of Nottingham, Mikel Galar, Public University of Navarre
Book:

Large-Scale Data Analytics with Python and Spark

Published online:

15 December 2023

Print publication:

23 November 2023, pp 305-368
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The goal of this chapter is to present complete examples of the design and implementation of machine learning methods in large-scale data analytics. In particular, we choose three distinct topics: semi-supervised learning, ensemble learning, and how to deploy deep learning models at scale. Each of them is introduced, motivating why parallelization to deal with big data is needed, determining the main bottlenecks, designing and coding Spark-based solutions, and discussing further work required to improve the code. In semi-supervised learning, we focus on the simplest self-labeling approach called self-training, and a global solution for it. Likewise, in ensemble learning, we design a global approach for bagging and boosting. Lastly, we show an example with deep learning. Rather than parallelizing the training of a model, which is typically easier on GPUs, we deploy the inference step for a case study in semantic image segmentation.

Dedication
Isaac Triguero, University of Nottingham, Mikel Galar, Public University of Navarre
Book:

Large-Scale Data Analytics with Python and Spark

Published online:

15 December 2023

Print publication:

23 November 2023, pp v-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Acknowledgments
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp xix-xx
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

3 - Hadoop
from Part II - Big Data Frameworks
Isaac Triguero, University of Nottingham, Mikel Galar, Public University of Navarre
Book:

Large-Scale Data Analytics with Python and Spark

Published online:

15 December 2023

Print publication:

23 November 2023, pp 45-67
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Hadoop is an open-source framework, written in Java, for big data processing and storage that is based on the MapReduce programming model. This chapter starts off with a brief introduction to Hadoop and how it has evolved to become a solid base platform for most big data frameworks. We show how to implement the classical Word Count using Hadoop MapReduce, highlighting the difficulties in doing so. After that, we provide essential information about how the resource negotiator, YARN, and its distributed file system, HDFS, work. We describe step by step how a MapReduce process is executed on YARN, introducing the concepts of resource and node managers, application master, and containers, as well as the different execution models (standalone, pseudo-distributed, and fully-distributed). Likewise, we talk about the HDFS, covering the basic design of this filesystem, and what it means in terms of functionality and efficiency. We also discuss recent advances such as erasure coding, HDFS federation, and high availability. Finally, we expose the main limitations of Hadoop and how it has sparked the rise of many new big data frameworks, which now coexist within the Hadoop ecosystem.

Part III - Machine Learning for Big Data
Isaac Triguero, University of Nottingham, Mikel Galar, Public University of Navarre
Book:

Large-Scale Data Analytics with Python and Spark

Published online:

15 December 2023

Print publication:

23 November 2023, pp 175-368
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Machine Learning for Big Data
from Part III - Machine Learning for Big Data
Isaac Triguero, University of Nottingham, Mikel Galar, Public University of Navarre
Book:

Large-Scale Data Analytics with Python and Spark

Published online:

15 December 2023

Print publication:

23 November 2023, pp 212-253
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter puts forward new guidelines for designing and implementing distributed machine learning algorithms for big data. First, we present two different alternatives, which we call local and global approaches. To show how these two strategies work, we focus on the classical decision tree algorithm, revising its functioning and some details that need modification to deal with large datasets. We implement a local-based solution for decision trees, comparing its behavior and efficiency against a sequential model and the MLlib version. We also discuss the nitty-gritty of the implementation of decision trees in MLlib as a great example of a global solution. That allows us to formally define these two concepts, discussing the key (expected) advantages and disadvantages. The second part is all about measuring the scalability of a big data solution. We talk about three classical metrics, speed-up, size-up, and scale-up, to help understand if a distributed solution is scalable. Using these, we test our local-based approach and compare it against its global counterpart. This experiment allows us to give some tips for calculating these metrics correctly using a Spark cluster.

18 - Inequality
from Part IV - The Sector at Large
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp 217-226
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter deals with the topic of inequality in the healthcare sector. The chapter begins by looking at some summary data and the inequality in various measures of health that are obvious from simple differences. Then the chapter discusses some basic social philosophy on the topic of how to judge fairness in an abstract society. More detailed data on health outcomes is then explored, highlighted inequality along different demographic lines. Then there is a discussion of theories that fit the facts presented: why do these inequalities exist, and what do we learn by unpacking them: both social implications as well as clinical. Finally, inequalities in the labor market for healthcare workers are discussed.

Part II - Big Data Frameworks
Isaac Triguero, University of Nottingham, Mikel Galar, Public University of Navarre
Book:

Large-Scale Data Analytics with Python and Spark

Published online:

15 December 2023

Print publication:

23 November 2023, pp 43-174
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

17 - Medical Malpractice
from Part IV - The Sector at Large
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp 207-216
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter covers the medical malpractice system: how it works, what its goals are, and how it influences provider behavior. The chapter begins by defining key terminology in tort law and explaining the process by which a medical malpractice case is brought and resolved, as well as the goals that this system is trying to achieve. Then discussion turns to how this system creates incentive for actions that run counter to its goals and the problems that are likely to arise, along with some empirical evidence of the existence of said problems.

1 - How Economists View Human Behavior
from Part I - Patients
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp 3-14
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter introduces the basics of the economic approach to understanding decision-making. This is done using examples drawn from consumer decision-making in the context of healthcare. Topics include how to think about preferences, different types of costs, optimization, and the importance of perceptions. The end of chapter supplement discusses how to use price indexes.

Preface
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp xv-xviii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Reviews
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp ii-ii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Part II - Providers
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp 71-126
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

14 - Public Insurance
from Part III - Health Insurers
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp 174-184
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 14 provides a brief overview of major health insurance provided by the government in the United States. The majority of the chapter goes through the major programmatic details of the Medicare and Medicaid Programs: who is covered, what services are covered and how generously, and how providers are paid by the programs. The discussion of Medicaid also explains the nature of the federal-state partnership and explores ways in which states have and use their freedom to expand upon the program. The remainder of the chapter briefly describes other public insurance (or insurance-related) programs: the VA and CHAMPVA, TRICARE, CHIP, COBRA, and the IHS.

Contents
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp v-xiv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp 245-252
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Spark
from Part II - Big Data Frameworks
Isaac Triguero, University of Nottingham, Mikel Galar, Public University of Navarre
Book:

Large-Scale Data Analytics with Python and Spark

Published online:

15 December 2023

Print publication:

23 November 2023, pp 68-108
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter introduces Spark, a data processing engine that mitigates some of the limitations of Hadoop MapReduce to perform data analytics efficiently. We begin with the motivation of Spark, introducing Spark RDDs as an in-memory distributed data structure that allows for faster processing while maintaining the attractive properties of Hadoop, such as fault tolerance. We then cover, hands-on, how to create and operate with RDDs, distinguishing between transformations and actions. Furthermore, we discuss how to work with key–value RDDs (which is more like MapReduce), how to use caching to perform iterative queries/operations, and how RDD lineage works to ensure fault tolerance. We provide a great range of examples with transformations such as map vs. flatMap and groupByKey vs. reduceByKey, discussing their behavior, adequacy (depending on what we want to achieve), and their performance. More advanced concepts, such as shared variables (broadcast and accumulators) or work by partitions are presented towards the end. Finally, we talk about the anatomy of a Spark application, as well as the different types of dependencies (narrow vs. wide) and the limitations on optimizing their processing.

5 - Evaluating Evidence
from Part I - Patients
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp 57-70
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 5 moves from theory into evidence and discusses how empirical economists think about causality. First, the chapter covers common issues that make it difficult to have confidence in causal claims based on associational evidence alone. Then, experimental evidence is discussed: how to run an experiment, common pitfalls that can undermine confidence in experimental evidence, and what can be done to avoid them. Next, major experimental studies on the impact of health insurance are described. Finally, the chapter discusses the concept of quasi-experimental evidence and how it fits into economics. The end of chapter supplement discusses ethics in research with human subjects and the role of institutional review boards.

3 - Demand for Medical Care
from Part I - Patients
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp 29-44
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 3 develops the fundamentals of demand analysis through the lens of demand for medical care. The chapter discusses how to read and use demand curves and why demand for health is not commonly used (there is no direct market for health). Then various demand elasticities are covered: price elasticity of demand, income elasticity of demand, and cross-price elasticity of demand. Each elasticity is developed along with examples specific to the market for medical care and health decision-making. The chapter also develops important tools for using demand curves: how to think through demand shifters, Engel curves, how to calculate consumer surplus, and how to aggregate from individual to market demand. The end of chapter supplement walks through how to calculate elasticities.

Copyright page
Andrew Friedson, Milken Institute, California
Book:

Economics of Healthcare

Published online:

02 November 2023

Print publication:

23 November 2023, pp iv-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Textbooks

Refine search

Refine search

Actions for selected content:

37615 results in Cambridge Textbooks

9 - Advanced Examples: Semi-supervised, Ensembles, Deep Learning Model Deployment

Summary

Dedication

Acknowledgments

3 - Hadoop

Summary

Part III - Machine Learning for Big Data

7 - Machine Learning for Big Data

Summary

18 - Inequality

Summary

Part II - Big Data Frameworks

17 - Medical Malpractice

Summary

1 - How Economists View Human Behavior

Summary

Preface

Reviews

Part II - Providers

14 - Public Insurance

Summary

Contents

Index

4 - Spark

Summary

5 - Evaluating Evidence

Summary

3 - Demand for Medical Care

Summary

Copyright page

Textbooks

Refine search

Refine search

Actions for selected content:

Save Search

37615 results in Cambridge Textbooks

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary