Search results for Statistics and Probability

1 - Probability
Carlos Fernandez-Granda, New York University
Book:

Probability and Statistics for Data Science

Published online:

19 June 2025

Print publication:

03 July 2025, pp 6-36
- Chapter
- - You have access
- PDF
- Export citation
Summary

This chapter introduces probability. We begin with an informal definition which enables us to build intuition about the properties of probability. Then, we present a more rigorous definition, based on the mathematical framework of probability spaces. Next, we describe conditional probability, a concept that makes it possible to update probabilities when additional information is revealed. In our first encounter with statistics, we explain how to estimate probabilities and conditional probabilities from data, as illustrated by an analysis of votes in the United States Congress. Building upon the concept of conditional probability, we define independence and conditional independence, which are critical concepts in probabilistic modeling. The chapter ends with a surprising twist: In practice, probabilities are often impossible to compute analytically! Fortunately, the Monte Carlo method provides a pragmatic solution to this challenge, allowing us to approximate probabilities very accurately using computer simulations. We apply w 3 × 3 basketball tournament from the 2020 Tokyo Olympics.

1 - Tensors and Their Subparts
from Part I - Tensor Basics
Grey Ballard, Wake Forest University, North Carolina, Tamara G. Kolda, MathSci.ai
Book:

Tensor Decompositions for Data Science

Published online:

05 June 2025

Print publication:

26 June 2025, pp 3-20
- Chapter
- - You have access
- PDF
- Export citation

1 - Introduction
Daniel P. Palomar, Hong Kong University of Science and Technology
Book:

Portfolio Optimization

Published online:

17 June 2025

Print publication:

12 June 2025, pp 1-16
- Chapter
- - You have access
- PDF
- Export citation

1 - Background
Paul Fearnhead, Lancaster University, Christopher Nemeth, Newcastle University, Chris J. Oates, University of Newcastle upon Tyne, Chris Sherlock, Lancaster University
Book:

Scalable Monte Carlo for Bayesian Learning

Published online:

16 May 2025

Print publication:

05 June 2025, pp 1-38
- Chapter
- - You have access
- PDF
- Export citation
Summary

This chapter provides a comprehensive overview of the foundational concepts essential for scalable Bayesian learning and Monte Carlo methods. It introduces Monte Carlo integration and its relevance to Bayesian statistics, focusing on techniques such as importance sampling and control variates. The chapter outlines key applications, including logistic regression, Bayesian matrix factorization, and Bayesian neural networks, which serve as illustrative examples throughout the book. It also offers a primer on Markov chains and stochastic differential equations, which are critical for understanding the advanced methods discussed in later chapters. Additionally, the chapter introduces kernel methods in preparation for their application in scalable Markov Chain Monte Carlo (MCMC) diagnostics.

1 - Introduction
Vikram Krishnamurthy, Cornell University, New York
Book:

Partially Observed Markov Decision Processes

Published online:

16 May 2025

Print publication:

05 June 2025, pp 1-8
- Chapter
- - You have access
- PDF
- Export citation

1 - Introduction
from Part I - Preliminaries
Tomaso Aste, University College London
Book:

Probabilistic Data-Driven Modeling

Published online:

17 May 2025

Print publication:

01 May 2025, pp 3-8
- Chapter
- - You have access
- PDF
- Export citation
Summary

This chapter explores the pivotal role of modeling as a conduit between diverse data representations and applications in real, complex systems. The emphasis is on portraying modeling in terms of multivariate probabilities, laying the foundation for the probabilistic data-driven modeling framework.

Chapter 1 - Introduction
Michael Billig, Loughborough University, Cristina Marinho, University of Edinburgh
Book:

Politicians Manipulating Statistics

Published online:

31 May 2025

Print publication:

03 April 2025, pp 1-9
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter provides an introduction to the main themes of the book and why this is a book about the misuse of language, just as much as the misuse of numbers. Statistics are never just numbers, for the numbers have to be labelled. Because politicians are so distrusted at present, people expect politicians to manipulate statistics. This opening chapter introduces readers to a number of excellent recent books about statistics, most of which have been addressed to non-specialist readers. The topic of statistics is a broad one and can sustain a variety of books with different slants. Unlike other books on statistics, this one looks directly at manipulation and how it occurs. A recurring theme of the book is that the political manipulation of statistics is not typically a single act, but politicians will often manipulate their statisticians to manipulate the official statistics on their behalf. This opening chapter also comments on the book’s style of writing. The authors write aim to write in a clear and non-technical way, and to give special emphasis to the ways that politicians manipulate language when manipulating numbers.

1 - The Pitfalls, Promises, and Challenges of Data
Munther A. Dahleh, Massachusetts Institute of Technology
Book:

Data, Systems, and Society

Published online:

24 March 2025

Print publication:

27 March 2025, pp 1-18
- Chapter
- - You have access
- PDF
- HTML
- Export citation
Summary

This chapter delves into the complexities and challenges of data science, emphasizing the potential pitfalls and ethical considerations inherent in decision-making based on data. It explores the intricate nature of data, which can be multifaceted, noisy, temporally and spatially disjointed, and often a result of the interplay among numerous interconnected components. This complexity poses significant difficulties in drawing causal inferences and making informed decisions.
A central theme of the chapter is the compromise of privacy that individuals may face in the quest for data-driven insights, which raises ethical concerns regarding the use of personal data. The discussion extends to the concept of algorithmic fairness, particularly in the context of racial bias, shedding light on the need for mitigating biases in data-driven decision-making processes.
Through a series of examples, the chapter illustrates the challenges and potential pitfalls associated with data science, underscoring the importance of robust methodologies and ethical considerations. It concludes with a thought-provoking examination of income inequality as a controversial example of data science in practice. The example highlights the nuanced interplay between data, decisions, and societal impacts.

1 - Introduction
Luis E. Nieto-Barajas, Instituto Tecnológico Autónomo de México (ITAM)
Book:

Dependence Models via Hierarchical Structures

Published online:

20 March 2025

Print publication:

27 March 2025, pp 1-22
- Chapter
- - You have access
- PDF
- Export citation
Summary

In this chapter we start by reviewing the different types of inference procedures: frequentist, Bayesian, parametric and non-parametric. We introduce notation by providing a list of the probability distributions that will be used later on, together with their first two moments. We review some results on conditional moments and carry out several examples. We review definitions of stochastic processes, stationary processes and Markov processes, and finish by introducing the most common discrete-time stochastic processes that show dependence in time and space.

1 - Asymptotics of Polynomial Time Trend Estimation and Hypothesis Testing under Rank Deficiency
from Part I - Trend Determination, Asset Price Bubbles, and Factor-Augmented Regressions
Edited by Shuping Shi, Macquarie University, Sydney, Xiaohu Wang, Fudan University, Shanghai, Tao Zeng, Zhejiang University, China
Book:

Financial Econometrics

Published online:

20 February 2025

Print publication:

27 February 2025, pp 3-29
- Chapter
- - You have access
- PDF
- Export citation
Summary

Limit theory is developed for least squares regression estimation of a model involving time trend polynomials and a moving average error process with a unit root. Models with these features can arise from data manipulation such as overdifferencing and model features such as the presence of multicointegration. The impact of such features on the asymptotic equivalence of least squares and generalized least squares is considered. Problems of rank deficiency that are induced asymptotically by the presence of time polynomials in the regression are also studied, focusing on the impact that singularities have on hypothesis testing using Wald statistics and matrix normalization. The chapter is largely pedagogical but contains new results, notational innovations, and procedures for dealing with rank deficiency that are useful in cases of wider applicability.

1 - Market and Arbitrage Pricing
Costis Skiadas, Northwestern University, Illinois
Book:

Theoretical Foundations of Asset Pricing

Published online:

06 February 2025

Print publication:

13 February 2025, pp 1-41
- Chapter
- - You have access
- PDF
- Export citation
Summary

This chapter formally defines a financial market and associated constructs, and lays the foundations for arbitrage pricing and dynamic replication (or hedging) through trading strategies.

Chapter 1 - Random Utility
from Part I - Static Choice
Tomasz Strzalecki, Harvard University, Massachusetts
Book:

Stochastic Choice Theory

Published online:

14 January 2025

Print publication:

23 January 2025, pp 3-18
- Chapter
- - You have access
- PDF
- Export citation
Summary

Observed choices are random in psychological experiments on perception and in economics experiments on choice. I discuss a number of possible explanations and introduce the random utility model.

1 - Credible Planning under Uncertainty
Charles F. Manski, Northwestern University, Illinois
Book:

Discourse on Social Planning under Uncertainty

Published online:

02 January 2025

Print publication:

09 January 2025, pp 1-32
- Chapter
- - You have access
- PDF
- HTML
- Export citation
Summary

Section 1.1 calls attention to the prevalent research practice that studies planning with incredible certitude. Section 1.2 contrasts the conceptions of uncertainty in consequentialist and axiomatic decision theory. Section 1.3 presents the formal structure of consequentialist theory, which is used throughout the book. Section 1.4 explains the prevalent econometric characterization of uncertainty, which distinguishes identification problems and statistical imprecision. Section 1.5 discusses the distinct perspectives on social welfare expressed in various strands of research on planning.

1 - Quadratic Functionals of Brownian Motion
from Part I - Theory
Katsuto Tanaka, Hitotsubashi University, Tokyo
Book:

Brownian Motion, the Fredholm Determinant, and Time Series Analysis

Published online:

19 December 2024

Print publication:

02 January 2025, pp 7-62
- Chapter
- - You have access
- PDF
- Export citation

1 - Student Samples in Research
from Part I - Quantitative Data Collection Sources
- By Michael Basil
Edited by John E. Edlund, Rochester Institute of Technology, New York, Austin Lee Nichols, Central European University, Vienna
Book:

The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences

Published online:

12 December 2024

Print publication:

19 December 2024, pp 3-23
- Chapter
- - You have access
- PDF
- HTML
- Export citation
Summary

This chapter provides an overview on the use and validity of student samples in the behavioral and social sciences. In some instances, data collected from students can be of limited value or even inappropriate; however, in other cases, this approach provides useful data. I offer three general ways to evaluate the use of student samples. First, consider the research design. Descriptive studies that rely on students to draw inferences about the overall population are likely problematic. Second, statistical controls such as multivariate analyses that adjust for other factors may reduce some of the biases that may be introduced through sampling. Third, consider the theorized mechanism – a clear theoretical mechanism that does not vary based on the demographics of the sample allows us to put more faith in constrained samples. Despite these approaches, and regardless of our methods, statistics, and theoretical mechanism, we should be cautious with generalizability claims.

1 - Introduction to Social Networks Research
Jennifer M. Larson, Vanderbilt University, Tennessee
Book:

Designing Empirical Social Networks Research

Published online:

07 November 2024

Print publication:

21 November 2024, pp 1-13
- Chapter
- - You have access
- PDF
- Export citation
Summary

An empirical social networks study is concerned with what a well-defined social network is like, and whether and how it matters in some context of interest. Designing a successful one requires serious thinking on the front end about what the network is and what it does in theory. This book aims to help researchers do just that. To begin, this chapter motivates this research area with examples from political science, explains why the topic is unique enough to warrant a whole book, and offers guidance on how to know if your research should incorporate networks.

1 - Introduction
from Part I - Preliminary Considerations
Nathalie Japkowicz, American University, Washington DC, Zois Boukouvalas, American University, Washington DC
Book:

Machine Learning Evaluation

Published online:

07 November 2024

Print publication:

21 November 2024, pp 3-7
- Chapter
- - You have access
- PDF
- Export citation
Summary

Chapter 1 discusses the motivation for the book and the rationale for its organization into four parts: preliminary considerations, evaluation for classification, evaluation in other settings, and evaluation from a practical perspective. In more detail, the first part provides the statistical tools necessary for evaluation and reviews the main machine learning principles as well as frequently used evaluation practices. The second part discusses the most common setting in which machine learning evaluation has been applied: classification. The third part extends the discussion to other paradigms such as multi-label classification, regression analysis, data stream mining, and unsupervised learning. The fourth part broadens the conversation by moving it from the laboratory setting to the practical setting, specifically discussing issues of robustness and responsible deployment.

1 - The Three-Hatted Pollster
Clifford Young, Ipsos Public Affairs, Kathryn Ziemer, Ipsos Public Affairs
Book:

Polls, Pollsters, and Public Opinion

Published online:

01 November 2024

Print publication:

04 July 2024, pp 1-12
- Chapter
- - You have access
- PDF
- HTML
- Export citation
Summary

This chapter provides an overview of the purpose of the book, namely to help the user of public opinion data develop a systematic analytical approach for understanding, predicting, and engaging public opinion. This includes helping the reader understand how public opinion can be employed as a decision-making input, meaning a factor, or variable, to assess, predict, or influence an outcome. The chapter outlines how information from different disciplines, including cognitive psychology, behavioral economics, and political science, come together to inform the pollster’s work.

Chapter 1 - A whirlwind tour of network science
from Part I - Background
James Bagrow, University of Vermont, Yong‐Yeol Ahn, Indiana University, Bloomington
Book:

Working with Network Data

Published online:

06 June 2024

Print publication:

13 June 2024, pp 3-16
- Chapter
- - You have access
- PDF
- Export citation
Summary

Network science has exploded in popularity since the late 1990s. But it flows from a long and rich tradition of mathematical and scientific understanding of complex systems. We can no longer imagine the world without evoking networks. And network data is at the heart of it. In this chapter, we set the stage by highlighting network sciences ancestry and the exciting scientific approaches that networks have enabled, followed by a tour of the basic concepts and properties of networks.

1 - Learning from Data, and Tools for the Task
John H. Maindonald, Statistics Research Associates, Wellington, New Zealand, W. John Braun, University of British Columbia, Okanagan, Jeffrey L. Andrews, University of British Columbia, Okanagan
Book:

A Practical Guide to Data Analysis Using R

Published online:

11 May 2024

Print publication:

30 May 2024, pp 1-87
- Chapter
- - You have access
- PDF
- Export citation
Summary

We begin by illustrating the interplay between questions of scientific interest and the use of data in seeking answers. Graphs provide a window through which meaning can often be extracted from data. Numeric summary statistics and probability distributions provide a form of quantitative scaffolding for models of random as well as nonrandom variation. Simple regression models foreshadow the issues that arise in the more complex models considered later in the book. Frequentist and Bayesian approaches to statistical inference are contrasted, the latter primarily using the Bayes Factor to complement the limited perspective that p-values offer. Akaike Information Criterion (AIC) and related "information" statistics provide a further perspective. Resampling methods, where the one available dataset is used to provide an empirical substitute for a theoretical distribution, are introduced. Remaining topics are of a more general nature. RStudio is one of several tools that can help in organizing and managing work. The checks provided by independent replication at another time and place are an indispensable complement to statistical analysis. Questions of data quality, of relevance to the questions asked, of the processes that generated the data, and of generalization, remain just as important for machine learning and other new analysis approaches as for more classical methods.

JSM Books Chapter Collection

Refine listing

Refine listing

Actions for selected content:

Save Search

30 results in JSM Books Chapter Collection

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary