Search

11 - Classification
James Burridge, University of Portsmouth, Nick Tosh, University of Galway
Book:

Inference in Statistical Modelling and Machine Learning

Published online:

22 May 2026

Print publication:

23 July 2026, pp 192-221
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter introduces probabilistic models for supervised learning tasks where the prediction target is categorical. In binary classification, the target takes two values; models output the conditional probability of one of these, given the predictors. Logistic regression expresses the log odds as a linear function of the predictors and is fitted by minimising (regularised) cross-entropy loss. Minimising unregularised cross-entropy is equivalent to maximising likelihood, but in linearly separable cases, a maximum likelihood solution may not exist. Regularisation ensures the problem is well posed and helps control overfitting. In multiclass classification, the target can take K > 2 values, and models output a K-dimensional probability vector. Multinomial logistic regression expresses a K-dimensional score vector as a linear function of the predictors and applies the softmax function to convert scores into probabilities. k-nearest neighbours (k-NN) is a non-parametric method that estimates class probabilities from nearby training points. In high-dimensional predictor spaces, parametric models like logistic regression often outperform non-parametric ones like k-NN.

5 - Logistic and Poisson Regression
Isabella Verdinelli, Carnegie Mellon University, Pennsylvania, Larry Wasserman, Carnegie Mellon University, Pennsylvania
Book:

All of Regression

Published online:

08 May 2026

Print publication:

04 June 2026, pp 66-74
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

When the outcome Y is binary or an integer, we need to modify our methods. In this chapter, we introduce logistic regression for binary data and Poisson regression for count data. These are special cases of a class of regression models called generalized linear models. Logistic regression is a special case of a more general suite of methods called classification, which are discussed in Chapter 9.

Operational uncertainty in machine learning based debris block detection in urban waterways
Christopher Rowlatt, Andrew Paul Barnes, Simon Dooley, Thomas Rodding Kjeldsen
Journal:

Cambridge Prisms: Water / Volume 4 / 2026

Published online by Cambridge University Press:

02 March 2026, e10
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
This study investigates the use of machine learning based image classification techniques to detect debris blocking of urban waterways. Using a dataset comprising 1089 labelled CCTV images of a trash screen located in Cardiff, UK and a comprehensive re-sampling approach, we investigate not only the ability of selected machine learning algorithms to correctly identify images, but also to evaluate the uncertainty of these algorithms conditional on the datasets presented to them. For each candidate model, we considered two datasets: an imbalanced dataset and an under-sampled dataset. The results demonstrate that the performance of a simple logistic regression model was broadly comparable to that of more advanced machine learning models such as vision transformers. The best performing models (vision transformers and logistic regression) achieved an accuracy of more than 80%, while the NetRes50 model achieved an accuracy in the low 70%. This is an important result that opens the possibility for implementing these techniques as part of an operational real-time flood warning system utilising already existing cameras.

7 - Structural Equation Models with Categorical Indicators and Outcomes
Kenneth A. Bollen, University of North Carolina, Chapel Hill
Book:

Elements of Structural Equation Models (SEMs)

Published online:

05 February 2026

Print publication:

19 February 2026, pp 500-589
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 7 covers models with categorical endogenous variables. It examines the consequences of treating such variables as continuous and how to modify SEMs to take account of categorical variables. It begins with single equation regression-like models for binary, ordinal, and count variables and builds to multiequation models. It includes a polychoric correlation approach, models with exogenous observed variables, the treatment of missing values, and alternative modeling approaches for categorical variables.

8 - Supervised Learning
from Part III - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science with R

Published online:

07 February 2026

Print publication:

22 January 2026, pp 203-256
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter provides a comprehensive introduction to supervised learning techniques for classification problems. It begins with logistic regression for binary classification, explaining the sigmoid function and gradient ascent optimization. The chapter then covers softmax regression for multi-class problems, followed by k-nearest neighbors (kNN) as an intuitive distance-based classifier.
Decision trees are explored in detail, including entropy, information gain, and the ID3 algorithm, along with derived decision rules and association rules. Random forests are presented as an ensemble method that addresses overfitting by combining multiple decision trees.
The chapter covers Naive Bayes classification based on Bayes’ theorem, despite its "naive" independence assumption. Finally, Support Vector Machines (SVMs) are introduced for both linear and non-linear classification using maximum margin hyperplanes.
Each technique includes hands-on R programming examples with real datasets, practical applications, and exercises to reinforce learning concepts.

8 - Supervised Learning
from Part III - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science with Python

Published online:

07 February 2026

Print publication:

22 January 2026, pp 210-269
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter explores supervised learning techniques where algorithms learn from labeled training data to make predictions. It begins with logistic regression for binary classification problems, using the sigmoid function to output probabilities between 0 and 1. Softmax regression extends this to multi-class problems. The chapter covers k-nearest neighbors (kNN), which classifies data points based on their similarity to training examples. Decision trees use entropy and information gain to create interpretable classification rules, while random forests combine multiple decision trees to reduce overfitting through ensemble methods. Naive Bayes applies Bayes’ theorem with independence assumptions for probabilistic classification, particularly effective for text classification. Finally, support vector machines (SVM) find optimal decision boundaries by maximizing margins between classes. Each technique is demonstrated through hands-on Python examples using real datasets, showing practical applications in various domains from healthcare to finance.

Which-Hunting and the Standard English Relative Clause
Lars Hinrichs, Benedikt Szmrecsanyi, Axel Bohmann
Journal:

Language / Volume 91 / Issue 4 / December 2015

Published online by Cambridge University Press:

01 January 2026, pp. 806-836
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Alternation among restrictive relativizers in written Standard English is undergoing a massive shift from which to that. In corpora of written-edited-published British and American English covering the period from 1961-1992, American English spearheads this change. We study 16,868 restrictive relative clauses with inanimate antecedents from the Brown quartet of corpora. Predictors include additional areas of variation regulated by prescriptivism. We show that: (i) relativizer deletion follows different constraints from the selection of either that or which; (ii) this change is a case of institutionally backed colloquialization-cum-Americanization; and (iii) uptake of the precept correlates with avoidance of the passive voice at the text level but not with other prescriptive rules.

7 - Logistic and Softmax Regression
from Part II - Regression
Ruye Wang, Harvey Mudd College, California
Book:

Introduction to Machine Learning

Published online:

05 February 2026

Print publication:

18 December 2025, pp 166-182
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Regression and classification are closely related, as shown in this chapter, which discusses methods used to map a linear regression function into a probablity function by either logistic function (for binary classification) or softmax function (for multi-class classification). According to this probablity function, an unlabeled sample can be assigned to one of the classes. The optimal model parameters in this method can be obtained based on the training set so that either the likelihood or the posterior probability of these parameters are maximized.

3 - Optimization Theory and Algorithms
Sébastien Roch, University of Wisconsin, Madison
Book:

Mathematical Methods in Data Science

Published online:

04 November 2025

Print publication:

30 October 2025, pp 128-199
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter focuses on the core concepts of optimization theory and its application in data science and AI. It begins with a review of differentiable functions of several variables, including the gradient and Hessian matrices, and key results like the Chain Rule and the Mean Value Theorem. The chapter then introduces optimality conditions for unconstrained optimization, explaining first-order and second-order conditions, and the role of convexity in ensuring global optimality. A detailed discussion of the gradient descent algorithm is provided, including its convergence analysis under different assumptions. The chapter concludes with an application to logistic regression, demonstrating how gradient descent is used to optimize the cross-entropy loss function in a supervised learning context. Practical Python examples are integrated throughout to illustrate the theoretical concepts.

12 - Regression and Classification
Carlos Fernandez-Granda, New York University
Book:

Probability and Statistics for Data Science

Published online:

19 June 2025

Print publication:

03 July 2025, pp 495-598
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter covers regression and classification, where the goal is to estimate a quantity of interest (the response) from observed features. In regression, the response is a numerical variable. In classification, it belongs to a finite set of predetermined classes. We begin with a comprehensive description of linear regression and discuss how to leverage it to perform causal inference. Then, we explain under what conditions linear models tend to overfit or to generalize robustly to held-out data. Motivated by the threat of overfitting, we introduce regularization and ridge regression, and discuss sparse regression, where the goal is to fit a linear model that only depends on a small subset of the available features. Then, we introduce two popular linear models for binary and multiclass classification: Logistic and softmax regression. At this point, we turn our attention to nonlinear models. First, we present regression and classification trees and explain how to combine them via bagging, random forests, and boosting. Second, we explain how to train neural networks to perform regression and classification. Finally, we discuss how to evaluate classification models.

10 - Mixed Effects Modelling
Sali A. Tagliamonte, University of Toronto
Book:

Analysing Sociolinguistic Variation

Published online:

19 June 2025

Print publication:

03 July 2025, pp 225-249
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

How do I conduct a mixed effects logistic regression of a linguistic variable?This chapter will illustrate the procedures for performing statistical modelling using mixed effects logistic regression with the lme4 package in R. It will review the steps for conducting analyses, for finding the best model for the feature under study, and what to do with it when you find it.

10 - Excursion: Problems in Machine Learning
Yisong Yang, New York University
Book:

Advanced Linear Algebra

Published online:

27 May 2025

Print publication:

12 June 2025, pp 272-302
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

A rich and important area for the applications of linear algebra is machine learning. In machine learning, one aims to achieve optimized or learned understanding of various kinds of real-world phenomena from data collected or observed, without real comprehension of the functioning mechanisms of such phenomena. These functioning mechanisms are often impossible or unpractical to grasp anyway. In this chapter, we present several introductory and fundamental problems in supervised machine learning including linear regression, data classification, and logistic regression and the mathematical and computational methods associated.

REDUCING TYPE 1 CHILDHOOD DIABETES IN SAUDI ARABIA BY IDENTIFYING AND MODELLING ITS KEY PERFORMANCE INDICATORS
Part of
- Linear inference, regression
AHOOD ALAZWARI
Journal:

Bulletin of the Australian Mathematical Society / Volume 112 / Issue 3 / December 2025

Published online by Cambridge University Press:

09 June 2025, pp. 574-576

Print publication:

December 2025
- Article
- - You have access
- PDF
- HTML
- Export citation

4 - Exploring the Suitability of Environmental Health Information for Parental Education Using Machine Learning Models
Meng Ji, University of Sydney, Michael Oakes, University of Birmingham
Book:

Multilingual Environmental Communications

Published online:

16 May 2025

Print publication:

05 June 2025, pp 75-103
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, new computational models will focus on whether environmental health texts are suitable for parents rather than the general public. Logistic regression models will identify linguistic features that are important contributors to the prediction of the suitability of environmental health materials for parents and caregivers of young children, who are more likely to be affected by environmental health risks such as water pollution, excessive sun exposure, and radiation in natural and indoor environments.

2 - Statistics and Machine Learning for Textual Readability Studies
Meng Ji, University of Sydney, Michael Oakes, University of Birmingham
Book:

Multilingual Environmental Communications

Published online:

16 May 2025

Print publication:

05 June 2025, pp 23-52
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter describes how to characterize data and the distribution of data. We will also describe how the shape of the normal distribution enables hypothesis testing. In the section on regression, we look at how two variables or ways of measuring data are related to each other. We will use simple linear regression as an introduction to multiple regression, the technique used in the development of a number of traditional readability measures. A more sophisticated form of regression is called logistic regression is also discussed, which will be applied in the case studies of Chapters 4 to 6.

Online Calibration Via Variable Length Computerized Adaptive Testing
Yuan-chin Ivan Chang, Hung-Yi Lu
Journal:

Psychometrika / Volume 75 / Issue 1 / March 2010

Published online by Cambridge University Press:

01 January 2025, pp. 140-157
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Item calibration is an essential issue in modern item response theory based psychological or educational testing. Due to the popularity of computerized adaptive testing, methods to efficiently calibrate new items have become more important than that in the time when paper and pencil test administration is the norm. There are many calibration processes being proposed and discussed from both theoretical and practical perspectives. Among them, the online calibration may be one of the most cost effective processes. In this paper, under a variable length computerized adaptive testing scenario, we integrate the methods of adaptive design, sequential estimation, and measurement error models to solve online item calibration problems. The proposed sequential estimate of item parameters is shown to be strongly consistent and asymptotically normally distributed with a prechosen accuracy. Numerical results show that the proposed method is very promising in terms of both estimation accuracy and efficiency. The results of using calibrated items to estimate the latent trait levels are also reported.

A Generalized Rasch Model for Manifest Predictors
Aeilko H. Zwinderman
Journal:

Psychometrika / Volume 56 / Issue 4 / December 1991

Published online by Cambridge University Press:

01 January 2025, pp. 589-600
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A logistic regression model is suggested for estimating the relation between a set of manifest predictors and a latent trait assumed to be measured by a set of k dichotomous items. Usually the estimated subject parameters of latent trait models are biased, especially for short tests. Therefore, the relation between a latent trait and a set of predictors should not be estimated with a regression model in which the estimated subject parameters are used as a dependent variable. Direct estimation of the relation between the latent trait and one or more independent variables is suggested instead. Estimation methods and test statistics for the Rasch model are discussed and the model is illustrated with simulated and empirical data.

Robust Inference with Binary Data
Maria-Pia Victoria-Feser
Journal:

Psychometrika / Volume 67 / Issue 1 / March 2002

Published online by Cambridge University Press:

01 January 2025, pp. 21-32
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper robustness properties of the maximum likelihood estimator (MLE) and several robust estimators for the logistic regression model when the responses are binary are analysed. It is found that the MLE and the classical Rao's score test can be misleading in the presence of model misspecification which in the context of logistic regression means either misclassification's errors in the responses, or extreme data points in the design space. A general framework for robust estimation and testing is presented and a robust estimator as well as a robust testing procedure are presented. It is shown that they are less influenced by model misspecifications than their classical counterparts. They are finally applied to the analysis of binary data from a study on breastfeeding.

A Latent Transition Model With Logistic Regression
Hwan Chung, Theodore A. Walls, Yousung Park
Journal:

Psychometrika / Volume 72 / Issue 3 / September 2007

Published online by Cambridge University Press:

01 January 2025, pp. 413-435
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Latent transition models increasingly include covariates that predict prevalence of latent classes at a given time or transition rates among classes over time. In many situations, the covariate of interest may be latent. This paper describes an approach for handling both manifest and latent covariates in a latent transition model. A Bayesian approach via Markov chain Monte Carlo (MCMC) is employed in order to achieve more robust estimates. A case example illustrating the model is provided using data on academic beliefs and achievement in a low-income sample of adolescents in the United States.

A web-based dynamic nomogram for estimating talaromycosis risk in hospitalized HIV-positive patients
Xu Li, Zhongsheng Jiang, Shenglin Mo, Xiaohong Huang, Tao Chen, Peng Zhang, Linghua Li, Bin Huang, Yanqiu Lu, Ying Wu, Jiaguang Hu
Journal:

Epidemiology & Infection / Volume 152 / 2024

Published online by Cambridge University Press:

05 December 2024, e153
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Our study aimed to develop and validate a nomogram to assess talaromycosis risk in hospitalized HIV-positive patients. Prediction models were built using data from a multicentre retrospective cohort study in China. On the basis of the inclusion and exclusion criteria, we collected data from 1564 hospitalized HIV-positive patients in four hospitals from 2010 to 2019. Inpatients were randomly assigned to the training or validation group at a 7:3 ratio. To identify the potential risk factors for talaromycosis in HIV-infected patients, univariate and multivariate logistic regression analyses were conducted. Through multivariate logistic regression, we determined ten variables that were independent risk factors for talaromycosis in HIV-infected individuals. A nomogram was developed following the findings of the multivariate logistic regression analysis. For user convenience, a web-based nomogram calculator was also created. The nomogram demonstrated excellent discrimination in both the training and validation groups [area under the ROC curve (AUC) = 0.883 vs. 0.889] and good calibration. The results of the clinical impact curve (CIC) analysis and decision curve analysis (DCA) confirmed the clinical utility of the model. Clinicians will benefit from this simple, practical, and quantitative strategy to predict talaromycosis risk in HIV-infected patients and can implement appropriate interventions accordingly.

Search Results

Refine search

Refine search

Actions for selected content:

100 results

11 - Classification

Summary

5 - Logistic and Poisson Regression

Summary

Operational uncertainty in machine learning based debris block detection in urban waterways

7 - Structural Equation Models with Categorical Indicators and Outcomes

Summary

8 - Supervised Learning

Summary

8 - Supervised Learning

Summary

Which-Hunting and the Standard English Relative Clause

7 - Logistic and Softmax Regression

Summary

3 - Optimization Theory and Algorithms

Summary

12 - Regression and Classification

Summary

10 - Mixed Effects Modelling

Summary

10 - Excursion: Problems in Machine Learning

Summary

REDUCING TYPE 1 CHILDHOOD DIABETES IN SAUDI ARABIA BY IDENTIFYING AND MODELLING ITS KEY PERFORMANCE INDICATORS

4 - Exploring the Suitability of Environmental Health Information for Parental Education Using Machine Learning Models

Summary

2 - Statistics and Machine Learning for Textual Readability Studies

Summary

Online Calibration Via Variable Length Computerized Adaptive Testing

A Generalized Rasch Model for Manifest Predictors

Robust Inference with Binary Data

A Latent Transition Model With Logistic Regression

A web-based dynamic nomogram for estimating talaromycosis risk in hospitalized HIV-positive patients

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

100 results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary