Search

9 - Unsupervised Learning
from Part III - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science with R

Print publication:

22 January 2026, pp 257-282
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter introduces unsupervised learning, where algorithms analyze data without predefined labels or target outcomes. It covers three main clustering approaches: agglomerative clustering (bottom-up approach merging similar data points) and divisive clustering (top-down approach, exemplified by k-means algorithm that partitions data into k groups by minimizing distances to centroids).
The chapter explains Expectation Maximization (EM) algorithm for handling incomplete data and finding maximum likelihood parameters in statistical models. It includes a section on reinforcement learning, where agents learn optimal actions through trial-and-error interactions with environments to maximize rewards.
Key topics include distance matrices, dendrograms, cluster evaluation metrics (AIC, BIC), and practical applications. The chapter emphasizes the artistic nature of unsupervised learning, requiring careful design decisions about thresholds, cluster numbers, and technique selection. Hands-on R examples demonstrate each method using real datasets.

7 - Machine Learning Introduction and Regression
from Part III - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science with Python

Print publication:

22 January 2026, pp 183-209
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter introduces machine learning as a subset of artificial intelligence that enables computers to learn from data without explicit programming. It defines machine learning using Tom Mitchell’s formal framework and explores practical applications like self-driving cars, optical character recognition, and recommendation systems. The chapter focuses on regression as a fundamental machine learning technique, explaining linear regression for modeling relationships between variables. A key section covers gradient descent, an optimization algorithm that iteratively finds the best model parameters by minimizing error functions. Through hands-on Python examples, students learn to implement both linear regression and gradient descent algorithms, visualizing how models improve over iterations. The chapter emphasizes practical considerations for choosing appropriate algorithms, including accuracy, training time, linearity assumptions, and the number of parameters, preparing students for more advanced supervised and unsupervised learning techniques.

9 - Unsupervised Learning
from Part III - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science with Python

Print publication:

22 January 2026, pp 270-300
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter covers unsupervised learning, where algorithms analyze data without known true labels or outcomes. Unlike supervised learning, the goal is to discover hidden patterns and structures in data.
The chapter explores three main techniques: Agglomerative clustering works bottom-up, starting with individual data points and merging similar ones into larger clusters. Divisive clustering (including k-means) takes a top-down approach, splitting data into smaller groups. Both methods use distance matrices and dendrograms to visualize cluster relationships.
Expectation Maximization (EM) handles incomplete data by iteratively estimating missing parameters using maximum likelihood estimation. Model quality is assessed using AIC and BIC criteria.
The chapter also introduces reinforcement learning, where agents learn optimal actions through trial-and-error interactions with environments, receiving rewards or penalties. Applications include robotics, gaming, and autonomous systems. Throughout, the chapter emphasizes the creative, interpretive nature of unsupervised learning compared to more structured supervised approaches.

2 - Artificial Intelligence and Machine Learning
Soroush Saghafian, Harvard University, Massachusetts
Book:

Insight-Driven Problem Solving

Published online:

21 October 2025

Print publication:

30 October 2025, pp 53-110
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter educates the reader on the main ideas that have enabled various advancements in Artificial Intelligence (AI) and Machine Learning (ML). Using various examples, and taking the reader on a journey through history, it showcases how the main ideas developed by the pioneers of AI and ML are being used in our modern era to make the world a better place. It communicates that our lives are surrounded by algorithms that work based on a few main ideas. It also discusses recent advancements in Generative AI, including the main ideas that led to the creation of Large Language Models (LLMs) such as Chat GPT. The chapter also discusses various societal considerations in AI and ML and ends with various technological advancements that could further improve our abilities in using the main ideas.

1 - What Is AI?
Thomas Nygren, Uppsala University, Sweden
Book:

Artificial Intelligence in Schools

Published online:

24 December 2025

Print publication:

30 October 2025, pp 4-23
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

An introduction to AI, including an overview of essential technologies such as machine learning and deep learning, and a discussion on generative AI and its potential limitations. The chapter includes an exploration of AI's history, including its relationship to cybernetics, its role as a codebreaker, periods of optimism and “AI winters,” and today's global development with generative AI. Chapter 1 also include an analysis of AI's role in the international and national context, focusing on potential conflicts of goals and threats that can arise from technology.

On the literary landscapes of vector embeddings
Part of
- CHR Expanding the Toolkit: Large Language Models in Humanities Research
Daniel Rockmore, Jiayi Chen, Mohammad Javad Latifi Jebelli, Allen Riddell, Harrison Stropkay
Journal:

Computational Humanities Research / Volume 1 / 2025

Published online by Cambridge University Press:

07 October 2025, e18
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
From the early use of TF-IDF to the high-dimensional outputs of deep learning, vector space embeddings of text, at a scale ranging from token to document, are at the heart of all machine analysis and generation of text. In this article, we present the first large-scale comparison of a sampling of such techniques on a range of classification tasks on a large corpus of current literature drawn from the well-known Books3 data set. Specifically, we compare TF-IDF, Doc2vec and several Transformer-based embeddings on a variety of text-specific tasks. Using industry-standard BISAC codes as a proxy for genre, we compare embeddings in their ability to preserve information about genre. We further compare these embeddings in their ability to encode inter- and intra-book similarity. All of these comparisons take place at the book “chunk” (1,024 tokens) level. We find Transformer-based (“neural”) embeddings to be best, in the sense of their ability to respect genre and authorship, although almost all embedding techniques produce sensible constructions of a “literary landscape” as embodied by the Books3 corpus. These experiments suggest the possibility of using deep learning embeddings not only for advances in generative AI, but also a potential tool for book discovery and as an aid to various forms of more traditional comparative textual analysis.

1 - Generative AI
from Part I - Understanding Generative AI from Multidisciplinary Perspectives
- By Tom Melham
Edited by Mimi Zou, University of New South Wales, Sydney, Cristina Poncibò, University of Turin, Martin Ebers, University of Tartu, Estonia, Ryan Calo, University of Washington
Book:

The Cambridge Handbook of Generative AI and the Law

Published online:

08 August 2025

Print publication:

07 August 2025, pp 3-10
- Chapter
- - You have access
- PDF
- HTML
- Export citation
Summary

Generative artificial intelligence has a long history but surged into global prominence with the introduction in 2017 of the transformer architecture for large language models. Based on deep learning with artificial neural networks, transformers revolutionised the field of generative AI for production of natural language outputs. Today’s large language models, and other forms of generative artificial intelligence, now have unprecedented capability and versatility. This emergence of these forms of highly capable generative AI poses many legal issues and questions, including consequences for intellectual property, contracts and licences, liability, data protection, use in specific sectors, potential harms, and of course ethics, policy, and regulation of the technology. To support the discussion of these topics in this Handbook, this chapter gives a relatively non-technical introduction to the technology of modern artificial intelligence and generative AI.

1 - Introduction
Anna Dawid, Uniwersytet Warszawski, Poland, Julian Arnold, Universität Basel, Switzerland, Borja Requena, ICFO - The Institute of Photonic Sciences, Alexander Gresch, Heinrich-Heine-Universität Düsseldorf, Marcin Płodzień, ICFO - The Institute of Photonic Sciences, Kaelan Donatella, Université de Paris VII (Denis Diderot), Kim A. Nicoli, University of Bonn, Paolo Stornati, ICFO - The Institute of Photonic Sciences, Rouven Koch, Aalto University, Finland, Miriam Büttner, Albert-Ludwigs-Universität Freiburg, Germany, Robert Okuła, Gdańsk University of Technology, Gorka Muñoz-Gil, Universität Innsbruck, Austria, Rodrigo A. Vargas-Hernández, McMaster University, Ontario, Alba Cervera-Lierta, Centro Nacional de Supercomputación, Juan Carrasquilla, Swiss Federal Institute of Technology in Zurich, Vedran Dunjko, Universiteit Leiden, Marylou Gabrié, Institut Polytechnique de Paris, Patrick Huembeli, Evert van Nieuwenburg, Universiteit Leiden, Filippo Vicentini, Institut Polytechnique de Paris, Lei Wang, Chinese Academy of Sciences, Beijing, Sebastian J. Wetzel, University of Waterloo, Ontario, Giuseppe Carleo, École Polytechnique Fédérale de Lausanne, Eliška Greplová, Technische Universiteit Delft, The Netherlands, Roman Krems, University of British Columbia, Vancouver, Florian Marquardt, Max-Planck-Institut für die Wissenschaft des Lichts, Michał Tomza, Uniwersytet Warszawski, Maciej Lewenstein, ICFO - Institute of Photonic Sciences, Alexandre Dauphin, Instituto de Ciencias Fotónicas
Book:

Machine Learning in Quantum Sciences

Published online:

13 June 2025

Print publication:

12 June 2025, pp 1-13
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we introduce the reader to basic concepts in machine learning. We start by defining the artificial intelligence, machine learning, and deep learning. We give a historical viewpoint on the field, also from the perspective of statistical physics. Then, we give a very basic introduction to different tasks that are amenable for machine learning such as regression or classification and explain various types of learning. We end the chapter by explaining how to read the book and how chapters depend on each other.

ARTIFICIAL INTELLIGENCE INFORMED SIMULATION OF DISSOLVED INORGANIC NITROGEN FROM UNGAUGED CATCHMENTS TO THE GREAT BARRIER REEF
Part of
CHERIE O’SULLIVAN
Journal:

Bulletin of the Australian Mathematical Society / Volume 112 / Issue 1 / August 2025

Published online by Cambridge University Press:

16 May 2025, pp. 216-219

Print publication:

August 2025
- Article
- - You have access
- PDF
- HTML
- Export citation

Quantitative Research Methods in Corporate Finance

Exemplified by Stata, Python, and R
Taylan Mavruk
Published online:

20 March 2025

Print publication:

20 March 2025
- Textbook
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Focused on empirical methods and their applications to corporate finance, this innovative text equips students with the knowledge to analyse and critically evaluate quantitative research methods in corporate finance, and conduct computer-aided statistical analyses on various types of datasets. Chapters demonstrate the application of basic econometric models in corporate finance (as opposed to derivations or theorems), backed up by relevant research. Alongside practical examples and mini case studies, computer lab exercises enable students to apply the theories of corporate finance and make stronger connections between theory and practice, while developing their programming skills. All of the Stata code is provided (with corresponding Python and R code available online), so students of all programming abilities can focus on understanding and interpreting the analyses.

12 - Future Directions: Machine-Learning Methods
Taylan Mavruk, University of Gothenburg
Book:

Quantitative Research Methods in Corporate Finance

Published online:

20 March 2025

Print publication:

20 March 2025, pp 264-269
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

ML methods are increasingly being used in (corporate) finance studies, with impressive applications. ML methods can be applied with the aim of reducing prediction error in the models, but can also be used to extend the existing traditional econometric methods. The performance of the ML models depends on the quality of the input data and the choice of model. There are many ML models, but all come with their own specific details. It is therefore essential to select accurate model(s) for the analysis. This chapter briefly reviews some broad types of ML methods. It covers supervised learning, which tends to achieve superior prediction performance by using more flexible functional forms than OLS in the prediction model. It explains unsupervised learning methods that derive and learn structural information from conventional data. Finally, the chapter also discusses some limitations and drawbacks of ML, as well as potential remedies.

Understanding precipitation changes through unsupervised machine learning
Part of
- Tackling Climate Change with Machine Learning
Griffin Mooers, Tom Beucler, Mike Pritchard, Stephan Mandt
Journal:

Environmental Data Science / Volume 3 / 2024

Published online by Cambridge University Press:

12 February 2024, e3
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Despite the importance of quantifying how the spatial patterns of heavy precipitation will change with warming, we lack tools to objectively analyze the storm-scale outputs of modern climate models. To address this gap, we develop an unsupervised, spatial machine-learning framework to quantify how storm dynamics affect changes in heavy precipitation. We find that changes in heavy precipitation (above the 80th percentile) are predominantly explained by changes in the frequency of these events, rather than by changes in how these storm regimes produce precipitation. Our study shows how unsupervised machine learning, paired with domain knowledge, may allow us to better understand the physics of the atmosphere and anticipate the changes associated with a warming world.

Introduction to Intelligent Systems, Control, and Machine Learning using MATLAB

Marco P. Schoen
Published online:

27 November 2023

Print publication:

16 November 2023
- Textbook
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Dive into the foundations of intelligent systems, machine learning, and control with this hands-on, project-based introductory textbook. Precise, clear introductions to core topics in fuzzy logic, neural networks, optimization, deep learning, and machine learning, avoid the use of complex mathematical proofs, and are supported by over 70 examples. Modular chapters built around a consistent learning framework enable tailored course offerings to suit different learning paths. Over 180 open-ended review questions support self-review and class discussion, over 120 end-of-chapter problems cement student understanding, and over 20 hands-on Arduino assignments connect theory to practice, supported by downloadable Matlab and Simulink code. Comprehensive appendices review the fundamentals of modern control, and contain practical information on implementing hands-on assignments using Matlab, Simulink, and Arduino. Accompanied by solutions for instructors, this is the ideal guide for senior undergraduate and graduate engineering students, and professional engineers, looking for an engaging and practical introduction to the field.

6 - Artificial Neural Networks
Marco P. Schoen, Idaho State University
Book:

Introduction to Intelligent Systems, Control, and Machine Learning using MATLAB

Published online:

27 November 2023

Print publication:

16 November 2023, pp 177-224
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Starting with the perceptron, in Chapter 6 we discuss the functioning, the training, and the use of neural networks. For the different neural network structures, the corresponding script in Matlab is provided and the limitations of the different neural network architectures are discussed. A detailed discussion and the underlying mathematical concept of the Backpropagation learning algorithm is accompanied with simple examples as well as sophisticated implementations using Matlab. Chapter 6 also includesconsiderations on quality measures of trained neural networks, such as the accuracy, recall, specificity, precision, prevalence, and some of the derived quantities such as the F-score and the receiver operating characteristic plot. We also look at the overfitting problem and how to handle it during the neural network training process.

Chapter 1 - Machine Learning Algorithms and Measurement
from Part I - Foundations
- By Q. Chelsea Song, Ivan Hernandez, Hyun Joo Shin, Meaghan M. Tracy, Mengqiao Liu
Edited by Louis Tay, Purdue University, Indiana, Sang Eun Woo, Purdue University, Indiana, Tara Behrend, Purdue University, Indiana
Book:

Technology and Measurement around the Globe

Published online:

08 November 2023

Print publication:

09 November 2023, pp 11-45
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter provides an overview of the common machine learning algorithms used in psychological measurement (to measure human attributes). They include algorithms used to measure personality from interview videos; job satisfaction from open-ended text responses; and group-level emotions from social media posts and internet search trends. These algorithms enable effective and scalable measures of human psychology and behavior, driving technological advancements in measurement. The chapter consists of three parts. We first discuss machine learning and its unique contribution to measurement. We then provide an overview of the common machine learning algorithms used in measurement and their example applications. Finally, we provide recommendations and resources for using machine learning algorithms in measurement.

APPLICATION OF UNSUPERVISED LEARNING AND IMAGE PROCESSING INTO CLASSIFICATION OF DESIGNS TO BE FABRICATED WITH ADDITIVE OR TRADITIONAL MANUFACTURING
Baris Ördek, Yuri Borgianni
Journal:

Proceedings of the Design Society / Volume 3 / July 2023

Published online by Cambridge University Press:

19 June 2023, pp. 613-622
- Article
- - You have access
  - Open access
- PDF
- Export citation
Manufacturing process (MP) selection systems require a large amount of labelled data, typically not provided as design outputs. This issue is made more severe with the continuous development of Additive Manufacturing systems, which can be increasingly used to substitute traditional manufacturing technologies. The objective of this paper is to investigate the application of image processing for classifying MPs in an unsupervised approach. To this scope, k-means and hierarchical clustering algorithms are applied to an unlabelled image dataset. The input dataset is constructed from freely accessible web databases and consists of twenty randomly selected CAD models and corresponding images of machine elements: 35% additively manufactured parts and 65% manufactured with traditional manufacturing technologies. The input images are pre-processed to have the same colour and size. The k-means and hierarchical clustering algorithms reported 65% and 60% accuracy, respectively. The algorithms show comparable performance, however, the k-means algorithm failed to predict the correct subdivisions. The research shows promising potential for MP classification and image processing applications.

‘Super-Unsupervised’ Classification for Labelling Text: Online Political Hostility as an Illustration
Stig Hebbelstrup Rye Rasmussen, Alexander Bor, Mathias Osmundsen, Michael Bang Petersen
Journal:

British Journal of Political Science / Volume 54 / Issue 1 / January 2024

Published online by Cambridge University Press:

24 April 2023, pp. 179-200
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We live in a world of text. Yet the sheer magnitude of social media data, coupled with a need to measure complex psychological constructs, has made this important source of data difficult to use. Researchers often engage in costly hand coding of thousands of texts using supervised techniques or rely on unsupervised techniques where the measurement of predefined constructs is difficult. We propose a novel approach that we call ‘super-unsupervised’ learning and demonstrate its usefulness by measuring the psychologically complex construct of online political hostility based on a large corpus of tweets. This approach accomplishes the feat by combining the best features of supervised and unsupervised learning techniques: measurements of complex psychological constructs without a single labelled data source. We first outline the approach before conducting a diverse series of tests that include: (i) face validity, (ii) convergent and discriminant validity, (iii) criterion validity, (iv) external validity, and (v) ecological validity.

10 - Unsupervised Learning
William W. Hsieh, University of British Columbia, Vancouver
Book:

Introduction to Environmental Data Science

Published online:

23 March 2023

Print publication:

23 March 2023, pp 330-371
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Under unsupervised learning, clustering or cluster analysis is first studied. Clustering methods are grouped into non-hierarchical (including K-means clustering) and hierarchical clustering. Self-organizing maps can be used as a clustering method or as a discrete non-linear principal component analysis method. Autoencoders are neural network models that can be used for non-linear principal component analysis. Non-linear canonical correlation analysis can also be performed using neural network models.

1 - Introduction
William W. Hsieh, University of British Columbia, Vancouver
Book:

Introduction to Environmental Data Science

Published online:

23 March 2023

Print publication:

23 March 2023, pp 1-18
- Chapter
- - You have access
- PDF
- Export citation
Summary

The historical development of statistics and artificial intelligence (AI) is outlined, with machine learning (ML) emerging as the dominant branch of AI. Data science is viewed as being composed of a yin part (ML) and a yang part (statistics), and environmental data science is the intersection between data science and environmental science. Supervised learning and unsupervised learning are compared. Basic concepts of underfitting/overfitting and the curse of dimensionality are introduced.

Trajectory design via unsupervised probabilistic learning on optimal manifolds – Corrigendum
Cosmin Safta, Roger G. Ghanem, Michael J. Grant, Michael Sparapany, Habib N. Najm
Journal:

Data-Centric Engineering / Volume 3 / 2022

Published online by Cambridge University Press:

14 September 2022, e28
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation

Search Results

Refine search

Refine search

Actions for selected content:

32 results

9 - Unsupervised Learning

Summary

7 - Machine Learning Introduction and Regression

Summary

9 - Unsupervised Learning

Summary

2 - Artificial Intelligence and Machine Learning

Summary

1 - What Is AI?

Summary

On the literary landscapes of vector embeddings

1 - Generative AI

Summary

1 - Introduction

Summary

ARTIFICIAL INTELLIGENCE INFORMED SIMULATION OF DISSOLVED INORGANIC NITROGEN FROM UNGAUGED CATCHMENTS TO THE GREAT BARRIER REEF

Quantitative Research Methods in Corporate Finance

12 - Future Directions: Machine-Learning Methods

Summary

Understanding precipitation changes through unsupervised machine learning

Introduction to Intelligent Systems, Control, and Machine Learning using MATLAB

6 - Artificial Neural Networks

Summary

Chapter 1 - Machine Learning Algorithms and Measurement

Summary

APPLICATION OF UNSUPERVISED LEARNING AND IMAGE PROCESSING INTO CLASSIFICATION OF DESIGNS TO BE FABRICATED WITH ADDITIVE OR TRADITIONAL MANUFACTURING

‘Super-Unsupervised’ Classification for Labelling Text: Online Political Hostility as an Illustration

10 - Unsupervised Learning

Summary

1 - Introduction

Summary

Trajectory design via unsupervised probabilistic learning on optimal manifolds – Corrigendum

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

32 results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Quantitative Research Methods in Corporate Finance

Summary

Introduction to Intelligent Systems, Control, and Machine Learning using MATLAB

Summary

Summary

Summary

Summary