Search

A Hands-On Introduction to Data Science with R

2nd edition
Chirag Shah
Coming soon
Expected online publication date:

January 2026

Print publication:

31 January 2026
- Textbook
- Export citation
Students will develop a practical understanding of data science with this hands-on textbook for introductory courses. This new edition is fully revised and updated, with numerous exercises and examples in the popular data science tool R, a new chapter on using R for statistical analysis, and a new chapter that demonstrates how to use R within a range of cloud platforms. The many practice examples, drawn from real-life applications, range from small to big data and come to life in a new end-to-end project in Chapter 11. New 'Data Science in Practice' boxes highlight how concepts introduced work within an industry context and many chapters include new sections on AI and Generative AI. A suite of online material for instructors provides a strong supplement to the book, including lecture slides, solutions, additional assessment material and curriculum suggestions. Datasets and code are available for students online. This entry-level textbook is ideal for readers from a range of disciplines wishing to build a practical, working knowledge of data science.

A Hands-On Introduction to Data Science with Python

2nd edition
Chirag Shah
Coming soon
Expected online publication date:

December 2025

Print publication:

22 January 2026
- Textbook
- Export citation
Students will develop a practical understanding of data science with this hands-on textbook for introductory courses. This new edition is fully revised and updated, with numerous exercises and examples in the popular data science tool Python, a new chapter on using Python for statistical analysis, and a new chapter that demonstrates how to use Python within a range of cloud platforms. The many practice examples, drawn from real-life applications, range from small to big data and come to life in a new end-to-end project in Chapter 11. New 'Data Science in Practice' boxes highlight how concepts introduced work within an industry context and many chapters include new sections on AI and Generative AI. A suite of online material for instructors provides a strong supplement to the book, including lecture slides, solutions, additional assessment material and curriculum suggestions. Datasets and code are available for students online. This entry-level textbook is ideal for readers from a range of disciplines wishing to build a practical, working knowledge of data science.

4 - Varieties of Expertise
David L. Weimer, University of Wisconsin, Madison
Book:

Negotiating Values

Published online:

30 November 2025

Print publication:

18 December 2025, pp 76-99
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The OPTN draws on a variety of expertise in designing organ allocation rules. Expertise arises from both explicit and tacit knowledge. Explicit knowledge includes generally accepted theories and empirical regularities that are accessible without first-hand experience of practice in some domain of knowledge. Tacit knowledge arises from experience, such as professional practice. In addition to this contributory tacit knowledge, it may also arise through interaction among participants in some domain of knowledge. Through its committee system, the OPTN taps the contributory knowledge of practitioners and patients and creates interactional tacit knowledge, especially among committee staff. Explicit knowledge arises from analysis of near universal longitudinal data on transplant candidates and other data collected within the transplantation system. These data support predictions of policy outcomes through simulation models and optimization tools utilizing machine learning.

Chapter 9 - Statistical Modelling of Syntactic Complexity of English Academic Texts Using Ensemble Machine Learning
- By Maryam Nasseri
Edited by Mikko Laitinen, University of Eastern Finland, Paula Rautionaho, University of Eastern Finland
Book:

Data-Intensive Investigations of English

Published online:

03 December 2025

Print publication:

18 December 2025, pp 227-257
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This computational modelling work investigates whether different rhetorical sections as subgenres of postgraduate English academic texts can be characterised by distinct types and amounts of syntactic structures. A corpus of dissertations written by students with different English language backgrounds and academic contexts was subjected to various Natural Language Processing (NLP) methods. Using a novel analytical method on linguistic data, this study identifies strong syntactic predictors of genres with the robust statistical modelling of ensemble learning. This method consists of four machine learning predictive classifiers of Random Forest, K-Nearest Neighbors, deep learning artificial neural network, and Gradient Boosting as the stacked layer and the Naive Bayes method as the meat-learner. The discussion of findings examines the extent of variability among the rhetorical sections of MA dissertations regarding the type and distribution of coordination, subordination, phrasal complexity, as well as the length of syntactic structures.

Symbolic identification of tensor equations in multidimensional physical fields
Tianyi Chen, Hao Yang, Wenjun Ma, Jun Zhang
Journal:

Journal of Fluid Mechanics / Volume 1024 / 10 December 2025

Published online by Cambridge University Press:

02 December 2025, A34
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Recently, data-driven methods have shown great promise for discovering governing equations from simulation or experimental data. However, most existing approaches are limited to scalar equations, with few capable of identifying tensor relationships. In this work, we propose a general data-driven framework for identifying tensor equations, referred to as symbolic identification of tensor equations (SITE). The core idea of SITE – representing tensor equations using a host–plasmid structure – is inspired by the multidimensional gene expression programming approach. To improve the robustness of the evolutionary process, SITE adopts a genetic information retention strategy. Moreover, SITE introduces two key innovations beyond conventional evolutionary algorithms. First, it incorporates a dimensional homogeneity check to restrict the search space and eliminate physically invalid expressions. Second, it replaces traditional linear scaling with a tensor linear regression technique, greatly enhancing the efficiency of numerical coefficient optimization. We validate SITE using two benchmark scenarios, where it accurately recovers target equations from synthetic data, showing robustness to noise and flexible expressive capability. Furthermore, SITE is applied to identify constitutive relations directly from molecular simulation data, which are generated without reliance on macroscopic constitutive models. It adapts to both compressible and incompressible flow conditions and successfully identifies the corresponding macroscopic forms, highlighting its potential for data-driven discovery of tensor equation.

Physics of the vortex gust–airfoil interaction under an optimal mitigation strategy learned through deep reinforcement learning
Brice Martin, Thierry Jardin, Emmanuel Rachelson, Michaël Bauerheim
Journal:

Journal of Fluid Mechanics / Volume 1024 / 10 December 2025

Published online by Cambridge University Press:

01 December 2025, A18
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper aims to elucidate the physical mechanisms underlying airfoil–vortex gust interaction and mitigation. The vortex gust mitigation problem consists in finding the pitch rate sequence that minimises the gust-induced lift disturbance of an NACA0012 airfoil at Reynolds number 1000. The instantaneous flow fields and resulting lift are obtained from numerical resolution of the Navier–Stokes equations. The controller is modelled as an artificial neural network and trained to minimise the lift fluctuation using deep reinforcement learning (DRL). The paper shows that DRL-trained controllers are able to mitigate medium- and high-intensity vortex gusts by more than 80 % compared to the uncontrolled scenario. It then presents a comparative analysis of the controlled and uncontrolled lift generation mechanisms using the force partitioning method (FPM). The FPM provides a quantitative assessment of the amount of lift generated by each flow region. For medium-intensity gusts, the main phenomenon is the asymmetry in the airfoil boundary layer induced by the vortex. The control strategy mitigates the gust-induced lift by restoring the flow symmetry around the airfoil. For high-intensity gusts, the boundary layer asymmetry remains, but the gust interaction with the airfoil also triggers flow separation and the formation of a strong leading-edge vortex (LEV). Consequently, the control command balances several aerodynamic phenomena such as boundary layer asymmetry, flow detachment, LEV, and secondary recirculation regions to produce a net quasi-zero lift fluctuation. Thus this work highlights the potential of DRL control, enhanced by advanced post-processing such as FPM, to discover and interpret optimal flow control mechanisms.

Global trends and future projections of eating disorders among adolescents and young adults: comprehensive analysis from 1990 to 2050 using eight machine-learning models
Lu Liu, Ke Wang, Mengqin Dai, Wenxiu Luo, Lei Tang, Xianghong Ding, Yun Liu, Liling Wu, Nian Liu, Jiaming Luo
Journal:

The British Journal of Psychiatry , FirstView

Published online by Cambridge University Press:

27 November 2025, pp. 1-15
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Background
Eating disorders, particularly anorexia nervosa and bulimia nervosa, are significant global health challenges.
Aims
This study analyses historical trends and forecasts future patterns of eating disorders among young adults aged 15–29 years using machine learning techniques.
Method
Global data on anorexia nervosa and bulimia nervosa from the Global Burden of Disease study 2021 spanning 1990 to 2021 were analysed, examining incidence, prevalence and disability-adjusted life years (DALYs) across age groups, sociodemographic index (SDI) levels and regions. Eight machine-learning models were employed to forecast trends from 2022 to 2050.
Results
Bulimia nervosa showed more pronounced increases compared to anorexia nervosa across all metrics. The 15–19 age group had the highest incidence rates, while the 20–24 age group showed the highest prevalence and DALY rates. Low SDI regions experienced substantial increases, with bulimia nervosa prevalence rising by 179.05%. East Asia demonstrated the most significant rise in age-standardised rates. The Prophet model best forecast anorexia nervosa trends, while ARIMA performed best for bulimia nervosa. Projections indicate continued increases through 2050 for both disorders.
Conclusions
The global burden of eating disorders among young adults is projected to increase significantly by 2050, with bulimia nervosa showing more rapid growth than anorexia nervosa. Substantial variations exist across age groups, SDI levels and regions. These findings highlight the urgent need for enhanced prevention programmes targeting high-risk age groups, strengthened healthcare capacity in rapidly developing regions and evidence-based policy interventions to address the growing global burden of eating disorders.

Neural operator-based stochastic forcing for resolvent prediction of space–time turbulence statistics in channel flows
Chutian Wu, Xin-Lei Zhang, Guowei He
Journal:

Journal of Fluid Mechanics / Volume 1024 / 10 December 2025

Published online by Cambridge University Press:

25 November 2025, A1
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this work we propose a neural operator-based coloured-in-time forcing model to predict space–time characteristics of large-scale turbulent structures in channel flows. The resolvent-based method has emerged as a powerful tool to capture dominant dynamics and associated spatial structures of turbulent flows. However, the method faces the difficulty in modelling the coloured-in-time nonlinear forcing, which often leads to large predictive discrepancies in the frequency spectra of velocity fluctuations. Although the eddy viscosity has been introduced to enhance the resolvent-based method by partially accounting for the forcing colour, it is still not able to accurately capture the decay rate of the time-correlation function. Also, the uncertainty in the modelled eddy viscosity can significantly limit the predictive reliability of the method. In view of these difficulties, we propose using the neural operator based on the DeepONet architecture to model the stochastic forcing as a function of mean velocity and eddy viscosity. Specifically, the DeepONet-based model is constructed to map an arbitrary eddy-viscosity profile and corresponding mean velocity to stochastic forcing spectra based on the direct numerical simulation data at $Re_\tau =180$. Furthermore, the learned forcing model is integrated with the resolvent operator, which enables predicting the space–time flow statistics based on the eddy viscosity and mean velocity from the Reynolds-averaged Navier–Stokes (RANS) method. Our results show that the proposed forcing model can accurately predict the frequency spectra of velocity in channel flows at different characteristic scales. Moreover, the model remains robust across different RANS-provided eddy viscosities and generalises well to $Re_\tau =550$.

Only-child matching penalty in the marriage market
Keisuke Kawata, Mizuki Komura
Journal:

Journal of Demographic Economics , First View

Published online by Cambridge University Press:

24 November 2025, pp. 1-35
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
This study explores the marriage matching of only-child individuals and the related outcomes. Specifically, we analyze two aspects: First, we investigate the marriage patterns of only children, examining whether people choose mates in a positive or negative assortative manner regarding only-child status. We find that, along with being more likely to remain single, only children are more likely to marry another only child. Second, we measure the matching premium or penalty as the difference in partners’ socioeconomic status between only-child and non-only-child individuals, where socioeconomic status is approximated by years of schooling. Our estimates indicate that among women who marry an only-child husband, only children are penalized, as their partners’ educational attainment is 0.63 years lower. Finally, we discuss the potential sources of this penalty in light of our empirical findings.

10 - The Future of Chilling Effects and How to Stop It
from Part III - Implications
Jonathon W. Penney, Osgoode Hall Law School, York University
Book:

Chilling Effects

Published online:

20 November 2025

Print publication:

20 November 2025, pp 168-189
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 10 predicts the “future” of chilling effects – which today looks darker and more dystopian than ever in light of the proliferation of new forms of artificial intelligence, machine learning, and automation technologies in society. The author here introduces a new term “superveillance” to explain new forms of AI-driven systems of automated legal and social norm enforcement that will likely cause mass societal chilling effects at an unprecedented scale. The author also argues how chilling effects today enable this more oppressive future and proposes a comprehensive law and public policy reforms and solutions to stop it.

Jurisprudence and the Intelligible World: Exploring Predictive Modelling as a Mechanism to Decide Bail in the Australian Context
Brett Anthony Hansard, Jianlong Zhou
Journal:

International Annals of Criminology , First View

Published online by Cambridge University Press:

20 November 2025, pp. 1-37
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
The intelligible world of machines and predictive modelling is an omnipresent and almost inescapable phenomenon. It is an evolution where human intelligence is being supported, supplemented or superseded by artificial intelligence (AI). Decisions once made by humans are now made by machines, learning at a faster and more accurate rate through algorithmic calculations. Jurisprudent academia has undertaken to argue the proposition of AI and its role as a decision-making mechanism in Australian criminal jurisdictions. This paper explores this proposition through predictive modelling of 101 bail decisions made in three criminal courts in the State of New South Wales (NSW), Australia. Indicatively, the models’ statistical performance and accuracy, based on nine predictor variables, proved effective. The more accurate logistic regression model achieved 78% accuracy and a performance value of 0.845 (area under the curve; AUC), while the classifier model achieved 72.5% accuracy and a performance value of 0.702 (AUC). These results provide the groundwork for AI-generated bail decisions being piloted in the NSW jurisdiction and possibly others within Australia.

Frequency and bandwidth reconfigurable MIMO dielectric resonator antenna and its optimization using Taguchi neural network
Yajush Rai, Suyash Kumar Singh, Anand Sharma, Smriti Agarwal, Rajeev Gupta
Journal:

International Journal of Microwave and Wireless Technologies , First View

Published online by Cambridge University Press:

14 November 2025, pp. 1-16
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this article, a 1 × 2 bandwidth (BW) and frequency-reconfigurable dielectric resonator-based multiple input multiple output (MIMO) antenna array is presented for 5G sub-6 GHz (3.3–6.0 GHz)/Wi-Fi 6E (5.925–6.425 GHz)/Wi-Fi 5G (5.15–5.85 GHz) applications. Additional dual-ring-open loop resonator structures with varied dimensions are introduced within antenna’s feeding network to achieve BW and frequency reconfigurability. RF PIN and varactor diodes (VDs) are integrated with proposed structure to enable switching between various modes and continuous tuning of frequency and BW, respectively. Further, Taguchi neural network (TNN) has been incorporated to predict percentage bandwidth of proposed antenna, getting a maximum deviation of only 0.6% from actual value. The proposed structure operated from 4.98 to 6.5 GHz, achieving wide continuous frequency tuning of 20.36% in passband and 6.1% reconfiguration for notch band. It also demonstrates continuous BW tunability from 16.69% to 34.44% with measured BWs of 19.58%, 34.44%, and 16.69% at 0, 3, and 8 V reverse bias voltages of VDs, respectively. MIMO antenna array structure also shows enhanced gain performance with a peak gain of 11.03 dBi and an overall gain above 7 dBi in the whole operating band.

Multimodal prediction of psychotic-like experiences using elastic net modeling: external validation in a clinical sample
Seda Arslan, Merve Kaşıkçı, Osman Dağ, Didenur Şahin-Çevik, Işık Batuhan Çakmak, Evangelos Vassos, Martijn van den Heuvel, Timothea Toulopoulou
Journal:

Psychological Medicine / Volume 55 / 2025

Published online by Cambridge University Press:

14 November 2025, e346
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Background
Psychotic-like experiences (PLEs) are considered a subclinical component of psychosis continuum. Studies indicate that PLEs arise from multimodal factors, yet research comprehensively examining these factors together remains scarce. Using a large youth sample, we present the first model that simultaneously examines multimodal factors related to PLEs. As a secondary aim, we evaluate the model’s ability to explain psychosis in an external validation cohort that included individuals experiencing psychosis.
Methods
After applying variable selection including generalized estimating equations, correlation filtering, Least Absolute Shrinkage and Selection Operator model to 741 variables (i.e., environmental factors, cognitive appraisals, clinical variables, cognitive functioning, and structural brain connectome measures), obtained PLEs predictors (N = 27) and covariates (i.e., age, sex, IQ) were included in the classification model based on Elastic Net algorithm for predicting high/low PLEs in 396 healthy participants aged 14–24 (Mage = 19.72 ± 2.5). We externally validated PLE-related predictors in a clinical sample comprising first-episode psychosis patients (n = 19), their siblings (n = 20), and healthy controls (n = 19).
Results
Eleven factors, including environmental and cognitive appraisals, along with 16 structural network properties spanning frontal, temporal, occipital, and parietal regions, were identified as important predictors of PLEs. The model’s performance was moderate in predicting low versus high PLEs (accuracy = 75%, AUC = 0.750). Specificity was high (84.2%) in distinguishing siblings from patients.
Conclusions
Multimodal features, including environmental burden, cognitive schemas, and brain network alterations, predict PLEs and partially generalize to clinical psychosis. These variables may reflect intermediate phenotypes across the psychosis spectrum, offering insights into both vulnerability and resilience.

Implementation of support vector machines to classify abnormal neuronal response during emotion regulation at an individual level in patients with newly diagnosed bipolar disorder – and its association with subsequent functional changes and mood episodes
Robert James Richard Blair, Alexander Tobias Ysbæk-Nielsen, Hanne Lie Kjærstad, Sahil Bajaj, Klara Coello, Maura Faurholt-Jepsen, Maj Vinberg, Lars Vedel Kessing, Julian Macoveanu, Kamilla Miskowiak
Journal:

Psychological Medicine / Volume 55 / 2025

Published online by Cambridge University Press:

11 November 2025, e338
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Background
In this study, a classifier (hyperplane) is determined to distinguish the neural responses during emotion regulation versus viewing images in healthy adults and then applied to determine (i) the effectiveness of the emotion regulation response (defined as emotion regulation distance from the hyperplane [DFHER]) in independent samples of healthy adults, patients with BD, and the patients’ unaffected relatives (URs) and (ii) the association of DFHER with the duration of future (hypo)manic and depressive episodes for patients with BD over a 16-month follow-up period.
Methods
Study participants (N = 226) included 65 healthy adults (35 used for support vector machine [SVM] learning [HCTrain] and 30 kept as an independent test sample [HCTest]), 87 patients with newly diagnosed BD (67% BD type 2) and 74 URs. BOLD response data came from an emotion regulation task. Clinical symptoms were assessed at baseline fMRI and after 16 months of specialized treatment.
Results
The SVM ML analysis identified a hyperplane with 75.7% accuracy. Patients with BD showed reduced DFHER relative to the HCTest and UR groups. Reduced DFHER was associated with reduced improvement in psychosocial functioning during the 16-month follow-up time (B = −1.663, p = 0.02).
Conclusions
The neural response during emotion regulation can be relatively well distinguished in healthy adults via ML. Patients with newly diagnosed BD show significant disruption in the recruitment of this emotion regulation response. Disrupted may indicate a reduced capacity for functional improvement during specialized treatment in a mood disorder clinic.

Training, fieldwork and collaboration in a new remote sensing method for heritage monitoring in Libya
Nichole Sheldrick, Ahmed Buzaian, Ahmed Mutasim Abdalla Mahmoud
Journal:

Libyan Studies , First View

Published online by Cambridge University Press:

11 November 2025, pp. 1-11
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Between 2023 and 2024, the Endangered Archaeology in the Middle East and North Africa (EAMENA) project, in collaboration with the Libyan Department of Antiquities (DoA), organised and conducted a series of training workshops and fieldwork campaigns in Libya, funded by the British Council’s Cultural Protection Fund (CPF). The workshops provided training to over 20 members of the DoA in a newly-developed Machine Learning Automated Change Detection (MLACD) tool. This remote sensing method was developed by the Leicester EAMENA team to detect landscape change and aid heritage monitoring efforts. The MLACD method was applied to four case studies in Libya: Lefakat (Cyrenaica), Bani Walid (Tripolitania), the region south of Derna (Cyrenaica) and Jarma (Fazzan). Each of these case studies was followed by a survey campaign by Libyan archaeologists to validate the results of the method, survey the archaeological sites identified, record their condition and assess the disturbances and threats affecting them. This article will provide an overview of the aims and successful outcomes of the EAMENA-CPF training programme, as well as an introduction to the MLACD method and its application to Libyan heritage, providing background and context for the individual case studies, which will be published more fully in separate articles.

Early prediction of ADHD symptoms from perinatal characteristics: A machine learning study
Yee-Lam Ho, Bonnie Auyeung, Aja Murray
Journal:

Development and Psychopathology , First View

Published online by Cambridge University Press:

10 November 2025, pp. 1-14
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Early identification of risk for attention-deficit hyperactivity disorder (ADHD) symptoms can enable more timely interventions and improve long-term outcomes. While previous research has linked various maternal and perinatal factors to ADHD, few studies have examined these predictors collectively in a single comprehensive analysis. This study aimed to assess whether later ADHD symptoms can be predicted from information available at birth, specifically ethnicity, maternal metabolic markers, mental health, and socioeconomic status. It additionally aimed to identify the most influential predictors. Using data from the Born in Bradford (BiB) study, we applied multiple linear regression (LR) and machine learning techniques to predict ADHD symptoms as measured by the Hyperactivity/Inattention subscale of the Strengths and Difficulties Questionnaire (SDQ). A 10-fold cross-validated LR model explained 6.97% of the variance in SDQ scores. In the random forest model, infant male sex and maternal smoking during pregnancy emerged as the top predictors. These findings provide proof of principle for early identification of children at risk of ADHD. Future models may benefit from incorporating additional perinatal data to improve predictive accuracy.

Enhanced flow rate prediction of disturbed pipe flow using a shallow neural network
Christoph Wilms, Ann-Kathrin Ekat, Katja Hertha-Dunkel, Thomas Eichler, Sonja Schmelter
Journal:

Flow: Applications of Fluid Mechanics / Volume 5 / 2025

Published online by Cambridge University Press:

06 November 2025, E39
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Trustworthy volumetric flow measurements are essential in many applications such as power plant controls or district heating systems. Flow metering under disturbed flow conditions, such as downstream of bends, is a challenge and leads to errors of up to 20 %. In this paper, an algorithm based on a shallow neural network (SNN) is developed, leading to a significant error reduction for strongly disturbed flow profiles. To cover a wide range of disturbances, the training dataset was chosen to consist of three base types of elbow configurations. For 83 % of the test data, the SNN produces a smaller error than the state-of-the-art approach. The average error is reduced from 2.25 % to 0.42 %. For the SNN, an error of less than 1 % can be achieved for downstream distances greater than 10 pipe diameters. The SNN demonstrated robustness to various reductions of the training dataset, as well as to noisy input data. Additionally, simulation data of a realistic pipe system with a significantly different geometry compared with the training data was used for testing. In this strong extrapolation, the mean error of the SNN was always smaller than the state-of-the-art approach and an error of less than 1 % could be achieved for more than 10 pipe diameters downstream of the last disturbance.

Can artificial intelligence support Bactrian camel conservation? Testing machine learning on aerial imagery in Mongolia’s Gobi Desert – CORRIGENDUM
Chris McCarthy, Simon Phillips, Troy Sternberg, Adiya Yadamsuren, Battogtokh Nasanbat, Kyle Shaney, Buho Hoshino, Erdenebuyan Enkhjargal
Journal:

Environmental Conservation , First View

Published online by Cambridge University Press:

05 November 2025, pp. 1-2
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation

Mathematical Methods in Data Science

Bridging Theory and Applications with Python
Sébastien Roch
Published online:

04 November 2025

Print publication:

30 October 2025
- Textbook
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Bridge the gap between theoretical concepts and their practical applications with this rigorous introduction to the mathematics underpinning data science. It covers essential topics in linear algebra, calculus and optimization, and probability and statistics, demonstrating their relevance in the context of data analysis. Key application topics include clustering, regression, classification, dimensionality reduction, network analysis, and neural networks. What sets this text apart is its focus on hands-on learning. Each chapter combines mathematical insights with practical examples, using Python to implement algorithms and solve problems. Self-assessment quizzes, warm-up exercises and theoretical problems foster both mathematical understanding and computational skills. Designed for advanced undergraduate students and beginning graduate students, this textbook serves as both an invitation to data science for mathematics majors and as a deeper excursion into mathematics for data science students.

Efficient modeling of a two-terminal polarization convertor–loaded dielectric-built filtenna using ML algorithms
Abhishek Kumar, Jitendra Ahir, Sanjeev Kumar Gupta
Journal:

International Journal of Microwave and Wireless Technologies , First View

Published online by Cambridge University Press:

03 November 2025, pp. 1-8
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this study, a metasurface (MS) polarization converter combined with a two-port dielectric antenna is constructed and studied. The feeding configuration, which consists of a printed line connected to an aperture, offers built-in filtering capabilities. In addition to converting linear to circular polarization between 2.49 and 3.25 GHz, the suspended MS layer enhances port isolation to less than −20 dB. In addition, the suggested radiator’s |S11| is projected using the Random Forest and XGBoost machine learning (ML) models, which demonstrate satisfactory agreement with simulation data. The antenna effectively functions over 2.33–3.35 GHz, demonstrating that it is a leading contender for sub-6 GHz 5G communication systems. Fabricated measurements support both simulation and ML predictions.

Search Results

Refine search

Refine search

Actions for selected content:

1000 results

A Hands-On Introduction to Data Science with R

A Hands-On Introduction to Data Science with Python

4 - Varieties of Expertise

Summary

Chapter 9 - Statistical Modelling of Syntactic Complexity of English Academic Texts Using Ensemble Machine Learning

Summary

Symbolic identification of tensor equations in multidimensional physical fields

Physics of the vortex gust–airfoil interaction under an optimal mitigation strategy learned through deep reinforcement learning

Global trends and future projections of eating disorders among adolescents and young adults: comprehensive analysis from 1990 to 2050 using eight machine-learning models

Neural operator-based stochastic forcing for resolvent prediction of space–time turbulence statistics in channel flows

Only-child matching penalty in the marriage market

10 - The Future of Chilling Effects and How to Stop It

Summary

Jurisprudence and the Intelligible World: Exploring Predictive Modelling as a Mechanism to Decide Bail in the Australian Context

Frequency and bandwidth reconfigurable MIMO dielectric resonator antenna and its optimization using Taguchi neural network

Multimodal prediction of psychotic-like experiences using elastic net modeling: external validation in a clinical sample

Implementation of support vector machines to classify abnormal neuronal response during emotion regulation at an individual level in patients with newly diagnosed bipolar disorder – and its association with subsequent functional changes and mood episodes

Training, fieldwork and collaboration in a new remote sensing method for heritage monitoring in Libya

Early prediction of ADHD symptoms from perinatal characteristics: A machine learning study

Enhanced flow rate prediction of disturbed pipe flow using a shallow neural network

Can artificial intelligence support Bactrian camel conservation? Testing machine learning on aerial imagery in Mongolia’s Gobi Desert – CORRIGENDUM

Mathematical Methods in Data Science

Efficient modeling of a two-terminal polarization convertor–loaded dielectric-built filtenna using ML algorithms

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

1000 results

A Hands-On Introduction to Data Science with R

A Hands-On Introduction to Data Science with Python

Summary

Summary

Summary

Mathematical Methods in Data Science