To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Recent advances in healthcare and rising life expectancy intensify longevity risk, motivating a deeper understanding of how cause-of-death (COD) rates interact. Using male COD data from 1978 to 2018 in the United States, we develop a copula-based hierarchical framework for seven major causes: cancer, diabetes, external causes, influenza, mental disorders, nephritis, and vascular disease. The framework integrates reconciliation, hierarchical dependence, and long-run equilibrium using a Lee–Carter (LC) setting. More specifically, the LC period indices are estimated under reconciliation penalties and are modeled through a sparse vector error correction model, with dependence captured by a hierarchical Archimedean copula. Two applications illustrate the value of our approach. In out-of-sample forecasting, the framework outperforms the standard LC model by improving the accuracy of aggregate mortality rates. In structural analysis, fitted connectedness reveals that diabetes and vascular disease act as net transmitters of mortality shocks, while cancer and external causes are net receivers. These insights help actuaries, demographers, clinicians, and policymakers enhance mortality forecasting to assess whether prioritizing government interventions for high-transmission causes could potentially maximize overall mortality improvements for society.
We study a nonlinear branching diffusion process in the sense of McKean, i.e. where particles are subjected to a mean-field interaction. We consider first a strong formulation of the problem and we provide an existence and uniqueness result by using contraction arguments. Then we consider the notion of weak solution and its equivalent martingale problem formulation. In this setting, we provide a general weak existence result, as well as a propagation of chaos property, i.e. the McKean–Vlasov branching diffusion is the limit of a large-population branching diffusion process with mean-field interaction.
The fast-moving field of data science is increasingly permeating into the health and care actuarial sciences. Given this context, the Institute and Faculty of Actuaries set out to form a “techniques in data science in health and care” working party. This working party was tasked with creating a framework for those actuaries working within the health and care domain that would assist them in determining which techniques are appropriate for a project. The framework presented here was developed through a combination of literature review and synthesis of expert opinion from experienced practitioners from diverse backgrounds. The framework offers a structured, itemised approach, serving as a checklist to ensure that all relevant analytics and decisions are considered and documented. Each itemised topic is covered by a summary providing guidance and relevant references for further reading. The checklist follows the natural workflow of a data analytics project, guiding users through each step to prevent omissions and maintain rigour in both analysis, reporting and peer-review. The framework blends relevant analytics elements from actuarial science, data science and epidemiology. We hope the framework will enhance transparency, reproducibility, and comprehensiveness in the reporting and peer-review of health and care data analytics projects.
In this paper, we investigate how policyholder information and broader economic conditions jointly influence the duration of credit life insurance contracts in the French market. Employing a proportional-intensities regression framework built on inhomogeneous phase-type distributions, we capture the way covariates shape the distribution of policy lifetime until a lapse occurs. The model is estimated via a specialized expectation-maximization algorithm, adapted to handle censored data, covariates, and feature selection through shrinkage. Our analysis of real-world data shows that different policyholder attributes and economic factors can significantly alter lapse behavior, with effects varying across insurance products, individuals, and economic cycles. These findings highlight the importance of integrating both individual-level and macroeconomic indicators in lapse risk assessment, ultimately informing more accurate pricing and allowing for improved risk management strategies.
We present an efficient neural-based approach to estimate the instantaneous flow field around an airfoil from limited surface pressure measurements. The model, denoted SNN-POD, relies on two independent shallow neural networks to predict the instantaneous flow over a wide range of angles of attack $ \left[10{}^{\circ},20{}^{\circ}\right] $. At all angles the global model correctly recovers the average characteristics of the flow from single-time sensor data, thus allowing combination with local, angle-dependent models. The method is applied to 2D URANS simulations of a thick airfoil at a Reynolds number of $ \mathit{\operatorname{Re}}=4.5\times {10}^6 $. The training set consists of snapshots obtained from a coarse sampling $ \left(1-2{}^{\circ}\right) $ of the angle of attack range. A variance-based criterion is used to determine the number and positions of sensors. Tests are carried out for unseen snapshots at angles of attack within the set (sampled angles) as well as outside the set (interpolated angles). The maximum MSE error of attack for sampled and interpolated angles is respectively $ 2.9\% $ and $ 6.6\% $. This makes it possible to develop adaptive strategies to improve the estimation if necessary.
Insurance risk arising from natural catastrophes such as earthquakes is a key component of the minimum capital test for federally regulated property and casualty insurance companies. This paper proposes an integrated, open-source, simulation-based actuarial framework for the assessment of earthquake insurance risk and solvency capital requirements. The framework combines spatio-temporal earthquake occurrence modeling, physics-informed ground-shaking estimation based on Canadian seismic hazard maps, building exposure and vulnerability modeling, and detailed insurance loss and claim calculations within a unified pipeline. Spatial heterogeneity in seismic risk is captured through kernel-based spatio-temporal point process modeling, while Voronoi-based deviance residuals are employed as localized diagnostic tools to validate model adequacy. Simulated insured losses are used to estimate regional and country-wide probable maximum losses (PMLs), and a new capital aggregation formula is proposed that explicitly incorporates cross-provincial dependence in earthquake losses, in contrast to the current region-based regulatory aggregation. The proposed framework enables spatially resolved loss and capital assessment at a fine geographic scale and is implemented in a fully reproducible open-source environment. An interactive web application is also provided to allow users to simulate earthquake damage and the resulting financial losses and insurance claims at user-specified epicenter locations.
Accurate prediction of bridge crack evolution is essential for infrastructure safety assurance and maintenance optimization. This study develops an interpretable machine learning framework to predict the expansion of cracks on the main beam in small- and medium-span highway beam bridges and identify the underlying mechanisms of structural deterioration. A comprehensive database was constructed from inspection and monitoring records of over 100 bridges, featuring critical degradation indicators, including crack density (CD) and maximum crack width (MCW). Following data preprocessing and feature selection through correlation analysis, three machine learning algorithms, that is, support vector regression (SVR), random forest (RF), and extreme gradient boosting (XGBoost), were implemented and evaluated using statistical metrics (R2, RMSE, and MAE). The XGBoost model demonstrated superior predictive performance with R2 values of 0.9433 and 0.9413 for MCW and CD, respectively, reducing RMSE by up to 66.8% and MAE by up to 72% compared to alternative models. SHAP (SHapley Additive exPlanations) analysis revealed that four factors, namely, vehicle load (VL), annual average daily truck traffic (ADTT), bridge age (BA), and annual average daily traffic (ADT), collectively contributed 61.45 ± 2.35% to crack development, with VL (19.7%) being the most influential factor. These findings identify excessive traffic loading and aging as the dominant drivers of crack propagation in beam bridges, providing valuable insights for targeted maintenance strategies and bridge management.
A graph is called Rank-Ramsey if (i) Its clique number is small, and (ii) The adjacency matrix of its complement has small rank. We initiate a systematic study of such graphs. Our main motivation is that their constructions, as well as proofs of their non-existence, are intimately related to the famous log-rank conjecture from the field of communication complexity. These investigations also open interesting new avenues in Ramsey theory. We construct two families of Rank-Ramsey graphs exhibiting polynomial separation between order and complement rank. Graphs in the first family have bounded clique number (as low as $41$). These are subgraphs of certain strong products, whose building blocks are derived from triangle-free strongly-regular graphs. Graphs in the second family are obtained by applying Boolean functions to Erdős-Rényi graphs. Their clique number is logarithmic, but their complement rank is far smaller than in the first family, about $\mathcal{O}(n^{2/3})$. A key component of this construction is our matrix-theoretic view of lifts. We also consider lower bounds on the Rank-Ramsey numbers, and determine them in the range where the complement rank is $5$ or less. We consider connections between said numbers and other graph parameters, and find that the two best known explicit constructions of triangle-free Ramsey graphs turn out to be far from Rank-Ramsey.
This study investigates a hybrid variable annuity (VA) contract that combines guaranteed minimum accumulation benefit (GMAB) and guaranteed minimum death benefit (GMDB) riders, with the added flexibility for policyholders to surrender prior to maturity. The contract guarantees the return of premiums or a greater rolled-up value at either maturity or death. We propose a novel two-account structure: an investment account tied to the underlying fund, from which management fees are deducted, and a separate cash account for the deduction of insurance fees for funding the GMAB and GMDB riders. This design generalizes the conventional single-account model by decoupling fee sources. From the policyholder’s perspective, we derive actuarially fair insurance charges and show that the two-account framework delivers substantially lower guarantee fees compared to the classical design, improves contract characteristics by reducing the effective moneyness of embedded guarantees, thereby discouraging early surrenders and mitigating mortality risk mispricing. Furthermore, we show that bundling survival and death benefits results in higher fair fees than the sum of standalone riders, thereby enhancing the product’s appeal to insurers. The analysis also incorporates taxation considerations up to a predefined preservation age, reflecting regulatory and practical product design constraints.
Fluid queues governed by birth–death processes have been used to analyze buffer dynamics and stability behavior of the fluid flow systems. However, most existing studies primarily focus on classical single-ended queues, often ignore double-ended queue flow dynamics, or rely heavily on simulation-based approaches. Specifically, the study of the fluid flow systems modulated by double-ended queues subject to the catastrophic failure and subsequent repair processes is challenging and interesting and has not yet received attention in the literature. Even when such systems are considered, explicit closed-form analytic expressions for equilibrium buffer content distributions and related performance measures are rarely available. To overcome these limitations, this article investigates a fluid flow system regulated by a double-ended queue and exposed to catastrophic failures with subsequent repair processes. Such a driven queue can be equivalently represented as a one-dimensional bilateral birth–death process, namely a continuous-time randomized random walk on the integers with catastrophic failures and repairs. The stability condition for the fluid occupancy in the credit buffer is rigorously established, and explicit closed-form analytical expressions for both the probability density function and the cumulative distribution function of the buffer content in the equilibrium regime are determined. These analytic results provide deeper insight into the steady-state behavior of the system and enable the derivation of several vital performance measures of practical interest. Furthermore, graphical illustrations are presented to highlight the influence of the system parameters on the performance descriptors of the fluid content, thereby enhancing the interpretability and applicability of the proposed fluid queueing system.
Cost-of-capital valuation is a well-established approach to the valuation of liabilities and is one of the cornerstones of current regulatory frameworks for the insurance industry. Standard cost-of-capital considerations typically rely on the assumption that the required buffer capital is held in risk-less one-year bonds. The aim of this work is to analyze the effects of allowing investments of the buffer capital in risky assets, for example, in a combination of stocks and bonds. In particular, we make precise how the decomposition of the buffer capital into contributions from policyholders and investors varies as the degree of riskiness of the investment increases and highlight the role of limited liability in the case of heavy-tailed insurance risks. With a focus on nonlife insurance, we present a combination of general theoretical results, explicit results for certain stochastic models, and numerical results that emphasize the key findings.
This comprehensive modern look at regression covers a wide range of topics and relevant contemporary applications, going well beyond the topics covered in most introductory books. With concision and clarity, the authors present linear regression, nonparametric regression, classification, logistic and Poisson regression, high-dimensional regression, quantile regression, conformal prediction and causal inference. There are also brief introductions to neural nets, deep learning, random effects, survival analysis, graphical models and time series. Suitable for advanced undergraduate and beginning graduate students, the book will also serve as a useful reference for researchers and practitioners in data science, machine learning, and artificial intelligence who want to understand modern methods for data analysis.
In this paper, when the errors in the semi-parametric errors-in-variables model are asymptotic negatively associated (or ρ−, for short) random variables, the estimators of parameter, non-parameter, and error variances in the model are $\widehat{\beta}_{n}$, $\widehat{g}_{n}(t)$, and $ \widehat{\sigma}_{n}^{2}$, respectively, by using wavelet smoothing and least square method. Under some general assumptions, we also establish some results on the strong consistency of the estimators. Furthermore, simulations are conducted to assess the finite sample behavior of the estimators and confirm the validity of the theoretical results.
Thin-walled truncated conical shells subjected to axial compression are extremely susceptible to buckling, with experimentally observed buckling loads often falling well below classical theoretical predictions. The ratio of the experimentally measured critical load to its theoretical counterpart is defined as the Knockdown Factor (KDF). Although design guidelines proposed by agencies such as NASA provide conservative estimates of KDFs to ensure safety, recent research has highlighted the need to revisit and refine these provisions due to their excessive conservatism. In this context, the present study compares robust machine learning (ML) models for predicting buckling loads, or equivalently KDFs, of truncated conical shells using Artificial Neural Network (ANN), Support Vector Regression (SVR), Random Forest Regression (RFR) and Histogram Gradient Boosting (HGB). These models are able to capture strong nonlinear and complex feature interactions which are inherent in buckling phenomena. A comprehensive database compiled from existing literature and complemented with a set of simulated data is employed for model training and testing. To lead a new direction in the line of data-driven KDF prediction, a novel hybrid ML framework integrating Gaussian Process Regression (GPR) with Extreme Gradient Boosting (XGB), referred to as (GPR + XGB), is proposed. Additionally, a sensitivity analysis is performed to identify the most influential features governing the KDF predictions of truncated conical shells. The proposed hybrid framework that leverages experimental data as well as simulated data to accurately predict buckling KDFs of truncated conical shells, achieve significantly improved accuracy over existing ML models and conservative design guidelines.
Pension fund populations often have mortality experiences that are substantially different from the national benchmark. In a motivating case study of Brazilian corporate pension funds, pensioners are observed to have mortality that is 40–55% below the national average, due to the underlying socioeconomic disparities. Direct analysis of a pension fund population is challenging due to very sparse data, with age-specific annual death counts often in low single digits. We design and study a collection of stochastic subpopulation frameworks that coherently capture and project pensioner mortality rates via deflator factors relative to a reference population. Superseding parametric approaches, we propose Gaussian process (GP)-based models that flexibly estimate age- and/or year-specific deflators. We demonstrate that the GP models achieve better goodness of fit and uncertainty quantification. Our models are illustrated on two Brazilian pension funds in the context of exogenous national mortality tables. The GP models are implemented in R Stan using a fully Bayesian approach and take into account over-dispersion relative to the Poisson likelihood.
Survivors are not a random sample of patients with disease, they are biased. How they differ from non-survivors must be understood before survival can be attributed to a disease process, or therapeutic intervention. Live-birth bias is a particular example; many conceptions fail before term birth and this influences the live-birth population. The importance of collider bias is reviewed. Workers are generally healthier than those not working introducing bias into occupational health studies.
The propensity to interpret data according to prior beliefs, confirmation bias is one of the most insidious forms of bias in research: old and modern examples are offered. Misinterpretation of study results is commonplace in the courtroom, often described under the rubric of “junk science.” The association of a rare exposure with a rare outcome is increasingly the focus of biomedical research, this incurs increased opportunity for bias to influence study results. Absolute rather than relative risks are an important form of interpreting rare study data. Reverse causality is a profound source of error: Is the disease responsible for increasing exposure to the putative risk factor? Various biases are linked with time: In the context of public health screening, there is lead time bias and length time bias; and for survival studies, immortal time bias. Stein’s paradox offers a caution that the results of a larger sample may actually be more predictive of the subgroup experience within that sample than the study result observed for that subgroup.