Search results for Statistics and Probability

Analysis of influencing factors of concurrent primary liver cancer in hepatitis B patients and construction of column chart prediction model
Qunmei Cao, Yilin Zhou, Changlong Wen, Qinglan Li
Journal:

Epidemiology & Infection / Volume 153 / 2025

Published online by Cambridge University Press:

16 September 2025, e111
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
A predictive column chart was developed to assess the risk of primary liver cancer (PLC) in hepatitis B patients. Data from 107 PLC patients and 107 controls were used as the training set, with 92 patients as the validation set. An additional 446 patients from other hospitals, including 15 with PLC, formed the external validation group. Multivariate logistic regression identified gender, BMI, alcohol consumption, diabetes, family history of liver cancer, cirrhosis, and HBV DNA load as independent risk factors. The model showed strong discrimination with AUCs of 0.882 and 0.859 in the training and validation sets, respectively, and good calibration (Hosmer–Lemeshow χ² = 2.648, P = 0.954; χ² = 4.117, P = 0.846). Decision curve analysis (DCA) confirmed clinical benefit within a risk threshold of 0.07–0.95. In the external validation group, the model maintained discrimination (AUC = 0.863) and calibration (Hosmer–Lemeshow χ² = 7.999, P = 0.434), with DCA showing net benefit across 0.14–0.95. These results indicate the column chart is a reliable tool for PLC risk prediction in hepatitis B patients.

Modeling discrete common-shock risks through matrix distributions
Martin Bladt, Eric C. K. Cheung, Oscar Peralta, Jae-Kyung Woo
Journal:

ASTIN Bulletin: The Journal of the IAA , First View

Published online by Cambridge University Press:

16 September 2025, pp. 1-26
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We introduce a novel class of bivariate common-shock discrete phase-type (CDPH) distributions to describe dependencies in loss modeling, with an emphasis on those induced by common shocks. By constructing two jointly evolving terminating Markov chains that share a common evolution up to a random time corresponding to the common shock component, and then proceed independently, we capture the essential features of risk events influenced by shared and individual-specific factors. We derive explicit expressions for the joint distribution of the termination times and prove various class and distributional properties, facilitating tractable analysis of the risks. Extending this framework, we model random sums where aggregate claims are sums of continuous phase-type random variables with counts determined by these termination times and show that their joint distribution belongs to the multivariate phase-type or matrix-exponential class. We develop estimation procedures for the CDPH distributions using the expectation-maximization algorithm and demonstrate the applicability of our models through simulation studies and an application to bivariate insurance claim frequency data. In particular, the distribution of the latent common shock component present in correlated count data can be estimated as well.

Building Information Graphs (BIGs): remodeling building information for learning and applications
Zijian Wang, Rafael Sacks
Journal:

Data-Centric Engineering / Volume 6 / 2025

Published online by Cambridge University Press:

15 September 2025, e44
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Despite significant advances in Building Information Modeling (BIM) and increased adoption, numerous challenges remain. Discipline-specific BIM software tools with file storage have unresolved interoperability issues and do not capture or express interdisciplinary design intent. This hobbles machines’ ability to process design information. The lack of suitable data representation hinders the application of machine learning and other data-centric applications in building design. We propose Building Information Graphs (BIGs) as an alternative modeling method. In BIGs, discipline-specific design models are compiled as subgraphs in which nodes and edges model objects and their relationships. Additional nodes and edges in a meta-graph link the building objects across subgraphs. Capturing both intradisciplinary and interdisciplinary relationships, BIGs provide a dimension of contextual data for capturing design intent and constraints. BIGs are designed for computation and applications. The explicit relationships enable advanced graph functionalities, such as across-domain change propagation and object-level version control. BIGs preserve multimodal design data (geometry, attributes, and topology) in a graph structure that can be embedded into high-dimensional vectors, in which learning algorithms can detect statistical patterns and support a wide range of downstream tasks, such as link prediction and graph generation. In this position article, we highlight three key challenges: encapsulating and formalizing object relationships, particularly design intent and constraints; designing graph learning techniques; and developing innovative domain applications that leverage graph structures and learning. BIGs represent a paradigm shift in design technologies that bridge artificial intelligence and building design to enable intelligent and generative design tools for architects, engineers, and contractors.

Uncovering policy priorities for disability inclusion: NLP and LLM approaches to analyzing CRPD state reports
Derrick Cogburn, Theodore Ochieng, Keiko Shikako, Juliana Woods, Mina Aydin
Journal:

Data & Policy / Volume 7 / 2025

Published online by Cambridge University Press:

15 September 2025, e61
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Over 193 countries have signed at least one of more than 500 multilateral treaties addressing critical global issues, such as human rights, environmental protection, and trade. Ratifying a treaty obligates a country, as a “State Party,” to report to the United Nations on its progress toward implementing the treaty’s provisions. These reports and their associated review processes generate a wealth of textual data. Effectively monitoring, reviewing, and assessing national, regional, and global progress toward these treaty commitments is crucial for ensuring compliance and realizing the benefits of international cooperation. The UN Convention on the Rights of Persons with Disabilities (CRPD), which has been ratified by 191 countries, exemplifies this challenge. With over 1.3 billion people worldwide living with disabilities, the CRPD aims to promote a shift from a charity-based “medical model” that views disability as an individual deficiency, to a rights-based “social justice model” that emphasizes societal barriers and inclusivity. Each State Party submits periodic reports to the Committee on the Rights of Persons with Disabilities detailing their implementation efforts. This study analyzed all available CRPD State Reports (N = 170) using text mining, Natural Language Processing, and GenerativeAI tools to assess global progress, identify regional variations, and explore the factors influencing successful implementation. The findings reveal evidence of widespread CRPD implementation, growing support for social justice and economic inclusion, and the importance of civil society engagement. Hybrid data analysis approach of this study offers a promising framework for harnessing the power of textual data to advance the realization of treaty commitments worldwide.

Using graph neural networks and frequency domain data for automated operational modal analysis of populations of structures
Xudong Jian, Yutong Xia, Gregory Duthé, Kiran Bacsa, Wei Liu, Eleni Chatzi
Journal:

Data-Centric Engineering / Volume 6 / 2025

Published online by Cambridge University Press:

15 September 2025, e45
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
The population-based structural health monitoring paradigm has recently emerged as a promising approach to enhance data-driven assessment of engineering structures by facilitating transfer learning between structures with some degree of similarity. In this work, we apply this concept to the automated modal identification of structural systems. We introduce a graph neural network (GNN)-based deep learning scheme to identify modal properties, including natural frequencies, damping ratios, and mode shapes of engineering structures based on the power spectral density of spatially sparse vibration measurements. Systematic numerical experiments are conducted to evaluate the proposed model, employing two distinct truss populations that possess similar topological characteristics but varying geometric (size and shape) and material (stiffness) properties. The results demonstrate that, once trained, the proposed GNN-based model can identify modal properties of unseen structures within the same structural population with good efficiency and acceptable accuracy, even in the presence of measurement noise and sparse measurement locations. The GNN-based model exhibits advantages over the classic frequency domain decomposition method in terms of identification speed, as well as against an alternate multilayer perceptron architecture in terms of identification accuracy, rendering this a promising tool for PBSHM purposes.

The structure of identity facilitation and interference
Part of
- Network Approaches to Attitudes and Beliefs
Maria C. Ramos
Journal:

Network Science / Volume 13 / 2025

Published online by Cambridge University Press:

15 September 2025, e12
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
This article investigates global patterns of facilitation and interference among identities—socially recognizable categories that shape individuals’ sense of who they are and carry cultural expectations (e.g., mother, worker). While identity theory suggests that identities interact in structured ways, existing research often examines identities in isolation or conventional roles, limiting the ability to observe broader patterns. This study adopts a relational approach to explore how identities facilitate or interfere with each other. By drawing on sociological identity theory, I formulate hypotheses about these interactions. Using original survey data, I construct identity networks where nodes represent identities and ties indicate the prevalence of facilitation or interference. Blockmodeling techniques are then employed to characterize the global structure of these networks. The findings reveal distinct positions within the network, largely aligning with theoretical expectations.

Industrial mobile robot-based manufacturing system modeling potential
Miklós Boleraczki, István Gábor Gyurika
Journal:

Data-Centric Engineering / Volume 6 / 2025

Published online by Cambridge University Press:

15 September 2025, e46
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Industrial mobile robots as service units will be increasingly used in the future in factories with Industry 4.0 production cells in an island-like manner. The differences between the mobile robots available on the market make it necessary to help the optimal selection and use of these robots. In this article, we present a concept that focuses on the mobile robot as a way to investigate the manufacturing system. This approach will help to find the optimal solution when selecting robots. With the parameters that can be included, the robot can be characterized in the manufacturing system environment, making it much easier to express and compute capacity, performance, and efficiency characteristics compared to previous models. In this article, we also present a case study based on the outlined method, which investigates the robot utilization as a function of battery capacity and the number of packages to be transported.

Dynamics on Graphs

2nd edition
Rick Durrett
Coming soon
Expected online publication date:

September 2025
- Book
- Export citation
This extensive revision of the 2007 book 'Random Graph Dynamics,' covering the current state of mathematical research in the field, is ideal for researchers and graduate students. It considers a small number of types of graphs, primarily the configuration model and inhomogeneous random graphs. However, it investigates a wide variety of dynamics. The author describes results for the convergence to equilibrium for random walks on random graphs as well as topics that have emerged as mature research areas since the publication of the first edition, such as epidemics, the contact process, voter models, and coalescing random walk. Chapter 8 discusses a new challenging and largely uncharted direction: systems in which the graph and the states of their vertices coevolve.

An explicit economical additive basis
Part of
- Sequences and sets
Vishesh Jain, Huy Tuan Pham, Mehtaab Sawhney, Dmitrii Zakharov
Journal:

Combinatorics, Probability and Computing , First View

Published online by Cambridge University Press:

12 September 2025, pp. 1-6
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We present an explicit subset $A\subseteq \mathbb{N} = \{0,1,\ldots \}$ such that $A + A = \mathbb{N}$ and for all $\varepsilon \gt 0$,
\begin{equation*}\lim _{N\to \infty }\frac {\big |\big \{(n_1,n_2): n_1 + n_2 = N, (n_1,n_2)\in A^2\big \}\big |}{N^{\varepsilon }} = 0.\end{equation*}
This answers a question of Erdős.

Response to letter to editor from Rattanapitoon, N. & Rattanapitoon, S
Sara R. Healy, Martha Betson, Joaquin M. Prada, Eric R. Morgan
Journal:

Epidemiology & Infection / Volume 153 / 2025

Published online by Cambridge University Press:

12 September 2025, e100
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation

Cyber breach risk modeling for insurance: capturing temporal and cross-group dependence
Yijia Li, Xuanhe Wang, Peng Zhao, Taizhong Hu
Journal:

Annals of Actuarial Science , First View

Published online by Cambridge University Press:

12 September 2025, pp. 1-25
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Cyber breaches pose a significant threat to both enterprises and society. Analyzing cyber breach data is essential for improving cyber risk management and developing effective cyber insurance policies. However, modeling cyber risk is challenging due to its inherent characteristics, including sparsity, heterogeneity, heavy tails, and dependence. This work introduces a cluster-based dependence model that captures both temporal and cross-group dependencies, providing a more accurate representation of multivariate cyber breach risks. The proposed framework employs a cluster-based kernel approach to model breach severity, effectively handling heterogeneity and extreme values, while a copula-based method is used to capture multivariate dependence. Our findings, validated through both empirical and synthetic studies, demonstrate that the proposed model effectively captures the statistical characteristics of multivariate cyber breach risks and outperforms commonly used models in predictive accuracy. Furthermore, we show that our approach can enhance cyber insurance pricing by generating more profitable insurance contracts.

Letter to the Editor: “Modelling the risk of foodborne transmission of Toxocara spp. to humans” by Healy et al. (2025)
Nathkapach Kaewpitoon Rattanapitoon, Schawanya Kaewpitoon Rattanapitoon
Journal:

Epidemiology & Infection / Volume 153 / 2025

Published online by Cambridge University Press:

12 September 2025, e101
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation

Modeling the influences of climate conditions on measles transmission in China
Peihua Wang, Jianjiu Chen, Wenyi Zhang, Yong Wang, Wan Yang
Journal:

Epidemiology & Infection / Volume 153 / 2025

Published online by Cambridge University Press:

11 September 2025, e110
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Climate conditions are known to modulate infectious disease transmission, yet their impact on measles transmission remains underexplored. In this study, we investigate the extent to which climate conditions modulate measles transmission, utilizing measles incidence data during 2005–2008 from China. Three climate-forced models were employed: a sinusoidal function, an absolute humidity (AH)-forced model, and an AH and temperature (AH/T)-forced model. These models were integrated into an inference framework consisting of a susceptible–exposed–infectious–recovered (SEIR) model and an iterated filter (IF2) to estimate epidemiological characteristics and assess climate influences on measles transmission. During the study period, measles epidemics peaked in spring in northern China and were more diverse in the south. Our analyses showed that the AH/T model better captured measles epidemic dynamics in northern China, suggesting a combined impact of humidity and temperature on measles transmission. Furthermore, we preliminarily examined the impact of other factors and found that population susceptibility and incidence rate were both positively correlated with migrant worker influx, suggesting that higher susceptibility among migrant workers may sustain measles transmission. Taken together, our study supports a role of humidity and temperature in modulating measles transmission and identifies additional factors in shaping measles epidemic dynamics in China.

Conditioning Bienaymé–Galton–Watson trees to have large sub-populations
Part of
- Probability theory on algebraic and topological structures
- Markov processes
Romain Abraham, Hongwei Bi, Jean-François Delmas
Journal:

Advances in Applied Probability , First View

Published online by Cambridge University Press:

10 September 2025, pp. 1-42
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We study the local limit in distribution of Bienaymé–Galton–Watson trees conditioned on having large sub-populations. Assuming a generic and aperiodic condition on the offspring distribution, we prove the existence of a limit given by a Kesten’s tree associated with a certain critical offspring distribution.

Quasi-stationary distributions for subcritical branching Markov chains
Part of
- Markov processes
- Limit theorems
Wenming Hong, Dan Yao
Journal:

Advances in Applied Probability , First View

Published online by Cambridge University Press:

10 September 2025, pp. 1-47
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Consider a subcritical branching Markov chain. Let $Z_n$ denote the counting measure of particles of generation n. Under some conditions, we give a probabilistic proof for the existence of the Yaglom limit of $(Z_n)_{n\in\mathbb{N}}$ by the moment method, based on the spinal decomposition and the many-to-few formula. As a result, we give explicit integral representations of all quasi-stationary distributions of $(Z_n)_{n\in\mathbb{N}}$, whose proofs are direct and probabilistic, and do not rely on Martin boundary theory.

Large and moderate deviations in Poisson navigations
Part of
Partha Pratim Ghosh, Benedikt Jahnel, Sanjoy Kumar Jhawar
Journal:

Advances in Applied Probability , First View

Published online by Cambridge University Press:

10 September 2025, pp. 1-38
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We derive large- and moderate-deviation results in random networks given as planar directed navigations on homogeneous Poisson point processes. In this non-Markovian routing scheme, starting from the origin, at each consecutive step a Poisson point is joined by an edge to its nearest Poisson point to the right within a cone. We establish precise exponential rates of decay for the probability that the vertical displacement of the random path is unexpectedly large. The proofs rest on controlling the dependencies of the individual steps and the randomness in the horizontal displacement as well as renewal-process arguments.

Well-posedness and averaging principle for non-Gaussian McKean–Vlasov stochastic differential equations with locally Lipschitz coefficients
Part of
Ying Chao, Jinqiao Duan, Ting Gao, Pingyuan Wei
Journal:

Advances in Applied Probability , First View

Published online by Cambridge University Press:

09 September 2025, pp. 1-44
- Article
- - You have access
- PDF
- HTML
- Export citation
In this paper, we investigate a class of McKean–Vlasov stochastic differential equations (SDEs) with Lévy-type perturbations. We first establish the existence and uniqueness theorem for the solutions of the McKean–Vlasov SDEs by utilizing an Eulerlike approximation. Then, under suitable conditions, we demonstrate that the solutions of the McKean–Vlasov SDEs can be approximated by the solutions of the associated averaged McKean–Vlasov SDEs in the sense of mean square convergence. In contrast to existing work, a novel feature of this study is the use of a much weaker condition, locally Lipschitz continuity in the state variables, allowing for possibly superlinearly growing drift, while maintaining linearly growing diffusion and jump coefficients. Therefore, our results apply to a broader class of McKean–Vlasov SDEs.

Asymptotic mixed normality of maximum-likelihood estimator for Ewens–Pitman partition
Part of
Takuya Koriyama, Takeru Matsuda, Fumiyasu Komaki
Journal:

Advances in Applied Probability , First View

Published online by Cambridge University Press:

09 September 2025, pp. 1-21
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
This paper investigates the asymptotic properties of parameter estimation for the Ewens–Pitman partition with parameters $0\lt\alpha\lt1$ and $\theta\gt-\alpha$. Specifically, we show that the maximum-likelihood estimator (MLE) of $\alpha$ is $n^{\alpha/2}$-consistent and converges to a variance mixture of normal distributions, where the variance is governed by the Mittag-Leffler distribution. Moreover, we show that a proper normalization involving a random statistic eliminates the randomness in the variance. Building on this result, we construct an approximate confidence interval for $\alpha$. Our proof relies on a stable martingale central limit theorem, which is of independent interest.

The distance on the slightly supercritical random series–parallel graph
Part of
Xinxing Chen, Bernard Derrida, Thomas Duquesne, Zhan Shi
Journal:

Advances in Applied Probability , First View

Published online by Cambridge University Press:

09 September 2025, pp. 1-42
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We consider the random series–parallel graph introduced by Hambly and Jordan (2004 Adv. Appl. Probab. 36, 824–838), which is a hierarchical graph with a parameter $p\in [0, \, 1]$. The graph is built recursively: at each step, every edge in the graph is either replaced with probability p by a series of two edges, or with probability $1-p$ by two parallel edges, and the replacements are independent of each other and of everything up to then. At the nth step of the recursive procedure, the distance between the extremal points on the graph is denoted by $D_n (p)$. It is known that $D_n(p)$ possesses a phase transition at $p=p_c \;:\!=\;\frac{1}{2}$; more precisely, $\frac{1}{n}\log {{\mathbb{E}}}[D_n(p)] \to \alpha(p)$ when $n \to \infty$, with $\alpha(p) >0$ for $p>p_c$ and $\alpha(p)=0$ for $p\le p_c$. We study the exponent $\alpha(p)$ in the slightly supercritical regime $p=p_c+\varepsilon$. Our main result says that as $\varepsilon\to 0^+$, $\alpha(p_c+\varepsilon)$ behaves like $\sqrt{\zeta(2) \, \varepsilon}$, where $\zeta(2) \;:\!=\; \frac{\pi^2}{6}$.

On orderings of vectors of order statistics and sample ranges from heterogeneous bivariate Pareto variables
Mostafa Sattari, Narayanaswamy Balakrishnan
Journal:

Probability in the Engineering and Informational Sciences , First View

Published online by Cambridge University Press:

09 September 2025, pp. 1-16
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
In this paper, we study ordering properties of vectors of order statistics and sample ranges arising from bivariate Pareto random variables. Assume that $(X_1,X_2)\sim\mathcal{BP}(\alpha,\lambda_1,\lambda_2)$ and $(Y_1,Y_2)\sim\mathcal{BP}(\alpha,\mu_1,\mu_2).$ We then show that $(\lambda_1,\lambda_2)\stackrel{m}{\succ}(\mu_1,\mu_2)$ implies $(X_{1:2},X_{2:2})\ge_{st}(Y_{1:2},Y_{2:2}).$ Under bivariate Pareto distributions, we prove that the reciprocal majorization order between the two vectors of parameters is equivalent to the hazard rate and usual stochastic orders between sample ranges. We also show that the weak majorization order between two vectors of parameters is equivalent to the likelihood ratio and reversed hazard rate orders between sample ranges.

Statistics and Probability

Refine search

Refine search

Actions for selected content:

52341 results in Statistics and Probability

Analysis of influencing factors of concurrent primary liver cancer in hepatitis B patients and construction of column chart prediction model

Modeling discrete common-shock risks through matrix distributions

Building Information Graphs (BIGs): remodeling building information for learning and applications

Uncovering policy priorities for disability inclusion: NLP and LLM approaches to analyzing CRPD state reports

Using graph neural networks and frequency domain data for automated operational modal analysis of populations of structures

The structure of identity facilitation and interference

Industrial mobile robot-based manufacturing system modeling potential

Dynamics on Graphs

An explicit economical additive basis

Response to letter to editor from Rattanapitoon, N. & Rattanapitoon, S

Cyber breach risk modeling for insurance: capturing temporal and cross-group dependence

Letter to the Editor: “Modelling the risk of foodborne transmission of Toxocara spp. to humans” by Healy et al. (2025)

Modeling the influences of climate conditions on measles transmission in China

Conditioning Bienaymé–Galton–Watson trees to have large sub-populations

Quasi-stationary distributions for subcritical branching Markov chains

Large and moderate deviations in Poisson navigations

Well-posedness and averaging principle for non-Gaussian McKean–Vlasov stochastic differential equations with locally Lipschitz coefficients

Asymptotic mixed normality of maximum-likelihood estimator for Ewens–Pitman partition

The distance on the slightly supercritical random series–parallel graph

On orderings of vectors of order statistics and sample ranges from heterogeneous bivariate Pareto variables

Statistics and Probability

Refine search

Refine search

Actions for selected content:

Save Search

52341 results in Statistics and Probability

Dynamics on Graphs