To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The Latent Position Model (LPM) is a popular approach for the statistical analysis of network data. A central aspect of this model is that it assigns nodes to random positions in a latent space, such that the probability of an interaction between each pair of individuals or nodes is determined by their distance in this latent space. A key feature of this model is that it allows one to visualize nuanced structures via the latent space representation. The LPM can be further extended to the Latent Position Cluster Model (LPCM), to accommodate the clustering of nodes by assuming that the latent positions are distributed following a finite mixture distribution. In this paper, we extend the LPCM to accommodate missing network data and apply this to non-negative discrete weighted social networks. By treating missing data as “unusual” zero interactions, we propose a combination of the LPCM with the zero-inflated Poisson distribution. Statistical inference is based on a novel partially collapsed Markov chain Monte Carlo algorithm, where a Mixture-of-Finite-Mixtures (MFM) model is adopted to automatically determine the number of clusters and optimal group partitioning. Our algorithm features a truncated absorb-eject move, which is a novel adaptation of an idea commonly used in collapsed samplers, within the context of MFMs. Another aspect of our work is that we illustrate our results on 3-dimensional latent spaces, maintaining clear visualizations while achieving more flexibility than 2-dimensional models. The performance of this approach is illustrated via three carefully designed simulation studies, as well as four different publicly available real networks, where some interesting new perspectives are uncovered.
The generalized Gompertz distribution—an extension of the standard Gompertz distribution as well as the exponential distribution and the generalized exponential distribution—offers more flexibility in modeling survival or failure times as it introduces an additional parameter, which can account for different shapes of hazard functions. This enhances its applicability in various fields such as actuarial science, reliability engineering and survival analysis, where more complex survival models are needed to accurately capture the underlying processes. The effect of heterogeneity has generated increased interest in recent times. In this article, multivariate chain majorization methods are exploited to develop stochastic ordering results for extreme-order statistics arising from independent heterogeneous generalized Gompertz random variables with increased degree of heterogeneity.
The dynamics of information diffusion on social media platforms vary significantly between individual communities and the broader population. This study explores and compares the differences between community-based interventions and population-wide approaches in adjusting the spread of information. We first examine the temporal dynamics of social media groups, assessing their behavior through metrics such as time-dependent posts and retweets. Using functional data analysis, we investigate Twitter activities related to incidents such as the Skripal/Novichok case. We present three ways to quantify disparities between communities and uncover the strategies used by each group to promote specific narratives. We then compare the impact of targeted, community-based interventions with that of broader, population-wide responses in shaping the diffusion of information. Through this analysis, we identify key differences in how communities engage with and amplify information, revealing distinct patterns in the diffusion process. Our findings provide a comparative framework for understanding the relative consequences of different intervention strategies, offering insights into how targeted and broad approaches influence public discourse across social media platforms.
As data are becoming increasingly important resources for municipal administrations in the context of urban development, formalization of urban data governance (DG) is considered a prerequisite to systematic municipal data practice for the common good. Unlike for larger cities, it is unclear how common such formalized DG is in rural districts and small towns. We therefore mapped the current status quo in small municipalities in Germany as a case exemplifying the broader phenomenon. We systematically searched online for policy documents on DG in all metropolitan regions, all rural districts, and a quota sample of nearly a sixth of all German small towns. We then performed content analysis of the identified documents along predefined categories of urban development. Results show that hardly any small towns dispose of relevant policy documents. Rural districts are somewhat more active in formally defining DG. Identified policy documents tend to address mostly economic activities, social infrastructure, and demography, whereas Housing and Urban design and public space are among the least mentioned categories of urban development.
Regular inspections of civil structures and infrastructure, performed by professional inspectors, are costly and demanding in terms of time and safety requirements. Additionally, the outcome of inspections can be subjective and inaccurate as they rely on the inspector’s expertise. To address these challenges, autonomous inspection systems offer a promising alternative. However, existing robotic inspection systems often lack adaptive positioning capabilities and integrated crack labelling, limiting detection accuracy and their contribution to long-term dataset improvement. This study introduces a fully autonomous framework that combines real-time crack detection with adaptive pose adjustment, automated recording and labelling of defects, and integration of RGB-D and LiDAR sensing for precise navigation. Damage detection is performed using YOLOv5, a widely used detection model, which analyzes the RGB image stream to detect cracks and generates labels for dataset creation. The robot autonomously adjusts its position based on confidence feedback from the detection algorithm, optimizing its vantage point for improved detection accuracy. Experiment inspections showed an average confidence gain of 18% (exceeding 20% for certain crack types), a reduction in size estimation error from 23.31% to 10.09%, and a decrease in the detection failure rate from 20% to 6.66%. While quantitative validation during field testing proved challenging due to dynamic environmental conditions, qualitative observations aligned with these trends, suggesting its potential to reduce manual intervention in inspections. Moreover, the system enables automated recording and labeling of detected cracks, contributing to the continuous improvement of machine learning models for structural health monitoring.
International travel is thought to be a major risk factor for developing gastrointestinal illness in England. Transmission is thought to be more likely in countries which have lower food hygiene standards, poorer sanitation, and lack of access to clean water. However, many studies are conducted within travel clinic settings which may bias findings. Here, we present a case–control study undertaken in returning English travellers in the community conducted with cases of gastrointestinal illness notified to UKHSA.
All Cryptosporidiosis, Giardiasis, non-typhoidal Salmonellosis, and Shigellosis cases notified to the UK Health Security Agency (UKHSA) between 01 July 2023 and 15 October 2023 were asked to complete an anonymous electronic questionnaire if travelling during their incubation period. Asymptomatic travellers were recruited as controls via a market research panel and asked to complete the same questionnaire. A destination water, hygiene, and sanitation score were derived from the WHO ‘Attributable fraction of diarrhoea to inadequate WASH’ dataset. Demographics, travel details, and exposures while travelling were compared by Pearson’s chi-squared test, and pathogen and destination specific multivariable analyses were performed using a forward stepwise approach.
A total of 653 cases and 483 controls were included. The odds of being a case were significantly higher when travelling to countries outside of the EU (OR:4.6, 95%CI:3.5–6.0; p = <0.001) and to countries with high-risk WASH score (OR 6.6, 95%CI:4.9–9.1; p = <0.001), particularly Egypt, Mexico, Tunisia, and Turkey. For those travelling to a low-risk destination, eating undercooked meat or fish and swallowing water from environmental water sources were significantly associated with higher odds of illness by multivariable analysis (p < 0.05). At high-risk destinations, eating foods consumed on excursions, swallowing water from environmental sources, and eating foods from hotel buffets were significantly associated with higher odds of being a case.
Travel to popular tourist destinations is a potentially under-recognized risk factor for acquiring gastrointestinal infections. Exposures at low-risk destinations were broadly similar to risk factors in the UK. Exposures in high-risk destinations highlighted potential risks associated with catered hotels and tourist excursions which should be explored further.
A well-known theorem of Nikiforov asserts that any graph with a positive $K_{r}$-density contains a logarithmic blowup of $K_r$. In this paper, we explore variants of Nikiforov’s result in the following form. Given $r,t\in \mathbb{N}$, when a positive $K_{r}$-density implies the existence of a significantly larger (with almost linear size) blowup of $K_t$? Our results include:
• For an $n$-vertex ordered graph $G$ with no induced monotone path $P_{6}$, if its complement $\overline {G}$ has positive triangle density, then $\overline {G}$ contains a biclique of size $\Omega ({n \over {\log n}})$. This strengthens a recent result of Pach and Tomon. For general $k$, let $g(k)$ be the minimum $r\in \mathbb{N}$ such that for any $n$-vertex ordered graph $G$ with no induced monotone $P_{2k}$, if $\overline {G}$ has positive $K_r$-density, then $\overline {G}$ contains a biclique of size $\Omega ({n \over {\log n}})$. Using concentration of measure and the isodiametric inequality on high dimensional spheres, we provide constructions showing that, surprisingly, $g(k)$ grows quadratically. On the other hand, we relate the problem of upper bounding $g(k)$ to a certain Ramsey problem and determine $g(k)$ up to a factor of 2.
• Any incomparability graph with positive $K_{r}$-density contains a blowup of $K_r$ of size $\Omega ({n \over {\log n}}).$ This confirms a conjecture of Tomon in a stronger form. In doing so, we obtain a strong regularity type lemma for incomparability graphs with no large blowups of a clique, which is of independent interest. We also prove that any $r$-comparability graph with positive $K_{(2h-2)^{r}+1}$-density contains a blowup of $K_h$ of size $\Omega (n)$, where the constant $(2h-2)^{r}+1$ is optimal.
The ${n \over {\log n}}$ size of the blowups in all our results are optimal up to a constant factor.
For Markov chains and Markov processes exhibiting a form of stochastic monotonicity (higher states have higher transition probabilities in terms of stochastic dominance), stability and ergodicity results can be obtained with the use of order-theoretic mixing conditions. We complement these results by providing quantitative bounds on deviations between distributions. We also show that well-known total variation bounds can be recovered as a special case.
Social networks influence health outcomes, yet declining health can also reshape social ties. While prior research has focused on constrained settings, the impact of health on social networks in fully voluntary contexts remains underexplored. This study examines the reciprocal relationship between health and social networks in voluntary settings, assessing whether previously observed patterns persist. We analyzed three-wave longitudinal whole network data from two voluntary clubs (N = 102, mean age = 54 years) in North-Rhine Westphalia, Germany, using Stochastic Actor-Oriented Models to distinguish between selection and influence effects across self-rated, mental, and physical health measures. Our analyses suggest diverging patterns observed in more constrained settings. We found no evidence of peer influence on health across any measures. While self-rated health showed some evidence of selection effects, social avoidance was limited to individuals with poor physical health. Notably, we found no evidence of withdrawal; instead, individuals with poorer health were more likely to nominate others in the network, suggesting they actively sought social connections as a compensatory strategy. These findings challenge existing assumptions about health-based network dynamics, emphasizing the need to reconsider how social networks function in voluntary contexts. Future research should explore how the degree of setting constraints shape health-related network dynamics.
In this paper, we solve an exit probability game between two players, each of whom controls a linear diffusion process. One player controls its process to minimize the probability that the difference of the processes reaches a low level before it reaches a high level, while the other player aims to maximize the probability. By solving the Bellman–Isaacs equations, we find the sub-value and sup-value functions of the game in explicit forms, which are twice continuously differentiable. The optimal plays associated with the sub-value and sup-value are also found explicitly.
West Nile virus (WNV) is a zoonotic mosquito-borne Flavivirus, with bird populations reservoirs. Although often asymptomatic, infection in humans can cause febrile symptoms and, more rarely, severe neurological symptoms. Previous studies assessed environmental drivers of WNV infections, but most overlooked areas with potential WNV circulation despite no reported human case, and mixed mechanisms affecting hosts vs. vectors. Our objective was to generate a WNV Bird Risk Index (BRI) mapping the potential of WNV circulation in bird communities across Europe. We first used a bird traits-based model to estimate WNV seroprevalence in European wild bird species and identify eco-ethological characteristics associated with it. This allowed us to build a map of the WNV BRI that showed a strong spatial heterogeneity across Europe. To validate this metric, using a Besag-York-Mollie 2 spatial model in a Bayesian framework, we showed a positive association between the BRI and the number of years with notified WNV human cases between 2016 and 2023, at the NUTS administrative region scale. To conclude, we provide a map quantifying the suitability for WNV to circulate in the bird reservoir. This allows to target surveillance efforts in areas at risk for WNV zoonotic infections in the future.
We prove new results about comparing the efficiency of general state space Markov chain Monte Carlo algorithms that randomly select a possibly different reversible method at each step (previously known only for finite state spaces). We also provide new, simpler, more accessible proofs of key results, and analyse numerous examples. We provide a full proof of the formula for the asymptotic variance for real-valued functionals on $\varphi$-irreducible reversible Markov chains, first introduced by Kipnis and Varadhan (1986, Commun. Math. Phys.104, 1–19). Given two Markov kernels P and Q with stationary measure $\pi$, we say that the Markov kernel P efficiency-dominates the Markov kernel Q if the asymptotic variance with respect to P is at most the asymptotic variance with respect to Q for every real-valued functional $f\in L^2(\pi)$. Assuming only a basic background in functional analysis, we prove that for two reversible Markov kernels P and Q, P efficiency-dominates Q if and only if the operator $\mathcal{Q}-\mathcal{P}$, where $\mathcal{P}$ is the operator on $L^2(\pi)$ that maps $f\mapsto\int f(y)P(\cdot,\mathrm{d}y)$ and similarly for $\mathcal{Q}$, is positive on $L^2(\pi)$, i.e. $\langle f,\left(\mathcal{Q}-\mathcal{P}\right)f\rangle\geq0$ for every $f\in L^2(\pi)$ (previous proofs for general state spaces use technical results from monotone operator function theory). We use this result to show that under mild conditions, sandwich variants of data augmentation algorithms efficiency-dominate the original algorithm. We also provide other easy-to-check sufficient conditions for efficiency dominance, some of which are generalized from the finite state space case. We also provide a proof based on that of Tierney (1998, Ann. Appl. Prob.8, 1–9) that Peskun dominance is a sufficient condition for efficiency dominance for reversible kernels. Using these results, we show that Markov kernels formed by random selection of other ‘component’ Markov kernels will always efficiency-dominate another Markov kernel formed in this way, as long as the component kernels of the former efficiency-dominate those of the latter. These results on the efficiency dominance of combining component kernels generalizes the results on the efficiency dominance of combined chains introduced by Neal and Rosenthal (2024, J. Appl. Prob.62, 188–208) from finite state spaces to general state spaces.
This article studies the principal component analysis (PCA) estimation of weak factor models with sparse loadings. We uncover an intrinsic near-sparsity preservation property for the PCA estimators of loadings, which comes from the approximately (block) upper triangular structure of the rotation matrix. It suggests an asymmetric relationship among factors: the sparsity of the rotated loadings for a stronger factor can be contaminated by the loadings from weaker ones, but the sparsity of the rotated loadings of a weaker factor is almost unaffected by the loadings of stronger ones. Then, we propose a simple alternative to the existing penalized approaches to sparsify the loading estimators by screening out the small PCA loading estimators directly, and construct consistent estimators for factor strengths. The proposed estimators perform well in finite samples, as shown by a set of Monte Carlo simulations.
Play of Chance and Purpose emphasizes learning probability, statistics, and stochasticity by developing intuition and fostering imagination as a pedagogical approach. This book is meant for undergraduate and graduate students of basic sciences, applied sciences, engineering, and social sciences as an introduction to fundamental as well as advanced topics. The text has evolved out of the author's experience of teaching courses on probability, statistics, and stochastic processes at both undergraduate and graduate levels in India and the United States. Readers will get an opportunity to work on several examples from real-life applications and pursue projects and case-study analyses as capstone exercises in each chapter. Many projects involve the development of visual simulations of complex stochastic processes. This will augment the learners' comprehension of the subject and consequently train them to apply their learnings to solve hitherto unseen problems in science and engineering.
System components usually attain marginal lifetimes with stochastic dependence in the context of load-sharing reliability structures. This study deals with the load-sharing parallel systems of two components. We prove that two marginal lifetimes are positively quadrant dependent when component lifetimes have continuous probability distributions, and such a stochastic dependence is upgraded to the total positive of order 2 in the setting of component lifetimes having an exponential distribution. In addition, we discuss how these findings shed light on related results for the load-sharing Ross model, the conditional residual lifetime, and the conditional inactivity time.
In this paper, we study the joint distribution of the forward and backward recurrence times in a delayed renewal process, as well as their marginal distributions. We obtain several exact results and bounds for these quantities. Some of these bounds are “general,” in the sense that the bounds are valid for any arbitrary distributions of the inter-arrival times, and some are based on aging properties of the distributions of the interarrival times of the renewals. Finally, several numerical examples are presented to illustrate the results.
In a recent paper, Juodis and Reese (2022, Journal of Business & Economic Statistics, 40, 1191–1203) (JR) show that the application of the CD test proposed by Pesaran (2004, General diagnostic tests for cross-sectional dependence in panels, CWPE 0435, Cambridge) to residuals from panels with latent factors results in over-rejection. They propose a randomized test statistic to correct for over-rejection, and add a screening component to achieve power. This article considers the same problem but from a different perspective and shows that the standard CD test remains valid if the latent factors are weak. A bias-corrected version, CD$^{\ast}$, is proposed which is shown to be asymptotically standard normal under the null of error cross-sectional independence which has power against network-type alternatives. This result is shown to hold for pure latent factor models as well as for panel regression models with latent factors. The case where the errors are serially correlated is also considered. Small sample properties of the CD$^{\ast}$ test are investigated by Monte Carlo experiments and are shown to have satisfactory small sample properties. In an empirical application, using the CD$^{\ast}$ test, it is shown that there remains spatial error dependence in a panel data model for real house price changes across 377 Metropolitan Statistical Areas in the United States, even after the effects of latent factors are filtered out.
We analysed weekly influenza A intensive care unit (ICU) or high dependency unit (HDU) admissions reported by age group and subtype by NHS trusts in England through mandatory surveillance during the 2023–2024 influenza season. We investigated whether subtype reporting varied with patient age group, NHS trust type and region. We estimated the subtype ratio and explored whether this estimate varied among subsets of trusts grouped by the regularity of subtype reporting. Our aim was to explore factors relating to subtype reporting and investigate how these affect subtype ratio estimates. 112 NHS trusts reported data, with 86 trusts reporting influenza A cases and 28 trusts reporting subtyped influenza A cases. The proportion of subtype reporting trusts varied with region and trust type, but not patient age group. The estimated ratio of influenza A(H1N1)pdm09 to influenza A(H3N2) was 3.13 (95% CI: 2.17, 4.51), indicating that influenza A(H1N1)pdm09 was dominant; this was approximately similar across levels of regularity of trust subtype reporting. The accuracy of subtype ratio estimates depends on the availability of influenza A subtype information and data representativeness. We identified low levels of subtype reporting, which likely limits early recognition of new influenza strains and informing of the prescription of antivirals in influenza outbreaks.
Following the pivotal work of Sevastyanov (1957), who considered branching processes with homogeneous Poisson immigration, much has been done to understand the behaviour of such processes under different types of branching and immigration mechanisms. Recently, the case where the times of immigration are generated by a non-homogeneous Poisson process has been considered in depth. In this work, we demonstrate how we can use the framework of point processes in order to go beyond the Poisson process. As an illustration, we show how to transfer techniques from the case of Poisson immigration to the case where it is spanned by a determinantal point process.
We employ an appropriate change of measure technique to offer a general result connecting a general form of the Gerber–Shiu function with the distribution of the deficit at ruin under the new (exponentially tilted) measure. Exploiting this result, we extract closed-form formulae for special forms of the Gerber–Shiu function assuming two cases of bivariate distributions that describe the dependence structure between claim sizes and inter-claim times. More specifically, initially, we employ the Downton–Moran bivariate exponential distribution, and we offer explicit formulae for cases of the Gerber–Shiu functions that include the time and the number of claims until ruin. In addition, we derive a closed formula for the defective discounted joint density of the number of claims until ruin, the deficit at ruin, and the time until ruin. The same is achieved for the joint density of the number of claims and the deficit at ruin. We further generalize these results by assuming that the inter-claim times and the claim sizes follow a Kibble–Moran bivariate Erlang distribution. Finally, we offer numerical examples in order to illustrate our main results.