To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The increasing popularity of large language models has not only led to widespread use but has also brought various risks, including the potential for systematically spreading fake news. Consequently, the development of classification systems such as DetectGPT has become vital. These detectors are vulnerable to evasion techniques, as demonstrated in an experimental series: Systematic changes of the generative models’ temperature proofed shallow learning—detectors to be the least reliable (Experiment 1). Fine-tuning the generative model via reinforcement learning circumvented BERT-based—detectors (Experiment 2). Finally, rephrasing led to a >90% evasion of zero-shot—detectors like DetectGPT, although texts stayed highly similar to the original (Experiment 3). A comparison with existing work highlights the better performance of the presented methods. Possible implications for society and further research are discussed.
We consider the moments and the distribution of hitting times on the lollipop graph which is the graph exhibiting the maximum expected hitting time among all the graphs having the same number of nodes. We obtain recurrence relations for the moments of all order and we use these relations to analyze the asymptotic behavior of the hitting time distribution when the number of nodes tends to infinity.
We show that for $\lambda\in[0,{m_1}/({1+\sqrt{1-{1}/{m_1}}})]$, the biased random walk’s speed on a Galton–Watson tree without leaves is strictly decreasing, where $m_1\geq 2$. Our result extends the monotonic interval of the speed on a Galton–Watson tree.
This article proposes Bayesian adaptive trials (BATs) as both an efficient method to conduct trials and a unifying framework for the evaluation of social policy interventions, addressing the limitations inherent in traditional methods, such as randomized controlled trials. Recognizing the crucial need for evidence-based approaches in public policy, the proposed approach aims to lower barriers to the adoption of evidence-based methods and to align evaluation processes more closely with the dynamic nature of policy cycles. BATs, grounded in decision theory, offer a dynamic, “learning as we go” approach, enabling the integration of diverse information types and facilitating a continuous, iterative process of policy evaluation. BATs’ adaptive nature is particularly advantageous in policy settings, allowing for more timely and context-sensitive decisions. Moreover, BATs’ ability to value potential future information sources positions it as an optimal strategy for sequential data acquisition during policy implementation. While acknowledging the assumptions and models intrinsic to BATs, such as prior distributions and likelihood functions, this article argues that these are advantageous for decision-makers in social policy, effectively merging the best features of various methodologies.
The tonuity, proposed by Chen et al. ((2019) ASTIN Bulletin: The Journal of the IAA, 49(1), 530.), is a combination of an immediate tontine and a deferred annuity. However, its switching time from tontine to annuity is fixed at the moment the contract is closed, possibly becoming sub-optimal if mortality changes over time. This article introduces an alternative tonuity product, wherein a dynamic switching condition is pivotal, relying on the observable mortality trends within a reference population. The switching from tontine to annuity then occurs automatically once the condition is satisfied. Using data from the Human Mortality Database and UK Continuous Mortality Investigation, we demonstrate that, in a changing environment, where an unforeseen mortality or longevity shock leads to an unexpected increase or decrease in mortality rates, the proposed dynamic tonuity contract can be preferable to the regular tonuity contract.
The Wiener–Hopf factors of a Lévy process are the maximum and the displacement from the maximum at an independent exponential time. The majority of explicit solutions assume the upward jumps to be either phase-type or to have a rational Laplace transform, in which case the traditional expressions are lengthy expansions in terms of roots located by means of Rouché’s theorem. As an alternative, compact matrix formulas are derived, with parameters computable by iteration schemes.
The H3N2 canine influenza virus (CIV) emerged from an avian reservoir in Asia to circulate entirely among dogs for the last 20 years. The virus was first seen circulating outside Asian dog populations in 2015, in North America. Utilizing viral genomic data in addition to clinical reports and diagnostic testing data, we provide an updated analysis of the evolution and epidemiology of the virus in its canine host. CIV in dogs in North America is marked by a complex life history – including local outbreaks, regional lineage die-outs, and repeated reintroductions of the virus (with diverse genotypes) from different regions of Asia. Phylogenetic and Bayesian analysis reveal multiple CIV clades, and viruses from China have seeded recent North American outbreaks, with 2 or 3 introductions in the past 3 years. Genomic epidemiology confirms that within North America the virus spreads very rapidly among dogs in kennels and shelters in different regions – but then dies out locally. The overall epidemic therefore requires longer-distance dispersal of virus to maintain outbreaks over the long term. With a constant evolutionary rate over 20 years, CIV still appears best adapted to transmission in dense populations and has not gained properties for prolonged circulation among dogs.
This paper characterizes irreducible phase-type representations for exponential distributions. Bean and Green (2000) gave a set of necessary and sufficient conditions for a phase-type distribution with an irreducible generator matrix to be exponential. We extend these conditions to irreducible representations, and we thus give a characterization of all irreducible phase-type representations for exponential distributions. We consider the results in relation to time-reversal of phase-type distributions, PH-simplicity, and the algebraic degree of a phase-type distribution, and we give applications of the results. In particular we give the conditions under which a Coxian distribution becomes exponential, and we construct bivariate exponential distributions. Finally, we translate the main findings to the discrete case of geometric distributions.
This paper demonstrates how learning the structure of a Bayesian network, often used to predict and represent causal pathways, can be used to inform policy decision-making.
We show that Bayesian networks are a rigorous and interpretable representation of interconnected factors that affect the complex environment in which policy decisions are made. Furthermore, Bayesian structure learning differentiates between proximal or immediate factors and upstream or root causes, offering a comprehensive set of potential causal pathways leading to specific outcomes.
We show how these causal pathways can provide critical insights into the impact of a policy intervention on an outcome. Central to our approach is the integration of causal discovery within a Bayesian framework, which considers the relative likelihood of possible causal pathways rather than only the most probable pathway.
We argue this is an essential part of causal discovery in policy making because the complexity of the decision landscape inevitably means that there are many near equally probable causal pathways. While this methodology is broadly applicable across various policy domains, we demonstrate its value within the context of educational policy in Australia. Here, we identify pathways influencing educational outcomes, such as student attendance, and examine the effects of social disadvantage on these pathways. We demonstrate the methodology’s performance using synthetic data and its usefulness by applying it to real-world data. Our findings in the real example highlight the usefulness of Bayesian networks as a policy decision tool and show how data science techniques can be used for practical policy development.
Clostridiodes difficile’s epidemiology has evolved over the past decades, being recognized as an important cause of disease in the community setting. Even so, there has been heterogeneity in the reports of CA-CDI. Therefore, the aim of this study was to assess the epidemiologic profile of CA-CDI.
This systematic review and meta-analysis were conducted according to PRISMA checklist and Cochrane guidelines (CRD42023451134). Literature search was performed by an experienced librarian from inception to April 2023, searching in databases like MEDLINE, Scopus, Web of Science, EMBASE, CCRCC, CDSR, and ClinicalTrials. Observational studies that reported prevalence, incidence of CA-CDI, or indicators to calculate them were included. Pool analysis was performed using a binomial-normal model via the generalized linear mixed model. Subgroup analysis and publication bias were also explored. A total of 49 articles were included, obtaining a prevalence of 5% (95% CI 3–8) and an incidence of 7.53 patients (95% CI 4.45–12.74) per 100,000 person-years.
In conclusion, this meta-analysis underscores that among the included studies, the prevalence of CA-CDI stands at 5%, with an incidence rate of 7.3 cases per 100,000 person-years. Noteworthy risk factors identified include prior antibiotic exposure and age.
Bayesian model updating (BMU) is frequently used in Structural Health Monitoring to investigate the structure’s dynamic behavior under various operational and environmental loadings for decision-making, e.g., to determine whether maintenance is required. Data collected by sensors are used to update the prior of some physics-based model’s latent parameters to yield the posterior. The choice of prior may significantly affect posterior predictions and subsequent decision-making, especially under the typical case in engineering applications of little informative data. Therefore, understanding how the choice of prior affects the posterior prediction is of great interest. In this article, a robust Bayesian inference technique evaluates the optimal and worst-case prior in the vicinity of a chosen nominal prior and their corresponding posteriors. This technique derives an interacting Wasserstein gradient flow that minimizes and maximizes/minimizes the KL divergence between the posterior and the approximation to the posterior, with respect to the approximation to the posterior and the prior. Two numerical case studies are used to showcase the proposed algorithm: a double-banana-posterior and a double-beam structure. Optimal and worst-case priors are modeled by specifying an ambiguity set containing any distribution at a statistical distance to the nominal prior, less than or equal to the radius. The resulting posteriors may be used to yield the lower and upper bounds on subsequent calculations of an engineering metric (e.g., failure probability) used for decision-making. If the metric used for decision-making is not sensitive to the resulting posteriors, it may be assumed that decisions taken are robust to prior uncertainty.
Robotic manipulation inherently involves contact with objects for task accomplishment. Traditional motion planning techniques, while having shown their success in collision-free scenarios, may not handle manipulation tasks effectively because they typically avoid contact. Although geometric constraints have been introduced into classical motion planners for tasks that involve interactions, they still lack the capability to fully incorporate contact. In addition, these planning methods generally do not operate on objects that cannot be directly controlled. In this work, building on a recently proposed framework for energy-based quasi-static manipulation, we propose an approach to manipulation planning by adapting a numerical continuation algorithm to compute the equilibrium manifold (EM), which is implicitly derived from physical laws. By defining a manipulation potential energy function that captures interaction and natural potentials, the numerical continuation approach is integrated with adaptive ordinary differential equations that converge to the EM. This allows discretizing the implicit manifold as a graph with a finite set of equilibria as nodes interconnected by weighted edges defined via a haptic metric. The proposed framework is evaluated with an inverted pendulum task, where the explored branch of the manifold demonstrates effectiveness.
We propose a family of weighted statistics based on the CUSUM process of the WLS residuals for the online detection of changepoints in a Random Coefficient Autoregressive model, using both the standard CUSUM and the Page-CUSUM process. We derive the asymptotics under the null of no changepoint for all possible weighing schemes, including the case of the standardized CUSUM, for which we derive a Darling–Erdös-type limit theorem; our results guarantee the procedure-wise size control under both an open-ended and a closed-ended monitoring. In addition to considering the standard RCA model with no covariates, we also extend our results to the case of exogenous regressors. Our results can be applied irrespective of (and with no prior knowledge required as to) whether the observations are stationary or not, and irrespective of whether they change into a stationary or nonstationary regime. Hence, our methodology is particularly suited to detect the onset, or the collapse, of a bubble or an epidemic. Our simulations show that our procedures, especially when standardising the CUSUM process, can ensure very good size control and short detection delays. We complement our theory by studying the online detection of breaks in epidemiological and housing prices series.
For a continuous-time phase-type (PH) distribution, starting with its Laplace–Stieltjes transform, we obtain a necessary and sufficient condition for its minimal PH representation to have the same order as its algebraic degree. To facilitate finding this minimal representation, we transform this condition equivalently into a non-convex optimization problem, which can be effectively addressed using an alternating minimization algorithm. The algorithm convergence is also proved. Moreover, the method we develop for the continuous-time PH distributions can be used directly for the discrete-time PH distributions after establishing an equivalence between the minimal representation problems for continuous-time and discrete-time PH distributions.
In the present technological age, where cyber-risk ranks alongside natural and man-made disasters and catastrophes – in terms of global economic loss – businesses and insurers alike are grappling with fundamental risk management issues concerning the quantification of cyber-risk, and the dilemma as to how best to mitigate this risk. To this end, the present research deals with data, analysis, and models with the aim of quantifying and understanding cyber-risk – often described as “holy grail” territory in the realm of cyber-insurance and IT security. Nonparametric severity models associated with cyber-related loss data – identified from several competing sources – and accompanying parametric large-loss components, are determined, and examined. Ultimately, in the context of analogous cyber-coverage, cyber-risk is quantified through various types and levels of risk adjustment for (pure-risk) increased limit factors, based on applications of actuarially founded aggregate loss models in the presence of various forms of correlation. By doing so, insight is gained into the nature and distribution of volatile severity risk, correlated aggregate loss, and associated pure-risk limit factors.
Bartonella is a widely distributed Gram-negative bacterium that includes species that are capable of causing illness in humans. Rodents represent one of the main reservoirs of zoonotic pathogens, and monitoring their populations can provide valuable insights into human health. We conducted a surveillance study of rodents from two north-western states of Mexico (Baja California and Chihuahua) to investigate the prevalence and genetic diversity of Bartonella by polymerase chain reaction (PCR) amplification and sequencing of the citrate synthase (gltA) gene. A total of 586 rodents belonging to 28 species were captured, and 408 were tested for Bartonella spp. The overall Bartonella spp. prevalence was 39.71%. The prevalence found in Chihuahua was higher (42.80%) than in Baja California (32.52%), and rodents such as Neotoma albigula, Neotoma mexicana, Peromyscus boylii, and Chaetodipus baileyi had the highest prevalence. The gltA sequences revealed seven genetic variants, some of which were obtained from Peromyscus and Dipodomys rodents and were associated with Bartonella species of human health concern, such as B. grahamii and B. vinsonii subsp. arupensis. In addition, a sequence obtained from a Peromyscus maniculatus was clustered with Candidatus Bartonella rudakovii, a previously unreported association. This study provides valuable data and new insight into the Bartonella-hosts interactions in rodent species in north-western Mexico.
Data for Policy (dataforpolicy.org), a trans-disciplinary community of research and practice, has emerged around the application and evaluation of data technologies and analytics for policy and governance. Research in this area has involved cross-sector collaborations, but the areas of emphasis have previously been unclear. Within the Data for Policy framework of six focus areas, this report offers a landscape review of Focus Area 2: Technologies and Analytics. Taking stock of recent advancements and challenges can help shape research priorities for this community. We highlight four commonly used technologies for prediction and inference that leverage datasets from the digital environment: machine learning (ML) and artificial intelligence systems, the internet-of-things, digital twins, and distributed ledger systems. We review innovations in research evaluation and discuss future directions for policy decision-making.
Stochastic generators are essential to produce synthetic realizations that preserve target statistical properties. We propose GenFormer, a stochastic generator for spatio-temporal multivariate stochastic processes. It is constructed using a Transformer-based deep learning model that learns a mapping between a Markov state sequence and time series values. The synthetic data generated by the GenFormer model preserve the target marginal distributions and approximately capture other desired statistical properties even in challenging applications involving a large number of spatial locations and a long simulation horizon. The GenFormer model is applied to simulate synthetic wind speed data at various stations in Florida to calculate exceedance probabilities for risk management.
A new explicit solution representation is provided for ARMA recursions with drift and either deterministically or stochastically varying coefficients. It is expressed in terms of the determinants of banded Hessenberg matrices and, as such, is an explicit function of the coefficients. In addition to computational efficiency, the proposed solution provides a more explicit analysis of the fundamental properties of such processes, including their Wold–Cramér decomposition, their covariance structure, and their asymptotic stability and efficiency. Explicit formulae for optimal linear forecasts based either on finite or infinite sequences of past observations are provided. The practical significance of the theoretical results in this work is illustrated with an application to U.S. inflation data. The main finding is that inflation persistence increased after 1976, whereas from 1986 onward, the persistence declines and stabilizes to even lower levels than the pre-1976 period.
Modern data analysis depends increasingly on estimating models via flexible high-dimensional or nonparametric machine learning methods, where the identification of structural parameters is often challenging and untestable. In linear settings, this identification hinges on the completeness condition, which requires the nonsingularity of a high-dimensional matrix or operator and may fail for finite samples or even at the population level. Regularized estimators provide a solution by enabling consistent estimation of structural or average structural functions, sometimes even under identification failure. We show that the asymptotic distribution in these cases can be nonstandard. We develop a comprehensive theory of regularized estimators, which include methods such as high-dimensional ridge regularization, gradient descent, and principal component analysis (PCA). The results are illustrated for high-dimensional and nonparametric instrumental variable regressions and are supported through simulation experiments.