To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This paper introduces a dynamic knowledge-graph approach for digital twins and illustrates how this approach is by design naturally suited to realizing the vision of a Universal Digital Twin. The dynamic knowledge graph is implemented using technologies from the Semantic Web. It is composed of concepts and instances that are defined using ontologies, and of computational agents that operate on both the concepts and instances to update the dynamic knowledge graph. By construction, it is distributed, supports cross-domain interoperability, and ensures that data are connected, portable, discoverable, and queryable via a uniform interface. The knowledge graph includes the notions of a “base world” that describes the real world and that is maintained by agents that incorporate real-time data, and of “parallel worlds” that support the intelligent exploration of alternative designs without affecting the base world. Use cases are presented that demonstrate the ability of the dynamic knowledge graph to host geospatial and chemical data, control chemistry experiments, perform cross-domain simulations, and perform scenario analysis. The questions of how to make intelligent suggestions for alternative scenarios and how to ensure alignment between the scenarios considered by the knowledge graph and the goals of society are considered. Work to extend the dynamic knowledge graph to develop a digital twin of the UK to support the decarbonization of the energy system is discussed. Important directions for future research are highlighted.
We propose using fully Bayesian Gaussian process emulation (GPE) as a surrogate for expensive computer experiments of transport infrastructure cut slopes in high-plasticity clay soils that are associated with an increased risk of failure. Our deterioration experiments simulate the dissipation of excess pore water pressure and seasonal pore water pressure cycles to determine slope failure time. It is impractical to perform the number of computer simulations that would be sufficient to make slope stability predictions over a meaningful range of geometries and strength parameters. Therefore, a GPE is used as an interpolator over a set of optimally spaced simulator runs modeling the time to slope failure as a function of geometry, strength, and permeability. Bayesian inference and Markov chain Monte Carlo simulation are used to obtain posterior estimates of the GPE parameters. For the experiments that do not reach failure within model time of 184 years, the time to failure is stochastically imputed by the Bayesian model. The trained GPE has the potential to inform infrastructure slope design, management, and maintenance. The reduction in computational cost compared with the original simulator makes it a highly attractive tool which can be applied to the different spatio-temporal scales of transport networks.
We developed a novel method to align two data sources (TB notifications and the Demographic Health Survey, DHS) captured at different geographic scales. We used this method to identify sociodemographic indicators – specifically population density – that were ecologically correlated with elevated TB notification rates across wards (~100 000 people) in Dhaka, Bangladesh. We found population density was the variable most closely correlated with ward-level TB notification rates (Spearman's rank correlation 0.45). Our approach can be useful, as publicly available data (e.g. DHS data) could help identify factors that are ecologically associated with disease burden when more granular data (e.g. ward-level TB notifications) are not available. Use of this approach might help in designing spatially targeted interventions for TB and other diseases in settings of weak existing data on disease burden at the subdistrict level.
We investigated whether household to clinic distance was a risk factor for death on tuberculosis (TB) treatment in Malawi. Using enhanced TB surveillance data, we recorded all TB treatment initiations and outcomes between 2015 and 2018. Household locations were geolocated, and distances were measured by a straight line or shortest road network. We constructed Bayesian multi-level logistic regression models to investigate associations between distance and case fatality. A total of 479/4397 (10.9%) TB patients died. Greater distance was associated with higher (odds ratio (OR) 1.07 per kilometre (km) increase, 95% credible interval (CI) 0.99–1.16) odds of death in TB patients registered at the referral hospital, but not among TB patients registered at primary clinics (OR 0.98 per km increase, 95% CI 0.92–1.03). Age (OR 1.02 per year increase, 95% CI 1.01–1.02) and HIV-positive status (OR 2.21, 95% CI 1.73–2.85) were also associated with higher odds of death. Model estimates were similar for both distance measures. Distance was a risk factor for death among patients at the main referral hospital, likely due to delayed diagnosis and suboptimal healthcare access. To reduce mortality, targeted community TB screening interventions for TB disease and HIV, and expansion of novel sensitive diagnostic tests are required.
Given a hereditary property of graphs $\mathcal{H}$ and a $p\in [0,1]$, the edit distance function $\textrm{ed}_{\mathcal{H}}(p)$ is asymptotically the maximum proportion of edge additions plus edge deletions applied to a graph of edge density p sufficient to ensure that the resulting graph satisfies $\mathcal{H}$. The edit distance function is directly related to other well-studied quantities such as the speed function for $\mathcal{H}$ and the $\mathcal{H}$-chromatic number of a random graph.
Let $\mathcal{H}$ be the property of forbidding an Erdős–Rényi random graph $F\sim \mathbb{G}(n_0,p_0)$, and let $\varphi$ represent the golden ratio. In this paper, we show that if $p_0\in [1-1/\varphi,1/\varphi]$, then a.a.s. as $n_0\to\infty$,
Moreover, this holds for $p\in [1/3,2/3]$ for any $p_0\in (0,1)$.
A primary tool in the proof is the categorization of p-core coloured regularity graphs in the range $p\in[1-1/\varphi,1/\varphi]$. Such coloured regularity graphs must have the property that the non-grey edges form vertex-disjoint cliques.
Consumption of pork and pork products can be associated with outbreaks of human salmonellosis. Salmonella infection is usually subclinical in pigs, and farm-based control measures are challenging to implement. To obtain data on Salmonella prevalence, samples can be collected from pigs during the slaughter process. Here we report the results of a Great Britain (GB) based abattoir survey conducted by sampling caecal contents from pigs in nine British pig abattoirs during 2019. Samples were collected according to a randomised stratified scheme, and pigs originating from 286 GB farms were included in this survey. Salmonella was isolated from 112 pig caecal samples; a prevalence of 32.2% [95% confidence interval (CI) 27.4–37.4]. Twelve different Salmonella serovars were isolated, with the most common serovars being S. 4,[5],12:i:-, a monophasic variant of Salmonella Typhimurium (36.6% of Salmonella-positive samples), followed by S. Derby (25.9% of Salmonella-positive samples). There was no significant difference compared to the estimate of overall prevalence (30.5% (95% CI 26.5–34.6)) obtained in the last abattoir survey conducted in the UK (2013). Abattoir-based control measures are often effective in the reduction of Salmonella contamination of carcasses entering the food chain. In this study, the effect of abattoir hygiene practices on the prevalence of Salmonella on carcasses was not assessed. Continuing Salmonella surveillance at slaughter is recommended to assess effect of farm-based and abattoir-based interventions and to monitor potential public health risk associated with consumption of Salmonella-contaminated pork products.
In this paper we consider the one-dimensional, biased, randomly trapped random walk with infinite-variance trapping times. We prove sufficient conditions for the suitably scaled walk to converge to a transformation of a stable Lévy process. As our main motivation, we apply subsequential versions of our results to biased walks on subcritical Galton–Watson trees conditioned to survive. This confirms the correct order of the fluctuations of the walk around its speed for values of the bias that yield a non-Gaussian regime.
Motivated by the problem of variance allocation for the sum of dependent random variables, Colini-Baldeschi, Scarsini and Vaccari (2018) recently introduced Shapley values for variance and standard deviation games. These Shapley values constitute a criterion satisfying nice properties useful for allocating the variance and the standard deviation of the sum of dependent random variables. However, since Shapley values are in general computationally demanding, Colini-Baldeschi, Scarsini and Vaccari also formulated a conjecture about the relation of the Shapley values of two games, which they proved for the case of two dependent random variables. In this work we prove that their conjecture holds true in the case of an arbitrary number of independent random variables but, at the same time, we provide counterexamples to the conjecture for the case of three dependent random variables.
This paper considers logarithmic asymptotics of tails of randomly stopped sums. The stopping is assumed to be independent of the underlying random walk. First, finiteness of ordinary moments is revisited. Then the study is expanded to more general asymptotic analysis. Results are applicable to a large class of heavy-tailed random variables. The main result enables one to identify if the asymptotic behaviour of a stopped sum is dominated by its increments or the stopping variable. As a consequence, new sufficient conditions for the moment determinacy of compounded sums are obtained.
It is well known that stationary geometrically ergodic Markov chains are $\beta$-mixing (absolutely regular) with geometrically decaying mixing coefficients. Furthermore, for initial distributions other than the stationary one, geometric ergodicity implies $\beta$-mixing under suitable moment assumptions. In this note we show that similar results hold also for subgeometrically ergodic Markov chains. In particular, for both stationary and other initial distributions, subgeometric ergodicity implies $\beta$-mixing with subgeometrically decaying mixing coefficients. Although this result is simple, it should prove very useful in obtaining rates of mixing in situations where geometric ergodicity cannot be established. To illustrate our results we derive new subgeometric ergodicity and $\beta$-mixing results for the self-exciting threshold autoregressive model.
Queueing networks are stochastic systems formed by interconnected resources routing and serving jobs. They induce jump processes with distinctive properties, and find widespread use in inferential tasks. Here, service rates for jobs and potential bottlenecks in the routing mechanism must be estimated from a reduced set of observations. However, this calls for the derivation of complex conditional density representations, over both the stochastic network trajectories and the rates, which is considered an intractable problem. Numerical simulation procedures designed for this purpose do not scale, because of high computational costs; furthermore, variational approaches relying on approximating measures and full independence assumptions are unsuitable. In this paper, we offer a probabilistic interpretation of variational methods applied to inference tasks with queueing networks, and show that approximating measure choices routinely used with jump processes yield ill-defined optimization problems. Yet we demonstrate that it is still possible to enable a variational inferential task, by considering a novel space expansion treatment over an analogous counting process for job transitions. We present and compare exemplary use cases with practical queueing networks, showing that our framework offers an efficient and improved alternative where existing variational or numerically intensive solutions fail.
The logistic birth and death process is perhaps the simplest stochastic population model that has both density-dependent reproduction and a phase transition, and a lot can be learned about the process by studying its extinction time, $\tau_n$, as a function of system size n. A number of existing results describe the scaling of $\tau_n$ as $n\to\infty$ for various choices of reproductive rate $r_n$ and initial population $X_n(0)$ as a function of n. We collect and complete this picture, obtaining a complete classification of all sequences $(r_n)$ and $(X_n(0))$ for which there exist rescaling parameters $(s_n)$ and $(t_n)$ such that $(\tau_n-t_n)/s_n$ converges in distribution as $n\to\infty$, and identifying the limits in each case.
We reduce the upper bound for the bond percolation threshold of the cubic lattice from 0.447 792 to 0.347 297. The bound is obtained by a growth process approach which views the open cluster of a bond percolation model as a dynamic process. A three-dimensional dynamic process on the cubic lattice is constructed and then projected onto a carefully chosen plane to obtain a two-dimensional dynamic process on a triangular lattice. We compare the bond percolation models on the cubic lattice and their projections, and demonstrate that the bond percolation threshold of the cubic lattice is no greater than that of the triangular lattice. Applying the approach to the body-centered cubic lattice yields an upper bound of 0.292 893 for its bond percolation threshold.
In this paper we analyse the limiting conditional distribution (Yaglom limit) for stochastic fluid models (SFMs), a key class of models in the theory of matrix-analytic methods. So far, only transient and stationary analyses of SFMs have been considered in the literature. The limiting conditional distribution gives useful insights into what happens when the process has been evolving for a long time, given that its busy period has not ended yet. We derive expressions for the Yaglom limit in terms of the singularity˜$s^*$ such that the key matrix of the SFM, ${\boldsymbol{\Psi}}(s)$, is finite (exists) for all $s\geq s^*$ and infinite for $s<s^*$. We show the uniqueness of the Yaglom limit and illustrate the application of the theory with simple examples.
In this note, we consider dynamic assortment optimization with incomplete information under the capacitated multinomial logit choice model. Recently, it has been shown that the regret (the cumulative expected revenue loss caused by offering suboptimal assortments) that any decision policy endures is bounded from below by a constant times $\sqrt {NT}$, where $N$ denotes the number of products and $T$ denotes the time horizon. This result is shown under the assumption that the product revenues are constant, and thus leaves the question open whether a lower regret rate can be achieved for nonconstant revenue parameters. In this note, we show that this is not the case: we show that, for any vector of product revenues there is a positive constant such that the regret of any policy is bounded from below by this constant times $\sqrt {N T}$. Our result implies that policies that achieve ${{\mathcal {O}}}(\sqrt {NT})$ regret are asymptotically optimal for all product revenue parameters.
We study propagation of avalanches in a certain excitable network. The model is a particular case of the one introduced by Larremore et al. (Phys. Rev. E, 2012) and is mathematically equivalent to an endemic variation of the Reed–Frost epidemic model introduced by Longini (Math. Biosci., 1980). Two types of heuristic approximation are frequently used for models of this type in applications: a branching process for avalanches of a small size at the beginning of the process and a deterministic dynamical system once the avalanche spreads to a significant fraction of a large network. In this paper we prove several results concerning the exact relation between the avalanche model and these limits, including rates of convergence and rigorous bounds for common characteristics of the model.
Oscillatory systems of interacting Hawkes processes with Erlang memory kernels were introduced by Ditlevsen and Löcherbach (Stoch. Process. Appl., 2017). They are piecewise deterministic Markov processes (PDMP) and can be approximated by a stochastic diffusion. In this paper, first, a strong error bound between the PDMP and the diffusion is proved. Second, moment bounds for the resulting diffusion are derived. Third, approximation schemes for the diffusion, based on the numerical splitting approach, are proposed. These schemes are proved to converge with mean-square order 1 and to preserve the properties of the diffusion, in particular the hypoellipticity, the ergodicity, and the moment bounds. Finally, the PDMP and the diffusion are compared through numerical experiments, where the PDMP is simulated with an adapted thinning procedure.