We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Deep neural networks and other modern machine learning models are often susceptible to adversarial attacks. Indeed, an adversary may often be able to change a model’s prediction through a small, directed perturbation of the model’s input – an issue in safety-critical applications. Adversarially robust machine learning is usually based on a minmax optimisation problem that minimises the machine learning loss under maximisation-based adversarial attacks. In this work, we study adversaries that determine their attack using a Bayesian statistical approach rather than maximisation. The resulting Bayesian adversarial robustness problem is a relaxation of the usual minmax problem. To solve this problem, we propose Abram – a continuous-time particle system that shall approximate the gradient flow corresponding to the underlying learning problem. We show that Abram approximates a McKean–Vlasov process and justify the use of Abram by giving assumptions under which the McKean–Vlasov process finds the minimiser of the Bayesian adversarial robustness problem. We discuss two ways to discretise Abram and show its suitability in benchmark adversarial deep learning experiments.
Motivated by recent developments of quasi-stationary Monte Carlo methods, we investigate the stability of quasi-stationary distributions of killed Markov processes under perturbations of the generator. We first consider a general bounded self-adjoint perturbation operator, and then study a particular unbounded perturbation corresponding to truncation of the killing rate. In both scenarios, we quantify the difference between eigenfunctions of the smallest eigenvalue of the perturbed and unperturbed generators in a Hilbert space norm. As a consequence, $\mathcal{L}^1$-norm estimates of the difference of the resulting quasi-stationary distributions in terms of the perturbation are provided.
We are concerned with the micro-macro Parareal algorithm for the simulation of initial-value problems. In this algorithm, a coarse (fast) solver is applied sequentially over the time domain and a fine (time-consuming) solver is applied as a corrector in parallel over smaller chunks of the time interval. Moreover, the coarse solver acts on a reduced state variable, which is coupled with the fine state variable through appropriate coupling operators. We first provide a contribution to the convergence analysis of the micro-macro Parareal method for multiscale linear ordinary differential equations. Then, we extend a variant of the micro-macro Parareal algorithm for scalar stochastic differential equations (SDEs) to higher-dimensional SDEs.
We study the Markov chain Monte Carlo estimator for numerical integration for functions that do not need to be square integrable with respect to the invariant distribution. For chains with a spectral gap we show that the absolute mean error for $L^p$ functions, with $p \in (1,2)$, decreases like $n^{({1}/{p}) -1}$, which is known to be the optimal rate. This improves currently known results where an additional parameter $\delta \gt 0$ appears and the convergence is of order $n^{(({1+\delta})/{p})-1}$.
When implementing Markov Chain Monte Carlo (MCMC) algorithms, perturbation caused by numerical errors is sometimes inevitable. This paper studies how the perturbation of MCMC affects the convergence speed and approximation accuracy. Our results show that when the original Markov chain converges to stationarity fast enough and the perturbed transition kernel is a good approximation to the original transition kernel, the corresponding perturbed sampler has fast convergence speed and high approximation accuracy as well. Our convergence analysis is conducted under either the Wasserstein metric or the $\chi^2$ metric, both are widely used in the literature. The results can be extended to obtain non-asymptotic error bounds for MCMC estimators. We demonstrate how to apply our convergence and approximation results to the analysis of specific sampling algorithms, including Random walk Metropolis, Metropolis adjusted Langevin algorithm with perturbed target densities, and parallel tempering Monte Carlo with perturbed densities. Finally, we present some simple numerical examples to verify our theoretical claims.
In this work, we study early warning signs for stochastic partial differential equations (SPDEs), where the linearisation around a steady state is characterised by continuous spectrum. The studied warning sign takes the form of qualitative changes in the variance as a deterministic bifurcation threshold is approached via parameter variation. Specifically, we focus on the scaling law of the variance near the transition. Since we are dealing here, in contrast to previous studies, with the case of continuous spectrum and quantitative scaling laws, it is natural to start with linearisations of the drift operator that are multiplication operators defined by analytic functions. For a one-dimensional spatial domain, we obtain precise rates of divergence. In the case of the two- and three-dimensional domains, an upper bound to the rate of the early warning sign is proven. These results are cross-validated by numerical simulations. Our theory can be generically useful for several applications, where stochastic and spatial aspects are important in combination with continuous spectrum bifurcations.
A numerical method is proposed for a class of one-dimensional stochastic control problems with unbounded state space. This method solves an infinite-dimensional linear program, equivalent to the original formulation based on a stochastic differential equation, using a finite element approximation. The discretization scheme itself and the necessary assumptions are discussed, and a convergence argument for the method is presented. Its performance is illustrated by examples featuring long-term average and infinite horizon discounted costs, and additional optimization constraints.
We present and study a novel algorithm for the computation of 2-Wasserstein population barycenters of absolutely continuous probability measures on Euclidean space. The proposed method can be seen as a stochastic gradient descent procedure in the 2-Wasserstein space, as well as a manifestation of a law of large numbers therein. The algorithm aims to find a Karcher mean or critical point in this setting, and can be implemented ‘online’, sequentially using independent and identically distributed random measures sampled from the population law. We provide natural sufficient conditions for this algorithm to almost surely converge in the Wasserstein space towards the population barycenter, and we introduce a novel, general condition which ensures uniqueness of Karcher means and, moreover, allows us to obtain explicit, parametric convergence rates for the expected optimality gap. We also study the mini-batch version of this algorithm, and discuss examples of families of population laws to which our method and results can be applied. This work expands and deepens ideas and results introduced in an early version of Backhoff-Veraguas et al. (2022), in which a statistical application (and numerical implementation) of this method is developed in the context of Bayesian learning.
In this manuscript, we address open questions raised by Dieker and Yakir (2014), who proposed a novel method of estimating (discrete) Pickands constants $\mathcal{H}^\delta_\alpha$ using a family of estimators $\xi^\delta_\alpha(T)$, $T>0$, where $\alpha\in(0,2]$ is the Hurst parameter, and $\delta\geq0$ is the step size of the regular discretization grid. We derive an upper bound for the discretization error $\mathcal{H}_\alpha^0 - \mathcal{H}_\alpha^\delta$, whose rate of convergence agrees with Conjecture 1 of Dieker and Yakir (2014) in the case $\alpha\in(0,1]$ and agrees up to logarithmic terms for $\alpha\in(1,2)$. Moreover, we show that all moments of $\xi_\alpha^\delta(T)$ are uniformly bounded and the bias of the estimator decays no slower than $\exp\{-\mathcal CT^{\alpha}\}$, as T becomes large.
Continuous-time Markov chains are frequently used to model the stochastic dynamics of (bio)chemical reaction networks. However, except in very special cases, they cannot be analyzed exactly. Additionally, simulation can be computationally intensive. An approach to address these challenges is to consider a more tractable diffusion approximation. Leite and Williams (Ann. Appl. Prob.29, 2019) proposed a reflected diffusion as an approximation for (bio)chemical reaction networks, which they called the constrained Langevin approximation (CLA) as it extends the usual Langevin approximation beyond the first time some chemical species becomes zero in number. Further explanation and examples of the CLA can be found in Anderson et al. (SIAM Multiscale Modeling Simul.17, 2019).
In this paper, we extend the approximation of Leite and Williams to (nearly) density-dependent Markov chains, as a first step to obtaining error estimates for the CLA when the diffusion state space is one-dimensional, and we provide a bound for the error in a strong approximation. We discuss some applications for chemical reaction networks and epidemic models, and illustrate these with examples. Our method of proof is designed to generalize to higher dimensions, provided there is a Lipschitz Skorokhod map defining the reflected diffusion process. The existence of such a Lipschitz map is an open problem in dimensions more than one.
It is known that the simple slice sampler has robust convergence properties; however, the class of problems where it can be implemented is limited. In contrast, we consider hybrid slice samplers which are easily implementable and where another Markov chain approximately samples the uniform distribution on each slice. Under appropriate assumptions on the Markov chain on the slice, we give a lower bound and an upper bound of the spectral gap of the hybrid slice sampler in terms of the spectral gap of the simple slice sampler. An immediate consequence of this is that the spectral gap and geometric ergodicity of the hybrid slice sampler can be concluded from the spectral gap and geometric ergodicity of the simple version, which is very well understood. These results indicate that robustness properties of the simple slice sampler are inherited by (appropriately designed) easily implementable hybrid versions. We apply the developed theory and analyze a number of specific algorithms, such as the stepping-out shrinkage slice sampling, hit-and-run slice sampling on a class of multivariate targets, and an easily implementable combination of both procedures on multidimensional bimodal densities.
Qu, Dassios, and Zhao (2021) suggested an exact simulation method for tempered stable Ornstein–Uhlenbeck processes, but their algorithms contain some errors. This short note aims to correct their algorithms and conduct some numerical experiments.
In this paper we consider the filtering of partially observed multidimensional diffusion processes that are observed regularly at discrete times. This is a challenging problem which requires the use of advanced numerical schemes based upon time-discretization of the diffusion process and then the application of particle filters. Perhaps the state-of-the-art method for moderate-dimensional problems is the multilevel particle filter of Jasra et al. (SIAM J. Numer. Anal.55 (2017), 3068–3096). This is a method that combines multilevel Monte Carlo and particle filters. The approach in that article is based intrinsically upon an Euler discretization method. We develop a new particle filter based upon the antithetic truncated Milstein scheme of Giles and Szpruch (Ann. Appl. Prob.24 (2014), 1585–1620). We show empirically for a class of diffusion problems that, for $\epsilon>0$ given, the cost to produce a mean squared error (MSE) of $\mathcal{O}(\epsilon^2)$ in the estimation of the filter is $\mathcal{O}(\epsilon^{-2}\log(\epsilon)^2)$. In the case of multidimensional diffusions with non-constant diffusion coefficient, the method of Jasra et al. (2017) requires a cost of $\mathcal{O}(\epsilon^{-2.5})$ to achieve the same MSE.
There has been substantial interest in developing Markov chain Monte Carlo algorithms based on piecewise deterministic Markov processes. However, existing algorithms can only be used if the target distribution of interest is differentiable everywhere. The key to adapting these algorithms so that they can sample from densities with discontinuities is to define appropriate dynamics for the process when it hits a discontinuity. We present a simple condition for the transition of the process at a discontinuity which can be used to extend any existing sampler for smooth densities, and give specific choices for this transition which work with popular algorithms such as the bouncy particle sampler, the coordinate sampler, and the zigzag process. Our theoretical results extend and make rigorous arguments that have been presented previously, for instance constructing samplers for continuous densities restricted to a bounded domain, and we present a version of the zigzag process that can work in such a scenario. Our novel approach to deriving the invariant distribution of a piecewise deterministic Markov process with boundaries may be of independent interest.
We study 2-stage game-theoretic problem oriented 3-stage service policy computing, convolutional neural network (CNN) based algorithm design, and simulation for a blockchained buffering system with federated learning. More precisely, based on the game-theoretic problem consisting of both “win-lose” and “win-win” 2-stage competitions, we derive a 3-stage dynamical service policy via a saddle point to a zero-sum game problem and a Nash equilibrium point to a non-zero-sum game problem. This policy is concerning users-selection, dynamic pricing, and online rate resource allocation via stable digital currency for the system. The main focus is on the design and analysis of the joint 3-stage service policy for given queue/environment state dependent pricing and utility functions. The asymptotic optimality and fairness of this dynamic service policy is justified by diffusion modeling with approximation theory. A general CNN based policy computing algorithm flow chart along the line of the so-called big model framework is presented. Simulation case studies are conducted for the system with three users, where only two of the three users can be selected into the service by a zero-sum dual cost game competition policy at a time point. Then, the selected two users get into service and share the system rate service resource through a non-zero-sum dual cost game competition policy. Applications of our policy in the future blockchain based Internet (e.g., metaverse and web3.0) and supply chain finance are also briefly illustrated.
We develop a novel Monte Carlo algorithm for the vector consisting of the supremum, the time at which the supremum is attained, and the position at a given (constant) time of an exponentially tempered Lévy process. The algorithm, based on the increments of the process without tempering, converges geometrically fast (as a function of the computational cost) for discontinuous and locally Lipschitz functions of the vector. We prove that the corresponding multilevel Monte Carlo estimator has optimal computational complexity (i.e. of order $\varepsilon^{-2}$ if the mean squared error is at most $\varepsilon^2$) and provide its central limit theorem (CLT). Using the CLT we construct confidence intervals for barrier option prices and various risk measures based on drawdown under the tempered stable (CGMY) model calibrated/estimated on real-world data. We provide non-asymptotic and asymptotic comparisons of our algorithm with existing approximations, leading to rule-of-thumb principles guiding users to the best method for a given set of parameters. We illustrate the performance of the algorithm with numerical examples.
We propose a discrete-time discrete-space Markov chain approximation with a Brownian bridge correction for computing curvilinear boundary crossing probabilities of a general diffusion process on a finite time interval. For broad classes of curvilinear boundaries and diffusion processes, we prove the convergence of the constructed approximations in the form of products of the respective substochastic matrices to the boundary crossing probabilities for the process as the time grid used to construct the Markov chains is getting finer. Numerical results indicate that the convergence rate for the proposed approximation with the Brownian bridge correction is $O(n^{-2})$ in the case of $C^2$ boundaries and a uniform time grid with n steps.
We construct a class of non-reversible Metropolis kernels as a multivariate extension of the guided-walk kernel proposed by Gustafson (Statist. Comput.8, 1998). The main idea of our method is to introduce a projection that maps a state space to a totally ordered group. By using Haar measure, we construct a novel Markov kernel termed the Haar mixture kernel, which is of interest in its own right. This is achieved by inducing a topological structure to the totally ordered group. Our proposed method, the $\Delta$-guided Metropolis–Haar kernel, is constructed by using the Haar mixture kernel as a proposal kernel. The proposed non-reversible kernel is at least 10 times better than the random-walk Metropolis kernel and Hamiltonian Monte Carlo kernel for the logistic regression and a discretely observed stochastic process in terms of effective sample size per second.
The principle of maximum entropy is a well-known approach to produce a model for data-generating distributions. In this approach, if partial knowledge about the distribution is available in terms of a set of information constraints, then the model that maximizes entropy under these constraints is used for the inference. In this paper, we propose a new three-parameter lifetime distribution using the maximum entropy principle under the constraints on the mean and a general index. We then present some statistical properties of the new distribution, including hazard rate function, quantile function, moments, characterization, and stochastic ordering. We use the maximum likelihood estimation technique to estimate the model parameters. A Monte Carlo study is carried out to evaluate the performance of the estimation method. In order to illustrate the usefulness of the proposed model, we fit the model to three real data sets and compare its relative performance with respect to the beta generalized Weibull family.