To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In recent years, there has been significant interest in the effect of different types of adversarial perturbations in data classification problems. Many of these models incorporate the adversarial power, which is an important parameter with an associated trade-off between accuracy and robustness. This work considers a general framework for adversarially perturbed classification problems, in a large data or population-level limit. In such a regime, we demonstrate that as adversarial strength goes to zero that optimal classifiers converge to the Bayes classifier in the Hausdorff distance. This significantly strengthens previous results, which generally focus on $L^1$-type convergence. The main argument relies upon direct geometric comparisons and is inspired by techniques from geometric measure theory.
We show that the Potts model on a graph can be approximated by a sequence of independent and identically distributed spins in terms of Wasserstein distance at high temperatures. We prove a similar result for the Curie–Weiss–Potts model on the complete graph, conditioned on being close enough to any of its equilibrium macrostates, in the low-temperature regime. Our proof technique is based on Stein’s method for comparing the stationary distributions of two Glauber dynamics with similar updates, one of which is rapid mixing and contracting on a subset of the state space. Along the way, we prove a new upper bound on the mixing time of the Glauber dynamics for the conditional measure of the Curie–Weiss–Potts model near an equilibrium macrostate.
We study an abstract group of reversible Turing machines. In our model, each machine is interpreted as a homeomorphism over a space which represents a tape filled with symbols and a head carrying a state. These homeomorphisms can only modify the tape at a bounded distance around the head, change the state, and move the head in a bounded way. We study three natural subgroups arising in this model: the group of finite-state automata, which generalizes the topological full groups studied in topological dynamics and the theory of orbit-equivalence; the group of oblivious Turing machines whose movement is independent of tape contents, which generalizes lamplighter groups and has connections to the study of universal reversible logical gates, and the group of elementary Turing machines, which are the machines which are obtained by composing finite-state automata and oblivious Turing machines.
We show that both the group of oblivious Turing machines and that of elementary Turing machines are finitely generated, while the group of finite-state automata and the group of reversible Turing machines are not. We show that the group of elementary Turing machines has undecidable torsion problem. From this, we also obtain that the group of cellular automata (more generally, the automorphism group of any uncountable one-dimensional sofic subshift) contains a finitely generated subgroup with undecidable torsion problem. We also show that the torsion problem is undecidable for the topological full group of a full $\mathbb {Z}^d$-shift on a nontrivial alphabet if and only if $d \geq 2$.
We prove that determining the weak saturation number of a host graph $F$ with respect to a pattern graph $H$ is computationally hard, even when $H$ is the triangle. Our main tool establishes a connection between weak saturation and the shellability of simplicial complexes.
We consider the problem of predicting the next bit in an infinite binary sequence sampled from the Cantor space with an unknown computable measure. We propose a new theoretical framework to investigate the properties of good computable predictions, focusing on such predictions’ convergence rate.
Since no computable prediction can be the best, we first define a better prediction as one that dominates the other measure. We then prove that this is equivalent to the condition that the sum of the KL divergence errors of its predictions is smaller than that of the other prediction for more computable measures. We call that such a computable prediction is more general than the other.
We further show that the sum of any sufficiently general prediction errors is a finite left-c.e. Martin-Löf random real. This means the errors converge to zero more slowly than any computable function.
Given a finite abelian group $G$ and $t\in \mathbb{N}$, there are two natural types of subsets of the Cartesian power $G^t$; namely, Cartesian powers $S^t$ where $S$ is a subset of $G$ and (cosets of) subgroups $H$ of $G^t$. A basic question is whether two such sets intersect. In this paper, we show that this decision problem is NP-complete. Furthermore, for fixed $G$ and $S$, we give a complete classification: we determine conditions for when the problem is NP-complete and show that in all other cases the problem is solvable in polynomial time. These theorems play a key role in the classification of algebraic decision problems in finitely generated rings developed in later work of the author.
Schubert Vanishing is a problem of deciding whether Schubert coefficients are zero. Until this work it was open whether this problem is in the polynomial hierarchy ${{\mathsf {PH}}}$. We prove this problem is in ${{\mathsf {AM}}} \cap {{\mathsf {coAM}}}$ assuming the Generalized Riemann Hypothesis ($\mathrm{GRH}$), that is, relatively low in ${{\mathsf {PH}}}$. Our approach uses Purbhoo’s criterion [57] to construct explicit polynomial systems for the problem. The result follows from a reduction to Parametric Hilbert’s Nullstellensatz, recently analyzed in [2]. We extend our results to all classical types.
A popular method to perform adversarial attacks on neural networks is the so-called fast gradient sign method and its iterative variant. In this paper, we interpret this method as an explicit Euler discretization of a differential inclusion, where we also show convergence of the discretization to the associated gradient flow. To do so, we consider the concept of $p$-curves of maximal slope in the case $p=\infty$. We prove existence of $\infty$-curves of maximum slope and derive an alternative characterization via differential inclusions. Furthermore, we also consider Wasserstein gradient flows for potential energies, where we show that curves in the Wasserstein space can be characterized by a representing measure on the space of curves in the underlying Banach space, which fulfil the differential inclusion. The application of our theory to the finite-dimensional setting is twofold: On the one hand, we show that a whole class of normalized gradient descent methods (in particular, signed gradient descent) converge, up to subsequences, to the flow when sending the step size to zero. On the other hand, in the distributional setting, we show that the inner optimization task of adversarial training objective can be characterized via $\infty$-curves of maximum slope on an appropriate optimal transport space.
We consider the problem of sequential matching in a stochastic block model with several classes of nodes and generic compatibility constraints. When the probabilities of connections do not scale with the size of the graph, we show that under the Ncond condition, a simple max-weight type policy allows us to attain an asymptotically perfect matching while no sequential algorithm attains perfect matching otherwise. The proof relies on a specific Markovian representation of the dynamics associated with Lyapunov techniques.
We are concerned with the micro-macro Parareal algorithm for the simulation of initial-value problems. In this algorithm, a coarse (fast) solver is applied sequentially over the time domain and a fine (time-consuming) solver is applied as a corrector in parallel over smaller chunks of the time interval. Moreover, the coarse solver acts on a reduced state variable, which is coupled with the fine state variable through appropriate coupling operators. We first provide a contribution to the convergence analysis of the micro-macro Parareal method for multiscale linear ordinary differential equations. Then, we extend a variant of the micro-macro Parareal algorithm for scalar stochastic differential equations (SDEs) to higher-dimensional SDEs.
We study the computational problem of rigorously describing the asymptotic behavior of topological dynamical systems up to a finite but arbitrarily small pre-specified error. More precisely, we consider the limit set of a typical orbit, both as a spatial object (attractor set) and as a statistical distribution (physical measure), and we prove upper bounds on the computational resources of computing descriptions of these objects with arbitrary accuracy. We also study how these bounds are affected by different dynamical constraints and provide several examples showing that our bounds are sharp in general. In particular, we exhibit a computable interval map having a unique transitive attractor with Cantor set structure supporting a unique physical measure such that both the attractor and the measure are non-computable.
There are known characterisations of several fragments of hybrid logic by means of invariance under bisimulations of some kind. The fragments include $\{\mathord {\downarrow }, \mathord {@}\}$ with or without nominals (Areces, Blackburn, Marx), $\mathord {@}$ with or without nominals (ten Cate), and $\mathord {\downarrow }$ without nominals (Hodkinson, Tahiri). Some pairs of these characterisations, however, are incompatible with one another. For other fragments of hybrid logic no such characterisations were known so far. We prove a generic bisimulation characterisation theorem for all standard fragments of hybrid logic, in particular for the case with $\mathord {\downarrow }$ and nominals, left open by Hodkinson and Tahiri. Our characterisation is built on a common base and for each feature extension adds a specific condition, so it is modular in an engineering sense.
We investigate natural variations of behaviourally correct learning and explanatory learning—two learning paradigms studied in algorithmic learning theory—that allow us to “learn” equivalence relations on Polish spaces. We give a characterization of the learnable equivalence relations in terms of their Borel complexity and show that the behaviourally correct and explanatory learnable equivalence relations coincide both in uniform and non-uniform versions of learnability and provide a characterization of the learnable equivalence relations in terms of their Borel complexity. We also show that the set of uniformly learnable equivalence relations is $\boldsymbol {\Pi }^1_1$-complete in the codes and study the learnability of several equivalence relations arising naturally in logic as a case study.
This paper studies the conjecture of Hirschfeldt, Miller, and Podzorov in [13] on the complexity of order-computable sets, where a set A is order-computable if there is a computable copy of the structure $(\mathbb {N}, <,A)$ in the language of linear orders together with a unary predicate. The class of order-computable sets forms a subclass of $\Delta ^{0}_{2}$ sets. Firstly, we study the complexity of computably enumerable (c.e.) order-computable sets and prove that the index set of c.e. order-computable sets is $\Sigma ^{0}_{4}$-complete. Secondly, as a corollary of the main result on c.e. order-computable sets, we obtain that the index set of general order computable sets is $\Sigma ^{0}_{4}$-complete within the index set of $\Delta ^{0}_{2}$ sets. Finally, we continue to study the complexity of more general $\Delta ^{0}_{2}$ sets and prove that the index set of $\Delta ^{0}_{2}$ sets is $\Pi ^{0}_{3}$-complete.
Edge AI is the fusion of edge computing and artificial intelligence (AI). It promises responsiveness, privacy preservation, and fault tolerance by moving parts of the AI workflow from centralized cloud data centers to geographically dispersed edge servers, which are located at the source of the data. The scale of edge AI can vary from simple data preprocessing tasks to the whole machine learning stack. However, most edge AI implementations so far are limited to urban areas, where the infrastructure is highly dependable. This work instead focuses on a class of applications involved in environmental monitoring in remote, rural areas such as forests and rivers. Such applications have additional challenges, including failure proneness and access to the electricity grid and communication networks. We propose neuromorphic computing as a promising solution to the energy, communication, and computation constraints in such scenarios and identify directions for future research in neuromorphic edge AI for rural environmental monitoring. Proposed directions are distributed model synchronization, edge-only learning, aerial networks, spiking neural networks, and sensor integration.
We consider the performance of Glauber dynamics for the random cluster model with real parameter $q\gt 1$ and temperature $\beta \gt 0$. Recent work by Helmuth, Jenssen, and Perkins detailed the ordered/disordered transition of the model on random $\Delta$-regular graphs for all sufficiently large $q$ and obtained an efficient sampling algorithm for all temperatures $\beta$ using cluster expansion methods. Despite this major progress, the performance of natural Markov chains, including Glauber dynamics, is not yet well understood on the random regular graph, partly because of the non-local nature of the model (especially at low temperatures) and partly because of severe bottleneck phenomena that emerge in a window around the ordered/disordered transition. Nevertheless, it is widely conjectured that the bottleneck phenomena that impede mixing from worst-case starting configurations can be avoided by initialising the chain more judiciously. Our main result establishes this conjecture for all sufficiently large $q$ (with respect to $\Delta$). Specifically, we consider the mixing time of Glauber dynamics initialised from the two extreme configurations, the all-in and all-out, and obtain a pair of fast mixing bounds which cover all temperatures $\beta$, including in particular the bottleneck window. Our result is inspired by the recent approach of Gheissari and Sinclair for the Ising model who obtained a similar flavoured mixing-time bound on the random regular graph for sufficiently low temperatures. To cover all temperatures in the RC model, we refine appropriately the structural results of Helmuth, Jenssen and Perkins about the ordered/disordered transition and show spatial mixing properties ‘within the phase’, which are then related to the evolution of the chain.
Describing the equality conditions of the Alexandrov–Fenchel inequality [Ale37] has been a major open problem for decades. We prove that in the case of convex polytopes, this description is not in the polynomial hierarchy unless the polynomial hierarchy collapses to a finite level. This is the first hardness result for the problem and is a complexity counterpart of the recent result by Shenfeld and van Handel [SvH23], which gave a geometric characterization of the equality conditions. The proof involves Stanley’s [Sta81] order polytopes and employs poset theoretic technology.
We study the parameterized complexity of the problem to decide whether a given natural number n satisfies a given $\Delta _0$-formula $\varphi (x)$; the parameter is the size of $\varphi $. This parameterization focusses attention on instances where n is large compared to the size of $\varphi $. We show unconditionally that this problem does not belong to the parameterized analogue of $\mathsf {AC}^0$. From this we derive that certain natural upper bounds on the complexity of our parameterized problem imply certain separations of classical complexity classes. This connection is obtained via an analysis of a parameterized halting problem. Some of these upper bounds follow assuming that $I\Delta _0$ proves the MRDP theorem in a certain weak sense.
Novel prediction methods should always be compared to a baseline to determine their performance. Without this frame of reference, the performance score of a model is basically meaningless. What does it mean when a model achieves an $F_1$ of 0.8 on a test set? A proper baseline is, therefore, required to evaluate the ‘goodness’ of a performance score. Comparing results with the latest state-of-the-art model is usually insightful. However, being state-of-the-art is dynamic, as newer models are continuously developed. Contrary to an advanced model, it is also possible to use a simple dummy classifier. However, the latter model could be beaten too easily, making the comparison less valuable. Furthermore, most existing baselines are stochastic and need to be computed repeatedly to get a reliable expected performance, which could be computationally expensive. We present a universal baseline method for all binary classification models, named the Dutch Draw (DD). This approach weighs simple classifiers and determines the best classifier to use as a baseline. Theoretically, we derive the DD baseline for many commonly used evaluation measures and show that in most situations it reduces to (almost) always predicting either zero or one. Summarizing, the DD baseline is general, as it is applicable to any binary classification problem; simple, as it can be quickly determined without training or parameter tuning; and informative, as insightful conclusions can be drawn from the results. The DD baseline serves two purposes. First, it is a robust and universal baseline that enables comparisons across research papers. Second, it provides a sanity check during the prediction model’s development process. When a model does not outperform the DD baseline, it is a major warning sign.
We study the community detection problem on a Gaussian mixture model, in which vertices are divided into $k\geq 2$ distinct communities. The major difference in our model is that the intensities for Gaussian perturbations are different for different entries in the observation matrix, and we do not assume that every community has the same number of vertices. We explicitly find the necessary and sufficient conditions for the exact recovery of the maximum likelihood estimation, which can give a sharp phase transition for the exact recovery even though the Gaussian perturbations are not identically distributed; see Section 7. Applications include the community detection on hypergraphs.