To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting, that is, accurate predictions despite overfitting training data. In this article, we survey recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behaviour of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favourable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.
Neural networks (NNs) are the method of choice for building learning algorithms. They are now being investigated for other numerical tasks such as solving high-dimensional partial differential equations. Their popularity stems from their empirical success on several challenging learning problems (computer chess/Go, autonomous navigation, face recognition). However, most scholars agree that a convincing theoretical explanation for this success is still lacking. Since these applications revolve around approximating an unknown function from data observations, part of the answer must involve the ability of NNs to produce accurate approximations.
This article surveys the known approximation properties of the outputs of NNs with the aim of uncovering the properties that are not present in the more traditional methods of approximation used in numerical analysis, such as approximations using polynomials, wavelets, rational functions and splines. Comparisons are made with traditional approximation methods from the viewpoint of rate distortion, i.e. error versus the number of parameters used to create the approximant. Another major component in the analysis of numerical approximation is the computational time needed to construct the approximation, and this in turn is intimately connected with the stability of the approximation algorithm. So the stability of numerical approximation using NNs is a large part of the analysis put forward.
The survey, for the most part, is concerned with NNs using the popular ReLU activation function. In this case the outputs of the NNs are piecewise linear functions on rather complicated partitions of the domain of f into cells that are convex polytopes. When the architecture of the NN is fixed and the parameters are allowed to vary, the set of output functions of the NN is a parametrized nonlinear manifold. It is shown that this manifold has certain space-filling properties leading to an increased ability to approximate (better rate distortion) but at the expense of numerical stability. The space filling creates the challenge to the numerical method of finding best or good parameter choices when trying to approximate.
Numerical homogenization is a methodology for the computational solution of multiscale partial differential equations. It aims at reducing complex large-scale problems to simplified numerical models valid on some target scale of interest, thereby accounting for the impact of features on smaller scales that are otherwise not resolved. While constructive approaches in the mathematical theory of homogenization are restricted to problems with a clear scale separation, modern numerical homogenization methods can accurately handle problems with a continuum of scales. This paper reviews such approaches embedded in a historical context and provides a unified variational framework for their design and numerical analysis. Apart from prototypical elliptic model problems, the class of partial differential equations covered here includes wave scattering in heterogeneous media and serves as a template for more general multi-physics problems.
We prove completeness of preferential conditional logic with respect to convexity over finite sets of points in the Euclidean plane. A conditional is defined to be true in a finite set of points if all extreme points of the set interpreting the antecedent satisfy the consequent. Equivalently, a conditional is true if the antecedent is contained in the convex hull of the points that satisfy both the antecedent and consequent. Our result is then that every consistent formula without nested conditionals is satisfiable in a model based on a finite set of points in the plane. The proof relies on a result by Richter and Rogers showing that every finite abstract convex geometry can be represented by convex polygons in the plane.
This paper introduces three model-theoretic constructions for generalized Epstein semantics: reducts, ultramodels and $\textsf {S}$-sets. We apply these notions to obtain metatheoretical results. We prove connective inexpressibility by means of a reduct, compactness by an ultramodel and definability theorem which states that a set of generalized Epstein models is definable iff it is closed under ultramodels and $\textsf {S}$-sets. Furthermore, a corollary concerning definability of a set of models by a single formula is given on the basis of the main theorem and the compactness theorem. We also provide an example of a natural set of generalized Epstein models which is undefinable. Its undefinability is proven by means of an $\textsf {S}$-set.
There has been a recent interest in hierarchical generalizations of classic incompleteness results. This paper provides evidence that such generalizations are readily obtainable from suitably formulated hierarchical versions of the principles used in the original proofs. By collecting such principles, we prove hierarchical versions of Mostowski’s theorem on independent formulae, Kripke’s theorem on flexible formulae, Woodin’s theorem on the universal algorithm, and a few related results. As a corollary, we obtain the expected result that the formula expressing “$\mathrm {T}$ is $\Sigma _n$-ill” is a canonical example of a $\Sigma _{n+1}$ formula that is $\Pi _{n+1}$-conservative over $\mathrm {T}$.
We prove neighbourhood canonicity and strong completeness for the logics $\mathbf {EK}$ and $\mathbf {ECK}$, obtained by adding axiom (K), resp. adding both (K) and (C), to the minimal modal logic $\textbf {E}$. In contrast to an earlier proof in [10], ours is constructive. More precisely, we construct minimal characteristic models for both logics and do not rely on compactness of first order logic. The proof involves a specific circumscription technique and quite some set-theoretic maneuvers to establish that the models satisfy the appropriate frame conditions. After giving both proofs, we briefly spell out how they generalize to four stronger logics and to the extensions of the resulting six logics with a global modality.
In Baghdad in the mid twelfth century Abū al-Barakāt proposes a radical new procedure for finding the conclusions of premise-pairs in syllogistic logic, and for identifying those premise-pairs that have no conclusions. The procedure makes no use of features of the standard Aristotelian apparatus, such as conversions or syllogistic figures. In place of these al-Barakāt writes out pages of diagrams consisting of labelled horizontal lines. He gives no instructions and no proof that the procedure will yield correct results. So the reader has to work out what his procedure is and whether it is correct. The procedure turns out to be insightful and entirely correct, but this paper may be the first study to give a full description of the procedure and a rigorous proof of its correctness.
Strong negation is a well-known alternative to the standard negation in intuitionistic logic. It is defined virtually by giving falsity conditions to each of the connectives. Among these, the falsity condition for implication appears to unnecessarily deviate from the standard negation. In this paper, we introduce a slight modification to strong negation, and observe its comparative advantages over the original notion. In addition, we consider the paraconsistent variants of our modification, and study their relationship with non-constructive principles and connexivity.
We present a new manifestation of Gödel’s second incompleteness theorem and discuss its foundational significance, in particular with respect to Hilbert’s program. Specifically, we consider a proper extension of Peano arithmetic ($\mathbf {PA}$) by a mathematically meaningful axiom scheme that consists of $\Sigma ^0_2$-sentences. These sentences assert that each computably enumerable ($\Sigma ^0_1$-definable without parameters) property of finite binary trees has a finite basis. Since this fact entails the existence of polynomial time algorithms, it is relevant for computer science. On a technical level, our axiom scheme is a variant of an independence result due to Harvey Friedman. At the same time, the meta-mathematical properties of our axiom scheme distinguish it from most known independence results: Due to its logical complexity, our axiom scheme does not add computational strength. The only known method to establish its independence relies on Gödel’s second incompleteness theorem. In contrast, Gödel’s theorem is not needed for typical examples of $\Pi ^0_2$-independence (such as the Paris–Harrington principle), since computational strength provides an extensional invariant on the level of $\Pi ^0_2$-sentences.
The paper provides a proof theoretic characterization of the Russellian theory of definite descriptions (RDD) as characterized by Kalish, Montague and Mar (KMM). To this effect three sequent calculi are introduced: LKID0, LKID1 and LKID2. LKID0 is an auxiliary system which is easily shown to be equivalent to KMM. The main research is devoted to LKID1 and LKID2. The former is simpler in the sense of having smaller number of rules and, after small change, satisfies cut elimination but fails to satisfy the subformula property. In LKID2 an additional analysis of different kinds of identities leads to proliferation of rules but yields the subformula property. This refined proof theoretic analysis leading to fully analytic calculus with constructive proof of cut elimination is the main contribution of the paper.
Inferentialism is a theory in the philosophy of language which claims that the meanings of expressions are constituted by inferential roles or relations. Instead of a traditional model-theoretic semantics, it naturally lends itself to a proof-theoretic semantics, where meaning is understood in terms of inference rules with a proof system. Most work in proof-theoretic semantics has focused on logical constants, with comparatively little work on the semantics of non-logical vocabulary. Drawing on Robert Brandom’s notion of material inference and Greg Restall’s bilateralist interpretation of the multiple conclusion sequent calculus, I present a proof-theoretic semantics for atomic sentences and their constituent names and predicates. The resulting system has several interesting features: (1) the rules are harmonious and stable; (2) the rules create a structure analogous to familiar model-theoretic semantics; and (3) the semantics is compositional, in that the rules for atomic sentences are determined by those for their constituent names and predicates.
Jean Nicod (1893–1924) is a French philosopher and logician who worked with Russell during the First World War. His PhD, with a preface from Russell, was published under the title La géométrie dans le monde sensible in 1924, the year of his untimely death. The book did not have the impact he deserved. In this paper, I discuss the methodological aspect of Nicod’s approach. My aim is twofold. I would first like to show that Nicod’s definition of various notions of equivalence between theories anticipates, in many respects, the (syntactic and semantic) model-theoretic notion of interpretation of a theory into another. I would secondly like to present the philosophical agenda that led Nicod to elaborate his logical framework: the defense of rationalism against Bergson’s attacks.
It is customary to expect from a logical system that it can be algebraizable, in the sense that an algebraic companion of the deductive machinery can always be found. Since the inception of da Costa’s paraconsistent calculi, algebraic equivalents for such systems have been sought. It is known, however, that these systems are not self-extensional (i.e., they do not satisfy the replacement property). More than this, they are not algebraizable in the sense of Blok–Pigozzi. The same negative results hold for several systems of the hierarchy of paraconsistent logics known as Logics of Formal Inconsistency (LFIs). Because of this, several systems belonging to this class of logics are only characterizable by semantics of a non-deterministic nature. This paper offers a solution for two open problems in the domain of paraconsistency, in particular connected to algebraization of LFIs, by extending with rules several LFIs weaker than $C_1$, thus obtaining the replacement property (that is, such LFIs turn out to be self-extensional). Moreover, these logics become algebraizable in the standard Lindenbaum–Tarski’s sense by a suitable variety of Boolean algebras extended with additional operations. The weakest LFI satisfying replacement presented here is called RmbC, which is obtained from the basic LFI called mbC. Some axiomatic extensions of RmbC are also studied. In addition, a neighborhood semantics is defined for such systems. It is shown that RmbC can be defined within the minimal bimodal non-normal logic $\mathbf {E} {\oplus } \mathbf {E}$ defined by the fusion of the non-normal modal logic E with itself. Finally, the framework is extended to first-order languages. RQmbC, the quantified extension of RmbC, is shown to be sound and complete w.r.t. the proposed algebraic semantics.
We develop an untyped framework for the multiverse of set theory. $\mathsf {ZF}$ is extended with semantically motivated axioms utilizing the new symbols $\mathsf {Uni}(\mathcal {U})$ and $\mathsf {Mod}(\mathcal {U, \sigma })$, expressing that $\mathcal {U}$ is a universe and that $\sigma $ is true in the universe $\mathcal {U}$, respectively. Here $\sigma $ ranges over the augmented language, leading to liar-style phenomena that are analyzed. The framework is both compatible with a broad range of multiverse conceptions and suggests its own philosophically and semantically motivated multiverse principles. In particular, the framework is closely linked with a deductive rule of Necessitation expressing that the multiverse theory can only prove statements that it also proves to hold in all universes. We argue that this may be philosophically thought of as a Copernican principle that the background theory does not hold a privileged position over the theories of its internal universes. Our main mathematical result is a lemma encapsulating a technique for locally interpreting a wide variety of extensions of our basic framework in more familiar theories. We apply this to show, for a range of such semantically motivated extensions, that their consistency strength is at most slightly above that of the base theory $\mathsf {ZF}$, and thus not seriously limiting to the diversity of the set-theoretic multiverse. We end with case studies applying the framework to two multiverse conceptions of set theory: arithmetic absoluteness and Joel D. Hamkins’ multiverse theory.
Here, I combine the semantics of Mares and Goldblatt [20] and Seki [29, 30] to develop a semantics for quantified modal relevant logics extending ${\bf B}$. The combination requires demonstrating that the Mares–Goldblatt approach is apt for quantified extensions of ${\bf B}$ and other relevant logics, but no significant bridging principles are needed. The result is a single semantic approach for quantified modal relevant logics. Within this framework, I discuss the requirements a quantified modal relevant logic must satisfy to be “sufficiently classical” in its modal fragment, where frame conditions are given that work for positive fragments of logics. The roles of the Barcan formula and its converse are also investigated.