To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This appendix summarizes the main things a reader needs to know about the programming environment we use. I hope it will provide a useful quick overview, but this appendix is no substitute for a textbook on functional programming like Cousineau and Mauny (1998) or Paulson (1991). There are numerous other texts on OCaml and CAML Light available online, e.g. a fairly comprehensive OCaml book and some old lecture notes on CAML Light by the present author.
Functional programming
OCaml supports several styles of programming, but its roots lie in functional programming, and almost all of our code is written in a purely functional style. In brief, the idea of functional programming is that a program is simply an expression, and execution means evaluation of the expression. Although this point of view may seem outlandish to those with experience of more traditional imperative programming, supported by common languages like C and Java, an expression-centric view is already familiar from other contexts such as spreadsheet programming.
The centrepiece of imperative programming is the successive modification, via assignment statements x = e or x := e, of a number of program variables, known collectively as the state. These assignment statements are invoked in a particular order using sequential execution (sometimes indicated by ‘;’) and built into more complex constructs using if tests, while loops and so on.
Functional programming represents a radical departure from this model.
This book is about computer programs that can perform automated reasoning. I interpret ‘reasoning’ quite narrowly: the emphasis is on formal deductive inference rather than, for example, poker playing or medical diagnosis. On the other hand I interpret ‘automated’ broadly, to include interactive arrangements where a human being and machine reason together, and I'm always conscious of the applications of deductive reasoning to real world problems. Indeed, as well as being inherently fascinating, the subject is deriving increasing importance from its industrial applications.
This book is intended as a first introduction to the field, and also to logical reasoning itself. No previous knowledge of mathematical logic is assumed, although readers will inevitably find some prior experience of mathematics and of computer programming (especially in a functional language like OCaml, F#, Standard ML, Haskell or LISP) invaluable. In contrast to the many specialist texts on the subject, this book aims at a broad and balanced general introduction, and has two special characteristics.
Pure logic and automated theorem proving are explained in a closely intertwined manner. Results in logic are developed with an eye to their role in automated theorem proving, and wherever possible are developed in an explicitly computational way.
Automated theorem proving methods are explained with reference to actual concrete implementations, which readers can experiment with if they have convenient access to a computer. All code is written in the high-level functional language OCaml.
Although this organization is open to question, I adopted it after careful consideration, and extensive experimentation with alternatives. A more detailed self-justification follows, but most readers will want to skip straight to the main content, starting with ‘How to read this book’ on page xvi.
Our efforts so far have been aimed at making the computer prove theorems completely automatically. But the scope of fully automatic methods, subject to any remotely realistic limitations on computing power, covers only a very small part of present-day mathematics. Here we develop an alternative: an interactive proof assistant that can help to precisely state and formalize a proof, while still dealing with some boring details automatically. Moreover, to ensure its reliability, we design the proof assistant based on a very simple logical kernel.
Human-oriented methods
We've devoted quite a lot of energy to making computers prove statements completely automatically. The methods we've implemented are fairly powerful and can do some kinds of proofs better than (most) people. Still, the enormously complicated chains of logical reasoning in many fields of mathematics are seldom likely to be discovered in a reasonable amount of time by systematic algorithms like those we've presented. In practice, human mathematicians find these chains of reasoning using a mixture of intuition, experimentation with specific instances, analogy with or extrapolation from related results, dramatic generalization of the context (e.g. the use of complexanalytic methods in number theory) and of course pure luck – see Lakatos (1976), Polya (1954) and Schoenfeld (1985) for varied attempts to subject the process of mathematical discovery to methodological analysis. It's probably true to say that very few human mathematicians approach the task of proving theorems with methods like those we have developed.
One natural reaction to the limitations of systematic algorithmic methods is to try to design computer programs that reason in a more human-like style.
Most of this book is about positive results: certain logical problems can in principle be automated. Here we consider the limits of automation, showing that algorithms in the usual sense cannot exist for certain logical problems. In particular we show that pure first-order logic is not decidable, and that the theory of natural numbers with addition and multiplication is, in a precise sense, nowhere near decidable. In making our way to these results, we prove Gödel's famous first incompleteness theorem.
Hilbert's programme
The idea of mechanizing reasoning fascinated people long before computers. Specific questions about the scope and limits of mechanization were investigated systematically in the early part of the twentieth century, largely due to the influence of Hilbert's programme to place mathematics on firm foundations. To appreciate the full cultural significance of the results that follow, it's worth examining the intellectual ferment over the foundations of mathematics that made these questions so significant at the time.
At various points in history, mathematicians have become concerned over apparent problems in the accepted foundations of their subject. For example, the Pythagoreans tried to base mathematics just on the rational numbers, and so were disturbed by the discovery that √2 must be irrational. Subsequently, the apparently self-contradictory treatment of infinitesimals in Newton and Leibniz's calculus disturbed many (Berkeley 1734), as later did the use of complex numbers and the discovery of non-Euclidean geometries. Later still, when the theory of infinite sets began to be pursued for its own sake and generalized, mainly by Cantor, renewed foundational worries appeared.
We study propositional logic in detail, defining its formal syntax in OCaml together with parsing and printing support. We discuss some of the key propositional algorithms and prove the compactness theorem, as well as indicating the surprisingly rich applications of propositional theorem proving.
The syntax of propositional logic
Propositional logic is a modern version of Boole's algebra of propositions as presented in Section 1.4. It involves expressions called formulas that are intended to represent propositions, i.e. assertions that may be considered true or false. These formulas can be built from constants ‘true’ and ‘false’ and some basic atomic propositions (atoms) using various logical connectives (‘not’, ‘and’, ‘or’, etc.). The atomic propositions are like variables in ordinary algebra, and we sometimes refer to them as propositional variables or Boolean variables. As the word ‘atomic’ suggests, we do not analyze their internal structure; that will be considered when we treat first-order logic in the next chapter.
Representation in OCaml
We represent propositional formulas using an OCaml datatype by analogy with the type of expressions in Section 1.6. We allow the ‘constant’ propositions False and True and atomic formulas Atom p, and can build up formulas from them using the unary operator Not and the binary connectives And, Or, Imp (‘implies’) and Iff (‘if and only if’). We defer a discussion of the exact meanings of these connectives, and deal first with immediate practicalities.
The underlying set of atomic propositions is largely arbitrary, although for some purposes it's important that it be infinite, to avoid a limit on the complexity of formulas we can consider. In abstract treatments it's common just to index the primitive propositions by number.
In this chapter we introduce logical reasoning and the idea of mechanizing it, touching briefly on important historical developments. We lay the groundwork for what follows by discussing some of the most fundamental ideas in logic as well as illustrating how symbolic methods can be implemented on a computer.
What is logical reasoning?
There are many reasons for believing that something is true. It may seem obvious or at least immediately plausible, we may have been told it by our parents, or it may be strikingly consistent with the outcome of relevant scientific experiments. Though often reliable, such methods of judgement are not infallible, having been used, respectively, to persuade people that the Earth is flat, that Santa Claus exists, and that atoms cannot be subdivided into smaller particles.
What distinguishes logical reasoning is that it attempts to avoid any unjustified assumptions and confine itself to inferences that are infallible and beyond reasonable dispute. To avoid making any unwarranted assumptions, logical reasoning cannot rely on any special properties of the objects or concepts being reasoned about. This means that logical reasoning must abstract away from all such special features and be equally valid when applied in other domains. Arguments are accepted as logical based on their conformance to a general form rather than because of the specific content they treat. For instance, compare this traditional example:
All men are mortal
Socrates is a man
Therefore Socrates is mortal
with the following reasoning drawn from mathematics:
All positive integers are the sum of four integer squares
15 is a positive integer
Therefore 15 is the sum of four integer squares
These two arguments are both correct, and both share a common pattern:
We've considered various algorithms (tableaux, resolution, etc.) for verifying that a first-order formula is logically valid, if indeed it is. But these will not in general tell us when a formula is not valid. We'll see in Chapter 7 that there is no systematic procedure for doing so. However, there are procedures that work for certain special classes of formulas, or for validity in certain special (classes of) models, and we discuss some of the more important ones in this chapter. Often these naturally generalize common decision problems in mathematics and universal algebra such as equation-solving or the ‘word problem’.
The decision problem
There are three natural and closely connected problems for first-order logic for which we might want an algorithmic solution. By negating the formula, we can according to taste present them in terms of validity or unsatisfiability.
Confirm that a logically valid (or unsatisfiable) formula is indeed valid (resp. unsatisfiable), and never confirm an invalid (satisfiable) one.
Confirm that a logically invalid (or satisfiable) formula is indeed invalid (resp. satisfiable), and never confirm a valid (unsatisfiable) one.
Test whether a formula is valid or invalid (or whether it is satisfiable or unsatisfiable).
Evidently (3) encompasses both (1) and (2). Conversely, solutions to both (1) and (2) could be used together to solve (3): just run the verification procedures for validity and invalidity (or satisfiability and unsatisfiability) in parallel. Now, we have presented explicit solutions to (1), such as tableaux or resolution. But these do not solve (3). Given a satisfiable formula, these algorithms, while at least not incorrectly claiming they are unsatisfiable, will not always terminate.
In this appendix we collect together some useful mathematical background. Readers may prefer to read the main text and refer to this appendix only if they get stuck. We do not give much in the way of proofs and the style is terse and rather dull, so this is not a substitute for standard texts. For example, Forster (2003) discusses in detail almost all the topics here, as well as much relevant material in logic and computability and some more advanced topics in set theory.
Mathematical notation and terminology
We use ‘iff’ as a shorthand for ‘if and only if’ and ‘w.r.t.’ for ‘with respect to’. We write x | y, read ‘x divides y’, to mean that y is an integer multiple of x, e.g. 3 | 6, 1 | x and x | 0. We use the usual arithmetic operations (‘+’ etc.) on numbers; we generally write xy for the product of x and y, but sometimes write x y to emphasize that there is an operation involved and make the syntax more regular. An operation such as addition for which the order of the two arguments is irrelevant (x + y = y + x) is called commutative, and an operation where the association does not matter (x + (y + z) = (x + y) + z) is said to be associative. We also use the conventional equality and inequality relations (‘=’, ‘≤’ etc.) on numbers, and sometimes emphasize that an equation is the definition of a concept by decorating the equality sign with def, e.g. tan(x) =def sin(x)/cos(x).
We now move from propositional logic to richer first-order logic, where propositions can involve non-propositional variables that may be universally or existentially quantified. We show how proof in first-order logic can be mechanized naively via Herbrand's theorem. We then introduce various refinements, notably unification, that help make automated proof more efficient.
First-order logic and its implementation
Propositional logic only allows us to build formulas from primitive propositions that may independently be true or false. However, this is too restrictive to capture patterns of reasoning where the truth or falsity of propositions depends on the values of non-propositional variables. For example, a typical proposition about numbers is ‘m < n’, and its truth depends on the values of m and n. If we simply introduce a distinct propositional variable for each such proposition, we lose the ability to interrelate different instances according to the variables they contain, e.g. to assert that ¬(m < nΛn < m). Firstorder (predicate) logic extends propositional logic in two ways to accommodate this need:
• the atomic propositions can be built up from non-propositional variables and constants using functions and predicates;
• the non-propositional variables can be bound with quantifiers.
We make a syntactic distinction between formulas, which are intuitively intended to be true or false, and terms, which are intended to denote ‘objects’ in the domain being reasoned about (numbers, people, sets or whatever). Terms are built up from (object-denoting) variables using functions. In discussions we use f(s, t, u) for a term built from subterms s, t and u using the function f, or sometimes infix notation like s + t rather than +(s, t) where it seems more natural or familiar.