To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Turing's deep 1937 paper made it clear that Gödel's astonishing earlier results on arithmetic undecidability related in a very natural way to a class of computing automata, nonexistent at the time of Turing's paper, but destined to appear only a few years later, subsequently to proliferate as the ubiquitous stored-program computer of today. The appearance of computers, and the involvement of a large scientific community in elucidation of their properties and limitations, greatly enriched the line of thought opened by Turing. Turing's distinction between computational problems was rawly binary: some were solvable by algorithms, others not. Later work, of which an attractive part is elegantly developed in the present volume, refined this into a multiplicity of scales of computational difficulty, which is still developing as a fundamental theory of information and computation that plays much the same role in computer science that classical thermodynamics plays in physics: by defining the outer limits of the possible, it prevents designers of algorithms from trying to create computational structures which provably do not exist. It is not surprising that such a thermodynamics of information should be as rich in philosophical consequence as thermodynamics itself.
This quantitative theory of description and computation, or Computational Complexity Theory as it has come to be known, studies the various kinds of resources required to describe and execute a computational process. Its most striking conclusion is that there exist computations and classes of computations having innocent-seeming definitions but nevertheless requiring inordinate quantities of some computational resource.
Resources for which results of this kind have been established include:
(a) The mass of text required to describe an object;
(b) The volume of intermediate data which a computational process wouldneed to generate;
(c) The time for which such a process will need to execute, either on astandard "serial" computer or on computational structures unrestrictedin the degree of parallelism which they can employ.
Having done the bulk of the work necessary to encode the halting probability Ω as an exponential diophantine equation, we now turn to theory. In Chapter 5 we trace the evolution of the concepts of program-size complexity. In Chapter 6 we define these concepts formally and develop their basic properties. In Chapter 7 we study the notion of a random real and show that Ω is a random real. And in Chapter 8 we develop incompleteness theorems for random reals.
To understand domain theory, first you should become acquainted with the λ-calculus. Its syntax could hardly be simpler, but its semantics could hardly be more complex. The typed λ-calculus has a simple semantics: each type denotes a set; each lambda expression denotes a function.
Through a careful analysis of nontermination, Dana Scott developed the typed λ-calculus into a Logic for Computable Functions (LCF). In the semantics of this logic, each data type denotes a set whose values are partially ordered with respect to termination properties. A computable function between partially ordered sets is monotonic and continuous. A recursive definition is understood as a fixed point of a continuous function. For reasoning about recursive definitions, Scott introduced the principle of fixed point induction. The resulting logic is ideal for reasoning about recursion schemes and other aspects of computable functions.
The lambda calculus
The λ-calculus has a long and many faceted relationship with computer science. Back in the 1930's, Alonzo Church formulated a notion of ‘effectively computable function’ in terms of functions expressible in the λ-calculus. His notion was shown to give the same set of functions as other models of computation such as the general recursive functions and Turing machines. Church's thesis states that these functions and no others are effectively computable. The λ-calculus influenced the design of the programming language Lisp and is closely reflected in the language ML.
ML, the meta language of LCF, allows computation on expressions of PPλ, the object language. An ML program may use quotations, an automatic mechanism for constructing well-typed PPλ expressions. At a lower level there are functions for building and taking apart PPλ expressions: for example, a function that maps A ΛB ΛC to the list of formulae A, B, C. The most complicated functions perform substitution or pattern matching.
If you intend to use LCF, now is the time to start. You will find the examples easier to understand if you work them on the computer. Section 5.7 describes how to start and finish a session.
The syntax of PPλ
PPλ expressions include types, terms, formulae, goals, and theorems. The corresponding ML data types are type, term, form, goal, and thm. A function for building formulae might have the type form × form → form.
Syntactic conventions
The abstract syntax of a PPλ expression is not a string of symbols but a tree structure. A tree is not only the most convenient representation conceptually, but gives the most efficient implementation in the computer.
Syntax is specified by BNF equations. Alternative syntax phrases are separated by a vertical bar: |. A phrase that can be repeated zero or more times is enclosed in braces: {, }. Syntactic classes like type-variable appear in italics. Language symbols appear, underlined, in typewriter font to emphasize that they are typed and displayed on computer terminals.
The purpose of this chapter is to introduce the notion of program-size complexity. We do this by giving a smoothed-over story of the evolution of this concept, giving proof sketches instead of formal proofs, starting with program size in LISP. In Chapter 6 we will start over, and give formal definitions and proofs.
Complexity via LISP Expressions
Having gone to the trouble of defining a particularly clean and elegant version of LISP, one in which the definition of LISP in LISP really is equivalent to running the interpreter, let's start using it to prove theorems! The usual approach to program-size complexity is rather abstract, in that no particular programming language is directly visible. Eventually, we shall have to go a little bit in this direction. But we can start with a very straightforward concrete approach, namely to consider the size of a LISP expression measured by the number of characters it has. This will help to build our intuition before we are forced to use a more abstract approach to get stronger theorems. The path we shall follow is similar to that in my first paper [CHAITIN (1966,1969a)], except that there I used Turing machines instead of LISP.
So we shall now study, for any given LISP object, its program-size complexity, which is the size of the smallest program (i.e., S-expression) for calculating it. As for notation, we shall use HLISP (“information content measured using LISP”), usually abbreviated in this chapter by omitting the subscript for LISP. And we write |S| for the size in characters of an S-expression S.
In Part I of this monograph, we do the bulk of the preparatory work that enables us in Part II to exhibit an exponential diophantine equation that encodes the successive bits of the halting probability Ω.
In Chapter 2 we present a method for compiling register machine programs into exponential diophantine equations. In Chapter 3 we present a stripped-down version of pure LISP. And in Chapter 4 we present a register machine interpreter for this LISP, and then compile it into a diophantine equation. The resulting equation, which unfortunately is too large to exhibit here in its entirety, has a solution, and only one, if the binary representation of a LISP expression that halts, i.e., that has a value, is substituted for a distinguished variable in it. It has no solution if the number substituted is the binary representation of a LISP expression without a value.
Having dealt with programming issues, we can then proceed in Part II to theoretical matters.
In this chapter we present a new definition of program-size complexity. H(A, B/C, D) is defined to be the size in bits of the shortest self-delimiting program for calculating strings A and B if one is given a minimal-size self-delimiting program for calculating strings C and D. As is the case in LISP, programs are required to be self-delimiting, but instead of achieving this with balanced parentheses, we merely stipulate that no meaningful program be a prefix of another. Moreover, instead of being given C and D directly, one is given a program for calculating them that is minimal in size. Unlike previous definitions, this one has precisely the formal properties of the entropy concept of information theory.
What train of thought led us to this definition? Following [CHAITIN (1970a)], think of a computer as decoding equipment at the receiving end of a noiseless binary communications channel. Think of its programs as code words, and of the result of the computation as the decoded message. Then it is natural to require that the programs/code words form what is called a “prefix-free set,” so that successive messages sent across the channel (e.g. subroutines) can be separated. Prefix-free sets are well understood; they are governed by the Kraft inequality, which therefore plays an important role in this chapter.
One is thus led to define the relative complexity H(A, B/C, D) of A and B with respect to C and D to be the size of the shortest self-delimiting program for producing A and B from C and D. However, this is still not quite right.
In this chapter we present the beautiful work of JONES and MATIJASEVIC (1984), which is the culmination of a half century of development starting with GODEL (1931), and in which the paper of DAVIS, PUTNAM, and ROBINSON (1961) on Hilbert's tenth problem was such a notable milestone. The aim of this work is to encode computations arithmetically. As Gödel showed with his technique of Gödel numbering and primitive recursive functions, the metamathematical assertion that a particular proposition follows by certain rules of inference from a particular set of axioms, can be encoded as an arithmetical or number theoretic proposition. This shows that number theory well deserves its reputation as one of the hardest branches of mathematics, for any formalized mathematical assertion can be encoded as a statement about positive integers. And the work of Davis, Putnam, Robinson, and Matijasevic has shown that any computation can be encoded as a polynomial. The proof of this assertion, which shows that Hilbert's tenth problem is unsolvable, has been simplified over the years, but it is still fairly intricate and involves a certain amount of number theory; for a review see DAVIS, MATIJASEVIC, and ROBINSON (1976).
Formulas for primes: An illustration of the power and importance of these ideas is the fact that a trivial corollary of this work has been the construction of polynomials which generate or represent the set of primes; JONES et al. (1976) have performed the extra work to actually exhibit manageable polynomials having this property.
The aim of this book is to present the strongest possible version of Gödel's incompleteness theorem, using an information-theoretic approach based on the size of computer programs.
One half of the book is concerned with studying Ω, the halting probability of a universal computer if its program is chosen by tossing a coin. The other half of the book is concerned with encoding Ω as an algebraic equation in integers, a so-called exponential diophantine equation.
Gödel's original proof of his incompleteness theorem is essentially the assertion that one cannot always prove that a program will fail to halt. This is equivalent to asking whether it ever produces any output. He then converts this into an arithmetical assertion. Over the years this has been improved; it follows from the work on Hilbert's 10th problem that Godel's theorem is equivalent to the assertion that one cannot always prove that a diophantine equation has no solutions if this is the case.
In our approach to incompleteness, we shall ask whether or not a program produces an infinite amount of output rather than asking whether it produces any; this is equivalent to asking whether or not a diophantine equation has infinitely many solutions instead of asking whether or not it is solvable.
If one asks whether or not a diophantine equation has a solution for N different values of a parameter, the N different answers to this question are not independent; in fact, they are only log2N bits of information.
Having developed the necessary information-theoretic formalism in Chapter 6, and having studied the notion of a random real in Chapter 7, we can now begin to derive incompleteness theorems.
The setup is as follows. The axioms of a formal theory are considered to be encoded as a single finite bit string, the rules of inference are considered to be an algorithm for enumerating the theorems given the axioms, and in general we shall fix the rules of inference and vary the axioms. More formally, the rules of inference F may be considered to be an r.e. set of propositions of the form
“Axioms | –F Theorem”.
The r.e. set of theorems deduced from the axiom A is determined by selecting from the set F the theorems in those propositions which have the axiom A as an antecedent. In general we'll consider the rules of inference F to be fixed and study what happens as we vary the axioms A. By an n-bit theory we shall mean the set of theorems deduced from an n-bit axiom.
Incompleteness Theorems for Lower Bounds on Information Content
Let's start by rederiving within our current formalism an old and very basic result, which states that even though most strings are random, one can never prove that a specific string has this property.
As we saw when we studied randomness, if one produces a bit string s by tossing a coin n times, 99.9% of the time it will be the case that H(s) ≈ n + H(n). In fact, if one lets n go to infinity, with probability one H(s) > n for all but finitely many n (Theorem R5).
In this chapter we convert the definition of LISP in LISP given in Section 3.6 into a register machine program. Then we compile this register machine program into an exponential diophantine equation.
Register Machine Pseudo–Instructions
The first step to program an interpreter for our version of pure LISP is to write subroutines for breaking S-expressions apart (SPLIT) and for putting them back together again (JOIN). The next step is to use SPLIT and JOIN to write routines that push and pop the interpreter stack. Then we can raise the level of discourse by defining register machine pseudo-instructions which are expanded by the assembler into calls to these routines; i.e., we extend register machine language with pseudo-machine instructions which expand into several real machine instructions. Thus we have four “microcode” subroutines: SPLIT, JOIN, PUSH, and POP. SPLIT and JOIN are leaf routines, and PUSH and POP call SPLIT and JOIN.
Figure 9 is a table giving the twelve register machine pseudo-instructions.
Now a few words about register usage; there are only 19 registers! First of all, the S-expression to be evaluated is input in EXPRESSION, and the value of this S-expression is output in VALUE. There are three large permanent data structures used by the interpreter:
the association list ALIST which contains all variable bindings,
the interpreter STACK used for saving and restoring information when the interpreter calls itself, and
the current remaining DEPTH limit on evaluations.
All other registers are either temporary scratch registers used by the interpreter (FUNCTION, ARGUMENTS, VARIABLES, X, and Y), or hidden registers used by the microcode rather than directly by the interpreter.