TWO-SORTED FREGE ARITHMETIC IS NOT CONSERVATIVE

Abstract Neo-Fregean logicists claim that Hume’s Principle (HP) may be taken as an implicit definition of cardinal number, true simply by fiat. A long-standing problem for neo-Fregean logicism is that HP is not deductively conservative over pure axiomatic second-order logic. This seems to preclude HP from being true by fiat. In this paper, we study Richard Kimberly Heck’s Two-Sorted Frege Arithmetic (2FA), a variation on HP which has been thought to be deductively conservative over second-order logic. We show that it isn’t. In fact, 2FA is not conservative over n-th order logic, for all 
$n \geq 2$
 . It follows that in the usual one-sorted setting, HP is not deductively Field-conservative over second- or higher-order logic.

§1. Introduction.Frege [10-12] sought to derive the theorems of arithmetic from nothing but basic logical laws and definitions.Such a derivation, called a logicist derivation of arithmetic, would provide the ultimate foundation for our arithmetical knowledge.It would justify the theorems of arithmetic once and for all by deriving them from principles that needed no justification-principles that were either self-evident ('basic logical laws') or true simply by stipulation ('definitions').
By his own lights, Frege did not manage to give a logicist derivation of arithmetic.But he did show how to derive a very powerful system of arithmetic from a single, natural principle, known as Hume's Principle (HP). 1 Informally, HP says, 'The number of Fs is equal to the number of Gs iff there is a one-one correspondence between the Fs and the Gs.' In second-order logic, HP is expressible as the universal closure of where # is an operator that combines with monadic second-order variables F, G to form terms of object type, and F ≈ R G abbreviates the statement that R is a one-one correspondence between the Fs and the Gs. 2 Then we have the following beautiful result: Theorem 1.1 (Frege's Theorem).The theorems of second-order arithmetic are derivable in second-order logic from HP together with eliminative definitions of natural number, zero, and successor. 3o-Fregean logicists, preeminently Hale and Wright [14], argue that Frege's Theorem already yields a logicist derivation of arithmetic.They claim that HP may be taken as an implicit definition of the operator # ('the number of') in purely logical terms. 4Hale and Wright's notion of implicit definition is deeply controversial.For our purposes, the main point is that Hale and Wright conceive of implicit definitions as true simply by stipulation [14, p. 117].Such definitions need no justification.They are true by fiat.So, if Hale and Wright are correct, Frege's Theorem does indeed yield a logicist derivation of arithmetic.
Not just anything can be stipulated to be true.We cannot establish any new 'substantive' truths by fiat.No one could have established by fiat that the Morning Star is the Evening Star.Accordingly, it is natural to think that any legitimate stipulative definition must meet the following requirement, known as conservativeness: Definition 1.2.Let T be a theory in a formal language L. Let Δ be a definition of one new sign, and let L + be the language obtained by adding that new sign to L. Assume that deductive systems for L and L + have been specified.Then Δ is conservative over T iff any L-formula that is derivable L + from T + Δ is already derivable L from T.
Intuitively, a definition is conservative over our theory T just in case adding it to our theory does not yield any new theorems expressible entirely in old vocabulary.The definition does not settle any open questions that we already knew how to ask.
But HP is not conservative.More precisely, HP is not conservative over pure axiomatic second-order logic-which presumably ought to be the starting theory for aspiring logicists. 5For HP proves a sentence DI in the language of pure second-order logic which says that the universe is Dedekind-infinite ('there is a one-one mapping from the universe into itself that is not onto').But DI is not a theorem of pure secondorder logic.So, it seems that HP cannot be a legitimate stipulative definition.Call this the conservativeness problem for neo-Fregean logicism.
The conservativeness problem is robust.Definitions that are conservative over pure second-order logic seem to be mathematically very weak, and hence unable to provide a foundation for arithmetic.Furthermore, adding more basic logical laws won't help unless those laws suffice to prove DI.But it seems like a tall order to prove the existence of infinitely many objects from basic logical laws alone.
Hale and Wright respond to the conservativeness problem by denying that stipulative definitions must be conservative in the sense of Definition 1.2.Roughly speaking, they hold that stipulative definitions need only satisfy a modified conservativeness requirement, known as Field-conservativeness. 6 We set out to explore a different approach.Is it possible to find a variant of HP that is conservative in the standard deductive sense-the sense of Definition 1.2?
A promising direction is suggested by Heck's work on the Julius Caesar problem [15,  16].Heck reconstrues Hume's Principle as introducing a new sort of singular term into the language.Call the reconstrued principle two-sorted Hume's Principle (2HP), and the theory that results from supplementing 2HP with logical axioms for the expanded language, two-sorted Frege Arithmetic (2FA).The theory 2FA interprets second-order arithmetic in the numerical sort.In particular, 2FA proves that the numerical universe is Dedekind-infinite.But there is no obvious witness to non-conservativeness, because the numerical sort is not part of the base language.Indeed, it has been claimed that 2FA is conservative over pure second-order logic [3, p. 237, n. 7].
In this paper we prove that 2FA is not conservative over pure second-order logic.In fact, we prove something stronger.Our strategy is based on the following little fact: Lemma 1.3.Let T be a theory in a formal language L, and let A be any L-sentence.Suppose that a sentence Δ is not conservative over T + A. Then Δ is not conservative over T.
Proof.Let ϕ be an L-sentence such that T + A + Δ ϕ, but T + A ϕ.By the Deduction Theorem, we have In Section 7, we consider a theory w2FA that is much weaker than 2FA.We show that w2FA is non-conservative over pure second-order logic together with an axiom saying that the base universe is infinite. 7In other words, even if we already know that there are infinitely many objects, w2FA tells us something new about them!Then from Lemma 1.3, it follows that w2FA, and hence 2FA, is non-conservative over pure second-order logic.
In Section 8, we show that for the weaker theory w2FA, the non-conservativeness vanishes if we strengthen the base theory in either of two natural ways.First, w2FA is conservative over third-or higher-order logic.Second, w2FA is conservative over second-order logic plus 'the base universe is finite'.
In Section 9, we present a different proof that 2FA is not conservative over pure second-order logic.This proof shows that 2FA remains non-conservative over the stronger base theories discussed in the previous section.Specifically, we show that 2FA is not conservative over second-order logic plus 'the base universe is finite', and the proof of this fact generalizes to third-and higher-order logic.
In order to state and prove these results, we will need some preliminaries.In Section 2, we explain the logical setting for the paper: many-sorted axiomatic second-order logic.In Section 3, we explain how to construe Hume's Principle in a many-sorted setting, and we define the theories w2FA and 2FA.In Section 4, we present some background material on first-and second-order arithmetic.In Section 5, we show how to formalize some facts about well-orderings and finiteness in second-order logic.In Section 6, we discuss the Fraenkel model, which is the minimal infinite model of pure second-order logic.
In Sections 7-9, we prove the main results.Lastly, in Section 10, we connect our work to the literature on Field-conservativeness and related notions.Our main result implies that in a one-sorted setting, HP is neither deductively Field-conservative nor deductively Caesar-neutral conservative over second-or higher-order logic.This answers some open problems raised by Shapiro and Weir [27, p. 298], Fine [9, p. 192, n.  1], and Studd [30, p. 597].We conclude by mentioning some open problems of our own.§2.Many-sorted second-order logic.We work in axiomatic second-order logic with many sorts of singular terms and first-order variables.In this section we explain the logical framework in considerable generality.
In Section 2.1, we define 'sort'.In Sections 2.2 and 2.3, we define second-order languages L J [K ] for any nonempty set of object sorts J and any set of constant symbols K.We present deductive systems and general semantics for these languages.In Section 2.4, we define the two many-sorted second-order languages that will be central to the rest of the paper, called the base language L := L {0} [∅] and the expanded language

Sorts.
Let J be any nonempty set of symbols.These symbols are called firstorder sorts or object sorts.
Let Sorts 2 (J ) be the set of all tuples j 1 , ... , j n with n ≥ 1 and j 1 , ... , j n ∈ J .These tuples are second-order relation sorts formed from J.
In the languages L J [K ], there will be no function variables and no third-order variables.We only allow variables of sorts ∈ J ∪ Sorts 2 (J ).However, there may be constant symbols of any sort ∈ Sorts(J ).

Languages without constant symbols.
For any set of object sorts J, we define the second-order language L J as follows: (i) The alphabet of L J contains variables x j , y j , z j , ... for each object sort j ∈ J , and relation variables X , Y , Z , ... for each second-order sort ∈ Sorts 2 (J ).x not free in ϕ X not free in ϕ Rule of inference from ϕ and ϕ → , infer Let ϕ, be any formulas of L J .Let x, y, X be variables, and t, T be terms.(Note that x, y, t must all be of the same sort.Likewise, X and T must be of the same sort.)Let ϕ(t) be the result of substituting t for all free occurrences of x in ϕ.In ( * ), let α be any atomic formula of L J , and let α be any formula obtained from α by replacing zero or more occurrences of x with y.In Comprehension, we write X x to abbreviate X j 1 ,...,jn x There are no nonlogical constant symbols.The logical constants are ¬, →, ∀, =. (ii) The terms of sort are the variables of sort , for each ∈ J ∪ Sorts 2 (J ).(iii) In atomic formulas, we require that the sorts match.More precisely, the atomic formulas are strings of the form t j 1 = t j 2 and T j 1 ,...,jn t j 1 1 , ... , t jn n , where each t j is a term of sort j ∈ J , and T j 1 ,...,jn is a term of sort j 1 , ... , j n .(iv) If ϕ, are formulas and x j , X are variables, then ¬ϕ, ϕ → , ∀x j ϕ, ∀X ϕ are also formulas.
The deductive system for L J is essentially equivalent to Shapiro's D2 minus the axiom schema of choice [26, pp.66-67].Compare [8, pp.112-113].Its axioms are all closed universal generalizations of the formulas depicted in Table 1.For legibility, we suppress sorts.But note that x, y, and t must all be of the same sort, and X and T must be of the same sort.This requirement is induced by the formation rules of the language.
An L J -prestructure M is a collection of nonempty sets {M : ∈ J ∪ Sorts 2 (J )} such that M j 1 ,...,jn ⊆ P(M j 1 × ••• × M jn ) for all j 1 , ... , j n ∈ J .Satisfaction and truth in M are defined inductively, taking variables of sort to range over domain M .
A general L J -structure is an L J -prestructure in which the second-order comprehension axioms are satisfied.Our deductive system is sound and complete with respect to general L J -structures.
A standard L J -structure M is a general L J -structure in which M j 1 ,...,jn = P(M j 1 × ••• × M jn ) for all j 1 , ... , j n ∈ J .So, a standard L J -structure is fully specified by its object domains {M j : j ∈ J }.Our deductive system is sound but not complete with respect to standard structures.

Languages with constant symbols.
We will now sketch how to add constant symbols to the languages L J .
For each ∈ Sorts(J ), let K be a set of new symbols, called constant symbols.Each constant symbol is assigned to a particular sort , and is classified as an object, relation, or function constant accordingly.Assume that the K 's are pairwise disjoint, or use superscripts to keep track of sorts.Let K = ∈Sorts(J ) K .Define the language L J [K ] as follows: (i) The alphabet of L J [K ] is the alphabet of L J expanded by K. (ii) If ∈ J ∪ Sorts 2 (J ), the atomic terms of sort are the variables x and the constants in K .
If ∈ Sorts 3 (J ), the atomic terms of sort are the constants in K .If = 1 , ... , n ; n+1 ∈ FnSorts(J ), and f ∈ K , and t 1 1 , ... , t n n are terms of the indicated sorts, then f t 1 1 ••• t n n is a term of sort n+1 .(iii) The atomic formulas are defined as in L J , except that we also allow atomic formulas of the form T t 1 1 ••• t n n with = 1 , ... , n ∈ Sorts 3 (J ).(iv) The inductive clauses generating the set of all formulas are unchanged.
The deductive system for L J [K ] is obtained from the deductive system for L J by allowing ϕ, to range over L J [K ]-formulas, α to range over atomic L J [K ]-formulas, and adding axioms of Extensionality analogous to the axioms of Identity. 8n L J [K ]-prestructure M = (S, I ) consists of an L J -prestructure S together with an interpretation I of the constant symbols that meets the following three conditions: General and standard L J [K ]-structures are defined analogously to L J -structures.

The languages
L and L + .We now define the two languages that will be at the center of the rest of the paper.Definition 2.5.The base language is L := L {0} .Definition 2.6.The expanded language is L + := L {0,n} [{# 0 , # n }], where # 0 and # n are function constants of sorts 0 ; n and n ; n , respectively.
The logical axioms for L and L + will be denoted by Ax L and Ax L + , respectively.Some notational conventions: §3.Heck's theory 2FA.Think of the base language L as our starting language, and Ax L as our starting theory.Heck [15], [16, pp.150-151] reconstrues Hume's Principle as introducing a new, numerical sort of object (sort n), together with a host of new second-order relation sorts.The operator # ('the number of') may be applied to a concept variable of either sort, yielding a singular term of the numerical sort.Definition 3.7.Weak two-sorted Hume's Principle (w2HP) is the universal closure of: Here, F 0 ≈ R 00 G 0 abbreviates the statement that R 00 is a one-one correspondence between F 0 and G 0 .
Intuitively, w2HP gives the criterion of identity for numbers belonging to base-sort concepts.It tells us how to count base-sort objects.But w2HP does not tell us how to count numbers.Since we do in fact count numbers, we are motivated to consider a stronger principle.Definition 3.8.Two-sorted Hume's Principle (2HP) is the conjunction of the universal closures of the following three L + -formulas: The first line is w2HP.The second line gives the criterion of identity for numbers belonging to numerical concepts.The third line gives the mixed criterion of identity, which tells us (e.g.) whether the number of Julio-Claudian emperors equals the number of prime numbers less than 12.
Using our superscript-dropping conventions, we may write 2HP as follows: Definition 3.9.Weak two-sorted Frege Arithmetic (w2FA) is the theory whose logical axioms are Ax L + and whose sole nonlogical axiom is w2HP. 9In other words, Definition 3.10.Two-sorted Frege Arithmetic (2FA) is the theory whose logical axioms are Ax L + and whose sole nonlogical axiom is 2HP.In other words, Notice that the logical axioms of 2FA include full second-order comprehension for the expanded language.So, by Frege's Theorem, 2FA interprets second-order arithmetic in the numerical sort.It follows that 2FA proves a sentence which says that the numerical universe is Dedekind-infinite.But this is not a witness to non-conservativeness, because the numerical sort is not part of the base language.Prima facie, it seems quite plausible that 2FA should be a conservative extension of Ax L .§4. Arithmetic.We will study 2FA by comparing it with other, better-known systems of arithmetic.In Section 4.1, we describe the usual systems of first-and second-order arithmetic.In Section 4.2, we describe systems of arithmetic with no function symbols.
Definition 4.11.The language of Peano arithmetic, L PA , is a classical first-order language with identity whose nonlogical vocabulary is (0, S, ≤, +, •).Here, 0 is a constant symbol, S is a unary function symbol, ≤ is a binary relation symbol, and +, • are binary function symbols.Definition 4.12.Robinson arithmetic, Q, is the theory in L PA with the following eight axioms: Definition 4.13.Peano arithmetic, PA, is the result of adding to Q the following axiom schema of induction: where ϕ(x) is any formula of L PA .
An L PA -formula is called bounded, or Σ 0 , if all quantifiers occurring in it are bounded.An L PA -formula is called Σ n (n ≥ 0) if it consists of a string of n alternating unbounded quantifiers, the first of which is existential, followed by a bounded formula.That is, a Σ n formula has the form ∃x∀y∃z∀w ••• , where is bounded.Definition 4.14.The theory I Σ n (n ≥ 0) is the result of adding to Q the axiom schema of induction above, restricted to Σ n formulas.
Definition 4.15.The language of second-order arithmetic, L 2 , is a two-sorted language consisting of all the vocabulary of L PA , together with denumerably many monadic secondorder variables X, Y, Z, ... and a second-order quantifier ∀X .The atomic formulas of L 2 include all strings of the form Xt, where t is a first-order term and X is a second-order variable.
The second-order variables of L 2 are usually called set variables, and the atomic formulas Xt are sometimes written t ∈ X .For our purposes, there is no difference between set variables and concept variables, and the predication relation ∈ may be left implicit.Hence, L 2 may be regarded as an expansion of the monadic fragment of L. Definition 4.16.Second-order arithmetic, Z 2 , is the theory in L 2 whose axioms are those of Q, together with the second-order induction axiom and the second-order comprehension scheme for each formula ϕ of L 2 not containing X free.As usual, ϕ may contain parameters, i.e., free first-or second-order variables other than x.

First-and second-order arithmetic with no function symbols.
In this section, we introduce an arithmetical language L in which successor, addition, and multiplication are rendered as relations (which may be only partially defined) instead of functions.This allows us to define BA , a weak system of arithmetic that does not assume the existence of infinitely many natural numbers.The main point of the section is to state Lemma 4.23 and prove Lemmas 4.25 and 4.28.We will use these lemmas in Section 9 only, so feel free to skip this section and return to it later.
Definition 4.17.Let L be the classical first-order language with identity whose nonlogical vocabulary is (0, S, ≤, A, M ).Here, 0 is a constant symbol, S and ≤ are binary relation symbols, and A and M are ternary relation symbols.
An L -formula is called bounded , or Σ 0 , if it contains only bounded quantifiers.
Definition 4.18.BA is the theory in L with the following axioms: 1. ≤ is a discrete linear order with least element 0, 2. Sxy iff y is the upper neighbor of x with respect to ≤, 3. Definitions of A and M: Commutativity and associativity of A and M, distributivity, monotonicity of addition, monotonicity of multiplication by a positive number, and x ≤ y ↔ (∃u ≤ y)Axuy, 5. Induction scheme for Σ 0 formulas: Definition 4.19.I Σ 0 is the result of adding to BA axioms saying that S, A, M define total functions, namely ∀x∃ySxy, etc.
An L -formula is called Σ n (n ≥ 0) if it consists of a string of n alternating unbounded quantifiers, the first of which is existential, followed by a bounded formula.Definition 4.20.The theory I Σ n (n ≥ 0) is the result of adding to I Σ 0 the axiom schema of induction above, extended to Σ n formulas.
We now state some useful facts about BA and its relatives.
For each n ∈ N, let x .
= n abbreviate the L -formula  For proof, see [13, p. 233].Lemma 4.25.Let ϕ(x 1 , ... , x k ) be a bounded formula, and let a 1 , ... , a k ∈ N be such that N ϕ(a 1 , ... , a k ).Then Proof.Let be the L PA -formula obtained from ϕ by replacing Sxy, Axyz, Mxyz with Sx = y, x + y = z, x • y = z respectively.Observe that (S a 1 0, ... , S a k 0) is a true bounded sentence of L PA .Now we argue as follows: I Σ 0 (S a 1 0, ... , S a k 0), The first line holds because I Σ 0 proves all true bounded sentences. 11The second line follows by Lemma 4.22.Regarding the third line, it is easy to check that for each n ∈ N, The fourth line follows by propositional logic, because ϕ and differ only by applications of the equivalences in D. The fifth line follows because I Σ 0 + D is conservative over I Σ 0 for L -formulas.The sixth line follows by Lemma 4.24.
Lastly, we describe a system of second-order arithmetic without function symbols.Proof.We argue that Z 2 + D Z 2 .The other direction is easy.Observe that (Z 2 + D) (I Σ 0 + D) I Σ 0 Q.Furthermore, the two ways of formulating the second-order induction axiom are equivalent in the presence of Sx = y ↔ Sxy.
It remains to show that Z 2 + D proves the second-order comprehension scheme for L 2 .Take any L 2 -formula ϕ.Let be the formula obtained from ϕ by replacing each atomic predication Xt with ∃z(Xz ∧ z = t), where z is a new variable.Then every non-atomic term in occurs in an equation t 1 = t 2 .These equations are L PA -formulas.By Lemma 4.23, Z 2 + D proves each L PA -formula to be equivalent to an L -formula.So, there is an L 2 -formula ϕ such that Z 2 + D ϕ ↔ ϕ .Now apply second-order comprehension to ϕ , and we are done.§5.Well-orderings and finiteness.In this section, we define 'well-ordering' in L, and we note that Ax L proves that all well-orderings are comparable (Lemma 5.29).Then we define the notion of Stäckel-finiteness and prove the important lemma of induction on finite concepts (Lemma 5.32).We will use these lemmas throughout the paper.
For simplicity, we work in L. However, these notions can easily be extended to L + .Let ∅ denote the empty concept.Let V denote the universal concept.
Let '(X, R) is a linear order' abbreviate the formula In other words, (X, R) is a linear order just in case R is an antisymmetric, transitive, total relation on X.
Let '(X, R) is well-founded' abbreviate Say that (X, R) is a well-ordering if (X, R) is a well-founded linear order.We say that two well-orderings (X, ≤ X ) and (Y, Strictly speaking, we should represent f as a relation, but we will go on using functional notation informally. If (X, R) is a well-ordering, let X a be the initial segment of (X, R) up to a, defined by We also regard ∅ as an initial segment of (X, R).An initial segment of (X, R) is proper if it is not equal to X.
Let (X, ≤ X ) < o (Y, ≤ Y ) abbreviate the statement that (X, ≤ X ) is order-isomorphic with a proper initial segment of (Y, ≤ Y ).
Lemma 5.29 (Comparability of well-orderings).It is provable from Ax L that any two well-orderings (X, ≤ X ) and (Y, ≤ Y ) are comparable, in the sense that exactly one of the following holds: Proof.Copy the usual set-theoretic proof [18, pp.18-19].
We now define the notion of Stäckel-finiteness.If R is a binary relation, let R -1 be the converse of R, defined by R -1 xy ↔ Ryx.Definition 5.30.Say that (X, R) is a double well-ordering if (X, R) and (X, R -1 ) are both well-orderings.
Say that X is Stäckel-finite, abbreviated Fin(X ), if X admits a double well-ordering.That is, Fin(X ) ⇐⇒ df ∃R((X, R) is a double well-ordering).
Stäckel-finiteness is strictly stronger than Dedekind-finiteness, in the sense that where of course DFin(X ) abbreviates that X is Dedekind-finite.Indeed, Fin(X ) → DFin(X ) is a version of the pigeonhole principle.It is provable from Ax L by induction on finite concepts (Lemma 5.32).On the other hand, the Fraenkel model (defined in Section 6) is a model of DFin(V ) + ¬Fin(V ), witnessing that Ax L DFin(X ) → Fin(X ).
Lastly, we show that Ax L proves a principle of induction on Stäckel-finite concepts.
Let X ∪ {a} be the concept defined by Lemma 5.32 (Induction on finite concepts).Let ϕ(X ) be any formula of L. Then Ax L proves the universal closure of Proof.Assume the antecedent.Take any X such that Fin(X ).Fix a double wellordering (X, R), and let Y be defined by Yx ↔ (Xx ∧ ϕ(X x)).It suffices to show that Y = X .
Suppose not.Since (X, R) is a well-ordering, there is an R-least y such that Xy ∧ ¬Yy.It is easy to see that y cannot be the R-least element of X.Since (X, R -1 ) is a well-ordering, y has a unique (X, R)-predecessor, call it z.By the minimality of y, we have Yz, and hence ϕ(X z).Also, it is easy to see that Fin(X z).It follows that ϕ((X z) ∪ {y}), which is to say ϕ(X y).But this contradicts our choice of y. §6.The Fraenkel model.In this section, we define the Fraenkel model and show that it is a model of Ax L + ¬Fin(V ) (Lemmas 6.38 and 6.39).Then we show that the relations occurring in the Fraenkel model are exactly the sets definable by Boolean combinations of equalities with object parameters (Lemma 6.40).We will make good use of these facts in Section 7.
We remark that Lemma 6.40 implies that the Fraenkel model is the minimal infinite model of Ax L -i.e., it is a submodel of any infinite model of Ax L .Definition 6.33.Let A ⊆ N n and E ⊆ N. We say that E is a support of A if every permutation : N → N that fixes E pointwise fixes A setwise: (∀e ∈ E)( (e) = e) =⇒ ∀x 1 , ... , x n ((x 1 , ... , x n ) ∈ A ↔ ( (x 1 ), ... , (x n )) ∈ A).
Using the notation (A) = {( (x 1 ), ... , (x n )) ∈ N n : (x 1 , ... , x n ) ∈ A}, we can restate this property as follows: for every permutation : N → N, Definition 6.35.The Fraenkel model is the L-prestructure M whose object universe is N, and whose n-ary relations are the symmetric subsets of N n .That is, writing M n for M 0,...,0 (n zeroes), It is well known that M is a model of Ax L (i.e., it is a general L-structure) [32].However, we are not aware of any English-language source that gives the proof.For the reader's convenience, we present the proof from [1] in the next two lemmas.Lemma 6.36.If A ⊆ N n is symmetric, and : N → N is any permutation, then (A) ⊆ N n is also symmetric.
Proof.Let E be a support for A. We show that -1 (E) is a support for (A).Indeed, take any permutation : N → N that fixes -1 (E) pointwise.Then the permutation -1 : N → N fixes E pointwise.So, ( -1 )(A) = A, and hence ( (A)) = (A).Proof.Let M be the prestructure defined above.We show that M satisfies Comprehension.Take any formula ϕ( x, b, B) of L, with free variables x = (x 1 , ... , x n ) and parameters b = (b 1 , ... , b j ) and B = (B 1 , ... , B k ) drawn from M. Say that Since the relation parameters B are drawn from M, each set B i has a finite support Take any permutation : N → N that fixes E pointwise, and take any ā = (a 1 , ... , a n ) ∈ N n .We check that ā ∈ A ⇐⇒ ( ā) = ( (a 1 ), ... , (a n )) ∈ A. Indeed, (Notation: ( b) = ( (b 1 ), ... , (b j )) and ( B) = ( (B 1 ), ... , (B k )).By Lemma 6.36, each (B i ) is a parameter from M.) The second step works because permuting everything uniformly doesn't change any truth-values relative to any variableassignment.This is easily proved by induction on formulas.The third step works because fixes E pointwise, hence fixes all the parameters.Lemma 6.39.The Fraenkel model is a model of ¬Fin(V ).
Proof.In fact, we will prove something stronger: the Fraenkel model does not contain any linear ordering of the universe.
Consider any relation R ⊆ N 2 with finite support E ⊆ N. Suppose for sake of contradiction that R is a linear ordering of the universe.Since R is total, we may choose distinct a, b ∈ N \ E such that Rab.Let be any permutation fixing E such that (a) = b and (b) = a.Since E is a support of R, it follows that Rba.But this contradicts the assumption that R is antisymmetric.
So, M contains no linear ordering of the universe.It follows that M contains no double well-ordering of the universe, i.e., M ¬Fin(V ).
We close this section by giving a simple characterization of symmetric sets.

Lemma 6.40. Let E ⊆ N be a finite set. A set A ⊆ N is symmetric with support E iff A is definable by Boolean combinations of equalities with parameters from E.
Proof.Define an equivalence relation ∼ E on N n , as follows: In words: ā ∼ E b iff ā and b are n-tuples with the same pattern of identity and distinctness which agree on members of E. It is easy to see that ∼ E really is an equivalence relation.(=⇒).Suppose A ⊆ N n is symmetric with support E. Observe that A is a union of equivalence classes of ∼ E .Indeed, if ā ∼ E b, then there is a permutation : N → N fixing E such that ( ā) = b.Now, each equivalence class of ∼ E is definable by a Boolean combination of equalities with parameters from E, of the following form: (The parenthesized negations may or may not be present in each conjunct.)Furthermore, ∼ E has only finitely many equivalence classes, because there are only finitely many possible patterns of identity and distinctness among x 1 , ... , x n and the members of E. Hence, A is definable by a disjunction of formulas like the one above.(⇐=).Suppose A is definable by a Boolean combination of equalities with parameters from E. We show that A is symmetric with support E.
Take any permutation : N → N fixing E pointwise.That is, for all x i , x j ∈ N and e ∈ E, By induction on formulas, it is easy to see that N ϕ( x, ē) ↔ ϕ( ( x), ē) for any Boolean combination of equalities ϕ( x, ē).Since A is defined by some such Boolean combination, it follows that (A) = A.
Since was arbitrary, we conclude that A is symmetric with support E. §7.The non-conservativeness of w2FA.In this section, we prove Theorem 7.47, which says that w2FA is not conservative over Ax L + ¬Fin(V ).
Here is the main idea of the proof.We have seen that Ax L + ¬Fin(V ) has a model whose relations are easy to describe in finitary terms (Section 6).Hence, Ax L + ¬Fin(V ) is a fairly weak theory; in fact it is mutually interpretable with firstorder Peano arithmetic.(To show that Ax L + ¬Fin(V ) interprets PA, the trick is to code arithmetical statements as statements about finite concepts.)On the other hand, adding w2FA to Ax L + ¬Fin(V ) results in a much stronger theory, one which proves that the numerical sort is Dedekind-infinite and hence interprets secondorder arithmetic.Second-order arithmetic is not conservative over Peano arithmetic.By means of a carefully chosen interpretation, this non-conservativeness can be transferred to the theories of interest to us.For example, w2FA + ¬Fin(V ) proves the interpretation of a consistency statement for Peano arithmetic, while Ax L + ¬Fin(V ) does not.
Let X ≈ Y abbreviate that there is a bijection between X and Y, in which case we say that X and Y are equinumerous concepts.
If Ryz is a binary relation, let R y be the concept defined by R y z ↔ Ryz.(This is terrible notation, but we only use it in the following definition.) The α-translations of the axioms of Q can be expressed as follows (after eliminating definite descriptions in a convenient way).For any Stäckel-finite concepts X, Y, Z, Y , Z , ¬Succ(X, ∅), (We drop the third axiom of Q, since it is redundant in PA.)It is tedious but straightforward to check that all of these claims are provable from Ax L + ¬Fin(V ).
The previous step essentially provides us with recursive definitions of Add and Mult.Using these recursive definitions, it is then easy to prove that Add and Mult define total functions (up to ≈).For Add, we must show that for any Stäckel-finite concepts X, Y, Z, W , Both of these claims are provable by induction on the finite concept Y (Lemma 5.32), using the recursive definition of Add.The proof for Mult is similar.
Lastly, the α-translation of the induction scheme of PA follows from induction on finite concepts (Lemma 5.32 again).
Proof.By Lemma 7.43, we already know that the α-translation is an interpretation of PA in Ax L + ¬Fin(V ), and hence in w2FA + ¬Fin(V ).It remains to check that w2FA + ¬Fin(V ) proves the α-translations of the second-order induction and comprehension axioms.
The translation of the second-order induction axiom is equivalent to This is easily proved by induction on finite concepts, generalized to L + -formulas.The generalization is proved in the same way as Lemma 5.32.
The comprehension scheme translates as follows: To prove this in w2FA + ¬Fin(V ), apply comprehension (in L + ) to the formula Then use w2FA and the fact that ≈ is a congruence with respect to ϕ α (Y ).
We will now define a translation : L → L PA inspired by the Fraenkel model, and show that it is an interpretation of Ax L + ¬Fin(V ) in PA.
Fix primitive recursive encodings of finite sets and sequences as natural numbers.For finite sequences, this amounts to specifying the following functions in L PA : (i) for each n ∈ N, a primitive recursive function x 1 , ... , x n , which codes this tuple as a single number, (ii) primitive recursive functions length(s) and (s) i , which return the length and the i-th element of the finite sequence coded by s.
We identify finite sets and sequences with their codes.We use the letter E for finite sets, and the letter s for finite sequences.Fix a primitive recursive G ödel numbering of L PA -formulas.We identify formulas with their G ödel numbers.For each formula ϕ, let ϕ be a formal numeral that denotes (the G ödel number of) ϕ.
Next, we describe L PA -formulas BoolEq, BoolSat, pad n representing certain primitive recursive relations and functions.
Let BoolEq(x, y, E) just in case: x is a Boolean combination of L PA -equalities with exactly y free variables and with constant symbols drawn from {S e 0 : e ∈ E}.
Let BoolSat(x, s) just in case: x is a Boolean combination of L PA -equalities that is satisfied when the i-th variable of L PA is assigned the value (s) i , for all i ≤ length(s).This is primitive recursive, because truth and satisfaction for bounded (Σ 0 ) formulas are primitive recursive notions.
For each n ∈ N, let pad n (x 1 , ... , x n , y 1 , ... , y n ) = s just in case: s is the shortest finite sequence whose x i -th element is y i (for all 1 ≤ i ≤ n) and whose other elements are all zero.Definition 7.45.Define the translation : L → L PA as follows.
Let the variables of L PA and the object variables of L be enumerated by v 1 , v 2 , v 3 , .... Translate each object variable v i of L by the even-numbered variable v 2i .Translate each relation variable X of L by a distinct odd-numbered variable v X ∈ {v 1 , v 3 , v 5 , ...}.In the last clause, E is a fresh variable and n is the arity of X.
Proof.It is easy to check that the -translation of any non-comprehension axiom is a theorem of first-order logic, and hence is provable in PA.12It remains to show that PA proves the -translation of each comprehension axiom, and also that PA proves (¬Fin(V )) .
The idea is to formalize the proofs of Lemmas 6.38, 6.39, and 6.40 in PA.The main obstacle is that we defined symmetric sets A ⊆ N n in terms of arbitrary permutations of N, and it is not obvious how to formalize those in PA.But in fact we do not need arbitrary permutations.Say that a permutation : N → N is essentially finite if (a) = a for all but finitely many a ∈ N. If we go through Section 6, replacing 'permutation' with 'essentially finite permutation' everywhere, we get exactly the same model, and all the proofs still work.
We formalize Lemma 6.40 as follows.Say that an L PA -formula ϕ( x) is symmetric with support E just in case, for every essentially finite permutation , Then we prove a theorem scheme in PA which says: 'An L PA -formula is symmetric iff there is a Boolean combination of equalities coextensive with it.'More precisely, let ϕ(v i 1 , ... , v in ) be any L PA -formula with exactly the free variables displayed.Then PA proves the following: ϕ(v i 1 , ... , v in ) is symmetric with support E iff there exists y such that BoolEq(y, S n 0, E) ∧ ∀ x(BoolSat(y, pad n (S i 1 0, ... , S in 0, x)) ↔ ϕ( x)).
For the (⇐=) direction, copy the rest of the proof of Lemma 6.40.Next, we formalize Lemma 6.38.We replace M ϕ ('M satisfies ϕ') with ϕ throughout.For each L-formula ϕ( x, ȳ, Ȳ ) not containing X free, we wish to show that PA proves This basically says: 'There is a Boolean combination of equalities coextensive with ϕ( x, ȳ, Ȳ ) .'By the formalized version of Lemma 6.40, it suffices to prove in PA that ϕ( x, ȳ, Ȳ ) is a symmetric L PA -formula.To do this, use induction on L-formulas ϕ( x, X ) to prove the following theorem scheme in PA: (This corresponds to our earlier observation that permuting everything uniformly doesn't change any truth-values in M relative to any variable-assignment.)Then copy the rest of the proof of Lemma 6.38.
In the same way, it is easy to formalize Lemma 6.39 in PA.
We are now ready to prove the first main theorem of the paper.
For proof, see Lemma 1.3.
Corollary 7.49.2FA is not conservative over Ax L .§8. w2FA is conservative over stronger base theories.It is surprising that w2FA is not conservative over Ax L .However, the next two theorems establish some limits to the non-conservativeness of w2FA.Theorem 8.50.w2FA is conservative over third-order logic.
Proof.Let L 3 be the third-order analog of the base language L. Let Ax L 3 denote the axioms of the deductive system for L 3 , including full third-order comprehension in the base sort.Note that w2FA still only includes second-order comprehension for the numerical sort.
Take any L 3 -formula ϕ, and suppose that w2FA + Ax L 3 ϕ.We show that Ax L 3 ϕ.Our strategy is to define an interpretation of w2FA in Ax L 3 that leaves L 3 -sentences fixed (up to renaming of bound variables).Under such an interpretation, any derivation of ϕ from w2FA + Ax L 3 is transformed into a derivation of ϕ from Ax L 3 .The idea is to interpret each cardinality #X as the concept X from whence it came, with numericalsort equality being interpreted as equinumerosity.
Hájek and Pudlák generally assume that equality is interpreted as equality (p.149, II.1.5(2)).However, it is easy to adapt the proof of III.4.7-8 so as to dispense with this assumption.See also [20, p. 76, theorem 1] for more details.Set up the pre-translation so that distinct variables of L 3 ∪ L + are translated as distinct variables of L 3 .For example, let the base-sort concept variables be enumerated by X 0 , X 1 , X 2 , ..., and the numerical-sort object variables by v 0 , v 1 , v 2 , .... Then let the pre-translations be Similarly for other sorts.
We now define the translation * : In the first and last lines, let = 1 , ... , k be any second-or third-order sort.In the last line, Cong ≈ ((X ) * ) is a metalinguistic abbreviation of the statement: '≈ is a congruence for the relevant argument-places of (X ) * ', where the sort determines which argument-places are relevant.
It is easy to check that the * -translation of each axiom of w2FA is provable from Ax L 3 .So, the translation works.
To prove the next theorem, we need another little fact about conservativeness.Lemma 8.51.Let T be a theory in a formal language L, and let A be any L-sentence.Suppose that a sentence Δ is conservative over T + A and is also conservative over T + ¬A.Then Δ is conservative over T.
Proof.Take any ϕ ∈ L, and suppose that T + Δ ϕ.We show that T ϕ.Indeed By the same reasoning, we also have T ¬A → ϕ.Hence, T ϕ.
Theorem 8.52.w2FA is conservative over Ax L + Fin(V ).Now it is easy to check that the †-translation of each axiom of w2FA is provable from Ax L + Fin(V ) + |V | = 1.So, the interpretation works.Proof.Observe that Ax L + |V | = 1 is a categorical theory, and hence it is a complete theory.So, the only way that w2FA could be non-conservative over Ax L + |V | = 1 is if the combined theory w2FA + |V | = 1 were inconsistent.But w2FA + |V | = 1 is consistent: it has a model M with object domains M 0 = {a} and M n = {0, 1} and with I (#) being the function mapping each base-sort concept to its cardinality.§9.The non-conservativeness of 2FA.In the previous section, we established some limits to the non-conservativeness of w2FA.In this section, we will show that 2FA is more deeply non-conservative than w2FA.The main result is Theorem 9.67, which says that 2FA is non-conservative over Ax L + Fin(V ).Our proof of this result can be generalized to show that 2FA is non-conservative over pure axiomatic n-th order logic for any n ≥ 2, or even over simple type theory.
Roughly, the idea is to construct a G ödel sentence for Ax L + Fin(V ).By a variation on G ödel's first incompleteness theorem, Ax L + Fin(V ) does not prove its own G ödel sentence.On the other hand, 2FA + Fin(V ) does prove the G ödel sentence, because it is a powerful theory: it interprets second-order arithmetic in the new sort (and it is smart enough to relate that arithmetic to the G ödel sentence expressed in L).
But Ax L + Fin(V ) says that the universe is finite, so it cannot interpret Q. How, then, is it possible to pull off the G ödel argument?The trick is that Ax L + Fin(V ) has arbitrarily large models.If Ax L + Fin(V ) proved its own Gödel sentence, then any sufficiently large model would contain a witness to the paradoxical derivation, yielding a contradiction.
To implement this argument, it will be convenient to work with a definitional extension T = Ax L∪L + Fin(V ) + Δ, which we now describe.We identify variables of L with object variables of L. Thus, • 0 is a base object constant, • S and ≤ are constants of sort 0, 0 , • A and M are constants of sort 0, 0, 0 .
Let Ax L∪L be the axioms of the deductive system for L ∪ L .Definition 9.56.Let Δ be the conjunction of the following (L ∪ L )-formulas: 1. (V, ≤) is a double well-ordering with least element 0, 2. Sxy iff y is the upper neighbor of x with respect to ≤, 3. Definitions of A and M: Proof.It is obvious that T proves the universal closures of the first three axioms of BA .Furthermore, since (V, ≤) is a well-ordering, we have induction for all (L ∪ L )formulas.Using induction, it is easy to prove the universal closures of the remaining axioms of BA .
We will now describe the construction of the G ödel sentence of T. Fix a G ödel numbering of L ∪ L .We describe L PA -formulas Der T , diag representing certain primitive recursive notions.
Let Der T (x, y) just in case: x is the G ödel number of a T-derivation of a formula with G ödel number y.
Let diag(x) = y be a function with the following property: if n is the G ödel number of an (L ∪ L )-formula (y) with exactly the free variable y, then diag(S n 0) = diag( (y) ) = ∀y(y . = n → (y)) .
(The notation y . = n is from Definition 4.21.)Note that diag is modeled on the G ödel diagonal function: in essence, it substitutes into a formula its own G ödel number.
It is well known that recursive relations are Δ 1 -definable in PA [13, p. 18, theorem 0.45].So, we may choose Der T and diag so that Der T (x, diag(y)) is a Σ 1 formula.By Lemma 4.23, there is an equivalent Σ 1 formula ϕ(x, y) of L such that, for any parameters a, b ∈ N, N ϕ(a, b) ⇐⇒ N Der T (S a 0, diag(S b 0)).
Let p be the G ödel number of ∀x¬ϕ(x, y).Then diag(S p 0) = diag( ∀x¬ϕ(x, y) ) = G , where G is the following sentence: We say that G is the Gödel sentence of the theory T.
Proof.Suppose for sake of contradiction that T G. Let d be the G ödel number of a derivation of G. Then we have N Der T (S d 0, diag(S p 0)),

N ϕ(d, p).
Write ϕ(x, y) as ∃z (x, y, z), where is bounded .Fix r ∈ N such that N (d, p, r).By Lemma 4.25 and the Generalization Theorem, By Lemma 9.58, We assumed that T G. Hence, But T has arbitrarily large finite models.In particular, N max{d, p, r} is a model of T that satisfies ∃x∃y∃z(x Let us now turn our attention to what is provable in the stronger theory 2FA + Fin(V ).Lemma 9.60.2FA interprets Z 2 , and hence Z 2 .
The proof is an easy variation on Frege's Theorem.It will be convenient to fix a particular interpretation of Z 2 and Z 2 in the numerical sort of 2FA.Definition 9.61.Fix a translation : L 2 → L + which interprets Z 2 in the numerical sort of 2FA.The interpretants of the nonlogical vocabulary items of L 2 will be denoted by 0, S, ≤, A, M. The universe of the interpretation is defined by the following formula N(x): Object quantifiers are relativized to N(x).Set quantifiers are relativized to ∀x(Xx → N(x)).
The interpretation of Z 2 in 2FA is obtained by extending the -translation so as to interpret Z 2 + D, where D consists of the definitions of S, +, • in terms of S, A, M (Definition 4.21).
The next two lemmas show that 2FA + Fin(V ) is smart enough to relate the arithmetic in its base sort (BA ) with the arithmetic in its numerical sort (Z 2 ).
We can rule out the latter two options, because they imply that the converse of (N, ≤) is a well-ordering, which it isn't.Hence, (V, ≤) < o (N, ≤).This is what we wanted.
For the next definition, fix 0, S, ≤, A, M , and suppose Δ.Also fix a as in the statement of Lemma 9.62.
Let ϕ(x, y), (x, y, z), and G be the L -formulas from Lemma 9.59.Let p be a term in the numerical sort of L + that denotes the G ödel number of ∀x¬ϕ(x, y).In other words, p = ∀x¬ϕ(x, y) .
Let G be the following formula in the numerical sort of L + : G := ∀x¬ϕ (x, p).
Observe that 2FA G ↔ G .(This is because we chose the interpretations of Z We are finally ready to prove the second main theorem of the paper. Theorem 9.67.2FA is not conservative over Ax L + Fin(V ).
Proof.We establish the following witness to non-conservativeness: 2FA + Fin(V ) ∀(0, S, ≤, A, M If L is a second-or higher-order language, then denotes the consequence relation with respect to standard (full) semantics.
There are two differences between Field-conservativeness and standard deductive conservativeness.Firstly, Field-conservativeness involves relativizing some of the quantifiers to 'non-abstracts'.Secondly, Field-conservativeness is formulated semantically rather than deductively.
Hale and Wright's suggestion, then, is that abstraction principles need only be Fieldconservative in order to be acceptable.Much of the neo-Fregean literature has followed Hale and Wright on this point, if only because there seemed to be no other way for the neo-Fregean project to get off the ground. 16ollowing [33, pp.21-22], we may distinguish some notions closely related to Fieldconservativeness.See [6, 33] for motivation and further discussion.Definition 10.71.Let L, L + , T, Δ be as in Definition 10.70.Assume that deductive systems for L and L + have been specified.Let P (for 'previously recognized ontology') be a new unary predicate symbol.Then: 1. Δ is deductively Field-conservative over T iff for every L-formula ϕ, T ¬∃F (x=@F ) + Δ ϕ ¬∃F (x=@F ) =⇒ T ϕ.
Weir [33, p. 24, theorem 4.1] proved that HP is both Field-conservative and Caesarneutral conservative over pure second-order logic.It has remained an open question whether HP satisfies the deductive analogue of either of these conditions.Our results imply that it does not. 17eorem 10.72.HP is not deductively Caesar-neutral conservative over pure axiomatic second-order logic.
Proof.We proved that 2FA is not deductively conservative over pure axiomatic second-order logic Ax L (Corollary 7.49).Let be an L-sentence such that 2FA but Ax L .Let P be a new unary predicate symbol.It suffices to show that Ax L[{#,P}] + HP P .

Definition 4 . 21 .
Let D be the conjunction of the following three (L PA ∪ L )-formulas:

Lemmas 4 .
22 and 4.23 tell us that the theories I Σ n and I Σ n are in a strong sense equivalent.Lemma 4.22.Let n ≥ 0. Then I Σ n + D I Σ n , and conversely I Σ n + D I Σ n .Lemma 4.23.Let ϕ be a Σ n formula with n ≥ 1.Then there is a Σ n formula ϕ with the same free variables as ϕ such that I Σ n + D ϕ ↔ ϕ .For proof, see[13, pp.88-89].10

Definition 4 . 26 .
The language L 2 is just like L 2 , but with the vocabulary of L replacing the vocabulary of L PA .Definition 4.27.Let Z 2 be the theory in L 2 whose axioms are those of I Σ 0 , plus the second-order induction axiom X 0 ∧ ∀x∀y(Xx ∧ Sxy → Xy) → ∀xXx and the second-order comprehension scheme for L 2 .Lemma 4.28.Z 2 and Z 2 are mutually interpretable.Indeed, Z 2 + D Z 2 , and conversely Z 2 + D Z 2 .

Corollary 6 .
37. Each relation domain M n of the Fraenkel model is closed under the action (on N n ) of permutations of N. Lemma 6.38.The Fraenkel model is a model of Ax L .

2 and Z 2
to be compatible with one another.See Definition 9.61.)Intuitively, G says: 'The G ödel sentence for T is not derivable in T.' In other words, G formalizes the statement of Lemma 9.59.It is well known that Z 2 formalizes Tarskian definitions of truth and satisfaction for L PA[31, pp.183-187].In the same way, 2FA formalizes Tarskian definitions of truth and satisfaction for L with respect to the standard model N. Denote the truth predicate by Tr N (x) and the satisfaction predicate by Sat N (x, y).Lemma 9.65.Let be an L -formula whose free variables are among the first k free variables of L .Then 2FA proves∀x 1 ••• ∀x k (Sat N ( , x 1 , ... , x k ) ↔ (x 1 , ... , x k )).For proof, compare [31, pp.186-187, proposition 18.12].Lemma 9.66.2FA G.Proof (sketch).The idea is to formalize the proof of Lemma 9.59 in 2FA.We reason in 2FA.Suppose ¬ G. Then there exists d such that ϕ (d, p).By Lemma 9.65, we have Sat N ( ϕ , d, p ). Write ϕ(x, y) = ∃z (x, y, z).Unpacking the definition of Sat N , there exists r such that Sat N ( , d, p, r ).Formalize Lemma 4.25 to obtain ∃x Der BA (x, x .= d ∧ y .= p ∧ z .= r → (x, y, z) ), and so on, until we reach ∃x Der T (x, ¬∃x∃y∃z(x .= d ∧ y .= p ∧ z .= r) ).Let m = max{d, p, r}.Argue that Der T is sound with respect to the semantics Tr N m , in the sense that ∀y(∃x Der T (x, y) → Tr N m (y)).Finally, check that ¬Tr N m ( ¬∃x∃y∃z(x .= d ∧ y .= p ∧ z .= r) ).Contradiction.

Table 1 .
Deductive system for L J .