TOWARDS THE INEVITABILITY OF NON-CLASSICAL PROBABILITY

Abstract This paper generalises an argument for probabilism due to Lindley [9]. I extend the argument to a number of non-classical logical settings whose truth-values, seen here as ideal aims for belief, are in the set 
$\{0,1\}$
 , and where logical consequence 
$\models $
 is given the “no-drop” characterization. First I will show that, in each of these settings, an agent’s credence can only avoid accuracy-domination if its canonical transform is a (possibly non-classical) probability function. In other words, if an agent values accuracy as the fundamental epistemic virtue, it is a necessary requirement for rationality that her credence have some probabilistic structure. Then I show that for a certain class of reasonable measures of inaccuracy, having such a probabilistic structure is sufficient to avoid accuracy-domination in these non-classical settings.


Overview
It is a common assumption in formal epistemology that an agent's beliefs can be represented by (or even identied with) a credence function cr, which assigns to each proposition A a number cr(A) ∈ R, interpreted as the agent's degree of belief in that proposition.On this foundation is built the position known as probabilism; the claim that, in order to be rational, an agent's credence function must be a probability distribution.In other words, supporters of probabilism see the probability axioms as epistemic norms that all rational agents should respect.
1 One way to argue for probabilism is to show that probabilistic credences are in some way more epistemically valuable than non-probabilistic ones.Many arguments to this eect assume that the fundamental value of credences is their accuracy, which intuitively reects how closely they align with the truth.The concept of accuracy is made precise by introducing inaccuracy measures, functions I(cr, w) which assign a penalty to the credence cr for each state of the 1 It remains up for debate to what extent human agents can hope to be rational in this sense.Here I avoid the problem by discussing probabilism as a theory of purely ideal rationality.For an in-depth treatment of this issue, see Stael (2020).world w.These measures are then used to show that a rational agent who values accuracy ought to have probabilistic credences.
Clearly, a great deal of an accuracy argument's strength rests on what we take to be a reasonable accuracy measure.This will be a main theme in this essay, which aims to generalize an accuracy argument due to Lindley.Lindley's argument makes remarkably weak assumptions on what should count as a reasonable accuracy measure; because of this, it leads to a weaker set of rationality norms than probabilism proper, one that has a number of interesting philosophical ramications.
The generalization I pursue involves the logical setting of the argument.Like most accuracy arguments in the literature, Lindley's assumes that sentences are either true or false, and that logical consequence is dened in the classical way.
Although Williams (2012b) has generalized an argument due to Joyce (1998) to a broad class of non-classical settings, this generalization relies on Joyce's specic assumptions on what counts as an accuracy measure, and thus cannot be applied to Lindley's argument.I will proceed in a fundamentally dierent way to extend Lindley's argument to some of the non-classical settings discussed by Williams.
I will start in Section 2 by introducing probabilism as an epistemological position, and accuracy arguments as a way to justify it.Section 3 is an overview of Lindley's accuracy argument, with particular attention devoted to spelling out its assumptions and philosophical consequences.The remainder of the paper contains my generalisation of Lindley's argument.Section 4 prepares the ground by making precise the non-classical settings which I will be working with, and the non-classical probability axioms I will be justifying as rational norms for credences.In Section 5 prove the main result: rational agents are required to have credences whose transforms obey the non-classical probability axioms, if they want to avoid accuracy-domination.Section 6 discusses the problem of a converse result, and shows that, for a class of reasonable inaccuracy measures, having a probabilistic transform is sucient to avoid accuracydomination.Some open problems are briey outlined in Section 7, and Section 8 concludes the essay with a summary of its main results.

Probabilism and accuracy
Let's start by introducing some notation.We consider an agent who has beliefs towards a set of sentences F in a nite propositional language L, which includes the standard connectives ∧, ∨, ¬.We assume F to be closed under ∧, ∨, ¬.For the moment, we restrict ourselves to the classical case, and assume that each sentence in L must be either true or false (this assumption will be abandoned in later sections).We denote this by taking S = {true, f alse} as our set of truthstatuses.We will consider a nite set W of functions w : F → S satisfying the classical truth conditions (e.g.w(A) = true i w(¬A) = f alse).Each w ∈ W is a classically possible world.The agent's beliefs are modeled by credence functions cr : F → R, with cr(A) being interpreted as the agent's degree of belief in A. We denote by Cred the set of all credence functions dened over F.
A popular way to argue for probabilism starts from the idea that the fundamental epistemic virtue of a credence is its accuracy.Accuracy is taken to be a gradational concept: one is more accurate at a world w the higher one's degree of belief in the sentences that are true at w, and the lower one's degree of belief in the sentences that are false at w.It is normally assumed that credence 0 represents complete lack of belief, and that credence 1 represents the maximum degree of belief the agent can have; thus the function • w dened by: (1) will be the most accurate credence at world w. 2 We can think of • w as providing an aim for belief at world w, in the sense that it is the credence of an ideal agent who knows the truth or falsity of every proposition, and thus will assign maximum belief to all truths, and minimum belief to all falsities (Williams, 2012b).I refer to the value A w as the truth-value of A at world w.
Most accuracy arguments for probabilism follow the same three-step structure.
1.A function (or class of functions) I : Cred × W → R is dened, such that I(cr, w) is a reasonable measure of how accurate the credence function cr is at world w (i.e.how close it is to the ideal credence .w ).Whether I assigns higher values to more accurate credences and lower values to less accurate ones, or vice versa, is just a matter of convention.Throughout this paper we will assume that the higher the value I(cr, w), the more inaccurate the credence cr is at world w.This way, the function I will act as a kind of distance between the agent's beliefs and the ideal credence.I will write I G to denote the inaccuracy of a credence cr on a subset G of F.
2. As a second step, the accuracy measure I is used to dene one or more rationality requirements.For example, we may think that an agent with credence function cr is not rational if there is another credence function cr ′ that is more accurate than cr in every possible world.This is known as the Non-Dominance requirement.
3. Finally, a theorem proves that in order for an agent's credence to be rational according to the specied measure and requirement, it must be probabilistic.
A classical example of accuracy argument is due to Joyce (1998), who takes advantage of an earlier theorem proven by De Finetti (1974).Following the above schema, Joyce begins by laying down some conditions to establish what 2 As explained in greater detail below, the choice of values 0 and 1 is arbitrary, and any other pair of values could be used to the same eect.
functions can be considered appropriate measures of accuracy.Then, the Non-Dominance criterion is introduced as a way of discriminating between rational and irrational credences.Finally, Joyce proves the following result: Accuracy Theorem: When evaluating credences with an acceptable accuracy measure I, every non-probabilistic credence cr is accuracy-dominated by a probabilistic cr ′ , meaning that cr ′ is more accurate than cr in each world.More fomally: for every possible world w ∈ W .
Putting the pieces together, we deduce that the credence function cr of a rational agent must be probabilistic, for otherwise there would be some other credence function cr ′ which is more accurate than cr no matter what, and this we regard as a failure of rationality on the part of the agent.
Most of the criticism of Joyce's argument is directed towards its assumptions, rather than towards the theorem that contains the argument's deductive step.
In particular, the following have been questioned: (i) The assumption that the rationality of an agent's beliefs depends on how accurate her credence is with regards to the actual world.
(ii) The assumption that a rational credence function should not be accuracydominated (Non-dominance).
(iii) Assumptions on what counts as an acceptable accuracy measure.
Critics of (i) argue that accuracy is not the only criterion required to dene rationality.Other properties of an agent's beliefs, such as the degree by which they are supported by evidence, or their behavioural implications, might be just as (if not more) important for the agent's epistemic prole.Furthermore, these other virtues might trade-o with accuracy, so that pursuing the one requires implies giving up the other (Carr, 2017).A serious answer to this objection goes beyond the scope of this essay, and has already been discussed at length by others.3 However, I hope that even a sceptical reader will agree that there are at least some contexts in which credal accuracy is the most important epistemic desideratum think for example of a meteorologist making weather predictions, or a computer program making economic forecasts.The sceptic can read the present discussion as limited in scope to those contexts.After settling on (i), (ii) is fairly uncontroversial: if all we care about is accuracy, and cr ′ is more accurate than cr no matter what, then there's no reason why we should hold the latter instead of the former.On the other hand, Joyce's assumptions on (iii) 3 The standard references are Pettigrew (2011Pettigrew ( , 2016)), in which it is argued that accuracy is the fundamental epistemic virtue on the basis that all others can be derived from it.This position is known as veritism.
what counts as an acceptable accuracy measure are not trivial, and they have sparked considerable debate (Maher, 2002).

4
Given the above discussion, it is natural to ask ourselves whether it's possible to justify probabilism with a dierent set of conditions on what an appropriate measure of accuracy should be like.Many justications of probabilism rely on a specic class of measures of inaccuracy, which are called strictly proper (Joyce, 2009;Predd et al., 2009;Pettigrew, 2016).
Denition 2.1 ((Strictly) proper inaccuracy measure).Let W be a nite set of worlds mapping sentences in F into {true, f alse}.Then I is proper i for every probability function p and every nite subset G ⊆ F, the expected score taken as a function of cr for xed p, is minimized when cr = p.Furthermore, if cr = p uniquely minimises this function (on all nite G's), we say that I is strictly proper. 5Intuitively, the above denition requires that the inaccuracy measure I make all probabilistic credences immodest, in the sense that an agent whose beliefs are represented by such a credence would expect her own beliefs to be more accurate than any other.By arguing that these rules provide reasonable measures of epistemic accuracy, and assuming Non-Dominance to be a rationality requirement, it's possible to show that all rational credences respect the probability axioms.Lindley (1982) also evaluates an agent's beliefs in terms of their accuracy, and takes Non-Dominance to be a requirement for rational credences.However, he considers a class of reasonable accuracy measures other than the class of proper ones.This dierence in his assumptions leads him to a weaker conclusion.Avoiding accuracy-domination does not require credences to be probabilistic; instead, Lindley argues, it merely requires that they can be transformed into probabilistic functions by a canonical transform.In other words, although Lindley's undemanding assumptions are not sucient to justify full-blown probabilism, the rational credences they characterise all share some probabilistic structure.
The details and philosophical signicance of this result will be discussed in more detail in the next section.3 Lindley's argument I will now go over Lindley's accuracy argument.I begin with an overview of the formal result, and then discuss its philosophical consequences.The result 4 Joyce himself has adjusted and extended his argument over the years to address some of this criticism (Joyce, 2009).However, his assumptions are still the subject of debate among epistemologists (Titelbaum, 2015). 5The requirement that G be nite guarantees that, dening I as in Section 3, the inaccuracy I G (cr, w) is nite.If we were measuring accuracy over propositions instead of sentences, then the requirement that W is nite would suce.
is presented with the notation introduced in the previous section, in order to simplify the extension to non-classical settings in following sections.
Like Joyce, Lindley begins with assumptions that establish what kind of measure of inaccuracy should be considered reasonable, and what it means for a credence function to be rational according to such a measure.
(a) Score assumption: if cr is a credence function dened over F, f is a score function, and G is a subset of F, then the total inaccuracy of cr at world w over G is given by A∈G f (cr(A), A w ).I will abuse the notation and use the symbol f to denote both the local score function and the global inaccuracy measure dened by that score function.So the score assumption can be written as: Note that the inaccuracy of cr over an innite G may be innite, so the range of I G is the extended real numbers.
(b) Admissibility assumption: We say an agent's credence cr : F → R is accuracy-dominated on a nite subset G ⊂ F (according to the inaccuracy measure f ) i there is some other credence function cr ′ such that: for all possible worlds w ∈ W , with some worlds in which the inequality is strict.This means that cr is never more accurate than some other cr ′ on G, no matter what world is the case, and in some worlds cr ′ is more accurate than cr.We say cr is accuracy-dominated (according to f ) if it is accuracy-dominated on some nite G ⊂ F. We then introduce the following rationality criterion: a credence cr is rationally admissible according to an inaccuracy measure f only if it is not accuracy-dominated according to that f (Non-Dominance ).
(c) Origin and Scale assumption: There are two distinct values x F , x T ∈ R with x F < x T , such that: x F is the only rationally admissible value for cr(A) if A is false in all possible worlds w ∈ W .
x T is the only rationally admissible value for cr(A) if A is true in all possible worlds w ∈ W .
In Lindley's argument, the credence values x F , x T represent the agent's certainty in the falsity/truth of a proposition, respectively.
(d) Regularity assumptions: The credence cr can assume all values in a closed, bounded interval J ⊂ R.There exists the derivative f ′ (x, y) of f (x, y) with respect to x ∈ J.This derivative is continuous in x for each y and, for both y = 0 and y = 1, is zero at no more than one point.Also x F and x T are interior points of J.
Lindley's assumptions are too weak to imply that all rational credences be probabilistic.Instead they imply that a rational credence's canonical transform must respect the probability axioms.For each inaccuracy measure f , this transform is obtained by composing the agent's credence with the function P f : R → R dened as: Lindley's proof is then developed via three main lemmas, each showing that P f • cr respects one of the probability axioms.The axioms are taken to be: After proving the lemmas, the following theorem can be derived straightforwardly: Theorem 1 (Lindley (1982)).Under the assumptions (a-d) listed above, if cr : F → R is admissible according to a reasonable inaccuracy measure f (i.e.cr is not accuracy-dominated under f ), and if P f is the canonical transform dened as in (6), then the composite function (P f • cr) : F → R obeys the probability axioms (A1)-(A3).
To better understand the nature of Lindley's conclusions, it will be useful to discuss the assumptions he makes about the inaccuracy measures.In particular, we will compare the kind of measures of accuracy he considers reasonable to the (strictly) proper measures which are commonly used in accuracy arguments for probabilism.Assumption (a) demands that the total inaccuracy of a credence cr over G is simply the sum of the scores f (cr(A), A w ) on each A ∈ G.This is also commonly assumed for proper inaccuracy measures, which are usually dened as sums of proper scoring rules.Assumption (c) reects the fact that the rational credence values cr(⊤) = 1 for a tautology, and cr(⊥) = 0 for a contradiction, are conventional.This level of generality can also be achieved by proper inaccuracy measures.In his reply to Howson, Joyce (2015) shows that we can adapt these measures so that any two values may be used as the endpoints of a rational credal scale.By doing so we derive a dierent form of probabilism, where the usual probability axioms are substituted by scaleinvariant formulations.Assumption (d) mainly requires the score functions to be smooth.This guarantees that if the credence cr(A) is very close to cr ′ (A), the respective scores will be close as well.It also adds some further technical conditions, some of which can be weakened (see Lindley (1982)).
If we restrict ourselves to smooth inaccuracy measures in the sense of (d), then the measures Lindley nds reasonable are a strictly larger class than the proper ones.Indeed, Lindley points out that, among the inaccuracy measures he considers, the proper ones are those that lead[] directly to a probability (Lindley, 1982, p.7).By this he means that the transform P f dened from a proper inaccuracy measure f is simply the identity function, so P f • cr = cr for each cr ∈ Cred.As a consequence of this, for a credence cr to be non-dominated under a proper measure of inaccuracy, cr must itself be probabilistic, regardless of whether we dene probability by means of the standard or scale-invariant axioms.Under Lindley's more general assumptions, on the other hand, it might be that a non-probabilistic cr is non-dominated under f (when f is not proper).In this case, the transform P f • cr produces a dierent credence cr ′ , which Theorem 1 ensures is probabilistic, again with either standard or scale-invariant fomulations of the axioms.6 Titelbaum (2015) suggests that Lindley's conclusions can be interpreted in dierent ways depending on one's position with regards to the numerical representation of beliefs.On one hand, if we think all that matters to an epistemologist are qualitative statements like Sally believes A more than B, then quantifying the agent's beliefs with real numbers is a useful modeling technique, but nothing more than that.Since dierent credence functions corresponding to the same probability distribution are ordinally equivalent, according to this view they are just dierent ways to represent the same beliefs.Every rational credence is then either probabilistic, or, by Lindley's theorem, epistemically equivalent to a probabilistic one.
On the other hand, we may think that numeric representations capture some deeper facts about belief, facts that cannot always be expressed by mere qualitative judgements.In this case, there can be a real dierence between ordinally equivalent credence functions, and in particular between a credence function and its probabilistic transform.So the result takes a more negative light, showing just how demanding our assumptions have to be in order to induce full-blown probabilism.This double relevance of Lindley's argument makes it an interesting target for generalization, which will be the subject of the next section. 6Further discussion of this point can be found in (Joyce, 2009, 10).This setup closely follows the one used by Williams in his generalization of Joyce's accuracy argument.To prove his accuracy theorem, Joyce (1998) uses the fact that probability functions are convex combinations of the ideal credences induced by each (classically) possible world, which I denoted earlier by • w .In his generalization, Williams (2012b,a) takes advantage of the fact that the concept of convex combination can be easily extended to the nonclassical case.More precisely: if the truth-status of a sentence A at world w is w(A), Williams interprets the truth-value A w as the ideal belief of an agent towards A, if w is the case.Then nonclassical probabilities end up being convex combinations of the functions .w induced by each non-classical possible world.
Unlike Joyce, however, Lindley does not rely on a concept like that of convex combinations which can be so readily generalized.In order to transfer Lindley's argument to the non-classical case, I will need to adjust some of his assumptions and proofs in Section 5 My conclusion will also be dierent; whereas Williams vindicates full-blown non-classical probabilism, I aim to justify its weaker version, analogous to that justied by Lindley: in a number of non-classical settings, all rational credence functions can be canonically transformed into functions that respect Paris's generalised probablity axioms.
We are interested in generalising Lindley's argument to non-classical logics with truth-values in {0, 1}.We do not impose any restrictions on the set S of truth-statuses of these logics, but we do require that the truth-value assignment functions • w of each possible world w ∈ W satisfy the following: Since we interpret truth-values as aims for belief, the above conditions concern the ways an ideal agent's degree of belief in a composite propositions is constrained by her degree of belief in its components.The classical setting with its usual truth-value assignment clearly satises both conditions, and so we include it as a particular case of the more general non-classical pattern.But many non-classical settings also fall within this family.Here I list some of them, again following Williams (2012b) in their denition: 7 Classical:

alse}
The truth-value mapping .w is dened by: ∧, ¬ follow their usual classical truth tables.
LP Gluts: 7 For reasons of space, only the classical and LP cases are written out here.The denitions of the other logics can be found in the Appendix.

S := {true, both, f alse}
The truth-value mapping .w is dened by: The connectives ∧, ¬ follow the rules: Kleene gaps: (see Appendix) Intuitionism: (see Appendix) Fuzzy Gaps (nite or innite): (see Appendix) Lindley's proof does not apply in general to these logical settings.To show this, I take as an example Lindley's proof of (A2), showing how it breaks in the case of LP Gluts.This is also a nice way to introduce Lindley's proof strategy, which I will adapt later in this section.
Proposition 1 (Lindley's Lemma 2 (Lindley, 1982)).Under the assumptions (a-d) listed in Section 3, if the credence cr : F → R is not accuracy-dominated according to a reasonable inaccuracy measure f , then for all A ∈ F: Proof.Classical case: Assume cr is not accuracy-dominated for some inaccuracy measure f .There can be only two distinct (classically) possible worlds w 1 , w 2 with: w 1 (A) = true, w 1 (¬A) = f alse and w 2 (A) = f alse, w 2 (¬A) = true.Let x := cr(A) and y := cr(¬A).Then from the Score assumption (a) we know that the total inaccuracy in the two possible cases is: f (x, 0) + f (y, 1). ( Now we move from cr to a new credence distribution cr ′ such that: with h, k ∈ R.You can think of this as nudging our credence in A by some small quantity h, nudging our credence in ¬A by some small quantity k, and leaving unchanged our credence in all other sentences in F. The idea is that, since cr is not accuracy-dominated, it should not be possible to decrease our total inaccuracy in every world by means of this sort of nudging. By Score assumption (a) we have that the total inaccuracy of an agent is the sum of the scores she obtains on each sentence, and since cr and cr ′ agree on all sentences except for A and ¬A, the dierence in total inaccuracy between these two credences will amount to the dierence in their scores on A and ¬A: Thinking of f (x, 1) and f (x, 0) as functions of a single variable x, for small h, k we can rewrite this shift in accuracy in terms of the derivative of f : If we equate these two expressions to small, selected negative values, we get a system of two linear equations in unknowns h, k, one for each distinct possible world.If this system had a solution, then dening cr ′ as above would make it more accurate than cr over the set {A, ¬A} in all possible worlds, but this would contradict the assumption that cr is not accuracy-dominated.So the system must not have a solution, that is, its determinant must be equal to zero.8 This happens when: We now expand the sum P f (x) + P f (y) using the transform's denition in (6): So by ( 18) we have P f (x) + P f (y) = 1, that is, 8 Denote the small negative values on the right-hand of the rst and second equation by ϵ 1 and ϵ 2 , respectively.By regularity assumption (d), at least one of f ′ (x, 1) or f ′ (x, 0) is nonzero.Let's say, without loss of generality, that f ′ (x, 1) ̸ = 0. Then we can divide both sides of the rst equation by this quantity to express h as a function of k.Substituting this expression in the second equation, we obtain: which gives us a solution if we can divide both sides by the factor multiplying k.So unless this factor, which is the determinant of the system, is equal to 0, we have that cr is inadmissible.as needed.9 LP Case: Here there are three distinct possible worlds w 1 , w 2 , w 3 with: w 1 (A) = true, w 1 (¬A) = f alse, w 2 (A) = f alse, w 2 (¬A) = true, w 3 (A) = both, w 3 (¬A) = both.So by replicating the procedure above, in our move from cr to cr ′ we get the following variation in accuracy in the three possible cases: since A w = 1 when w(A) = both in LP Gluts.At this point, if we attempt to equate these expressions to some selected negative values as above, we obtain a system of three linear equations in unknowns h, k, which does not have a solution in general.
The reason why Lindley's proof does not immediately extend to non-classical settings is that the very probability axioms he is trying to justify contain implicit references to classical logical notions.For example, the denition of tautology is dependent on the notion of logical consequence: A is a tautology if and only if |= A, where the double turnstile denotes the classical logical consequence.Axiom (P2) says that (A ∨ ¬A), which is a classical tautology, ought to be believed with credence 1. However there are logical settings in which it might be reasonable to be less than certain about the truth of (A∨¬A).In a treatment of scientic conrmation, for instance, it might be appropriate to have low belief in both A and ¬A, and to not be certain of their disjunction, if neither has received any supporting evidence (Weatherson, 2003).
If we want to extend Lindley's argument, we must rst understand what probabilism would look like in a non-classical setting.To this purpose we introduce Paris's axioms (Paris, 2001): Note that this axiomatisation of probability makes explicit reference to a notion of logical consequence.When we restrict ourselves to classical logic this notion is uniquely dened, but many denitions are possible in non-classical contexts.The conclusion of my generalised accuracy argument will be that Paris's axioms provide epistemic norms for a number of non-classical settings when logical consequence is given a no-drop characterization.This is dened as follows: 9 For boundary cases, see the original proof Lindley (1982).This point highlights another perspective from which to consider the generalisation of Lindley's accuracy argument presented in this essay.As noted by Williams (2012b), the ability to support (some form of ) probabilism might oer a reason to prefer the no-drop characterization of logical consequence over its alternatives, opening up an interesting connection between our epistemology and the underlying semantic theory.
Before we move on, it's worth pointing out an important dierence between Lindley's result and what we are trying to prove.Lindley's original argument considers conditional credences as the fundamental expression of the uncertainty of an agent.This reects a stance in the philosophy of probability that sees conditional probability as the fundamental notion, and derives unconditional probability from it.In contrast to this view, many textbooks on probability theory, but also many philosophical treatments, take unconditional judgements as primitive and interpret axiom (A3) as a denition of conditional probability (see Hájek (2003); Easwaran (2011) for some discussion of these two positions).While I will not go over the details of this debate here, I want to point out that, unlike Lindley's, my argument will be limited to agents expressing unconditional beliefs.This is not because I consider unconditional belief to be more fundamental in any way, but because the relationship between conditional and unconditional probabilities in a non-classical setting is not at all straightforward.A number of dierent approaches are available for specifying it (Williams, 2016), and they can lead to fairly dierent results.Thus I will restrict myself to the simpler unconditional case.Indeed, Paris's (P1)-(P3), which I try to justify here, do not make any reference to conditional probability.Thus my goal is not to extend Lindley's result itself, but rather to generalize his argument strategy in order to prove a weaker result in a number of non-classical settings.

Non-classical generalisation: the necessary condition
It's time to start working on our generalization of Theorem 1 As mentioned in the previous section, in order to transfer the argument to a non-classical setting, we need to adapt some of its assumptions.The obvious place to start is the origin and scale assumption (c): unlike in the classical case, we don't have a perfect correspondence between truth-statuses and truth-values anymore, so we must further specify the role of truth-values as ideal aims for belief.
(c*) Origin and Scale assumption There are two distinct values, x 0 , x 1 ∈ R with x 0 < x 1 , such that: x 0 uniquely minimises f (x, 0).
This new formulation diers from the original in two ways.First, we use the names x 0 , x 1 for our admissible values instead of x T , x F .This is because they do not really correspond to a sentence's truth status, but rather to the ideal belief the agent should have in it.Secondly, the way these values are dened is dierent.Notice that if we dene x 0 as in (c), it must be that f (x, 0) is uniquely minimised at x 0 .If that were not the case, then having credence cr(A) = x F when A w in every possible world would make one accuracy-dominated, and thus x F would be inadmissible, contradicting our assumption (similarly for x T ).
But in the non-classical case, we might be working in a logic for which there is no A ∈ F such that A w = 0 for all possible worlds, in which case the previous formulation of the assumption would hold vacuously.Thus we must directly assume that x 0 uniquely minimises f (x, 0) (and similarly for x 1 ).
We begin our generalisation by proving an analogue of Lindley (1982)'s Lemma 1.
Lemma 1.Under the assumptions (a), (b), (d) listed in Section 3 and the assumption (c * ) above, if the credence function cr : F → R is not accuracydominated under a reasonable inaccuracy measure f , then: 2. The function P f : R → R dened as in ( 6), takes values in [0, 1] for x ∈ [x 0 , x 1 ].
Proof.The proof follows the original one (Lindley, 1982).Assume cr is not accuracy-dominated according to inaccuracy measure f .Let A ∈ F, and let x := cr(A).By regularity assumptions we know: If we had x > x 1 then both derivatives f ′ (x, 0), f ′ (x, 1) would be positive, and so moving to a credence cr ′ with cr ′ (A) = cr(A) − h = x − h for some small positive h would guarantee a reduction of inaccuracy.But this contradicts our assumption that cr is not accuracy-dominated.Likewise for x < x 0 .So we have proven the rst point.
Consider now the case where x ∈ [x 0 , x 1 ].Decreasing the value of x will decrease f (x, 0) (i.e.we get closer to the ideal belief when A has truth-value 0) but increase f (x, 1) (i.e.we getfurther away from the ideal belief when A has truth-value 1), and vice versa when the value of x increases.We can now see from P f 's denition that P f (x) ∈ [0, 1], so the second point holds.Also, from the continuity of f ′ (x, y), we have that P f is continuous for all non-dominated degrees of belief x ∈ [x 0 , x 1 ], and: and, similarly, P f (x 1 ) = 1.This proves the third point.
From Lemma 1 we obtain that P f • cr respects (P1).We prove now that (P3) is also satised, employing a similar strategy to that used by Lindley for his Lemma 2. In the proof we will use the LP Gluts case as an example, and explain how analogous proofs can be constructed for the other settings under consideration.
Lemma 2. Under the assumptions (a), (b), (d) listed in Section 3 and the assumption (c * ) above, if the credence function cr : F → R is not accuracydominated under a reasonable inaccuracy measure f , then for any A, B ∈ F: Proof.Assume the credence function cr is not accuracy-dominated under the inaccuracy measure f .Let A, B ∈ F and let x := cr(A), y := cr(B), p := cr(A ∧ B), s := cr(A ∨ B).We want to prove that P f (x) + P f (y) = P f (p) + P f (s). (36) Applying the denition (6) of P f and moving all the terms to the left-hand side, this becomes: The expression on the left-hand side can be simplied to have a common denominator.Let ϕ be the numerator of the resulting fraction.Ultimately, then, we need to prove: Now in the case of LP Gluts, we have at most 9 distinct possible worlds.These are: f alse true f alse true w 8 f alse both f alse both w 9 f alse f alse f alse f alse which lead to the following inaccuracy for each possible case (note that both = 1): f (x, 1) + f (y, 1) + f (p, 1) + f (s, 1) f (x, 1) + f (y, 1) + f (p, 1) + f (s, 1) f (x, 0) + f (y, 1) + f (p, 0) + f (s, 1) Consider the change from credence cr to a new credence cr ′ such that: This leads to a corresponding change in the accuracy for each possible case: Proof.Let x := cr(A) and y := cr(B), and we know that A |= B. Since cr is admissible, Lemma 1 tells us that x, y ∈ [x 0 , x 1 ].First consider the case where x = x 1 .This value is admissible only when A w = 1 for all possible worlds w.Furthermore, from Lemma 1 we have that P f (x) = P f (x 1 ) = 1.We know from the no-drop characterization of |= that A |= B i A w ≤ B w for all w ∈ W . Therefore it must be that B w = 1 for all possible w in the current non-classical interpretation.But then the only admissible value for y is x 1 and P f (y) = P f (x 1 ) = 1.So: as needed.
In the case where x = x 0 , we have from Lemma 1 that P (x) = 0, and since in the same lemma we proved that P f (y) ∈ [0, 1] for all admissible y, it clearly holds that P (x) ≤ P (y). 10 Consider now the case where x, y ∈ (x 0 , x 1 ).We want to prove that P f (x) ≤ P f (y), which by denition of P f means: since the denominators are both positive and greater than 0 we can simplify to: this inequality is what we are going prove.
We know that A |= B, so in the case of LP Gluts we have at most seven possible worlds: f alse f alse 0 0 For i = 1, ..., 7 the total inaccuracy at world w i is given by: As in the previous proof, consider the change in the total inaccuracy when we move from cr to cr ′ by changing the credence in A, B of small quantities h, k respectively.By imposing that this inaccuracy change be less then zero we obtain a system of seven inequalities.Once again, we see that some of these 10 Similar cases can be constructed for y = x 1 and y = x 0 .
inequalities are duplicates.Regardless of which one of the non-classical settings under consideration we are working on, we are forced to assign truth-values in {0, 1} to A and B, and because of the no-drop characterization of logical consequence, a world where A w = 1 and B w = 0 is not possible.So we always obtain a system of at most three inequalities: Since by regularity assumptions f ′ (y, 0) > 0 for all y ∈ (x 0 , x 1 ), the rst inequality becomes: and the angular coecient of the corresponding equation is negative, because f ′ (x, 0) > 0. Similarly, the second inequality becomes: where f ′ (y, 1) < 0 and so the corresponding angular coecient is positive.The third inequality becomes: and the corresponding equation has negative angular coecient.The space of the solutions of the system is the one highlighted in Figure 2 of the Appendix.
If this system had a solution, then changing from cr to cr ′ would guarantee us greater accuracy no matter what, which is absurd because cr is assumed to not be accuracy-dominated.So the space of solutions must be empty, which happens when: that is, when: This is exactly the inequality we needed to prove.
Combining the three lemmas above, we get a result analogous to Lindley's Theorem 1 Lindley (1982): Theorem 2. Under the assumptions (a), (b), (d) listed in Section 3 and the assumption (c * ) above, if cr : F → R is not accuracy-dominated according to a reasonable inaccuracy measure f , then the transform (P f • cr) : F → R, where P f is dened as in ( 6), obeys Paris's probability axioms (P1)-(P3).
This result, together with our assumption (Non-dominance ) that avoiding accuracy-domination is a necessary requirement for rationality, shows that all rational credences must have a probabilistic transform in the logical settings under discussion.

Sucient conditions
Let us take a moment to look back at the main result of the previous section, Non-dominance as a rationality requirement for credences, we can then conclude that having a probabilistic transform is a necessary condition for rationality.
It is natural to ask at this point whether the probability axioms also provide a sucient condition for rationality.If we pose the question in this form, however, the answer will have to be no.There are plenty of cases where holding a certain credence whose canonical transform is probabilistic makes one irrational.We don't need to move to non-classical logics to nd examples of this.Imagine you have an urn in front of you, which you know contains 5 red balls and 5 black balls.You extract a ball from the urn after it has been shaken.We would like to say that, if you are rational, then P f should transform your credence into a probability that assigns value 1/2 to the sentence the next extracted ball will be red.Any other credence, whether it has probabilistic transform or not, seems irrational in this case, as it would go against your evidence.
So we must reformulate our question.What we want to ask is: is having a credence whose transform respects the probability axioms a sucient condition for avoiding the kind of irrationality that follows from a violation of Non-dominance?In other words, is every cr with a probabilistic transform safe from accuracy-domination?
We can think of the question above as a sanity check on the assumptions we used to prove our accuracy theorem.If it turns out that some intuitively reasonable credences with probabilistic transforms are accuracy-dominated, we should probably conclude that our measures of inaccuracy are inadequate, or that the Non-dominance requirement is too demanding.Joyce (2009) seems to view the question in this way when he discusses it in the context of his own accuracy argument for probabilism, in the classical setting.Indeed, he responds by arguing that no inaccuracy measure can be reasonable if it makes some probabilistic credence accuracy-dominated, and denes his measures of inaccuracy accordingly.But the fundamental problem is only pushed back by this answer.A new question arises: why should we require our measures of inaccuracy to protect probabilities from domination, and not some other class of credences?
Although Joyce (2009) does attempt to answer this second question, I will not spend more time on his reply here.Instead, I want to look at how this Now assume by way of contradiction that cr ↾ G is accuracy-dominated on G by some other function cr ′ .So the inequality: holds on each w i , and is strict for at least one i.But then the inequality must also hold for Π's expectations of these scores; that is, thinking of f (cr, w) and f (cr ′ , w) as random variables over {w 1 , ..., w n }: where G = {A 1 , ..., A m }.We can swap the order of the sums to obtain: We can remove from the above inequality all terms associated to the A j 's for which cr(A j ) = cr ′ (A j ), since they will be equal on both sides.Then for the inequality to hold, it must be that for at least one of the remaining A j we have: but this is a contradiction, given that cr(A) uniquely minimizes expected score according to Π on each A ∈ G.
This theorem shows that, in the non-classical settings we have been working with, single-valued inaccuracy scores make all credences whose canonical transform respects (P1)-(P3) safe from domination.It's easy to see that all proper inaccuracy measures which respect our regularity assumptions are singlevalued, given that when f is proper in the sense of Denition 2.1, P f is just the identity function.Although we cannot dispel the worry that, under some reasonable inaccuracy measures, some credences whose transform is probabilistic are accuracy-dominated, we have at least shown that for a large class of these measures (which includes all those that are proper) this does not happen.
7 Some open questions A number of non-classical logics have not been discussed in this paper.This is because the generalization of Lindley's argument, if it is possible, is not as straightforward in these settings.I will not provide such a generalization in this paper.But I want to use this section to discuss the kind of additional adjustments that such a task might require, and highlight the diculties it presents.
An example of a logical setting whose truth-values are {0, 1} but the conditions (TV1)-(TV2) are not satised is the Gap Supervaluation one (see Appendix for a denition).Here the three axioms (P1), (P2), (P3) do not characterize closed convex combinations of truth-value distributions.Paris (2001) shows that in this case the axioms should be: where A, B, A 1 , ..., A m ∈ F, and S ranges over the non-empty subsets of {1, 2, ..., m}.
And even this result holds only for specic languages, as Williams (2012b) notes.
In supervaluation settings, more sophisticated axioms seem necessary in order to account for variation in the expressive power of the language.
Other logical settings, such as the Finite and Innite Fuzzy (see Appendix for their denition) violate our assumptions by having truth-values outside of {0, 1}.
The origin and score assumption (c) needs additional credence values to represent the certainty of a proposition being in any of the dierent truth statuses.
For example, in a Finite Fuzzy setting with truth statuses S = {0, 1/n, 2/n, ..., (n − 1)/n, 1} we need values x 0 , x 1/n , ...x ((n−1)/n) , x 1 so that x i is the only admissible value for cr(p) when p w = i for all w ∈ W . New questions arise concerning these additional values, and how they should be distributed: should there be a requirement that x 1/2 be equidistant from x 0 and x 1 , or can it be placed anywhere between them, as long as it is greater than x 1/4 and lower than x 3/4 ?The fact that x 0 , x 1 should maintain a special status is suggested by their role in the rst axiom.
The regularity (d) assumptions and, more importantly, the denition of P f (6), should also be adapted to take the new truth-values into account.
In fact, the value of the original transform depends only on the derivatives f ′ (x, 0), f ′ (x, 1), which does not seem right in this context.Whether it's possible to generalize the denition of P f to include more than two truth-values, and what this generalization would look like, remain open questions.

Conclusion
Philosophers have argued for probabilism in many ways.Among them, accuracybased arguments provide some of the most interesting justications, but their assumptions are not at all uncontroversial.Thus it is interesting to explore how much we can relax these assumptions, while still deriving from them some meaningful epistemic norms.Lindley's argument answers this question dierently depending on one's position regarding the nature of credence.Those who see credences as a mere convention for discussing human beliefs and who do not distinguish between structurally similar credence functions, will nd that Lindley's assumptions are sucient to derive probabilism as they intend it.Those who maintain that, on the contrary, numeric representations capture some key features of belief, will have to face just how strong of a foundation is needed to support full-blown probabilism.
In this paper I have adapted Lindley's argument for probabilism as a necessary condition for rationality to a number of unconditional, non-classical settings that were excluded from the original result.I have also specied a class of inaccuracy measures for which Lindley's version of probabilism is sucient to avoid accuracy-domination.These generalisations are relevant for three main reasons.First, they allow to justify probabilism as a theory of rationality for all elds of research where it might be appropriate to step outside the boundaries of classical logic.Secondly, they might give philosophers in these elds some reason to prefer a no-drop characterisation of non-classical logical consequence, since this is fruitfully connected with probabilistic epistemology.Lastly, they highlight a strong connection between our measures of inaccuracy and the underlying logic.This connection is visible in the way Lindley's origin and score assumption was adapted to the {0, 1}-valued non-classical case, and in the issues caused by the multiplicity of truth values in the supervaluational and fuzzy settings.It remains to be seen whether Lindley's argument can be generalised to multi-valued settings of this kind, and whether it can support (some form of ) conditional probability in the non-classical case.This paper may be considered as a starting point for future work on this topic.

A Appendix
A.1 Figures  A.2

Proof of Lemma 2 -Boundary cases
The proof above doesn't work for boundary cases, for example when A w = 1 in all possible worlds.This is because cr(A) = x = x 1 in this case, so by dening cr ′ (A) = x + d 1 we might go outside the interval [x 0 , x 1 ] of admissible values (rst point of Lemma 1).Let's once again take the LP Gluts setting as an example for our proof.
And not only that, because the fact that A ∨ B w = 1 in all these worlds means that for cr to be acceptable it must be cr(A ∨ B) = s = x 1 .Thus any admissible cr ′ must be dened as follows:

4
Non-classical generalisation: the set-up This section lays the groundwork for the generalization of Lindley's accuracy argument in Section 5 First of all, I specify the family of non-classical logics to which my generalised result applies, and then list some of them.I show that Lindley's proof of Theorem 1 does not extend to these settings in general, and explain why this has to do with the axiomatisation of probability he is working with.Secondly, I clarify what my generalised result establishes by presenting an alternative version of the probability axioms, due to Paris (2001), which characterises closed convex combinations of truth-values in the non-classical settings under consideration.
Denition 4.1 (No-drop logical consequence).A |= B i A w ≤ B w for every world w ∈ W.
Figure 1 in the Appendix shows the space of the solutions of these two inequalities.

Theorem 2
It has the typical form of the accuracy theorems discussed in Section 2: starting from some assumptions on what counts as a reasonable measure of inaccuracy, we have shown that in order to avoid accuracy-domination, a credence's transform must respect Paris's probability axioms (P1)-(P3).If we take

Figure 1 :
Figure 1: Solutions of the rst two inequalities.In this example x = 0.6, y = 0.4, and f is the Brier score (MATLAB gure).

Figure 2 :
Figure 2: Solutions of the whole system.In this example x = 0.6, y = 0.4, and f is the Brier score (MATLAB gure).