To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
It is surprising, and perhaps a reflection of a certain provincialism in philosophy, that the problem of induction is so seldom linked to learning. On the face of it, an animal in a changing environment faces problems no different in general principle from those that we as ordinary humans or as specialized scientists face in trying to make predictions about the future.
Patrick Suppes Learning and Projectibility
This chapter applies the ideas developed in the preceding chapter to a class of bounded resource learning procedures known as payoff-based models. Payoff-based models are alternatives to classical Bayesian models that reduce the complexity of a learning situation by disregarding information about states of the world. I am going to focus on one particular payoff-based model, the “basic model of reinforcement learning,” which captures in a precise and mathematically elegant way the idea that acts which are deemed more successful (according to some specific criterion) are more likely to be chosen.
What we are going to see is that the basicmodel can be derived from certain symmetry principles, analogous to the derivation of Carnap's family of inductive methods. Studying the symmetries involved in this derivation leads into a corner of decision theory that is relatively unknown in philosophy. Duncan Luce, in the late 1950s, introduced a thoroughly probabilistic theory of individual choice behavior in which preferences are replaced by choice probabilities. A basic constraint on choice probabilities, known as “Luce's choice axiom,” together with the theory of commutative learning operators, provides us with the fundamental principles governing the basic model of reinforcement learning.
Our exploration of the basic model does not, of course, exhaust the study of payoff-based and other learning models. I indicate some other possible models throughout the chapter and in the appendices. The main conclusion is that learning procedures that stay within a broadly probabilistic framework often arise from symmetry principles in a way that is analogous to Bayesian models.
As I have already indicated, we may think of a system of inductive logic as a design for a “learning machine”: that is to say, a design for a computing machine that can extrapolate certain kinds of empirical regularities from the data with which it is supplied. Then the criticism of the so-far-constructed “c-functions” is that they correspond to “learning machines” of very low power. They can extrapolate the simplest possible empirical generalizations, for example: “approximately nine-tenths of the balls are red,” but they cannot extrapolate so simple a regularity as “every other ball is red.”
Hilary Putnam Probability and Confirmation
To approach the type of reflection that seems to characterize inductive reasoning as encountered in practical circumstances, we must widen the scheme and also consider partial exchangeability.
Bruno de Finetti Probability, Statistics and Induction
One of the main criticisms of Carnap's inductive logic that Hilary Putnam has raised – alluded to in the epigraph – is that it fails in situations where inductive inference ought to go beyond relative frequencies. It is a little ironic that Carnap and his collaborators could have immediately countered this criticism if they had been more familiar with the work of Bruno de Finetti, who had introduced a formal framework that could be used for solving Putnam's problem already in the late 1930s. De Finetti's central innovation was to use symmetries that generalize exchangeability to various notions of partial exchangeability for inductive inference of patterns.
The goal of this chapter is to show how generalized symmetries can be used to overcome the inherent limitations of order invariant learning models, such as the Johnson–Carnap continuum of inductive methods or the basic model of reinforcement learning. What we shall see is that learning procedures can be modified so as to be able to recognize in principle any finite pattern.
Taking Turns
Order invariant learning rules collapse when confronted with the problem of learning how to take turns. Taking turns is important whenever a learning environment is periodic.
This appendix shows how to apply the work of A. A. J. Marley on commutative learning operators to reinforcement learning. Marley works with a slightly more general set of axioms than the one I use here, which is tailored toward the basic model of reinforcement learning.
Abstract Families
Let's consider the sequences of propensities one for each act. They arise from sequences of choice probabilities that at every stage satisfy Luce's choice axiom. Let be the range of values the random variables, can take on, and let be In many applications, will just be the set of nonnegative real numbers. Let be the set of outcomes, which can often be identified with a subset of the reals.
A learning operator L maps pairs of propensities and outcomes in to. If x is an alternative's present propensity, then L(x, a) is its new propensity if choosing that alternative has led to outcome a. This gives rise to a family of learning operators; for each, can be viewed as an operator from to. We assume that there is a unit element with
The triple is called an abstract family. An abstract family is quasi-additive if there exists a function and a function such that for each x, y in
We say that the process given by the sequences of propensities and the sequence of choices of acts is a reinforcement learning process if there exists an abstract family such that for all n whenever
where e is the unit element of. If such a family is quasi-abstract, then
Marley's Theorem
Some of Marley's principles tell us when an abstract family fits the learning process we are interested in. Let's start by introducing the relevant concepts.
Definition 1An abstract family is strictly monotonic if for all x, y in and each a in
Strict monotonicity says that learning is stable; an outcome has the same effect on how propensities are ordered across all propensity levels.
The criteria incorporated in the personalistic view do not guarantee agreement on all questions among all honest and freely communicating people, even in principle.
Leonard J. Savage The Foundations of Statistics
Learning models usually feature an individual agent who responds in some way to new information. They are monological, in the sense that they ignore that learning often takes place in a social context. The learning models discussed in previous chapters are no exception. In the last two chapters of this book I wish to pursue two issues that are relevant for extending my approach to social epistemology.
Both topics have to do with epistemic disagreement. Disagreement is ubiquitous in many areas of our lives. It is common in cutting-edge science, economics, business, politics, religion, and philosophy, not to mention the many ordinary disagreements we all have to deal with day to day. Many of our disagreements have epistemic aspects. For instance, divergent economic policies may be based on different assessments of data and models. What is the relationship between epistemic disagreements and rational learning? To what extent are divergent opinions compatible with all agents being epistemically rational? Finding answers to these questions is of crucial importance.
This chapter focuses on learning from others. The opinions of other agents who might disagree with you are taken as evidence that can cause you to change your beliefs. We will see that this learning situation requires no substantially new solution. Radical probabilism provides us with all the resources to model updating on the opinions of other agents. The most important part of this chapter is a “rational reconstruction” of a rule for merging the opinions of agents which has been much discussed in the epistemological literature on peer disagreement. This rule, which is known as straight averaging, combines the opinions of a group of agents additively by assigning each of them equal weights. I will show that something close to straight averaging emerges from combing a principle of Carnapian inductive logic with the theory of higher-order probabilities.
Let us imagine to ourselves the case of a person just brought forth into this world, and left to collect from his observation of the order and course of events what powers and causes take place in it. The Sun would, probably, be the first object that would engage his attention: but after losing sight of it the first night he would be entirely ignorant whether he would ever see it again. He would therefore be in the condition of a person making a first experiment entirely unknown to him.
Richard Price Appendix to Bayes’ Essay
So far, we have considered learning models that operate within fixed conceptual frameworks. Fictitious play assumes that states, acts, and outcomes are known; reinforcement learning assumes the same for acts and outcomes. In many situations this is implausible, as Bayes’ friend and curator Richard Price has observed in his reflection on “a person just brought forth into this world.” One may not know the basic constituents of a new environment; accordingly, there may be no fixed conceptual framework for learning.
While this observation is not a knockout criticism of the learning models studied in previous chapters, it does put the spotlight on one of their inherent limitations. In this chapter, we will investigate some ways to overcome that limitation. To set the stage, I will give some context to the question of how much knowledge a learning model presupposes by discussing Savage's distinction between small worlds and large worlds in the context of learning. In a small world, the structure of an epistemic situation is fully known; in a large world, one does not know or anticipate every aspect of the epistemic situation that might be relevant. The distinction between large and small worlds will add yet another layer to our running discussion of bounded rationality.
The second section of this chapter looks at models based on flexible conceptual frameworks that take the epistemic incompleteness of a learning situation into account. They do so by keeping the learning procedure open to conceptual changes.
Radical Probabilism doesn't insist that probabilities be based on certainties; it can be probabilities all the way down, to the roots.
Richard Jeffrey Radical Probabilism
In Chapter 1, we introduced the two main aspects of Bayesian rational learning: dynamic consistency and symmetry. The preceding chapters focused on symmetries. In particular, we have seen that learning models other than Bayesian conditioning agree that updating on new information should be consistent with one's overall inductive assumptions about the learning situation. In this chapter we return to the issue of dynamic consistency. Recall that Bayesian conditioning is the only dynamically consistent rule for updating probabilities in a special, and particularly important, class of learning situations in which an agent learns the truth of a factual proposition. The basic rationale for dynamic consistency is that an agent's probabilities are best estimates prior and posterior to the learning experience only if they cohere with one another.
This chapter explores dynamic consistency in the context of learning models of bounded rationality. Since these models depart rather sharply from Bayesian conditioning, it is not immediately clear how, or even whether, they can be dynamically consistent. The relevant insights come from Richard Jeffrey's epistemological program of radical probabilism, which holds that Bayesian conditioning is just one among many legitimate forms of learning. After introducing Jeffrey's main ideas, we will see that his epistemology provides a large enough umbrella to include the probabilistic models of learning we have encountered in the preceding chapters, and many more. Two principles of radical probabilism, in particular, will assume a decisive role: Bas van Fraassen's reflection principle and its generalization, the martingale principle, which is due to Brian Skyrms. The two principles extend dynamic consistency to generalized learning processes and thereby allow us to say when such a process updates consistently on new information, even if the content of the information cannot be expressed as an observational proposition.
From the theoretical,mathematical point of view, even the fact that the evaluation of probability expresses somebody's opinion is then irrelevant. It is purely a question of studying it and saying whether it is coherent or not; i.e., whether it is free of, or affected by, intrinsic contradictions. In the same way, in the logic of certainty one ascertains the correctness of the deductions but not the accuracy of the factual data assumed as premises.
Bruno de Finetti Theory of Probability I
Symmetry arguments are tools of great power; therein lies not only their utility and attraction, but also their potential treachery. When they are invoked one may find, as did the sorcerer's apprentice, that the results somewhat exceed one's expectations.
Sandy Zabell Symmetry and Its Discontents
This chapter is a short introduction to the philosophy of inductive inference. After motivating the issues at stake, I'm going to focus on the two ideas that will be developed in this book: consistency and symmetry.
Consistency is a minimal requirement for rational beliefs. It comes in two forms: static consistency guarantees that one's degrees of beliefs are not self-contradictory, and dynamic consistency requires that new information is incorporated consistently into one's system of beliefs. I am not going to present consistency arguments in full detail; my goal is, rather, to give a concise account of the ideas that underlie the standard theory of probabilistic learning, known as Bayesian conditioning or conditionalization, in order to set the stage for generalizing these ideas in subsequent chapters. Bayesian conditioning provides the basic framework for rational learning from factual propositions, but it does not always give rise to tractable models of inductive inference. In practice, nontrivial inductive inference requires degrees of beliefs to exhibit some kind of symmetry. Symmetries are useful because they simplify a domain of inquiry by distinguishing some of its features as invariant. In this chapter, we examine the most famous probabilistic symmetry, which is known as exchangeability and was studied extensively by Bruno de Finetti in his work on inductive inference.
Differentminds may set out with the most antagonistic views, but the progress of investigation carries them by a force outside of themselves to one and the same conclusion.
Charles S. Peirce How to Make Our Ideas Clear
The kind of probabilism I advocate in this book is very undogmatic in that it typically allows for a range of admissible beliefs in response to the same information. The charge of excessive subjectivity is often brought against such a view. This charge is particularly pressing when the rationality of science is at stake: after all, the emergence of a consensus seems to be one consequence of what it means to respond rationally to scientific evidence. Does radical probabilism have anything to offer that would account for this feature of scientific rationality?
The previous chapter explored learning from other agents. As we have seen, this leads to a consensus only under special conditions, which express an initial agreement among the agents about the structure of the epistemic situation. That agreement is required at some levels in order to get agreement at others is a theme we are going to encounter again in this chapter, in which we set aside learning from others and focus instead on situations where agents revise their opinions based on the same information.
In the epigraph to this chapter, Peirce expresses an optimistic view about this process: learning from the same evidence shall overcome any initial disagreement. There is more than a grain of truth in this statement, as is shown by a number of theorems that I will discuss in this chapter. These theorems go a long way toward showing that probabilistic learning can lead to a rational consensus. However, there are also limits to this. The question of when Jeffrey conditioning leads to a consensus is especially interesting in this regard; we will see that considering Jeffrey conditioning helps reveal a dependence between long-run agreement and whether evidence is solid or soft.
Learning is something we are all very familiar with. As children we learn to recognize faces, to walk, to speak, to climb trees and ride bikes, and so many other things that it would be a hopeless task to continue the list. Later we learn how to read and write; we learn arithmetic, calculus, and foreign languages; we learn how to cook spaghetti, how to drive a car, or what's the best response to telemarketing calls. Even as adults, when many of our beliefs have become entrenched and our behaviors often are habitual, there are new alternatives to explore if we wish to do so; and sometimes we even may revise long-held beliefs or change our conduct based on something we have learned.
So learning is a very important part of our lives. But it is not restricted to humans, assuming we understand it sufficiently broadly. Animals learn when they adjust their behavior to external stimuli. Even plants and very simple forms of life like bacteria can be said to “learn” in the sense of responding to information from their environment, as do some of the machines and computer programs created by us; search engines learn a lot about you from your search history (leading to the funky marketing idea that the underlying algorithms know more about you than you do yourself).
Thus, learning covers a wide variety of phenomena that share a particular pattern: some old state of an individual (what you believe, how you act, etc.) is altered in response to new information. This general description encompasses many distinct ways of learning, but it is too broad to characterize learning events. There are all kinds of epistemically irrelevant or even harmful factors that can have an influence on how an individual's state is altered. In order to better understand learning events and what sets them apart from other kinds of events, this book uses abstract models of learning, that is, precise mathematical representations of learning protocols. Abstractmodels of learning are studied in many fields, such as decision and game theory, mathematical psychology, and computer science. I will explore some learning models that I take to be especially interesting.
The work presented here develops a comprehensive probabilistic approach to learning from experience. The central question I try to answer is: “What is a correct response to some new piece of information?” This question calls for an evaluative analysis of learning which tells us whether, or when, a learning procedure is rational. At its core, this book embraces a Bayesian approach to rational learning, which is prominent in economics, philosophy of science, statistics, and epistemology. Bayesian rational learning rests on two pillars: consistency and symmetry. Consistency requires that beliefs are probabilities and that new information is incorporated consistently into one's old beliefs. Symmetry leads to tractable models of how to update probabilities. I will endorse this approach to rational learning, but my main objective is to extend it to models of learning that seem to fall outside the Bayesian purview – in particular, to models of so-called “bounded rationality.”While these models may often not be reconciled with Bayesian decision theory (maximization of expected utility), I hope to show that they are governed by consistency and symmetry; as it turns out, many bounded learning models can be derived from first principles in the same way as Bayesian learning models.
This project is a continuation of Richard Jeffrey's epistemological program of radical probabilism. Radical probabilism holds that a proper Bayesian epistemology should be broad enough to encompass many different forms of learning from experience besides conditioning on factual evidence, the standard form of Bayesian updating. The fact that boundedly rational learning can be treated in a Bayesian manner, by using consistency and symmetry, allows us to bring them under the umbrella of radical probabilism; in a sense, a broadly conceived Bayesian approach provides us with “the one ring to rule them all” (copyright Jeff Barrett). As a consequence, the difference between high rationality models and bounded rationality models of learning is not as large as it is sometimes thought to be; rather than residing in the core principles of rational learning, it originates in the type of information used for updating.
Since their inception, the Perspectives in Logic and Lecture Notes in Logic series have published seminal works by leading logicians. Many of the original books in the series have been unavailable for years, but they are now in print once again. This volume, the nineteenth publication in the Lecture Notes in Logic series, collects the proceedings of the European Summer Meeting of the Association for Symbolic Logic, held in Paris, France in July 2000. This meeting marked the centennial anniversary of Hilbert's famous lecture and was held in the same hall at La Sorbonne where Hilbert presented his problems. Three long articles, based on tutorials given at the meeting, present accessible expositions of developing research in model theory, computability, and set theory. The eleven subsequent papers present work from the research frontier in all areas of mathematical logic.
Since their inception, the Perspectives in Logic and Lecture Notes in Logic series have published seminal works by leading logicians. Many of the original books in the series have been unavailable for years, but they are now in print once again. This volume, the twentieth publication in the Lecture Notes in Logic series, contains the proceedings of the 2001 European Summer Meeting of the Association for Symbolic Logic, held at the Vienna University of Technology. Two long articles present accessible expositions on resolution theorem proving and the determinacy of long games. The remaining articles cover separate research topics in many areas of mathematical logic, including applications in computer science, proof theory, set theory, model theory, computability theory, linguistics and aspects of philosophy. This collection will interest not only mathematical logicians but also philosophical logicians, historians of logic, computer scientists, formal linguists and mathematicians working in algebra, abstract analysis and topology.
Since their inception, the Perspectives in Logic and Lecture Notes in Logic series have published seminal works by leading logicians. Many of the original books in the series have been unavailable for years, but they are now in print once again. This volume, the fifteenth publication in the Lecture Notes in Logic series, collects papers presented at the symposium 'Reflections on the Foundations of Mathematics' held in celebration of Solomon Feferman's 70th birthday (The 'Feferfest') at Stanford University, California in 1988. Feferman has shaped the field of foundational research for nearly half a century. These papers reflect his broad interests as well as his approach to foundational research, which emphasizes the solution of mathematical and philosophical problems. There are four sections, covering proof theoretic analysis, logic and computation, applicative and self-applicative theories, and philosophy of modern mathematical and logic thought.