To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This is a book about predicting the future. It describes my attempt to master a small enough corner of the universe to glimpse the events of tomorrow, today. The degree to which one can do this in my tiny toy domain tells us something about our potential to foresee larger and more interesting futures.
Considered less prosaically, this is the story of my 25-year obsession with predicting the results of jai alai matches in order to bet on them successfully. As obsessions go, it probably does not rank with yearning for the love of one you will never have or questing for the freedom of an oppressed and downtrodden people. But it is my obsession – one that has led me down paths that were unimaginable at the beginning of the journey.
This book marks the successful completion of my long quest and gives me a chance to share what I have learned and experienced. I think the attentive reader will come to understand the worlds of mathematics, computers, gambling, and sports quite differently after reading this book.
My interest in jai alai began during my parents' annual escape from the cold of a New Jersey winter to the promised land of Florida. They stuffed the kids into a Ford station wagon and drove a thousand miles in 2 days each way. Florida held many attractions for a kid: the sun and the beach, Disney World, Grampa, Aunt Fanny, and Uncle Sam. But the biggest draw came to be the one night each trip when we went to a fronton, or jai alai stadium, and watched them play.
Mom was the biggest jai alai fan in the family and the real motivation behind our excursions. We loaded up the station wagon and drove to the Dania Jai-Alai fronton located midway between Miami and Fort Lauderdale. In the interests of preserving capital for later investment, my father carefully avoided the valet parking in favor of the do-it-yourself lot. We followed a trail of palm trees past the cashiers' windows into the fronton.
Walking into the fronton was an exciting experience. The playing court sat in a vast open space, three stories tall, surrounded by several tiers of stadium seating. To my eyes, at least, this was big-league, big-time sport. Particularly “cool” was the sign saying that no minors would be admitted without a parent. This was a very big deal when I was only 12 years old.
We followed the usher who led us to our seats. The first game had already started.
This book is the outgrowth of an effort to provide a course covering the general topic of uncertain inference. Philosophy students have long lacked a treatment of inductive logic that was acceptable; in fact, many professional philosophers would deny that there was any such thing and would replace it with a study of probability. Yet, there seems to many to be something more traditional than the shifting sands of subjective probabilities that is worth studying. Students of computer science may encounter a wide variety of ways of treating uncertainty and uncertain inference, ranging from nonmonotonic logic to probability to belief functions to fuzzy logic. All of these approaches are discussed in their own terms, but it is rare for their relations and interconnections to be explored. Cognitive science students learn early that the processes by which people make inferences are not quite like the formal logic processes that they study in philosophy, but they often have little exposure to the variety of ideas developed in philosophy and computer science. Much of the uncertain inference of science is statistical inference, but statistics rarely enter directly into the treatment of uncertainty to which any of these three groups of students are exposed.
At what level should such a course be taught? Because a broad and interdisciplinary understanding of uncertainty seemed to be just as lacking among graduate students as among undergraduates, and because without assuming some formal background all that could be accomplished would be rather superficial, the course was developed for upper-level undergraduates and beginning graduate students in these three disciplines. The original goal was to develop a course that would serve all of these groups.
In Chapter 3, we discussed the axioms of the probability calculus and derived some of its theorems. We never said, however, what “probability” meant. From a formal or mathematical point of view, there was no need to: we could state and prove facts about the relations among probabilities without knowing what a probability is, just as we can state and prove theorems about points and lines without knowing what they are. (As Bertrand Russell said [Russell, 1901, p. 83] “Mathematics may be defined as the subject where we never know what we are talking about, nor whether what we are saying is true.”)
Nevertheless, because our goal is to make use of the notion of probability in understanding uncertain inference and induction, we must be explicit about its interpretation. There are several reasons for this. In the first place, if we are hoping to follow the injunction to believe what is probable, we have to know what is probable. There is no hope of assigning values to probabilities unless we have some idea of what probability means. What determines those values? Second, we need to know what the import of probability is for us. How is it supposed to bear on our epistemic states or our decisions? Third, what is the domain of the probability function? In the last chapter we took the domain to be a field, but that merely assigns structure to the domain: it doesn't tell us what the domain objects are.
There is no generally accepted interpretation of probability.
We have abandoned many of the goals of the early writers on induction. Probability has told us nothing about how to find interesting generalizations and theories, and, although Carnap and others had hoped otherwise, it has told us nothing about how to measure the support for generalizations other than approximate statistical hypotheses. Much of uncertain inference has yet to be characterized in the terms we have used for statistical inference. Let us take a look at where we have arrived so far.
Objectivity
Our overriding concern has been with objectivity. We have looked on logic as a standard of rational argument: Given evidence (premises), the validity (degree of entailment) of a conclusion should be determined on logical grounds alone. Given that the Hawks will win or the Tigers will win, and that the Tigers will not win, it follows that the Hawks will win. Given that 10% of a large sample of trout from Lake Seneca have shown traces of mercury, and that we have no grounds for impugning the fairness of the sample, it follows with a high degree of validity that between 8% and 12% of the trout in the lake contain traces of mercury.
The parallel is stretched only at the point where we include among the premises “no grounds for impugning. …” It is this that is unpacked into a claim about our whole body of knowledge, and embodied in the constraints discussed in the last three chapters under the heading of “sharpening.”
The system described in this book retrieves and analyzes data each night and employs a substantial amount of computational sophistication to determine the most profitable bets to make. It isn't something you are going to try at home, kiddies.
However, in this section I'll provide some hints on how you can make your trip to the fronton as profitable as possible. By combing the results of our Monte Carlo simulations and expected payoff model, I've constructed tables giving the expected payoff for each bet, under the assumption that all players are equally skillful. This is very useful information to have if you are not equipped to make your own judgments as to who is the best player, although we also provide tips as to how to assess player skills. By following my advice, you will avoid criminally stupid bets like the 6–8–7 trifecta.
But first a word of caution is in order. There are three primary types of gamblers:
Those who gamble to make money – If you are in this category, you are likely a sick individual and need help. My recommendation instead would be that you take your money and invest in a good mutual fund. In particular, the Vanguard Primecap fund has done right well for me over the past few years.
One theme running through this book is how hard we had to work in order to make even a small profit. As the saying goes, “gambling is a hard way to make easy money.”
We are now in a position to reap the benefits of the formal work of the preceding two chapters. The key to uncertain inference lies, as we have suspected all along, in probability. In Chapter 9, we examined a certain formal interpretation of probability, dubbed evidential probability, as embodying a notion of partial proof. Probability, on this view, is an interval-valued function. Its domain is a combination of elementary evidence and general background knowledge paired with a statement of our language whose probability concerns us, and its range is of [0, 1]. It is objective. What this means is that if two agents share the same evidence and the same background knowledge, they will assign the same (interval) probabilities to the statements of their language. If they share an acceptance level 1 – α for practical certainty, they will accept the same practical certainties.
It may be that no two people share the same background knowledge and the same evidence. But in many situations we come close. As scientists, we tend to share each other's data. Cooked data is sufficient to cause expulsion from the ranks of scientists. (This is not the same as data containing mistakes; one of the virtues of the system developed here is that no data need be regarded as sacrosanct.) With regard to background knowledge, if we disagree, we can examine the evidence at a higher level: is the item in question highly probable, given that evidence and our common background knowledge at that level?
There are a number of epistemological questions raised by this approach, and some of them will be dealt with in Chapter 12.
Mathematical modeling is a subject best appreciated by doing. The trick is finding an interesting type of prediction to make or question to study, and then identifying sufficient data to build a reasonable model upon. Even if you are not a computer programmer, spread-sheet programs such as Microsoft Excel can provide an excellent environment in which to experiment with mathematical models.
In this section, I pose several interesting questions to which the modeling techniques presented in this book may be applicable. To provide starting points, I include links to existing studies and data sets on the WWW. Web links are extremely perishable, so treat these only as an introduction. Any good search engine like www.google.com should help you find better sources after a few minutes' toil. Happy modeling!
Gambling
Lottery numbers – How random are lottery numbers? Do certain numbers in certain states come up more often than would be expected by chance? Can you predict which lotto combinations are typically underbet, meaning that they minimize the likelihood that you must share the pot with someone else if you win?. How large must pool size grow in a given progressive lottery to yield a positive expected value for each ticket bought?
Plenty of lottery records are available on the WWW if you look hard enough. Log on to http://www.lottonet.com/ for several year's historical data from several state lotteries. Minnesota does a particularly good job, making its historical numbers available at http://www.lottery.state.mn.us.
Horse racing – Many of the ideas employed in our jai alai system are directly applicable to horse racing. […]
As Carnap points out [Carnap, 1950], some of the controversy concerning the support of empirical hypotheses by data is a result of the conflation of two distinct notions. One is the total support given a hypothesis by a body of evidence. Carnap's initial measure for this is his c*; this is intended as an explication of one sense of the ordinary language word “probability.” This is the sense involved when we say, “Relative to the evidence we have, the probability is high that rabies is caused by a virus.” The other notion is that of “support” in the active sense, in which we say that a certain piece of evidence supports a hypothesis, as in “The detectable presence of antibodies supports the viral hypothesis.” This does not mean that that single piece of evidence makes the hypothesis “highly probable” (much less “acceptable”), but that it makes the hypothesis more probable than it was. Thus, the presence of water on Mars supports the hypothesis that that there was once life on Mars, but it does not make that hypothesis highly probable, or even more probable than not.
Whereas c*(h, e) is (for Carnap, in 1950) the correct measure of the degree of support of the hypothesis h by the evidence e, the increase of the support of h due to e given background knowledge b is the amount by which e increases the probability of h: c*(h, b Λ e) – c*(h, b). We would say that e supports h relative to background b if this quantity is positive, and undermines h relative to b if this quantity is negative.
Traditionally, logic has been regarded as the science of correct thinking or of making valid inferences. The former characterization of logic has strong psychological overtones—thinking is a psychological phenomenon—and few writers today think that logic can be a discipline that can successfully teach its students how to think, let alone how to think correctly. Furthermore, it is not obvious what “correct” thinking is. One can think “politically correct” thoughts without engaging in logic at all. We shall, at least for the moment, be well advised to leave psychology to one side, and focus on the latter characterization of logic: the science of making valid inferences.
To make an inference is to perform an act: It is to do something. But logic is not a compendium of exhortations: From “All men are mortal” and “Socrates is a man” do thou infer that Socrates is mortal! To see that this cannot be the case, note that “All men are mortal” has the implication that if Charles is a man, he is mortal, if John is a man, he is mortal, and so on, through the whole list of men, past and present, if not future. Furthermore, it is an implication of “All men are mortal” that if Fido (my dog) is a man, Fido is mortal; if Tabby is a man, Tabby is mortal, etc. And how about inferring “If Jane is a man, Jane is mortal”? As we ordinarily construe the premise, this, too is a valid inference. We cannot follow the exhortation to perform all valid inferences: There are too many, they are too boring, and that, surely, is not what logic is about.
We form beliefs about the world, from evidence and inferences made from the evidence. Belief, as opposed to knowledge, consists of defeasible information. Belief is what we think is true, and it may or may not be true in the world. On the other hand, knowledge is what we are aware of as true, and it is always true in the world.
We make decisions and act according to our beliefs, yet they are not infallible. The inferences we base our beliefs on can be deductive or uncertain, employing any number of inference mechanisms to arrive at our conclusions, for instance, statistical, nonmonotonic, or analogical. We constantly have to modify our set of beliefs as we encounter new information. A new piece of evidence may complement our current beliefs, in which case we can hold on to our original beliefs in addition to this new evidence. However, because some of our beliefs can be derived from uncertain inference mechanisms, it is inevitable that we will at some point encounter some evidence that contradicts what we currently believe. We need a systematic way of reorganizing our beliefs, to deal with the dynamics of maintaining a reasonable belief set in the face of such changes.
The state of our beliefs can be modeled by a logical theory K, a deductively closed set of formulas. If a formula φ is considered accepted in a belief set, it is included in the corresponding theory K; if it is rejected, its negation ¬φ is in K. In general the theory is incomplete.
Every morning at 2 A.M., as professors sleep and graduate students arrive to pull all-nighters, my computer diligently makes the rounds of the Websites of all major frontons, downloading the latest schedules and results and then running these files through Dario's parsing programs. After a few months of retrieval we had built a large-enough collection of jai alai data to justify some serious analysis. Our goal was to use all this data to measure the relative abilities of jai alai players and incorporate this information into our Monte Carlo simulation to make customized predictions for each match.
To get this job done, I had to bring another student on to the project, Meena Nagarajan. Meena was a different type of student than Dario. As a married woman with a young child, she realized that there are other things to life besides computers. She was returning to school to get her master's degree with the express goal of getting a lucrative job with a financial services company associated with Wall Street, as indeed she ultimately did. She realized that building a program-trading system for jai alai was a great way to learn how to build one for trading stocks, and she therefore signed on to work on the project.
Her undergraduate degree back in India was in applied mathematics; thus, she brought to the table an understanding of the meaning and limitations of statistics.
The idea behind evidential probability is a simple one. It consists of two parts: that probabilities should reflect empirical frequencies in the world, and that the probabilities that interest us—the probabilities of specific events—should be determined by everything we know about those events.
The first suggestions along these lines were made by Reichenbach [Reichenbach, 1949]. With regard to probability, Reichenbach was a strict limiting-frequentist: he took probability statements to be statements about the world, and to be statements about the frequency of one kind of event in a sequence of other events. But recognizing that what concerns us in real life is often decisions that bear on specific events—the next roll of the die, the occurrence of a storm tomorrow, the frequency of rain next month—he devised another concept that applied to particular events, that of weight. “We write P(a) = p thus admitting individual propositions inside the probability functor. The number p measures the weight of the individual proposition a. It is understood that the weight of the proposition was determined by means of a suitable reference class, …” [Reichenbach, 1949, p. 409]. Reichenbach appreciated the problem of the reference class: “… we may have reliable statistics concerning a reference class A and likewise reliable statistics for a reference class C, whereas we have insufficient statistics for the reference class A·C. The calculus of probability cannot help in such a case because the probabilities P(A, B) and P(C, B) do not determine the probability P(A · C, B)” [Reichenbach, 1949, p. 375]. The best the logician can do is to recommend gathering more data.