Hostname: page-component-89b8bd64d-r6c6k Total loading time: 0 Render date: 2026-05-07T08:02:14.019Z Has data issue: false hasContentIssue false

Using cognitive models to combine probability estimates

Published online by Cambridge University Press:  01 January 2023

Michael D. Lee*
Affiliation:
Department of Cognitive Sciences, 3151 SSPA Zot 5100, University of California Irvine, Irvine CA, USA, 92697
Irina Danileiko*
Affiliation:
Department of Cognitive Sciences, 3151 SSPA Zot 5100, University of California Irvine, Irvine CA, USA, 92697
Rights & Permissions [Opens in a new window]

Abstract

We demonstrate the usefulness of cognitive models for combining human estimates of probabilities in two experiments. The first experiment involves people’s estimates of probabilities for general knowledge questions such as “What percentage of the world’s population speaks English as a first language?” The second experiment involves people’s estimates of probabilities in football (soccer) games, such as “What is the probability a team leading 1–0 at half time will win the game?”, with ground truths based on analysis of large corpus of games played in the past decade. In both experiments, we collect people’s probability estimates, and develop a cognitive model of the estimation process, including assumptions about the calibration of probabilities and individual differences. We show that the cognitive model approach outperforms standard statistical aggregation methods like the mean and the median for both experiments and, unlike most previous related work, is able to make good predictions in a fully unsupervised setting. We also show that the parameters inferred as part of the cognitive modeling, involving calibration and expertise, provide useful measures of the cognitive characteristics of individuals. We argue that the cognitive approach has the advantage of aggregating over latent human knowledge rather than observed estimates, and emphasize that it can be applied in predictive settings where answers are not yet available.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
The authors license this article under the terms of the Creative Commons Attribution 3.0 License.
Copyright
Copyright © The Authors [2014] This is an Open Access article, distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Figure 0

Table 1 The 40 general knowledge questions and their answers.

Figure 1

Figure 1: Analyses of the football game environment, based on 6072 games from first-division games played in domestic leagues between 2001 and 2011.

Figure 2

Table 2: The 40 football estimation questions and their empirical answers.

Figure 3

Figure 2: Histograms of stick people showing the distribution of performance, measured as the mean absolute difference between estimates and true probabilities, for all participants in both the general knowledge (upper) and football (lower) experiments. The inset panels show, for each experiment, the relationship between the estimates and the answers for the best- and worst-performed participants.

Figure 4

Figure 3: The theoretical framework for our cognitive model of probability estimation. The ith probability is assumed to have a latent truth πi that is subjected to calibration and expertise processes in producing an observed estimate. Calibration operates according to a non-linear function that maps true to perceived probabilities, such that small probabilities are over-estimated and large probabilities are under-estimated. Expertise controls how precisely a perceived probability is reported through the standard deviation of the Gaussian distribution from which the behavioral estimate is sampled. Both the level of calibration and expertise processes are controlled by participant-specific parameters that allow for individual differences.

Figure 5

Figure 4: Graphical model for behavioral estimates of probabilities made by a number of participants for a number of questions. The latent true probability πi for the ith question is calibrated according to a parameter δj for the jth participant to become the value ψij. This calibrated values then produces an observed estimate pij according to the expertise σj of the participant.

Figure 6

Figure 5: The performance of three cognitive models and two statistical methods in estimating probabilities for the general knowledge questions, and the relationship of their levels of performance to individual participants. The cognitive models assume calibration and expertise (“Calibrate+Expertise”), just calibration (“Calibrate”) or just expertise (“Expertise”). The statistical methods are the median and the mean of individual responses for each question. The top panels show the relationship between true and estimated answers for all 40 questions for each method. The bottom panel shows the distribution of individual performance as stick figures and the levels of model performance as broken lines. The performance of the models and individuals is measured as mean absolute difference from true answers.

Figure 7

Figure 6: The performance of three cognitive models and two statistical methods in estimating probabilities for the football questions, and the relationship of their levels of performance to individual participants. The same information is presented in the same format as for the general knowledge questions presented in Figure 5.

Figure 8

Figure 7: The expected posterior expertise σj and calibration δj parameters for each participant in the general knowledge (left) and football (right) experiments. For the general knowledge experiment, four participants are highlighted and the scatter plot of their estimates relative to the answers are shown in inserted panels.

Figure 9

Figure 8: The relationship between model-based inferences of individual expertise, self reported measures of expertise, and actual performance in estimating probabilities across individuals. The top two panels relate to the general knowledge questions, and show how the model-based expertise and self reported expertise correlate with performance in estimating probabilities. The bottom three panels relate to the football questions, and show how the model-based expertise, self reported expertise, and trivia question performance relate to performance in estimating probabilities. For each scatter plot the Pearson correlation coefficient is also shown.

Supplementary material: File

Lee and Danileiko supplementary material

Lee and Danileiko supplementary material 1
Download Lee and Danileiko supplementary material(File)
File 20.6 KB
Supplementary material: File

Lee and Danileiko supplementary material

Lee and Danileiko supplementary material 2
Download Lee and Danileiko supplementary material(File)
File 20.8 KB