Hostname: page-component-77f85d65b8-grvzd Total loading time: 0 Render date: 2026-03-29T20:28:28.748Z Has data issue: false hasContentIssue false

A model-based approach for the analysis of the calibration of probability judgments

Published online by Cambridge University Press:  01 January 2023

David V. Budescu*
Affiliation:
Department of Psychology, Fordham University, Dealy Hall, 411 East Fordham Road, Bronx, NY, 10458, USA
Timothy R. Johnson
Affiliation:
Department of Statistics, University of Idaho
Rights & Permissions [Opens in a new window]

Abstract

The calibration of probability or confidence judgments concerns the association between the judgments and some estimate of the correct probabilities of events. Researchers rely on estimates using relative frequencies computed by aggregating data over observations. We show that this approach creates conceptual problems, and may result in the confounding of explanatory variables or unstable estimates. To circumvent these problems we propose using probability estimates obtained from statistical models—specifically mixed models for binary data—in the analysis of calibration. We illustrate this methodology by re-analyzing data from a published study and comparing the results from this approach to those based on relative frequencies. The model-based estimates avoid problems with confounding variables and provided more precise estimates, resulting in better inferences.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
The authors license this article under the terms of the Creative Commons Attribution 3.0 License.
Copyright
Copyright © The Authors [2011] This is an Open Access article, distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Figure 0

Figure 1: Estimated calibration curves from (a) model-based probability estimates and (b) relative frequencies for the first basketball study. Smooth curves in (a) are mean calibration curves, averaged over subjects. Open points are mean estimated conditional probabilities for each distance and confidence value. Closed points are mean estimated probabilities for each confidence value, averaged over distance.

Figure 1

Figure 2: Estimated calibration curves from (a) model-based probability estimates and (b) relative frequencies for the second basketball study. Smooth curves in (a) are mean calibration curves, averaged over subjects. Open points are mean estimated conditional probabilities for each distance, group, and confidence value. Closed points are mean estimated probabilities for each group and confidence value, averaged over distance.

Figure 2

Figure 3: Estimated subject-specific calibration curves for the first basketball study. The light grey curves represent the estimated curves for each of the 45 subjects, at each distance. The black curves are the mean calibration curve at each distance.

Figure 3

Figure 4: Mean and distribution of calibration indices (log scale) from model-based probability estimates for each group and distance for the second basketball study.

Figure 4

Figure 5: Mean confidence judgments for each side, distance, and estimated marginal probability for the first basketball study. The marginal probabilities have been rounded to the nearest tenth. The error bars represent 95% confidence intervals.

Figure 5

Figure 6: Mean confidence judgments for each group, distance, and estimated marginal probability for the second basketball study. The marginal probabilities have been rounded to the nearest tenth. The error bars represent 95% confidence intervals.

Figure 6

Figure 7: Mean and distribution of calibration measures from (a) the model-based probability estimates and (b) the raw data, for each side and distance, for the first basketball study.

Figure 7

Figure 8: Mean and distribution of calibration measures from (a) the model-based probability estimates and (b) the raw data, for each group and distance, for the second basketball study.