Hostname: page-component-89b8bd64d-7zcd7 Total loading time: 0 Render date: 2026-05-05T15:20:58.497Z Has data issue: false hasContentIssue false

The Bit Scale: A Metric Score Scale for Unidimensional Item Response Theory Models

Published online by Cambridge University Press:  10 December 2025

Joakim Wallmark*
Affiliation:
Dept. of Statistics, USBE, Umea University , Sweden
Marie Wiberg
Affiliation:
Dept. of Statistics, USBE, Umea University , Sweden
*
Corresponding author: Joakim Wallmark; Email: joakim.wallmark@umu.se
Rights & Permissions [Opens in a new window]

Abstract

In item response theory (IRT), the conventional latent trait scale ($\theta $) is inherently arbitrary, lacking a fixed unit or origin and often tied to specific population distributional assumptions (e.g., standard normal). This limits the direct comparability and interpretability of scores across different tests, populations, or model estimation methods. This article introduces the “bit scale,” a novel metric transformation for unidimensional IRT scores derived from fundamental principles of information theory, specifically surprisal and entropy. Bit scores are anchored to the properties of the test items rather than the test-taker population. This item-based anchoring ensures the scale’s invariance to population assumptions and provides a consistent metric for comparing latent trait levels. We illustrate the utility of the bit scale through empirical examples: demonstrating consistent scoring when fitting models with different $\theta $ scale assumptions, and using anchor items to directly link scores from different test administrations. A simulation study confirms the desirable statistical properties (low bias and accurate standard errors) of Maximum Likelihood-estimated bit scores and their robustness to extreme scores. The bit scale offers a theoretically grounded, interpretable, and comparable metric for reporting and analyzing IRT-based assessment results. Software implementations in R (bitscale) and Python (IRTorch) are available and practical implications are discussed.

Information

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Psychometric Society
Figure 0

Figure 1 Item response functions, $\theta $ distributions, and test information curves before and after sigmoid transformation of $\theta $ scale.Note: In the upper plots, a and b refer to the slope and difficulty parameters of the 2PL model as later described in Equation (1).

Figure 1

Figure 2 Example IRF to the left with its corresponding entropy curve in the middle, and the derivative of the entropy curve to the right. The vertical arrows in the entropy plot show the distances added up to compute the item bit score for a test taker with an estimated $\theta $ of 1.5.

Figure 2

Table 1 Summary statistics for each test form

Figure 3

Figure 3 $\theta $ and bit score distributions for GPC models fitted with and without constraints on the $\theta $ scale.

Figure 4

Figure 4 Fisher information for GPC models fitted with and without constraints on the $\theta $ scale. Information is represented on both the $\theta $ and the bit scales.

Figure 5

Figure 5 Latent score distributions.

Figure 6

Figure 6 $\theta $ score transformation utilizing anchor item computed bit scores contrasted against the mean–mean linking method.

Figure 7

Figure 7 Bias, RMSE, and SE across the bit and $\theta $ scales. Note that the SE and RMSE curves are close to overlapping due to the small bias in most settings.

Figure 8

Figure 8 Simulation evaluation of theoretical SEs for ML-estimated bit and $\theta $ scores.