Hostname: page-component-89b8bd64d-7zcd7 Total loading time: 0 Render date: 2026-05-06T16:20:23.752Z Has data issue: false hasContentIssue false

Variable-Length Stopping Rules for Multidimensional Computerized Adaptive Testing

Published online by Cambridge University Press:  01 January 2025

Chun Wang*
Affiliation:
University of Washington
David J. Weiss
Affiliation:
University of Minnesota
Zhuoran Shang
Affiliation:
University of Minnesota
*
Correspondence should be made to Chun Wang, Measurement and Statistics, College of Education, University of Washington, 312E Miller Hall, Box 353600, Seattle, WA 98195-3600, USA. Email: wang4066@uw.edu

Abstract

In computerized adaptive testing (CAT), a variable-length stopping rule refers to ending item administration after a pre-specified measurement precision standard has been satisfied. The goal is to provide equal measurement precision for all examinees regardless of their true latent trait level. Several stopping rules have been proposed in unidimensional CAT, such as the minimum information rule or the maximum standard error rule. These rules have also been extended to multidimensional CAT and cognitive diagnostic CAT, and they all share the same idea of monitoring measurement error. Recently, Babcock and Weiss (J Comput Adapt Test 2012. https://doi.org/10.7333/1212-0101001) proposed an “absolute change in theta” (CT) rule, which is useful when an item bank is exhaustive of good items for one or more ranges of the trait continuum. Choi, Grady and Dodd (Educ Psychol Meas 70:1–17, 2010) also argued that a CAT should stop when the standard error does not change, implying that the item bank is likely exhausted. Although these stopping rules have been evaluated and compared in different simulation studies, the relationships among the various rules remain unclear, and therefore there lacks a clear guideline regarding when to use which rule. This paper presents analytic results to show the connections among various stopping rules within both unidimensional and multidimensional CAT. In particular, it is argued that the CT-rule alone can be unstable and it can end the test prematurely. However, the CT-rule can be a useful secondary rule to monitor the point of diminished returns. To further provide empirical evidence, three simulation studies are reported using both the 2PL model and the multidimensional graded response model.

Information

Type
Original Paper
Copyright
Copyright © 2018 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Supplementary material: File

Wang et al. supplementary material

Wang et al. supplementary material 1
Download Wang et al. supplementary material(File)
File 9 KB
Supplementary material: File

Wang et al. supplementary material

Wang et al. supplementary material 2
Download Wang et al. supplementary material(File)
File 26.6 KB