Varying the Valuating Function and the Presentable Bank in Computerized Adaptive Testing

Juan Ramón Barrada; Francisco José Abad; Julio Olea

doi:10.5209/rev_SJOP.2011.v14.n1.45

Varying the Valuating Function and the Presentable Bank in Computerized Adaptive Testing

Published online by Cambridge University Press: 10 January 2013

Juan Ramón Barrada ,

Francisco José Abad and

Julio Olea

Show author details

Juan Ramón Barrada*: Affiliation:
Universidad Autónoma de Barcelona (Spain)
Francisco José Abad: Affiliation:
Universidad Autónoma de Madrid (Spain)
Julio Olea: Affiliation:
Universidad Autónoma de Barcelona (Spain)
*: Correspondence concerning this article should be addressed to Juan Ramón Barrada. Facultad de Psicología. Universidad Autónoma de Barcelona. 08193 Bellaterra. Barcelona. (Spain). Phone: +34-935813263. E-mail: juanramon.barrada@uab.es

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In computerized adaptive testing, the most commonly used valuating function is the Fisher information function. When the goal is to keep item bank security at a maximum, the valuating function that seems most convenient is the matching criterion, valuating the distance between the estimated trait level and the point where the maximum of the information function is located. Recently, it has been proposed not to keep the same valuating function constant for all the items in the test. In this study we expand the idea of combining the matching criterion with the Fisher information function. We also manipulate the number of strata into which the bank is divided. We find that the manipulation of the number of items administered with each function makes it possible to move from the pole of high accuracy and low security to the opposite pole. It is possible to greatly improve item bank security with much fewer losses in accuracy by selecting several items with the matching criterion. In general, it seems more appropriate not to stratify the bank.

En los tests adaptativos informatizados, la función de valoración más comúnmente empleada es la función de información de Fisher. Cuando el objetivo es mantener al máximo la seguridad del banco de ítems, la función de valoración que parece más adecuada es el criterio de proximidad, con el que se valora la distancia entre el nivel de rasgo estimado y el punto donde es máxima la información proporcionada por un ítem. Recientemente, se ha propuesto no mantener la misma regla de valoración constante a lo largo de todo el test. En este estudio, expandimos la idea de combinar el criterio de proximidad con la función de información de Fisher. También manipulamos el número de estratos en los que se divide el banco. Encontramos que la manipulación del número de ítems administrados con cada función hace posible moverse desde el extremo de alta precisión y baja seguridad hasta el extremo opuesto. La selección de varios ítems con el criterio de proximidad hace posible mejorar en gran medida la seguridad del banco con pérdidas escasas en precisión. En general, parece más adecuado no estratificar el banco.

Keywords

computerized adaptive testing item selection rule item bank security overlap rate tests adaptativos informatizados regla de selección de ítems seguridad del banco de ítems tasa de solapamiento

Information

Type: Research Article
Information: The Spanish Journal of Psychology , Volume 14 , Issue 1 , May 2011 , pp. 500 - 508

DOI: https://doi.org/10.5209/rev_SJOP.2011.v14.n1.45 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Abad, F. J., Olea, J., Aguado, D., Ponsoda, V., & Barrada, J. R. (2010). Deterioro de parámetros de los ítems en tests adaptativos informatizados: Estudio con eCAT [Item parameter drift in computerized adaptive testing: Study with eCAT]. Psicothema, 22, 340–347.Google Scholar

ACT, Inc. (1997). ACT assessment technical manual. Iowa City, IA: Author.Google Scholar

Ban, J., Hanson, B. A., Wang, T., Yi, Q., & Harris, D. J. (2001). A comparative study of on-line pretest item-calibration/scaling methods in computerized adaptive testing. Journal of Educational Measurement, 38, 191–212. doi:10.1111/j.1745-3984.2001.tb01123.xGoogle Scholar

Barrada, J. R., Abad, F. J., & Veldkamp, B. P. (2009). Comparison of methods for controlling maximum exposure rates in computerized adaptive testing. Psicothema, 21, 313–320.Google Scholar PubMed

Barrada., J. R., Mazuela, P., & Olea, J. (2006). Maximum Information Stratification method for controlling item exposure in Computerized Adaptive Testing. Psicothema, 18, 156–159.Google Scholar PubMed

Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2010). A method for the comparison of item selection rules in computerized adaptive testing. Applied Psychological Measurement, 34, 438–452. doi:10.1177/0146621610370152Google Scholar

Barrada, J. R., Veldkamp, B. P., & Olea, J. (2009). Multiple maximum exposure rates in computerized adaptive testing. Applied Psychological Measurement, 33, 58–73. doi:10.1177/0146621608315329CrossRef Google Scholar

Chang, H. H. (2004). Understanding computerized adaptive testing – From Robbins-Monro to Lord and beyond. In Kaplan, David (Ed.) The SAGE handbook of quantitative methodology for the social sciences (pp. 117–133). Thousand Oaks, CA: Sage Publications.Google Scholar

Chang, H. H., Qian, J., & Ying, Z. (2001). a-stratified multistage computerized adaptive testing with b blocking. Applied Psychological Measurement, 25, 333–341. doi:10.1177/01466210122032181Google Scholar

Chang, H. H., & Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20, 213–229. doi:10.1177/014662169602000303CrossRef Google Scholar

Chang, H. H., & Ying, Z. (1999). a-Stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211–222. doi:10.1177/01466219922031338Google Scholar

Chang, H. H., & Zhang, J. (2002). Hypergeometric family and item overlap rates in computerized adaptive testing. Psychometrika, 67, 387–398. doi:10.1007/BF02294991Google Scholar

Chang, S. W., & Ansley, T. N. (2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71–103. doi:10.1111/j.1745-3984.2003.tb01097.xCrossRef Google Scholar

Chen, S. Y., Ankenmann, R. D., & Spray, J. A. (2003). The relationship between item exposure and test overlap in computerized adaptive testing. Journal of Educational Measurement, 40, 129–145.doi:10.1111/j.1745-3984.2003.tb01100.xGoogle Scholar

Dodd, B. G. (1990) The effect of item selection procedure and stepsize on computerized adaptive attitude measurement using the rating scale model. Applied Psychological Measurement, 14, 355–366. doi:10.1177/014662169001400403Google Scholar

Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Hingham, MA: Kluwer.Google Scholar

Hanson, B. A. (2002). IRT Command Language. Computer software manual. Retrieved from: http://www.b-a-h.com/software/irt/icl/icl_manual.pdf Google Scholar

Hau, K. T., & Chang, H. H. (2001). Item selection in computerized adaptive testing: Should more discriminating items be used first? Journal of Educational Measurement, 38, 249–266. doi:10.1111/j.1745-3984.2001.tb01126.xGoogle Scholar

Leung, C. K., Chang, H. H., & Hau, K. T. (2002). Item selection in computerized adaptive testing: Improving the a-stratified design with the Sympson-Hetter algorithm. Applied Psychological Measurement, 26, 376–392. doi:10.1177/014662102237795CrossRef Google Scholar

Leung, C. K., Chang, H. H., & Hau, K. T. (2005). Computerized adaptive testing: a mixture item selection approach for constrained situations. British Journal of Mathematical and Statistical Psychology, 58, 239–257. doi:10.1348/000711005X62945CrossRef Google Scholar PubMed

Li, Y. H., & Schafer, W. D. (2005). Increasing the homogeneity of CAT's item-exposure rates by minimizing or maximizing varied target functions while assembling shadow tests. Journal of Educational Measurement, 42, 245–269. doi:10.1111/j.1745-3984.2005.00013.xGoogle Scholar

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar

Olea, J., Abad, F. J., Ponsoda, V., & Ximénez, M. C. (2004). Un test adaptativo informatizado para evaluar el conocimiento de ingles escrito: Diseño y comprobaciones psicométricas [A computerized adaptive test for the assessment of written English: Design and psychometric properties]. Psicothema, 16, 519–525.Google Scholar

Revuelta, J., & Ponsoda, V. (1998). A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35, 311–327. doi:10.1111/j.1745-3984.1998.tb00541.xCrossRef Google Scholar

Stocking, M. L., & Lewis, C. L. (2000). Methods of controlling the exposure of items in CAT. In Linden, W. J. van der & Glas, C. A. W. (Eds.) Computerized adaptive testing: Theory and practice (pp. 163–182). Dordrecht: Kluwer Academic.Google Scholar

Sympson, J. B., & Hetter, R. D. (1985, October). Controlling item exposure rates in computerized adaptive testing. Proceedings of the 27th annual meeting of the Military Testing Association (pp. 973–977). San Diego, CA.Google Scholar

van der Linden, W. J. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63, 201–216. doi:10.1007/BF02294775CrossRef Google Scholar

van der Linden, W. J. (2003). Some alternatives to Sympson-Hetter item-exposure control in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 28, 249–265. doi:10.3102/10769986028003249Google Scholar

van der Linden, W. J., & Glas, C. A. W. (Eds.) (2010). Elements of adaptive testing. New York, NY: Springer.CrossRef Google Scholar

van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational & Behavioral Statistics, 29, 273–291. doi:10.3102/10769986029003273CrossRef Google Scholar

Veerkamp, W. J. J., & Berger, M. P. F. (1997). Some new ítem selection criteria for adaptive testing. Journal of Educational & Behavioral Statistics, 22, 203–226. doi:10.3102/10769986022002203Google Scholar

Way, W. D. (1998). Protecting the integrity of computerized testing ítem pools. Educational Measurement: Issues and Practice, 17, 17–27. doi:10.1111/j.1745-3992.1998.tb00632.xGoogle Scholar

Wingersky, M. S., & Lord, F. M. (1984). An investigation of methods for reducing sampling error in certain IRT procedures. Applied Psychological Measurement, 8, 347–364. doi:10.1177/014662168400800312CrossRef Google Scholar

Article contents

Varying the Valuating Function and the Presentable Bank in Computerized Adaptive Testing

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests