Skip to main content Accessibility help
Hostname: page-component-59b7f5684b-hd9dq Total loading time: 0.506 Render date: 2022-10-01T06:36:12.992Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "displayNetworkTab": true, "displayNetworkMapGraph": false, "useSa": true } hasContentIssue true

4 - Test Validity in Cognitive Assessment

Published online by Cambridge University Press:  23 November 2009

Denny Borsboom
Assistant Professor of Psychology, University of Amsterdam
Gideon J. Mellenbergh
Professor of Psychology, University of Amsterdam
Jacqueline Leighton
University of Alberta
Mark Gierl
University of Alberta
Get access



Scientific theories can be viewed as attempts to explain phenomena by showing how they would arise, if certain assumptions concerning the structure of the world were true. Such theories invariably involve a reference to theoretical entities and attributes. Theoretical attributes include such things as electrical charge and distance in physics, inclusive fitness and selective pressure in biology, brain activity and anatomic structure in neuroscience, and intelligence and developmental stages in psychology. These attributes are not subject to direct observation but require an inferential process by which the researcher infers positions of objects on the attribute on the basis of a set of observations.

To make such inferences, one needs to have an idea of how different observations map on to different positions on the attribute (which, after all, is not itself observable). This requires a measurement model. A measurement model explicates how the structure of theoretical attributes relates to the structure of observations. For instance, a measurement model for temperature may stipulate how the level of mercury in a thermometer is systematically related to temperature, or a measurement model for intelligence may specify how IQ scores are related to general intelligence.

The reliance on a process of measurement and the associated measurement model usually involves a degree of uncertainty; the researcher assumes, but cannot know for sure, that a measurement procedure is appropriate in a given situation.

Cognitive Diagnostic Assessment for Education
Theory and Applications
, pp. 85 - 116
Publisher: Cambridge University Press
Print publication year: 2007

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Bechtold, H. P. (1959). Construct validity: A critique. American Psychologist, 14, 619–629.CrossRefGoogle Scholar
Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.CrossRefGoogle Scholar
Borsboom, D., Mellenbergh, G. J., & Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061–1071.CrossRefGoogle ScholarPubMed
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.CrossRefGoogle ScholarPubMed
Cronbach, L. J. (1990). Essentials of psychological testing (5th ed.). New York: Harper & Row.Google Scholar
Cronbach, L. J., & Gleser, G. C. (1957). Psychological tests and personnel decisions.Urbana: University of Illinois Press.Google Scholar
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.CrossRefGoogle ScholarPubMed
Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer.CrossRefGoogle Scholar
Groot, A. D. (1970). Some badly needed non-statistical concepts in applied psychometrics. Nederlands Tijdschrift voor de Psychologie, 25, 360–376.Google Scholar
Vries, A. L. M. (2006). The merit of ipsative measurement: Second thoughts and minute doubts. Unpublished doctoral dissertation, University of Maastricht, The Netherlands.Google Scholar
Dolan, C. V., Jansen, B. R. J., & Maas, H. L. J. (2004). Constrained and unconstrained normal finite mixture modeling of multivariate conservation data. Multivariate Behavioral Research, 39, 69–98.CrossRefGoogle Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.Google Scholar
Gross, A. L., & Su, W. H. (1975). Defining a “fair” or “unbiased” selection model: A question of utilities. Journal of Applied Psychology, 60, 345–351.CrossRefGoogle Scholar
Guttman, L. (1965). Introduction to facet design and analysis. In Proceedings of the 15th international congress of psychology. Amsterdam: North Holland.Google Scholar
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer-Nijhoff.CrossRefGoogle Scholar
Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to adolescence. New York: Basic Books.CrossRefGoogle Scholar
Jackson, D. N. (1971). The dynamics of structured personality tests. Psychological Review, 78, 229–248.CrossRefGoogle Scholar
Jansen, B. R. J., & Maas, H. L. J. (1997). Statistical tests of the rule assessment methodology by latent class analysis. Developmental Review, 17, 321–357.CrossRefGoogle Scholar
Jansen, B. R. J., & Maas, H. L. J. (2002). The development of children's rule use on the balance scale task. Journal of Experimental Child Psychology, 81, 383–416.CrossRefGoogle ScholarPubMed
Kaplan, D. (2000). Structural equation modeling: Foundations and extensions. Newbury Park, CA: Sage.Google Scholar
Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3, 635–694.Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
Maris, E. (1995). Psychometric latent response models. Psychometrika, 60, 523–547.CrossRefGoogle Scholar
Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127–143.CrossRefGoogle Scholar
Mellenbergh, G. J. (1996). Measurement precision in test score and item response models. Psychological Methods, 1, 293–299.CrossRefGoogle Scholar
Mellenbergh, G. J., & Linden, W. J. (1979). The internal and external optimality of decisions based on tests. Applied Psychological Measurement, 3, 257–273.CrossRefGoogle Scholar
Mellenbergh, G. J., & Linden, W. J. (1981). The linear utility model for optimal selection. Psychometrika, 46, 283–305.CrossRefGoogle Scholar
Meredith, W. (1993). Measurement invariance, factor analysis, and factorial invariance. Psychometrika, 58, 525–543.CrossRefGoogle Scholar
Messick, S. (1998). Test validity: A matter of consequence. Social Indicators Research, 45, 35–44.CrossRefGoogle Scholar
Messick, S. C. (1989). Validity. In Linn, R. L. (Ed.), Educational measurement (pp. 13–103). Washington, DC: American Council on Education and National Council on Measurement in Education.Google Scholar
Michell, J. (1997). Quantitative science and the definition of measurement in psychology. British Journal of Psychology, 88, 355–383CrossRefGoogle Scholar
Michell, J. (1999). Measurement in psychology: A critical history of a methodological concept.Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Millsap, R. E. (1997). Invariance in measurement and prediction: Their relationship in the single-factor case. Psychological Methods, 2, 248–260.CrossRefGoogle Scholar
Millsap, R. E., & Meredith, W. (1992). Inferential conditions in the statistical detection of measurement bias. Applied Psychological Measurement, 16, 389–402.CrossRefGoogle Scholar
Oort, F. J. (1993). Theory of violators: Assessing unidimensionality of psychological measures. In Steyer, R., Wender, K. F., & Widaman, K. F. (Eds.), Proceeding of the 7th European meeting of the Psychometric Society in Trier (pp. 377–381). Stuttgart: Gustav Fischer.Google Scholar
Oosterveld, P. (1996). Questionnaire design methods. Unpublished doctoral dissertation, University of Amsterdam, Amsterdam, The Netherlands.Google Scholar
Popham, W. J. (1997). Consequential validity: Right concern-wrong concept. Educational Measurement: Issues and Practice, 16, 9–13.CrossRefGoogle Scholar
Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85, 59–104.CrossRefGoogle Scholar
Roid, G. H. (2003). Stanford-Binet Intelligence Scales, Fifth Edition. Itasca, IL: Riverside Publishing.Google Scholar
Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271–282.CrossRefGoogle Scholar
Schmittmann, V. D., Dolan, C. V., Maas, H. L. J., & Neale, M. C. (2005). Discrete latent Markov models for normally distributed response data. Multivariate Behavioral Research, 40, 461–484.CrossRefGoogle ScholarPubMed
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin.Google Scholar
Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage.CrossRefGoogle Scholar
Snow, R. E., & Lohman, D. F. (1989). Implications of cognitive psychology for educational measurement. In Linn, R. L. (Ed.), Educational measurement (pp. 263–311). Washington, DC: American Council on Education and National Council on Measurement in Education.Google Scholar
Stouthard, M. E. A., Hoogstraten, J., & Mellenbergh, G. J. (1995). A study of the convergent and discriminant validity of the dental anxiety inventory. Behaviour Research and Therapy, 33, 589–595.CrossRefGoogle ScholarPubMed
Stouthard, M. E. A., Mellenbergh, G. J., & Hoogstraten, J. (1993). Assessment of dental anxiety: A facet approach. Anxiety, Stress, and Coping, 6, 89–105.CrossRefGoogle Scholar
Tuerlinckx, F., & Boeck, P. (2005). Two interpretations of the discrimination parameter. Psychometrika, 70, 629–650.CrossRefGoogle Scholar
Uebersax, J. S. (1999). Probit latent class analysis with dichotomous or ordered category measures: Conditional independence/dependence models. Applied Psychological Measurement, 23, 283–297.CrossRefGoogle Scholar
Linden, W. J. (1980). Decision models for use with criterion-referenced tests. Applied Psychological Measurement, 4, 469–492.CrossRefGoogle Scholar
Linden, W. J., & Hambleton, R. K. (Eds.). (1997). Handbook of modern item response theory. New York: Springer.CrossRefGoogle Scholar
Wechsler, D. (1997). Wechsler Adult Intelligence Scale, Third Edition. San Antonio, TX: The Psychological Corporation.Google Scholar
Wicherts, J. M., Dolan, C. V., & Hessen, D. J. (2005). Stereotype threat and group differences in test performance: A question of measurement invariance. Journal of Personality and Social Psychology, 89, 696–716.CrossRefGoogle ScholarPubMed
Cited by

Save book to Kindle

To save this book to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats