We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Let $F:\; {\mathscr {C}} \to {\mathscr {E}} \ $ be a functor from a category $\mathscr {C} \ $ to a homological (Borceux–Bourn) or semi-abelian (Janelidze–Márki–Tholen) category $\mathscr {E}$. We investigate conditions under which the homology of an object $X$ in $\mathscr {C}$ with coefficients in the functor $F$, defined via projective resolutions in $\mathscr {C}$, remains independent of the chosen resolution. Consequently, the left derived functors of $F$ can be constructed analogously to the classical abelian case.
Our approach extends the concept of chain homotopy to a non-additive setting using the technique of imaginary morphisms. Specifically, we utilize the approximate subtractions of Bourn–Janelidze, originally introduced in the context of subtractive categories. This method is applicable when $\mathscr {C}$ is a pointed regular category with finite coproducts and enough projectives, provided the class of projectives is closed under protosplit subobjects, a new condition introduced in this article and naturally satisfied in the abelian context. We further assume that the functor $F$ meets certain exactness conditions: for instance, it may be protoadditive and preserve proper morphisms and binary coproducts—conditions that amount to additivity when $\mathscr {C}$ and $\mathscr {E}$ are abelian categories.
Within this framework, we develop a basic theory of derived functors, compare it with the simplicial approach, and provide several examples.
Dichotomous IRT models can be viewed as families of stochastically ordered distributions of responses to test items. This paper explores several properties of such distributions. In particular, it is examined under what conditions stochastic order in families of conditional distributions is transferred to their inverse distributions, from two families of related distributions to a third family, or from multivariate conditional distributions to a marginal distribution. The main results are formulated as a series of theorems and corollaries which apply to dichotomous IRT models. One part of the results holds for unidimensional models with fixed item parameters. The other part holds for models with random item parameters as used, for example, in adaptive testing or for tests with multidimensional abilities.
The case of adaptive testing under a multidimensional response model with large numbers of constraints on the content of the test is addressed. The items in the test are selected using a shadow test approach. The 0–1 linear programming model that assembles the shadow tests maximizes posterior expected Kullback-Leibler information in the test. The procedure is illustrated for five different cases of multidimensionality. These cases differ in (a) the numbers of ability dimensions that are intentional or should be considered as “nuisance dimensions” and (b) whether the test should or should not display a simple structure with respect to the intentional ability dimensions.
An application of a hierarchical IRT model for items in families generated through the application of different combinations of design rules is discussed. Within the families, the items are assumed to differ only in surface features. The parameters of the model are estimated in a Bayesian framework, using a data-augmented Gibbs sampler. An obvious application of the model is computerized algorithmic item generation. Such algorithms have the potential to increase the cost-effectiveness of item generation as well as the flexibility of item administration. The model is applied to data from a non-verbal intelligence test created using design rules. In addition, results from a simulation study conducted to evaluate parameter recovery are presented.
Three plausible assumptions of conditional independence in a hierarchical model for responses and response times on test items are identified. For each of the assumptions, a Lagrange multiplier test of the null hypothesis of conditional independence against a parametric alternative is derived. The tests have closed-form statistics that are easy to calculate from the standard estimates of the person parameters in the model. In addition, simple closed-form estimators of the parameters under the alternatives of conditional dependence are presented, which can be used to explore model modification. The tests were applied to a data set from a large-scale computerized exam and showed excellent power to detect even minor violations of conditional independence.
Current modeling of response times on test items has been strongly influenced by the paradigm of experimental reaction-time research in psychology. For instance, some of the models have a parameter structure that was chosen to represent a speed-accuracy tradeoff, while others equate speed directly with response time. Also, several response-time models seem to be unclear as to the level of parametrization they represent. A hierarchical framework for modeling speed and accuracy on test items is presented as an alternative to these models. The framework allows a “plug-and-play approach” with alternative choices of models for the response and response-time distributions as well as the distributions of their parameters. Bayesian treatment of the framework with Markov chain Monte Carlo (MCMC) computation facilitates the approach. Use of the framework is illustrated for the choice of a normal-ogive response model, a lognormal model for the response times, and multivariate normal models for their parameters with Gibbs sampling from the joint posterior distribution.
A lognormal model for response times is used to check response times for aberrances in examinee behavior on computerized adaptive tests. Both classical procedures and Bayesian posterior predictive checks are presented. For a fixed examinee, responses and response times are independent; checks based on response times offer thus information independent of the results of checks on response patterns. Empirical examples of the use of classical and Bayesian checks for detecting two different types of aberrances in response times are presented. The detection rates for the Bayesian checks outperformed those for the classical checks, but at the cost of higher false-alarm rates. A guideline for the choice between the two types of checks is offered.
A linear utility model is introduced for optimal selection when several subpopulations of applicants are to be distinguished. Using this model, procedures are described for obtaining optimal cutting scores in subpopulations in quota-free as well as quota-restricted selection situations. The cutting scores are optimal in the sense that they maximize the overall expected utility of the selection process. The procedures are demonstrated with empirical data.
Parameter recovery and item utilization were investigated for different designs for online test item calibration. The design was adaptive in a double sense: it assumed both adaptive testing of examinees from an operational pool of previously calibrated items and adaptive assignment of field-test items to the examinees. Four criteria of optimality for the assignment of the field-test items were used, each of them based on the information in the posterior distributions of the examinee’s ability parameter during adaptive testing as well as the sequentially updated posterior distributions of the field-test item parameters. In addition, different stopping rules based on target values for the posterior standard deviations of the field-test parameters and the size of the calibration sample were used. The impact of each of the criteria and stopping rules on the statistical efficiency of the estimates of the field-test parameters and on the time spent by the items in the calibration procedure was investigated. Recommendations as to the practical use of the designs are given.
Several criteria from the optimal design literature are examined for use with item selection in multidimensional adaptive testing. In particular, it is examined what criteria are appropriate for adaptive testing in which all abilities are intentional, some should be considered as a nuisance, or the interest is in the testing of a composite of the abilities. Both the theoretical analyses and the studies of simulated data in this paper suggest that the criteria of A-optimality and D-optimality lead to the most accurate estimates when all abilities are intentional, with the former slightly outperforming the latter. The criterion of E-optimality showed occasional erratic behavior for this case of adaptive testing, and its use is not recommended. If some of the abilities are nuisances, application of the criterion of As-optimality (or Ds-optimality), which focuses on the subset of intentional abilities is recommended. For the measurement of a linear combination of abilities, the criterion of c-optimality yielded the best results. The preferences of each of these criteria for items with specific patterns of parameter values was also assessed. It was found that the criteria differed mainly in their preferences of items with different patterns of values for their discrimination parameters.
Observed-score equating using the marginal distributions of two tests is not necessarily the universally best approach it has been claimed to be. On the other hand, equating using the conditional distributions given the ability level of the examinee is theoretically ideal. Possible ways of dealing with the requirement of known ability are discussed, including such methods as conditional observed-score equating at point estimates or posterior expected conditional equating. The methods are generalized to the problem of observed-score equating with a multivariate ability structure underlying the scores.
A set of linear conditions on item response functions is derived that guarantees identical observed-score distributions on two test forms. The conditions can be added as constraints to a linear programming model for test assembly that assembles a new test form to have an observed-score distribution optimally equated to the distribution on an old form. For a well-designed item pool and items fitting the IRT model, use of the model results into observed-score pre-equating and prevents the necessity of post hoc equating by a conventional observed-score equating method. An empirical example illustrates the use of the model for an item pool from the Law School Admission Test.
In order to identify aberrant response-time patterns on educational and psychological tests, it is important to be able to separate the speed at which the test taker operates from the time the items require. A lognormal model for response times with this feature was used to derive a Bayesian procedure for detecting aberrant response times. Besides, a combination of the response-time model with a regular response model in an hierarchical framework was used in an alternative procedure for the detection of aberrant response times, in which collateral information on the test takers’ speed is derived from their response vectors. The procedures are illustrated using a data set for the Graduate Management Admission Test® (GMAT®). In addition, a power study was conducted using simulated cheating behavior on an adaptive test.
Posterior odds of cheating on achievement tests are presented as an alternative to \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$p$$\end{document} values reported for statistical hypothesis testing for several of the probabilistic models in the literature on the detection of cheating. It is shown how to calculate their combinatorial expressions with the help of a reformulation of the simple recursive algorithm for the calculation of number-correct score distributions used throughout the testing industry. Using the odds avoids the arbitrary choice between statistical tests of answer copying that do and do not condition on the responses the test taker is suspected to have copied and allows the testing agency to account for existing circumstantial evidence of cheating through the specification of prior odds.
An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers’ ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.
Response times on test items are easily collected in modern computerized testing. When collecting both (binary) responses and (continuous) response times on test items, it is possible to measure the accuracy and speed of test takers. To study the relationships between these two constructs, the model is extended with a multivariate multilevel regression structure which allows the incorporation of covariates to explain the variance in speed and accuracy between individuals and groups of test takers. A Bayesian approach with Markov chain Monte Carlo (MCMC) computation enables straightforward estimation of all model parameters. Model-specific implementations of a Bayes factor (BF) and deviance information criterium (DIC) for model selection are proposed which are easily calculated as byproducts of the MCMC computation. Both results from simulation studies and real-data examples are given to illustrate several novel analyses possible with this modeling framework.
A maximin model for IRT-based test design is proposed. In the model only the relative shape of the target test information function is specified. It serves as a constraint subject to which a linear programming algorithm maximizes the information in the test. In the practice of test construction, several demands as linear constraints in the model. A worked example of a text construction problem with practical constraints is presented. The paper concludes with a discussion of some alternative models of test construction.