Hostname: page-component-77f85d65b8-2tv5m Total loading time: 0 Render date: 2026-03-29T06:18:29.293Z Has data issue: false hasContentIssue false

DIF Analysis with Unknown Groups and Anchor Items

Published online by Cambridge University Press:  01 January 2025

Gabriel Wallin
Affiliation:
Department of Mathematics and Statistics, Lancaster University
Yunxiao Chen*
Affiliation:
Department of Statistics London School of Economics and Political Science
Irini Moustaki
Affiliation:
Department of Statistics London School of Economics and Political Science
*
Correspondence should be made to Yunxiao Chen, Department of Statistics, London School of Economics and Political Science, Columbia House, Room 5.16 Houghton Street, London WC2A 2AE, UK. Email: y.chen186@lse.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

Ensuring fairness in instruments like survey questionnaires or educational tests is crucial. One way to address this is by a Differential Item Functioning (DIF) analysis, which examines if different subgroups respond differently to a particular item, controlling for their overall latent construct level. DIF analysis is typically conducted to assess measurement invariance at the item level. Traditional DIF analysis methods require knowing the comparison groups (reference and focal groups) and anchor items (a subset of DIF-free items). Such prior knowledge may not always be available, and psychometric methods have been proposed for DIF analysis when one piece of information is unknown. More specifically, when the comparison groups are unknown while anchor items are known, latent DIF analysis methods have been proposed that estimate the unknown groups by latent classes. When anchor items are unknown while comparison groups are known, methods have also been proposed, typically under a sparsity assumption – the number of DIF items is not too large. However, DIF analysis when both pieces of information are unknown has not received much attention. This paper proposes a general statistical framework under this setting. In the proposed framework, we model the unknown groups by latent classes and introduce item-specific DIF parameters to capture the DIF effects. Assuming the number of DIF items is relatively small, an L1\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$L_1$$\end{document}-regularised estimator is proposed to simultaneously identify the latent classes and the DIF items. A computationally efficient Expectation-Maximisation (EM) algorithm is developed to solve the non-smooth optimisation problem for the regularised estimator. The performance of the proposed method is evaluated by simulation studies and an application to item response data from a real-world educational test.

Information

Type
Theory & Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Copyright
Copyright © The Author(s) 2024
Figure 0

Figure. 1 Path diagram of the proposed model, where the dashed lines indicate the DIF effects.

Figure 1

Algorithm 1 Regularised estimation and model selection.

Figure 2

Algorithm 2 An EM algorithm for solving (3).

Figure 3

Table 1 Respondent and item classification accuracy under different simulation scenarios for the two-group case.

Figure 4

Table 2 Respondent and item classification accuracy under different simulation scenarios for the three-group case.

Figure 5

Figure. 2 RMSE for J=25\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J=25$$\end{document} under the 2-group setting.

Figure 6

Figure. 3 RMSE for J=50\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J=50$$\end{document} under the 2-group setting.

Figure 7

Figure. 4 The RMSEs for under the 3-group setting.

Figure 8

Table 3 Average absolute bias and RMSE over all items by estimated parameter type, when J=25\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J=25$$\end{document}, N=1000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$N=1000$$\end{document}, and 5000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$5000$$\end{document}, under the 2-group setting.

Figure 9

Table 4 Average absolute bias and RMSE over all items by estimated parameter type, when J=50\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J=50$$\end{document}, N=1000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$N=1000$$\end{document}, and 5000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$5000$$\end{document} under the 2-group setting.

Figure 10

Table 5 Average absolute bias and RMSE over all items by estimated parameter type, N=1000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$N=1000$$\end{document} and 5000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$5000$$\end{document}, under the 3-group setting.

Figure 11

Table 6 Estimated item easiness and DIF effects for the detected DIF items.

Figure 12

Algorithm 3 Line Search Algorithm

Figure 13

Algorithm 4 Soft Threshold Function

Figure 14

Algorithm 5 Proximal Gradient Function

Supplementary material: File

Wallin et al. supplementary material

Wallin et al. supplementary material
Download Wallin et al. supplementary material(File)
File 70.3 KB