心理科学 ›› 2022, Vol. 45 ›› Issue (3): 710-717.

• 统计、测量与方法 • 上一篇    下一篇

认知诊断测验中基于信息矩阵的多群组DIF检验

孙小坚1,2,刘彦楼3,王诗梦1,辛涛4,宋乃庆1,周蔓3   

  1. 1. 西南大学
    2. 中国基础教育质量监测协同创新中心
    3. 曲阜师范大学
    4. 北京师范大学
  • 收稿日期:2020-05-15 修回日期:2020-11-24 出版日期:2022-05-20 发布日期:2022-05-22
  • 通讯作者: 辛涛

Using Information Matrix-based Method to Detect Differential Item Functioning with Multiple Groups in Cognitive Diagnostic Test

  • Received:2020-05-15 Revised:2020-11-24 Online:2022-05-20 Published:2022-05-22

摘要: 基于改进的Wald统计量,将适用于两群组的DIF检测方法拓展至多群组的项目功能差异(DIF)检验;改进的Wald统计量将分别通过计算观察信息矩阵(Obs)和经验交叉相乘信息矩阵(XPD)而得到。模拟研究探讨了此二者与传统计算方法在多个群组下的DIF检验情况,结果表明:(1)Obs和XPD的一类错误率明显低于传统方法,DINA模型估计下Obs和XPD的一类错误率接近理论水平;(2)样本量和DIF量较大时,Obs和XPD具有与传统Wald统计量大体相同的统计检验力。

关键词: 认知诊断测验, 项目功能差异, 多群组, 改进的Wald统计量

Abstract: Cognitive diagnostic assessment (CDA) has received much attention in educational and psychological measurement recently. During the CDA, cognitive diagnostic test is the key component to examine whether or not individuals master the attributes that the test intends to measure based on his/her item responses. Lots of factors can affect the quality of the cognitive diagnostic test, among which differential item functioning (DIF) is one of the most important factors. Recently, researchers have developed lots of methods to detect DIF items, such as Mantel–Haenszel (MH), simultaneous item bias test, logistic regression (LR), and Wald statistics. These methods have been mainly designed to compare two groups, named the reference group and the focal group, respectively. However, there are more than two groups in practical situations, such as different classrooms within a school or different schools within a district. As we notice, a few studies have extended the DIF detection methods to multiple groups, for example, Li and Wang (2015) used the Wald statistic to detect DIF items for three groups. However, during the calculation of the Wald statistic, the information matrix is itemwise-based, which yields inflated Type I error rate. Meanwhile, both MH and LR methods can be used to detect DIF items for multiple group by using total score as the match variable. However, the results are worse than the itemwise-based Wald statistic for most conditions, therefore, the Wald statistic will be considered in this study. Currently, we attempt to extend the improved Wald statistics to more than two groups to control the Type I error rate as well as improve the power rate. A simulation study is conducted to investigate the performance of two improved Wald statistics for DIF detection with more than two groups in CDA. Six factors are manipulated in the study, which are DIF type, DIF size, sample size, test length, proportion of DIF items, and method of DIF detection. In addition, five factors are fixed in the study, include number of groups, number of attributes, correlation among attributes, model that used to generate response pattern, and distribution of item parameters. Type I error rate and statistic power are used to evaluate the performance of three DIF detection methods, the nominal level of Type-I error rate is setting as .05. In order to reduce the sampling error, 50 replications are used for each condition. Results show that (1) For all conditions, the itemwise-based (IW-based) Wald statistic, which leads to inflated Type I error rates, produce larger Type I error rates than the two improved Wald statistics— the cross-product information-based (XPD-based) Wald statistic and the observed information-based (Obs-based) Wald statistic. (2) When the DINA model is used to estimate the item parameters, the Type I error rates of the two improved Wald statistics close to the nominal level for most of conditions. (3) The IW-based Wald statistic yields the highest power for all conditions, the Obs-based and the XPD-based Wald statistics produce similar power rate in most conditions. The differences are diminished among these three Wald statistics when sample size and DIF size are relatively larger.

Key words: Cognitive diagnostic test, differential item functioning, multiple groups, improved Wald statistics