›› 2020, Vol. ›› Issue (1): 206-214.

Previous Articles     Next Articles

The Exploration and Application of Multidimensional Testlets Response Model in DIF Detecting

  

  • Received:2018-08-04 Revised:2019-07-27 Online:2020-01-15 Published:2020-01-20

基于多维题组反应模型的项目功能差异检验探究

魏丹1,张丹慧2,刘红云3   

  1. 1. 北京师范大学心理学部
    2. 北京师范大学中国基础教育质量监测协同创新中心
    3. 北京师范大学
  • 通讯作者: 刘红云

Abstract: The differential item functioning (DIF) detecting is an important technology to ensure the equity of assessments for different groups, such as boys and girls. Since testlets have been widely used in educational assessment today, it has been shown that ignoring testlet effects when detecting DIF often results in the failure and bias in the detection of DIF. As such, different types of testlet models have been proposed to partial out the influence of testlet factors from the estimation of latent proficiency, and they have been applied to detect differential item functioning (DIF) in testlet tests. However, the previous researches have been conducted under the condition that only one-dimension ability or unidimensional testlets is modeled. The detection of DIF under the condition that multiple latent traits exist has not been clearly addressed yet. This study aims to explore the application of multidimensional testlets response model (MTRM) in DIF detecting. A Monto Carlo methodology was conducted to compare the accuracy under different conditions in terms of the detection rate for the DIF items and false detection rate for those without DIF, with manipulation of the four factors: sample size (1000/2000/5000/10000), the difference of group mean ability (0/0.5), the magnitudes of tesltets effect (0.25/0.75) and DIF (0.3/0.6). In addition, an empirical study was conducted to discuss the practicality of MTRM in DIF detecting with math assessment data. Both the simulation study and real data analysis were compared with the DIF detecting method based on the multidimensional random coefficients multinomial logistic model (MRCMLM), which ignored the testlet effects. The simulation research indicates that the sample size and magnitudes of DIF are the most effective factors that influence the results. To be more specific, as the increase of the sample size, the detection rate is higher, the false detection rate is lower, and the result under different conditions is more stable. It’s conservatively suggested to have 5000 or more individuals to get much accurate results in this study. And both the MTRM and MRCMLM give acceptable accuracy for DIF with the magnitudes of 0.6, rather than that of 0.3. In addition, the testlet effect is also a non-ignorable factor. That is, in spite of the similar results of MTRM and MRCMLM in many cases, the MTRM gets higher accuracy and stronger stability across different conditions when compared with MRCMLM which ignores the testlet effect, especially when the testlet effect is large. What’s more, the real data analysis shows that MTRM have better model fit indices and we can generally conclude that MTRM is better than MRCMLM in DIF detecting. Generally speaking, the present study proves evidence of the multidimensional testlets response model in term of its’ application in DIF detecting, as well as supplementing technology by taking both within-item multidimensional testlets and multiple abilities into account. A promising attribute of this model is that the parameter estimation is easily achieved through using the software ConQuest. This study provides rigorous theoretical support, as well as practical confirmation, for the MTRM’s application in DIF detecting which makes important practical value and theoretical guiding significance.

Key words: multidimensional testlets response model (MTRM), multidimensional random coefficients multinomial logistic model (MRCMLM), differential item functioning (DIF), detection rate, false detection rate

摘要: 本文将多维题组反应模型(MTRM)应用到多维题组测验的项目功能差异(DIF)检验中,通过模拟研究和应用研究探究MTRM在DIF检验中的准确性、有效性和影响因素,并与忽略题组效应的多维随机系数多项Logistic模型(MRCMLM)进行对比。结果表明:(1)随着样本量的增大,MTRM对有效DIF值检出率增高,错误率降低,在不同条件下结果的稳定性更高;(2)与MRCMLM相比,基于MTRM的DIF检验模型检验率更高,受到其他因素的影响更小;(3)当测验中题组效应较小时,MTRM与MRCMLM结果差异较小,但是MTRM模型拟合度更高。

关键词: 多维题组反应模型, MRCMLM, 项目功能差异, 检出率, 错误率

CLC Number: