多值评分CD-CAT的选题策略研究

心理科学 ›› 2023, Vol. 46 ›› Issue (2): 461-469.

多值评分CD-CAT的选题策略研究

高旭亮,王芳赵鹏娟

贵州师范大学心理学院，贵州师范大学心理健康教育与咨询中心，贵阳，550025

收稿日期:2020-09-29 修回日期:2022-04-23 出版日期:2023-03-20 发布日期:2023-03-20
通讯作者: 高旭亮

Research on item selection algorithm of polytomous scored cognitive diagnosis adaptive test

^{Gao Xuliang, Wang Fang, Zhao Pengjuan}

School of Psychology, Mental Health Education and Counseling Center, Guizhou Normal University, Guiyang, 550025

Received:2020-09-29 Revised:2022-04-23 Online:2023-03-20 Published:2023-03-20

摘要/Abstract

摘要： 认知诊断计算机化自适应测验（Cognitive Diagnosis Computerized Adaptive Testing, CD-CAT）为心理和教育评估测验的发展提供了新的视角。目前，关于CD-CAT的研究主要是基于二值评分的模型展开，但是，在实际应用领域，例如教育评估或心理计量学中，存在很多多值评分数据。高效的选题方法是CD-CAT程序成功的核心要素，本研究提出了两种新的多值评分CD-CAT（Polytomous CD-CAT, PCD-CAT）的选题方法，期望后验方差（Expected Posterior Variance, EPV）和最大期望距离（Maximum Expected Distance, MED）。在广义多值评分GPDM模型下，通过模拟实验比较了EPV和MED在PCD-CAT的效果。实验结果表明，与传统的选题方法相比，EPV和MED具有更高的测验精度和测验效率。

关键词: 认知诊断, 计算机化自适应测验, 多值评分, GPDM模型

Abstract: The development of cognitive diagnostic computerized adaptive test (CD-CAT) provides a new perspective for obtaining information about students’ mastery or nonmastery of a set of skills in the field of knowledge. In recent years, CD-CAT has received more and more attention in the field of educational evaluation and psychological evaluation. Effective item selection algorithm is the key to the success of CD-CAT system. To date, various item selection methods of CD-CAT have been proposed based on dichotomous cognitive diagnosis model (CDM). There are few researches on the item selection algorithm of polytomous CD-CAT (PCD-CAT). However, in educational assessment, psychological evaluation and many other disciplines, there are a lot of polytomously-scored data. In this article, the authors explored the CD-CAT item selection algorithm based on general polytomous diagnosis model, and proposed two PCD-CAT item selection algorithms, namely, maximum expected posterior distribution variance (EPV) and maximum expected distance (MED). The performances of the proposed item selection algorithms were evaluated through two simulation studies and compared with the KL, PWKL, and HKL algorithms in fixed-length and variable-length PCD-CAT. In the simulation experiment, the size of the item bank was 350, and the maximum score of each item was fixed at 4. The number of attributes was fixed to K = 7. In the first study, we manipulated three factors: the test length (5, 10, 15 and 20), item bank quality (high vs. low), and item selection algorithms (KL, PWKL, HKL, EPV and MED). The results of Study 1 showed that the EPV and MED consistently resulted in the highest attribute pattern recovery rate in all the simulation conditions. The results of Study 1 showed that the pattern correct classification rate (PMR) of EPV and MED was significantly higher than that of KL, PWKL and HKL methods. The EPV and MED had similar PMR under most experimental conditions, but when the test length was short (for example, 5 items), regardless of the quality of the item bank, the PMR of EPV was higher than that of MED. Under all conditions, the KL method had the lowest PMR rate, while the difference in PMR rates between PWKL and HKL was almost negligible. Study 2 investigated the performance of two proposed new item selection algorithms under the condition of variable-length PCD-CAT. In Study 2, when the maximum posterior probability of the attribute vector reaches a prespecified value, the test is terminated. Three factors were manipulated in the Study 2: prespecified termination values (0.6, 0.7, 0.8 and 0.9), item bank quality (high and low) and five item selection algorithms. The results of Study 2 showed that the average test lengths of EPV and MED were roughly similar, and significantly smaller than the average test lengths of KL, PWKL and HKL. Although the results are encouraging, there are still some future research directions that deserve further study, such as, (a) how to use both confirmatory CDM and exploratory CDM in diagnostic evaluation to better analyze data; (b) item calibration technology for cognitive diagnostic computerized adaptive test; (c) using computer to realize automatic scoring of polytomously-scored items.

Key words: cognitive diagnosis, computerized adaptive testing, polytomously-scored items, GPDM

高旭亮王芳赵鹏娟. 多值评分CD-CAT的选题策略研究[J]. 心理科学, 2023, 46(2): 461-469.

Gao Xuliang, Wang Fang, Zhao Pengjuan. Research on item selection algorithm of polytomous scored cognitive diagnosis adaptive test[J]. Journal of Psychological Science, 2023, 46(2): 461-469.

[1]	刘耀辉陈琦鹏徐慧颖詹沛达. 纵向汉明距离判别法：对潜在属性的发展追踪[J]. 心理科学, 2023, 46(3): 742-751.
[2]	郑天鹏周文杰郭磊 . 基于题目作答时间信息的认知诊断模型[J]. 心理科学, 2023, 46(2): 478-490.
[3]	唐小娟丁树良俞宗火. 题目属性向量平衡策略的认知诊断测验设计[J]. 心理科学, 2022, 45(6): 1466-1474.
[4]	李秋云蔡艳汪大勋涂冬波. 认知诊断框架下多级评分题目的DIF检测方法及其应用[J]. 心理科学, 2022, 45(4): 998-1007.
[5]	孙小坚刘彦楼王诗梦辛涛宋乃庆周蔓. 认知诊断测验中基于信息矩阵的多群组DIF检验[J]. 心理科学, 2022, 45(3): 710-717.
[6]	李俊杰马丽华曾平飞康春花. 基于理想作答反应构建的非参CD-CAT选题策略[J]. 心理科学, 2022, 45(2): 470-480.
[7]	王立君赵少勇昌维唐芳詹沛达. 重参数化多分属性DINA模型的多级评分拓广——基于等级反应模型[J]. 心理科学, 2022, 45(1): 195-203.
[8]	何洁毛秀珍唐倩王霞. 基于项目区分度的双目标CD-CAT选题策略[J]. 心理科学, 2022, 45(1): 204-212.
[9]	汪文义宋丽红丁树良汪腾熊建. 非参数认知诊断方法下诊断结果的概率化表征[J]. 心理科学, 2021, 44(5): 1249-1258.
[10]	郭治辰汪大勋蔡艳涂冬波. 结合题目作答时间的计算机化自适应测验选题方法[J]. 心理科学, 2021, 44(5): 1241-1248.
[11]	高旭亮龚毅王芳. 多级评分认知诊断模型述评[J]. 心理科学, 2021, 44(2): 457-464.
[12]	詹沛达潘艳芳李菲茗. 面向“为学习而测评”的纵向认知诊断模型[J]. , 2021, 44(1): 214-222.
[13]	汪文义朱黎君叶宝娟方小婷. Bootstrap区间估计在认知诊断模型误设中的应用[J]. , 2020, 43(6): 1498-1505.
[14]	王立君唐芳詹沛达. 基于认知诊断测评的个性化补救教学效果分析：以“一元一次方程”为例[J]. , 2020, 43(6): 1490-1497.
[15]	詹沛达高椿雷边玉芳罗照盛. 使用题组反应模型缓解局部题目依赖性对多阶段测验的危害[J]. 心理科学, 2017, 40(1): 216-223.