自适应分组认知诊断测验设计及其选题策略

心理科学 ›› 2018, Vol. ›› Issue (3): 720-726.

自适应分组认知诊断测验设计及其选题策略

罗芬,王晓庆,丁树良,熊建华

江西师范大学

收稿日期:2016-10-10 修回日期:2017-09-12 出版日期:2018-05-20 发布日期:2018-05-20
通讯作者: 丁树良

The Design and Selection Strategies of Adaptive Multi-group Testing for Cognitive Diagnosis

Received:2016-10-10 Revised:2017-09-12 Online:2018-05-20 Published:2018-05-20
Contact: Shu-Liang DING

摘要/Abstract

摘要： 应用OMST在线装配模式，提出自适应分组认知诊断测验（CD-AMGT）。由于知识状态的先决关系是偏序关系，而且构成格（lattice)，利用知识状态当前估计值在格中的上下确界对被试真实知识状态的可能范围进行界定，由此装配下一分组，分组中结合PWKL策略或SHE策略进行选题以兼顾诊断精度、效率和安全性。模拟实验表明，CD-AMGT与PWKL、SHE对比，当题目类型丰富时，以分类准确率略微降低为代价，其题库使用均匀性和计算用时均表现出较大优势。

关键词: 自适应多分组测验, 选题策略, 题库安全性, 测验用时

Abstract: It took incredible investment of time and effort to construct item bank. Selection strategy, which is the most significant component of Cognitive Diagnostic Computerized Adaptive Testing (CD-CAT), should react quickly and pay attention to the utilization of item bank. The two widely used item selection methods in CD-CAT are Shannon Entropy (SHE) and Posterior Weighted Kullback-Leibler (PWKL). The characteristic of SHE method and PWKL method is higher classification accuracy, but the utilization rate of item bank is uneven. Learning from the idea of on-the-fly multistage adaptive testing, a new test named adaptive multi-group testing for cognitive diagnosis (AMST) was proposed. AMST is composed of several groups and each group has multiple items which were assembled by the interim knowledge state and its upper and lower bounds. The combination of AMST and SHE named AMST-SHE and AMST coupled with PWKL named AMST-PWKL, AMST-SHE and AMST-PWKL are the optimal design for AMST. Based on the Deterministic Inputs, Noisy-and-gate, a simulation study was operated to investigate the efficiency of the AMGT-SHE method and the AMGT-PWKL method compared with the SHE method, the PWKL method and the Random selection method for four item pools with different structures. Pattern correct rate, test length, average exposure rate and time consuming per person were calculated. Suppose that the attributes are mutually independent and the number of attributes was 5. Test length was fixed to 25, and the size of the item pool was fixed to 300. Variable length test stops when the largest posterior probability of knowledge state was not smaller than 0.9 and the second largest was larger than 0.1. The Monte Carlo simulation results showed that (1) the pattern correct rate and time consuming of the AGMT-SHE method were better than those of the AGMT-PWKL method, but the average exposure rate was opposite. For the AGMT-SHE method and the AGMT-PWKL method, the simpler the item types in item bank, the higher the pattern correct rate; (2) when the item types are rich in item bank, the average exposure rate and time consuming of the AMGT-PWKL method and the AMGT-SHE method are far better than those of the PWKL method and SHE method, especially on time consuming, the former is one-ninth of the latter, but test length would be increased; (3) the items of initial stage, which came from each column of the reachability matrix R replaced the random selection, contributes to improve pattern correct rate. AGMT, structured shadow pool by lattice theory, which combined AGMT - PWKL and AGMT - SHE, compared with the PWKL method and the SHE method, when item types are rich, at the price of increasing test length and losing a little pattern correct rate, it can greatly improve the uniformity of item bank and it is greatly beneficial for high-risk test. Paid attention to security in the test, the AMGT - PWKL method performed better and Paid attention to the item bank security and accuracy of the test, we should adopt the AMGT - SHE method. The AGMT-SHE method and AGMT-PWKL method are the locally optimal solution, not the globally optimal solution, so it can satisfy the high response speed request of CAT.

Key words: adaptive multi-group testing for cognitive diagnosis, selection method, the safety of testing, time consuming per person

罗芬王晓庆丁树良熊建华. 自适应分组认知诊断测验设计及其选题策略[J]. 心理科学, 2018, (3): 720-726.

[1]	高椿雷罗照盛郑蝉金喻晓锋彭亚风. 具有认知诊断功能的多阶段自适应测验及其影响因素研究[J]. 心理科学, 2016, 39(6): 1492-1499.
[2]	汪文义丁树良宋丽红. 兼顾测验效率和题库使用率的CD-CAT选题策略[J]. 心理科学, 2014, 37(1): 212-216.
[3]	涂冬波蔡艳戴海琦. 认知诊断CAT选题策略及初始题选取方法[J]. 心理科学, 2013, 36(2): 469-474.
[4]	程小扬丁树良. 拓广分部评分模型下计算机自适应测验变加权选题策略[J]. 心理科学, 2011, 34(4): 965-969.