CAT中结合贝叶斯方法与序贯监测程序的题库质量监控技术

心理科学 ›› 2018, Vol. 41 ›› Issue (1): 189-195.

CAT中结合贝叶斯方法与序贯监测程序的题库质量监控技术

郭磊¹,刘伟²

1. 西南大学
2. 西南财经大学

收稿日期:2016-12-06 修回日期:2017-05-07 出版日期:2018-01-20 发布日期:2018-01-20
通讯作者: 郭磊

Quality Monitoring Technique Combining Bayesian Method with Sequential Procedure in CAT System

Received:2016-12-06 Revised:2017-05-07 Online:2018-01-20 Published:2018-01-20

摘要/Abstract

摘要： Zhang(2013)提出了序贯监测程序(SMP)用以检测CAT中的题目在作答过程中是否发生泄漏。然而，该方法会出现虚报且未关注在题目泄漏后，对能力估计精度产生的影响。本研究在SMP基础上引入个人拟合指标，提出SMP_PFI方法，拟在给定的置信度上核实被SMP标记的题目是否真正泄漏，并探查SMP_PFI方法对能力估计精度与被封存题目数量关系的影响。实验结果表明：新方法能够有效降低SMP单独运行时的一类错误。通过控制CPFI值能够平衡能力估计精度与被封存题目数量之间的关系。

关键词: 序贯监测程序, 个人拟合指标, 计算机化自适应测验, 测验安全, 能力估计精度

Abstract: Computerized adaptive testing (CAT) was proposed to measure the trait levels of examinee with great precision than conventional tests by building an individualized test for each examinee. Test items are selected sequentially, according to the current performance of an examinee. So, the test can be tailored to each examinee’s trait level, thus matching the difficulties of the items to the examinee being measured. In the field of the CAT research, the security of item pool is greatly important for the fairness of test and the protection of item pool. So far, many studies have proposed different kinds of item exposure control techniques to limit the maximum exposure ratio of each item, such as a-stratified method, Sympson-Hetter (SH) strategy, restricted maximum information strategy, shadow test approach and so on. However, only emphasizing the item exposure control is far from being enough. This method cannot prevent some people to steal the items from the exam room. Actually, when a CAT item pool has been used in practice for a while, some items may be compromised due to either item overexposure or some contrived reasons. When the number of compromised items increases, both the reliability and validity of the continuous testing system and the estimate accuracy of examinees’ ability will be damaged. In order to guarantee and maintain the security of the CAT system, sequentially monitoring procedure (SMP; Zhang, 2013) was developed to detect whether the performance or statistical characteristic of individual items has gone through any significant changes during their lifetime in an item pool or not. Nevertheless, , the rate of Type I errors may occur when using the SMP in the process of monitoring. Moreover, there is no concern about how compromised items could affect the ability estimate accuracy in the Zhang’s study. Thus, this paper combining the person fit index which was calculated by Bayesian Method with the sequentially monitoring procedure, and proposed a new approach called SMP_PFI method to decrease the rate of Type I errors. The rationale and operating steps of the new method were illustrated. The new method could not noly examine whether the items which had already been flagged by SMP were truly compromised or not, but balance the relationship between ability estimate accuracy and the number of items that be freezed in the item pool. The simulation studies were conducted to investigate the performance of SMP_PFI method. Particularly, the correct response probability of examinees who had preknowledge of the receiving items was set at 0.8 or 0.9 artificially in Zhang’s study, which ignored the interaction between the examinee themselves and the property of items. So, we adopted the modified 3PL model to simulate responses when examinee received compromised items. Experimental results showed that the new method could effectively reduce the rate of Type I errors, and users also could set the criteria of CPFI index at any reasonable level to balance the relationship between ability estimate accuracy and the number of items that be freezed in the item pool.

Key words: sequentially monitoring procedure, person fit index, computerized adaptive testing, test security, ability estimate accuracy

郭磊刘伟. CAT中结合贝叶斯方法与序贯监测程序的题库质量监控技术[J]. 心理科学, 2018, 41(1): 189-195.

[1]	高旭亮王芳赵鹏娟. 多值评分CD-CAT的选题策略研究[J]. 心理科学, 2023, 46(2): 461-469.
[2]	何洁毛秀珍唐倩王霞. 基于项目区分度的双目标CD-CAT选题策略[J]. 心理科学, 2022, 45(1): 204-212.
[3]	郭治辰汪大勋蔡艳涂冬波. 结合题目作答时间的计算机化自适应测验选题方法[J]. 心理科学, 2021, 44(5): 1241-1248.
[4]	詹沛达高椿雷边玉芳罗照盛. 使用题组反应模型缓解局部题目依赖性对多阶段测验的危害[J]. 心理科学, 2017, 40(1): 216-223.
[5]	王钰彤罗照盛王睿. 计算机化多阶段自适应测验研究述评[J]. 心理科学, 2015, 38(2): 452-456.
[6]	王昭郭庆科韩丹. 个人拟合指数对人格测验的影响及意义[J]. 心理科学, 2012, 35(5): 1225-1232.