心理科学 ›› 2014, Vol. 37 ›› Issue (1): 212-216.

• 统计与测量 • 上一篇    下一篇

兼顾测验效率和题库使用率的CD-CAT选题策略

汪文义,丁树良,宋丽红   

  1. 江西师范大学
  • 收稿日期:2012-10-22 修回日期:2013-10-08 出版日期:2014-01-20 发布日期:2014-01-20
  • 通讯作者: 丁树良

Item Selection Methods for Balancing Test Efficiency with Item Bank Usage Efficiency in CD-CAT

  • Received:2012-10-22 Revised:2013-10-08 Online:2014-01-20 Published:2014-01-20

摘要: CD–CAT中已有选题策略较注重测验效率,而对题库使用率不够重视。针对此问题,基于DINA模型,引入两种新的选题策略KLED和RHA,同时对HA进行模拟研究。结果显示:PWKL与KLED只在测验效率上具有优势;KLED若按属性向量分层,题库使用率有所提高,KLED比ED更容易推广到其他有显式表达的诊断模型场合;HA、RHA和RP–PWKL可较好兼顾测验效度和题库使用率,但RP-PWKL需设置项目的最大曝光率阈值。两种新选题方法在定长和变长CD-CAT都具有一定的应用价值。

关键词: 计算机化自适应认知诊断测验 选题策略 题库使用率 二分法

Abstract: Cognitive diagnostic computerized adaptive testing (CD-CAT) is a popular mode of online testing of cognitive diagnostic assessment (CDA). The key to a CD-CAT program is the item selection methods. Three of the most popular methods are developed based on Kullback–Leibler information (KL), Shannon entropy (SHE) and expected discrimination method (ED) to utilize selecting items in CD-CAT. These methods can achieve much better test efficiency, however, they often lead to unbalanced item usage within a pool. Diagnostic test would not be a high–stake test, so the item overexposure problem may not be a major concern. Whereas the item underexposure problem lead to wasting time and money invested in developing each item on it and the high test overlap rate problem lead to the effects of intense exercise . Although,restrictive progressive method (RP-PWKL) and restrictive threshold method (RT-PWKL)are proposed to balance item exposure control with measurement accuracy, RP-PWKL and RT-PWKL suppress overexposure is to add a restriction so that the maximum exposure rate will be kept under a predetermined value. The rationale for the maximum exposure rate should be further consideration. For above consideration, the article proposes two item selection methods for CD-CAT based on the Deterministic Input, Noisy ‘And’ Gate” (DINA) model. First, using KL information as a discrimination function of ED, KLED is proposed to handle other cognitive diagnostic models, besides the DINA model. Second, according to the idea of randomization strategies, in which the selection of the item is always made at random among the most informative items, randomization halving algorithm (RHA) is proposed. For RHA,all items within the specified range are available for selection rather than an arbitrary or only one number. Moreover, we show the connection between KLED based on KL, HA, and RHA; KLED can be regard as a weighted HA method, weighted by the corresponding item parameters; HA can be regard as RHA without adding a random component between different item attribute vectors in the Q matrix of item pool. Then, two simulation studies are carried out, one using a simulated item bank, and the other based on items calibrated from real data. Eight item selection strategies are taken into consideration in these studies, including random, posterior–weighted KL (PWKL) , RP-PWKL, RT-PWKL, ED, halving algorithm (HA), KLED and RHA. In addition, VRP-PWKL and VRT-PWKL are proposed for variable-length CD-CAT as an extended version of RP-PWKL and RT-PWKL.Simulation studies for fixed or variable-length CD-CAT are conducted based on the eight methods, and the results are compared in terms of pattern or attribute classification correct rate, error classification rate, item exposure rate, test overlap rate. The simulation results show that: RHA, HA, RP-PWKL, VRP-PWKL and VRT-PWKL have more balanced usage of the item bank and slight decrease in correct classification rate of knowledge state; RHA, HA, VRP-PWKL and VRT-PWKL can be used for variable-length CD-CAT. Though the results from the simulation study are encouraging, further studies of CD-CAT are proposed for the future investigations such as different cognitive diagnostic models.

Key words: CD-CAT, item selection methods, item bank usage, halving algorithm