多级评分项目的多维CAT选题策略开发

心理科学 ›› 2018, Vol. ›› Issue (6): 1500-1507.

多级评分项目的多维CAT选题策略开发

韩雨婷¹,²,高旭亮³,汪大勋²,蔡艳²,涂冬波²

1. 北京师范大学
2. 江西师范大学
3. 江西师范大学心理学院

收稿日期:2017-06-12 修回日期:2018-02-22 出版日期:2018-11-20 发布日期:2018-11-20
通讯作者: 涂冬波

Item Selection Methods in Multidimensional Polytomous Computerized Adaptive Testing

Received:2017-06-12 Revised:2018-02-22 Online:2018-11-20 Published:2018-11-20
Contact: Tu Dong-Bo

摘要/Abstract

摘要： 本研究开发了两种新的适用于多级评分项目的多维计算机化自适应测验（PMCAT）的选题策略——修正的连续熵（RCEM）和修正的后验期望KL信息（MKB）方法，并与以往PMCAT的选题策略进行了对比研究。Monte Carlo实验结果表明：两种新开发的选题策略比原方法估计精度更高，并且RCEM方法在所有选题策略中曝光率最低。新开发的选题策略具有较理想的估计精度和曝光控制效果，为PMCAT在实践中的应用提供了新的方法支持。

关键词: 多维项目反应理论, 多维计算机化自适应测验, 多级评分项目, 多级评分的MCAT

Abstract: Multidimensional computerized adaptive testing (MCAT) features a combination of computerized adaptive testing and multidimensional item response theory (MIRT) which shows great potential to obtain diagnostic information from examinees’ item responses accurately and efficiently, with its goal of extracting as much information as possible about the multiple abilities required. Currently, most MCAT researches focus on the dichotomous data. However there are large polytomous data in educational and psychological tests in practice. Polytomously-scored items provide more information than dichotomously-scored items do. It could not only measure concepts and skills in greater depth than dichotomous items but also reduce the test length while achieving the same effects, particularly under the CAT context. This paper proposed two new MCAT procedures with polytomous score, called as PMCAT. The first proposed item selection algorithm was RCEM, which used the updating current posterior distribution to replace the fixed prior distribution in the calculation of posterior distribution (equation 3-5). By using the updating prior distribution, the RCEM method could extract more information from the responses. The second method was MKB, which weighted the KL information by the entire posterior distribution. Compared with the KI method, the numerator in equation (8) considered all the possible ability vectors, and weights them accordingly. Because it used Bayesian idea and did not require estimating the ability vector , which might not be accurately described at the early stages of testing, it might show more informative than KI method. To compare the new RCEM, MKB method and the D-optimality, KI, MUI methods proposed by Lin (2012). A Monte Carlo simulation study was conducted under the same simulation conditions as those of Lin (2012). The item pool consisted of 400 items following MGPCM, with the item discrimination parameters and were drawn from . And the distribution of item threshold parameters were as follows: ， and ， and . Similar to previous MCAT studies, examinee responses were simulated with true abilities on a two-dimensional grid spanning the square . Crossing 13 discrete points on each of two dimensions generated a grid of 169 vector points. In order to balance the random error, 500 replications were run at each point. This was a fixed-length MCAT and the test length was set to 25. The first item was chosen by random. EAP was used as the latent trait estimation approach when the test is ongoing. The prior distribution was to use the standard bivariate normal distribution. The Gauss–Hermite numerical integration formulas from Glas (1992) was used and the integration was taken over the range of ability . The simulation results showed: the estimation accuracy and test security of the proposed methods under PMCAT were all acceptable and reasonable. The estimation accuracy of RCEM and MKB method was better than CEM (MUI) and KI method respectively (see Table 2). Combine with the security of the pool, the proposed RCEM was the ideal of all in that it gained the lowest item exposure rate and had a relatively high accuracy (see Table 3). Consisted with the previous study, D-optimality, MUI and the proposed RCEM, MKB methods had similar estimation and item selection pattern, while KI differed than them (see Table4 and Figure 2). Polytomously-scored items are widely used in Likert-type scale in psychological inventories and are also highly applied in the achievement context. The two kind of new methods showed their superiority in PMCAT by this paper, thus hey might expect a good prospect and application.

Key words: multidimensional item response theory, multidimensional computerized adaptive test, polytomously-scored items, polytomously-scored MCAT

韩雨婷高旭亮汪大勋蔡艳涂冬波. 多级评分项目的多维CAT选题策略开发[J]. 心理科学, 2018, (6): 1500-1507.