Psychological Science ›› 2018, Vol. ›› Issue (6): 1492-1499.

Previous Articles     Next Articles

The Interval Estimation Three Methods of Attribute-level Classification Consistency in Cognitive Diagnostic Assessment

Wen-Yi WANG1,Xiao-Ming FANG2, 3   

  1. 1. Jianxi Normal University
    2. Jiangxi Normal University
    3.
  • Received:2017-11-28 Revised:2018-06-22 Online:2018-11-20 Published:2018-11-20

认知诊断属性分类一致性信度区间估计三种方法

汪文义,方小婷,叶宝娟   

  1. 江西师范大学
  • 通讯作者: 叶宝娟

Abstract: Abstract: As is well known, point estimate contains limited information about a population parameter and could not give how far it could be from the population parameter. The confidence interval of the parameter could provide more information about the precision of estimated parameters. In evaluating the quality of a test, the confidence interval of composite reliability has received more and more attention in recent years. However, there is no research on the estimation of the reliability interval of the cognitive diagnostic assessment. The researchers often only report the reliability point estimation, and no one is concerned about the confidence interval in the report. It is far from enough to report the reliability of the cognitive diagnostic assessment. The confidence interval of reliability in cognitive diagnostic assessment should be taken into account. We can estimate error range from the confidence interval. This study compares three reliability interval estimation methods of the cognitive diagnostic assessment based on DINA model and attribute-level classification consistency (Wang, et al., 2015). There are three approaches to estimate the confidence interval of the cognitive diagnostic assessment: Bootstrap method, Parallel test method, and Parallel test pairing method. Each of the three approaches produces the standard error, average and confidence interval about attribute-level classification consistency reliability. These factors were considered in the simulation design: (a) the number of test attributes(k=5, and 7); (b) the number of test items (5 attributes is t=15, and 31; 7 attributes is t=21, and 42); (c) the quality of test items [U(0.05,0.25), and U(0.25,0.45)]; (d) the number of sample size (n=500, 1000 and 1500); (e)the method for calculating the confidence intervals of attribute-level classification consistency reliability (Bootstrap, parallel test method, and parallel test pairing method). Totally, 72 treatment conditions were generated in terms of the above 5-factor simulation design (i.e., 72=2×3×2×2×3). The simulation results indicated: (1) Whether tests contain 5 or 7 attributes of the independent attribute hierarchical relationship, the standard error and reliability interval which are obtained by three interval estimation methods estimated (Bootstrap method, parallel test method and parallel test matching method) are all affected by the quality of test, sample size or test length. With the increase of the number of test items and subjects, the standard error and the length of confidence interval tend to decrease. As the quality of the subject decreases, the standard error and the length of confidence interval increase; (2) Whether tests contain 5 or 7 attributes of the independent attribute hierarchical relationship, the average of the reliability estimating by three interval estimation methods is affected by the quality of test, sample size or test length. With the increase of the amount of test, the average of the reliability shows a larger increase. With the increase of the number of subjects, the change of the average of the reliability is small; as the quality of the subject declines, the average of the reliability shows a larger decline; (3) The parallel test method and the Bootstrap method are close to the standard errors and confidence intervals estimated when test length and sample size is small. However, with the increase of the number of the subjects, the estimation accuracy of the Bootstrap method is improved rapidly. When a large amount of test items and the number of subjects was large, the result was basically close to the parallel test matching method. Bootstrap method requires the least time, parallel test matching method in practice is difficult to achieve. Therefore, recommended Bootstrap method when estimate the confidence interval of cognitive diagnosis.

Key words: Key words: attribute-level classification consistency, interval estimation, bootstrap method, parallel test method, parallel test pairing method

摘要: 摘要:引入了三种可以估计认知诊断属性分类一致性信度置信区间的方法:Bootstrap法、平行测验法和平行测验配对法。用模拟研究验证和比较了这三种方法的表现,结果发现,平行测验法和Bootstrap法在被试量比较少、题目数量比较少的情况下,估计的标准误和置信区间较接近,但是随着被试量的增加,Bootstrap法的估计精度提高较快,在被试量大和题目数量较多时基本接近平行测验配对法的结果。Bootstrap法的所需时间最少,平行测验配对法计算过程复杂且用时较长,推荐用Bootstrap法估计认知诊断属性分类一致性信度的置信区间。

关键词: 关键词:属性分类一致性信度, 区间估计, Bootstrap法, 平行测验法, 平行测验配对法