Psychological Science ›› 2012, Vol. 35 ›› Issue (2): 446-451.

Previous Articles     Next Articles

Anchor Parameter Drift Test in Equating a Large-scale Examination in China

Tie-Chuan LIU1, 2, 2,2   

  1. 1. Jiangxi Normal University
    2.
  • Received:2010-11-15 Revised:2011-05-13 Online:2012-03-20 Published:2012-03-20

我国一大型考试等值的铆题参数漂移检验

刘铁川1,戴海琦2,赵玉3,4   

  1. 1. 江西师范大学
    2. 江西南昌市紫阳大道99号江西师范大学心理学院
    3.
    4. 赣南医学院
  • 通讯作者: 戴海琦

Abstract: In large-scale examination, common items or anchors are frequently embedded in different test forms for equating. Non-Equivalent Anchor Test design (NEAT) requires not only the anchors’ representation in contents but also functioning equivalently across test forms. As a result of affecting by irrelavent factors, some anchors’ parameter may change substantially in different administrations. Goldstein (1983) named this phenomenon item parameter drift (IPD). Drifted anchors may cause systematic error in equating (Huiqin, Rogers, & Vukmirovic, 2008), but until now few studies have addressed this issue in China. In the present paper, several different approaches of detecting drifted items and minimizing their effect on equating were outlined first. Then, two kinds of popular methods for Differential Item Functioning (DIF) detection, which are MH test and logistic regression, were utilized to examine anchors in equating two test forms from a large-scale examination in China. MH method was done using DIFAS 4.0 and logistic regression was done using R. For controlling Type I Error, ETS’s classification criteria and pseudo R-squareds were also considered when performing MH test and logistic regression. Two test forms data were fit and equated by Three-Parameter Logistic Model (3PLM) after deleting the drifted anchors. Item parameter estimation under 3PLM was performed using BILOG-MG. Factor analysis suggested that 3PLM can be used to fit two test forms. The results showed: (1) Twenty-two anchors detected for parameter drift. Both anchors’ difficulty parameter and discrimination parameter could drift across different test forms. (2) Twenty-one of all drifted anchors fit 3PLM well in the old test form, but sixteen of them misfit in the new test form. (3) Equating results with Mean/Square method before and after the deletion of drifted anchors was very different. Therefore, the inclusion of drifted anchors in equating may cause systematic error. Examination of anchors’ parameter drift should be treated as one necessary process in equating different test forms. Such method could also be utilized to longitudinal psychological study for maintaining comparability across timeline. Limitation of the present paper and further research suggestions on anchor parameter drift were given at the end of this paper. For example, comparison of detecting methods and social and cultural causes of anchors’ parameter drift may be examined in future.

Key words: Anchor Parameter Drift, Equating error, Differential Item Functioning, Mean/Sigma Method, Characteristic Curve Method

摘要: 设置铆题来链接不同测验形式是一种常用的等值设计。但受到曝光等因素影响,铆题功能在不同施测时间会发生改变。本研究采用MH检验和logistic回归考察我国一大型考试等值的铆题质量,结果发现,有22个铆题发生参数漂移,铆题的难度参数和区分度参数可能发生漂移;这些铆题中大部分在二次使用时无法通过模型拟合检验;若不删除参数发生漂移的铆题导致较大的系统等值误差,应将铆题参数漂移检验作为等值中的一步必要工作。

关键词: 铆题参数漂移, 等值误差, 项目功能差异, 平均数/标准差法, 特征曲线法