心理科学 ›› 2019, Vol. ›› Issue (1): 163-169.

• 统计、测量与方法 • 上一篇    下一篇

四参数Logistic加权模型下被试能力稳健估计

梅云1,简小珠1,2,刘建平3   

  1. 1. 江西师范大学
    2. 井冈山大学
    3. 江西师范大学心理学院
  • 收稿日期:2017-11-17 修回日期:2018-09-07 出版日期:2019-01-20 发布日期:2019-01-20
  • 通讯作者: 刘建平

The Ability Overestimation and Ability Underestimation of the Examinee under the Weighted-Score Logistic Model

Yun MEI1,Xiao-Zhu Jian2,3,   

  • Received:2017-11-17 Revised:2018-09-07 Online:2019-01-20 Published:2019-01-20

摘要: 设计项目参数、被试得分已知的测验情境,在两、三、四参数Logistic加权模型下进行能力估计,发现被试得分等级之间的能力步长存在着均匀的步长间距,被试得分能较好的反映多级记分的分数加权作用。两参数Logistic加权模型下会出现被试参数估计扰动现象,猜测现象会导致能力高估现象,失误现象会导致能力低估现象;三参数Logistic加权模型c型下能力高估现象未出现或不明显;三参数Logistic加权模型γ型下能力低估现象未出现或不明显;四参数Logistic加权模型下被试能力高估现象和低估现象都未出现或不明显,四参数Logistic加权模型是被试能力稳健性估计较好的方法。

关键词: Logistic加权模型, 猜测现象, 失误现象, 能力高估, 能力低估

Abstract:  Under the weighted-score logistic model (WSLM), which is proposed by Jian, Dai, & Dai(2016). On the basis of the item emphases of the polytomously scored item, the WSLM model adds the weighted-score parameters into the dichotomous logistic model. Because of the dichotomous model have five forms at least. Similarly, the weighted-Score Logistic model also have four forms, including the one-parameter weighted-Score Logistic model, the two-parameter weighted-Score Logistic model, the three-parameter weighted-Score Logistic model including c parameter, the three-parameter weighted-Score Logistic model including γ parameter, the four-parameter weighted-Score Logistic model.  There are response disturbances such as random guessing, carelessness, transcription error in the educational tests. In the paper and pencil testes or computerized adaptive testing, the aberrant responses such as careless errors and lucky guesses would cause significant ability estimation biases in the past researches. Mislevy & Bock (1982) proposed the Biweight estimator, and made comparison between the Biweight estimator and maximum likelihood estimator. Results showed that the Biweight estimator could typically reduce Biases, thereby dispel measurement disturbances. And three-parameter Logistic IRT model and four parameter Logistic IRT model, Huber robust estimation, and the other methods have therefore been proposed to address the response disturbance, including random guessing, carelessness, etc..  The paper comparisons the four models to robustify ability estimates by an example of a test. The four models compared including two-parameter WSLM, three-parameter WSLM contains c parameter, three-parameter WSLM contains γ parameter, four-parameter WSLM. Second, three simulation studies in three test cases are presented respectively, with the aim of comparing four approaches, including 2PM-MLE, Biweight estimation, Huber estimation, 4PM-Robust estimation. The hypothetical test instrument contains 34 items, with difficulty thresholds b~ N(0,1), and log (a) ~ N(0,1). The 35th item with difficulty thresholds range from -4.0 to 4.0. The ability of the middle-ability examinee is estimated by the responses on the 34 items of the basic test under two-parameter logistic model, and the ability estimation is seen as the reference value for the other three models.  Based on the two-parameter WSLM, the ability of the examinees will be overestimated when there exist guessing phenomenon on the difficult items; Meanwhile, the ability of the examinees will be underestimated when there exist sleeping phenomenon on the easy items. The three-parameter WSLM, which contains c parameter, the overestimation phenomenon would be rectified. However, the underestimation phenomenon still exists when the examinees miss the easy items. Secondly, The three-parameter WSLM, which contains γ parameter, the underestimation phenomenon would be rectified well when the examinees miss the easy items. But the overestimation phenomenon still exists when the examinees get the difficult items. Thirdly, the four-parameter WSLM, which contains c, γ parameter, the underestimation phenomenon would be rectified well when the examinees miss the easy items, and the overestimation phenomenon would also be rectified well when the low-ability examinees get the difficult items luckily. So, the examinee can get the ability robust estimation under the four-parameter WSLM when there exists response disturbances such as random guessing and carelessness error n the tests.

Key words: weighted Logistic model, guessing phenomena, randon error phenomena, ability overestimated, ability underestimated