在GLMM框架下统一GT与IRT

心理科学 ›› 2021, Vol. ›› Issue (2): 449-456.

在GLMM框架下统一GT与IRT

薛明锋¹,陈平¹,刘拓²,甄锋泉³

1. 北京师范大学
2. 天津师范大学
3. 华南师范大学

收稿日期:2019-05-09 修回日期:2020-02-22 出版日期:2021-03-20 发布日期:2021-03-20
通讯作者: 陈平

Using GLMM to Unify GT and IRT

Ming-Feng XUE¹,Ping Chen²,Tour Liu³,Feng-Quan ZHEN⁴

1. Beijing Normal University
2.
3. Tianjin Normal University
4. South China Normal University

Received:2019-05-09 Revised:2020-02-22 Online:2021-03-20 Published:2021-03-20
Contact: Ping Chen

摘要/Abstract

摘要： 本文首次提出使用广义线性混合模型(Generalized Linear Mixed Model, GLMM)对概化理论(GT)和项目反应理论(IRT)进行统合，即在一次统计中就能同时获得GT和IRT所需要的估计结果。模拟研究结果显示：相比于传统的GT方差分量估计方法——期望均值平方(Expected Mean Squares, EMS)，GLMM可以获得更准确的方差分量、G系数和Φ系数，而且GLMM获得的题目难度参数估计精度优于传统Rasch模型。实证研究展示GLMM在实际心理测量数据分析中的应用。

关键词: GLMM, 概化理论, 项目反应理论, 心理测量

Abstract: It is important to inspect the quality of psychometric tools (e.g., ability tests and personality scales) before they are applied. Due to the drawbacks of classical test theory, GT (generalizability theory) and IRT (item response theory) are becoming popular. Though some efforts have been made to combine GT and IRT, most of the researches continue to employ GT and IRT separately. That is because previous models such as GRIM (Generalizability in Item Response Theory Modeling) and HRM (Hierarchical Rater Model), are a little complicated and lack program to perform them. Therefore, this paper proposed GLMM (Generalized Linear Mixed Model) to unify GT and IRT. GLMM is an extension of Linear Mixed Model. By emploiting various link functions, response variables are no longer limited by continuous data in GLMM. Therefore, it is suitable to analyze discrete data such as dichotomous data. There are a lot of advantages to unify GT and IRT under the framework of GLMM. First of all, GLMM can provide variance components that are key components in GT as well as difficulty parameters that are necessary in IRT at the same time. Secondly, GLMM is simpler than previous models. In addition, we can perform GLMM in many programs such as lme4 package in R, HLM and so on. Last but not least, compared with EMS (Expected Mean Squares), traditional method to estimating variance components in GT, GLMM can avoid the violation of assumption of interval scale, which improves the reliability of analysis. To illustrate the feasibility and the strengths of GLMM, a simulation study and an empirical study were conducted. In the simulation study, σ_p^2=2×π^2/3, σ_i^2=1×π^2/3, and the reason why σ_p^2 and σ_i^2 were the multiples of π^2/3 was that the default residual variance of binominal GLMM using logit as linking function was π^2/3. Setting true parameters of variance component in this way provided us a simple proportional relationship. Person effect and item effect were randomly drawn from normal distribution with variance of σ_p^2 and σ_i^2 respectively, and the item effect was treated as easiness parameter. By exploiting the inverse logit function, the sum of person effect and item effect was transformed to the probability of a correct response. Then binary response was drawn from Bernoulli distribution with probability calculated from last step. GLMM was exploited to analyze the data. To make comparison, EMS and Rasch function in ltm package were also used. The results showed that GLMM provided more precise estimates of variance component, G coefficient and Φ coefficient than EMS did, while difficulty parameters estimated from GLMM were more precise than their counterparts from ltm package. Empirical data was LSAT dataset from ltm package with 1000 subjects, who answered 5 dichotomous questions. The results showed that the percentages of σ_p^2 from GLMM and EMS were close, but the percentages of σ_i^2 or σ_(pi,e)^2 were quite different between methods. In addition, difficulty parameters estimated through GLMM and traditional Rasch model were close. Compared with traditional GT and IRT, GLMM can produce reliable and precise results, especially no longer rely on the interval scale data assumption as EMS does. Therefore, it is appropriate to combine GT and IRT using GLMM to analyze psychometric tools which offers some special advantages.

Key words: GLMM, Generalizability theory, Item Response Theory, Psychometrics

薛明锋陈平刘拓甄锋泉. 在GLMM框架下统一GT与IRT[J]. 心理科学, 2021, (2): 449-456.

Ming-Feng XUE Ping Chen Tour Liu Feng-Quan ZHEN. Using GLMM to Unify GT and IRT[J]. Journal of Psychological Science, 2021, (2): 449-456.

[1]	汪文义宋丽红罗芬丁树良. 2PLM下缺失数据处理方法及其比较[J]. 心理科学, 2016, 39(6): 1500-1507.
[2]	杜文久孙胜亮原坤. 改进的MCMC算法及其在估计IRT模型参数中的应用[J]. 心理科学, 2013, 36(3): 734-738.
[3]	朱宇冯瑞龙辛涛. 新HSK书写成绩可靠性影响因素的概化理论分析[J]. 心理科学, 2013, 36(2): 479-483.
[4]	刘昊刘肖岑冯晓霞. 应用Rasch模型测试和分析儿童入学准备状态[J]. 心理科学, 2013, 36(2): 484-488.
[5]	黎光明张敏强. 非正态分布下概化理论方差分量变异量估计[J]. 心理科学, 2013, 36(1): 203-209.
[6]	刘红云李美娟骆方李小山. 单维项目因素分析：CCFA与IRT估计方法的比较[J]. 心理科学, 2012, 35(2): 441-445.
[7]	涂冬波蔡艳戴海琦丁树良. 项目反应理论新进展：基于3PLM和GRM的混合模型[J]. 心理科学, 2011, 34(5): 1189-1194.
[8]	关丹丹王博车宏生. 2007-2010年心理学专业基础综合考试的多元概化理论研究[J]. 心理科学, 2011, 34(4): 950-956.