Psychological Science ›› 2018, Vol. ›› Issue (4): 989-995.

Previous Articles     Next Articles

Exploring Cognitive Diagnosis Retrofitting and Further Analyses of Language Proficiency Testing: The Case of the Guangzhou English Achievement Examination

Yan-Ting LIN1,Huilin Chen2, 3   

  1. 1. Sun Yat-sen University
    2.
    3. Dept. of Psychology
  • Received:2017-08-21 Revised:2018-04-16 Online:2018-07-20 Published:2018-07-20

探索语言水平测验的认知诊断改造和深度分析: 以广州市英语学业考试为例

林燕婷1,陈慧麟2,陈劲松3   

  1. 1. 中山大学
    2. 上海外国语大学
    3. 中山大学大学
  • 通讯作者: 陈劲松

Abstract: Cognitive diagnostic models (CDMs) can provide meaningful diagnostic information about individuals’ knowledge state. Recently, retrofitting CDMs to language tests is increasingly popular. However, existing studies on the topics suffered some issues, largely due to incomplete validation procedures, missing item-level fit measures and superficial analyses. For these reasons, this paper intended to accomplish three tasks. First, it intended to revise validation procedure and strategy based on previous research, and then to verify the validity of the proposed procedure. Second, it intended to retrofit an achievement examination with CDM and to conduct in-depth analyses based on revised validation procedure and strategy. Third, it intended to investigate the language characteristic of English learning among middle school students. The test materials of this study come from the 2015 Guangzhou Middle-School English Achievement Examination. They include sentence completion and reading comprehension, with about 40 items in total. Data of 2718 students from this examination were analyzed. This research compared two Q-matrixes constructed on the basis of the examination syllabus and expert panel separately, and found that former Q-matrix was less appropriate for cognitive diagnosis. With the revised validation procedures and item-level fit measures, we found that the ability attribute definitions based on the examination syllabus were excessively broad but should be more specific. In comparisons, the attribute set and Q-matrix based on expert panel can be appropriately retrofitted and validated with the procedures and fit measures. Meanwhile, this study further analyzed the retrofitting test and found that: a) proficiency classifications based on attributes distribution and total score were different in determining whether a student was passing or not. Whether this was a special case or not can be a topic of further study; b) the attribute mastery probability showed that student mastery was good in general. The mastery probability of attribute AR3 was the lowest and the hierarchy of attribute AR3 indicated that students need to pay more attention to learning it; c) there was no significant gender difference on mastering attribute AR4. But there were significant gender differences on the other probability (ts (1, 2716) > -2.51, ps <.012) and girls’ level of mastery was significantly better than boys’. Therefore, boys should strengthen their English study; d) the attribute distribution of attribute patterns of Language Knowledge and Application showed that the attribute profile “11111111” was the largest proportion (29%). The attributes profile “1111” of Reading Comprehension accounts for 23%. It reflected that the relationships among language attributes are interrelated, and provided another evidence for fitness of the G-DINA model in diagnosing the language test or language skills; e) to test the external validity of our results, the students' listening and writing performance were used as external criteria for evaluation. It showed that the correlations with most attribute probabilities were statistically and substantively significant, suggesting good external validity. In general, this study can lay a foundation of further developing language proficiency testing for cognitive diagnosis purpose.

Key words: Cognitive Diagnosis Model, English Achievement Examination, Q-matrix, G-DINA, Fit Indices

摘要: 本研究探索在通用认知诊断模型和相关检验方法的基础上对现有语言水平测验进行诊断改造和分析,分三步进行探索:1)探索对语言水平测试不同的属性和Q矩阵构建途径;2)探索对语言水平测试基于通用模型的建模和效度验证;3)探索对语言水平测试建模后续的深入分析。研究发现:属性分布和总分分布划分的学生水平一致性较高;学生对属性掌握存在性别差异且属性间的难易层级不同;属性模式分布进一步验证了语言属性间关联程度较高以及通用认知诊断模型和相关检验方法对语言测验的适用性。三步式的建模分析可作为对语言水平测验进行认知诊断改造的参考。

关键词: 认知诊断模型, 英语成就测验, Q矩阵, G-DINA, 拟合指数