心理科学 ›› 2023, Vol. 46 ›› Issue (1): 212-220.

• 统计、测量与方法 • 上一篇    下一篇

一种可融入额外信息的机器学习诊断法

康春花,朱仕浩,宫皓明,曾平飞   

  1. 浙江师范大学教师教育学院
  • 收稿日期:2021-02-24 修回日期:2022-09-30 出版日期:2023-01-20 发布日期:2023-02-17
  • 通讯作者: 曾平飞

  • Received:2021-02-24 Revised:2022-09-30 Online:2023-01-20 Published:2023-02-17

摘要: 研究将PNN和曼哈顿距离、贝叶斯定理相结合,提出了一种相对简洁的可融入额外信息的认知诊断法MB-PNN,通过模拟和实证研究考察了MB-PNN的有效性和适宜性,得到以下结论:(1)M-PNN的判准率高于PNN,表明将PNN中的ED修改为MD是适宜的;(2)MB-PNN的判准率较M-PNN和PNN高,表明基于多种信息的判别较基于单一信息的判别更为精准;(3)MB-PNN保留了PNN原有的非参数优势,基本不受知识状态分布和样本容量影响;(4)MB-PNN最能区分不同类型的学生,在认知诊断评估实践中更为适宜。

关键词: 额外信息, 贝叶斯定理, 机器学习诊断法, 判准率

Abstract: With the emerging era of big data, cognitive diagnostic assessment (CDA) can no longer be limited to the mining of single test information but should acquire a large amount of information of students through various methods, which will aid in distinguishing the knowledge state of students synthetically. Existing research on CDA involves preliminary studies that are based on parametric models incorporating process information, and they require additional parameters to be included in the original model. As a result, the model not only becomes more complex but also is not necessarily universal. Moreover, to deal with different types of data, such a model needs to be refactored. Therefore, how to simplify the method to combine process information is a topic worthy of extensive research. Among machine-learning-based diagnostic methods, the probabilistic neural network (PNN) diagnostic method combines the advantages of neural networks and non-parametric methods. The existing algorithm of the PNN uses Euclidean distance discrimination (EDD). Nevertheless, Kang and Yang (2019) found that the Manhattan distance discrimination (MDD) method has a higher accuracy rate than EDD. Based on this, this study has the following objectives: (1) The Euclidean distance in the PNN algorithm is modified to the Manhattan distance to improve the accuracy of the PNN diagnostic method. (2) In the second layer of the PNN algorithm, Bayes' theorem is added such that additional information can be fused in the model and the state of students’ knowledge can be comprehensively identified. Accordingly, the study proposes a relatively concise cognitive diagnostic method (termed MB-PNN) that can incorporate additional information. Further, the effectiveness and suitability of the MB-PNN are examined through simulation and empirical research. The results demonstrate the following: (1) Under the same conditions, the accuracy rate of the M-PNN is higher than that of the PNN, indicating that replacing the Euclidean Distance in the PNN with the Manhattan Distance is an effective approach. (2) Under the same conditions, the accuracy rate of the MB-PNN is higher than those of the M-PNN and PNN, indicating that the diagnosis accuracy based on multiple information is higher than that based on single information. (3) The MB-PNN retains the original non-parametric advantages of the PNN and is basically not affected by the distribution of knowledge state and the sample size. (4) The MB-PNN can optimally distinguish different types of students and is more suited for application in CDA. The study provides possible research ideas for cognitive diagnosis and evaluation based on multimodal education big data.

Key words: additional information, Bayes', theorem, machine learning diagnosis method, accuracy rate