Psychological Science ›› 2016, Vol. 39 ›› Issue (1): 90-96.

Generalizabiltity Analysis of Teaching Level Evaluation for College Teachers


  • Received:2015-04-27 Revised:2015-09-14 Online:2016-01-20 Published:2016-01-20
  • Contact: Guang MingLI



  1. 1. 华南师范大学
    2. 华南师范大学心理学院
  • 通讯作者: 黎光明

Abstract: In order to improve the quality of college teaching, colleges in China have evaluation for college teachers teaching level every semester. Generally speaking, colleges have students evaluate each one of their curriculum teacher. Since each teacher teaches different classes, there will be a few results for each one of them and a concordant conclusion for each teacher cannot be drawn. On the other hand, factors which influence the results are multiple. For instance, too little students take part in the evaluation leads to low reliability of the result, students with different kinds of majors pay different attention to indexes, different kinds of curriculums gain disproportion grade and because of students focusing on different issues through different periods, time point is another factor that affects results. This study, based on generalizability theory, offers a method to solve the problem which is mentioned above and discusses the factors which affect results of Teaching Level Evaluation for college teachers. The data collected by the scale of Teachers’ Teaching Level Evaluation and collected from 19 curriculums, 7 of which are liberal arts curriculums and the other 11 are science curriculums. 558 data were collected at the beginning of the semester on March and 566 collected at the end of the semester on December, of which involved 5 liberal arts majors, 10 science majors and 4 engineering majors. All the data was saved to the txt format and analyzed with mGENOVA. According to the generalizability theory, evaluation taken by specific number of students is reliable enough to measure teaching level of one teacher. The generalizability theory uses index of dependability (Φ) instead of validity used in classical test theory to judge the reliability of results. In terms of needed number of evaluators, the D study result shows that reliability raises while the number increases and it is appropriate to have 20 students evaluate each teacher. The study also finds out that students major in engineering course, who pay more attention to practical issue, have higher reliability for five indexes of the scale than students major in liberal arts course or science course have. In addition, when students evaluate their teachers who teach science course, the result is more reliable than when they evaluate teachers of liberal arts. Last but not least, it turns out that the evaluation taken at the beginning of the semester is more reliable than the evaluation taken at the end of the semester. The conclusions are as follows: (1) Compared to the result taken at the end of the semester, the result taken at the beginning of next semester has a higher reliability. (2) Students with different kinds of major pay different attention on five indexes, which affects the reliability of evaluation. (3) Evaluation reliability for science curriculum is higher than evaluation reliability for liberal arts curriculum. (4) To ensure the reliability of evaluation, 20 students are needed to participate in the evaluation for each teacher.

Key words: College Teachers, Teaching Level Evaluation, Generalizability Theory, Index of Dependability

摘要: 以概化理论为基础,探究影响高校教师教学水平评价结果的因素。采用《高校教师教学水平评价量表(学生用)》收集评价数据,用mGENOVA对数据进行分析。结果发现:(1)与在第一学期末进行教学水平评价相比,在第二学期初进行教学水平评价的结果可靠性更高;(2)评价每位教师的教学水平仅需抽查20名学生即可保证评价结果的可靠性;(3)不同专业类型的学生对评价指标的侧重点不同,继而影响评价结果的可靠性;(4)学生对理科课程的评价可靠性较高,对文科课程的评价可靠性较低。

关键词: 高校教师, 教学水平评价, 概化理论, 可靠性指数