Psychological Science ›› 2015, Vol. 38 ›› Issue (4): 807-812.

Previous Articles     Next Articles

Replicability of Psychological Experiment

zhong xiaobo   

  • Received:2014-11-14 Revised:2015-04-12 Online:2015-07-20 Published:2015-07-20
  • Contact: zhong xiaobo

心理学实验的可重复性

仲晓波   

  1. 嘉应学院
  • 通讯作者: 仲晓波

Abstract: Mainly due to dissatisfaction with the conventional NHST (null hypothesis significance testing) procedure, researchers have begun to consider alternatives. In his article, “An Alternative to Null-Hypothesis Significance Tests,” Killeen urged psychological research to abandon the routine of NHST and to quantify the signal-to-noise characteristics of experimental outcomes with replication probabilities. He described the coefficient that he invented, Prep (probability of replicability), as the probability of obtaining “an effect of the same sign as that found in an original experiment” (Killeen, 2005). Replying to the proposal, the journal Psychological Science quickly came to encourage researchers to employ Prep, rather than P value of NHST, in the reporting of their experimental results. But soon after, Killeen’s computational formula of Prep was found wrong, which resulted in that his proposal had been rejected at last (Maraun, & Gabriel, 2010). However, Killeen is been thought correct in the sense that the issue of replicability should have a central role in researchers’ assessments of the empirical results. So, the problems we need solve are then how to express the replicability of psychological experiment and how to enhance it. Acording to us, not only is Killeen’s computational formula of Prep incorrect, but his definition of replication is also inappropriate. He defined replication as obtaining “an effect of the same sign as that found in an original experiment”. However, in the constraint of same sign, the difference of the sample effects of the two experiments may very large (for example, considering that the two sample effects are 0.1 and 0.9 respectively). On the other hand, when the two sample effects have different signs, the difference of their values may rather small (for example, considering that the two sample effects are 0.1 and -0.1 respectively). The followings are our proposals about replicability and their corresponding reasons: 1. When the replicate experiment and the original experiment are homogeneous, it is not necessary to consider their difference of sample effects qualitatively, we need consider it just quantitatively. In the situation, the replicability can be appropriately expressed in terms of the width of CI (confidence interval): the smaller the width is, the nearer the two sample effects are statistically, and therefore the better the replicability is. 2. In this homogeneous situation, the factors that determine the replicability of psychological experiment are the random extraneous variables that influence the dependent variable. We should employ as many as possible these random extraneous variables as covariables in the experiment design and data analysis. That would decrease the width of CI, and therefore increase the replicability of experiment. 3. Except indicating replicability of experiment, CI is also used to make statistic inference and effect estimation of experiment. As a tool of statistic inference, it can replace the two-sided testing and divide the sample space into three parts (Harris, 1997). As a tool of effect estimation, CI can not only provide the point estimation of effect (by its median) but also provide accuracy estimation of the point estimation (by its width). Considering these merits when it implements these three functions, we argue that CI is the most appropriate one of all devices researchers have proposed to report the result of psychological experiment. 4. When the value of control variable in the replicate experiment is different from the value of the original experiment and there is interaction between independent variable and the control variable, the interaction would make the effects of the two experiments differ from each other. In this heterogeneous situation, we should deal with the difference of the sample effects between these two experiments qualitatively. 5. The fact that replicability of psychological experiment is inferior to that of other empirical science experiments, in whether homogeneous or heterogeneous situation, can be attributed to the numerousness of its extraneous variables. Inspired by Killeen’ article, the replicability of psychological experiment has become a controversy focus. Our criticisms of his proposal are aimed at not only is his computational formula of Prep but also his definition of replication. We suggest that CI be the method indicating replicability of experiment. The replicability can be enhanced by measuring the random extraneous variables and employing them as covariables in the experimental design and data analysis.

Key words: psychological experiment, replicability, extraneous variable, confidence interval

摘要: 严格意义的实验可重复性指的是实验控制条件不变的情况下其结果的可重复性,置信区间是表示这种可重复性的恰当方法,可重复性的提高可通过在实验设计和数据分析中将影响因变量的随机额外变量作为协变量引入来实现;另一种意义的可重复性指的是实验结果的可迁移性,它涉及当控制条件变化时因控制变量和自变量的交互作用而导致的实验结果的变化。在这两种意义下,心理学实验较低的可重复性都源于它的额外变量的庞杂。

关键词: 心理学实验, 可重复性, 额外变量, 置信区间