问卷数据建模的前传

心理科学 ›› 2018, Vol. 41 ›› Issue (1): 204-210.

问卷数据建模的前传

温忠粦¹,黄彬彬²,汤丹丹¹

1. 华南师范大学
2. 北京师范大学

收稿日期:2017-11-20 修回日期:2017-12-08 出版日期:2018-01-20 发布日期:2018-01-20
通讯作者: 温忠粦

Preliminary Work for Modeling Questionnaire Data

Received:2017-11-20 Revised:2017-12-08 Online:2018-01-20 Published:2018-01-20
Contact: Zhong-Lin WEN

摘要/Abstract

摘要： 问卷法是一种常见的实证研究方法。问卷数据建模的前期工作，就像是一栋大楼的奠基工程，基础是否扎实，影响后续的工程质量。本文专门讨论统计模型建立之前要做的事情（重点是量表评价），内容包括：处理缺失值、评价量表的结构效度和题目删除的适当性、多维量表需要合成总分时检验同质性并计算合成信度、检验共同方法偏差和评价（变量）区分效度、题目打包、检验自变量的多重共线性，最后也涉及建模理据和无关变量控制等。

关键词: 问卷数据, 量表, 测量模型, 信度, 效度

Abstract: Questionnaire data have been frequently employed in empirical studies of psychology, as well as in many other behavioral and social science disciplines. This paper discusses preliminary work for modeling questionnaire data, including the data processing which might affect the analysis result. First of all, initial processes of raw data are introduced, including data checking, missing value imputation, and normality test. Then we focus on the questionnaire (scales and items) evaluation based on a measurement model using Confirmatory Factor Analyses (CFA). The construct validity of the scale is acceptable if the measurement model reflecting the hypothetical construct proposed by the theory fits the data with acceptable fit indexes (CFI and TLI > 0.9; RMSEA and SRMR <0.08, say). When item-factor relationship is examined, some items with low loading (e.g., less than 0.4 in the completely standardized solution) are often deleted. It is necessary to consider and explain that the remaining items of the scale are still a representive item sample to measure the latent variable. For a general test, the measurement errors of items are reasonably uncorrelated. If Cronbach's coefficient α is high enough to be accepted, then test reliability is also acceptable. Suppose that the total score of the test is meaningful and employed, it would be better report the composit reliability with a confidence interval. For a multidimensional test, the total score could be employed only when homogeneity reliability is not lower than 0.5. For a reaseach with several latent variables, discriminant validity could be examined by a series of CFA models. One-factor model is the worst fitted whereas the separated-factor model in which one latent variable corresponding to one factor is the best fitted. Discriminant validity is verified if the separated-factor model is obviously better fitted than any other competitive model in the series of CFA models. Then a method factor is added to the separated-factor model as a global factor to set up a bifactor model, and common method bias is not a problem if the bifactor model is not obviously better fitted than the separated-factor model. Structure equation models are frequently applied to analyze questionnaire data. It is suggested that the sample size be large enough so that it is more than 10 times of the nubmer of indicators, or 5 times of the number of parameters which are freely estimated. When the sample size is not large enough, item parceling constitutes a technique of improving the quality of indicators and model fit. The prerequisites for parceling are unidimension and homogeneity, and the applicability of parceling is the analysis of structural models, rather than measurement models. If the scale is multidimensional, internal-consistency approach is recommended so that the items of the same dimension are parceled to one or three indicators for structural equation modeling. When a multiple regression model is involved, multicollinearity could be detected by tolerance or the variance inflation factor (VIF=1/tolerance). Each predictor has a VIF, and a VIF of 5 (or 10) or above indicates a (or serious) multicollinearity problem. A VIF > 5 (or 10) is equivalent to the variance of the predictor is explained by more than 80% (or 90%) by all the other predictors, that is, the coefficient of determination of the regression of the predictor on all the other predictors is larger than 0.8 (or 0.9). For a cross-sectional design of study, it may be an issue how to propose a hypothesis that one variable is a cause of another. The issue could be addressed by domain theory, literature or commonsense. If X is more essential (or more stable, or more objective, or more long-standing, etc.) than Y, X is much more likely to act as a cause than Y. Variable control is necessary for causal inference, in order to eliminate the spurious effect when there exists a common cause of X and Y, or remove the unanalyzed effect when there exists a covariate with which X affects Y. Recently the replication crisis attracted attention and discussion. For a questionnaire data set, different results might arise from different methods of processing data before modeling and analyzing. Appropriate data processing could help obtain a reasonable result and raise the repeatability of the result.

Key words: questionnaire data, scale, measurement model, reliability, validity

温忠粦黄彬彬汤丹丹. 问卷数据建模的前传[J]. 心理科学, 2018, 41(1): 204-210.

[1]	张缨斌王烨晖. 反应风格的测量与统计控制[J]. , 2019, 42(3): 747-754.
[2]	严进吴英杰姜琦. 基于行为事件的履历资料评估[J]. 心理科学, 2015, 38(2): 457-462.
[3]	毕重增肖影影许欢欢. 青少年自我价值感量表研究结果的元分析[J]. 心理科学, 2014, 37(3): 625-632.
[4]	叶宝娟温忠粦. 两水平研究中单维测验信度的估计[J]. 心理科学, 2013, 36(3): 728-733.
[5]	朱宇冯瑞龙辛涛. 新HSK书写成绩可靠性影响因素的概化理论分析[J]. 心理科学, 2013, 36(2): 479-483.
[6]	王乾东胡超傅根跃. 在艾森克人格问卷L量表上说谎的语音频谱特征[J]. 心理科学, 2013, 36(2): 306-310.
[7]	孙崇勇刘电芝. 认知负荷主观评价量表比较[J]. 心理科学, 2013, 36(1): 195-202.
[8]	赵守盈杨建原臧运洪. 基于多层面模型的教学效能感量表[J]. 心理科学, 2012, 35(6): 1484-1490.
[9]	郭庆科姜晶王洪友. MMCS的心理测量学性能及中国大学生心理控制源的特点[J]. 心理科学, 2012, 35(6): 1491-1496.
[10]	卢谢峰唐源鸿王孟成. 人格测验的参照情境效应[J]. 心理科学, 2012, 35(6): 1453-1458.
[11]	叶宝娟温忠粦. 用Delta法估计多维测验合成信度的置信区间[J]. 心理科学, 2012, 35(5): 1213-1217.
[12]	李晓明傅小兰王新超. 不同道德评价取向对企业道德决策的预测作用[J]. 心理科学, 2012, 35(5): 1154-1158.
[13]	黎坚李一茗张厚粲. 离线元认知调节的结构探索与验证[J]. 心理科学, 2012, 35(5): 1190-1195.
[14]	张宝山李娟. 流调中心抑郁量表在老年人群中的因素结构[J]. 心理科学, 2012, 35(4): 993-998.
[15]	黎琳王丽娟刘伟. 不同任务情境中的前瞻记忆是一致的吗？[J]. 心理科学, 2012, 35(3): 569-573.