
Abstract:
Recent advances in a category of analytic methods collectively referred to as cognitive diagnostic models (CDM) show great promise. A large number of CDM have been proposed, The deterministic inputs, noisy, ‘‘and’’gate (DINA) model, an example of a conjunctive model, assigns the highest probability of answering correctly to examinees that possess all of the required attributes. Disjunctive models, however, assume that lacking a particular attribute can be offset by possessing another. For example, the deterministic inputs, noisy, ‘‘or’’ gate (DINO) model assigns the highest probability of answering correctly to examinees with at least one of the required attributes. Examples of other specific, interpretable CDM are the reduced reparametrized unified model (RRUM; Hartz,2002), the additive CDM(ACDM). Apart from these specific CDM, general or saturated CDM subsuming many widely used specific CDM have also been developed, including the generalized DINA (GDINA) model, the general diagnostic model (GDM), and the loglinear CDM (LCDM). Although general CDM provide better modeldata fit, reduced CDM have more straightforward interpretations, are more stable, and can provide more accurate classifications when used correctly.
Although a multitude of CDM are available, it is not clear how the most appropriate model for a specific test can be identified because the cognitive processes in answering items may be complicated. An important decision that researchers make is that of choosing either a CDM that allows for compensatory relationships among skills or one that allows for noncompensatory relationships among skills. With a compensatory model, a high level of competence on one skill can compensate for a low level of competence on another skill in performing a task. Specifically, a general model (i.e., GDINA model) can be tested statistically against the fits of some of the specific CDM it subsumes using the Wald test. The Wald test was originally proposed by de la Torre (2011) for comparing general and specific models at the item level (i.e., one item at a time) thereby creating the possibility of using multiple CDM within the same test which means each item has a appropriate CDM (Mixed CDM). In order to compare the Mixed model and other model performance in the paper and pencil test, Using a complex simulation study we investigated parameter recovery, classification accuracy, and performance of a itemfit statistics for correct and misspecified diagnostic classification models within a GDINA framework. The basic manipulated test design factors included the number of respondents, item quality generating model, fitted model and Qmatrix. The three sample sizes were N = 500, 1,000, and 2,000, item quality were high, medium and low, generating model and fitted model were GDINA, Mixed, DINA, DINO, ACDM and RRUM, Qmatrix included simple Qmatrix and complex Qmatrix. The study found that overall under all experimental conditions, the Mixed CDM had the best performance. Simply take into account classification accuracy rate, Mixed in low quality advantage is more obvious in the tests, when item quality is high, Mixed and GDINA performance is almost identical, but under all experimental conditions, Mixed was better than GDINA in informationbased fit indexes AIC and item parameter recovery. 