Cognitively Diagnostic Analysis Using the G-DINA Model in R

: Cognitive diagnosis models (CDMs) have increasingly been applied in education and other ﬁelds. This article provides an overview of a widely used CDM, namely, the G-DINA model, and demonstrates a hands-on example of using multiple R packages for a series of CDM analyses. This overview involves a step-by-step illustration and explanation of performing Q-matrix evaluation, CDM calibration, model ﬁt evaluation, item diagnosticity investigation, classiﬁcation reliability examination, and the result presentation and visualization. Some limitations of conducting CDM analysis in R are also discussed.


Introduction
Cognitive diagnosis models (CDMs), or diagnostic classification models (DCMs), are psychometric models for classifying individuals into latent classes with unique profiles of attributes. CDMs have increasingly attracted attention in education as they have shown the potential to identify students' strengths and weaknesses and thus aid classroom instruction and learning. In addition to the applications in education [1][2][3][4], CDMs have also been applied to other areas recently, such as industrial and organizational psychology [5] and psychiatry [6,7].
Despite the usefulness of CDMs in many fields, software programs for CDM analysis are still lacking. Programs such as Mplus [8,9], JAGS [10,11], and Stan [12,13] have been used for CDM analysis, but they are not without limitations. For example, CDM estimation using these programs often requires advanced coding skills, which may pose a formidable obstacle for CDMs practical application. Also, these general programs typically lack many essential functions, such as those for refining Q-matrix and assessing classification reliability. Recently, several R packages have been developed particularly for CDM analysis. Notably, George et al. [14] introduced the CDM package, and Ma and de la Torre [15] presented the GDINA package and how to apply it for a series of CDM analyses. However, different R packages have different functionality and features, and it remains unclear how these packages can be used in an integrated way for complete CDM analysis. This paper aims to fill this gap by illustrating a comprehensive CDM analysis with a particular emphasis on the use of multiple R packages under a widely used general CDM-the generalized deterministic input, noisy "and" gate (G-DINA) model [16] with a real dataset. As the first tutorial intended to introduce the state-of-the-art techniques for CDM analyses in the environment of R via multiple R packages, this paper will help researchers gain better insight into these packages and conduct CDM analyses in a more principled way.
where g [·] represents an identity, logit, or log link function, δ j0 is the intercept of item j, δ jk is the main effect of attribute k, δ jkk is the two-way interaction effect of attributes k and k', and δ j12...K * j is the K * j -way interaction effect of attributes 1 to K * j . The G-DINA model is an unrestricted, saturated model that can be reduced to many other restricted models by imposing appropriate constraints. In particular, to obtain the deterministic-input, noisy-and-gate (DINA) model [20][21][22], all terms in Equation (1) except δ j0 and δ j12...K * j are constrained to be 0. In this way, the IRF of the DINA model is expressed by To obtain the deterministic input, noisy-or-gate (DINO) model [6], only the intercept and the main effect of attribute k are kept in the link function of Equation (1), and the IRF of DINO is expressed by where δ jk = −δ jk k = . . . = (−1) δ j12...K * j , k = 1, . . . , K * j , k = 1, . . . , K * j − 1, and k > k , . . . , K * j [16]. In this regard, the number of item parameters for both DINA and DINO models was reduced to just two, regardless of the number of attributes measured. To obtain the additive CDM (A-CDM) [16], only the intercept and the main effects in the identity link function of Equation (1) are kept. In this way, the IRF of A-CDM can be expressed by The linear logistic model (LLM) [23] is the logit link function of A-CDM, and its IRF can be expressed by and the reduced reparameterized unified model (R-RUM) [24] is the log-link function of A-CDM, and its IRF is given by The reduced models presented here can be understood as particular cases of the G-DINA model that accommodate conjunctive or noncompensatory processes (DINA; mastery of all attributes is necessary to have a high probability of success), disjunctive processes (DINO; mastery of one attribute can compensate for the lack of the rest), or additive processes (A-CDM, LLM, and R-RUM; each attribute implies an independent increase in a function of the probability of success).

Overview of the CDM Analyses
This section will discuss the steps involved in cognitive diagnosis modeling using the G-DINA model, as shown in Figure 1. The development of diagnostic tests and specifications of Q-matrices will not be discussed here; detailed discussions of those can be found in Leighton and Gierl [25], Nichols et al. [26], and Tjoe and de la Torre [19], to name a few.
When the Q-matrix may not be entirely correct, the first step of CDM analysis should be the empirical Q-matrix evaluation, which involves validating the number of attributes and detecting the misidentified elements. To validate the number of attributes, Nájera and colleagues [27] adopted procedures for assessing the dimensionality, which were initially developed for exploratory analysis, often without a provisional Q-matrix. When the number of attributes has been validated, a host of methods have been developed for identifying misspecified elements [28][29][30][31]. De la Torre and Minchen [32] recommended employing a saturated CDM when conducting Q-matrix validation to avoid conflating Q-matrix misspecifications with model misspecifications. Also, although statistical procedures could provide some valuable insights into the Q-matrix, the appropriateness of the recommendations of these procedures should be carefully assessed by domain experts. In other words, the Q-matrix validation procedures should be used as a tool to facilitate domain experts developing the Q-matrix.
The second step of CDM analysis often involves model specification. The goal is to determine the measurement model-the model estimating the association between attributes and the observed data-for each item and specify the structural model-the model estimating the association among attributes-for joint attribute distribution. The measurement model can be specified on a priori grounds or determined by statistical procedures. For example, the Wald test and likelihood ratio test have been used to select the measurement model for each item [16,[33][34][35]. Regularized CDMs have also been used to determine each item's most appropriate measurement model [36,37]. It should be noted that monotonicity constraints may need to be imposed because they are often theoretically reasonable and can stabilize the parameter estimation, especially when the sample size is small [38].
ble and can stabilize the parameter estimation, especially when the sample size is small [38].
Similarly, the structural model can also be specified based on theories or statistical approaches. For example, when it is believed that attributes have a hierarchical relationship or are related to a common higher-order factor, the structural model should reflect such a belief. The likelihood ratio test can also be performed to compare the saturated structural model with a structural model it subsumes. The next step of CDM analysis requires assessing model-data fit. Model-data fit can be gauged in an absolute sense, at either the test or item level. The test-level absolute fit evaluation provides information about to what extent the models can fit data well for the whole test, whereas the item-level absolute fit assesses whether or to what extent the model can fit data well for the item. Examples of absolute fit measures include the full Similarly, the structural model can also be specified based on theories or statistical approaches. For example, when it is believed that attributes have a hierarchical relationship or are related to a common higher-order factor, the structural model should reflect such a belief. The likelihood ratio test can also be performed to compare the saturated structural model with a structural model it subsumes. The next step of CDM analysis requires assessing model-data fit. Model-data fit can be gauged in an absolute sense, at either the test or item level. The test-level absolute fit evaluation provides information about to what extent the models can fit data well for the whole test, whereas the item-level absolute fit assesses whether or to what extent the model can fit data well for the item. Examples of absolute fit measures include the full information statistics such as Pearson χ 2 and limited information statistics such as M 2 statistic and RMSEA 2 [39,40]. Models can also be compared using relative fit measures at either test or item level. Examples of measures for relative fit evaluation include information criteria, such as AIC and BIC, and other inferential statistics, such as the Wald test and LR test [41].
When the goodness of fit is adequate, one can interpret model calibration results, including item diagnosticity and person classification reliability. In particular, the item Psych 2021, 3 816 characteristic graph showing the probability of success for different latent groups can be displayed in a bar chart. Item discrimination index can also be calculated. Items with poor psychometric properties may need to be removed. In addition to item diagnosticity, test reliability should be investigated. The focus of CDM analysis is often classification, thus, classification accuracy and consistency should be assessed. With satisfactory classification reliability, the final step of CDM analysis is to report person classifications, which could be at the individual level or at an aggregated level for a group of students. It should be noted that CDM analysis may not necessarily be conducted sequentially. For example, the model or the Q-matrix may need to be revised, and some items may need to be removed if the model cannot fit data well.

An Illustration
This section will use a set of data to illustrate how to use different R packages for CDM analysis.

Data and Q-Matrix Preparation
The dataset we selected the grammar session of the Examination for the Certificate of Proficiency in English (ECPE) to illustrate an example of CDM analysis application, which has been used in several previous studies [9,36,42]. The dataset contains dichotomous responses to 28 items of 2922 students, reflecting their mastery of three grammar rules (attributes): morphosyntactic rules (A1), cohesive rules (A2), and lexical rules (A3). The Q-matrix is given in Table 1.  As shown in Table 1, the Q-matrix specified the attributes measured by each item. A cell with the value of 1 indicates that the corresponding item measures the corresponding attribute, and a cell with the value of 0 indicates the opposite. For example, the attribute vector, or Q-vector, of item 1 is [1], indicating it measures attributes 1 and 2. One can find the ECPE data and corresponding Q-matrix from CDM [14], GDINA [15], or edmdata [43] R packages.

Empirical Q-Matrix Evaluation
Empirical Q-matrix evaluation involves validating the number of attributes or dimensionality evaluation and detecting misspecified elements in the provisional Q-matrix. Although it usually occurs during the Q-matrix development phase, dimensionality evaluation may provide valuable insight into the structure of the provisional Q-matrix. Dimensionality evaluation can be conducted by the cdmTools [44] package with cdmTools::paK() and cdmTools::modelcompK() functions. The cdmTools::paK() function adopts the parallel analysis method by comparing the eigenvalues generated from principal components, Pearson correlations, and mean criterion [27,45] of the randomly resampled correlation matrices and their sample correlation matrices. The argument cor specifies the type of correlations to be used, whose default value is "both", implying using both Pearson and tetrachoric/polychoric correlations. In our code, we define cor = "cor", indicating the Pearson correlations are employed. The number of suggested attributes is extracted by $sug.K. As presented in the output below, the suggested number of attributes is 3, which is equal to that of our provisional Q-matrix.
>R res.paK <-cdmTools::paK(dat, cor = "cor") >R res.paK$sug.K The cdmTools::modelcompK() function compares several model fit indices of the CDMs fitted with different Q-matrices of a specified number of attributes that are developed through the discrete factor loading method (DFL) [46] and the Hull method [47]. Nájera and colleagues [27] suggested preferring the AIC over other indices. In modelcompK() function, exploreK = 1:5 indicates that Q-matrices with one to five attributes were evaluated.   The AIC and BIC values can be plotted across the number of attributes using the plot() function to obtain a direct view of the comparison result. The plots in Figure 2 demonstrate that the tendency change of AIC values and the minimum BIC value are both at K = 3.
purpose. For example, the CDM package implements de la Torre's method [48], and the NPCD package [49] refines the Q-matrix based on Chiu's [50] nonparametric approach. Both methods may be used for mixed DINA and DINO models. The GDINA and cdmTools have functions for Q-matrix validation under the saturated CDMs. Following the suggestion of de la Torre and Minchen [32], the G-DINA model was employed when conducting Q-matrix validation to avoid conflating Q-matrix misspecifications with model misspecifications. Specifically, the G-DINA model was fitted to the data using the code shown below. The argument mono.constraint is set to TRUE to impose monotonicity constraints to the model, ensuring that the probability of having a correct response of an item will not decrease as the student masters more required attributes. The argument control = list(conv.crit = 0.000001) indicates that convergence criterion was set to 0.000001 instead of the default value at 0.0001. In this paper, the stepwise Wald test [29] was used by specifying method = "wald" in GDINA::Qval() function. Alternatively, the PVAF (i.e., the proportion of variance accounted for) method with fixed or predicted cutoffs can be applied [28] when using this function. In the cdmTools package, Q-matrix validation can be performed with cdmTools::valQ() function, which implements the Hull method with PVAF or McFadden's pseudo R-squared [47] with various iteration algorithms [31].
>R qv <-GDINA::Qval(GDINA.obj = est, method = "wald") The suggested Q-matrix based on the stepwise Wald test is presented below. The cells marked with an asterisk are modified according to the validation results. In our case, the q-vector of items 9 and 13 was suggested to be modified. >R plot(res.modelcompK$fit$AIC, type = "b") >R plot(res.modelcompK$fit$BIC, type = "b") After the number of attributes has been assessed, whether the Q-matrix consists of misspecified elements needs to be examined. Many R packages provide functions for this purpose. For example, the CDM package implements de la Torre's method [48], and the NPCD package [49] refines the Q-matrix based on Chiu's [50] nonparametric approach. Both methods may be used for mixed DINA and DINO models. The GDINA and cdmTools have functions for Q-matrix validation under the saturated CDMs.
Following the suggestion of de la Torre and Minchen [32], the G-DINA model was employed when conducting Q-matrix validation to avoid conflating Q-matrix misspecifications with model misspecifications. Specifically, the G-DINA model was fitted to the data using the code shown below. The argument mono.constraint is set to TRUE to impose monotonicity constraints to the model, ensuring that the probability of having a correct response of an item will not decrease as the student masters more required attributes. The argument control = list(conv.crit = 0.000001) indicates that convergence criterion was set to 0.000001 instead of the default value at 0.0001.
>R qv <-GDINA::Qval(GDINA.obj = est, method = "wald") The suggested Q-matrix based on the stepwise Wald test is presented below. The cells marked with an asterisk are modified according to the validation results. In our case, the q-vector of items 9 and 13 was suggested to be modified.
>R Additionally, the mesa plots were drawn [51] to visualize the PVAF of q-vectors using the code below for Items 9 and 13, as shown in Figure 3. In mesa plots, the x-axis represents the q-vectors and the y-axis their corresponding PVAF values. The default cutoff value for PVAF (eps) is set to 0.95 out of the range from 0 to 1. The cutoff can be adjusted according to the researcher's judgment. De la Torre and Ma (2016) recommended the q-vector on the edge of the "mesa" to be considered the correct q-vector for the item. In these mesa plots, the red dots are the original q-vectors. These plots indicate that attributes 3 and 1 each contributed to most of the variance of the item success probabilities of items 9 and 13, respectively, whereas the rest did not contribute much. According to these plots, the original q-vectors [001] and [100] are suggested to be the correct ones instead of the q-vectors [101] in the modified Q-matrix.

CDM Calibration
After the Q-matrix is finalized, CDMs can fit into the data. Both CDM and GDINA packages can fit the G-DINA model to the data, but they have different default settings. This section will show how the model calibration using one package can be converted to the other package. First, the data is calibrated using the GDINA package with the code below: R> plot(qv,c(9,13),eps = 0.95,data.label = TRUE) Please note that, in this section, only some Q-matrix validation methods were discussed when a provisional Q-matrix is available; however, some exploratory methods [52][53][54][55] have also been developed to estimate the Q-matrix based on the response data without the need for a provisional Q-matrix. Additionally, different Q-matrix validation methods may produce different recommendations because their performance may be affected by many factors. For example, it has been shown that the stepwise Wald method may prove difficult in converging when the number of attributes is large. As a result, although those recommendations could be valuable for refining the Q-matrix, whether to adopt the suggestions should be contingent on domain experts' judgment and interpretation.

CDM Calibration
After the Q-matrix is finalized, CDMs can fit into the data. Both CDM and GDINA packages can fit the G-DINA model to the data, but they have different default settings. This section will show how the model calibration using one package can be converted to the other package. First, the data is calibrated using the GDINA package with the code below: Based on the estimates from the GDINA package, the following code allows fixing the parameters at their estimated values and obtaining an object of "gdina" from the CDM package directly. In particular, item parameter estimates from the GDINA::GDINA() function of the GDINA package were extracted by GDINA.est$delta.parm, and the prior probabilities for each latent class for the last E-step of the EM cycle are obtained via GDINA::extract(GDINA.est, what = "att.prior"). Using the CDM package, the arguments in CDM::gdina() function can be defined with the elements we extracted from GDINA.est. In particular, the arguments delta.fixed and attr.prob.fixed make it possible to fix delta parameters and attribute probabilities, respectively. The argument reduced.skillspace is FALSE, indicating the attribute patterns were not reduced and all possible attribute patterns were included in the estimation [56]. It should be noted that the attribute space needs to be specified using argument skillclasses in that CDM::gdina() and GDINA::GDINA() functions use different attribute spaces by default. In the code below, the attribute space was defined as att.pattern and then specified in CDM::gdina() function. Because of those settings, GDINA.est and CDM.est contain equivalent estimation results.
>R delta.param <-extract(GDINA.est,"delta.parm") >R mixing.proportions <-GDINA::extract(GDINA.est, what = "att.prior") >R K <-ncol(Q) >R att.pattern <-extract(GDINA.est,"attributepattern") >R CDM.est <-CDM::gdina(dat, Q, skillclasses = att.pattern, >+ delta.fixed = delta.param, >+ attr.prob.fixed = mixing.proportions, >+ reduced.skillspace = FALSE) The data can be fit first using the CDM package prior to fixing the parameter estimates in the GDINA package. As shown below, when fitting the data using CDM::gdina() function, monotonic constraints were imposed using argument mono.constr and set criteria for convergence using arguments conv.crit and dev.crit. It should be noted that when the monotonicity constraints are imposed, a logit link G-DINA model is adopted by default, which is mathematically equivalent to the identity link G-DINA model. The GDINA::GDINA() function does not allow fixing delta parameters directly; instead, the item success probabilities can be fixed. The code below extracts the probabilities of success for each reduced attribute profile on each item: The code below calls GDINA::GDINA() with several arguments. In particular, the logit link G-DINA model was specified via the arguments model and linkfunc. The attribute space used in the CDM package was extracted via cdm.fit$attribute.patt.splitted and specified using att.str in GDINA::GDINA() function. The initial item success probabilities were specified via argument catprob.parm and the initial distribution of latent classes was specified using att.prior. By specifying maxitr = 0 in argument control, the E-M cycle was disabled, and the initial item success probabilities and distribution of latent classes were used for the final E-step calculation. After determining the CDMs and Q-matrix, the assessment is performed to determine whether the parameters of the model can be identified. The function cdmTools::is.Qid() from the cdmTools package checks model identifiability according to the criteria from Chen and colleagues [53] and Xu and Shang [57]. As shown below, all parameters of the G-DINA model can be identified in this example. Q-matrix in the Q argument, as well as the model that was estimated in the model, need to be provided. Available inputs for the model are "DINA", "DINO", or "others". Here "others" are indicated because of the use of the G-DINA model. >R cdmTools::is.Qid(Q, model = "others") So far, the discussion has been focused on how to estimate the G-DINA model using both packages, obtain the equivalent objects between two packages, and assess the identifiability of the G-DINA model globally. In practice, researchers may want to simplify the G-DINA model empirically because it has been shown that reduced models, when used appropriately, can provide better classification results than the G-DINA model [34]. The GDINA package offers a function called GDINA::modelcomp(), which implements the Wald test and likelihood ratio test for assessing whether the G-DINA model can be reduced to five commonly used reduced models, namely, the DINA model, the DINO model, A-CDM, LLM, and R-RUM, as shown below: In addition, the CDM package allows researchers to fit the regularized G-DINA model using a variety of penalty terms. This is a flexible approach to simplifying the G-DINA model and interested readers may refer to Robitzsch [37] and Robitzsch and George [36] for more information. A caveat to the Wald and LR tests for model comparisons is that trivial discrepancy between two models may be detected when sample size is large and one should be aware that the logit link must be used when the regularized G-DINA model is specified in the CDM package.

Model Fit Evaluation
Both CDM and GDINA packages offer functions for assessing model-data fit. Table 2 shows the functions and the statistics calculated in each package. It is evident that both packages calculate various statistics for assessing both absolute and relative fit at test and item levels. This paper will not enumerate the outputs of all those statistics; instead, it will focus on absolute fit statistics, as only the G-DINA model was used, and present some results as an example.

Test-Level Fit Evaluation
Most test-level absolute fit measures gauge the discrepancy between observed quantities and model-implied counterparts. For example, the M 2 statistic [38,58,59] compares the univariate and bivariate distributions of observations and model predictions. Because it conforms to χ 2 distribution, hypothesis tests can be conducted to assess whether the model fits data. However, it is well-known that a hypothesis test is affected by sample size, and a large sample may capture trivial discrepancies between the model and the data. To address this issue, the root mean square error of approximation (RMSEA 2 ) [39,60] and the standardized root mean square residual (SRMSR) [39,61] can be used as effect-size measures. For both RMSEA 2 and SRMSR, a smaller value indicates a better absolute model data fit [62]. Simulation studies suggest that RMSEA2 < 0.03 indicates excellent fit, 0.03 < RMSEA2 < 0.045 a good fit, and RMSEA2 < 0.045 poor fit. SRMSR < 0.05 indicates good model fit [39,63]. It should be noted that when the number of parameters is large, the M 2 statistic, as well as RMSEA 2 , may not be calculable.  Aggregated item-level or item-pair level absolute fit measures have also been used to assess test-level fit. Examples include the mean absolute difference between the observed and expected correlations (MADcor) [64,65], the maximum absolute difference between observed and predicted Fisher transformed correlations (MaxAD.r) [64], the maximum absolute difference between observed and predicted log odds ratios (MaxAD.LOR) [64], the mean of absolute deviations of residual covariances (MADRESIDCOV) [66], and the maximum χ 2 value of all item pairs (max(X 2 )) [67]. The χ 2 statistic quantifies the deviance between the observed and predicted item-pair distributions, using individual posterior distributions of the specified model. For MaxAD.r, MaxAD.LOR and max(X 2 ), one can often report adjusted p-value to assess whether the model and the data fit well for the worst pair of items. For other measures, a small value indicates a good fit. Since the value of MADRESIDCOV is often small, the value of 100*MADRESIDCOV is usually adopted [68].
In the CDM package, function CDM::IRT.modelfit() can be used to calculate MADcor, MaxAD.r (labelled as abs (fcor)), MADRESIDCOV, max(X 2 ) and SRMSR. The CDM::IRT.modelfit() function also calculates information criteria, such as Akaike's Information Criteria (AIC) [69] and Bayesian Information Criteria (BIC) [70]. Both criteria are based on the maximum likelihood statistic, and BIC is additionally affected by sample size. Both AIC and BIC serve as a measure for comparing model fit, and a smaller value indicates better model fit. Nevertheless, please note that when parameters are fixed in model calibration, the calculation of information criteria is incorrect and must be manually corrected.
In the GDINA package, GDINA::modelfit() function calculates M 2 statistic, RMSEA 2 , and SRMSR for absolute fit evaluation, and calculates log-likelihood, AIC, BIC, CAIC, and SABIC for relative fit evaluation. The GDINA::itemfit() function calculates MaxAD.r and MaxAD.LOR. The code below shows how to obtain these statistics from CDM and GDINA packages.

Item-Level and Item-Pair Level Fit Evaluation
Item-level absolute fit can be assessed using S-χ 2 item fit statistic [41,71], which can be calculated using CDM::itemfit.sx2() function. The S-χ 2 item fit statistic compares observed and expected proportions for each item and each latent class and forms a chisquare distributed statistic. As a result, the items with p-values greater than 0.05 indicate good item fit at 0.05 nominal level. The output, for instance, indicates that item 13 has a significant misfit.
Unlike the S-χ 2 and RMSD statistics that focus on to what extent the model can fit data well for each item, the absolute difference between observed and predicted Fisher transformed correlations and the absolute difference between observed and predicted log odds ratios for all item pairs [64] are reported in the GDINA package. Both measures focus on to what extent the model can explain the association between each pair of items. A heatmap plot illustrating the adjusted p-values of transformed correlation between item pairs can be requested using plot(), as demonstrated in Figure 4. In the heatmap plot, items are presented on both x-and y-axes. The first item on the x-axis and the last on the y-axis were dropped for pairing items. The adjusted p-values of all item pairs are plotted in the lower right shading area, where those of adequately fitted item pairs are in grey (p > 0.05) and those of inadequately fitted item pairs are in different tones of red (p < 0.05), depending on the p-value [42]. In our case, some item pairs (e.g., items 9 and 10 and items 13 and 22) demonstrated significant misfit and thus are in demand for further exploration by domain experts.

Item Diagnosticity Investigation
To assess item diagnosticity, the distribution of the probability of success across all latent groups in each item can be drawn, using plot() as the code presents below. The plots should show the distinctions between the bars representing each latent group. A good example is item 20 presented in Figure 5, where an increase is observed in the probability of success as a student masters more attributes measured by this item. A poor example is item 17, where the success probability of all four latent groups is over 0.75 and little difference is observed between the bars. This indicates that a student has a more than 75% chance to answer this item correctly whether they do not master any attributes required by this item, master only one attribute, or master both attributes. In this regard, item 17 does not have the ability to distinguish students in different latent groups. Similar plots can be drawn using the plot() function as well in the CDM package.

Item Diagnosticity Investigation
To assess item diagnosticity, the distribution of the probability of success across all latent groups in each item can be drawn, using plot() as the code presents below. The plots should show the distinctions between the bars representing each latent group. A good example is item 20 presented in Figure 5, where an increase is observed in the probability of success as a student masters more attributes measured by this item. A poor example is item 17, where the success probability of all four latent groups is over 0.75 and little difference is observed between the bars. This indicates that a student has a more than 75% chance to answer this item correctly whether they do not master any attributes required by this item, master only one attribute, or master both attributes. In this regard, item 17 does not have the ability to distinguish students in different latent groups. Similar plots can be drawn using the plot() function as well in the CDM package. Another way to check item diagnosticity is to investigate the item discrimination indices. In the GDINA package, item discrimination is measured by two indices: P(1)-P(0) and G-DINA discrimination index (GDI). P(1)-P(0) measures the differences in success probabilities between those who master all required attributes and those who master none of them. GDI measures the variance of the item success probabilities based on the reduced >R plot(GDINA.est,item = 1:28) Another way to check item diagnosticity is to investigate the item discrimination indices. In the GDINA package, item discrimination is measured by two indices: P(1)-P(0) and G-DINA discrimination index (GDI). P(1)-P(0) measures the differences in success probabilities between those who master all required attributes and those who master none of them. GDI measures the variance of the item success probabilities based on the reduced attribute profile [28,75]. An item with a higher value of P(1)-P(0) or GDI has higher discrimination power. Currently, there is no agreement in the field regarding the value of good discrimination power. Although a higher value of P(1)-P(0) or GDI is desirable, it could be an indicator of overspecified q-vectors [28]. These two item discrimination indices can be requested using GDINA::extract() as in: >R In the CDM package, P(1)-P(0) is referred to as item discrimination index or IDI [48,76]. The CDM package also calculates the discrimination index (DI) at the item-attribute level based on the mastery probability of including and excluding the measured attribute for a specific item and the DI at the test level by averaging the marginalized probability of DIs at the item-attribute level for each item. Using CDM::discrim.index(), the test, item, and item-attribute level DIs can be requested. Note that the IDI at item level is the same as the P(1)-P(0) values requested by GDINA::extract(). Although not presented here, the CDM package also calculates the cognitive diagnostic index (CDI) based on the Kullback-Leibler information (KLI) [76], which can be requested by using CDM::cdi.kli().

CDM Result Presentation
The primary goal of CDM analysis is to classify students into different latent classes or estimate students' attribute profiles. Both CDM and GDINA packages can estimate a person's parameters using expected a posteriori (EAP), MAP, or MLE methods [83]. In the GDINA package, the GDINA::personparm() function could be used, whereas in the CDM package, the CDM::IRT.factor.scores() function can be applied. Below are the estimated attribute profiles of the first six students using GDINA::personparm().
>R head(personparm(GDINA.est)) A1 A2 A3 [ Using the mpRadar() function created from the fmsb::radarchart() function of the fmsb R package [84], a radar chart of the mastery probability for each student or for several students at the same time can be plotted. As presented in Figure 6, the student has a nearly 100% chance of mastering the lexical rules, a nearly 0% chance of mastering the morphosyntactic rules, and about a 45% chance of mastering the cohesive rules. In addition to person classifications, the proportion of students who master or do not master each attribute referred to as attribute prevalence, and the proportion of students in each latent class referred to as latent class proportions can also be measured. They can be requested by calling GDINA::extract() function and specifying what = "prevalence" and "posterior.prob" in the GDINA package, respectively. In the CDM package, it can be obtained in the list named "Skill Pattern Probabilities" by calling the summary() function. Figure 7 presents a bar plot of the attribute prevalence, and Figure  8 presents a pie chart and a doughnut chart of the latent class proportions. In addition to person classifications, the proportion of students who master or do not master each attribute referred to as attribute prevalence, and the proportion of students in each latent class referred to as latent class proportions can also be measured. They can be requested by calling GDINA::extract() function and specifying what = "prevalence" and "posterior.prob" in the GDINA package, respectively. In the CDM package, it can be obtained in the list named "Skill Pattern Probabilities" by calling the summary() function. Figure 7 presents a bar plot of the attribute prevalence, and Figure 8 Figure 9 presents a network plot showing both the tetrachoric correlations among attributes and the attribute prevalence. In particular, the tetrachoric correlations are displayed on the arrows between corresponding attributes, and the attribute prevalence is represented using pie charts for each attribute.
package, it can be obtained in the list named "Skill Pattern Probabilities" by calling the summary() function. Figure 7 presents a bar plot of the attribute prevalence, and Figure  8 presents a pie chart and a doughnut chart of the latent class proportions.  Figure 9 presents a network plot showing both the tetrachoric correlations among attributes and the attribute prevalence. In particular, the tetrachoric correlations are displayed on the arrows between corresponding attributes, and the attribute prevalence is represented using pie charts for each attribute.  Figure 9 presents a network plot showing both the tetrachoric correlations among attributes and the attribute prevalence. In particular, the tetrachoric correlations are displayed on the arrows between corresponding attributes, and the attribute prevalence is represented using pie charts for each attribute. The attribute prevalence and latent class proportions can be plotted together as presented in Figure 10, similar to Bradshaw et al. [2]. The code for creating the plots in Figures  7-10 was written by the authors and can be requested from the first author of the article. The attribute prevalence and latent class proportions can be plotted together as presented in Figure 10, similar to Bradshaw et al. [2]. The code for creating the plots in Figures 7-10 was written by the authors and can be requested from the first author of the article.

Discussion
The purpose of this study is to provide a hands-on example of conducting CDM analysis in the G-DINA framework using R packages and illustrating how different R packages can be used in an integrated manner, providing richer information for cognitive diagnosis. Utilizing an exemplary dataset, the study demonstrated a workflow of CDM analyses, from Q-matrix validation to classification visualization. Such an illustration will be helpful to researchers who plan to conduct CDM analysis in R. However, only a limited number of relevant procedures were discussed because of their availability in existing R packages. Other procedures that are equally if not more critical can often be found in the literature.
Despite the potential usefulness, the procedures discussed in this paper may not always work well. For example, the M2 statistic and RMSEA2 from the GDINA package may not be calculable if the number of parameters is too large. The S-χ 2 item fit statistic from the CDM package would also not be calculated if there are missing data. In addition, although it was shown that the CDM and GDINA packages could complement each other in various aspects, researchers need to proceed with caution when using them together. Separate data calibrations may produce different parameter estimates due to the fact that (1) the EM algorithm may reach local maxima or (2) different default settings are specified for different packages. Therefore, this paper shows how to obtain equivalent calibration results by fixing parameter estimates obtained from one package in the other. Doing so, however, may lead to incorrect calculation of the number of free parameters and consequently affect the calculation of other statistics, such as information criteria.
Finally, it should be emphasized that this paper only focuses on the CDM analysis of dichotomous response data using the G-DINA model. However, researchers can do more than that in R. For example, the CDM package can also handle the general diagnostic model and regularized latent class model, while the GDINA package can handle several CDMs for multiple strategies. Both can also run CDMs for polytomous attributes and polytomous responses. Also, The NPCD and ACTCD [49,85] packages can conduct nonparametric cognitive diagnostic analysis.

Discussion
The purpose of this study is to provide a hands-on example of conducting CDM analysis in the G-DINA framework using R packages and illustrating how different R packages can be used in an integrated manner, providing richer information for cognitive diagnosis. Utilizing an exemplary dataset, the study demonstrated a workflow of CDM analyses, from Q-matrix validation to classification visualization. Such an illustration will be helpful to researchers who plan to conduct CDM analysis in R. However, only a limited number of relevant procedures were discussed because of their availability in existing R packages. Other procedures that are equally if not more critical can often be found in the literature.
Despite the potential usefulness, the procedures discussed in this paper may not always work well. For example, the M 2 statistic and RMSEA 2 from the GDINA package may not be calculable if the number of parameters is too large. The S-χ 2 item fit statistic from the CDM package would also not be calculated if there are missing data. In addition, although it was shown that the CDM and GDINA packages could complement each other in various aspects, researchers need to proceed with caution when using them together. Separate data calibrations may produce different parameter estimates due to the fact that (1) the EM algorithm may reach local maxima or (2) different default settings are specified for different packages. Therefore, this paper shows how to obtain equivalent calibration results by fixing parameter estimates obtained from one package in the other. Doing so, however, may lead to incorrect calculation of the number of free parameters and consequently affect the calculation of other statistics, such as information criteria.
Finally, it should be emphasized that this paper only focuses on the CDM analysis of dichotomous response data using the G-DINA model. However, researchers can do more than that in R. For example, the CDM package can also handle the general diagnostic model and regularized latent class model, while the GDINA package can handle several CDMs for multiple strategies. Both can also run CDMs for polytomous attributes and polytomous responses. Also, The NPCD and ACTCD [49,85] packages can conduct nonparametric cognitive diagnostic analysis.