# Regularized Latent Class Analysis for Polytomous Item Responses: An Application to SPM-LS Data

^{1}

^{2}

## Abstract

**:**

## 1. Introduction

## 2. Latent Class Analysis

#### 2.1. Exploratory Latent Class Analysis for Dichotomous Item Responses

#### 2.2. Exploratory Latent Class Analysis for Polytomous Item Responses

^{I}− 1 free parameters while in the LCM I · K · C + C − 1 parameters are estimated. It should be emphasized that the LCM for polytomous items has more free parameters compared to LCMs with dichotomous items as well as for unidimensional and multidimensional item response models for polytomous data.

## 3. Regularized Latent Class Analysis

#### 3.1. Regularized Latent Class Analysis for Dichotomous Item Responses

#### 3.1.1. Fused Regularization among Latent Classes

#### 3.1.2. Hierarchies in Latent Class Models

#### 3.2. Regularized Latent Class Analysis for Polytomous Item Responses

#### 3.2.1. Fused Regularization among Latent Classes

#### 3.2.2. Fused Regularization among Categories

#### 3.2.3. Fused Regularization among Latent Classes and Categories

#### 3.2.4. Fused Group Regularization among Categories

#### 3.2.5. Fused Group Regularization among Classes

#### 3.3. Estimation

_{ic}(x;

**γ**

_{i}) = P(X

_{i}= x | U = c) and p

_{c}(

**δ**) = P(U = c).

`regpolca()`in the R package

`sirt`(Robitzsch 2020). The function is under current development for improving computational efficiency.

## 4. Simulated Data Illustration

#### 4.1. Dichotomous Item Responses

#### 4.1.1. Data Generation

#### 4.1.2. Results

#### 4.2. Polytomous Item Responses

#### 4.2.1. Data Generation

#### 4.2.2. Results

## 5. Application of the SPM-LS Data

#### 5.1. Method

#### 5.2. Results

#### 5.2.1. Results for Dichotomous Item Responses

#### 5.2.2. Results for Polytomous Item Responses

## 6. Discussion

## Funding

## Acknowledgments

## Conflicts of Interest

## Abbreviations

AIC | Akaike information criterion |

BIC | Bayesian information criterion |

EM | expectation maximization |

LASSO | least absolute shrinkage and selection operator |

LCA | latent class analysis |

LCM | latent class model |

NLM | nested logit model |

RLCA | regularized latent class analysis |

RLCM | restricted latent class model |

SPM-LS | last series of Raven’s standard progressive matrices |

## Appendix A. Additional Results for Simulated Data Illustration with Polytomous Item Responses

**Table A1.**Data illustration polytomous data: estimated item response probabilities in the exploratory LCM with $C=4$ classes.

Item | Cat | Class | Item | Cat | Class | Item | Cat | Class | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | ||||||

1 | 0 | 0.08 | 0.78 | 0.80 | 0.82 | 5 | 0 | 0.10 | 0.11 | 0.45 | 0.92 | 9 | 0 | 0.14 | 0.82 | 0.14 | 0.81 |

1 | 1 | 0.33 | 0.08 | 0.07 | 0.07 | 5 | 1 | 0.34 | 0.27 | 0.23 | 0.02 | 9 | 1 | 0.27 | 0.08 | 0.26 | 0.08 |

1 | 2 | 0.31 | 0.05 | 0.05 | 0.06 | 5 | 2 | 0.32 | 0.31 | 0.17 | 0.02 | 9 | 2 | 0.32 | 0.05 | 0.33 | 0.05 |

1 | 3 | 0.29 | 0.08 | 0.08 | 0.06 | 5 | 3 | 0.24 | 0.31 | 0.15 | 0.04 | 9 | 3 | 0.27 | 0.05 | 0.27 | 0.05 |

2 | 0 | 0.20 | 0.84 | 0.89 | 0.91 | 6 | 0 | 0.22 | 0.24 | 0.31 | 0.77 | 10 | 0 | 0.19 | 0.89 | 0.40 | 0.83 |

2 | 1 | 0.30 | 0.06 | 0.00 | 0.06 | 6 | 1 | 0.23 | 0.17 | 0.16 | 0.07 | 10 | 1 | 0.42 | 0.05 | 0.28 | 0.05 |

2 | 2 | 0.28 | 0.06 | 0.01 | 0.03 | 6 | 2 | 0.23 | 0.21 | 0.06 | 0.03 | 10 | 2 | 0.19 | 0.01 | 0.12 | 0.03 |

2 | 3 | 0.23 | 0.03 | 0.10 | 0.00 | 6 | 3 | 0.31 | 0.39 | 0.47 | 0.13 | 10 | 3 | 0.20 | 0.05 | 0.20 | 0.10 |

3 | 0 | 0.15 | 0.76 | 0.20 | 0.80 | 7 | 0 | 0.07 | 0.79 | 0.83 | 0.84 | 11 | 0 | 0.10 | 0.13 | 0.55 | 0.90 |

3 | 1 | 0.25 | 0.15 | 0.34 | 0.10 | 7 | 1 | 0.34 | 0.10 | 0.06 | 0.05 | 11 | 1 | 0.32 | 0.28 | 0.10 | 0.03 |

3 | 2 | 0.36 | 0.06 | 0.18 | 0.05 | 7 | 2 | 0.31 | 0.07 | 0.06 | 0.06 | 11 | 2 | 0.26 | 0.34 | 0.17 | 0.03 |

3 | 3 | 0.24 | 0.03 | 0.28 | 0.06 | 7 | 3 | 0.28 | 0.03 | 0.06 | 0.05 | 11 | 3 | 0.32 | 0.25 | 0.18 | 0.04 |

4 | 0 | 0.27 | 0.89 | 0.30 | 0.86 | 8 | 0 | 0.24 | 0.86 | 0.91 | 0.88 | 12 | 0 | 0.25 | 0.19 | 0.21 | 0.77 |

4 | 1 | 0.35 | 0.04 | 0.28 | 0.04 | 8 | 1 | 0.23 | 0.06 | 0.05 | 0.06 | 12 | 1 | 0.23 | 0.24 | 0.27 | 0.08 |

4 | 2 | 0.19 | 0.02 | 0.21 | 0.01 | 8 | 2 | 0.24 | 0.06 | 0.02 | 0.06 | 12 | 2 | 0.19 | 0.20 | 0.10 | 0.04 |

4 | 3 | 0.19 | 0.05 | 0.22 | 0.09 | 8 | 3 | 0.28 | 0.02 | 0.03 | 0.00 | 12 | 3 | 0.34 | 0.38 | 0.42 | 0.12 |

## Appendix B. Additional Results for SPM-LS Dataset with Polytomous Item Responses

**Table A2.**SPM-LS polytomous data: estimated item response probabilities and latent class probabilities for best-fitting RLCM with $C=3$ latent classes.

Item | Cat | C1 | C2 | C3 | Item | Cat | C1 | C2 | C3 | Item | Cat | C1 | C2 | C3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

SPM1 | 0 | 0.73 | 0.15 | 0.82 | SPM5 | 0 | 0.72 | 0.04 | 0.98 | SPM9 | 0 | 0.25 | 0.22 | 0.73 |

SPM1 | 1 | 0.11 | 0.48 | 0.12 | SPM5 | 1 | 0.05 | 0.48 | 0.00 | SPM9 | 1 | 0.14 | 0.03 | 0.12 |

SPM1 | 2 | 0.06 | 0.01 | 0.02 | SPM5 | 2 | 0.03 | 0.32 | 0.01 | SPM9 | 2 | 0.17 | 0.00 | 0.07 |

SPM1 | 3 | 0.07 | 0.04 | 0.01 | SPM5 | 3 | 0.08 | 0.00 | 0.00 | SPM9 | 3 | 0.12 | 0.32 | 0.03 |

SPM1 | 4 | 0.00 | 0.24 | 0.01 | SPM5 | 4 | 0.06 | 0.00 | 0.00 | SPM9 | 4 | 0.19 | 0.15 | 0.01 |

SPM1 | 5 | 0.01 | 0.08 | 0.02 | SPM5 | 5 | 0.02 | 0.08 | 0.01 | SPM9 | 5 | 0.06 | 0.00 | 0.03 |

SPM1 | 6 | 0.02 | 0.00 | 0.00 | SPM5 | 6 | 0.02 | 0.08 | 0.00 | SPM9 | 6 | 0.06 | 0.16 | 0.01 |

SPM1 | 7 | 0.00 | 0.00 | 0.00 | SPM5 | 7 | 0.02 | 0.00 | 0.00 | SPM9 | 7 | 0.01 | 0.12 | 0.00 |

SPM2 | 0 | 0.87 | 0.32 | 0.97 | SPM6 | 0 | 0.51 | 0.08 | 0.92 | SPM10 | 0 | 0.08 | 0.00 | 0.56 |

SPM2 | 1 | 0.02 | 0.12 | 0.03 | SPM6 | 1 | 0.10 | 0.36 | 0.04 | SPM10 | 1 | 0.14 | 0.12 | 0.19 |

SPM2 | 2 | 0.02 | 0.36 | 0.00 | SPM6 | 2 | 0.11 | 0.00 | 0.03 | SPM10 | 2 | 0.26 | 0.40 | 0.03 |

SPM2 | 3 | 0.07 | 0.00 | 0.00 | SPM6 | 3 | 0.09 | 0.00 | 0.01 | SPM10 | 3 | 0.10 | 0.08 | 0.07 |

SPM2 | 4 | 0.01 | 0.12 | 0.00 | SPM6 | 4 | 0.06 | 0.24 | 0.00 | SPM10 | 4 | 0.13 | 0.00 | 0.06 |

SPM2 | 5 | 0.00 | 0.08 | 0.00 | SPM6 | 5 | 0.06 | 0.20 | 0.00 | SPM10 | 5 | 0.10 | 0.08 | 0.06 |

SPM2 | 6 | 0.01 | 0.00 | 0.00 | SPM6 | 6 | 0.05 | 0.08 | 0.00 | SPM10 | 6 | 0.10 | 0.32 | 0.03 |

SPM2 | 7 | 0.00 | 0.00 | 0.00 | SPM6 | 7 | 0.02 | 0.04 | 0.00 | SPM10 | 7 | 0.09 | 0.00 | 0.00 |

SPM3 | 0 | 0.67 | 0.04 | 0.92 | SPM7 | 0 | 0.38 | 0.20 | 0.88 | SPM11 | 0 | 0.11 | 0.26 | 0.48 |

SPM3 | 1 | 0.17 | 0.00 | 0.05 | SPM7 | 1 | 0.06 | 0.12 | 0.06 | SPM11 | 1 | 0.18 | 0.43 | 0.10 |

SPM3 | 2 | 0.08 | 0.40 | 0.00 | SPM7 | 2 | 0.13 | 0.28 | 0.01 | SPM11 | 2 | 0.21 | 0.00 | 0.12 |

SPM3 | 3 | 0.01 | 0.32 | 0.00 | SPM7 | 3 | 0.12 | 0.28 | 0.01 | SPM11 | 3 | 0.15 | 0.12 | 0.07 |

SPM3 | 4 | 0.01 | 0.04 | 0.02 | SPM7 | 4 | 0.11 | 0.00 | 0.02 | SPM11 | 4 | 0.14 | 0.00 | 0.08 |

SPM3 | 5 | 0.00 | 0.12 | 0.01 | SPM7 | 5 | 0.07 | 0.00 | 0.02 | SPM11 | 5 | 0.08 | 0.19 | 0.07 |

SPM3 | 6 | 0.04 | 0.04 | 0.00 | SPM7 | 6 | 0.09 | 0.00 | 0.00 | SPM11 | 6 | 0.07 | 0.00 | 0.07 |

SPM3 | 7 | 0.02 | 0.04 | 0.00 | SPM7 | 7 | 0.04 | 0.12 | 0.00 | SPM11 | 7 | 0.06 | 0.00 | 0.01 |

SPM4 | 0 | 0.62 | 0.00 | 0.98 | SPM8 | 0 | 0.18 | 0.00 | 0.81 | SPM12 | 0 | 0.03 | 0.24 | 0.45 |

SPM4 | 1 | 0.11 | 0.40 | 0.01 | SPM8 | 1 | 0.24 | 0.04 | 0.01 | SPM12 | 1 | 0.14 | 0.09 | 0.17 |

SPM4 | 2 | 0.04 | 0.36 | 0.00 | SPM8 | 2 | 0.08 | 0.00 | 0.07 | SPM12 | 2 | 0.19 | 0.44 | 0.10 |

SPM4 | 3 | 0.07 | 0.08 | 0.00 | SPM8 | 3 | 0.14 | 0.08 | 0.03 | SPM12 | 3 | 0.23 | 0.03 | 0.06 |

SPM4 | 4 | 0.05 | 0.00 | 0.01 | SPM8 | 4 | 0.11 | 0.20 | 0.03 | SPM12 | 4 | 0.12 | 0.00 | 0.07 |

SPM4 | 5 | 0.04 | 0.12 | 0.00 | SPM8 | 5 | 0.07 | 0.28 | 0.04 | SPM12 | 5 | 0.14 | 0.00 | 0.06 |

SPM4 | 6 | 0.04 | 0.00 | 0.00 | SPM8 | 6 | 0.11 | 0.40 | 0.01 | SPM12 | 6 | 0.07 | 0.20 | 0.07 |

SPM4 | 7 | 0.03 | 0.04 | 0.00 | SPM8 | 7 | 0.07 | 0.00 | 0.00 | SPM12 | 7 | 0.08 | 0.00 | 0.02 |

## References

- Agresti, Alan, and Maria Kateri. 2014. Some remarks on latent variable models in categorical data analysis. Communications in Statistics Theory and Methods 43: 801–14. [Google Scholar] [CrossRef]
- Battauz, Michela. 2019. Regularized estimation of the nominal response model. Multivariate Behavioral Research. [Google Scholar] [CrossRef] [PubMed]
- Bhattacharya, Sakyajit, and Paul D. McNicholas. 2014. A LASSO-penalized BIC for mixture model selection. Advances in Data Analysis and Classification 8: 45–61. [Google Scholar] [CrossRef] [Green Version]
- Borsboom, Denny, Mijke Rhemtulla, Angelique O. J. Cramer, Han L. J. van der Maas, Marten Scheffer, and Conor V. Dolan. 2016. Kinds versus continua: A review of psychometric approaches to uncover the structure of psychiatric constructs. Psychological Medicine 46: 1567–79. [Google Scholar] [CrossRef] [Green Version]
- Cao, Peng, Xiaoli Liu, Hezi Liu, Jinzhu Yang, Dazhe Zhao, Min Huang, and Osmar Zaiane. 2018. Generalized fused group lasso regularized multi-task feature learning for predicting cognitive outcomes in Alzheimers disease. Computer Methods and Programs in Biomedicine 162: 19–45. [Google Scholar] [CrossRef]
- Chen, Yunxiao, Jingchen Liu, Gongjun Xu, and Zhiliang Ying. 2015. Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association 110: 850–66. [Google Scholar] [CrossRef] [Green Version]
- Chen, Yunxiao, Xiaoou Li, Jingchen Liu, and Zhiliang Ying. 2017. Regularized latent class analysis with application in cognitive diagnosis. Psychometrika 82: 660–92. [Google Scholar] [CrossRef]
- Chen, Yunxiao, Xiaoou Li, Jingchen Liu, and Zhiliang Ying. 2018. Robust measurement via a fused latent and graphical item response theory model. Psychometrika 83: 538–62. [Google Scholar] [CrossRef] [Green Version]
- Collins, Linda M., and Stephanie T. Lanza. 2009. Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences. New York: Wiley. [Google Scholar] [CrossRef]
- DeSantis, Stacia M., E. Andrés Houseman, Brent A. Coull, Catherine L. Nutt, and Rebecca A. Betensky. 2012. Supervised Bayesian latent class models for high-dimensional data. Statistics in Medicine 31: 1342–60. [Google Scholar] [CrossRef]
- DeSantis, Stacia M., E. Andrés Houseman, Brent A. Coull, Anat Stemmer-Rachamimov, and Rebecca A. Betensky. 2008. A penalized latent class model for ordinal data. Biostatistics 9: 249–62. [Google Scholar] [CrossRef] [Green Version]
- Fan, Jianqing, and Runze Li. 2001. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96: 1348–60. [Google Scholar] [CrossRef]
- Finch, W. Holmes, and Kendall C. Bronk. 2011. Conducting confirmatory latent class analysis using Mplus. Structural Equation Modeling 18: 132–51. [Google Scholar] [CrossRef]
- Fop, Michael, and Thomas B. Murphy. 2018. Variable selection methods for model-based clustering. Statistics Surveys 12: 18–65. [Google Scholar] [CrossRef]
- Formann, Anton K. 1982. Linear logistic latent class analysis. Biometrical Journal 24: 171–90. [Google Scholar] [CrossRef]
- Formann, Anton K. 1992. Linear logistic latent class analysis for polytomous data. Journal of the American Statistical Association 87: 476–86. [Google Scholar] [CrossRef]
- Formann, Anton K. 2007. (Almost) equivalence between conditional and mixture maximum likelihood estimates for some models of the Rasch type. In Multivariate and Mixture Distribution Rasch Models. Edited by Matthias von Davier and Claus H. Carstensen. New York: Springer, pp. 177–89. [Google Scholar] [CrossRef]
- Formann, Anton K., and Thomas Kohlmann. 1998. Structural latent class models. Sociological Methods & Research 26: 530–65. [Google Scholar] [CrossRef]
- George, Ann C., Alexander Robitzsch, Thomas Kiefer, Jürgen Groß, and Ali Ünlü. 2016. The R package CDM for cognitive diagnosis models. Journal of Statistical Software 74: 1–24. [Google Scholar] [CrossRef] [Green Version]
- Gu, Yuqi, and Gongjun Xu. 2018. Partial identifiability of restricted latent class models. arXiv arXiv:1803.04353. [Google Scholar] [CrossRef]
- Gu, Yuqi, and Gongjun Xu. 2019. Learning attribute patterns in high-dimensional structured latent attribute models. Journal of Machine Learning Research 20: 115. [Google Scholar]
- Hastie, Trevor, Robert Tibshirani, and Martin Wainwright. 2015. Statistical Learning with Sparsity: The Lasso and Generalizations. Boca Raton: CRC Press. [Google Scholar] [CrossRef]
- Houseman, E. Andrés, Brent A. Coull, and Rebecca A. Betensky. 2006. Feature-specific penalized latent class analysis for genomic data. Biometrics 62: 1062–70. [Google Scholar] [CrossRef] [Green Version]
- Huang, Jian, Patrick Breheny, and Shuangge Ma. 2012. A selective review of group selection in high-dimensional models. ss 27: 481–99. [Google Scholar] [CrossRef] [PubMed]
- Huang, Po-Hsien, Hung Chen, and Li-Jen Weng. 2017. A penalized likelihood method for structural equation modeling. Psychometrika 82: 329–54. [Google Scholar] [CrossRef] [PubMed]
- Jacobucci, Ross, Kevin J. Grimm, and John J. McArdle. 2016. Regularized structural equation modeling. Structural Equation Modeling 23: 555–66. [Google Scholar] [CrossRef] [PubMed]
- Janssen, Anne B., and Christian Geiser. 2010. On the relationship between solution strategies in two mental rotation tasks. Learning and Individual Differences 20: 473–78. [Google Scholar] [CrossRef]
- Kang, Hyeon-Ah, Jingchen Liu, and Zhiliang Ying. 2017. A graphical diagnostic classification model. arXiv arXiv:1707.06318. [Google Scholar]
- Keribin, Christine. 2000. Consistent estimation of the order of mixture models. Sankhyā: The Indian Journal of Statistics, Series A 62: 49–66. [Google Scholar]
- Langeheine, Rolf, and Jürgen Rost. 1988. Latent Trait and Latent Class Models. New York: Plenum Press. [Google Scholar] [CrossRef]
- Lazarsfeld, Paul F., and Neil W. Henry. 1968. Latent Structure Analysis. Boston: Houghton Mifflin. [Google Scholar]
- Leoutsakos, Jeannie-Marie S., Karen Bandeen-Roche, Elzabeth Garrett-Mayer, and Peter P. Zandi. 2011. Incorporating scientific knowledge into phenotype development: Penalized latent class regression. Statistics in Medicine 30: 784–98. [Google Scholar] [CrossRef]
- Liu, Jingchen, and Hyeon-Ah Kang. 2019. Q-matrix learning via latent variable selection and identifiability. In Handbook of Diagnostic Classification Models. Edited by Matthias von Davier and Young-Sun Lee. Cham: Springer, pp. 247–63. [Google Scholar] [CrossRef]
- Liu, Xiaoli, Peng Cao, Jianzhong Wang, Jun Kong, and Dazhe Zhao. 2019. Fused group lasso regularized multi-task feature learning and its application to the cognitive performance prediction of Alzheimer’s disease. Neuroinformatics 17: 271–94. [Google Scholar] [CrossRef]
- Myszkowski, Nils, and Martin Storme. 2018. A snapshot of g. Binary and polytomous item-response theory investigations of the last series of the standard progressive matrices (SPM-LS). Intelligence 68: 109–16. [Google Scholar] [CrossRef]
- Nussbeck, Fritjof W., and Michael Eid. 2015. Multimethod latent class analysis. Frontiers in Psychology 6: 1332. [Google Scholar] [CrossRef] [Green Version]
- Oberski, Daniel L., Jacques A. P. Hagenaars, and Willem E. Saris. 2015. The latent class multitrait-multimethod model. Psychological Methods 20: 422–43. [Google Scholar] [CrossRef] [PubMed]
- Oelker, Margret-Ruth, and Gerhard Tutz. 2017. A uniform framework for the combination of penalties in generalized structured models. Advances in Data Analysis and Classification 11: 97–120. [Google Scholar] [CrossRef]
- Robitzsch, Alexander. 2020. sirt: Supplementary Item Response Theory Models. R Package Version 3.9-4. Available online: https://CRAN.R-project.org/package=sirt (accessed on 17 February 2020).
- Robitzsch, Alexander, and Ann C. George. 2019. The R package CDM for diagnostic modeling. In Handbook of Diagnostic Classification Models. Edited by Matthias von Davier and Young-Sun Lee. Cham: Springer, pp. 549–72. [Google Scholar] [CrossRef]
- Ruan, Lingyan, Ming Yuan, and Hui Zou. 2011. Regularized parameter estimation in high-dimensional Gaussian mixture models. Neural Computation 23: 1605–22. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- San Martín, Ernesto. 2018. Identifiability of structural characteristics: How relevant is it for the Bayesian approach? Brazilian Journal of Probability and Statistics 32: 346–73. [Google Scholar] [CrossRef]
- Scharf, Florian, and Steffen Nestler. 2019. Should regularization replace simple structure rotation in exploratory factor analysis? Structural Equation Modeling 26: 576–90. [Google Scholar] [CrossRef]
- Schmiege, Sarah J., Katherine E. Masyn, and Angela D. Bryan. 2018. Confirmatory latent class analysis: Illustrations of empirically driven and theoretically driven model constraints. Organizational Research Methods 21: 983–1001. [Google Scholar] [CrossRef]
- Storme, Martin, Nils Myszkowski, Simon Baron, and David Bernard. 2019. Same test, better scores: Boosting the reliability of short online intelligence recruitment tests with nested logit item response theory models. Journal of Intelligence 7: 17. [Google Scholar] [CrossRef] [Green Version]
- Sun, Jianan, Yunxiao Chen, Jingchen Liu, Zhiliang Ying, and Tao Xin. 2016. Latent variable selection for multidimensional item response theory models via L
_{1}regularization. Psychometrika 81: 921–39. [Google Scholar] [CrossRef] - Sun, Jiehuan, Jose D. Herazo-Maya, Philip L. Molyneaux, Toby M. Maher, Naftali Kaminski, and Hongyu Zhao. 2019. Regularized latent class model for joint analysis of high-dimensional longitudinal biomarkers and a time-to-event outcome. Biometrics 75: 69–77. [Google Scholar] [CrossRef]
- Tamhane, Ajit C., Dingxi Qiu, and Bruce E. Ankenman. 2010. A parametric mixture model for clustering multivariate binary data. Statistical Analysis and Data Mining 3: 3–19. [Google Scholar] [CrossRef]
- Tibshirani, Robert, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. 2005. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society. Series B: Statistical Methodology 67: 91–108. [Google Scholar] [CrossRef] [Green Version]
- Tutz, Gerhard, and Jan Gertheiss. 2016. Regularized regression for categorical data. Statistical Modelling 16: 161–200. [Google Scholar] [CrossRef] [Green Version]
- Tutz, Gerhard, and Gunther Schauberger. 2015. A penalty approach to differential item functioning in Rasch models. Psychometrika 80: 21–43. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- van Erp, Sara, Daniel L. Oberski, and Joris Mulder. 2019. Shrinkage priors for Bayesian penalized regression. Journal of Mathematical Psychology 89: 31–50. [Google Scholar] [CrossRef] [Green Version]
- von Davier, Matthias. 2008. A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology 61: 287–307. [Google Scholar] [CrossRef] [PubMed]
- von Davier, Matthias. 2010. Hierarchical mixtures of diagnostic models. Psychological Test and Assessment Modeling 52: 8–28. [Google Scholar]
- von Davier, Matthias, and Young-Sun Lee, eds. 2019. Handbook of Diagnostic Classification Models. Cham: Springer. [Google Scholar] [CrossRef]
- von Davier, Matthias, Bobby Naemi, and Richard D. Roberts. 2012. Factorial versus typological models: A comparison of methods for personality data. Measurement: Interdisciplinary Research and Perspectives 10: 185–208. [Google Scholar] [CrossRef]
- Wang, Chun, and Jing Lu. 2020. Learning attribute hierarchies from data: Two exploratory approaches. Journal of Educational and Behavioral Statistics. [Google Scholar] [CrossRef]
- Wu, Baolin. 2013. Sparse cluster analysis of large-scale discrete variables with application to single nucleotide polymorphism data. Journal of Applied Statistics 40: 358–67. [Google Scholar] [CrossRef]
- Wu, Zhenke, Livia Casciola-Rosen, Antony Rosen, and Scott L. Zeger. 2018. A Bayesian approach to restricted latent class models for scientifically-structured clustering of multivariate binary outcomes. arXiv arXiv:1808.08326. [Google Scholar]
- Xu, Gongjun. 2017. Identifiability of restricted latent class models with binary responses. Annals of Statistics 45: 675–707. [Google Scholar] [CrossRef] [Green Version]
- Xu, Gongjun, and Zhuoran Shang. 2018. Identifying latent structures in restricted latent class models. Journal of the American Statistical Association 113: 1284–95. [Google Scholar] [CrossRef]
- Yamamoto, Michio, and Kenichi Hayashi. 2015. Clustering of multivariate binary data with dimension reduction via L
_{1}-regularized likelihood maximization. Pattern Recognition 48: 3959–68. [Google Scholar] [CrossRef] [Green Version] - Zhang, Cun-Hui. 2010. Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics 38: 894–942. [Google Scholar] [CrossRef] [Green Version]

**Figure 1.**Different penalty functions used in regularization with regularization parameter $\lambda =0.25$ (

**left panel**) and $\lambda =0.125$ (

**right panel**).

**Figure 3.**Data illustration dichotomous data: Regularization path for estimated item response probabilities for Item 1 for Classes 2, 3, 4 for the four-class solution.

Item | Class | |||
---|---|---|---|---|

1 | 2 | 3 | 4 | |

1, 7 | 0.10 | 0.82 | 0.82 | 0.82 |

2, 8 | 0.22 | 0.88 | 0.88 | 0.88 |

3, 9 | 0.16 | 0.79 | 0.16 | 0.79 |

4, 10 | 0.25 | 0.85 | 0.25 | 0.85 |

5, 11 | 0.10 | 0.10 | 0.46 | 0.91 |

6, 12 | 0.22 | 0.22 | 0.22 | 0.79 |

**Table 2.**Data illustration dichotomous data: model comparison for exploratory latent class models (LCMs).

C | #np | AIC | BIC |
---|---|---|---|

2 | 25 | 13,636 | 13,759 |

3 | 38 | 13,169 | 13,356 |

4 | 51 | 12,979 | 13,229 |

5 | 64 | 12,981 | 13,295 |

6 | 76 | 12,976 | 13,349 |

**Table 3.**Data illustration dichotomous data: estimated item response probabilities in exploratory LCM with $C=4$ classes.

Item | Class | |||
---|---|---|---|---|

1 | 2 | 3 | 4 | |

1 | 0.08 | 0.79 | 0.79 | 0.82 |

2 | 0.20 | 0.84 | 0.89 | 0.91 |

3 | 0.15 | 0.76 | 0.19 | 0.81 |

4 | 0.27 | 0.90 | 0.29 | 0.86 |

5 | 0.10 | 0.09 | 0.44 | 0.92 |

6 | 0.23 | 0.23 | 0.30 | 0.77 |

7 | 0.07 | 0.79 | 0.82 | 0.85 |

8 | 0.24 | 0.87 | 0.91 | 0.87 |

9 | 0.14 | 0.82 | 0.18 | 0.81 |

10 | 0.19 | 0.90 | 0.42 | 0.83 |

11 | 0.10 | 0.13 | 0.54 | 0.89 |

12 | 0.25 | 0.19 | 0.19 | 0.77 |

**Table 4.**Data illustration dichotomous data: estimated item response probabilities in the regularized latent class model (RLCM) with $C=4$ classes based on the minimal Bayesian information criterion (BIC) ($\lambda =0.21$).

Item | Class | |||
---|---|---|---|---|

1 | 2 | 3 | 4 | |

1 | 0.08 | 0.80 | 0.80 | 0.80 |

2 | 0.20 | 0.88 | 0.88 | 0.88 |

3 | 0.15 | 0.79 | 0.15 | 0.79 |

4 | 0.27 | 0.87 | 0.27 | 0.87 |

5 | 0.09 | 0.09 | 0.44 | 0.92 |

6 | 0.23 | 0.23 | 0.23 | 0.76 |

7 | 0.07 | 0.82 | 0.82 | 0.82 |

8 | 0.24 | 0.87 | 0.87 | 0.87 |

9 | 0.14 | 0.81 | 0.14 | 0.81 |

10 | 0.19 | 0.85 | 0.40 | 0.85 |

11 | 0.11 | 0.11 | 0.55 | 0.89 |

12 | 0.22 | 0.22 | 0.22 | 0.76 |

Item | Cat | Class | Item | Cat | Class | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | ||||

1, 7 | 0 | 0.10 | 0.82 | 0.82 | 0.82 | 4, 10 | 0 | 0.25 | 0.85 | 0.25 | 0.85 |

1, 7 | 1 | 0.30 | 0.06 | 0.06 | 0.06 | 4, 10 | 1 | 0.35 | 0.03 | 0.35 | 0.03 |

1, 7 | 2 | 0.30 | 0.06 | 0.06 | 0.06 | 4, 10 | 2 | 0.20 | 0.03 | 0.20 | 0.03 |

1, 7 | 3 | 0.30 | 0.06 | 0.06 | 0.06 | 4, 10 | 3 | 0.20 | 0.09 | 0.20 | 0.09 |

2, 8 | 0 | 0.22 | 0.88 | 0.88 | 0.88 | 5, 11 | 0 | 0.10 | 0.10 | 0.46 | 0.91 |

2, 8 | 1 | 0.26 | 0.05 | 0.04 | 0.06 | 5, 11 | 1 | 0.30 | 0.30 | 0.18 | 0.03 |

2, 8 | 2 | 0.26 | 0.05 | 0.04 | 0.06 | 5, 11 | 2 | 0.30 | 0.30 | 0.18 | 0.03 |

2, 8 | 3 | 0.26 | 0.02 | 0.04 | 0.00 | 5, 11 | 3 | 0.30 | 0.30 | 0.18 | 0.03 |

3, 9 | 0 | 0.16 | 0.79 | 0.16 | 0.79 | 6, 12 | 0 | 0.22 | 0.22 | 0.22 | 0.79 |

3, 9 | 1 | 0.28 | 0.11 | 0.28 | 0.11 | 6, 12 | 1 | 0.24 | 0.23 | 0.22 | 0.06 |

3, 9 | 2 | 0.33 | 0.05 | 0.33 | 0.05 | 6, 12 | 2 | 0.20 | 0.17 | 0.12 | 0.04 |

3, 9 | 3 | 0.23 | 0.05 | 0.23 | 0.05 | 6, 12 | 3 | 0.34 | 0.38 | 0.44 | 0.11 |

C | #np | AIC | BIC |
---|---|---|---|

2 | 72 | 25,082 | 25,440 |

3 | 107 | 24,616 | 25,151 |

4 | 143 | 24,431 | 25,148 |

5 | 179 | 24,439 | 25,337 |

6 | 215 | 24,444 | 25,524 |

Appr. | Fused | Equation | C | ${\mathit{\lambda}}_{1}$ | ${\mathit{\lambda}}_{2}$ | #np | #nreg | BIC |
---|---|---|---|---|---|---|---|---|

R1 | Class | (10) | 4 | 0.31 | — | 84 | 63 | 24,982 |

R2 | Cat | (11) | 4 | — | 0.18 | 68 | 79 | 24,689 |

R3 | Cat and Class | (12) | 4 | 0.40 | 0.15 | 44 | 103 | 24,836 |

R4 | Grouped Cat | (13) | 4 | 0.45 | — | 82 | 65 | 24,777 |

R5 | Grouped Class | (14) | 4 | 0.65 | — | 79 | 67 | 24,776 |

**Table 8.**Data illustration polytomous data: estimated item response probabilities in the RLCM with $C=4$ classes and fused regularization among classes based on the minimal BIC.

Item | Cat | Class | Item | Cat | Class | Item | Cat | Class | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | ||||||

1 | 0 | 0.07 | 0.79 | 0.82 | 0.82 | 5 | 0 | 0.10 | 0.13 | 0.46 | 0.92 | 9 | 0 | 0.13 | 0.82 | 0.10 | 0.82 |

1 | 1 | 0.31 | 0.07 | 0.06 | 0.06 | 5 | 1 | 0.30 | 0.29 | 0.18 | 0.02 | 9 | 1 | 0.29 | 0.06 | 0.30 | 0.06 |

1 | 2 | 0.31 | 0.07 | 0.06 | 0.06 | 5 | 2 | 0.30 | 0.29 | 0.18 | 0.02 | 9 | 2 | 0.29 | 0.06 | 0.30 | 0.06 |

1 | 3 | 0.31 | 0.07 | 0.06 | 0.06 | 5 | 3 | 0.30 | 0.29 | 0.18 | 0.04 | 9 | 3 | 0.29 | 0.06 | 0.30 | 0.06 |

2 | 0 | 0.22 | 0.85 | 0.87 | 0.91 | 6 | 0 | 0.22 | 0.24 | 0.30 | 0.76 | 10 | 0 | 0.20 | 0.89 | 0.37 | 0.82 |

2 | 1 | 0.26 | 0.05 | 0.01 | 0.06 | 6 | 1 | 0.26 | 0.18 | 0.32 | 0.07 | 10 | 1 | 0.42 | 0.05 | 0.25 | 0.05 |

2 | 2 | 0.26 | 0.05 | 0.01 | 0.03 | 6 | 2 | 0.26 | 0.18 | 0.06 | 0.03 | 10 | 2 | 0.19 | 0.01 | 0.13 | 0.03 |

2 | 3 | 0.26 | 0.05 | 0.11 | 0.00 | 6 | 3 | 0.26 | 0.40 | 0.32 | 0.14 | 10 | 3 | 0.19 | 0.05 | 0.25 | 0.10 |

3 | 0 | 0.16 | 0.74 | 0.19 | 0.80 | 7 | 0 | 0.07 | 0.79 | 0.85 | 0.85 | 11 | 0 | 0.10 | 0.16 | 0.55 | 0.91 |

3 | 1 | 0.28 | 0.16 | 0.27 | 0.10 | 7 | 1 | 0.31 | 0.09 | 0.05 | 0.05 | 11 | 1 | 0.30 | 0.28 | 0.15 | 0.03 |

3 | 2 | 0.28 | 0.05 | 0.27 | 0.05 | 7 | 2 | 0.31 | 0.09 | 0.05 | 0.05 | 11 | 2 | 0.30 | 0.28 | 0.15 | 0.03 |

3 | 3 | 0.28 | 0.05 | 0.27 | 0.05 | 7 | 3 | 0.31 | 0.03 | 0.05 | 0.05 | 11 | 3 | 0.30 | 0.28 | 0.15 | 0.03 |

4 | 0 | 0.26 | 0.88 | 0.28 | 0.85 | 8 | 0 | 0.25 | 0.86 | 0.91 | 0.88 | 12 | 0 | 0.24 | 0.20 | 0.20 | 0.76 |

4 | 1 | 0.36 | 0.05 | 0.24 | 0.04 | 8 | 1 | 0.25 | 0.06 | 0.03 | 0.06 | 12 | 1 | 0.21 | 0.21 | 0.35 | 0.10 |

4 | 2 | 0.19 | 0.02 | 0.24 | 0.02 | 8 | 2 | 0.25 | 0.06 | 0.03 | 0.06 | 12 | 2 | 0.21 | 0.21 | 0.10 | 0.04 |

4 | 3 | 0.19 | 0.05 | 0.24 | 0.09 | 8 | 3 | 0.25 | 0.02 | 0.03 | 0.00 | 12 | 3 | 0.34 | 0.38 | 0.35 | 0.10 |

**Table 9.**The last series of Raven’s standard progressive matrices (SPM-LS) polytomous data: percentage frequencies and recoding table.

Item | Cat0 | Cat1 | Cat2 | Cat3 | Cat4 | Cat5 | Cat6 | Cat7 |
---|---|---|---|---|---|---|---|---|

SPM1 | 76.0 (7) | 13.6 (3) | 3.0 (1) | 2.4 (4) | 2.2 (6) | 2.0 (2) | 0.8 (5) | — |

SPM2 | 91.0 (6) | 3.0 (3) | 2.4 (4) | 2.2 (1) | 0.8 (5) | 0.4 (7) | 0.2 (2) | — |

SPM3 | 80.4 (8) | 8.0 (2) | 4.2 (6) | 2.0 (4) | 1.8 (3) | 1.6 (5) | 1.2 (7) | 0.8 (1) |

SPM4 | 82.4 (2) | 5.6 (3) | 3.2 (5) | 2.6 (1) | 2.2 (8) | 1.8 (6) | 1.2 (7) | 1.0 (4) |

SPM5 | 85.6 (1) | 3.8 (2) | 3.0 (3) | 2.6 (7) | 1.8 (6) | 1.6 (5) | 1.0 (4) | 0.6 (8) |

SPM6 | 76.4 (5) | 7.0 (4) | 5.2 (6) | 3.0 (3) | 2.8 (7) | 2.6 (8) | 2.0 (2) | 1.0 (1) |

SPM7 | 70.1 (1) | 6.6 (4) | 5.8 (5) | 5.4 (3) | 4.4 (8) | 3.4 (6) | 2.4 (7) | 1.8 (2) |

SPM8 | 58.1 (6) | 7.6 (1) | 7.0 (3) | 6.6 (8) | 6.4 (2) | 6.2 (5) | 5.8 (7) | 2.2 (4) |

SPM9 | 57.3 (3) | 12.0 (5) | 9.0 (1) | 7.2 (4) | 6.6 (8) | 4.0 (7) | 3.0 (2) | 0.8 (6) |

SPM10 | 39.5 (2) | 17.2 (6) | 11.2 (7) | 8.0 (3) | 7.8 (8) | 7.4 (5) | 6.0 (4) | 2.8 (1) |

SPM11 | 35.7 (4) | 14.0 (1) | 13.8 (7) | 9.8 (5) | 9.4 (6) | 8.0 (3) | 6.6 (2) | 2.6 (8) |

SPM12 | 32.5 (5) | 15.4 (2) | 14.2 (3) | 10.4 (1) | 8.2 (4) | 8.2 (7) | 7.4 (6) | 3.6 (8) |

C | $\mathit{\lambda}$ | #np | #nreg | BIC | |
---|---|---|---|---|---|

LCM | 2 | 0 | 25 | 0 | 5973 |

3 | 0 | 38 | 0 | 5721 | |

4 | 0 | 51 | 0 | 5680 | |

5 | 0 | 64 | 0 | 5696 | |

6 | 0 | 77 | 0 | 5694 | |

RLCM | 2 | 0.01 | 25 | 0 | 5973 |

3 | 0.33 | 35 | 3 | 5715 | |

4 | 0.38 | 39 | 12 | 5643 | |

5 | 0.29 | 45 | 19 | 5621 | |

6 | 0.53 | 45 | 32 | 5620 |

**Table 11.**SPM-LS dichotomous data: estimated item probabilities and latent class probabilities for best fitting RLCM with $C=5$ latent classes.

Item | Class | ||||
---|---|---|---|---|---|

C1 | C2 | C3 | C4 | C5 | |

${\mathit{p}}_{\mathit{c}}$ | 0.12 | 0.04 | 0.40 | 0.07 | 0.37 |

SPM1 | 0.39 | 0.39 | 0.83 | 0.83 | 0.83 |

SPM2 | 0.57 | 0.57 | 0.99 | 0.86 | 0.99 |

SPM3 | 0.33 | 0.00 | 0.86 | 0.96 | 0.96 |

SPM4 | 0.05 | 1.00 | 0.91 | 0.60 | 1.00 |

SPM5 | 0.08 | 1.00 | 0.96 | 0.77 | 1.00 |

SPM6 | 0.07 | 0.07 | 0.85 | 0.85 | 0.97 |

SPM7 | 0.20 | 0.83 | 0.58 | 0.83 | 0.95 |

SPM8 | 0.06 | 0.69 | 0.36 | 1.00 | 0.90 |

SPM9 | 0.16 | 0.34 | 0.34 | 1.00 | 0.90 |

SPM10 | 0.00 | 0.23 | 0.23 | 0.00 | 0.79 |

SPM11 | 0.14 | 0.00 | 0.14 | 0.00 | 0.77 |

SPM12 | 0.11 | 0.62 | 0.11 | 0.11 | 0.62 |

${\overline{p}}_{\u2022c}$ | 0.18 | 0.48 | 0.60 | 0.65 | 0.89 |

**Table 12.**SPM-LS polytomous data: estimated item response probabilities and latent class probabilities for best-fitting RLCM with $C=3$ latent classes for items SPM10, SPM11 and SPM12.

Item | Cat | C1 | C2 | C3 | Item | Cat | C1 | C2 | C3 | Item | Cat | C1 | C2 | C3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

SPM10 | 0 | 0.08 | 0.00 | 0.56 | SPM11 | 0 | 0.11 | 0.26 | 0.48 | SPM12 | 0 | 0.03 | 0.24 | 0.45 |

SPM10 | 1 | 0.14 | 0.12 | 0.19 | SPM11 | 1 | 0.18 | 0.43 | 0.10 | SPM12 | 1 | 0.14 | 0.09 | 0.17 |

SPM10 | 2 | 0.26 | 0.40 | 0.03 | SPM11 | 2 | 0.21 | 0.00 | 0.12 | SPM12 | 2 | 0.19 | 0.44 | 0.10 |

SPM10 | 3 | 0.10 | 0.08 | 0.07 | SPM11 | 3 | 0.15 | 0.12 | 0.07 | SPM12 | 3 | 0.23 | 0.03 | 0.06 |

SPM10 | 4 | 0.13 | 0.00 | 0.06 | SPM11 | 4 | 0.14 | 0.00 | 0.08 | SPM12 | 4 | 0.12 | 0.00 | 0.07 |

SPM10 | 5 | 0.10 | 0.08 | 0.06 | SPM11 | 5 | 0.08 | 0.19 | 0.07 | SPM12 | 5 | 0.14 | 0.00 | 0.06 |

SPM10 | 6 | 0.10 | 0.32 | 0.03 | SPM11 | 6 | 0.07 | 0.00 | 0.07 | SPM12 | 6 | 0.07 | 0.20 | 0.07 |

SPM10 | 7 | 0.09 | 0.00 | 0.00 | SPM11 | 7 | 0.06 | 0.00 | 0.01 | SPM12 | 7 | 0.08 | 0.00 | 0.02 |

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Robitzsch, A.
Regularized Latent Class Analysis for Polytomous Item Responses: An Application to SPM-LS Data. *J. Intell.* **2020**, *8*, 30.
https://doi.org/10.3390/jintelligence8030030

**AMA Style**

Robitzsch A.
Regularized Latent Class Analysis for Polytomous Item Responses: An Application to SPM-LS Data. *Journal of Intelligence*. 2020; 8(3):30.
https://doi.org/10.3390/jintelligence8030030

**Chicago/Turabian Style**

Robitzsch, Alexander.
2020. "Regularized Latent Class Analysis for Polytomous Item Responses: An Application to SPM-LS Data" *Journal of Intelligence* 8, no. 3: 30.
https://doi.org/10.3390/jintelligence8030030