Applying GenAI to Optimize Q-Matrix Construction for Cognitive Diagnostic Assessment in EFL Reading
Abstract
1. Introduction
2. Literature Review
2.1. Q-Matrix Specification
2.2. Existing Approaches of Q-Matrix Construction
2.3. Q-Matrix Modification Extensions
- To what extent can GenAI-informed Q-matrices achieve comparable or superior model-data fit to manually constructed Q-matrices?
- To what extent can GenAI-informed Q-matrices outperform manually constructed Q-matrices in terms of classification accuracy and attribute correlation?
3. Method
3.1. Research Design and Hypothesis
- (1)
- The three purely GenAI-generated Q-matrices (Qmat-DS, Qmat-K, Qmat-DB) will demonstrate comparable and improved psychometric performance compared to two manually constructed Q-matrices (Qmat-E and Qmat S).
- (2)
- The human–AI collaborative Q-matrix (Qmat-DS-H) will outperform all other Q-matrices across most psychometric indicators.
3.2. Instruments and Data Description
3.3. Q-Matrices Construction Procedure
- (1)
- If all three generations coded an attribute as 0 or 1, that value was retained.
- (2)
- If two of the three generations coded an attribute as 1 and one coded it as 0, the attribute was coded as 1 (indicating the attribute was required).
- (3)
- If two of the three generations coded an attribute as 0 and one coded it as 1, the attribute was coded as 0 (indicating the attribute was not required).
3.4. Q-Matrices Validation Procedure
4. Results
4.1. RQ1 Model–Data Fit Results Comparsion
4.1.1. Test-Level Model Fit Statistics Across Six Q-Matrices
4.1.2. Item-Level Fit Results Across Six Q-Matrices
4.1.3. Item Parameters Stability Across Six Q-Matrices
4.2. RQ2 Classification Accuracy and Attribute Correaltion Comparison
4.2.1. Classification Accuracy Across Six Q-Matrices
4.2.2. Summary of Attribute Correlation Across Six Q-Matrices
5. Discussion
5.1. Advantages of GenAI-Informed Q-Matrix Construction
5.2. Necessity of Human–AI Collaboration in Q-Matrix Construction
5.3. Implications for Development of Diagnostic Assessments
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Tetrachoric Correlations Among Attributes Across Six Q-Matrices
| A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | |
|---|---|---|---|---|---|---|---|---|
| A1 | 1 | |||||||
| A2 | 0.06 | 1 | ||||||
| A3 | 0.16 | 0.26 | 1 | |||||
| A4 | 0.28 | −0.02 | 0.1 | 1 | ||||
| A5 | 0.49 | −0.11 | 0.3 | 0.6 | 1 | |||
| A6 | 0.22 | 0.26 | 0.43 | 0.25 | 0.2 | 1 | ||
| A7 | −0.23 | 0.07 | 0.25 | 0.08 | −0.13 | 0.3 | 1 | |
| A8 | 0.02 | 0.28 | −0.09 | 0.02 | −0.28 | 0.19 | 0.5 | 1 |
| A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | |
|---|---|---|---|---|---|---|---|---|
| A1 | 1 | |||||||
| A2 | 0.1 | 1 | ||||||
| A3 | 0.16 | 0.24 | 1 | |||||
| A4 | 0.09 | 0.07 | 0.14 | 1 | ||||
| A5 | 0.13 | −0.04 | 0.15 | 0.22 | 1 | |||
| A6 | 0.01 | 0.18 | 0.25 | 0.01 | −0.05 | 1 | ||
| A7 | 0.03 | 0.07 | −0.19 | 0.17 | 0.53 | 0.11 | 1 | |
| A8 | 0.24 | 0.21 | 0.26 | 0.26 | −0.25 | 0.18 | 0.17 | 1 |
| A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | |
|---|---|---|---|---|---|---|---|---|
| A1 | 1 | |||||||
| A2 | 0.05 | 1 | ||||||
| A3 | −0.01 | 0.28 | 1 | |||||
| A4 | −0.02 | −0.12 | 0.07 | 1 | ||||
| A5 | −0.11 | 0.14 | 0.26 | 0.34 | 1 | |||
| A6 | 0.37 | 0.03 | 0.16 | 0.32 | 0.31 | 1 | ||
| A7 | 0.14 | 0.08 | 0.12 | 0.14 | −0.24 | 0.32 | 1 | |
| A8 | 0.09 | 0.5 | 0.3 | 0.07 | 0.42 | −0.08 | 0.17 | 1 |
| A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | |
|---|---|---|---|---|---|---|---|---|
| A1 | 1 | |||||||
| A2 | 0.35 | 1 | ||||||
| A3 | 0.08 | 0.39 | 1 | |||||
| A4 | 0.57 | 0.48 | 0.2 | 1 | ||||
| A5 | 0.23 | 0.25 | 0.16 | 0.39 | 1 | |||
| A6 | 0.03 | 0.23 | 0.22 | 0.21 | 0.21 | 1 | ||
| A7 | −0.22 | 0.15 | 0.41 | 0.42 | 0.24 | 0.5 | 1 | |
| A8 | 0.44 | 0.22 | 0.07 | 0.24 | 0.14 | 0.18 | 0.17 | 1 |
| A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | |
|---|---|---|---|---|---|---|---|---|
| A1 | 1 | |||||||
| A2 | 0.22 | 1 | ||||||
| A3 | 0.08 | 0.12 | 1 | |||||
| A4 | 0.08 | −0.04 | 0.15 | 1 | ||||
| A5 | 0.12 | −0.22 | 0.01 | −0.08 | 1 | |||
| A6 | 0.12 | 0.11 | −0.04 | 0.14 | 0.21 | 1 | ||
| A7 | 0.34 | 0.53 | 0.09 | 0.1 | −0.09 | 0.08 | 1 | |
| A8 | 0.1 | 0.06 | −0.06 | 0.26 | 0.11 | 0.28 | −0.15 | 1 |
| A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | |
|---|---|---|---|---|---|---|---|---|
| A1 | 1 | |||||||
| A2 | 0.28 | 1 | ||||||
| A3 | 0.36 | 0.49 | 1 | |||||
| A4 | 0.69 | 0.33 | 0.45 | 1 | ||||
| A5 | 0.25 | 0.24 | 0.33 | 0.21 | 1 | |||
| A6 | 0.16 | 0.21 | 0.34 | 0.31 | 0.04 | 1 | ||
| A7 | 0.36 | −0.20 | 0.38 | 0.61 | 0.29 | 0.17 | 1 | |
| A8 | 0.29 | 0.49 | 0.24 | 0.23 | 0.58 | 0.27 | 0.34 | 1 |
References
- Alderson, J. C., Brunfaut, T., & Harding, L. (2015). Towards a theory of diagnosis in second and foreign language assessment: Insights from professional practice across diverse fields. Applied Linguistics, 36(2), 236–260. [Google Scholar] [CrossRef]
- Aryadoust, V. (2021). A cognitive diagnostic assessment study of the listening test of the Singapore-Cambridge general certificate of education O-level: Application of DINA, DINO, G-DINA, HO-DINA, and RRUM. International Journal of Listening, 35(1), 29–52. [Google Scholar] [CrossRef]
- Aryadoust, V., Zakaria, A., & Jia, Y. (2024). Investigating the affordances of OpenAI’s large language model in developing listening assessments. Computers and Education: Artificial Intelligence, 6, 100204. [Google Scholar] [CrossRef]
- Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford University Press. [Google Scholar]
- Cai, H., & Min, S. (2025). Informing Q-matrix specification with exploratory factor analysis in cognitive diagnostic assessment of language comprehension. Language Assessment Quarterly, 22(3), 306–326. [Google Scholar] [CrossRef]
- Chen, H., Cai, Y., & de la Torre, J. (2023). Investigating second language (L2) reading subskill associations: A cognitive diagnosis approach. Language Assessment Quarterly, 20(2), 166–189. [Google Scholar] [CrossRef]
- Chen, J., de la Torre, J., & Zhang, Z. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling. Journal of Educational Measurement, 50(2), 123–140. [Google Scholar] [CrossRef]
- Chen, Y., Li, X., Liu, J., & Ying, Z. (2017). Regularized latent class analysis with application in cognitive diagnosis. Psychometrika, 82(3), 660–692. [Google Scholar] [CrossRef]
- Chiu, C.-Y. (2013). Statistical refinement of the Q-matrix in cognitive diagnosis. Applied Psychological Measurement, 37(8), 598–618. [Google Scholar] [CrossRef]
- Chuang, P.-L., & Yan, X. (2025). Language assessment in the era of generative artificial intelligence: Opportunities, challenges, and future directions. System, 134, 103846. [Google Scholar] [CrossRef]
- de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179–199. [Google Scholar] [CrossRef]
- de la Torre, J., & Chiu, C. Y. (2016). A general method of empirical Q-matrix validation. Psychometrika, 81(2), 253–273. [Google Scholar] [CrossRef] [PubMed]
- Du, W., & Ma, X. (2021). Probing what’s behind the test score: Application of multi-CDM to diagnose EFL learners’ reading performance. Reading and Writing, 34(6), 1441–1466. [Google Scholar] [CrossRef]
- Du, W., & Ma, X. (2026). From coarse to fine: A cognitive diagnosis of EFL learners’ inferential ability in EFL reading. Thinking Skills and Creativity, 59, 102022. [Google Scholar] [CrossRef]
- Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data (Rev. ed.). MIT Press. [Google Scholar]
- Gao, M., Miller, M. D., & Liu, R. (2017). The impact of Q-matrix misspecification and model misuse on classification accuracy in the generalized DINA model. Journal of Measurement and Evaluation in Education and Psychology, 8, 391–403. [Google Scholar] [CrossRef]
- He, L., Jiang, Z., & Min, S. (2021). Diagnosing writing ability using China’s standards of English language ability: Application of cognitive diagnosis models. Assessing Writing, 50, 100565. [Google Scholar] [CrossRef]
- He, L., Xiong, L., & Min, S. (2022). Diagnosing listening and reading skills in the Chinese EFL context: Performance stability and variability across modalities and performance levels. System, 106, 102787. [Google Scholar] [CrossRef]
- Im, S., & Corter, J. E. (2011). Statistical consequences of attribute misspecification in the rule space method. Educational and Psychological Measurement, 71(4), 712–731. [Google Scholar] [CrossRef]
- Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for fusion model application to LanguEdge assessment. Language Testing, 26(1), 31–73. [Google Scholar] [CrossRef]
- Javidanmehr, Z., & Anani Sarab, M. R. (2019). Retrofitting non-diagnostic reading comprehension assessment: Application of the G-DINA model to a high stakes reading comprehension test. Language Assessment Quarterly, 16(3), 294–311. [Google Scholar] [CrossRef]
- Jiang, T., Sun, Z., Fu, S., & Lv, Y. (2024). Human-AI interaction research agenda: A user-centered perspective. Data and Information Management, 8(4), 100078. [Google Scholar] [CrossRef]
- Johnson, M. S., & Sinharay, S. (2018). Measures of agreement to assess attribute-level classification accuracy and consistency for cognitive diagnostic assessments. Journal of Educational Measurement, 55(4), 635–664. [Google Scholar] [CrossRef]
- Kim, A. (2015). Exploring ways to provide diagnostic feedback with an ESL placement test: Cognitive diagnostic assessment of L2 reading ability. Language Testing, 32(2), 227–258. [Google Scholar] [CrossRef]
- Kim, Y. (2025). Automated essay scoring with GPT-4 for a local placement test: Investigating prompting strategies, intra-rater reliability, and alignment with human scores. TESOL Quarterly, 59, S318–S329. [Google Scholar] [CrossRef]
- Kunina-Habenicht, O., Rupp, A. A., & Wilhelm, O. (2012). The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement, 49(1), 59–81. [Google Scholar] [CrossRef]
- Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. [Google Scholar] [CrossRef]
- Lee, Y. W., & Sawaki, Y. (2009). Application of three cognitive diagnosis models to ESL reading and listening assessments. Language Assessment Quarterly, 6(3), 239–263. [Google Scholar] [CrossRef]
- Leighton, J. P., & Gierl, M. J. (2007). Cognitive diagnostic assessment for education: Theory and applications. Cambridge University Press. [Google Scholar]
- Lin, Z., & Chen, H. (2024). Investigating the capability of ChatGPT for generating multiple-choice reading comprehension items. System, 123, 103344. [Google Scholar] [CrossRef]
- Liu, M., Hashim, H., & Sulaiman, N. A. (2025). Cognitive diagnostic assessment for IELTS speaking skills: A Chinese high school case study. Arab World English Journal, 16(2), 276–310. [Google Scholar] [CrossRef]
- Ma, W., & de la Torre, J. (2020). GDINA: An R package for cognitive diagnosis modeling. Journal of Statistical Software, 93(14), 1–26. [Google Scholar] [CrossRef]
- Madison, M. J., & Bradshaw, L. P. (2015). The effects of Q-matrix design on classification accuracy in the log-linear cognitive diagnosis model. Educational and Psychological Measurement, 75(3), 491–511. [Google Scholar] [CrossRef] [PubMed]
- Mahapatra, S. (2024). Impact of ChatGPT on ESL students’ academic writing skills: A mixed methods intervention study. Smart Learning Environments, 11(1), 9. [Google Scholar] [CrossRef]
- Mei, H., & Chen, H. (2022). Cognitive diagnosis in language assessment: A thematic review. RELC Journal, 55(2), 529–537. [Google Scholar] [CrossRef]
- Meng, Y., & Fu, H. (2023). Modeling mediation in the dynamic assessment of listening ability from the cognitive diagnostic perspective. The Modern Language Journal, 107(S1), 137–160. [Google Scholar] [CrossRef]
- Min, S., & He, L. (2022). Developing individualized feedback for listening assessment: Combining standard setting and cognitive diagnostic assessment approaches. Language Testing, 39(1), 90–116. [Google Scholar] [CrossRef]
- Mirzaei, A., Vincheh, M. H., & Hashemian, M. (2020). Retrofitting the IELTS reading section with a general cognitive diagnostic model in an Iranian EAP context. Studies in Educational Evaluation, 64, 100817. [Google Scholar] [CrossRef]
- Nájera, P., Sorrel, M. A., de la Torre, J., & Abad, F. J. (2020). Improving Robustness in Q-matrix validation using an iterative and dynamic procedure. Applied Psychological Measurement, 44(6), 431–446. [Google Scholar] [CrossRef]
- Nichols, P. D., Chipman, S. F., & Brennan, R. L. (Eds.). (1995). Cognitively diagnostic assessment. Lawrence Erlbaum Associates. [Google Scholar]
- Polakova, P., & Ivenz, P. (2024). The impact of ChatGPT feedback on the development of EFL students’ writing skills. Cogent Education, 11(1), 2410101. [Google Scholar] [CrossRef]
- Qin, H., & Guo, L. (2024). Using machine learning to improve Q-matrix validation. Behavior Research Methods, 56, 1916–1935. [Google Scholar] [CrossRef]
- Ravand, H. (2016). Application of a cognitive diagnostic model to a high-stakes reading comprehension test. Journal of Psychoeducational Assessment, 34(8), 782–799. [Google Scholar] [CrossRef]
- Ravand, H., & Baghaei, P. (2020). Diagnostic classification models: Recent developments, practical issues, and prospects. International Journal of Testing, 20(1), 24–56. [Google Scholar] [CrossRef]
- Ravand, H., & Robitzsch, A. (2018). Cognitive diagnostic model of best choice: A study of reading comprehension. Educational Psychology, 38(10), 1255–1277. [Google Scholar] [CrossRef]
- Rupp, A. A., & Templin, J. (2008). The effects of Q-matrix misspecification on parameter estimates and classification accuracy in the DINA model. Educational and Psychological Measurement, 68(1), 78–96. [Google Scholar] [CrossRef]
- Rupp, A. A., Templin, J., & Henson, R. (2010). Diagnostic measurement: Theory, methods, and applications. The Guildford Press. [Google Scholar]
- Shi, X., Ma, X., Du, W., & Gao, X. (2024). Diagnosing Chinese EFL learners’ writing ability using polytomous cognitive diagnostic models. Language Testing, 41(1), 109–134. [Google Scholar] [CrossRef]
- Tatsuoka, K. K. (1985). A probabilistic model for diagnosing misconceptions by the pattern classification approach. Journal of Educational Statistics, 10(1), 55–73. [Google Scholar] [CrossRef][Green Version]
- Terzi, R., & de la Torre, J. (2018). An iterative method for empirically-based Q-matrix validation. International Journal of Assessment Tools in Education, 5(2), 248–262. [Google Scholar] [CrossRef]
- Toprak, T. E., & Cakir, A. (2021). Examining the L2 reading comprehension ability of adult ELLs: Developing a diagnostic test within the cognitive diagnostic assessment framework. Language Testing, 38(1), 106–131. [Google Scholar] [CrossRef]
- von Davier, M., & Lee, Y.-S. (Eds.). (2019). Handbook of diagnostic classification models: Models and model extensions, applications, software packages. Springer International Publishing. [Google Scholar] [CrossRef]
- Wang, C. (2024). A Diagnostic Facet Status Model (DFSM) for extracting instructionally useful information from diagnostic assessment. Psychometrika, 89, 747–773. [Google Scholar] [CrossRef]
- Wang, Y., & Meng, Y. (2026). Optimizing distractor quality in a locally developed second language listening test: Integrating generative AI and psychometric methods. Language Testing, 43(2), 141–164. [Google Scholar] [CrossRef]
- Xi, X. (2023). Advancing language assessment with AI and ML-leaning into AI is inevitable, but can theory keep up? Language Assessment Quarterly, 20(4–5), 357–376. [Google Scholar] [CrossRef]
- Xi, X. (2025). Revisiting communicative competence in the age of AI: Implications for large-scale testing. Annual Review of Applied Linguistics, 45, 200–221. [Google Scholar] [CrossRef]

| Item | A1 | A2 | A3 |
|---|---|---|---|
| Item1 | 0 | 1 | 0 |
| Item2 | 1 | 1 | 0 |
| Item3 | 1 | 1 | 1 |
| Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Item 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | Item 11 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| Item 2 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 12 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| Item 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | Item 13 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| Item 4 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 14 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| Item 5 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 15 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| Item 6 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 16 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Item 7 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | Item 17 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
| Item 8 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | Item 18 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Item 9 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | Item 19 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| Item 10 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | Item 20 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Item 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | Item 11 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 2 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 12 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 3 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | Item 13 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 4 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | Item 14 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 5 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 15 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 6 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 16 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 |
| Item 7 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | Item 17 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 |
| Item 8 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 18 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 |
| Item 9 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | Item 19 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| Item 10 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | Item 20 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Item 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | Item 11 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| Item 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | Item 12 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| Item 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | Item 13 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| Item 4 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | Item 14 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 5 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | Item 15 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| Item 6 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | Item 16 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 7 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | Item 17 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
| Item 8 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | Item 18 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
| Item 9 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | Item 19 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 |
| Item 10 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | Item 20 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 |
| Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Item 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | Item 11 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 2 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 12 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 3 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 13 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 4 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 14 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 5 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 15 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 6 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | Item 16 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| Item 7 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | Item 17 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
| Item 8 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | Item 18 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
| Item 9 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | Item 19 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 |
| Item 10 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | Item 20 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 1 |
| Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Item 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | Item 11 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 2 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 12 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 3 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 13 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 4 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Item 14 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 5 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 15 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 6 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 16 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 |
| Item 7 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | Item 17 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
| Item 8 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | Item 18 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 |
| Item 9 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | Item 19 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| Item 10 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | Item 20 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | Item | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Item 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | Item 11 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | Item 12 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | Item 13 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 4 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | Item 14 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 5 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | Item 15 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
| Item 6 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | Item 16 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 |
| Item 7 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | Item 17 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 |
| Item 8 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | Item 18 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
| Item 9 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | Item 19 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 |
| Item 10 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | Item 20 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 |
| Q-Matrices | Dataset | Model | Max zr (adj-p) | Max zl (adj-p) | AIC | BIC | −2LL |
|---|---|---|---|---|---|---|---|
| Qmat-E | Empirical dataset (N = 1083) | G-DINA | 4.87(0.00) | 5.38(0.00) | 25,379.73 | 27,070.17 | 24,701.72 |
| Qmat-S | 4.62(0.00) | 4.49(0.00) | 25,333.07 | 27,143.19 | 24,607.08 | ||
| Qmat-DS | 3.40(0.13) | 3.27(0.20) | 25,407.38 | 27,895.68 | 24,409.38 | ||
| Qmat-K | 4.77(0.00) | 4.64(0.00) | 25,360.40 | 27,250.31 | 24,602.40 | ||
| Qmat-DB | 4.22(0.00) | 4.09(0.01) | 25,305.49 | 27,135.56 | 24,571.50 | ||
| Qmat-DS-H | 3.03(0.47) | 2.92(0.66) | 25,406.04 | 27,435.57 | 24,592.04 | ||
| Qmat-E | ACDM | 4.51(0.00) | 4.56(0.00) | 25,498.00 | 27,073.76 | 24,866.00 | |
| Qmat-S | 5.67(0.00) | 5.02(0.00) | 25,512.47 | 27,118.14 | 24,868.46 | ||
| Qmat-DS | 5.24(0.00) | 5.11(0.00) | 25,476.82 | 27,157.29 | 24,802.82 | ||
| Qmat-K | 5.05(0.00) | 4.92(0.00) | 25,508.08 | 27,128.72 | 24,858.16 | ||
| Qmat-DB | 5.76(0.00) | 5.62(0.00) | 25,526.39 | 27,137.05 | 24,880.40 | ||
| Qmat-DS-H | 4.50(0.00) | 4.46(0.00) | 25,466.52 | 27,142.09 | 24,816.52 | ||
| Qmat-E | RRUM | 5.04(0.00) | 5.52(0.00) | 25,484.14 | 27,059.89 | 24,852.14 | |
| Qmat-S | 5.78(0.00) | 5.64(0.00) | 25,486.80 | 27,092.48 | 24,842.80 | ||
| Qmat-DS | 6.17(0.00) | 6.08(0.00) | 25,597.67 | 27,278.14 | 24,923.68 | ||
| Qmat-K | 5.35(0.00) | 5.87(0.00) | 25,469.00 | 27,089.63 | 24,819.00 | ||
| Qmat-DB | 5.47(0.00) | 5.34(0.00) | 25,476.66 | 27,087.32 | 24,830.66 | ||
| Qmat-DS-H | 5.29(0.00) | 5.46(0.00) | 25,477.20 | 27,122.77 | 24,817.20 | ||
| Qmat-E | Simulated dataset (N = 1000) | G-DINA | 3.75(0.03) | 3.76(0.03) | 23,592.35 | 25,255.74 | 22,914.36 |
| Qmat-S | 2.80(0.96) | 2.72(1.00) | 23,532.17 | 25,313.32 | 22,806.16 | ||
| Qmat-DS | 4.08(0.00) | 4.09(0.01) | 23,604.22 | 26,052.69 | 22,606.22 | ||
| Qmat-K | 2.52(1.00) | 2.46(1.00) | 23,503.60 | 25,363.26 | 22,745.60 | ||
| Qmat-DB | 2.56(1.00) | 2.68(1.00) | 23,511.64 | 25,312.42 | 22,777.64 | ||
| Qmat-DS-H | 3.60(0.06) | 3.61(0.06) | 23,520.47 | 25,517.52 | 22,706.48 | ||
| Qmat-E | ACDM | 4.52(0.00) | 4.52(0.00) | 23,678.30 | 25,228.83 | 23,046.30 | |
| Qmat-S | 2.60(1.00) | 2.71(1.00) | 23,583.27 | 25,163.25 | 22,939.28 | ||
| Qmat-DS | 6.05(0.00) | 5.96(0.00) | 23,761.85 | 25,415.42 | 23,087.84 | ||
| Qmat-K | 5.27(0.00) | 5.29(0.00) | 23,670.06 | 25,264.76 | 23,020.06 | ||
| Qmat-DB | 6.08(0.00) | 5.97(0.00) | 23,768.12 | 25,353.00 | 23,122.12 | ||
| Qmat-DS-H | 2.71(1.00) | 2.69(1.00) | 23,604.97 | 25,224.20 | 22,944.98 | ||
| Qmat-E | RRUM | 4.59(0.00) | 4.59(0.00) | 23,698.59 | 25,249.12 | 23,066.58 | |
| Qmat-S | 2.59(1.00) | 2.69(1.00) | 23,588.70 | 25,168.67 | 22,944.70 | ||
| Qmat-DS | 6.28(0.00) | 6.21(0.00) | 23,727.61 | 25,381.19 | 23,053.62 | ||
| Qmat-K | 5.24(0.00) | 5.27(0.00) | 23,666.44 | 25,261.13 | 23,016.44 | ||
| Qmat-DB | 6.14(0.00) | 6.05(0.00) | 23,738.35 | 25,323.23 | 23,092.28 | ||
| Qmat-DS-H | 3.28(0.19) | 3.22(0.23) | 23,586.07 | 25,205.29 | 22,926.06 |
| Item s | Qmat-E | Qmat-S | Qmat-DS | Qmat-K | Qmat-DB | Qmat-DS-H | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| g | s | g | s | g | s | g | s | g | s | g | s | |
| Item 1 | 0.458 | 0.133 | 0.538 | 0.113 | 0.526 | 0.120 | 0.495 | 0.131 | 0.585 | 0.104 | 0.323 | 0.087 |
| Item 2 | 0.000 | 0.062 | 0.000 | 0.000 | 0.371 | 0.204 | 0.000 | 0.157 | 0.000 | 0.000 | 0.268 | 0.190 |
| Item 3 | 0.494 | 0.137 | 0.391 | 0.148 | 0.458 | 0.170 | 0.455 | 0.144 | 0.539 | 0.148 | 0.379 | 0.115 |
| Item 4 | 0.441 | 0.045 | 0.348 | 0.206 | 0.651 | 0.145 | 0.529 | 0.075 | 0.515 | 0.193 | 0.283 | 0.093 |
| Item 5 | 0.357 | 0.281 | 0.387 | 0.336 | 0.335 | 0.232 | 0.375 | 0.238 | 0.433 | 0.297 | 0.181 | 0.000 |
| Item 6 | 0.344 | 0.049 | 0.000 | 0.269 | 0.000 | 0.237 | 0.449 | 0.092 | 0.000 | 0.225 | 0.235 | 0.150 |
| Item 7 | 0.000 | 0.371 | 0.183 | 0.319 | 0.000 | 0.218 | 0.181 | 0.361 | 0.054 | 0.426 | 0.231 | 0.191 |
| Item 8 | 0.000 | 0.000 | 0.270 | 0.000 | 0.234 | 0.256 | 0.201 | 0.190 | 0.301 | 0.181 | 0.252 | 0.233 |
| Item 9 | 0.000 | 0.247 | 0.197 | 0.196 | 0.226 | 0.000 | 0.184 | 0.000 | 0.110 | 0.000 | 0.212 | 0.163 |
| Item 10 | 0.445 | 0.073 | 0.000 | 0.088 | 0.552 | 0.102 | 0.000 | 0.103 | 0.239 | 0.121 | 0.000 | 0.120 |
| Item 11 | 0.285 | 0.314 | 0.189 | 0.173 | 0.191 | 0.160 | 0.161 | 0.198 | 0.167 | 0.000 | 0.233 | 0.186 |
| Item 12 | 0.365 | 0.061 | 0.000 | 0.037 | 0.242 | 0.038 | 0.000 | 0.031 | 0.000 | 0.011 | 0.216 | 0.034 |
| Item 13 | 0.221 | 0.040 | 0.000 | 0.049 | 0.097 | 0.021 | 0.274 | 0.000 | 0.269 | 0.039 | 0.080 | 0.037 |
| Item 14 | 0.204 | 0.007 | 0.137 | 0.012 | 0.227 | 0.006 | 0.214 | 0.010 | 0.183 | 0.020 | 0.000 | 0.005 |
| Item 15 | 0.215 | 0.042 | 0.161 | 0.000 | 0.312 | 0.058 | 0.182 | 0.068 | 0.152 | 0.053 | 0.241 | 0.000 |
| Item 16 | 0.000 | 0.213 | 0.000 | 0.212 | 0.415 | 0.153 | 0.275 | 0.155 | 0.000 | 0.193 | 0.000 | 0.242 |
| Item 17 | 0.015 | 0.273 | 0.067 | 0.253 | 0.000 | 0.273 | 0.000 | 0.230 | 0.047 | 0.324 | 0.073 | 0.155 |
| Item 18 | 0.090 | 0.393 | 0.126 | 0.380 | 0.000 | 0.000 | 0.128 | 0.348 | 0.113 | 0.000 | 0.081 | 0.191 |
| Item 19 | 0.113 | 0.128 | 0.000 | 0.105 | 0.000 | 0.000 | 0.000 | 0.132 | 0.381 | 0.000 | 0.000 | 0.083 |
| Item 20 | 0.231 | 0.000 | 0.282 | 0.000 | 0.000 | 0.156 | 0.167 | 0.183 | 0.150 | 0.157 | 0.000 | 0.119 |
| Mean | 0.214 | 0.143 | 0.164 | 0.145 | 0.242 | 0.127 | 0.213 | 0.142 | 0.212 | 0.125 | 0.164 | 0.120 |
| Mean SE | 0.040 | 0.029 | 0.036 | 0.028 | 0.047 | 0.021 | 0.039 | 0.023 | 0.043 | 0.028 | 0.027 | 0.017 |
| Q-Matrices | Test Level | Attribute Level | |||||||
|---|---|---|---|---|---|---|---|---|---|
| P(a) | P(a)A1 | P(a)A2 | P(a)A3 | P(a)A4 | P(a)A5 | P(a)A6 | P(a)A7 | P(a)A8 | |
| Qmat-E | 0.570 | 0.855 | 0.940 | 0.830 | 0.890 | 0.838 | 0.962 | 0.874 | 0.884 |
| Qmat-S | 0.696 | 0.951 | 0.931 | 0.868 | 0.936 | 0.899 | 0.953 | 0.886 | 0.901 |
| Qmat-DS | 0.692 | 0.861 | 0.928 | 0.972 | 0.920 | 0.870 | 0.892 | 0.953 | 0.899 |
| Qmat-K | 0.674 | 0.877 | 0.925 | 0.881 | 0.863 | 0.937 | 0.978 | 0.947 | 0.893 |
| Qmat-DB | 0.709 | 0.999 | 0.948 | 0.947 | 0.904 | 0.891 | 0.950 | 0.900 | 0.878 |
| Qmat-DS-H | 0.712 | 0.880 | 0.930 | 0.951 | 0.940 | 0.917 | 0.981 | 0.928 | 0.898 |
| Q-Matrices | High (>0.70) | Intermediate (0.20–0.70) | Low (<0.20) |
|---|---|---|---|
| Qmat-E | 0 | 14 | 14 |
| Qmat-S | 0 | 8 | 20 |
| Qmat-DS | 0 | 10 | 18 |
| Qmat-K | 0 | 19 | 9 |
| Qmat-DB | 0 | 6 | 22 |
| Qmat-DS-H | 0 | 24 | 4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Du, W.; Shen, J.; Ma, X. Applying GenAI to Optimize Q-Matrix Construction for Cognitive Diagnostic Assessment in EFL Reading. J. Intell. 2026, 14, 79. https://doi.org/10.3390/jintelligence14050079
Du W, Shen J, Ma X. Applying GenAI to Optimize Q-Matrix Construction for Cognitive Diagnostic Assessment in EFL Reading. Journal of Intelligence. 2026; 14(5):79. https://doi.org/10.3390/jintelligence14050079
Chicago/Turabian StyleDu, Wenbo, Jiayi Shen, and Xiaomei Ma. 2026. "Applying GenAI to Optimize Q-Matrix Construction for Cognitive Diagnostic Assessment in EFL Reading" Journal of Intelligence 14, no. 5: 79. https://doi.org/10.3390/jintelligence14050079
APA StyleDu, W., Shen, J., & Ma, X. (2026). Applying GenAI to Optimize Q-Matrix Construction for Cognitive Diagnostic Assessment in EFL Reading. Journal of Intelligence, 14(5), 79. https://doi.org/10.3390/jintelligence14050079

