A Cognitive Diagnosis Model Using Convolutional Neural Networks to Predict Student Scores

Mo, Jianwen; Hao, Longhua; Yuan, Hua; Shou, Zhaoyu

doi:10.3390/app15062875

Open AccessArticle

A Cognitive Diagnosis Model Using Convolutional Neural Networks to Predict Student Scores

School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(6), 2875; https://doi.org/10.3390/app15062875

Submission received: 19 February 2025 / Revised: 4 March 2025 / Accepted: 5 March 2025 / Published: 7 March 2025

Download

Browse Figures

Versions Notes

Abstract

Smart education is an important direction of future educational development, aiming to improve the intelligence level of the existing digital education system and achieve the deep integration of information technology and mainstream education business. Most of the existing cognitive diagnostic models are trained and tested based on the answers to known questions, and the dependence on the trained questions leads to poor prediction results of the model for unknown questions. To solve that problem, this paper divides the questions of the dataset into the usual quizzes (known questions) and the final exam questions (unknown questions), which are used for training and testing, respectively. The cognitive diagnosis model based on a convolutional neural network (CNNCD) is proposed. Firstly, the attention mechanism is used to dig out the intra-layer relationships among students, questions and knowledge points, which alleviates the problem of insufficient information mining among students, questions and knowledge points. Secondly, two multi-layer (general and special) one-dimensional convolutional neural networks are combined to model the cognitive diagnosis of the process from students’ usual grades to students’ mastery of knowledge points. Finally, two multi-layer (general and special) one-dimensional convolutional neural networks are used to predict students’ final exam scores. Experiments on three public datasets and self-made BOIT datasets show that the proposed method is superior to other comparison models on these three evaluation indexes, indicating the feasibility and effectiveness of the proposed model.

Keywords:

cognitive diagnosis; convolutional neural network; attention mechanism; intra-layer relationship; score prediction

1. Introduction

In recent years, with the rapid development of information technology such as the Internet era and computers, the modern education system has undergone fundamental changes, and the traditional education mode has gradually transformed into a new intelligent and personalized modern education mode [1]. Cognitive diagnosis is an important part of modern intelligent education [2], and its model is also an important research direction in the field of educational evaluation. It mainly evaluates and diagnoses students’ mastery of specific knowledge points or cognitive skills through the response data of students. These models not only pay attention to the overall ability level of students but also deeply analyze the specific performance of students in various cognitive attributes (such as knowledge points, skills, etc.), to provide teachers with more accurate teaching feedback and then provide personalized education resource recommendation for students [1].

At present, there are many kinds of cognitive diagnostic models, which can be roughly divided into two categories, simple models and complex models. The Item Response Theory (IRT [3]) and the Deterministic Inputs, Noisy And gate model (DINA [4]) are representative of simple models. Item Response Theory (IRT) is a modern psychometric theory that analyzes and evaluates test scores, questionnaire survey data, etc. The model is modeled through the potential learning ability of students, combined with the difficulty and distinction degree of questions, etc. The model is widely used, but has some limitations. Deterministic Inputs, Noisy And gate model (DINA) is also a widely used cognitive diagnostic model, which mainly describes the students’ answering process by slipping and guessing parameters and the knowledge point investigation matrix, and then evaluates the students’ knowledge mastery ability. However, the model is too simple, uses single, large parameters, is easy to overfit, and has a low prediction accuracy. Complex models include the Fuzzy Cognitive Diagnosis Framework (FuzzyCDF [5]), Neural Cognitive Diagnosis Model (NCDM [6]) and Cognitive and Response Model (C&RM [7]). The Fuzzy Cognitive Diagnostic Framework (FuzzyCDF) is a four-tier (i.e., latent trait, skill proficiency, problem mastery and problem score) generative model designed to capture the relationship between a student’s knowledge state and their performance on objective and subjective questions. Although this allows for the joint investigation of multiple knowledge points, the importance placed on the knowledge points is insufficient. The Neural Cognitive Diagnosis Model (NCDM) combines neural networks to learn the complex interactions between students and questions, while drawing on the monotony hypothesis of pedagogy theory to ensure the interpretability of vectors, but the evaluation results are not convincing and prone to overfitting. The Cognitive and Response Model (C&RM) combines two high-order feature parameters to accurately model students’ knowledge levels by setting the mutual compensation mechanism of the ability feature and effort feature. At the same time, the characteristic parameters of weak knowledge points are constructed to comprehensively consider the influence of students’ knowledge level and different knowledge points on the answered questions, so as to further improve the interpretation and prediction accuracy of the model. However, the reliability of the student effort characteristic parameters is not high, and the initial parameters need to be optimized.

The previous cognitive diagnosis process diagram is shown in Figure 1. In the selection of questions, the same proportion of questions are selected from each student’s questions for training, and the rest are tested (for example, student 1 selected questions e1, e2 and e3 for cognitive diagnosis training, and then tested e4 and e5; Student 2 selected questions e1, e4 and e5 for cognitive diagnostic training, and then used e2 and e3 for the test). This means the training questions selected by each student are inconsistent, resulting in the model’s dependence on the trained questions, that is, the model has a good predictive effect on the trained questions (known questions) and a poor predictive effect on the untrained questions (unknown questions). The reality is often that the results of some questions (such as the student’s usual quizzes) are known and the results of other questions (such as the student’s final exam questions) are predicted. In the face of this situation, the performance of previous cognitive diagnostic models will be reduced due to their lack of training for unknown questions.

To solve the above problems, this paper proposes a cognitive diagnosis model based on convolutional neural networks. First of all, in the selection of questions, inspired by Hirose H [8], the questions are divided into usual quizzes

E_{1}

and final exam questions

E_{2}

, whose correlation matrix with knowledge points is represented by

q_{1}

and

q_{2}

, respectively. Students are awarded their usual grades

R_{u s u a l}

by answering the usual quizzes. The model combines two multi-layer (general and special) one-dimensional convolutional neural networks to model the process from students’ usual grades

R_{u s u a l}

to students’ mastery of knowledge points

α

to students’ answers to the final exam questions

R

. The process is shown in Figure 2. At the same time, an attention mechanism further improves the prediction accuracy of the model. The main work is as follows:

Two multi-layer (general and special) one-dimensional convolutional neural networks are combined, that is, the convolutional kernel of the special one-dimensional convolutional neural network is generated by the ordinary one-dimensional convolutional neural network, so as to model the cognitive diagnosis process of students and the prediction process of final exam results.
The attention mechanism is used to further improve the prediction accuracy of the model by considering the influence of the intra-level relationships among students, questions and knowledge points on students’ answer results.
On the public datasets FrcSub [5], Math1 [5] and Math2 [5] and the self-built BOIT dataset, the proposed CNNCD model and some classical cognitive diagnosis models are compared and analyzed through experiments, and the three indexes of Accuracy [9], Root Mean Square Error (RMSE [10]) and Mean Absolute Error (MAE [10]) are evaluated, respectively. The experimental results show that the proposed model is superior to the comparison models, which indicates the feasibility and effectiveness of the proposed model.

2. Related Work

2.1. Attention Mechanism

As a kind of neural network technology, the core of an attention mechanism is to allow the model to focus on the most relevant part of the input data, by assigning weights to realize the importance of different input elements. In recent years, application models based on attention mechanisms have made remarkable progress in many fields such as natural language processing and computer vision. By introducing an attention mechanism, these models enhance the ability to capture and understand key information, and this improves the performance of the models. However, the attention mechanism’s application in the field of pedagogy is still in the exploratory stage. Zhao T. et al. [11] added the attention mechanism to AI-assisted education to help teachers estimate students’ attitudes through data analysis, save teachers’ energy, improve teaching efficiency and help improve teaching methods. The KSCD [12] model will integrate the vector of students and knowledge points, and obtains the cognitive states of students through the attention mechanism. The SSKT [13] model integrates appropriate Query, Key and Value objects into the attention mechanism to effectively simulate how students extract, integrate and apply information from their existing knowledge. The BRAYOLOv7 [14] model adopts a double-layer attention mechanism and integrates the improved convolutional block attention module into the backbone structure. Finally, the improved model achieves an average accuracy of 99%, and the accuracy is also significantly improved. Dey A et al. [15] propose a novel attention-based deep learning model specifically designed for student action recognition in online learning environments, enabling machines to understand and respond to student behavior to enhance education, personalize learning and support student academic success and well-being. Du L et al. [16] used the attention mechanism to capture students’ attention to information and improved the average accuracy of a personalized search. Miao W [17] proposed a network model of an attention mechanism, which alleviated the problem of fuzzy data and features and improved the generalization ability of the model. The HAN [18] model uses a hierarchical attention mechanism to weigh different emotional features, and adds original information to the attention mechanism concentration to prevent information loss, thus improving the accuracy of text emotion recognition. Yan C et al. [19] used the multi-head self-attention mechanism to extract students’ behavioral features and improve the prediction accuracy of the student performance model. The LCANet [20] model integrates an attention mechanism and joint loss function to identify students’ emotions in real classroom scenes, and it integrates an improved channel space attention module as a way to extract more local feature information, providing an important reference for improving teaching in intelligent teaching scenarios. The above studies show that different types of attention mechanisms trigger significant performance gains in different pedagogical tasks. Therefore, it is of great value to explore the application of attention mechanisms in the field of pedagogy.

2.2. Cognitive Diagnosis Based on Neural Networks

In recent years, with the rapid development of neural networks, their application in cognitive diagnosis models has also made remarkable progress. The NCDM [6] is one of the earliest cognitive diagnosis models based on neural networks. It uses multiple neural layers to model the relationship between learners and exercises, and it applies the monotonicity hypothesis to ensure the interpretability of the model, which can capture students’ mastery of different knowledge points and thus provide accurate results for cognitive diagnosis. The DIRT [21] model uses deep learning techniques to enhance the cognitive diagnostic capabilities of Item Response Theory (IRT), which utilizes neural networks to mine the semantic representation of problem text and is combined with the IRT model to improve diagnostic accuracy. The Deep-IRT [22] model is a synthesis of the Item Response Theory (IRT) model and knowledge tracking model. It is based on the deep neural network architecture of the Dynamic Key-Value Memory Network (DKVMN), which makes knowledge tracking based on deep learning explainable. The QNN [2] model proposes a neural network constrained by a Q matrix, which is used to determine the connections between neurons and the width and depth of the neural network, providing a realistic and implementable reference solution for classroom teaching evaluation and cold starts of personalized and adaptive evaluation systems. The RCD [23] model represents students, exercises and knowledge points as nodes in three local graphs, and it constructs a multi-layer attention network to aggregate the relationships between nodes in the graph and among graphs. This representation method can capture the interactive information between learners and exercises more comprehensively, and it improves the accuracy of cognitive diagnosis. The FK-CD [24] model introduced the importance of forgetting information and knowledge points on the basis of the cognitive diagnosis model. From the point of view of knowledge points, the model uses a neural network to obtain the degree of students’ forgetting of knowledge points, and it predicts that according to the close connection between the test question and knowledge points. The C&RM [7] model expanded one-dimensional high-order ability features into independent ability features and effort features, and it set up a joint compensation mechanism between them. In addition, it also considered the difference between students’ knowledge level and the importance of knowledge points as having an impact on answering questions, and it constructed the feature parameters of weak knowledge points. The KSCD model uses students’ problem-solving records to learn the relationship between knowledge concepts, and obtains the embedded representation of knowledge point vectors based on neural networks. The above studies have effectively used neural networks to improve the accuracy of cognitive diagnosis. Therefore, this paper proposes a cognitive diagnosis model based on convolutional neural networks, which combines two multi-layer (general and special) one-dimensional convolutional neural networks to model the cognitive diagnosis process of students and the prediction process of final exam scores. At the same time, the attention mechanism is used to dig out the intra-layer relationships among students, questions and knowledge points, to improve the performance of the model.

3. Materials and Methods

3.1. Dataset

In this paper, four datasets, FrcSub [5], Math1 [5], Math2 [5] and Basis of Information Theory (BOIT), are selected for comparative experiments, among which FrcSub, Math1 and Math2 are common datasets, which include the student’s answer results and the questions and knowledge points correlation matrix.

The BOIT dataset is derived from the basic information theory course data of a 2019 undergraduate of Guilin University of Electronic Science and Technology. The data contain the student’s answer results and the questions and knowledge points correlation matrix. Details of the dataset are shown in Table 1. There are 38 questions in the BOIT dataset, among which 28 are from the usual quizzes by students in the WeChat public account and 10 are from the final exam questions of students. The correlation matrix between questions and knowledge points was modified and improved by the information theory teacher of the Information and Communication School of Guilin University of Electronic Science and Technology.

3.2. CNNCD Model

The CNNCD model proposed in this paper includes an attention module, knowledge evaluation module and test item prediction module, as shown in Figure 3.

In Figure 3,

R_{u s u a l} (N \times M_{1})

represents the student’s scores on the usual quizzes.

q_{1} (M_{1} \times K)

is the correlation matrix between usual quizzes and knowledge points.

q_{2} (M_{2} \times K)

is the correlation matrix between the final exam questions and knowledge points.

R_{u s u a l}^{'} (N \times M_{1})

is the usual quizzes answering factor, which integrates the relationship between students.

q_{1}^{'} (M_{1} \times K)

is the correlation matrix of usual quizzes and knowledge points, which integrates the relationship between knowledge points.

q_{2}^{'} (M_{2} \times K)

is the correlation matrix of final exam questions and knowledge points, which integrates the relationship between usual quizzes and final exam questions.

α (N \times K)

is the student’s mastery of knowledge points.

R (N \times M_{2})

is the predicted scores of the student’s final exam questions.

N

,

M_{1}

,

M_{2}

and

K

represent the number of students, usual quizzes, final exam questions and knowledge points, respectively. godCNN stands for a general one-dimensional convolutional neural network. sodCNN stands for a special one-dimensional convolutional neural network. The biggest difference between the two is that the convolutional kernel in godCNN is a parameter inherent in the model, while the convolutional kernel in sodCNN is self-made.

In view of the dependence of cognitive diagnostic models on known training questions during testing in the past, this study divided the questions of the dataset into usual quizzes (for training) and final exam questions (for testing). The correlation matrices corresponding to the knowledge points of the usual quizzes and the final exam questions are

q_{1}

and

q_{2}

, respectively, and the results of the students’ answers to the usual quizzes and the final exam questions are

R_{u s u a l}

and

R_{r e a l}

, respectively. (1) Firstly, the given results of students’ usual quizzes

R_{u s u a l}

, the correlation matrix of usual quizzes and knowledge points

q_{1}

, and the correlation matrix of final exam questions and knowledge points

q_{2}

are passed through the attention module to construct the intra-layer relationship among students, questions and knowledge points. (2) Then, the usual quizzes answering factor

R_{usual}^{'}

, which integrates the relationship between students, and the correlation matrix

q_{1}^{'}

of usual quizzes and knowledge points that integrates the relationship between knowledge points, are input into the knowledge evaluation module, and the students’ mastery of knowledge points

α

is obtained by combining two multi-layer one-dimensional convolutional neural networks (godCNN and sodCNN). (3) At last, the students’ mastery of knowledge points

α

and the correlation matrix

q_{2}^{'}

of the final exam questions and knowledge points, which integrates the relationship between the usual quizzes and the final exam questions, are passed through the test item prediction module, so as to obtain the students’ predicted answer result

R

of the final exam questions.

In the knowledge evaluation module and the test item prediction module, two multi-layer one-dimensional convolutional usual networks (godCNN and sodCNN) are cleverly combined. In the knowledge evaluation module, the usual quizzes and knowledge points association matrix

q_{1}^{'}

, which integrate the relationship between knowledge points, are passed through the multi-layer ordinary one-dimensional convolutional neural network (godCNN) to obtain knowledge point feature vectors of various levels, which are then flattened, fully connected and reshaped into a convolutional kernel with knowledge point features. The multi-layer special one-dimensional convolutional neural network (sodCNN) can be used to transform the response factor

R_{usual}^{'}

, which integrates the relationship between students, into the students’ mastery of knowledge points

α

.

3.2.1. Attention Module

The good effect of the previous model often depends on having sufficient data, such as the students’ answers and the relationship matrix between the questions and the knowledge points. However, in an actual situation, these data may be insufficient, and the relationship between students, test questions and knowledge points will also affect students’ problem-solving. In this paper, the attention mechanism is used to explore the interrelationship among students, questions and knowledge points.

In exploring the intra-layer relationship between students, the linear change in students’ usual grade

R_{u s u a l}

is determined first.

Q_{1} = R_{usual} \cdot W_{1}^{Q},

(1)

K_{1} = R_{usual} \cdot W_{1}^{K},

(2)

V_{1} = R_{usual} \cdot W_{1}^{V},

(3)

where

W_{1}^{Q}

,

W_{1}^{K}

and

W_{1}^{V}

are the weight matrix coefficients of the linear change in the student’s scores on the usual quizzes

R_{u s u a l}

.

The relationship between students is

C (S, S) = s o f t \max (\frac{Q_{1} K_{1}^{T}}{\sqrt{d_{K_{1}}}}),

(4)

where

d_{K_{1}}

is a scale factor and also a dimension of

K_{1}

.

After the relationship between students is obtained, the usual quizzes answering factor integrating the relationship between students is expressed as

R_{u s u a l}^{'} = C (S, S) \cdot V_{1},

(5)

According to the above formula, the relationship between students can be calculated, so as to obtain the usual quizzes answering factor

R_{usual}^{'}

, which integrates the relationship between students. In the same way, the relationship

C (K, K)

between knowledge points can be calculated based on the correlation matrix

q_{1}

between usual quizzes and knowledge points, so as to obtain the usual quizzes and knowledge points association matrix

q_{1}^{'}

integrating the relationship between knowledge points.

Since the final goal of the model is to predict the scores of the final exam questions, it is also necessary to calculate the relationship between the usual quizzes and the final exam questions

C (E_{1}, E_{2})

, so as to obtain the final exam questions and knowledge points correlation matrix

q_{2}^{'}

integrating the relationship between the usual quizzes and the final exam questions. The calculation process is as follows.

First,

q_{1}

and

q_{2}

are linearly transformed, respectively.

Q_{2} = q_{2} \cdot W_{2}^{Q},

(6)

K_{2} = q_{1} \cdot W_{2}^{K},

(7)

V_{2} = q_{1} \cdot W_{2}^{V},

(8)

where

W_{2}^{Q}

is the weight matrix coefficient of the linear change in the correlation matrix

q_{2}

between the final exam questions and the knowledge points, and

W_{2}^{K}

and

W_{2}^{V}

are the weight matrix coefficients of the linear change in the correlation matrix

q_{1}

between the usual quizzes and the knowledge points.

The relationship between the usual quizzes and the final exam questions is

C (E_{1}, E_{2}) = s o f t \max (\frac{Q_{2} K_{2}^{T}}{\sqrt{d_{K_{2}}}}),

(9)

where

d_{k_{2}}

is a scale factor and a dimension of

K_{2}

.

The final exam questions and knowledge points correlation matrix integrating the relationship between the usual quizzes and the final exam questions is

q_{2}^{'} = C (E_{1}, E_{2}) \cdot V_{2},

(10)

3.2.2. Knowledge Evaluation Module

In the real world, students have different degrees of mastery of each knowledge point. The higher the degree of students’ mastery of a certain knowledge point, the greater the probability of students answering the relevant questions of the knowledge point correctly. Therefore, students’ mastery of knowledge points has an impact on students’ answer results. In the model, students’ mastery of knowledge points

α

depends on the usual quizzes answering factor

R_{usual}^{'}

, which integrates the relationship between students, and the correlation matrix

q_{1}^{'}

, which integrates the relationship between knowledge points. The knowledge evaluation module is combined with two multi-layer one-dimensional convolutional neural networks (godCNN and sodCNN), and its specific structure is shown in Figure 4.

Firstly, the correlation matrix

q_{1}^{'} = [e_{1}, e_{2}, \dots, e_{M_{1}}]

(

M_{1}

rows and

K

columns), which integrates the relationship between knowledge points, is transposed to obtain the features of each knowledge point.

h_{0}^{g} = {q_{1}^{'}}^{T} = {[e_{1}, e_{2}, \dots, e_{M_{1}}]}^{T},

(11)

Then,

{q_{1}^{'}}^{T}

is passed through a multi-layer ordinary one-dimensional convolutional neural network (godCNN) as follows.

The old features of each knowledge point are transformed into new features by one-dimensional convolution, activation and pooling. With the increase in network depth, the model may have problems such as gradient disappearance and network explosion, so in order to enhance the representation ability of the network and speed up the training convergence, the model uses the residual structure. The mathematical expression is as follows:

h_{i}^{g} = p o l l [r e l u (h_{i - 1}^{g} \otimes W_{i}^{g})] + h_{i - 1}^{g},

(12)

where

p o l l

represents the pooling process (maximum pooling is used here),

r e l u

is the activation function,

W^{g} = {W_{1}^{g}, \dots, W_{i}^{g}, \dots, W_{Z}^{g}} (1 \leq i \leq Z)

and

W_{i}^{g}

is the convolution kernel of an ordinary one-dimensional convolutional neural network at layer

i

.

h^{g} = {h_{1}^{g}, \dots, h_{i}^{g}, \dots, h_{Z}^{g}} (1 \leq i \leq Z)

,

h_{i}^{g}

represents the features obtained after convolution pooling of knowledge point feature values by convolution kernel

W_{i}^{g}

,

i

represents the number of convolution layers,

Z

is the maximum number of convolution layers and

\otimes

represents the convolution process.

Then, the resulting features are flattened and reshaped through several fully connected layers (View, FC, Reshape, VFR) into a convolution kernel with knowledge point features.

f_{i 0}^{g} = v i e w (h_{i}^{g}),

(13)

f_{i 1}^{g} = ϕ (ω_{i 1}^{g} \cdot f_{i 0}^{g} + b_{i 1}^{g}),

(14)

f_{i 2}^{g} = ϕ (ω_{i 2}^{g} \cdot f_{i 1}^{g} + b_{i 2}^{g}),

(15)

f_{i 3}^{g} = ϕ (ω_{i 3}^{g} \cdot f_{i 2}^{g} + b_{i 3}^{g}),

(16)

W_{i}^{s} = r e s h a p e (f_{i 3}^{g}),

(17)

where

v i e w ()

represents the feature flattening,

ω_{i 1}^{g}

,

ω_{i 2}^{g}

,

ω_{i 3}^{g}

represent the parameter matrix of the fully connected layer neural network,

b_{i 1}^{g}

,

b_{i 2}^{g}

,

b_{i 3}^{g}

represent the bias,

ϕ

represents the

s i g m o i d

activation function and

r e s h a p e ()

represents feature remodeling.

W^{s} = {W_{1}^{s}, \dots, W_{i}^{s}, \dots, W_{Z}^{s}} (1 \leq i \leq Z)

,

W_{i}^{s}

is the convolutional kernel with knowledge point characteristics obtained through the

i

layer ordinary one-dimensional convolutional neural network,

i

is the number of convolutional layers,

Z

is the maximum number of convolutional layers and

\cdot

represents matrix multiplication.

A multi-layer special one-dimensional convolutional neural network (sodCNN) is used to integrate the usual quizzes answering factor

R_{usual}^{'}

integrating the relationship between students and the convolutional kernel

W^{s}

with knowledge features. The specific process is as follows,

Through special one-dimensional convolution, full connection layer, activation, pooling and residual connection operations, the usual quizzes features are transformed into new features. The mathematical expression is as follows:

h_{0}^{s} = R_{u s u a l}^{'},

(18)

f_{i}^{s} = ω_{i}^{s} \cdot (h_{i - 1}^{s} \otimes W_{i}^{s}) + b_{i}^{s},

(19)

h_{i}^{s} = p o l l [r e l u (f_{i}^{s})] + h_{i - 1}^{s},

(20)

where

p o l l

represents the pooling process (maximum pooling is used here),

r e l u

is the activation function,

ω_{i}^{s}

is the parameter matrix of the fully connected layer neural network and

b_{i}^{s}

is the bias.

h^{s} = {h_{1}^{s}, \dots, h_{i}^{s}, \dots, h_{Z}^{s}} (1 \leq i \leq Z)

,

h_{i}^{s}

represents the features obtained after the convolution pooling of the eigenvalues of usual quizzes through the convolution kernel

W_{i}^{s}

,

i

is the number of convolution layers,

Z

is the maximum number of convolution layers,

\cdot

represents matrix multiplication and

\otimes

represents the convolution process.

Then, the final feature

h_{Z}^{s}

is flattened, and students’ mastery of knowledge points

α

is obtained through several fully connected layers.

f_{0}^{k e} = v i e w (h_{Z}^{s}),

(21)

f_{1}^{k e} = ϕ (ω_{1}^{k e} \cdot f_{0}^{k e} + b_{1}^{k e}),

(22)

f_{2}^{k e} = ϕ (ω_{2}^{k e} \cdot f_{1}^{k e} + b_{2}^{k e}),

(23)

α = ϕ (ω_{3}^{k e} \cdot f_{2}^{k e} + b_{3}^{k e}),

(24)

where

ω_{1}^{k e}

,

ω_{2}^{k e}

and

ω_{3}^{k e}

are the parameter matrix of the fully connected layer neural network,

b_{1}^{k e}

,

b_{2}^{k e}

and

b_{3}^{k e}

are the bias,

ϕ

represents the activation function and

\cdot

represents the matrix multiplication.

3.2.3. Test Item Prediction Module

In the model, the structure of the test item prediction module is similar to that of the knowledge evaluation module. The predicted result of the final exam questions

R

depends on the students’ mastery of the knowledge points

α

and the final exam questions and knowledge points correlation matrix

q_{2}^{'}

, which integrates the relationship between the usual quizzes and the final exam questions.

This process is still completed by two multi-layer one-dimensional convolutional neural networks (godCNN and sodCNN). The final exam questions and knowledge points association matrix

q_{2}^{'} = [e_{M_{1} + 1}, e_{M_{1} + 2}, \dots, e_{M_{1} + M_{2}}]

, which integrates the relationship between the usual quizzes and the final exam questions, is used to generate multiple convolutional cores with the features of the final exam questions through a multi-layer ordinary one-dimensional convolutional neural network (godCNN). Using these convolution cores, students’ mastery of knowledge points

α

is mapped to students’ predicted scores on final exam questions

R

by a special multi-layer one-dimensional convolutional neural network (sodCNN).

3.2.4. Loss Function and Network Optimization

In this paper, SmoothL1Loss [25] is used as the loss function, which is the combination of L2Loss and L1Loss. It has some advantages of L2Loss and L1Loss at the same time. Its mathematical expression is as follows:

L = l (R, R_{r e a l}) = {l_{1}, l_{2}, \dots, l_{n}}^{T},

(25)

where

R

represents the predicted results of students’ final exam questions, and

R_{r e a l}

represents the real results of students’ final exam questions.

l_{n} = \{\begin{matrix} \frac{1}{2 β} {(R_{n} - R_{r e a l, n})}^{2}, | R_{n} - R_{r e a l, n} | < β \\ | R_{n} - R_{r e a l, n} | - \frac{1}{2} β, o t h e r w i s e \end{matrix},

(26)

where

n

is the number of samples and

β

is the threshold between 0 and 1.

If the absolute values of the predicted results and the real results of the sample are lower than

β

, a loss L2Loss of the square term is created; otherwise, the absolute loss L1Loss is used, which in some cases prevents gradient explosion.

This paper also adopts the SGD [26] optimization method to train the model, and it uses CosineAnnealingLR [27] learning rate scheduling to improve how the model converges in the training process, improve the performance of the model and prevent overfitting.

In order to better train and predict the model, the usual quizzes are used as final exam questions for training, that is,

q_{1}

in the model is used as

q_{2}

for training. After training, the students’ mastery of knowledge point

α

is the result of cognitive diagnosis, and the final exam questions are used for testing.

3.3. Performance Index

In this paper, the Accuracy [9], Root Mean Square Error (RMSE [10]) and Mean Absolute Error (MAE [10]) are used as evaluation indexes, and their mathematical expressions are shown as follows.

The calculation formula of Accuracy is as follows:

A c c u r a c y = \frac{T P + T N}{T P + F P + F N + T N},

(27)

where TP represents the number of samples with positive real sample labels and positive prediction labels; FP represents the number of samples with negative real sample labels and positive prediction labels; FN represents the number of samples with positive real sample labels and negative prediction labels; and TN represents the number of samples with negative real sample labels and negative prediction labels.

The Root Mean Square Error (RMSE) is calculated as follows:

R M S E = \sqrt{\frac{1}{n} \sum_{j = 1}^{n} {(y_{p r e d, j} - y_{t r u e, j})}^{2}},

(28)

where

n

is the number of samples,

y_{p r e d, j}

is the predicted value of the

j

sample and

y_{t r u e, j}

is the true value of the

j

sample.

The Mean Absolute Error (MAE) is calculated as follows:

M A E = \frac{1}{n} \sum_{j = 1}^{n} | y_{p r e d, j} - y_{t r u e, j} |,

(29)

where

| y_{p r e d, j} - y_{t r u e, j} |

represents the absolute value of the deviation between the predicted value and the true value.

The Accuracy is an index used to measure the accuracy of the completion of the task. Because the predicted result of the model is the probability of the students answering the final exam questions correctly, the final exam questions whose predicted probability is greater than 0.5 are regarded as the questions that the students answered correctly, and the other questions are regarded as the questions that the students answered incorrectly. The Root Mean Square Error and the Mean Absolute Error are commonly used error measurement methods, which are widely used in many fields such as data analysis, modeling and deep learning. The higher the accuracy, the smaller the Root Mean Square Error and the Mean Absolute Error, and the better the prediction ability of the model.

4. Experiments and Analysis of Results

4.1. Comparative Experiment

In this paper, students’ answers are divided into usual quizzes (training set) and final exam questions (test set). Based on the data in the training set, students’ knowledge mastery level is evaluated, and then the questions in the test set are predicted according to the evaluation results. FuzzyCDF [5], DINA [4], IRT [3], MIRT [28], NCDM [6] and MCD [29] were selected as the comparison models of the cognitive diagnosis model in this paper.

Table 2, Table 3, Table 4 and Table 5 list the comparative experimental results.

In Table 2, Table 3 and Table 4, the Training Set Ratio (TSR) refers to the proportion of training set samples to the dataset samples, that is, the proportion of usual quizzes in all questions.

T S R = \frac{E_{1}}{E_{1} + E_{2}},

(30)

As can be seen from the tables, in the public datasets FrcFub, Math1 and Math2, other models have a poor performance in terms of the Accuracy index due to a lack of training for unknown question types, and there are training parameters related to test questions in the models. However, the CNNCD model proposed in this paper does not have relevant parameters related to questions. Therefore, the Accuracy and RMSE index are better than those of other models, especially in the datasets Math1 and Math2, but in the MAE index, the performance is insufficient.

Since the dataset BOIT has already divided the usual quizzes and the final exam questions, it does not compare and classify the proportion of the usual quizzes. As can be seen from Table 5, the model CNNCD proposed in this paper is superior to other models in terms of Accuracy and RMSE, and it performs better in terms of MAE, second only to model NCDM.

4.2. Ablation Experiment

Since the BOIT dataset is most similar to the FrcSub dataset in terms of data sample size, FrcSub is the most commonly used classical cognitive diagnostic dataset, and the BOIT dataset is a self-made dataset, so these two datasets are representative. Therefore, the FrcSub dataset and BOIT dataset were selected to test the influence of different modules on the prediction effect of the model in this paper. The experimental results are shown in Figure 5.

In Figure 5, CDM represents the removal of the attention module and knowledge evaluation module of model CNNCD; CDM + odCNN means removing the attention module of model CNNCD; CDM + att means removing the knowledge evaluation module of model CNNCD; and CDM + odCNN + att represents the complete CNNCD model. After the knowledge evaluation module is removed, matrix multiplication is used to connect the results of students’ usual quizzes and the relevant matrix of the usual quizzes and knowledge points.

As can be seen from Figure 5a,b, when the model CNNCD removes the attention module and knowledge evaluation module, the prediction index is the lowest; when the attention module and knowledge evaluation module are added, each index has a different degree of improvement, indicating the effectiveness of the attention module and knowledge evaluation module.

Considering the influence of one-dimensional convolutional neural networks with different depths on the prediction results, this paper attempts to use one-dimensional convolutional neural networks with different depths to model the knowledge mastery module in order to verify the influence of network depth on the cognitive diagnosis results. The experimental results are shown in Figure 6.

In Figure 6a–c, the horizontal coordinate is the number of layers of the convolutional network in the knowledge mastery module, and the vertical coordinate is the index value of the model Accuracy rate, RMSE and MAE, respectively. It can be seen that for dataset FrcSub, when the number of convolutional layers is 3, the Accuracy and MAE perform best, and when the number of convolutional layers is 4, the RMSE performs best. For dataset BOIT, when the number of convolutional layers is 4, the Accuracy and MAE are the best, and when the number of convolutional layers is 2, the RMSE is the best. It is not difficult to see from the results that there are some differences between the two datasets in the process of cognitive diagnosis.

5. Conclusions

To overcome the dependence of existing cognitive diagnosis models on trained question types, this paper proposes a cognitive diagnosis model based on a convolutional neural network—CNNCD. The model divides the question types of the dataset into the usual quizzes and the final exam questions. It uses usual quizzes and their correlation matrix with knowledge points to model students’ mastery of knowledge points, and then uses the final exam questions to predict. In addition, an attention mechanism is used to dig deeper into the intra-layer relationship among students, questions and knowledge points. This alleviates the problem of insufficient information mining among students, questions and knowledge points. Based on the three evaluation results of the Accuracy, Root Mean Square Error (RMSE) and Mean Absolute Error (MAE), the results of the prediction are better than those of other models, which fully demonstrates the superiority of the CNNCD model. In addition, a large number of ablation experiments were conducted to verify the effectiveness of the attention module and the knowledge evaluation module.

When the CNNCD model is applied to the teaching of college courses, on the one hand, it can evaluate the students’ mastery of knowledge points, so that students can check their deficiencies and make up for their omissions; on the other hand, it can help teachers find deficiencies in their teaching and optimize their teaching plans. In future studies, we will focus on adopting different types of attention mechanisms to optimize the attention module, thereby improving the predictive accuracy of the model. At the same time, we will build on the existing model to further investigate the correlation between different student grades in different semesters. This will help teachers to assess students’ cognitive states, the difficulty of questions and knowledge points.

Author Contributions

Conceptualization, J.M. and L.H.; methodology, L.H.; software, L.H.; validation, L.H., H.Y. and Z.S.; formal analysis, H.Y.; investigation, H.Y.; resources, J.M.; data curation, L.H.; writing—original draft preparation, L.H.; writing—review and editing, J.M.; visualization, L.H.; supervision, J.M.; project administration, J.M.; funding acquisition, J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (62177012, 62267003), the Guangxi Natural Science Foundation (2024GXNSFDA010048) and a Project of the Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory (GXKL06240107).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The FrcSub, Math1 and Math2 datasets are available at the following link: http://staff.ustc.edu.cn/%7Eqiliuql/data/math2015.rar. We accessed on 13 October 2024. The BOIT will be made available upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the existing affiliation information. This change does not affect the scientific content of the article.

References

Cheng, X. Study on Learning Diagnosis and Personalized Test Item Recommendation Based on Graph Embedding. Master’s Thesis, Hubei University, Wuhan, China, 2023. [Google Scholar]
Tao, J.; Zhao, W.; Zhang, Y.; Guo, Q.; Min, B.; Xu, X.; Liu, F. Cognitive diagnostic assessment: A Q-matrix constraint-based neural network method. Behav. Res. Methods 2024, 56, 6981–7004. [Google Scholar] [CrossRef]
Embretson, S.; Reise, S. Item Response Theory for Psychologists; Psychology Press: New York, NY, USA, 2000; Volume 4. [Google Scholar] [CrossRef]
Jimmy, D.l.T. DINA Model and Parameter Estimation: A Didactic. J. Educ. Behav. Stat. 2009, 34, 115–130. [Google Scholar]
Liu, Q.; Wu, R.; Chen, E.; Xu, G.; Su, Y.; Chen, Z.; Hu, G. Fuzzy Cognitive Diagnosis for Modelling Examinee Performance. ACM Trans. Intell. Syst. Technol. (TIST) 2018, 9, 1–26. [Google Scholar] [CrossRef]
Wang, F.; Liu, Q.; Chen, E.; Huang, Z.; Chen, Y.; Yin, Y.; Huang, Z.; Wang, S. Neural Cognitive Diagnosis for Intelligent Education Systems. Proc. AAAI Conf. Artif. Intell. 2020, 34, 6153–6161. [Google Scholar] [CrossRef]
Wang, L.; Luo, Z.; Liu, C. Cognitiveand Response Model for Evaluation of MOOC Learners. Chin. J. Electron. 2023, 51, 18–25. [Google Scholar]
Hirose, H. Prediction of Success or Failure for Final Examination using Nearest Neighbor Method to the Trend of Weekly Online Testing. arXiv 2018, arXiv:1901.02056. [Google Scholar]
Yang, H. Feature mining method of equipment support data based on attribute classification. Ordnance Mater. Sci. Eng. 2020, 43, 124–128. [Google Scholar] [CrossRef]
Hodson, T.O. Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not. Geosci. Model Dev. 2022, 15, 5481–5487. [Google Scholar] [CrossRef]
Zhao, T.; Song, T. Establishing a Fusion Model of Attention Mechanism and Generative Adversarial Network to Estimate Students’ Attitudes in English Classes. Teh. Vjesn.—Tech. Gaz. 2022, 29, 1464–1471. [Google Scholar]
Ma, H.; Li, M.; Wu, L.; Zhang, H.; Cao, Y.; Zhang, X.; Zhao, X. Knowledge-Sensed Cognitive Diagnosis for Intelligent Education Platforms. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; pp. 1451–1460. [Google Scholar]
Qian, L.; Zheng, K.; Wang, L.; Li, S. Student State-aware knowledge tracing based on attention mechanism: A cognitive theory view. Pattern Recognit. Lett. 2024, 184, 190–196. [Google Scholar] [CrossRef]
Wu, J.; Zhang, Y.; Fu, L.; Luo, Y.; Xie, H.; Hua, R. BRAYOLOv7: An improved model based on attention mechanism and raspberry pi implementation for online education. Int. J. Sen. Netw. 2024, 46, 45–59. [Google Scholar] [CrossRef]
Dey, A.; Anand, A.; Samanta, S.; Sah, B.K.; Biswas, S. Attention-Based AdaptSepCX Network for Effective Student Action Recognition in Online Learning. Procedia Comput. Sci. 2024, 233, 164–174. [Google Scholar] [CrossRef]
Du, L.; Xu, Y. Development strategy of online English teaching based on attention mechanism and recurrent neural network recommendation method. Int. J. Data Min. Bioinform. 2024, 28, 140–155. [Google Scholar] [CrossRef]
Miao, W. A Study on the Teaching Design of a Hybrid Civics Course Based on the Improved Attention Mechanism. Appl. Sci. 2022, 12, 1243. [Google Scholar] [CrossRef]
Su, B.; Peng, J. Sentiment Analysis of Comment Texts on Online Courses Based on Hierarchical Attention Mechanism. Appl. Sci. 2023, 13, 4204. [Google Scholar] [CrossRef]
Yan, C.; Ganglin, W.; Jiaxin, L.; Yunwei, C.; Qinghua, Z.; Feng, T.; Haiping, Z.; Qianying, W.; Yaqiang, W. A prediction model of student performance based on self-attention mechanism. Knowl. Inf. Syst. 2022, 65, 733–758. [Google Scholar]
Hu, P.; Tang, X.; Yang, L.; Kong, C.; Xia, D. LCANet: A model for analysis of students real-time sentiment by integrating attention mechanism and joint loss function. Complex Intell. Syst. 2024, 11, 27. [Google Scholar] [CrossRef]
Cheng, S.; Liu, Q.; Chen, E.; Huang, Z.; Huang, Z.; Chen, Y.; Ma, H.; Hu, G. DIRT: Deep Learning Enhanced Item Response Theory for Cognitive Diagnosis. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2397–2400. [Google Scholar]
Yeung, C.-K. Deep-IRT: Make Deep Learning Based Knowledge Tracing Explainable Using Item Response Theory. arXiv 2019, arXiv:1904.11738. [Google Scholar]
Gao, W.; Liu, Q.; Huang, Z.; Yu, Y.; Bi, H.; Wang, M.-C.; Ma, J.; Wang, S.; Su, Y. RCD:Relation Map Driven Cognitive Diagnosis for Intelligent Education Systems; Computing Association: New York, NY, USA, 2021. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, L. Cognitive diagnosis model integrating forgetting and knowledge importance. J. South China Univ. Technol. (Nat. Sci. Ed.) 2023, 51, 54–62. [Google Scholar]
Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Tian, Y.; Zhang, Y.; Zhang, H. Recent Advances in Stochastic Gradient Descent in Deep Learning. Mathematics 2023, 11, 682. [Google Scholar] [CrossRef]
Smith, L.N.; Topin, N. Super-convergence: Very fast training of neural networks using large learning rates. In Proceedings of the Defense + Commercial Sensing, Baltimore, MD, USA, 15–16 April 2019. [Google Scholar]
Reckase, M.D. 18 Multidimensional Item Response Theory. In Handbook of Statistics; Rao, C.R., Sinharay, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2006; Volume 26, pp. 607–642. [Google Scholar]
Luo, Z.; Li, Y.; Yu, X.; Gao, C.; Peng, Y. A Simple Cognitive Diagnosis Method Based on Q-Matrix Theory. Acta Psychol. Sin. 2015, 47, 264–272. [Google Scholar] [CrossRef]

Figure 1. The previous cognitive diagnosis process diagram.

Figure 2. Cognitive diagnosis process diagram in this paper.

Figure 3. CNNCD model diagram.

Figure 4. Structure of multi-layer one-dimensional convolutional neural networks (godCNN and sodCNN) in the knowledge evaluation module.

Figure 5. Influence of modules on the prediction effect of the CNNCD model. Experimental results of the (a) FrcSub and (b) BOIT datasets.

Figure 6. The effect of the number of convolutional layers on the Accuracy (a), RMSE (b) and MAE (c) of CNNCD models in data sets FrcSub and BOIT.

Table 1. Dataset summary.

DataSet	Number of Students	Number of Questions	Number of Knowledge Points
FrcSub	536	20	8
Math1	4209	15	11
Math2	3911	16	16
BOIT	198	38	11

Table 2. Experimental comparison results of FrcSub dataset.

	Accuracy					RMSE					MAE
Model	0.3	0.4	0.5	0.6	0.7	0.3	0.4	0.5	0.6	0.7	0.3	0.4	0.5	0.6	0.7
FuzzyCDF	0.722	0.749	0.774	0.793	0.814	0.436	0.418	0.399	0.391	0.374	0.378	0.357	0.332	0.327	0.309
DINA	0.489	0.500	0.503	0.529	0.542	0.715	0.707	0.705	0.686	0.677	0.511	0.500	0.497	0.471	0.458
IRT	0.679	0.673	0.670	0.671	0.655	0.499	0.494	0.492	0.495	0.504	0.360	0.370	0.365	0.361	0.388
MIRT	0.704	0.726	0.754	0.749	0.795	0.495	0.475	0.446	0.457	0.410	0.308	0.290	0.263	0.264	0.222
NCDM	0.632	0.695	0.772	0.786	0.795	0.595	0.525	0.464	0.457	0.446	0.365	0.303	0.230	0.224	0.214
MCD	0.720	0.743	0.750	0.770	0.795	0.469	0.437	0.423	0.406	0.384	0.317	0.299	0.298	0.286	0.268
CNNCD	0.743	0.762	0.787	0.815	0.828	0.427	0.412	0.390	0.374	0.366	0.307	0.314	0.256	0.257	0.206

Table 3. Experimental comparison results of Math1 dataset.

	Accuracy					RMSE					MAE
Model	0.3	0.4	0.5	0.6	0.7	0.3	0.4	0.5	0.6	0.7	0.3	0.4	0.5	0.6	0.7
FuzzyCDF	0.517	0.543	0.541	0.563	0.590	0.531	0.521	0.520	0.512	0.501	0.492	0.480	0.481	0.469	0.454
DINA	0.540	0.549	0.590	0.590	0.621	0.581	0.580	0.529	0.540	0.546	0.496	0.462	0.439	0.434	0.412
IRT	0.552	0.500	0.489	0.553	0.426	0.563	0.619	0.528	0.553	0.750	0.466	0.491	0.465	0.469	0.578
MIRT	0.554	0.543	0.533	0.554	0.525	0.612	0.610	0.630	0.599	0.626	0.450	0.457	0.470	0.450	0.477
NCDM	0.507	0.500	0.539	0.561	0.547	0.702	0.707	0.674	0.659	0.666	0.493	0.500	0.463	0.440	0.452
MCD	0.544	0.573	0.570	0.545	0.568	0.556	0.524	0.526	0.524	0.497	0.463	0.452	0.458	0.475	0.476
CNNCD	0.611	0.658	0.657	0.665	0.708	0.490	0.483	0.475	0.479	0.457	0.485	0.475	0.457	0.419	0.398

Table 4. Experimental comparison results of Math2 dataset.

	Accuracy					RMSE					MAE
Model	0.3	0.4	0.5	0.6	0.7	0.3	0.4	0.5	0.6	0.7	0.3	0.4	0.5	0.6	0.7
FuzzyCDF	0.552	0.560	0.574	0.591	0.613	0.516	0.513	0.509	0.497	0.491	0.473	0.468	0.464	0.452	0.438
DINA	0.552	0.577	0.561	0.605	0.672	0.562	0.551	0.557	0.514	0.490	0.482	0.460	0.455	0.435	0.386
IRT	0.558	0.518	0.547	0.445	0.564	0.596	0.643	0.544	0.681	0.616	0.462	0.497	0.459	0.581	0.498
MIRT	0.528	0.545	0.520	0.550	0.549	0.642	0.613	0.634	0.611	0.608	0.473	0.459	0.482	0.450	0.457
NCDM	0.582	0.593	0.573	0.651	0.710	0.646	0.638	0.653	0.591	0.507	0.418	0.407	0.427	0.349	0.345
MCD	0.529	0.542	0.578	0.566	0.684	0.565	0.555	0.529	0.528	0.465	0.475	0.473	0.449	0.463	0.448
CNNCD	0.589	0.610	0.604	0.687	0.778	0.490	0.496	0.492	0.464	0.407	0.477	0.496	0.467	0.439	0.321

Table 5. Experimental comparison results of the dataset BOIT.

Model	Accuracy	RMSE	MAE
FuzzyCDF	0.585	0.512	0.444
DINA	0.590	0.538	0.445
IRT	0.685	0.494	0.372
MIRT	0.619	0.547	0.386
NCDM	0.695	0.498	0.321
MCD	0.703	0.453	0.368
CNNCD	0.734	0.448	0.348

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mo, J.; Hao, L.; Yuan, H.; Shou, Z. A Cognitive Diagnosis Model Using Convolutional Neural Networks to Predict Student Scores. Appl. Sci. 2025, 15, 2875. https://doi.org/10.3390/app15062875

AMA Style

Mo J, Hao L, Yuan H, Shou Z. A Cognitive Diagnosis Model Using Convolutional Neural Networks to Predict Student Scores. Applied Sciences. 2025; 15(6):2875. https://doi.org/10.3390/app15062875

Chicago/Turabian Style

Mo, Jianwen, Longhua Hao, Hua Yuan, and Zhaoyu Shou. 2025. "A Cognitive Diagnosis Model Using Convolutional Neural Networks to Predict Student Scores" Applied Sciences 15, no. 6: 2875. https://doi.org/10.3390/app15062875

APA Style

Mo, J., Hao, L., Yuan, H., & Shou, Z. (2025). A Cognitive Diagnosis Model Using Convolutional Neural Networks to Predict Student Scores. Applied Sciences, 15(6), 2875. https://doi.org/10.3390/app15062875

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Cognitive Diagnosis Model Using Convolutional Neural Networks to Predict Student Scores

Abstract

1. Introduction

2. Related Work

2.1. Attention Mechanism

2.2. Cognitive Diagnosis Based on Neural Networks

3. Materials and Methods

3.1. Dataset

3.2. CNNCD Model

3.2.1. Attention Module

3.2.2. Knowledge Evaluation Module

3.2.3. Test Item Prediction Module

3.2.4. Loss Function and Network Optimization

3.3. Performance Index

4. Experiments and Analysis of Results

4.1. Comparative Experiment

4.2. Ablation Experiment

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Correction Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI