Next Article in Journal
Design of a Capacitorless DRAM Based on Storage Layer Separated Using Separation Oxide and Polycrystalline Silicon
Previous Article in Journal
Horus: An Effective and Reliable Framework for Code-Reuse Exploits Detection in Data Stream
 
 
Communication
Peer-Review Record

DKT-LCIRT: A Deep Knowledge Tracking Model Integrating Learning Capability and Item Response Theory

Electronics 2022, 11(20), 3364; https://doi.org/10.3390/electronics11203364
by Guangquan Li 1,2, Junkai Shuai 1, Yuqing Hu 1, Yonghong Zhang 3,*, Yinglong Wang 1, Tonghua Yang 2,* and Naixue Xiong 4
Reviewer 1:
Reviewer 2: Anonymous
Electronics 2022, 11(20), 3364; https://doi.org/10.3390/electronics11203364
Submission received: 23 August 2022 / Revised: 28 September 2022 / Accepted: 17 October 2022 / Published: 18 October 2022
(This article belongs to the Section Artificial Intelligence)

Round 1

Reviewer 1 Report

The authors have presented an interesting work. In the work, the authors have proposed a deep knowledge tracking model using item response theory. There are few issues that need to be addressed before the acceptance of the paper.

1.       The paper is hard to read and understand. In abstract, the authors have claimed that their work has outperformed other classical models. It is suggested to quantify that how much their results better than other methods at this stage. Similarly, include a contribution list that describe the contribution of this paper to the literature at the end of the introduction section. Moreover, the authors have used too length sentences that make the paper hard to read. For example, Page 3 Line 115 till Line 119 consist of a single line paragraph. Avoid the one-sentence paragraphs and rephrase the sentences. Furthermore, avoid spelling mistakes. For example, in Figure 3, sigmoid is written as “sigmod”.

2.       A lot of abbreviations and variables are used in the paper. It is suggested to add a table of abbreviations in the paper.

3.       Add a space before using citation in the text throughout the paper.

4.       Notations should be consistent within the formulas. For example, in Equation 5, in input stu_i is used on left side of equation while it is just used as variable i on right side of equation. Similarly for seg_z

5.       In Equation 5, learning capability d is for some student i. It should variate. Either the equation should be for all students or it should be for only one student. Currently, variable i is fixed as it is taken in input on the left side of equation.

6.       Variable “capital K” is not defined for Equation 5.

7.       In Figure 2, the time interval is not of same length. How is the variability of the time intervals addressed in the work? Moreover, if student has missed some tests, how they are addressed in the system

8.       Presenting K-mean algorithm as Algorithm 1 is unnecessary

9.       In Section 3.2.1, the exercise is represented as q_t. However, the variable t in the subscript is not defined? Is it a time?

10.   Different vectors are described in Section 3.2.1. It is suggested to also describe the dimensions of the vectors

11.   In Equation 1, k_j is the knowledge point but what the variable j represents? What is the domain or range of k_j?

12.   In Line 213, N_jt indicates the amount of time in total student i represents. Where is variable i in this variable? How should we differentiate this variable for different students?

13.   In Line 216, Groups are C_z. Is it the last time interval based groups?

14.   In Equation 1, k_j is the knowledge point and in Equation 6, k_t is embedding vector. Don’t use similar notations for two different things.

15.   In Equation 6, what does variable i represent?

16.   In Equation 7, value matrix is subscripted with variable t but the key matrix in Equation 6 does not? Why?

17.   In Line 252, v_t-1 is used but not defined

18.   In Equation 8, pd is constant, pd_t is difficulty of exercise what is pd(p_j)? Don’t use similar notations for different things. It makes the readability of the paper very difficult

19.   In Line 256, N_j represents group of students but what does variable j represents?

20.   In Line 256, p_j represents exercise and p_ij represents answer. Don’t use similar notations for different things

21.   In Line 263, W_h represents weight matrix. What does variable h represents here?

22.   In Line 256, exercise is represented as p_j but on line 277 it is represented with variable j only.

23.   In Algorithm 2, the input sequence has variable t while variable t is initialized on Line 1. Are they different or they are related?

24.   In Algorithm 2, the input sequence is for different students but there is no variable to identify individual users among them.

25.   In Algorithm 2, on Line 2 embedding vectors are cascaded with learning capability grouping. Neither grouping was taken as input of the algorithm nor it Algorithm 1 is called within the Algorithm 2

26.   In Algorithm 2, read vector r is cascaded with v and pd but both of those variables are undefined in the Algorithm

27.   In Line 299, r_t represent true values whereas same r_t is read vector in Equation 7.

28.   The algorithm is time-dependent and it was also assumed by the authors in Line 113 that knowledge tracking changes all the time. However, during evaluation they used k-fold cross validation. K-fold cross validation is used when data is not time dependent.  High score of k-fold validation shows that knowledge tracking doesn’t change over time? The authors are advised to change the validation methodology

Comments for author File: Comments.pdf

Author Response

Response to Reviewer 1 Comments

Point 1: The paper is hard to read and understand. In abstract, the authors have claimed that their work has outperformed other classical models. It is suggested to quantify that how much their results better than other methods at this stage. Similarly, include a contribution list that describe the contribution of this paper to the literature at the end of the introduction section. Moreover, the authors have used too length sentences that make the paper hard to read. For example, Page 3 Line 115 till Line 119 consist of a single line paragraph. Avoid the one-sentence paragraphs and rephrase the sentences. Furthermore, avoid spelling mistakes. For example, in Figure 3, sigmoid is written as “sigmod”.

Response 1: Thanks for your comment! We apologize for making it difficult for you to read this paper. We have adjusted it to the best of our ability and had it revised by professionals. Thanks to your suggestion, we have added in line 25 how much better the quantitative results are than the other methods. We have also added a contribution to the literature from this study in line 93. We have read the paper carefully, fixed spelling errors, and adjusted the sentence structure to make the paper a little easier to read.

Point 2: A lot of abbreviations and variables are used in the paper. It is suggested to add a table of abbreviations in the paper.

Response 2: Thanks to the reviewer for this suggestion. We have added a glossary of terminology abbreviations to the supplementary material in line 446.

Point 3: Add a space before using citation in the text throughout the paper.

Response 3: Thank you for pointing out this issue! We have added a space before using citation in the text throughout the paper.

Point 4: Notations should be consistent within the formulas. For example, in Equation 5, in input stu_i is used on left side of equation while it is just used as variable i on right side of equation. Similarly for seg_z.

Response 4: Thank you for pointing out the inadequacies of our manuscript! We have carefully proofread the formulas and made changes to ensure that the notations in each formula remain consistent. We have changed equation 5 to  in line 234.

Point 5: In Equation 5, learning capability d is for some student i. It should variate. Either the equation should be for all students or it should be for only one student. Currently, variable i is fixed as it is taken in input on the left side of equation.

Response 5: Thanks for your comment! Equation 5 represents a K-means clustering process. Student i are grouped at time interval z. Student i are assigned to groups  with similar learning capability based on their learning capability  .

Point 6: Variable “capital K” is not defined for Equation 5.

Response 6: Thank you for pointing out the inadequacies of our manuscript! The variable "capital K" means that there are K learning capability groupings. We have added the definition below Equation 5.

Point 7: In Figure 2, the time interval is not of same length. How is the variability of the time intervals addressed in the work? Moreover, if student has missed some tests, how they are addressed in the system.

Response 7: Thanks again for your comment! When a fixed time interval is divided, the length of the time interval varies because the length of the interaction sequence varies. The learning capability of a certain time interval is only related to the student's previous performance, so it does not affect the result. If a student misses some tests, the system defaults to the student answering the exercises incorrectly. The fact that the student did not do the exercises is equivalent to not knowing how to do the exercises. The student did not master the knowledge contained in the exercises.

Point 8: Presenting K-mean algorithm as Algorithm 1 is unnecessary.

Response 8: Thanks to the reviewer for this suggestion. The k-means clustering algorithm is well understood by all. We have removed the specific process of the k-means clustering algorithm.

Point 9: In Section 3.2.1, the exercise is represented as q_t. However, the variable t in the subscript is not defined? Is it a time?

Response 9: Thank you for pointing out the inadequacies of our manuscript!  denotes the exercise done by the student at moment t. We have made additions in line 252.

Point 10: Different vectors are described in Section 3.2.1. It is suggested to also describe the dimensions of the vectors.

Response 10: Thanks to the reviewer for this suggestion. Suppose that Q exercises contain N knowledge points. The correlation weight vector . The embedding vector . The exercise Embedding matrix . The key memory matrix . We have added the dimensions of these vectors in Section 3.2.1.

Point 11: In Equation 1, k_j is the knowledge point but what the variable j represents? What is the domain or range of k_j?

Response 11: Thank you for pointing out this issue! To ensure that different things do not use similar notation, we have modified the Equation 1 to .  is the knowledge point. The variable  represents a certain knowledge point. The range of  is the number of knowledge points contained in all exercises.

Point 12: In Line 213, N_jt indicates the amount of time in total student i represents. Where is variable i in this variable? How should we differentiate this variable for different students?

Response 12: Thank you for pointing out the inadequacies of our manuscript! We have made changes in line 225.  indicates the total number of times student i answered the knowledge point . We use the variable i to distinguish between different students.

Point 13: In Line 216, Groups are C_z. Is it the last time interval based groups?

Response 13: Thank you for pointing out this issue!  is the group to which students are assigned with similar learning capability at time interval z. It is based on all historical performance prior to time interval z. We have made additions in line 230.

Point 14: In Equation 1, k_j is the knowledge point and in Equation 6, k_t is embedding vector. Don’t use similar notations for two different things.

Response 14: Thank you for pointing out the inadequacies of our manuscript! We carefully proofread the formulas, modified equations 1 to 4. The  represent knowledge points. We make sure that different things do not use similar notations.

Point 15: In Equation 6, what does variable i represent?

Response 15: Thanks for your comment! In Equation 6, i represents  is the i-th row-vector of . We have made additions below Equation 6.

Point 16: In Equation 7, value matrix is subscripted with variable t but the key matrix in Equation 6 does not? Why?

Response 16: Thanks again for your comment! In Equation 7,  is the value memory matrix, which stores students' mastery of knowledge points. It is always changing over time. In Equation 6,    is the key memory matrix, which stores the potential knowledge points of the exercises. It does not change over time.

Point 17: In Line 252, v_t-1 is used but not defined.

Response 17: Thank you for pointing out the inadequacies of our manuscript!  is the previous response vectors. We have added the definition in line 267.

Point 18: In Equation 8, pd is constant, pd_t is difficulty of exercise what is pd(p_j)? Don’t use similar notations for different things. It makes the readability of the paper very difficult.

Response 18: Thank you for pointing out the inadequacies of our manuscript! To ensure that different notations are used for different things, we have changed Equation 8 to .  denotes the difficulty of exercise .  is a function that maps the error rate of exercise  onto (10) difficulty levels.  represents the difficulty level of the exercises that we wish to keep. We have modified the interpretation of the formula in line 274.

Point 19: In Line 256, N_j represents group of students but what does variable j represents?

Response 19: Thanks for your comment!  represents the group of students who answered the exercises . The variable  represents exercise . We have made changes in line 272.

Point 20: In Line 256, p_j represents exercise and p_ij represents answer. Don’t use similar notations for different things.

Response 20: Thank you for pointing out the inadequacies of our manuscript! We have used the variable  for the exercises and the variable  for the answers in line 273. We made sure to use different notation for different things.

Point 21: In Line 263, W_h represents weight matrix. What does variable h represents here?

Response 21: Thank you for pointing out this issue! We have changed Equation 10 to .  indicates the input weight matrix.  indicates the recurrent weight matrix. We have made additions below Equation 10.

Point 22: In Line 256, exercise is represented as p_j but on line 277 it is represented with variable j only.

Response 22: Thank you for pointing out the inadequacies of our manuscript! We have changed the exercise  to variable  in line 272. We have carefully proofread the entire text to ensure variable uniformity.

Point 23: In Algorithm 2, the input sequence has variable t while variable t is initialized on Line 1. Are they different or they are related?

Response 23: Thanks for your comment! The variable t in the input sequence is the variable t on Line 1. The variable t represents the moment, and the Line 1 indicates the operation that keeps cycling behind with moment.

Point 24: In Algorithm 2, the input sequence is for different students but there is no variable to identify individual users among them.

Response 24: Thank you for pointing out this issue! The algorithm is applicable to all students. We have added the variable i to the input sequence to identify the different students.

Point 25: In Algorithm 2, on Line 2 embedding vectors are cascaded with learning capability grouping. Neither grouping was taken as input of the algorithm nor it Algorithm 1 is called within the Algorithm 2.

Response 25: Thank you for pointing out the inadequacies of our manuscript! The learning capability grouping is calculated by equation (1) to (5). We have added in line 3 of the algorithm.

Point 26: In Algorithm 2, read vector r is cascaded with v and pd but both of those variables are undefined in the Algorithm.

Response 26: Thank you for pointing out the inadequacies of our manuscript! We have initialized the variables  and  in line 1 of the algorithm.

Point 27: In Line 299, r_t represent true values whereas same r_t is read vector in Equation 7.

Response 27: Thank you for pointing out this issue! We have changed  to  in Equation 18 in line 317.

Point 28: The algorithm is time-dependent and it was also assumed by the authors in Line 113 that knowledge tracking changes all the time. However, during evaluation they used k-fold cross validation. K-fold cross validation is used when data is not time dependent.  High score of k-fold validation shows that knowledge tracking doesn’t change over time? The authors are advised to change the validation methodology.

Response 28: Thanks to the reviewer for this suggestion. K-fold cross validation should be applicable in the field of knowledge tracking. K-fold cross validation randomizes the sequence of student interactions, but the order of answers inside the interaction sequence remains the same. Many knowledge tracking papers have also used k-fold cross validation, such as reference [39].

Thank you for your kind comments! And thank you so much for professional and careful opinion!

Author Response File: Author Response.pdf

Reviewer 2 Report

The topic addressed in the paper sounds interesting as the use of online education systems can help in reaching students who cannot physically attend lessons. The paper is well structured and organised. There are only a few minor points that could improve the overall presentation:

1)     Could the proposed solution be involved in offline learning to help in personalising the learning path?

2)     Could the Authors specify what is meant by “interactions”?

3)     Datasets: are the involved datasets intended to have ecological validity?

4)     What is measured by the correct rate (table 1)?

5)     Figure 4: pay attention to the quality of the image, especially the legend.

Comments for author File: Comments.pdf

Author Response

Response to Reviewer 2 Comments

Point 1: Could the proposed solution be involved in offline learning to help in personalising the learning path? 

Response 1: Thank you for pointing out this issue! Our proposed model should be applicable to offline learning. The model can get good prediction performance in offline environment as well. But when the external environment changes, offline learning needs to retrain the whole model. How can we land offline learning to the real world to help achieve personalized learning paths recommendations? This is one of our future research directions. We have added this direction of research in the conclusion section.

Point 2: Could the Authors specify what is meant by “interactions”?

Response 2: Thanks for your comment! The interaction means the response sequence between the student and the online education system. The system receives the result of the student's answer, changes the student's knowledge state according to the result, and then recommends an appropriate exercise. The student gives feedback to the system after answering the exercise, and so on and so forth. We have made additions in line 48.

Point 3: Datasets: are the involved datasets intended to have ecological validity?

Response 3: Thanks again for your comment! The four datasets we used are all from educational platforms and are well represented in the knowledge tracking domain. The datasets have a very large data volume and a very diverse data distribution, which can meet the requirements of the experiments well. The experimental results are universally representative and applicable, and can be well extended to real-life scenarios. The datasets has good ecological validity. We have made additions in line 322.

Point 4: What is measured by the correct rate (table 1)?

Response 4: Thank you for pointing out the inadequacies of our manuscript! The correct rate measures the percentage of correct answers to the exercises contained in all interactions in the dataset. We have made additions in line 330.

Point 5: Figure 4: pay attention to the quality of the image, especially the legend.

Response 5: Thank you for pointing out the inadequacies of our manuscript! We've replaced Figure 4 with a higher quality image.

Thank you for your kind comments! And thank you so much for professional and careful opinion!

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Authors have updated the paper but still there are issues with the manuscript.

1.       Authors were advised to use methodology other than K-fold cross validation method as their technique is dependent on time steps. However, authors have refused to update their methodology. Authors are advised to use better cross validation methodologies such as [1].

2.       Algorithm 1’s input sequence uses the variable t whereas the same variable is being initialized in line 2 of algorithm

3.       Page 6, Line 232, number of clusters is represented with "small k" whereas in Equation 5, it is represented with "Capital K", use only one notation for one variable. Also check the consistency of all the variables. Due to lots of variables, paper is still hard to read. 

[1] Racine, J., 2000. Consistent cross-validatory model-selection for dependent data: hv-block cross-validation. Journal of econometrics99(1), pp.39-61.

Author Response

Response to Reviewer 1 Comments

Point 1: Authors were advised to use methodology other than K-fold cross validation method as their technique is dependent on time steps. However, authors have refused to update their methodology. Authors are advised to use better cross validation methodologies such as [1].

Response 1: First of all, we sincerely apologize to you. Due to our misunderstanding, we did not change the validation method in the last revision. Thank you for your suggestion and method. The time series in this paper is steady state. Therefore, the hv-block cross-validation method is well suited for this case. The hv-block cross-validation method is consistent for general smooth observations. As the total number of observations approaches, the probability of selecting the model with the best predictive power converges to 1. The parameter h controls the dependence of the test and training sets and is set to ensure the near independence of these sets. The parameter v controls the relationship between the training set, the test set, and the sample size. We reran the experiments with a change in validation. The experimental results are shown in Table 2 and Table 3.

Table 2. Average AUC and ACC values of the three models on all datasets

Models

DKT

DKVMN

DKT-LCIRT

AUC

ACC

AUC

ACC

AUC

ACC

ASSIST2009

0.823

0.768

0.825

0.771

0.852

0.785

ASSIST2015

0.725

0.735

0.730

0.736

0.764

0.749

Synthetic

0.804

0.752

0.799

0.754

0.825

0.775

Statics2011

0.794

0.751

0.797

0.754

0.819

0.773

Table 3. Average AUC and ACC values of the three models on all datasets

Models

DKT-LC

DKT-IRT

DKT-LCIRT

AUC

ACC

AUC

ACC

AUC

ACC

ASSIST2009

0.850

0.784

0.826

0.773

0.852

0.785

ASSIST2015

0.765

0.750

0.732

0.737

0.764

0.749

Synthetic

0.827

0.776

0.798

0.754

0.825

0.775

Statics2011

0.816

0.771

0.795

0.753

0.819

0.773

 

Point 2: Algorithm 1’s input sequence uses the variable t whereas the same variable is being initialized in line 2 of algorithm

Response 2: Thanks for your comment! We have changed t to n in line 2. The input sequence  represents the interaction of the student with the system from moment 1 to moment t. Line 2 of the algorithm indicates that it keeps looping the algorithm as the moments change.

Point 3: Page 6, Line 232, number of clusters is represented with "small k" whereas in Equation 5, it is represented with "Capital K", use only one notation for one variable. Also check the consistency of all the variables. Due to lots of variables, paper is still hard to read. 

Response 3: Thank you for pointing out the inadequacies of our manuscript! We have changed the "Capital K" to "small k" in Equation 5. We checked all the variables to ensure consistency. The paper has lots of variables. To make the paper easier to read, we have added a variable description table in the supplementary materials section.Table S2: Meaning of the main variables

Thank you for your kind comments! And thank you so much for professional and careful opinion!

Author Response File: Author Response.pdf

Back to TopTop