A Study in the Early Prediction of ICT Literacy Ratings Using Sustainability in Data Mining Techniques

: It would be very beneﬁcial to determine in advance whether a student is likely to succeed or fail within a particular learning area, and it is hypothesized that this can be accomplished by examining student patterns based on the data generated before the learning process begins. Therefore, this article examines the sustainability of data-mining techniques used to predict learning outcomes. Data regarding students’ educational backgrounds and learning processes are analyzed by examining their learning patterns. When such achievement-level patterns are identiﬁed, teachers can provide the students with proactive feedback and guidance to help prevent failure. As a practical application, this study investigates students’ perceptions of computer and internet use and predicts their levels of information and communication technology literacy in advance via sustainability-in-data-mining techniques. The technique employed herein applies OneR, J48, bagging, random forest, multilayer perceptron, and sequential minimal optimization (SMO) algorithms. The highest early prediction result of approximately 69% accuracy was yielded for the SMO algorithm when using 47 attributes. Overall, via data-mining techniques, these results will aid the identiﬁcation of students facing risks early on during the learning process, as well as the creation of customized learning and educational strategies for each of these students.


Introduction
In learning scenarios, it is important for teachers to be able to identify students potentially at risk of faring poorly within a learning area and provide educational intervention proactively. Educational institutions are becoming increasingly concerned with achieving such interventions early on in the learning process [1] because estimating the ratio of positive-to-negative learning outcomes (i.e., succeeding or failing to learn) is critical to strategic planning. Through analysis of the variables from a student's background, it is possible to identify whether or not the student will be likely to succeed prior to immersion in the learning experience. Subsequently, appropriate actions can be taken to facilitate successful outcomes [2]. The capacity to analyze and predict academic performance represents an important milestone in the educational domain, and it is an important factor in building a student's future [3,4]. Therefore, these predictive variables can be used to identify students' learning characteristics to create adaptable methods for providing high-quality education to improve learning outcomes [5,6].
Existing studies have shown that students leverage relevant personal variables and attributes for their academic progress during instruction. These studies focused on the possibility of predicting academic achievement by utilizing student background factors as determined by surveys conducted prior to a class. Such background factors are necessary to analyze students' perceptive capabilities. If automated methods could be employed for

Application of Data-Mining Techniques in Education
Data-mining techniques are used to discover new information hidden within large databases [1,6]. Owing to advances in computing technology, these techniques are increasingly being used to solve problems and make discoveries in various fields of science, medicine, finance, and business [9,10]. In particular, data mining is being used in the field of education to diagnose students' learning factors and provide them with a variety of educational services [11,12].
The education industry leverages data-mining techniques to predict academic performance in advance lessons. The mined data relate to elements of the entire learning course (e.g., midterm, quiz, and activity content). Online and offline programming introductory courses applied similar metrics using neural-network (NN), DT, SVM, and NBC methods. These results showed that the SVM algorithms were the most efficient after 50% completion of a course. Failure rates were predicted with 92% efficiency in online classes and 83% efficiency in offline classes [13]. When predicting early time periods for majors in information-technology (IT)-related areas, seven algorithms (i.e., DT, rule induction, artificial NN, KNN, NBC, and random forest) were used. In this study, year-2007 student data were used for training, and the predicted rate was expressed using similar data from 2008. The results showed that NBC had a prediction rate of 83.7% [14]. University informatics courses used REPTree, J48, and M5P data-mining techniques to predict student performance. The attributes used to create the models included exam conditions, exam points, activities points, and more. The predictive model showed an average of 65% positive results and could reasonably predict a student's academic achievement [1].
However, for the early prediction of overall academic performance, graduation credits, or final-grade ratings, directly relevant attributes (e.g., exam and quiz scores) are commonly used. These related attributes are highly correlated with collected and predicted data, and assessments can be used to early-predict a student's achievement, but only after the learning process begins.
There are two ways to assess success or failure likelihood after the learning process begins. First, the research must ensure early prediction of the overall performance or required credits. Second, the student can express an early prediction rate based on the responses to a personal questionnaire provided prior to the learning process. For highschool students, a DT algorithm was used to predict student achievement, which was divided into five rating categories: "Unsatisfactory" (6%), "Basic" (40%), "Moderate" (38%), Sustainability 2021, 13, 2141 3 of 11 "Good" (14%), and "Excellent" (3%). The data used included measures of self-esteem, self-concept, habits, motivation, cognitive skills, study strategies, and emotional variables representing personal factors related to academic performance. The prediction accuracy in that study was the highest in the "Basic" category with 40% of the student distribution. The remaining categories were in the range of 34-83% [15]. For college students, three algorithms (i.e., DT, NN, and SVM) were used to predict academic performance. The data included measures of online time, frequency of internet connection, amount of internet traffic, and usage behaviors online, which are linked to academic performance. The results showed that the SVM algorithm was the most accurate when predicting passing and failing grades (69-73%), followed by NN (68-71%) and DT (60-62%) [16]. The data used for college students included measures of age, gender, personality, motivation, and learning strategy, and data mining was used to predict the learning outcomes.
The results of that study indicated that SVM (73.3%) was the highest among the six algorithms, followed by KNN (69.4%), NN (69.0%), NBC (69.0%), DT (65.9%), and logistic regression (60.0%). Finally, for college students, the results were more accurate for freshmen than for seniors [17]. In the current paper, the early prediction of academic performance using extant learning processes is precluded, and the attributes directly relevant to predicting final grades are excluded. Additionally, the perception of IT-related students, which constitutes non-grade data focused on predicting final performance, is predicted using six sustainability-in-data-mining algorithms.

ICT Literacy
ICT literacy has been emphasized as an ability to be acquired by all to keep pace with IT development. Such literacy includes the ability to use digital technologies to solve problems, analyze, and generate information based on data, and communicate with others [18,19]. This is the interactions generated by learning to facilitate teacher decisionmaking when big data are generated, these big data are managed, and analyzed by data mining [20]. Since 2007, ICT literacy tests have been employed, and IT-related perceptions have been surveyed among elementary-and middle-school students in Korea [21,22].
The ICT literacy-test questionnaire comprises 36 questions concerning the internet, computer literacy, and IT curricula for daily life. The test results are divided into four levels (i.e., excellent, average, basic, and poor) according to student achievement. The criteria for each level are determined via expert consultation and consideration of the student's ability. The surveys of IT-related students measure the perceptions of their ability to use computers, smart devices, internet tools, and software. Details are shown in Table 1.

Research Method
The proposed method predicted ICT literacy levels using sustainability-in-data-mining techniques based on students' IT-related perceptions. The ICT literacy rating prediction used six algorithms and data mining. A dataset from 2011 was used as the training set, and an attribute selector set of 47, 24, and 17 attributes were used for elementary schools, depending on the information gain ranking and empirical method. Similarly, sets of 47, 22, and 14 attributes were selected for middle-school students.
The data-mining technique selected six sets of algorithms referenced in the preceding studies. Several algorithms were used, including rule-based machine learning, OneR, DT, J48, ensemble listeners, bagging, random forest, neural networks, MLP, SVM, and sequential minimal optimization (SMO) [23]. This study used 10-fold cross-validation to create an optimal method for evaluating model performance [24].
The proposed model applied a dataset from 2012 as the test set, and the model's predictive accuracy was evaluated by measuring accuracy, precision, recall, and F1 score (i.e., F-measure). The flowchart of the ICT literacy evaluation prediction is shown in Figure 1.

Research Subject
The subjects of the study were selected by surveying students corresponding to 1% of the number of elementary-and middle-school students in Korea using stratified random sampling. For the 2011 dataset, 12,373 elementary-school students and 15,556 middleschool students were selected. Similarly, 12,905 elementary and 18,072 middle-school Sustainability 2021, 13, 2141 5 of 11 students were selected for 2012. In total, 25,296 elementary-and 33,628 middle-school students participated in this study for two years.
schools, depending on the information gain ranking and empirical method. Similarly, sets of 47, 22, and 14 attributes were selected for middle-school students.
The data-mining technique selected six sets of algorithms referenced in the preceding studies. Several algorithms were used, including rule-based machine learning, OneR, DT, J48, ensemble listeners, bagging, random forest, neural networks, MLP, SVM, and sequential minimal optimization (SMO) [23]. This study used 10-fold cross-validation to create an optimal method for evaluating model performance [24].
The proposed model applied a dataset from 2012 as the test set, and the model's predictive accuracy was evaluated by measuring accuracy, precision, recall, and F1 score (i.e., F-measure). The flowchart of the ICT literacy evaluation prediction is shown in Figure 1.

Research Subject
The subjects of the study were selected by surveying students corresponding to 1% of the number of elementary-and middle-school students in Korea using stratified random sampling. For the 2011 dataset, 12,373 elementary-school students and 15,556 middle-school students were selected. Similarly, 12,905 elementary and 18,072 middle-school students were selected for 2012. In total, 25,296 elementary-and 33,628 middle-school students participated in this study for two years.

Preprocessing Data
ICT literacy results and IT questionnaire data of the elementary-and middle-school students were collected for the years 2011 and 2012. The data corresponding to 2011 were used as the training set, and those corresponding to 2012 were used as the test set. The criteria for data purification required the selection of missing values and excluded outliers that resulted in unstable or distorted data.
In this study, attribute selection was performed to improve the efficiency of data prediction [5]. The data attributes used in the analysis of data prediction were extracted from the 2011 training dataset using information gain and average merit. Information gain can determine the importance of a given attribute when deciding which attributes in the training dataset are most useful for distinguishing the classes to be learned, including the order In this study, attribute selection was performed to improve the efficiency of data prediction [5]. The data attributes used in the analysis of data prediction were extracted from the 2011 training dataset using information gain and average merit. Information gain can determine the importance of a given attribute when deciding which attributes in the training dataset are most useful for distinguishing the classes to be learned, including the order of attributes. Using the heuristic method, attributes related to the research issues were included, and unnecessary items were deleted (e.g., user ID, student name, and registration number) to finally select the appropriate attributes. The details regarding this procedure are shown in Table 2.

Parameter Setting and Final Model Confirmation
The proposed ICT literacy rating prediction model increased the efficiency of the results when using the six data-mining algorithms. The analysis of the results for prediction was performed using 10-fold cross-validation to change the attributes of the data and basic option parameters and to adjust the highest prediction rate.
The proposed model used data mining to compare actual and predicted data results. As a result, models having higher accuracy were considered to be better.

Data-Mining Techniques Used
Regarding the data-mining techniques, six algorithms were selected by comparing and analyzing their performance accuracies and capabilities based on previous studies. The OneR algorithm is a simple classification rule that is typically applied to a dataset to test a particular attribute. It is a simple and accurate classification algorithm that can create one rule for each predictor and select the rule having the smallest number of errors [25]. The J48 algorithm determines classification criteria based on normalized entropy difference and uses the concept of information entropy to create a DT from the learning data [26]. Bagging is used for statistical classification and regression, and it is an ensemble meta-algorithm designed to improve safety and accuracy. It can reduce the distribution of unstable procedures, such as regression trees, while greatly improving predictive accuracy [27]. Random forest is an ensemble learning method used for the creation, classification, and regression operations of multiple decision trees during training cycles. The benefits of random forest are that it selects one optimal solution, but it randomly selects from the k best options, thereby improving the decision trees [28]. The MLP is a kind of feed-forward artificial NN comprising at least three node hierarchies in which each node, except the input node, is a neuron that uses a nonlinear activation function [6]. The SMO algorithm is sensitive to fine-tuning, but manual fine-tuning is not desirable because it does not guarantee the efficiency of results [13].

Evaluation Criteria
In this study, accuracy, precision, recall (sensitivity), and F1 score were used as criteria for evaluating the six data-mining algorithms [29,30]. Accuracy is the percentage of the measurement that matches the actual and predicted values of the algorithm among the total data (1). Precision is the ratio between actually correct predictions of the positive class (true-positive (TP)) and all predictions of the positive class by the proposed model (TP + false positive (FP)). In other words, it is the ratio of what the algorithm predicted to be the correct answer (2). Recall (sensitivity) is the ratio of actual correct answers (TP + false negative (FN)) when the correct answer was accurately predicted (TP) (3). Precision and recall can be biased if there are many positives or negatives in the data, and the F1 score is used for the performance evaluation of the model using the harmonic mean of precision and recall (4).

Research Results
The proposed method predicted student ICT literacy levels using sustainability-indata-mining techniques based on the perceptions of those IT-related learners. Information gain can be used to transform datasets to determine attribute importance and to distinguish classes. Therefore, the attributes found in the elementary-school results were divided into 47, 24, and 17 based on the average merit value of information gain ranking. The attributes from the middle-school results were divided into 47, 22, and 14.
The early-predicted results for elementary-and middle-school ICT literacy were characteristic of the algorithm used in the sustainability-in-data-mining techniques, indicating normal changes with the number of choices in the attributes.

ICT Literacy-Level Prediction Results for Elementary Students
The results of the ICT literacy-level predictions for the 2012 elementary-school dataset showed that the accuracy corresponded to the number of selected attributes. The lowest accuracy was 62.8%, and the highest was 67.3%. The highest early prediction result of all six algorithms was provided by SMO (67.3%), which used 47 attributes. The lowest prediction result was provided by OneR (62.8%), which used 17 attributes. The details regarding these results are shown in Figure 2. The F1 score uses the harmonic average of precision and recall and is an indicator of test and prediction. In this study, the SMO algorithm scored the highest (0.499) when 47 attributes were used. The lowest prediction result was returned by OneR (0.388) using 17 attributes. The details regarding these results are shown in Table 4.

Prediction Results for Middle-School Students
The results of the ICT literacy grade predictions using the 2012 IT-related middleschool dataset showed varying accuracies according to the number of selected attributes. For this dataset, the accuracy ranged from 63.9% to 68.7%. As noted, the highest prediction score was provided by SMO, which used 47 attributes (68.7%). This was also the highest score achieved when comparing those of the other algorithms. The lowest prediction score was provided by MLP, which used 14 attributes (63.9%). The details regarding these results are shown in Figure 3. The F1 score uses the harmonic average of precision and recall and is an indicator of test and prediction. In this study, the SMO algorithm scored the highest (0.499) when 47 attributes were used. The lowest prediction result was returned by OneR (0.388) using 17 attributes. The details regarding these results are shown in Table 4.

Prediction Results for Middle-School Students
The results of the ICT literacy grade predictions using the 2012 IT-related middleschool dataset showed varying accuracies according to the number of selected attributes. For this dataset, the accuracy ranged from 63.9% to 68.7%. As noted, the highest prediction score was provided by SMO, which used 47 attributes (68.7%). This was also the highest

Prediction Results for Middle-School Students
The results of the ICT literacy grade predictions using the 2012 IT-related middleschool dataset showed varying accuracies according to the number of selected attributes. For this dataset, the accuracy ranged from 63.9% to 68.7%. As noted, the highest prediction score was provided by SMO, which used 47 attributes (68.7%). This was also the highest score achieved when comparing those of the other algorithms. The lowest prediction score was provided by MLP, which used 14 attributes (63.9%). The details regarding these results are shown in Figure 3. The F1 score is an indicator of how well a prediction matches reality, and it uses a harmonic mean of precision and recall. As a result, when 47 attributes were used, SMO (0.541) exhibited the highest score. The lowest prediction score was provided by OneR (0.504) using 14 attributes. The details regarding these results are shown in Table 5.

Conclusions
This paper presented a model for predicting early academic performance-based learning perception using sustainability-in-data-mining techniques. Specifically, ICT literacy levels were predicted using six algorithms based on the students' perception of IT and ICT factors.
The highest SMO algorithm prediction results were 67.3% when using 47 attributes, and the lowest SMO algorithm prediction results were 65.0% when using 24 attributes for the 2012 elementary-school dataset. Therefore, the difference between the two cases was 2.3%. For the 2012 middle-school dataset, the highest and lowest prediction results for the SMO algorithms differed by approximately 4.5%, with accuracy scores of 68.7% for 47 attributes and 64.2% for 14 attributes.
The differences between early prediction results for the elementary-and middle-school datasets using the six-algorithm data-mining technique were 2.3% and 4.5%, respectively. By arranging the attributes affecting these results, similar scores can be achieved without significant changes in early prediction accuracy, even when a small number of features is selected. In particular, the accuracy results of the elementary-and middle-school students were more favorable when 24 and 17 attributes, respectively, were used than when all 47 were used. This was true for the MLP, bagging, and J48 algorithms. Therefore, the accuracy is dependent on both the characteristics of the algorithm and the number of attributes.
These results fully answer the three research questions presented in the introduction. The results of these three data-mining techniques can, therefore, be used to inform teachers, institutions, and students in advance of potential learning successes or failures. Moreover, this innovation has the potential of avoiding or mitigating negative learning outcomes while providing students with important insights into improved educational approaches. In summary, it is possible to sufficiently predict early academic performance using sustainability-in-data-mining techniques based on student perceptions of IT competency and ICT literacy. Moreover, during the process of predicting early achievement by recognition, the SMO and RF algorithms were shown to be most effective. Finally, it was determined that the early prediction accuracy remained close to the highest observed ratio without significant changes when the number of attributes was reduced.
I examined the top five attributes of Information Gain ranking from the analyzed attributes. At the elementary school, the ability to attach data, download search documents, communicate on SNS, music, videos, and search articles using the internet was revealed. At middle school, computer virus prevention, the ability to use the internet, the ability to use the operating system, and the ability to resolve errors were shown. Middle school students used more specialized methods of using a computer than elementary school students. On the other hand, I examined the properties under Information Gain ranking. In general, we've found attributes that are not related to ICT, such as whether to keep smart devices or when to use a computer first boot. Analyzing the attributes indicated by this Information Gain, it can be said that they are related to pursuing the direction of learners' learning and educational strategies.
The significance of this study is its development of a new model for the early prediction of academic performance. This can help identify students facing risks early in the learning process via the application of data-mining techniques and the creation of customized learning and educational strategies for each student. Future research will require improvements to these study results via the extension and integration of the analysis of more diverse data to improve prediction accuracy.
This study has certain limitations, particularly, although the use of sustainability-indata-mining techniques to predict achievement using student perception is interesting, it poses some risks. First, more data are required because, in this study, ICT literacy was only analyzed for 1% of Korea's student population via stratified random sampling. This is insufficient to represent larger populations. Second, only six representative algorithms (i.e., SMO, RF, MLP, bagging, J48, and OneR) were selected and studied. The addition of deep-learning algorithms, wherein sustainability-in-data-mining techniques are rapidly evolving, represents an important consideration for future work.