Toward Predicting Student’s Academic Performance Using Artificial Neural Networks (ANNs)

Student performance is related to complex and correlated factors. The implementation of a new advancement of technologies in educational displacement has unlimited potentials. One of these advances is the use of analytics and data mining to predict student academic accomplishment and performance. Given the existing literature, machine learning (ML) approaches such as Artificial Neural Networks (ANNs) can continuously be improved. This work examines and surveys the current literature regarding the ANN methods used in predicting students’ academic performance. This study also attempts to capture a pattern of the most used ANN techniques and algorithms. Of note, the articles reviewed mainly focused on higher education. The results indicated that ANN is always used in combination with data analysis and data mining methodologies, allowing studies to assess the effectiveness of their findings in evaluating academic achievement. No pattern was detected regarding selecting the input variables as they are mainly based on the context of the study and the availability of data. Moreover, the very limited tangible findings referred to the use of techniques in the actual context and target objective of improving student outcomes, performance, and achievement. An important recommendation of this work is to overcome the identified gap related to the only theoretical and limited application of the ANN in a real-life situation to help achieve the educational goals.


Introduction
Education institutions at different educational levels are established to provide highquality education capable of changing people's levels of awareness, knowledge, and mental capacity. Teachers and educators are always looking to enhance student achievement and monitor their performance to determine the efficiency of the teaching process. The new advancement of technology enables educators to use analytics and data mining methodologies to search large datasets for patterns that reflect their students' behavior and learning [1]. Although student performance is critical to the learning process, it is a complex phenomenon influenced by various factors such as the teaching environment and individual study habits. Various studies [2,3] have used a variety of indicators/variables to develop models that can predict students' academic performance at different levels of education, including high school and university education in various disciplines, in particular engineering and medicine.
The nature of the factors that contribute to a candidate's performance is correlated and complex. Thus, this limits the potential of other prediction models, such as straightforward and clear assumptions of the given data such as the regression model. However, the problem here is the complexity of selecting a proper function capable of capturing all forms of data associations and automatically adjusting output in case of additional information. A more common approach to solving this type of problem is to use an Artificial Neural network, which simulates the human brain in solving a problem. As a result, an adaptive system, such as an Artificial Neural Network, is being developed to predict a student's performance based on the effects of these factors. Artificial Neural Networks (ANN) are valuable tools for data analysis used to identify and represent functional correlations between variables [4]. ANNs are currently being used to address, forecast, and classify challenges in diverse fields. For example, in the academic domain, ANN is applied for analyzing academic performance [5]. Indeed, a variety of methodological approaches have been used to predict students' academic performance. For example, traditional statistical methods, such as discriminant analysis and multiple linear regressions, are used as the first and most common approach found in the educational literature in Malaysian study performed by [6].
Various studies [7,8] have used structural equation modelling (SEM) to compare theoretical models to data sets or to test different models of academic performance [7,8]. In comparison to artificial intelligence computing methods, traditional approaches failed to consistently show the capacity to reach accurate predictions or classifications [9]. As a result, a third approach to predicting student achievement and academic performance indicated in recent literature uses machine learning techniques, such as Artificial Neural Network methods (ANN) and some other techniques including Decision Tree [10], Support Vector Machines [2], Bayesian algorithms [11], and Ensemble Learning [12]. This method has been successfully applied in various fields, including business, engineering, meteorology, and economics, without significant differences in the obtained results or the level of accuracy.
ANN is one of the machine learning and data mining algorithms that was used in various research literature and was claimed to provide superior and accurate results regarding predicting student performance. ANN is widely regarded as an effective patternrecognizer and an important method for classifying potential outcomes. Given the extant literature, there is always room for improvement in machine learning approaches such as ANN (the main focus of this paper) and the prediction model.
The review of literature is an important step which enables the researcher to identify the knowledge gap, understand the recent technological trends and examine various methods that yield various degrees of accuracy to identify the factors or input variables and methodologies that yield the highest accuracy of prediction of the student academic achievement and performance.
Through literature assessment, the researcher's mind is open to contributing to ANN's knowledge in predicting student achievements and performances. Hence, this paper aims to examine the current literature in a systematic review approach. Also, this review attempts to identify a pattern of the most used ANN techniques used and the nature of the data commonly used to predict student academic performance, in terms of the level of education, sample sizes and the attributes or the inputs used in various models. Additionally, this review tries to establish any link to the factors that contribute to the model's ability to predict student performance accurately.
The remainder of this paper is organized as follows: Section 2 presents the related studies and significance of this review. Section 3 introduces the research methodology. Section 4 presents the results, whereas Section 5 discusses the findings and Section 6 discusses the future work. Finally, the conclusion is presented in Section 7.

Related Works
A preliminary search on the surveys and systematic literature reviews, specifically concerning the student's academic performance using Artificial Neural Networks (ANNs), Appl. Sci. 2022, 12, 1289 revealed that there are currently no secondary studies that were carried out on the specific objective of this review; however only one SLR was conducted by the authors of [1]. In ref. [1], the authors analyzed and classified 62 papers on predicting student performance using data mining and analytical techniques. However, using Artificial Neural Networks (ANNs) in predicting student performance was not given a pivotal focus and consideration. Accordingly, only a handful of available studies were included in their review on using ANN for predicting students' academic performance.
To emphasize the conclusion that no systematic literature review had previously been conducted on the topic of this review, a preliminary search was conducted on the same digital databases used in this review (e.g., Google Scholar, IEEEXplore, ACM Library, Web of Science and ScienceDirect). The results of the preliminary search, that lists and summarizes the most-related systematic literature review to our review, are presented in Table 1. The analysis of data extracted from these systematic reviews, presented in Table 1, shows that they are beyond the scope and objectives of this systematic review. Table 1. Existing systematic literature reviews.

Reference
Main Objective Source Year of Publication [13] To analyze existing studies on the intelligent approaches and techniques used to predict student learning outcomes.
Applied Sciences (Journal) 2021 [14] To explore the current machine learning methods and attributes used in predicting the student's performance.
IEEE Explorer (Conference) 2021 [15] To synthesize research literature on educational data mining (EDM) and learning analytics.
Information Discovery and Delivery (Journal) 2020 [16] To assess the current state of student academic performance prediction research.
ACM Conference on Innovation and Technology in Computer Science Education 2018 [17] To determine the differences between various data mining prediction techniques used in education.
International Journal of Education and Management Engineering 2017 3. Method

Search Strategy
For a comprehensive review of the current literature on the required research topic and to address our research goals, we devised specific steps for a clear plan. These steps include defining and prioritising the problems and queries. This process involves establishing the research aims and objectives, developing a structure and strategy, and setting the inclusion and exclusion criteria. In addition, other steps include examining the literature (scanning) using relevant databases, manual searching, selecting articles based on the inclusion and exclusion criteria, critical assessment, and data extraction from the selected studies. The final step involves gathering data and reporting the findings with illustrations (visualize the results). In the context of the research, Google Scholar was primarily used as the main search database. The rationale is that Google Scholar has a major advantage over other search engines in identifying articles in a wide variety of scientific journals. Also, it offers precise accuracy in terms of the selected keywords. Additionally, for a more comprehensive literature search, where possible, other databases searched included IEEEXplore, ACM Library, Web of Science and ScienceDirect. The literature search in the databases was conducted using the following criteria, parameters, and keywords: "predicting student outcomes" and "artificial neural networks." Using the search queries, we sought to retrieve all relevant academic literature published under the context of ANN. Furthermore, the relevant databases were searched using the period 2013 to 2021.

Inclusion and Exclusion Criteria
The study's inclusion and exclusion criteria were developed based on [18]; we included studies that were (i) published in English, (ii) from both conference proceedings and peerreviewed journals, (iii) predicting students' success at all levels, (iv) directly related to ANN and provided extensive details on the methodology employed. Other studies that were not written in English, do not include reliable data on the characteristics used, do not go into great depth (in detail) on the algorithms utilized and published before 2013 were excluded.

Results
The literature search was conducted in August 2021 and resulted in 853 studies in the first phase. Records were selected if they were considered relevant based on our inclusion criteria. This process was used to filter the articles' titles and abstracts. On further examination, 416 studies were duplicates and 369 were rejected. The full texts of the remaining 68 articles were retrieved for further examination. After thoroughly examining the full texts and applying the inclusion and exclusion criteria (Figure 1), a total of 21 articles met the criteria and were considered for the full assessment. The selected 21 articles were categorized based on the title, author, publication year, study location, objectives, the years the articles' data was adopted, educational level, and sample size. Other essential criteria include the methodology, type of ANN adopted in the studies, the input (independent) and output (dependent) variables, ANN's performance evaluation criteria, the most significant predictor, the model's accuracy level, and study findings. The process of scanning literature is demonstrated in the flow chart as in Figure 1 [19].
in the databases was conducted using the following criteria, parameters, and keywords: "predicting student outcomes" and "artificial neural networks." Using the search queries, we sought to retrieve all relevant academic literature published under the context of ANN. Furthermore, the relevant databases were searched using the period 2013 to 2021.

Inclusion and Exclusion Criteria
The study's inclusion and exclusion criteria were developed based on [18]; we included studies that were (i) published in English, (ii) from both conference proceedings and peer-reviewed journals, (iii) predicting students' success at all levels, (iv) directly related to ANN and provided extensive details on the methodology employed. Other studies that were not written in English, do not include reliable data on the characteristics used, do not go into great depth (in detail) on the algorithms utilized and published before 2013 were excluded.

Results
The literature search was conducted in August 2021 and resulted in 853 studies in the first phase. Records were selected if they were considered relevant based on our inclusion criteria. This process was used to filter the articles' titles and abstracts. On further examination, 416 studies were duplicates and 369 were rejected. The full texts of the remaining 68 articles were retrieved for further examination. After thoroughly examining the full texts and applying the inclusion and exclusion criteria (Figure 1), a total of 21 articles met the criteria and were considered for the full assessment. The selected 21 articles were categorized based on the title, author, publication year, study location, objectives, the years the articles' data was adopted, educational level, and sample size. Other essential criteria include the methodology, type of ANN adopted in the studies, the input (independent) and output (dependent) variables, ANN's performance evaluation criteria, the most significant predictor, the model's accuracy level, and study findings. The process of scanning literature is demonstrated in the flow chart as in Figure 1 [19].

Studies Characteristics
Remarkably, the research interest in using ANN to predict student performance has been growing over the last few years, with a substantial increase in the number of studies

Studies Characteristics
Remarkably, the research interest in using ANN to predict student performance has been growing over the last few years, with a substantial increase in the number of studies between 2018 and 2019, as Figure 2 depicts. However, the decline in publishing output (2019-2021) cannot be considered a trend because the database search was completed before the end of 2021. The significantly low number of publications during this timeframe (2013-2021) can be attributed to the research topic's low maturity due to its novelty. As illustrated in Figure 2a, this review period was from 2013 to 2021, with the literature sourced from peer-reviewed journals and conference proceedings through a database search. In detail, 85.7% of the literature were journal papers, and 14.3% were conference proceedings ( Figure 2b). fore the end of 2021. The significantly low number of publications during this timefr (2013-2021) can be attributed to the research topic's low maturity due to its novelty illustrated in Figure 2a, this review period was from 2013 to 2021, with the litera sourced from peer-reviewed journals and conference proceedings through a data search. In detail, 85.7% of the literature were journal papers, and 14.3% were confer proceedings ( Figure 2b).   More than half of the studies (57.1%, n = 12/21) were undertaken in Asia (see Figure 3), 14.3% (n = 3/21) in Europe, 14.3% (n = 3/21) in South America, 4.8% (n = 1/21) in North America, and 4.8% (n = 1/21) in Africa. Furthermore, one study was undertaken by a Turkey-based researcher, but it was unclear where the analysis was conducted. A total of 295,354 participants were involved in the selected studies. The sample sizes ranged from 150 to 162,030 participants, with a mean of 156.015, indicating that ANNs were used on large datasets.
(2013-2021) can be attributed to the research topic's low maturity due to its novelty. As illustrated in Figure 2a, this review period was from 2013 to 2021, with the literature sourced from peer-reviewed journals and conference proceedings through a database search. In detail, 85.7% of the literature were journal papers, and 14.3% were conference proceedings ( Figure 2b). More than half of the studies (57.1%, n = 12/21) were undertaken in Asia (see Figure  3), 14.3% (n = 3/21) in Europe, 14.3% (n = 3/21) in South America, 4.8% (n = 1/21) in North America, and 4.8% (n = 1/21) in Africa. Furthermore, one study was undertaken by a Turkey-based researcher, but it was unclear where the analysis was conducted. A total of 295,354 participants were involved in the selected studies. The sample sizes ranged from 150 to 162,030 participants, with a mean of 156.015, indicating that ANNs were used on large datasets.

Studies Synopses
To obtain an overview of the studies, the authors extracted critical data in Table 2. The tabulated information involves the study's objectives, level of education, and information about the data, such as years of the collected data and sample size. The results in this table show that most of the studies focus on students' academic performance as the outcome variable; also, most studies evaluate bachelor degree students due to a large amount of data at this level. Notably, the sample size in the studies is negligible, except for a few studies.

Predictors and Outcome Variables
Specific patterns were detected in the collected data; the results in Table 3 describe the independent and dependent variables used in the selected studies. The predictors differ from one study to another. In addition to the results of the subjects, some researchers include the students' activities such as total length of internet time, book-borrowing numbers, and traffic to school. The socioeconomic variables were seen in some studies, and some involve family details. Table 3. Predictors and outcomes used in the studies.

Ref.
Predictors Outcome [20] Number of messages on the LMS both viewed and posted, content creation, files viewed, quiz efforts Course outcome [21] (GPA) for the first three years of study Fifth-year and final Cumulative Grade Point Average (CGPA) [22] GP scored in some subjects CGPA [23] The first year of university score, high school score, subject result score of math I and II, Electronics I, Electrical Circuit I, the number of the credit that the student passed during the first year of college, demographic variables, type of high school (private or public), location of the school (inside or outside of Palestine), and the student gender CGPA of the first year in engineering university [24] First three years, residency training records, gender success on pre-board exams [25] Gender, training, forum, chat, discussion, upload assistance, The message, quiz training, and total login E-learning success [28] 21 attributes, passed the course and the student's affiliation Predict the mean grades and credits of the students [29] 39 attributes, students' gender, region, educational level, age range, neighborhood crime rate (IMD), number of times they have previously participated in the course, enrolled credits, disability, and the final exam result (passed/failed). In addition, the number of times the student has interacted with any of the online course contents was counted throughout the courses Student success [30] Gender, content score, time spent, number of entries to content, homework score. Attendance, archived courses Student performance [31] University entrance examination score, the average overall score of high school graduation, examination, the elapsed time between graduating from high school and obtaining university admission, location of student's high school, type of high school attended, gender Students' academic performance. [32] 12 input variables, classified into academic, parent, person, managerial and social Student pass or fail [33] Exam results and other factors, such as the location of the student's high school and the student's gender Student performance [34] Socioeconomic variables, school type variables, student's previous achievement variables, tutor's expertise variables Student performance [35] Students' internet accessing details including the total length of internet time, active periods, traffic, college entrance examination scores represent the students' initial knowledge level and learning ability, book-borrowing numbers, and birth dates, first midterm examination scores Student grade in subjects Table 3. Cont. [36] 116 features for the production phase (product data) and 84 for the learning phase Team performance [12] 23 factors including academic demographic, social, and behavioral factors with prior semester performance. Student at risk [37] 11 variables include socio-economic background, university entrance examination results, and CGPA. CGPA [38] 123 variables, including prior academic achievement, tuition fees, students' socioeconomic status, students' home characteristics, students' household status, students' background information, high school characteristics, working status, university background, and academic performance in higher education.

Ref. Predictors Outcome
Performance level (low or high) [39] Application score, vulnerability index, gender, population segment, application priority, application instance, school type, regime, province, ethnicity, disability Level pass

Methods and Performance
The prediction accuracy of students' performance varies due to many factors such as data size, evaluated variables, and the type of methods used. The methods used in the assessed studies are introduced in Table 4, including the type of ANN or algorithm used, model accuracy, and remarkable findings.

Research Domains
In recent years, numerous studies have concentrated on the field of student success evaluation in EDM. A significant number of articles, primarily on higher education, have been reported [8]. Early identification is unquestionably significant since it can be used to support educational organizations implementing actions and policies. A wide range of data mining methods and explanatory features were used to try and forecast academic results, including demographic and academic behavioral patterns, online activity, among other things [40]. This article's key aim is to analyze findings from the last five years (2013-2021).
There is a need to understand student success and the factors that contribute to this success and the overall performance at all levels. All the studies predict the students' success or failure, predict the student's performance, and investigate the factors that improve their success [20,23,26,30,32,34]. Some studies aimed to compare the ANN and other techniques to reach the same goal: predicted student performance including Linear Regression (LR) [22]. In other studies, the comparison is made between two ANN methodologies such as Cuckoo search and gravitational search algorithms [31,33].
Regarding the level of education, all the reviewed studies evaluated student performance and applied the ANN on university levels or higher. In two studies, secondary school students in conjunction with university student results were used [22,26]; the same methodology was applied for master's students [27,28]. Although some studies investigated student performance in the medical discipline [19], most of the studies focused on engineering students including those in the architectural, mechanical, electrical, civil, information technologies, and software disciplines [34]. Other studies used multiple disciplines, including business and marketing, social communication, law, medicine, engineering, and psychology [27]. Some studies focused on E-learning or Virtual Learning Environments (VLEs) [20,29,30]. Various sample sizes were used based on the data availability.

Artificial Neural Networks
A neural network is an equation that describes a mathematical structure made up of several interlinked computational elements called neurons, similar to the name given to the central element of the human nervous system. In some instances, these neurons, referred to as perceptrons, or nodes in the network topology, perform a simple operation on their inputs and send the output to corresponding nodes [23,30]. Neural networks are polymorphic in structural formation and parallel in algorism computation, and they can be described as a densely interconnected system of processing elements capable of parallel computation. An ANN contains an input layer (which can be thought of as the independent variables or the predictor variables), one or more hidden layers, and a categorical dependent variable-like output layer. All ANNs use multiple processing entities that can be trained. This is a process in which the network learns and adapts to patterns of inputs by developing a unique mathematical relationship based on the perceived given pattern of input data sets using the match of the input variables to the outcomes, for each case. The difference between the traditional analytical methodologies and the ANN is that the typical methods consider a specific type of connection between the input variables and outcome variables and then use many suitable procedures to change the values of the parameters in the model. At the same time, neural networks create a mathematical relationship by "learning" the patterns of all inputs from any of the individual cases used in training the network [23,30].
ANNs can generate a predicted outcome for each case entered throughout the training phase. In addition, one of the significant benefits of machine learning that makes it increasingly popular over the traditional statistical methods is the in case of incorrect prediction. The network has the ability to adjust and recompute the weights of the mathematical relationships among the input variables that represent the predictors and, with the expected outcome, weights that are represented in the network's hidden layers. The outcome-dependent variables are the predicted output, which is usually a continuous variable as in the case of predicting student performance. As such, the outcome is usually success or failure, or pass or fail; this result is linked with a unique value for each input case (or subject). The input variables have information on the likelihood of relating to every one of the categorical classifications of the output variables used in the adopted ANN's development. During the training process, the network improves its accuracy in replicating the test cases' known outcomes. Until one or more predetermined stopping criteria are met, the neural network improves its predictions. A minimum level of accuracy, learning rate, persistence, number of iterations, amount of time, and other criteria can be used as stopping criteria [23].
In the reviewed articles, the authors usually divided the data set into the actual data input, which was usually made of between 60-70% of the obtained data; in some studies, the remaining 40-30% was used for training of the ANN model. In some of the cases, 10% of the data was used to determine the accuracy of the proposed ANN model. One of the major advantages of machine learning, including the ANN methods, is that after the model has been trained, the predictive capability of the trained model is tested with the remaining cases in the dataset, which is reported in some studies to be as low as 10% of the original data set. This process is important to the validation of the model's predictive capacity. The validation process involves observing the weights in the model that are fixed to the values obtained during the training phase. The process is vital before the developed network is used to predict classes of outcomes in a new set of data. It can then be used to predict future outcomes in cases where the outcome is still unknown [20,24,29,32].
The review of the literature revealed that although ANNs provide high accuracy of the prediction results, the literature seems to use ANNs to a lesser extent as the application of ANN is limited to sophisticated software such as SPSS and MATLAB [9,[20][21][22]27]. This software requires expert knowledge of statistics, algorithms, and data mining, which is not available for most teachers and educators, limiting the real-life application and use of such software. Additionally, other algorithms and data mining such as Decision Tree algorithms are considered more frequently used due to the availability of a more userfriendly application developed by the University of Waikato, New Zealand. The software is free licensed and uses data mining and predictive analysis to combine visualization tools and graphical user interfaces. These features attract more users to use the software in comparison to ANN.
The search inclusion criteria focused on ANNs, although in some studies, the technique was not used alone; [21] used Probabilistic Neural Networks (PNNs) based on the DDA (Dynamic Decay Adjustment). Compared with the Tree Ensemble Predictor, the Random Forest Predictor, the Naïve Bayes Predictor, the Decision Tree Predictor, and the Logistic Regression Predictor using the Konstanz Knowledge Miner (KNIME). Ref. [22] used both NNs in comparison with conventional algorithms, which is logistic regression. Similarly, [25] also used ANNs, multiple regression, and SVM algorithms. Ref. [21] used MANFIS-S (Multi Adaptive Neuro-Fuzzy Inference System with Representative Sets) Ref. [32] used a technique known as Factor selection, based on principal component analysis, to reduce the number of variables in the survey to smaller numbers of constructs known as dimension reduction. The new constructs were then used as input for the ANN.
Other studies used ANN to validate the results obtained by other methods [28,29]. Multi-layer perceptrons (MLPs) are the most popular type of ANN algorithm employed by the majority of the studies [9,20,21,24,27,29,32,34].
Few studies used different types of MLP, including feedforward [23,28] other studies employed the Backpropagation NN [23,27]. Other studies employed different techniques such as Elman neural network, BPNN [29], Probabilistic Neural Network (PNN) in [16], Particle Swarm Optimization in [23], Cuckoo Search (CS), and Cuckoo Optimization Algorithm (COA) in [31,33]. There is no apparent association between the type of ANN used and the level of accuracy, nor the sample size used. The only link that is proposed or seen to enhance the accuracy of the prediction model is the training of the network, as presented in [24]. The study contributes the high accuracy of the prediction to the combination of use of the ANN together with other methods such as cluster analyses and Kohonen networks and not to the nature of the type of variables or attributes selected. Furthermore, a similar observation was indicated by [25].
All the reviewed studies indicated superior accuracy of the ANN over the other traditional prediction algorithms and other data mining techniques. Only in one study was it found that PNN showed the least accuracy compared with the Tree Ensemble Predictor, the Random Forest Predictor, the Naïve Bayes Predictor, the Decision Tree Predictor, and the Logistic Regression Predictor using Konstanz Knowledge Miner (KNIME) [21].

Input Variables Used
The input variables used in the reviewed studies did not seem to follow a specific pattern except for the use of the student scores and GPA from previous study years. Based on the context of this study, the variables and section of the input varied greatly from one study to another. As an example, when [17] attempted to predict students who will perform better while studying at the Faculty of Engineering and Information Technology, they chose the input variables that were believed to be related to student's understanding of Engineering. The variables included Math I and II, Electronics I, Electrical Circuit I in high school. These inputs were combined with first-year scores in the university.
Not all the studies used academic scores; [27] used cognitive variables to determine student performance. The study applied variables such as Working memory (Intelligence and ability to acquire new knowledge), Attention, and Learning strategies. The study used several computerized tasks such as Attention Network Test (ANT) adapted from [41], Automated Operation Span adapted from [42], and the Learning Strategies Questionnaire adapted from [43][44][45]. From the surveys, the study extracted the input variables of General Reaction Time, Attentional Networks, Working Memory Capacity, Reaction Time Operation. Cognitive resources/Cognitive processing, Anxiety Management, Alerting Attention, and Orienting Attention.
The study also included demographic variables which are related to gender and parents' educational level and occupation. These factors seem to be implemented in many studies [22,25,27,32,34] as usually this information is easy to be obtained. In addition, the majority of studies used GPA or the various subject scores, [21,24,26,31,34], due to similar reasons, as this information was easily obtained, especially in a retrograde setting to predict future scores or success and failure of the same students.
Studies that examined the virtual learning environment [20] chose corresponding input variables such as the number of messages on the LMS viewed and posted, content creation, files viewed, and quiz efforts. These data were easily obtained and recorded in the LMS. Relatively similar variables were used by [25] in predicting the e-learning students' graduation levels. The input variables included Gender, Total Login, Quiz Training, Message, Upload Assistance, Forum, Training, Chat, and Discussion. The two studies yielded a high accuracy of the ANN, achieving 98.30% and 97.9, respectively.
The virtual learning environment is becoming increasingly popular in a recent study [29] which predicts student success at the master levels. The study used extracted information from the Moodle platform, which aimed to monitor and extract certain input variables included access, number of visits, clicks per hour time of each access, Clicks in resources and Messages in the forum. Moreover, the result indicated enhancement in the student performance with the increase in each of the variables.
Students' level of distraction from internet use and the time spent on the internet were among the variables applied in examining the grades of honor students in the [35] study. This study employed a combination of exciting inputs that were believed to affect student performance. These input variables included the previously mentioned variables, number of books borrowed from the library, college entrance examination scores, and the first midterm score; the model demonstrated satisfactory accuracy up to 84%.
Similar observations were reported in other literature [46], and also used in-term or end-of-term scores as input factors in circulating student success. In a review conducted, it was discovered that overall grade averages and internal assessments (quizzes, lab studies, and attendance) are commonly used in predicting student results. External assessments (final exam) and demographic characteristics (gender, age, family status, and impairment status) were also used to forecast student progress in certain research. Extracurricular behaviors, high school history, and social contact variables, as well as psychometric influences (student-related, learning the conduct, attendance period, and family support), were all considered to be used to assess student success. Ref. [47] used their general weighted grade point average, letter grades from specific classes, and midterm and final exam scores to attempt to forecast students' academic success. An earlier study [48] employed Artificial Neural Networks, Decision Trees, and linear regression analysis to forecast student success based on cumulative grade point averages. Although many studies tend to include demographic and socioeconomic variables as contributing and predicting factors for student performance and success, [49] reported that by using just grades, it is possible to forecast student success without using any socioeconomic details.

Future Work
The use of ANNs in the educational context needs to have practical based evidence of the actual use and implementation of the methodology or approach in the educational policy to reach the ultimate objective, which is the improvement of the learning process. This practical application of research findings was only indicated in a few cases. The very limited tangible findings referred to the use of techniques for the actual context and target objective which is an improvement of student outcomes, performance, and achievement through feedback and the predictions adopted from the proposed methodological and algorism approach. Overall, this review found a diverse range of types of ANN analysis methods and a strong emphasis on higher education.
An important recommendation of this review is to overcome the identified gap, which is related to the only theoretical and limited application of the ANN in the actual context of enhancement of the academic achievement of the students. Future studies need to develop research aimed to contribute to actual application in the daily teaching process and practice, and support education policy decision-making, which should be an alternative. Internal feedback with successful examples of using various algorithms and techniques over time, without practical application, can lead to withering of the scientific field. The practical application of the results, on the other hand, will broaden the scope of research and benefit the research community and the wider educational community.
Finally, and most importantly, based on the review of the current literature, there is a need to expand the field beyond higher education and focus primarily on early educational levels. It will serve a more significant number of students in more diverse classes and have a greater impact on their lives and society as a whole, the most prominent future direction for this field. It is a path that can lead to new research opportunities, but more importantly, it that can lead to outcomes that have a greater impact on education and society.

Conclusions
The focus on higher education dominated the articles we reviewed. The systematic review of the literature methodology adopted in this paper leads to specific observations and conclusion points concerning using Artificial Neural Networks in predicting student achievements and performance. One of the observations is that the reviewed articles mainly focus on higher education, including university education in various fields. This can be attributed to the fact that the higher education setting has a significant availability of data and the increased familiarity of the researchers with the university institutions, notably the engineering domain. On the other hand, secondary and primary education, where a larger population and larger sample size can be obtained, do not appear to have developed in the same way in terms of the attention they receive from educational data mining researchers. Despite the fact that there is much room for research due to the students' older age, the variety of subjects, and the different learning levels of students inter alia, the field has not evolved as much as expected. In addition, reduced data accessibility, procedures for approving research by educational authorities, increased attention paid to students' personal data, and so on are major roadblocks to further research development.
The results indicated that ANNs are always used in combination with data analysis and data mining methodologies and algorithms, allowing studies to assess the effectiveness of their findings in evaluating academic achievement. ANNs have indicated high accuracy in predicting the outcome of academic achievements, although similar results were obtained with other data mining approaches. The degree of accuracy obtained from the methodology does not seem to be influenced by factors such as sample size, level, the field of education, or the study context. It was further observed that the same data mining techniques were used more than others. Additionally, it was noted that using various data mining techniques did not significantly increase the level of accuracy of the predictions.
Regarding the input variables, no pattern was detected regarding the selection of the input variables, as some studies used cognitive attributes. In contrast, others were limited to the conventional scores of students either by subject or year with no significant difference in the accuracy of the prediction model. Due to the ease and the availability of the data on student scores, such as accumulative GPA, were most frequently used as input variables in addition to some demographic variables including gender. Grades were frequently combined with demographic and academic data. No combination resulted in greater accuracy. In most studies, only the mere expression of suggestions for how the results could be used was more common, but without anything specific. Accordingly, the goal appears to be distorting practice.  Data Availability Statement: Data are available from the corresponding author for researchers who meet the criteria for accessing the data.