A Predictive Model for Student Achievement Using Spiking Neural Networks Based on Educational Data

: Student achievement prediction is one of the most important research directions in educational data mining. Student achievement directly reﬂects students’ course mastery and lecturers’ teaching level. Especially for the achievement prediction of college students, it not only plays an early warning and timely correction role for students and teachers, but also provides a method for university decision-makers to evaluate the quality of courses. Based on the existing research and experimental results, this paper proposes a student achievement prediction model based on evolutionary spiking neural network. On the basis of fully analyzing the relationship between course attributes and student attributes, a student achievement prediction model based on spiking neural network is established. The evolutionary membrane algorithm is introduced to learn hyperparameters of the model, so as to improve the accuracy of the model in predicting student achievement. Finally, the proposed model is used to predict student achievement on two benchmark student datasets, and the performance of the prediction model proposed in this paper is analyzed by comparing with other experimental algorithms. The experimental results show that the model based on spiking neural network can effectively improve the prediction accuracy of student achievement.


Introduction
The education system contains a large amount of educational data, such as universities and training centers. How to mine the hidden knowledge of these data helps decision makers within the higher education system to improve the quality of education in the process of student development [1]. With the development of society and the popularization of higher education, the number of college students is increasing. It is difficult for teachers to track the learning situation of each student, which affects the quality of teaching and learning to a certain extent. This leads to a certain number of students in colleges and universities failing examinations, repeating grades or even dropping out every year, which seriously affects the future development of students. The quality of student training has gradually become a new focus in the field of higher education. Student achievement is one of the key factors that most directly reflects the quality of student training in higher education [2]. Therefore, it has important application value and practical significance to study and construct an efficient prediction method about student achievement.
With the vigorous development of artificial intelligence and big data technology, it is possible to model and analyze the massive data accumulated by colleges and universities for many years. Big data technologies represented by deep learning and data mining can discover some data patterns, extract valuable information and knowledge, and provide services for solving problems in various fields, which has become the consensus of today's industry and academia. At present, technologies such as big data have been widely used in many fields such as finance, medical care, e-commerce, energy and manufacturing, and transportation. Especially in the field of education, more and more big • Based on the analysis of educational data, an educational data mining model is discussed. • Spiking neural network is used for the first time to predict student achievement. • Evolutionary spiking neural network is designed and implemented on the basis of the student datasets.
• Simulation results verify the effectiveness of the proposed model in predicting student achievement. • The research results of the proposed model can provide more targeted reference for scientific research and education management workers.
The remainder of this paper is summarized as follows. Section 2 discusses the current research state in educational data mining and spiking neural networks. Section 3 describes the proposed model in detail, especially how to design educational data mining models based on evolving spiking neural networks. In Section 4, the proposed model is compared with the state-of-the-art models, and the simulation results are evaluated and discussed on the xAPI-Edu-Data and student-grade-prediction datasets, respectively. Finally, Section 5 summarizes the conclusions of this paper.

Research Status of Educational Data Mining
As a multidisciplinary research field such as education, computer science, statistics, and psychology, educational data mining has rich research content, diverse methods, and diverse research perspectives. The analysis and research of educational data is of great significance to scientifically and effectively improve the quality of education and the level of teaching management [16]. A lot of in-depth work has been done by researchers who have long been engaged in educational data analysis. These research works mainly focus on student achievement prediction, student modeling, learning recommendation, analysis and visualization, etc. [19]. These research directions on educational data mining are discussed below.
Student achievement prediction uses relevant information about students to predict their future academic performance [20]. A prediction task can be either a classification task or a regression task, such as predicting the probability of a student failing, predicting a student's grade rank, predicting a student achievement in a course, etc.
Student modeling is to reveal the learning characteristics of students by building models of student behavior, learning strategies and cognitive abilities [21]. For example, identifying students' behavior and finding out the relationship between students' learning behavior patterns and their personality traits.
Learning recommendation system is a technology that recommends courses, learning materials, learning methods or professional directions to students based on their personality characteristics and academic performance [22]. For example, recommend suitable majors for students based on their academic performance in their first year of admission, and recommend the order of course content to students based on their log records and personal information in the learning system.
Analysis visualization refers to the use of visualization technologies such as histograms, polylines, heat maps, and word clouds to visualize the knowledge or information contained in educational data, so that people can obtain information faster and understand information more conveniently and intuitively.
In the above research work, student achievement prediction is one of the important research branches in the field of educational data mining. Scholars have carried out a lot of fruitful research work, but the research on the prediction of student achievement by introducing artificial intelligence or big data technology is still in the exploratory stage. Traditional statistical methods are difficult to effectively predict student achievement. Therefore, this paper chooses student achievement prediction as the main research content, in order to contribute to the in-depth research in this direction.

Spiking Neural Network
Spiking neural network is called the third-generation artificial neural network, which is a discrete bionic network model that is closest to real life. Traditional artificial neural networks are still limited to the von Neumann architectures for information processing and learning [23]. In the von Neumann architecture, the memory and the processor are separated from each other, resulting in a large amount of energy being consumed when performing massive data operations [24]. However, spiking neural network encodes information into spike sequences for processing, that is, the input information is processed and transmitted through direct action potentials at synapses. It adopts mechanisms such as plastic synapses and spiking time coding to simulate the spatiotemporal properties of neural networks, and its structure is closer to biological neurons. Therefore, it has higher biological feasibility and more computational power than traditional neural networks [25].
The theory of spiking neural network can find the corresponding biological basis, which makes it have both good biological interpretability and the information in the time dimension. Compared with traditional neural networks, spiking neural network has more efficient computing power and is relatively easy to implement in hardware. Because their event-driven nature can reduce the power consumption of operations [26]. The excellent performance and advantages of spiking neural networks have attracted a large number of internationally renowned teams for long-term in-depth research [27,28]. The main research directions of spiking neural network are discussed as follows.

•
Neurons. Biological neurons generally simulate their functions through neuron models. The neuron model is the basis for building spiking neural network. Different types of neurons are connected to each other to form various types of neural network models. Many learning algorithms are designed and implemented on the basis of these neuron models or their variants. • Network Topology. The topology of a neural network includes the number of network layers, the number of neurons in each layer, and the way each neuron connected to each other. The topology of artificial neural network is often divided into input layer, hidden layer and output layer, and each layer is connected in sequence. Among them, the neurons of input layer are responsible for receiving input information from the outside world and passing it to the neurons of hidden layer. The hidden layer is responsible for information processing and information transformation within the neural network. Usually, the hidden layer is designed as one or more layers according to the needs of transformation. Like the topology of traditional artificial neural network, the structure of spiking neural network mainly includes feedforward spiking neural network, recurrent spiking neural network and hybrid spiking neural network, etc. • Spike sequence encoding. For the encoding of input information, researchers have proposed a variety of spike sequence encoding methods for spiking neural networks by learning the information encoding mechanism of biological neurons. For example, the first spike-triggered time coding method, the delayed phase coding method, the population coding method, etc. • Learning algorithm. Spiking neural network contains hyperparameters such as network topology, number of neurons, and weights. During training network, these hyperparameters are determined by a learning algorithm. The learning algorithm directly determines the output accuracy of the spiking neural network. Therefore, scholars have carried out a lot of research on the learning algorithm of spiking neural network, and the research directions mainly focus on unsupervised learning, supervised learning, semi-supervised learning and reinforcement learning.
As the third generation of neural networks, spiking neural networks have great computational potential. Due to the complex structure of neurons and the large number of hyperparameters, there are few application scenarios. By reviewing the literature, it is found that the application of spiking neural network is rare in educational data mining. Therefore, this paper will try to use spiking neural network to predict student achievement. The research results of this paper in educational data mining can expand the application scope of spiking neural network.

Proposed Method
In order to better obtain and analyze educational data, we first give the design scheme of student achievement prediction model. On this basis, we propose a student achievement prediction model based on spiking neural network for educational data mining.

Scheme of Student Achievement Prediction
The design scheme of the student achievement prediction model is discussed for details, as shown in Figure 1. This scheme is a general framework for student performance prediction. It is not limited to using the model proposed in this paper, other models can also be used. As can be seen in Figure 1, the scheme mainly consists of five parts, including datasets, data preprocessing, data extraction, data modeling and application. Next, each part in Figure 1 is described respectively.  • Datasets consist of raw data from databases, documents, or the website. Research on student achievement prediction has focused on education and psychology, using data mostly from questionnaires or student self-reports [29]. Generally, the acquisition of this kind of data should first understand data structure and meaning of the original student achievement data involved in the task, and determine the required data items and data extraction principles. Finally, the extraction of relevant student data is completed using appropriate means and strict operating specifications. The above process involves more relevant professional knowledge. We can try to combine the arguments of experts and users to obtain variables that are highly correlated with student performance. If the extraction of multi-source data is involved in the acquisition process, due to the different software and hardware platforms, it is necessary to pay attention to the connection of the data sources of these heterogeneous databases and the conversion of data formats. If the confidentiality of student data is involved, more attention should be paid to the operation of such relevant data during processing, and remarks should be made on the relevant data for reference. The study found that the possible reason why the prediction accuracy of the model could not be improved was caused by the quality of the data source. In the acquisition of raw data, it is particularly important to minimize errors and avoid mistakes from the source, especially to reduce human errors. Currently, the main sources of datasets on student achievement prediction are education management systems, offline datasets of educational history, and standardized test datasets. • Data preprocessing needs to complete data cleaning, data integration, data transformation and other operations. In the whole data mining process, data preprocessing takes about 60% of the time, and the subsequent mining work only accounts for about 10% of the total workload. The preprocessed data can not only save a lot of space and time, but also help the predictive model to make better decisions and predictions. Due to the different sources of educational data, the attributes and feature dimensions of student data are inconsistent. In order to obtain better quality modeling data, certain data cleaning, integration and transformation must be performed. Among them, data cleaning is the most time-consuming and tedious, but it is the most important step in the data preparation process. This step can effectively reduce the problem of conflict situations that may arise during the learning process. The raw data with conditions of noise, error, missing and redundant can be processed as follows.
-Noise data. Data smoothing techniques are the most widely used methods for dealing with such noisy data [30]. -Error data. For some wrong data tuples, we change, delete or ignore these wrong data by analyzing the datasets. -Missing data. We use global constants or mean values of attributes to fill nulls, and use regression methods or use derivation-based Bayesian methods or decision trees to fix certain attributes of the data [31]. -Redundant data. We will remove redundant parts of the data to improve the processing speed of the prediction model.
Data integration is a data storage technology and process that combines data from different data sources such as databases, networks, or public files. Since the data integration of different disciplines involves different theoretical foundations and rules, data integration can be said to be a difficult point in data preprocessing. Naming rules and requirements for each data source may be inconsistent. To extract data from multiple data sources into a database, all data formats must be unified in order to ensure the accuracy of the experimental results. Generally, each data source needs to be modified according to a unified standard, and then the data of different data sources can be uniformly extracted into the same database. Data transformation is the use of linear or nonlinear mathematical transformation methods to compress multi-dimensional data into fewer dimensional data and eliminate their differences in characteristics such as space, attributes, time and precision. While these methods are usually lossy on the original data, the results tend to have greater utility. To a certain extent, the original data after data transformation operation makes the prediction model to have better prediction accuracy and execution efficiency. • Data extraction is to divide the data into training dataset and test dataset. The training dataset refers to building a classifier by matching some parameters to a dataset of learning samples. On the training dataset, the learning method is used to determine the hyperparameters of the model. That is, let the training model build a prediction method based on the training dataset. After training the model, the test dataset is mainly employed to evaluate the discriminative ability and generalization ability of the model. • Data modeling addresses two main types of forecasting problems, including classification and numerical prediction. Classification and prediction are two ways of using data to make predictions that can be used to determine future outcomes. Classification is used to predict discrete categories of data objects, and the attribute values that need to be predicted are discrete and disordered. Numerical prediction is used to predict the continuous value of data objects, and the attribute values that need to be predicted are continuous and ordered. The classification data model reflects how to find out the characteristic knowledge of the common nature of similar things and the difference characteristic knowledge between different things. Classification is to build a classification model through guided learning training, and use the model to classify samples of unknown classification. A predictive model is similar to a classification model and can be viewed as a map or function y = f (x), where x is the input tuple and the output y is a continuous or ordered value. Unlike the classification algorithm, the attribute values that the prediction algorithm needs to predict are continuous and ordered. • Application refers to applying the above process to classification or prediction to solve practical problems. The specific application of the data model in this paper is for educational data mining. More specifically, the proposed model is applied to solve the problems of student achievement prediction. Figure 2 shows a student achievement prediction model based on an evolutionary spiking neural network. First, the processed modeling data is divided into a training dataset and a test dataset according to a certain proportion. Next, the evolutionary spiking neural network model is proposed for the student achievement prediction. Then, the proposed model is trained using the training dataset, and the evolutionary membrane algorithm is used to optimize the hyperparameters of the proposed model to obtain the best output performance. A test dataset is chosen to evaluate the performance of the proposed model. On the test dataset, the prediction effect of the proposed model is analyzed by some evaluation indicators.

Output of the 1th neuron
Output of the n-th neuron the pulse time sequence of j-th neuron  As can be seen in Figure 2, the working mode of the spiking neural network proposed in this paper can be roughly divided into some parts, including the encoding of the input data, the training and learning of the spiking neural network, and the decoding and output of the predicting results. Each part of the proposed is discussed as follows. • We take the student dataset as the input of the proposed model, and encode these data as the input spike sequence using the first spike encoding [32]. • The input spike sequence is passed to neurons for transmission and processing, and then the learning rate and synaptic time delay are optimized using evolutionary membrane algorithm to achieve adaptive tuning the hyperparameters of the proposed model.

•
The processing result of the neuron is passed to the output layer. The output layer outputs the predicted spike sequence, and calculates the mean squared error between the actual spike sequence and the expected output spike sequence. • The model adjusts the learning rate and synaptic time delay by continuously calling the evolutionary membrane algorithm to reduce its prediction error value. Until the mean squared error is smaller than a certain limit or the number of iterations satisfies the requirement of stop learning, the proposed model outputs a spike sequence and decodes this sequence into a prediction result.
For a deeper understanding of the proposed model, a flowchart of the proposed student achievement model based on evolutionary spiking neural network is shown in Figure 3.   Figure 3 specifically shows the working process of the entire model. This model first determines the characteristic data that affects student performance as the input sequence of the spiking neural network, and uses the first coding method to convert these data into input spiking sequences and expected spiking sequences. On this basis, the model passes the input spiking sequence to the neuron of the spiking neural network, and the data is learned and trained in the neuron. Then, the final data learned by the neuron is passed to the output layer, and the meaning square error between the actual spiking sequences and the expected spiking sequences is calculated. Next, it is necessary to compare the error value with the error standard or determine whether the number of iterations meets the ending requirements. The evolutionary membrane algorithm is used to adjust the learning rate and synaptic time delay of the neuron, and this process is executed cyclically until the error value meets the requirements, or the number of iterations satisfies the ending condition. The evolutionary membrane algorithm used in this paper consists of objects, membrane structures, and reaction rules. The object represents the hyperparameters of the spiking neural network. The membrane structure is a two-layer structure composed of a skin membrane and several membranes. The chemical reaction optimization algorithm as reaction rules evolve these objects in the membrane. Then, evolutionary membrane algorithm evaluate each object in the membrane, and compare the fitness values of these objects. The chemical reaction optimization is an operator and generates the offspring objects, and ensures the diversity of candidate objects and reduces the probability of "premature" phenomenon in the evolution process to a certain extent. The above process to continuously evolve the next generation of spiking neural network is repeated until the end condition is met. When the process is terminated, the best spiking neural network model is determined to predict student achievement. Finally, if the output error reaches the required error value range or the number of iterations satisfies the conditions, which means that the adjustment of learning rate and synaptic time delay is completed. The actual output spiking sequence is attained, and the sequence is transformed as the prediction results.

Experimental Studies
To analyze the performance of the proposed prediction model based on evolutionary spiking neural network, several state-of-the-art experimental algorithms were selected for comparison on the student datasets, such as logistic regression, decision tree, XGBoost, AdaBoost, neural network, and support vector machine(SVM). First, we describe the benchmark dataset and some evaluation metrics used in our experiments. Second, we provide the comparison results between the proposed algorithm and the experimental algorithm, and further analyze the performance of the proposed algorithm. Finally, the comparison results of the experimental algorithms under the benchmark dataset are discussed.

Benchmark Datasets
Two benchmark datasets are used here to verify the performance of the proposed algorithm. Details of these datasets including xAPI-Edu-Data and UCI datasets are discussed below.
The first classification dataset is xAPI-Edu-Data from Alibaba Cloud Tianchi, which contains 17 variables related to student grades. More specifically, the dataset contains 480 student records from two semesters of different countries and genders. There are 17 information attribute including current education level, class, courses selected, students raising their hands in class, attendance characteristics, and parents of students, etc. Student grades are divided into three categories ['L', 'M', 'H'], which will serve as the criteria for judging student. Among them, "L" (0-59) means failing, "M" (60-89) means medium, and "H" (90-100) means high score. We analyzed the proportion of students in the three categories of grades, as shown in Figure 4. As can be seen in Figure 4, we found that most of the students were in the middle grades, and the high-scoring students accounted for 29.58% of the total number.  The dataset contains 480 student records and is divided into training dataset and test dataset. Among them, the training dataset consists of 384 items, accounting for 80% of the total number of samples. The test dataset accounts for 20% of the total number of samples with a total of 96 entries.
Another categorical dataset is the student performance dataset from the UCI Machine Learning Repository, which consists of 395 student records in mathematics subjects and contains 33 variables related to student achievement. The dataset was collected from two Portuguese schools in the 2005-2006 school year, using school reports and questionnaires. These factors affect final student performance in the dataset, including demographic metrics(e.g., mother's education, household income), social/emotional associations(e.g., alcohol consumption), and school-related expectations (e.g., number of exam failures). The dataset is used under the binary/five-level classification and regression tasks in Table 1. The raw values in Table 1 are the true values in the dataset. The classifications in Table 1 refer to the 5 values of the five-level classification system on a 20-point scale. The codes in Table 1 represent 5 types of values, which is convenient for programming.
We analyzed the proportion of students with different scores in the student performance dataset. As can be seen from Figure 5, the overall pass rate of the students is not bad. The students with excellent grades account for 6.09% of the total number, but 46.95% of them also fail. The dataset is composed of 395 student records, which are divided into training dataset and test dataset. Among them, the training dataset consists of 316 entries, accounting for 80% of the total number of samples. The test dataset has a total of 79 items, accounting for 20% of the total samples.
To find the best prediction model for student achievement, the training dataset is used to optimize the hyperparameters in the model. The test dataset is used to provide the best solution as the final output of the proposed model, and demonstrate the performance of the proposed model.

Evaluation Indicators
Three evaluation metrics including Precision, Recall, and Accuracy are used to evaluate the performance of all experimental algorithms. In multi-label classification, these metrics are used as the proportion of predicted results that exactly match the corresponding groundtruth results. The detailed definitions of these indicators are described below.
Assuming a binary classification problem, the samples have two categories including positive and negative. Then there are 4 combinations of model prediction results, namely True Positive(TP), False Positive(FP), False Negative(FN), True Negative(TN), as shown in Table 2. TP indicates that a sample is a positive class and the predicted class is also positive. If the sample is of negative class and the predicted class is positive, it is called FP. Correspondingly, if a positive class is predicted as a negative class and the sample is a negative class, called FN. Samples of the negative class are predicted as the negative class, called TN. Precision represents the ratio of positive to true samples predicted by the model, as shown in Equation (1).
Recall indicates how many samples with positive labels are correctly predicted by the model, as shown in Equation (2). Unlike Equation (1), the two evaluation metrics differ only in the denominator.
Accuracy represents the ratio of the samples predicted by the model to the true value, as shown in Equation (3). The denominator in Equation (3) is always the total number of samples, and the numerator is the number of values predicted by the model equal to the true value. It is easy to extend to multi-class cases, such as 10-class. The numerator here is the sum of all classes whose predicted value equals the actual value.

Experimental Conditions
All experimental algorithms are compared with the proposed algorithm, including logistic regression, decision tree, XGBoost, AdaBoost, neural network, SVM. The solution performance of all experimental algorithms is calculated according to the above indicators.
Simulations for all experiments were run on a Windows 10 Pro host with dual Intel Xeon Platinum 8160 processors (33 M cache, 2.10 GHz) and 160 GB physical RAM. The processor is composed of two Xeon CPUs containing 48 parallel cores and 96 threads. All experimental algorithms are implemented in the Pycharm community using Python 3.10.

Comparing the Results of All Experimental Algorithms
The proposed model is compared with other experimental models on the above two datasets. To further illustrate their differences, three evaluation metrics are used to evaluate these algorithms. The experimental results on the two datasets are discussed in detail below.

Comparing Results with All Experimental Algorithms on the xAPI-Edu-Data Datasets
xAPI-Edu-Data is chosen to test the performance of all experimental models. To ensure that all experimental models perform successfully, the xAPI-Edu-Data dataset is analyzed below. The data content of xAPI-Edu-Data is first visualized, which can intuitively understand the characteristics of the dataset. Figure 6 shows the relationship between student performance and gender for this dataset. It can be clearly seen that the number of female students who fail is far less than the number of male students. Almost the same number of female students achieved "M" and "H", while the number of male and female students on "H" was not much different. But on 'M', there are more male students than female students.
Heatmap is a very popular way of data display, and it can use different color blocks to divide all attributes of the dataset into different hierarchical intervals, so as to visually display the data by partition. A heatmap of student features attributes of xAPI-Edu-Data is generated as shown in Figure 7.  Figure 7 is a heatmap that visually shows the degree of correlation between student attributes. It is not difficult to find that the most popular of these attributes include 'Relation', 'raisedhands', 'VisITedResources', 'AnnouncementsView', 'Discussion', 'ParentAnswering-Survey', 'ParentsschoolStatisfaction', and 'StudentAbsenceDays'.
Next, to further analyze the key features of the dataset, feature discovery is performed on the training dataset using the proposed model. The bar graphs in Figure 8 show how important these features are. Looking at the experimental results in Figures 6-8, it is concluded that there is a strong correlation between these attributes and student performance, including 'Announce-mentsView', 'Discussion', 'raisedhands', 'VisITedResources', 'Topic', 'PlaceofBirth', 'Stu-dentAbsenceDays', 'Gender' and 'Parent Response Survey'. In addition, it is not difficult to see from the data that students who are absent for more than 7 days rarely get high marks, and those who are absent for less than 7 days are rarely failed.
On the basis of analyzing the above xAPI-Edu-Data, the proposed model is compared with other experimental models on three evaluation indicators. Table 3 shows the simulation results of all experimental models on the xAPI-Edu-Data dataset. It can be seen from Table 3 that the accuracy of the test results of all experimental algorithms is above 71%. On Precision, Logistc Regression can get the best results on 'L', and the proposed algorithm can achieve the best results of 0.85 on 'M', and the result of AdaBoost is superior to other experimental algorithms on 'H'. In terms of Recall, Logistc Regression still can get the best results on 'L', and the proposed algorithm outperforms all experimental algorithm on 'M', and the experimental result of AdaBoost is as high as 0.91 on 'H'. The last indicator F1-score, Logistc Regression still can get the best results on 'L', and the proposed algorithm still achieve the best results on 'M', and XGBoost and AdaBoost achieve the same results on 'H' and outperforms other experimental algorithms.
In summary, the classification on 'L', logistic regression is the best result. The proposed algorithm outperforms some of these experimental algorithms on datasets classified by different levels, and its accuracy is 0.84375, especially the classification results of 'M' are the best. These are related to the proposed algorithm using membrane structure and reaction rules to optimize the spiking neural network, and these mechanisms can well balance the relationship between exploration and utilization, they help the proposed algorithm to jump out of the local optima that approximates the global optimal solution.

Comparing Results with All Experimental Algorithms on the Student Performance Datasets
In this section, the student performance dataset from the UCI machine learning repository is selected as the experimental data to further validate the advantages of the proposed model in comparison with other experimental models.
First, we still analyze the relationship between student achievement and gender. As can be seen in Figure 9, it is clear that the grades of male and female are very similar, but the failing number of female is higher than male. The result of Figure 9 suggests that gender has little effect on final grades in the student performance dataset. Next, heatmap is utilized to further analyze the interaction between different attributes of students, as shown in Figure 10. It is not difficult to find from Figure 10  Next, the feature importance curve is shown in Figure 11. The target attribute 'G3' has a strong correlation with the attributes 'G1' and 'G2'. This is because 'G3' is the final year grade (issued in the third semester), while 'G1' and 'G2' correspond to the first and second semester grades. In Figure 11, it is difficult to predict 'G3' without 'G1' and 'G2'. However, it is possible to successfully predict 'G3' in the absence of 'G1' and 'G2' in the student performance dataset if more important data features can be found from other datasets.  Observing the above experimental results from Figures 9-11, it can be concluded that there is a strong correlation between 'G1', 'G2' and students' final grades. Therefore, student achievement is related to attributes such as 'school', 'age', 'famsize', 'Medu', 'Mjob', 'reason', 'traveltime', 'failures', 'famsup', 'activities', 'higher', 'romantic', 'freetime', 'Dalc', 'health', 'absences'.
On this basis, the proposed and comparative models are executed on the dataset and evaluated using three metrics. Table 4 describes the comparison results with all experimental algorithms on the student performance dataset. 'A', 'B', 'C', 'D' and 'E' in Table 1 stand for excellent/very good, good, satisfactory, pass and fail.
To verify the advantages of the proposed algorithm, the proposed algorithm is conducted on the student performance dataset and compare the quantitative results with other experimental algorithms in Table 4.
In term of Precision, XGBoost, SVM and the proposed algorithm have the same and best results on 'A', and SVM and the proposed algorithm achieve the best results of 0.92 on 'B', and AdaBoost get the best results on 'C', and the proposed algorithm is better than on other experimental algorithms on 'D', and 'E'. For the second indicator Recall, SVM obtains the best results on 'A', and XGBoost, AdaBoost, SVM, and the proposed algorithm have the same and best result on 'B', and Decision trees, AdaBoost and the proposed algorithm achieve the best results on 'C', and AdaBoost on 'D' have the best result, and XGBoost have the best result on 'E'. The last indicator F1-score, the result of SVM is 1 on 'A' which means the best result compared with all experimental algorithms, and SVM and the proposed algorithm have the same results on 'B' that is better than other experimental algorithms, and AdaBoost on 'C' have the best result, and the proposed algorithm have the best result on 'D' and 'E'. Finally, the accuracy achieved by the proposed algorithm is 0.814, which is better than all experimental algorithms. We selected 6 experimental algorithms on the xAPI-Edu-Data dataset and the student achievement dataset. Compared with these experimental algorithms, the effectiveness of the proposed algorithm is verified. We observed the experimental results from Tables 3 and 4, we can easily find that the proposed algorithm has better classification accuracy, especially on the xAPI-Edu-Data dataset. However, in the classification of small sample data attributes, the prediction accuracy of the proposed algorithm is not significantly better than the comparison algorithms. In other words, the advantages of the proposed algorithm are more obvious on the attributes with more samples. Based on the above experimental results, it can be seen that the proposed algorithm effectively utilizes evolutionary membrane algorithm including objects, reaction rules and membrane structures, and is effective for training hyperparameters of spiking neural networks. The proposed model achieves good performance, but there are some shortcomings. First, the evolutionary spiking neural network method proposed in this paper can automatically learn effective representations of samples and achieve good model performance, but it is slightly insufficient in model interpretability. Second, this paper only uses the standard student achievement dataset, and only uses the basic attribute information and grades of students. The content of the course knowledge points and compulsory courses will not be explored and utilized. Comprehensive use of multivariate data related to student performance to train models will facilitate in-depth research on evolutionary spiking neural networks.
To sum up, centering on the research topic of student achievement prediction, this paper conducts a more in-depth and systematic research on student achievement prediction from the perspective of student achievement, and the focus of the work is to improve the predictability and accuracy of the proposed model for predicting student achievement based on an evolutionary spiking neural network.

Conclusions
With the advancement of educational data analysis, colleges and universities have gradually accumulated massive educational data resources. How to mine valuable information from these educational information data and use this information to provide better service and support for education and teaching has become an urgent problem to be solved. In this context, the emerging research direction of educational data mining came into being. As one of the important research branches of educational data mining, student achievement prediction has received extensive attention, and many scholars have carried out some fruitful work. However, there is still much room for improvement in the predictability and accuracy of existing research work on student achievement prediction. Therefore, this paper chooses the data mining technology based on the proposed evolutionary spiking neural network to conduct in-depth research in the field of education, taking student achievement as the research object, studying the attribute data of students, and trying to find a method to predict student achievement. The effectiveness of the proposed algorithm is verified by simulations on two benchmark datasets, and certain effects are achieved. The main work of this paper includes: • Analyzes and preprocesses student information. It is necessary to have a comprehensive understanding of student attribute data. One aspect is understanding the structure of the data in the student dataset and transforming the raw data into something that all experimental algorithms can use. On the other hand, it is for better feature extraction to find out the key elements in student data that affect final student achievement. • On the basis of a comprehensive understanding of all experimental algorithms, a student achievement prediction model based on evolutionary spiking neural network is established. This paper uses six different data mining algorithms for student achievement as a comparison between the experimental algorithms and the proposed prediction algorithm, and analyzes the effect of each prediction model. • A specific application case of a student achievement prediction model is proposed, which lays a foundation for the wider application of future student achievement prediction, and aims to provide new perspectives and ideas for the application of data mining in the field of education data mining. • The proposed model realizes the prediction of student achievement, and it can provide effective technical support for teaching management work such as teaching students in accordance with their aptitude in the early stage of the course, academic early warning, etc., thus providing theoretical basis and technical support for the management of students in colleges and universities.
With the increase of academic difficulty, the requirements for college students will be higher and higher. Colleges and universities will focus on improving the quality of undergraduate teaching, deepen the reform of education and strengthen the management of the teaching process, and effectively improve the learning effect of students and the quality of talent training as the ultimate goal. As an important technical means of students' academic process management, the proposed student achievement model can promote the timely feedback of students' academic problems, and realize early detection, early treatment and early resolution of academic problems. The student achievement prediction model proposed in this paper is based on the student behavior data to give early warning in the final exam and get the predicted grade result, so as to truly prevent problems before they occur. The research results of this paper are to use the students' behavior data to judge the students' current learning status, to judge the changes of test scores and whether to need early warning, and to judge whether to communicate with the students according to the students' learning situation and the severity predicted by the model. Applying the work of this paper to the academic management of colleges and universities can not only distinguish students with academic problems, but also detect students who tend to fail in time to prevent these students from failing their subjects. In short, the application of data mining technology in the field of education is still in the exploratory stage, and it is also a popular research direction in the current data mining application field. There is also a lot of information worth mining in student behavior data to be discovered, although this research still has many shortcomings. With the further development of future research, research in related fields will inevitably make greater breakthroughs, and achieve greater results for the application of data mining in the field of education.