Next Article in Journal
Reconstructing Classical Algebras via Ternary Operations
Previous Article in Journal
Efficient Post-Quantum Cross-Silo Federated Learning Based on Key Homomorphic Pseudo-Random Function
Previous Article in Special Issue
Flexible Job Shop Dynamic Scheduling and Fault Maintenance Personnel Cooperative Scheduling Optimization Based on the ACODDQN Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Differential Evolutionary-Based XGBoost for Solving Classification of Physical Fitness Test Data of College Students

1
College of Electronics and Information Engineering, Beibu Gulf University, Qinzhou 530053, China
2
College of Physical Education, Beibu Gulf University, Qinzhou 535000, China
3
Beibu Gulf Ocean Development Research Center, Beibu Gulf University, Qinzhou 535011, China
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(9), 1405; https://doi.org/10.3390/math13091405
Submission received: 8 March 2025 / Revised: 7 April 2025 / Accepted: 18 April 2025 / Published: 25 April 2025

Abstract

:
The physical health of college students is an important basis for societal development, which directly impacts the competitiveness of future talents and the overall vitality of the nation. To accurately and timely identify the physical health status of college students, a hybrid model of DE-XGBoost is proposed in this study: a discrete coding strategy is designed to solve the XGBoost hyperparameter optimization problem, and differential evolution (DE) is used to achieve global parameter optimization. Based on 20,452 physical test records of a university in 2022, the empirical comparison shows that the accuracy rate, recall rate, and F1 value of the model are improved by 3.5–7.9% compared with support vector machine (SVM), gradient boosting machine (GBM), and multi-layer perceptron (MLP), showing significant performance advantages. This research provides a novel and efficient framework for physical fitness classification, with potential applications in educational curriculum design.

1. Introduction

The physical health of college students not only concerns their personal growth and development but also directly affects the vitality and progress of society [1]. However, there is a general decline in physical fitness, with insufficient exercise, and mental health problems among college students, which pose potential threats to both individuals and society [2]. Therefore, accurately and quickly identifying the physical health status of college students has emerged as a critical issue that necessitates attention within the realm of higher education. By focusing on and improving the physical health of college students, we can not only enhance their overall quality of life but also lay a solid foundation for the sustainable development of society [3].
The data of physical fitness tests for college students encompasses crucial characteristic information, such as height, weight, and lung capacity [4]. Generally speaking, college students’ physiques can be divided into excellent, good, pass, and fail. Hence, the determination of physical fitness pertains to the issue of multi-classification of data. Common classification algorithms include SVM, GBM, and MLP. SVM classification is appropriate for handling small sample data in high-dimensional patterns; however, it is significantly influenced by parameters and is sensitive to noise and outliers [5]. GBM, by gradually constructing and integrating multiple weak classifiers, can effectively handle non-linear problems, yet its training duration is relatively long and it is highly sensitive to the selection of hyperparameters [6]. MLP, as a classic neural network model, is applicable for handling high-dimensional data and non-linear classification tasks, but it is prone to overfitting and demands a substantial amount of data and computing resources for training [7]. In addition, XGBoost is also a relatively popular machine learning method [8]. XGBoost is a machine learning algorithm with superior parallel computing capabilities and high computational accuracy [9]. Based on gradient boosting decision trees, XGBoost improves model performance by optimizing the objective function and introducing regularization terms. It is widely used in classification, regression, ranking, and other tasks, and is often used in bioinformatics [10], financial risk control [11,12], recommendation systems [13], and fault diagnosis [14]. However, XGBoost may be inefficient when dealing with high-dimensional sparse data, and parameter adjustments are complicated, requiring different parameter adjustments for different datasets [15].
In recent years, the integration of evolutionary algorithms and machine learning methods has become a new research trend. For example, genetic algorithms (GA) evaluate and iteratively optimize peptide sequences through machine learning, thereby reducing the cost of experimental validation [16]. The evolutionary strategy (ES) uses a data-driven approach to optimize model parameters for the preprocessing phase [17]. Particle swarm optimization (PSO) relies on swarm intelligence to search for the best solution in the solution space but is susceptible to local optimality [18]. Ant colony optimization (ACOISA) can improve the training speed of support vector machines (SVM) [19]. The simulated annealing algorithm (SA) avoids local optimality by accepting suboptimal solutions probabilistically, but its convergence is relatively slow [20]. Although these algorithms perform well in specific scenarios and are suitable for solving different types of problems, their application to physical test data classification problems is still limited. For example, the parameters in physical fitness test data are mostly continuous values, and global optimization in high-dimensional space needs to be handled efficiently. GA and PSO are sensitive to parameter changes and tend to fall into local optimality, while DE requires only a small number of control parameters and enhances population diversity through a differential variation mechanism, showing stronger robustness in complex solution space. At the same time, some studies have shown that DE has a faster convergence rate when dealing with continuous parameter optimization problems, and is not sensitive to initial values [21], so it is more suitable for XGBoost hyperparameter tuning.
During extensive research, we have discovered some literature that applies the fusion of DE and XGBoost in fields such as network anomaly detection [22] and short-term wind power prediction [23]. These successful cases have verified the unique advantages of the DE algorithm in handling high-dimensional parameter spaces and non-linear data patterns, especially the intrinsic compatibility of its floating-point number-based mutation mechanism with the gradient boosting architecture of XGBoost. However, there is still a significant gap in the application of DE and XGBoost in the specific scenario of college students’ physical health assessment. Physical health data possess complex characteristics, such as multimodal coupling (e.g., non-linear associations between physiological indicators and exercise performance) and large individual differences (distribution shifts caused by age, gender, and training levels), making it difficult for traditional optimization algorithms to precisely locate the optimal solution in the parameter space. This paper innovatively introduces the DE-XGBoost model into this field for the first time. By establishing a 10-dimensional hyperparameter space-mapping mechanism, it effectively addresses the bottlenecks of traditional methods, such as being prone to local optima and having high parameter sensitivity in the high-dimensional continuous feature space of physical fitness data, providing a new paradigm for physical health assessment.
The background of this paper is as follows: The first chapter is the introduction, which summarizes the importance of college students’ physical health and the existing problems, and briefly introduces the commonly used classification algorithms and their advantages and disadvantages. The second chapter introduces the detailed classification of physical test data, the basic principles and characteristics of the DE and XGBoost algorithms, and gives the motivation and background of this paper. The third chapter introduces the design method of the DE-XGBoost classification recognition model in detail. The fourth chapter is the experimental analysis, appying the empirical data to evaluate the effectiveness and benefit of the proposed method. The last part is the conclusion of this paper, which summarizes the research results and points out the possible direction of future research.

2. Related Work

2.1. Classification of Physical Fitness Test Data

The physical health status of college students is directly associated with the growth of individuals and the vitality of society as a whole, and physical test data provide an important basis for the effective evaluation of the physical health of college students [24]. The aim of this paper is to accurately predict and judge the physical health of college students by analyzing their physical test data. The physical test data used included 11 variables, such as gender, height, weight, lung capacity, 50 m run, standing long jump, sitting forward bend, 800 m run (female), 1000 m run (male), one-minute sit-ups (female), and pull-ups (male). Based on these test results, the paper divides the students’ physical health into four grades: excellent, good, pass, and fail. Each grade is based on a comprehensive score of the test results, reflecting the student’s overall physical health level. The detailed data are presented as follows (Table 1):

2.2. Differential Evolution

DE is an efficient swarm intelligence optimization algorithm proposed by Rainer Storn and Keneth Price in 1997 [25]. It draws on the variation, crossover, and selection mechanisms of natural evolution to search for optimal solutions in complex solution spaces [26].
In the initial stage of DE, the algorithm first randomly generates an initial population in the search space. The population consists of NP individuals, each of which is a D-dimensional vector.
In the mutation stage, the algorithm randomly selects two distinct individuals from the population and adds the weighted difference vector to the third randomly selected individual to generate a mutation variant. This process can introduce new solutions, thereby increasing the diversity of the population. Common variation strategies are as follows [27,28]:
DE/rand/1:
V i = x r 1 + F × ( x r 2 x r 3 )
DE/best/1:
V i = x b e s t + F × ( x r 1 x r 2 )
DE/rand/2:
V i = x r 1 + F × ( x r 2 x r 3 ) + F × ( x r 4 x r 5 )
DE/best/2:
V i = x best + F × ( x r 1 x r 2 ) + F × ( x r 3 x r 4 )
where the scaling factor F is a constant that controls the scaling degree of the difference vector. r 1 , r 2 , r 3 , r 4 , r 5 are the five distinct individuals randomly selected from the population, and  x best is the optimal individual from the population.
The crossover phase aims to combine the mutant individual with the original individual to produce a trial vector. In this process, each dimension inherits its value from the variant individual with a certain probability (cross probability CR), otherwise, it retains the value of the original individual. In this way, the trial vector combines information from the original individual and the mutant individual.
U 1 = U 1 if rand ( 0 , 1 ) CR or j = j rand V i otherwise
where CR is the crossover rate and j = j rand is a randomly selected dimension, guaranteeing that at least one dimension of the U i comes from V i .
In the selection phase, the algorithm evaluates the fitness values of both the trial vector and the original individual. If the fitness value of the trial vector is superior, it is used to replace the original individual. Otherwise, the original individual remains unchanged. This greedy selection mechanism guarantees monotonic improvement in population quality [29].
X i = U i if f ( U i ) f ( X i ) X i otherwise
The differential mutation mechanism of DE makes it exhibit unique adaptability when dealing with continuous high-dimensional parameter optimization problems. The parameters in physical test data (such as height, weight, lung capacity, etc.) are all continuous variables, and their optimization space has the characteristics of being smooth and globally correlated. DE generates mutant individuals through the linear combination of vector differences (as shown in Equations (1)–(4)), which can effectively utilize the linear relationships among parameters and thus achieve efficient exploration within the continuous solution space. Additionally, DE has a relatively low dependence on hyperparameters (such as the scaling factor F and crossover probability CR). This feature makes DE more suitable for the complex hyperparameter combination optimization of XGBoost (with a total of 10 parameters), avoiding the problem of algorithm instability caused by parameter coupling.

2.3. XGBoost Algorithm

The XGBoost algorithm is an optimized gradient-boosting decision tree algorithm [30]. It is an efficient machine learning algorithm based on a gradient-boosting decision tree (GBDT) developed by Chen T.Q. [31], which optimizes the performance and operation efficiency of GBDT. The core of the algorithm is to construct the objective function and optimize the function iteratively to obtain the best model. The objective function of the XGBoost algorithm comprises two components: the loss function and the regularization term. The loss function serves to quantify the discrepancy between the model’s predicted values and the actual outcomes, whereas the regularization term is employed to regulate model complexity and mitigate overfitting [32]. The formulation of the objective function can be expressed as follows:
obj = i = 1 n l ( y i , y ^ i ) + k = 1 K Ω ( f k )
The regularization term can often be expressed as a function of the weight and number of leaf nodes, which is used to punish complex models, where y i is the classification label of the sample x i , y ^ i is the model prediction value; Ω is the regularization term; and f k is the model of the k tree. The l function is the loss function. The regularization term is expanded to the following:
Ω ( f k ) = γ M + 1 2 λ i = 1 M w j 2
where M represents the total number of leaf nodes in the classification regression tree model, W j is the weight value assigned to each leaf node, and γ and λ are the default penalty coefficient constants. After the T-round iteration, the objective function can be expressed as follows:
obj = i = 1 n l [ y i , y ^ i ( t 1 ) + f t ( x i ) ] + k = 1 K Ω ( f t )
The Taylor expansion formula is used for the quadratic approximation of the above expression, the regularization term is integrated into it, and the final expression of the objective function is obtained.
obj = i = 1 n g i f i ( X i ) + 1 2 h i f t 2 ( X i ) + γ M + 1 2 λ i = 1 M w j 2
where g i and h i denote the first and second derivatives of l ( y ^ i ( t 1 ) ) with respect to y ^ i ( t 1 ) , respectively.
During boosting iteration, the XGBoost algorithm adopts a greedy strategy, trying to add a new decision tree to the existing model in each iteration to minimize the objective function. To identify the optimal structure of the decision tree, the algorithm utilizes a second-order Taylor expansion of the loss function to obtain a more accurate approximation. In this way, at each iteration, the algorithm chooses a decision tree structure that minimizes the value of the objective function.

2.4. Motivation

When XGBoost methods solve classification problems, different parameter settings may lead to significant differences in model accuracy [33]. In order to ensure the effectiveness and robustness of the machine learning algorithm, this paper proposes to use DE to optimize the main parameters of XGBoost to solve the classification problem of college students’ physical test data, so as to improve the accuracy and performance of the algorithm. As a global optimization method based on population difference, DE shows high efficiency and robustness in complex search space, and can effectively avoid falling into local optimal solutions. In addition, in current studies on physical test data, few combine DE and XGBoost for parameter optimization and model construction. Therefore, the research in this paper may not only provide more accurate tools for the physical health assessment of college students but also seeks to introduce innovative ideas and methodologies for the application of machine learning in related fields.

3. Proposed Methed

DE-XGboost is an optimization method that combines DE and XGBoost to automatically search for the best combination of hyperparameters of XGBoost through DE. In DE-XGBoost, each individual is encoded as a vector of length 10 corresponding to the 10 key hyperparameters of XGBoost, including max_depth, n_estimators, learning_rate, and so on. The whole implementation process of DE includes four main steps: population initialization, mutation, crossover, and selection. First, the initial population is randomly generated, with each individual representing a set of hyperparameters. Then, a new candidate solution is generated by a mutation operation. The mutation strategy used in this paper is DE/best/1. Then, the modified candidate solution is mixed with the original individual by cross-operation to generate a new individual. Finally, the fitness of the new individual and the original individual are compared through the selection operation, that is, the accuracy of the XGBoost model, and the individual with higher fitness is retained. After many iterations, the DE gradually optimizes the hyperparameter combination, and finally, finds the hyperparameter configuration that makes the XGBoost model perform best.

3.1. Individual Coding

The design of individual coding is one of the cores of the DE-XGBoost framework, which enables DE to perform efficient global search by mapping hyperparameter combinations to real number vectors. In the DE-XGBoost framework, the individual encoding of DE takes the form of real number vectors. The expression of individual coding is as follows (Figure 1):
Each individual represents a set of 10 key hyperparameter configurations for the XGBoost model. The value can be max_depth ( x 1 ), n_estimators ( x 2 ), learning_rate ( x 3 ), gamma ( x 4 ), min_child_weight ( x 5 ), subsample ( x 6 ), colsample_bytree ( x 7 ), colsample_bylevel ( x 8 ), reg_alpha ( x 9 ), and reg_lambda ( x 10 ). These hyperparameters control the structure of the tree, the learning process, the regularization intensity, and the sampling strategy of the features and samples, respectively, in the XGBoost model, which significantly influences the model’s performance and generalization capability [34]. The specific data are described as follows (Table 2):

3.2. Algorithm Flow

In DE-XGBoost’s algorithmic flow, the data file is first read and preprocessed, including filling in missing values and separating features and labels. During model construction, to enhance both training and testing efficacy, 70% of the sample data points are designated as the training set for model development, while 30% are allocated as test sets to evaluate the classification accuracy of the model. Then, key parameters are set for the DE, such as population size (NP) as 15, dimension (Dim) as 10, scaling factor (F) as 0.5, crossover probability (CR) as 0.5, and maximum number of Total_Iterations as 25, and the initial population is generated randomly. Each individual represents a set of hyperparameter combinations for the XGBoost model.
Next, the objective function objf is defined, which is evaluated as a fitness value by training the XGBoost model and calculating its accuracy on the test set. In the iterative process, DE avoids local optimality by introducing stochastic differential vector perturbation and population diversity maintenance mechanisms. In the variation stage, the disturbance vector is generated by randomly selected individual differences (such as pop[r2]-pop[r3]), and the scale factor F is used to enlarge the exploration range. This random disturbance can break through the gravity region of the current optimal solution. The population diversity was maintained by the Cr probability retaining some new parameters in the crossover stage. In the selection stage, a greedy strategy is adopted to preserve the optimal solution, which makes the algorithm dynamically balance between developing the current optimal and exploring the unknown region, so as to effectively jump out of the local optimal. This process continues until a preset iteration termination condition is met. The XGBoost model is trained with optimized hyperparameters, and its performance is tested by random upsampling, cross-validation, confusion matrix analysis, ROC curve drawing, AUC value calculation, and other evaluation methods. The overall algorithm framework is shown in Figure 2. The individual update process in DE_XGBoost is shown in Algorithm 1.
At the same time, according to Algorithm 1, variation crossover operations are carried out on each dimension (total Dim) within each outer loop, and the time complexity of each operation is O(1), so this part is divided into O(Dim), a total of NP individuals, and the total time is O(NP × Dim). This whole process is repeated Total_Iterations, so the Total time complexity of the DE-XGBoost algorithm is O(Total_Iterations × NP × Dim).
Algorithm 1: Pseudocode for differential evolution hyperparameter optimization
Mathematics 13 01405 i001

4. Experimental Analysis

4.1. Data Selection and Processing

The data in this paper come from the actual sports test values of a university in 2022, and a total of 20,452 records were collected, covering the performance of students in physical fitness, skills, and comprehensive quality. To ensure both the accuracy and generalizability of the model, we systematically divided the dataset. A total of 70% of the data, that is, about 14,316 records, were allocated as the model training set. The remaining 30%, amounting to about 6136 records, were designated as the test set to validate and assess the performance and reliability of the trained model.
The existence of singular sample data in the sample data will affect the accuracy of the classification results to some extent [35]. In order to influence this part and enhance the precision of classification prediction outcomes, this paper adopts the max-min normalization method to normalize linear variables before establishing the model [36]. The calculation formula is shown in Equation (11):
Y = x x min x max x min
where x represents the original data value, x min is the minimum value in the dataset, x max is the maximum value in the dataset, and Y is the normalized data value ranging from 0 to 1.

4.2. Evaluation Index

In order to further compare the training results of the model and the actual application effect, this paper utilizes model evaluation indexes, such as the accuracy rate, recall rate, accuracy rate, ROC curve, and AUC value. These metrics facilitate a clearer visualization of the model’s predictive performance and generalization capability [37].
Accuracy rate is defined as the proportion of correctly predicted samples relative to the total number of samples. The higher the value, the better. Its calculation formula is shown in Equation (12):
A = TP + TN TP + TN + FP + FN
where TP is the positive class sample correctly predicted by the model, TN is the negative class sample correctly predicted, FP is the positive class sample incorrectly predicted, and FN is the negative class sample incorrectly predicted.
Recall rate is defined as the proportion of predicted positive samples among those that are actually positive. A higher value indicates better performance. Its calculation formula is shown in Equation (13):
R = TP TP + FN
Accuracy rate refers to the proportion of predicted positive samples and actual positive samples. A higher value indicates better performance. Its calculation formula is shown in Equation (14):
P = TP TP + FP
The F1-score serves as a metric for evaluating classification problems. It represents the harmonic mean of precision and recall, with values ranging from 0 to 1. The calculation formula is shown in Equation (15):
F 1 = 2 × P × R P + R
In the binary classification problem, P is the positive example and N is the negative example. T (true) means true and F (false) means not. TP denotes the count of samples that are genuinely positive and have been predicted as positive; FN signifies the count of samples that are genuinely positive but have been predicted as negative—for example, in the context of physical fitness testing, TP means that the actual physical fitness test result is a pass, and the predicted result is also a pass; FN indicates that the actual physical fitness test result is pass, but the predicted result is fail; FP indicates that the actual physical fitness test result is failing, but the predicted result is passing; TN indicates that the actual physical test result is failing, and the predicted result is also failing. The multi-classification problem can be transformed into the confusion matrix of the binary classification problem, as shown in Table 3.
The ROC curve, also known as the receiver operating characteristic curve (ROC) [38], is a comprehensive indicator of the continuous variables of the sensitivity and specificity of the response model. The ordinate TPR (true positive rate) refers to the “true positive rate” of samples. The horizontal coordinate FP (false positive rate) refers to the “false positive rate” of the sample. The coordinate point (0, 0) means that all samples are negative examples, the coordinate point (1, 1) means that all samples are positive examples, and the coordinate (0, 1) means perfect classification. The AUC value is the area enclosed under the ROC curve, and its value is between [0, 1]. The closer to 1, the better the classification effect.

4.3. Algorithm Comparison

In order to verify the performance of the algorithm, the DE-XGBoost model is compared with four models, XGBoost, SVM, GBM, and MLP, for a more comprehensive evaluation. Table 4 shows the performance comparison of each index of the DE-XGBoost algorithm under different F/Cr reverse combinations, and Table 5 shows the impact of the Cr value on the performance of the DE-XGBoost model when F = 0.5 is fixed. Table 6 compares the performance advantages of XGBoost models under different optimization algorithms. The settings of the XGBoost, SVM and other model parameters are shown in Table 7, and the evaluation index data are shown in Table 8. Table 9 shows the Friedman test’s resulting graph, showing each model’s performance ranking. The feature importance diagram of DE-XGBoost is shown in Figure 3. The ROC curve and AUC values of DE-XGBoost are shown in Figure 4. The confusion matrix for DE-XGBoost is shown in Figure 5. The confusion matrix normalization diagram of DE-XGBoost is shown in Figure 6. The convergence diagram of DE-XGBoost is shown in Figure 7.
Table 4 shows the impact of different parameter combinations of F and Cr on the performance of the DE-XGBoost model. The data indicate that the overall performance is optimal when F = 0.5 and Cr = 0.5 (accuracy 0.987868, F1-score 0.987829), while the performance is weakest when F = 0.1 and Cr = 0.9 (accuracy 0.984444, F1-score 0.984311). Although parameter adjustments cause fluctuations in the indicators, the maximum difference in accuracy is only 0.34%, and the overall impact is limited, especially when the suboptimal parameters (such as F = 0.3 and Cr = 0.7) have a difference of less than 0.05% compared to the optimal combination. Meanwhile, Table 5 shows that when F is fixed at 0.5, the parameter combination of F = 0.5 and Cr = 0.5 still performs optimally, while F = 0.5 and Cr = 0.9 performs the weakest (accuracy 0.986357). Notably, when Cr increases from 0.1 to 0.9, the fluctuation range of each indicator is relatively small, with the maximum difference in accuracy being 0.015, indicating that the adjustment of the Cr parameter also has a limited impact on the model performance under the current experimental settings. Therefore, adopting the optimal parameter combination (F = 0.5, Cr = 0.5) as determined in this paper can ensure performance while avoiding the complexity and resource consumption brought by adaptive algorithms.
According to the performance index comparison results presented in Table 6, the DE-XGBoost model demonstrates significant advantages in all four core evaluation dimensions: its prediction accuracy reaches as high as 0.987868, achieving a 0.2% and 1.7% improvement in accuracy compared to the improved versions based on Particle Swarm Optimization (PSO) and Simulated Annealing (SA), respectively; the recall rate and precision rate indicators both remain consistently above 0.98, reaching excellent performance of 0.987889 and 0.988106, respectively; and the F1 harmonic coefficient even reaches the optimally balanced value of 0.987829. This set of data fully validates the outstanding adaptability of the differential evolution algorithm in the hyperparameter optimization process of XGBoost. Through its efficient swarm intelligence search mechanism, it not only achieves breakthroughs in individual indicators but also significantly outperforms the traditional PSO and SA optimization schemes in the overall improvement of the model’s comprehensive predictive ability.
As shown in Table 7, DE-XGBoost performs joint optimization of 10 hyperparameters through differential evolution (NP = 15, F = 0.5, Cr = 0.5). Compared with the fixed parameters of traditional XGBoost (100 trees, 0.3 learning rate, depth 6), the accuracy rate of DE-XGBoost in Table 7 is improved by 3.6%. The advantages of evolutionary algorithms in the intelligent exploration of parameter space are verified. Among other models, SVM’s high regularization (C = 1.0) and RBF kernel are suitable for small samples but have high computational costs. The shallow structure of MLP (single hidden layer 50 nodes) and fixed learning rate (0.001) lead to weak generalization ability. GBM balances model complexity with convergence efficiency and is suitable for medium-scale data. DE-XGBoost’s auto-tuning benefits stand out when high precision and parallel resources are needed.
As evidenced by the evaluation index data presented in Table 8, there are certain differences in the performance of each model on the test set. The DE-XGBoost model performed best on all metrics, with accuracy, recall, accuracy and F1-scores all close to 0.988. This shows that DE-XGBoost is very effective in handling this task, with high predictive power and stability. The XGBoost model performed second, with accuracy, recall, accuracy, and F1-scores all around 0.952, slightly lower than DE-XGBoost, but still showing strong performance. The indicators of the SVM model are relatively low—all indicators are around 0.908, which represents a significant gap compared with DE-XGBoost. The GBM model was positioned between DE-XGBoost and SVM, with accuracy, recall, and F1-scores around 0.945. This shows that GBM is a relatively reliable model, but not as good as DE-XGBoost. The performance of the MLP is similar to GBM, but slightly lower, with all indicators around 0.938.
Table 9 presents the performance ranking of five algorithms, where DE-XGBoost takes the top spot with an absolute advantage, followed by XGBoost (2nd place) and GBM (3rd place), while MLP (4th place) and SVM (5th place) rank lower. This ranking indicates that DE-XGBoost performs the best in this evaluation. DE-XGBoost, with its optimized ensemble learning framework and feature-processing capabilities, leads to improved prediction accuracy and stability, demonstrating a significant advantage over the traditional tree models XGBoost and GBM in capturing non-linear relationships. In contrast, MLP and SVM, due to their sensitivity to data distribution and generalization limitations, rank at the bottom in this assessment, further highlighting the adaptability advantages of ensemble algorithms in complex tasks.
Figure 3 shows the feature importance scores derived from DE-XGBoost. Including gender (f1), height (f2), weight (f3), lung capacity (f4), 50 m run (f5), standing long jump (f6), sitting forward bend (f7), 800 m run (f8), 1000 m run (f9), 1 min sit-up (f10) and pull-up (f11), and 11 other characteristics. The X-axis represents the importance scores of the feature, and the Y-axis represents the names of the features. It can be intuitively seen from the figure that vital capacity (f4) and body weight (f3) are factors that affect classification accuracy, and the importance scores are 11,367.0 and 10,665.0, respectively. This is consistent with physiological research, which shows that respiratory efficiency and body composition directly affect health levels [39]. In contrast, sex (f1) showed the smallest effect, with a score of 168.0, suggesting that fitness classifications are less gender-biased in this dataset.
Figure 4 shows the ROC curve and AUC value of the DE-XGBoost model to reflect the characteristics of the model. Curve 1 represents excellent physical fitness, curve 2 indicates good physical fitness test performance, curve 3 indicates passing physical fitness test performance, and curve 4 indicates failing physical fitness test performance. The AUC value (area under the curve) is 0.99, close to 1.0, indicating that the DE-XGBoost model has extremely high accuracy and stability in distinguishing between different classes. The smoothness of the ROC curve and the trend towards the top-left corner further verify the excellent performance of the model in various categories.
Figure 5 is a confusion matrix diagram of DE-XGBoost, showing the classification results of the model on the test set. As depicted in the figure, the classification performance of this model is excellent in most categories, especially categories 0 and 3, where the predicted results exhibit a high degree of consistency with the actual labels. The classification performance of category 1 is slightly weak, and there are some misclassification phenomena, but the overall performance is still good. For category 2, only a very small number of samples was misclassified. Figure 6 shows the normalized confusion matrix and the proportion of categories, in which 100% of the samples of category 0 are correctly predicted, the accuracy of category 1 is 95%, but there is a phenomenon that 3.4% is misjudged as category 2. Meanwhile, the distribution ratio of samples of categories 1 and 2 in the test set is close, indicating that the high accuracy of the model does not depend on the advantage of sample size, but has stable classification ability.
Figure 7 shows the convergence characteristics of the DE-XGBoost algorithm in the first run. The horizontal axis is the number of iterations (0–25), and the vertical axis is the best fitness (based on the classification accuracy index). The convergence curve presents a typical three-stage feature: the accuracy of the initial stage (0–5 iterations) rapidly rises from 0.985 to 0.990, reflecting the efficient exploration ability of the differential evolution mechanism in parameter space. In the middle stage (5–10 iterations), the upward slope is slow and accompanied by a small amount of fluctuation, reflecting the equilibrium process of development and exploration of the algorithm in the local optimal region. In the convergence stage (after 10 iterations), the accuracy rate is stable in the range of 0.990–0.992, and the fluctuation range is very small, which indicates that the hyperparameter combination of XGBoost has reached the approximate optimal configuration after 10 iterations. This convergence mode verifies the optimization efficiency of the DE-XGBoost framework, and only 10 iterations can be completed to optimize parameters while ensuring the model performance, significantly reducing the computational cost.

4.4. Sensitivity Analysis

To assess the influence of various data partitions on algorithm performance, multiple parameter combinations are employed to evaluate the effectiveness of the algorithm, which are respectively expressed as follows: (0.9, 0.1), (0.8, 0.2), (0.7, 0.3), (0.6, 0.4), (0.5, 0.5), (0.4, 0.6), (0.3, 0.7), (0.2, 0.8), (0.1, 0.9). The specific evaluation index results are shown in Table 10. The results show that with the change in parameter combination, the performance of the algorithm exhibits a degree of fluctuation, and the overall performance basically shows a downward trend. When the parameter combination is (0.9, 0.1), the accuracy rate, recall rate, accuracy rate, and F1-score of the algorithm all achieve maximum metrics. With decrease in the first value and increase in the second value in the parameter combination, each index gradually decreases, especially when the parameter combination is (0.1, 0.9), and each index decreases significantly.

5. Conclusions

This paper proposes an innovative DE-XGBoost method to solve the problem of physical fitness test classification for college students. The proposed method optimizes the parameters of the XGBoost model by DE, which significantly boosts the algorithm’s performance in classifying physical test data. To validate the effectiveness of the DE-XGBoost method, we utilized a dataset comprising 20,452 physical test data collected in 2022 from a university as our experimental dataset. The results of our experiments demonstrate that, compared with other traditional and advanced classification algorithms, the DE-XGBoost method achieves significant improvement in classification accuracy, which not only validates the effectiveness of the DE-XGBoost method but also provides a new idea and technical means for the accurate classification of college students’ physical test data.
Although this paper has made some preliminary achievements in the classification of college students’ physical fitness test, it still has some shortcomings and needs to be expanded. This paper primarily concentrates on the optimization of XGBoost model parameters using DE but does not discuss other factors that may affect classification performance, such as data preprocessing and feature selection. Secondly, the data used in the experiment are relatively simple and do not involve more complex or diverse physical test data, which limits the full verification of the generalization capability of the DE-XGBoost method to a certain extent. Therefore, in future studies, we will work on higher-difficulty classification problems, such as multi-class classification or unbalanced data processing, and explore more complex feature engineering methods.

Author Contributions

Conceptualization, B.L. and W.Q.; methodology, B.L. and Z.L.; software, B.L.; validation, W.Q. and Z.L.; formal analysis, W.Q.; investigation, Z.L.; resources, W.Q.; data curation, W.Q.; writing—original draft preparation, B.L.; writing—review and editing, W.Q. and Z.L.; visualization, B.L. and W.Q.; supervision, W.Q. and Z.L.; project administration, Z.L.; funding acquisition, Z.L. and W.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (No. 62362001).

Data Availability Statement

The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wang, Q.Y.; Wang, H. Analysis of factors influencing physical education teaching on students’ physical health. Chin. J. Educ. 2019, S1, 207–209. [Google Scholar]
  2. Lu, J. Analysis on Construction and Improvement of University Health Management System. Mod. Commer. Trade Ind. 2021, 42, 80–81. [Google Scholar]
  3. Zhang, B.; Yu, X. Investigation and research on college students’ physical health status. Sci. Technol. Innov. Guide 2015, 12, 51. [Google Scholar]
  4. Liu, L.; Zhou, J.S.; Cheng, H.; Zhu, J.M. Comprehensive evaluation of college students’ physical health based on multivariate statistical analysis. J. Guiyang Univ. (Nat. Sci.) 2014, 9, 16–20+23. [Google Scholar]
  5. Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. 1998, 13, 18–28. [Google Scholar] [CrossRef]
  6. Natekin, A.; Knoll, A. Gradient Boosting Machines, a Tutorial. Front. Neurorobot. 2013, 7, 21. [Google Scholar] [CrossRef]
  7. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  8. Zhang, L.G. Feature Selection Method Based on XGBoost and Ant Colony Optimization. Comput. Sci. Appl. 2023, 13, 883–889. [Google Scholar]
  9. Kang, M.C.; Yoo, D.Y.; Gupta, R. Machine learning-based prediction for compressive and flexural strengths of steel fiber-reinforced concrete. Constr. Build. Mater. 2021, 266 Pt B, 121117. [Google Scholar] [CrossRef]
  10. Wang, M.; Cheng, Z.H.; Hu, M. Establishment and Evaluation of an Early Prediction Model for Severe Disease Risk of COVID-19 Patients Based on XGBoost. J. Army Med. Univ. 2022, 44, 195–202. [Google Scholar]
  11. Wu, C.; Ma, D. Financial Customer Investment Behavior Feature Selection Method Based on Improved XGBoost. J. Comput. Appl. 2024, 44, 330–336. [Google Scholar]
  12. Al Ali, A.; Khedr Ahmed, M.; El-Bannany, M. Fraud Prediction Model for Financial Statements Based on Optimized XGBoost Integrated Learning Technology. Appl. Sci. 2023, 13, 2272. [Google Scholar] [CrossRef]
  13. Hu, D. Research on Intelligent Product Recommendation System Based on Machine Learning. Wirel. Internet Technol. 2023, 20, 18–21. [Google Scholar]
  14. Ruan, Y.; Zhang, H.T.; Sun, J.; Liu, X.; Xia, L.L.; Sun, A.H.; Fang, Y.J. NRBO-XGBoost Transformer Fault Diagnosis Method Based on DGA. J. Chaohu Univ. 2024, 26, 87–93, 128. [Google Scholar]
  15. Gharagoz, M.M.; Noureldin, M.; Kim, J. Explainable Machine Learning (XML) Framework for Seismic Assessment of Structures Using Extreme Gradient Boosting (XGBoost). Eng. Struct. 2025, 327, 119621. [Google Scholar] [CrossRef]
  16. Zhang, H.; Wang, Y.; Zhu, Y.; Huang, P.; Gao, Q.; Li, X.; Chen, Z.; Liu, Y.; Jiang, J.; Gao, Y.; et al. Machine Learning and Genetic Algorithm-Guided Directed Evolution for the Development of Antimicrobial Peptides. J. Adv. Res. 2025, 68, 415–428. [Google Scholar] [CrossRef]
  17. Lermer, M.; Reich, C.; Abdeslam, D.O. An Evolutionary Strategy Based Training Optimization of Supervised Machine Learning Algorithms (EStoTimeSMLAs). In Proceedings of the 5th Asia Conference on Machine Learning and Computing (ACMLC), Bangkok, Thailand, 28–30 December 2022; pp. 11–15. [Google Scholar]
  18. Li, X.; Guo, Y.; Li, Y. Particle Swarm Optimization-Based SVM for Classification of Cable Surface Defects of the Cable-Stayed Bridges. IEEE Access 2020, 8, 44485–44492. [Google Scholar] [CrossRef]
  19. Akinyelu, A.A.; Ezugwu, A.E.; Adewumi, A.O. Ant Colony Optimization Edge Selection for Support Vector Machine Speed Optimization. Neural Comput. Appl. 2020, 32, 11385–11417. [Google Scholar] [CrossRef]
  20. Rasdi Rere, L.M.; Fanany, M.I.; Arymurthy, A.M. Simulated Annealing Algorithm for Deep Learning. Procedia Comput. Sci. 2015, 72, 137–144. [Google Scholar] [CrossRef]
  21. Du, Z.; Chen, W.; Zhou, G.; Zhao, S. Research on Optimization of Boiler Combustion System Based on Intelligent Control Algorithm. In Proceedings of the 2023 IEEE 7th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 15–17 September 2023; pp. 1388–1391. [Google Scholar]
  22. Bajpai, S.; Sharma, K.; Chaurasia, B.K. Anomaly Detection in IoT Networks Using Differential Evolution and XGBoost. In Proceedings of International Conference on Recent Innovations in Computing; Illés, Z., Verma, C., Gonçalves, P.J.S., Singh, P.K., Eds.; Lecture Notes in Electrical Engineering (LNEE, Volume 1195); Springer: Singapore, 2023. [Google Scholar]
  23. Zhang, J.; Tian, H. Short-term wind power prediction based on DE-XGBoost. Inf. Technol. 2024, 48, 136–142. [Google Scholar]
  24. Department of Physical Health and Arts Education, Ministry of Education. The Results of the Eighth National Survey on Students’ Physique and Health Were Released. Chin. Sch. Health 2021, 42, 1281–1282. [Google Scholar]
  25. Storn, R.; Price, K. Differential Evolution—A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
  26. Ding, Q.F.; Yin, X.Y. A Review of Differential Evolution Algorithms. J. Intell. Syst. 2017, 12, 431–442. [Google Scholar]
  27. Das, S.; Suganthan, P.N. Differential Evolution: A Survey of the State-of-the-Art. IEEE Trans. Evol. Comput. 2011, 15, 4–31. [Google Scholar] [CrossRef]
  28. Neri, F.; Tirronen, V. Recent Advances in Differential Evolution: A Survey and Experimental Analysis. Artif. Intell. Rev. 2010, 33, 61–106. [Google Scholar] [CrossRef]
  29. Song, X.Y.; Li, M.; Zhao, M. Multi-population multi-strategy differential evolution algorithm with triple selection mechanism. Appl. Res. Comput. 2025, 42, 795–803. [Google Scholar]
  30. Lei, Y.; Jiang, W.; Jiang, A.; Zhu, Y.; Niu, H.; Zhang, S. Fault Diagnosis Method for Hydraulic Directional Valves Integrating PCA and XGBoost. Processes 2019, 7, 589. [Google Scholar] [CrossRef]
  31. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16), San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
  32. Trizoglou, P.; Liu, X.; Lin, Z. Fault Detection by an Ensemble Framework of Extreme Gradient Boosting (XGBoost) in the Operation of Offshore Wind Turbines. Renew. Energy 2021, 179, 945–962. [Google Scholar] [CrossRef]
  33. Sun, Y.S.; Huang, Y.; Liang, T.; Ji, H.C.; Xiang, P.F.; Xu, X.R. Identification of Complex Carbonate Lithology Logging Based on XGBoost Algorithm. Lithol. Reserv. 2019, 32, 98–106. [Google Scholar]
  34. Wu, C.; Dong, A.; Li, Z.; Wang, F. Photovoltaic Power Prediction Based on Graph Similarity Day and PSO-XGBoost. High Volt. Eng. 2022, 48, 3250–3259. [Google Scholar]
  35. Han, R.Y.; Wang, Z.W.; Wang, W.H.; Xu, F.H.; Qi, X.H.; Cui, Y.T. Lithology identification of igneous rocks based on XGBoost and conventional logging curves, a case study of the eastern depression of Liaohe Basin. J. Appl. Geophys. 2021, 195, 104480. [Google Scholar]
  36. Yang, H.Y.; Zhao, X.Y.; Wang, L. Review of Data Normalization Methods. Comput. Eng. Appl. 2023, 59, 13. [Google Scholar] [CrossRef]
  37. Catriona, M.; Theo, P.; Denis, M.; Justin, M. A Review of Model Evaluation Metrics for Machine Learning in Genetics and Genomics. Front. Bioinform. 2024, 4, 1457619. [Google Scholar]
  38. Li, B.; Gatsonis, C.; Dahabreh, I.J.; Steingrimsson, J.A. Estimating the Area Under the ROC Curve When Transporting a Prediction Model to a Target Population. Biometrics 2022, 79, 2382–2393. [Google Scholar] [CrossRef]
  39. Ergezen, G.; Menek, M.; Demir, R. Respiratory muscle strengths and its association with body composition and functional exercise capacity in non-obese young adults. Fam. Med. Prim. Care Rev. 2023, 25, 146–149. [Google Scholar] [CrossRef]
Figure 1. Individual coding.
Figure 1. Individual coding.
Mathematics 13 01405 g001
Figure 2. Algorithm flow chart.
Figure 2. Algorithm flow chart.
Mathematics 13 01405 g002
Figure 3. Feature importance map.
Figure 3. Feature importance map.
Mathematics 13 01405 g003
Figure 4. ROC curve and AUC values.
Figure 4. ROC curve and AUC values.
Mathematics 13 01405 g004
Figure 5. Confusion matrix diagram.
Figure 5. Confusion matrix diagram.
Mathematics 13 01405 g005
Figure 6. Normalized confusion matrix diagram.
Figure 6. Normalized confusion matrix diagram.
Mathematics 13 01405 g006
Figure 7. DE-XGBoost convergence graph.
Figure 7. DE-XGBoost convergence graph.
Mathematics 13 01405 g007
Table 1. Variables of the physical test data.
Table 1. Variables of the physical test data.
AttributesDescriptionsScopes
Gender-1 (man)∼2 (women)
Height (cm)-137∼200
Weight (kg)-32.8∼131.1
Lung Capacity (mL)Vital capacity is the maximum volume of air in a breath982∼8504
50 m Sprint (s)A short distance race with a distance of 50 m5.6∼13.5
Standing Long Jump (cm)The furthest distance you can reach after jumping from a stationary position100∼300
Sit and Reach (cm)A test to assess human flexibility0∼40
800 m Run (min)A middle distance race with a distance of 800 m3′0∼8′48
1000 m Run (min’s)A middle distance race with a distance of 1000 m3′0∼8′48
One-minute Sit-ups (reps/min)Sit-ups performed within one minute3∼70
Pull-ups (reps/min)Measure human upper limb overhang pull strength and muscle endurance level0∼36
Table 2. Individual encoding of DE algorithm.
Table 2. Individual encoding of DE algorithm.
AttributesDescriptionsScopes
max_depthThe maximum depth of the tree controls the complexity of the model1∼15
n_estimatorsThe number of trees affects model performance100∼200
learning_rateThe learning rate regulates the influence of each individual tree0∼1
gammaRegularizes parameters to control node splitting0∼1
min_child_weightMinimum child node weight to prevent overfitting1∼15
subsampleSample scale for training each tree0∼1
colsample_bytreeThe proportion of feature subsets per tree0∼1
colsample_bylevelThe proportion of feature subsets for each layer0∼1
reg_alphaL1 regularizes parameters to increase sparsity0∼1
reg_lambdaL2 regularizes parameters to limit the weight size0∼1
Table 3. Taking the physical fitness test as an example, a multi-classification transformation binary classification confusion matrix is presented.
Table 3. Taking the physical fitness test as an example, a multi-classification transformation binary classification confusion matrix is presented.
Actual ResultForecast Result
PassFlunk
PassTPFN
FlunkFPTN
Table 4. Multi-index performance comparison under the F/Cr reverse combination.
Table 4. Multi-index performance comparison under the F/Cr reverse combination.
F = 0.1, Cr = 0.9F = 0.3, Cr = 0.7F = 0.5, Cr = 0.5F = 0.7, Cr = 0.3F = 0.9, Cr = 0.1
Accuracy rate0.9844440.98744420.9878680.98753130.984791
Recall rate0.9843880.98745440.9878890.98745560.984852
Precision rate0.9847140.98772860.9881060.98780190.985062
F10.9843110.98741370.9878290.98745010.984750
Table 5. Performance metrics with F = 0.5 under varying Cr values.
Table 5. Performance metrics with F = 0.5 under varying Cr values.
F = 0.5, Cr = 0.1F = 0.5, Cr = 0.3F = 0.5, Cr = 0.5F = 0.5, Cr = 0.7F = 0.5, Cr = 0.9
Accuracy rate0.9863130.9877700.9878680.9869550.986357
Recall rate0.9863850.9876640.9878890.9870730.986615
Precision rate0.9866180.9880710.9881060.9871630.986578
F10.9863110.9876560.9878290.9869480.986346
Table 6. Comparison of performance advantages of XGBoost model based on optimization algorithm.
Table 6. Comparison of performance advantages of XGBoost model based on optimization algorithm.
Accuracy RateRecall RatePrecision RateF1
DE-XGBoost0.9878680.9878890.9881060.987829
PSO-XGBoost0.9858140.9858990.9861510.985813
SA-XGBoost0.9713560.9713650.9718760.970983
Table 7. Parameter setting of each model.
Table 7. Parameter setting of each model.
AlgorithmArgument
DE-XGBoostNP = 15, Dim = 10, F = 0.5, Cr = 0.5
XGBoostn_estimators = 100, learning_rate = 0.3, max_depth = 6, subsample = 1
SVMC = 100, tol = 1 × 10−3, max_iter = −1, random_state = 42
GBMn_estimators = 100, learning_rate = 0.1, max_depth = 3, random_state = 42
MLPhidden_layer_sizes = 100, max_iter = 200, random_state = 42
Table 8. The results of physical testing and identification of each model.
Table 8. The results of physical testing and identification of each model.
Accuracy RateRecall RatePrecision RateF1
DE-XGBoost0.9878680.9878890.9881060.987829
XGBoost0.9521900.9521380.9519840.951564
SVM0.9083700.9083700.9082920.907966
GBM0.9458890.9458890.9465700.945136
MLP0.9386740.9386740.9383550.938212
Table 9. Friedman test.
Table 9. Friedman test.
AlgorithmRanking
DE-XGBoost1
XGBoost2
SVM5
GBM3
MLP4
Table 10. Sensitivity test.
Table 10. Sensitivity test.
ParameterAccuracy RateRecall RatePrecision RateF1
(0.9, 0.1)0.9911510.9911480.9912880.991108
(0.8, 0.2)0.9900520.9900750.9902490.990032
(0.7, 0.3)0.9878680.9878890.9881060.987829
(0.6, 0.4)0.9862150.9862200.9864650.986137
(0.5, 0.5)0.9836800.9836840.9839670.983590
(0.4, 0.6)0.9857770.9857680.9860530.985696
(0.3, 0.7)0.9763520.9763260.9766740.976156
(0.2, 0.8)0.9857100.9856680.9860160.985611
(0.1, 0.9)0.9574550.9574590.9574470.957053
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, B.; Qin, W.; Liao, Z. A Differential Evolutionary-Based XGBoost for Solving Classification of Physical Fitness Test Data of College Students. Mathematics 2025, 13, 1405. https://doi.org/10.3390/math13091405

AMA Style

Liang B, Qin W, Liao Z. A Differential Evolutionary-Based XGBoost for Solving Classification of Physical Fitness Test Data of College Students. Mathematics. 2025; 13(9):1405. https://doi.org/10.3390/math13091405

Chicago/Turabian Style

Liang, Baoyue, Weifu Qin, and Zuowen Liao. 2025. "A Differential Evolutionary-Based XGBoost for Solving Classification of Physical Fitness Test Data of College Students" Mathematics 13, no. 9: 1405. https://doi.org/10.3390/math13091405

APA Style

Liang, B., Qin, W., & Liao, Z. (2025). A Differential Evolutionary-Based XGBoost for Solving Classification of Physical Fitness Test Data of College Students. Mathematics, 13(9), 1405. https://doi.org/10.3390/math13091405

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop