1. Introduction
With the continuous expansion of power system scale and the increasing growth of loads, high-current switchgears are increasingly widely used in high-voltage power supply and distribution systems, undertaking important functions such as equipment isolation, fault removal, and system protection. However, due to factors such as complex operation environment, severe load fluctuations, and component aging, high-current switchgears are prone to problems such as poor contact, abnormal heating, and even insulation breakdown [
1,
2]. Among them, abnormal temperature rise is a key inducement of faults, which may lead to contact oxidation, reduced conductivity, and insulation aging, posing significant risk hazards [
3]. Some traditional switchgear prediction methods have problems such as inaccurate feature extraction, weak temporal modeling capability, and a low degree of multi-model fusion, making them difficult to effectively support high-quality risk assessment [
4]. In contrast, machine learning-based data-driven prediction methods can more efficiently mine fault patterns, enabling refined and dynamic risk identification and early warning [
5]. Therefore, there is an urgent need for switchgear operation risk assessment integrating machine learning and data-driven methods to improve the accuracy of risk identification and the timeliness of trend prediction.
Random forest (RF) possesses both strong capability in handling nonlinear relationships and the advantage of automatic feature selection. In [
6], Xuan et al. proposed an RF-based switchgear fault prediction model, which solved single models’ unstable prediction and enabled efficient early fault identification. In [
7], Sotnikov et al. proposed a random forest-based prediction model, which was trained using 2-D finite element method simulation data, enabling effective prediction of the quench behavior of such tapes. In [
8], Shaikh et al. built an RF-gray relational assessment model. It leveraged RF’s nonlinear strength, quantified risk factor correlation, and achieved accurate risk classification. In [
9], Wassan et al. proposed an RF-based multi-source fusion method. However, RF poorly captures subtle features such as local high-temperature hotspots. Its predictions lag under dynamic conditions with rapid temperature changes, failing to meet real-time sudden risk response needs.
Long short-term memory (LSTM), a special recurrent neural network, captures long- and short-term series dependencies. It eases gradient vanishing via gating to memorize temporal features. In [
10], Yu et al. proposed an LSTM-based dynamic switchgear temperature rise prediction model to predict trends under load fluctuations and track real-time temperature changes. In [
11], Wang et al. proposed an LSTM risk assessment model with an optimized structure to improve local overload risk accuracy. In [
12], Tan et al. built an LSTM framework with environmental parameters to reduce prediction deviations and enhance adaptability. In [
13], R. Panda et al. proposed an LSTM-based dynamic switchgear risk early warning method through real-time updated input sequences and dynamic risk updates. However, LSTM weakly captures short-term sudden temperature rises, distorts predictions under drastic operation changes, and needs massive historical data, which limits its use in data-scarce scenarios.
Adaptive back propagation neural network (ABPNN), as an improved traditional BP network, adaptively adjusts the learning rate to speed convergence and reduce local optima. It dynamically optimizes parameter updates during error backpropagation. In [
14], Chen et al. proposed an ABPNN-based switchgear temperature rise prediction model, addressing traditional BP networks’ slow convergence and local optima. In [
15], Dai et al. introduced an ABPNN-feature screening risk assessment method, using ABPNN’s adaptive weights to fix unreasonable feature weights and improve accuracy. In [
16], Gu et al. developed an ABPNN switchgear multi-state assessment model, simulating neural connections to solve temperature–current nonlinear coupling for comprehensive health evaluation. In [
17], Li et al. proposed an ABPNN switchgear risk classification method, combining expert knowledge and optimization to reduce traditional subjectivity for objective outcomes. However, ABPNN is insensitive to switchgear sudden states such as local high temperatures, has poor generalization in small-sample scenarios, and is overly complex, which hinders rapid deployment in real-time monitoring.
In switchgear risk assessment and temperature rise prediction, ensemble learning gains attention for fusing multi-model advantages. Traditional methods like RF and SVM work in small samples but are noise-affected with large-scale, nonlinear, time-dependent data. In [
18], Alsumaidaee et al. proposed an improved RF feature screening method, optimizing feature evaluation to reduce redundancy and boost accuracy and generalization. In [
19], Miao et al. introduced an LSTM-attention framework, emphasizing critical time steps to capture equipment dynamics and overcome the temperature rise in the data’s complex temporal correlation. In [
20], Wang et al. developed a multi-stage system with integrated neural networks, using multi-subnet parallel processing to improve data fusion efficiency and identify nonlinear features. In [
21], Wang et al. proposed an adaptive weighted model, fusing RF and LSTM outputs to overcome single-model limits and enhance composite fault identification. In [
22], Hussain et al. focused on the risk quantification of insulation defects, modeled fault probability and consequences based on partial discharge signals, and verified the insulation defect identification effect through risk level classification. In [
23], Zhou et al. addressed the cost-efficiency shortcomings of traditional maintenance strategies by combining failure mode analysis and power grid risk assessment. Although the above methods have achieved good results in specific application scenarios, they still have problems such as insufficiently accurate feature selection, insufficient sequence dependency modeling capability, and weak coupling of the integration architecture [
24].
Although ensemble learning has made certain progress in switchgear operation risk assessment, it still faces the following challenges. Firstly, different features of switchgear operation data have different contributions to risk assessment. Although the RF algorithm can realize feature selection by evaluating feature importance through decision trees, it may lead to high redundancy in the selected important features. Secondly, although LSTM can capture the relationship of time series features to realize intelligent assessment of switchgear operation risk, it may encounter gradient vanishing or gradient explosion when processing multi-dimensional switchgear operation data. Finally, a BP neural network can correct risk assessment results to improve accuracy, while the complex switchgear operation environment and a large number of model parameters cause problems such as dimensionality disaster and local optimum in BP network training, leading to reduced accuracy of correction results.
To address the above challenges, preprocessing and dimensionality reduction in high-current switchgear operation data are first proposed. Preprocessing is realized through data cleaning, missing value imputation, and data normalization, while data quality and consistency are further improved through data dimensionality reduction. Secondly, a high-current switchgear operation risk assessment method based on ensemble learning is proposed. In the first layer, feature extraction and screening of switchgear operation data are conducted based on improved RF, providing high-quality input for risk assessment. In the second layer, key information of feature fluctuations is captured based on an attention LSTM to accurately assess the risk of high-current switchgears. In the third layer, risk assessment correction is operated based on ABPNN, which dynamically adjusts BP network parameters through an adaptive genetic algorithm to correct the risk assessment results of the second layer, improving the accuracy and stability of the assessment. Finally, the performance of the proposed method in actual risk assessment is comprehensively verified through simulation experiments. The proposed work is compared with state-of-the-art works through the precise extraction of important features, fluctuating time series feature capture, and accurate and stable risk assessment based on multi-model fusing, as summarized in 
Table 1. Based on the literature comparisons, the major contributions of this paper are elaborated below.
Ensemble learning framework for high-current switchgear risk assessment. The hierarchical architecture combines the robustness of RF for feature selection with the temporal modeling capability of attention-LSTM and assessment accuracy correction of adaptive genetic algorithm-driven ABPNN, mitigating single-model biases and achieving higher precision.
Important feature extraction based on improved RF. We integrate the Spearman correlation coefficient with traditional RF to assess feature contributions via Gini reduction, reducing feature redundancy while maintaining predictive accuracy.
Short-term fluctuation feature capture and initial assessment based on attention LSTM. We optimize LSTM gating by adjusting weight matrices and bias vectors to suppress gradient explosion. The dynamic attention mechanism prioritizes critical features, enhancing short-term fluctuation feature capture.
Risk assessment correction based on ABPNN with heuristic network parameter adjustment. A BP neural network with genetic algorithm-driven parameter updates is leveraged for result correction. Adaptive crossover/mutation prevents local optima and over-fitting, ensuring robust generation.
  3. Risk Assessment Method for High-Current Switchgear Based on Ensemble Learning
Machine learning can automatically predict and assess the operational risk level of high-current switchgear by analyzing historical operational data. Its adaptive learning capability allows the dynamic adjustment of the risk assessment model to accommodate changes in operating conditions, reducing reliance on human expertise. However, existing machine learning-based risk assessment methods have the following two limitations. On one hand, existing methods lack an accurate feature selection mechanism for switchgear operational data, leading to the inclusion of redundant features in the assessment model. This makes it difficult for the model to focus on the most influential features for risk evaluation, resulting in slow model convergence. On the other hand, the operating environment of switchgear is complex, and the risk assessment results are easily influenced by various nonlinear and disruptive factors. Existing methods lack effective risk assessment correction and optimization mechanisms, leading to low accuracy in the evaluation results.
On this basis, this paper proposes a risk assessment method for high-current switchgear based on ensemble learning. Compared to traditional learning methods, ensemble learning enhances the robustness and accuracy of risk assessment by leveraging the collaboration between multiple models, effectively overcoming the limitations of individual models in feature extraction, temporal modeling, and learning correction. The principle of the proposed method is shown in 
Figure 2. Specifically, in the first layer, the proposed algorithm uses an improved RF to perform feature extraction and selection on switchgear operational data. In the second layer, the proposed algorithm combines a gating optimization mechanism and an attention mechanism to construct a high-current switchgear risk assessment model based on an attention LSTM. The high-importance features selected in the first layer are used as input, capturing the key information of feature fluctuations and outputting the risk assessment results for the high current switchgear. In the third layer, the ABPNN model is embedded to correct and optimize the risk assessment results output from the second layer. Based on an adaptive genetic algorithm, the ABPNN parameters are dynamically adjusted to improve the accuracy and stability of the assessment.
  3.1. Feature Extraction Method Based on Improved RF
As an embedded method, RF evaluates the importance of switchgear operational data features and performs feature selection by constructing multiple decision tree models. However, traditional RF overlooks the correlations among different features during the importance quantification process, resulting in a high redundancy among the extracted important features. This redundancy adversely affects the accuracy of subsequent temperature rise prediction and risk level assessment. Therefore, this section proposes an improved RF method that employs the Spearman correlation coefficient to measure the relationships between operational data features, enabling the identification and removal of highly correlated features to reduce redundancy.
The mean decrease in impurity (MDI) method is used to quantify feature importance [
32]. Specifically, this method evaluates the contribution of each feature to the switchgear risk assessment by calculating the average decrease in impurity resulting from splits on that feature across all decision tree nodes. The Gini impurity, also known as Gini importance, is employed to measure feature importance. The Gini impurity at each decision tree node is calculated as follows:
        where 
 represents the number of feature categories, and 
 represents the proportion of feature samples belonging to the category 
 in node 
. Suppose node 
 is split into left child node 
 and right child node 
, their weighted average Gini impurity is calculated as follows:
        where 
, 
, and 
 represent the number of feature samples in the parent node, left child node, and right child node, respectively. Therefore, the decrease in impurity is calculated as follows:
        where 
 represents the Gini impurity of the parent node. The importance of feature 
 can be obtained by calculating the sum of 
 overall nodes where splits occur based on feature 
, which is expressed as follows:
        where 
 represents all nodes in the tree where splits occur based on feature 
, and 
 represents the Spearman correlation coefficient between features 
 and 
. A larger value of 
 indicates a higher redundancy level of feature 
, which leads to a certain degree of reduction in its importance, where 
 represents the set of all nodes in the tree where splits occur based on feature 
.
Set the importance threshold as , and retain all switchgear operational data features with importance greater than . The extracted feature set is denoted as .
In the feature extraction layer, the importance threshold  for the improved RF method is crucial for selecting a concise yet informative feature set. This threshold was determined empirically by ranking features based on their importance scores and utilizing cross-validation to identify the subset of features that yields the best performance for the subsequent prediction task.
  3.2. Risk Assessment Method for High-Current Switchgear Based on Attention LSTM
The feature values extracted by the random forest are input into the LSTM network for preliminary risk assessment. LSTM is adept at capturing long-term dependencies in time-series features, with a structure designed to adaptively remember or forget historical information, making it highly effective in intelligent risk assessment. However, traditional LSTM can encounter issues such as vanishing or exploding gradients when handling complex and diverse switchgear operational data, which can negatively affect the model’s training effectiveness and prediction accuracy. Therefore, this section optimizes the gating mechanism based on traditional LSTM and combines it with an attention mechanism, enabling the model to better focus on key information in the time series and improve its ability to capture short-term fluctuation features.
  3.2.1. Optimizing the Gating Mechanism
Traditional LSTM consists of three components: the forget gate 
, the input gate 
, and the output gate 
. The forget gate is used to control the retention of historical switchgear operational states and information, the output gate controls the output of data, and the input gate manages the input of time-series data. The gating mechanism can be represented as follows:
          where 
 is the hidden state from the previous time period, 
 is the input vector at the current time period, 
, 
, and 
 are the weight matrices, 
, 
, and 
 are the bias vectors, and 
 is the sigmoid activation function.
Furthermore, optimize the gating mechanism. By adjusting the weight matrices and bias vectors, the error accumulation during backpropagation is effectively reduced, thereby mitigating gradient explosion and enhancing the model’s ability to handle complex time-series data. The optimized gating mechanism is represented as follows:
          where 
 is the cell state, 
 is the weight matrix, 
 is the bias vector, 
 denotes element-wise multiplication, and 
 is the hyperbolic tangent function.
  3.2.2. Introducing the Attention Mechanism
In switchgear operation risk assessment, the impact of operational data at different time periods on the risk evaluation varies. For the LSTM network, effectively recognizing and distinguishing input features, giving higher attention to important features, and assigning higher weights typically enhance the model’s discriminative ability. Based on this, the attention mechanism is introduced to further optimize the LSTM. The attention mechanism calculates the corresponding probabilities of the input feature vectors and updates the weight matrices and biases at each iteration, gradually optimizing the weight combination of the input feature vectors to achieve the best prediction results. The calculation formula for the attention mechanism is as follows:
          where 
 represents the attention probability derived from the input vector 
 at the 
-th time period, 
 is the influence of the input feature vector at the 
-th time period on the output, i.e., the weight of the input vector 
, and 
 is the output of the attention layer.
  3.2.3. Switchgear Operation Risk Assessment
The operation risk assessment of switchgear involves multiple factors, including temperature rise, load, and equipment operating conditions. This section constructs a comprehensive switchgear operation risk assessment model based on attention LSTM. First, a temperature rise prediction model and a load prediction model are established. Based on these, a risk assessment model is constructed by incorporating equipment operating conditions to achieve accurate risk prediction for the switchgear. The temperature rise prediction model can be expressed as follows:
          where 
 is the temperature rise prediction value at the 
-th time period, 
 is the weight matrix for temperature rise prediction, 
 is the bias vector for temperature rise prediction, and 
 is the weighted hidden state. Similarly, the load prediction model can be expressed as follows:
          where 
 is the load prediction value at the 
-th time period, 
 is the weight matrix for load prediction, and 
 is the bias vector for load prediction.
After obtaining the temperature rise and load prediction values, the risk assessment model is constructed by incorporating equipment operational status data. The model calculates the contribution of temperature rise, load, and equipment operating status to the operational risk, and the risk values of the switchgear at different time periods can be derived. The risk assessment model is expressed as follows:
          where 
 is the switchgear operational risk value at the 
-th time period, 
 is the equipment operating status score, and 
, 
, and 
 are the weight coefficients represent the contributions of temperature rise, load, and equipment operating status to the operational risk, respectively.
In the risk assessment model, the weight coefficients , and  in Equation (29) represent the relative importance of temperature rise, load, and equipment status. In this study, these weights are assigned based on operational experience and historical fault statistics, reflecting the domain knowledge that temperature rise is a primary indicator of imminent risk. We acknowledge that a more objective determination of these weights is a valuable direction for future research. For instance, multi-physics simulation could be employed to establish a more rigorous quantitative relationship between these factors and the operational risk.
  3.3. Risk Assessment Correction Method Based on ABPNN
Switchgear operational risk is influenced by the interaction of various factors, making it prone to deviations during risk assessment. BP neural networks, due to their strong generalization and self-organizing capabilities, can calibrate and optimize the assessment results. However, traditional BP neural networks are susceptible to the curse of dimensionality and the issue of training obtaining trapped in local optima, which leads to reduced calibration accuracy. Therefore, this section proposes the ABPNN, which introduces an adaptive genetic algorithm based on traditional neural networks. By dynamically adjusting network parameters, it improves training efficiency and prediction accuracy while effectively avoiding issues like local optima and overfitting, thus optimizing the performance of BP neural networks. Furthermore, the genetic algorithm adaptively adjusts the crossover and mutation probabilities to ensure population diversity and enhance the convergence of the model.
  3.3.1. Network Architecture of ABPNN
The ABPNN switchgear operation risk assessment correction model adopts a three-layer network architecture, including an input layer, a hidden layer, and an output layer. The input layer consists of 
 neurons 
, representing the 
 influencing factors of operational risk. The hidden layer contains 
 neurons 
, which is responsible for processing the input data and generating intermediate results that are passed to the subsequent layers. The output layer consists of 
 neurons 
, and the output values represent the specific operational risk levels. The input 
 and output 
 of the 
-th neuron in the ABPNN hidden layer are represented as follows:
          where 
 represents the connection weight from the input layer to the hidden layer.
The input 
 and output 
 of the 
-th neuron in the output layer are represented as follows:
          where 
 represents the connection weights from the hidden layer to the output layer. 
 is the activation function between the hidden layer and the output layer, typically chosen as the sigmoid function. The weights 
 and 
 can be adjusted by calculating the error of the output values.
The input layer of the ABPNN switchgear operation risk assessment correction model consists of 10 neurons, i.e., operational lifespan (
), historical fault condition (
), load current (
), busbar temperature (
), contact temperature (
), ambient temperature (
), ambient humidity (
), conductor contact resistance (
), contact area (
), and fan status (
). The output layer of the model outputs the corrected switchgear operational risk levels. Therefore, the number of neurons in the output layer is set to 3, i.e., 
, 
, and 
, with each neuron outputting values of 0 or 1. The output values of 
, 
, and 
 and their corresponding switchgear risk levels are shown in 
Table 2.
  3.3.2. Risk Assessment Correction Method Based on ABPNN
During risk assessment correction, the influencing factors 
 and risk levels 
 of switchgear operational risk are first determined, and the input data is mapped to the range 
 to obtain the input layer neurons and output layer data for the ABPNN. For real-valued influencing factors, such as operational lifespan (
), load current (
), busbar temperature (
), contact temperature (
), ambient temperature (
), ambient humidity (
), conductor contact resistance (
), and contact area (
), normalization has already been performed during data preprocessing, allowing them to be directly input into the ABPNN. For categorical influencing factors, such as historical fault condition (
) and fan status (
), the independent variable assignment method shown in 
Table 3 can be used.
Then, the alternative network structure is selected, which involves determining the number of neurons in the hidden layer. Several integers within the range  are chosen as the number of neurons in the hidden layer. The network is trained using a small sample dataset, and the corresponding convergence errors are recorded. The number of hidden layer neurons with the smallest convergence error during training is selected as the optimal network structure. After determining the network structure, the ABPNN parameters are initialized, including the connection weights  from the input layer to the hidden layer, the weights  from the hidden layer to the output layer, the threshold  for the hidden layer, and the threshold  for the output layer. In addition, set the initial learning rate for model training and the desired minimum training error threshold. Finally, the ABPNN takes the influence factor data representing the switchgear status output by the LSTM network as input and outputs the predicted risk level encoding. Through supervised learning, the network weights and thresholds are continuously adjusted using the backpropagation algorithm to minimize the error between the predicted output and the true risk level. By deeply integrating and reassessing the predicted values from the LSTM network and the switchgear status features, the prediction bias is corrected, ultimately outputting the optimized operational risk level assessment.
  3.3.3. Adaptive Genetic Algorithm for Optimizing Network Parameters
To improve the training efficiency and prediction accuracy of the ABPNN model, avoid issues such as local optima and overfitting during training, and further enhance the correction performance, this paper introduces an adaptive genetic algorithm to optimize the core parameters of ABPNN. The steps are as follows.
Step 1: The initial network parameters of ABPNN are encoded to generate multiple distinct individuals, and the population size is set as . As the initial population , these individuals are then input into the genetic algorithm.
Step 2: The fitness of the population is used to measure an individual’s ability to survive in the environment. A higher fitness value indicates a better individual’s genes, which corresponds to a greater likelihood of survival and reproduction. In this model, each individual’s output is calculated using sample data and the initialized ABPNN model, and the result is compared with the desired output. The error value is calculated to evaluate the quality of the individual’s genes, which determines their fitness. The individual’s fitness value is calculated as follows:
          where 
 represents the number of training samples, 
 is the predicted output value of the sample, and 
 is the actual output value of the sample. By repeating the process, the fitness values of all individuals in the 
-th generation of the population 
 can be derived.
Step 3: Initialize the maximum number of iterations for the genetic algorithm, the crossover probability  within the range , and the mutation probability  within the range .
Step 4: Use the roulette wheel selection method to perform optimal selection of individuals from the population. Based on the fitness values of the individuals in the population, the probability 
 of each individual being inherited and the cumulative probability 
 are calculated as follows:
A random number  between  is generated. If the cumulative probability of an individual is greater than , then this individual is selected to stay; otherwise, another individual  is chosen such that . Step 4 is repeated until every individual in the population has been selected once.
Step 5: Based on the changes in the fitness values of the population individuals, the crossover and mutation probabilities are dynamically adjusted to form a new population. Set the initial values and the initial crossover probability 
 and the initial mutation probability 
 are given by
          where 
, 
, 
, and 
 are fixed values.
The new population is then obtained, and its fitness values are calculated. If the fitness values are widely dispersed, the probabilities can be reduced to retain the better-performing individuals in the population. If an individual’s fitness value indicates that it has become trapped in a local optimum, the probabilities of escaping the local optimum are increased, ensuring individual diversity.
Step 6: Repeat steps 4 to 6 until the maximum number of iterations is reached. Output and decode the optimal individual, and analyze to obtain the optimal parameters for the ABPNN. Update the ABPNN based on the optimal parameters.
The algorithm is summarized in 
Table 4.
  4. Simulation Analysis
To systematically verify the effectiveness and superiority of the method for operation risk assessment of high-current switchgear based on ensemble learning, a case study verification under actual operating conditions is conducted. The software used for simulation analysis is MATLAB R2024a (24.1.0.2537033), 64-bit (win64), released on 21 February 2024. Simulation parameters are shown in 
Table 5. The experimental data are derived from the actual operation record dataset of multiple high-current switchgears in a regional distribution network, which mainly includes 960 sets of equipment monitoring data for high-current switchgears and historical load current data recorded by the SCADA system. The equipment monitoring data cover the name of the monitoring device, specific location information, sampling time, load current, and the temperatures of the upper and lower contacts at the A/B/C three-phase moving contacts. The SCADA system records basic information such as load current and contact temperature, as well as multi-dimensional information including knife switch temperature, ambient temperature, and cabinet fan status. All data are preprocessed in accordance with the method described in 
Section 2. Data cleaning is performed using the MAD method, missing values are handled by Lagrange interpolation and similar-day imputation, dimensional differences are eliminated through Z-Score standardization, and KPCA technology is applied for dimensionality reduction in high-dimensional features to retain key information while improving model training efficiency. A total of 720 preprocessed datasets are selected for training and learning, and 240 of them are finally chosen for testing.
To systematically verify the effectiveness and superiority of the method for operation risk assessment of high-current switchgear based on ensemble learning, a case study verification under actual operating conditions is conducted. The overall simulation process is illustrated in 
Figure 3.
To comprehensively evaluate the performance of the proposed algorithm, three representative methods were selected as baselines. Baseline 1 [
36]: It adopts an improved RF model for feature selection and preliminary evaluation, and uses the traditional LSTM to predict the load evolution trend and temperature rise process. However, it has limitations in predicting dynamic change trends. Baseline 2 [
37]: It employs an improved LSTM network with an attention mechanism for evaluation. Baseline 2 can effectively capture the temporal features in the data, but lacks an effective correction mechanism for systematic prediction biases that may be caused by complex nonlinear disturbance factors. Baseline 3 [
38]: It utilizes the traditional LSTM and ABPNN for risk assessment. Based on its nonlinear fitting ability, a direct mapping model from multi-dimensional operating parameters to risk levels is established. Nevertheless, it is difficult to capture the long-term dependencies of load fluctuations and temperature rise evolution, and thus cannot effectively identify key temporal features.
Taking temperature rise prediction as an example, the algorithm’s performance is verified. The following performance metrics are considered: the coefficient of determination (R2), mean absolute error (MAE), mean absolute percent error (MAPE), and root mean square error (RMSE). The formulas are as follows:
RMSE:
      where 
 is the total number of samples, 
 represents the predicted value of the 
-th sample, and 
 represents the true value of the 
-th sample. 
 and 
 focus on describing the reliability of the model, while 
 and 
 focus on describing the prediction accuracy of the model.
Figure 4 presents a comparison of temperature rise prediction results for 30 samples by various algorithms when the load current is 1 kA. 
Figure 5 shows the corresponding comparison of relative errors in the prediction results. 
Table 5 displays the results of various indicators for algorithms. The data in 
Figure 4 and 
Figure 5, and 
Table 6 indicate that R
2 of the proposed algorithm reaches 0.981, which is 15.4%, 4.9%, and 24.8% higher than that of Baseline 1, Baseline 2, and Baseline 3, respectively, demonstrating a significant enhancement in fitting ability. Compared with Baseline 1, Baseline 2, and Baseline 3, MAE of the proposed algorithm is reduced by 64.7%, 48.7%, and 72.8%, respectively; MAPE is reduced by 65.1%, 49.5%, and 73.5%, respectively; and RMSE is reduced by 63.9%, 45.3%, and 69.8%, respectively, showing lower prediction errors and higher accuracy. The reason is that, in contrast to Baseline 1, Baseline 2, and Baseline 3, the proposed algorithm makes full use of a multi-level integrated learning framework. It introduces an attention mechanism based on the LSTM in the first layer, effectively capturing long-term dependent features in time series and achieving accurate prediction of load evolution trends and temperature rise processes. Meanwhile, the second layer adopts an improved RF for feature extraction, and by optimizing the feature selection mechanism and tree structure parameters, it improves the model’s ability to screen key operational features and the accuracy of preliminary evaluation. In addition, the third layer further constructs a prediction correction model based on ABPNN, which can effectively correct the risk assessment results output by LSTM.
 Figure 6 shows the prediction results of switchgear contact temperature rise under different load currents for various algorithms. The proposed algorithm’s prediction curve closely matches the real temperature rise curve within the interval. Its principle is similar to that in 
Figure 4, where the first layer of the ensemble learning framework extracts key features through an improved RF. The second layer then uses an attention LSTM to accurately capture the temporal dependencies in the temperature rise process. The third layer uses the ABPNN model for final correction of the prediction results. Its nonlinear fitting ability corrects the residual bias from the previous models. This allows for the accurate restoration of the complex physical relationship between temperature rise and load current. The model demonstrates high accuracy in temperature rise prediction under complex operating conditions.
 Figure 7 shows the RMSE convergence curves of different algorithms during the training process. Compared to the convergence values of Baseline 1, Baseline 2, and Baseline 3, RMSE of the proposed algorithm is reduced by 64.25%, 45.34%, and 69.83%, and the convergence speed is improved by 44.90%, 22.86%, and 6.90%, respectively. This is due to the use of an improved RF model for feature extraction and selection. By combining information gain ratio and feature stability for joint selection, and introducing the Spearman correlation coefficient to identify and remove highly correlated redundant features, it ensures that the feature set input into the subsequent deep learning models consists of highly correlated key influencing factors. This makes the learning process more efficient and significantly accelerates the convergence speed. In the second layer of the ensemble learning framework, an improved LSTM network with an attention mechanism is constructed. The switchgear operational data exhibits complex temporal dependencies, and the contribution of features at different time periods to risk assessment varies. The introduced temporal attention module enables the LSTM to dynamically assign weights to input features at different time steps, focusing on the most decisive information for the current prediction. This greatly enhances the model’s ability to capture complex temporal features in the data, making it particularly sensitive to anomalous data. At the same time, the optimized gating mechanism effectively alleviates the gradient issues in long sequences, ensuring stable training of the model in deep networks. The introduction of the attention mechanism allows the model to more effectively identify and utilize key information, preventing overfitting to less important information. This enables the model to find the global optimal solution more quickly during training and achieve lower prediction errors.
 Table 7 shows the prediction accuracy of different algorithms for samples at various risk levels. The proposed algorithm demonstrates a significant accuracy advantage, particularly in identifying high-risk (Level V) cases. Based on historical fault data, switchgear temperature, and potential safety hazards, the switchgear risk level can be classified into five levels, as shown in 
Table 8. Compared to Baseline 1, Baseline 2, and Baseline 3, the overall prediction accuracy has increased by 22.7%, 11.3%, and 5.1%, respectively. The high-risk recognition accuracy has increased by 29.63%, 12.90%, and 9.37%, respectively. The fundamental reason lies in the collaborative effect of the models at each layer in the ensemble learning framework. The attention mechanism-based LSTM is able to dynamically focus on and weight the key information in the input temporal data. For high-risk features, the attention mechanism assigns higher weights to these features, allowing for early-stage accurate warning and classification of potential Level V fault risks. When the second layer provides a preliminary risk level, especially when it is close to Level V, ABPNN, through its nonlinear mapping ability and adaptive learning mechanism, performs secondary classification and optimization for these boundary or ambiguous samples. This minimizes potential misclassification at the final classification output layer, ensuring the model has higher confidence and reliability, significantly reducing the probability of underestimating high-risk events.
 The confusion matrix shown in 
Figure 8 demonstrates the classification performance of the proposed algorithm on risk samples. The results show that the proposed algorithm achieves recognition accuracy of 94.2%, 94.8%, 94.0%, 93.2%, and 97.2% for risk levels I, II, III, IV, and V, respectively, maintaining high recognition accuracy across all levels. The reason is that, in the ensemble learning framework, ABPNN serves as the final correction layer, performing deep fusion and optimization of the preliminary risk assessment results from the first two layers. Traditional BP neural networks are prone to becoming stuck in local optima or suffering from the curse of dimensionality, but the adaptive genetic algorithm introduced in this paper effectively overcomes these shortcomings. It dynamically optimizes the network structure and weights of ABPNN, enhancing its global search capability, and enhances the learning effect of nonlinear features by adaptively adjusting the learning rate and error feedback mechanism. Even if the first two layers exhibit slight deviations under certain complex conditions, ABPNN can precisely correct these deviations through its adaptive correction ability, significantly improving the accuracy of the final risk level classification and the robustness of the system.
  5. Conclusions
To address the problem of accurately assessing the operational risks of switchgears under complex operating environments and severe load fluctuations, this paper has proposed a method for operation risk assessment of high-current switchgears based on ensemble learning. First, we have preprocessed and reduced the dimensionality of multi-source operational data to improve data quality. Then, via a multi-layer integrated learning framework, we have used improved RF to extract key features in the first layer, combined an attention mechanism with LSTM to capture temporal features and assess risks in the second layer, and corrected results with ABPNN fused with an adaptive genetic algorithm to enhance assessment performance in the third layer. Simulation results show that in temperature rise prediction, the proposed algorithm significantly improves the goodness-of-fit indicator R2, with increases of 15.4%, 4.9%, and 24.8% compared to Baseline 1, Baseline 2, and Baseline 3, respectively. Particularly, the accuracy of high-risk level identification is more prominently improved, with increases of 29.63%, 12.90%, and 9.37%, respectively. These results fully demonstrate that the proposed multi-layer integrated learning framework, through the collaborative operation of each process, effectively enhances the accuracy of temperature rise prediction and the reliability of operation risk assessment for high-current switchgears. Moreover, it exhibits excellent performance in identifying high-risk levels under complex operating conditions, providing strong support for the safe operation of switchgears.
Future research will optimize the algorithm structure through lightweight technologies such as model compression and knowledge distillation to significantly improve real-time processing capabilities. Meanwhile, by combining transfer learning methods to transfer knowledge from pre-trained models to new domains, the generalization performance of the algorithm in different application scenarios will be enhanced. In subsequent work, further exploration will be conducted on the setting of key model parameters, and targeted simulations will be carried out to obtain the basis for parameter optimization, thereby improving the adaptability and reliability of the risk assessment model to the complex operating conditions of high-current switchgear.