Online Prediction Method of Transmission Line Icing Based on Robust Seasonal Decomposition of Time Series and Bilinear Temporal–Spectral Fusion and Improved Beluga Whale Optimization Algorithm–Least Squares Support Vector Regression

: Due to the prevalent challenges of inadequate accuracy, unstandardized parameters, and suboptimal efficiency with regard to icing prediction, this study introduces an innovative online method for icing prediction based on Robust STL–BTSF and IBWO–LSSVR. Firstly, this study adopts the Robust Seasonal Decomposition of Time Series and Bilinear Temporal–Spectral Fusion ( Robust STL–BTSF ) approach, which is demonstrably effective for short-term and limited sample data preprocessing. Subsequently, injecting a multi-faceted enhancement approach to the Beluga Whale Optimization algorithm (BWO), which integrates a nonlinear balancing factor, a population optimization strategy, a whale fall mechanism, and an ascendant elite learning scheme. Then, using the Improved BWO (IBWO) above to optimize the key hyperparameters of Least Squares Support Vector Regression (LSSVR), a superior offline predictive part is constructed based on this approach. In addition, an Incremental Online Learning algorithm (IOL) is imported. Integrating the two parts, the advanced online icing prediction model for transmission lines is built. Finally, simulations based on actual icing data unequivocally demonstrate that the proposed method markedly enhances both the accuracy and speed of predictions, thereby presenting a sophisticated solution for the icing prediction on the transmission lines.


Introduction
Transmission lines encased in ice present multifaceted threats such as insulator flashover, conductor galloping, circuit tripping, power outages, and disruption of communication systems [1,2].These phenomena critically undermine productivity dsand livelihoods while accruing considerable economic losses [3], thereby posing a substantial obstacle to the secure and consistent operation of power systems [4].Consequently, the investigation into icing prediction models is paramount for strategic line planning, efficient allocation of de-icing operations, and guaranteeing a safe and dependable electricity supply within the power grid [5].In recent years, substantial research efforts have been undertaken by scholars, yielding significant advancements in the field of icing prediction on transmission lines.Present studies suggest that prediction models for icing on transmission lines can be broadly categorized into three distinct types, as delineated in Table 1 [6,7].
In recent years, the profound integration of machine learning into fault diagnosis and power load forecasting has spurred its extension into the realm of transmission line icing prediction, where it has demonstrated notable success [8].Chen et al. advanced the field by proposing an online transmission line icing prediction model underpinned by data-driven methodologies.The findings reveal that this model surpasses traditional prediction models in both predictive accuracy and generalizability [9].Manninen et al. introduced a novel machine-learning-based approach for predicting the health index of overhead transmission lines, specifically focusing on high-voltage overhead lines.The proposed method's elevated accuracy and applicability were corroborated through a comparative analysis of the model's predicted values against empirical data [10].Nevertheless, most of the aforementioned studies neglected the optimization of hyperparameters in their machine-learning models, leading to unstable prediction accuracies and significant random fluctuations.Consequently, enhancing the model's predictive efficacy necessitates the integration of advanced artificial intelligence optimization techniques with machine-learning models for hyperparameter optimization, thereby constructing a sophisticated model for icing prediction [11].Tang et al. employed the classical Particle Swarm Algorithm (PSO) as an intelligent optimization tool to iteratively refine the primary parameters of the SVR.The extensive training of the model with a substantial dataset substantiated the proposed model's remarkable enhancements in both effectiveness and accuracy.Nevertheless, this model exhibits limited efficacy in extracting feature dimensions from chaotic time series, and its complexity escalates substantially with increased embedding dimensions [12].Sun et al. innovated with a hybrid icing prediction model, amalgamating wavelet transform (WT) and bat algorithm-enhanced extreme learning machine (BA-ELM).In this model, WT is utilized for denoising meteorological data, while the BA optimizes input weights and bias thresholds, culminating in a robust icing prediction framework.However, this offline model does not enhance the standard ELM, resulting in constrained generalizability [13].
Table 1.Transmission line icing prediction model.

Physical Model
A mathematical model based on the physical process of water transitioning from liquid to solid.
Closer to the actual state of ice cover formation, offering high theoretical reliability.
Challenges in parameter acquisition; limited generalizability and practical utility.

Statistical Model
Processing historical data through mathematical and statistical techniques to identify key factors and model the thickness of icing.
Exhibits strong reproducible fit to historical data; high prediction accuracy.Highly data-dependent; computationally difficult.

Machine Learning Model
Integrates machine learning for predictive analysis, bypassing the physical processes of icing.
Minimal-data requirements; rapid predictive capabilities; high accuracy.
The selection of model parameters can be subjective, impacting the prediction outcomes.
The icing prediction for transmission lines is inherently uncertain and nonlinear, attributed to its vulnerability to climatic variables like temperature, humidity, and wind speed.Owing to their simplicity and adaptability, meta-heuristic algorithms, renowned for their efficacy in addressing nonlinear and multimodal challenges, have gained traction among scholars for parameter optimization in machine learning models.Nonetheless, meta-heuristic algorithms are commonly critiqued for their slow convergence rates and propensity to fall into local optimal solutions, necessitating further refinement.Enhancing and optimizing meta-heuristic algorithms is imperative to augment the predictive accuracy of conventional machine learning models.Additionally, it is essential to refine and optimize the standard machine-learning model to bolster its predictive precision [14].Consequently, to further elevate the predictive performance, both the algorithms and models necessitate enhancements and advancements.Ma et al. amalgamated the traditional fireworks algorithm with the quantum optimization algorithm, significantly enhancing the efficacy of the optimization search.This methodology was employed in conjunction with a Support Vector Machine (SVM) to construct an advanced prediction model.Analytical verification has confirmed that this method markedly enhances both accuracy and solution speed [15].Xia et al. introduced a novel similarity-based weighted SVR model.Parameters were optimized based on varying sample weights using a hybrid swarm intelligence optimization algorithm, combining PSO and the ant colony algorithm, thereby enhancing the model's generalization capacity, as evidenced by experimental results [16].The incorporation of Bayesian optimization into SVR facilitates automatic parameter adjustment, taking into full account the inter-related [17,18].In the practical deployment of prediction models hinged on icing mechanisms, the accuracy is invariably contingent upon the judicious selection of parameters.Nevertheless, the icing process is characterizable as a chaotic time series, exhibiting considerable complexity and non-linearity alongside inherent inertia.Consequently, employing LSSVR to delineate the relationship between icing thickness and environmental factors not only maximizes the extraction of pertinent information from these factors but also minimizes data redundancy concerning ice cover thickness, thereby enhancing both the accuracy and efficiency of the prediction model.
In the realm of meteorological data preprocessing, Zhou et al. proffered an innovative method designed for the handling of extended temporal sequences.This pioneering approach stratifies the dataset into two distinct categories: the harmoniously evolving smooth series and the oscillatory counterparts, serving as a predictive tool, thus imbuing the dataset with heightened resilience and interpretative clarity [19].In pursuit of ascertaining the quintessential configuration for a hybrid wave wind farm, Haces and co-authors introduced a pioneering wave wake preprocessing modality.This innovative paradigm orchestrates the seamless alignment of every geographic coordinate with a preconceived wave wake through a judicious amalgamation of recursive optimization and genetic algorithms, subsequently culminating in the comprehensive imputation of any missing data facets [20].In 2023, Altunkaynak et al. first introduced the Additive Seasonal Algorithm as an alternative data preprocessing algorithm for processing datasets with different meteorological characteristics, which decomposes the raw data into trend period, seasonality, and error components to facilitate the subsequent operations [21].While the aforementioned meteorological data preprocessing techniques have yielded commendable outcomes, it is important to recognize that their methodologies predominantly cater to the exigencies of tumultuous, protracted temporal datasets.The meteorological data under scrutiny in this study, focusing on icing phenomena, is inherently punctuated by pronounced periodicity and abbreviated temporal extents.Hence, it is incumbent upon us to judiciously discern and adopt data preprocessing strategies that are attuned to the distinctive attributes of this dataset.
In summary, this study introduces a novel online icing prediction methodology for transmission lines.The contributions are delineated as follows: (1) The Robust STL-BTSF has an advanced data preprocessing method.This method orchestrates the data across temporal and spectral dimensions, extracting salient information efficiently, thereby enhancing the alignment and efficacy of feature fusion.
(2) A multi-faceted strategy is employed to address the inherent limitations of conventional BWO.This enhancement markedly accelerates the convergence speed and gains a substantial boost in optimization precision.
(3) Establishing an LSSVR, employing IBWO to optimize parameters and set up the offline predictive part.Injecting IOL, and the online icing prediction model for transmission lines is built.
The chapters of this paper are organized as follows: In Section 2, the preprocessing method of meteorological data is introduced.Section 3 constructs the topology of the online prediction model.In Section 4, the workflow of the prediction model is analyzed, and the corresponding evaluation indexes are introduced.In Section 5, the experimental evaluation is carried out through the arithmetic example simulation.Finally, the paper is summarized in Section 6.

Data Preprocessing Method Based on Robust STL-BTSF
For the icing meteorological time series, Y t can be decomposed into a trend term component T t , a period term component S t , and a residual term R t , as shown in Equation (1): Robust STL is used to decompose the sequence through two processes: inner loop and outer loop, as shown in Figure 1.
Appl.Syst.Innov.2024, 7, x FOR PEER REVIEW 4 of 22 online prediction model.In Section 4, the workflow of the prediction model is analyzed, and the corresponding evaluation indexes are introduced.In Section 5, the experimental evaluation is carried out through the arithmetic example simulation.Finally, the paper is summarized in Section 6.

Data Preprocessing Method Based on Robust STL-BTSF
For the icing meteorological time series, Yt can be decomposed into a trend term component Tt, a period term component St, and a residual term Rt, as shown in Equation (1): , 1, 2, ..., Robust STL is used to decompose the sequence through two processes: inner loop and outer loop, as shown in Figure 1.The internal circulation effectively fits the temporal trend component and computes the periodic component within the time series.Simultaneously, the external circulation fine-tunes the robustness weights.This dual-layered iterative process seamlessly interweaves, yielding a coherent and harmonious alignment while effectively smoothing out the ice-covered climate data characterized by seasonal cycles.Following the application of the Robust STL, an innovative iterative BTSF mechanism is introduced.Its role is to explicitly capture the intricate interdependencies among a profusion of time-frequency pairs.The iterative refinement of representations unfolds in a sophisticated fusion-andsqueeze fashion, facilitated by Spectrum−to−Time (S2T) and Time−to−Spectrum (T2S) aggregation modules, as illustrated in where Ft and Fs are temporal and spectral features; θt and θs are parameters of the encoding networks EncoderA and EncoderB, respectively.Then, the approach builds an iterative bilinear fusion channel interaction between features F(i, j) of two domains and integrates them using BTSF as shown in Equation (3): Then it encodes cross-domain affinities to adaptively refine the temporal and spectral features through an iterative procedure as Equation (4): The internal circulation effectively fits the temporal trend component and computes the periodic component within the time series.Simultaneously, the external circulation fine-tunes the robustness weights.This dual-layered iterative process seamlessly interweaves, yielding a coherent and harmonious alignment while effectively smoothing out the ice-covered climate data characterized by seasonal cycles.Following the application of the Robust STL, an innovative iterative BTSF mechanism is introduced.Its role is to explicitly capture the intricate interdependencies among a profusion of time-frequency pairs.The iterative refinement of representations unfolds in a sophisticated fusion-and-squeeze fashion, facilitated by Spectrum−to−Time (S2T) and Time−to−Spectrum (T2S) aggregation modules, as illustrated in Figure 2.
S2T, T2S, and bilinear fusion jointly form a loop block in a fuse-and-squeeze manner.The bilinear fusion jointly forms a loop block in a fuse-and-squeeze manner.After several loops of Equations ( 3) and (4), the final bilinear feature Fbilinear is obtained.

LSSVR
To address the challenge of hyperplane parameter selection, which often results in an unwarranted expansion of the solution space within SVR, the Least Squares Method is Specifically, each augmented time series x t is first transformed to the spectral domain by a Fast Fourier Transform (FFT), obtaining spectral signal x s .Then x t and x s are delivered to two encoding networks for feature extraction.The process is shown in Equation (2): where F t and F s are temporal and spectral features; θ t and θ s are parameters of the encoding networks Encoder A and Encoder B , respectively.Then, the approach builds an iterative bilinear fusion channel interaction between features F(i, j) of two domains and integrates them using BTSF as shown in Equation (3): Then it encodes cross-domain affinities to adaptively refine the temporal and spectral features through an iterative procedure as Equation ( 4): S2T, T2S, and bilinear fusion jointly form a loop block in a fuse-and-squeeze manner.The bilinear fusion jointly forms a loop block in a fuse-and-squeeze manner.After several loops of Equations ( 3) and ( 4), the final bilinear feature F bilinear is obtained.

LSSVR
To address the challenge of hyperplane parameter selection, which often results in an unwarranted expansion of the solution space within SVR, the Least Squares Method is harnessed for optimization.The optimization criterion centers around the quadratic term of the error factor, thereby transforming inequality constraints into an equivalent set of equation constraints.Notably, the resolution of the optimization problem is reformulated as the solution to a system of linear equations, a transformation facilitated by the Karush-Kuhn-Tucker (KKT) condition.This adaptation renders the method particularly suitable for the precise fitting of limited sample data.The optimization objective function is expressed in Equation (5).
where: x i is the sample vector; c is the penalty factor; x is the slack variable; ω is the normal vector; b is the bias vector.By introducing the Lagrange multiplier method, the transformed optimization problem is shown in Equation ( 6): where α is the Lagrange operator; e and π are irrational numbers.The optimal solution is obtained through the KKT condition, and the regression function of LSSVR can be obtained by solving using the least squares method as shown in Equation (7): where k (x, x i ) is the kernel function.In this paper, we choose the radial basis kernel function (RBF), which has good generalization ability and significant advantage when dealing with nonlinear problems, as shown in Equation (8): where σ 2 is the width parameter of the RBF.The selection of σ and c directly affects the performance of LSSVR.Therefore, to improve the performance of SVR, the optimal solution search for the values of σ and c is required.Meta-heuristic algorithms are widely used for selecting optimal solutions for parameter optimization due to their superior ability to find optimal solutions.

BWO
BWO is a new meta-heuristic algorithm proposed by Zhong in 2022 [22].The behaviors of beluga whales inspire BWO.Balance factor B f determines the transition from exploration to the exploitation phase, as expressed in Equation ( 9).
where T is the current iteration, T max is the maximum iteration, and B 0 randomly changes between (0, 1).The processes are explained below.

Exploration Phase
The exploration phase involves considering pairs of swimming beluga whales.Their positions are updated according to Equation (10).
where X i,j T+1 is the new position for the ith beluga whale on the jth dimension.P j is a random number selected from d-dimension (j = 1, 2, . .., d).X i, Pj T and X r,P1 T is the position of the ith and rth beluga whale.r 1 to r 7 is a random number between (0, 1).

Exploitation Phase
The preying behavior inspires the exploitation phase, as expressed in Equation (11).
where X i T and X r T position for the ith beluga whale and a random beluga whale, X i T+1 is the new position.X best T is the best position.C 1 is the random jump strength, L F is the Levy flight function, as expressed in Equations ( 12) and ( 13): where σ = , u and v are normally distributed random numbers.

Whale Fall
During migration and foraging, a small number of beluga whales did not survive and fell to the deep seabed.X step is the step size of the whale fall model expressed as Equation ( 14) to (15).
where u b and l b are the upper and lower boundaries of the variable, C 2 is the step factor, and

Improving Methods of BWO
BWO, as a new type of meta-heuristic algorithm, offers the advantage of high optimization accuracy.However, it still exhibits shortcomings in terms of convergence speed and stability.Through research on beluga whales and the refinement of fundamental theories, a multi-strategy approach is proposed, leading to the development of IBWO.This multi-strategy consists of four parts, as follows:

Nonlinear Balancing Factor with Sigmoid Function
To improve the convergence speed, a nonlinear attenuation using the improved Sigmoid function is proposed to balance the exploration and exploitation of beluga whales.The improvement is shown in Equation ( 16), and the curve is shown in Figure 3: As shown in Figure 3, the balancing factor B f will decay nonlinearly.t is the current iteration, and T is the maximum iteration (set T = 100).This improvement can effectively exert the effective global optimization ability of beluga whales and further improve the exploitation capacity and optimization efficiency. where , u and v are normally distributed random numbers.

Whale Fall
During migration and foraging, a small number of beluga whales did not survive and fell to the deep seabed.Xstep is the step size of the whale fall model expressed as Equation ( 14) to (15).
where ub and lb are the upper and lower boundaries of the variable, C2 is the step factor, and C2 = 2Wf × n.Wf is the probability of whale fall, Wf =0.1 − 0.05T/Tmax.

Improving Methods of BWO
BWO, as a new type of meta-heuristic algorithm, offers the advantage of high optimization accuracy.However, it still exhibits shortcomings in terms of convergence speed and stability.Through research on beluga whales and the refinement of fundamental theories, a multi-strategy approach is proposed, leading to the development of IBWO.This multi-strategy consists of four parts, as follows:

Nonlinear Balancing Factor with Sigmoid Function
To improve the convergence speed, a nonlinear attenuation using the improved Sigmoid function is proposed to balance the exploration and exploitation of beluga whales.The improvement is shown in Equation ( 16), and the curve is shown in Figure 3: As shown in Figure 3, the balancing factor Bf will decay nonlinearly.t is the current iteration, and T is the maximum iteration (set T = 100).This improvement can effectively exert the effective global optimization ability of beluga whales and further improve the exploitation capacity and optimization efficiency.

Population Optimization Strategy
Currently, the beluga whale algorithm lacks a step for comparing the latest generation of search agents with the previous one.As a result, the selected individuals may not

Population Optimization Strategy
Currently, the beluga whale algorithm lacks a step for comparing the latest generation of search agents with the previous one.As a result, the selected individuals may not necessarily have the maximum fitness value.Therefore, in this paper, we propose a selection strategy based on a differential evolutionary algorithm to reselect the updated beluga whales, as shown in Equation ( 17): where fit (X i ) is the fitness value of X i .By comparing the fitness values to select the optimal position, we ensure that each updated beluga whale benefits the entire population.This approach aims to increase the iterative period, thereby improving the optimization ability of beluga whale populations.

Improvement Whale Fall Mechanism
When the beluga whale has fallen and leaves the population, to ensure the same number, the next generation of juvenile beluga whales in the population are included in the optimization team.The specific formula is shown in Equation ( 18): The improved whale fall formula perturbs individual beluga whales in a manner that prevents them from becoming trapped in local optima during the mid-to-late stages.Additionally, it addresses the issue of poor population diversity in the later stages.

Elite Learning Strategy
After each iteration, a few beluga individuals with the lowest-ranked fitness values in the population are selected for learning.The objective of this learning process is to improve upon the optimal beluga leader within the population.The specific formula for this improvement is shown in Equation ( 19): where Npop is the number of beluga whale populations, X(index(Npop)) is the position of the beluga whale individual with the Npopth rank in terms of fitness value, and r 1 , r 2 , r 3 , r 4 are all random numbers between (0, 1).The positions of the above individuals after iteration float randomly within the range of (0.8, 1.1).At the conclusion of each generation iteration, the four individuals with the poorest positions undergo elite learning to enhance the optimization performance of the population and address issues such as slow population convergence and low average fitness.

IBWO
This section focuses on the optimization process of IBWO, and the pseudo-code of the Algorithm 1 is shown below.The specific flow is shown in Figure 4.
Step 1: Initiate the parameterization of the IBWO.This involves the population size n, the maximum number of iterations T max , and the nonlinear balancing factor B f .Initial positions are randomly generated in the search space, and the fitness value is obtained based on the objective function.
Step 2: Exploration phase.Based on the B f , it is decided that each beluga whale enters the exploration phase or the exploitation phase.If the beluga's B f > 0.5, the update mechanism enters the exploration phase, and the beluga's position is updated by Equation (10).
Step 3: Exploitation phase.If B f < 0.5, the beluga position is updated using Equation (11).Then, the fitness value of the new position is calculated and ranked based on the population optimization strategy of Equation (17).Then, the results are compared with those of the previous generation to find the best result.
Step 4: Whale fall phase.Some beluga whales die and descend into the deep sea, and the probability of whale fall is W f .The location is updated by the improved whale fall strategy of Equation (18).
Step 5: Elite learning phase.The post-foraging beluga population is updated again using the elite learning strategy according to Equation (19), and elite learning is performed on the N individual beluga whales with the worst position after the update.
Step 6: Termination condition check.If the current number of iterations is greater than the maximum number of iterations, the IBWO algorithm stops.Otherwise, repeat step 2. Initialize the population and evaluate the adaptation values to obtain the best solution 2: While T ≤ T max Do 3: Calculate the whale fall probability W f ; Equation ( 16) Obtain the nonlinear balancing factor B f based on the Sigmoid function.

4:
For X i Do 5: If B f (i) > 0.5 6: // Exploration phase 7: Generate p j randomly from dimension (j = 1, 2, . .., d) 8: Randomly select a beluga whale X r 9: Use Equation ( 10) to update the new position of the ith beluga whale 10: Step 4: Whale fall phase.Some beluga whales die and descend into the deep sea, and the probability of whale fall is Wf.The location is updated by the improved whale fall strategy of Equation (18).
Step 5: Elite learning phase.The post-foraging beluga population is updated again using the elite learning strategy according to Equation (19), and elite learning is performed on the N individual beluga whales with the worst position after the update.
Step 6: Termination condition check.If the current number of iterations is greater than the maximum number of iterations, the IBWO algorithm stops.Otherwise, repeat step 2.

IOL Model
The traditional offline batch learning method can obtain a faster learning rate but cannot dynamically update the regression model as new samples are added.The incremental learning algorithm adds a new sample during each iteration, which can fully utilize the results of the previous iteration and improve the prediction accuracy of the model.
The kernel function matrix

IOL Model
The traditional offline batch learning method can obtain a faster learning rate but cannot dynamically update the regression model as new samples are added.The incremental learning algorithm adds a new sample during each iteration, which can fully utilize the results of the previous iteration and improve the prediction accuracy of the model.
The kernel function matrix Q = K (x, x i ), Q t is the square matrix of t × t, as shown in Equation ( 20): Let H(t) = Q t + C −1 I, then the above can be rewritten as Equation (21): At time t + 1, new samples (x i+1 , y j+1 ) are added to the sample set.H(t + 1) can be obtained from the KKT condition with the kernel function matrix Q t+1 as shown in Equation ( 22): The chunking matrix can be written as Equation ( 23): where L(t + 1) = [K(x 1 , x t+1 ), . .., K(x t , x t+1 )] T ; n(t + 1) = K(x t+1 , x t+1 ) + C −1 .

Forecasting Process
The foundational concept underpinning the modeling approach involves the projection of meteorological datasets, encompassing variables such as ambient temperature, relative humidity, and wind speed, into an expansive, high-dimensional feature space.This is achieved through the establishment of a nonlinear mapping.Subsequently, linear regression analyses are conducted on the icing phenomena within this intricately defined feature space.Features are extracted from time series data to obtain feature vectors.Then, using kernel functions, these feature vectors are mapped to a higher dimensional feature space, making the data easier to fit with linear models.Finally, use the SVR model to train and predict the mapped data.The procedural methodology of the icing prediction model, based on the integration of Robust STL-BTSF and IBWO-SVR, is delineated in Figure 5.This figure provides a visual representation of the sequential steps and analytical processes involved in the model's formulation.
Accordingly, the online icing prediction model can be divided into the following six parts: Step 1: Meteorological data preprocessing based on Robust STL-BTSF.
Step 2: Establish the LSSVR model.Choose the kernel function, the regularization constant c kernel function, and the hyperparameter g.
Step 3: Optimize the model parameters.A multi-strategy method is proposed to improve the BWO and use the IBWO to find the optimal value mentioned in Step 2.
Step 4: Construct an IOL online model.An incremental learning algorithm is used to construct an online prediction model for ice cover.
Step 5: Establish and train the IBWO-LSSVR online prediction model.The processed data is fed into the model, building the icing prediction model of transmission lines.
Step 6: Test and model evaluation.Test the performance and choose evaluation indexes to evaluate the model and verify the effectiveness of the model.

Evaluation Index
To verify the superiority of the proposed model, four evaluation indexes, namely th Mean Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE and Mean Absolute Percentage Error (MAPE), are used for comparative analysis.The for mula for each evaluation index is shown in Equations ( 24)-(27):

Evaluation Index
To verify the superiority of the proposed model, four evaluation indexes, namely the Mean Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE), are used for comparative analysis.The formula for each evaluation index is shown in Equations ( 24)-( 27): where MSE denotes a measure of the degree of difference between the estimated and the estimated quantity; RMSE is the square root of the ratio of the square of the deviation of the predicted value from the true value to the ratio of the number of observations, n; MAE denotes the mean of the absolute value of the absolute deviation of the values of the various measurements; and MAPE is the difference between the predicted value and the true value divided by the true value.The smaller the value of the above assessment indicators, the better the model is.

Data Sources
Icing data of transmission lines were collected, and 150 sets of icing thicknesses along with related meteorological data were recorded over a period of 2 months during recent winters, as depicted in Figure 6.The prediction analysis and calculations for icing on the lines were carried out using the MATLAB R2018a platform.The processing unit is an Intel Xeon Gold 6136 ×2, while the display computer is equipped with NVIDIA TITAN Xp × 4 graphics cards.Memory consists of a Samsung server DDR4 16G × 12.

7, x FOR PEER REVIEW 13 of 22
Icing data of transmission lines were collected, and 150 sets of icing thicknesses along with related meteorological data were recorded over a period of 2 months during recent winters, as depicted in Figure 6.The prediction analysis and calculations for icing on the lines were carried out using the MATLAB R2018a platform.The processing unit is an Intel Xeon Gold 6136 ×2, while the display computer is equipped with NVIDIA TITAN Xp × 4 graphics cards.Memory consists of a Samsung server DDR4 16G × 12.
To validate the model proposed in this paper, the sequence in Figure 6 is divided into a training set and a testing set.For the offline model, the data is split into training and testing sets using a 7:3 ratio.For the online model, the training and testing sets are allocated in a 9:1 ratio

Optimization Performance Tests of IBWO
To compare the optimization performance of meta-heuristic algorithms, we use CEC benchmarking functions for comparison.In this paper, three kinds of single-peak, three kinds of multi-peak, and three kinds of fixed-dimensional multi-peak benchmark test functions are selected to test the performance and make a comparative analysis of PSO, Grey Wolf Optimization Algorithm (GWO), Improved GWO (IGWO), BWO, and the proposed IBWO.The parameter set is shown in Table 2, the benchmark test functions are shown in Table 3, the curve of results is shown in Figures 7 and 8, respectively; and the results of the function simulation test and the Wilcoxon rank-sum test are shown in Tables 4 and 5, respectively.To validate the model proposed in this paper, the sequence in Figure 6 is divided into a training set and a testing set.For the offline model, the data is split into training and testing sets using a 7:3 ratio.For the online model, the training and testing sets are allocated in a 9:1 ratio.

Optimization Performance Tests of IBWO
To compare the optimization performance of meta-heuristic algorithms, we use CEC benchmarking functions for comparison.In this paper, three kinds of single-peak, three kinds of multi-peak, and three kinds of fixed-dimensional multi-peak benchmark test functions are selected to test the performance and make a comparative analysis of PSO, Grey Wolf Optimization Algorithm (GWO), Improved GWO (IGWO), BWO, and the proposed IBWO.The parameter set is shown in Table 2, the benchmark test functions are shown in Table 3, the curve of results is shown in Figures 7 and 8, respectively; and the results of the function simulation test and the Wilcoxon rank-sum test are shown in Tables 4 and 5, respectively.

Benchmark Function Dim
Rang fmin x 2 i − 10 cos(2πx i ) + 10 30 [−5.12, 5.12] 0 As illustrated in Table 4, it can be clearly seen that IBWO achieves the optimal value of the minimum among the nine test functions of F 1 to F 9 in comparison with other algorithms.Regarding the average and STD, it can be observed that consistently identifies the minimum value of the function, demonstrating the stable optimization search performance of the proposed benchmark functions convincingly.
Meanwhile, as shown in Figure 8 and Table 4, the IBWO can not only find the optimal fitness value but also exhibits a faster convergence speed than the remaining four algorithms.Specifically, the convergence curves, such as F 1 , F 3 , F 4 , etc., demonstrate the efficiency and effectiveness of the two improvement methods of nonlinear balancing factor and elite learning strategy.Moreover, the optimal value is already found at the early stage of iterations in the results of F 4 , F 5 , F 6 , etc.The convergence speed is fast, while the optimization accuracy is high.Combined with the above analysis, it is not difficult to conclude that compared with PSO, GWO, IGWO, and BWO, the IBWO algorithm proposed in this paper has superior optimization performance and high stability.
In addition, Table 4 presents the Wilcoxon rank-sum test results for the nine types of tests mentioned above, aimed at assessing whether IBWO is significantly different from the other four algorithms.The p-value is calculated for this purpose.Under the standard significance level of p = 0.05, if it is greater than 0.05, there is no significant difference between the above algorithms and IBWO; otherwise, the difference with IBWO is significant.According to Table 5, the p-values calculated for the four algorithms are far less than 0.05, indicating that the optimization performance of IBWO is significant in the benchmark test function.Icing data of transmission lines were collected, and 150 sets of icing thicknesses along with related meteorological data were recorded over a period of 2 months during recent winters, as depicted in Figure 6.The prediction analysis and calculations for icing on the lines were carried out using the MATLAB R2018a platform.The processing unit is an Intel Xeon Gold 6136 ×2, while the display computer is equipped with NVIDIA TITAN Xp × 4 graphics cards.Memory consists of a Samsung server DDR4 16G × 12.
To validate the model proposed in this paper, the sequence in Figure 6 is divided into a training set and a testing set.For the offline model, the data is split into training and testing sets using a 7:3 ratio.For the online model, the training and testing sets are allocated in a 9:1 ratio

Optimization Performance Tests of IBWO
To compare the optimization performance of meta-heuristic algorithms, we use CEC benchmarking functions for comparison.In this paper, three kinds of single-peak, three kinds of multi-peak, and three kinds of fixed-dimensional multi-peak benchmark test functions are selected to test the performance and make a comparative analysis of PSO, Grey Wolf Optimization Algorithm (GWO), Improved GWO (IGWO), BWO, and the proposed IBWO.The parameter set is shown in Table 2, the benchmark test functions are shown in Table 3, the curve of results is shown in Figures 7 and 8, respectively; and the results of the function simulation test and the Wilcoxon rank-sum test are shown in Tables 4 and 5, respectively.Meanwhile, as shown in Figure 8 and Table 4, the IBWO can not only find the optimal fitness value but also exhibits a faster convergence speed than the remaining four algorithms.Specifically, the convergence curves, such as F1, F3, F4, etc., demonstrate the efficiency and effectiveness of the two improvement methods of nonlinear balancing factor and elite learning strategy.Moreover, the optimal value is already found at the early stage of iterations in the results of F4, F5, F6, etc.The convergence speed is fast, while the optimization accuracy is high.Combined with the above analysis, it is not difficult to conclude that compared with PSO, GWO, IGWO, and BWO, the IBWO algorithm proposed in this paper has superior optimization performance and high stability.To verify the effectiveness of the data preprocessing method, a comparison experimental group is set up, 20 sets of continuous winter meteorological data are input, and Robust STL-BTSF and BTSF are tested for alignment.The distance distribution results are shown in Figure 9.In addition, after verifying the superiority of IBWO above, the results of icing prediction obtained by two methods based on the IBWO-LSSVR model are shown in Figure 10, respectively.The comparison of data preprocessing methods is shown in Table 5.
between the above algorithms and IBWO; otherwise, the difference with IBWO is significant.According to Table 5, the p-values calculated for the four algorithms are far less than 0.05, indicating that the optimization performance of IBWO is significant in the benchmark test function.

Example 1: Data Preprocessing Method
To verify the effectiveness of the data preprocessing method, a comparison experimental group is set up, 20 sets of continuous winter meteorological data are input, and Robust STL-BTSF and BTSF are tested for alignment.The distance distribution results are shown in Figure 9.In addition, after verifying the superiority of IBWO above, the results of icing prediction obtained by two methods based on the IBWO-LSSVR model are shown in Figure 10, respectively.The comparison of data preprocessing methods is shown in Table 5.As shown in Figure 9, the alignment results of the two methods are displayed.Compared with BTSF, Robust STL-BTSF has the highest average value of the orthogonal feature distance, which preserves the maximum information of the data, indicating that Robust STL-BTSF achieves the best alignment of the data features.Analyzing Figure 10 and Table 6, Robust STL-BTSF improves by 20.5% in running time.In terms of MSE, there are problems such as insufficient feature quantity extraction and low pre-prediction accuracy under BTSF processing.In summary, after verifying the superiority of Robust STL-BTSF in the processing of meteorological data with small samples and short time series, all the following sections use this method to analyze and preprocess meteorological data.As shown in Figure 9, the alignment results of the two methods are displayed.Compared with BTSF, Robust STL-BTSF has the highest average value of the orthogonal feature distance, which preserves the maximum information of the data, indicating that Robust STL-BTSF achieves the best alignment of the data features.Analyzing Figure 10 and Table 6, Robust STL-BTSF improves by 20.5% in running time.In terms of MSE, there are problems such as insufficient feature quantity extraction and low pre-prediction accuracy under BTSF processing.In summary, after verifying the superiority of Robust STL-BTSF in the processing of meteorological data with small samples and short time series, all the following sections use this method to analyze and preprocess meteorological data.To verify the superiority of the prediction model, Robust STL-BTSF and IBWO-LSSVR are simulated and compared with a variety of models, and the iterative adaptation curves are shown in Figure 11.To verify the superiority of the prediction model, Robust STL-BTSF and IBWO-LSSVR are simulated and compared with a variety of models, and the iterative adaptation curves are shown in Figure 11.Analysis of Figure 11 reveals that the proposed model has a strong optimization ability in the pre-iteration period, with lower adaptation values compared to the other four models at the same number of iterations.Furthermore, the optimization ability of this model significantly improves during the middle of the iteration process.A clear superiority can be observed compared to the other models, reaching optimum performance at 30 iterations.In contrast, the other models have not yet achieved their best fitness values at this point.

Example 3: Simulation Analysis of Online Icing Prediction of Transmission Lines
Model testing and evaluation were carried out, and the Robust STL-BTSF and IBWO-LSSVR model was used for icing prediction and model evaluation.The comparison of evaluation index results is shown in Table 7.The residual plot is shown in Figure 12, and the comparison of prediction results is shown in Figure 13.Model testing and evaluation were carried out, and the Robust STL-BTSF and IBWO-LSSVR model was used for icing prediction and model evaluation.The comparison of evaluation index results is shown in Table 7.The residual plot is shown in Figure 12, and the comparison of prediction results is shown in Figure 13.Analyzing Table 7, Robust STL-BTSF and IBWO-LSSVR have the lowest error values of the four evaluation indexes, which validates the prediction accuracy of the model.Furthermore, through Figures 12 and 13, compared with other methods, the residual values of the proposed are more evenly distributed on both sides of the central axis, which proves that its prediction effect is better than that of other methods.In addition, the model proposed in this paper has the highest accuracy on the prediction curves with the real ice cover thickness, which further verifies the superiority of the method proposed in this paper.

Conclusions
To improve the speed and accuracy of the online icing prediction model, and to guarantee the safety and reliability of power supply to the power grid, this paper proposes an online icing prediction method for transmission lines, drawing the following conclusions: (1) Robust STL-BTSF and IBWO-LSSVR deals with the time-frequency domain de- Analyzing Table 7, Robust STL-BTSF and IBWO-LSSVR have the lowest error values of the four evaluation indexes, which validates the prediction accuracy of the model.Furthermore, through Figures 12 and 13, compared with other methods, the residual values of the proposed are more evenly distributed on both sides of the central axis, which proves that its prediction effect is better than that of other methods.In addition, the model proposed in this paper has the highest accuracy on the prediction curves with the real ice cover thickness, which further verifies the superiority of the method proposed in this paper.

Conclusions
To improve the speed and accuracy of the online icing prediction model, and to guarantee the safety and reliability of power supply to the power grid, this paper proposes an online icing prediction method for transmission lines, drawing the following conclusions: (1) Robust STL-BTSF and IBWO-LSSVR deals with the time-frequency domain decomposition and fusion of small-sample short time-series meteorological data, which can effectively retain effective information and improve the alignment of the feature volume.
(2) Improve BWO by introducing a multi-strategy improvement method with a Sigmoid-based nonlinear balancing factor, a population preference strategy, an improved whale fall strategy, and an elite learning strategy.The superiority of IBWO in terms of convergence speed and optimization-seeking accuracy is verified through nine typical test functions.
(3) Using IBWO to optimize LSSVR and introducing IOL, an online icing prediction model is constructed, and online updating of the regression function and prediction model is realized.The prediction speed and accuracy are greatly improved.
In the future, we will focus on areas with longer ice cover cycles within the year.Furthermore, research will be conducted on prediction methods for transmission line icing scenarios with long-term cycles and large-scale data to further improve prediction accuracy.In addition, we plan to conduct research using the physical model of the icing mechanism and construct a complete, reasonable, and efficient ice prediction and evaluation index model.

Figure 2 .
Specifically, each augmented time series xt is first transformed to the spectral domain by a Fast Fourier Transform (FFT), obtaining spectral signal xs.Then xt and xs are delivered to two encoding networks for feature extraction.The process is shown in Equation (2):

Figure 10 .
Figure 10.Comparison of data preprocessing methods.

Figure 10 .
Figure 10.Comparison of data preprocessing methods.

Figure 12 .
Figure 12.Residual map of test results.

Figure 13 .
Figure 13.Comparison of test results.

Figure 13 .
Figure 13.Comparison of test results.

Table 2 .
Parameters of algorithms.

Table 4 .
Simulation results of the benchmark test function.

Table 2 .
Parameters of algorithms.

Table 6 .
Comparison of preprocessing methods.
5.3.2.Example 2: Model ExperimentTo verify the superiority of the prediction model, Robust STL-BTSF and IBWO-LSSVR are simulated and compared with a variety of models, and the iterative adaptation curves are shown in Figure11.

Table 6 .
Comparison of data preprocessing methods.

Table 6 .
Comparison of data preprocessing methods.

Table 7 .
Comparison table of evaluation indicators.

Table 7 .
Comparison table of evaluation indicators.
Figure 12.Residual map of test results.