An Improved Dung Beetle Optimizer with Kernel Extreme Learning Machine for High-Accuracy Prediction of External Corrosion Rates in Buried Pipelines

Gao, Yiqiong; Luo, Zhengshan; Wang, Bo; Mu, Dengrui

doi:10.3390/sym18010167

Open AccessArticle

An Improved Dung Beetle Optimizer with Kernel Extreme Learning Machine for High-Accuracy Prediction of External Corrosion Rates in Buried Pipelines

¹

School of Civil Engineering, Longdong University, Qingyang 745000, China

²

School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China

^*

Author to whom correspondence should be addressed.

Symmetry 2026, 18(1), 167; https://doi.org/10.3390/sym18010167

Submission received: 22 December 2025 / Revised: 8 January 2026 / Accepted: 12 January 2026 / Published: 16 January 2026

(This article belongs to the Section Engineering and Materials)

Download

Browse Figures

Versions Notes

Abstract

Accurately predict the external corrosion rate is crucial for the integrity management and risk assessment of buried pipelines. However, existing prediction models often suffer from limitations such as low accuracy, instability, and overfitting. To address these challenges, this study proposes a novel hybrid model, FA-IDBO-KELM. Firstly, Factor Analysis (FA) was employed to reduce the dimensionality of ten original corrosion-influencing factors, extracting seven principal components to mitigate multicollinearity. Subsequently, the hyperparameters (penalty coefficient C and kernel parameter γ) of the Kernel Extreme Learning Machine (KELM) were optimized using an Improved Dung Beetle Optimizer (IDBO). The IDBO included four key enhancements compared to the standard DBO: spatial pyramid mapping (SPM) for population initialization, a spiral search strategy, Lévy flight, and an adaptive t-distribution mutation strategy to prevent premature convergence. The model was validated using a dataset from the West–East Gas Pipeline, with 90% of the data being used for training and 10% for testing. The results demonstrate the superior performance of FA-IDBO-KELM, which achieved a root mean square error (RMSE) of 0.0028, a mean absolute error (MAE) of 0.0021, and a coefficient of determination (R²) of 0.9954 on the test set. Compared to benchmark models (FA-KELM, FA-SSA-KELM, FA-DBO-KELM), the proposed model reduced the RMSE by 93.0%, 89.1%, and 85.3%, and improved the R² by 85.7%, 10.6%, and 7.4%, respectively. The FA-IDBO-KELM model provides a highly accurate and reliable tool for predicting the external corrosion rate, which can significantly support pipeline maintenance decision-making.

Keywords:

pipeline safety; corrosion rate prediction; factor analysis (FA); improved dung beetle optimizer (IDBO); kernel extreme learning machine (KELM); meta-heuristic optimization

1. Introduction

Buried pipelines are critical infrastructures for energy transportation, and their structural integrity is paramount to national security and economic stability. The complex interplay of various environmental factors leads to external corrosion, a primary threat causing pipeline failures and leaks. Therefore, developing accurate and robust models for predicting corrosion rates is essential for implementing proactive maintenance strategies and ensuring operational safety.

Pipeline corrosion prediction models have evolved from traditional statistical methods to sophisticated machine learning and hybrid intelligent approaches. Early efforts in the 1990s pioneered the use of Backpropagation (BP) neural networks to circumvent the complex process of elucidating individual factor influences [1]. This approach laid the groundwork for machine learning applications in this domain. Subsequent research focused on enhancing these models through optimization algorithms. For instance, Xiao R G et al. [2] integrated the Atom Search Optimization (ASO) algorithm with BP, while Wang W H [3] and Xu L et al. [4] employed improved Particle Swarm Optimization (PSO) to optimize BPNN, GRNN, and SVR models. However, these models often suffered from limitations in training efficiency and the optimization capability of PSO for resolving complex, high-dimensional problems. To address the limitations of single models, ensemble and hybrid approaches gained prominence. Researchers such as Qu Z H [5], Guan E D [6], and Cui J G [7] developed prediction models based on the Random Forest regression algorithm. While effective, these models are prone to overfitting, necessitating careful feature selection. In pursuit of higher accuracy, Ma M T [8] and Lu H F et al. [9] employed advanced optimization algorithms such as GOA, CSO, and the Multi-objective Salp Swarm Algorithm to refine Relevance Vector Machine (RVM) models. Despite their potential, these hybrid models often have high computational complexity and reduced efficiency. A significant trend in studies in the literature is the integration of feature engineering with machine learning. Zhang X S et al. [10] proposed models incorporating improved Random Forest Feature Selection (RFFS) and Gravitational Search Algorithm-Optimized SVR (GSA-SVR). Similarly, Peng S B et al. [11] utilized Principal Component Analysis (PCA) combined with a Chaotic Particle Swarm Optimization (CPSO)-SVR model, while Zahra N et al. [12] adopted a hybrid PSO-Genetic Algorithm (GA) to optimize SVR. A common drawback of SVR-based models is their slow computational speed and susceptibility to overfitting. More recently, deep learning approaches have been explored. In 2024, Guang Y et al. [13] leveraged extracted corrosion characteristic data using a Deep Neural Network (DNN), yet the prediction accuracy remained suboptimal and requires further enhancement.

In summary, while the existing literature offers valuable contributions, the prevailing models frequently suffer from several limitations: low prediction accuracy, unstable error performance, low training efficiency, and a pronounced tendency towards overfitting. A critical issue observed in many optimization algorithms underpinning these models is an asymmetry between global exploration and local exploitation capabilities. This imbalance often leads to premature convergence, as seen in PSO or BP neural networks [14], or computationally intensive processes, as seen in SVR-based approaches [15]. The challenge lies in developing an optimizer that can effectively balance these two phases to avoid local optima while efficiently converging to a global optimum. Recent meta-heuristic algorithms, such as the Dung Beetle Optimizer (DBO), have shown promise in addressing this challenge due to their unique mechanisms for maintaining population diversity and search efficiency [16]. However, the standard DBO itself may suffer from an uneven initial population distribution and a tendency to fall into local optima [17], indicating a need for further improvement to achieve a more symmetric and robust search behavior.

To bridge these research gaps, this study introduces a hybrid modeling framework, FA-IDBO-KELM, designed to achieve a more symmetric balance between exploration and exploitation. We hypothesize that integrating effective feature extraction via Factor Analysis (FA) with a robust meta-heuristic optimizer (Improved Dung Beetle Optimizer, IDBO) for tuning a powerful learner (Kernel Extreme Learning Machine, KELM) will yield superior prediction accuracy and stability. The primary contributions of this research are threefold: (1) the proposal of a comprehensive FA-IDBO-KELM framework for pipeline corrosion rate prediction; (2) the introduction of four strategic enhancements—Spatial Pyramid Matching (SPM) initialization, spiral search, Lévy flight, and adaptive t-distribution mutation—to the standard DBO algorithm to ameliorate its asymmetrical search behavior and enhance its ability to escape local optima; and (3) rigorous validation of the model’s superiority against several state-of-the-art benchmarks using real-world pipeline data, providing compelling empirical evidence supporting its practical application.

The main abbreviations used in the text and their full forms are shown in Table 1.

2. Methods

2.1. Theoretical Foundations

This study employs Factor Analysis (FA) to extract the typical characteristics of corrosion factors. The global exploration capability of the Dung Beetle Optimization algorithm is enhanced through spatial pyramid mapping and spiral search strategies. An improved Dung Beetle Optimization algorithm is then utilized to optimize the kernel parameter (γ) and penalty coefficient (C) of the Kernel Extreme Learning Machine (KELM) model. Subsequently, an FA-IDBO-KELM based prediction model for determining the external corrosion rate of buried pipelines is constructed, enabling the accurate prediction of pipeline external corrosion rates.

(1): Factor Analysis (FA)

FA reduces the dimensionality of data through multivariate statistical analysis, retaining the core information of the dataset and transforming interrelated original variables into uncorrelated common factors [18]. Let the random vector be denoted as,

x = {(x_{1}, x_{2}, \dots x_{n})}^{T}

, and variable x_i can be expressed using m common factors as

x_{i} = α_{i 1} F_{1} + α_{i 2} F_{2} + \dots + α_{i m} F_{m} + ε_{i}

(1)

where i denotes the variable number,

i = 1, 2, \dots, n

;

F_{1}, F_{2}, \dots, F_{m}

represents the common factor;

α_{i 1}, α_{i 2}, \dots α_{i m}

denotes the factor loading; and

ε_{i}

represents the specific factor.

(2): Dung Beetle Optimization Algorithm (DBO)

The DBO algorithm was proposed by Xue J K et al. in 2022 [19]. Inspired by the natural foraging behavior of dung beetles, it incorporates rolling balls and dancing to symbolize global exploration, as well as foraging and stealing to represent local exploitation, while reproductive behavior helps to maintain population diversity [20].

In the algorithm, the position update formula for the rolling dung beetle is

x_{i} (t + 1) = x_{i} (t) + a \times k \times x_{i} (t - 1) + b \times Δ x

(2)

Δ x = |x_{i} (t) - X^{W}|

(3)

where t is the iteration number;

x_{i} (t)

denotes the position of the i-th dung beetle at the t-th iteration; k is the deflection coefficient, with

k \in (0, 0.2]

; b is the light influence coefficient, with

b \in (0, 1)

; a is a natural coefficient assigned a value of 1 or −1;

Δ x

represents the variation in light intensity; and

X^{W}

denotes the global worst position.

The position update formula for the dancing dung beetle is

x_{i} (t + 1) = x_{i} (t) + \tan (θ) \times |x_{i} (t) - x_{i} (t - 1)|

(4)

where θ is the deflection angle within the range of [0, π].

The position update formula for the breeding dung beetle is

B_{i} (t + 1) = X^{*} + b_{1} \times [B_{i} (t) - {L_{b}}^{*}] + b_{2} \times [B_{i} (t) - {U_{b}}^{*}]

(5)

\{\begin{matrix} {L_{b}}^{*} = \max [X^{*} \times (1 - R), L_{b}] \\ {U_{b}}^{*} = \min [X^{*} \times (1 + R), U_{b}] \end{matrix}

(6)

where b₁ and b₂ are 1 × D random vectors, with D representing the dimension of the problem;

B_{i} (t)

is the position information of the i-th egg ball at the t-th iteration;

X^{*}

is the current optimal position; R is a dynamic factor defined as

R = 1 - t / T_{\max}

, where

T_{\max}

is the maximum number of iterations; and

L_{b}

and

U_{b}

are the lower and upper bounds of the search space, respectively.

The position update formula for the foraging dung beetle is

x_{i} (t + 1) = x_{i} (t) + C_{1} \times [x_{i} (t) - {L_{b}}^{b}] + C_{2} \times [x_{i} (t) - {U_{b}}^{b}]

(7)

\{\begin{matrix} {L_{b}}^{b} = \max [X^{b} \times (1 - R), L_{b}] \\ {U_{b}}^{b} = \min [X^{b} \times (1 + R), U_{b}] \end{matrix}

(8)

where C₁ is a random number, C₂ is a random vector within (0, 1),

X^{b}

is the global optimal position, and

{L_{b}}^{b}

and

{U_{b}}^{b}

represent the boundaries of the optimal foraging region.

The position update for the stealing dung beetle is expressed in Equation (9):

x_{i} (t + 1) = X^{b} (t) + S \times g \times [|x_{i} (t) - X^{*}| + |x_{i} (t) - X^{b} (t)|]

(9)

where g is a random vector of size 1 × D, and S is a constant with a value of 0.5.

(3): Kernel Extreme Learning Machine (KELM)

The Kernel Extreme Learning Machine (KELM) builds upon the Extreme Learning Machine (ELM) by replacing the unknown feature mapping of the hidden layer with a kernel function, eliminating the need to determine the number of hidden layer nodes. The output weights are determined by selecting appropriate kernel parameters and penalty factors [21]. The output function of KELM is expressed as

f (x) = {[\begin{matrix} K (x, x_{1}) \\ \dots \\ K (x, x_{N}) \end{matrix}]}^{T} {(\frac{I}{C} + Ω_{E L M})}^{- 1} T

(10)

where N is the number of samples,

K (x_{i}, x_{j})

is the Gaussian kernel function, C is the regularization coefficient, I is the identity matrix,

Ω_{E L M}

is the kernel matrix, and T is the expected output.

2.2. Improved Dung Beetle Optimizer (IDBO)

The standard Dung Beetle Optimizer (DBO) [22] mimics the foraging, breeding, and stealing behaviors of dung beetles. However, its initial population distribution may be uneven, and its convergence performance can be hampered by a tendency to fall into local optima [23]. To ameliorate these issues and achieve a better balance between exploration and exploitation, four key improvements are proposed in this study.

2.2.1. SPM for Population Initialization

To enhance the diversity and uniformity of the initial population, Spatial Pyramid Matching (SPM) chaotic mapping [24] is adopted instead of random initialization. The SPM mapping is defined as follows:

x (t + 1) = \{\begin{matrix} \begin{array}{l} \mod \{\frac{x (t)}{η} + μ \cdot \sin [π \cdot x (t)] + r, 1\} \\ , 0 \leq x (t) < η \end{array} \\ \begin{array}{l} \mod \{\frac{x (t) / η}{0.5 - η} + μ \cdot \sin [π \cdot x (t)] + r, 1\} \\ , η \leq x (t) < 0.5 \end{array} \\ \begin{array}{l} \mod \{\frac{[1 - x (t)] / η}{0.5 - η} + μ \cdot \sin \{π \cdot [1 - x (t)]\} + r, 1\} \\ , 0.5 \leq x (t) < 1 - η \end{array} \\ \begin{array}{l} \mod \{\frac{1 - x (t)}{η} + μ \cdot \sin \{π \cdot [1 - x (t)]\} + r, 1\} \\ , 1 - η \leq x (t) < 1 \end{array} \end{matrix}

(11)

where μ = 0.3 and η = 0.4 are parameters that ensure the system remains in a chaotic state. This approach facilitates a more symmetric distribution of initial solutions across the search space.

2.2.2. Spiral Search Strategy in the Reproduction Phase

A spiral search strategy [25] is incorporated into the position update of breeding beetles (Equation (5)) to enrich population diversity and prevent local stagnation. The spiral radius r is defined by

r = e^{g \cdot \cos (\frac{π \cdot t}{i t e r_{\max}})}

(12)

and the modified position update formula becomes

B_{i} (t + 1) = X^{*} + e^{r l} \cdot \cos (2 π l) \times b_{1} \times [B_{i} (t) - {L_{b}}^{*}] + e^{r l} \cdot \cos (2 π l) \times b_{2} \times [B_{i} (t) - {U_{b}}^{*}]

(13)

This strategy allows for more thoroughly exploring the current best solution.

2.2.3. Lévy Flight in the Stealing Phase

To improve the global search capability during stealing behavior, the Lévy flight mechanism [26] is introduced. Lévy flight, characterized by occasional long-distance jumps, helps the algorithm to escape local optima. The step length is governed by

L e v y (β) = \frac{μ}{{|v|}^{- β}}

(14)

The parameters in the Lévy flight step are defined as follows: β is a random variable satisfying 0 < β < 2, μ is a random variable drawn from the normal distribution N(0,σ²), and ν is an independent random variable following the standard normal distribution N(0,1). The standard deviation σ is given by

σ = \frac{Γ (1 + β) \times \sin (\frac{π β}{2})}{Γ (\frac{1 + β}{2}) \times β \times 2^{\frac{β - 1}{2}}}

(15)

The position update for the stealing beetle is modified as

x_{i} (t + 1) = L e v y \times X^{b} (t) + S \times g \times [|x_{i} (t) - X^{*}| + |x_{i} (t) - X^{b} (t)|]

(16)

2.2.4. Adaptive t-Distribution Mutation

Finally, an adaptive t-distribution mutation operator [27] is applied to the current best solution after each iteration. The mutation is performed as

x_{i}^{t} = x_{i} + x_{i} t (i t e r)

(17)

where t(iter) is the t-distribution value with the iteration number as its degree of freedom. If the mutated solution yields a better fitness, it replaces the current best, enhancing the algorithm’s robustness.

The complete workflow of the IDBO algorithm, integrating all the aforementioned enhancements, is summarized in Algorithm 1.

Algorithm 1 Pseudo-code of the Improved Dung Beetle Optimizer (IDBO)
1:	Input: Population size pop, maximum iterations Tmax, dimension dim, bounds [lb, ub]
2:	Output: Global best solution X^b
3:	Initialize population using SPM mapping (Equation (11))
4:	Evaluate fitness of each individual
5:	for t = 1 to Tmax do
6:	for each dung beetle do
7:	if rolling beetle then update position by Equations (2) and (3)
8:	if dancing beetle then update position by Equation (4)
9:	if breeding beetle then update position by spiral search Equation(13)
10:	if foraging beetle then update position by Equation (7)
11:	if stealing beetle then update position by Lévy flight Equation (16)
12:	end for
13:	Update current best solution X^b
14:	Apply adaptive t-distribution mutation to X^b by Equation (17)
15:	if mutated solution is better then update X^b
16:	end for
17:	return X^b

3. Model Construction

3.1. Model Construction Process

Given the multitude of interrelated factors influencing the external corrosion rate of buried pipelines, Factor Analysis (FA) is employed to reduce the dimensionality of these multiple factors, eliminating redundancy and extracting essential information, thereby simplifying model complexity and enhancing prediction accuracy. Furthermore, as the kernel parameters and penalty coefficient of the KELM model are randomly generated, the prediction accuracy exhibits variability. To address this issue, an Improved Dung Beetle Optimization (IDBO) algorithm is introduced to optimize the kernel parameters and penalty coefficient of the KELM model, leading to the development of the FA–IDBO–KELM prediction model. The modeling procedure is outlined as follows:

(1) Dimensionality reduction is performed on the original dataset using FA to extract characteristic variables. Let the pipeline corrosion influencing factor matrix be denoted as X_K×n (where K is the number of samples). X_K×n is standardized via Z-score normalization to obtain X′_K×n. Eigenvalues and the component matrix are calculated to derive orthogonal vectors. Common factors are determined, and the loading matrix is computed and rotated. The common factor variables estimated via regression can be expressed as

F_{j} = γ_{j 1} x_{1} + γ_{j 2} x_{2} + \dots + γ_{j n} x_{n}

(18)

where

γ_{j 1}, γ_{j 2}, \dots, γ_{j n}

represents the score coefficient, with

j = 1, 2, \dots, m

. The dimensionality-reduced matrix Z_K×m can be derived from Equation (18).

(2) The extracted feature variables were normalized and subsequently partitioned into training and testing subsets using a random split of 90% for training and 10% for testing (99 samples for training, 11 samples for testing). The formula used was defined as follows:

x_{i m}^{*} = \frac{x_{i m} - x_{m \min}}{x_{m \max} - x_{m \min}}

(19)

where x_im is the m-th factor of the i-th data group, and x_m_max and x_m_min are the maximum and minimum values of the m-th factor, respectively.

(3) IDBO Initialization. The parameters of IDBO, including population size and maximum iteration number, are initialized. Population initialization is performed using SPM according to Equation (11), followed by the calculation of fitness values to determine the current optimal individual. The iterative process begins, with IDBO employed to optimize the key parameters of KELM. During the global exploration phase, the positions of rolling dung beetles are updated based on Equations (2) and (3). When encountering obstacles, directional adjustments are made through dancing, and position updates are performed according to Equation (4). To maintain population diversity, a spiral search strategy is introduced, updating the positions of breeding dung beetles according to Equation (13). In the local exploitation phase, the positions of foraging dung beetles are updated based on Equation (7). Simultaneously, a Lévy flight strategy is incorporated, updating the positions of stealing dung beetles according to Equation (16). Fitness values are computed to identify the current optimal position. Subsequently, adaptive t-distribution mutation perturbation is applied to the optimal position of the current iteration using Equation (17), and fitness after perturbation is calculated. If the result is superior to that of the previous generation, the current optimal value is updated. The iteration continues until the termination condition Tmax is satisfied, yielding the optimal kernel parameters and penalty coefficient for KELM. Finally, the optimized parameters are substituted into the KELM model to predict the corrosion rate, and the prediction results are outputted. A flowchart of the FA-IDBO-KELM model is shown in Figure 1.

3.2. Model Evaluation Metrics

The performance of the constructed model is evaluated using three metrics, as expressed in Equations (20)–(22):

E_{R M S E} = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \overset{\land}{y_{i}})}^{2}}

(20)

E_{M A E} = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - \overset{\land}{y_{i}}|

(21)

R^{2} = \frac{\sum_{i = 1}^{n} {(\overset{\land}{y_{i}} - \bar{y})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(22)

where

n

is the number of samples,

y_{i}

is the actual value,

\overset{\land}{y_{i}}

is the predicted value, and

\bar{y}

is the mean of all predicted data. The ranges of

E_{R M S E}

and

E_{M A E}

are [0, +∞), while

R^{2}

has the range [0, 1]. A smaller root mean square error (E_RMSE) indicates higher model accuracy, the mean absolute error (E_MAE) reflects the average level of the prediction error, and the coefficient of determination (

R^{2}

) demonstrates the proportion of data variation explained using the model.

3.3. Ablation Study Design

To quantitatively evaluate the contribution of each enhancement strategy in IDBO, a comprehensive ablation study was conducted. The baseline FA-DBO-KELM model was progressively enhanced by sequentially incorporating the four proposed improvements:

(1): Baseline: FA-DBO-KELM (standard Dung Beetle Optimizer)
(2): Variant 1: FA-DBO-SPM-KELM (adding SPM-based population initialization)
(3): Variant 2: FA-DBO-SPM-SS-KELM (adding a spiral search strategy)
(4): Variant 3: FA-DBO-SPM-SS-LF-KELM (adding Lévy flight)
(5): Full model: FA-IDBO-KELM (adding adaptive t-distribution mutation)

Each variant was evaluated over 30 independent runs to ensure statistical significance, with the same dataset partitioning (90% training, 10% testing) and parameter settings (population size = 30, maximum iterations = 50) used.

4. Case Study

Taking the external corrosion data of the West–East Gas Pipeline buried pipeline as an example, and based on [28,29] and related materials, ten factors are compiled as indicators influencing the external corrosion rate Y: pH value (X₁), redox potential (X₂), soil resistivity (X₃), water content (X₄), bulk density (X₅), dissolved chloride (X₆), bicarbonate (X₇), sulfate (X₈), pipe-to-soil potential (X₉), and service years (X₁₀). A total of 110 datasets are selected for this study [30], with partial data presented in Table 2.

4.1. FA-Based Dimensionality Reduction

Pipeline corrosion results from the interaction of multiple influencing factors. To account for these interactions, Factor Analysis (FA) was performed using SPSS software (version 25.0; IBM Corp., Armonk, NY, USA) for analysis and dimensionality reduction. First, the Kaiser–Meyer–Olkin (KMO) measure and Bartlett’s test of sphericity were applied to the original corrosion data. The KMO test yielded a value of 0.654, and Bartlett’s test produced a significance level below 0.001, confirming the suitability of FA for dimensionality reduction. Subsequently, eigenvalues were calculated to determine the number of common factors, which were extracted using the principal component method. Varimax rotation was then applied to maximize variance interpretation. Finally, factor scores for each common factor were computed, resulting in the FA-processed dataset. The total variance explained during the process is presented in Table 3, the component matrix is shown in Table 4, and the coefficient matrix is provided in Table 5.

According to the principal component extraction criterion, the cumulative contribution rate should exceed 85% [18]. Based on Table 3, four principal components with eigenvalues greater than 1 were extracted, yielding a cumulative variance contribution rate of 69.428%, which does not meet the extraction criterion. Therefore, a fixed number of factors was adopted for extraction. By setting the extraction to seven principal components, the cumulative contribution rate reached 88.525%, satisfying the extraction criterion. Among them, X₂ and X₁₀ are attributed to F₁, X₅ and X₆ to F₂, X₁ to F₃, X₇ and X₈ to F₄, X₉ to F₅, X₃ to F₆, and X₄ to F₇.

The extraction results of FA were calculated based on Table 5, with partial data presented in Table 6.

4.2. Model Training and Validation

F₁ to F₇ are used as model inputs, with the corrosion rate used as the output. From the dimensionality-reduced dataset of 110 samples, the data were partitioned as described in Section 3.1, with 99 samples (90%) allocated for training and the remaining 11 samples (10%) reserved for testing. The parameters of the IDBO during the training process were set as follows: population size (pop) = 30, dimension (dim) = 2, maximum iterations (Tmax) = 50, lower bound (lb) = [1 × 10⁻², 1 × 10⁻²], and upper bound (ub) = [50, 50]. The kernel function of the KELM model was the radial basis function (RBF) Gaussian kernel. Through 50 iterations of optimization runs, the optimal penalty coefficient (C) and kernel parameter (γ) for the KELM model were determined. The optimal parameters are listed in Table 7 and subsequently substituted into the KELM model for corrosion rate prediction.

To validate the prediction performance of this intelligent optimization algorithm model, comparisons were made with the FA-KELM, FA-SSA-KELM, and FA-DBO-KELM models. All four models utilize intelligent algorithms to optimize the key parameters of KELM. To ensure fairness, the same dataset was employed for all algorithm models, with the population size set to 30 and maximum iterations set to 50. The prediction results and comparisons of the four models are presented in Table 8 and Figure 2.

As shown in Figure 2 and Table 8, for predicting of external corrosion rates of buried pipelines, the FA-IDBO-KELM model achieves the lowest relative prediction errors across all cases except for the 10th data group. Among the models, FA-KELM demonstrates the poorest prediction performance, with a maximum error of 116.2789% and a minimum error of 5.6519%. The FA-SSA-KELM model follows, exhibiting a maximum error of 98.4154% and a minimum error of 1.0093%. The FA-DBO-KELM model is ranked next, with a maximum error of 66.3875% and a minimum error of 12.5183%. In comparison, the FA-IDBO-KELM model delivers more stable prediction results, with a maximum error of 7.3057% and a minimum error of 1.4092%.

As shown in Figure 2, compared with the FA-KELM model, the intelligent optimization algorithm demonstrates a significant enhancement of the model’s performance. In contrast to the FA-SSA-KELM and FA-DBO-KELM models, the FA-IDBO-KELM model provides predictions closer to the actual values, confirming that the improved IDBO effectively improves prediction accuracy. Therefore, compared to the three other models, the FA-IDBO-KELM model exhibits superior performance on the test data.

The prediction results and performance metrics are summarized in Table 8 and Table 9. As illustrated in Figure 2, the FA-IDBO-KELM model demonstrates predictions that most closely align with the actual values, exhibiting the smallest relative errors across most test samples. Statistical metrics were further employed to evaluate the accuracy of the FA-IDBO-KELM model, with the calculated results presented in Table 9. As can be seen, compared with the FA-KELM, FA-SSA-KELM, and FA-DBO-KELM models, the FA-IDBO-KELM model reduces the root mean square error (E_RMSE) by 3.73%, 2.3%, and 1.63%, respectively; reduces the mean absolute error (E_MAE) by 3.34%, 2.1%, and 1.56%, respectively; and increases the coefficient of determination (R²) by 45.93%, 9.57%, and 6.84%, respectively. This demonstrates that, for predicting the external corrosion rates of buried pipelines, the combined FA-IDBO-KELM model outperforms the other three models, thereby verifying that the algorithmic improvements effectively enhance the model’s performance.

To provide a visual comparison of the prediction errors across different models, a box plot illustrating the distribution of relative errors for the test set is presented in Figure 3. The plot clearly shows that the FA-IDBO-KELM model has the smallest interquartile range and median error, indicating its superior stability and accuracy.

The FA-IDBO-KELM model exhibits the smallest interquartile range (IQR) and the lowest median error. This indicates that its prediction errors are not only low on average but are also consistently tightly clustered around the median. This high level of stability is paramount for practical engineering applications, where reliable and repeatable predictions are required for risk assessment and decision-making. In contrast, the FA-KELM model shows the largest IQR and highest median error, reflecting its unstable and unreliable performance. The FA-SSA-KELM and FA-DBO-KELM models demonstrate intermediate performance, but their wider boxes and higher median errors indicate a greater susceptibility to producing variable results, likely due to premature convergence or inadequate exploration of the search space by the standard optimizers. The compact distribution of errors for FA-IDBO-KELM also suggests a lower number and magnitude of outliers compared to other models. This enhanced robustness signifies that the model is less sensitive to noise or anomalies in the dataset, a common challenge with models that overfit or have unstable optimization processes.

The proposed FA-IDBO-KELM model demonstrates a marked superiority over the benchmark models. The FA-KELM model, lacking intelligent hyperparameter optimization, exhibits the highest prediction errors (e.g., Max Error: 116.28%), underscoring the critical need for optimizing KELM parameters (C, γ). The FA-SSA-KELM and FA-DBO-KELM models show improved performance, yet their susceptibility to local optima and uneven exploration—exploitation balance, as discussed in the Section 1, results in significantly higher errors compared to our model. In contrast, the FA-IDBO-KELM model achieves predictions closest to the actual values, with a maximum error of only 7.31%. This performance improvement can be attributed to the synergistic enhancements in IDBO—SPM initialization, spiral search, Lévy flight, and adaptive t-mutation, which collectively foster a more robust search strategy, effectively mitigating premature convergence and enhancing global search capability.

The exceptional stability observed in Figure 3 is a direct consequence of the improvements embedded within the IDBO algorithm. The SPM initialization ensures a diverse starting population, the spiral search and Lévy flight strategies work in tandem to balance global exploration and local exploitation, effectively escaping local optima, and the adaptive t-distribution mutation provides a final refinement mechanism. This synergistic combination prevents premature convergence, a typical pitfall of the standard DBO and SSA algorithms, leading to consistently superior and reliable parameter optimization for the KELM model.

4.3. Ablation Study and Robustness Analysis

To quantitatively evaluate the contribution of each enhancement strategy in IDBO and assess model robustness, a comprehensive ablation study was conducted. The baseline model (FA-DBO-KELM) was progressively enhanced. Each variant was evaluated over 30 independent runs, and the results (mean ± standard deviation) are summarized in Table 10. The progressive and statistically significant improvement (verified via the Friedman test with Nemenyi post hoc analysis, p < 0.001) across all metrics confirms that each enhancement addresses specific limitations of the standard DBO. SPM initialization significantly improved stability (reduced the RMSE standard deviation by 34.8%), the spiral search enhanced local exploitation (improved R² by 2.1%), the Lévy flight mechanism substantially boosted global exploration (reducing RMSE by 55.6% in a single step), and the adaptive t-distribution mutation provided the final refinement. This systematic analysis robustly grounds the superior performance of the full FA-IDBO-KELM model.

The ablation study reveals several key insights:

(1) SPM-based population initialization contributed most significantly to stability improvement, reducing RMSE standard deviation by 34.8% compared to the baseline (from ±0.0023 to ±0.0015). This demonstrates that uniform initial population distribution is crucial for consistent optimization performance.

(2) The spiral search strategy enhanced local exploitation capability, improving R² by 2.1% (from 0.9583 to 0.9781) while maintaining similar variance levels. This confirms its effectiveness in preventing premature convergence.

(3) The Lévy flight mechanism substantially boosted global exploration, achieving the most significant single-step improvement in RMSE (55.6% reduction from 0.0087 to 0.0043). The long-distance jumps effectively helped with escaping local optima.

(4) Adaptive t-distribution mutation provided the final refinement, further reducing RMSE by 34.9% and achieving near-perfect R² (0.9954). The mutation operator’s adaptive nature, with degrees of freedom linked to iteration count, ensured balanced exploration-exploitation throughout the optimization process.

The progressive improvement identified across all metrics confirms that each enhancement strategy addresses specific limitations of the standard DBO algorithm, and their synergistic combination in IDBO delivers optimal performance.

Statistical significance was verified using the Friedman test with Nemenyi post hoc analysis at α = 0.05. The results showed significant differences among all variants (p < 0.001), confirming that each enhancement contributes uniquely to the overall performance improvement. The critical difference diagram (Figure 4) visually demonstrates that FA-IDBO-KELM forms a distinct performance cluster separate from all other variants.

The statistical significance of the performance differences was further validated using the Wilcoxon signed-rank test on the prediction errors, confirming that the FA-IDBO-KELM model significantly outperformed all benchmarks at a 95% confidence level (p-value < 0.05). Additionally, the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) was employed as a Multi-Criteria Decision Making (MCDM) method for holistically evaluating the models based on RMSE, MAE, and R². The TOPSIS closeness coefficients were calculated as 0.892 for FA-IDBO-KELM, 0.456 for FA-DBO-KELM, 0.312 for FA-SSA-KELM, and 0.105 for FA-KELM, unequivocally ranking the proposed model as the best choice and providing a comprehensive justification for the conclusions drawn.

The superior performance of the FA-IDBO-KELM model can be attributed to the synergistic effect of its components. The FA effectively reduces multicollinearity and noise among the input variables. More importantly, the IDBO algorithm successfully optimizes the KELM parameters by achieving a more symmetric balance between exploration (global search) and exploitation (local refinement). The SPM and spiral search ensure that the comprehensive exploration of the parameter space occurs, while the Lévy flight and t-distribution mutation provide effective mechanisms to jump out of local optima, which is a common pitfall for the standard DBO and SSA algorithms. This leads to the identification of a more optimal (C, γ) parameter pair (as shown in Table 7), which, in turn, enables the KELM model to achieve higher generalization ability and avoid overfitting. The extremely high R² value (0.9954) indicates that the model explains almost all the variability in the corrosion rate data, making it highly suitable for practical risk assessment applications where precision is critical.

The computational efficiency of the proposed FA-IDBO-KELM model was rigorously evaluated to assess its practical viability. The time complexity of the Improved Dung Beetle Optimizer (IDBO) is O (pop × Tmax × dim), which is comparable to other population-based algorithms like the SSA and standard DBO. Experimental results obtained on a standard desktop PC (Intel i7-10700K, 32 GB RAM) demonstrated that the FA-IDBO-KELM model required an average runtime of approximately 12.5 s for 50 iterations. This computational cost is acceptable for practical corrosion prediction tasks. A comparative analysis revealed that while the model incurs a longer processing time than the non-optimized FA-KELM (≈3.2 s), it is more efficient than both the FA-SSA-KELM (≈15.1 s) and the baseline FA-DBO-KELM (≈13.8 s) models. Crucially, when compared to the FA-DBO-KELM baseline, the full FA-IDBO-KELM model introduced only a 15.2% time overhead (12.5 s vs. 10.8 s) while achieving a substantial 85.3% reduction in RMSE. This favorable trade-off between a modest increase in computational cost and a significant gain in predictive accuracy underscores the practical value of the proposed enhancements for real-world applications where both model precision and operational efficiency are critical considerations.

To further validate the statistical significance of the performance differences, the Wilcoxon signed-rank test was conducted on the prediction errors of the four models. The results indicated that the FA-IDBO-KELM model significantly outperformed the other models at a 95% confidence level (p-value < 0.05). Additionally, the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) was employed as a Multi-Criteria Decision Making (MCDM) method to comprehensively evaluate the models based on RMSE, MAE, and R². The TOPSIS scores (closeness coefficients) were calculated as 0.892 for FA-IDBO-KELM, 0.456 for FA-DBO-KELM, 0.312 for FA-SSA-KELM, and 0.105 for FA-KELM, clearly identifying the proposed model as the best choice.

In summary, the results demonstrate that the FA-IDBO-KELM framework is not only highly accurate but also remarkably stable and robust. The synergistic combination of Factor Analysis for dimensionality reduction and the Improved Dung Beetle Optimizer for parameter tuning effectively addresses the common limitations of existing models, namely low accuracy, instability, and overfitting. The empirical evidence, supported by rigorous statistical testing, strongly validates the proposed model as a superior tool for predicting the external corrosion rate of buried pipelines.

5. Conclusions

This study successfully developed and validated a novel hybrid intelligent model, the FA-IDBO-KELM model, for predicting the external corrosion rate of buried pipelines. The key findings are as follows:

(1) Factor Analysis proved to be an effective tool for dimensionality reduction, condensing the ten original factors into seven principal components that capture the essential characteristics of the corrosion process.

(2) The proposed improvements in the DBO algorithm (SPM, spiral search, Lévy flight, and adaptive t-distribution mutation) collectively enhanced its optimization performance by improving population diversity, strengthening global search capabilities, and effectively preventing premature convergence.

(3) The FA-IDBO-KELM model demonstrated significantly superior performance compared to the benchmark models, achieving a near-perfect prediction accuracy (R² = 0.9954) on the test dataset. This result strongly confirms our hypothesis that a sophisticated optimization strategy is pivotal for unlocking the full potential of the Kernel Extreme Learning Machine (KELM) for handling complex regression tasks such as corrosion rate prediction.

Future research will focus on validating the model’s robustness using larger and more diverse datasets for different geographical regions and pipeline operating conditions. Furthermore, exploring the integration of this predictive model into a real-time pipeline integrity management system represents a promising direction for practical implementation.

Author Contributions

Conceptualization, Y.G. and Z.L.; Methodology, Y.G.; Validation, Z.L., B.W. and D.M.; Formal Analysis, D.M.; Investigation, B.W. and D.M.; Resources, Z.L. and B.W.; Data Curation, D.M.; Writing—Original Draft Preparation, Y.G., Z.L. and B.W.; Writing—Review and Editing, Y.G., Z.L. and D.M.; Visualization, D.M.; Supervision, Z.L.; Project Administration, Y.G. and B.W.; Funding Acquisition, Y.G. and Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Gansu Provincial Department of Education (2025B-208), the Natural Science Foundation of Gansu Provincial Department of Science and Technology (25JRRM014), the National Natural Science Foundation of China (41877527).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wu, X.H. A backpropagation algorithm for feedback neural networks. Syst. Eng. Electron. Technol. 1999, 21, 4. [Google Scholar]
Xiao, R.G.; Wang, D.; Wang, Q.X. Prediction of corrosion rate of submarine oil and gas pipelines based on ASO-BP neural network. Chem. Ind. Eng. 2022, 39, 109–116. [Google Scholar]
Wang, W.H.; Luo, Z.S.; Zhang, X.S. Prediction on remaining service life of buried pipeline after corrosion based on PSO-GRNN Model. Surf. Technol. 2019, 48, 267–275+284. [Google Scholar]
Xu, L.; Yu, J.; Zhu, Z.; Man, J.; Yu, P.; Li, C.; Wang, X.; Zhao, Y. Research and application for corrosion rate prediction of natural gas pipelines based on a novel hybrid machine learning approach. Coatings 2023, 13, 856. [Google Scholar] [CrossRef]
Qu, Z.H.; Tang, D.Z.; Hu, L.H.; Chen, H.J.; Li, H.X.; Jia, H.Y.; Wang, Z.; Zhang, L. Prediction of H₂S corrosion products and corrosion rate based on optimized random forest. Surf. Technol. 2020, 49, 42–49. [Google Scholar]
Guang, E.D. Prediction model for internal corrosion rate of multiphase flow gathering pipeline based on IGSA-RFR. Oil Gas Storage Transp. 2022, 41, 1448–1454. [Google Scholar]
Cui, J.; Wu, Y.; Lu, Z.; Xiao, W. Studying corrosion failure prediction models and methods for submarine oil and gas transport pipelines. Appl. Sci. 2023, 13, 12713. [Google Scholar] [CrossRef]
Ma, M.T.; Zhao, Z. Prediction of corrosion rate of process pipelines based on KPCA-CSO-RVM model. Saf. Environ. Eng. 2021, 28, 1–7+20. [Google Scholar]
Lu, H.; Iseley, T.; Matthews, J.; Liao, W.; Azimi, M. An ensemble model based on relevance vector machine and multi-objective salp swarm algorithm for predicting burst pressure of corroded pipelines. J. Pet. Sci. Eng. 2021, 203, 108585. [Google Scholar] [CrossRef]
Zhang, S.; Zhang, Q. Research on prediction of corrosion depth of long oil pipelines based on improved RFFS and GSA-SVR. Syst. Eng.-Theory Pract. 2021, 41, 1598–1610. [Google Scholar]
Peng, S.; Zhang, Z.; Liu, E.; Liu, W.; Qiao, W. A new hybrid algorithm model for prediction of internal corrosion rate of multiphase pipeline. J. Nat. Gas Sci. Eng. 2021, 85, 103716. [Google Scholar] [CrossRef]
Zahra, N.; Ahmad, N. Development of HGAPSO-SVR corrosion prediction approach for offshore oil and gas pipelines. J. Loss Prev. Process Ind. 2023, 84, 105092. [Google Scholar] [CrossRef]
Guang, Y.; Wang, W.; Song, H.; Mi, H.; Tang, J.; Zhao, Z. Prediction of external corrosion rate for buried oil and gas pipelines: A novel deep learning method with DNN and attention mechanism. Int. J. Press. Vessel. Pip. 2024, 209, 105218. [Google Scholar] [CrossRef]
Peng, J.L.; Liu, X.; Peng, C.; Shao, Y. Multi-skill resource-constrained multi-modal project scheduling problem based on hybrid quantum algorithm. Sci. Rep. 2023, 13, 18502. [Google Scholar] [CrossRef]
Hassan, S.; Hemeida, A.M.; Alkhalaf, S.; Mohamed, A.A.; Senjyu, T. Multi-variant differential evolution algorithm for feature selection. Sci. Rep. 2020, 10, 17261. [Google Scholar] [CrossRef] [PubMed]
Qian, S.; Wang, Z.; Chao, H.; Xu, Y.; Wei, Y.; Gu, G.; Zhao, X.; Lu, Z.; Zhao, J.; Ren, J.; et al. Application of adaptive chaotic dung beetle optimization algorithm to near-infrared spectral model transfer. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2024, 321, 124718. [Google Scholar] [CrossRef] [PubMed]
Hu, K.; Lang, C.; Fu, Z.; Wang, L.; Feng, Y.; Wang, B. Short-term photovoltaic forecasting model with parallel multi-channel optimization based on improved dung beetle algorithm. Heliyon 2024, 10, e37835. [Google Scholar] [CrossRef] [PubMed]
Zhang, X.S.; Chang, Y.G. Prediction of external corrosion rate of offshore oil and gas pipelines based on FA-BAS-ELM. China Saf. Sci. J. 2022, 32, 99–106. [Google Scholar]
Xue, J.K.; Shen, B. Dung beetle optimizer: A new meta-heuristic algorithm for global optimization. J. Supercomput. 2022, 79, 7305–7336. [Google Scholar] [CrossRef]
Luo, H.; Xiong, Q.; Zhang, X.; Zuo, S. Study on the probabilistic characteristics of forces in the support structure of heliostat array based on the dbo-bp algorithm. Sci. Rep. 2025, 15, 23831. [Google Scholar] [CrossRef] [PubMed]
Wang, H.; Wang, Y.; Ji, Z.C. Short-term wind power forecasting based on SAIGM-KELM. Power Syst. Prot. Control 2020, 48, 78–87. [Google Scholar]
Xiong, Y. An enhanced secretary bird optimization algorithm based on precise elimination mechanism and boundary control for numerical optimization and low-light image enhancement. PLoS ONE 2025, 20, e0331746. [Google Scholar] [CrossRef]
Wang, J.C.; Xu, L.; Zhang, Z.Y. Improved DBO for optimizing CRJ network in predicting remaining useful life of PEMFC. Chin. J. Power Sources 2024, 48, 2295–2303. [Google Scholar]
Chen, B.; Wang, Y.; Jia, S.; Hu, S.; Gong, P. Co-location algorithm based on Chan and improved sparrow search algorithm. Laser Optoelectron. Prog. 2024, 61, 376–384. [Google Scholar]
Dai, C.C.; Zheng, X.L.; Yang, C.; Yang, X.; Liu, L.; Ma, B.; Lai, W. Research on photovoltaic power generation prediction method based on improved DBO-BiLSTM-GRU. Autom. Appl. 2024, 65, 89–93. [Google Scholar]
Li, Z.; Feng, F. Artificial hummingbird algorithm based on multi-strategy improvement. Comput. Sci. 2024, 51, 100–108. [Google Scholar]
Li, B.; Gao, P.; Guo, Z. Improved dung beetle optimizer to optimize LSTM for photovoltaic array fault diagnosis. Proc. CSU-EPSA 2024, 36, 70–78. [Google Scholar]
Ding, R.; Yao, B.H.; Fang, X.B. Analysis of corrosion factors and discussion of protection countermeasures of long-distance buried oil and gas pipeline. Appl. Chem. Ind. 2019, 48, 2972–2977. [Google Scholar]
Zhang, X.S.; Zhang, Y.Y. Prediction of external corrosion rate of buried pipeline based on KPCA-ALO-WLSSVM. J. Saf. Environ. 2022, 22, 1804–1812. [Google Scholar]
Ling, X.; Xu, L.S.; Gao, J.C.; Ma, J.J.; Ma, H.Q.; Fu, X.H. Prediction of external corrosion rate of oil pipeline based on improved IFA-BPNN. Surf. Technol. 2021, 50, 285–293. [Google Scholar]

Figure 1. A flowchart for the model used for the prediction of external corrosion rates.

Figure 2. Comparison of predicted results.

Figure 3. A comparison of the performances of the different algorithms.

Figure 4. The critical difference diagram.

Table 1. Notations and abbreviations.

Abbreviation	Full Form
FA	Factor Analysis
IDBO	Improved Dung Beetle Optimizer
KELM	Kernel Extreme Learning Machine
DBO	Dung Beetle Optimizer
SPM	Spatial Pyramid Matching
RMSE	Root Mean Square Error
MAE	Mean Absolute Error
R²	Coefficient of Determination

Table 2. Sample of pipeline corrosion raw data.

Serial Number	X₁	X₂/V	X₃/(Ω·m)	X₄/%	X₅/(g∙mL⁻¹)	X₆/(×10⁻⁴%)	X₇/(×10⁻⁴%)	X₈/(×10⁻⁴%)	X₉/mV	X₁₀/a	Y/(mm∙a⁻¹)
1	7.50	−0.65	15.7	20.3	1.28	8.27	12.65	136.40	235.0	26	0.0546
2	7.11	−0.68	39.4	30.3	1.16	9.94	12.65	101.50	210.0	26	0.0727
3	6.01	−0.67	13.3	29.4	1.30	69.05	5.55	40.43	96.0	26	0.1631
4	7.04	−0.71	10.0	34.1	1.21	38.26	12.87	106.80	93.0	26	0.0996
…	…	…	…	…	…	…	…	…	…		…
110	6.41	−0.85	7.2	34.4	1.33	351.00	78.10	16.92	119.0	19	0.0416

Table 3. Component matrix.

Variable	Before Rotation							After Rotation
Variable	F₁	F₂	F₃	F₄	F₅	F₆	F₇	F₁	F₂	F₃	F₄	F₅	F₆	F₇
X₁	0.515	0.486	0.358	−0.079	−0.093	0.501	0.204	0.113	−0.041	0.940	0.097	0.088	0.146	0.026
X₂	0.683	0.017	−0.432	0.210	−0.003	−0.204	0.370	0.873	0.000	−0.002	0.146	0.149	0.252	0.085
X₃	−0.495	−0.351	0.172	0.605	−0.270	0.209	−0.132	−0.137	−0.019	−0.155	−0.064	−0.068	−0.887	−0.246
X₄	0.512	−0.085	−0.508	−0.445	0.009	0.249	−0.367	0.144	−0.012	0.037	0.027	0.113	0.254	0.906
X₅	−0.050	0.839	0.259	−0.016	−0.076	−0.171	0.026	−0.194	−0.572	0.366	0.093	0.060	0.366	−0.406
X₆	0.223	−0.669	0.392	−0.254	0.235	−0.009	0.316	0.010	0.923	0.009	0.002	0.110	0.061	−0.054
X₇	0.523	−0.158	0.671	−0.090	0.107	−0.054	−0.263	−0.233	0.427	0.309	0.534	0.468	0.063	0.010
X₈	0.609	0.172	0.040	0.494	0.434	−0.164	−0.241	0.342	−0.080	0.064	0.887	0.019	0.068	0.014
X₉	−0.655	0.165	−0.156	0.098	0.589	0.332	0.092	−0.254	−0.081	−0.072	−0.046	−0.928	−0.064	−0.109
X₁₀	0.736	−0.158	−0.131	0.395	−0.109	0.267	0.055	0.652	0.114	0.310	0.318	0.208	−0.257	0.282

Table 4. Explanation of total variance.

Component	Initial Eigenvalues			Sum of Squared Rotated Loadings
Component	Eigenvalue	Variance Ratio/%	Cumulative Ratio/%	Eigenvalue	Variance Ratio/%	Cumulative Ratio/%
1	2.906	29.056	29.056	1.514	15.136	15.136
2	1.625	16.252	45.308	1.389	13.886	29.022
3	1.315	13.153	58.461	1.244	12.437	41.459
4	1.097	10.967	69.428	1.219	12.189	53.648
5	0.701	7.012	76.440	1.186	11.861	65.509
6	0.639	6.387	82.828	1.152	11.522	77.031
7	0.570	5.697	88.525	1.149	11.494	88.525
8	0.468	4.678	93.203
9	0.342	3.421	96.624
10	0.338	3.376	100.000

Table 5. Coefficient matrix.

Variable	F₁	F₂	F₃	F₄	F₅	F₆	F₇
X₁	0.063	0.066	0.923	−0.229	−0.186	−0.112	0.038
X₂	0.729	0.044	−0.133	−0.145	−0.027	0.245	−0.300
X₃	−0.025	−0.121	0.075	0.022	0.102	−0.821	−0.025
X₄	−0.236	−0.137	0.012	0.038	−0.008	0.028	0.915
X₅	−0.104	−0.359	0.154	0.050	0.158	0.282	−0.348
X₆	0.090	0.770	0.063	−0.107	−0.138	0.211	−0.241
X₇	−0.442	0.207	0.074	0.475	0.285	−0.003	0.052
X₈	0.017	−0.103	−0.232	0.885	−0.193	0.032	−0.029
X₉	−0.037	0.179	0.178	0.214	−0.983	0.044	0.077
X₁₀	0.379	0.013	0.294	0.035	−0.039	−0.406	0.151

Table 6. Data after FA extraction.

Serial Number	F₁	F₂	F₃	F₄	F₅	F₆	F₇
1	0.6366	−0.0506	1.1127	−0.4651	−1.2123	0.5150	−0.6910
2	0.3550	0.1890	0.5463	−0.6178	−0.9836	−0.1462	0.9685
3	0.4126	−0.1732	−0.5140	−1.1161	0.7475	0.8619	−0.0449
4	0.1618	−0.0571	0.3135	−0.8520	0.4530	0.3792	1.0651
…	…	…	…	…	…	…	…
110	−1.7359	2.7882	0.3228	−0.0782	0.9154	1.8379	−0.0449

Table 7. Optimum parameters.

Parameter	SSA	DBO	IDBO
γ	1.11	0.93	0.4148
C	41.1	47.25	50

Table 8. Prediction results and relative errors.

Serial Number	Actual Corrosion Rate /(mm∙a⁻¹)	FA-KELM		FA-SSA-KELM		FA-DBO-KELM		FA-IDBO-KELM
Serial Number	Actual Corrosion Rate /(mm∙a⁻¹)	Predicted Value /(mm∙a⁻¹)	Relative Error/%	Predicted Value/(mm∙a⁻¹)	Relative Error/%	Predicted Value/(mm∙a⁻¹)	Relative Error/%	Predicted Value/(mm∙a⁻¹)	Relative Error/%
1	0.0300	0.0627	108.9780	0.0575	91.66	0.0499	66.3875	0.0309	2.9238
2	0.0585	0.1265	116.2789	0.0764	30.639	0.0759	29.6919	0.0599	2.3275
3	0.1000	0.1057	5.6519	0.129	28.9642	0.1215	21.4647	0.0957	4.2676
4	0.0234	0.0340	45.1445	0.0464	98.4154	0.0349	49.0798	0.0242	3.4392
5	0.1028	0.1454	41.4101	0.1105	7.5312	0.1181	14.928	0.1053	2.4332
6	0.0605	0.0809	33.7885	0.105	73.5671	0.0902	49.0256	0.0626	3.4633
7	0.0939	0.1255	33.6695	0.1229	30.8915	0.1205	28.3416	0.1008	7.3057
8	0.0539	0.0811	50.4464	0.0732	35.822	0.062	15.1074	0.0547	1.4893
9	0.0545	0.1088	99.6408	0.0881	61.6331	0.0797	46.2272	0.0557	2.2178
10	0.0855	0.0293	65.7425	0.0846	1.0093	0.0748	12.5183	0.0841	1.6775
11	0.0416	0.0829	99.3071	0.0634	52.4064	0.0499	20.0276	0.0422	1.4092

Table 9. Predicted outcome performance indicator values.

Prediction Model	E_RMSE	E_MAE	R²
FA-KELM	0.0400	0.0355	0.5361
FA-SSA-KELM	0.0258	0.0231	0.8997
FA-DBO-KELM	0.0191	0.0177	0.9270
FA-IDBO-KELM	0.0028	0.0021	0.9954

Table 10. Ablation study results (mean ± standard deviation over 30 runs).

Model Variant	RMSE	MAE	R²	Ranking Score
FA-DBO-KELM (Baseline)	0.0191 ± 0.0023	0.0177 ± 0.0019	0.9270 ± 0.0085	0.456
+SPM Initialization	0.0125 ± 0.0015	0.0112 ± 0.0012	0.9583 ± 0.0062	0.632
+Spiral Search	0.0087 ± 0.0011	0.0079 ± 0.0009	0.9781 ± 0.0043	0.745
+Lévy Flight	0.0043 ± 0.0006	0.0038 ± 0.0005	0.9892 ± 0.0028	0.834
+Adaptive t-Mutation	0.0028 ± 0.0003	0.0021 ± 0.0002	0.9954 ± 0.0015	0.892

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, Y.; Luo, Z.; Wang, B.; Mu, D. An Improved Dung Beetle Optimizer with Kernel Extreme Learning Machine for High-Accuracy Prediction of External Corrosion Rates in Buried Pipelines. Symmetry 2026, 18, 167. https://doi.org/10.3390/sym18010167

AMA Style

Gao Y, Luo Z, Wang B, Mu D. An Improved Dung Beetle Optimizer with Kernel Extreme Learning Machine for High-Accuracy Prediction of External Corrosion Rates in Buried Pipelines. Symmetry. 2026; 18(1):167. https://doi.org/10.3390/sym18010167

Chicago/Turabian Style

Gao, Yiqiong, Zhengshan Luo, Bo Wang, and Dengrui Mu. 2026. "An Improved Dung Beetle Optimizer with Kernel Extreme Learning Machine for High-Accuracy Prediction of External Corrosion Rates in Buried Pipelines" Symmetry 18, no. 1: 167. https://doi.org/10.3390/sym18010167

APA Style

Gao, Y., Luo, Z., Wang, B., & Mu, D. (2026). An Improved Dung Beetle Optimizer with Kernel Extreme Learning Machine for High-Accuracy Prediction of External Corrosion Rates in Buried Pipelines. Symmetry, 18(1), 167. https://doi.org/10.3390/sym18010167

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Improved Dung Beetle Optimizer with Kernel Extreme Learning Machine for High-Accuracy Prediction of External Corrosion Rates in Buried Pipelines

Abstract

1. Introduction

2. Methods

2.1. Theoretical Foundations

2.2. Improved Dung Beetle Optimizer (IDBO)

2.2.1. SPM for Population Initialization

2.2.2. Spiral Search Strategy in the Reproduction Phase

2.2.3. Lévy Flight in the Stealing Phase

2.2.4. Adaptive t-Distribution Mutation

3. Model Construction

3.1. Model Construction Process

3.2. Model Evaluation Metrics

3.3. Ablation Study Design

4. Case Study

4.1. FA-Based Dimensionality Reduction

4.2. Model Training and Validation

4.3. Ablation Study and Robustness Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI