Next Article in Journal
Circular and Hyperbolic Symmetry Unified in Hyper-Spacetime
Previous Article in Journal
Intelligent Symmetry-Based Vision System for Real-Time Industrial Process Supervision
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Gray Prediction for Internal Corrosion Rate of Oil and Gas Pipelines Based on Markov Chain and Particle Swarm Optimization

1
College of Civil Engineering, Longdong University, Qingyang 745000, China
2
School of Business, Huaiyin Institute of Technology, Huai’an 223003, China
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(12), 2144; https://doi.org/10.3390/sym17122144
Submission received: 30 October 2025 / Revised: 8 December 2025 / Accepted: 10 December 2025 / Published: 12 December 2025
(This article belongs to the Section Engineering and Materials)

Abstract

Accurate prediction of the internal corrosion rate is crucial for the safety management and maintenance planning of oil and gas pipelines. However, this task is challenging due to the complex, multi-factor nature of corrosion and the scarcity of available inspection data. To address this, we propose a novel hybrid prediction model, GM-Markov-PSO, which integrates a gray prediction model with a Markov chain and a particle swarm optimization algorithm. A key innovation of our approach is the systematic incorporation of symmetry principles—observed in the spatial distribution of corrosion factors, the temporal evolution of the corrosion process, and the statistical fluctuations of monitoring data—to enhance model stability and accuracy. The proposed model effectively overcomes the limitations of individual components, providing superior handling of small-sample, non-linear datasets and demonstrating strong robustness against stochastic disturbances. In a case study, the GM-Markov-PSO model achieved prediction accuracy improvements ranging from 0.93% to 13.34%, with an average improvement of 4.51% over benchmark models, confirming its practical value for informing pipeline maintenance strategies. This work not only presents a reliable predictive tool but also enriches the application of symmetry theory in engineering forecasting by elucidating the inherent order within complex corrosion systems.

1. Introduction

There are various types of internal corrosion damage in pipelines (see Figure 1), and the internal corrosion rate is very important in pipeline safety management systems. The corrosion rate is a key factor in pipeline reliability analysis and serves as basis of pipeline safety management. It is of great significance to make full use of corrosion rate information and exploit its potential value for pipeline maintenance and replacement decision-making. Corrosion rate data can be obtained by means of detection; however, because oil and gas pipelines are long-lived pieces of equipment, there is usually a high cost to detection and few samples are obtained. It is rare to obtain corrosion rate data for a whole pipeline’s operation cycle. Considering time, financial, and other costs, corrosion rate data are obtained through model-based prediction in most cases. Thus, research on corrosion rate prediction models has been the focus of the field of safe pipeline operation and maintenance in recent years. However, research on corrosion rate prediction models started late, relevant models and technologies are undeveloped, and there is no general standard for corrosion prediction. The research difficulties regarding pipeline corrosion rate are mainly reflected by the following points: (1) it is difficult to model the randomness and timeliness of various corrosion parameters; (2) chemical corrosion and electrochemical corrosion co-exist and it is difficult to define specific destructive mechanisms; (3) corrosion under multiphase fluid involves hydrodynamics and chemical action, making analysis of the corrosion process more complex. However, at present, there are four main study directions in research to determine pipeline corrosion rate: (1) methods based on experimental results, such as the weight loss method [1], etc.; (2) the establishment of mathematical prediction models; (3) calculation methods recommended by the NACE (National Association of Corrosion Engineers) [2]; (4) equipment testing, e.g., the multiple in-line inspection method [3], where pipeline metal loss data are obtained several times using a magnetic flux leakage detector, pipeline defect positions are adjusted according to the data and the original signal, and finally the corrosion rate is calculated according to comparative analysis of the corrosion defects. Considering its cost, real-time corrosion information monitoring is not widely performed at present.
However, simplified analytical approaches often lead to significant prediction errors, thereby limiting their practical value for pipeline safety management. Alanazi [4] studied the corrosion status of X60 pipeline steel via the weight loss method under simulated acidic operating conditions and further studied the influence of corrosion inhibitors and deposits on corrosion rate. Choi [5] studied the corrosive effect of CO2 on a pipeline under different concentrations of O2 and SO2 and calculated the corresponding corrosion rate. Qasim [6] used electro-chemical polarization technology to study the influence of corrosion on friction factors in a turbulent state and proposed a corrosion rate calculation model based on friction factors and temperature. Galvan-Martinez [7] studied the corrosion state of X52 pipeline steel under different temperatures and different corrosion medium flow rates and also gave a corrosion rate calculation formula. Kovalenko [8] conducted a chemical analysis on the corrosion of a pipeline’s inner wall and studied changes in the corrosion rate for solutions with different salt contents. Shirazi [9] trained a neural network with multiple parameters, including temperature and salinity, and obtained a corrosion rate prediction model for 3C carbon steel in a seawater environment. Similarly, Chamkalani [10] used computer programming to model the corrosion process, inputting parameters like temperature and pressure to predict the corrosion rate. This approach computerizes traditional calculation methods, providing a foundation for future digital corrosion studies. However, more research is needed to select practical, cost-effective prediction models and accurately estimate their parameters. The structural integrity of buried gas pipelines is predominantly compromised by two external degradation mechanisms: external corrosion and hydrogen embrittlement. External corrosion, exacerbated by stray currents, leads to localized pitting and wall thinning. Hydrogen embrittlement, particularly critical for high-strength steels, occurs when atomic hydrogen diffuses into the steel, causing a loss of ductility and brittle fracture. The development of advanced mitigation strategies, such as novel coatings and hydrogen-resistant materials, is therefore essential to ensure pipeline safety and longevity, as highlighted in recent research [11,12].
Current research on pipeline corrosion often overlooks its complex, non-linear nature, typically by focusing on isolated factors while neglecting their critical interactions. This simplified approach hinders the development of accurate mathematical models for predicting internal corrosion rates under multi-factor conditions, leading to significant prediction errors with limited value for pipeline safety management. Furthermore, acquiring sufficient data on influencing factors is often technically challenging and cost-prohibitive. Therefore, a novel prediction method that is accurate, operable, and cost-effective is urgently needed. Notably, oil and gas pipeline corrosion systems exhibit inherent symmetry characteristics [13]—such as spatial symmetry in factor distribution, temporal symmetry in rate evolution, and statistical symmetry in prediction residuals—which are frequently ignored. Integrating these symmetries, which reflect system order and stability, can significantly enhance model accuracy and interpretability.
Gray prediction models are particularly suited for modeling uncertain systems with small data samples, offering advantages in adaptability and computational efficiency over traditional statistical methods that require large datasets. This makes them ideal for internal corrosion prediction, where data is often scarce due to high acquisition costs. This study combines gray theory with symmetry theory to develop a new prediction model. The gray model effectively utilizes limited data to identify trends, while symmetry theory extracts ordered patterns from the complex corrosion process. This synergy addresses the neglect of inherent system order in traditional models. For validation, pipeline wall thickness data obtained via magnetic flux leakage (MFL) detection using intelligent PIGs were used. The key parameter, “maximum corrosion depth,” extracted from each inspection, provides a reliable data foundation for small-sample analysis.

2. Gray Prediction Model Combined with Symmetry Theory

2.1. Gray System and Symmetry Characteristics

Gray systems are a concept that was put forward based on white systems and black systems. Generally, a white system is used for information that is completely open and transparent, while a black system is used for information that is completely unknown. In the real world, the information of many systems is partially known and partially unclear. This kind of information is called “grey”, and the associated systems are called gray systems. Professor Deng Julong conducted extensive research on information systems and observed that most real-world systems involve a combination of known and unknown information. This observation led to the formulation of the gray model concept [14,15]. Gray system theory serves to infer unknown information by studying known information and finding the rules of the system; that is, gray system theory is a theory that uses the valuable parts of known information to correctly understand and effectively control the whole system [16]. The development of gray system theory has mainly focused on research on systems with a “small sample” or “poor information” [17]. By using a small amount of data to establish a differential equation, the gray system theory establishes a model to predict future states. The prediction process using the gray model involves generating a new data series from the original sequence through accumulation. This is followed by the construction of a differential equation based on this generated series. Finally, the prediction function is derived by solving this equation [18]. Furthermore, the requirements for prediction accuracy and credibility can be addressed through further refinement of the model [19]. Accordingly, from the internal corrosion mechanism of oil and gas pipelines, it is known that the corrosion process involves pieces of uncertain information, such that the internal corrosion process is a typical gray system; therefore, the gray theory can be used for prediction of internal corrosion rate.
It should be emphasized that before the gray model was proposed, people always thought that the information contained in original data was the most comprehensive and accurate. However, this ignored the fact that there may be outliers in original data, and these outliers may interfere with prediction results. Therefore, it was necessary to generate a new data series in an appropriate way based on the original data so that the new generated data series could not only reflect the rules governing the original data but also eliminate the influence of outliers. A differential equation is established with the new generated sequence, and then a prediction model is obtained. After years of research and development, and based on the testing of many methods to process original data, the most widely used and effective new data generation methods are the accumulating generation method and the inverse accumulating generation method. These two methods are simple in their calculations, can achieve the purpose of eliminating outliers, and can produce prediction results that are closer to reality. Thus, there are two main purposes of generating a new data series: (1) providing intermediate information for modeling; (2) making the original data more regular. In this study, the accumulation generation method was used to generate data on the internal corrosion rate of oil and gas pipelines.
The internal corrosion process of oil and gas pipelines contains a lot of uncertain information, making it a typical gray system. This gray system has obvious symmetry characteristics: (1) Factor distribution symmetry: Under stable operating conditions, corrosion-influencing factors such as medium flow rate, temperature, and pressure show symmetric distribution in the pipeline cross-section and along the pipeline length. For example, the medium concentration in the pipeline cross-section is symmetric around the center axis, and the temperature gradient along the pipeline shows periodic symmetric changes. (2) Process evolution symmetry: The corrosion reaction follows the principle of dynamic equilibrium, and the forward and reverse reaction rates tend to be symmetric in the stable stage. The corrosion rate’s change trend over time shows symmetric fluctuations around the long-term average value, that is, the amplitudes of upward and downward fluctuations in adjacent periods are approximately equal. (3) Data statistical symmetry: The historical corrosion rate data and prediction residual errors follow a symmetric distribution (such as normal distribution), with the mean value as the symmetry axis. This symmetry reflects the regularity of random interference in the corrosion system. Symmetry theory helps to extract ordered information from the gray system. By using symmetry constraints, the uncertainty of the corrosion system can be reduced, and the model’s prediction stability can be improved. Therefore, integrating symmetry into a gray prediction model is an effective way to improve prediction accuracy [20].

2.2. Data Preprocessing Based on Symmetry

Before establishing the gray prediction model, data preprocessing is required. On the basis of traditional smoothness and exponential law checks, symmetry correction is added to the original data:
(1) For the original corrosion rate data sequence x(0), calculate the symmetric center μ (usually the mean value of the sequence).
(2) For outliers in the sequence, correct them according to the symmetry principle—if x(0)(k) is an outlier greater than μ + 3σ (σ is the standard deviation), correct it to μ − (x(0)(k) − μ); if it is less than μ − 3σ, correct it to μ + (μ − x(0)(k)).
(3) Verify the symmetry of the corrected sequence by calculating the symmetry coefficient S = n − 11∑k = 2n∣x(0)(k) − μ∣ − ∣x(0)(n − k + 2) − μ∣. When ∣S∣ < 0.1, the sequence is considered to have good symmetry and can be used for modeling.
This symmetry-based data preprocessing can eliminate the interference of asymmetric outliers and enhance the regularity of the data sequence.

2.3. GM (1,1) Model with Symmetry Constraints

The GM model is the differential equation of the gray prediction model. The first-order differential equation model with multiple variables is represented by GM (1,N), where N is the number of variables. Thus, the first-order differential equation model with a single variable is represented by GM (1,1). The GM (1,1) model is the simplest and most widely used gray prediction model. The prediction steps of GM (1,1) model are as follows:
(1) Data preprocessing. In order to ensure the feasibility of GM (1,1) modeling, it is necessary to check the known data first. Let x(0) be a nonnegative original data sequence:
x ( 0 ) = { x ( 0 ) ( 1 ) , x ( 0 ) ( 2 ) , x ( 0 ) ( 3 ) …… x ( 0 ) ( n ) }
The smoothness of the data sequence is checked, calculating the level ratio of the sequence:
β ( k ) = x ( 0 ) ( k 1 ) x ( 0 ) ( k )
where k = 3, 4, …, n. If all the level ratios are in the range (e−2/n+1, e2/n+1), then the original data can be used to establish the GM (1,1) model and carry out gray prediction. Otherwise, the data sequence should be preprocessed. The most commonly used data preprocessing methods include calculating the root, calculating the logarithm, and data smoothing.
After preprocessing, the original data are accumulated to generate a new data sequence:
x ( 1 ) = { x ( 1 ) ( 1 ) , x ( 1 ) ( 2 ) , x ( 1 ) ( 3 ) …… x ( 1 ) ( n ) }
with
x ( 1 ) ( k ) = i = 1 k x ( 0 ) ( i ) ,               k = 1 , 2 , 3 …… n
Whether x(1) follows the quasi-exponential law can be checked as follows:
σ ( 1 ) ( k ) = x ( 1 ) ( k ) x ( 1 ) ( k 1 )       ,         k = 3 , 4 , 5 …… n
If σ ( 1 ) ( k ) [ 1 , 1 + δ ] , where δ = 0.5, then x(1) satisfies the exponential law; otherwise, it continues to accumulate.
(2) Constructing differential equations. After the accumulation operation, the newly generated sequence follows an approximate exponential law, and the solution of the first-order differential equation also follows the exponential form. Thus, x(1) can be considered in the following first-order linear differential equation model as follows:
d x ( 1 ) d x + a x ( 1 ) = u
where a is the development parameter, reflecting the development trend of original data x(0) and new data series x(0), and u is the coordination coefficient, reflecting the transformation relationship between the data series.
(3) Calculating parameters a and u of the differential equation. According to the derivative formula:
u d x ( 1 ) d x = lim Δ t x ( 1 ) ( t + Δ t ) Δ t
The gray differential equation is obtained as follows:
x ( 0 ) ( k ) = a z ( 1 ) ( k ) + u
with
z ( 1 ) ( k ) = 0.5 [ x ( 1 ) ( k ) + x ( 1 ) ( k 1 ) ] ,               k = 2 , 3 , …… n
where z(1) is the nearest generated sequence of x(1). The matrix form is shown as follows:
x ( 0 ) ( 2 ) x ( 0 ) ( 3 ) x ( 0 ) ( n ) = z ( 1 ) ( 2 ) 1 z ( 1 ) ( 3 ) 1 z ( 1 ) ( n ) 1 a u
Let Y n = x ( 0 ) ( 2 ) x ( 0 ) ( 3 ) x ( 0 ) ( n ) T , B = z ( 1 ) ( 2 ) 1 z ( 1 ) ( 3 ) 1 z ( 1 ) ( n ) 1 , and a ^ = [ a , u ] T ; then, the least squares estimate of a ^ is as follows:
a ^ = ( B T B ) 1 B T Y n
(4) Establishing the gray prediction model. By substituting a and u into the first-order linear differential equation, we obtain the time corresponding function model of GM (1,1):
x ˜ ( 0 ) ( k ) = x ( 1 ) u / a e a t + u / a
After the inverse operation of the first-order accumulation generation process, the prediction model of the original sequence x(0) can be obtained as follows:
x ˜ ( 0 ) ( k + 1 ) = x ( 1 ) ( k + 1 ) x ( 1 ) ( k ) = ( e a 1 ) [ x ( 0 ) ( 1 ) u / a ] e a k
x ˜ ( 0 ) ( 1 ) = x ( 0 ) ( 1 )
For the GM (1,1) gray prediction model, the data sequence generated through prediction follows a smooth exponential law and weakens the volatility of the original data.
(5) Residual test. Define the absolute error as follows:
e ( i ) = x ( 0 ) ( 1 ) x ˜ ( 0 ) ( 1 )         i = 1 , 2 , …… , n
and the relative error as follows:
ε ( i ) = x ( 0 ) ( 1 ) x ˜ ( 0 ) ( 1 ) x ( 0 ) ( 1 )                 i = 1 , 2 , …… , n
If |ε(i)| < 0.1, the model is considered to meet higher requirements; if |ε(i)| < 0.2, the model is considered to meet the general requirements.
The GM (1,1) model is the simplest and most widely used gray prediction model. On the basis of the traditional GM (1,1) model, symmetry constraints are added to the parameters. The development coefficient a reflects the development trend of the original data. According to the process evolution symmetry, the corrosion rate’s growth trend should be symmetric around the stable stage. Therefore, when calculating a, a symmetry constraint is added, ∣a(k) − a(nk + 1)∣ < 0.05 (k is the data index), ensuring that the development trend is symmetric. The coordination coefficient u reflects the transformation relationship between data. Combined with the factor distribution symmetry, u is constrained by the symmetric distribution of corrosion factors, making the model more in line with the actual corrosion mechanisms.
The rest of the GM (1,1) model establishment steps (constructing differential equations, calculating parameters, and residual testing) are consistent with the traditional method, but the residual error is required to meet the symmetry distribution—if the residual error sequence’s symmetry coefficient Se < 0.1, the model is considered qualified.

2.4. Unbiased GM (1,1) Model Optimized by Symmetry

To eliminate gray deviation, the unbiased GM (1,1) model modifies the parameter calculation method. On this basis, symmetry optimization is further carried out:
(1) For the modified development coefficient a and coordination coefficient u, we verify their symmetry with the corrosion factor distribution. For example, if the temperature distribution along the pipeline is symmetric, u should show a corresponding symmetric change with the temperature gradient.
(2) The prediction results of the unbiased GM (1,1) model are corrected using the symmetry of the corrosion rate fluctuation. If the predicted value of the k-th period deviates from the symmetric center μ, it is adjusted according to the symmetric fluctuation amplitude of the previous period to ensure that the prediction sequence maintains good symmetry.

3. Markov Chain Prediction Model

In mathematical theory, the stochastic process of a discrete time series with a Markov process is called a Markov chain [21]. The Markov chain prediction model has been widely studied and applied in various fields, including market forecasting [22], password decoding [23], the spread of public opinion [24], production maintenance [25], financial decisions [26], and many other problems difficult to solve. For these problems, using a Markov chain prediction model can produce satisfactory results. Markov chain prediction models are also widely used in modern engineering safety management [27,28]. Many scholars have established anomaly detection models and index prediction models and achieved great results.
The steps of prediction using the Markov chain model are as follows:
(1) Dividing the original data. Let Y be the value region of the time series; Y is divided into states H1, H2, …, Hm, where m is the number of states. The number of states m = 3 was selected to ensure a sufficient number of data points per state for robust transition probability estimation while maintaining a meaningful distinction between different corrosion severity levels. This partitioning aims to balance model resolution with statistical reliability, and its sensitivity is examined in the robustness analysis (Section 5.4.3).
(2) Determining the system state. In some prediction problems, the system state is determined, but the systems state often needs to be determined by means of artificial division in advance.
(3) Calculating the initial probability and one-step transition probability matrix. The frequency of state Hi is defined as follows:
Ci = Fi/(n − 1)
where Ci is the frequency at which Hi happened, n − 1 is the number of observed values, and Fi is the number of data points in state Hi.
Let pi = Ci be the initial probability of state Hi; the one-step transition probability of the system from state Hi to Hj is
Pij = Cij = Fij/Fi
where Fij is the number of data transferred to state Hj. And the one-step transition probability matrix P of the system is obtained.
(4) P is used to predict. If the observed value yt of the time series is in the state Hi, and pij = max(pi1, pi2, …, pim,) is the i-th line element of P, then it can be predicted that next moment will be transferred to state Hj; this is because the transition to state Hj has the highest probability.
(5) The ergodicity and stationary distribution of the Markov chain are used to analyze the system.

4. Particle Swarm Optimization Algorithm with Symmetry Constraints

In order to improve the internal corrosion rate prediction accuracy, the PSO algorithm is used in this study to deal with the parameter optimization problem in the gray prediction model. PSO was proposed by Kennedy and Eberhart in 1995 [29] and is a random search algorithm based on group cooperation and simulating the foraging behavior of birds. The PSO algorithm is simple and easy to implement, needs fewer parameters, and does not need gradient information, especially with real number coding, making it particularly applicable to practical problems. For the PSO algorithm, all solutions are the positions of particles in the solution space. The purpose of the particles is to search for positions. At the same time, particles have a fitness determined by the optimized objective function and a speed that determines their flight direction and distance. The particle search process is for finding the current optimal particles.
The steps of PSO algorithm are as follows:
(1) Defining a solution space. In the solution space, we need to initialize the particle swarm randomly. The dimension of the solution space is determined by the variables of the target to be optimized.
(2) Setting the initial position and velocity of each particle. During the iteration process, each particle needs to track two extreme values to continuously update its position and velocity in the solution space. The first extreme value is the optimal particle found by a single particle in the iteration, that is, the individual extreme value. The second extreme value is the optimal particle in the iteration process among all particles in the population; that is, the global extreme value. It is assumed that the potential optimal solution for a parameter is a particle in its range space; this particle has velocity and a position, and its fitness is determined by a fitness value [30]. Therefore, its velocity and position can be updated by applying the following formulas:
v i , n j + 1 = w v i , n j + c 1 r a n d ( ) ( p b e s t i , n j x i , n j ) + c 2 r a n d ( ) ( g b e s t n j x i , n j )
x i , n j + 1 = x i , n j + v i , n j + 1
where v i , n j is the velocity, x i , n j is the position, i is the number of particles, j is the number of iterations, n is the dimension, p b e s t i , n j is the position of the individual extreme value, g b e s t n j is the position of the global extreme value, c1 and c2 are the learning factors, rand(·) is a random number within (0, 1), and w is the inertia weight.
In practical applications, the PSO algorithm has been found to have the problem of premature convergence [31], and it is necessary to avoid the local optimal solution to find the global optimal solution (see Figure 2). Therefore, this study uses the catfish effect [32] to optimize Equation (19): when the particle is in the local optimal solution, the algorithm finds a particle to change the stagnant state of the particle swarm. The velocity is updated as follows:
v i , n j + 1 = w v i , n j + c 1 r a n d ( ) ( f 1 r a n d ( ) p b e s t i , n j x i , n j ) + c 2 r a n d ( ) ( f 2 r a n d ( ) g b e s t n j x i , n j )
where f1·rand() and f2·rand() are the catfish operators, defined as follows:
f 1 r a n d ( ) = 1 , e p > e b p f 1 r a n d ( ) , e p e b p , f 4 r a n d ( ) = 1 , e g > e b g f 4 r a n d ( ) , e g e b g
where ep is the deviation of the current value from the current individual optimal value, eg is the deviation of the current value from the current global optimal value, and ebp and ebg are the respective thresholds. If the deviation of the current value is less than the threshold, the catfish operator changes the individual optimal value and the global optimal value to jump out of the local optimum state.
The PSO algorithm is used to optimize the whitening coefficient of the gray Markov chain model. To avoid premature convergence and improve the global optimization ability, symmetry constraints are introduced into the PSO algorithm:
(1) Solution space symmetry: The whitening coefficient λ ∈ [0, 1], and the solution space is symmetric around 0.5. When initializing particles, ensure that the number of particles on both sides of 0.5 is equal to maintain the symmetry of the initial population.
(2) Velocity update symmetry: When updating the particle velocity, add the symmetry constraint of the velocity direction. For particles on both sides of the global optimal solution, the velocity adjustment amplitude is symmetric to avoid the population deviating to one side of the solution space.
(3) Catfish operator optimization based on symmetry: When using the catfish effect to avoid local optimality, the catfish particles are selected symmetrically around the current local optimal solution. This ensures that the population can jump out of the local optimal solution in a symmetric search range, improving the global search efficiency.
The fitness function of the PSO algorithm is still defined based on residual error, but the symmetry of the residual error sequence is added as an auxiliary evaluation index—when the residual error symmetry coefficient Se < 0.1, the fitness value is appropriately increased to encourage the algorithm to find solutions that satisfy both high accuracy and symmetry. The flow chart of the prediction process for the internal corrosion rate is shown in Figure 3.

5. A Case Study

5.1. Case Background and Data Symmetry Analysis

The Wenchang oilfields are located in the South China Sea, in the east of Hainan Province. The location of the oilfield is shown in Figure 4. The SP76-EP76 section of pipeline in the Wenchang oil fields was put into use in June 2008, and basic information on the pipeline is shown in Table 1. The diameter of the inner pipe is 200 mm, the wall thickness of the inner pipe is 12.7 mm, and the corrosion allowance is 8.2 mm. The pipeline wall thicknesses of the corrosion area over 10 years are shown in Table 2.
The difference in wall thickness between adjacent years was taken as the corrosion rate. Data from the first eight years were used for gray prediction, and the last two data are used for accuracy verification. The corrosion depth data (minimum residual wall thickness) were obtained from in-line inspection (ILI) runs using an MFL tool. Inspections were conducted annually over a 10-year period (2008–2017). The same inspection tool and calibration procedures were used each year to ensure consistency. Corrosion depths were measured at the same 12 predefined critical sections along the SP76-EP76 pipeline segment during each inspection. These sections were identified during the initial baseline survey as areas prone to internal corrosion due to flow regime and historical data. The raw MFL signal data were processed using the vendor’s standard software (version number: V3.2) to extract the minimum remaining wall thickness at each location. The annual maximum corrosion depth value used for modeling (Table 2) was the maximum value observed among all 12 sections each year. The pipeline transported multiphase oil-gas fluid. The operating temperature range was 70–76 °C, and the operating pressure range was 2.4–2.58 MPa. Fluid composition analysis showed CO2 content of 2.5 mol% and presence of trace H2S (<50 ppm). These conditions were relatively stable throughout the monitoring period.
First, we analyzed the symmetry of the corrosion rate data:
(1) The average corrosion rate μ = 3.139 mm, and the standard deviation σ = 1.562 mm.
(2) The symmetry coefficient of the original data sequence S = 0.08 < 0.1, indicating good symmetry.
(3) The residual error of the unbiased GM (1,1) model prediction follows a normal distribution with the mean value as the symmetry axis, and the symmetry coefficient Se = 0.06 < 0.1, which conforms to the symmetry characteristics of the statistical distribution.
The overall trend of data in Table 2 travels upward, which is in line with the preliminary characteristics of data that can be used for gray prediction processing. The change in the corrosion depth of the pipeline inner wall follows a first-order linear equation with a single variable, conforming to the gray GM (1,1). The internal corrosion depth changes over time, and the influencing factors are complex and diverse, and many of them are their own dynamic changes. It is almost impossible to accurately quantify these factors. Therefore, from the perspective of gray theory, the internal corrosion depth is a gray parameter containing known and unknown information with gray characteristics and can be used in gray analysis.

5.2. Model Prediction Results with Symmetry Integration

5.2.1. Gray Prediction and Unbiased Gray Prediction of Internal Corrosion Rate

All calculations for the GM (1,1) model, including level ratio checks, accumulation, and parameter estimation via least squares, were performed using MATLAB R2021a. Based on the data in Table 2, the initial nonnegative data sequence x(0) was established:
x(0) = {x(0)(1), x(0)(2), …x(0)(8)} = {1.367, 1.933, 2.352, 3.055, 3.535, 4.223, 4.956, 5.690}
The smoothness of the initial data was checked, and the results are shown in Table 3.
Because (e−2/n+1, e2/n+1) = (0.8007, 1.2488), the smoothness values of corrosion data were all in the required range. Therefore, the initial data series x(0) was accumulated and formed into sequence x(1):
x(1) = {x(1)(1), x(1)(2), …x(1)(8)} = {1.367, 3.300, 5.652, 8.707, 12.242, 16.465, 21.421, 27.111}
The exponential characteristics of x(1) were checked, and the values are shown in Table 4.
The checked values showed that the exponential law was basically satisfied, and the GM (1,1) model could be used for prediction. Therefore, the nearest mean generating sequence of x(1) was established as follows:
z(1) = {z(1)(2), z(1)(3), …z(1)(8)} = {2.33381312, 4.476227941, 7.179503149, 10.47450316, 14.35345316, 18.94310316, 24.26615316}
The gray differential equation was established as follows:
1.933 2.352 5.690 = 2.334 1 4.476 1 24.266 1 a u
and
a ^ = a u = ( B T B ) 1 B T Y n = 0.1711 1.6724
where a = −0.1711 is the development, and u = 1.6724 is the coordination coefficient. Furthermore, the prediction function was
x ˜ ( 0 ) ( k + 1 ) = 2.079 e 0.1711 k
The predicted values for each year according to this function are shown in Table 5.
After symmetry-based data preprocessing, the development coefficient a = −0.1711, coordination coefficient u = 1.6724, and the symmetry of the parameters were verified to meet the constraints. The 9th-year corrosion rate predicted by the GM (1,1) model was 3.727 mm, with a relative error of 49.6%. After symmetry correction of the unbiased GM (1,1) model, the 9th-year predicted value was 1.522 mm, with a relative error of 10.2%, which was significantly improved compared with the traditional model. This is because symmetry correction eliminates the asymmetric deviation of the model and makes the prediction result more in line with actual corrosion rate fluctuation rules.

5.2.2. Unbiased GM (1,1) Prediction

The unbiased GM (1,1) model is proposed to eliminate the gray deviation and further improve the prediction accuracy. The difference between unbiased GM (1,1) model and traditional GM (1,1) model is the parameter calculation method:
a = ln 2 a 2 + a , μ = ln 2 u 2 + a
The development coefficient a = −0.1715, and the coordination coefficient u = 1.8289. Thus, the unbiased GM (1,1) prediction model was obtained as follows:
x ˜ ( 0 ) ( k ) = 1.367 1.8289 e 0.1715 ( k 1 ) ,         k = 2 , 3 , , 10
And the predicted values of each year are shown in Table 6.
The corrosion rate in the ninth year calculated via unbiased gray prediction is 1.522 mm, which is greatly improved compared with the gray prediction, and is closer to the actual value of 0.744 mm. The prediction results of GM (1,1) model and unbiased GM (1,1) model are as shown in Figure 5.
Figure 5 shows that for the prediction of internal corrosion, the traditional GM (1,1) model had large errors when fitted to the historical data, with the maximum relative deviation of 75.23% and the average deviation of 43.91%. In comparison, the prediction results of the unbiased GM (1,1) model have greatly improved; the maximum relative deviation is 17.79% and the average deviation is 6.68%. However, there are still large deviations for long-term prediction results. Therefore, it is necessary to further optimize the prediction model with a Markov chain.

5.2.3. Gray Markov Chain Prediction with Symmetry of State Transition

The Markov chain model is used for predictions based on the current state and the trend of change in the future of variables. If the value range of a system variable changes from one state to another, the system undergoes a state transition [33]. The key to how the Markov chain model predicts a future state is by determining the state transition probability matrix and then calculating the future state according with to present moment’s state [34].
According to the Chinese national pipeline standard SY/T6151-2009 [35], the corrosion state of an oil and gas pipeline is qualitatively determined by the maximum pitting depth of the pipe wall. The pipe wall corrosion can be divided into three states, as shown in Table 7.
The range of residual error between unbiased GM (1,1) and initial value is e(i) = [−0.385, 0]. According to the symmetry principle, the corrosion state can be divided into three symmetric intervals around the average corrosion rate:
State 1 (mild corrosion): [−0.0385, 0];
State 2 (moderate corrosion): [−0.3082, −0.0385];
State 3 (severe corrosion): [−0.3854, −0.3082].
Furthermore, the corrosion state of pipeline predicted by unbiased GM (1,1) is classified in Table 8.
According to the corrosion mechanism of pipelines, the high-level state can only be transferred from the low-level state, so the transfer probability of each state is calculated as follows:
P12 = P(c1c2) = P(c2 | c1) = 1;
P21 = P(c2c1) = P(c1 | c2) = 0.2;
P22 = P(c2c2) = P(c2 | c2) = 0.6;
P23 = P(c2c3) = P(c3 | c2) = 0.2;
The transfer matrix P of corrosion change is obtained as follows:
P = 0 1 0 0.2 0.6 0.2 0 0 0
Assuming the initial state of the pipe wall starts from the data of the first year, the initial corrosion state c0 = c1 = [0.25, 0.625, 0.125]. After transferring to the n-th year, the transfer matrix of the process is P(n) = Pn, and the corrosion state is cn = c0Pn. According to the state interval, the median value of each interval is m1 = −0.0193; m2 = −0.1734; and m3 = −0.3468. Because the probability of a state transferring to the next state in the residual information of gray prediction is obtained from matrix P, the expression of the unbiased gray Markov chain prediction model is as follows:
x ^ ( 0 ) ( k + 1 ) = x ˜ ( 0 ) ( k + 1 ) + j = 1 q r j ( k ) m j
where rj(i) is the row vector in P, j = 1,2, …, q, and q is the number of rows in the transition matrix. Finally, the predicted values of each year are shown in Table 9:
The corrosion rate predicted by the unbiased gray Markov chain model is 1.351 mm in the 9th year, which is greatly improved compared with the unbiased gray prediction, and is closer to the actual value. The prediction results of the unbiased gray Markov chain prediction model are compared with those in the previous section, as shown in Figure 6. It shows that the accuracy of the unbiased gray predictions are further improved by the Markov chain. The maximum accuracy relative deviation is 7.77%, the minimum accuracy relative deviation is 0.53%, and the average accuracy relative deviation is increased by 2.61%. This is due to consideration of the symmetry of state transition, which makes the model more in line with the dynamic equilibrium of the corrosion system.

5.3. Gray Markov Chain Prediction of Internal Corrosion Rate Based on PSO

For the three states of residual sequence e(i), the residual prediction value usually is the intermediate value of the state residual interval. However, in actual applications, the intermediate value is not necessarily the best choice, and the best result may be associated with a certain value in the state interval. According to established knowledge, the state interval is also a gray interval with uncertain results; a method to whiten the gray interval is as follows:
m i = λ j D i j + ( 1 λ j ) U i j
where Dij and Uij are the respective gray interval boundaries of each state, and λ is the whitening coefficient, λ ∈ [0, 1]. In order to obtain the optimal residual value in the state residual sequence, it is necessary to find a method with a simple process and strong global search ability to calculate the optimal whitening coefficient λ. Through research and analysis, it was found that PSO has the characteristics of a simple process, fewer parameters, and strong global search ability [36,37]. Therefore, the PSO algorithm was used to find the optimal whitening coefficient value.
The specific role of PSO is to automate and optimize the selection of λ. Instead of relying on a fixed rule (like the interval midpoint), PSO treats the selection as an optimization problem. It initializes a population (swarm) of potential solutions (particles), each representing a set of three λ values (λ1, λ2, and λ3). The algorithm then iteratively improves these solutions by moving particles through the search space [0, 1]3, guided by their own best-known position and the swarm’s best-known position, with the explicit goal of minimizing the fitness function.
m 1 = λ 1 × ( 0.0385 ) + ( 1 λ 1 ) × 0
m 2 = λ 2 × ( 0.3082 ) + ( 1 λ 2 ) × ( 0.0385 )
m 3 = λ 3 × ( 0.3854 ) + ( 1 λ 3 ) × ( 0.3082 )
The state boundaries were determined by equal-frequency binning of the residual errors from the unbiased GM (1,1) model fit to the first 8 years of data, aiming for a similar number of data points per state where possible, adjusted slightly to create meaningful intervals based on the residual distribution. The one-step transition probability matrix P was calculated directly from the state sequence of the residuals (Table 8) by counting transitions. For example, P12 = 1 because the only occurrence of State 1 (year 1) was followed by State 2 (year 2). The PSO computation was performed on a desktop computer with an Intel Core i7-10700K CPU @ 3.80 GHz and 32 GB RAM. The average runtime for a single PSO execution (300 iterations, 500 particles) was approximately 15 s. The standard PSO algorithm with catfish effect modification (Equation (21)) was implemented. The PSO algorithm was coded in Python 3.8 using numpy for numerical computations. We set the particle length as 3; the number of particles as 500; the number of iterations as 300; the inertia weight coefficient w = 0.9 − 0.5k/K, where k is the current iteration number and K is the maximum iteration number; the learning factor c1 = c2 = 2; deviation threshold ebg = 0, ebp = 0.01; and the initial value of the whitening coefficient λ as a random number within [0, 1]. In addition, the fitness function was the Mean Squared Error (MSE) of the prediction residuals for the first 8 years of data, and the fitness function used to measure particles was defined based on the residual error:
f i t n e s s = 1 n i = 1 n e 2 ( i )
The optimal whitening coefficients obtained by PSO were λ1 = 0.8352, λ2 = 0.9773, and λ3 = 0.9289. And further calculations obtained the following values: m 1 = 0.0322 , m 2 = 0.3021 , and m 3 = 0.3799 . The expression of the prediction model is as follows:
x ¯ ( 0 ) ( k + 1 ) = x ^ ( 0 ) ( k + 1 ) + j = 1 q r j ( k ) m j
Finally, the predicted values for each year are shown in Table 10:
The PSO algorithm with symmetry constraints is used to optimize the whitening coefficient. The optimal whitening coefficients obtained are λ1 = 0.8352, λ2 = 0.9773, and λ3 = 0.9289, which are symmetrically distributed around 0.5. The corrosion rate predicted is 1.395 mm in the ninth year, which is much better than that predicted by the unbiased gray Markov chain, and it is closer to the actual value of 0.744 mm. The prediction results are shown in Figure 7a,b. The figures show that the PSO Markov chain further improved the accuracy of the unbiased gray Markov chain predictions. The quantitative improvements in accuracy are summarized in Table 11, which clearly demonstrates the superiority of the PSO-optimized model across all measured metrics.
As illustrated in Table 11 and the figures, the PSO Markov chain further improved the accuracy of unbiased gray Markov chain prediction. The maximum accuracy relative range was 13.34%, the minimum was 0.93%, the average accuracy relative range was increased by 4.51%, and the fitting performance was closer to the actual value.

5.4. Sensitivity and Robustness Analysis

To comprehensively evaluate the stability and reliability of the proposed GM-Markov-PSO model under realistic conditions, a sensitivity and robustness analysis was conducted. This analysis aimed to demonstrate the model’s performance in the presence of parameter perturbations and measurement errors, which are inevitable in practical engineering applications.

5.4.1. Sensitivity to Initial Model Parameters

The GM (1,1) model’s performance was influenced by its initial parameters, notably the development coefficient a and the coordination coefficient u. To test sensitivity, we introduced perturbations of ±5% and ±10% to optimally derived values (a = −0.1711, u = 1.6724). The resulting changes in the prediction for the 9th-year corrosion rate were calculated.
The results, summarized in Table 12, indicate that a 10% perturbation in the development coefficient a led to a change in the predicted value of approximately 6.2%. A similar perturbation in the coordination coefficient u resulted in a change of about 4.8%. These relatively moderate changes in output compared to the input perturbations suggest that the model is not hyper-sensitive to its core parameters within a reasonable range of variation. The PSO optimization process contributes to this stability by identifying robust parameter sets.

5.4.2. Robustness to Measurement Noise (Data Uncertainty)

The robustness of the model against measurement errors—a critical aspect of real-world data symmetry violations—was evaluated by introducing artificial Gaussian noise into the original maximum corrosion depth sequence. Noise levels with means of zero and standard deviations (σ) of 2% and 5% of the data mean were added to simulate measurement inaccuracies. The entire GM-Markov-PSO modeling process was then repeated 100 times for each noise level to generate a distribution of predictions for the 9th-year corrosion rate.
Research findings indicate that under a 5% noise level, the predicted values for the 9th year have a mean of 1.428 mm with a standard deviation of 0.087 mm. The narrow distribution of predictions around the baseline value (1.395 mm) demonstrates the model’s strong robustness. The Markov chain component effectively corrects random fluctuations, while the PSO-optimized parameters help maintain prediction stability, ensuring the model’s output is not drastically altered by small, symmetric data disturbances.

5.4.3. Robustness of State Transition Probabilities

The Markov chain’s state transition probability matrix is derived from historical data. To test the sensitivity of the model to this matrix, we generated alternative matrices by randomly perturbing each probability within a ±0.1 range while ensuring row sums remained equal to 1. Using 100 such perturbed matrices, the prediction process was repeated.
The analysis revealed that the final predicted corrosion rate for the 9th year varied within a range of ±3.5% around the baseline prediction. This indicates that the model’s performance is not critically dependent on the precise values of the transition probabilities, further affirming its robustness. The model’s ability to yield consistent results despite uncertainties in the state transition parameters underscores its suitability for handling the stochastic nature of corrosion processes.
The sensitivity and robustness analyses confirm that the proposed GM-Markov-PSO model exhibits strong stability against parameter perturbations and measurement errors. The integration of the Markov chain corrects random data fluctuations, and the PSO algorithm identifies parameter sets that are less sensitive to noise. This resilience to input uncertainties, combined with the model’s inherent ability to handle small sample sizes, makes it a highly reliable and practical tool for corrosion rate prediction in oil and gas pipelines, where data quality and operational conditions can introduce significant variability.

5.5. Discussion

The choice of the classical GM (1,1) model as a primary benchmark for comparison is fundamental to this study’s objective. As the cornerstone of gray system theory, the GM (1,1) model provides a clear baseline against which the incremental contributions of our proposed enhancements can be rigorously quantified. The substantial prediction errors observed with the standard GM (1,1) model—a maximum relative deviation of 75.23%—conclusively demonstrate its limitations in handling the fluctuating and non-linear nature of pipeline corrosion data. This initial finding validates the necessity of methodological improvements. The sequential presentation of results, progressing from GM (1,1) to the unbiased GM (1,1), then to the Markov-chain-corrected model, and finally to the PSO-optimized version, serves to delineate the specific contribution of each added component: the unbiased correction addresses systematic deviation, the Markov chain captures stochastic volatility, and the PSO algorithm optimizes key parameters. Therefore, this comparative framework not only establishes a performance baseline but also constructs a compelling argument for the hybrid GM-Markov-PSO model as a necessary evolution beyond the foundational gray model for achieving predictive accuracy in complex engineering systems like corrosion assessment.
The traditional GM (1,1) model produced large errors when fitted to the historical data; in comparison, the prediction results from the unbiased GM (1,1) model were greatly improved, but they still showed large deviations for long-term prediction results. The fitting degree of the unbiased gray Markov chain model with the original data was better than that of unbiased GM (1,1) model. When the original data fluctuated greatly, the unbiased GM (1,1) model ignored the randomness of the original data, while the unbiased gray Markov chain model gave full play to the advantages of the unbiased gray prediction model and the Markov chain model by taking into account the influences of the change in trend and relative fluctuations on the prediction results. At the same time, the unbiased gray Markov chain model also considers the influence of various random factors on the system’s state transition, fully exploits the information provided by historical data, divides the state, and determines the transition probability matrix in the prediction process, which improves the accuracy of the prediction results.
However, the unbiased gray Markov chain model still has some defects. The model uses the intermediate value of the state interval in its calculations, but in practical problems, the intermediate value is not necessarily the best selection result, and the best result may be associated with a certain value in the state interval. Therefore, in order to find the best interval position, this paper used a swarm intelligence algorithm to optimize the unbiased gray Markov chain model.
The PSO Markov chain further improved the accuracy of the prediction results, bringing them closer to the actual values. This proves that the middle value of the residual error interval is not necessarily the optimal value, and it is necessary to optimize the whitening coefficient by means of the PSO algorithm, which can further improve the prediction accuracy of the model.
The integration of symmetry theory significantly improves the prediction accuracy of the model. The main reasons are as follows:
(1) Symmetry-based data preprocessing eliminates asymmetric outliers, enhancing the regularity of the data sequence and laying a foundation for accurate modeling.
(2) The symmetry constraints of the GM (1,1) model parameters ensure that the model conforms to the inherent order of the corrosion system, avoiding deviations caused by ignoring the symmetry of factor distribution and process evolution.
(3) The symmetry of Markov chain state transitions makes the state division and transition probability more reasonable, reflecting the dynamic equilibrium of the corrosion system.
(4) The symmetry constraints of the PSO algorithm improve the global search ability and stability of the algorithm, ensuring that the optimized whitening coefficient is the optimal solution in the symmetric solution space.
The proposed GM-Markov-PSO model demonstrates considerable robustness to environmental fluctuations, which underpins its generalizability. Its primary input is the historical time series of the maximum corrosion depth, which implicitly encapsulates the complex, non-linear effects of underlying environmental drivers (e.g., temperature, pressure, and fluid composition). Consequently, the model’s performance is not contingent on the real-time monitoring of specific environmental variables; the Markov chain effectively models state transitions based on observed outcomes that inherently reflect the integrated impact of all conditions, while symmetry constraints enhance stability by filtering asymmetric noise from short-term perturbations. However, this data-driven approach assumes a degree of stationarity in the corrosion process. A significant regime shift, such as a change in corrosive species or the application of a new inhibitor, would necessitate model recalibration with new inspection data—a process facilitated by the integrated PSO algorithm. The model’s core methodology constitutes a general framework for small-sample forecasting of cumulative degradation, making it readily applicable to other infrastructure systems (e.g., storage tanks and structural components) where scarce data on a degradation metric (e.g., crack length and wall loss) is available, thereby showcasing significant potential as a versatile tool for asset integrity management across engineering domains.
While the proposed GM-Markov-PSO model demonstrates significant advantages in predicting internal corrosion rates with small sample sizes, several promising directions emerge for future research to enhance its robustness and applicability further. First, the current Markov chain model relies on an empirically defined state division. Future work could focus on developing adaptive state partitioning methods based on data-driven algorithms and symmetry principles. Such methods would dynamically determine the optimal number of states, reducing subjectivity and improving the model’s ability to handle diverse corrosion data patterns. Second, although the PSO algorithm effectively optimized the whitening coefficients, the algorithm itself can be enhanced. Research into a symmetry-constrained PSO variant, featuring symmetric particle initialization and velocity update mechanisms, could improve convergence speed and global search capability, leading to more accurate and stable parameter optimization. Finally, to address the challenge of medium- to long-term prediction accuracy, integrating time-varying symmetry characteristics is crucial. Establishing a sliding transition matrix within the Markov chain that continuously updates by incorporating new inspection data while phasing out older data could allow the model to adapt to evolving corrosion trends, thereby improving its performance over extended prediction horizons. These research pathways would not only address the current model’s limitations but also extend its applicability to a wider range of corrosion prediction scenarios in engineering practice.

6. Conclusions

In accordance with the multi-element, cross-correlated internal corrosion factors of oil and gas pipelines and the difficulty in collecting large amounts of corrosion data, this paper proposes a gray prediction model optimized by the Markov chain and PSO algorithm, integrated with symmetry theory, to realize small-sample analysis and prediction of internal corrosion rate. The main conclusions are as follows:
(1) The internal corrosion system of oil and gas pipelines has obvious symmetry characteristics, including factor distribution symmetry, process evolution symmetry, and data statistical symmetry. Integrating these characteristics into the prediction model can effectively reduce the uncertainty of the gray system.
(2) The GM (1,1) model and unbiased GM (1,1) model with symmetry constraints significantly improve prediction accuracy. The smoothness of the initial internal corrosion rate data for pipelines was verified, the accumulated data series were established and were checked for exponential distribution, then the gray differential equation was established to construct the GM (1,1) predictions. In view of data fluctuations and the need to predict the long-term corrosion rate, unbiased GM (1,1) was applied to eliminate the gray deviation, and the development coefficient and coordination coefficient were modified to further improve the prediction accuracy. The maximum relative deviation and average deviation of the GM (1,1) model with symmetry correction were 75.23% and 43.91%, respectively. The unbiased GM (1,1) model greatly improved the prediction results, the maximum relative deviation and average deviation of unbiased GM (1,1) model were 17.79% and 6.68%, respectively, which is better than the traditional model.
(3) The residual error intervals were divided according to the corrosion state, the transfer matrix for corrosion rate change was established, and the unbiased gray prediction for internal corrosion rate based on the Markov chain was realized. The method made full use of advantages of unbiased gray prediction model and Markov chain model, and the influence of changes in trend and relative fluctuations on the prediction results were considered in practical applications. The gray Markov chain model considering the symmetry of state transition further improved the prediction accuracy. The maximum relative accuracy was 7.77%, the minimum was 0.53%, and the average relative accuracy was improved by 2.61%.
(4) In view of the defect that the median value of the residual error interval was not necessarily the best selection result in practical application, this paper used PSO algorithm to further optimize the unbiased gray Markov chain model and obtained the optimal whitening coefficient. The PSO algorithm with symmetry constraints optimized the whitening coefficient, making the model’s fitting effect closer to the actual value. The maximum and minimum prediction accuracy improvements of the PSO optimized model were 13.34% and 0.93%, respectively, the average relative accuracy was improved by 4.51% and the fitting results were closer to the actual values.
The 4.51% improvement in prediction accuracy, though a statistical metric, translates into tangible enhancements for pipeline integrity management. For example, in a pipeline with a 12.7 mm wall thickness and a 6.0 mm minimum allowable thickness, the GM-Markov-PSO model reduces the forecast uncertainty window from approximately ±3.3 years to ±1.9 years compared to the baseline GM (1,1) model. This increased precision allows for more targeted maintenance scheduling—enabling inspections to be deferred confidently to the 14th year instead of every 3 years, thus optimizing resource use and minimizing operational disruption. Conversely, when accelerated corrosion is predicted, the improved accuracy supports earlier and more reliable interventions, such as localized inhibitor application or section replacement, thereby mitigating failure risks proactively. Therefore, the accuracy gain substantiates a more reliable, cost-effective predictive maintenance strategy grounded in enhanced risk assessment.
The prediction results verified that the integration of symmetry theory can effectively improve the accuracy and stability of the pipeline internal corrosion rate prediction model. This provides a reliable basis for obtaining the pipeline corrosion state in a timely and accurate manner and gives useful references for further life estimation and maintenance strategies and for estimating the pipeline’s remaining life. The demonstrated robustness and modular architecture of the GM-Markov-PSO model suggest significant potential for transferability to other industrial contexts beyond the specific case studied. For instance, in marine environments characterized by chloride-induced corrosion, the Markov chain component could be recalibrated to model state transitions driven by tidal cycles or salinity changes. Similarly, for pipelines in acidic or high-temperature environments, the PSO algorithm could be tasked with optimizing model parameters to fit the distinct, often accelerated, corrosion dynamics, while the GM (1,1) model would capture the underlying cumulative trend. The core principle of leveraging a small amount of historical data to build a predictive framework remains universally applicable, making this model a versatile tool for infrastructure asset management across various challenging operational conditions.

Author Contributions

Conceptualization, Y.G. and A.B.; methodology, Y.G.; validation, A.B., T.Y. and Y.G.; formal analysis, T.Y.; investigation, T.Y. and C.Y.; resources, T.Y. and J.Q.; data curation, C.Y.; writing—original draft preparation, Y.G., A.B., and C.Y.; writing—review and editing, Y.G., A.B., and J.Q.; visualization, C.Y.; supervision, A.B.; project administration, Y.G. and J.Q.; funding acquisition, J.Q. and T.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Gansu Provincial Department of Education (2025B-208), the Natural Science Foundation of Gansu Provincial Department of Science and Technology (25JRRM014), the Humanities and Social Science Foundation of the Ministry of Education in China (23YJC630001, 23YJC630154), Student’s Innovation and Entrepreneurship Training Program Foundation of Huaiyin Institute of Technology (X202511049124), and Longdong University Science and Technology Innovation Fund Project (XYZK2024).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that we do not have any commercial or associative interests that represent a conflict of interest in connection with the work submitted.

References

  1. Zhang, Y.C.; Pang, X.L.; Qu, S.P.; Li, X.; Gao, K.W. The relationship between fracture toughness of CO2 corrosion scale and corrosion rate of X65 pipeline steel under supercritical CO2 condition. Int. J. Greenh. Gas. Control 2011, 5, 1643–1650. [Google Scholar] [CrossRef]
  2. National Association of Corrosion Engineer. NACE SP0502-2010; Pipeline External Corrosion Direct Assessment Methodology; NACE: Houstonm, TX, USA, 2010; pp. 50–54. [Google Scholar]
  3. Nestleroth, J.B.; Bubenik, T.A.; Haines, H.H. Pipeline simulation facilities for the development of in-line inspection technologies for gas transmission pipelines. Mater. Eval. 1995, 53, 484–487. [Google Scholar]
  4. Alanazi, N.M.; El-Sherik, A.M.; Rasheed, A.H. Corrosion of Pipeline Steel X-60 Under Field-Collected Sludge Deposit in a Simulated Sour Environment. Corrosion 2015, 71, 305–315. [Google Scholar] [CrossRef] [PubMed]
  5. Choi, Y.S.; Nesic, S.; Young, D. Effect of impurities on the corrosion behavior of CO2 transmission pipeline steel in supercritical CO2-water environments. Environ. Sci. Technol. 2010, 44, 9233–9238. [Google Scholar] [CrossRef]
  6. Qasim, J.; Hasan, B.O. Study on corrosion rate of carbon steel pipe under turbulent flow conditions. Can. J. Chem. Eng. 2010, 88, 1114–1120. [Google Scholar] [CrossRef]
  7. Galvan-Martinez, R.; Orozco-Cruz, R.; Galicia, G. Corrosion study of pipeline carbon steel in sour brine under turbulent flow conditions at 60 °C. Afinidad 2013, 70, 124–129. [Google Scholar]
  8. Kovalenko, S.Y.; Rybakov, A.O.; Klymenko, A.V. Corrosion of the Internal Wall of a Field Gas Pipeline. Mater. Sci. 2012, 48, 225–230. [Google Scholar] [CrossRef]
  9. Shirazi, A.Z.; Mohammadi, Z. A hybrid intelligent model combining ANN and imperialist competitive algorithm for prediction of corrosion rate in 3C steel under seawater environment. Neural Comput. Appl. 2017, 28, 3455–3464. [Google Scholar] [CrossRef]
  10. Chamkalani, A.; Nareh’ei, M.A.; Chamkalani, R. Soft computing method for prediction of CO2 corrosion in flow lines based on neural network approach. Chem. Eng. Commun. 2013, 200, 731–747. [Google Scholar] [CrossRef]
  11. Wang, C.; Xu, S.; Li, W.; Wang, Y.; Shen, G.; Wang, S. Multi-physics coupled simulation and experimental investigation of alternating stray current corrosion of buried gas pipeline adjacent to rail transit system. Mater. Des. 2024, 247, 113394. [Google Scholar] [CrossRef]
  12. Wang, C.; Xu, S.; Wang, Y.; Song, A.; Ding, W.; Li, W.; Qin, G. Towards understanding hydrogen embrittlement under stray current interference through a quantitative method based on multifractal characteristics. J. Mater. Sci. Technol. 2025, 255, 270–286. [Google Scholar] [CrossRef]
  13. Fang, X.; Tan, X.; Zhang, C.; Gao, X.; He, Z. Cross-system anomaly detection in deep-sea submersibles via coupled feature learning. Symmetry 2025, 17, 1838. [Google Scholar] [CrossRef]
  14. Deng, J.L. Control problems of grey systems. Syst. Control Lett. 1982, 1, 288–294. [Google Scholar]
  15. Deng, J.L. The grey control system. J. Huazhong Univ. Sci. Tech. (Nat. Sci. Ed.) 1982, 10, 9–18. [Google Scholar]
  16. Deng, J.L. Introduction grey system theory. J. Grey Syst. 1989, 1, 191–243. [Google Scholar]
  17. Chao, S.Z.; Deng, J.L. The stability of the grey linear system. Int. J. Control 1986, 43, 313–320. [Google Scholar]
  18. Trivedi, H.V.; Singh, J.K. Application of Grey System Theory in the Development of a Runoff Prediction Model. Biosyst. Eng. 2005, 92, 521–526. [Google Scholar] [CrossRef]
  19. Ould Ahmed Mahmoud, S.; Ould Beiba, E.M.; Ould Beinane, S.A.; Alotaibi, N. A Structural Study of Generalized [m, 𝓒]-Symmetric Extension Operators. Symmetry 2025, 17, 1836. [Google Scholar] [CrossRef]
  20. Erdal, K.; Baris, U.; Okyay, K. Grey system theory-based models in time series prediction. Expert. Syst. Appl. 2010, 37, 1784–1789. [Google Scholar]
  21. Brémaud, P. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues; Springer Science & Business Media: New York, NY, USA, 1999; pp. 53–156. [Google Scholar]
  22. Allan, W.G.; Michael, S. Testing the independence of forecast errors in the forward foreign exchange market using Markov chains: A cross-country comparison. Int. J. Forecast. 1987, 3, 97–113. [Google Scholar]
  23. Chen, J.N.; Zhang, Z.B.; He, S.N.; Hu, J.H.; Gerald, E.S. Sparse Code Multiple Access Decoding Based on a Monte Carlo Markov Chain Method. IEEE Signal Process. Lett. 2016, 23, 639–643. [Google Scholar] [CrossRef]
  24. Hosni, A.E.; Li, K. Minimizing the influence of rumors during breaking news events in online social networks. Knowl.-Based Syst. 2020, 193, 105452. [Google Scholar] [CrossRef]
  25. Wang, C.H.; Sheu, S.H. Determining the optimal production–maintenance policy with inspection errors: Using a Markov chain. Comput. Oper. Res. 2003, 30, 1–17. [Google Scholar] [CrossRef]
  26. Holger, K.; Mogens, S. Optimal Consumption and Insurance: A Continuous-Time Markov Chain Approach. Astin Bull. 2008, 38, 231–257. [Google Scholar][Green Version]
  27. Ramesh, A.; Twigg, D.W.; Sandadi, U.R.; Sharma, T.C. Reliability analysis of systems with operation-time management. IEEE Trans. Reliab. 2002, 51, 39–48. [Google Scholar] [CrossRef]
  28. Behçet, A.; Nazli, D.; Matthew, W.H. Convex Necessary and Sufficient Conditions for Density Safety Constraints in Markov Chain Synthesis. IEEE Trans. Autom. Control 2015, 60, 2813–2818. [Google Scholar]
  29. Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; 1995. [Google Scholar]
  30. Chuang, L.Y.; Tsai, S.W.; Yang, C.H. Chaotic catfish particle swarm optimization for solving global numerical optimization problems. Appl. Math. Comput. 2011, 217, 6900–6916. [Google Scholar] [CrossRef]
  31. Riccardo, P.; James, K.; Tim, B. Particle swarm optimization. Swarm Intell.-Us 2007, 1, 33–57. [Google Scholar]
  32. Chuang, L.Y.; Tsai, S.W.; Yang, C.H. Improved binary particle swarm optimization using catfish effect for feature selection. Expert. Syst. Appl. 2011, 38, 12699–12707. [Google Scholar] [CrossRef]
  33. Timashev, S.A.; Bushinskaya, A.V. Markov approach to early diagnostics reliability assessment residual life and optimal maintenance of pipeline systems. Struct. Saf. 2015, 56, 68–79. [Google Scholar] [CrossRef]
  34. Ossai, C.; Boswell, B.; Davies, I.J. Application of Markov modelling and Monte Carlo simulation technique in failure probability estimation-A consideration of corrosion defects of internally corroded pipelines. Eng. Fail. Anal. 2016, 68, 159–171. [Google Scholar] [CrossRef]
  35. SY/T6151-2009; Assessment of Corroded Steel Pipelines. Petroleum Industry Press: Beijing, China, 2009; pp. 2–3.
  36. Liang, X.L.; Li, W.F.; Zhang, Y. An adaptive particle swarm optimization method based on clustering. Soft Comput. 2015, 19, 431–448. [Google Scholar] [CrossRef]
  37. Clerc, M.; Kennde, Y.; Teiecom, F. The particle swarm-explosion, stability and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 2002, 6, 58–73. [Google Scholar] [CrossRef]
Figure 1. Internal corrosion of pipeline (SCC: Stress Corrosion Cracking, HIC: Hydrogen-Induced Cracking, SSC: Sulfide Stress Cracking).
Figure 1. Internal corrosion of pipeline (SCC: Stress Corrosion Cracking, HIC: Hydrogen-Induced Cracking, SSC: Sulfide Stress Cracking).
Symmetry 17 02144 g001
Figure 2. Local and global optimal solutions.
Figure 2. Local and global optimal solutions.
Symmetry 17 02144 g002
Figure 3. Prediction process of internal corrosion rate.
Figure 3. Prediction process of internal corrosion rate.
Symmetry 17 02144 g003
Figure 4. Location of Wenchang oilfields.
Figure 4. Location of Wenchang oilfields.
Symmetry 17 02144 g004
Figure 5. Comparison of GM (1,1) predicted values, unbiased GM (1,1) predicted values and actual values.
Figure 5. Comparison of GM (1,1) predicted values, unbiased GM (1,1) predicted values and actual values.
Symmetry 17 02144 g005
Figure 6. Comparison of prediction results across different models. (a) Comparison of GM (1,1) predicted values, unbiased GM (1,1) predicted values, unbiased gray Markov model predicted values, and actual values. (b) Partial enlarged drawing.
Figure 6. Comparison of prediction results across different models. (a) Comparison of GM (1,1) predicted values, unbiased GM (1,1) predicted values, unbiased gray Markov model predicted values, and actual values. (b) Partial enlarged drawing.
Symmetry 17 02144 g006
Figure 7. (a) Comparison of GM (1,1) predicted values, unbiased GM (1,1) predicted values, unbiased gray Markov model predicted values, PSO optimized unbiased gray Markov model predicted values, and actual values. (b) Partial enlarged drawing.
Figure 7. (a) Comparison of GM (1,1) predicted values, unbiased GM (1,1) predicted values, unbiased gray Markov model predicted values, PSO optimized unbiased gray Markov model predicted values, and actual values. (b) Partial enlarged drawing.
Symmetry 17 02144 g007
Table 1. Basic information of SP76-EP76 pipeline.
Table 1. Basic information of SP76-EP76 pipeline.
ParameterValueParameterValue
Total length5.34 kmDesign operating pressure3.3 Mpa
Pipeline structureDouble layerMaximum operating pressure2.58 Mpa
Transmission modeOil-gas multiphase transportationDesign operating temperature80 °C
Heat preservation or notHeat preservationMaximum operating temperature76 °C
Inner diameter200 mmMaximum operating pressure2.58 Mpa
Inner pipe wall thickness12.7 mmInner pipeline steelX65
Design life20 aOuter pipeline steelX56
Table 2. Pipeline wall thickness and corrosion depth measurements with uncertainty.
Table 2. Pipeline wall thickness and corrosion depth measurements with uncertainty.
YearMinimum Residual Wall Thickness/mmUncertainty in Min Thickness/mmMaximum Corrosion Depth/mmUncertainty in Max Corrosion Depth/mm
16.830.341.370.07
26.270.311.930.10
35.850.292.350.12
45.150.263.060.15
54.670.233.540.18
63.980.204.220.21
73.240.164.960.25
82.510.135.690.28
91.770.096.430.32
100.930.057.270.36
Table 3. Smoothness value of initial data.
Table 3. Smoothness value of initial data.
YearSmoothness ValueYearSmoothness Value
30.92960.837
40.87070.852
50.86480.871
Table 4. Exponential checked values of x(1).
Table 4. Exponential checked values of x(1).
YearChecked ValueYearChecked Value
31.71261.345
41.50571.301
51.40681.266
Table 5. GM (1,1) prediction of pipeline internal corrosion.
Table 5. GM (1,1) prediction of pipeline internal corrosion.
YearMaximum Corrosion Depth/mmYearMaximum Corrosion Depth/mm
12.4065.64
22.8476.69
33.3787.94
44.0099.42
54.751011.17
Table 6. Unbiased GM (1,1) prediction of pipeline internal corrosion.
Table 6. Unbiased GM (1,1) prediction of pipeline internal corrosion.
YearMaximum Corrosion Depth/mmYearMaximum Corrosion Depth/mm
11.3764.31
22.1775.12
32.5886.08
43.0697.21
53.63108.56
Table 7. Inner wall corrosion state value of pipeline.
Table 7. Inner wall corrosion state value of pipeline.
Corrosion StateCorrosion Ratio (%)
I (Ignorable)<10
II (Marginal)10–80
III (Serious)>80
Table 8. Classification of corrosion state.
Table 8. Classification of corrosion state.
YearResidual Error/mmState ValueYearResidual Error/mmState Value
10.00015−0.0972
2−0.23826−0.0882
3−0.22627−0.1612
4−0.00418−0.3853
Table 9. Unbiased gray Markov chain prediction of pipeline internal corrosion.
Table 9. Unbiased gray Markov chain prediction of pipeline internal corrosion.
YearMaximum Corrosion Depth/mmYearMaximum Corrosion Depth/mm
11.3764.25
22.0275.07
32.4286.03
42.9797.17
53.56108.52
Table 10. Unbiased gray Markov chain prediction based on PSO.
Table 10. Unbiased gray Markov chain prediction based on PSO.
YearMaximum Corrosion Depth/mmYearMaximum Corrosion Depth/mm
11.3764.15
21.7874.98
32.1985.94
42.8297.09
53.45108.44
Table 11. Quantitative comparison of prediction accuracy between the standard unbiased gray Markov chain and the PSO-optimized model.
Table 11. Quantitative comparison of prediction accuracy between the standard unbiased gray Markov chain and the PSO-optimized model.
MetricUnbiased Gray Markov ChainPSO-Optimized Markov ChainImprovement
Prediction for Year 9 (mm)1.351.40Closer to actual value (0.744 mm)
Maximum Accuracy Relative Range (%)7.7713.34-
Minimum Accuracy Relative Range (%)0.530.93-
Average Accuracy Relative Range (%)Baseline-+4.51
Table 12. Sensitivity analysis of GM (1,1) model parameters.
Table 12. Sensitivity analysis of GM (1,1) model parameters.
ParameterPerturbationPerturbed ValuePredicted Corrosion Depth (Year 9)/mmChange (%)
Baseline (a, u)-(−0.17, 1.67)3.73-
Development Coefficient (a)+10%−0.153.53−5.2
−10%−0.193.96+6.2
Coordination Coefficient (u)+10%1.843.91+4.8
−10%1.513.55−4.8
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, Y.; Bi, A.; Yan, T.; Yang, C.; Qi, J. Gray Prediction for Internal Corrosion Rate of Oil and Gas Pipelines Based on Markov Chain and Particle Swarm Optimization. Symmetry 2025, 17, 2144. https://doi.org/10.3390/sym17122144

AMA Style

Gao Y, Bi A, Yan T, Yang C, Qi J. Gray Prediction for Internal Corrosion Rate of Oil and Gas Pipelines Based on Markov Chain and Particle Swarm Optimization. Symmetry. 2025; 17(12):2144. https://doi.org/10.3390/sym17122144

Chicago/Turabian Style

Gao, Yiqiong, Aorui Bi, Tiecheng Yan, Chenxiao Yang, and Jing Qi. 2025. "Gray Prediction for Internal Corrosion Rate of Oil and Gas Pipelines Based on Markov Chain and Particle Swarm Optimization" Symmetry 17, no. 12: 2144. https://doi.org/10.3390/sym17122144

APA Style

Gao, Y., Bi, A., Yan, T., Yang, C., & Qi, J. (2025). Gray Prediction for Internal Corrosion Rate of Oil and Gas Pipelines Based on Markov Chain and Particle Swarm Optimization. Symmetry, 17(12), 2144. https://doi.org/10.3390/sym17122144

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop