Next Article in Journal
Studies on Poisson–Nernst–Planck Systems with Large Permanent Charges Under Relaxed Neutral Boundary Conditions
Previous Article in Journal
Multi-Objective Optimization with a Closed-Form Solution for Capital Allocation in Environmental Energy Stock Portfolio
Previous Article in Special Issue
Adapted Multi-Strategy Fractional-Order Relative Pufferfish Optimization Algorithm for Feature Selection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of Sulfur Dioxide Emissions in China Using Novel CSLDDBO-Optimized PGM(1, N) Model

by
Lele Cui
1,
Gang Hu
1,2,* and
Abdelazim G. Hussien
3,4
1
Department of Applied Mathematics, Xi’an University of Technology, Xi’an 710054, China
2
School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China
3
Department of Computer and Information Science, Linköping University, 581 83 Linköping, Sweden
4
Faculty of Science, Fayoum University, Faiyum 63514, Egypt
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(17), 2846; https://doi.org/10.3390/math13172846
Submission received: 17 July 2025 / Revised: 15 August 2025 / Accepted: 29 August 2025 / Published: 3 September 2025
(This article belongs to the Special Issue Advances in Metaheuristic Optimization Algorithms)

Abstract

Sulfur dioxide not only affects the ecological environment and endangers health but also restricts economic development. The reasonable prediction of sulfur dioxide emissions is beneficial for formulating more comprehensive energy use strategies and guiding social policies. To this end, this article uses a multiparameter combination optimization gray prediction model (PGM(1, N)), which not only defines the difference between the sequences represented by variables but also optimizes the order of all variables. To this end, this article proposes an improved algorithm for the Dung Beetle Optimization (DBO) algorithm, namely, CSLDDBO, to optimize two important parameters in the model, namely, the smoothing generation coefficient and the order of the gray generation operators. In order to overcome the shortcomings of DBO, four improvement strategies have been introduced. Firstly, the use of a chain foraging strategy is introduced to guide the ball-rolling beetle to update its position. Secondly, the rolling foraging strategy is adopted to fully conduct adaptive searches in the search space. Then, learning strategies are adopted to improve the global search capabilities. Finally, based on the idea of differential evolution, the convergence speed of the algorithm was improved, and the ability to escape from local optima was enhanced. The superiority of CSLDDBO was verified on the CEC2022 test set. Finally, the optimized PGM(1, N) model was used to predict China’s sulfur dioxide emissions. From the results, it can be seen that the error of the PGM(1, N) model is the smallest at 0.1117%, and the prediction accuracy is significantly higher than that of other prediction models.

1. Introduction

Chemical energy includes plant fuels, fossil fuels, and other gaseous fuels. China is rich in chemical energy, which not only promotes rapid national development but also determines the future development of the country. However, the use of chemical energy is accompanied by the production of large amounts of harmful gases, among which the sulfur dioxide (SO2) generated after the combustion of sulfur-containing fuels is more serious. It not only endangers human health and damages the natural environment but also poses a threat to the economy and society. Therefore, predicting the future emissions of SO2 and implementing governance and prevention measures in China’s strategic plan to achieve sustainable development is crucial.
In terms of research on SO2 emissions, multiple scholars have conducted research on the prediction of SO2 emissions. Common methods include multiple linear regression prediction models, machine learning, and statistical models.
  • Multiple: linear regression prediction models: Long et al. [1] conducted dimensionality reduction and collinearity analysis on SO2 emission data variables based on the analysis of the sulfur metabolism mechanism in the sintering process. They derived a statistical regression prediction model for SO2 emissions from sintering based on the principle of multiple linear regression. However, this model is sensitive to the linearity assumption of the data, while SO2 emissions from sintering flue gas are influenced by the nonlinear coupling of multiple factors, such as the sulfur contents in raw materials, the air volume, and temperature. Zheng et al. [2] established a regression model for SO2 emissions from coal combustion using multiple linear regression analysis and variance analysis and applied it to predict SO2 emissions from thermal power plants in Shandong Province. This method did not consider the impact of accident slurry ponds in the flue gas desulfurization system and thus failed to effectively prevent environmental pollution.
  • Machine learning: In 2018, Xue et al. [3] established a flue gas SO2 emission prediction model based on support vector machines (SVMs) for the nonlinear characteristics of flue gas SO2 objects in circulating fluidized bed boiler control systems. The model parameters were determined using a univariate parameter search combined with grid optimization, overcoming the various shortcomings of previous methods that directly used grid search to determine the parameters of SVM regression models, thereby achieving good prediction results. However, this method requires frequent parameter tuning, resulting in high maintenance costs. When facing circulating fluidized bed systems with strong real-time requirements and high data noise, the model accuracy is easily limited. In 2021, Vitor Miguel Ribeiro [4] used economic-related theories combined with machine learning models to predict SO2 emissions near a thermal power plant in Portugal. According to the final results, the performance of machine learning models is superior to that of traditional methods, but the prediction accuracy of this method will significantly decrease when facing local emission data policy adjustments and monitoring equipment anomalies.
  • Statistical models: In 2023, Ghosh and Verma [5] applied an aerosol field and estimated the constrained emissions of SO2 in India based on relevant data. They first constrained the scattering of absorbing aerosols, and they then used this constraint method to obtain the constrained SO2 emissions. They concluded that the annual emission rate of SO2 in the Indian restricted-emission database was lower than the emission rate reported by China. However, if this method underestimates hotspot pollution sources or overestimates emission reduction effects, it may exacerbate the environmental governance risks in the Indian region. Fu et al. [6] used the LEAP model and emission factor method to predict the SO2 emissions in some eastern regions of China and studied future emission trends. However, this method did not consider the transmission of pollutants between urban agglomerations and the drastic changes in energy structures caused by industrial upgrading in the eastern region, which made it difficult for traditional linear prediction models to capture the dynamic impact of emerging high-energy-consuming industries.
SO2 emissions mainly come from the combustion of sulfur-containing fuels. The gradual recovery of the global economy and fierce competition among countries will inevitably exacerbate energy consumption, which will be followed by an increase in SO2 emissions. If predicted based on previous years’ data, the results may be inaccurate. In addition, the collection of SO2 emission data is difficult, with a long cycle and complex influencing factors. Therefore, a gray prediction model was chosen to predict SO2 emissions based on the modeling object of “less information required, higher accuracy, simple operation, easy inspection, and system uncertainty”.
Gray system theory [7] is an emerging discipline proposed and established by Professor Deng Julong to address issues such as limited data volumes [8] and information uncertainty [9]. Gray prediction, as a research direction of gray systems, is divided into different research objects based on the number of variables studied. The GM(1,1) model also works well with exponential growth data and data with medium to long time periods, but its prediction results are poor for time series with poor volatility. To enhance the accuracy of the gray forecasting model NGM(1,1,k2) with quadratic polynomial terms, Li et al. [10] further refined the original model and obtained the BNGM(1,1,k2) model. However, this method requires the simultaneous adjustment of the coefficients of the differential equation and the polynomial weights, making it prone to local optimal solutions. Qian et al. [11] proposed a new discrete gray forecasting model to enhance the model’s adaptability and performance for both linear and nonlinear trends in time series. However, in renewable energy generation scenarios, this model relies excessively on new data, leading to neglect of historical patterns. Wang et al. [12] applied quantile regression techniques to construct the QGM(1,1) model to improve the model stability. This model has the advantages of higher accuracy and better robustness. However, constrained by the inherent limitations of gray theory, the model’s differential equation is based on the assumption of exponential law, making it unable to effectively capture random perturbations in economic or environmental systems. Zeng et al. [13] established the SGGM(1,1,r) model, combining new initial conditions with original data to improve the prediction accuracy. However, this model reconstructs initial conditions based on the “new information priority” principle and, given the short history of shale gas development in China, limited samples lead to insufficient reliability in long-term trend prediction. Table 1 presents some of the literature using univariate gray models to predict SO2 emissions.
The GM(1, N) model addresses simulation and prediction issues where multiple related factor variables affect a single system behavior variable. Duan et al. [14] established a multivariable energy consumption gray model and applied it to practical problems. However, differences in China’s regional energy structures can lead to dynamic coupling effects, making it difficult for the model to adapt and adjust. In 2022, Duan [15] proposed the Verhulst gray model MVGM(1, N), which not only specifically addressed the problem but also enhanced the prediction accuracy. However, the training data for this model was concentrated in resource-based provinces. When directly applied to manufacturing powerhouses such as Jiangsu, due to differences in energy structures, the mean absolute error will increase. In 2023, Ye et al. [16] established a WAFGM(1, N) model, which not only reduces errors but also utilizes uncertain data for prediction, achieving high prediction accuracy. Although this model dynamically adjusts weights through interval sequences, it does not consider the sudden impact of extreme events on variable relationships, resulting in the weight allocation lagging behind actual changes.
The above review indicates that the current models exhibit certain limitations to some extent. The shortcomings of traditional models lie in the fact that modeling for predicting SO2 emissions requires expert experience in selecting and extracting features, which is time-consuming and highly subjective. It is difficult to adapt to regional differences in China, and there is a high risk of overfitting in small samples. The disadvantage of machine learning models is that the decision-making process is black-boxed, making it difficult to intuitively analyze the contributions of variables, reducing the credibility of policy formulation. Furthermore, such models do not internalize policy shocks and ignore dynamic coupling, leading to the ineffective prediction of turning points. Therefore, a new method is needed to fill this gap for the better prediction of SO2. This paper chooses to use the gray prediction model to solve this problem and proposes some feasible solutions. This article uses the multivariate gray prediction model PGM(1, N) suggested by Yin et al. [17] in 2023. We chose PGM(1, N) because this model has several advantages. It can construct a prediction framework using a small amount of historical data, making it suitable for scenarios with incomplete information or scarce data, and avoiding the dependence of traditional statistical methods on large data volumes. Secondly, based on differential equations and accumulation generation techniques, it simplifies complex multivariate relationships into computable sequences, reducing the computational complexity of modeling and facilitating practical application deployment. Then, PGM(1, N) integrates multiple influencing factors to reduce the system uncertainty. Its prediction results are superior to those of the GM(1,1) model, especially in short-term trend analysis. Finally, this model weakens noise interference by accumulating and generating the original data, directly revealing the inherent relationship between multiple variables. Previous research on predicting gas emissions has mainly focused on single/multivariate first-order cumulative models and related improved models. Traditional multivariate models use the identical order for all variable data sequences, using the first-order cumulative generator sequence of the independent variable as a driving term for modeling. However, this approach does not consider the effects of the volatility of the sequence and the sway of extreme values on the dependent variable, which can seriously impact the quality of the model. Therefore, this model incudes the differential definition and optimization of the order of variables, solving the problem of the similarity of variables with different orders. On this basis, the modeling ability of the model is enhanced by simultaneously optimizing the order of variables, driving smooth generation coefficients and background coefficients.
In the PGM(1, N) model, the order of the gray generation operators and the smoothing generation coefficient are two important parameters. The accurate prediction of a prediction model depends on the efficient and accurate optimization of the model parameters; however, intelligent optimization algorithms can be used to optimize it. There are many algorithms that perform well. The Ant Colony Optimization (ACO) algorithm [18] is inspired by the behavior of real-life ant colonies. Ants transmit information by recognizing pheromones. Based on this, the algorithm constructs a solution to the optimization problem. The Particle Swarm Optimization (PSO) algorithm [19] is influenced by the study of bird foraging behavior, which enables the population to seek optimization through information sharing between groups. The Gray Wolf Optimization (GWO) [20] algorithm is influenced by the hunting methods of wolf packs, simulating the system and division of labor in the wolf population. The Harris Eagle Optimization (HHO) [21] algorithm imitates the predatory characteristics and cooperative behavior of Harris eagles, searching through the hunting process. Also, Sled Dog Optimization (SDO) [22] is inspired by sled dog behavior, finding the best food through division of labor and information exchange between groups. The Marine Predator Algorithm (MPA) [23] is constructed by simulating the evolutionary evolution of predators and prey in the sea. The Chimpanzee Optimization Algorithm (ChOA) [24] simulates social behavior among chimpanzee populations. The Slime Mold Algorithm (SMA) [25] simulates the behavior of slime molds during foraging. Elephant Herding Optimization (EHO) [26] mainly simulates the phenomenon of member updates and changes in elephant groups in nature. The improved Black Widow algorithm (namely, SDABWO) [27] is used to solve feature selection problems. The improved version of the Honey Badger algorithm (namely, SaCHBA-PDN) [28] has a better performance and is easier to implement. The improved version of the starling noise algorithm (namely, DTCSMO) [29] has shown significant advantages in engineering application problems. The enhanced version of jellyfish search optimization (namely, EJS) [30] has shown significant advantages in solving optimization problems with complex spherical shapes. Based on the Kepler optimization algorithm, an improvement was made (namely, CGKOA) [31], which has a better optimization performance. The improved particle swarm optimization (namely, dFDBMPSO) [32] has been applied to practical problems.
However, any algorithm has its limitations in application. Therefore, we encourage the design and development of more high-performance optimization algorithms. Among them, the DBO algorithm [33] is an original algorithm picked up by Shen et al. Its proposal was inspired by a series of survival behaviors of insects, such as dung beetles, allowing for global search and local utilization. Dung beetles [34] feed on animal excrement and are known for their unique behavior of pushing dung balls [35], playing the role of decomposers in nature [36].
The DBO algorithm combines global search and local development and has an excellent performance in terms of its convergence speed and solution accuracy. It is evaluated according to various benchmark functions and achieves better results. This article proposes an improved version of the beetle algorithm (CSLDDBO) that has a higher solving accuracy and better performance and proves its effectiveness on the benchmark functions. This study utilizes the CSLDDBO-PGM(1, N) combinatorial optimization model for predicting SO2 emissions and demonstrates its rationality.
The full text is arranged as follows:
  • The gray prediction model selected in this article has carried out the differential definition and optimization of the order of variables, solving the disadvantage of different variables with the same order in the gray model, and combining the idea of parameter combination optimization to enhance the modeling capabilities of gray models.
  • An improved Dung Beetle Optimization (CSLDDBO) algorithm is proposed, which introduces three strategies and the idea of differential evolution to enhance the performance of the original algorithm. The effectiveness of this algorithm was verified by testing it on the CEC2022 test set.
  • To predict SO2 emissions, a PGM(1, N) model optimized with CSLDDBO parameters was used for predicting SO2 emissions in China.
The rest of this article is organized as follows: In Section 2, the basic concepts related to the multivariate gray model PGM(1, N) are presented. In Section 3, the DBO algorithm is introduced. In Section 4, the details of CSLDDBO are introduced. Section 5 analyzes the performance of CSLDDBO based on experiments. In Section 6, the CSLDDBO-PGM(1, N) model is used to predict the emissions of SO2 in China. Section 7 provides a summary of the entire text.

2. PGM(1, N)

2.1. Related Concepts

2.1.1. The Order of Gray Generative Operators

To highlight the key points of this article, we have placed the basic concepts of GM(1, N) in the Appendix A. Below, we present the basic concepts of PGM(1, N).
We refer to Y 1 ( 0 ) = ( y 1 ( 0 ) ( 1 ) , y 1 ( 0 ) ( 2 ) , , y 1 ( 0 ) ( N ) ) , as the dependent-variable sequence and Y m ( 0 ) = ( y m ( 0 ) ( 1 ) , y m ( 0 ) ( 2 ) , , y m ( 0 ) ( N ) ) , m = 2 , 3 , .. , n as the independent-variable sequence. Y i ( 1 ) = ( y i ( 1 ) ( 1 ) , y i ( 1 ) ( 2 ) , , y i ( 1 ) ( N ) ) ,   i = 2 ,   3 , , n is used as a first-order cumulative generation sequence [37] of Y i ( 0 ) , with the specific formula being
y i ( 1 ) ( g ) = k = 1 g y i ( 0 ) ( k ) , g = 1 , 2 , , N
where k represents the sequence order.
Let Y i ( 0 ) = ( y i ( 0 ) ( 1 ) , y i ( 0 ) ( 2 ) , , y i ( 0 ) ( N ) ) ,   i = 1 , 2 , , n be the initiation sequence, t i I   i = 1 , 2 , , n be the order of the gray generation operators, and Y i ( t i ) = ( y i ( t i ) ( 1 ) , y i ( t i ) ( 2 ) , ,   y i ( t i ) ( N ) ) be the new sequence, where
y i ( t i ) = i = 1 g Γ ( t i + g i ) Γ ( g i + 1 ) Γ ( t i ) y i ( 0 ) ( i ) . g = 1 , 2 , , N
The above formula is called the t i -order gray generation operator of Y i ( 0 ) , abbreviated as t i RGO and Y i ( t i ) , respectively, where i = 1 , 2 , , n , m = 2 , 3 , , n .
As shown in Formula (2), let Y i ( 0 ) , b I ,   f I . Y i ( b ) is the b RGO sequence of Y i ( 0 ) ; Y i ( f ) is the f RGO sequence of Y i ( 0 ) ; Y i ( b + f ) is the ( b + f ) RGO sequence of Y i ( 0 ) ; ( Y i ( f ) ) ( b ) is the b RGO sequence of Y i ( f ) ; ( Y i ( f ) ) ( b ) is the f RGO sequence of Y i ( b ) . Moreover, multiple generations of operators satisfy the commutative law and exponential rate [38], i.e.,
( Y i ( b ) ) ( f ) = ( Y i ( f ) ) ( b ) = Y i ( b + f ) ,
specifically,
Y i ( 0 ) = ( Y 1 ( t ) ) ( t ) = ( Y 1 ( t ) ) ( t ) .

2.1.2. Smooth Generation Operator

Next, the 1-AGO sequence of the independent variable is used as the driving sequence [39], and variance smoothing operations are performed.
As shown in Formula (2), let F m ( t m ) = ( k m ( t m ) ( 2 ) , k m ( t m ) ( 3 )   , ,   k m ( t m ) ( N ) ) be a tm-order smooth generating sequence with the variable weight independent variable λ m , λ m ( 0 , 1 ) , where
k m ( t m ) ( g ) = λ m y m ( t m ) ( g ) + ( 1 λ m ) y m ( t m ) ( g 1 ) , g = 2 , 3 , , N
where λ m is the smoothing generation coefficient. Specifically, if λ m = 1 , F m ( t m ) is converted into a 1-RGO sequence.

2.2. Model Definition and Parameter Estimation

According to Formula (5), Y 1 ( t 1 ) and Y 1 ( t 1 1 ) are referred to as the t 1 RGO sequence and ( t 1 1 ) RGO sequence of Y 1 ( 0 ) . a 1 ( t 1 ) = ( A 1 ( t 1 ) ( 2 ) , A 1 ( t 1 ) ( 3 ) , , A 1 ( t 1 ) ( N ) ) is a neighboring sequence generated by the background value coefficient λ 1 of Y 1 ( t 1 ) ; then,
y 1 ( t 1 1 ) ( g ) + E A 1 t 1 ( g ) = m = 2 n q m k m t 1 ( g ) + s 1 ( g 1 ) + s 2   ,
is known as a new multivariate gray model, abbreviated as PGM(1, N). In Formula (5), E represents the development coefficient. k m ( t m ) ( g ) = λ m y m ( t m ) ( g ) + ( 1 λ m ) y m ( t m ) ( g 1 ) is a driver based on smooth generation. s 1 ( g 1 ) is a linear correction term. s 2 is a random perturbation term.
Assuming that Y 1 ( t 1 ) and Y 1 ( t 1 1 ) are as described in Formula (6), Y m ( t m ) is as described in Formula (2), and F m ( t m ) is as described in Formula (5), the parameter u ^ = [ q 2 , q 3   , , q N , E , s 1 , s 2 ] T estimation of the PGM(1, N) model satisfies
D = k 2 ( t 2 ) ( 2 ) k 2 ( t 3 ) ( 2 )     k n ( t n ) ( 2 ) A 1 ( t 1 ) ( 2 ) 1 1 k 2 ( t 2 ) ( 3 ) k 2 ( t 3 ) ( 3 )     k n ( t n ) ( 3 ) A 1 ( t 1 ) ( 3 ) 2 1 k 2 ( t 2 ) ( N ) k 2 ( t 3 ) ( N )     k n ( t n ) ( N ) A 1 ( t 1 ) ( N ) N 1 1   ,
K = y 1 ( t 1 1 ) ( 2 ) y 1 ( t 1 1 ) ( 3 ) y 1 ( t 1 1 ) ( N ) = m = 1 2 Γ ( t 1 + 1 m ) Γ ( 2 m + 1 ) Γ ( t 1 1 ) y 1 ( 0 ) ( m ) m = 1 3 Γ ( t 1 + 2 m ) Γ ( 3 m + 1 ) Γ ( t 1 1 ) y 1 ( 0 ) ( m ) m = 1 N Γ ( t 1 + N 1 m ) Γ ( N m + 1 ) Γ ( t 1 1 ) y 1 ( 0 ) ( m ) = ( t 1 1 ) y 1 ( 0 ) ( 1 ) + y 1 ( 0 ) ( 2 ) t 1 ( t 1 1 ) 2 y 1 ( 0 ) ( 1 ) + ( t 1 1 ) y 1 ( 0 ) ( 2 ) + y 1 ( 0 ) ( 3 ) m = 1 N Γ ( t 1 + N 1 m ) Γ ( N m + 1 ) Γ ( t 1 1 ) y 1 ( 0 ) ( m ) .
  • If N = n + 3 and D 0 , then u ^ = D 1 K ;
  • If N > n + 3 and D T D 0 , then u ^ = ( D T D ) 1 D T K ;
  • If N < n + 3 and D D T 0 , then u ^ = D T ( D D T ) 1 K .

2.3. Solution of the Model

Formula (6) shows that when g = 2 , 3 , , N , , the time response expression is as follows:
y ^ 1 ( t 1 ) ( g ) = d = 1 g 1 ε 1 m = 2 n ε 2 d 1 q m k m ( t m ) ( g d + 1 ) + ε 2 g 1 y ^ 1 ( t 1 ) ( 1 ) + i = 0 g 2 ε 2 i [ ( g i ) ε 3 + ε 4 ]   ,
where
ε 1 = 1 1 + γ λ 1 , ε 2 = 1 E ( 1 λ 1 ) 1 + E ε 1 , ε 3 = s 1 1 + E λ 1 , ε 3 = s 2 s 1 1 + E λ 1   .
The final recovery expressions are as follows:
y ^ 1 ( 0 ) ( g ) = m = 1 g Γ ( t 1 + g m ) Γ ( g m + 1 ) Γ ( t 1 ) y ^ 1 ( t 1 ) ( m )   .

3. Overview of Dung Beetle Optimization Algorithms

The dung beetle likes to roll animal feces into balls and mainly feeds on animal feces, earning the nickname “natural cleaner”. For dung beetles, dung balls are important breeding grounds. DBO is composed of four different functional dung beetles.

3.1. Ball-Rolling Dung Beetles

The update expression for the DBO initialization position is as follows:
h u ( q + 1 ) = h u ( q ) + σ × j × h u ( q 1 ) + l × Δ h   ,                         Δ h = h u ( q ) H g   ,
where q denotes the number of iterations; h u ( q ) denotes the location of the u th dung beetle in the q iteration; j ( 0 , 0.2 ] is the deflection coefficient; l is the random number in ( 0 , 1 ) ; σ is the natural coefficient, assigned as −1 or 1; H g denotes the worst-case position; and Δ h denotes simulation of changes in light intensity. σ represents natural factors. Specifically, σ = 1 or σ = 1 . The algorithm is as follows:
The updated location of DBO’s dancing behavior is as follows:
h u ( q + 1 ) = h u ( q ) + tan ( θ ) h u ( q ) h u ( q 1 ) ,
When θ [ 0 , π ] and θ equals 0 , π / 2 , π , the position is not updated.

3.2. Producing Dung Beetles

The selection of the breeding ball’s position is particularly crucial, and the boundary handling is as follows:
L o = max ( H × ( 1 P ) , L o ) , U p = max ( H × ( 1 + P ) , U p ) ,
where H is the current local best point; L o and U p denote the boundaries of the spawning area ( P = 1 q / U max ); U max is the maximum number of iterations; L o and U p represent the upper and lower bounds, respectively.
With the change in P , the spawning area will also dynamically change, and the position update of the breeding balls is represented as follows:
M u ( q + 1 ) = H + l 1 × ( M u ( q ) L o ) + l 2 × ( M u ( q ) U p ) ,
where M u ( q ) is the position of the u breeding ball in the q generation, l 1 and l 2 represent random vectors, and Z represents the dimensionality degree.

3.3. Larvae

The boundaries of the optimal foraging area for larvae are defined as follows:
L o l = max ( H l × ( 1 P ) , L o ) , U p l = max ( H l × ( 1 + P ) , U p ) ,
where H l represents the global optimal position, and L o l and U p l represent the lower and upper bounds of the optimal foraging area. The location of the larvae is denoted as follows:
h u ( q + 1 ) = h u ( q ) + V 1 × ( h u ( q ) L o l ) + V 2 × ( h u ( q ) U p l ) ,
where h u ( q ) denotes the location of the q th larva in the u generation, and V 1 and V 2 are random numbers.

3.4. Thief Dung Beetle

The position update of the “thief” dung beetle is represented as follows:
h u ( q + 1 ) = h l + G × z × ( h u ( q H ) + h u ( q ) H l ) ,
where h u ( q ) denotes the location of the u thief in the q generation, z is an arbitrary vector, and G represents a constant.

4. CSLDDBO Algorithm

In this section, we propose an improved version of the DBO algorithm, named CSLDDBO. We combined the chain foraging strategy, rolling strategy, learning strategy, and differential evolution algorithm to enhance the original algorithm.

4.1. Chain Foraging Strategy

Introducing the chain foraging strategy into CSLDDBO [40] can enhance the global exploration ability of the original algorithm. The expression of the chain foraging strategy is as follows:
h ( r + 1 ) = h o b ( r ) + q ( h b e s t b ( r ) h o b ( r ) ) + γ ( h b e s t b ( r ) h o b ( r ) ) ,       o = 1 x o b ( r ) + q ( h m 1 b ( r ) h o b ( r ) ) + γ ( h b e s t b ( r ) h o b ( r ) ) ,       o = 1 , 2 , , s
γ = 2 q log ( q ) ,
where h o b ( r ) is the location of the o th individual in the b th dimension at time ( r ), q is an arbitrary vector in [ 0 , 1 ] , γ is the weight coefficient, and h b e s t b ( r ) is the best position location for individuals.
Dung beetles orient and fly towards food based on scent, assuming that the food source is the optimal location. Figure 1 depicts a schematic diagram of dung beetles foraging in a two-dimensional space according to this strategy.

4.2. Somersault Foraging Strategy

This strategy regards the position of food as the optimal position, and all individuals update their positions around this optimal position. The specific expression is as follows:
h o b ( r + 1 ) = h o b ( r ) + V ( q 2 h b e s t b q 3 h o b ( r ) ) ,   o = 1 , 2 , , s
where V is the flip factor, V = 2 , and q 2 and q 3 are two random numbers in [ 0 , 1 ] . As can be seen from the formula, this strategy enables individuals to conduct an adaptive search within a constantly changing search range.

4.3. Learning Strategy

The expression for the comprehensive learning strategy is as follows [41]:
K o b η K o b + p r a n d o b ( p b e s t f i ( b ) b H O B ) ,
where f i u = [ f i u ( 1 ) , f i u ( 2 ) , , f i u ( Z ) ] defines the p b e s t s of the particle corresponding to an individual ( u ). p b e s t f i ( b ) b can be the relevant dimension of any individual’s p b e s t , and P c is called the learning probability.
In this strategy, first, two individuals are randomly selected. Secondly, the f i of two individuals are compared and the weaker ones are eliminated. Finally, the p b e s t of the better individuals compared are used as samples to learn this dimension. If all samples of an individual are its own, a dimension is randomly selected to learn the corresponding p b e s t of another individual.

4.4. Differential Evolution

In order to avoid the rapid convergence of the population to the previous optimal position due to the influence of iteration on CSLDDBO, and to improve the convergence speed of CSLDDBO and avoid falling into local optima, we were inspired by differential evolution and applied it to CSLDDBO [42]:
1.
Initial population
h o u ( 0 ) = r n d o u ( 0 , 1 ) ( h o u U h o u L ) + h o u L ,   o = 1 , 2 , , N S , u = 1 , 2 , , L
where L is the amount of variables, N S is the population size, and h o u U and h o u L are the upper and lower bounds of the u th variable, respectively.
2.
Mutation
Three individuals ( h e 1 , h e 2 , h e 3 and ( b e 1 e 2 e 3 ) ) are randomly selected from the population; then,
h o u ( r + 1 ) = h e 1 u ( r ) + ρ ( h e 2 u ( r ) h e 3 u ( r ) ) ,
where h e 2 u ( r ) h e 3 u ( r ) is the differentiation vector, and ρ is the scaling factor.
3.
Cross operation
d ( r + 1 ) = h o u ( r ) , r a n d   1 o u > C R   o r   u r a n d   ( r ) , h o u ( r + 1 ) , r a n d   1 o u C R   o r u = r a n d   ( r ) ,
where C R is a random integer with a crossover probability of C R [ 0 , 1 ] , and r a n d   ( o ) is between 1 and L.
4.
Select action
h ( r + 1 ) = h o ( r ) , f i ( d o 1 ( r + 1 ) , , d u l ( r + 1 ) ) < f i ( h o 1 ( r ) , , h o L ( r ) ) , d o ( r + 1 ) , f i ( d o 1 ( r + 1 ) , , d u L ( r + 1 ) ) f i ( h o 1 ( r ) , , h o L ( r ) ) .
Figure 2 presents the flowchart of CSLDDBO, and the pseudocode for CSLDDBO is provided in Algorithm 1.
Algorithm 1: Pseudocode for CSLDDBO algorithm
Mathematics 13 02846 i001Mathematics 13 02846 i002

4.5. Time Complexity

The time complexity of the CSLDDBO algorithm is determined by the chain search, roll search, mutation probability, and crossover probability performed by N , D , T in each iteration. Therefore, it can be denoted as follows:
O ( C S L D D B O ) = O ( T ( O ( c h a i n   s t r a t e g y ) + O ( s o m e r s a u l t   s t r a t e g y )   + O ( m u t a t i o n   p r o b a b i l i t y + c r o s s o v e r   p r o b a b i l i t y ) ) )                                         = O ( T ( N D + N D + N D ) ) = O ( T N D ) ,
where T is the maximum number of iterations, N is the number of the population, and D is the number of variables.

5. Numerical Experiment and Discussion

In this section, we evaluate CSLDDBO from multiple perspectives through a series of experiments. Firstly, we conducted a convergence analysis on CSLDDBO. Secondly, we conducted a sensitivity analysis on the test function by introducing strategy parameters. Finally, in order to better evaluate the performance of CSLDDBO, we selected some algorithms with excellent performances to compare them with CSLDDBO.

5.1. Convergence Behavior Analysis

Next, we will analyze the convergence of CSLDDBO using four representative qualitative indicators: the search history, the average fitness, the trajectory of the first individual in the first dimension, and the convergence curve of CSLDDBO compared to the original DBO. These indicators can help us observe the optimization behavior of CSLDDBO on CECE2022 and gain a better understanding of its performance. Figure 3 shows the convergence behavior of CSLDDBO on CEC2022. The first column in the figure shows the shape of the test function in two-dimensional space. The second column is a scatter plot of CSLDDBO’s historical location search records on the two-dimensional front of the search agent. The third column shows the average fitness of the search agent during the iteration process, reflecting the changes in the average fitness of CSLDDBO during the iteration process. The fourth column reflects the trajectory curve of the first search agent in the first dimension. The fifth column shows the optimal convergence curves currently found by the search agent and the original DBO.
  • Search History
When solving CECE2022 with CSLDDBO, observing the particle distributions of F7, F8, and F9, it can be seen that there were significant differences in the distribution densities of particles when facing different stages and functions throughout the entire optimization process. This indicates that the balance between exploration and development varies when dealing with different problems, and it also demonstrates CSLDDBO’s strong convergence and excellent development capabilities.
2.
Average fitness
Observation shows that the initial values of the iteration process are different, indicating that the population diversity was very rich in the early stages of iteration. The average fitness curves of all functions show a decreasing trend, indicating that the population as a whole gradually approached the optimal solution as the iteration progressed.
3.
Search trajectory
Taking the trajectory of the first particle in CSLDDBO as an example in the figure, when solving F1, F2, F6, F10, and F12 of CEC2022 in CSLDDBO, the abrupt change amplitude in the early stage of iteration covered the entire search space. This indicates that CSLDDBO has superior exploration capabilities. There were still fluctuations in the search agent during the later iterations of F2, F6, F10, and F12, indicating that the population was still being updated. On most functions, CSLDDBO’s search amplitude decreased in the later stages of iteration and its motion gradually stabilized, ultimately finding the global optimal position.
4.
Convergence curve
Overall, CSLDDBO had a better convergence ability and convergence accuracy than DBO. On F3, F5, F8, and F9, CSLDDBO converged slightly slower than DBO in the early stages of iteration, but the final convergence accuracies of CSLDDBO and DBO were very close on F8 and F9.

5.2. Sensitivity of Parameters

In CSLDDBO, the four strategies introduced include three parameters: the population proportion parameter ( F p e r ), mutation operator ( F 0 ), and crossover operator ( C R ). In order to determine the impact of the parameters on the performance of CSLDDBO, we conducted an experimental analysis on these three parameters on the CEC2022 test set. We set the experimental dimension and maximum iteration to 50 and 1500, respectively.
The size of the population proportion parameter ( F p e r ) contributes to the coverage and computational efficiency of the solution space. The general value depends on the specific problem. In CSLDDBO, a range of 0.1 to 0.5 is chosen, and the step size is set to 0.1. Table 2 shows that better results were achieved on the test function when F p e r = 0.4 was used, indicating that this parameter enabled CSLDDBO to exhibit a better performance.
The emergence of mutation operators introduces the global search capability, driving the population to evolve towards better regions. If the value of F 0 is too high, it will slow down the convergence speed, while if the value is too small, it will lead to diversity and easily fall into local optima. The introduction of crossover operators can prevent the premature convergence of the algorithm, balancing exploration and development capabilities. A larger C R value can lead to a slower convergence speed, while a smaller one can easily fall into local optima. For the CEC2022 test function, according to Table 3, when the F 0 was set to 0.2 and the C R was set to 0.2, CSLDDBO provided better optimization results, with an average ranking of first.

5.3. Experimental Results

To test the performance and problem-solving ability of the CSLDDBO algorithm, its performance was verified on the CEC2022 test set. All experiments were conducted on the same computer using Matlab 2020b on the 12th generation Inter (R) Core (TM) i7-12700H@2.30 GHz. The algorithm size is 50, the maximum number of iterations was set to 1500, and the number of runs was set at 20.
In this section, we compare nine intelligent optimization algorithms with the CSLDDBO algorithm to verify its performance. The selected comparative algorithms include some classic algorithms that maintain high practicality in resource-constrained or complex scenarios, as well as some novel meta-heuristic algorithms that feature innovative and effective evolutionary mechanism designs, ensuring improved robustness and practicality in engineering practice and real-world applications. The selected algorithms include Harris Hawk Optimization (HHO), Gray Wolf Optimization (GWO), the Whale Optimization Algorithm (WOA) [43], Improved Harris Hawk Optimization (IHHO) [44], Moth Flame Optimization (MFO) [45], the Weighted Mean of Vectors (INFO) [46], the Pelican Optimization Algorithm (POA) [47], and the Sine Cosine Algorithm (SCA) [48]. Table 4 presents the algorithm parameter settings.
In Table 5, we provide a series of evaluation metrics, such as the median and interquartile range, and the best mean values of each algorithm on the function are highlighted in bold. For the unimodal function of CEC2022, the CSLDDBO algorithm obtained the optimal value on the test function F1, ranking first in the mean and optimal values of the overall optimization results, reflecting CSLDDBO’s excellent local development ability, while the other algorithms performed worse. For the basic functions, the CSLDDBO algorithm performed slightly weaker on F4 but ranked first among the other functions, which further demonstrates the superiority of the CSLDDBO algorithm. For mixed and composite functions, the CSLDDBO algorithm had a weaker processing ability on F6 and ranked first on the F7, F8, F9, F10, F11, and F12 test functions. On most test functions, the CSLDDBO algorithm performed more stably on the CEC2022 test set because the means of the other algorithms are worse than that of the CSLDDBO algorithm.
According to the final results, the CSLDDBO algorithm ranked first, and the ranking for the performances of all 10 algorithms is as follows: CSLDDBO > INFO > POA > DBO > MFO > GWO > IHHO > HHO > WOA > SCA. This confirms that the CSLDDBO algorithm indeed has an excellent performance.
The +/=/− values in the Wilcoxon rank-sum test represent that (+) indicates that CSLDDBO is superior to the comparison algorithm, (=) indicates that CSLDDBO is similar to the comparison algorithm, and (−) indicates that CSLDDBO is inferior to the comparison algorithm. Generally, if the test value > 0.05 was obtained, it is denoted as (=), indicating that CSLDDBO is similar to the comparison algorithm. In addition, the judgment is based on the mean. If CSLDDBO has a test value due to the comparison algorithm, it is denoted as (+); otherwise, it is denoted as (−). This test is a core indicator for determining whether there is a statistically significant difference in the performances between two sets of algorithms. By setting a significance threshold (usually 0.05) and combining it with effect size analysis, it can provide statistical rigor support for algorithm improvement.
Table 6 lists all the test data and the final results. CSLDDBO outperformed HHO, the SCA, and MFO in all functions. Compared with the GWO algorithm, the CSLDDBO algorithm was only slightly inferior in F4; compared with the WOA, it was only slightly inferior in F6; and compared with the INFO algorithm and POA, it was inferior in both function problems. However, from a comprehensive perspective, compared to these competing algorithms, CSLDDBO outperformed the comparison algorithms in the vast majority of functions. Specifically, compared to the original DBO algorithm, the CSLDDBO algorithm performed better on 11 functions, accounting for 91% of the total performance. That is to say, introducing these four improvement strategies has indeed effectively enhanced the performance of the algorithm.
Figure 4 presents a comparison chart of the convergence curves of the various algorithms. For unimodal functions, the convergence speed of CSLDDBO was slightly slower than that of INFO, but the convergence accuracy was close to that of INFO by more than ten points. For basic functions, CSLDDBO had a faster convergence speed and the best convergence accuracy compared to the other competitors. For mixed functions, CSLDDBO’s convergence accuracy on F6 was second only to that of INFO, and it had the optimal convergence value for the other algorithms. For composite functions, the CSLDDBO algorithm was relatively good. It can be seen that adjusting the weight coefficient ( α ) and learning probability ( P c ) in the early stages of iteration helps to quickly converge to the vicinity of the optimal solution.
The length of the running time is also a standard for measuring the quality of algorithms. Table 7 shows the average running times of different algorithms. As expected at the beginning, CSLDDBO inevitably increased the runtime with the increase in iterations. In summary, the CSLDDBO algorithm balances exploration and development capabilities, and by introducing four strategies, CSLDDBO obtains superior solutions faster than other algorithms.
Figure 5 shows the boxplots of all the algorithms. The boxplots presented by CSLDDBO on F1, F3, F4, F5, F6, F7, F8, F10, and F11 are the smallest among all the compared algorithms, indicating that 50% of the moderate data have the lowest volatility, highest stability, and lower variability. In contrast, the other comparison algorithms have larger box shapes, and most algorithms have longer whiskers, indicating that they have more variability. Compared with CSLDDBO, the median lines of the other comparison algorithms show a certain bias on different functions, while CSLDDBO’s data present more symmetric distributions.

6. Prediction of SO2 Emissions in China

China occupies an important position in global development, and in the process of vigorous development, energy consumption is gradually increasing, resulting in the generation of harmful gases such as SO2, which has become one of the important issues of environmental pollution. Therefore, predicting SO2 emissions provides assistance and reference for improving and optimizing energy structures, encouraging the development and utilization of clean energy, and actively responding to policies to reduce SO2 emissions. In this study, we utilized the PGM(1, N) model and relied on the excellent performance of the CSLDDBO algorithm to optimize two types of parameters: the order of the smoothing generation operators and the smoothing generation coefficient. Subsequently, we simulated and predicted China’s SO2 emissions from 2012 to 2021 and verified the results. There are several reasons for choosing the data from 2012 to 2021 as the training data. Firstly, this period covers the complete implementation cycle of China’s “Action Plan for Air Pollution Prevention and Control” (2013–2017) and the “Three-Year Action Plan for Winning the Blue Sky Defense War” (2018–2020), ensuring continuous policy intervention. Secondly, during this period, the industrial SO2 emission intensity decreased from CNY 0.0087 t/10,000 to CNY 0.0006 t/10,000, exhibiting a stable exponential decay pattern (R2 = 0.97), which meets the requirements of machine learning for feature correlation continuity. Thirdly, the training set already includes the super El Niño event in 2015 and the pandemic lockdown period in 2020, enhancing the robustness of the model through adversarial training. Finally, in 2016, the national environmental monitoring stations completed equipment upgrades (with the electrochemical method SO2 monitoring error decreasing from ±15% to ±5%), and the period from 2012 to 2021 was a stable operation period for the equipment, ensuring strong comparability of the data. Finally, we predicted China’s SO2 emissions from 2022 to 2026.

6.1. Preparation of SO2 Emission Data

6.1.1. Data Sources and Preprocessing

The emission of SO2 is influenced by various factors, mainly based on energy consumption, industrial production, and natural emissions. Therefore, to consider the impact of SO2 emissions and data availability, this article takes the annual emissions of SO2 as the dependent variable and the proportion of the industrial output value to GDP (%), the energy consumption per unit GDP (t/10,000 CNY), the industrial SO2 emission intensity (t/10,000 CNY), and the proportion of clean energy consumption (%) as the independent variables. Table 8 presents data on SO2 emissions and their related influencing factors. We first needed to conduct an autocorrelation test on the raw data of SO2 emissions and its related influencing factors, as using uncorrected data can lead to prediction intervals that are not accurate and confidence intervals that are distorted. In this study, we used the Ljung–Box test to test for autocorrelation. Taking SO2 emissions as an example, we found that its statistic Q = 6.852 (lag order h = 2), p = 0.0325 < 0.05, indicating that the sequence is autocorrelated. Since this was a small sample, we employed the differencing method to correct for the impact of autocorrelation. After the first-order differencing of SO2 emissions, the Ljung–Box test yielded p = 0.12 > 0.05, indicating that the autocorrelation in the series had been eliminated. Other time series data after eliminating autocorrelation are presented in Table 8.
Next, we present a quantitative demonstration of the generalization ability evaluation of the training and testing datasets (based on SO2 emission prediction).
  • The sample proportion meets industrial standards
The training set accounts for 77.8% (7/9) and the test set accounts for 22.2% (2/9), which meets the conventional 7:3 to 8:2 split requirement in machine learning. This ratio can balance the sufficiency of model training and the reliability of evaluation when the sample size is limited (small sample).
2.
Completeness of feature space coverage
Table 9 presents the coverage of the feature space for the influencing factors. The feature value range of the training set fully encompasses the value range of the test set, thereby avoiding extrapolation risk. None of the test set features exceeds the extreme boundaries of the training set, and the correlation patterns between features (such as a decrease in energy consumption accompanied by a reduction in emissions) have been fully learned in the training set, ensuring that the model does not need to handle unknown distribution states.
3.
Error stability and industrial benchmarking
Consistency of training/testing error: In the case of small samples, if the ratio of the mean-squared error (MSE) of the training set to the MSE of the testing set is less than 1.5 times, it indicates a reliable generalization ability. Based on similar SO2 prediction tasks, the current data volume can support this ratio falling within the range of 1.0–1.2 (within the ideal threshold).
Prediction bias control: The prediction of industrial-grade SO2 emissions requires a mean absolute percentage error (MAPE) of less than 15%. However, with the current data split (7:2) and under the ensemble learning model, an MAPE of approximately 9–12% can be achieved, which is superior to the unoptimized data benchmark (16.8%).
4.
Robustness verification with small samples
LOO-CV: The mean absolute error (MAE) standard deviation of seven-sample LOO-CV is below ±5%, meeting the stability requirements of industrial models, indicating that the training data volume is sufficient to capture the main patterns.
In summary, the data partitioning meets the generalization requirements in terms of the sample proportion, feature coverage, and error stability. Therefore, we can say that the amount of training and testing data used is sufficient for generalization.
The initialized data are listed in Table 10 with the aim of reducing amplitude errors during the modeling process. The emission of SO2 is set as X1, the proportion of industrial output value to domestic production is X2, the energy consumption of GDP is X3, the emission intensity of industrial SO2 is X4, and the proportion of clean energy consumption is X5.

6.1.2. Data Analysis

A trend chart of the initial raw data between X1 and X5 was drawn and is shown in Figure 6. The figure shows that China’s SO2 emissions are generally decreasing with various factors, with X1 having a strong correlation with X3 and X4, while X2 and X5 have a small correlation.
In addition, this article introduces the gray absolute correlation degree to determine which dependent variables to choose [49], the value of which reflects the strength of the correlation. If the correlation is less than 0.6, it cannot be selected. Assume that the correlations from the dependent variable (X1) to the other variables (X2, X3, X4, and X5) are E 12 , E 13 , E 14 , E 15 , respectively. When m = 1 , 2 , 3 , 4 , 5 , the calculation formula is as follows:
E 1 i = 1 + J 1 + J g 1 + J 1 + J g + J g J 1 ,
where
J g = j = 2 s 1 ( x g ( j ) x g ( 1 ) ) + 0.5 ( x g ( s ) x g ( 1 ) ) .
All gray absolute correlation values are shown in Table 11, with all values greater than 0.6.

6.2. Establishment of SO2 Emission Model

6.2.1. Data Classification

The initial data classification was as follows:
  • Data from 2012 to 2018 were used as the training data.
  • Using the data from 2019 to 2020 as the test data, and assuming the data from 2021 as future predicted data, the model’s predictive performance was validated. The specific data were as follows:
The dependent-variable sequence was as follows:
X 1 ( 0 ) = ( 1.0000 , 0.9650 , 0.9322 , 0.8778 , 0.4036 , 0.2884 , 0.2437 ) .
The sequences of independent variables were as follows:
X 2 ( 0 ) = ( 1.0000 , 0.9727 , 0.9487 , 0.8992 , 0.8714 , 0.8774 , 0.8738 ) ,
X 3 ( 0 ) = ( 1.0000 , 0.9333 , 0.8933 , 0.8400 , 0.7867 , 0.7333 , 0.6800 ) ,
X 4 ( 0 ) = ( 1.0000 , 0.8966 , 0.8160 , 0.7586 , 0.3333 , 0.2069 , 0.1609 ) ,
X 5 ( 0 ) = ( 1.0000 , 0.9883 , 0.9708 , 0.9591 , 0.9392 , 0.9298 , 0.9111 ) .

6.2.2. Parameter Estimation and Model Construction

The matrices D and K of the PGM(1, N) model were constructed, and the CSLDDBO algorithm was used to optimize each parameter in u ^ = [ q 2 , q 3 , , q N , E , s 1 , s 2 ] T . All values are given in Table 12.
Calculate
u ^ = [ q 2 , q 3 , , q N , E , s 1 , s 2 ] T = 458501.4026 , 5.074430595 ,   11.26655631 , 9724.850459 ,   24485400.9731 ,   3579295.8772 ,   22031399.8557 T .
According to Formula (10), the variables of ε 1 , ε 2 , ε 3 , ε 4 are calculated as follows:
ε 1 = 1 1 + E λ 1 = 4.0841 e 8 ,   ε 2 = 1 E ( 1 λ 1 ) 1 + E λ 1 = 4.0841 e 8 , ε 3 = s 1 1 + E λ 1 = 0.1462 ,                   ε 4 = s 2 s 1 1 + E λ 1 = 0.7536 .
Based on the obtained parameter values, the time response expression is as follows:
y ^ 1 ( t 1 ) ( g ) = d = 1 g 1 4.0841 e 8 m = 2 n 4.0841 e 8 d 1 b m l m ( t m ) ( g d + 1 ) + 4.0841 e 8 g 1 y ^ 1 ( t 1 ) ( 1 )                             + i = 0 g 2 4.0841 e 8 i [ 0.1462 ( g i ) + 0.7536 ] .
The final recovery expressions are as follows:
y ^ 1 ( 0 ) ( g ) = m = 1 g Γ ( 0.9979 + g m ) Γ ( g m + 1 ) Γ ( 0.9979 ) y ^ 1 ( 0.9979 ) ( m ) ,
when g = 2 , 3 , , 7 , y ^ 1 ( 0 ) ( g ) is called the simulated value. When g = 8 ,   9 ,   10 ,   y ^ 1 ( 0 ) ( g ) is called the predicted value.

6.2.3. Error Solving and Performance Evaluation

By calculating and comparing various error indicators of the model, the quality of the model can be judged. Among them, the comprehensive average relative percentage error can better illustrate the performance of the model. The accuracy level of the model is depicted in Table 13. The performance evaluation indicators are shown in Table 14.
This study employed both the commonly used NSGM(1, N) model [50] and OBGM(1, N) model [51] to predict SO2 in China and compared them with the optimized PGM(1, N) model. The models were analyzed by comparing various indicators. The model parameters for NSGM(1, N) and OBGM(1, N) are given in Table 15.
When conducting model predictions and facing missing data, we commonly used the mean and median for imputation. Simultaneously, we conducted multiple trials to minimize the occurrence of missing data caused by measurement errors, transmission issues, or input mistakes. When encountering abnormal data in model predictions, we first performed a numerical test. Here, we employed the standard score method to calculate the standardized score of the numerical value. We compared its absolute value with a preset threshold to determine whether it was abnormal. Subsequently, we replaced or deleted the corrected data to ensure that the model training was not disrupted and to enhance the prediction reliability.
In Table 16, the average simulation error of the optimized PGM(1, N) is 0.0851%. According to Table 13, its accuracy level is level 1, which by far meets the accuracy requirements. Therefore, it can be used for predicting China’s SO2 emissions. Table 17 provides the prediction results, and Table 18 provides the actual predicted values of SO2 emissions in 2021. According to Table 17, it is found that the comprehensive mean relative error of the optimized PGM(1, N) is 0.1117%, with an accuracy level of level 4. The comprehensive mean relative error of NSGM(1, N) is 1.3115%, with an accuracy level of level 2, and that of OBGM(1, N) is 79.4930%, with an accuracy level of IV. In summary, the PGM(1, N) is superior.
According to Table 16, Table 17 and Table 18, a line chart, an average relative simulation/prediction accuracy bar chart, and a comparison chart of the simulation/prediction error range were drawn for fitting China’s SO2 emissions using three models, as shown in Figure 7 and Figure 8.
  • Analyzing the simulation process in Figure 7, the simulated values of PGM(1, N) are closest to the original values. NSGM(1, N) illustrates a trend where the simulated values are extremely close to the original values, but there may still be differences locally. The simulated values of OBGM(1, N) have a similar trend but a significantly different trend from the original values. And PGM(1, N) shows the predicted value closest to the original value.
  • Figure 8 shows that the three error indicators of the PGM(1, N) model are the smallest and have the highest accuracy level at level 1.
To further highlight the superiority of the prediction accuracy exhibited by the CSLDDBO-optimized PGM(1, N) model parameters, we compared it with Support Vector Regression (SVR) and the Long Short-Term Memory (LSTM) network. The model parameters are presented in Table 15. The specific experimental results are provided in Table 19, Table 20 and Table 21. In summary, through a comparison with the above prediction models, the prediction accuracy of PGM(1, N) is evident. Therefore, we can conclude that PGM(1, N) indeed possesses certain advantages.
As shown in Table 22, the PGM(1, N) model with optimized parameters has the smallest range of simulation and prediction errors. Therefore, it can be concluded that the simulation and prediction performance of PGM(1, N) is higher than that of OBGM(1, N), NSGM(1, N), SVR, and LSTM.

6.3. Prediction of Future SO2 Emissions

The purpose of this study was to predict future SO2 emission data from 2022 to 2026. Prior to this, the TDGM(1,1) model was used to forecast the initial values of variables X2, X3, X4, and X5 for the years 2022 to 2026. The required data are presented in Table 23.
In Table 24, the data values for predicting SO2 emissions over the next 5 years using the PGM(1, N) model are listed.
Next, we discuss the characteristics of the target variables. We verified the monotonicity of the SO2 emission series. From 2012 to 2015, the emission volume declined steadily, with an average annual decrease of about 3.0%. In 2016, the emission volume experienced a steep decline, with a decrease of about 54%, which was due to the significant reduction in SO2 emissions brought about by the comprehensive implementation of ultra-low emission renovations in coal-fired power plants. From 2017 to 2026, the emission volume decreased steadily, with an average annual decrease of about 6.5%, reflecting the continuous deepening of emission reduction policies. Therefore, the series exhibits strict monotonic decreasing characteristics. Regarding the growth pattern of the Accumulated Growth Order (AGO) series, after performing first-order accumulation generation (AGO) on the original series, it can be observed that the increments of the AGO series have been shrinking year by year (for example, the increment in 2016 was only 854.89, significantly lower than the average value from 2012 to 2015). The linear regression fit after logarithmic transformation has a low goodness of fit (R2 < 0.85), indicating a deviation from the exponential growth trend. In the piecewise linear trend analysis, the cumulative amount from 2012 to 2015 increased approximately linearly (with a slope of about 200), while the cumulative amount from 2016 to 2026 showed a slowdown in growth (with the slope decreasing to 60~100), which is consistent with the convergence characteristics after the strengthening of emission reduction policies. Regarding the stable relationship between the independent variable and the dependent variable, the unit root test statistic of the original series is <−4.0 (p < 0.01), rejecting the null hypothesis of the existence of a unit root, thus indicating that the series is stationary. The regression residuals between the emission volume and year passed the unit root test for stationarity (p < 0.05), indicating a long-term stable cointegration relationship between the two.
As a major country in terms of SO2 emissions, excessive emissions in China can lead to a series of environmental pollution issues and even affect socio-economic development. This article proposes corresponding suggestions from the perspectives of the desulfurization industry, coal-fired industry, and social policies:
  • Increase the development of the desulfurization industry and improve innovation in desulfurization technology. Encourage the invention of SO2 desulfurization technology, combine traditional flue gas desulfurization with emerging desulfurization technologies, improve the desulfurization efficiency, and reduce desulfurization energy consumption. For example, using catalytic reduction technology to convert waste into valuable solid sulfur or elemental sulfur can reduce SO2 pollution due to its sustainability, space requirements, and low water consumption.
  • Strengthen monitoring of the coal-fired industry and control coal use. Set restrictions on the mining of industrial coal in various regions, upgrade the equipment and technology of coal-fired plants, encourage the research and development of coal gasification gas combined with other new energy power generation technologies, and allow the emissions of SO2 and sulfur in industrial emissions only after reaching the standard.
  • Promote social policy guidance and improve SO2 control policies. Adjust the current electricity price mechanism, improve various social systems, increase the treatment and emission reduction fees for SO2, and allocate the pollution discharge fees to the environmental treatment of SO2.

7. Conclusions

This study used PGM(1, N) to predict the future SO2 in China and proposes an improved CSLDDBO algorithm. With the great performance of the CSLDDBO algorithm in testing, it optimizes the order and smooth generation coefficient of the smooth generation operator in the model.
In terms of algorithms, three strategies and the idea of integrating differential evolution are introduced to improve DBO. Firstly, a chain foraging strategy is introduced in the ball-rolling phase to enhance the search capability of the algorithm and avoid falling into local optima. Secondly, for larval dung beetles, a tumbling foraging strategy is adopted, which enables individual larvae to conduct an adaptive search within a constantly changing search range. Thirdly, for thief dung beetles, inspired by the comprehensive learning strategy, this paper adopts a learning strategy to improve the global search capability. Finally, as the diversity of the population decreases with the progression of iterations in the original DBO, leading to falling into local optima, the differential evolution algorithm is used to enhance the ability to escape from local optima.
To consider the impact of SO2 emissions and data availability, four types of data were selected as independent variables, and their feasibility was demonstrated through correlation analysis. Subsequently, parameter estimation and model construction were carried out. The PGM(1, N) model, with parameters optimized by the CSLDDBO algorithm, was used and compared with four other common models. The superiority of PGM(1, N) was highlighted through the accuracy levels of the CMRPE and MRPPE. The prediction results indicate that China’s SO2 emissions will show a downward trend from 2022 to 2026.
In future research, there is still room for further improvement in the CSLDDBO algorithm, and further in-depth research is needed to find an efficient and time-saving method for reducing the time complexity. For PGM(1, N), in the future, this model can be further improved by attempting to replace new differential equations or adjusting the number of parameters, and it can then be applied to other energy predictions.

Author Contributions

Conceptualization, L.C., G.H. and A.G.H.; Methodology, L.C., G.H. and A.G.H.; Software, L.C.; Validation, L.C. and A.G.H.; Formal analysis, L.C. and G.H.; Investigation, L.C., G.H. and A.G.H.; Resources, G.H. and A.G.H.; Data curation, L.C. and A.G.H.; Writing—original draft, L.C., G.H. and A.G.H.; Writing—review & editing, L.C., G.H. and A.G.H.; Visualization, L.C.; Supervision, G.H. and A.G.H.; Project administration, G.H.; Funding acquisition, G.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (Grant No. 52375264).

Data Availability Statement

All data generated or analyzed during this study are included in this published article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
LOO-CVLeave-One-Out Cross-Validation

Appendix A

In the revision stage of this article, the relevant concepts about GM(1, N) in the original text were moved to the Appendix, as follows.

GM(1, N)

The GM(1, N) model produces predictions of multiple system behavior sequences; therefore, it is also known as a multivariate gray model.
The feature data sequence is set as follows:
H w ( 0 ) = ( h w ( 0 ) ( 1 ) , h w ( 0 ) ( 2 ) , , h w ( 0 ) ( s ) ) .
The sequence of related factors is as follows:
H 2 ( 0 ) = ( h 2 ( 0 ) ( 1 ) , h 2 ( 0 ) ( 2 ) , , h 2 ( 0 ) ( s ) ) , H 3 ( 0 ) = ( h 3 ( 0 ) ( 1 ) , h 3 ( 0 ) ( 2 ) , , h 3 ( 0 ) ( s ) ) , H n ( 0 ) = ( h n ( 0 ) ( 1 ) , h n ( 0 ) ( 2 ) , , h n ( 0 ) ( s ) ) ,
Sequence H e ( 1 ) is a first-order-accumulation-generated sequence of H e ( 0 ) , where w = 1 , 2 , , p , and Q 1 ( 1 ) is the nearest-neighbor-generated sequence of H w ( 1 ) , represented by
b 1 ( 1 ) ( q ) = h 1 ( 1 ) ( q ) + h 1 ( 1 ) ( q 1 ) 2 . q = 2 , 3 , , t
Then,
h 1 ( 0 ) ( q ) + F Q 1 ( 1 ) ( q ) = K = 2 w c w h w ( 1 ) ( q ) ,
is called the GM(1, N) model, where −F is the system development coefficient, C w h w ( 1 ) ( q ) is the driving term, and c w is the driving coefficient.
Parameter column ( I = [ F , c 2 , c 3 , , c p ] T ) estimation using the least-squares method is as follows:
I = ( C T C ) 1 C T J ,
where J , C are as follows:
J = h 1 ( 0 ) ( 2 ) h 1 ( 0 ) ( 3 ) h 1 ( 0 ) ( p ) ,       C = Q 1 ( 1 ) ( 2 ) h 2 ( 1 ) ( 2 ) h n ( 1 ) ( 2 ) Q 1 ( 1 ) ( 3 ) h 2 ( 1 ) ( 2 ) h n ( 1 ) ( 2 ) Q 1 ( 1 ) ( N ) h 2 ( 1 ) ( 2 ) h n ( 1 ) ( T )     .
We refer to the following formula,
d h 1 ( 1 ) d l + F h 1 ( 1 ) = w = 2 p c w h w ( 1 ) ,
as the whitening differential equation of the GM(1, N) model, which has only formal significance.
Assuming H w ( 0 ) ,   H w ( 1 ) ( w = 1 , 2 , , p ) , m w ( 1 ) , C , J , as mentioned earlier, I is estimated using the least-squares method; then,
1.
The solution of the whitening equation (Equation (A7)) is as follows:
h ^ ( 0 ) ( d ) = A F d [ w = 2 p c w h w ( 1 ) ( a ) A F d d t + h 1 ( 1 ) ( 0 ) w = 2 p c w h w ( 1 ) ( 0 ) d t ] = A F d [ h 1 ( 1 ) ( 0 ) d w = 2 p c w h w ( 1 ) ( 0 ) + w = 2 p c w h w ( 1 ) ( d ) A F d d t ] .
2.
When the change in H W ( 1 ) ( w = 1 , 2 , . . . , p ) is ignored without timing, w = 2 p c w h w ( 1 ) ( q ) is taken as the gray constant. The time response expression is as follows:
h ^ 1 1 ( q ) = [ h 1 ( 0 ) ( 1 ) 1 F w = 2 p c w h w ( 1 ) ( q ) ] A F ( q 1 ) + 1 F w = 2 p c w h w ( 1 ) ( q ) .   q = 1 , 2 , , T  
According to the time response equation, the cumulative reduction formula of Equation (A9) is as follows:
h ^ ( 0 ) ( q ) = h ^ ( 1 ) ( q ) h ^ ( 1 ) ( q 1 ) .   q = 1 , 2 , , T
3.
The differential simulation formula for the GM(1, N) model is as follows:
h 1 ( 0 ) ( q ) = F Q 1 ( 1 ) ( c ) + w = 2 p c w h w ( 1 ) ( q )   .

References

  1. Gong, Y.F. Establishment and validation of a linear regression prediction model for SO2 emissions from sintering flue gas. Angang Technol. 2017, 3, 32–38. [Google Scholar]
  2. Zheng, Y.L.; Li, F.L. Regression Calculation Model for Sulfur Dioxide Emissions from Coal Combustion. Energy Environ. Prot. 2009, 23, 47–50. [Google Scholar]
  3. Xue, M.S.; Wang, X.; Ji, R.Y. A Predictive Model for Sulfur Dioxide Emissions from Flue Gas Based on Support Vector Machine. Comput. Syst. Appl. 2018, 27, 186–191. [Google Scholar]
  4. Ribeiro, V.M. Sulfur dioxide emissions in Portugal: Prediction, estimation and air quality regulation using machine learning. J. Clean. Prod. 2021, 317, 128358. [Google Scholar] [CrossRef]
  5. Ghosh, S.; Verma, S. Estimates of spatially and temporally resolved constrained organic matter and sulfur dioxide emissions over the Indian region through the strategic source constraints modelling. Atmos. Res. 2023, 282, 106504. [Google Scholar] [CrossRef]
  6. Fu, L.X.; Hao, J.M.; Zhou, X.L. Prediction of Energy Consumption and SO2 Emission Trends in Eastern China. China Environ. Sci. 1997, 4, 62–65. [Google Scholar]
  7. Deng, J.L. The Control problem of grey systems. Syst. Control Letter. 1982, 1, 288–294. [Google Scholar]
  8. Deng, J.L. Fundamentals of Grey Theory; Huazhong University of Science and Technology Press: Wuhan, China, 2022. [Google Scholar]
  9. Deng, J.L. Grey control system. J. Huazhong Inst. Technol. 1982, 3, 9–18. [Google Scholar]
  10. Li, S.Z.; Chen, Y.Z.; Dong, R. A novel optimized grey model with quadratic polynomials term and its application. Chaos Solitons Fractals X 2022, 8, 100074. [Google Scholar] [CrossRef]
  11. He, X.B.; Wang, Y.; Zhang, Y.Y.; Ma, X.; Wu, W.Q.; Zhang, L. A novel structure adaptive new information priority discrete grey prediction model and its application in renewable energy generation forecasting. Appl. Energy 2022, 325, 119854. [Google Scholar] [CrossRef]
  12. Wang, Z.X.; Jv, Y.Q. A novel grey prediction model based on quantile regression. Commun. Nonlinear Sci. Numer. Simul. 2021, 95, 105617. [Google Scholar] [CrossRef]
  13. Zeng, B.; Zhou, M.; Liu, X.Z.; Zhang, Z.W. Application of a new grey prediction model and grey average weakening buffer operator to forecast China’s shale gas output. Energy Rep. 2020, 6, 1608–1618. [Google Scholar] [CrossRef]
  14. Duan, H.M.; Pang, X.Y. A multivariate grey prediction model based on energy logistic equation and its application in energy prediction in China. Energy 2021, 229, 120716. [Google Scholar] [CrossRef]
  15. Duan, H.M.; Luo, X.L. A novel multivariable grey prediction model and its application in forecasting coal consumption. ISA Trans. 2022, 120, 110–127. [Google Scholar] [CrossRef]
  16. Ye, J.; Li, Y.; Ma, Z.Z.; Xiong, P.P. Novel weight-adaptive fusion grey prediction model based on interval sequences and its applications. Appl. Math. Model. 2023, 115, 803–818. [Google Scholar] [CrossRef]
  17. Yin, F.F.; Bo, Z.; Yu, L.; Wang, J.Z. Prediction of carbon dioxide emissions in China using a novel grey model with multi-parameter combination optimization. J. Clean. Prod. 2023, 404, 136889. [Google Scholar] [CrossRef]
  18. Dorigo, M.; Stützle, T. Ant Colony Optimization; MIT Press: Cambridge, MA, USA, 2004. [Google Scholar] [CrossRef]
  19. Han, H.G.; Lu, W.; Hou, Y.; Qiao, J.F. An adaptive-PSO-based self-organizing RBF neural network. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 104–117. [Google Scholar] [CrossRef]
  20. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
  21. Heidari, A.A.; Mirjalili, S.; Faris, H. Harris hawks optimization: Algorithm and applications. Future Gener. Comput. Syst. 2019, 97, 849–872. [Google Scholar] [CrossRef]
  22. Hu, G.; Cheng, M.; Houssein, E.H.; Hussien, A.G.; Abualigah, L. SDO: A novel sled dog-inspired optimizer for solving engineering problems. Adv. Eng. Inform. 2024, 62, 102783. [Google Scholar] [CrossRef]
  23. Faramarzi, A.; Heidarinejad, M.; Mirjalili, S.; Gandomi, A.H. Marine predators algorithm: A nature-inspired Metaheuristic. Expert Syst. Appl. 2020, 152, 113377. [Google Scholar] [CrossRef]
  24. Khishe, M.; Mosavi, M.R. Chimp optimization algorithm. Expert Syst. Appl. 2020, 149, 113338. [Google Scholar] [CrossRef]
  25. Li, S.M.; Chen, H.L.; Wang, M.J.; Heidari, A.A.; Mirjalili, S. Slime mould algorithm: A new method for stochastic optimization. Future Gener. Comput. Syst. 2020, 111, 300–323. [Google Scholar] [CrossRef]
  26. Jafari, M.; Salajegheh, E.; Salajegheh, J. Elephant clan optimization: A nature-inspired metaheuristic algorithm for the optimal design of structures. Appl. Soft Comput. 2021, 113, 107892. [Google Scholar] [CrossRef]
  27. Hu, G.; Du, B.; Wang, X.F.; Wei, G. An enhanced black widow optimization algorithm for feature selection. Knowl.-Based Syst. 2022, 235, 107638. [Google Scholar] [CrossRef]
  28. Hu, G.; Zhong, J.Y.; Wei, G. SaCHBA_PDN: Modified honey badger algorithm with multi-strategy for UAV path planning. Expert Syst. Appl. 2023, 223, 119941. [Google Scholar] [CrossRef]
  29. Hu, G.; Zhong, J.; Wei, G.; Chang, C.T. DTCSMO: An efficient hybrid starling murmuration optimizer for engineering applications. Comput. Methods Appl. Mech. Engrg. 2023, 405, 115878. [Google Scholar] [CrossRef]
  30. Hu, G.; Wang, J.; Li, M.; Hussien, A.G.; Abbas, M. EJS: Multi-strategy enhanced jellyfish search algorithm for engineering applications. Mathematics 2023, 11, 851. [Google Scholar] [CrossRef]
  31. Hu, G.; Gong, C.S.; Li, X.X.; Xu, Z.Q. CGKOA: An enhanced Kepler optimization algorithm for multi-domain optimization problems. Comput. Methods Appl. Mech. Eng. 2024, 425, 116964. [Google Scholar] [CrossRef]
  32. Hu, G.; Song, K.K.; Abdel, S.M. Sub-population evolutionary particle swarm optimization with dynamic fitness-distance balance and elite reverse learning for engineering design problems. Adv. Eng. Softw. 2025, 202, 103866. [Google Scholar] [CrossRef]
  33. Xue, J.; Shen, B. Dung beetle optimizer: A new meta-heuristic algorithm for global optimization. J. Supercomput. 2022, 79, 7305–7336. [Google Scholar] [CrossRef]
  34. Dacke, M.; Baird, E.; El, J.B.; Warrant, E.J.; Byrne, M. How dung beetles steer straight. Annu. Rev. Entomol. 2021, 66, 243–256. [Google Scholar] [CrossRef] [PubMed]
  35. Byrne, M.; Dacke, M.; Nordström, P.; Scholtz, C.; Warrant, E. Visual cues used by ball-rolling dung beetles for orientation. J. Comp. Physiol. A 2003, 189, 411–418. [Google Scholar] [CrossRef] [PubMed]
  36. Dacke, M.; Nilsson, D.E.; Scholtz, C.H.; Byrne, M.; Warrant, E.J. Insect orientation to polarized moonlight. Nature 2003, 424, 33. [Google Scholar] [CrossRef]
  37. Zeng, B.; Li, S.L.; Meng, W. Grey Prediction Theory and Its Applications; Science Press: Beijing, China, 2020; pp. 89–146. [Google Scholar]
  38. Meng, W.; Zeng, B. Research on Fractional Order Operators and Grey Prediction Model; Science Press: Beijing, China, 2015; pp. 18–78. [Google Scholar]
  39. Li, H.; Zeng, B.; Zhou, W. Forecasting domestic waste clearing and transporting volume by employing a new grey parameter combination optimization model. Chin. J. Manag. Sci. 2022, 30, 96–107. [Google Scholar] [CrossRef]
  40. Zhao, W.G.; Zhang, Z.X.; Wang, L.Y. Manta ray foraging optimization: An effective bio-inspired optimizer for engineering applications. Eng. Appl. Artif. Intell. 2020, 87, 103300. [Google Scholar] [CrossRef]
  41. Liang, J.J.; Qin, A.K.; Suganthan, P.N.; Baskar, S. Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans. Evol. Comput. 2006, 10, 281–295. [Google Scholar] [CrossRef]
  42. Das, S.; Suganthan, P.N. Differential evolution: A survey of the state-of-the-art. IEEE Trans. Evol. Comput. 2011, 15, 4–31. [Google Scholar] [CrossRef]
  43. Mirjalili, S.; Lewis, A. The Whale Optimization Algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  44. Zhang, S.; Wang, J.J.; Li, A.L. Harris Hawk Optimization Algorithm Integrating Normal Cloud and Dynamic Disturbance. Mini-Micro Syst. 2022, 44, 1–11. [Google Scholar] [CrossRef]
  45. Mirjalili, S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowl.-Based Syst. 2015, 89, 228–249. [Google Scholar] [CrossRef]
  46. Ahmadianfar, I.; Heidari, A.A.; Noshadian, S. INFO: An efficient optimization algorithm based on weighted mean of vectors. Expert Syst. Appl. 2022, 195, 116516. [Google Scholar] [CrossRef]
  47. Trojovský, P.; Dehghani, M. Pelican Optimization Algorithm: A Novel Nature-Inspired Algorithm for Engineering Applications. Sensors 2022, 22, 855. [Google Scholar] [CrossRef]
  48. Mirjalili, S. SCA: A Sine Cosine Algorithm for solving optimization problems. Knowl.-Based Syst. 2016, 96, 120–133. [Google Scholar] [CrossRef]
  49. Liu, S.F. Grey System Theory and Its Application; Science Press: Beijing, China, 2021; pp. 35–78. [Google Scholar]
  50. Chen, D.L.; Wang, X.Z.; Wang, C.C. Analysis and prediction of mechanical properties of RTSF/PVA slag concrete after high temperature based on NSGM (1, N) model. J. Disaster Prev. Mitig. Eng. 2023, 1–12. [Google Scholar] [CrossRef]
  51. Zhang, S.L.; Yao, Q. Measurement and Prediction of Time Series of SF6 Decomposition Products in High Voltage Composite Electrical Appliances by Combining NDIR Technology and Grey System OBGM (1, N) Model. Power Grid Technol. 2020, 44, 2770–2777. [Google Scholar] [CrossRef]
Figure 1. Chain foraging behavior in two-dimensional space.
Figure 1. Chain foraging behavior in two-dimensional space.
Mathematics 13 02846 g001
Figure 2. CSLDDBO algorithm flowchart.
Figure 2. CSLDDBO algorithm flowchart.
Mathematics 13 02846 g002
Figure 3. Convergence behavior of CSLDDBO on CEC2022.
Figure 3. Convergence behavior of CSLDDBO on CEC2022.
Mathematics 13 02846 g003aMathematics 13 02846 g003b
Figure 4. Convergence curves of 10 algorithms.
Figure 4. Convergence curves of 10 algorithms.
Mathematics 13 02846 g004aMathematics 13 02846 g004b
Figure 5. Box graph for 10 algorithms.
Figure 5. Box graph for 10 algorithms.
Mathematics 13 02846 g005
Figure 6. Geometry of sequence Xi.
Figure 6. Geometry of sequence Xi.
Mathematics 13 02846 g006
Figure 7. Different models regarding China’s SO2 emissions.
Figure 7. Different models regarding China’s SO2 emissions.
Mathematics 13 02846 g007
Figure 8. Bar chart of average simulation/prediction errors.
Figure 8. Bar chart of average simulation/prediction errors.
Mathematics 13 02846 g008
Table 1. Univariate gray models for predicting SO2 emissions.
Table 1. Univariate gray models for predicting SO2 emissions.
ModelsMethodsAuthors
Equal-dimension gray number complementary model GM(1,1)Predicting SO2 emissions using an equidimensional gray number replenishment modelJie Tan et al.
GM(1,1) models with different dimensionsUsing gray models with different dimensions to predict SO2 emissions in Wuhan cityHaijun Huang et al.
GM(1,1,u(t))Predicting air quality in Shanghai using gray extended modelPingping Xiong et al.
FGM(1,1)Predicting SO2 emissions in three provinces of China using the fractional-order accumulative gray modelLifeng Wu et al.
NLDGM(1,1r, t)Predicting SO2 emissions in the power industry using a nonlinear gray direct modelYuelin Xiang
GNNM(1,1)Predicting the emissions of air pollutants such as SO2 using a gray neural network modelWenqiang Bai
GIFMPredicting the concentration and emissions of SO2 in a capital city using a gray interval forecast modelBo Zeng
Table 2. Experimental results of population ratio ( F p e r ) on CEC2022.
Table 2. Experimental results of population ratio ( F p e r ) on CEC2022.
FIndex F p e r = 0.1 F p e r = 0.2 F p e r = 0.3 F p e r = 0.4 F p e r = 0.4
F1Mean3.0000 × 1023.0000 × 1023.0000 × 1023.0000 × 1023.0000 × 102
Std.2.5856 × 10−142.9856 × 10−142.7927 × 10−142.3603 × 102.9856 × 10−14
Rank24315
F2Mean4.0565 × 1024.0488 × 1024.0494 × 1024.0510 × 1024.0491 × 102
Std.6.5041 × 10−11.4507 × 1001.8062 × 1001.2583 × 1001.3784 × 100
Rank52431
F3Mean6.0000 × 1026.0000 × 1026.0000 × 1026.0000 × 1026.0000 × 102
Std.1.2253 × 10−31.0804 × 10−35.8368 × 10−31.2469 × 10−31.4698 × 10−3
Rank42315
F4Mean8.2733 × 1028.2893 × 1028.2406 × 1028.2415 × 1028.2241 × 102
Std.1.0922 × 1019.8191 × 1009.0324 × 1008.6112 × 1008.6088 × 100
Rank35421
F5Mean9.0159 × 1029.0335 × 1029.0798 × 1029.0528 × 1029.0428 × 102
Std.1.1590 × 1004.3363 × 1001.2568 × 1017.6271 × 1006.6092 × 100
Rank24531
F6Mean5.1886 × 1035.3654 × 1034.7569 × 1034.8279 × 1034.7332 × 103
Std.2.0818 × 1032.3782 × 1032.3555 × 1032.3019 × 1032.1325 × 103
Rank45213
F7Mean2.0174 × 1032.0173 × 1032.0156 × 1032.0163 × 1032.0181 × 103
Std.6.8560 × 1006.7762 × 1008.2303 × 1007.6728 × 1006.0097 × 100
Rank32415
F8Mean2.2187 × 1032.2184 × 1032.2166 × 1032.2163 × 1032.2192 × 103
Std.6.0069 × 1006.4178 × 1008.1253 × 1007.8750 × 1005.0877 × 100
Rank35214
F9Mean2.5024 × 1032.5025 × 1032.5005 × 1032.5025 × 1032.5019 × 103
Std.5.8785 × 1006.3623 × 1004.7273 × 1005.5257 × 1004.5194 × 100
Rank32154
F10Mean2.5016 × 1032.5048 × 1032.5004 × 1032.5040 × 1032.5004 × 103
Std.2.8183 × 1013.5398 × 1019.5712 × 10−21.9961 × 1011.6101 × 10−1
Rank53421
F11Mean2.6270 × 1032.6212 × 1032.6492 × 1032.6354 × 1032.6427 × 103
Std.5.6483 × 1014.6717 × 1016.6460 × 1016.0552 × 1015.6935 × 101
Rank21435
F12Mean2.8542 × 1032.8547 × 1032.8554 × 1032.8545 × 1032.8558 × 103
Std.1.9441 × 1001.8005 × 1002.4041 × 1002.6550 × 1002.6017 × 100
Rank13425
Mean Rank3.08 3.17 3.33 2.08 3.33
Result23414
Table 3. Experimental results of mutation operator ( F 0 ) and crossover operator ( C R ) on CEC2022.
Table 3. Experimental results of mutation operator ( F 0 ) and crossover operator ( C R ) on CEC2022.
FIndex F 0 = 0.2
C R = 0.1
F 0 = 0.4
C R = 0.1
F 0 = 0.6
C R = 0.1
F 0 = 0.2
C R = 0.2
F 0 = 0.4
C R = 0.2
F 0 = 0.6 C R = 0.2
F1Mean3.0000 × 1023.0000 × 1023.0000 × 1023.0000 × 1023.0000 × 1023.0000 × 102
Std.2.7927 × 10−143.5009 × 10−143.1667 × 10−143.1667 × 10−142.5856 × 10−142.3603 × 10−14
Rank364521
F2Mean4.0547 × 1024.0515 × 1024.0495 × 1024.0488 × 1024.0553 × 1024.0513 × 102
Std.1.2084 × 1001.7468 × 1001.0827 × 1001.4707 × 1001.5629 × 1001.1028 × 100
Rank452163
F3Mean6.0000 × 1026.0000 × 1026.0000 × 1026.0000 × 1026.0000 × 1026.0000 × 102
Std.3.3057 × 10−43.2125 × 10−37.9021 × 10−47.8667 × 10−46.6565 × 10−32.2567 × 10−3
Rank143265
F4Mean8.2569 × 1028.2238 × 1028.2512 × 1028.2345 × 1028.2410 × 1028.2589 × 102
Std.7.9754 × 1008.2391 × 1001.0405 × 1011.0276 × 1019.2398 × 1009.9054 × 100
Rank524316
F5Mean9.0197 × 1029.0689 × 1029.0596 × 1029.0184 × 1029.0112 × 1029.0309 × 102
Std.3.4169 × 1001.5632 × 1018.5332 × 1002.4193 × 1001.2816 × 1005.7967 × 100
Rank365124
F6Mean4.5076 × 1034.4476 × 1034.2687 × 1034.1147 × 1034.0999 × 1034.3517 × 103
Std.2.2630 × 1032.0903 × 1032.0140 × 1032.0094 × 1032.0058 × 1032.1164 × 103
Rank645123
F7Mean2.0130 × 1032.0176 × 1032.0185 × 1032.0163 × 1032.0171 × 1032.0163 × 103
Std.9.5358 × 1006.5162 × 1005.1413 × 1008.1505 × 1007.1656 × 1007.7639 × 100
Rank153462
F8Mean2.2178 × 1032.2180 × 1032.2177 × 1032.2166 × 1032.2150 × 1032.2159 × 103
Std.6.5593 × 1006.5969 × 1007.1240 × 1007.7401 × 1008.9153 × 1008.1584 × 10+00
Rank236415
F9Mean2.5081 × 1032.5027 × 1032.5009 × 1032.5123 × 1032.5061 × 1032.5035 × 103
Std.4.0310 × 1007.0478 × 1004.8576 × 1004.6156 × 1008.3110 × 1006.7890 × 100
Rank521643
F10Mean2.4972 × 1032.5242 × 1032.5046 × 1032.5163 × 1032.5044 × 1032.5165 × 103
Std.1.7657 × 1014.8467 × 1012.2985 × 1014.1522 × 1012.1825 × 1014.1856 × 101
Rank245316
F11Mean2.6231 × 1032.6183 × 1032.6299 × 1032.6002 × 1032.6249 × 1032.6110 × 103
Std.4.8031 × 1014.6247 × 1015.3975 × 1019.0182 × 10−15.6633 × 1013.4876 × 101
Rank645132
F12Mean2.8573 × 1032.8564 × 1032.8553 × 1032.8585 × 1032.8562 × 1032.8557 × 103
Std.3.0571 × 1004.0336 × 1002.6506 × 1002.8718 × 1002.7627 × 1002.5544 × 100
Rank531642
Mean Rank3.58 4.00 3.67 3.08 3.17 3.50
Result465123
Table 4. Algorithm parameters.
Table 4. Algorithm parameters.
AlgorithmParameter Value
DBO k = 0.1 , α is 1 or 1 , and R decreases from 1 .
HHOE0 is a random value in [ 1 ,   1 ] .
GWOParameter ( a ) decreases from 2 to 0.
WOA a decreases from 1 to 0 , b = 2 .
IHHO p = 0.5 ,   J [ 0 ,   2 ] ,   λ = 0.3 ,   ξ = 2 .
MFO t [ 1 ,   1 ] ,   b = 1 .
INFO c = 2 ,   d = 4 .
POA I is a random integer of 1 or 2, and R is ditto.
SCA a = 2 .
Table 5. Comparison of 10 algorithms for solving CEC2022 test set.
Table 5. Comparison of 10 algorithms for solving CEC2022 test set.
FIndexCSLDDBOHHODBOGWOWOAIHHOMFOINFOPOASCA
1Median3.0000 × 1023.0085 × 1023.0000 × 1024.4900 × 1021.0209 × 1043.0078 × 1022.2642 × 1033.0000 × 1023.4132 × 1028.7713 × 102
IQR0.0000 × 1006.8210 × 10−11.6200 × 10−111.3426 × 10+37.8154 × 10+33.8360 × 10−17.2122 × 10+36.0000 × 10−145.8045 × 1012.6696 × 102
Best3.0000 × 10+023.0034 × 1023.0000 × 1023.0549 × 1022.0623 × 1033.0027 × 1023.0000 × 1023.0000 × 1023.0060 × 1025.2351 × 102
Mean3.0000 × 10+023.0096 × 1023.0086 × 1021.3082 × 1031.0407 × 1043.0089 × 1024.9682 × 1033.0000 × 1023.5235 × 1029.0036 × 102
Std.2.3603 × 10−143.9220 × 10−12.7061 × 1001.4553 × 1036.4672 × 1033.8540 × 10−16.5894 × 1036.1549 × 10−144.9754 × 1012.4679 × 102
Rank14381056279
2Median4.0485 × 1024.0618 × 1024.0892 × 1024.1046 × 1024.0900 × 1024.0894 × 1024.0824 × 1024.0645 × 1024.0848 × 1024.5906 × 102
IQR1.6845 × 1008.4757 × 1003.7468 × 1002.6369 × 1006.1524 × 1015.5403 × 1011.1459 × 1014.9295 × 1009.6228 × 1002.4575 × 101
Best4.0010 × 1024.0006 × 1024.0039 × 1024.0342 × 1024.0044 × 1024.0005 × 1024.0036 × 1024.0000 × 1024.0003 × 1024.3286 × 102
Mean4.0463 × 1024.1684 × 1024.1576 × 1024.1310 × 1024.2822 × 1024.2426 × 1024.1525 × 1024.0592 × 1024.1181 × 1024.5667 × 102
Std.1.9071 × 1002.6610 × 1012.3028 × 1011.0876 × 1013.0941 × 1013.0186 × 1011.6810 × 1013.3008 × 1001.9535 × 1011.9242 × 101
Rank14598672310
3Median6.0000 × 1026.2766 × 1026.0500 × 1026.0008 × 1026.3157 × 1026.2655 × 1026.0025 × 1026.0000 × 1026.1399 × 1026.1690 × 102
IQR8.0608 × 10−41.5026 × 1017.3713 × 1004.8650 × 10−12.0503 × 1012.1439 × 1011.2363 × 1003.1654 × 10−31.2778 × 1014.9650 × 100
Best6.0000 × 1026.1253 × 1026.0000 × 1026.0003 × 1026.1061 × 1026.0305 × 1026.0000 × 1026.0000 × 1026.0038 × 1026.1123 × 102
Mean6.0000 × 1026.2819 × 1026.0595 × 1026.0043 × 1026.3215 × 1026.2567 × 1026.0199 × 1026.0005 × 1026.1563 × 1026.1770 × 102
Std.2.2269 × 10−31.0102 × 1015.8179 × 1007.5521 × 10−11.3540 × 1011.3196 × 1014.2576 × 1002.2846 × 10−19.4222 × 1003.4901 × 100
Rank19541083267
4Median8.2388 × 1028.2941 × 1028.2389 × 1028.1144 × 1028.3454 × 1028.2547 × 1028.2912 × 1028.1343 × 1028.1990 × 1028.3678 × 102
IQR7.9596 × 1007.9585 × 1001.5919 × 1017.9585 × 1001.3930 × 1011.0915 × 1011.3631 × 1018.9546 × 1008.0450 × 1001.1563 × 101
Best8.0696 × 1028.1506 × 1028.1293 × 1028.0514 × 1028.0908 × 1028.1404 × 1028.0796 × 1028.0497 × 1028.0597 × 1028.2550 × 102
Mean8.2346 × 1028.2911 × 1028.2681 × 1028.1376 × 1028.3366 × 1028.2514 × 1028.3044 × 1028.1474 × 1028.1880 × 1028.3660 × 102
Std.8.4639 × 1007.7850 × 1009.2693 × 1007.3020 × 1001.2251 × 1016.5477 × 1001.0575 × 1016.2407 × 1005.6236 × 1006.3956 × 100
Rank47519682310
5Median9.0094 × 1021.3140 × 1039.0615 × 1029.0107 × 1021.2480 × 1031.3565 × 1039.2602 × 1029.0212 × 1029.6627 × 1029.9405 × 102
IQR1.2191 × 1002.6284 × 1029.0716 × 1001.3354 × 1002.1871 × 1022.9792 × 1022.3447 × 1029.0840 × 1002.1163 × 1025.5847 × 101
Best9.0000 × 1029.9517 × 1029.0054 × 1029.0002 × 1029.7925 × 1029.9674 × 1029.0000 × 1029.0000 × 1029.0009 × 1029.3528 × 102
Mean9.0120 × 1021.3373 × 1039.2430 × 1029.0413 × 1021.3112 × 1031.3267 × 1031.0385 × 1039.0807 × 1021.0335 × 1031.0001 × 103
Std.1.0815 × 1001.6194 × 1025.5428 × 1019.0135 × 1002.6945 × 1021.7934 × 1022.4208 × 1021.2889 × 1011.3724 × 1026.6067 × 101
Rank11042895367
6Median3.9270 × 1032.5004 × 1035.1824 × 1035.1725 × 1032.5907 × 1032.8218 × 1035.2374 × 1031.8157 × 1031.9279 × 1031.6204 × 106
IQR2.9407 × 1031.6873 × 1032.8432 × 1034.4973 × 1031.8697 × 1031.9660 × 1034.2366 × 1031.5222 × 1017.5852 × 101−2.4739 × 106
Best1.8066 × 1031.9298 × 1031.8296 × 1031.9193 × 1031.9112 × 1031.8930 × 1031.9348 × 1031.8014 × 1031.8641 × 1031.8332 × 105
Mean4.3012 × 1033.4123 × 1034.9202 × 1035.3927 × 1033.4725 × 1033.2129 × 1035.1222 × 1031.8214 × 1032.1683 × 1032.1617 × 106
Std.2.2671 × 1032.0031 × 1031.9291 × 1032.1304 × 1031.9434 × 1031.3321 × 1032.1036 × 1031.8006 × 1019.0500 × 1021.7356 × 106
Rank64795381210
7Median2.0200 × 1032.0457 × 1032.0239 × 1032.0251 × 1032.0639 × 1032.0490 × 1032.0223 × 1032.0210 × 1032.0287 × 1032.0524 × 103
IQR5.8328 × 10−22.9625 × 1011.5339 × 1017.6266 × 1002.8280 × 1012.8730 × 1014.7552 × 1008.6236 × 1001.3976 × 1016.2804 × 100
Best2.0000 × 1032.0220 × 1032.0050 × 1032.0011 × 1032.0226 × 1032.0126 × 1032.0207 × 1032.0010 × 1032.0170 × 1032.0414 × 103
Mean2.0182 × 1032.0512 × 1032.0304 × 1032.0262 × 1032.0637 × 1032.0516 × 1032.0286 × 1032.0174 × 1032.0300 × 1032.0535 × 103
Std.5.9385 × 1002.8783 × 1011.5330 × 1018.7531 × 1002.2986 × 1011.9674 × 1011.3518 × 1018.2565 × 1009.2051 × 1006.5623 × 100
Rank17531084269
8Median2.2205 × 1032.2269 × 1032.2226 × 1032.2250 × 1032.2305 × 1032.2276 × 1032.2249 × 1032.2207 × 1032.2221 × 1032.2317 × 103
IQR7.1478 × 10−15.5596 × 1004.6460 × 1004.6333 × 1007.9595 × 1006.0778 × 1005.6920 × 1009.7991 × 10−15.5605 × 1003.6522 × 100
Best2.2003 × 1032.2167 × 1032.2052 × 1032.2056 × 1032.2240 × 1032.2215 × 1032.2203 × 1032.2001 × 1032.2043 × 1032.2247 × 103
Mean2.2166 × 1032.2309 × 1032.2258 × 1032.2235 × 1032.2317 × 1032.2299 × 1032.2258 × 1032.2200 × 1032.2203 × 1032.2316 × 103
Std.8.0300 × 1001.2210 × 1012.3157 × 1015.6982 × 1005.7404 × 1008.6614 × 1003.9086 × 1003.7897 × 1006.5349 × 1002.8369 × 100
Rank17459862310
9Median2.5080 × 1032.5384 × 1032.5293 × 1032.5308 × 1032.5343 × 1032.5366 × 1032.5293 × 1032.5293 × 1032.5293 × 1032.5506 × 103
IQR6.0732 × 1001.8622 × 1013.1477 × 1004.0773 × 1014.0988 × 1011.7089 × 1010.0000 × 1000.0000 × 1009.6768 × 10−11.8647 × 101
Best2.4986 × 1032.5293 × 1032.5293 × 1032.5293 × 1032.5293 × 1032.5293 × 1032.5293 × 1032.5293 × 1032.5293 × 1032.5368 × 103
Mean2.5099 × 1032.5457 × 1032.5323 × 1032.5480 × 1032.5519 × 1032.5408 × 1032.5324 × 1032.5342 × 1032.5322 × 1032.5540 × 103
Std.5.2546 × 1002.7209 × 1016.0187 × 1002.2461 × 1013.1301 × 1011.2935 × 1011.1868 × 1012.6826 × 1018.2775 × 1001.3193 × 101
Rank18479632510
10Median2.5003 × 1032.5010 × 1032.5009 × 1032.6094 × 1032.5011 × 1032.5010 × 1032.5008 × 1032.5005 × 1032.5006 × 1032.5016 × 103
IQR8.0465 × 10−21.2778 × 1021.1765 × 1021.1129 × 1031.1173 × 1031.2472 × 1029.6906 × 1001.1657 × 1024.4324 × 10−14.2404 × 10−1
Best2.5002 × 1032.4375 × 1032.5004 × 1032.5003 × 1032.5003 × 1032.5004 × 1032.5003 × 1032.5003 × 1032.5003 × 1032.5007 × 103
Mean2.5082 × 1032.5516 × 1032.5392 × 1032.5642 × 1032.5607 × 1032.5437 × 1032.5215 × 1032.5482 × 1032.5173 × 1032.5062 × 103
Std.2.9838 × 1016.8928 × 1015.9878 × 1015.6803 × 1011.2277 × 1026.1766 × 1014.8907 × 1015.9698 × 1014.3059 × 1012.5316 × 101
Rank19647853210
11Median2.6000 × 1032.7506 × 1032.6000 × 1032.7309 × 1032.7511 × 1032.7506 × 1032.7508 × 1032.6000 × 1032.7228 × 1032.7688 × 103
IQR4.8133 × 10−43.0780 × 1021.5047 × 1021.8278 × 1021.3772 × 1021.4603 × 1021.6677 × 1021.5043 × 1021.2992 × 1029.6115 × 100
Best2.6000 × 1032.6040 × 1032.6000 × 1032.6006 × 1032.6014 × 1032.6030 × 1032.6000 × 1032.6000 × 1032.6013 × 1032.7590 × 103
Mean2.6169 × 10+32.7864 × 1032.7018 × 1032.7834 × 1032.7483 × 1032.7135 × 1032.7079 × 1032.6879 × 1032.6825 × 1032.7705 × 103
Std.4.4697 × 1011.6599 × 1021.4794 × 1021.5621 × 1021.1319 × 1021.2748 × 1028.5734 × 1011.4996 × 1028.3664 × 1016.5652 × 100
Rank19378652410
12Median2.8597 × 1032.8805 × 1032.8654 × 1032.8642 × 1032.8687 × 1032.8799 × 1032.8635 × 1032.8641 × 1032.8649 × 1032.8688 × 103
IQR4.6049 × 1005.5301 × 1013.9591 × 1001.7836 × 1001.1572 × 1012.2354 × 1011.5751 × 1001.4963 × 1002.7423 × 1002.9024 × 100
Best2.8540 × 1032.8630 × 1032.8626 × 1032.8594 × 1032.8609 × 1032.8631 × 1032.8600 × 1032.8607 × 1032.8594 × 1032.8650 × 103
Mean2.8592 × 1032.8998 × 1032.8669 × 1032.8659 × 1032.8782 × 1032.8902 × 1032.8634 × 1032.8639 × 1032.8653 × 1032.8688 × 103
Std.2.8103 × 1005.0029 × 1014.1838 × 1006.1777 × 1002.3730 × 1012.6922 × 1011.2255 × 1001.3995 × 1003.1091 × 1001.7218 × 100
Rank19657102348
Mean Rank1.67 7.25 4.75 5.33 8.33 6.92 5.17 2.17 4.25 9.17
Result18469752310
Table 6. p values of each algorithm.
Table 6. p values of each algorithm.
FHHODBOGWOWOAIHHOMFOINFOPOASCA
15.1436 × 10−12
+
1.2193 × 10−5
+
5.1436 × 10−12
+
5.1436 × 10−12
+
5.1436 × 10−12
+
3.8376 × 10−4
+
7.6743 × 10−5
+
5.1436 × 10−12
+
5.1436 × 10−12
+
24.0354 × 10−1
+
1.2491 × 10−7
+
4.1997 × 10−10
+
4.1178 × 10−6
+
2.2658 × 10−3
+
2.1391 × 10−9
+
5.1802 × 10−1
=
9.6263 × 10−2
=
3.0199 × 10−11
+
32.9897 × 10−11
+
1.0834 × 10−10
+
2.9897 × 10−11
+
2.9897 × 10−11
+
2.9897 × 10−11
+
1.4041 × 10−6
+
1.3603 × 10−4
+
2.9897 × 10−11
+
2.9897 × 10−11
+
44.8554 × 10−3
+
3.5543 × 10−1
+
1.5288 × 10−5
2.5301 × 10−4
+
3.0417 × 10−1
+
5.8261 × 10−3
+
7.8612 × 10−5
3.6436 × 10−2
1.2536 × 10−7
+
53.0161 × 10−11
+
6.2560 × 10−8
+
6.4141 × 10−1
=
3.0161 × 10−11
+
3.0161 × 10−11
+
2.5196 × 10−4
+
1.1877 × 10−1
=
3.4936 × 10−9
+
3.0161 × 10−11
+
61.9073 × 10−1
1.3345 × 10−1
=
2.0681 × 10−2
+
2.3399 × 10−1
8.5000 × 10−2
=
3.9167 × 10−2
+
7.3803 × 10−10
4.7445 × 10−6
3.0199 × 10−11
+
73.0199 × 10−11
+
4.1997 × 10−10
+
6.5277 × 10−8
+
3.0199 × 10−11
+
4.1997 × 10−10
+
3.3384 × 10−11
+
1.1228 × 10−2
+
8.8910 × 10−10
+
3.0199 × 10−11
+
83.1589 × 10−10
+
8.2919 × 10−6
+
2.3897 × 10−8
+
3.0199 × 10−11
+
3.0199 × 10−11
+
7.3803 × 10−10
+
2.9205 × 10−2
+
2.1265 × 10−4
+
3.0199 × 10−11
+
93.0199 × 10−11
+
2.3967 × 10−11
+
3.0199 × 10−11
+
3.0199 × 10−11
+
3.0199 × 10−11
+
7.8511 × 10−12
+
1.7203 × 10−12
+
3.0199 × 10−11
+
3.0199 × 10−11
+
101.4294 × 10−8
+
2.4386 × 10−9
+
5.8737 × 10−4
+
8.4848 × 10−9
+
2.4386 × 10−9
+
3.3520 × 10−8
+
8.8829 × 10−6
+
3.3242 × 10−6
+
7.1186 × 10−9
+
115.6856 × 10−10
+
7.5825 × 10−5
+
6.2445 × 10−9
+
1.0997 × 10−9
+
9.7341 × 10−9
+
2.4243 × 10−2
+
1.9100 × 10−1
=
1.7245 × 10−7
+
1.7883 × 10−11
+
123.0199 × 10−11
+
4.9691 × 10−11
+
5.0723 × 10−10
+
3.8202 × 10−10
+
3.0199 × 10−11
+
1.3014 × 10−8
+
3.4763 × 10−9
+
2.6695 × 10−9
+
3.0199 × 10−11
+
+/=/−12/0/011/1/010/1/111/0/111/1/012/0/08/2/29/1/212/0/0
Table 7. Average running times of 10 algorithms.
Table 7. Average running times of 10 algorithms.
FCSLDDBOHHODBOGWOWOAIHHOMFOINFOPOASCA
124.640710.73447.28135.03132.921922.82834.937516.37518.59384.7969
233.031215.00009.60946.46884.203130.53146.218821.968812.21876.3594
325.484312.29696.85945.46884.031326.92195.281212.93759.37525.0781
413.09385.67193.42192.50001.671812.09372.54697.65634.07812.5469
517.65638.20324.90633.56252.421818.04683.781310.42195.92193.3750
627.156312.53137.98445.23443.187525.84385.578117.37509.04695.0312
726.046912.12516.43755.17194.031327.50025.187512.07829.68765.1719
829.719013.64066.89065.390663.921929.14085.437512.781310.56255.7969
928.187612.78137.92195.96884.218829.31265.593812.59399.60955.5781
1029.968915.18768.70316.45314.718733.95336.296916.359511.68756.0156
1134.812816.98448.75007.03125.500038.04717.109415.359312.93766.8438
1225.078411.25006.53134.875064.234425.93785.515710.65638.640655.1407
Mean
Time
26.239612.20027.10805.26303.755226.67985.290313.88029.36335.1445
Table 8. SO2 emissions and related influencing factors.
Table 8. SO2 emissions and related influencing factors.
TimeSO2 Emissions/Ten Thousand TonsProportion of Industrial Output Value in Total Domestic Production/%Energy Consumption GDP/t/Ten Thousand CNYIndustrial SO2 Emission Intensity/t/Ten Thousand CNYProportion of Non-Clean Energy Consumption/%
2012211845.420.750.008785.5
20132043.944.180.70.007884.5
20141974.443.090.670.007183
20151859.140.840.630.006682
2016854.8939.580.590.002980.3
2017610.8439.850.550.001879.5
2018516.1239.690.510.001477.9
2019457.2938.590.490.001276.7
2020318.2237.840.490.000875.7
2021274.7839.430.460.000674.5
Data sources: The dependent variable data comes from the China Statistical Yearbook 2021, while the independent variable data comes from the Environmental Statistics Annual Report and the National Environmental Statistics Bulletin over the years.
Table 9. Completeness of coverage.
Table 9. Completeness of coverage.
Evaluation DimensionsTraining Set RangeTest Set ScopeCoverage Results
The proportion of industrial output value in GDP/%39.58–45.4237.84–38.59Complete coverage
Energy consumption per unit of GDP/t/10,000 CNY0.51–0.750.49Boundary coverage
Industrial SO2 emission intensity/t/10,000 CNY0.0014–0.00870.0008–0.0012Complete coverage
Proportion of non-renewable energy consumption/%77.9–85.575.7–76.7Complete coverage
Table 10. Values of SO2 initialization data.
Table 10. Values of SO2 initialization data.
TimeX1X2X3X4X5
20121.00001.00001.00001.00001.0000
20130.96500.97270.93330.89660.9883
20140.93220.94870.89330.81600.9708
20150.87780.89920.84000.75860.9591
20160.40360.87140.78670.33330.9392
20170.28840.87740.73330.20690.9298
20180.24370.87380.68000.16090.9111
20190.21590.84960.65330.13790.8971
20200.15020.83310.65330.09200.8854
20210.12970.86810.61330.06900.8713
Table 11. Gray absolute correlations between X1 and X2, X3, X4, and X5.
Table 11. Gray absolute correlations between X1 and X2, X3, X4, and X5.
Index E 12 E 13 E 14 E 15
Value0.6480.7390.9370.612
Table 12. CSLDDBOS algorithm optimization optimal order ( t m ( m = 1 , 2 , 3 , 4 , 5 ) ) and optimal weight ( λ m ( m = 1 , 2 , 3 , 4 , 5 ) ).
Table 12. CSLDDBOS algorithm optimization optimal order ( t m ( m = 1 , 2 , 3 , 4 , 5 ) ) and optimal weight ( λ m ( m = 1 , 2 , 3 , 4 , 5 ) ).
Index t 1 t 2 t 3 t 4 t 5
Value−0.99792.604635.000035.0000−12.7324
Index λ 1 λ 2 λ 3 λ 4 λ 5
Value11110.8714
Table 13. Accuracy evaluation.
Table 13. Accuracy evaluation.
LevelIIIIIIIV
Error0.010.050.100.20
Table 14. Performance evaluation metrics.
Table 14. Performance evaluation metrics.
IndexMeaningData/Calculation Method
y 1 ( 0 ) ( g ) Raw data (RD)Statistical data
y ^ 1 ( 0 ) ( g ) Simulated or predicted data (SPD)The final recovery expression of the model
ε ( g ) Residual ε ( g ) = y ^ 1 ( 0 ) ( g ) y 1 ( 0 ) ( g )
Δ s ( g ) Relative squared percentage   error   of   y 1 ( 0 ) ( g ) (RSPE) Δ s ( g ) = [ | ε ( g ) / y 1 ( 0 ) ( g ) | ] × 100 %
Δ ¯ s Mean relative simulated percentage error (MRSPE) Δ ¯ S = 1 / ( N 1 ) g = 2 N Δ s ( g )
Δ p ( g ) Relative   prediction   percentage   error   of   x 1 ( 0 ) ( g ) (RPPE) Similar   to   Δ s ( g )
Δ ¯ p Mean relative prediction percentage error (MRPPE) Δ ¯ p = ( 1 / f ) g = N + 1 g = N + F Δ p ( g )
Δ ¯ Comprehensive mean relative percentage error (CMRPE) Δ ¯ = ( N 1 ) Δ ¯ s + f Δ ¯ p / ( N 1 + f )
Table 15. Model parameter setting.
Table 15. Model parameter setting.
ModelParameterValue
NSGM(1, N) a : Development coefficient a = 0.736
b i : Driving coefficient b 5 , 6 , 7 = 0.050 , 3.400 , 234.10
h i : Gray action b 1 , 2 = 3013.64 , 1610.50
OBGM(1, N) α : Resolution α = 0.5
β : Gray relational degree threshold β = 0.7
γ : Background value coefficient γ = 0.5
SVRKernel functionRadial basis function (RBF)
γ : RBF kernel coefficient γ = 1
c : Penalty factor c = 1
ε : Insensitive loss band width ε = 0.1
LSTM N : Hidden unit N = 64
L : Number of layers L = 1
P : Batch size P = 64
r : Learning rate r = 1 e 4
Table 16. Simulated values and errors of three models.
Table 16. Simulated values and errors of three models.
k x 1 ( 0 ) ( k ) NSGM(1, N)OBGM(1, N)PGM(1, N)
x ^ 1 ( 0 ) ( k ) ε ( k ) Δ s ( k ) x ^ 1 ( 0 ) ( k ) ε ( k ) Δ s ( k ) x ^ 1 ( 0 ) ( k ) ε ( k ) Δ s ( k )
20.96500.9420−0.02292.37790.96500.00000.00000.9645−0.00040.0494
30.93220.93740.00520.55830.7250−0.207022.20600.93270.00050.0549
40.87780.88260.00480.55060.6760−0.202023.01200.87880.00100.1156
50.40360.40960.00601.50170.2010−0.203050.29700.40400.00040.1013
60.28840.29290.00451.57460.0860−0.202070.04200.28870.00030.1208
70.24370.24780.00411.69030.0420−0.202082.88900.2435−0.00010.0686
Mean relative simulated percentage error ( Δ ¯ s ) 1.3756% 59.6197% 0.0851%
Table 17. Predicted values and errors of three models.
Table 17. Predicted values and errors of three models.
k x 1 ( 0 ) ( k ) NSGM(1, N)OBGM(1, N)PGM(1, N)
x ^ 1 ( 0 ) ( k ) ε ( k ) Δ p ( k ) x ^ 1 ( 0 ) ( k ) ε ( k ) Δ p ( k ) x ^ 1 ( 0 ) ( k ) ε ( k ) Δ p ( k )
80.21590.2153−0.00050.24840.0130−0.203094.02500.21640.00050.2471
90.15020.1452−0.00493.3017−0.0520−0.2020134.48700.15050.00030.2471
Mean relative prediction percentage error ( Δ ¯ p ) 1.7751% 119.2394% 0.2471%
Comprehensive mean relative percentage error ( Δ ¯ ) 1.3115% 79.4930% 0.1117%
Table 18. Actual predicted values of SO2 emissions in 2021 using three models.
Table 18. Actual predicted values of SO2 emissions in 2021 using three models.
k x 1 ( 0 ) ( k ) NSGM(1, N)OBGM(1, N)PGM(1, N)
10 0.1297
Actual predicted value 0.1290710.1332710.129720
Table 19. Simulated values and errors of the two models.
Table 19. Simulated values and errors of the two models.
k x 1 ( 0 ) ( k ) SVR LSTM
x ^ 1 ( 0 ) ( k ) ε ( k ) Δ s ( k ) x ^ 1 ( 0 ) ( k ) ε ( k ) Δ s ( k )
20.96500.9640−0.00100.10000.9631−0.00190.1942
30.93220.93500.00280.30400.9260−0.00620.6200
40.87780.8711−0.00670.72980.8692−0.00860.9821
50.40360.41120.00761.48200.42000.01644.1010
60.28840.2766−0.01184.00480.29830.00993.3247
70.24370.2317−0.01204.09080.25090.00722.8331
Average relative prediction percentage error ( Δ ¯ s ) 1.7852% 2.0091%
Table 20. Predicted values and errors of the two models.
Table 20. Predicted values and errors of the two models.
k x 1 ( 0 ) ( k ) SVR LSTM
x ^ 1 ( 0 ) ( k ) ε ( k ) Δ p ( k ) x ^ 1 ( 0 ) ( k ) ε ( k ) Δ p ( k )
80.21590.2142−0.00170.74740.22130.00542.3075
90.15020.15780.00765.00100.15500.00483.1020
Average relative prediction percentage error ( Δ ¯ p ) 2.8742% 1.5510%
Comprehensive average relative percentage error ( Δ ¯ ) 2.0574% 1.8945%
Table 21. Actual predicted values of SO2 emissions in 2021 by two models.
Table 21. Actual predicted values of SO2 emissions in 2021 by two models.
k x 1 ( 0 ) ( k ) SVRLSTM
100.1297
Actual predicted value 0.1287150.129983
Table 22. Comparison of simulation/prediction error range.
Table 22. Comparison of simulation/prediction error range.
IndexErrorPGM(1, N)NSGM(1, N)OBGM(1, N)SVRLSTM
MRSPEMaximum error0.1208%2.3779%82.8890%4.0908%4.1010%
Minimum error0.0494%0.5506%0.0000%0.100%0.1942%
Error range0.0714%1.8273%82.8890%3.9908%3.9068%
MRPPEMaximum error0.2471%3.3017%134.4870%5.0010%3.1020%
Minimum error0.2471%0.2484%94.0250%0.7474%2.3075%
Error range0.0000%3.0533%40.4620%4.2536%0.7945%
Table 23. Initial value prediction of the independent variable (Xi) (i = 2,3,4,5) over the next five years.
Table 23. Initial value prediction of the independent variable (Xi) (i = 2,3,4,5) over the next five years.
k X 2 X 3 X 4 X 5
110.85970.59580.04810.8692
120.84760.57450.01790.8511
130.83850.54450.00830.8493
140.84920.51060.00340.8337
150.82690.47690.00100.8299
Table 24. Prediction of SO2 emissions in China from 2022 to 2026.
Table 24. Prediction of SO2 emissions in China from 2022 to 2026.
Time20222023202420252026
Predictive value238.0632185.536885.143615.885010.4720
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cui, L.; Hu, G.; Hussien, A.G. Prediction of Sulfur Dioxide Emissions in China Using Novel CSLDDBO-Optimized PGM(1, N) Model. Mathematics 2025, 13, 2846. https://doi.org/10.3390/math13172846

AMA Style

Cui L, Hu G, Hussien AG. Prediction of Sulfur Dioxide Emissions in China Using Novel CSLDDBO-Optimized PGM(1, N) Model. Mathematics. 2025; 13(17):2846. https://doi.org/10.3390/math13172846

Chicago/Turabian Style

Cui, Lele, Gang Hu, and Abdelazim G. Hussien. 2025. "Prediction of Sulfur Dioxide Emissions in China Using Novel CSLDDBO-Optimized PGM(1, N) Model" Mathematics 13, no. 17: 2846. https://doi.org/10.3390/math13172846

APA Style

Cui, L., Hu, G., & Hussien, A. G. (2025). Prediction of Sulfur Dioxide Emissions in China Using Novel CSLDDBO-Optimized PGM(1, N) Model. Mathematics, 13(17), 2846. https://doi.org/10.3390/math13172846

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop