A FIG-IWOA-BiGRU Model for Bus Passenger Flow Fluctuation Trend and Spatial Prediction

Zhang, Jie; He, Qingling; Lu, Xiaojuan; Xiao, Shungen; Wang, Ning

doi:10.3390/math13193204

Open AccessArticle

A FIG-IWOA-BiGRU Model for Bus Passenger Flow Fluctuation Trend and Spatial Prediction

by

Jie Zhang

^1,2

,

Qingling He

^3,*,

Xiaojuan Lu

³,

Shungen Xiao

¹ and

Ning Wang

^1,*

¹

College of Information Engineering, Ningde Normal University, Ningde 352100, China

²

School of Civil Engineering and Transportation, Northeast Forestry University, Harbin 150040, China

³

School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, China

^*

Authors to whom correspondence should be addressed.

Mathematics 2025, 13(19), 3204; https://doi.org/10.3390/math13193204

Submission received: 19 August 2025 / Revised: 27 September 2025 / Accepted: 29 September 2025 / Published: 6 October 2025

Download

Browse Figures

Versions Notes

Abstract

To capture bus passenger flow fluctuations and address the problems of slow convergence and high error in machine learning parameter optimization, this paper develops an improved Whale Optimization Algorithm (IWOA) integrated with a Bidirectional Gated Recurrent Unit (BiGRU). First, a Logistic–Tent chaotic mapping is introduced to generate a diverse and high-quality initial population. Second, a hybrid mechanism combining elite opposition-based learning and Cauchy mutation enhances population diversity and reduces premature convergence. Third, a cosine-based adaptive convergence factor and inertia weight strategy improve the balance between global exploration and local exploitation. Based on the correlation analysis between bus passenger flow and weather condition data in Harbin, and combined with the fluctuation characteristics of bus passenger flow, the data were divided into windows with a 7-day weekly cycle and processed by fuzzy information granulation to obtain three groups of fuzzy granulated window data, namely LOW, R, and UP, representing the fluctuation trend and spatial characteristics of bus passenger flow. The IWOA was employed to optimize and solve parameters such as the hidden layer weights and bias vectors of the BiGRU, thereby constructing a bus passenger flow fluctuation trend and spatial prediction model based on FIG-IWOA-BiGRU. Simulation experiments with 21 benchmark functions and real bus data verified its effectiveness. Results show that IWOA significantly improves optimization accuracy and convergence speed. For bus passenger flow forecasting, the average MAE, RMSE, and MAPE of LOW, R, and UP data are 2915, 3075, and 8.1%, representing improvements over existing classical models. The findings provide reliable decision support for bus scheduling and passenger travel planning.

Keywords:

urban transportation; bus passenger flow; whale optimization algorithm (WOA); hybrid improvement strategy; bidirectional gated recurrent unit (BiGRU); fluctuation spatial prediction

MSC:

65-11; 68-11

1. Introduction

Bus passenger flow serves as the basis for bus operation management departments to formulate network planning and vehicle scheduling plans. By establishing the relationship between bus passenger flow and its influencing factors, and exploring the fluctuation patterns of randomness, time variation, and imbalance in passenger flow, an accurate prediction model of passenger flow fluctuations can be constructed. This enables reasonable forecasting of fluctuation trends and spatial distribution of passenger flow, thereby providing a theoretical foundation for developing bus operation scheduling plans. Meanwhile, it also facilitates passengers in planning travel routes and times more effectively, improves operational efficiency, and ensures the service level and quality of bus services.

Mei Z et al. [1] pointed out that factors such as weather conditions, bus stop information, bus passenger flow change rate, and time and date have significant impacts on bus passenger flow. Chen D et al. [2] established a nonlinear relationship between bus passenger flow and factors such as the number of bus stops, road network density, and the number of companies, revealing the influencing mechanisms of passenger flow changes. Zhu C et al. [3] found that the land-use characteristics surrounding bus stops have a significant impact on bus passenger flow. Fang X et al. [4] examined the correlation between bus IC card passenger flow attributes and travel time, and proposed a passenger flow forecasting method based on the classification of travel attributes. Chen E et al. [5] analyzed the time-series characteristics of bus IC card passenger flow data and developed an ARIMA-based forecasting model. To further analyze the seasonal periodic patterns of bus passenger flow, Milenković M et al. [6] proposed a Seasonal Autoregressive Integrated Moving Mean (SARIMA)-based bus passenger flow forecasting method. Cheng G et al. [7] developed a Short Time Series Clustering–Seasonal Autoregressive Integrated Moving Mean (STSC-SARIMA) model for bus passenger flow forecasting. However, the above studies primarily fit linear regression relationships between bus passenger flow and influencing factors using historical data samples, without adequately addressing the spatiotemporal dynamics and nonlinear characteristics of bus passenger flow. This may result in the omission of certain spatiotemporal dynamic influencing factors and their relationships during parameter calibration, thereby reducing the accuracy and generalization capability of bus passenger flow forecasting results [8].

To investigate the spatiotemporal dynamic complexity and nonlinear relationships between bus passenger flow and factors such as weather conditions, commercial land-use ratio, points of interest (POI), bus stop density, and characteristics of bus and road networks under large-scale data samples, and to explore the characteristic patterns of influencing factors and the spatiotemporal distribution of bus passenger flow to improve forecasting accuracy and robustness, Yu L et al. [9] examined the impact of land-use characteristics on bus passenger flow and proposed a bus passenger flow forecasting model based on the least squares support vector machine. Xu X et al. [10] developed a combined forecasting model integrating time series analysis and support vector machines for bus passenger flow. Li Y et al. [11] proposed a bus passenger flow forecasting method based on the kernel extreme learning machine (KELM), which effectively improved the accuracy of forecasting results. Yazıcıoğlu C et al. [12] introduced an LSTM-based bus passenger flow forecasting method, and the results showed that incorporating weather conditions helps enhance the accuracy and applicability of the forecasts. Zhang Y et al. [13] compared the forecasting accuracy of existing machine learning models and found that bidirectional forecasting models, such as those based on BiLSTM and BiGRU, outperform unidirectional models in both accuracy and robustness.

To further integrate the spatiotemporal distribution features of the built environment, bus stop density, and bus and road networks, and to improve the accuracy and applicability of bus passenger flow forecasting, Cao B et al. [14] proposed a hybrid forecasting method based on convolutional neural network–long short-term memory (CNN-LSTM) by incorporating factors such as commercial land-use ratio, points of interest (POI), and bus stop density. This method effectively extracts the spatiotemporal features of bus passenger flow, reduces model computational complexity, and achieves higher forecasting accuracy compared to LSTM networks. Baghbani A et al. [15] developed a traffic-aware multi-step graph neural network (TMS-GNN)-based method for bus passenger flow forecasting, which considers bus stop connectivity and urban traffic operation characteristics to significantly improve both the accuracy and applicability of forecasts. Lv W et al. [16] constructed an extreme gradient boosting (XGBoost) model for bus passenger flow forecasting by incorporating POI information. Yu J et al. [17] proposed an interpretable bus passenger flow forecasting method based on CatBoost, which not only improves forecasting accuracy but also reveals the interaction mechanisms between influencing factors and forecast accuracy. These studies have effectively enhanced the accuracy and robustness of bus passenger flow forecasting by exploring the complex spatiotemporal dynamic relationships and distribution characteristics between bus passenger flow and influencing factors. However, they often rely on manually setting parameters for machine learning models, which can limit the precision and efficiency of parameter optimization.

To further improve the accuracy and efficiency of bus passenger flow forecasting, Nagaraj N et al. [18] examined factors such as bus type, the locations of route terminals, passenger numbers, and headways, and proposed a forecasting method based on a greedy hierarchical algorithm combined with a long short-term memory (LSTM) neural network. Sun F et al. [19] explored the spatiotemporal distribution patterns of bus passenger flow using IC card data and developed a forecasting model based on a nonlinear autoregressive network with exogenous inputs (NARX) optimized by a genetic algorithm. Li C et al. [20], considering the volatility, nonlinearity, and periodicity of bus passenger flow, applied a clustering algorithm to segment passenger flow data and constructed a forecasting model based on a particle swarm optimization–support vector machine (PSO-SVM). Bus passenger flow forecasting models based on GWO-SVM [21] and whale optimization-Attention-Bidirectional Gated Recurrent Unit (BiGRU) [22] have also demonstrated high prediction accuracy and strong generalization capability, as intelligent optimization algorithms are used to fine-tune machine learning hyperparameters.

The whale optimization algorithm (WOA) in intelligent optimization exhibits advantages such as simple structure, few parameters, and high accuracy and convergence speed in optimization solving [23], and has been widely applied in areas including path planning [24] and machine learning parameter optimization [25]. However, due to the use of an initialization strategy in the optimization process, WOA often suffers from uneven population distribution. Moreover, the linearly decreasing convergence factor during population position updating tends to cause an imbalance between global search and local exploitation, leading to problems such as being trapped in local optima, insufficient solution accuracy, and slow convergence speed. To address these issues, Prasad D [26] and Duan Y et al. [27] improved the population initialization of WOA by adopting Logistic and Circle chaotic mappings, thereby enhancing population diversity and quality and improving optimization speed and convergence accuracy. Similarly, Darvish Falehi A [28] and Paul C et al. [29] applied chaotic mappings to improve WOA population initialization, balancing the relationship between global search and local exploitation, enhancing optimization accuracy and speed, and strengthening WOA’s capability in solving complex engineering problems.

To address the issue that the Whale Optimization Algorithm (WOA) is prone to premature convergence to local optima, Li M [30] and Chakraborty S et al. [31] improved WOA by employing an elite opposition-based learning strategy. This approach enhanced population diversity and local exploration capability, increased the probability of feasible solutions approaching the optimal solution during the optimization process, and avoided premature convergence caused by limited exploration ability.

Regarding the problem that the linearly decreasing convergence factor in WOA cannot effectively balance the relationship between global search and local exploitation, thereby reducing solution accuracy and convergence speed, Anitha J [32] and Yue Y et al. [33] introduced cosine functions to improve convergence factors and inertia weights. This effectively balanced global and local search abilities, enhancing both solution accuracy and convergence speed. Chakraborty S et al. [34] suggested that improving the coefficient vector in the position update mechanism could significantly enhance the exploration capability in the early stage and the exploitation capability in the later stage. Wang H [35] and Yang W et al. [36] used nonlinear functions and dynamic convergence factors to improve the linear convergence factor of WOA, strengthening global and local search capabilities while reducing the risks of premature convergence and insufficient accuracy. Ju C et al. [37] integrated chaotic mapping, nonlinear convergence factors, and Cauchy mutation as a hybrid strategy to improve WOA. This increased population diversity and quality, effectively balanced global and local search abilities, and overcame the issues of insufficient accuracy, slow convergence, and susceptibility to local optima.

Existing studies mainly focus on examining the impacts of factors such as weather conditions, built environment, and the characteristics of bus and road networks on bus passenger flow, and on developing models aimed at improving the point-in-time forecasting accuracy of bus passenger flow, while neglecting its randomness, time variation, and imbalance. The use of intelligent optimization algorithms to tune machine learning parameters can avoid the limitations of manual parameter setting, which often results in insufficient accuracy and applicability of forecasting results.

Therefore, to address the problems of WOA being easily trapped in local optima, insufficient solution accuracy, and slow convergence speed, this study adopts a Logistic–Tent combined chaotic mapping to initialize the population of WOA, thereby increasing population diversity and quality; an elite opposition-based learning and Cauchy mutation hybrid mechanism is adopted to overcome premature convergence and the tendency of WOA to fall into local optima. Moreover, improved convergence factors and inertia weights are introduced to enhance the population position update mechanism, effectively balancing the relationship between global search and local exploitation, and improving both solution accuracy and convergence speed of WOA. Based on bus passenger flow data from Harbin, the study analyzes the correlations between passenger flow, weather conditions, and passenger attributes, applies fuzzy information granulation (FIG) to the passenger flow data, and employs the improved WOA (IWOA) to optimize BiGRU parameters, thereby developing a FIG-IWOA-BiGRU model for forecasting bus passenger flow fluctuation trends and spatial distribution. By considering the impacts of passenger travel randomness and uncertainty on bus passenger flow, the model extends forecasting results from point predictions to spatial predictions. Meanwhile, the use of a hybrid-strategy-improved WOA prevents the algorithm from falling into local optima and suffering from slow convergence, which could otherwise lead to poor BiGRU prediction performance and long computation times, thus addressing the limitations in bus passenger flow data forecasting applications.

The remainder of this paper is organized as follows: Section 1 analyzes the fluctuation characteristics of bus passenger flow and its influencing factors. Section 2 briefly introduces fuzzy information granulation and the whale optimization algorithm. Section 3 presents in detail the strategies and functions of the improved whale optimization algorithm, and constructs the FIG-IWOA-BiGRU model for forecasting bus passenger flow fluctuation trends and spatial distribution. Section 4 provides the results of numerical simulation experiments based on 21 benchmark test functions and discusses the advantages of the improved whale optimization algorithm compared with other metaheuristic algorithms. The results of the FIG-IWOA-BiGRU bus passenger flow fluctuation trend and spatial forecasting model are then presented, followed by a comparative analysis of the accuracy and applicability of the proposed model and existing models.

2. Bus Passenger Flow Fluctuation Characteristics and Influencing Factors Analysis

2.1. Analysis of Bus Passenger Flow Fluctuation Characteristics

The fluctuation characteristics of bus passenger flow represent the overall fluctuation trend and spatial variation range of passenger flow within a given period. As shown in the time series of bus passenger flow fluctuations in Figure 1, the fluctuation trends and spatial ranges of different IC card types vary across different periods; however, they share the following characteristics:

1. Temporal and spatial attributes—The fluctuation characteristics of bus passenger flow reflect changes in passenger travel demand within a given period, as well as the influence of land-use characteristics and the degree of development around bus routes on travel demand.

2. Object and condition attributes—These characteristics treat the passenger flow of a bus route as a whole and examine changes in passenger travel demand from a macro perspective. New fluctuation patterns in bus passenger flow emerge only when passengers’ travel purposes and modes change.

3. Predictability—From the perspective of residents’ travel demand and mode choice, the fluctuation characteristics of bus passenger flow reflect the service level of bus operations and can be analyzed and forecast based on historical data patterns.

2.2. Principles for Selecting Influencing Factors of Bus Passenger Flow

The selection of influencing factors for bus passenger flow aims to reduce the workload of data collection and processing while ensuring the accuracy of forecasting model results. The main principles include the following:

1. Scientific validity—The selected factors should objectively reflect changes in their impact on passenger travel. For example, passengers have an optimal preference for weather conditions; the degree of deviation from this optimal value is inversely proportional to travel comfort. Therefore, indicators such as clothing index, wind scale, and daily Mean temperature can better capture the seasonal variations in temperature and their effects on passenger travel.

2. Availability—The selected factors should be accessible and calculable. For instance, air quality index, relative humidity, and clothing index can be obtained and computed from meteorological websites.

3. Sensitivity—The selected factors should have a significant impact on the fluctuation characteristics and trends of bus passenger flow.

2.3. Analysis of Influencing Factors of Bus Passenger Flow

Liu J et al. [38] argued that considering weather conditions can help improve the accuracy of bus passenger flow forecasting. To further analyze the relationship between bus passenger flow and weather conditions, influencing factors were selected for the fluctuation trend and spatial prediction datasets of bus ridership. The research data mainly consist of bus IC card records and corresponding weather conditions in Harbin from 1 March 2021 to 31 October 2021, covering a total of 245 days. Considering that both bus IC card data and weather condition data contain partial temporal gaps, a time series method was employed to repair and complete the missing data. The Pearson correlation coefficient in binary interval variable correlation analysis was employed to examine the degree of correlation between bus passenger flow and weather-related influencing factors, and the results are shown in Figure 2.

Existing studies suggest that when the correlation coefficient between two influencing factors exceeds 0.95, the factors are considered highly correlated, and one of them should be removed [38]. As shown in Figure 2, the correlation coefficients among the factors selected in this study are

|ρ_{X, Y}| < 0.95

, except that the wind chill index shows correlations higher than 0.95 with both daily Mean temperature and clothing index. The correlations among the remaining factors are within the threshold range. Therefore, daily Mean temperature, air quality index, relative humidity, wind scale, and clothing index were selected as the influencing factors for bus passenger flow forecasting.

3. Fuzzy Information Granulation and Whale Optimization Algorithm

3.1. Fuzzy Information Granulation

Given the randomness and uncertainty of bus passenger flow and weather data, the fuzzy granulation method based on fuzzy set theory is applied to determine the fuzzy granulation window and perform information fuzzy granulation processing on bus passenger flow data. The number of fuzzy granulation windows is determined according to the characteristics of bus passenger flow data, dividing them into several fuzzy granulation sub-windows according to a fixed rule length. The purpose of information fuzzy granulation is to obtain reasonable fuzzy information granules for each window while retaining the valid information of the original window data, thereby simplifying the spatial dimensionality of the model. The triangular membership function and the Gaussian membership function are commonly used fuzzy information granulation methods. The membership function for fuzzy information granulation is typically determined using histogram plotting. The histogram of bus passenger flow data is shown in Figure 3. As illustrated in Figure 3, the triangular membership function provides a better fuzzy information granulation of bus passenger flow data. Moreover, compared with trapezoidal and Gaussian membership functions, the triangular membership function offers the advantages of computational simplicity, efficiency, and rapid modeling.

Thus, this study adopts the triangular membership function for bus passenger flow information granulation, which is expressed as Equation (1):

A (x, a, m, b) = \{\begin{cases} 0, & x < a \\ \frac{x - a}{m - a}, & a \leq x \leq m \\ \frac{b - x}{b - m}, & m < x \leq b \\ 0, & x > b \end{cases}

(1)

where

a

,

m

, and

b

represent the minimum value, mean value, and maximum value, respectively, of the fluctuations in the original sample data after window granulation.

3.2. Whale Optimisation Algorithm

3.2.1. Encircling Prey

The WOA was proposed by Mirjalili et al. [39] in 2016 as an intelligent optimization algorithm consisting of mathematical models such as encircling prey, spiral bubble-net attacking, and random search for prey. Encircling prey refers to the assumption that, due to the uncertainty of the optimal position in the data samples, the model considers the position of the current sample with the highest fitness as the target position and uses it as a reference for other samples to update their own positions. The formula is as follows:

\vec{D} = |C \cdot \vec{X^{*}} (t) - \vec{X} (t)|

(2)

\vec{X} (t + 1) = \vec{X^{*}} (t) - A \cdot \vec{D}

(3)

where

t

denotes the current iteration number,

\vec{X^{*}}

is the position vector of the best individual,

\vec{X}

is the position vector, and the coefficient vectors A and C are calculated as follows:

A = 2 a \cdot r_{1} - a

(4)

C = 2 r_{2}

(5)

where

a

is a coefficient that decreases linearly from 2 to 0 as the iterations progress, and

r_{1}

and

r_{2}

are random variables in the range [0, 1].

3.2.2. Spiral Bubble-Net Attacking

The spiral bubble-net attacking mainly simulates the process of updating the positions of other samples toward the position of the optimal sample through spiral position updating and a shrinking encircling mechanism. Since spiral position updating and the shrinking encircling mechanism occur simultaneously during the spiral bubble-net attacking process, this mathematical model assumes that each occurs with a probability of 50%. The formula is as follows:

\vec{X} (t + 1) = \vec{X^{*}} (t) - A \cdot \vec{D}, if p < 0.5

(6)

\vec{X} (t + 1) = \vec{D^{’}} \cdot e^{b l} \cdot \cos (2 π l) + \vec{X^{*}} (t), if p \geq 0.5

(7)

where

b

is a constant defining the shape of the spiral, and

l

and

p

are random numbers in the range [−1,1].

3.2.3. Random Search for Prey

The random search for prey updates positions by performing a random search based on the locations of data samples. This mathematical model assumes that when A > 1, the position of a sample is randomly updated according to the characteristics of the data samples, in order to enhance global search capability. The formula is as follows:

\vec{D} = |C \cdot \vec{X_{r a n d}} - X|

(8)

\vec{X} (t + 1) = \overset{⇢}{X_{r a n d}} - A \cdot \vec{D}

(9)

where

\overset{⇢}{X_{r a n d}}

is an individual randomly selected from the current population.

4. Improved Whale Optimization–Bidirectional Gated Recurrent Unit

4.1. Logistic–Tent Combined Mapping Initialization

A randomly and uniformly distributed initial population helps expand the search space of the algorithm, improving its convergence speed and solution accuracy [40]. In the absence of any prior information, the whale optimization algorithm generates the initial population randomly within the search space, which can easily lead to uneven distribution of individual positions and a reduced search space. Therefore, this study combines Logistic and Tent chaotic mappings to construct a Logistic–Tent combined mapping. This combined mapping integrates the strong spatial ergodicity of Logistic mapping with the fast iteration speed, low autocorrelation, and strong applicability of Tent mapping. Such a combination helps improve the uniformity and ergodicity of population spatial distribution in WOA during the optimization of bus passenger flow data, thereby overcoming the tendency to fall into local optima and enhancing the convergence speed and accuracy in solving BiGRU parameters. The formula is as follows:

z (t + 1) = \{\begin{cases} r z_{t} (1 - z (t)) + (4 - r) z (t), & 0 < z (t) \leq 0.3 \\ r z_{t} (1 - z (t)) + (4 - r) (1 - z (t)), & 0 . 3 < z (t) < 1 \end{cases}

(10)

where

z (t)

is the value after the t-th iteration, and

r

is a random parameter in the range (0, 4).

4.2. Elite Opposition-Based Learning and Cauchy Mutation Hybrid Mechanism

4.2.1. Elite Opposition-Based Learning

The opposition-based learning (OBL) strategy, proposed by Tizhoosh [30] in 2005, is a learning strategy that, based on the current feasible solution, evaluates its opposite solution and selects the better feasible solution from the current and opposite solutions, with the aim of increasing the probability of the optimal feasible solution approaching the global optimum. This strategy can effectively improve the diversity and quality of the algorithm’s population and prevent the algorithm from prematurely falling into a local optimum. It has been widely applied to the improvement of various intelligent optimization algorithms and has achieved favorable results.

The opposition-based learning strategy assumes that the current feasible solution of the population is

X = (x_{1}, x_{2}, \dots, x_{d})

, where

d

is the dimension of the search space and

x_{j} \in [a_{j}, b_{j}]

. Its opposite solution is

\bar{X} = (\bar{x_{1}}, \bar{x_{2}}, \dots, \bar{x_{d}})

, where

\bar{x_{j}} = ω (a_{j} + b_{j}) - x_{j}

, and

ω

is a generalized coefficient uniformly distributed in [0, 1].

The elite opposition-based learning strategy [31] is an improved approach proposed to address the issue in the opposition-based learning strategy, where the generated opposite solution is more difficult to reach the global optimum within the current search space. It has been widely applied to the improvement of various algorithms and has achieved favorable results. Assuming that the local extremum point corresponding to a general individual in the current population is an elite individual

X_{i, j}^{e} = (X_{i, 1}^{e}, X_{i, 2}^{e}, \dots, X_{i, d}^{e}) (i = 1, 2, 3, \dots, d)

, its opposite solution

\bar{X_{i, j}^{e}} = (\bar{X_{i, 1}^{e}}, \bar{X_{i, 2}^{e}}, \dots \bar{X_{i, d}^{e}})

can be expressed as

\bar{X_{i, j}^{e}} = K^{*} (α_{j} + β_{j}) - X_{i, j}^{e}

(11)

where

K

is a dynamic coefficient in the range (0, 1), and

X_{i, j}^{e} \in [α_{j}, β_{j}]

,

α_{j} = \min (X_{i, j})

,

β_{j} = \min (X_{i, j})

,

α_{j}

and

β_{j}

are dynamic boundaries that overcome the limitation of fixed boundaries in preserving search experience. These dynamic boundaries enable the elite opposite solution to be located within a narrow search space, thereby improving the convergence performance of the algorithm. If the operation causes

X_{i, j}^{e}

to cross the boundary and become an infeasible solution, it can be reset using the method of random generation defined in Equation (12).

\bar{X_{i, j}^{e}} = r a n d (α_{j}, β_{j})

(12)

4.2.2. Cauchy Mutation

Given that the Cauchy distribution function has a slow decay along the horizontal axis and a heavy-tailed property, it can be used to optimize the best position of whale individuals in the WOA. This allows rapid perturbation of the current optimal solution obtained during the optimization process, enabling fast search for feasible solutions within the bus passenger flow data space, and enhancing the algorithm’s global search capability as well as its ability to escape local optima [38]. When the WOA converges to the global optimal solution, the updated position of the optimal whale individual after Cauchy mutation perturbation is

X_{n e w b e s t}^{j} = X_{b e s t}^{j} + X_{b e s t}^{j} \times C a u c h y (0, 1)

(13)

where

X_{n e w b e s t}^{j}

is the optimal whale position after Cauchy perturbation,

X_{b e s t}^{j}

is the optimal whale position before Cauchy perturbation, and

C a u c h y (0, 1)

is a random value generated from the standard Cauchy distribution.

4.2.3. Hybrid Mechanism

The hybrid mechanism expands the search range of the WOA in the bus passenger flow data space through elite opposition-based learning, enabling the exploration of more potential optimal solutions to improve the convergence speed and accuracy in optimizing BiGRU parameters. Cauchy mutation is mainly used to randomly perturb the optimal whale position, enhancing its ability to escape from local optima and accelerating the solution process of BiGRU parameters for bus passenger flow forecasting. The dynamic updating method of the optimal whale position under the hybrid mechanism is as follows:

\bar{X_{n e w b e s t, i, j}^{e}} = \{\begin{cases} \frac{(a_{j} + b_{j})}{2} + \frac{(a_{j} + b_{j})}{2 n} - \frac{X_{b e s t, i, j}^{j}}{n}, λ < 0.5 \\ X_{b e s t, i, j}^{e} + X_{b e s t, i, j}^{e} \times C a u c h y (0, 1), λ \geq 0.5 \end{cases}

(14)

where λ is a random number in the range [0, 1].

4.3. Improved Convergence Factor and Inertia Weight

For swarm intelligence optimization algorithms, it is necessary to balance the relationship between global search and local exploitation. Failure to achieve this balance may cause the algorithm to converge prematurely to a local optimum or to converge slowly. Global search means that the population explores a wide search space with relatively large search steps and range, which can reduce the probability of falling into a local optimum. Local exploitation requires the population to perform refined searches within a local space, thereby avoiding the loss of the theoretical optimum due to excessively large search steps [23]. As a swarm intelligence optimization algorithm, WOA also depends critically on balancing global search and local exploitation. From Equations (3), (6) and (8), it can be seen that this balance largely depends on the value of A. From Equation (4), it follows that changes in the convergence factor

a

determine the value of A.

Given that the convergence factor in WOA decreases linearly and does not adequately balance the relationship between global search and local exploitation, this study incorporates the characteristics of the cosine function to improve the convergence factor formula as follows:

a = 2 - 2 \cos (\frac{π}{2} \cdot \frac{t}{t_{\max}})

(15)

Given that WOA continuously updates the leader’s position, it is prone to falling into local optima. To address this issue, this study introduces an inertia weight into the whale position updating process to increase the probability of WOA escaping from local optima. The improved whale position updating formula is as follows:

\vec{X} (t + 1) = φ \vec{X^{*}} (t) - A \cdot \vec{D}, |A| < 1, p < 0.5

(16)

\vec{X} (t + 1) = φ \vec{X_{r a n d}} - A \cdot \vec{D} |A| \geq 1, p < 0.5

(17)

\vec{X} (t + 1) = \vec{D^{'}} \cdot e^{b l} \cdot \cos (2 π l) + φ \vec{X^{*}} (t), p \geq 0.5

(18)

where

φ

is the inertia weight, calculated as

φ (t + 1) = φ_{\max} - (φ_{\max} - φ_{\min}) \cdot \cos (\frac{t}{t_{\max}} π)

(19)

4.4. Bidirectional Gated Recurrent Unit

As shown in Figure 4, the bidirectional gated recurrent unit (BiGRU) consists of several forward and backward GRUs that operate in opposite temporal directions and are independent of each other. Compared with GRU, BiGRU offers bidirectional modeling, accurate temporal annotation, and strong data feature extraction capabilities, while having fewer network parameters than LSTM. The calculation of the hidden layer units within BiGRU is shown in Equation (20):

\{\begin{cases} \vec{h_{t}} & = G R U (x_{t}, \vec{h_{t - 1}}) \\ \overset{\leftarrow}{h_{t}} & = G R U (x_{t}, \overset{\leftarrow}{h_{t - 1}}) \\ h_{t} & = α_{t} \vec{h_{t}} + β_{t} \overset{\leftarrow}{h_{t}} + b_{t} \end{cases}

(20)

where

G R U (\cdot)

is the gated recurrent unit,

x_{t}

is the input value,

\vec{h_{t}}

and

α_{t}

are the forward hidden state and weight,

\overset{\leftarrow}{h_{t}}

and

β_{t}

are the backward hidden state and weight, and

b_{t}

is the hidden layer bias vector.

4.5. FIG-IWOA-BiGRU Model Prediction Process

Based on the analysis results of bus passenger flow fluctuation characteristics and influencing factors, the bus passenger flow data are divided into fuzzy granulation windows with a period of 7 days per week, and triangular fuzzy membership functions are used for fuzzy granulation processing. The fuzzy-granulated bus passenger flow data are used as input samples, and the hybrid-strategy–improved WOA is employed to optimize the hyperparameters of the BiGRU model, which is then trained to obtain the bus passenger flow fluctuation trend and spatial prediction model. The steps of the FIG-IWOA-BiGRU bus passenger flow fluctuation trend and spatial prediction model are as follows, and the process is shown in Figure 5.

1.Select the influencing factors of bus passenger flow and compile the dataset; divide the bus passenger flow data into fuzzy granulation windows with a period of 7 days per week.

2. Apply a triangular fuzzy membership function to fuzzify the bus passenger flow window time series sample data, and compute the fuzzified window time series samples after fuzzy information granulation.

3. Set the improved WOA population size to 30 and the number of iterations to 500.

4. Generate the initial population within the search space using the Logistic–Tent combined mapping.

5. Calculate the fitness values of the individuals in the population and record the optimal position

\overset{⇢}{X (t)}

.

6. Apply the elite opposition-based learning and Cauchy mutation hybrid mechanism for opposition-based learning, evaluate and select the solution according to the objective function’s fitness value, update the solutions in each dimension using this method, and update the position of the optimal whale individual according to Equation (14).

7. Update parameter

φ

according to Equation (19), and update parameters

a

,

A

,

C

, and

l

accordingly.

8. Compare the value of

|A|

and the random value

p

with 0.5, then select the corresponding position updating formula. If

p < 0.5

and

|A| < 1

, update the current position according to Equation (16); if

p < 0.5

and

|A| \geq 1

, perform random prey search according to Equation (17); if

p \geq 0.5

, update the current position according to Equation (18).

9. Check whether the algorithm has reached the maximum number of iterations. If so, stop the iterations and output the optimal position and fitness value; otherwise, repeat steps (4)–(8).

10. Output the global optimal solution obtained by the IWOA and use it as the optimal parameter set to construct the BiGRU.

11. Use the FIG-IWOA-BiGRU model to forecast bus passenger flow fluctuation trends and spatial ranges, and verify the prediction performance.

5. Numerical Simulation and Forecasting Results Analysis

5.1. Numerical Simulation Results

To verify the effectiveness of the IWOA, 21 typical benchmark test functions were selected for numerical simulation. The results of 30 simulation runs were comprehensively compared with those of WOA, Grey Wolf Optimizer Algorithm (GWO), Particle Swarm Optimization Algorithm (PSO), and Genetic Algorithm (GA). All numerical simulations were conducted in a Windows 11 environment with 32 GB of RAM (Kingston Technology, Fountain Valley, CA, USA) and a 12th Gen Intel(R) Core(TM) i7-12700 processor (2.10 GHz) (Intel Corporation, Santa Clara, CA, USA), using MATLAB R2023b as the programming language. In the numerical simulation experiments, the maximum number of iterations was set to 500, and the population size was 30. Other parameter settings of each algorithm model are provided in Table 1. Among the 21 benchmark test functions, f₁–f₇ are high-dimensional unimodal benchmark test functions, f₈–f₁₃ are high-dimensional multimodal benchmark test functions, and f₁₄–f₂₁ are fixed-dimensional composite multimodal benchmark test functions, as detailed in Table 2. The numerical simulation results are presented in terms of statistical parameters such as mean, standard deviation, best fitness, and worst fitness. The statistical results for the different types of benchmark test functions are shown in Table 3, Table 4 and Table 5.

Table 3 presents the statistical results of best fitness, worst fitness, mean fitness, and standard deviation for unimodal benchmark test functions. As test functions with a unique global optimum, unimodal benchmark functions are primarily used to evaluate the exploitation capability of intelligent optimization algorithms. According to Table 3, the IWOA demonstrates superior exploitation ability. For functions f₁, f₂, f₄, and f₆, all statistical results of the IWOA outperform those of other intelligent optimization algorithms. For functions f₃, f₅, and f₇, IWOA ranks as the second-best algorithm. Therefore, the strategies proposed in this paper significantly enhance WOA, enabling the improved algorithm to achieve stronger exploitation ability compared with other intelligent optimization algorithms.

Compared with unimodal benchmark test functions, multimodal benchmark test functions have the characteristic of containing many local optima as the function dimension increases. This characteristic is effective for evaluating the exploration capability of intelligent optimization algorithms. According to the data in Table 4, the improved WOA demonstrates better exploration capability on multimodal benchmark test functions compared with algorithms such as WOA, GWO, PSO, and GA. This is mainly because the hybrid mechanism of elite opposition-based learning and Cauchy mutation effectively prevents the algorithm from falling into local optima and increases the probability of feasible solutions approaching the global optimum.

Composite multimodal benchmark test functions are more complex in structure and contain multiple local minima within their domains. They are commonly used to test an algorithm’s ability to balance global search and local exploitation. Only when an intelligent optimization algorithm achieves an appropriate balance between exploration and exploitation can it effectively overcome the problem of falling into local optima. According to the results in Table 5, the improved WOA exhibits superior capability in balancing global search and local exploitation on composite multimodal benchmark test functions compared with WOA, GWO, PSO, and GA. The improved balance performance of the algorithm mainly benefits from the enhancement of the convergence factor using the cosine function and the application of inertia weight in updating whale positions.

To further analyze the distribution characteristics of the IWOA’s numerical simulation results, box plots were drawn based on the results of each algorithm independently solving the benchmark test functions 30 times, as shown in Figure 6. In Figure 6, the central mark within each box represents the median of the algorithm’s solutions for the benchmark test functions, while the bottom and top edges of the box indicate the first and third quartiles. The “+” symbol denotes outliers outside the box. Comparing the medians and outliers in Figure 6 shows that the IWOA has higher solution quality, fewer outliers, and a more concentrated distribution of results, indicating good robustness.

To further analyze the robustness and fairness of the improved Whale Optimization Algorithm (IWOA), the Wilcoxon statistical test was conducted at a 5% significance level to evaluate the superiority of IWOA compared with other algorithms. Since IWOA cannot be compared with itself, it was compared with WOA, GWO, PSO, and GA, denoted as P1, P2, P3, and P4, respectively. The p-values of the Wilcoxon rank-sum test between IWOA and WOA, GWO, PSO, and GA are listed in Table 6. As shown in Table 6, all p-values are less than 0.05, indicating that the statistical results of IWOA in solving the 21 benchmark test functions are significant, and IWOA demonstrates superior performance compared with the other algorithms.

5.2. Convergence Curve Analysis

The convergence curves of the benchmark test functions can visually reflect the convergence speed and accuracy of each intelligent optimization algorithm, and clearly demonstrate their ability to overcome local optima. The convergence curves of the 21 benchmark test functions are shown in Figure 7, where the horizontal axis represents the number of iterations and the vertical axis represents the average fitness value after 30 independent runs of the test functions.

From the convergence curves of benchmark test functions (d)f₄, (i)f₉, (j)f₁₀, and (k)f₁₁, it can be observed that in the early iterations, the value of

φ

is relatively large, and the population rapidly converges toward the center of the optimal individual, resulting in a fast convergence speed. Around 300 iterations, an inflection point appears in the convergence curves. As the number of iterations increases, the probability of applying the hybrid mechanism of elite opposition-based learning and Cauchy mutation becomes higher, leading to a rapid decrease in the fitness values.

From the convergence curves of benchmark test functions (c)f₃, (e)f₅, (o)f₁₅, and (u)f₂₁, it can be seen that the IWOA achieves the solution accuracy of other comparative algorithms within about 100 iterations. Thereafter, due to the perturbation effects of the hybrid mechanism of elite opposition-based learning and Cauchy mutation, together with the inertia weight factor, the algorithm can effectively escape from local regions and continue searching for the global optimum. From the convergence curves of (f)f₆, (h)f₈, and (l)f₁₂, the IWOA shows multiple inflection points during the iterations compared with other algorithms, indicating that the cosine function-based improvement of the adaptive convergence factor in WOA better balances global search and local exploitation, thus enhancing convergence speed and accuracy.

From the convergence curves of benchmark test functions (m)f₁₃, (n)f₁₄, (p)f₁₆, (q)f₁₇, and (r)f₁₈, it is evident that the IWOA converges to the optimal accuracy before 100 iterations, proving that the improvement strategies in the exploration phase effectively increase the convergence speed of WOA. From the convergence curve of (u)f₂₁, it can be observed that the IWOA has a better fitness value than other comparative algorithms at the beginning of the iterations, which indicates that the population initialized with the Logistic–Tent combined mapping has higher quality, thereby enhancing the global search ability of the algorithm to some extent. An inflection point appears after about 30 iterations, demonstrating its faster convergence speed. Finally, from the convergence curves of (a)f₁, (b)f₂, (g)f₇, (s)f₁₉, and (t)f₂₀, it can be seen that the IWOA achieves better solution accuracy and convergence speed compared with the other algorithms.

5.3. Algorithm Time Complexity and Ranking Analysis

The time complexity of the IWOA is analyzed. Assuming that the execution time for parameter initialization (with a population size of N and a search space dimension of d) is

t_{1}

, the time for generating random distributions is

t_{2}

, and the time required to compute the fitness function is

f (d)

, the time complexity of the Whale Optimization Algorithm in the initialization stage can be expressed as

O (t_{1} + N (d t_{2} + f (d))) = O (d + f (d))

(21)

Assuming that the time required for all individuals in the whale population to update their positions in each dimension is the same, denoted as

t_{3}

, and the time for comparing fitness values and selecting the optimal position after each iteration are

t_{4}

and

t_{5}

, respectively, the time complexity of the Whale Optimization Algorithm in the optimization stage can be expressed as

O (N (d t_{3} + f (d) + t_{4}) + t_{5}) = O (d + f (d))

(22)

Therefore, the total time complexity for the WOA to obtain the optimal feasible solution in each generation is

T (d) = O (d + f (d)) + O (d + f (d)) = O (d + f (d))

(23)

In the improved algorithm, the time complexity of the initialization phase is basically the same as that of the Whale Optimization Algorithm. Assuming that the time for updating the positions of whale individuals in the optimization phase is

e_{1}

, the time for updating the optimal whale individual position using the elite reverse learning and Cauchy mutation hybrid mechanism is

e_{2}

, the time for comparing the selection of the optimal whale individual is

e_{3}

, and the time for updating whale individual positions using the improved convergence factor and inertia weight is

e_{4}

, then the time complexity of IWOA in this phase is

O (N (d e_{1} + f (d) + t_{4}) + t_{5} + N (e_{2} + e_{4}) + e_{3}) = O (d + f (d))

(24)

Therefore, the total time complexity of IWOA for obtaining the optimal feasible solution in each generation is

T (d) = O (d + f (d)) + O (d + f (d)) = O (d + f (d))

(25)

In summary, the IWOA and the WOA have essentially the same time complexity. To visually compare and analyze the time complexity of the two algorithms, the 21 benchmark test functions were each run 30 times separately using IWOA and WOA on the same platform. The benchmark functions were classified according to their dimensions, and the average runtime of the 30 numerical simulation runs was calculated, as shown in Figure 8. As illustrated in Figure 8, IWOA requires slightly more time than WOA to solve multimodal functions, but is faster for unimodal and composite modal functions, and its mean runtime is slightly lower than that of WOA, indicating that the improvement strategy enhances the algorithm’s optimization efficiency.

Considering that the quantitative analysis of all algorithms is based on the mean absolute error (MAE) of 21 benchmark test functions, the performance metrics of the algorithms can be effectively validated by ranking the numerical simulation results. Table 7 presents the MAE ranking of the benchmark test functions, calculated as follows:

M A E = \frac{\sum_{i = 1}^{N_{f}} |m_{i} - o_{i}|}{N_{f}}

(26)

m_{i}

is the mean value of the algorithm’s solution results,

o_{i}

is the theoretical value of each benchmark test function, and

N_{f}

is the total number of benchmark test functions.

As shown in Table 7, the MAE ranking of IWOA is first. Compared with other algorithms, IWOA exhibits the smallest MAE value, further demonstrating the effectiveness of the hybrid improvement strategy.

To further analyze the effectiveness of the hybrid strategy for improving the whale optimization algorithm, ablation experiments were conducted by incorporating different strategies, including Logistic–Tent mapping, an elite opposition-based learning and Cauchy mutation hybrid mechanism, as well as improved convergence factor and inertia weight. The corresponding enhanced algorithms were denoted as WOA1, WOA2, WOA3, and WOA4, respectively. The performance of the proposed IWOA on 21 benchmark functions was compared with the optimization results of WOA1, WOA2, WOA3, and WOA4 over 30 independent runs using the Wilcoxon rank-sum test. Since IWOA cannot be compared with itself, the comparisons of IWOA with WOA1, WOA2, WOA3, and WOA4 were denoted as PW1, PW2, PW3, and PW4, respectively. The p-values of the Wilcoxon rank-sum test for IWOA against the other algorithms are presented in Table 8. As shown in Table 8, most of the p-values are less than 0.05, indicating that significant differences exist between IWOA and the other algorithms. On unimodal benchmark functions, a few results show no significant difference or inconclusive significance, suggesting that applying individual strategies, such as the elite opposition-based learning with Cauchy mutation mechanism or the improved convergence factor, can achieve comparable optimization results. Nevertheless, overall, IWOA exhibits significant differences compared to WOA1, WOA2, WOA3, and WOA4, demonstrating the superiority of the proposed algorithm.

5.4. Analysis of Bus Passenger Flow Forecasting Results

5.4.1. Fuzzy Information Granulation Processing

Based on the IC card data of bus passenger flow and the corresponding influencing factors in Harbin from 1 March 2021, to 31 October 2021, covering a total of 245 days, fuzzy granulation was performed with periods of 3, 5, 7, and 14 days according to the results of the passenger flow fluctuation characteristic analysis. The three sets of granulated window data—the minimum value (LOW), mean value (R), and maximum value (UP)—were extracted to represent the fluctuation trends and spatial range indicators of bus passenger flow, and were then used as the inputs of the IWOA-BiGRU model. The corresponding prediction results are presented in Table 9.

As shown in Table 9, when applying a 7-day weekly cycle for fuzzy granulation, the 245 days of bus passenger flow data are divided into 35 fuzzy granulated windows, which achieves superior prediction accuracy. This is mainly because the division method not only accounts for the travel cycles of working days and non-working days of domestic residents but also ensures a sufficient sample size after partitioning. Using triangular fuzzy membership functions for fuzzy granulation processing, the minimum value (LOW), mean value (R), and maximum value (UP) were obtained to represent the fluctuation trends and spatial range indicators of bus passenger flow. The specific results are shown in Figure 9, and the normalized data within the interval (0,1) are shown in Figure 10.

5.4.2. Forecasting Results Analysis

After dividing the bus passenger flow data into fuzzy granulation windows with a 7-day length, 35 sets of new window data were obtained. The data from weeks 1–17 were used as the training set to establish the predictive regression function of the model, while the data from weeks 18–35 were used as the testing set to verify the model’s effectiveness. The parameter settings of the IWOA were as follows: population size POP = 25, maximum number of iterations Max_iter = 40, lower bounds of variable values lb = [0.0001,10,0.00001], upper bounds of variable values ub = [0.01,100,0.005], and variable dimension dim = length(lb).

The fuzzy-granulated bus passenger flow data of LOW, R, and UP were used as the inputs of the IWOA-BiGRU. The IWOA was employed to optimize the parameters of the regression prediction model, and the fuzzy-granulated prediction values of bus passenger flow were obtained. The fitness variation in the fuzzy-granulated bus passenger flow data fitted regression function is shown in Figure 11. To evaluate the accuracy of the forecasting results, regression analysis was conducted to examine the linear correlation between the actual and predicted values, as shown in Figure 12. From Figure 12, it can be observed that the actual and predicted values of the three fuzzy-granulated data groups (LOW, R, and UP) exhibit strong linear correlations. The slopes are 0.8699, 0.7507, and 0.9146, and the correlation coefficients

R^{2}

are 0.9058, 0.9119, and 0.9146, respectively. This indicates that the differences between the actual and predicted values are small, and the regression model achieves good prediction accuracy.

5.5. Analysis of Model Parameter Impacts

The maximum number of iterations in the bus passenger flow fluctuation trend and spatial prediction model refers to the number of times, after determining the optimal sample position, that other samples update their positions toward the optimal one in order to obtain the global optimum under the IWOA according to dataset characteristics. Due to space limitations, the parameter impact analysis was performed only on the R data after fuzzy granulation of bus passenger flow. Based on the parameter settings of the bus passenger flow fluctuation trend and spatial prediction model, the maximum number of iterations was set to 5, 10, 15, …, 60. The corresponding MAPE values of the prediction results are shown in Figure 13a. As seen in Figure 13a, with other IWOA parameters fixed, when the maximum number of iterations exceeds 40, the growth ratio of MAPE values relative to the number of iterations decreases significantly, while the computation time of the model increases notably. Excessive iteration settings may lead to redundant calculations on sample data, thereby prolonging the runtime of the prediction model.

The population size in the bus passenger flow fluctuation trend and spatial prediction model represents the exploration space of the IWOA during each run. A larger exploration space increases the probability of finding the optimal individual and the global optimum. Based on the parameter settings of the prediction model for R data after fuzzy granulation of bus passenger flow, the population size was set to 5, 10, 15, …, 60. The corresponding MAPE values of the prediction results are shown in Figure 13b. As observed in Figure 13b, when the population size exceeds 25, the growth ratio of MAPE values relative to the number of iterations decreases significantly, but the model runtime increases considerably. An excessively large population size may result in an overly wide search space, leading to redundant feature information and prolonged runtime of the prediction model.

5.6. Comparative Analysis of Model Forecasting Results

To further compare the prediction results of the models, the population size of IWOA, WOA, PSO, and GA was uniformly set to 25, with a maximum iteration number of 40, and the BiGRU algorithm parameters were kept consistent. The three sets of window data—LOW, R, and UP—obtained after fuzzy granulation of bus passenger flow were used as the input for both the IWOA-BiGRU model and existing forecasting models, and the corresponding prediction results were obtained. The statistical errors of the forecasting results for each model are shown in Figure 14. As illustrated in Figure 14, the error statistics of the IWOA-BiGRU model are consistently lower than those of the existing forecasting models. It is noteworthy that the forecasting error of bus passenger flow fuzzy granulation results is inversely proportional to the granulation values. The main reason may be that as the granulation values increase, the fluctuation trends and spatial variation ranges of the fuzzy-granulated bus passenger flow data become relatively larger.

According to Table 10, the average MAE, RMSE, and MAPE of the IWOA-BiGRU forecasting results for the three groups of window data (LOW, R, and UP) after bus passenger flow fuzzy granulation are 2915, 3075, and 8.1%, respectively. Compared with WOA-BiGRU, the improvements are 484, 686, and 1.5%; compared with PSO-LSSVM, the improvements are 1141, 1291, and 3.0%; compared with GA-BP, the improvements are 1301, 1441, and 3.5%; and compared with BiGRU, the improvements are 1378, 1498, and 3.7%. Meanwhile, relative to the existing Transformer algorithm, the improvements are 737, 932, and 2.1%, respectively. It is noteworthy that the MAPE of all models is positively correlated with the granulation values, but the variation in IWOA-BiGRU is the smallest, indicating better robustness and applicability.

To further compare and analyze the performance of the IWOA-BiGRU prediction results, the mean MAPE values of the three window sets (LOW, R, and UP) predicted by the IWOA-BiGRU model were subjected to paired t-tests with the corresponding mean MAPE values predicted by WOA-BiGRU, Transformer, PSO-LSSVM, GA-BP, and BiGRU, denoted as T1, T2, T3, T4, and T5, respectively. The results of the paired t-tests for the mean MAPE values of the prediction results of each model are presented in Table 11. As shown in Table 11, the significance levels of the mean MAPE values predicted by the IWOA-BiGRU model are all less than 0.05, indicating that the IWOA-BiGRU model demonstrates significant superiority in predicting bus passenger flow fluctuations.

As shown in Table 12, the upper and lower bounds of the bus passenger flow data thresholds obtained through triangular fuzzy information granulation encompass the range of the actual passenger flow data. From Figure 11 and Figure 13, combined with the data in Table 8 and Table 9, it can be seen that the IWOA-BiGRU forecasting results for the fuzzy-granulated bus passenger flow data (LOW, R, and UP) in week 35 (25–31 October 2021) are 34,539, 39,447, and 51,224, respectively. The prediction accuracy of this model is higher compared with existing forecasting models. Relative to the fuzzy-granulated values of bus passenger flow in the previous week (31,527, 35,625, and 43,822), the model forecasting results show an overall upward trend, which is generally consistent with the actual data trend, indicating the reliability of the forecasting results.

These findings provide data support and decision-making references for the formulation of bus operation scheduling plans and for passengers to reasonably plan travel routes and times, demonstrating practical application value.

To evaluate the applicability of the IWOA-BiGRU, the average runtime of forecasting models with different fuzzy-granulated bus passenger flow windows was recorded, as shown in Figure 15.

From Figure 15, it can be seen that the average runtime of IWOA-BiGRU is 412 s, which is shorter than that of WOA-BiGRU, PSO-LSSVM, and GA-BP by 8.2 s, 18.3 s, and 44.5 s, respectively. Although its runtime is longer than that of BiGRU, the higher prediction accuracy of IWOA-BiGRU demonstrates better applicability.

6. Conclusions

The main conclusions of this study are as follows:

(1) This paper applies a fuzzy information granulation method, which reduces the spatiotemporal complexity of bus passenger flow data while preserving its valid information, and constructs a FIG-IWOA-BiGRU-based model for forecasting bus passenger flow fluctuation trends and spatial distribution. The MAPE values of the forecasting results for the fuzzy-granulated LOW, R, and UP window data are 7.8%, 8.1%, and 8.5%, respectively. Compared with existing forecasting models, these values improve by 1.3–3.5%, 1.5–3.7%, and 1.7–4.0%, demonstrating good prediction accuracy and robustness. The findings contribute to the reasonable prediction of bus passenger flow fluctuation trends and spatial patterns, providing data support and decision-making references for bus operation scheduling plans and for passengers in planning travel routes and times.

(2) This paper improves the whale optimization algorithm (WOA) with hybrid strategies, including Logistic–Tent combined chaotic mapping, an elite opposition-based learning and Cauchy mutation hybrid mechanism, and modified convergence factor and inertia weight. These strategies enhance population diversity and quality while preserving the strengths of WOA, and improve the algorithm’s capability to balance global search and local exploitation. Comparative analysis of numerical simulation results and convergence characteristics on 21 benchmark test functions shows that IWOA outperforms WOA, GWO, PSO, and GA in solution accuracy, convergence speed, and the ability to escape local optima, further validating the effectiveness of the hybrid strategy-improved WOA proposed in this study.

In addition, due to the privacy of bus data, this study did not obtain operational data from other cities, which imposes certain limitations. In future research, the following aspects will be further explored:

(1) The main research content of this study is based on Harbin city’s bus IC card and weather condition data. Future research should further collect and organize bus IC card and weather data from other cities to comparatively analyze the impact of different built environments on the spatiotemporal distribution, fluctuation characteristics, and trends of bus passenger flow.

(2) Based on the analysis of bus passenger flow data in different urban regions, by considering modifications to the whale optimization algorithm’s search mechanism and algorithm structure and integrating the advantages of other intelligent optimization algorithms, more efficient intelligent optimization algorithms with improved search and convergence speed can be proposed, thereby enhancing the accuracy and applicability of models predicting bus passenger flow fluctuation trends and spatial distribution.

Author Contributions

Conceptualization, N.W. and Q.H.; methodology, J.Z.; software, X.L.; validation, J.Z.; formal analysis, J.Z.; investigation, X.L.; resources, N.W.; data curation, S.X.; writing—original draft preparation, J.Z.; writing—review and editing, Q.H.; visualization, Q.H.; supervision, N.W.; project administration, S.X.; funding acquisition, N.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by “the Natural Science Foundation of Fujian Province, grant number 2023J011093”; “the Collaborative Innovation Center Project of Ningde Normal University, grant number 2023ZX01”; “the Digital Intelligence Ningde Fujian Province Data Annotation Base”; “the Young Scholars Science Foundation of Lanzhou Jiaotong University, grant number 2025021; and “Research Projects of Ningde Normal University, grant number 2024Y17”.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mei, Z.; Yu, W.; Tang, W.; Yu, J.; Cai, Z. Attention mechanism-based model for short-term bus traffic passenger volume prediction. IET Intell. Transp. Syst. 2023, 17, 767–779. [Google Scholar] [CrossRef]
Chen, D.; Kun, Y.; Chun, G. The Relationship between Built Environment Characteristics and Metro Stations Pedestrian Catchment Areas. J. Adv. Transp. 2024, 2024, 2196799. [Google Scholar] [CrossRef]
Zhu, C.; Sun, X.; Li, Y.; Wang, Z.; Li, Y. A hybrid neural network for urban rail transit short-term flow prediction. J. Supercomput. 2024, 80, 24297–24323. [Google Scholar] [CrossRef]
Fang, X.; Lin, M.; Chen, W.; Pan, X. Forecasting short-term passenger flow on a bus route: A splitting-integrating method based on passenger travel behavior. Transport 2025, 40, 12–23. [Google Scholar] [CrossRef]
Chen, E.; Ye, Z.; Wang, C.; Xu, M. Subway passenger flow prediction for special events using smart card data. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1109–1120. [Google Scholar] [CrossRef]
Milenković, M.; Švadlenka, L.; Melichar, V.; Bojović, N.; Avramović, Z. SARIMA modelling approach for railway passenger flow forecasting. Transport 2018, 33, 1113–1120. [Google Scholar] [CrossRef]
Cheng, G.; He, C. Analysis of bus travel characteristics and predictions of elderly passenger flow based on smart card data. Electron. Res. Arch. 2022, 30, 4256–4276. [Google Scholar] [CrossRef]
Wang, J.; Leng, B.; Wu, J.; Du, H.; Xiong, Z. Metroeye: A weather-aware system for real-time metro passenger flow prediction. IEEE Access 2020, 8, 129813–129829. [Google Scholar] [CrossRef]
Yu, L.; Cui, M.; Dai, S. Deviation of peak hours for metro stations based on least square support vector machine. PLoS ONE 2023, 18, e0291497. [Google Scholar] [CrossRef]
Xu, X.; Zhang, K.; Mi, Z.; Wang, X. Short-term passenger flow prediction during station closures in subway systems. Expert Syst. Appl. 2024, 236, 121362. [Google Scholar] [CrossRef]
Li, Y.; Ma, C. Short-time bus route passenger flow prediction based on a secondary decomposition integration method. J. Transp. Eng. Part A Syst. 2023, 149, 04022132. [Google Scholar] [CrossRef]
Yazıcıoğlu, C.; Akgüngör, A.P. Modeling Short-Term Passenger Flows in Metro and Bus Systems Using Meteorological Data: Deep Learning Model Comparisons. Appl. Sci. 2025, 15, 6260. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, X.; Xie, J.; Bai, Y. Comparative analysis of deep-learning-based models for hourly bus passenger flow forecasting. Transportation 2024, 51, 1759–1784. [Google Scholar] [CrossRef]
Cao, B.; Li, Y.; Chen, Y.; Yang, A. A CNN-LSTM model for short-term passenger flow forecast considering the built environment in urban rail transit stations. J. Transp. Eng. Part A Syst. 2024, 150, 04024072. [Google Scholar] [CrossRef]
Baghbani, A.; Rahmani, S.; Bouguila, N.; Patterson, Z. TMS-GNN: Traffic-aware Multistep Graph Neural Network for bus passenger flow prediction. Transp. Res. Part C Emerg. Technol. 2025, 174, 105107. [Google Scholar] [CrossRef]
Lv, W.; Lv, Y.; Ouyang, Q.; Ren, Y. A bus passenger flow prediction model fused with point-of-interest data based on extreme gradient boosting. Appl. Sci. 2022, 12, 940. [Google Scholar] [CrossRef]
Yu, J.; Chang, X.; Hu, S.; Yin, H.; Wu, J. Combining travel behavior in metro passenger flow prediction: A smart explainable Stacking-Catboost algorithm. Inf. Process. Manag. 2024, 61, 103733. [Google Scholar] [CrossRef]
Nagaraj, N.; Gururaj, H.L.; Swathi, B.H.; Hu, Y.C. Passenger flow prediction in bus transportation system using deep learning. Multimed. Tools Appl. 2022, 81, 12519–12542. [Google Scholar] [CrossRef]
Sun, F.; Wang, X.L.; Zhang, Y.; Liu, W.X.; Zhang, R.J. Analysis of bus trip characteristic analysis and demand forecasting based on GA-NARX neural network model. IEEE Access 2020, 8, 8812–8820. [Google Scholar] [CrossRef]
Li, C.; Wang, X.; Cheng, Z.; Bai, Y. Forecasting bus passenger flows by using a clustering-based support vector regression approach. IEEE Access 2020, 8, 19717–19725. [Google Scholar] [CrossRef]
Sivakumar, R.; Angayarkanni, S.A.; Ramana Rao, Y.V. Traffic flow forecasting using natural selection based hybrid Bald Eagle Search—Grey Wolf optimization algorithm. PLoS ONE 2022, 17, e0275104. [Google Scholar] [CrossRef]
Jiang, Q. GMM Clustering Based on WOA Optimization and Space-Time Coupled Urban Rail Traffic Flow Prediction by CEEMD-SE-BiGRU-AM. Mob. Inf. Syst. 2022, 2022, 7846630. [Google Scholar] [CrossRef]
Rana, N.; Latiff, M.S.A.; Abdulhamid, S.M.; Chiroma, H. Whale optimization algorithm: A systematic review of contemporary applications, modifications and developments. Neural Comput. Appl. 2020, 32, 16245–16277. [Google Scholar] [CrossRef]
Xu, Y.; Li, Q.; Xu, X.; Yang, J.; Chen, Y. Research progress of nature-inspired metaheuristic algorithms in mobile robot path planning. Electronics 2023, 12, 3263. [Google Scholar] [CrossRef]
Nadimi-Shahraki, M.H.; Zamani, H.; Asghari Varzaneh, Z.; Mirjalili, S. A systematic review of the whale optimization algorithm: Theoretical foundation, improvements, and hybridizations. Arch. Comput. Methods Eng. 2023, 30, 4113–4159. [Google Scholar] [CrossRef]
Prasad, D.; Mukherjee, A.; Mukherjee, V. Temperature dependent optimal power flow using chaotic whale optimization algorithm. Expert Syst. 2021, 38, e12685. [Google Scholar] [CrossRef]
Duan, Y.; Liu, C.; Li, S. Battlefield target grouping by a hybridization of an improved whale optimization algorithm and affinity propagation. IEEE Access 2021, 9, 46448–46461. [Google Scholar] [CrossRef]
Darvish Falehi, A. An optimal second-order sliding mode based inter-area oscillation suppressor using chaotic whale optimization algorithm for doubly fed induction generator. Int. J. Numer. Model. Electron. Netw. Devices Fields 2022, 35, e2963. [Google Scholar] [CrossRef]
Paul, C.; Roy, P.K.; Mukherjee, V. Wind and Solar Based Multi-Objective Hydro-Thermal Scheduling Using Chaotic-Oppositional Whale Optimization Algorithm. Electr. Power Compon. Syst. 2023, 51, 568–592. [Google Scholar] [CrossRef]
Li, M.; Yu, X.; Fu, B.; Wang, X. A modified whale optimization algorithm with multi-strategy mechanism for global optimization problems. Neural Comput. Appl. 2023, 2023, 22339–22352. [Google Scholar] [CrossRef]
Chakraborty, S.; Saha, A.K.; Chhabra, A. Improving Whale Optimization Algorithm with Elite Strategy and Its Application to Engineering-Design and Cloud Task Scheduling Problems. Cogn. Comput. 2023, 15, 1497–1525. [Google Scholar] [CrossRef]
Anitha, J.; Pandian, S.I.A.; Agnes, S.A. An efficient multilevel color image thresholding based on modified whale optimization algorithm. Expert Syst. Appl. 2021, 178, 115003. [Google Scholar] [CrossRef]
Yue, Y.; You, H.; Wang, S.; Cao, L. Improved whale optimization algorithm and its application in heterogeneous wireless sensor networks. Int. J. Distrib. Sens. Netw. 2021, 17, 15501477211018140. [Google Scholar] [CrossRef]
Chakraborty, S.; Saha, A.K.; Nama, S.; Debnath, S. COVID-19 X-ray image segmentation by modified whale optimization algorithm with population reduction. Comput. Biol. Med. 2021, 139, 104984. [Google Scholar] [CrossRef] [PubMed]
Wang, H.; Wu, F.; Zhang, L. Application of variational mode decomposition optimized with improved whale optimization algorithm in bearing failure diagnosis. Alex. Eng. J. 2021, 60, 4689–4699. [Google Scholar] [CrossRef]
Yang, W.; Xia, K.; Fan, S.; Wang, L.; Li, T.; Zhang, J.; Feng, Y. A multi-strategy Whale optimization algorithm and its application. Eng. Appl. Artif. Intell. 2022, 108, 104558. [Google Scholar] [CrossRef]
Ju, C.; Ding, H.; Hu, B. A hybrid strategy improved whale optimization algorithm for web service composition. Comput. J. 2023, 66, 662–677. [Google Scholar] [CrossRef]
Liu, J.; He, Q.; Yue, Z.; Pei, Y. A Hybrid Strategy-Improved SSA-CNN-LSTM Model for Metro Passenger Flow Forecasting. Mathematics 2024, 12, 3929. [Google Scholar] [CrossRef]
Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
Kaur, G.; Arora, S. Chaotic whale optimization algorithm. J. Comput. Des. Eng. 2018, 5, 275–284. [Google Scholar] [CrossRef]

Figure 1. Travel characteristics of bus passenger flow.

Figure 2. Correlation of influencing factors of bus passenger flow.

Figure 3. Histogram distribution of bus passenger flow and fuzzy membership functions.

Figure 4. Structure of BiGRU.

Figure 5. Flowchart of Bus Passenger Flow Fluctuation Trend and Spatial Prediction Based on FIG-IWOA-BiGRU.

Figure 6. Benchmark function box line statistics.

Figure 7. Average convergence curves of benchmark test functions.

Figure 8. Algorithm runtime comparison.

Figure 9. Visualization of fuzzy information granulation.

Figure 10. Normalized granulation values.

Figure 11. Fitness variation.

Figure 12. Correlation between actual and predicted values under different bus passenger flow fluctuation trends (LOW/R/UP).

Figure 13. Impacts of model parameters on forecasting results.

Figure 14. Error comparison analysis under different bus passenger flow fluctuation trends (LOW/R/UP).

Figure 15. Average runtime of different models.

Table 1. Test function Information.

Algorithm	Value
IWOA	Parameter a decreases from 2 to 0 following a cosine function, p,r₁,r₂ ∈ [0, 1], spiral constant b = 1, and l ∈ [−1, 1]
WOA	Parameter a decreases linearly from 2 to 0, p,r₁,r₂ ∈ [0,1], spiral constant b = 1, and l ∈ [−1,1]
GWO	Parameter a decreases linearly from 2 to 0, r₁,r₂ ∈ [0,1]
PSO	Learning factors C₁ = C₂ = 2, particle velocity V_max = 2, V_min = −2, and inertia weight W_max = 0.8, W_min = 0.2
GA	Crossover probability P_c = 0.8, mutation probability P_m = 0.2.

Table 2. Relevant values of test functions.

Type	Function	Dimension	Search Range	Theoretical Optimal Value
Unimodal Benchmark test functions	$f_{1} (x) = \sum_{i = 1}^{n} x_{i}^{2}$	30	$[- 100, 100]$	0
	$f_{2} (x) = \sum_{i = 1}^{n} \|x_{i}\| + \prod_{i = 1}^{n} \|x_{i}\|$	30	$[- 10, 10]$	0
	$f_{3} (x) = {\sum_{i = 1}^{n} (\sum_{j - 1}^{i} x_{j})}^{2}$	30	$[- 100, 100]$	0
	$f_{4} (x) = \max_{i} \{\|x_{i}\|, 1 \leq i \leq n\}$	30	$[- 100, 100]$	0
	$f_{5} (x) = \sum_{i = 1}^{n - 1} [100 {(x_{i + 1} - x_{i}^{2})}^{2} + {(x_{i} - 1)}^{2}]$	30	$[- 30, 30]$	0
	$f_{6} (x) = \sum_{i = 1}^{n} {([x_{i} + 0.5])}^{2}$	30	$[- 100, 100]$	0
	$f_{7} (x) = \sum_{i = 1}^{n} i x_{i}^{4} + r a n d o m [0, 1)$	30	$[- 1.28, 1.28]$	0
Multimodal benchmark test functions	$F_{8} (x) = \sum_{i = 1}^{n} - x_{i} \sin (\sqrt{\|x_{i}\|})$	30	$[- 500, 500]$	$- 418.98 \times D i m^{n}$
	$F_{9} (x) = \sum_{i = 1}^{n} [x_{i}^{2} - 10 \cos (2 π x_{i}) + 10]$	30	$[- 5.12, 5.12]$	0
	$F_{10} (x) = - 20 \exp (- 0.2 \sqrt{\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2}}) - \exp (\frac{1}{n} \sum_{i = 1}^{n} \cos (2 π x_{i})) + 20 + e$	30	$[- 32, 32]$	0
	$F_{11} (x) = \frac{1}{4000} \sum_{i = 1}^{n} x_{i}^{2} - \prod_{i = 1}^{n} \cos (\frac{x_{i}}{\sqrt{i}}) + 1$	30	$[- 600, 600]$	0
	$\begin{array}{l} F_{12} (x) = \frac{π}{n} \{10 \sin (π y_{1}) + \sum_{i = 1}^{n - 1} {(y_{i} - 1)}^{2} [1 + 10 \sin^{2} (π y_{i + 1})] + {(y_{n} - 1)}^{2}\} \\ + \sum_{i = 1}^{n} u (x_{i}, 10, 100, 4) \\ y_{i} = 1 + \frac{x_{i} + 1}{4} \\ u (x_{i}, a, k, m) = \{\begin{cases} k {(x_{i} - a)}^{m} & x_{i} > a \\ 0 & - a < x_{i} < a \\ k {(- x_{i} - a)}^{m} & x_{i} < - a \end{cases} \end{array}$	30	$[- 50, 50]$	0
	$\begin{array}{l} F_{13} (x) = 0.1 \{\sin^{2} (3 π x_{1}) + \sum_{i = 1}^{n} {(x_{i} - 1)}^{2} [1 + \sin^{2} (3 π x_{i} + 1)] \\ + {(x_{n} - 1)}^{2} [1 + \sin^{2} (2 π x_{n})]\} + \sum_{i = 1}^{n} u (x_{i}, 5, 100, 4) \end{array}$	30	$[- 50, 50]$	0
Composite multimodal benchmark test functions	$F_{14} (x) = {(\frac{1}{500} + \sum_{j = 1}^{25} \frac{1}{j + \sum_{i = 1}^{2} {(x_{i} - a_{i j})}^{6}})}^{- 1}$	2	$[- 65, 65]$	1
	$F_{15} (x) = {\sum_{i = 1}^{11} [a_{i} - \frac{x_{1} (b_{i}^{2} + b_{1} x_{2})}{b_{i}^{2} + b_{1} x_{3} + x_{4}}]}^{2}$	4	$[- 5, 5]$	0.0003
	$F_{16} (x) = {(x_{2} - \frac{5.1}{4 π^{2}} x_{1}^{2} + \frac{5}{π} x_{1} - 6)}^{2} + 10 (1 - \frac{1}{8 π}) \cos x_{1} + 10$	2	$[- 5, 5]$	0.398
	$\begin{array}{l} F_{17} (x) = [1 + {(x_{1} + x_{2} + 1)}^{2} (19 - 14 x_{1} + 3 x_{1}^{2} - 14 x_{2} + 6 x_{1} x_{2} + 3 x_{2}^{2})] \\ \times [30 + {(2 x_{1} - 3 x_{2})}^{2} \times (18 - 32 x_{1} + 12 x_{1}^{2} + 48 x_{2} - 36 x_{1} x_{2} + 27 x_{2}^{2})] \end{array}$	2	$[- 2, 2]$	3
	$F_{18} (x) = - \sum_{i = 1}^{4} c_{i} \exp (- \sum_{j = 1}^{6} a_{i j} {(x_{j} - p_{i j})}^{2})$	6	$[0, 1]$	−3.32
	$F_{19} (x) = - \sum_{i = 1}^{5} {[(X - a_{i}) {(X - a_{i})}^{T} + c_{i}]}^{- 1}$	4	$[0, 10]$	−10.1532
	$F_{20} (x) = - \sum_{i = 1}^{7} {[(X - a_{i}) {(X - a_{i})}^{T} + c_{i}]}^{- 1}$	4	$[0, 10]$	−10.4028
	$F_{21} (x) = - \sum_{i = 1}^{10} {[(X - a_{i}) {(X - a_{i})}^{T} + c_{i}]}^{- 1}$	4	$[0, 10]$	−10.5363

Table 3. Unimodal Benchmark test functions.

Function	Indicator	IWOA	WOA	GWO	PSO	GA
f₁	Best	4.3 × 10⁻¹³³	1.5 × 10⁻⁸⁵	7.6 × 10⁻³⁰	2.9 × 10	2.1 × 10³
	Worst	4.4 × 10⁻¹¹⁰	2.3 × 10⁻⁷³	5.1 × 10⁻²⁷	5.5 × 10²	1.6 × 10⁴
	Mean	1.5 × 10⁻¹¹¹	1.8 × 10⁻⁷⁴	1.3 × 10⁻²⁷	3.3 × 10²	8.5 × 10³
	Standard	8.1 × 10⁻¹¹¹	5.6 × 10⁻⁷⁴	1.4 × 10⁻²⁷	1.3 × 10²	4.0 × 10³
f₂	Best	1.0 × 10⁻⁸²	1.8 × 10⁻⁵⁹	1.7 × 10⁻¹⁷	6.9	2.3 × 10
	Worst	5.5 × 10⁻⁶⁷	6.1 × 10⁻⁵⁰	7.4 × 10⁻¹⁶	4.5 × 10	6.0 × 10
	Mean	1.8 × 10⁻⁶⁸	2.8 × 10⁻⁵¹	1.3 × 10⁻¹⁶	1.7 × 10	4.0 × 10
	Standard	1.0 × 10⁻⁶⁷	1.2 × 10⁻⁵⁰	1.3 × 10⁻¹⁶	9.3	9.5
f₃	Best	7.9 × 10⁻⁹	7.1 × 10³	9.3 × 10⁻¹²	1.8 × 10³	2.2 × 10⁴
	Worst	4.0 × 10⁻³	7.2 × 10⁴	6.2 × 10⁻¹	5.0 × 10⁴	5.2 × 10⁴
	Mean	1.4 × 10⁻⁴	4.5 × 10⁴	3.3 × 10⁻²	1.1 × 10⁴	3.7 × 10⁴
	Standard	7.3 × 10⁻⁴	1.4 × 10⁴	1.2 × 10⁻¹	1.1 × 10⁴	7.9 × 10³
f₄	Best	2.2 × 10⁻³¹	2.9 × 10⁻²	1.1 × 10⁻⁷	4.5	5.1 × 10
	Worst	2.5 × 10⁻²³	8.9 × 10	5.4 × 10⁻⁶	1.4 × 10	8.6 × 10
	Mean	1.2 × 10⁻²⁴	5.5 × 10	1.1 × 10⁻⁶	8.7	6.9 × 10
	Standard	4.6 × 10⁻²⁴	2.9 × 10	1.1 × 10⁻⁶	2.6	8.3
f₅	Best	2.6 × 10	2.7 × 10	2.8 × 10	7.6 × 10²	2.1 × 10⁴
	Worst	2.9 × 10	2.9 × 10	2.9 × 10	4.9 × 10⁴	6.4 × 10⁶
	Mean	2.7 × 10	2.8 × 10	2.9 × 10	1.3 × 10⁴	1.1 × 10⁶
	Standard	6.7 × 10⁻¹	4.3 × 10⁻¹	4.9 × 10⁻²	1.3 × 10⁴	1.7 × 10⁶
f₆	Best	2.2 × 10⁻⁶	5.6 × 10⁻²	2.4 × 10⁻¹	7.4 × 10	3.4 × 10³
	Worst	3.1 × 10⁻⁴	1.0	1.8	9.5 × 10²	2.4 × 10⁴
	Mean	6.3 × 10⁻⁵	4.2 × 10⁻¹	8.3 × 10⁻¹	3.8 × 10²	1.0 × 10⁴
	Standard	7.2 × 10⁻⁵	2.4 × 10⁻¹	3.7 × 10⁻¹	2.1 × 10²	6.1 × 10³
f₇	Best	7.3 × 10⁻⁵	1.9 × 10⁻⁴	2.2 × 10⁻⁴	3.3 × 10⁻²	1.5 × 10⁻¹
	Worst	4.5 × 10⁻³	7.2 × 10⁻³	4.2 × 10⁻³	1.3 × 10	2.0
	Mean	1.1 × 10⁻³	2.3 × 10⁻³	1.8 × 10⁻³	1.2	6.8 × 10⁻¹
	Standard	1.1 × 10⁻³	2.1 × 10⁻³	9.8 × 10⁻⁴	2.8	5.1 × 10⁻¹

Table 4. Multimodal benchmark test functions.

Function	Indicator	IWOA	WOA	GWO	PSO	GA
f₈	Best	−1.3 × 10⁴	−1.3 × 10⁴	−7.3 × 10³	−9.2 × 10³	−2.8 × 10³
	Worst	−5.0 × 10³	−5.7 × 10³	−3.4 × 10³	−4.6 × 10³	−1.0 × 10³
	Average	−1.2 × 10⁴	−1.0 × 10⁴	−5.9 × 10³	−7.1 × 10³	−2.1 × 10³
	Standard	1.9 × 10³	2.1 × 10³	8.4 × 10²	1.1 × 10³	4.9 × 10²
f₉	Best	0.0	0.0	5.7 × 10⁻¹⁴	1.2 × 10²	1.6 × 10²
	Worst	0.0	1.1 × 10⁻¹³	7.0	2.4 × 10²	3.0 × 10²
	Average	0.0	3.8 × 10⁻¹⁵	1.1	1.9 × 10²	2.6 × 10²
	Standard	0.0	2.1 × 10⁻¹⁴	1.9	3.5 × 10	3.0 × 10
f₁₀	Best	4.4 × 10⁻¹⁶	4.4 × 10⁻¹⁶	7.5 × 10⁻¹⁴	3.8	1.9 × 10
	Worst	7.5 × 10⁻¹⁵	4.0 × 10⁻¹⁵	1.5 × 10⁻¹³	8.2	2.1 × 10
	Average	3.2 × 10⁻¹⁵	3.0 × 10⁻¹⁵	1.0 × 10⁻¹³	5.7	2.0 × 10
	Standard	2.7 × 10⁻¹⁵	1.6 × 10⁻¹⁵	1.7 × 10⁻¹⁴	1.0	3.7 × 10⁻¹
f₁₁	Best	0.0	0.0	0.0	1.7	2.1 × 10
	Worst	0.0	1.7 × 10⁻¹	5.9 × 10⁻²	6.6	1.7 × 10²
	Average	0.0	9.9 × 10⁻³	4.9 × 10⁻³	3.9	9.1 × 10
	Standard	0.0	3.8 × 10⁻²	1.2 × 10⁻²	1.2	4.6 × 10
f₁₂	Best	2.3 × 10⁻⁷	3.7 × 10⁻³	1.9 × 10⁻²	1.1	9.8 × 10⁻¹
	Worst	4.5 × 10⁻⁵	9.7 × 10⁻²	8.4 × 10⁻²	1.3 × 10	3.3 × 10
	Average	7.1 × 10⁻⁶	2.4 × 10⁻²	4.3 × 10⁻²	5.8	1.1 × 10
	Standard	8.8 × 10⁻⁶	1.8 × 10⁻²	1.8 × 10⁻²	2.9	6.8
f₁₃	Best	1.1 × 10⁻¹	3.2 × 10⁻¹	5.0 × 10⁻⁵	6.8	1.3 × 10
	Worst	1.3 × 10	1.2	2.6	4.4 × 10	3.8 × 10⁶
	Average	5.8 × 10⁻¹	7.4 × 10⁻¹	1.1	1.9 × 10	2.3 × 10⁵
	Standard	3.1 × 10⁻¹	2.5 × 10⁻¹	1.0	9.7	8.1 × 10⁵

Table 5. Composite multimodal benchmark test functions.

Function	Indicator	IWOA	WOA	GWO	PSO	GA
f₁₄	Best	1.0	1.0	1.0	1.0	1.0
	Worst	1.0	1.1 × 10	1.3 × 10	1.3 × 10	1.0
	Average	1.0	3.0	3.0	9.1	1.0
	Standard	1.2 × 10⁻¹⁰	2.9	3.0	5.0	2.8 × 10⁻¹⁰
f₁₅	Best	3.3 × 10⁻⁴	3.1 × 10⁻⁴	8.0 × 10⁻⁴	9.8 × 10⁻⁴	3.1 × 10⁻⁴
	Worst	2.2 × 10⁻³	2.0 × 10⁻²	2.0 × 10⁻²	2.4 × 10⁻²	2.3 × 10⁻²
	Average	6.5 × 10⁻⁴	3.7 × 10⁻³	6.3 × 10⁻³	7.5 × 10⁻³	5.5 × 10⁻³
	Standard	3.7 × 10⁻⁴	7.6 × 10⁻³	7.6 × 10⁻³	7.4 × 10⁻³	8.9 × 10⁻³
f₁₆	Best	4.0 × 10⁻¹	4.0 × 10⁻¹	4.0 × 10⁻¹	4.0 × 10⁻¹	5.9 × 10
	Worst	4.0 × 10⁻¹	4.0 × 10⁻¹	4.0 × 10⁻¹	4.0 × 10⁻¹	8.1 × 10
	Average	4.0 × 10⁻¹	4.0 × 10⁻¹	4.0 × 10⁻¹	4.0 × 10⁻¹	6.9 × 10
	Standard	8.7 × 10⁻⁶	3.3 × 10⁻⁵	7.0 × 10⁻⁵	1.6 × 10⁻⁵	6.5
f₁₇	Best	−3.9	−3.9	−3.9	−3.1	−3.7
	Worst	−3.9	−3.8	−3.9	−3.1	−2.5
	Average	−3.9	−3.9	−3.9	−3.1	−3.1
	Standard	2.8 × 10⁻³	1.5 × 10⁻²	1.7 × 10⁻³	2.7 × 10⁻⁵	3.6 × 10⁻¹
f₁₈	Best	−3.3	−3.3	−3.3	−3.2	−2.7
	Worst	−3.1	−2.4	−2.6	−1.1	−6.1 × 10⁻¹
	Average	−3.3	−3.2	−3.1	−2.6	−1.6
	Standard	7.1 × 10⁻²	1.9 × 10⁻¹	1.8 × 10⁻¹	7.9 × 10⁻¹	5.3 × 10⁻¹
f₁₉	Best	−1.0 × 10	−1.0 × 10	−1.0 × 10	−1.0 × 10	−3.8
	Worst	−5.1	−2.6	−2.6	−5.1	−9.6 × 10⁻¹
	Average	−1.0 × 10	−7.3	−8.5	−8.2	−2.1
	Standard	9.2 × 10⁻¹	2.9	2.6	2.5	7.8 × 10⁻¹
f₂₀	Best	−1.0 × 10	−1.0 × 10	−1.0 × 10	−1.0 × 10	−3.4
	Worst	−5.1	−1.8	−2.8	−3.7 × 10-1	−1.0
	Average	−1.0 × 10	−7.8	−9.4	−7.4	−2.1
	Standard	9.7 × 10⁻¹	3.1	2.2	3.4	6.2 × 10⁻¹
f₂₁	Best	−1.1 × 10	−1.1 × 10	−1.1 × 10	−1.1 × 10	−3.8
	Worst	−9.9	−2.4	−2.4 × 10	−5.1	−1.2
	Average	−1.0 × 10	−7.1	−1.0 × 10	−8.4	−2.1
	Standard	1.5 × 10⁻¹	3.2	2.1	2.7	6.5 × 10⁻¹

Table 6. Wilcoxon rank-sum test p-value.

Text Function	P1	P2	P3	P4
f₁	9.5 × 10⁻³⁸	2.5 × 10⁻¹⁷	1.6 × 10⁻²⁸	8.6 × 10⁻⁸³
f₂	2.1 × 10⁻¹¹	3.2 × 10⁻¹³	6.9 × 10⁻⁵⁶	2.5 × 10⁻⁶²
f₃	1.3 × 10⁻⁸³	1.3 × 10⁻⁸³	9.3 × 10⁻⁷⁴	1.2 × 10⁻⁸³
f₄	6.4 × 10⁻²⁹	5.4 × 10⁻⁶⁹	1.3 × 10⁻¹²	4.8 × 10⁻³⁰
f₅	2.4 × 10⁻⁸²	1.3 × 10⁻⁸³	2.6 × 10⁻⁷¹	1.2 × 10⁻⁸²
f₆	1.8 × 10⁻⁵⁴	1.9 × 10⁻⁵³	1.1 × 10⁻²⁷	1.7 × 10⁻⁸²
f₇	4.7 × 10⁻⁷⁴	1.1 × 10⁻⁷⁷	1.5 × 10⁻⁸	1.3 × 10⁻⁷
f₈	5.9 × 10⁻⁵¹	1.3 × 10⁻⁸³	1.3 × 10⁻⁷⁹	1.3 × 10⁻⁸³
f₉	3.8 × 10⁻⁶²	2.1 × 10⁻³¹	3.3 × 10⁻³⁶	9.4 × 10⁻⁴⁸
f₁₀	3.5 × 10⁻⁶¹	1.8 × 10⁻¹¹	1.7 × 10⁻⁸³	4.9 × 10⁻⁸⁴
f₁₁	3.8 × 10⁻²⁵	1.8 × 10⁻²⁵	3.6 × 10⁻³⁰	5.8 × 10⁻⁸³
f₁₂	8.8 × 10⁻⁵⁶	1.9 × 10⁻⁵⁴	8.8 × 10⁻²³	1.8 × 10⁻¹⁷
f₁₃	1.9 × 10⁻¹⁹	1.3 × 10⁻⁸³	5.9 × 10⁻⁶⁷	1.7 × 10⁻⁸¹
f₁₄	1.6 × 10⁻⁸⁴	1.4 × 10⁻⁸³	1.3 × 10⁻⁸³	2.6 × 10⁻⁶¹
f₁₅	6.1 × 10⁻⁷⁰	9.3 × 10⁻⁷⁵	1.2 × 10⁻⁸¹	1.7 × 10⁻⁸²
f₁₆	1.3 × 10⁻⁸³	1.3 × 10⁻⁸³	1.3 × 10⁻⁸³	1.8 × 10⁻⁸³
f₁₇	1.2 × 10⁻⁶⁴	2.5 × 10⁻²¹	1.6 × 10⁻⁶⁷	4.7 × 10⁻⁸⁰
f₁₈	2.5 × 10⁻⁸²	1.3 × 10⁻⁸³	1.1 × 10⁻⁸³	1.5 × 10⁻⁷⁹
f₁₉	4.3 × 10⁻¹¹	4.9 × 10⁻⁹	1.3 × 10⁻⁷⁵	1.4 × 10⁻⁷¹
f₂₀	6.4 × 10⁻¹²	5.9 × 10⁻³⁶	1.6 × 10⁻⁸⁰	1.6 × 10⁻⁶³
f₂₁	3.7 × 10⁻⁸³	1.3 × 10⁻⁸³	2.3 × 10⁻⁸³	1.9 × 10⁻⁶⁸

Table 7. Algorithm MAE ranking.

Algorithm	MAE	Rank
IWOA	2.13 × 10	1
WOA	1.78 × 10²	2
PSO	1.40 × 10³	3
GWO	4.10 × 10³	4
GA	4.85 × 10³	5

Table 8. p-values of the Wilcoxon rank-sum test in ablation experiments.

Text Function	PW1	PW2	PW3	PW4
f₁	5.2 × 10⁻¹³	Na	1.6 × 10⁻⁶⁴	1.5 × 10⁻⁶⁵
f₂	1.6 × 10⁻⁶⁴	3.3 × 10⁻⁵⁰	2.1 × 10⁻⁶⁴	6.3 × 10⁻⁶⁵
f₃	1.7 × 10⁻⁸¹	1.9 × 10⁻⁶⁴	1.5 × 10⁻⁷⁶	1.3 × 10⁻⁸¹
f₄	9.5 × 10⁻⁶⁶	Na	6.1 × 10⁻⁵³	1.7 × 10⁻⁶⁴
f₅	2.7 × 10⁻⁷⁵	3.1 × 10⁻⁷⁵	2.7 × 10⁻⁷⁵	3.3 × 10⁻⁷⁵
f₆	2.1 × 10⁻⁵²	2.7 × 10⁻⁵¹	1.1 × 10⁻²⁸	2.4 × 10⁻⁸¹
f₇	1.1 × 10⁻⁷¹	1.9 × 10⁻⁷⁵	Na	7.1 × 10⁻⁸¹
f₈	4.5 × 10⁻⁴⁹	1.7 × 10⁻⁸¹	2.2 × 10⁻⁷⁹	1.6 × 10⁻⁸¹
f₉	3.8 × 10⁻⁶⁰	2.6 × 10⁻²⁹	2.1 × 10⁻³⁷	9.3 × 10⁻⁴⁹
f₁₀	3.1 × 10⁻⁶⁰	2.8 × 10⁻⁸¹	1.7 × 10⁻⁸¹	6.5 × 10⁻⁸²
f₁₁	4.3 × 10⁻²³	2.1 × 10⁻²³	3.1 × 10⁻³¹	1.8 × 10⁻⁸¹
f₁₂	9.8 × 10⁻⁵⁴	2.2 × 10⁻⁵²	5.8 × 10⁻²⁰	4.8 × 10⁻¹⁸
f₁₃	4.2 × 10⁻²⁰	1.6 × 10⁻⁸¹	1.8 × 10⁻⁶⁹	1.6 × 10⁻⁸¹
f₁₄	1.9 × 10⁻⁸²	1.8 × 10⁻⁸¹	1.6 × 10⁻⁸¹	1.4 × 10⁻⁶³
f₁₅	3.8 × 10⁻⁷⁰	1.3 × 10⁻⁷⁷	1.6 × 10⁻⁸¹	1.8 × 10⁻⁸¹
f₁₆	1.7 × 10⁻⁸¹	1.6 × 10⁻⁸¹	1.3 × 10⁻⁸¹	1.2 × 10⁻⁸¹
f₁₇	1.3 × 10⁻⁶³	2.1 × 10⁻²¹	7.3 × 10⁻⁷⁰	1.6 × 10⁻⁸¹
f₁₈	1.6 × 10⁻⁷¹	1.8 × 10⁻⁵¹	1.3 × 10⁻⁸¹	1.5 × 10⁻⁶¹
f₁₉	4.5 × 10⁻⁸	1.5 × 10⁻⁶¹	1.1 × 10⁻⁸¹	1.6 × 10⁻⁵¹
f₂₀	3.7 × 10⁻³⁴	2.2 × 10⁻⁶¹	1.6 × 10⁻⁸¹	1.5 × 10⁻⁸¹
f₂₁	4.9 × 10⁻⁸¹	1.7 × 10⁻⁸¹	1.3 × 10⁻⁸¹	1.4 × 10⁻⁸¹

Note: “Na” denotes not applicable, as this method is not applicable to the current sample.

Table 9. Prediction results of IWOA-BiGRU with fuzzy granulated input windows.

Period	LOW			R			UP
Period	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE
3d	2983	3360	10.0	3256	3652	11.0	4729	5045	11.6
5d	2883	3248	9.7	3068	3435	10.3	4562	4865	11.2
7d	2376	2555	7.8	2804	2941	8.1	3565	3730	8.5
14d	2654	2915	8.9	2876	3230	9.6	3926	4115	10.1

Table 10. Comparison of bus passenger flow forecast results.

Category	LOW			R			UP
Category	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE
IWOA-BiGRU	2376	2555	7.8	2804	2941	8.1	3565	3730	8.5
WOA-BiGRU	2710	3003	9.1	3296	3743	9.6	4190	4537	10.2
Transformer	2917	3240	9.8	3536	3955	10.0	4502	4826	11.0
PSO-LSSVM	3123	3481	10.5	3925	4333	11.1	5119	5284	11.9
GA-BP	3229	3645	10.9	4089	4346	11.6	5330	5558	12.4
BiGRU	3349	3717	11.3	4155	4400	11.8	5375	5602	12.5

Table 11. Paired t-test results of mean MAPE for each model.

Category	Paired Differences					T	Df	Sig. (2-Tailed)
Category	Mean	Std. Deviation	Std. Error Mean	95% Confidence Interval Lower Bound	95% Confidence Interval Lower Bound	T	Df	Sig. (2-Tailed)
T1	−1.490	3.785	0.892	−3.372	0.392	−1.671	17	0.046
T2	−2.122	4.347	1.025	−4.283	0.040	−2.071	17	0.032
T3	−3.029	3.478	0.820	−4.758	−1.299	−3.694	17	0.002
T4	−3.496	3.405	0.803	−5.189	−1.80	−4.356	17	0
T5	−3.730	3.286	0.775	−5.365	−2.096	−4.816	17	0

Table 12. Fluctuation trend and spatial prediction of bus passenger flow.

Category	Time Series							Fluctuation Trend and Space
Date	18	19	20	21	22	23	24	Actual variation range (fuzzy example description)
Actual Passenger Flow	38,638	40,586	42,855	32,053	31,814	36,541	32,650	[Low, R, UP] = [31,527, 35,625, 43,822]
Date	25	26	27	28	29	30	31	Predicted variation range (fuzzy example description)
Actual Passenger Flow	34,620	39,539	46,912	48,846	50,646	36,712	41,715	[Low, R, UP] = [34,539, 39,447, 51,224]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, J.; He, Q.; Lu, X.; Xiao, S.; Wang, N. A FIG-IWOA-BiGRU Model for Bus Passenger Flow Fluctuation Trend and Spatial Prediction. Mathematics 2025, 13, 3204. https://doi.org/10.3390/math13193204

AMA Style

Zhang J, He Q, Lu X, Xiao S, Wang N. A FIG-IWOA-BiGRU Model for Bus Passenger Flow Fluctuation Trend and Spatial Prediction. Mathematics. 2025; 13(19):3204. https://doi.org/10.3390/math13193204

Chicago/Turabian Style

Zhang, Jie, Qingling He, Xiaojuan Lu, Shungen Xiao, and Ning Wang. 2025. "A FIG-IWOA-BiGRU Model for Bus Passenger Flow Fluctuation Trend and Spatial Prediction" Mathematics 13, no. 19: 3204. https://doi.org/10.3390/math13193204

APA Style

Zhang, J., He, Q., Lu, X., Xiao, S., & Wang, N. (2025). A FIG-IWOA-BiGRU Model for Bus Passenger Flow Fluctuation Trend and Spatial Prediction. Mathematics, 13(19), 3204. https://doi.org/10.3390/math13193204

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A FIG-IWOA-BiGRU Model for Bus Passenger Flow Fluctuation Trend and Spatial Prediction

Abstract

1. Introduction

2. Bus Passenger Flow Fluctuation Characteristics and Influencing Factors Analysis

2.1. Analysis of Bus Passenger Flow Fluctuation Characteristics

2.2. Principles for Selecting Influencing Factors of Bus Passenger Flow

2.3. Analysis of Influencing Factors of Bus Passenger Flow

3. Fuzzy Information Granulation and Whale Optimization Algorithm

3.1. Fuzzy Information Granulation

3.2. Whale Optimisation Algorithm

3.2.1. Encircling Prey

3.2.2. Spiral Bubble-Net Attacking

3.2.3. Random Search for Prey

4. Improved Whale Optimization–Bidirectional Gated Recurrent Unit

4.1. Logistic–Tent Combined Mapping Initialization

4.2. Elite Opposition-Based Learning and Cauchy Mutation Hybrid Mechanism

4.2.1. Elite Opposition-Based Learning

4.2.2. Cauchy Mutation

4.2.3. Hybrid Mechanism

4.3. Improved Convergence Factor and Inertia Weight

4.4. Bidirectional Gated Recurrent Unit

4.5. FIG-IWOA-BiGRU Model Prediction Process

5. Numerical Simulation and Forecasting Results Analysis

5.1. Numerical Simulation Results

5.2. Convergence Curve Analysis

5.3. Algorithm Time Complexity and Ranking Analysis

5.4. Analysis of Bus Passenger Flow Forecasting Results

5.4.1. Fuzzy Information Granulation Processing

5.4.2. Forecasting Results Analysis

5.5. Analysis of Model Parameter Impacts

5.6. Comparative Analysis of Model Forecasting Results

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI