1. Introduction
Following the conclusion of the COVID-19 pandemic, the food and beverage industry has markedly accelerated its adoption of the Online-to-Offline (O2O) model, characterized by “online ordering + offline delivery [
1].” This shift has catalyzed substantial global growth for major food delivery platforms such as Meituan and Ele.me. The scope of delivery services has expanded beyond meals to include pharmaceuticals, flowers, and other commodities, contributing to continuous industry expansion.
In China, for instance, nearly 600 million people utilized online food delivery services by the end of 2024, accounting for 53.4% of the total internet user population [
2]. Driven by the rapid proliferation of the O2O model, the market has demonstrated robust growth momentum, engaging over one-fifth of China’s population in using these services. The adoption rate surged remarkably from 16.5% at the end of 2015 to 52.7% by the end of 2021—an increase of 36.2 percentage points over six years—underscoring the sector’s vast developmental potential [
3]. This growth has been facilitated by leading delivery platforms including Meituan, Ele.me, and Baidu, alongside continuous advancements in e-commerce and mobile technology [
3,
4,
5]. Among these, Meituan has established a dominant market position, capturing 69% of the market share in 2020, a 5-percentage-point increase from 2019, reflecting its significant competitive advantage [
6].
However, market advancement necessitates further refinement of delivery services, particularly regarding the accurate communication of delivery timeframes to enhance customer satisfaction [
7,
8]. Existing research indicates a correlation between accurate predicted delivery times and user satisfaction [
9], as customers often need to align food orders with their personal schedules. It is noteworthy that the delivery time provided by platforms is essentially an Estimated Time of Arrival (ETA), representing the predicted duration from the point of origin to the destination—a subject that has been extensively studied [
10,
11,
12]. Nonetheless, the final delivery duration is influenced by a multitude of features, often resulting in significant variability and unexpected fluctuations [
13].
To address the challenges posed by high-dimensional feature sets in predicting food delivery times, enhance service satisfaction, and leverage technology to balance the commercial dynamics between sellers and buyers, this paper examines computational methods capable of precisely identifying predictive features and introducing new momentum into high-dimensional ETA forecasting.
In recent years, prediction technology has become a hot topic across various fields, attracting significant attention from researchers in algorithms and machine learning. For instance, traditional statistics has contributed methods such as Autoregressive (AR) models [
14], which assume linear relationships, Moving Average (MA) methods [
15], Autoregressive Moving Average (ARMA) models [
16], Autoregressive Integrated Moving Average (ARIMA) models [
17], along with their seasonal variants like SARIMA [
18] and exponential smoothing techniques such as Holt-Winters [
19]. These approaches have provided valuable insights into regression and forecasting from a statistical perspective. Machine learning has broken through the linear constraints inherent in traditional models, with notable examples including Support Vector Machines (SVMs) [
20], Random Forests (RFs) [
21], Gradient Boosting Decision Trees (GBDTs) [
22], and Multi-Layer Perceptrons (MLPs) [
23]. Building upon these, deep learning techniques enable end-to-end feature learning through sophisticated feature engineering and modeling. Representative deep learning architectures include Recurrent Neural Networks (RNNs) [
24], Long Short-Term Memory networks (LSTMs) [
25], Gated Recurrent Units (GRUs) [
26], and the attention-based Transformer model [
27]. These have significantly enhanced the ability to model long-term dependencies and high-dimensional dynamic systems.
Applying intelligent computational models to predict delivery times is crucial for enhancing user satisfaction and reducing operational costs for these platforms [
28]. Addressing the core need for cloud resource scheduling in food delivery platforms, Lang et al. [
29] innovatively applied an improved Artificial Neural Network (ANN) model to predict order volumes. This approach effectively increased prediction accuracy and successfully integrated the results into dynamic resource scheduling strategies, demonstrating the practical value of predictive models in optimizing decision-making and substantially improving platform operational efficiency [
29]. However, their model primarily focused on relatively macro-level order volume forecasting and did not sufficiently account for micro-level dynamic factors affecting delivery timeliness, such as real-time traffic conditions and rider behaviors. Subsequently, Song et al. [
30] focused on the “last-mile” delivery segment, which directly impacts user experience. They were among the first to systematically identify the critical issue of predicting final-leg delivery service times and explored its feasibility using big data technology, laying the groundwork for subsequent refined research in this area [
30]. Nonetheless, the depth of their study and the disclosure of model details were relatively limited, and the robustness and generalization ability of their proposed framework in complex and dynamic urban environments require further validation. Against this backdrop, de Araujo et al. [
31] expanded the scope to encompass the prediction of End-to-End total delivery time—from origin to destination—for packages. They innovatively applied deep learning models to process complex logistical sequence data, achieving notable improvements in prediction accuracy. Their research was explicitly geared towards serving the development of smart cities, with a clear application orientation [
31]. It is noteworthy, however, that while their deep learning model is powerful, its performance is highly dependent on data quality and annotations, potentially limiting its effectiveness in scenarios with sparse or noisy data. Therefore, collecting a robust set of features that effectively reflect the factors of influencing prediction is paramount. To address the aforementioned challenges and pursue higher accuracy, the recent work by Zhu et al. [
32] represents a cutting-edge direction. They proposed an innovative hybrid model integrating fuzzy systems with a Convolutional Factorization Machine (CFM). This model not only leverages fuzzy systems to effectively handle the uncertainty and semantic information prevalent in logistics processes but also captures complex high-order interactions among features efficiently through the CFM. It demonstrates superior performance and enhanced robustness in delivery time prediction tasks, marking a significant breakthrough in intelligent logistics forecasting technology [
32]. Nevertheless, the architecture of such hybrid models is typically complex, accompanied by relatively high computational costs and training difficulties, necessitating the use of optimization methods for adaptive parameter tuning.
For the adaptive tuning of model hyperparameters, researchers widely employ heuristic algorithms due to their high efficiency and ease of use. Liu et al. [
33] proposed the Earthworm Optimization Algorithm (EOA) to optimize Support Vector Regression (SVR), significantly enhancing the prediction accuracy of reservoir landslide displacement. Their contribution lies in using EOA to adaptively adjust SVR hyperparameters, effectively overcoming the limitations of experience-dependent traditional methods [
33]. Meanwhile, Che et al. [
34], addressing the issue of predicting high-speed mechanical test data, utilized a multi-strategy improved Whale Optimization Algorithm (WOA) to optimize Long Short-Term Memory (LSTM) hyperparameters, substantially boosting the model’s robustness and generalization ability. They innovatively integrated chaotic initialization and dynamic weight strategies to avoid local optima [
34]. Furthermore, Zhou et al. [
35] designed an improved Sparrow Search Algorithm (ISSA) to optimize LSTM hyperparameters, achieving high-precision prediction of building heating and cooling loads. The advantage of their approach is the introduction of an adaptive perturbation mechanism to enhance global search capability [
35]. In the energy sector, Qiu et al. [
36] combined the Particle Swarm Optimization (PSO) algorithm with Gated Recurrent Units (GRUs) and innovatively incorporated Kolmogorov-Arnold Networks (KANs) to optimize oil well production prediction. PSO efficiently identified the key hyperparameters of the GRU-KAN model, significantly reducing prediction error [
36]. In a similar vein, Cui et al. [
37] proposed a hybrid Whale Optimization Algorithm (WOA) to optimize a regional heat load prediction model based on CNN-LSTM-Attention. Their contribution involves the synergistic optimization of hyperparameters across multiple modules via WOA and the use of an attention mechanism to strengthen feature extraction [
37]. Correspondingly, Safavi et al. [
38] combined the Coati Optimization Algorithm with CNN-XGBoost for early prediction of battery lifespan. This algorithm performs a targeted search for key hyperparameters, improving prediction timeliness while reducing reliance on data, collectively demonstrating the powerful performance of models after hyperparameter optimization [
38].
In summary, this research proposed a novel symmetric optimization prediction model framework (SymOpt-CNSVR) for food delivery time prediction. The specific contributions are as follows:
Inspired by the collective intelligent behavior of Superb Fairy-wrens, three key enhancements have been made to the original Superb Fairy-wren Optimization Algorithm (SFOA) algorithm to solve complex optimization problems more efficiently. Firstly, a mechanism for group collaboration and information sharing was introduced, simulating how birds collectively avoid predators and search for food. This effectively helps the algorithm escape local optima, enhancing global exploration capabilities and stability. Secondly, a historical memory mechanism was designed, allowing individuals to remember and revisit historically optimal positions, thereby improving the algorithm’s convergence accuracy and search efficiency. Finally, a self-reinforcement learning strategy was integrated, requiring individuals to self-improve based on historical experience in each iteration, ensuring continuous enhancement of solution quality. These enhancements significantly improve the optimization performance of the algorithm, as demonstrated by the reduction in MAE and RMSE on the test set from 3.8646 and 5.0253 (using the original SFOA) to 3.0582 and 4.1947 (using ESFOA). This algorithm provides an effective new tool for automated and precise hyperparameter tuning of deep learning models.
The core innovation of this paper lies in constructing a “symmetrical” hybrid modeling framework that skillfully combines the strengths of deep learning and statistical learning. The framework adopts a symmetrical design logic: it utilizes the powerful nonlinear feature extraction capability of Convolutional Neural Networks (CNNs) to process high-dimensional sequential data and employs the improved ESFOA for automated hyperparameter optimization. Simultaneously, to obtain more statistically meaningful prediction results, a Support Vector Regression (SVR) model is used to capture deterministic patterns in the data, with Bayesian optimization applied for probabilistic search of its key parameters. This symmetrical structure of “CNN-ESFOA” and “SVR-Bayesian optimization” achieves a balance and unification between the representational power of deep learning and the interpretability of statistical models, offering a novel solution for complex data prediction problems.
This paper applies the proposed SymOpt-CNSVR framework to the prediction of complex high-dimensional sequential data in food delivery scenarios. Through rigorous experimental comparisons, the proposed method significantly outperforms traditional baseline models and other optimized prediction methods in terms of prediction accuracy. The results demonstrate that the framework can effectively capture nonlinear spatiotemporal dependencies and statistical patterns in food delivery data, providing a higher-precision prediction model to address the critical time prediction challenges in logistics delivery, with substantial practical application value.
The remainder of this paper is structured as follows:
Section 2 describes the designed algorithm and the establishment of the SymOpt-CNSVR model based on symmetrical logic;
Section 3 introduces the experimental environment and the testing and evaluation of the algorithm;
Section 4 summarizes the innovations and effectiveness of this study and offers insights for future improvements.
3. Experimental Analysis
3.1. Data Statistics and Processing
In this section, the present study compiles a dataset related to food delivery services. To gain a comprehensive understanding of the data’s structure and distribution, and to identify potential data issues such as missing values, outliers, and other irregularities, the study conducts a detailed statistical analysis of each variable, as presented in
Table 1.
As revealed in
Table 1, the dataset exhibits a certain proportion of missing values across its variables. To ensure the accuracy and reliability of subsequent experimental analyses, the study presents a detailed enumeration of the missing values, as shown in
Table 2, in order to better understand the distribution of the missing data.
Since the missing data are textual and the proportion of missing values in this part is small, they are not suitable for interpolation filling. Therefore, excluding these samples has no significant impact on the overall distribution of the data. Furthermore, to ensure consistency within the dataset, the study eliminates these missing values to avoid any potential impact on the predictive outcomes.
For textual data, as most machine learning algorithms are unable to process raw text directly, this study employs numerical encoding to transform the text into a format that can be recognized by the model, as shown in
Table 3.
3.2. Experimental Environment
To guarantee the fairness and reproducibility of the experiments, the experiments were conducted on a workstation equipped with an Intel Core i7-11800H processor (Intel Corporation, Santa Clara, CA, USA), 16 GB RAM, and an NVIDIA GeForce RTX-3060 GPU with 14 GB memory (NVIDIA Corporation, Santa Clara, CA, USA).
To validate the reproducibility of the results obtained from the SymOpt-CNSVR model, its initial parameter settings are presented in
Table 4.
To ensure the generalizability of the model, the dataset in this study was split into training and test sets in a 7:3 ratio.
The study employs four evaluation metrics—coefficient of determination (R
2), mean absolute error (MAE), root mean square error (RMSE), and mean squared log error (MSLE)—to assess the model’s stability and accuracy. The corresponding formulas are provided in
Table 5.
3.3. Algorithm Performance Testing
In this section, ESFOA is compared with seven high-citation algorithms, including SFOA, Bermuda Triangle Optimizer (BTO) [
44], Chinese Pangolin Optimizer (CPO) [
45], Kepler Optimization Algorithm (KOA) [
46], Improved Black-Winged Kite Algorithm (IBKA) [
47], Enhanced Sea Horse Optimization (ESHO) [
48], and Chaotic Mountain Gazelle Optimizer Improved by Multiple Oppositional-Based Learning Variants (HCQDOPP-MGO) [
49], to demonstrate the superiority and rationality of the proposed algorithm. The function test set from IEEE CEC2022 [
50] is used, the search interval is denoted as
, as shown in
Table 6.
The initial population size for all swarm intelligence algorithms is set to 50, with a maximum evaluation count of 1000. To ensure fairness, 50 independent experimental runs are conducted for each algorithm. This section evaluates the performance of ESFOA using the CEC2022 (Dim = 2/10/20) benchmark test suite. In order to focus on the average performance of each algorithm under different test conditions, this subsection presents only the comparative results of the mean values obtained after running each algorithm on datasets of varying dimensions.
First, the performance of ESFOA is evaluated using the CEC2022 (Dim = 2) benchmark test suite.
Table 7 shows that ESFOA ranks first in terms of mean values on the F6, F7, and F8 hybrid functions, while it achieves a mean rank of 1 on the F10 composite function with a value of 2427.785, clearly outperforming other algorithms. This demonstrates the precise prediction capability of ESFOA in complex function environments.
Figure 3 presents the average runtime of ESFOA and other algorithms across various test functions. ESFOA is highlighted in orange, while the other algorithms are highlighted in yellow. ESFOA has an average runtime of 0.051739 s, ranking third. Notably, it shows a significant improvement compared to BTO (0.099727 s) and CPO (0.670289 s). This result indicates that ESFOA strikes a good balance between optimization performance and computational efficiency, demonstrating superior performance.
Furthermore, the performance of ESFOA is evaluated by increasing the test set dimension to 10. The mean comparison results are shown in
Table 8.
Table 8 shows that ESFOA exhibits the best performance on the F7, F8, F10, and F12 functions, ranking first in both hybrid and composite functions. This indicates that ESFOA demonstrates high stability across these functions.
Figure 4 compares the convergence curves of ESFOA with other algorithms. The blue line represents ESFOA, which clearly demonstrates a strong ability to escape local optima in the later stages. This is a result of its self-enhancing training capability, indicating that ESFOA performs steady iterations in the early stages while exhibiting a powerful ability to escape local optima in the later stages.
Figure 5 illustrates the iterative stability of ESFOA compared to other algorithms. The blue box plot represents ESFOA, which shows a relatively small box size and a very low average fitness value, suggesting that ESFOA not only converges quickly but also exhibits a degree of stability in its search capability.
Next, the test set dimension is further increased to 20 dimensions.
As shown in
Table 9, the ESFOA is a comprehensive and highly robust optimizer. Although it did not achieve first place on some functions, its overall performance is exceptional: it ranks first on F12, second on several challenging functions such as F2, F6, and F11, and maintains stable competitiveness across other functions. This indicates that ESFOA is not a specialized algorithm but rather a powerful tool that can be widely applied to various complex optimization problems. Especially when handling high-dimensional, multimodal, and composite functions, its balanced exploration and exploitation capabilities ensure outstanding final performance.
As shown in
Figure 6, ESFOA ranks very high in the Friedman mean value for each dimension. It ranks first in the two-dimensional case and second in average, slightly lower than CPO. However, due to the very high time complexity of CPO, ESFOA has an excellent simple and efficient optimization capability, proving that the algorithm is very practical.
3.4. Model Prediction
In order to further evaluate the effectiveness of the ESFOA improvement, we brought this method into the SymOpt-CNSVR model we developed for optimization. Through the evaluation results of the training set and the test set, we can obtain the effectiveness of the model in predicting the delivery time of takeout.
Table 10 shows the results and ranges of ESFOA and SFOA optimization, and the final training and testing evaluation results are shown in
Table 11.
Overall,
Table 11 shows that both SFOA and ESOFA demonstrate strong predictive capabilities on the SymOpt-CNSVR model, with R
2 values on both the training and test sets approaching or exceeding 0.9, indicating good model fit and strong generalization. Further comparison reveals that the enhanced ESOFA optimization method significantly outperforms the basic SFOA across all evaluation metrics: its R
2 on the training and test sets increases by approximately 0.03 and 0.03, respectively, while its MAE, RMSE, and MSLE decrease significantly. In particular, the MAE on the training set decreases by approximately 27%, and the MSLE on the test set decreases by approximately 24%. This demonstrates that ESOFA offers significant advantages in improving prediction accuracy, stability, and error control, while avoiding significant overfitting, demonstrating enhanced optimization efficiency and model robustness.
After 10 repeated runs, the Wilcoxon test was used to verify the significance of the difference between the results of the two methods. As can be seen from
Table 12, all the
p-values of the four indicators of SFOA with ESFOA as the standard are less than 0.05, which is significant, indicating that there are obvious differences in the optimization results of the two methods. The results in
Table 11 are reliable.
Figure 7 shows that even when the initial solution is poorly designed, ESFOA can still achieve a convergence value that exceeds the target in a short number of iterations through its powerful cruising ability. This shows that ESFOA has very strong performance in optimizing hyperparameters and is suitable for use in the problem of selecting multiple hyperparameter combinations for models in high-dimensional data prediction.
3.5. Ablation Study
Furthermore, this study will conduct ablation tests on the performance of each module of the model developed in this paper, quantitatively analyze the impact of each module on the overall performance, enhance the interpretability of the model established in this paper, and understand the module selection and importance in its establishment process.
The results presented in
Table 13 and
Figure 8 demonstrate that SymOpt-CNSVR overwhelmingly achieves the best performance among all compared models. Its R
2 value, 0.9269, significantly surpasses the other models (0.77–0.84), while achieving significantly lower MAE, RMSE, and MSLE, demonstrating its superior predictive accuracy and reliability. Specifically, the SymOpt-CNSVR model achieves significant improvements in key metrics: its R
2 improves by over 10% compared to the next-best CNN-OptSVR model, while reducing MAE and RMSE by approximately 36% and 32%, respectively. While CNN and SVR each exhibit limited performance in isolation (R
2 = 0.772 and 0.5874, respectively), their integration within a unified framework yields a remarkable performance improvement. This synergy can be attributed to the complementary strengths of each module: the CNN excels at capturing complex spatial and multi-feature interactions within the data, while the SVR provides strong generalization capabilities for regression tasks based on high-level features. The significant performance gap between the individual models and the combined system underscores the importance of structural cooperation in leveraging both feature learning capacity and regression stability. Furthermore, the fact that the hybrid model (CNN-SVR and its optimized variants) does not merely average the performance of its submodules—but significantly surpasses them—suggests effective information fusion rather than error accumulation. The low MSLE value (0.1114) achieved by SymOpt-CNSVR (through symmetric optimization mechanism) further confirms that the model avoids large deviant predictions, reinforcing its practical reliability in real deployment scenarios.
3.6. Comparative Experiment
In addition to ablation studies, comparative experiments are also a crucial step in evaluating model performance. Next, we will evaluate the performance of baseline models and advanced optimization models, analyzing the predictive advantages of SymOpt-CNSVR and ensuring its superiority in the complex, high-dimensional task of predicting food delivery time.
This paper compares the predictive performance of traditional linear regression (LR), K-nearest neighbor (KNN), and advanced deep learning sequence models (GRU and BiLSTM) for this task. All results are based on the test set.
As shown in
Table 14 and
Figure 9, SymOpt-CNSVR stands out among all compared prediction models, significantly outperforming traditional linear regression (LR), K-nearest neighbor (KNN), and advanced deep learning sequence models (GRU and BiLSTM), demonstrating a revolutionary improvement in predictive capabilities. Specifically, the SymOpt-CNSVR model’s superiority is evident in its significant lead across all core evaluation metrics: its coefficient of determination (R
2) reaches a staggering 0.9269, an improvement of over 15 percentage points over the next-best performing KNN and BiLSTM (approximately 0.801), representing a substantial improvement in its ability to account for data variation. Its superiority in error control is even more striking, with MAE (3.0582) and RMSE (4.1947) significantly lower than those of other models. Compared to the best-performing models (MAE 4.713 for KNN and RMSE 6.9194 for BiLSTM), the error reductions reach 35% and 39%, respectively. Crucially, its MSLE indicator is only 0.1114, which is nearly 40% lower than the best value of other models (0.1833 of BiLSTM). This fully demonstrates that the model can extremely control the relative error even when predicting a large range of target values, and has unparalleled robustness and accuracy. Overall, the results strongly indicate that the symmetric optimization strategy adopted by SymOpt-CNSVR successfully integrates the advantages of different models. Its performance improvement is not a marginal improvement, but fundamentally surpasses the performance boundaries of traditional machine learning and even mainstream deep learning methods, setting a new accuracy benchmark. It also provides a more accurate model in the task of predicting the delivery time of takeout food, helping businesses and consumers to better determine the delivery process of takeout food.
Next, we will compare it with existing advanced optimization models, including BO-CNN-LSTM [
51], PSO-BiGRU-BiLSTM [
52], ACO-SVR-GRU [
53], and IRIME-BiTCN-BiGRU-MSA [
54]. These are all powerful optimization prediction models proposed in the past two years, which can effectively verify the efficiency and practicality of SymOpt-CNSVR in this paper in prediction tasks.
The comparison results of advanced optimization models in
Table 15 and
Figure 10 show that all the competing models utilize a hybrid architecture combining complex intelligent optimization algorithms with deep learning components. However, the SymOpt-CNSVR model stands out, achieving the best performance across most key performance metrics, firmly establishing its position as the optimal model.
SymOpt-CNSVR’s exceptional performance is reflected in its comprehensive and balanced lead. Its R2 value (0.9269) is the highest among all models. While only slightly behind the similarly performing ACO-SVR-GRU model (0.9249), this demonstrates its exceptional predictive accuracy and ability to account for data variation. More importantly, in key metrics measuring prediction error, SymOpt-CNSVR demonstrates undeniable superiority: its MAE (3.0582) and RMS (3.0640) are both superior. The E (4.1947) was the lowest among all models, indicating that its predicted values had the smallest absolute deviation from the true values. Most convincingly, its MSLE (0.1114) also ranked first and was significantly lower than that of other models. This demonstrates that SymOpt-CNSVR possesses the strongest robustness and control capabilities against potentially large prediction errors, effectively avoiding extreme errors.
In summary, these comparison results strongly demonstrate that the symmetric collaborative optimization strategy employed by SymOpt-CNSVR not only rivals the performance of state-of-the-art hybrid optimization models, but also surpasses them overall. This indicates that its optimization framework is highly efficient, fully exploiting and integrating the strengths of different models, ultimately achieving the optimal combination of prediction accuracy, stability, and reliability, making it ideally suited for complex, high-dimensional food delivery time prediction tasks.
4. Conclusions
This paper proposes a new prediction framework, SymOpt-CNSVR, based on symmetric optimization, to address the problem of delivery time prediction in food delivery scenarios. This framework deeply integrates the advantages of convolutional neural networks (CNNs) in high-dimensional feature extraction with the robust performance of support vector regression (SVR) in statistical learning, achieving complementary advantages between the two models through structural symmetry. The CNN is responsible for multi-feature importance assessment and nonlinear relationship extraction, and its hyperparameters are adaptively optimized using our proposed Enhanced Superb Fairy-wren Optimization Algorithm (ESFOA). SVR further achieves accurate prediction based on this, and its parameter tuning is efficiently implemented using Bayesian optimization, significantly reducing the complexity and time cost of parameter search. At the algorithmic level, ESFOA significantly improves global search efficiency and convergence stability by introducing a group collaboration mechanism, historical memory backtracking, and a self-reinforcement learning strategy. Experimental results on the standard test set CEC2022 demonstrate that ESFOA outperforms current mainstream optimization algorithms such as BTO, CPO, KOA, IBKA, ESHO, and HCQDOPP_MGO in terms of convergence accuracy and robustness. Finally, SymOpt-CNSVR achieves exceptional results on real-world food delivery data, achieving R2 = 0.9269, MAE = 3.0582, RMSE = 4.1947, and MSLE = 0.1114, comprehensively surpassing multiple baseline and cutting-edge optimization models. Notably, the model achieves a prediction MAE of only about 3 min, approaching the sub-1 min error target that is critical for high-precision logistics systems. This capability allows the platform to provide customers with highly reliable delivery time estimates, thereby significantly enhancing user trust and operational coordination efficiency. It not only validates the effectiveness and advancement of the proposed framework in handling high-dimensional spatiotemporal prediction tasks but also demonstrates its substantial practical impact in real intelligent dispatch environments.
The proposed method provides a path forward for further improving prediction performance and its broad application scope. Future work will focus on exploring more expressive Transformer architectures to capture more complex spatiotemporal dependencies and promote the universal application of ESFOA to a wider range of optimization tasks. At the same time, an online learning mechanism is developed to enhance the dynamic adaptability of the model. The ultimate goal is to integrate the framework into the actual intelligent scheduling system to achieve a closed-loop transformation from academic innovation to industrial value.