A Novel Forecasting System with Data Preprocessing and Machine Learning for Containerized Freight Market
Abstract
:1. Introduction
- (1)
- This research achieves high-accuracy predictions for the Shanghai Containerized Freight Index (SCFI) and the Ningbo Containerized Freight Index (NCFI), filling a significant gap in this field. By investigating these two freight indices, we not only demonstrate the robustness and superior performance of the proposed real-time forecasting system but also provide strong decision-making support for shipping market managers, enhancing their capacity to withstand market risks.
- (2)
- In addition to traditional forecasting based on historical container freight data, this study incorporates various factors as input variables for the forecasting framework, including energy prices, commodity prices, Baidu index data, and indices of similar types. This results in the establishment of a multi-factor hybrid forecasting model, which significantly improves prediction performance.
- (3)
- A two-stage data preprocessing scheme is proposed. Initially, the Hampel filter is employed for outlier identification and correction. Subsequently, a real-time rolling data decomposition technique based on VMD is applied to further refine the preprocessed data, substantially enhancing the prediction accuracy of the system.
- (4)
- To optimize the performance and training efficiency of the XGBoost model, this paper introduces a novel intelligent optimization algorithm—the cheetah optimization algorithm (COA). COA is utilized to optimize key parameters of the XGBoost model, including feature sampling ratio (colsample_bytree), number of trees (n_estimators), learning rate (learning_rate), maximum tree depth (max_depth), and the proportion of rows sampled for each tree (subsample), thereby enhancing the overall predictive efficacy of the model.
2. Materials and Methods
2.1. Hampel Filter
2.2. Variational Mode Decomposition
2.3. Cheetah Optimization Algorithm
2.3.1. Search Strategy
2.3.2. Sit-and-Wait Strategy
2.3.3. Attack Strategy
2.4. COA-XGBoost
2.5. Research Framework
3. Empirical Research
3.1. Data Description
3.2. Data Normalization
3.3. Evaluation Metrics
3.4. Experimental Setup
3.5. Experiment I
3.6. Experiment II
3.7. Feature Importance Analysis
4. Discussion
4.1. Improvement Percentages
- (1)
- Across all evaluation metrics, the proposed Hampel-VMD-COA-XGBoost prediction model outperforms the other models. For example, in the SCFI dataset, according to the results presented in Table 7, the RMSE, MAE, MAPE, and TIC of the proposed prediction framework are 3.3268, 2.1778, 0.0009, and 0.0007, respectively.
- (2)
- In both datasets, the predictive performance of the XGBoost model surpasses that of SVR and RF models. This demonstrates the superior ability of the XGBoost model to leverage past and future time series information for accurate predictions. It also confirms XGBoost’s advantage in extracting hidden information from time series data.
- (3)
- A comparative analysis between the standalone XGBoost model and XGBoost models incorporating decomposition modules and optimization algorithms reveals a significant improvement in accuracy when the COA algorithm and decomposition modules are applied. For example, in the NCFI dataset, as shown in Table 9, when comparing XGBoost with CEEMDAN-COA-XGBoost and XGBoost with VMD-COA-XGBoost, the models incorporating decomposition and optimization techniques show notable improvements in all evaluation metrics. This highlights the contributions of decomposition techniques and optimization algorithms to the prediction results. Furthermore, when comparing CEEMDAN-COA-XGBoost and VMD-COA-XGBoost, the VMD-COA-XGBoost model shows improvements of 25.42%, 30.79%, 35.82%, and 25.81% across the evaluation metrics. Therefore, VMD is considered more suitable for handling such nonlinear time series data, aiding in improving both the accuracy and stability of the predictive model.
- (4)
- CEEMDAN-COA-XGBoost was compared with Hampel-CEEMDAN-COA-XGBoost and VMD-COA-XGBoost was compared with Hampel-VMD-COA-XGBoost. For the SCFI dataset, as shown in Table 9, the prediction systems incorporating Hampel outlier detection achieved significantly improved prediction performance. This further validates the effectiveness of the Hampel filtering algorithm. Additionally, compared to the Hampel-CEEMDAN-COA-XGBoost model, the proposed prediction system showed improvements of 36.96%, 44.82%, 62.50%, and 41.67% across various evaluation metrics, indicating that the proposed system outperforms other comparative models.
4.2. Comparison of the Proposed Model and Existing Model
4.3. Limitations of the Present Study and Future Work
5. Policy Implications and Conclusions
5.1. Policy Recommendations
5.2. Conclusions
- (1)
- Across both the SCFI and NCFI datasets, the proposed prediction framework consistently outperformed all comparative models, underscoring its superiority and generalizability in predicting container shipping price indices.
- (2)
- Integrating multiple external factors yielded richer predictive insights, resulting in more reliable outcomes. This highlights the importance of multi-factor analysis in enhancing the accuracy and stability of prediction models.
- (3)
- Effective data preprocessing mechanisms can mitigate the impact of data noise, enabling the model to better capture the underlying data characteristics, thereby improving the predictive performance of the system.
- (4)
- The use of the cheetah optimization algorithm, with its exceptional optimization capabilities, allowed the XGBoost model to fully leverage its potential, leading to superior prediction results.
- (5)
- A hybrid prediction system for container shipping prices was established, with XGBoost—known for its strong generalization capabilities—serving as the predictor and integration tool. Ultimately, the system successfully captured deep feature information from container shipping prices and achieved nonlinear integration of the prediction results.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yin, J.; Shi, J. Seasonality patterns in the container shipping freight rate market. Marit. Policy Manag. 2018, 45, 159–173. [Google Scholar] [CrossRef]
- Saeed, N.; Nguyen, S.; Cullinane, K.; Gekara, V.; Chhetri, P. Forecasting container freight rates using the Prophet forecasting method. Transp. Policy 2023, 133, 86–107. [Google Scholar] [CrossRef]
- Munim, Z.H.; Schramm, H.-J. Forecasting container shipping freight rates for the Far East—Northern Europe trade lane. Marit. Econ. Logist. 2016, 19, 106–125. [Google Scholar] [CrossRef]
- Hao, J.; Yuan, J.; Wu, D.; Xu, W.; Li, J. A dynamic ensemble approach for multi-step price prediction: Empirical evidence from crude oil and shipping market. Expert Syst. Appl. 2023, 234, 121117. [Google Scholar] [CrossRef]
- Wang, S.; Wei, F.; Li, H.; Wang, Z.; Wei, P. Comparison of SARIMA model and Holt-Winters model in predicting the incidence of Sjögren’s syndrome. Int. J. Rheum. Dis. 2022, 25, 1263–1269. [Google Scholar] [CrossRef]
- Zhao, H.-M.; He, H.-D.; Lu, K.-F.; Han, X.-L.; Ding, Y.; Peng, Z.-R. Measuring the impact of an exogenous factor: An exponential smoothing model of the response of shipping to COVID-19. Transp. Policy 2022, 118, 91–100. [Google Scholar] [CrossRef]
- Liu, Z.; Huang, S. Carbon option price forecasting based on modified fractional Brownian motion optimized by GARCH model in carbon emission trading. North Am. J. Econ. Financ. 2021, 55, 101307. [Google Scholar] [CrossRef]
- Koyuncu, K.; Tavacioğlu, L.; Gökmen, N.; Arican, U.Ç. Forecasting COVID-19 impact on RWI/ISL container throughput index by using SARIMA models. Marit. Policy Manag. 2021, 48, 1096–1108. [Google Scholar] [CrossRef]
- Liu, J.; Li, Z.; Sun, H.; Yu, L.; Gao, W. Volatility forecasting for the shipping market indexes: An AR-SVR-GARCH approach. Marit. Policy Manag. 2021, 49, 864–881. [Google Scholar] [CrossRef]
- Bildirici, M.; Şahin Onat, I.; Ersin, Ö.Ö. Forecasting BDI Sea Freight Shipment Cost, VIX Investor Sentiment and MSCI Global Stock Market Indicator Indices: LSTAR-GARCH and LSTAR-APGARCH Models. Mathematics 2023, 11, 1242. [Google Scholar] [CrossRef]
- Liao, H.; Zeng, J.; Wu, C. A model for online forum traffic prediction integrated with multiple models. Comput. Eng. 2020, 46, 62–66. [Google Scholar]
- Liu, J.; Chu, N.; Wang, P.; Zhou, L.; Chen, H. A novel hybrid model for freight volume prediction based on the Baidu search index and emergency. Neural Comput. Appl. 2023, 36, 1313–1328. [Google Scholar] [CrossRef]
- Yin, K.; Guo, H.; Yang, W. A novel real-time multi-step forecasting system with a three-stage data preprocessing strategy for containerized freight market. Expert Syst. Appl. 2024, 246, 123141. [Google Scholar] [CrossRef]
- Han, Q.; Yan, B.; Ning, G.; Yu, B. Forecasting Dry Bulk Freight Index with Improved SVM. Math. Probl. Eng. 2014, 2014, 460684. [Google Scholar] [CrossRef]
- Zhang, Q.; Li, C.; Wang, X.; Hu, Y.; Yan, Y.; Jin, H.; Shang, G. Forecasting shipping index using CEEMD-PSO-BiLSTM model. PLoS ONE 2023, 18, e0280504. [Google Scholar] [CrossRef]
- Tsioumas, V.; Papadimitriou, S.; Smirlis, Y.; Zahran, S.Z. A Novel Approach to Forecasting the Bulk Freight Market. Asian J. Shipp. Logist. 2017, 33, 33–41. [Google Scholar] [CrossRef]
- Du, P.; Wang, J.; Yang, W.; Niu, T. Container throughput forecasting using a novel hybrid learning method with error correction strategy. Knowl. -Based Syst. 2019, 182, 104853. [Google Scholar] [CrossRef]
- Xie, G.; Zhang, N.; Wang, S. Data characteristic analysis and model selection for container throughput forecasting within a decomposition-ensemble methodology. Transp. Res. Part E Logist. Transp. Rev. 2017, 108, 160–178. [Google Scholar] [CrossRef]
- Schramm, H.-J.; Munim, Z.H. Container freight rate forecasting with improved accuracy by integrating soft facts from practitioners. Res. Transp. Bus. Manag. 2021, 41, 100662. [Google Scholar] [CrossRef]
- Tu, X.; Yang, Y.; Lin, Y.; Ma, S. Analysis of influencing factors and prediction of China’s Containerized Freight Index. Front. Mar. Sci. 2023, 10, 1245542. [Google Scholar] [CrossRef]
- Bae, S.-H.; Lee, G.; Park, K.-S. A Baltic Dry Index Prediction using Deep Learning Models. J. Korea Trade 2021, 25, 17–36. [Google Scholar] [CrossRef]
- Ghareeb, A. Time Time Series Forecasting of Stock Price for Maritime Shipping Company in COVID-19 Period Using Multi-Step Long Short-Term Memory (LSTM) Networks. Proc. Int. Conf. Bus. Excell. 2023, 17, 1728–1747. [Google Scholar] [CrossRef]
- Xiao, W.; Xu, C.; Liu, H.; Liu, X.; Kim, D.-K. A Hybrid LSTM-Based Ensemble Learning Approach for China Coastal Bulk Coal Freight Index Prediction. J. Adv. Transp. 2021, 2021, 5573650. [Google Scholar] [CrossRef]
- Zhang, X.; Xue, T.; Eugene Stanley, H. Comparison of Econometric Models and Artificial Neural Networks Algorithms for the Prediction of Baltic Dry Index. IEEE Access 2019, 7, 1647–1657. [Google Scholar] [CrossRef]
- Katris, C.; Kavussanos, M.G. Time series forecasting methods for the Baltic dry index. J. Forecast. 2021, 40, 1540–1565. [Google Scholar] [CrossRef]
- Shih, Y.-C.; Lin, M.-S.; Lirn, T.-C.; Juang, J.-G. A new-type deep learning model based on Shapley regulation for containerized freight index prediction. J. Mar. Sci. Technol. 2024, 32, 8. [Google Scholar] [CrossRef]
- Liu, S.; Huang, J.; Xu, L.; Zhao, X.; Li, X.; Cao, L.; Wen, B.; Huang, Y. Combined model for prediction of air temperature in poultry house for lion-head goose breeding based on PCA-SVR-ARMA. Trans. Chin. Soc. Agric. Eng. 2020, 36, 225–233. [Google Scholar]
- Kamal, I.M.; Bae, H.; Sunghyun, S.; Yun, H. DERN: Deep Ensemble Learning Model for Short- and Long-Term Prediction of Baltic Dry Index. Appl. Sci. 2020, 10, 1504. [Google Scholar] [CrossRef]
- Li, Z.; Piao, W.; Wang, L.; Wang, X.; Fu, R.; Fang, Y. China Coastal Bulk (Coal) Freight Index Forecasting Based on an Integrated Model Combining ARMA, GM and BP Model Optimized by GA. Electronics 2022, 11, 2732. [Google Scholar] [CrossRef]
- Huang, Y.; Deng, Y. A new crude oil price forecasting model based on variational mode decomposition. Knowl.-Based Syst. 2021, 213, 106669. [Google Scholar] [CrossRef]
- Zeng, Q.; Qu, C.; Ng, A.K.Y.; Zhao, X. A new approach for Baltic Dry Index forecasting based on empirical mode decomposition and neural networks. Marit. Econ. Logist. 2015, 18, 192–210. [Google Scholar] [CrossRef]
- Chen, Y.; Liu, B.; Wang, T. Analysing and forecasting China containerized freight index with a hybrid decomposition–ensemble method based on EMD, grey wave and ARMA. Grey Syst. Theory Appl. 2020, 11, 358–371. [Google Scholar] [CrossRef]
- Bagherzadeh, S.A.; Sabzehparvar, M. A local and online sifting process for the empirical mode decomposition and its application in aircraft damage detection. Mech. Syst. Signal Process. 2015, 54–55, 68–83. [Google Scholar] [CrossRef]
- Bagherzadeh, S.A.; Asadi, D. Detection of the ice assertion on aircraft using empirical mode decomposition enhanced by multi-objective optimization. Mech. Syst. Signal Process. 2017, 88, 9–24. [Google Scholar] [CrossRef]
- Zhang, C.; Zhao, Y.; Zhao, H. A Novel Hybrid Price Prediction Model for Multimodal Carbon Emission Trading Market Based on CEEMDAN Algorithm and Window-Based XGBoost Approach. Mathematics 2022, 10, 4072. [Google Scholar] [CrossRef]
- Chou, C.-C.; Lin, K.-S. A fuzzy neural network combined with technical indicators and its application to Baltic Dry Index forecasting. J. Mar. Eng. Technol. 2018, 18, 82–91. [Google Scholar] [CrossRef]
- Sahin, B.; Gurgen, S.; Unver, B.; Altin, I. Forecasting the Baltic Dry Index by using an artificial neural network approach. Turk. J. Electr. Eng. Comput. Sci. 2018, 26, 1673–1684. [Google Scholar] [CrossRef]
- Gu, B.; Liu, J. Determinants of dry bulk shipping freight rates: Considering Chinese manufacturing industry and economic policy uncertainty. Transp. Policy 2022, 129, 66–77. [Google Scholar] [CrossRef]
- Jeon, J.-W.; Duru, O.; Munim, Z.H.; Saeed, N. System Dynamics in the Predictive Analytics of Container Freight Rates. Transp. Sci. 2021, 55, 946–967. [Google Scholar] [CrossRef]
- Inglada-Pérez, L.; Coto-Millán, P. A Chaos Analysis of the Dry Bulk Shipping Market. Mathematics 2021, 9, 2065. [Google Scholar] [CrossRef]
- Chou, C.C.; Lin, K.S. A Fuzzy Neural Network Model for Analysing Baltic Dry Index in the Bulk Maritime Industry. Int. J. Marit. Eng. 2017, 159, A2. [Google Scholar] [CrossRef]
- Yang, Z.; Mehmed, E.E. Artificial neural networks in freight rate forecasting. Marit. Econ. Logist. 2019, 21, 390–414. [Google Scholar] [CrossRef]
- Tsioumas, V.; Papadimitriou, S. The dynamic relationship between freight markets and commodity prices revealed. Marit. Econ. Logist. 2016, 20, 267–279. [Google Scholar] [CrossRef]
- Tsouknidis, D.A. Dynamic volatility spillovers across shipping freight markets. Transp. Res. Part E Logist. Transp. Rev. 2016, 91, 90–111. [Google Scholar] [CrossRef]
- Makridakis, S.; Merikas, A.; Merika, A.; Tsionas, M.G.; Izzeldin, M. A novel forecasting model for the Baltic dry index utilizing optimal squeezing. J. Forecast. 2019, 39, 56–68. [Google Scholar] [CrossRef]
- Shabbir, M.; Chand, S.; Iqbal, F.; Kisi, O. Hybrid Approach for Streamflow Prediction: LASSO-Hampel Filter Integration with Support Vector Machines, Artificial Neural Networks, and Autoregressive Distributed Lag Models. Water Resour. Manag. 2024, 38, 4179–4196. [Google Scholar] [CrossRef]
- Park, C.-H.; Chang, J.-H. WLS Localization Using Skipped Filter, Hampel Filter, Bootstrapping and Gaussian Mixture EM in LOS/NLOS Conditions. IEEE Access 2019, 7, 35919–35928. [Google Scholar] [CrossRef]
- Yin, S.; Liu, H. Wind power prediction based on outlier correction, ensemble reinforcement learning, and residual correction. Energy 2022, 250, 123857. [Google Scholar] [CrossRef]
- Pearson, R.K.; Neuvo, Y.; Astola, J.; Gabbouj, M. The class of generalized hampel filters. In Proceedings of the Signal Processing Conference, Brisbane, Australia, 20 April 2015. [Google Scholar]
- Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
- Li, J.; Wu, Q.; Tian, Y.; Fan, L. Monthly Henry Hub natural gas spot prices forecasting using variational mode decomposition and deep belief network. Energy 2021, 227, 120478. [Google Scholar] [CrossRef]
- Li, Y.; Chen, L.; Sun, C.; Liu, G.; Chen, C.; Zhang, Y. Accurate Stock Price Forecasting Based on Deep Learning and Hierarchical Frequency Decomposition. IEEE Access 2024, 12, 49878–49894. [Google Scholar] [CrossRef]
- Akbari, M.A.; Zare, M.; Azizipanah-abarghooee, R.; Mirjalili, S.; Deriche, M. The cheetah optimizer: A nature-inspired metaheuristic algorithm for large-scale optimization problems. Sci. Rep. 2022, 12, 10953. [Google Scholar] [CrossRef] [PubMed]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Duan, Y.; Zhang, J.; Wang, X.; Feng, M.; Ma, L. Forecasting carbon price using signal processing technology and extreme gradient boosting optimized by the whale optimization algorithm. Energy Sci. Eng. 2024, 12, 810–834. [Google Scholar] [CrossRef]
- Hirata, E.; Matsuda, T. Forecasting Shanghai Container Freight Index: A Deep-Learning-Based Model Experiment. J. Mar. Sci. Eng. 2022, 10, 593. [Google Scholar] [CrossRef]
- Gao, N.; He, Y.; Ma, X. Exponential timing strategy based on EEMD-SVR predictive modeling of low frequency component. Stat. Decis. 2022, 38, 140–145. [Google Scholar]
- Chang, M.Z.; Park, S. Predictions and analysis of flash boiling spray characteristics of gasoline direct injection injectors based on optimized machine learning algorithm. Energy 2023, 262, 125304. [Google Scholar] [CrossRef]
Reference | Objective(s) | Data Processing | Model | Influencing Factors |
---|---|---|---|---|
Liu [12] | Freight volume | Empirical mode decomposition (EMD) | Backpropagation neural network (BP) | Baidu search index, COVID-19 index, Historical Data |
Yin [13] | CCFI | Hampel filter, Complete Ensemble empirical mode decomposition adaptive noise (CEEMDAN) | Extreme learning machine (ELM) | Historical data |
Han [14] | BDI | Wavelet transform (WT) | Improved SVM | Historical data |
Li [15] | FDI, BDI, CBFI, CBCFI, CFD, CTFI | Complementary ensemble empirical mode decomposition (CEEMD) | Bidirectional long short-term memory (BiLSTM) | Historical data |
Tsioumas [16] | BDI | Empirical mode decomposition (EMD) | Multivariate vector autoregressive model with exogenous variables (VARX) | Chinese steel production, Dry bulk fleet development, Dry bulk economic climate index (DBECI) |
Du [17] | Container throughputs | Variational mode decomposition (VMD) | Extreme learning machine (ELM) | Historical data |
Xie [18] | Container throughputs | Seasonal decomposition method(X-12-ARIMA) | Autoregressive integrated moving average (ARIMA), Seasonal autoregressive integrated moving average (SARIMA), Squares support vector regression (LSSVR) | Historical data |
Schramm [19] | Container freight rate | No | Vector autoregressive (VAR), Autoregressive integrated moving average (ARIMA), Autoregressive integrated moving average with exogenous variables (ARIMAX) | Logistics confidence Index (LCI), Historical data |
Tu [20] | CCFI | No | Deep neural network (DNN), CatBoost regression model, Robust regression model | China coastal bulk freight index (CCBFI), Global: Aluminum price, Container throughput |
Bae [21] | BDI | No | Artificial neural network (ANN), Recurrent neural network (RNN), Long short-term memory (LSTM) | Brent oil price, Coal price, Iron ore export volume |
Acronym | Full Term |
---|---|
H | Hampel filter |
MAD | Median absolute deviation |
CEEMDAN | Complete ensemble empirical mode decomposition with adaptive noise |
VMD | Variational mode decomposition |
IMF | Intrinsic mode function |
COA | Cheetah optimization algorithm |
SVR | Support vector regression |
RF | Random forest |
XGBoost | Extreme gradient boosting |
CART | Classification and regression tree |
Notation | Explanation |
---|---|
Convolutional operation | |
Partial derivative with respect to t | |
δ(t) | The Dirac delta function |
L, L(t), | The objective function |
The sums of the first order at the j-th node | |
The sums of the second order at the j-th node | |
The canonical term | |
F | The function set space of all CART decision trees |
Metric | Definition | Function |
---|---|---|
RMSE | Root mean squared error | |
MAE | Mean absolute error | |
MAPE | Mean absolute percentage error | |
TIC | Theil inequality coefficient |
Models | Decomposition Technique | Evolutionary Algorithm | Outlier Handing |
---|---|---|---|
SVR(model 1) | |||
RF(model 2) | |||
XGBoost(model 3) | |||
CEEMDAN-COA-XGBoost(model 4) | √ | √ | |
VMD-COA-XGBoost(model 5) | √ | √ | |
Hampel-CEEMDAN-COA-XGBoost(model 6) | √ | √ | √ |
Hampel-VMD-COA-XGBoost(model 7) | √ | √ | √ |
Datasets | Models | RMSE | MAE | MAPE | TIC |
---|---|---|---|---|---|
SCFI | SVR | 68.8425 | 63.2738 | 0.0448 | 0.0152 |
RF | 50.4306 | 40.4269 | 0.0286 | 0.0112 | |
XGBoost | 36.7491 | 13.8465 | 0.0077 | 0.0082 | |
NCFI | SVR | 50.8329 | 37.6292 | 0.0352 | 0.0142 |
RF | 41.0845 | 31.9748 | 0.0258 | 0.0115 | |
XGBoost | 31.6979 | 19.8456 | 0.0158 | 0.0089 |
Datasets | Models | RMSE | MAE | MAPE | TIC |
---|---|---|---|---|---|
SCFI | CEEMDAN-COA-XGBoost | 15.1839 | 9.6145 | 0.0053 | 0.0034 |
VMD-COA-XGBoost | 7.2803 | 4.1971 | 0.0021 | 0.0016 | |
Hampel-CEEMDAN-COA-XGBoost | 5.2771 | 3.9468 | 0.0024 | 0.0012 | |
Hampel-VMD-COA-XGBoost | 3.3268 | 2.1778 | 0.0009 | 0.0007 | |
NCFI | CEEMDAN-COA-XGBoost | 11.1953 | 8.3779 | 0.0067 | 0.0031 |
VMD-COA-XGBoost | 8.3489 | 5.7984 | 0.0043 | 0.0023 | |
Hampel-CEEMDAN-COA-XGBoost | 10.1763 | 7.4243 | 0.0061 | 0.0028 | |
Hampel-VMD-COA-XGBoost | 7.5175 | 5.2427 | 0.0041 | 0.0021 |
SCFI | NCFI | ||
---|---|---|---|
Characteristic | Importance | Characteristic | Importance |
Historical data | 0.2133 | Historical data | 0.2229 |
Aluminum price | 0.2046 | Crude price | 0.2211 |
BDI | 0.1693 | CCFI | 0.0949 |
CBCFI | 0.1035 | Container throughput | 0.0832 |
CCBFI | 0.0831 | TDI | 0.0689 |
CCFI | 0.0738 | CCBFI | 0.0624 |
BDI Baidu index | 0.0493 | BDI | 0.0528 |
Container throughput | 0.0303 | Shipping Baidu index | 0.0527 |
Coal price | 0.0252 | Coal price | 0.0475 |
Crude price | 0.0294 | Aluminum price | 0.0440 |
TDI | 0.0076 | CBCFI | 0.0395 |
Shipping Baidu index | 0.0105 | BDI Baidu index | 0.0100 |
Datasets | Benchmark Model | Comparative Model | PRMSE | PMAE | PMAPE | PTIC | |
---|---|---|---|---|---|---|---|
SCFI | SVR | vs. | XGBoost | 46.62% | 78.12% | 82.81% | 46.05% |
RF | vs. | XGBoost | 27.13% | 65.75% | 73.08% | 26.79% | |
XGBoost | vs. | CEEMDAN-COA-XGBoost | 58.68% | 30.56% | 31.69% | 58.54% | |
XGBoost | vs. | VMD-COA-XGBoost | 80.19% | 69.69% | 72.73% | 80.49% | |
CEEMDAN-COA-XGBoost | vs. | VMD-COA-XGBoost | 52.05% | 56.34% | 60.38% | 52.94% | |
CEEMDAN-COA-XGBoost | vs. | Hampel-CEEMDAN-COA-XGBoost | 65.25% | 58.95% | 54.72% | 64.71% | |
VMD-COA-XGBoost | vs. | Hampel-VMD-COA-XGBoost | 54.30% | 48.11% | 57.14% | 56.25% | |
Hampel-CEEMDAN-COA-XGBoost | vs. | Hampel-VMD-COA-XGBoost | 36.96% | 44.82% | 62.50% | 41.67% | |
NCFI | SVR | vs. | XGBoost | 37.64% | 47.26% | 55.11% | 37.32% |
RF | vs. | XGBoost | 22.85% | 37.93% | 38.76% | 22.61% | |
XGBoost | vs. | CEEMDAN-COA-XGBoost | 64.68% | 57.78% | 57.59% | 65.17% | |
XGBoost | vs. | VMD-COA-XGBoost | 73.66% | 70.78% | 72.78% | 74.16% | |
CEEMDAN-COA-XGBoost | vs. | VMD-COA-XGBoost | 25.42% | 30.79% | 35.82% | 25.81% | |
CEEMDAN-COA-XGBoost | vs. | Hampel-CEEMDAN-COA-XGBoost | 9.10% | 11.38% | 8.96% | 9.68% | |
VMD-COA-XGBoost | vs. | Hampel-VMD-COA-XGBoost | 9.96% | 9.58% | 4.65% | 8.70% | |
Hampel-CEEMDAN-COA-XGBoost | vs. | Hampel-VMD-COA-XGBoost | 26.13% | 29.38% | 37.79% | 25.00% |
Dataset | Model | RMSE |
---|---|---|
SCFI | LSTM | 17.62 |
Proposed model | 3.09 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Duan, Y.; Zhang, X.; Wang, X.; Fan, Y.; Liu, K. A Novel Forecasting System with Data Preprocessing and Machine Learning for Containerized Freight Market. Mathematics 2025, 13, 1695. https://doi.org/10.3390/math13101695
Duan Y, Zhang X, Wang X, Fan Y, Liu K. A Novel Forecasting System with Data Preprocessing and Machine Learning for Containerized Freight Market. Mathematics. 2025; 13(10):1695. https://doi.org/10.3390/math13101695
Chicago/Turabian StyleDuan, Yonghui, Xiaotong Zhang, Xiang Wang, Yingying Fan, and Kaige Liu. 2025. "A Novel Forecasting System with Data Preprocessing and Machine Learning for Containerized Freight Market" Mathematics 13, no. 10: 1695. https://doi.org/10.3390/math13101695
APA StyleDuan, Y., Zhang, X., Wang, X., Fan, Y., & Liu, K. (2025). A Novel Forecasting System with Data Preprocessing and Machine Learning for Containerized Freight Market. Mathematics, 13(10), 1695. https://doi.org/10.3390/math13101695