Enhancing Supply Chain Management: A Comparative Study of Machine Learning Techniques with Cost–Accuracy and ESG-Based Evaluation for Forecasting and Risk Mitigation
Abstract
1. Introduction
- We develop and evaluate an XGBoost-based demand forecasting model on a high-variance retail dataset, achieving an 18% reduction in mean absolute error (MAE) and a 22% reduction in root-mean-square error (RMSE) compared to a benchmark ARIMA (1,1,1) model. This result highlights the advantage of tree-based ensembles in handling non-stationary and intermittent demand patterns.
- We design a continuous-review inventory replenishment policy that dynamically adjusts reorder points based on forecast accuracy. When the MAE falls below 12% of average demand, this approach improves service levels by 7% and reduces the total inventory cost by 10% compared to a fixed-interval policy under identical conditions.
- We introduce two composite evaluation metrics—CAE and CAE-ESG—that jointly assess model performance, implementation cost, and sustainability impact. Using these metrics, we show that although Random Forests and RNNs perform well, XGBoost achieves the best balance between cost-efficiency and ESG footprint, reducing greenhouse gas emissions by 15% compared to deep learning models.
- We apply RFM-based customer segmentation to enhance the ML model input structure. By tailoring forecasts to customer segments (e.g., “Champions,” “Loyalists,” “At-Risk”), we observe up to an 11% improvement in forecast accuracy and a 9% uplift in performance for retention-critical cohorts, demonstrating the value of behavioral segmentation in demand modeling.
2. Literature Review
2.1. Demand Forecasting in SCM: From Classical Models to ML Approaches
2.2. Forecast-Driven Inventory Optimization: Moving Beyond Static Policies
2.3. ML for Risk Mitigation: Fraud and Delay Prediction
2.4. ESG-Aware Model Evaluation: Toward Sustainable Analytics
3. Materials and Methods
3.1. Data and Preprocessing
3.1.1. Data Sources and Context
- Order records: Timestamps, item Stock Keeping Units (SKUs), unit prices, quantities.
- Shipping logs: Promised vs. actual shipping dates, carrier information, shipping modes (standard, expedited, same-day).
- Customer profiles: Geographic location, segment tags (e.g., business vs. individual).
3.1.2. Cleaning and Imputation
- Missing Values:
- ⚬
- Numeric fields with <1% missing values (e.g., UnitPrice) were imputed using the median to avoid skewness. For numeric fields with ≥1% missingness, we applied a two-step strategy: If the field had business-critical value (e.g., ShippingCost), we used regression imputation based on correlated variables. Fields with >5% missingness and limited analytical value were excluded from the modeling to maintain data integrity and reduce noise.
- ⚬
- Categorical fields (e.g., ShippingMode): Imputed to an explicit “Unknown” category, preserving these records for pattern discovery.
- Outliers and Consistency
- ⚬
- Continuous variables beyond μ ± 3σ were winsorized to the 1st or 99th percentile to limit undue influence by recording errors or extreme purchases.
- ⚬
- Date fields were validated (e.g., ensuring ShipDate ≥ OrderDate); any anomalies were manually reviewed and corrected or dropped if unverifiable.
3.1.3. Feature Engineering
- Sales per Customer: Measures the total purchase amount per customer, a core variable used in RFM-based segmentation [32,33] and Customer Lifetime Value (CLV) modeling. High values of Si indicate customers who make significant monetary contributions, which supports prioritization in retention strategies.
- Actual Shipping Days: Shipping delay is a key operational indicator that reflects fulfillment efficiency. It has been linked to customer satisfaction and future order probability [24]. In demand prediction and inventory modeling, longer shipping delays often signal bottlenecks or risk exposures.
- Late Delivery Flag: This binary indicator flags whether an order violated its promised lead time. Such variables are crucial for ML-based risk modeling and have been used in prior work to detect supply chain disruptions and fraud patterns.
- Derived Demographics: Geographic variables such as CustomerCity and OrderCountry were one-hot encoded and evaluated for predictive utility. However, SHAP analysis revealed minimal contribution to model performance, which is consistent with the low correlation values observed during EDA. As a result, these features were excluded from the final models to avoid unnecessary dimensionality and overfitting.
3.1.4. Scaling and Encoding
- Min–Max scaling to [0, 1] was applied for algorithms sensitive to feature magnitudes [34] (linear models, neural networks):
3.1.5. Descriptive Moments and Distributional Insights
3.1.6. Pairwise Correlation Structure
3.1.7. Visual Exploration
- Histograms and boxplots to check for multimodality and outliers.
- Heatmaps to visualize clusters of highly correlated predictors, guiding feature selection to reduce multicollinearity.
3.2. Model Architecture Overview
- Tree-Based Models (Random Forest, XGBoost): Chosen for their robustness to outliers, built-in feature selection, and interpretability via feature importance metrics. XGBoost additionally provides regularization and handles missing values effectively.
- Neural Networks (Feedforward, LSTM-RNN): Selected for their ability to capture non-linear relationships and, in the case of LSTM, long-term sequential dependencies critical for time-series forecasting and temporal pattern recognition in fraud detection.
- Linear Models (Logistic Regression, Lasso): Included as interpretable baselines, with Lasso providing automatic feature selection through L1 regularization.
- Classical Methods (ARIMA, EOQ): Established benchmarks for time-series forecasting and inventory management, respectively.
3.3. Customer Segmentation and Churn Modeling
3.3.1. RFM Metric Computation
3.3.2. Quintile Scoring
3.3.3. Segment Labeling
- Champions: (sR,sF,sM ≥ 4)—Top-spending frequent buyers requiring exclusive rewards.
- Loyal Customers: (sF ≥ 4, sM ≤ 3)—High-frequency purchasers eligible for volume discounts.
- At-Risk: (sR ≤ 2, sF,sM ≥ 3)—Previously valuable customers needing reactivation campaigns
- Lost: (sR ≤ 2, sF ≤ 2)—Lapsed customers for win-back initiatives.
3.3.4. Problem Framing and Labels
3.3.5. Feature Matrix
3.3.6. Modeling Approaches
- Logistic Regression: trained by minimizing [39]
- Random Forest: Ensemble of T decision trees, each split minimizing Gini impurity.
- XGBoost: Gradient-boosted trees with regularized objective:
3.3.7. Model Selection and Thresholding
3.4. Forecasting and Inventory Optimization Framework
3.4.1. Data Structuring
3.4.2. Model Catalog
- Linear/Lasso: Lasso regression was included as a regularized linear baseline for its ability to perform feature selection while minimizing overfitting. The objective function is as follows:
3.4.3. Training and Validation
- Loss functions: MSE during training; MAE monitored for early stopping.
- Evaluation metrics: MAE, RMSE, and MAPE were used to compare the models.
- Statistical testing: A Wilcoxon signed-rank test (p < 0.05) was conducted to determine whether XGBoost’s forecast errors were statistically lower than those of the other models, justifying its selection for downstream simulation in inventory policy and ESG evaluation. MAE, RMSE, and MAPE were used. The Wilcoxon signed-rank test (p < 0.05) checked whether XGBoost’s errors were statistically lower than those of the alternatives, guiding its selection for downstream simulations.
3.4.4. Methods Compared
- Naive Economic Order Quantity (EOQ)/Reorder Point (ROP): Classical formulae.
- Forecast-Driven: Dynamic ROPs based on next-day forecasts from XGBoost and RNN.
3.4.5. Key Equations
3.4.6. Simulation Steps
3.5. Risk Prediction and Classification
3.5.1. Label Definitions
- Fraud: transactions flagged by the audit team.
- Late Delivery: Li = 1.
3.5.2. Modeling and Metrics
3.5.3. Parameter Grids
- Tree Models: Max depth {3, 5, 7}, learning rate {0.01, 0.1}, n_estimators ∈ {100, 300}.
- RNN: Hidden units {64, 128}, dropout {0.2, 0.5}, learning rate {1 × 10−2,1 × 10−3}.
- Recall for fraud (minimize false negatives).
- F1-score for late delivery (balance precision and recall).
3.6. Interpretability and Model Selection
3.6.1. Validation and Generalization
- Temporal Holdout: Strict forward testing for forecasting tasks ensures real-world applicability.
- Stratified kk-Fold CV: Across classification tasks, demonstrating stable performance (±1–2% variance).
- Paired Statistical Tests: Wilcoxon tests confirmed that the performance differences were significant (p < 0.05).
- Robustness Checks: Retraining on seasonal subsets showed consistent model behavior.
3.6.2. CAE Computation
- CAE:
- CAE with ESG Integration:
3.6.3. Enhancing SCM Algorithms
Algorithm 1: End-to-end pseudocode for enhancing SCM |
INPUT: Raw order and shipping data. Cleaning and Feature Engineering Impute missing, winsorize outliers: Handle missing values and outliers in the dataset. Compute Si, Di, Li, and demographic encodings: Calculate sales per customer, actual shipping days, and late delivery flags, and encode demographic features. Scale numeric features as needed: Apply Min–Max scaling to numeric features sensitive to feature magnitudes. Exploratory Analysis Compute moments and Pearson’s r: Calculate descriptive statistics and correlation coefficients. Visualize distributions and correlation heatmap: Generate histograms, boxplots, and correlation heatmaps. Customer Segmentation Calculate R, F, and M per customer: Compute recency, frequency, and monetary values for each customer. Assign quintile scores; map to segments: Rank customers and assign them to segments based on RFM scores. Churn Prediction Build Xc and yc labels: Create feature matrix and target labels for churn prediction. Train LR, RF, and XGB; select best by F1: Train logistic regression, Random Forest, and XGBoost models, and select the best model based on the F1-score. Forecasting Build time-series features Xt; target yt: Create time-series features and target variable for forecasting. Train models (LR, Lasso, RF, XGB, NN, RNN): Train various models including linear regression, Lasso, Random Forest, XGBoost, neural networks, and RNNs. Evaluate on holdout; compute CAE and CAE-ESG: Assess models using cost-efficiency and ESG-aligned metrics. Select model maximizing CAE-ESG or desired trade-off. Inventory Simulation FOR t = 1…365: Forecast t+1: Predict demand for the next day. Compute ROPt = t+1 + zσt+1: Calculate the reorder point. IF Invt ≤ ROPt, order Q∗: Place an order if inventory falls below the reorder point. Update inventory and costs: Update inventory levels and associated costs. Fraud and Late Delivery Classification Prepare features and labels: Create feature matrix and target labels for fraud and late delivery prediction. Train and evaluate RF, XGB, and RNN; tune for recall/F1: Also compute CAE and CAE-ESG for interpretability and efficiency. Hyperparameter Tuning Define grid Θ; perform stratified CV: Define hyperparameter grid and perform stratified cross-validation. Select θ maximizing target metric: Choose the best hyperparameters based on the target metric (e.g., F1-score). Interpretability Compute SHAP values; rank features: Calculate SHAP values to assess feature contributions and rank features by importance. Validation Temporal holdout, k-fold CV, Wilcoxon tests: Perform temporal holdout validation, k-fold cross-validation, and Wilcoxon tests to validate the model performance. Compare accuracy, CAE, and CAE-ESG across models. OUTPUT: Final models, segmentation rules, forecast scripts, simulation results, interpretability reports. |
4. Results
4.1. EDA Results
4.2. Customer Segmentation (RFM)
- Recency: Time elapsed since the last purchase.
- Frequency: Total number of purchases.
- Monetary value: Total amount spent.
- Champions: High recency, frequency, and monetary scores;
- Cannot Lose Them: High frequency and recency, but lower monetary value;
- At-Risk: Low recency but moderate frequency and monetary values;
- Customers Needing Attention: Customers with moderate behavior across all metrics;
- Lost: Customers with low engagement and spending;
- Loyal Customers: High frequency but lower monetary value;
- Promising: Recent customers with potential for increased loyalty;
- Recent Customers: Newer customers who have made recent purchases.
4.3. Churn Prediction for Specific Customer Segments
- The “At-Risk” and “Lost” segments are the primary targets for churn prevention efforts.
- Targeted interventions can significantly improve CLV for high-value segments like “Champions” and “Loyal Customers”.
- The “Promising” and “Recent Customers” segments show potential for increased loyalty and future purchases.
- Low-value segments like “Lost” customers require different engagement approaches.
4.4. Forecasting and Inventory Optimization
4.5. Classification Performance
4.5.1. Fraud Detection Performance
4.5.2. Late Delivery Prediction Performance
4.6. Hyperparameter Tuning and Model Fine-Tuning
4.7. Comparative Study: Traditional ML vs. Deep Learning
4.8. CAE and CAE-ESG Performance Evaluation
4.9. Summary of Hypothesis Validation
- Hypothesis 1 was confirmed: XGBoost and RNN achieved significantly lower MAE and RMSE than traditional ARIMA models in forecasting tasks. This finding supports prior research on the superiority of non-linear ML methods for non-stationary demand series.
- Hypothesis 2 was supported: Forecast-driven continuous-review inventory policies improved fill rates by 5–7% over fixed-interval EOQ baselines, validating the operational value of adaptive policies.
- Hypothesis 3 was confirmed: RNNs outperformed classical models in fraud and late delivery classification, particularly in recall, aligning with studies highlighting their strength in sequential pattern recognition.
- Hypothesis 4 was validated: CAE and CAE-ESG revealed critical trade-offs among predictive power, cost, and sustainability. Random Forest achieved the highest CAE-ESG score, despite not being the most accurate.
4.10. Model Interpretability via SHAP Analysis
4.10.1. SHAP-Based Interpretability for Fraud Detection
4.10.2. SHAP-Based Interpretability for Late Delivery
4.10.3. Comparative Feature Importance
- Late_delivery_risk and days for shipping (real) are modestly correlated, suggesting some overlapping influence in fraud detection.
- Shipping mode, a top driver of late delivery, shows inverse correlation with Late_delivery_risk, underscoring their contrasting effects across models.
- Categorical features such as type and its derivatives exhibit very low correlation with other features, reinforcing their unique modeling value.
5. Discussion
5.1. Forecasting and Inventory Control
5.2. Risk Mitigation and Business Outcomes
5.3. ESG Metrics in Model Evaluation
5.4. Implications for Practice
5.5. Addressing Gaps in the Literature
- The fragmentation of ML applications across SCM functions is addressed by validating Hypotheses 1, 2, and 3, which demonstrate that predictive models like XGBoost and RNNs can be successfully deployed across forecasting, inventory optimization, and risk mitigation in a unified analytical framework.
- The lack of comparative analysis under consistent conditions is resolved through direct empirical benchmarking of classical ML and deep learning models across identical datasets and KPIs, as reflected in the support for Hypotheses 1–3. This approach contrasts with prior work that typically examined forecasting or classification tasks in isolation.
- The absence of sustainability-aware model selection frameworks is directly addressed through Hypothesis 4, which introduces and validates the CAE and CAE-ESG metrics. These tools extend beyond accuracy-based evaluation, incorporating cost-efficiency, ESG scores, and operational complexity into SCM model assessment.
5.6. Future Research Directions
- Building on Hypotheses 1 and 2, future work could examine how advanced ML models like XGBoost or RNNs perform in multi-echelon inventory systems, closed-loop supply chains, or under conditions of extreme demand volatility (e.g., post-sale service and returns management). These contexts would test the robustness of forecast-driven replenishment and adaptive inventory policies in more complex, high-risk environments.
- In the context of Hypothesis 3, further investigation is warranted into hybrid classification architectures (e.g., CNN-RNN or attention-based models) for more nuanced fraud detection or late delivery prediction, especially as new data modalities (like image, GPS, or IoT) become available.
- Critically, Hypothesis 4 opens an important path for research on dynamic ESG integration. Future studies could explore how CAE-ESG can adapt to shifting stakeholder priorities or regulatory requirements by weighting ESG dimensions based on geography, industry, or corporate policy. Incorporating real-time ESG feedback loops or policy-sensitive ESG scoring could further align analytics with sustainability mandates.
5.7. Implications of CAE/CAE-ESG Metrics in SCM
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
List of Abbreviations
Abbreviation | Definition |
ARIMA | Autoregressive Integrated Moving Average |
CAE | Cost-Accuracy Efficiency |
CAE-ESG | Cost-Accuracy Efficiency with Environmental, Social, and Governance Integration |
CLV | Customer Lifetime Value |
CV | Cross-Validation |
DL | Deep Learning |
EDA | Exploratory Data Analysis |
EEI | Environmental Efficiency Index |
EOQ | Economic Order Quantity |
ERP | Enterprise Resource Planning |
ESG | Environmental, Social, and Governance |
ETS | Exponential Smoothing |
FNN | Feedforward Neural Network |
FPR | False Positive Rate |
GHG | Greenhouse Gas |
GRI | Global Reporting Initiative |
GRU | Gated Recurrent Unit |
ISO | International Organization for Standardization |
LSTM | Long Short-Term Memory |
MAE | Mean Absolute Error |
MAPE | Mean Absolute Percentage Error |
ML | Machine Learning |
MSE | Mean Squared Error |
PCC (r) | Pearson’s Correlation Coefficient (r) |
RF | Random Forest |
RFM | Recency, Frequency, Monetary Value |
RMSE | Root-Mean-Square Error |
RNN | Recurrent Neural Network |
ROC AUC | Area Under the Receiver Operating Characteristic Curve |
ROP | Reorder Point |
SASB | Sustainability Accounting Standards Board |
SCM | Supply Chain Management |
SHAP | SHapley Additive exPlanations |
SKU | Stock Keeping Unit |
SRS | Social Responsibility Score |
TPR | True Positive Rate |
XGBoost | Extreme Gradient Boosting |
Glossary of Variables and Metrics
Symbol | Definition | Unit | Context |
Si | Total sales for customer i | USD | Sum of unit price × quantity for all orders by customer i |
pij | Unit price of item j for customer i | USD/item | Retrieved from order transaction logs |
qij | Quantity of item j in order by customer i | Units | Recorded in the order dataset |
Di | Actual shipping days for order i | Days | Di = ActualShipDatei−OrderDatei |
Li | Late delivery flag for order i | Binary (0 or 1) | 1 if Di > PromisedLeadTimei, else 0 |
ROPt | Reorder point at time t | Units | |
Q* | Optimal order quantity | Units | |
D | Demand rate | Units/day | Average daily demand used in EOQ calculations |
S | Ordering cost | USD/order | Fixed cost per order placed |
H | Holding cost per unit | USD/unit/day | Cost of storing one unit in inventory per day |
Zα | Safety factor | Z-score | Based on desired service level |
σt+1 | Forecast standard deviation for next period | Units | Uncertainty in forecast demand for time t + 1 |
Forecasted demand for time t + 1 | Units/day | Output from XGBoost or RNN models | |
Rc | Recency for customer c | Days | Days since last purchase |
Fc | Frequency for customer c | Count | Total number of purchases |
Mc | Monetary value for customer c | USD | Total spending |
sR,sF,sM | Quintile scores for recency, frequency, and monetary value | Score (1–5) | Assigned based on ranking in the customer base |
CLV | Customer Lifetime Value | USD | Estimated future value of customer based on historical behavior |
CAE | Cost–Accuracy Efficiency | Dimensionless Index | Combines model accuracy and cost reduction |
CAE-ESG | CAE with ESG integration | Dimensionless Index | CAE plus ESG score (energy, labor, governance) |
Accuracy | Classification metric | % | Proportion of correct predictions |
Precision | Classification metric | % | True positives/(true positives + false positives) |
Recall | Classification metric | % | True positives/(true positives + false negatives) |
F1-score | Harmonic mean of precision and recall | % | Provides a balanced measure of a model’s accuracy in classification tasks |
SHAP ϕj | SHAP value of feature j | Feature Contribution | Measures contribution of feature j to model prediction |
EEI | Environmental Efficiency Index | Normalized (0–1) | Part of ESG score, reflects energy usage efficiency |
SRS | Social Responsibility Score | Normalized (0–1) | Reflects labor fairness and supply chain ethics |
GRM | Governance Risk Metric | Normalized (0–1) | Reflects regulatory and governance risk |
CompCost | Computational cost of model | Normalized (0–1) | Based on runtime, memory, hardware use |
OpComplexity | Operational complexity | Normalized (0–1) | Reflects difficulty in deploying and maintaining the model |
References
- Joel, O.S.; Oyewole, A.T.; Odunaiya, O.G.; Soyombo, O.T. Leveraging Artificial Intelligence for Enhanced Supply Chain Optimization: A Comprehensive Review of Current Practices and Future Potentials. Int. J. Manag. Entrep. Res. 2024, 6, 707–721. [Google Scholar] [CrossRef]
- Liang, Y. Detecting and Predicting Supply Chain Risks: Fraud and Late Delivery Based on Decision Tree Models. Adv. Econ. Manag. Political Sci. 2025, 153, 40–46. [Google Scholar] [CrossRef]
- Nweje, U.; Taiwo, M. Leveraging Artificial Intelligence for predictive supply chain management, focus on how AI- driven tools are revolutionizing demand forecasting and inventory optimization. Int. J. Sci. Res. Arch. 2025, 14, 230–250. [Google Scholar] [CrossRef]
- Alsolbi, I.; Shavaki, F.H.; Agarwal, R.; Bharathy, G.K.; Prakash, S.; Prasad, M. Big data optimisation and management in supply chain management: A systematic literature review. Artif. Intell. Rev. 2023, 56, 253–284. [Google Scholar] [CrossRef]
- Nzeako, G.; Akinsanya, M.O.; Popoola, O.A.; Chukwurah, E.G.; Okeke, C.D. The role of AI-Driven predictive analytics in optimizing IT industry supply chains. Int. J. Manag. Entrep. Res. 2024, 6, 1489–1497. [Google Scholar] [CrossRef]
- Deyassa, K.G. The Effectiveness of ISO 14001 And Environmental Management System—The Case of Norwegian Firms. Struct. Environ. 2019, 11, 77–89. [Google Scholar] [CrossRef]
- Bais, B.; Nassimbeni, G.; Orzes, G. Global Reporting Initiative: Literature review and research directions. J. Clean. Prod. 2024, 471, 143428. [Google Scholar] [CrossRef]
- Sahib, S.A.; Malik, D.Y.S. Sustainability Accounting Standards Historical Development/Literature Review. Int. Acad. J. Account. Financ. Manag. 2023, 10, 1–12. [Google Scholar] [CrossRef]
- Schwartz, R.; Dodge, J.; Smith, N.A.; Etzioni, O. Green AI. Commun. ACM 2020, 63, 54–63. [Google Scholar] [CrossRef]
- Zeng, H.; Li, R.Y.M.; Zeng, L. Evaluating green supply chain performance based on ESG and financial indicators. Front. Environ. Sci. 2022, 10, 982828. [Google Scholar] [CrossRef]
- Shahrabi, J.; Mousavi, S.S.; Heydar, M. Supply Chain Demand Forecasting; A Comparison of Machine Learning Techniques and Traditional Methods. J. Appl. Sci. 2009, 9, 521–527. [Google Scholar] [CrossRef]
- Aldahmani, E.; Alzubi, A.; Iyiola, K. Demand Forecasting in Supply Chain Using Uni-Regression Deep Approximate Forecasting Model. Appl. Sci. 2024, 14, 8110. [Google Scholar] [CrossRef]
- Rezki, N.; Mansouri, M. Deep learning hybrid models for effective supply chain risk management: Mitigating uncertainty while enhancing demand prediction. Acta Logist. 2024, 11, 589–604. [Google Scholar] [CrossRef]
- Irhuma, M.; Alzubi, A.; Öz, T.; Iyiola, K. Migrative armadillo optimization enabled a one-dimensional quantum convolutional neural network for supply chain demand forecasting. PLoS ONE 2025, 20, e0318851. [Google Scholar] [CrossRef]
- Adhana, K.; Smagulova, A.; Zharmukhanbetov, S.; Kalikulova, A. The Utilisation of Machine Learning Algorithm Support Vector Machine (SVM) for Improving the Adaptive Assessment. In Proceedings of the 2023 4th International Conference on Computation, Automation and Knowledge Management (ICCAKM), Dubai, United Arab Emirates, 12–13 December 2023; IEEE: Piscataway, NJ, USA, 2023; p. 1. [Google Scholar]
- Terven, J.; Cordova-Esparza, D.M.; Ramirez-Pedraza, A.; Chavez-Urbiola, E.A.; Romero-Gonzalez, J.A. Loss Functions and Metrics in Deep Learning. A Review. arXiv 2023, arXiv:2307.02694. [Google Scholar] [CrossRef]
- Chandran, J.M.; Khan, M.R.B. A Strategic Demand Forecasting: Assessing Methodologies, Market Volatility, and Operational Efficiency. Malays. J. Bus. Econ. Manag. 2024, 3, 150–167. [Google Scholar] [CrossRef]
- Zhang, X.; Li, P.; Han, X.; Yang, Y.; Cui, Y. Enhancing Time Series Product Demand Forecasting with Hybrid Attention-Based Deep Learning Models. Access 2024, 12, 190079–190091. [Google Scholar] [CrossRef]
- Suhartanto, J.F.; García-Flores, R.; Schutt, A. An Integrated Framework for Reactive Production Scheduling and Inventory Management. In Sustainable Design and Manufacturing; Springer: Singapore, 2021; Volume 262, pp. 327–338. [Google Scholar]
- Silver, E.A.; Pyke, D.F.; Thomas, D. Inventory and Production Management in Supply Chains, 4th ed.; CRC Press: Boca Raton, FL, USA; London, UK; New York, NY, USA, 2017. [Google Scholar]
- Ho, T.; Nguyen, S.; Nguyen, H.; Nguyen, N.; Man, D.; Le, T. An Extended RFM Model for Customer Behaviour and Demographic Analysis in Retail Industry. Bus. Syst. Res. 2023, 14, 26–53. [Google Scholar] [CrossRef]
- Heldt, R.; Silveira, C.S.; Luce, F.B. Predicting customer value per product: From RFM to RFM/P. J. Bus. Res. 2021, 127, 444–453. [Google Scholar] [CrossRef]
- Schober, P.; Boer, C.; Schwarte, L.A. Correlation Coefficients: Appropriate Use and Interpretation. Anesth. Analg. 2018, 126, 1763–1768. [Google Scholar] [CrossRef]
- Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis, 5th ed.; Wiley: Hoboken, NJ, USA, 2016. [Google Scholar]
- Li, Y.; Chen, T. ISCCO: A deep learning feature extraction-based strategy framework for dynamic minimization of supply chain transportation cost losses. PeerJ. Comput. Sci. 2024, 10, e2537. [Google Scholar] [CrossRef] [PubMed]
- Chen, T.; Guestrin, C. XGBoost; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
- Chang, R.; Liu, X.; Deng, W. Prediction of corporate default risk considering ESG performance and unbalanced samples. Appl. Soft Comput. 2025, 171, 112864. [Google Scholar] [CrossRef]
- Seyedan, M.; Mafakheri, F.; Wang, C. Order-up-to-level inventory optimization model using time-series demand forecasting with ensemble deep learning. Supply Chain Anal. 2023, 3, 100024. [Google Scholar] [CrossRef]
- Bhattacharya, N.G.; Zavar, G. Dynamic Relationship Between Stock Market Returns and Trading Volume: Evidence from Indian Stock Market. J. Glob. Econ. 2016, 12, 123–136. [Google Scholar] [CrossRef]
- Sarkar, M.; De Bruyn, A. LSTM Response Models for Direct Marketing Analytics: Replacing Feature Engineering with Deep Learning. J. Interact. Mark. 2021, 53, 80–95. [Google Scholar] [CrossRef]
- Xu, J.; Pero, M.; Fabbri, M. Unfolding the link between big data analytics and supply chain planning. Technol. Forecast. Soc. Change 2023, 196, 122805. [Google Scholar] [CrossRef]
- Khan, Y.; Su’ud, M.B.M.; Alam, M.M.; Ahmad, S.F.; Ahmad (Ayassrah), A.Y.A.B.; Khan, N. Application of Internet of Things (IoT) in Sustainable Supply Chain Management. Sustainability 2023, 15, 694. [Google Scholar] [CrossRef]
- Sieke, M. Foundations of Inventory Management. In Supply Chain Contract Management; Springer Fachmedien Wiesbaden GmbH: Cologne, Germany, 2019; pp. 9–36. [Google Scholar]
- Hyndman, R.J.; Athanasopoulos, G. Forecasting, 3rd ed.; Otexts, Online Open-Access Textbooks: Melbourne, Australia, 2021. [Google Scholar]
- Oliveira, G.X.C. Enhancing Customer Churn Prediction: Addressing Disparities and Imbalance in Machine Learning Models. 2023. Available online: https://www.proquest.com/docview/3122647352 (accessed on 26 March 2025).
- Sangeetha, G.; Harshavardhan, S.V.; Allen Joshua, L.; Anirudh, S.M. Predictive Analysis for Supply Chain Management Using Extreme Gradient Boost Classifier. Int. J. Adv. Res. Sci. Commun. Technol. 2024, 4, 531–534. [Google Scholar] [CrossRef]
- Zhang, J.; Yuyang, W.; Zidu, W. Enhancing Supply Chain Forecasting with Machine Learning: A Data-Driven Approach to Demand Prediction, Risk Management, and Demand-Supply Optimization. J. Fintech Bus. Anal. 2024, 2, 1–5. [Google Scholar] [CrossRef]
- Constante, F. DataCo Smart Supply Chain for Big Data Analysis. 2019. Available online: https://search.datacite.org/works/10.17632/8gx2fvg2k6.3 (accessed on 26 March 2025).
- Mathotaarachchi, K.V.; Hasan, R.; Mahmood, S. Advanced Machine Learning Techniques for Predictive Modeling of Property Prices. Information 2024, 15, 295. [Google Scholar] [CrossRef]
- Hariyani, D.; Hariyani, P.; Mishra, S.; Sharma, M.K. A literature review on green supply chain management for sustainable sourcing and distribution. Waste Manag. Bull. 2024, 2, 231–248. [Google Scholar] [CrossRef]
- Barbosa, M.W.; Vicente, A.d.l.C.; Ladeira, M.B.; Oliveira, M.P.V.d. Managing supply chain resources with Big Data Analytics: A systematic review. Int. J. Logist. 2018, 21, 177–200. [Google Scholar] [CrossRef]
- Wang, Z. Data-Driven Supply Chain Performance Optimization Through Predictive Analytics and Machine Learning. Appl. Comput. Eng. 2024, 118, 30–35. [Google Scholar] [CrossRef]
- Hadi Roshan; Masoumeh Afsharinezhad The new approach in market segmentation by using RFM model. J. Appl. Res. Ind. Eng. 2017, 4, 259–267. [CrossRef]
- Jia, W.; Sun, M.; Lian, J.; Hou, S. Feature dimensionality reduction: A review. Complex Intell. Syst. 2022, 8, 2663–2693. [Google Scholar] [CrossRef]
- Pasupuleti, V.; Thuraka, B.; Kodete, C.S.; Malisetty, S. Enhancing Supply Chain Agility and Sustainability through Machine Learning: Optimization Techniques for Logistics and Inventory Management. Logistics 2024, 8, 73. [Google Scholar] [CrossRef]
- Wang, H.; Sua, L.S.; Alidaee, B. Enhancing supply chain security with automated machine learning. arXiv 2024, arXiv:2406.13166. [Google Scholar] [CrossRef]
- Bassiouni, M.M.; Chakrabortty, R.K.; Sallam, K.M.; Hussain, O.K. Deep learning approaches to identify order status in a complex supply chain. Expert Syst. Appl. 2024, 250, 123947. [Google Scholar] [CrossRef]
- Rahman Mahin, M.P.; Shahriar, M.; Das, R.R.; Roy, A.; Reza, A.W. Enhancing Sustainable Supply Chain Forecasting Using Machine Learning for Sales Prediction. Procedia Comput. Sci. 2025, 252, 470–479. [Google Scholar] [CrossRef]
- Seyedan, M.; Mafakheri, F. Predictive big data analytics for supply chain demand forecasting: Methods, applications, and research opportunities. J. Big Data 2020, 7, 53. [Google Scholar] [CrossRef]
- Mulay, S.; Madgule, M.; Dhotre, K.; Bhosale, D.; Pingale, A. Prediction of economic order quantity using a modified analytical approach. Aust. J. Multi-Discip. Eng. 2025, 1–8. [Google Scholar] [CrossRef]
- Mahmood, S.; Hasan, R.; Hussain, S.; Adhikari, R. An Interpretable and Generalizable Machine Learning Model for Predicting Asthma Outcomes: Integrating AutoML and Explainable AI Techniques. World 2025, 6, 15. [Google Scholar] [CrossRef]
- Carter, C.R.; Rogers, D.S. A framework of sustainable supply chain management: Moving toward new theory. Int. J. Phys. Distrib. Logist. Manag. 2008, 38, 360–387. [Google Scholar] [CrossRef]
- Jiang, X. Predicting Corporate ESG Scores Using Machine Learning: A Comparative Study. Adv. Econ. Manag. Political Sci. 2024, 118, 141–147. [Google Scholar] [CrossRef]
Feature | Mean | Std. Dev | Skewness | Kurtosis |
---|---|---|---|---|
Sales per customer | 310.5 | 15.3 | 0.12 | 2.8 |
Days for shipping (real) | 3.5 | 1.2 | 0.05 | 3.0 |
Late delivery risk | 0.2 | 0.4 | 0.10 | 3.1 |
Order item total | 150.0 | 20.0 | 0.08 | 3.2 |
Segment | Percentage (%) |
---|---|
Recent Customers | 33.2 |
Promising | 16.9 |
Cannot Lose Them | 12.0 |
At-Risk | 11.4 |
Customers Needing Attention | 11.0 |
Loyal Customers | 10.5 |
Lost | 4.4 |
Champions | 0.6 |
Customer | Customer_Segmentation | Predicted Churn | R_Value | F_Value | M_Value | R_Score | F_Score | M_Score |
---|---|---|---|---|---|---|---|---|
1 | At-Risk | Yes | 280 | 16 | 8999.66 | 3 | 1 | 1 |
2 | At-Risk | Yes | 163 | 25 | 12,388.54 | 3 | 1 | 1 |
9 | Lost | Yes | 128 | 28 | 12,584.02 | 2 | 1 | 1 |
10 | Lost | Yes | 139 | 20 | 8196.45 | 2 | 1 | 1 |
11 | Loyal Customers | No | 316 | 3 | 2060.58 | 4 | 3 | 3 |
12 | Loyal Customers | No | 690 | 3 | 1869.59 | 4 | 3 | 3 |
15 | Recent Customers | No | 266 | 6 | 1557.02 | 3 | 3 | 3 |
16 | Recent Customers | No | 71 | 1 | 234.43 | 1 | 4 | 4 |
3 | Cannot Lose Them | No | 474 | 16 | 6312.54 | 4 | 1 | 1 |
4 | Cannot Lose Them | No | 177 | 15 | 9124.92 | 3 | 2 | 1 |
Customer_Segmentation | Pre-Avg CLV | Post-Avg CLV | Pre-Churn Rate | Post-Churn Rate | Pre-Count | Post-Count |
---|---|---|---|---|---|---|
At-Risk | 21.762133 | 17.409706 | 0.017286 | 0.006915 | 357 | 142.80 |
Cannot Lose Them | 38.923683 | 35.031315 | 0.000000 | 0.025567 | 528 | 528.00 |
Customers Needing Attention | 68.249190 | 64.836731 | 0.000000 | 0.102896 | 2125 | 2125.00 |
Lost | 28.059523 | 19.641666 | 0.654222 | 0.294400 | 13,511 | 6079.95 |
Promising | 100.334367 | 105.351085 | 0.000000 | 0.051278 | 1059 | 1059.00 |
Recent Customers | 153.917855 | 169.309640 | 0.000000 | 0.148751 | 3072 | 3072.00 |
Model | MAE (Units) | RMSE (Units) | MAPE (%) |
---|---|---|---|
Linear Regression | 0.0006 | 0.0015 | 0.02 |
Lasso | 1.5543 | 2.3331 | 4.76 |
Random Forest | 0.1941 | 2.1655 | 0.60 |
XGBoost | 0.1571 | 0.5333 | 0.48 |
Neural Network | 73.15 | 86.30 | 21.5 |
RNN | 5.52 | 7.84 | 1.62 |
Model | MAE | RMSE | R2 |
---|---|---|---|
Random Forest | 0.0246 | 1.2138 | 0.9999 |
XGBoost | 0.1151 | 0.4680 | 0.99999 |
RNN | 3.6877 | 6.8830 | 0.9973 |
Metric | RNN Optimized | Naive EOQ/ROP | XGBoost Optimized |
---|---|---|---|
Fill Rate (%) | 80.2 | N/A | 85.4 |
Stockout Events | 12 | 0 | 20 |
Total Cost (USD) | 2,045,780 | 907,820 | 2,500,000 |
Cost Reduction vs. Naive (%) | 45.2 | 0 | 0.0 |
RMSE | 126.62 | N/A | 215.49 |
MAE | 110.73 | N/A | 170.90 |
Policy | Total Cost | Holding Cost | Stockout Penalty |
---|---|---|---|
RNN | 50,612.25 | 49,239.91 | 1372.34 |
XGBoost | 81,181.80 | 74,055.03 | 7126.78 |
Naive | 14,119.00 | 4364.00 | 9755.00 |
Model | Accuracy (%) | Recall (%) | Precision (%) | F1-Score (%) |
---|---|---|---|---|
RF | 97.65 | 0.24 | 98.10 | 0.47 |
XGBoost | 99.11 | 67.06 | 91.20 | 78.03 |
RNN | 99.59 | 98.13 | 99.00 | 98.00 |
Model | Accuracy (%) | Recall (%) | Precision (%) | F1-Score (%) |
---|---|---|---|---|
RF | 97.97 | 100.00 | 95.50 | 98.18 |
XGBoost | 98.53 | 99.96 | 97.42 | 98.67 |
RNN | 98.88 | 98.10 | 97.60 | 97.85 |
Task | Model Version | Accuracy (%) | Recall (%) | F1-Score (%) |
---|---|---|---|---|
Fraud Detection | Original RNN | 99.59 | 98.13 | 98.00 |
Fine-Tuned RNN | 97.90 | 99.88 | 98.91 | |
Late Delivery Prediction | Original RNN | 98.88 | 98.10 | 97.85 |
Fine-Tuned RNN | 97.49 | 100.00 | 97.76 |
Model | Training Time (Approx.) | Memory Usage (Approx.) | Scalability | Complexity |
---|---|---|---|---|
XGBoost | 10 min | 2 GB | High | O(n_samples × n_features × n_trees × max_depth) |
Random Forest | 15 min | 2.5 GB | Moderate | O(n_samples × n_features × n_trees × max_depth) |
RNN | 1 h | 10 GB (GPU) | Low | O(n_samples × n_timesteps × n_features × n_units) |
Study | ML Method | Dataset Used | Key Performance Metrics | Superiority of Our Study |
---|---|---|---|---|
Our Study | XGBoost | DataCo’s ERP and logistics databases | Demand Forecasting: MAE = 0.1571, RMSE = 0.5333, MAPE = 0.48% | Holistic framework integration, superior demand forecasting accuracy, and tangible business outcomes. |
RNN | DataCo’s ERP and logistics databases | Late Delivery Prediction: Accuracy = 98.88%, recall = 98.10%, F1-score = 97.85% | Enhanced risk mitigation through superior recall and F1-scores in sequential data tasks. | |
[42] | TCN-1DSPCNN | Complex supply chain system | Late Delivery Prediction: 100% accuracy | Our study demonstrates comparable accuracy with additional business outcome metrics. |
[43] | Voting Regressor | Sales data | Sales Forecasting: R2 = 0.9999, RMSE = 1.54 | Our XGBoost model achieves similar accuracy with lower computational complexity. |
[44] | Ensemble DL Model | Various supply chain datasets | Demand Forecasting: R2 = 0.9999, RMSE = 1.54 | Our study provides a more comprehensive framework beyond demand forecasting. |
[32] | AutoML (XGBoost, LightGBM) | Supply chain security data | Fraud Detection: Accuracy = 99.11%, recall = 67.06% | Our RNN model for fraud detection shows higher recall and F1-score. |
Model | Accuracy | Cost Reduction | Comp. Cost | Op. Complexity | ESG Score | CAE | CAE-ESG |
---|---|---|---|---|---|---|---|
XGBoost | 0.9911 | 0.30 | 0.10 | 0.30 | 0.85 | 0.744 | 1.952 |
RNN | 0.9959 | 0.20 | 0.20 | 0.40 | 0.80 | 0.284 | 0.795 |
Random Forest | 0.9765 | 0.25 | 0.05 | 0.20 | 0.90 | 0.976 | 1.953 |
Model | Accuracy | Cost Reduction | Comp. Cost | Op. Complexity | ESG Score | CAE | CAE-ESG |
---|---|---|---|---|---|---|---|
XGBoost | 0.9853 | 0.35 | 0.10 | 0.20 | 0.87 | 1.151 | 2.510 |
RNN | 0.9888 | 0.22 | 0.20 | 0.30 | 0.82 | 0.434 | 1.100 |
Random Forest | 0.9797 | 0.28 | 0.07 | 0.20 | 0.88 | 1.100 | 2.072 |
Feature | Fraud Detection Model Impact | Late Delivery Model Impact | Correlation Notes |
---|---|---|---|
Late_delivery_risk | Strong negative SHAP (low = non-fraud) | Not in top 5 | Moderate positive correlation with days for shipping (real) (r = 0.40) |
Delivery status | Mixed SHAP impact | Not in top 5 | Weak correlation with other features |
Type | Moderate effect depending on type | Includes Type_TRANSFER, Type_PAYMENT | Uncorrelated (max r = 0.06), strongly independent |
Days for shipping (real) | Slightly increases fraud risk with long times | Not in top 5 | Moderate correlation with Late_delivery_risk (r = 0.40) |
Sales per customer | Weak-to-moderate effect | Not in top 5 | Low correlation with all other features |
Shipping mode | Not in top 5 | Strong positive SHAP (slower = late) | Negatively correlated with Late_delivery_risk (r = −0.40) |
Order_day_of_week | Not in top 5 | Moderate SHAP variation by weekday | Minimal correlation with all other variables |
Shipping_day_of_week | Not in top 5 | Moderate SHAP variation by weekday | No significant correlation |
Type_TRANSFER | Not in top 5 | Strong negative SHAP (on-time predictor) | Isolated binary category |
Type_PAYMENT | Not in top 5 | Slight positive SHAP when low | Also categorical and uncorrelated |
Late_delivery_risk | Strong negative SHAP (low = non-fraud) | Not in top 5 | Moderate positive correlation with days for shipping (real) (r = 0.40) |
Delivery status | Mixed SHAP impact | Not in top 5 | Weak correlation with other features |
Type | Moderate effect depending on type | Includes Type_TRANSFER, Type_PAYMENT | Uncorrelated (max r = 0.06), strongly independent |
Days for shipping (real) | Slightly increases fraud risk with long times | Not in top 5 | Moderate correlation with Late_delivery_risk (r = 0.40) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sattar, M.U.; Dattana, V.; Hasan, R.; Mahmood, S.; Khan, H.W.; Hussain, S. Enhancing Supply Chain Management: A Comparative Study of Machine Learning Techniques with Cost–Accuracy and ESG-Based Evaluation for Forecasting and Risk Mitigation. Sustainability 2025, 17, 5772. https://doi.org/10.3390/su17135772
Sattar MU, Dattana V, Hasan R, Mahmood S, Khan HW, Hussain S. Enhancing Supply Chain Management: A Comparative Study of Machine Learning Techniques with Cost–Accuracy and ESG-Based Evaluation for Forecasting and Risk Mitigation. Sustainability. 2025; 17(13):5772. https://doi.org/10.3390/su17135772
Chicago/Turabian StyleSattar, Mian Usman, Vishal Dattana, Raza Hasan, Salman Mahmood, Hamza Wazir Khan, and Saqib Hussain. 2025. "Enhancing Supply Chain Management: A Comparative Study of Machine Learning Techniques with Cost–Accuracy and ESG-Based Evaluation for Forecasting and Risk Mitigation" Sustainability 17, no. 13: 5772. https://doi.org/10.3390/su17135772
APA StyleSattar, M. U., Dattana, V., Hasan, R., Mahmood, S., Khan, H. W., & Hussain, S. (2025). Enhancing Supply Chain Management: A Comparative Study of Machine Learning Techniques with Cost–Accuracy and ESG-Based Evaluation for Forecasting and Risk Mitigation. Sustainability, 17(13), 5772. https://doi.org/10.3390/su17135772