You are currently viewing a new version of our website. To view the old version click .
Platforms
  • Article
  • Open Access

9 April 2025

Designing Predictive Analytics Frameworks for Supply Chain Quality Management: A Machine Learning Approach to Defect Rate Optimization

and
Department of Electronics Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary
*
Author to whom correspondence should be addressed.

Abstract

Efficient supply chain management (SCM) is essential for enterprises seeking to enhance operational efficiency, reduce costs, and mitigate risks while ensuring product quality and customer satisfaction. Addressing quality concerns within the supply chain proactively helps minimize rework, recalls, and returns, leading to significant cost savings and improved profitability. This study presents a machine learning (ML)-driven predictive analytics framework designed to forecast defect rates and optimize quality control processes. The research leverages a dataset sourced from a real-world fashion and beauty startup, hosted in a public repository. The framework employs advanced ML algorithms, including extreme gradient boosting (XGBoost), support vector machines (SVMs), and random forests (RFs), to accurately predict defect rates and derive actionable insights for supply chain optimization. Results demonstrate the effectiveness of predictive analytics in improving supply chain quality management, enabling enterprises to proactively reduce defect rates, minimize costs, and optimize return on investment (ROI). The proposed framework is designed to be scalable and transferable, ensuring adaptability across various industries, including fashion, e-commerce, and manufacturing. These findings underscore the economic and operational benefits of integrating machine learning into supply chain quality control, offering a data-driven, proactive approach to achieving high-efficiency, high-quality supply chain operations.

1. Introduction

Supply chain management (SCM) is defined as the coordination and integration of the flow of goods, information, and finances across all entities involved in delivering a product or service to the end customer [1]. Our research adopts a modern approach to SCM, emphasizing the integration of advanced technologies like machine learning and predictive analytics to optimize operations and enhance decision-making processes. The modern approach contrasts with the traditional supply chain by focusing on end-to-end visibility, automation, and data-driven strategies to improve efficiency, quality, and customer satisfaction [2]. Hence, in today’s competitive and dynamic business environment, effective supply chain management (SCM) has emerged as a cornerstone of operational success. Furthermore, supply chain quality is a critical determinant of operational success and customer satisfaction, influenced by several interrelated factors that ensure efficiency and reliability across the supply chain network. Key quality factors include product quality, which ensures adherence to standards and specifications to meet customer expectations, and process efficiency, which focuses on optimizing workflows to minimize delays and operational costs [3]. Additionally, supplier reliability plays a pivotal role in the timely and accurate delivery of raw materials or components, while inventory management ensures an optimal balance of stock levels, avoiding shortages or excess inventory [4]. Also, customer satisfaction, driven by the ability to meet or exceed customer requirements, and data accuracy, which underpins reliable decision making, are considered as crucial for sustaining performance [5]. For instance, poor supplier reliability can lead to production delays, negatively affecting delivery timelines and eroding customer trust [2]. Together, these factors underscore the complexity of achieving consistent quality across the supply chain, particularly in dynamic and competitive markets. Addressing these challenges requires advanced tools, such as machine learning (ML), to predict and proactively mitigate quality issues, forming the basis of this research.
To contextualize the research findings, we introduce a comprehensive explanation of supply chain quality metrics. These metrics serve as benchmarks for assessing and optimizing the supply chain’s performance through predictive analytics.

Key Supply Chain Metrics

  • Defect rate (DR) presents the percentage of defective products or services within the supply chain output. Its importance comes from high defect rates directly impacting customer satisfaction and operational costs. Predictive analytics reduce DR by identifying and addressing root causes proactively [6].
  • Cost Efficiency (CE): the balance between total supply chain expenditure and output quality. It is the reduction in operational inefficiencies, achieved through predictive analytics, that leads to significant cost savings [7].
  • Lead Time (LT): the time taken from order placement to product delivery. The importance of reducing lead times indicates improved supply chain agility and responsiveness to demand fluctuations [8].
  • Return on Investment (ROI): the financial return generated relative to the costs invested in supply chain improvements. The importance is that ROI serves as a key indicator of the economic feasibility of adopting advanced predictive models [9].
  • Customer Satisfaction (CS): a measure of how well supply chain outputs meet customer expectations. The importance is to improve defect rates and lead times, translating directly into higher satisfaction levels [10].
Each metric is influenced by specific supply chain practitioners:
-
Suppliers impact lead time and defect rate through raw material quality and delivery consistency [11].
-
Manufacturers affect defect rates and cost efficiency through production processes [12].
-
Distributors and retailers influence customer satisfaction and lead time via distribution efficiency and service quality [5].
However, among the critical metrics in this field, defect rate serves as a pivotal indicator of quality. While manufacturing defects often dominate discussions of quality improvement, defects occurring within the broader supply chain can equally impact business efficiency and profitability [13].
Despite the importance of these factors, ensuring consistent quality across all levels of the supply chain remains a significant challenge [14]. Operational complexities, combined with the dynamic nature of modern supply chains, necessitate advanced tools to predict and mitigate quality issues proactively [15]. Our research addresses these challenges by leveraging machine learning (ML) techniques to optimize quality control processes, focusing on defect rate prediction and the enhancement of supply chain performance.
Reducing defect rates is essential not only for maintaining high-quality standards but also for ensuring customer satisfaction and optimizing return on investment (ROI). Predictive analytics, powered by advanced machine learning (ML) algorithms, present an opportunity to address these challenges by enabling enterprises to anticipate and mitigate defects before they materialize. Such proactive measures can result in substantial cost savings by minimizing rework, scrap, and warranty claims while improving overall operational stability [15].
The integration of machine learning techniques into SCM has garnered increasing attention as industries strive to enhance quality control processes. Traditional methods relying on manual inspection and predefined rules often fall short in addressing the complexities of modern supply chains [16]. Studies such as those by [13] and [15] have underscored the importance of quality control in SCM, with frameworks emphasizing Industry 4.0 technologies for quality enhancement. However, achieving consistent quality across the supply chain remains a formidable challenge due to operational complexities and the diverse factors influencing quality outcomes.
Controversies arise around the best methodologies to implement predictive analytics in SCM. While some studies advocate for deterministic approaches, others highlight the value of hybrid models that combine multiple ML algorithms to improve defect rate prediction accuracy [17]. This study seeks to reconcile these perspectives by proposing an ensemble machine learning framework that leverages advanced techniques like XGBoost, random forest, and support vector machine to optimize predictive accuracy.
Despite the wealth of literature on SCM optimization, existing studies often lack robust methodologies for predicting and mitigating defect rates. Key gaps include the limited application of ensemble ML models and insufficient focus on real-world implementation strategies for quality control. Addressing these gaps, this research aims to establish a comprehensive quality control system that achieves the following:
  • Leverages predictive analytics to reduce defect rates and enhance operational efficiency in SCM.
  • Optimizes resource allocation and strengthens supplier relationships through data-driven insights.
  • Quantifies the impact of quality control improvements on ROI and cost reduction.
The primary aim of this research is to develop and validate a machine learning-based predictive framework for defect rate reduction in supply chain management, contributing to improved operational efficiency. In predictive analytics and SCM, curated datasets such as Kaggle are widely used as benchmarks for developing and testing machine learning frameworks. These datasets provide standardized and noise-free data structures, enabling researchers to validate algorithms in a controlled setting before deployment in real-world scenarios [2]. This study employs a Kaggle dataset [18] tailored to supply chain dynamics to evaluate the proposed framework, focusing on demand forecasting and inventory optimization challenges in fashion startups.
While real-world datasets often reflect operational noise and complexity, curated datasets provide a structured and noise-free environment for rigorously validating machine learning frameworks. As highlighted by [19], curated datasets ensure methodological rigor and allow researchers to benchmark frameworks before deploying them in real-world scenarios. The insights gained from this dataset establish a strong foundation for transferring the framework to operational data in modern supply chains.
The objectives are as follows:
  • To explore the use of historical data and machine learning algorithms for defect rate prediction.
  • To design and implement an ensemble model that integrates multiple ML techniques for quality control optimization.
  • To assess the effectiveness of the proposed framework in reducing associated costs and enhancing ROI in SCM operations.
Recent studies have highlighted the transformative potential of ML in SCM. For instance, Ref. [20] proposed a Bayesian-optimized LightGBM model for forecasting backorder risks, demonstrating the utility of advanced analytics in SCM quality control. Similarly, Ref. [21] presented optimization techniques to address cost, time, and risk in SCM. In [22], two methods were presented, namely, ant colony optimizations and particle swarm optimizations, for identifying the optimal path in supply chain management at the lowest possible cost, while in [23], a systematic literature review was performed with the aim to clarify how AI contributes to SCM. Supply chain optimizations comprise improving the efficiency and efficacy of supply chain operations. With the goal of finding the right balance between supply chain costs and customer service requirements, many strategies and models have been developed, including deterministic and scenario-based approaches [24]. Computing advances and large language models (LLMs) have allowed for the automation and optimization of supply chain processes. In [25], the authors suggested a decision tree regression model in order to optimize and classify massive volumes of data between customers and businesses, followed by the categorization of requests and buy update details between employees and the purchasing team. Ref. [26] claimed that new research possibilities in retail survival, e-commerce, and competitiveness arise from rethinking supply chain management to deal with tragic scenarios including pandemics, war, climate change, and biodiversity loss. The importance of predictive analysis of machine learning in SC module and enterprise resource planning was highlighted by the authors of the thorough review [27], which concludes that in the competitive market climate, guaranteeing consistent supply chain quality is critical to maintaining customer satisfaction and generating business profitability. However, reaching this aim offers several challenges, ranging from identifying essential elements impacting quality to adopting effective control mechanisms. By addressing the challenges mentioned in the previous literature, this study contributes a novel approach to SCM quality control, bridging the gap between theoretical advancements and practical applications. The findings are expected to provide actionable recommendations for SCM managers, fostering continuous improvement and long-term success. This study is guided by the following research questions:
  • How does the integration of machine learning algorithms, such as random forest, support vector machine, and XGBoost, contribute to reducing defect rates in supply chain management?
  • What is the quantifiable impact of machine learning-driven optimization on key performance indicators like ROI and cost reduction within SCM?
The rest of the paper is structured as follows. Section 2 presents related work. Section 3 presents materials and methods, which outline the key components of the research methodology, including data collection, preprocessing, feature selection, model development, and quality control implementation. Section 4 presents the results and potential benefits of the proposed strategy. Section 5 discusses the implication of theses finding. Finally, Section 6 concludes the paper with a summary of the research findings and recommendations for future work.

3. Materials and Methods

This study introduces a robust framework for defect rate prediction and its impact on quality control within supply chain management (SCM). The methods focus on integrating machine learning (ML) techniques such as random forest (RF), support vector machine (SVM), and ensemble learning. The process is detailed below to ensure reproducibility.

3.1. Dataset Selection and Characteristics

The dataset utilized in this study originates from a real-world fashion and beauty startup’s supply chain [18] and is hosted on Kaggle [47], a well-recognized repository for curated datasets in machine learning research. This dataset retains operational relevance by capturing essential supply chain dynamics, including supplier reliability, defect rates, inventory fluctuations, and logistics performance. These features make it highly suitable for evaluating predictive analytics frameworks in supply chain management, particularly in the areas of defect rate forecasting, quality management, and cost optimization.
Furthermore, the dataset utilized in this study includes detailed records on production processes, defect rates, supply chain performance metrics, and associated financial outcomes, involving features such as the following:
  • Production Stages: Detailed tracking of different stages in the manufacturing process.
  • Defect Types and Frequencies: Data on the occurrence of various defect categories.
  • Supply Chain Metrics: Lead times, order fulfillment rates, and delivery delays.
  • Cost Analysis: Financial records linking defect rates to associated costs and ROI.
The integration of this dataset into the proposed predictive analytics framework enables the following:
  • Utilizing historical defect occurrences to anticipate in reducing rework costs and improving quality control.
  • Linking defect rates to financial outcomes, ensuring that quality management interventions yield a positive ROI.
  • Applying predictive models to such supply chain disruptions, enabling ML-data-driven risk mitigation.
By leveraging supply chain features, the proposed framework provides insights into predictive analytic-driven quality management, ensuring practical applicability in supply chain environments. Consequently, this dataset serves as a highly effective testbed for machine learning applications in supply chain analytics due to the following:
  • Capturing essential supply chain relationships between defect rates, and logistic efficiency, making it relevant for other supply chain applications.
  • Facilitating predictive analysis of key performance metrics, allowing businesses to simulate and refine defect reduction and quality improvement strategies.
The dataset serves as a robust foundation for predictive analytics in supply chain management, providing relevant features for defect rate prediction, cost optimization, and operational efficiency analysis. While curated, it incorporates practical supply chain challenges, making it an effective testbed for machine learning model validation. The proposed framework demonstrates strong predictive capabilities, and future research will focus on expanding dataset diversity in order to ensure greater industry-wide impact.

3.2. Data Preprocessing

Data preprocessing ensured data integrity and suitability for ML modeling:
  • Data Cleaning: Identified and rectified missing values using imputation techniques and normalized feature scales to maintain uniformity. Python libraries such as pandas and scikit-learn were employed.
  • Feature Normalization: Min-max scaling was applied to normalize numerical features to a [0, 1] range for improved model convergence.

Feature Selection

Feature selection focused on identifying variables that significantly influence defect rate predictions.
-
Algorithms Compared: Random forest and linear regression were tested, as illustrated in Figure 1.
Figure 1. Comparative analysis of feature significance in linear regression and random forest models.
-
Performance Metric: Mean squared error (MSE) was used to evaluate feature selection effectiveness. Random forest outperformed linear regression, achieving an MSE of 2.4028 compared to 2.6781.
-
Key Features Identified: Features such as lead time, price, number of products sold, and stock levels were deemed most impactful.
Random forest calculates feature importance by measuring the decrease in impurity during tree splits. Features with higher importance scores were prioritized for subsequent modeling.

3.3. Prediction Models

Two prediction models were developed and evaluated:
  • Combined Model: Utilizing RF and SVM predictions using the simple averaging technique.
  • Ensemble Model: Integrated RF, SVM, and XGBoost predictions using the VotingRegressor technique, which calculates weighted averages of individual model outputs to enhance accuracy.

3.3.1. Algorithm Design

A novel algorithm, combined_Predictions, was introduced to merge RF and SVM outputs. The mathematical framework for prediction integration was as follows:
The mathematical explanation behind combining predictions from SVM and random forest using simple averaging is as follows:
SVM Prediction:
Support vector machine (SVM) is a supervised learning model used for classification and regression tasks. It finds the optimal hyper plane that separates the data points into different classes (in the case of classification) or approximates the relationship between input and output variables (in the case of regression).
  • The decision function of an SVM can be represented as follows:
f x = sign   w . x + b
where
  • f x is the decision function.
  • w is the weight vector.
  • x is the input feature vector.
  • B is the bias term.
  • sign is the sign function that determines the class label based on the sign of the expression.
Let y SVM i x represent the prediction made by the support vector machine (SVM) model for a given input feature x .
Random Forest Prediction:
Random forest is an ensemble learning method that operates by constructing a multitude of decision trees during training and outputs the mode of the classes (for classification) or the mean prediction (for regression) of the individual trees. Mathematically, the prediction of a random forest model can be represented as follows:
y = mode   T 1 x , T 2 x ,   . ,   T n x
where
  • y is the predicted class label.
  • T i x is the prediction of the tth decision tree.
  • x represents the input features.
  • Mode is the most frequent prediction among all trees.
Let y RF i x represent the prediction made by the random forest model for the same input features x .
For (n) feature, the combined prediction for each sample (i) using simple averaging generates the final prediction:
y i Combined = 1 n   i = 1 n y i SVM + y i RF 2
where y i SVM and y i RF are the predictions from SVM and random forest for the i th sample, respectively.
Algorithm 1 illustrates the algorithmic pseudocode for the combining prediction procedure. This approach leveraged the strengths of both models, reducing prediction errors.
Algorithm 1: The arithmetic pseudo code for the combined model
1.Description: Combine the predictions from SVM and RF
2.Input: SVM_predictions, RF_Predictions
3.Output: Combined_predictions
4.
5.
6.
7.
8.
9.
10.
Procedure:
Function combine predictions   Array   SVM predictions ,   Array   RF predictions
num samples Length SVM predictions
Combined predictions =
for   i = 0   to   num samples 1 :
Combined predictions SVM predictions i + RF predictions i 2
Return   Combined   predictions

3.3.2. Ensemble Learning

The VotingRegressor ensemble method combines predictions from three machine learning models: random forest (RF), support vector machine (SVM), and XGBoost (extreme gradient boosting). This ensemble approach generates a final prediction by integrating the individual predictions from these models, leveraging their combined strengths to improve overall predictive performance.
XGBoost Overview:
XGBoost is a highly efficient and scalable implementation of gradient boosting that builds an ensemble of weak learners (typically decision trees). It operates sequentially, with each tree focusing on reducing the errors made by the ensemble in previous iterations. XGBoost optimizes an objective function that incorporates both a loss function and a regularization term to achieve a balance between predictive accuracy and model complexity.
The objective function minimized by XGBoost is defined as follows:
Objective   function = i = 1 n   L y i ,   y i t + k = 1 T     fk  
where
  • (n) is the number of samples.
  • L y i , y i t is the loss function that measures the difference between the predicted y i t and actual values y i .
  • T is the total number of trees in the ensemble.
  • Ω (fk) is a regularization term that penalizes complex models to prevent over-fitting.
  • i represents each individual sample in the dataset (from 1 to n).
  • y i is the actual target value (ground truth) for the ith sample.
  • y i t is the prediction made by the XGBoost model at iteration t for the ith sample.
  • k represents each individual tree in the ensemble (from 1 to T).
  • fk represents the kth tree in the ensemble.
  • Ω(fk) is a regularization term applied to the kth tree to control its complexity.
The objective function represents the sum of the losses incurred for each sample across all boosting rounds, plus the regularization penalties for each tree in the ensemble. The goal of the XGBoost algorithm is to minimize this objective function by adjusting the parameters of the individual trees and the ensemble as a whole.
Ensemble Formula:
The ensemble approach employed integrates predictions from multiple machine learning models: random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost). Algorithm 2 illustrates the pseudocode for the ensemble approach. Further, through a majority voting approach, given input feature vector ( x i ), each model predicts value y i =   f m x , where f m x is the function representing the model (m), and m R F ,   S V M ,   X G B o o s t . The VotingRegressor then generates y x i derived as the mode, or the most frequently occurring value, among these individual predictions:
y x i = mode   y SVM x i , y RF x i ,   y Xgb x i
where y SVM x i , y RF x i ,   y Xgb x i represents the individual predictions from machine learning models, random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost), respectively.
Algorithm 2: The pseudo code for the ensemble model
1.Description: Pseudocode for Ensemble Prediction Using Majority Voting
2.Input: SVM_Prediction, RF_Predictions, XGboost_predictions
3.Output: y’: Final prediction vector using majority voting
4.Procedure:
5.
6.
7.
8.
9.
10.
y i [ ]
for   i = 0   to   num samples 1 :
predictions   SVM Predictions ,   RF predictions ,   XGboost predictions
y final mode predictions
y i y final
Return   y

3.3.3. Validation Technique

  • K-Fold Cross-Validation:
  • Process: The dataset was split into k = 10k folds. Models were trained on k − 1 folds and validated on the remaining fold.
  • Performance Metrics: MSE values were averaged across folds to evaluate generalization.

3.4. ROI Framework for Quality Control Optimization

  • Cost Impact Assessment: Using defect rate predictions, the financial impact of defects was quantified under scenarios with and without predictive models.
  • ROI Calculation: The return on investment (ROI) was computed as follows:
ROI = Net _ benefits Cos ts × 100 %
where net benefits include reductions in defect-associated costs and operational efficiency improvements.

4. Results

4.1. Predictive Model Performance

The evaluation of the predictive models demonstrated the ensemble model’s superiority over the combined model in both accuracy and error minimization:
  • Mean Squared Error (MSE): The ensemble model exhibited a significantly lower MSE compared to the combined model, as shown in Figure 2.
    Figure 2. Evaluating machine learning models combined vs. ensemble.
  • Accuracy: Both models achieved high accuracy; however, the ensemble model was slightly more precise, affirming its reliability for operational quality predictions.
  • Error Distribution: The histograms in Figure 3 illustrate the narrower error range for the ensemble and combined models, in order to indicate the distribution of prediction error and evaluate their performance.
    Figure 3. Comparative analysis of MSE in combined vs. ensemble prediction models.

4.2. Quality Control’s Impact on ROI

The comparison results between the two proposed configurations, namely, the compound and the ensemble, led to selection of the best performance model that achieves lower MSE and better accuracy, which is the ensemble model. These results emphasize that quality control extends beyond defect minimization to encompass broader operational improvements:
  • Cost Reduction: Historical data analysis revealed inefficiencies and bottlenecks, enabling targeted interventions. Without predictive analytics, total defect-associated costs stood at 19.55. These were reduced to 2.89 when predictive models were applied, achieving a net benefit of 16.66.
  • ROI Analysis:
    Ensemble Model ROI: Achieved 82.21%, reflecting its efficiency in reducing defect rates while optimizing operational costs.
Figure 4 visually depicts these results, showcasing significant cost reductions with predictive models, highlighting their financial benefits.
Figure 4. Enhancing ROI and cost reductions in SCM through ensemble predictive models.
Ultimately, our results demonstrate that the proposed ensemble machine learning framework integrating XGBoost, random forest, and support vector machine (SVM) achieves high accuracy in defect rate prediction, outperforming traditional methods.

4.3. Interpretation for Business Applications

These findings have direct implications for businesses, particularly those operating in industries like fashion and e-commerce:
  • Cost Savings: By accurately predicting defect rates, businesses can minimize rework, warranty claims, and returns. For instance, a fashion retailer could reduce inventory waste caused by defective or slow-moving stock.
  • Enhanced Decision Making: Predictive insights generated by the framework empower supply chain managers to allocate resources more efficiently, such as prioritizing quality checks for high-risk products or vendors.

4.4. Actionable Insights for Managers

The results provide the following actionable recommendations for SCM managers:
Proactive Quality Monitoring: Implementing the framework allows managers to monitor defect trends and intervene before issues escalate, ensuring better product quality and customer satisfaction.
  • Supplier Optimization: Insights from the framework help identify underperforming suppliers, enabling businesses to renegotiate contracts, improve supplier relationships, or source from alternative vendors.
  • ROI-Driven Investments: The demonstrated cost savings and defect reduction provide a clear case for investing in AI-driven quality control systems. Managers can use these insights to justify budget allocations for machine learning tools and training.

4.5. Long-Term Implications for Business Strategy

The study highlights the scalability of the framework for real-world applications, ensuring that businesses can extend its use across diverse supply chain operations. The following long-term benefits are expected:
Competitive Advantage: By ensuring consistent product quality and reducing defect-related costs, businesses can improve customer loyalty and enhance their market position.
  • Sustainability: Reducing defects and waste contributes to more sustainable supply chain practices, aligning with global trends toward environmental responsibility.
  • Agility: The framework equips businesses with the predictive tools needed to adapt to market fluctuations and unforeseen disruptions, such as supply chain shocks or changes in consumer demand.
The results of our study provide a robust foundation for businesses to adopt machine learning-driven quality control systems, enabling them to achieve tangible operational, financial, and strategic benefits.
Hence, predictive models not only reduce defect rates but also impact cost efficiency and ROI, which are critical in the supply chain. The ensemble model’s higher accuracy and reduced error rates were shown to optimize operations across multiple supply chain participants. These enhancements demonstrate how the proposed framework improves supply chain quality comprehensively, addressing impacts like defect minimization, cost reduction, and process optimization.
Consequently, our study highlights the transformative role of predictive analytics in SCQM, emphasizing their capacity to optimize quality control, reduce costs, and improve ROI. By leveraging historical data insights and advanced machine learning models, enterprises can achieve enhanced operational efficiency, greater customer satisfaction, and sustainable profitability. These findings contribute valuable evidence for businesses aiming to integrate SCQM practices and predictive analytics into their supply chain strategies.

5. Discussion

Our research highlights the transformative potential of predictive analytics in mitigating defects and optimizing operations within the supply chain. By employing machine learning models such as random forest, support vector machine (SVM), and XGBoost, we developed predictive frameworks capable of accurately forecasting defect rates. The results demonstrate significant reductions in defect-related costs and improvements in operational efficiency. These findings underscore the ability of advanced predictive techniques to proactively identify and address quality issues, thereby fostering more efficient and cost-effective supply chain practices.
The reliability of the proposed ML methods is demonstrated through the overall framework process, which leverages specific supply chain features that play a critical role in determining defect rates and operational efficiency. The results illustrate how these features significantly impact predictive accuracy, reinforcing the model’s effectiveness in supply chain quality management.
Our findings align with and extend previous research, illustrating the broader implications of leveraging machine learning in supply chain management (SCM). Through the integration of ensemble prediction models, we validated the hypothesis that predictive analytics significantly enhance ROI in SCM. The application of feature selection techniques, particularly via random forest, provided critical insights into the factors influencing defect rates. This enhanced our ability to design targeted interventions, further optimizing quality control measures.
From a methodological standpoint, our approach demonstrated the advantages of combining individual models into ensemble frameworks to improve predictive accuracy and robustness. Random forest excelled in feature selection, while the combined model improved overall predictability by integrating the strengths of individual models. The resulting cost reductions and quality improvements emphasize the tangible benefits of incorporating predictive analytics into SCM practices.
The implications of our work extend beyond cost savings. By fostering proactive quality control, these models contribute to reduced waste, improved resource allocation, and enhanced customer satisfaction. Our study highlights how enterprises can adopt machine learning to achieve competitive advantages and sustainable growth. The proposed framework is highly adaptable to real-world fashion startups, where demand forecasting and inventory management are critical challenges. For example, the framework can be integrated with sales and inventory data from fashion retailers to optimize stock levels, reduce holding costs, and forecast trends. Its modular design allows for scalability, enabling startups to incorporate additional data streams such as customer preferences and seasonal demand. This adaptability ensures that the findings from the Kaggle dataset can be extended to operational datasets with minimal adjustments.
Considering the implications, our study achieved promising results in defect rate prediction; data availability and quality remain critical challenges for achieving broader applicability. For supply chain practitioners, our findings underscore the importance of investing in robust data collection mechanisms and integrating predictive models into existing workflows. Future enhancements such as feature importance analysis can improve the interpretability of predictions, enabling practitioners to make more informed decisions.
Future research should explore these methodologies in broader contexts, including the utilization of external data sources. Additionally, addressing challenges such as data quality, model explainability, and scalability will be crucial for maximizing the potential of predictive analytics in SCM.

6. Conclusions

This study presents a robust framework for leveraging machine learning techniques to optimize quality management and predict defect rates in supply chain management. By utilizing advanced predictive models such as random forest, XGBoost, and SVM, we successfully reduced defect-related costs and achieved substantial ROI improvements.
Our methodology emphasizes the importance of feature selection in identifying key determinants of defect rates and demonstrates the effectiveness of ensemble approaches in improving predictive accuracy and robustness. Our research validates a machine learning framework for SCM using a Kaggle dataset, providing a robust foundation for real-world applications. The framework’s demonstrated performance highlights its scalability and adaptability to operational datasets, making it a valuable tool for addressing demand forecasting and inventory challenges in fashion startups. While this dataset offers valuable insights specific to the fashion industry, we acknowledge that future work will focus on incorporating datasets from diverse industries, such as manufacturing and retail, to evaluate the framework’s applicability across varied supply chain contexts. Expanding the dataset scope will enhance the robustness and versatility of the proposed approach.
This study provides a robust framework for defect prediction and quality management in supply chains, offering actionable insights for practitioners across multiple roles, including suppliers, manufacturers, and logistics providers. However, challenges such as data availability and model interoperability must be addressed to further enhance reliability and applicability. Future research should prioritize integrating external data sources, improving feature importance analysis, and exploring advanced machine learning techniques to develop more scalable and explainable models.
In conclusion, our research underscores the transformative role of predictive analytics in SCM, offering actionable insights for industry stakeholders, practitioners, and policymakers. By addressing existing challenges and pursuing new research directions, we aim to advance the field of supply chain quality management, enabling enterprises to achieve operational excellence and sustainable growth in an increasingly dynamic and competitive environment.

Author Contributions

Conceptualization, Z.N.J. and B.V.; methodology, Z.N.J. and B.V.; validation, Z.N.J. and B.V.; formal analysis, Z.N.J. and B.V.; investigation, Z.N.J. and B.V.; resources, Z.N.J. and B.V.; data curation, Z.N.J. and B.V.; writing—original draft preparation, Z.N.J.; writing—review and editing, Z.N.J. and B.V.; visualization, Z.N.J. and B.V.; supervision, B.V.; project administration, Z.N.J. and B.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

This research utilized a publicly available dataset, and has been cited accordingly in the references section.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the existing affiliation information. This change does not affect the scientific content of the article.

References

  1. Mosteanu, N.R.; Faccia, A.; Ansari, A.; Shamout, M.D.; Capitanio, F. Sustainability integration in supply chain management through systematic literature review. Qual.-Access Success 2020, 21, 117–123. [Google Scholar]
  2. Bhattacharya, S.; Govindan, K.; Dastidar, S.G.; Sharma, P. Applications of artificial intelligence in closed-loop supply chains: Systematic literature review and future research agenda. Transp. Res. Part E Logist. Transp. Rev. 2024, 184, 103455. [Google Scholar]
  3. Sinha, N.; Garg, A.K.; Dhall, N. Effect of TQM principles on performance of Indian SMEs: The case of automotive supply chain. TQM J. 2016, 28, 338–359. [Google Scholar]
  4. Jum’a, L.; Alkalha, Z.; Alaraj, M. Towards environmental sustainability: The nexus between green supply chain management, total quality management, and environmental management practices. Int. J. Qual. Reliab. Manag. 2024, 41, 1209–1234. [Google Scholar]
  5. Jia, F.; Zuluaga-Cardona, L.; Bailey, A.; Rueda, X. Sustainable supply chain management in developing countries: An analysis of the literature. J. Clean. Prod. 2018, 189, 263–278. [Google Scholar]
  6. Oliveira, R.; Sampaio, P.; Cubo, C.; Carvalho, M.S.; Fernandes, A.C. Defining the supply chain quality management concept. In Handbook of Research Methods for Supply Chain Management; Edward Elgar Publishing: Cheltenham, UK, 2022. [Google Scholar]
  7. Li, G. Supply Chain Efficiency and Effectiveness Management Using Decision Support Systems. Int. J. Inf. Syst. Supply Chain. Manag. 2022, 15, 1–18. [Google Scholar]
  8. Tiedemann, F.; Wikner, J.; Johansson, E. Understanding lead-time implications for financial performance: A qualitative study. J. Manuf. Technol. Manag. 2021, 32, 183–207. [Google Scholar] [CrossRef]
  9. Kouvelis, P.; Qiu, Y. Financing Inventories with an Investment Efficiency Objective: ROI-Maximizing Newsvendor, Bank Loans and Trade Credit Contracts. Soc. Sci. Res. Netw. 2021, 60, 136–161. [Google Scholar] [CrossRef]
  10. Nagy-Bota, S.; Moldovan, L.; Nagy-Bota, M.C.; Varga, I.E. Mathematical Models Used in the Optimizations of Supply Chains. Acta Marisiensis 2023, 20, 27–31. [Google Scholar]
  11. Jahin, M.A.; Shovon MS, H.; Shin, J.; Ridoy, I.A.; Mridha, M.F. Big Data—Supply Chain Management Framework for Forecasting: Data Preprocessing and Machine Learning Techniques. Arch. Comput. Methods Eng. 2024, 31, 3619–3645. [Google Scholar]
  12. Wang, J.; Zheng, R.; Wang, Z. Supply Chain Optimization Strategy Research Based on Deep Learning Algorithm. Mob. Inf. Syst. 2022, 2022, 9058490. [Google Scholar]
  13. Patil, A.; Dwivedi, A.; Moktadir, M.A. Big data-Industry 4.0 readiness factors for sustainable supply chain management: Towards circularity. Comput. Ind. Eng. 2023, 178, 109109. [Google Scholar]
  14. Wang, F.; Aviles, J. Enhancing Operational Efficiency: Integrating Machine Learning Predictive Capabilities in Business Intellgence for Informed Decision-Making. Front. Bus. Econ. Manag. 2023, 9, 282–286. [Google Scholar]
  15. Nguyen, K.; Akbari, M.; Quang, H.T.; McDonald, S.; Hoang, T.H.; Yap, T.L.; George, M. Navigating Environmental Challenges through Supply Chain Quality Management 4.0 in Circular Economy: A Comprehensive Review. Sustainability 2023, 15, 16720. [Google Scholar] [CrossRef]
  16. Medina-Elizondo, M.; Molina-Morejón, V.M.; Fernández-Contreras, L.; Rodríguez-Figueredo, S. Quality management system in the supply chain of the metal mechanical manufacturing industry. ECORFAN J. Repub. Peru 2022, 8, 24–33. [Google Scholar]
  17. Rahayu, R.; Purnomo, E.P.; Malawani, A.D. Using The “Return on Investment” Strategy to Sustain Logistic Supply Provider Toward Indonesia’s Logistic Policy. J. Gov. Civ. Soc. 2020, 4, 201–218. [Google Scholar]
  18. Singh, H. Supply Chain Analysis. 2023. Available online: https://www.kaggle.com/datasets/harshsingh2209/supply-chain-analysis (accessed on 13 March 2024).
  19. Lada, S.; Chekima, B.; Karim MR, A.; Fabeil, N.F.; Ayub, M.S.; Amirul, S.M.; Ansar, R.; Bouteraa, M.; Fook, L.M.; Zaki, H.O. Determining factors related to artificial intelligence (AI) adoption among Malaysia’s small and medium-sized businesses. J. Open Innov. Technol. Mark. Complex. 2023, 9, 100144. [Google Scholar]
  20. Sani, S.; Xia, H.; Milisavljevic-Syed, J.; Salonitis, K. Supply Chain 4.0: A Machine Learning-Based Bayesian-Optimized LightGBM Model for Predicting Supply Chain Risk. Machines 2023, 11, 888. [Google Scholar] [CrossRef]
  21. Alkahtani, M. Supply Chain Management Optimization and Prediction Model Based on Projected Stochastic Gradient. SustainSustainability 2022, 14, 3486. [Google Scholar]
  22. Hassouna, M.; El-henawy, I.; Haggag, R. A multi-objective optimization for supply chain management using artificial intelligence (AI). Int. J. Adv. Comput. Sci. Appl. 2022, 13, 140–149. [Google Scholar]
  23. Toorajipour, R.; Sohrabpour, V.; Nazarpour, A.; Oghazi, P.; Fischl, M. Artificial intelligence in supply chain management: A systematic literature review. J. Bus. Res. 2020, 122, 502–517. [Google Scholar]
  24. Li, B.; Mellou, K.; Zhang, B.; Pathuri, J.; Menache, I. Large Language Models for Supply Chain Optimization. arXiv 2023, arXiv:2307.03875. [Google Scholar]
  25. Kasturi, K.; Jebathangam, J. Supply Chain Management for Business Process Optimization using Decision Tree Regression Model. Int. J. Adv. Res. Sci. Commun. Technol. 2023, 3, 548–554. [Google Scholar]
  26. Sodhi, M.S.; Tang, C.S. Supply Chain Management for Extreme Conditions: Research Opportunities. SSRN Electron. J. 2021, 57, 7–16. [Google Scholar]
  27. Jawad, Z.N.; Balázs, V. Machine learning-driven optimization of enterprise resource planning (ERP) systems: A comprehensive review. Beni-Suef Univ. J. Basic Appl. Sci. 2024, 13, 4. [Google Scholar]
  28. Zhao, Y.; Jing, S.; Wang, R. Quality Control Decision Research of Two-Level Supply Chain Based on the “ERC” Fairness Preference. In Advances in Intelligent Traffic and Transportation Systems; IOS Press: Amsterdam, The Netherlands, 2023. [Google Scholar]
  29. Jumoke, A.; Anafeh, A.T.; Ossi, C.S. Breaking Down Silos: Enhancing Supply Chain Efficiency Through Erp Integration and Automation. Int. Res. J. Mod. Eng. Technol. Sci. 2024, 6, 1935. [Google Scholar]
  30. Nzeako, G.; Akinsanya, M.O.; Popoola, O.A.; Chukwurah, E.G.; Okeke, C.D. The role of AI-Driven predictive analytics in optimizing IT industry supply chains in optimizing IT industry supply chains. Int. J. Manag. Entrep. Res. 2024, 6, 1489–1497. [Google Scholar]
  31. Olaleye, T.O.; Arogundade, O.T.; Misra, S.; Abayomi-Alli, A.; Kose, U. Predictive Analytics and Software Defect Severity: A Systematic Review and Future Directions. Sci. Program. 2023, 2023, 6221388. [Google Scholar]
  32. Ross, A.; Neuteboom, W. Implementation of quality management from a historical perspective:the forensic science odyssey. Aust. J. Forensic Sci. 2021, 53, 359–371. [Google Scholar]
  33. Khedr, A.M. Enhancing supply chain management with deep learning and machine learning techniques: A review. J. Open Innov. Technol. Mark. Complex. 2024, 10, 100379. [Google Scholar]
  34. Eslamipoor, R.; Sepehriar, A. Enhancing supply chain relationships in the circular economy: Strategies for a green centralized supply chain with deteriorating products. J. Environ. Manag. 2024, 367, 121738. [Google Scholar] [CrossRef] [PubMed]
  35. Akinbamini, E.; Vargas, A.; Traill, A.; Boza, A.; Cuenca, L. Critical Analysis of Technologies Enhancing Supply Chain Collaboration in the Food Industry: A Nigerian Survey. Logistics 2025, 9, 8. [Google Scholar] [CrossRef]
  36. Elgalb, A.; Gerges, M. Optimizing Supply Chain Logistics with Big Data and AI: Applications for Reducing Food Waste. J. Curr. Sci. Res. Rev. 2024, 2, 29–39. [Google Scholar]
  37. Eslamipoor, R. A fuzzy multi-objective model for supplier selection to mitigate the impact of vehicle transportation gases and delivery time. J. Data Inf. Manag. 2022, 4, 231–241. [Google Scholar] [CrossRef]
  38. Varriale, V.; Cammarano, A.; Michelino, F.; Caputo, M. Critical analysis of the impact of artificial intelligence integration with cutting-edge technologies for production systems. J. Intell. Manuf. 2025, 36, 61–93. [Google Scholar]
  39. Zahlan, A.; Ranjan, R.P.; Hayes, D. Artificial intelligence innovation in healthcare: Literature review, exploratory analysis, and future research. Technol. Soc. 2023, 74, 102321. [Google Scholar] [CrossRef]
  40. Alzubaidi, L.; Al-Sabaawi, A.; Bai, J.; Dukhan, A.; Alkenani, A.H.; Al-Asadi, A.; Alwzwazy, H.A.; Manoufali, M.; Fadhel, M.A.; Albahri, A.S.; et al. Towards Risk-Free Trustworthy Artificial Intelligence: Significance and Requirements. Int. J. Intell. Syst. 2023, 2023, 4459198. [Google Scholar]
  41. Eni, L.N.; Raparthi, M.; LakshmiH; Yennapusa, H.; Balasubramanian, S.; Vodenicharova, M.; Srinu, C. From Data to Decisions Leveraging Machine Learning in Supply- Chain Management. Tuijin Jishu/J. Propuls. Technol. 2023, 44, 4218–4225. [Google Scholar]
  42. Hrbáčková, L.; Tuček, D. An analysis of two new process approach-related terms in ISO 9001:2015: Risk-based thinking and context of the organization. Sci. Pap. Univ. Pardubic. Ser. D Fac. Econ. Adm. 2019, 45, 65–76. [Google Scholar]
  43. Woerner, S.; Wagner, S.M.; Chu, Y.; Laumanns, M. Bonus or Penalty? Designing Service-Level Agreements for a Decentralized Supply Chain: The Implication of Return on Investment. IEEE Trans. Eng. Manag. 2024, 71, 837–854. [Google Scholar]
  44. Nikam, S.; Kolhare, N. Intelligent Quality Control System for Product Manufacturers through ML. In Proceedings of the International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON), Bangalore, India, 23–25 December 2022. [Google Scholar]
  45. Wisetsri, W.; Syam, E.; Alanya-Beltran, J.; Kulkarni, G.R.; Reddy, R.K.V.; Sheikh, M.F.A. Assessing and comparing the role of machine learning (ml) and supply chain management (scm) towards enhancing e-commerce. In Proceedings of the 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 28–29 April 2022. [Google Scholar]
  46. Kuk, E.; Bobek, S.; Nalepa, G.J. ML-Based Proactive Control of Industrial Processes. In Proceedings of the International Conference on Conceptual Structures, Prague, Czech Republic, 3–5 July 2023. [Google Scholar]
  47. Thayyib, P.V.; Mamilla, R.; Khan, M.; Fatima, H.; Asim, M.; Anwar, I.; Shamsudheen, M.K.; Khan, M.A. State-of-the-Art of Artificial Intelligence and Big Data Analytics Reviews in Five Different Domains: A Bibliometric Summary. Sustainability 2023, 15, 4026. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.