1. Introduction
Supplier selection is a crucial step in supply chain management, through which decision markers select the best suppliers for the services or product they wish to purchase [
1]. To increase performance and sustain current connections over time, businesses usually look for the best suppliers. Furthermore, in many industries, the cost of raw materials accounts for most manufacturing expenses [
2], so choosing a supplier is essential to a company’s financial health. Raw materials and services required to make a product typically account for 70% of a manufacturer’s selling price [
3,
4,
5]. Consequently, a cost-effective supplier can drastically lower the supply chain’s expenses. Furthermore, suppliers have a direct impact on an organisation’s profitability [
6,
7].
The identification and verification of the supplier is the first step in the supplier selection process. This is followed by the contract being signed. Although choosing a supplier may appear simple, it is one of the most important supply chain steps [
8]. Therefore, to remain competitive, lucrative, and secure, it is crucial to carefully choose a supplier based on the situational requirements [
9,
10,
11]. Choosing a supplier and determining which one is most likely to meet the requirements are frequent questions that come up during the SS process. The solution to this question depends on several qualitative and quantitative parameters [
12,
13], as the supplier selection process is a multi-criteria problem [
14,
15,
16,
17]. The choice of suppliers is significantly influenced by these kinds of considerations. Therefore, selecting the best supplier requires balancing these criteria and trade-offs. In practice, nevertheless, it is not feasible to trade off every potential supplier selection criterion. Furthermore, not all criteria contribute equally to the selection of suppliers. When suppliers are chosen using pointless criteria, the SS process might become more complex and there is a greater chance that decisions will be made incorrectly. Furthermore, picking a supplier based on a broad range of factors may misguide the selection process, which could have a detrimental effect on the business’s performance and earnings. Selecting relevant criteria for the selection process is therefore crucial when choosing a supplier. So, it would be ideal if it were able to identify every significant criterion from the broad spectrum of criteria. Using Multi-Criteria Decision-Making (MCDM) procedures is feasible given that this is a multi-criteria problem [
18]. MCDM offers an assessment framework that can address real-world problems by using scientific analytical techniques to help decision-makers (DMs) come up with workable answers [
19,
20].
Sorting and analysing big data are one of the most difficult tasks for large manufacturing-oriented businesses when evaluating suppliers because there are so many criteria and options that make up a massive data system. What are the best criteria for efficiently determining the benefits and drawbacks of suppliers? is the question posed [
21]. Data mining is an all-inclusive method for analysing large data. Data mining and knowledge discovery are computer science subfields that can be used to mine or extract important information from massive amounts of data and then transform it into concepts and structures that decision-makers (DMs) can easily grasp [
22]. According to Saura [
23], businesses might waste a lot of effort creating, organising, and cleaning databases (which may contain information from users, suppliers, and consumers). When constructing and reviewing databases, businesses can make better use of their time by utilising pertinent metrics and performance criteria. They demonstrated that one of the key responsibilities of data mining is identifying the fundamental criteria.
Furthermore, the supplier’s performance history dataset volume is growing with time, which is significant to observe. The supplier dataset shows the supplier’s historical performance. Compared to traditional methods, Machine Learning (ML) techniques can handle vast amounts of data more efficiently [
24]. Therefore, the machine learning technique is far more effective than traditional multi-criteria selection methods because the latter are occasionally unable to identify the true patterns in large datasets [
25]. One artificial intelligence (AI) technology called machine learning (ML) enables the effective selection of suppliers based on their prior performance [
26,
27]. Every provider has qualities or standards by which they are judged, and in machine learning, features are used to symbolise qualities. Nonetheless, a technique called the feature selection (FS) algorithm can be applied to determine the crucial supplier selection criteria. Further understanding of how the identified critical criteria impact other elements is also required, as the selection of suppliers will be dependent on them.
While ML and MCDM offer different ranges of benefits, they also have different requirements. Therefore, combining them can offer opportunities to address more needs and gain access to additional advantages, as well as lessen the disadvantages that each type of strategy has on its own. The purpose of this study is to propose such an integrated strategy by combining machine learning techniques for dimension reduction with the computation of selection criteria weights, which are subsequently applied to supplier ranking. An actual case study involving a pharmaceutical corporation validates the suggested methodology. The following is a summary of this study’s contributions:
A novel integrated supplier selection strategy that combines the TOPSIS method with machine learning approach.
A comparative analysis of various machine learning algorithms to ascertain their applicability within the framework of the suggested methodology.
A case study that illustrates the suggested methodology real-world scenario.
This paper’s remaining sections are organised as follows. The pertinent background information is compiled in
Section 2, which also offers a targeted summary of important literature on integrated approaches to supplier selection. The proposed integrated method is discussed in
Section 3. In this section, the TOPSIS approach used to rank suppliers is detailed, along with the machine learning techniques used for dimension reduction and determining the weight of criteria. To validate the suggested approach, a case study involving a pharmaceutical organisation is presented in
Section 4.
Section 5 wraps up by outlining several potential lines of inquiry for further study.
2. Literature Review
Suppliers give the supply chain system the materials, parts, and technology it needs to function. Supplier procurement operations can have a substantial impact on a company’s profitability, as procurement expenditures account for between 70% and 80% of most companies’ production costs. Additionally, the company’s turnover is heavily reliant on the resources and capabilities of these suppliers [
28]. The main objectives of any supply chain management system are to effectively manage the flow of resources, data, and money to meet client demands and accomplish overarching corporate objectives [
29,
30]. As the primary operational motor, suppliers can either accelerate or decrease the efficacy of the supply chain [
31]. Nevertheless, there are other downsides that make the supplier selection process more difficult. One of the main problems with SS is figuring out what criteria to include in the evaluation and selection process that are both acceptable and relevant [
32]. The selection of suppliers ought to be guided by the criteria of objectivity, specificity, and comprehensiveness. Businesses must create a thorough and precise evaluation system before choosing suppliers. The criteria obtained through a literature review include financial capabilities, equipment management, human resource development, quality control, cost control, technology development, user happiness, delivery agreements, and environmental awareness [
33]. To find suppliers for the cold supply chain (CSC), Ullah and Yousaf [
34] examined fifteen different essential criteria. They discovered that “utilisation of resources” is the most crucial.
MCDM methods consider preferences across a variety of quantitative and qualitative criteria, which are typically conflicting and difficult to reconcile, making it more difficult to come to a consensus. Behavioural decision theory, computer science, economics, and information systems are among the fields that influence the creation of MCDM methodologies [
1]. Numerous MCDM strategies have been put out in earlier research to assist businesses in choosing qualified suppliers. These include fuzzy set theory (FST) [
35], data envelopment analysis (DEA) [
36], the technique for order preference by similarity to ideal solution (TOPSIS) [
37], the analytical hierarchy process (AHP) [
38], and multi-objective programming [
39,
40]. To meet the demands of the decision-making scenarios, researchers have improved or combined these well-liked and traditional methods [
41,
42,
43,
44]. Most research, however, concentrated on supplier selection theories and methodologies, ignoring the development of criteria systems or only qualitatively evaluating them using pre-existing literature or professional judgement [
45]. The calibre of the decision-making in the earlier stages has a significant impact on the calibre of the supplier that is ultimately chosen [
46]. The complexity of supplier assessment difficulties and the unpredictability of human thought have led to a considerable increase in the number of studies on SS employing traditional approaches in the literature in recent years [
47,
48,
49,
50]. However, the SS process can be handled by machine learning [
51,
52].
ML primarily offers reliable information since it accurately forecasts the circumstances and assists in identifying the optimal course of action among the several options created during the study [
53,
54]. ML algorithms have been carefully studied in a number of findings. These techniques include supervised and unsupervised machine learning (ML), including k-means, principal component analysis (PCA), random forest (RF), support vector machines (SVMs), artificial neural networks (ANNs), and others [
55,
56,
57,
58]. In a variety of domains, such as medical imaging, image classification, speech recognition, and other industrial contexts, machine learning techniques are effectively applied with remarkable outcomes on object detection problems [
59] and dimension reduction [
60]. Nonnegative Matrix Factorization (NMF) is an effective dimension reduction technique that outperforms classic linear methods and other techniques [
61,
62,
63]. One of its primary strengths is its capacity to handle non-negative constraints, making it ideal for datasets with only positive values. Using this trait, NMF may extract parts-based and additive representations, exposing underlying patterns and features in data [
64]. Furthermore, NMF’s inherent sparsity-promoting nature enables it to automatically choose relevant features, significantly lowering data dimensionality while retaining critical information. Unlike certain linear approaches, which may struggle with high-dimensional and complicated datasets, NMF is robust and scalable in such settings [
65]. NMF interpretability is important because it allows researchers to obtain relevant insights into the data’s latent structure, which facilitates data exploration and analysis [
66]. Overall, the combination of nonnegativity, sparsity, interpretability, and scalability make NMF a versatile and compelling strategy for dimension reduction tasks, offering a viable alternative to other methods in the field [
67].
Despite ML’s ability to handle complicated problems, its application to SS has been rather limited [
68]. Moreover, Huo et al. [
69] claim that RF feature selection models offer the most accurate predictions that closely match the real one. For example, RF helped assess a green supplier by exposing the pairwise correlations between the criteria [
70]. Furthermore, RF aids in supplier ranking according to performance [
71]. Additionally, RF establishes the process’s flexibility and versatility and makes supplier evaluation trustworthy [
72].
There has been an extensive amount of study conducted regarding the application of several MCDM techniques combined with ML techniques to solve the drawbacks of each strategy in the supplier selection process. This section offers a selection of related research on the supplier selection and assessment challenge that has been addressed through the integration of ML algorithms with MCDM methodologies.
To highlight the specific research gap that the methodology suggested in
Section 3 aims to fill, a critical review is provided, with a primary focus on recently published research. Using vast amounts of historical data, Neji et al. [
73] showed how to use data-driven MCDM to green supplier selection issues. Using Random Forests, they examined the connections between various supplier selection factors. After that, they utilised a combination of DEMATEL and ANP to determine the criteria weights. Multi-objective optimisation and ratio analysis were then used to evaluate suppliers by calculating the difference between ideal and current suppliers. Through a case study of a green supplier selection procedure used by a Taiwanese electronics company, the effectiveness of the approach was confirmed.
The integrated strategy used by Cheng et al. [
74] integrates ML models with several MCDM techniques. To specifically identify suppliers, DEA and TOPSIS are combined. The tagged dataset is subsequently used to build a Support Vector Regression model, which can categorise undesired suppliers. A case study on a manufacturer of automation and electronic systems was carried out, proving the method’s accuracy and resilience. To evaluate customer satisfaction and pinpoint important supplier components [
75], investigate a combination of the Supply Chain Occupational Reference (SCOR) 4.0 model and BWM. They concentrate on sustainability and resilience in the pharmaceutical sector. After that, a gradient boosting machine learning model is used to categorise and rank suppliers according to their acceptability score; the algorithm’s efficacy is shown by the outcomes. When considering supplier selection from the perspective of operating in uncertain environments [
76], have investigated the integration of fuzzy Delphi and fuzzy BWM to prioritise suppliers under information uncertainty, refine and weigh criteria, and use TOPSIS and Grey Correlation to select the best supplier and distribute orders.
A prevalent feature of hybrid MCDM/ML methodologies in the literature is that they prioritise performance over adaptability [
1]. The low adoption of these solutions by supply chain stakeholders may also be explained by the fact that they are not easily integrated into procurement processes, even though they can successfully identify suitable suppliers more efficiently than typical MCDM approaches [
77]. The main obstacle to adoption is frequently the inability to justify the choices made by an ML-based or MCDM/ML hybrid strategy. Abdulla et al. [
78] have previously investigated interpretable machine learning techniques in conjunction with AHP to carry out supplier selection in this setting. To determine the most crucial selection criteria and weights that are then utilised to rank suppliers using AHP, a decision tree method was utilised. The results showed that by concentrating just on a subset of selection criteria, the decision tree algorithm could effectively determine the most crucial criteria, hence lessening the strain of applying the AHP approach. By investigating a greater range of machine learning algorithms outside of decision trees and considering a more modern MCDM approach created to address AHP’s problems, the methodology described in this work and detailed in the following section is continuing along the same trajectory.
3. Proposed Model
This section outlines the methodology employed and provides a detailed explanation of the computational process and solution procedure, which incorporates NMF, RF, and TOPSIS. As depicted in
Figure 1, supplier performance criteria data is first processed using NMF to reduce dimensionality and identify the core criteria. Subsequently, RF is applied, based on input from decision-makers, to assign weights to the core criteria, reflecting their significance to the case company within its operational context. Lastly, the framework evaluates potential new suppliers, utilising TOPSIS to consolidate the evaluation data and rank the suppliers. The detailed calculation steps are presented in the following sections.
3.1. Identification of Criteria and Data Preprocessing
The six primary factors used by pharmaceutical firms are supplier profile, cost, quality, services, delivery, and overall staff competencies [
79].
Table 1 shows how these main criteria are broken down into other sub-criteria. A questionnaire is used to collect information on all 24 criteria form industrial experts. To rate the supplier selection criteria (
c1,
c2, …,
c24) from 0 (the least important) to 10 (the most important), business managers of 34 pharmaceutical companies were consulted.
3.2. Dimension Reduction with NMF Method
3.2.1. Matrix Construction and Notation
In this analysis, the supplier selection process is represented by a matrix
X of dimensions
, where
m is the number of suppliers, and
n is the number of evaluation criteria. Each column
cj (e.g.,
c1,
c2, etc.) represents a unique criterion used to assess suppliers, such as quality, cost efficiency, or delivery performance [
80].
This matrix X forms the basis for applying Non-negative Matrix Factorization (NMF).
NMF is a dimensionality reduction technique that factors the input matrix
X into two non-negative matrices,
W and
H, giving the following [
81]:
Here:
W ∈ Rm×r contains the weights of each supplier in r latent components, representing supplier profiles.
H ∈ Rr×n contains the contribution of each criterion in these components, capturing the importance of each criterion in forming these profiles.
The rank r (or number of components) controls the complexity of the model, balancing the fidelity of the approximation with interpretability and computational efficiency.
3.2.2. Optimisation Objective
The decomposition is achieved by minimising the reconstruction error, measured by the Frobenius norm of the difference between
X and its approximation
:
where ‖ . ‖
F denotes the Frobenius norm, defined as follow:
This objective ensures that the factorisation captures as much of the original data’s structure as possible, which is particularly useful when negative values have no interpretive meaning, as in supplier evaluation scores [
82].
3.2.3. Selecting the Optimal Rank r Using the Elbow Method and KneeLocator
Determining the optimal rank r is crucial for effective dimensionality reduction. A common approach is the “elbow method,” where the reconstruction error is plotted as a function of r. The point where the error reduction slows significantly, forming an” elbow,” indicates a rank that balances accuracy and simplicity [
83]. To automate this process, the KneeLocator algorithm is employed, which detects the “elbow” or “knee” in a curve by identifying the point of maximum curvature.
3.3. Random Forest Feature Selection
The following steps outline the intuition underlying the RF feature selection [
84,
85].
Step 1: From the original training dataset, it constructs K number of classification trees.
Step 2: Consider the associated sample for each tree in the Random Forest. Also, error of a single tree of this sample can be denoted as .
Step 3: Randomly permutes the value of
in
to create a perturbed sample which can be denoted by
. The feature or criteria importance will be as follows [
86]:
From Equation (1), the importance of features can be measures to select the critical criteria.
The effectiveness of an ML model is indicated via performance measures. Despite the abundance of performance measurements, most of the earlier research concentrated on accuracy and
F−score and obtained a more accurate picture of their model [
87,
88,
89]. Therefore, in this work, RF classifier performance is measured using
F−score and accuracy using Equations (5) and (6).
3.4. TOPSIS
Throughout the past few decades, the TOPSIS model has been widely used in a variety of research domains to help with decision-making by rating multiple options according to how close or similar they are to an ideal answer [
90,
91,
92]. The TOPSIS model is used in this study to rank the suppliers after evaluating their performances in relation to sustainability aspects. The steps below are used to formulate the TOPSIS model:
Step 1: Evaluate supplier performance in relation to sustainability considerations:
The questionnaires created specifically for this study are used for this. The respondents used the linguistics scale in
Table 2 to illustrate the vendors’ performance on the questions.
Step 2: Normalise the decision matrix design:
The normalised decision matrix can be computed as follows:
where
and
and
D is the normalised decision matrix with
m rows and
n columns.
The sustainability criteria are divided into two categories: “benefit criteria,” which indicate that a scale increase is good, and “cost criteria,” which indicate that a scale decline is favourable.
Benefit criteria:
where
.
Cost criteria:
where
.
Step 3: Determine the normalised weighted decision matrix:
The weighted normalised decision matrix is calculated as follows:
where
and
.
where
is the weight of the
criterion for the
supplier alternative.
Step 4: Calculate the positive ideal solution (P+) and negative ideal solution (P−):
Cost criteria:
where:
and
;
and
.
Step 5: Determine the separation measures of each supplier alternative from the positive and negative ideal solution:
where
is the distance between two corresponding numbers on the linguistic scale.
Step 6: Computation of the closeness coefficient (
A) for each supplier alternative:
Step 7: Prioritisation of supplier alternatives:
According to the proximity coefficient, the supplier options are rated, with the best option being the one that is closest to the positive ideal solution and the worst option being the one that is most distant from the negative ideal method. Once these procedures are finished, a ranking of all potential suppliers has been determined. This ranking can be the direct result of our approach if stakeholders are interested in evaluating several possibilities; if just one supplier is to be selected, the output can be simply the supplier with the highest score.
4. Case Study
A real-life case study was considered to illustrate the approach and analyse its effectiveness. A questionnaire is used to collect information about the importance of 24 criteria across thirty-four (34) pharmaceutical companies. This section provides a detailed explanation of how the proposed approach was applied to the data. To rate the supplier selection criteria (
c1,
c2, …,
c24) from 0 (the least important) to 10 (the most important), the business managers of 34 pharmaceutical companies responded as shown in
Table 3:
4.1. Model Establishment and Calculation of NMF
To identify the most impactful criteria for supplier selection, an overall score was computed for each criterion based on its contribution across all latent factors derived from the NMF decomposition as shown in
Table 4. Specifically, the contributions of each criterion were summed across the 7 selected factors, resulting in a cumulative score that reflects the overall importance of each criterion within the model.
NMF is adopted for dimension reduction, identifying the core criteria for the evaluation framework.
Figure 2 shows the 8 criteria obtained from the original 24, which are the main criteria for classifying suppliers’ ratings. As a result, the number of original data were 34 × 24, which was reduced to 34 × 8. To ensure reliable dimensionality reduction, the parameter settings for Non-negative Matrix Factorization were determined through an iterative evaluation of reconstruction accuracy and interpretability. The number of latent components (
k) was varied between 6 and 12, and the model’s performance was assessed based on the Frobenius norm of reconstruction error. The optimal configuration of
k = 8 was selected because it produced the lowest reconstruction error while maintaining clear interpretability of the resulting factors in the context of supplier evaluation. This significantly reduces the amount of data and reduces the noise factor to improve the accuracy of the evaluation.
Using the elbow method, the cumulative scores to find a point where additional criteria provided diminishing returns in contribution are analysed. This approach allowed for proper selection of the top eight (8) criteria that collectively accounted for the most significant influence on supplier selection decisions. The graphical representation of the selected criteria using the elbow method is presented in
Figure 3:
The main critical criteria from the NMF analysis entails Product reliability which evaluates the overall quality of the product offered by the supplier, Record history which considers the supplier’s documented history of performance, including reliability and compliance, Purchase price which assesses the cost-effectiveness of the supplier’s products, an essential factor in budget considerations, Management quality, reviewing the supplier’s management structure and organisational efficiency, which impact reliability. Within this category further entails, technical competence which examines the supplier’s technical competencies, including specialised skills and technologies, payment conditions which looks at the flexibility and conditions of payment, influencing financial feasibility, customer relations evaluating the supplier’s approach to managing customer relationships and financial strength which considers the financial health of the supplier, critical for assessing stability.
These criteria represent a balanced view of the supplier’s capabilities, covering quality, cost, management, technical strength, financial strength, and customer relations. Selecting these top criteria based on their cumulative scores helps in constructing a robust evaluation framework that prioritises the most influential factors in supplier selection.
4.2. Using Random Forest to Obtain Criteria Weights
To determine the importance of each criterion in the supplier selection process, a machine learning classification approach is employed. This process involves constructing pipelines that combined data scaling methods with various classifiers, followed by hyperparameter tuning and feature importance extraction. Two scaling techniques were adopted, Standard Scaler and MinMaxScaler, to normalise the data, ensuring that each feature contributed equally regardless of scale. Four classifiers were applied: Random Forest Classifier, Support Vector Classifier (SVC), K-Nearest Neighbours (KNNs), and Logistic Regression. For each combination of scaler and classifier, a pipeline was created to streamline preprocessing and model fitting.
GridSearchCV was utilised to perform hyperparameter tuning, testing various parameter values to optimise each classifier’s performance. The hyperparameter grid was specific to each model, with parameters such as the number of estimators and maximum depth for Random Forest, and the regularisation parameter C for SVC. The cross-validation was used with accuracy as the scoring metric to identify the best-performing pipeline in each combination. To optimise the Random Forest model and ensure robust feature weighting, a 10-fold cross-validation procedure was implemented. Parameter tuning was performed using a grid search approach, varying the number of trees (from 100 to 1000) and maximum tree depth (from 5 to 15). After identifying the best pipeline, it was further evaluated by extracting feature importances, specifically for models that provide this information, such as Random Forest. The feature importance indicates the relative weight of each criterion in the model, enabling a ranked prioritisation of criteria. The results showed that the Standard Scaler with Random Forest Classifier achieved the highest accuracy. The extracted feature importances revealed that criteria such as technical competence (c15), Management quality (c14), Product reliability (c4), had the greatest impact, suggesting these factors are crucial for accurate supplier classification. This machine learning driven weighting provides an evidence-based approach for prioritising criteria, enhancing the robustness of the supplier selection framework.
Four different models are adopted to determine the criteria importance which entails random forest, SVM, logistic regression and KNN. The accuracy level of each model as displayed in
Table 5 suggesting random forest as the most accurate model with an accuracy score of 84.3%.
The weight distribution as shown in
Table 6 suggests that the top three criteria are technical competence as the most important criteria with a weight score of 0.2366, followed by management quality with a score of 0.1564 and product reliability with a weight score of 0.1457. From the list, the least three scores are record history, financial strength and purchase price.
4.3. Using TOPSIS to Integrate the Performance of Suppliers and Their Priority Ranking
Finally, TOPSIS is used to integrate suppliers’ performance data to form a final performance score, which is used to determine the ratings of the suppliers. In this case, performance data for four (4) suppliers were collected as shown in
Table 7,
Table 8,
Table 9 and
Table 10. The decision-maker only had to investigate using the core criteria for the suppliers, thus saving a lot of time and investigation costs.
The results indicate that suppliers
S1* is the most appropriate supplier with a performance score of 0.7089 and
S3*, is rated second with a performance score of 0.6355. If a third supplier needs to be included, then
S2* can be selected (
Table 11).
5. Discussion
The supplier ranking results demonstrate strong stability even when moderate variations are introduced into the importance weights of evaluation criteria. This robustness indicates that the proposed hybrid model is not overly sensitive to small fluctuations in input parameters, thereby enhancing its practical reliability and making it suitable for real-world procurement decision-making under uncertain conditions. The findings confirm that supplier evaluation in contemporary supply chains extends beyond economic considerations to encompass technical, managerial, and sustainability-related factors.
The analysis revealed that technical competence (c15), management quality (c14), and product reliability (c4) carry the highest importance weights as determined by the Random Forest model. These attributes are essential for ensuring long-term supplier competence, process reliability, and innovation, all critical enablers of sustainable supply chain performance. From a sustainability perspective, strong technical and managerial capacities support environmental compliance, resource efficiency, and quality assurance systems that reduce waste and improve operational resilience.
In addition to technical and managerial dimensions, the study recognises that environmental and social sustainability aspects, including adherence to environmental management standards, occupational health and safety practices, and corporate social responsibility are integral to sustainable supplier selection. By embedding these dimensions within the evaluation framework, the proposed model promotes a balanced approach that values both profitability and sustainability, aligning with global supply chain sustainability objectives.
The integration of the Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS) further enhances the interpretability and objectivity of supplier ranking. The hybrid data-driven approach, which combines Non-negative Matrix Factorization for dimensionality reduction and Random Forest for data-derived weighting, minimise human bias and enhances analytical transparency. This positions the model as an intelligent decision-support system capable of improving supplier evaluation accuracy, strengthening sustainability performance assessment, and supporting strategic sourcing decisions. Overall, the study underscores that sustainable supplier evaluation should integrate economic efficiency, technical capability, environmental stewardship, and social responsibility. The proposed hybrid framework provides procurement professionals with a reliable and scalable tool that advances both operational excellence and sustainability performance across the supply chain.
5.1. Theoretical Implications
This study contributes to the literature on multi-criteria decision-making (MCDM) and supplier selection by proposing a hybrid model that combines machine learning techniques with classical MCDM tools. The novelty of this study lies in its unique combination of Random Forest (RF) for feature importance with Non-negative Matrix Factorization (NMF) for dimensionality reduction, followed by Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) for final decision-making. While previous studies have explored hybrid frameworks such as RF combined with Analytic Hierarchy Process (AHP) [
93] or RF integrated with DEMATEL [
94], these models rely on expert judgement for pairwise comparisons or influence relationships, which introduces subjectivity and limits scalability. In contrast, the proposed model offers a fully data-driven pipeline where RF automates the determination of criteria importance, NMF eliminates noise and redundancy in high-dimensional data, and TOPSIS ensures interpretability and structured ranking. This eliminates dependency on extensive stakeholder inputs and improves computational efficiency, which is particularly advantageous in industrial contexts where procurement data is abundant but expert availability is limited. Specifically:
- (i)
It bridges the gap between traditional supplier selection approaches and intelligent decision support systems, integrating machine learning to automate weight generation and dimension reduction through RF and NMF, respectively.
- (ii)
The approach introduces an interpretable and systematic method of weighting supplier evaluation criteria using RF, addressing the subjective biases associated with expert-based methods.
- (iii)
The study further validates the application of tree-based machine learning models within MCDM frameworks, enhancing the interpretability and justification of decision processes.
- (iv)
By validating the proposed model against human expert decisions and comparing it to other established MCDM techniques, the study strengthens the theoretical reliability and replicability of hybrid evaluation methods.
5.2. Managerial Implications
For practitioners in the pharmaceutical sector, this study offers practical and actionable guidance for improving supplier selection and procurement strategy:
- (i)
Efficiency and Cost Reduction: The hybrid model significantly reduces decision-making time and cost by automating data pre-processing, feature selection, and ranking through the integration of Non-negative Matrix Factorization and Random Forest. In the pharmaceutical case study, this automation simplified the evaluation of 24 complex supplier criteria into 8 key performance dimensions, enabling procurement teams to make faster, evidence-based decisions without compromising accuracy.
- (ii)
Sustainability-Oriented Decision Support: The model enables procurement managers to prioritise suppliers using multidimensional sustainability criteria, covering technical capability, environmental compliance, and social responsibility. This is especially critical in the pharmaceutical industry, where supplier quality directly affects regulatory compliance, patient safety, and sustainable production practices.
- (iii)
System Integration and Scalability: The framework is designed to be ERP-compatible and can be embedded into existing procurement systems such as SAP Ariba, Oracle SCM, or Microsoft Dynamics through API integration or data import modules. The model’s outputs (supplier rankings and performance weights) can be periodically updated using live procurement data, allowing for continuous monitoring and dynamic supplier performance evaluation.
- (iv)
Reducing Subjectivity and Enhancing Transparency: By using objective, data-driven weighting and ranking, the model minimises cognitive bias in supplier assessment. Procurement managers can rely on transparent, repeatable evaluation logic that aligns supplier performance with strategic and sustainability goals specific to regulated industries.
- (v)
Cross-Industry Applicability: While validated using pharmaceutical supplier data, the model’s modular architecture allows for easy adaptation to other sectors such as mining, manufacturing, and energy, where complex supplier networks and sustainability compliance are equally critical. This positions the framework as a versatile tool for organisations aiming to institutionalise intelligent, data-driven supplier management practices.
6. Conclusions
This study developed a hybrid supplier ranking and selection framework that integrates data-driven machine learning techniques with a structured multi-criteria decision-making approach. The model effectively addresses the long-standing challenges of identifying relevant evaluation criteria and assigning objective weights by automating these processes through dimension reduction and feature importance analysis using historical procurement data. This automation enhances efficiency, transparency, and reliability in supplier assessment, reducing dependence on subjective stakeholder inputs and time-consuming consensus-building procedures. The proposed framework was validated through a pharmaceutical sector case study, demonstrating its ability to produce supplier rankings consistent with both expert judgments and results from established MCDM methods. Importantly, the findings reveal that sustainable supplier selection extends beyond economic factors to include technical capability, managerial effectiveness, and compliance with environmental and social responsibility standards. By embedding sustainability criteria directly into the evaluation process, the model supports organisations in aligning procurement decisions with broader corporate sustainability and regulatory objectives. Overall, this study contributes a robust, interpretable, and sustainability-oriented decision-support tool for supplier evaluation. The hybrid model not only strengthens analytical rigour but also promotes responsible sourcing practices, offering a scalable framework adaptable to diverse industrial contexts such as pharmaceuticals, mining, and manufacturing.
Research Limitations and Future Direction
While the proposed hybrid supplier selection framework demonstrates strong potential, several limitations should be acknowledged when considering its wider application. First, since the model integrates machine learning techniques, it requires access to adequate and high-quality data. Its performance may be constrained in organisations with limited digitalization or where data availability is restricted due to confidentiality or inconsistent recording practices. Secondly, the model is particularly beneficial in contexts involving many evaluation criteria, where feature selection and dimensionality reduction yield clear efficiency gains. In supplier selection settings with only a few criteria, the computational advantage of applying data-driven algorithms may be marginal.
Looking ahead, future research can be structured around four prioritised directions:
- 1
Cross-sectoral Validation:
Further case studies in sectors such as food, mining, automotive, and oil and gas are recommended to validate the model’s adaptability and performance across diverse supply chain environments.
- 2
System Integration and Practical Deployment:
Future work should explore how the hybrid model can be seamlessly embedded within enterprise procurement workflows, particularly through integration with ERP systems (e.g., SAP, Oracle, or Microsoft Dynamics). Developing the model as a service-based decision-support module would enhance its accessibility and enable continuous, automated supplier evaluation.
- 3
Expansion of Decision Scope:
Extending the framework to include supplier order allocation, logistics coordination, and delivery performance could transform it into a more comprehensive business intelligence tool for strategic sourcing and supply management.
- 4
Incorporation of Natural Language and Large Language Models (LLMs):
Future studies should investigate how natural language processing (NLP) and LLMs can complement traditional data-driven criteria by analysing unstructured information such as supplier reports, audits, and sustainability disclosures. While recent studies have highlighted the promise of NLP–MCDM integrations for text-based decision support, practical challenges remain. These include data privacy, model explainability, and ensuring contextual accuracy when interpreting qualitative supplier data. Addressing these challenges will be key to ensuring responsible and transparent adoption of LLM-assisted decision frameworks.
In summary, the proposed research trajectory prioritises validation, system integration, scope expansion, and intelligent automation. Collectively, these directions aim to evolve the current model into a robust, interpretable, and context-aware decision-support ecosystem for sustainable supplier selection across multiple industries.