Machine Learning in Reverse Logistics: A Systematic Literature Review

Silva, Abner Fernandes Souza da; Moris, Virginia Aparecida da Silva; Silva, João Eduardo Azevedo Ramos da; Voltarelli, Murilo Aparecido; Sigahi, Tiago F. A. C.

doi:10.3390/a18100650

Open AccessReview

Machine Learning in Reverse Logistics: A Systematic Literature Review

by

Abner Fernandes Souza da Silva

¹

,

Virginia Aparecida da Silva Moris

¹,

João Eduardo Azevedo Ramos da Silva

¹,

Murilo Aparecido Voltarelli

² and

Tiago F. A. C. Sigahi

^1,3,*

¹

Department of Production Engineering, Federal University of São Carlos, Sorocaba 18052-780, SP, Brazil

²

Department of Production Engineering, Federal University of São Carlos, Lagoa do Sino 18290-000, SP, Brazil

³

Department of Production Engineering, Polytechnic School, University of São Paulo, São Paulo 18052-780, SP, Brazil

^*

Author to whom correspondence should be addressed.

Algorithms 2025, 18(10), 650; https://doi.org/10.3390/a18100650

Submission received: 5 September 2025 / Revised: 8 October 2025 / Accepted: 10 October 2025 / Published: 16 October 2025

(This article belongs to the Special Issue Artificial Intelligence in Sustainability Research Operations, Management, and Ecosystems)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Reverse logistics (RL) plays a crucial role in promoting circularity and sustainability in supply chains, particularly in the face of increasing waste generation and growing environmental demands. In recent years, machine learning (ML) has emerged as a strategic tool to enhance processes, decision-making, and outcomes in RL. This article presents a systematic review of ML applications in reverse logistics, highlighting trends, challenges, and research opportunities. The analysis covers 52 articles retrieved from the Scopus and Web of Science databases, following the PRISMA protocol. The results show that the most frequently employed techniques are supervised models, followed by unsupervised methods and, to a lesser extent, reinforcement learning. The main ML applications in RL focus on return and waste generation forecasting, process optimization, classification, pricing, reliability assessments, and consumer behavior analysis. The studies examined predominantly use traditional evaluation metrics, such as MAPE and F1-score, while few consider multidimensional indicators encompassing long-term social or environmental impacts. Key challenges identified include data scarcity and quality, inherent uncertainties in reverse supply chains, and the high computational cost of models. This article also points to research gaps concerning metadata standardization, the absence of public benchmarks, model explainability, and the integration of ML with simulations and digital twins, indicating pathways toward more robust, transparent, and sustainable RL.

Keywords:

reverse logistics; circular economy; sustainability; sustainable development; closed-loop supply chains; artificial intelligence; machine learning; digital transformation; Industry 4.0; Logistics 4.0

1. Introduction

Solid waste management is a global concern from environmental, social, and economic perspectives. Improper waste handling contributes to soil and water pollution and poses risks to public health and biodiversity [1]. Among the various waste streams generated in modern societies, waste electrical and electronic equipment (WEEE), construction and demolition waste, and healthcare waste stand out. These waste types present specific characteristics—such as potential toxicity, increasing volume, and treatment complexity—that exacerbate the challenges of proper disposal. If not properly managed, they can not only contaminate the environment but also lead to the loss of valuable secondary resources, such as metals and plastics, which could be reintroduced into production chains [2].

In this context, reverse logistics (RL) emerges as an essential component of waste management systems, as it encompasses the planning, implementation, and control of product, material, and information flows from the point of consumption back to the point of origin, with the aim of reuse, recycling, or proper final disposal [3]. RL is, therefore, fundamental to promoting circularity and sustainability in production processes, particularly considering growing regulatory requirements and socio-environmental responsibility.

However, the efficient operationalization of RL faces multiple challenges, including the dispersion of collection points, variability and uncertainty in the quantity and quality of returned materials, lack of consumer information, and difficulties in tracking throughout the supply chain [4]. These uncertainties require tools capable of enhancing planning, decision-making, and operational performance in reverse supply chains.

In this context, artificial intelligence (AI) has emerged as a promising ally in addressing such challenges. AI-based tools, such as data analytics, optimization algorithms, image processing, and predictive analysis, have the potential to transform RL by increasing operational efficiency, traceability, and adaptability [5]. AI encompasses a wide range of techniques, including deep learning (DL), machine learning (ML), natural language processing, computer vision, and reinforcement learning (ReL). These technologies enable AI to perform functions traditionally associated with human intelligence, such as pattern recognition, predictive analysis, classification, process automation, and real-time decision-making.

Among the various branches of AI, machine learning (ML) stands out, with its primary goal being to enable computer systems to learn from data and improve their performance on specific tasks without explicit programming for each situation [6]. ML can be structured into three main approaches: supervised learning, unsupervised learning, and reinforcement learning.

In supervised learning, the model is trained with labeled data consisting of known input–output pairs, enabling the system to make predictions or classifications from new data [6]. In unsupervised learning, the goal is to identify patterns, latent structures, or clusters within unlabeled data, making it particularly useful for tasks such as segmentation and exploratory analysis. Finally, reinforcement learning is a process in which intelligent agents learn to make optimal decisions by interacting with the environment and receiving rewards or penalties for each action taken [7].

Against this backdrop, the use of ML techniques has grown exponentially in reverse logistics, enabling advances in return forecasting, collection route optimization, automated waste classification, fraud detection, material quality assessment, and other applications [5]. Nevertheless, challenges and research gaps persist, requiring continuous scientific and technological development.

Some studies have explored the role of AI in RL, each from distinct angles and scopes. Oluleye, Chan and Antwi-Afari [8] focused on the construction sector, presenting a critical review that discussed challenges, potentials, and integrations of AI in the field; however, by limiting their scope to construction, they did not delve into methodological aspects specific to RL in a broader sense. Bhowmik et al. [5] adopted a macro perspective by conducting a bibliometric and network analysis on AI applied to RL, examining 2929 articles and mapping trends, authors, countries, and thematic clusters. Despite its broad scope, the study remained predominantly bibliometric, devoting less attention to methodological, practical, and content-related issues. In an even broader approach, Raut et al. [9] conducted a review on AI in the circular economy, identifying thematic clusters and proposing an implementation framework, with RL addressed only as one of the topics analyzed. In the context of WEEE, Xiong et al. performed a systematic review covering the entire recycling and reuse cycle—from dismantling to logistics and management—categorizing AI tasks, discussing applied technologies, and outlining future directions, although their scope extended beyond RL in the strict sense. Bhattacharya et al. [10] examined closed-loop supply chains (CLSC) through the lens of AI, building a taxonomy of techniques and a research agenda; however, their CLSC focus diluted the specific discussion on RL.

In this regard, the present article advances the field by proposing a critical and integrative analysis that goes beyond merely identifying trends or thematic clusters. Unlike previous studies, this work structures its investigation around research questions aimed at identifying and categorizing the ML techniques most frequently applied to RL processes, mapping business objectives and real-world use cases, detailing the evaluation metrics employed in the studies, examining the challenges and limitations encountered in applying these techniques, and, above all, highlighting the gaps that represent opportunities for future research. By adopting this approach, the article not only compiles and organizes existing knowledge but also fosters a critical discussion on the maturity and limitations of proposed solutions, offering guidance for advancing the field. In doing so, this study fills a gap by providing a comprehensive, critical, and application-oriented perspective on ML in RL, contributing to both researchers and practitioners seeking to enhance the efficiency and sustainability of reverse supply chains across different contexts.

Thus, this article seeks to address the following research questions (RQ):

RQ1: Which machine learning techniques are most frequently applied in reverse logistics processes?
RQ2: What are the main objectives for using ML in reverse logistics?
RQ3: What performance and validation metrics are reported in studies on ML applied to reverse logistics?
RQ4: What are the main challenges and limitations encountered in applying ML to reverse logistics?
RQ5: What methodological and technological gaps emerge from the literature, indicating opportunities for future research?

2. Materials and Methods

2.1. General Methodology

To address the research questions, a systematic literature review method was employed. This approach enables the rigorous management of accumulated scientific knowledge by following a clear methodology that ensures the study’s reliability [11]. Accordingly, this work was conducted based on the PRISMA protocol [12], which ensures transparency in the stages of search, selection, and data extraction, explicitly detailing what was done, what was found, and why the review was conducted.

For the development of this systematic review, two databases were selected: Scopus and Web of Science. These platforms rank among the main sources for bibliographic analysis due to their multidisciplinary coverage and the availability of specialized indexes that ensure precision, transparency, and control over the retrieved information [13]. Figure 1 presents the PRISMA 2020 flow diagram, which summarizes the study selection process. A total of 159 records were identified from Scopus and Web of Science. After removing 54 duplicates using Rayyan, 105 records remained for screening. At the title and abstract screening stage, 43 records were excluded for not meeting the eligibility criteria. A total of 62 articles were retrieved in full text for eligibility assessment; however, 10 of them could not be accessed despite attempts through institutional libraries and direct author contact. As a result, 52 studies were effectively analyzed in the final review. No additional records were excluded at this stage, resulting in 52 studies included in the final review. The PRISMA checklist can be found in Table S1 of the Supplementary Material.

To structure the search, two initial constructs were adopted: “Machine Learning” and “Reverse Logistics.” An exploratory review was then conducted [14] to gain a deeper understanding of the topic and identify variants of these terms. Subsequently, VOSviewer 1.6.20, a tool that enables iterative analysis of article metadata, was used to map keywords related to the constructs through co-occurrence analysis [15,16]. Based on these findings, the search string to be applied in the databases was defined, as described in Table 1. The search was limited to peer-reviewed journal articles, with no restrictions applied regarding publication date. All searches were conducted up to July 2025.

In the searches conducted in Scopus, the TITLE-ABS-KEY field was used, which retrieves results from article titles, abstracts, and keywords, ensuring broad coverage without losing specificity. This means that an article would only be retrieved if the search terms appeared in at least one of these three areas, thereby increasing the precision and relevance of the results. In Web of Science, the TS field (Topic Search) was used, which also searches titles, abstracts, author keywords, and keyword plus terms.

The search string structure also employed Boolean operators such as AND (to combine the ML and RL term sets) and OR (to include terminological variations and synonyms), as well as the asterisk (*) as a truncation operator to capture suffix variations (for example, learning retrieves “learning,” “learners,” or “learned”).

The terms selected in the first part of the search string covered the following approaches:

Machine learning *: It refers to “machine learning” in general, including all variations and derivations of the term;
Deep learning: It covers deep learning techniques, an important subfield within ML;
Neural network *: It includes both artificial and deep neural networks, which are known for their ability to model complex relationships in data;
Supervised learning: It refers to supervised learning, where models are trained with labeled data;
Unsupervised learning: It refers to unsupervised learning, which is used to identify patterns without explicit labels in the data;
Reinforcement learning: It covers reinforcement learning techniques, which are relevant for optimization and decision-making in dynamic environments;
Decision tree *: It includes decision trees and their variants, which are common and interpretable ML methods;
Random forest *: It covers random forests, which are ensemble techniques based on multiple decision trees.

The second part of the search string covered the following:

Closed-loop supply chain and closed-loop supply chain: They are variations referring to “closed-loop supply chains,” a concept associated with reverse logistics and the circular economy;
Reverse logistics and reverse supply chain: They cover “reverse logistics” and variations, with or without hyphens.

Including all these variations ensured that the search would capture studies that might use different terminologies to address similar topics, thus maximizing coverage and comprehensiveness.

For the selection process, searches in both databases were filtered to include only scientific articles. This initial search yielded 159 records. The metadata from these records were extracted, and duplicates were identified and removed using the Rayyan tool [17], resulting in 105 unique records. Titles and abstracts were then screened according to the following exclusion criteria: articles that did not focus on the application of ML in at least one stage of RL, articles that did not focus on RL, and literature review articles. Inclusion criteria required that studies (i) explicitly applied at least one ML model to reverse logistics processes and (ii) were published as peer-reviewed journal articles.

The selection process was conducted by a single reviewer, who screened all titles, abstracts, and full texts using Rayyan. The results of the selection were subsequently discussed and checked by the co-authors to ensure consistency with the eligibility criteria. Data extraction was carried out by one author using a standardized spreadsheet, and the collected information was independently reviewed and validated by the other co-authors to minimize errors and ensure accuracy.

Twenty articles were excluded under the first criterion, as they showed no clear evidence of ML application in RL. Thirteen were excluded under the second criterion for addressing topics related to traditional supply chains or other subjects outside the defined scope, without directly addressing RL. Additionally, ten articles were excluded for being literature reviews. As a result, only articles explicitly applying an ML model in some stage of RL were retained.

In total, 43 articles were excluded, leaving 62 for analysis. However, 10 were removed due to unavailability. Then, 52 articles were selected. These documents were examined in full through content analysis [18], considering their main theme to extract data on the state of the art and to categorize each work according to its specific focus. This process was supported by a spreadsheet used to record relevant information such as authors, year of publication, methodology, results, conclusions, limitations, suggested future research, and country of origin. After this filtering, the selected studies were compiled with the aim of answering the research questions formulated.

For each included study, information was extracted on the type of machine learning technique applied, the reverse logistics process or application addressed, the performance and validation metrics used, and the main findings (outcomes). In addition, bibliographic and contextual data were collected, including year of publication, country, sector or industry context, journal, and reported funding sources. These items were defined to enable both a methodological and content-oriented synthesis of the literature.

The search string was designed to balance conceptual breadth with methodological precision. It combines paradigm-level terms—such as supervised learning, unsupervised learning, and reinforcement learning—with two representative algorithmic families, decision trees and random forests, which are among the most frequently applied and interpretable ML techniques in logistics-related studies. This hybrid structure ensured broad coverage of the main learning paradigms while capturing well-established algorithmic approaches without overextending the search scope. Specific models such as Support Vector Machines (SVM), Gradient Boosting, or k-nearest neighbors were not individually listed to maintain conceptual focus and avoid excessive fragmentation of results. This design aligns with PRISMA’s emphasis on conceptual inclusiveness and methodological consistency. Nonetheless, we acknowledge that this decision may have excluded some domain-specific studies, representing a limitation to be addressed in future reviews through a more algorithm-explicit search expansion.

Regarding article availability, for the ten studies whose full texts were inaccessible, retrieval was attempted through institutional libraries and direct contact with the authors. Despite these efforts, the papers remained unavailable. Their exclusion may have slightly limited the representation of niche or regional applications.

Although the search strategy was limited to the Scopus and Web of Science databases, this decision aimed to ensure methodological consistency and quality control by focusing exclusively on peer-reviewed journal articles. While this approach may have excluded relevant computer science conference papers—such as those published in NeurIPS, ICML, or KDD—it aligns with the objectives of this review, which emphasize methodological rigor and reproducibility rather than coverage breadth. Future studies may expand the search scope to include leading conference proceedings to capture cutting-edge algorithmic developments and implementation trends in reverse logistics.

2.2. Quality Assessment

To ensure methodological rigor and transparency, a quality assessment of the 52 included studies was conducted following four criteria: (i) clarity of research design, (ii) data transparency, (iii) model validation procedure, and (iv) completeness of results reporting. Each criterion was evaluated using a three-level scale (low, moderate, high). Most studies exhibited moderate to high levels of methodological clarity and model validation, while a smaller subset showed limited transparency in data description. This assessment confirmed that the selected literature met the minimum standards for inclusion, thereby increasing the reliability of the synthesized findings.

3. Results and Discussion

3.1. Bibliometric Analysis

The scientific output on the topic shows a notable geographical distribution, as illustrated in Figure 2. China stands out as the leading research hub, accounting for 16% of the published articles, followed by Iran with 12%. The United States, the United Kingdom, and India also rank among the most productive countries, although with slightly lower shares compared to the two leaders. This scenario indicates that, in certain nations, the topic has already become a significant concern in the academic agenda. Conversely, countries such as Brazil, Australia, Denmark, and others still show relatively modest publication levels, highlighting both opportunities and the need to foster research in these contexts.

Regarding the temporal distribution, there has been a significant increase in the number of publications over the years, as illustrated in Figure 3. This growth has been particularly pronounced since 2020, reflecting the emerging nature of the topic. The year 2024 recorded the highest volume of works to date, marking a peak in research activity.

The review was completed in July 2025, ensuring that the most recent publications were captured up to this date. This timing reinforces the contemporaneity of the analysis, as several studies published in 2025 were already indexed and included. Minor adjustments were made to ensure temporal consistency across the text and figures, thereby maintaining alignment between the reported period of data collection and the latest referenced works.

The reviewed studies span from 2007 to 2025, with a noticeable surge after 2020, reflecting the accelerating intersection between sustainability and artificial intelligence (AI). The predominance of research in China and Iran can be partially attributed to national initiatives promoting circular economy and AI-driven industry, such as China’s 14th Five-Year Plan emphasizing green manufacturing and Iran’s growing recycling and waste management programs. These institutional incentives appear to have fostered scientific output in machine learning for reverse logistics.

Regarding journals, as shown in Figure 4, Annals of Operations Research stands out, followed by Sustainability. Other journals also show relevant publication numbers, although to a lesser extent. Additionally, there is a considerable group of journals that contributed only one article each, indicating the dispersion of the topic across different fields of knowledge. It is also noteworthy that the most representative journals focus on both computing and sustainability, highlighting the interdisciplinary nature and the intersection between technology and environmental issues that underpin the discussion of ML applied to RL.

3.2. Content Analysis

The content analysis was organized according to the proposed research questions.

3.2.1. RQ1: Which ML Techniques Are Most Frequently Applied in RL Processes?

The machine learning techniques most frequently applied in reverse logistics vary according to the specific objective of each study. Overall, supervised learning models remain predominant (approximately 40%), followed closely by unsupervised learning methods (around 36%), while reinforcement learning accounts for roughly 15% of applications as shown in Figure 5. This confirms that supervised and unsupervised paradigms form the methodological backbone of current research, whereas reinforcement learning, though less common, is steadily gaining traction—particularly in dynamic optimization and routing contexts.

Among supervised approaches, artificial neural networks (ANN) dominate, especially in forecasting tasks where non-linear relationships are common. Decision trees and random forests are also employed for classification and decision support. Unsupervised methods, such as clustering and dimensionality reduction, are increasingly used for data segmentation and exploratory analysis. Reinforcement learning stands out as a growing area for solving adaptive logistics problems under uncertainty, supported by recent advances in simulation and digital twin environments.

Decision tree methods, in turn, are prominent in classification tasks, such as identifying waste categories or prioritizing alternatives in decision-making processes. In contexts involving uncertainty, fuzzy methods are used to address variability, contributing to greater robustness of the developed models. This landscape aligns with global trends that highlight the advancement of ANN-based techniques, especially in applications involving large volumes of data and high complexity, such as automated waste classification and forecasting of return flows [5,8]. Figure 6 presents the distribution of the different models used in the analyzed articles. It is also worth noting that many studies do not clearly specify the model employed, which undermines the transparency and replicability of results.

An important observation is that nearly one-third of the studies failed to explicitly state the type of ML model employed, classifying them as “unspecified.” In most of these cases, the papers referred to generic “predictive models” or “machine learning algorithms” without further detail. This lack of transparency hampers replicability and comparative evaluation, highlighting the urgent need for clearer reporting standards and model documentation practices in ML-based reverse logistics research.

3.2.2. RQ2: What Are the Main Objectives for Using ML in Reverse Logistics?

From the reading and analysis of the 52 included articles, a recurring pattern in the literature on ML and RL was identified. Based on this, the articles were classified into categories according to how ML is applied within the RL context. Table 2 presents these categories along with the corresponding articles.

The dominance of forecasting applications can be attributed to the intrinsic uncertainty of reverse logistics flows, where predicting product returns or waste generation is essential for planning. ANN-based models are particularly effective due to their ability to capture non-linear patterns and temporal dependencies, explaining their prevalence in predictive studies. In contrast, reinforcement learning techniques are less frequent but offer high potential in adaptive decision-making, such as optimizing collection routes or remanufacturing schedules, where feedback from dynamic environments can be leveraged for continuous improvement. This contrast illustrates the dual evolution of ML in RL: from predictive analytics toward prescriptive and autonomous learning.

A total of six main categories were identified. The most representative is forecasting, comprising 10 articles. In this category, studies seek to leverage the predictive capabilities of ML, proposing models to forecast parameters relevant to RL. For example, Fernandes de Souza et al. [22] developed a model to predict end-of-life vehicle generation, combining ARIMA and ANN models and achieving superior performance compared to other methods. Bittencourt et al. [20] proposed an ANN model to forecast the generation of end-of-life tires, while Temur and Bolat [29] used ANN to predict WEEE generation. Thus, the studies in this category emphasize the potential of ML in predictive tasks.

The second most frequent category is optimization. Articles in this group employ ML to support the search for optimal solutions to optimization problems. For instance, Gutierrez-Franco, Mejia-Argueta e Rabelo [33] used deep reinforcement learning to train logistics policies in combination with optimization and simulation methods; Achamrah et al. [30] applied reinforcement learning for route selection, and Wang et al. [37] also used reinforcement learning for inventory control in CLSCs. Overall, reinforcement learning predominates as a tool to support optimization.

The classification category includes studies that use ML to support decision-making involving multiple criteria, such as alternative selection, risk prioritization [47,49,51], or waste classification from images [48].

The reliability category covers works focused on predicting equipment failures and repair needs [39,40,42], as well as estimating component lifetimes [41].

The pricing category comprises research aimed at determining the value of second-hand products, applying ML models to price used items [43,44,45,46].

Finally, the consumer category includes studies that assess consumer behavior using data from social networks and questionnaires to understand product return patterns and attitudes toward reverse logistics.

Overall, the application of ML in RL is heavily concentrated in forecasting, followed by optimization and classification, along with applications in reliability, pricing, and consumer analysis. These findings reinforce the central role of forecasting (e.g., demand prediction, product returns, waste generation) as one of the main challenges in RL, as highlighted by Raut et al. [9] and Oluleye et al. [8]. The growing interest in optimization reflects the pursuit of more efficient routes, cost reduction, and better resource allocation—classic logistics topics that gain greater complexity in reverse supply chains due to uncertainty and variability. The use of ML for classification and consumer analysis signals a recent expansion, with advances in return segmentation, opinion analysis, and second-hand product pricing, indicating the maturation and diversification of research objectives in the field.

3.2.3. RQ3: What Performance and Validation Metrics Are Reported in Studies on ML Applied to RL?

Regarding evaluation metrics, there is a predominance of traditional ML indicators, such as MAPE, MAD, and MSE (error measures) for forecasting tasks, as well as Accuracy and F1-score for classification, in addition to metrics such as total cost and emissions for assessing operational and environmental impacts. This pattern is consistent with what has been observed in other domains of supply chain analytics, as noted by Boujarif et al. [39], Rezaei Zeynali et al. [51] and Fernandes de Souza et al. [22].

However, a notable gap exists: few studies propose or employ metrics specifically designed to assess social impacts, long-term environmental effects, or the systemic performance of reverse supply chains. Most works focus on conventional indicators, often limiting the analysis to isolated aspects such as model accuracy or immediate costs. This limitation may compromise the holistic evaluation of implemented solutions, as RL is intrinsically linked to promoting the circular economy, social responsibility, and environmental sustainability [56]. Consequently, the absence of metrics that capture social externalities and cumulative environmental impacts can lead to an incomplete understanding of the true potential of ML-based solutions.

Therefore, there is a clear need to advance the development, adoption, and standardization of multidimensional metrics that integrate economic, social, and environmental dimensions in the assessment of proposed solutions. Such metrics should be aligned with the principles of the circular economy and long-term sustainability, enabling researchers and practitioners to identify trade-offs and synergies among different objectives. Furthermore, the inclusion of systemic indicators—such as resilience, circularity, or stakeholder integration—can contribute to a more realistic evaluation, better aligned with contemporary challenges in reverse supply chains.

3.2.4. RQ4: What Are the Main Challenges and Limitations Encountered in Applying ML to RL?

Table 3 presents the main challenges identified in applying machine learning techniques to reverse logistics processes, as discussed in the recent literature. Among the most recurrent challenges, data scarcity and quality stand out. Due to the irregular, heterogeneous, and often non-traceable nature of return flows, as well as the dispersion of historical data across different systems (ERP, carriers, sorting centers, and 3PL providers), data gaps and inconsistent formats are common. These limitations negatively affect the reliability of models, which may end up learning weak patterns, and also demand significant effort in data cleaning and standardization [19,27].

Another major challenge relates to uncertainties, such as abrupt changes in operational reality, which can drastically alter the behavior of the variables under analysis. This may lead to the rapid obsolescence of predictive models, with forecast errors exceeding 50% in some cases, as well as undermine the efficiency of routing and planning strategies based on historical data [22].

Finally, the high computational cost associated with implementing hybrid and robust models is also noteworthy. In cases requiring substantial processing capacity, execution times may extend from minutes to several hours or even days, rendering them impractical for daily decision-making applications. Furthermore, the high energy consumption and cloud resource usage increase the operational cost of proposed solutions [51].

These challenges directly underpin the research gaps discussed in the following section. For instance, data scarcity and uncertainty motivate the development of standardized metadata and real-time data pipelines, while high computational costs highlight the need for lightweight, energy-efficient hybrid models. Bridging these issues represents a necessary step toward robust, transparent, and sustainable ML applications in reverse logistics.

3.2.5. RQ5: What Methodological and Technological Gaps Emerge from Literature, Indicating Opportunities for Future Research?

Table 4 summarizes the main research gaps currently identified in the literature on ML applications in RL, along with suggested directions and promising lines of investigation to advance the field.

Among the most notable gaps is the explicit modeling of uncertainties. Although some approaches address uncertainty through fuzzy methods or grey systems, in-depth discussions on the impact of these uncertainties on ML models applied to RL remain rare. Progress in this area requires systematic comparison of different methods under uncertain scenarios, thereby strengthening the robustness of proposed solutions.

Another recurring gap is the lack of real-time pipelines. Most reported prototypes and studies operate on historical data, failing to leverage the potential of real-time data collection and analysis, which limits the applicability of solutions in dynamic environments. The use of digital twins emerges as a promising line of research to address this challenge.

The topic of hybrid ML also deserves attention, as few studies combine supervised, reinforcement, and unsupervised models within a single architecture—an approach that could enhance the ability to handle multiple data types and complex tasks.

It is also recommended to adopt formal metadata standards, such as schema.org (a standardized vocabulary for describing data on the web) and JSON-LD (a lightweight encoding format that facilitates data integration and reuse), to ensure greater interoperability and transparency of RL datasets. Additionally, creating dataset cards specifically for RL could document the origin, processing, and main characteristics of datasets, promoting standardization and facilitating data sharing among researchers and practitioners.

The lack of public, multi-scale benchmarks is another significant barrier to RL advancements. Unlike well-established areas of ML, where public datasets allow for standardized method comparisons, reverse logistics studies often develop proprietary datasets, hindering comparative evaluation. The creation of open repositories, similar to the E-waste Monitor, could help overcome this limitation, promoting standardization and methodological progress in the field.

Another emerging topic is explainable AI (XAI) applied to RL. Few works have explored interpretability methods designed to increase transparency and understanding of ML results, such as SHAP—which quantifies the influence of each variable on predictions—and visual analytics in operational dashboards, which leverage interactive visualizations to facilitate analysis and decision-making.

Future research could explore co-simulation frameworks—which enable the simultaneous and coordinated execution of different models (e.g., integrating discrete event simulation with machine learning and optimization)—as well as federated integration, which connects multiple systems or platforms in a distributed and collaborative manner while preserving autonomy and data privacy among stakeholders. Furthermore, adopting digital twins could operationalize real-time data flows and decision-making, enhancing the flexibility and intelligence of reverse supply chains.

Based on the challenges identified, several methodological and technological gaps have been consolidated and structured to facilitate targeted research advancement. Each gap is described below, accompanied by its significance and possible future directions.

Gap 1—Explicit Uncertainty Modeling: Few studies systematically address uncertainty propagation in ML models applied to RL. Future research should compare alternative approaches—such as fuzzy logic, grey systems, and Bayesian inference—to evaluate robustness under uncertain demand and return conditions.

Gap 2—Real-Time Pipelines: Most prototypes rely on historical datasets. Incorporating real-time data acquisition and analysis through digital twin architectures could enable adaptive and resilient decision-making.

Gap 3—Hybrid ML Architectures: Integration of supervised, unsupervised, and reinforcement learning in unified frameworks remains limited. Combining them could enhance both predictive accuracy and adaptability.

Gap 4—Metadata Standards and FAIR Data: Lack of dataset transparency persists. The adoption of schema.org, JSON-LD, and dataset cards for RL would improve interoperability and reproducibility.

Gap 5—Public Benchmarks: The absence of open, multi-scale datasets prevents cross-comparison. Establishing standardized repositories akin to the E-waste Monitor would advance the field.

Gap 6—Explainable AI (XAI): Interpretability is critical for managerial trust. Applying SHAP or visual analytics to operational dashboards could foster understanding of model behavior.

Gap 7—Deep ML and Simulation Integration: Current ML–simulation couplings remain sequential. Developing co-simulation and federated digital twin frameworks can bridge predictive and operational layers in real time.

4. Conclusions

This study presented a comprehensive overview of the application of ML techniques in RL, highlighting advancements, challenges, and research opportunities. The analysis revealed that the most frequent approaches focus on return and waste generation forecasting, process optimization, alternative classification, pricing, reliability, and consumer behavior analysis. Among the most commonly used techniques are ANN, reinforcement learning methods, DL, and, more recently, tree-based algorithms.

Despite the progress observed, significant challenges remain, particularly regarding the scarcity and quality of available data, the presence of uncertainties, and the high computational cost of robust models. Another critical issue is the predominance of traditional evaluation metrics, with little attention given to multidimensional indicators that encompass social and long-term environmental aspects, as well as the systemic performance of reverse supply chains. The gap analysis points to the need for advances in explicit uncertainty modeling, metadata standardization (including the adoption of schema.org, JSON-LD, and dataset cards), the creation of public benchmarks, and deeper integration between ML, simulation, and digital twin frameworks.

Promising opportunities for future research include the development of real-time pipelines, the use of co-simulation between discrete event simulations and reinforcement learning, and the strengthening of explainability through methods such as SHAP applied to optimization models and visual analytics in operational dashboards. Such advances are essential for promoting more robust, transparent solutions aligned with the principles of the circular economy and sustainability.

In summary, consolidating ML in reverse logistics requires joint efforts to overcome technical challenges, create open standards and benchmarks, and promote metrics that reflect the real impact of implemented solutions. Integrating these guidelines can drive not only scientific progress but also the practical and sustainable application of ML in reverse supply chains, contributing to more efficient, responsible, and innovative waste management within the circular economy context.

The findings of this systematic review advance both theoretical and practical understanding at the intersection of ML and RL. By compiling and analyzing the main techniques, objectives, metrics, and challenges reported in the literature, this article provides an up-to-date overview of the field’s maturity stage, explicitly identifying methodological gaps and categorizing different types of studies and their levels of maturity. Mapping predominant approaches and identifying future opportunities contribute to a more structured research agenda, guiding future investigations capable of comparing and testing methods across different RL contexts.

From a practical perspective, the results of this study offer valuable insights for practitioners, organizations, and policymakers. The identification of the most used techniques and key challenges can help managers select and implement ML solutions suited to their needs, while also highlighting operational barriers and risks that may affect project success. The emphasis on system integration, data quality, and the adoption of metrics aligned with sustainability goals supports strategic decision-making and investment in reverse supply chains. Furthermore, by pointing to technological trends such as real-time pipelines, digital twins, and XAI applications, the article signals transformations that can enhance operational efficiency, transparency, and sustainability in logistics operations, delivering economic, environmental, and social benefits.

As a limitation, this study was restricted to articles indexed in Scopus and Web of Science, which may have led to the exclusion of relevant works from other sources. In addition, the dynamic and innovative nature of the field means that developments not yet documented at the time of this analysis may have emerged since.

No formal risk of bias assessment tool was applied to the included studies, as the objective of this review was to provide a descriptive synthesis of machine learning applications in reverse logistics. However, to reduce bias during the review process itself, the screening and data extraction performed by one author were subsequently reviewed and validated by the co-authors to ensure consistency with the eligibility criteria.

Based on the synthesized findings, this review establishes a targeted research agenda derived from the observed gaps and trends. Future investigations should prioritize

(i): Validation of hybrid ML–RL pipelines under uncertain and dynamic conditions;
(ii): Development of FAIR-compliant and open-access RL datasets with standardized metadata;
(iii): Integration of explainable AI methods to enhance interpretability and managerial trust;
(iv): Implementation of co-simulation frameworks combining ML, simulations, and digital twins to enable real-time decision-making.

These directions, rooted in the empirical gaps identified in this review, aim to strengthen the robustness, transparency, and sustainability of ML applications in reverse logistics.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/a18100650/s1. Table S1: PRISMA Checklist.

Author Contributions

Conceptualization, A.F.S.d.S., V.A.d.S.M. and J.E.A.R.d.S.; methodology, A.F.S.d.S., V.A.d.S.M. and J.E.A.R.d.S.; software, A.F.S.d.S.; validation, A.F.S.d.S., V.A.d.S.M., J.E.A.R.d.S., T.F.A.C.S. and M.A.V.; formal analysis, A.F.S.d.S., V.A.d.S.M. and J.E.A.R.d.S.; investigation, A.F.S.d.S., V.A.d.S.M. and J.E.A.R.d.S.; resources, T.F.A.C.S.; data curation, A.F.S.d.S., V.A.d.S.M., J.E.A.R.d.S., T.F.A.C.S. and M.A.V.; writing—original draft preparation, A.F.S.d.S., V.A.d.S.M., J.E.A.R.d.S., T.F.A.C.S. and M.A.V.; writing—review and editing, A.F.S.d.S., V.A.d.S.M., J.E.A.R.d.S., T.F.A.C.S. and M.A.V.; visualization, A.F.S.d.S., V.A.d.S.M., J.E.A.R.d.S. and T.F.A.C.S.; supervision, T.F.A.C.S. and M.A.V.; project administration, A.F.S.d.S., V.A.d.S.M. and J.E.A.R.d.S.; funding acquisition, T.F.A.C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Coordination for the Improvement of Higher Education Personnel, grant number 88887.967130/2024-00.

Data Availability Statement

Dataset of this paper will be available by the authors on request.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT-5 for the purposes of proofreading. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

United Nations Environment Programme. UNEP. Beyond an Age of Waste: Turning Rubbish into a Resource: Global Waste Management Outlook; UNEP: Nairobi, Kenya, 2024; ISBN 978-92-807-4129-2. [Google Scholar]
Baldé, C.P.; Kuehr, R.; Yamamoto, T.; McDonald, R.; Althaf, S.; Bel, G.; Deubzer, O.; Fernandez-Cubillo, E.; Forti, V.; Gray, V.; et al. Global E-waste Monitor 2024; International Telecommunication Union (ITU) e United Nations Institute for Training and Research (UNITAR): Genebra, Switzerland; Bonn, Germany, 2024; pp. 1–150. [Google Scholar]
Rogers, D.; Tibben-Lembke, R. Going Backwards: Reverse Logistics Trends and Practices; Reverse Logistics Executive Council: Reno, NV, USA; University of Nevada: Reno, NV, USA, 1998. [Google Scholar]
Sonar, H.; Dey Sarkar, B.; Joshi, P.; Ghag, N.; Choubey, V.; Jagtap, S. Navigating Barriers to Reverse Logistics Adoption in Circular Economy: An Integrated Approach for Sustainable Development. Clean. Logist. Supply Chain 2024, 12, 100165. [Google Scholar] [CrossRef]
Bhowmik, O.; Chowdhury, S.; Ashik, J.H.; Mahmud, G.I.; Khan, M.M.; Hossain, N.U.I. Application of Artificial Intelligence in Reverse Logistics: A Bibliometric and Network Analysis. Supply Chain Anal. 2024, 7, 100076. [Google Scholar] [CrossRef]
Sahu, S.K.; Mokhade, A.; Bokde, N.D. An Overview of Machine Learning, Deep Learning, and Reinforcement Learning-Based Techniques in Quantitative Finance: Recent Progress and Challenges. Appl. Sci. 2023, 13, 1956. [Google Scholar] [CrossRef]
Vidyasagar, M. A Tutorial Introduction to Reinforcement Learning. SICE J. Control Meas. Syst. Integr. 2023, 16, 172–191. [Google Scholar] [CrossRef]
Oluleye, B.I.; Chan, D.W.M.; Antwi-Afari, P. Adopting Artificial Intelligence for Enhancing the Implementation of Systemic Circularity in the Construction Industry: A Critical Review. Sustain. Prod. Consum. 2023, 35, 509–524. [Google Scholar] [CrossRef]
Raut, S.; Hossain, N.U.I.; Kouhizadeh, M.; Fazio, S.A. Application of Artificial Intelligence in Circular Economy: A Critical Analysis of the Current Research. Sustain. Futur. 2025, 9, 100784. [Google Scholar] [CrossRef]
Bhattacharya, S.; Govindan, K.; Ghosh Dastidar, S.; Sharma, P. Applications of Artificial Intelligence in Closed-Loop Supply Chains: Systematic Literature Review and Future Research Agenda. Transp. Res. Part E Logist. Transp. Rev. 2024, 184, 103455. [Google Scholar] [CrossRef]
Tranfield, D.; Denyer, D.; Smart, P. Towards a Methodology for Developing Evidence-Informed Management Knowledge by Means of Systematic Review. Br. J. Manag. 2003, 14, 207–222. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. Syst. Rev. 2021, 10, 89. [Google Scholar] [CrossRef]
Pranckutė, R. Web of Science (WoS) and Scopus: The Titans of Bibliographic Information in Today’s Academic World. Publications 2021, 9, 12. [Google Scholar] [CrossRef]
Gil, A.C. Como Elaborar Projetos de Pesquisa, 6th ed.; Atlas: São Paulo, Brazil, 2008. [Google Scholar]
De Medeiros Filho, A.R.; Russo, S.L. Trademarks as an Indicator: Systematic Review and Bibliometric Analysis of Literature. Biblios 2018, 71, 50–67. [Google Scholar] [CrossRef]
Van Eck, N.J.; Waltman, L. Software Survey: VOSviewer, a Computer Program for Bibliometric Mapping. Scientometrics 2010, 84, 523–538. [Google Scholar] [CrossRef] [PubMed]
Ouzzani, M.; Hammady, H.; Fedorowicz, Z.; Elmagarmid, A. Rayyan-a Web and Mobile App for Systematic Reviews. Syst. Rev. 2016, 5, 210. [Google Scholar] [CrossRef] [PubMed]
Bardin, L. Análise de Conteúdo; Edições 70: São Paulo, Brazil, 2011. [Google Scholar]
Ashtab, S.; Tosarkani, B.M. Scenario-Based Multi-Objective Optimisation Model Based on Supervised Machine Learning to Configure a Plastic Closed-Loop Supply Chain Network. Int. J. Bus. Perform. Supply Chain Model. 2023, 14, 106. [Google Scholar] [CrossRef]
Bittencourt, E.S.; Fontes, C.H.D.O.; Moya Rodriguez, J.L.; Filho, S.Á.; Ferreira, A.M.S. Forecasting of the Unknown End-of-Life Tire Flow for Control and Decision Making in Urban Solid Waste Management: A Case Study. Waste Manag. Res. J. Sustain. Circ. Econ. 2020, 38, 193–201. [Google Scholar] [CrossRef] [PubMed]
Dabo, A.-A.A.; Hosseinian-Far, A. An Integrated Methodology for Enhancing Reverse Logistics Flows and Networks in Industry 5.0. Logistics 2023, 7, 97. [Google Scholar] [CrossRef]
Fernandes De Souza, J.A.; Silva, M.M.; Rodrigues, S.G.; Machado Santos, S. A Forecasting Model Based on ARIMA and Artificial Neural Networks for End–of–Life Vehicles. J. Environ. Manag. 2022, 318, 115616. [Google Scholar] [CrossRef]
Ghosh, A.; Pathak, D.; Bhola, P.; Bhattacharjee, D.; Sivarajah, U. Analysing Product Attributes of Refurbished Laptops Based on Customer Reviews and Ratings: Machine Learning Approach to Circular Consumption. Ann. Oper. Res. 2023. [Google Scholar] [CrossRef]
González Rodríguez, G.; Gonzalez-Cava, J.M.; Méndez Pérez, J.A. An Intelligent Decision Support System for Production Planning Based on Machine Learning. J. Intell. Manuf. 2020, 31, 1257–1273. [Google Scholar] [CrossRef]
Monteiro, E.S.; Da Rosa Righi, R.; Barbosa, J.L.V.; Alberti, A.M. APTM: A Model for Pervasive Traceability of Agrochemicals. Appl. Sci. 2021, 11, 8149. [Google Scholar] [CrossRef]
Najmi, A.; Kanapathy, K.; Aziz, A.A. A Pathway to Involve Consumers for Exchanging Electronic Waste: A Deep Learning Integration of Structural Equation Modelling and Artificial Neural Network. J. Mater. Cycles Waste Manag. 2022, 24, 410–424. [Google Scholar] [CrossRef]
Pan, W.; Miao, L. Dynamics and Risk Assessment of a Remanufacturing Closed-Loop Supply Chain System Using the Internet of Things and Neural Network Approach. J. Supercomput. 2023, 79, 3878–3901. [Google Scholar] [CrossRef]
Rezaei, O.; Sahraeian, R.; Hosseini, S.M.H. A Multi-Objective Optimization Framework to Design the Closed-Loop Supply Chain Network Using Machine Learning for Demand Prediction. Process Integr. Optim. Sustain. 2025, 9, 1521–1542. [Google Scholar] [CrossRef]
Temur, G.T.; Bolat, B. Evaluating Efforts to Build Sustainable WEEE Reverse Logistics Network Design: Comparison of Regulatory and Non-Regulatory Approaches. Int. J. Sustain. Eng. 2017, 10, 358–383. [Google Scholar] [CrossRef]
Achamrah, F.E.; Riane, F.; Sahin, E.; Limbourg, S. An Artificial-Immune-System-Based Algorithm Enhanced with Deep Reinforcement Learning for Solving Returnable Transport Item Problems. Sustainability 2022, 14, 5805. [Google Scholar] [CrossRef]
Du, W.; Zhou, X.; Wang, C.; Rong, D. Research on Ecological Logistics Evaluation Model Based on BCPSGA-BP Neural Network. Multimed. Tools Appl. 2019, 78, 30271–30295. [Google Scholar] [CrossRef]
Guo, J.; Chen, L.; Wang, Z. Optimization of a Closed-Loop Supply Chain System Considering Government Incentives Mechanism under Deep Learning Algorithms. Comput. Ind. Eng. 2025, 205, 111146. [Google Scholar] [CrossRef]
Gutierrez-Franco, E.; Mejia-Argueta, C.; Rabelo, L. Data-Driven Methodology to Support Long-Lasting Logistics and Decision Making for Urban Last-Mile Operations. Sustainability 2021, 13, 6230. [Google Scholar] [CrossRef]
Kumar Jauhar, S.; Singh, A.; Kamble, S.; Tiwari, S.; Belhadi, A. Reverse Logistics for Electric Vehicles under Uncertainty: An Intelligent Emergency Management Approach. Transp. Res. Part E Logist. Transp. Rev. 2024, 192, 103806. [Google Scholar] [CrossRef]
Li, M.-Y.; Shih, F.-Y. Solving the Green Reverse Logistics Problem in E-Commerce Using a Reinforcement Learning Based Genetic Algorithm. Electron. Commer. Res. Appl. 2024, 68, 101455. [Google Scholar] [CrossRef]
Pooya, A.; Mansoori, A.; Eshaghnezhad, M.; Ebrahimpour, S.M. Neural Network for a Novel Disturbance Optimal Control Model for Inventory and Production Planning in a Four-Echelon Supply Chain with Reverse Logistic. Neural Process. Lett. 2021, 53, 4549–4570. [Google Scholar] [CrossRef]
Wang, J.; Zhou, S.; Li, M.; Ren, G.; Ren, X.; Xiong, X.; Zhang, Y. Multi-Echelon Inventory Optimization of Waste Electrical and Electronic Equipment Closed-Loop Supply Chain Based on Reinforcement Learning under Carbon Tax Policy. Eng. Appl. Artif. Intell. 2025, 154, 110987. [Google Scholar] [CrossRef]
Zhang, H.; Li, N.; Lin, J. Modeling the Decision and Coordination Mechanism of Power Battery Closed-Loop Supply Chain Using Markov Decision Processes. Sustainability 2024, 16, 4329. [Google Scholar] [CrossRef]
Boujarif, A.; Coit, D.W.; Jouini, O.; Zeng, Z.; Heidsieck, R. A Deep-Learning-Based Framework to Predict the Reliability of Multicomponent Repairable Systems in a Closed-Loop Supply Chain. IEEE Trans. Reliab. 2025, 74, 3809–3823. [Google Scholar] [CrossRef]
Gayialis, S.P.; Kechagias, E.P.; Konstantakopoulos, G.D.; Papadopoulos, G.A. A Predictive Maintenance System for Reverse Supply Chain Operations. Logistics 2022, 6, 4. [Google Scholar] [CrossRef]
Mazhar, M.I.; Kara, S.; Kaebernick, H. Remaining Life Estimation of Used Components in Consumer Products: Life Cycle Data Analysis by Weibull and Artificial Neural Networks. J. Oper. Manag. 2007, 25, 1184–1193. [Google Scholar] [CrossRef]
Zacharaki, A.; Vafeiadis, T.; Kolokas, N.; Vaxevani, A.; Xu, Y.; Peschl, M.; Ioannidis, D.; Tzovaras, D. RECLAIM: Toward a New Era of Refurbishment and Remanufacturing of Industrial Equipment. Front. Artif. Intell. 2021, 3, 570562. [Google Scholar] [CrossRef]
Boresta, M.; Pinto, D.M.; Stecca, G. Bridging Operations Research and Machine Learning for Service Cost Prediction in Logistics and Service Industries. Ann. Oper. Res. 2024, 342, 113–139. [Google Scholar] [CrossRef]
Molnár, A.; János, V.; Csiszárik-Kocsir, Á. Deriving the Classical and New-Keynesian Phillips Curves Using Machine Learning Simulations. Decis. Mak. Appl. Manag. Eng. 2024, 7, 546–567. [Google Scholar] [CrossRef]
Eruguz, A.S.; Karabağ, O.; Tetteroo, E.; Van Heijst, C.; Van Den Heuvel, W.; Dekker, R. Customer-to-Customer Returns Logistics: Can It Mitigate the Negative Impact of Online Returns? Omega 2024, 128, 103127. [Google Scholar] [CrossRef]
Seidi, M.; Kimiagari, A.M. A hybrid Genetic Algorithm-Neural Network Approach for Pricing Cores and Remanufactured Cores. S. Afr. J. Ind. Eng. 2012, 21, 131–148. [Google Scholar] [CrossRef]
Efendigil, T.; Önüt, S.; Kongar, E. A Holistic Approach for Selecting a Third-Party Reverse Logistics Provider in the Presence of Vagueness. Comput. Ind. Eng. 2008, 54, 269–287. [Google Scholar] [CrossRef]
Olivares-Vera, D.A.; Ovalle-Magallanes, E.; Hernández-Vázquez, J.I.; Hernández-Vázquez, J.O.; Gutierrez-Hernandez, D.A.; Olivares-Vera, A.D.P. Performance Evaluation of YOLO Models for Damage Identification in Tertiary Packaging. Signal Image Video Process. 2025, 19, 498. [Google Scholar] [CrossRef]
Panjehfouladgaran, H.; Lim, S.F.W.T. Reverse Logistics Risk Management: Identification, Clustering and Risk Mitigation Strategies. Manag. Decis. 2020, 58, 1449–1474. [Google Scholar] [CrossRef]
Rakhshan, K.; Daneshkhah, A.; Morel, J.-C. Stakeholders’ Impact on the Reuse Potential of Structural Elements at the End-of-Life of a Building: A Machine Learning Approach. J. Build. Eng. 2023, 70, 106351. [Google Scholar] [CrossRef]
Rezaei Zeynali, F.; Parvin, M.; ForouzeshNejad, A.A.; Jeyzanibrahimzade, E.; Ghanavati-Nejad, M.; Tajally, A. A Heuristic-Based Multi-Stage Machine Learning-Based Model to Design a Sustainable, Resilient, and Agile Reverse Corn Supply Chain by Considering Third-Party Recycling. Appl. Soft Comput. 2025, 174, 113042. [Google Scholar] [CrossRef]
Shahidzadeh, M.H.; Shokouhyar, S.; Javadi, F.; Shokoohyar, S. Unscramble Social Media Power for Waste Management: A Multilayer Deep Learning Approach. J. Clean. Prod. 2022, 377, 134350. [Google Scholar] [CrossRef]
Shahidzadeh, M.H.; Shokouhyar, S. Shedding Light on the Reverse Logistics’ Decision-Making: A Social-Media Analytics Study of the Electronics Industry in Developing vs Developed Countries. Int. J. Sustain. Eng. 2022, 15, 161–176. [Google Scholar] [CrossRef]
Shahidzadeh, M.H.; Shokouhyar, S. Revolutionizing Reverse Supply Chain Decision-Making: Deep Social Media Analysis in Qualitative Comparative Analysis. Comput. Ind. Eng. 2025, 206, 111241. [Google Scholar] [CrossRef]
Shokouhyar, S.; Shahidzadeh, M.H. Mastering Supply Chain’s Decision-Making Establishing SDG’s Goal: A Social Media Analytics Study of the Electronic Devices in Developing and Developed Countries. Ann. Oper. Res. 2024. [Google Scholar] [CrossRef]
Tietz Cazeri, G.; Sigahi, T.F.A.C.; Rampasso, I.S.; Moraes, G.H.S.M.; Zanon, L.G.; Gavira, M.O.; Eustachio, J.H.P.P.; Leal Filho, W.; Anholon, R. A multicriteria approach for assessing the maturity of supply chains regarding the implementation of circular economy practices in Brazil. Int. J. Sustain. Dev. World Ecol. 2024, 31, 611–625. [Google Scholar] [CrossRef]

Figure 1. PRISMA diagram. Source: Authors (2025).

Figure 2. Geographical analysis. Source: Authors (2025).

Figure 3. Temporal distribution. Source: Authors (2025).

Figure 4. Distribution by journal. Source: Authors (2025).

Figure 5. Distribution by type of machine learning model. Source: Authors (2025).

Figure 6. Distribution by model. Source: Authors (2025).

Table 1. Search strings for Scopus and Web of Science.

Scopus	Web of Science
TITLE-ABS-KEY (“machine learning ” OR “deep learning” OR “neural network ” OR “supervised learning” OR “unsupervised learning” OR “reinforcement learning” OR “Decision tree ” OR “random forest ”) AND TITLE-ABS-KEY(“closed loop supply chain” OR “closed-loop supply chain” OR “reverse logistics” OR “reverse supply-chain”)	TS = (“machine learning ” OR “deep learning” OR “neural network ” OR “supervised learning” OR “unsupervised learning” OR “reinforcement learning” OR “Decision tree ” OR “random forest ”) AND TS = (“closed loop supply chain” OR “closed-loop supply chain” OR “reverse logistics” OR “reverse supply-chain”)

Source: Authors (2025). The asterisk (*) is used as a truncation wildcard to retrieve variations of a term, ensuring the inclusion of both its singular and plural forms (e.g., searching for “network” would yield results for both “network” and “networks”).

Table 2. Categorization of included articles.

Category	Articles
Forecasting	[19,20,21,22,23,24,25,26,27,28,29]
Optimization	[30,31,32,33,34,35,36,37,38]
Reliability	[39,40,41,42]
Pricing	[43,44,45,46]
Classification	[47,48,49,50,51]
Consumers	[26,52,53,54,55]

Source: Authors (2025).

Table 3. Main limitations found in the articles.

Challenge	Description	Consequences	Articles
Data scarcity/quality	Return flows are irregular, heterogeneous, and often untracked; historical data are distributed across ERP systems, carriers, sorting centers, and 3PL providers, resulting in gaps and disparate formats.	Models learn low-reliability patterns; intensive data cleaning required.	[19,27]
Uncertainties	Abrupt changes in operational reality can alter the behavior of certain variables.	Predictive models become obsolete → forecast errors above 50%; optimized routing based on “outdated” demand increases empty mileage.	[22]
Computational cost in hybrid models	Robust models can be highly demanding in computational terms.	Execution time can increase from minutes to hours/days → impractical for daily decision-making; energy and cloud consumption raise operational costs.	[51]

Source: Authors (2025).

Table 4. Identified research gaps.

Research Gap	Current Shortcoming	Inspiration/Promising Research Directions
Explicit uncertainty modeling	In some cases, uncertainties are modeled, but more in-depth discussion on this aspect is lacking.	Use and comparison of models considering uncertainties, including fuzzy, grey systems, and others.
Real-time pipelines (streaming)	Most prototypes run on historical data instead of real-time data collection.	Digital twin implementation.
Hybrid ML	Few studies combine more than one type of ML.	Use of supervised and unsupervised learning models together.
Metadata standards & FAIR data	Lack of formal descriptions: data origin, processing, licenses.	Implementation of standardized metadata (e.g., schema.org, JSON-LD) and creation of dataset cards specifically for reverse logistics, facilitating transparency and data reuse.
Public, multi-scale benchmarks	Data repository models are not standardized, and there is a lack of public databases.	Government repositories (e.g., E-waste Monitor).
Explainable AI (XAI) applied to RL	Integrating XAI to accelerate interpretation of results is promising.	Application of SHAP to MILP for post-model analysis; visual analytics in reverse transport dashboards.
Deep ML + simulation integration	Connections remain “external”: simulation generates data, ML predicts, and then optimization decides (serial pipelines).	Advanced integration: co-simulation DES–ReL, Sim2Real ReL (training in ABS and applying in operation), and federated digital twin frameworks with embedded ML, overcoming serial pipelines.

Source: Authors (2025).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Silva, A.F.S.d.; Moris, V.A.d.S.; Silva, J.E.A.R.d.; Voltarelli, M.A.; Sigahi, T.F.A.C. Machine Learning in Reverse Logistics: A Systematic Literature Review. Algorithms 2025, 18, 650. https://doi.org/10.3390/a18100650

AMA Style

Silva AFSd, Moris VAdS, Silva JEARd, Voltarelli MA, Sigahi TFAC. Machine Learning in Reverse Logistics: A Systematic Literature Review. Algorithms. 2025; 18(10):650. https://doi.org/10.3390/a18100650

Chicago/Turabian Style

Silva, Abner Fernandes Souza da, Virginia Aparecida da Silva Moris, João Eduardo Azevedo Ramos da Silva, Murilo Aparecido Voltarelli, and Tiago F. A. C. Sigahi. 2025. "Machine Learning in Reverse Logistics: A Systematic Literature Review" Algorithms 18, no. 10: 650. https://doi.org/10.3390/a18100650

APA Style

Silva, A. F. S. d., Moris, V. A. d. S., Silva, J. E. A. R. d., Voltarelli, M. A., & Sigahi, T. F. A. C. (2025). Machine Learning in Reverse Logistics: A Systematic Literature Review. Algorithms, 18(10), 650. https://doi.org/10.3390/a18100650

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning in Reverse Logistics: A Systematic Literature Review

Abstract

1. Introduction

2. Materials and Methods

2.1. General Methodology

2.2. Quality Assessment

3. Results and Discussion

3.1. Bibliometric Analysis

3.2. Content Analysis

3.2.1. RQ1: Which ML Techniques Are Most Frequently Applied in RL Processes?

3.2.2. RQ2: What Are the Main Objectives for Using ML in Reverse Logistics?

3.2.3. RQ3: What Performance and Validation Metrics Are Reported in Studies on ML Applied to RL?

3.2.4. RQ4: What Are the Main Challenges and Limitations Encountered in Applying ML to RL?

3.2.5. RQ5: What Methodological and Technological Gaps Emerge from Literature, Indicating Opportunities for Future Research?

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI