Next Article in Journal
Multidrug-Resistant Acinetobacter spp. and Lytic Bacteriophages in Hospital Wastewater—A Five-Year Narrative Review
Next Article in Special Issue
Environmental Impacts of Italian Food Life Cycle Scenarios for Sustainability Management and Decision Making
Previous Article in Journal
Differential Associations of Internal and Residential Lead Exposure Pathways with Body Mass Index: A Mixture Analysis of Biomarkers and Household Dust
Previous Article in Special Issue
Ecological and Microbial Processes in Green Waste Co-Composting for Pathogen Control and Evaluation of Compost Quality Index (CQI) Toward Agricultural Biosafety
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluating Circular Economy Performance in Municipal Solid Waste Management: A Hybrid Structural Equation Modeling and Explainable Machine Learning Study from Cajamarca

by
Persi Vera-Zelada
1,*,
Emma Verónica Ramos-Farroñán
2,
Alexander Fernando Haro-Sarango
3,
Luis Alberto Vera-Zelada
4,
Julio Roberto Izquierdo-Espinoza
5,
Kevin Litman Florez-Tolentino
6,
Pamela Maidolly Torres-Moya
6,
Roberto Justo Tejada-Estrada
7 and
Gary Christiam Farfán-Chilicaus
8
1
Departamento de Ciencias Ambientales, Universidad Nacional Autónoma de Chota, Cajamarca 06003, Peru
2
Institute for Research in Science and Technology, Campus Piura, Universidad César Vallejo, Piura 20001, Peru
3
Ciencias Empresariales, Instituto Superior Tecnológico España, Ambato 180101, Ecuador
4
Escuela de Ingeniería Ambiental, Universidad Nacional de Cajamarca, Cajamarca 06003, Peru
5
Escuela de Administración y Marketing, Universidad Tecnológica del Perú, Lima 00051, Peru
6
Ciencias Económicas and Contabilidad y Finanzas, Universidad Nacional de Trujillo, Trujillo 13001, Peru
7
Escuela de Posgrado, Universidad César Vallejo, Callao 07001, Peru
8
Escuela de Ingeniería Industrial, Universidad Tecnológica del Perú, Lima 00051, Peru
*
Author to whom correspondence should be addressed.
Environments 2026, 13(4), 201; https://doi.org/10.3390/environments13040201
Submission received: 7 March 2026 / Revised: 28 March 2026 / Accepted: 30 March 2026 / Published: 5 April 2026
(This article belongs to the Special Issue Circular Economy in Waste Management: Challenges and Opportunities)

Abstract

This study evaluates the factors associated with municipal solid waste management performance under a circular economy approach in the municipalities of Cajamarca, Peru. A hybrid analytical design was applied to 120 municipal observations, combining partial least squares structural equation modeling to estimate the measurement and structural properties of four latent constructs—legal-regulatory framework, institutional capacity, operational management, and perceived performance—and XGBoost with SHAP to explore predictive classification of participation in circular economy training. The structural results indicate that operational management plays the central articulating role in linking regulation and institutional capacity to perceived performance, whereas the predictive component showed only modest out-of-sample discrimination (AUC-ROC = 0.519). Overall, the findings suggest that the proposed hybrid pipeline is more informative for explanatory integration and variable-importance analysis than for strong predictive discrimination under the current specification.

1. Introduction

Municipal solid waste management constitutes one of the most pressing environmental, institutional, and legal challenges of the twenty-first century, particularly in emerging economies where the gap between waste generation and adequate treatment continues to widen. According to the United Nations Environment Programme’s (UNEP) Global Waste Management Outlook 2024, global municipal solid waste generation reached 2.1 billion tonnes in 2023 and is projected to rise to 3.8 billion tonnes by 2050 if no structural measures for prevention and circularity are adopted [1]. The World Bank, through its report What a Waste 2.0, warned that at least 33% of waste generated globally is managed in an environmentally unsafe manner, through open dumping or uncontrolled burning, with direct impacts on public health, aquatic ecosystems, and greenhouse gas emissions [2].
Within this context, the circular economy has emerged as a transformative paradigm that seeks to decouple economic growth from natural resource consumption. Geissdoerfer et al. [3] conceptualized it as a new sustainability paradigm capable of redefining production and consumption systems beyond linear extraction and disposal. More recently, Kirchherr et al. [4] revisited the concept through a large-scale definitional analysis, emphasizing the centrality of resource loops, value retention, and systemic reconfiguration. The transition toward circular models of waste management requires not only technological innovation, but also enabling regulatory frameworks, strong institutional capacities, and technical operations capable of translating legal provisions into tangible outcomes. Blomsma and Brennan [5] argued that the circular economy emerged as a new framing around prolonged resource productivity, thereby highlighting the need for organizational and institutional adaptation. In turn, Ghisellini et al. [6] emphasized that circular transition depends on a balanced interaction between environmental and economic systems, which implies governance, regulation, and operational implementation capacities.
In Latin America and the Caribbean, the situation is particularly critical. The region generates approximately 231 million tonnes of municipal solid waste annually, with recycling rates below 10% and more than 40% of waste being managed inadequately, according to estimates by the Inter-American Development Bank. The Circularity Gap Report 2024 indicates that more than 60% of the total waste generated in the region lacks traceability or systematic records, severely limiting the formulation of evidence-based policies. Recent studies have documented that the predominance of an end-of-pipe approach, focused on final disposal rather than prevention and recovery, constitutes a major structural barrier to circularity in Latin American contexts. Graziani [7] identified persistent regional barriers associated with institutional fragmentation, limited investment, and weak policy coordination in Latin America and the Caribbean. Likewise, Hernández-Betancur et al. [8], in their systematic review of Latin American cities, showed that municipal solid waste management still tends to prioritize disposal over prevention, reuse, and recovery. Unlike European economies, where supranational directives have consolidated integrated regulatory frameworks with binding recycling targets and extended producer responsibility, countries in the region exhibit fragmented legal frameworks, weak intergovernmental coordination, and heterogeneous municipal capacities that hinder the effective implementation of circular strategies. Vilela-Pincay et al. [9] specifically highlighted the institutional challenges that developing countries face when attempting to align circular economy principles with municipal waste governance. Similarly, Ferronato and Torretta [10] showed that waste mismanagement in developing contexts is strongly associated with structural deficiencies in regulation, infrastructure, and institutional enforcement.
In Peru, this problem reflects the tension between a progressive regulatory framework and deficient territorial implementation. Legislative Decree No. 1278, the Law on Integrated Solid Waste Management, and its amendment through Law No. 32212 of December 2024, incorporate circular economy principles, extended producer responsibility, and material recovery as core pillars of national policy. Nevertheless, according to the 2024 Statistical Yearbook of the Environmental Sector published by the Ministry of the Environment (MINAM), the composition of household waste in 2023 was 56% organic, 21% recoverable inorganic, 14% non-recoverable, and 8% hazardous, while only 1.8% of total municipal waste generated was effectively recovered. Although 61.7% of waste is disposed of in sanitary landfills, the remaining 36.5% is sent to inadequate disposal sites, thereby revealing a structural gap in infrastructure and management [11].
In the Cajamarca region, this situation is even more severe: 81.9% of collected solid waste ends up in open dumps, only 4.7% of municipalities have operational sanitary landfills, and 57.94% of waste is disposed of inadequately [12]. Cajamarca comprises 92 municipalities that deposit waste in dumps and 64 that lack solid waste management instruments, positioning it as one of the regions with the greatest institutional and operational deficits in the country [13].
The relevance of this study lies in its contribution to a comprehensive understanding of the legal-regulatory, institutional, and operational factors that condition the performance of solid waste management under a circular economy approach in Peruvian municipalities. While the international literature has made significant progress in modeling barriers to and enablers of circularity through quantitative approaches such as structural equation modeling, empirical evidence in Latin American municipal contexts remains scarce and fragmented. Hair et al. [14] consolidated the methodological foundations of partial least squares structural equation modeling for the analysis of latent constructs in complex explanatory settings. In parallel, Sarstedt et al. [15] documented the rapid expansion and consolidation of PLS-SEM in applied research over the last decade, thereby reinforcing its relevance for studies that combine theory testing and predictive orientation.
Previous studies have employed PLS-SEM to assess determinants of environmental management in organizational settings. For example, Dangelico et al. [16] used a PLS-SEM approach to investigate the antecedents of sustainable behavioral intention, showing the usefulness of latent-variable modeling in environmentally oriented decision contexts. Likewise, Khan et al. [17] examined the relationship between Industry 4.0 and circular economy practices, demonstrating that PLS-SEM is suitable for evaluating structured relationships among sustainability-related constructs. Yet the application of hybrid models that integrate the explanatory power of structural equation modeling with the predictive capacity of machine learning remains a relevant methodological gap in the field of solid waste management. Shmueli et al. [18] advanced the discussion on predictive assessment in PLS-SEM by formalizing guidelines for out-of-sample evaluation. Building on that logic, Sharma et al. [19] proposed a hybrid SEM–machine learning approach to prediction, thereby illustrating the analytical potential of combining latent explanatory structures with algorithmic classification. The combination of PLS-SEM with supervised classification algorithms such as XGBoost, complemented by interpretability analysis using SHAP, makes it possible to overcome the limitations of purely confirmatory approaches by simultaneously offering theoretical validation of constructs and context-sensitive predictive capacity. Lundberg and Lee [20] provided the theoretical basis for SHAP as a unified approach to interpreting model predictions through additive feature contributions. Chen and Guestrin [21], in turn, introduced XGBoost as a scalable and efficient tree-boosting system with strong predictive capacity in structured-data environments.
In this sense, the proposed hybrid design is not an arbitrary combination of tools, but a sequential analytical strategy in which PLS-SEM supports theory-driven latent construct modeling, XGBoost 3.2.0 captures nonlinear predictive patterns, and SHAP provides transparent interpretation of the predictive stage, thus linking explanatory and predictive logics within a single framework.
Although technological factors such as recovery infrastructure, recycling intensity, treatment efficiency, and plant-level operating costs are relevant to circular economy implementation, the present study deliberately focuses on the governance-performance architecture of municipal solid waste management. Specifically, the analytical model was designed to examine how regulatory conditions, institutional capacity, and operational management are associated with perceived performance across municipalities. A distinct technological dimension was not included because the study aimed to prioritize comparable governance and management factors at the municipal level rather than facility-specific technical metrics, which may vary substantially in availability and measurement quality across local contexts.
The research problem is articulated around the following question: To what extent are the legal-regulatory framework, institutional capacity, and operational management structurally related to perceived performance in solid waste management under a circular economy approach in the municipalities of Cajamarca, and what is the predictive capacity of these latent constructs with respect to participation in training processes on circular economy? This question arises from the empirical observation that, despite the existence of national regulatory frameworks promoting circularity, the translation of these legal instruments into effective operational outcomes depends on institutional and technical mediations that have not yet been quantitatively modeled in the Peruvian context. The identified gap is both theoretical, due to the lack of integrated models linking regulation, institutional capacity, operations, and performance within a coherent causal architecture, and methodological, given the absence of prior applications of hybrid PLS-SEM and machine learning pipelines in the assessment of municipal circular economy practices in Peru. Although technological and cost-efficiency factors are also relevant for circular economy implementation, the present study deliberately prioritizes a governance-oriented specification centered on regulatory, institutional, operational, and perceived-performance dimensions at the municipal level.
The originality of this study lies in three complementary contributions. First, it provides empirical evidence from Cajamarca, a highly constrained territorial setting in Peru where circular economy implementation faces severe operational and institutional deficits. Second, it proposes an integrated analytical framework linking regulation, institutional capacity, operational management, and perceived performance within a single municipal governance architecture. Third, it combines latent-variable modeling with explainable machine learning, thereby extending the methodological toolkit available for evaluating circular economy performance in local waste management systems.
This study is directly aligned with Sustainable Development Goals 11 (sustainable cities, target 11.6), 12 (responsible production and consumption, targets 12.4.2 and 12.5.1), 13 (climate action), and 16 (strong institutions), insofar as it evaluates the causal chain connecting regulation, institutional capacity, technical operations, and environmental performance in municipal waste management, thereby contributing quantitative evidence for the territorial governance of circularity.
Accordingly, the general objective of this study is to assess the relationship among the legal-regulatory framework, institutional capacity, operational management, and solid waste performance under a circular economy approach in the municipalities of Cajamarca, through a hybrid PLS-SEM and machine learning model. More specifically, the study seeks to: first, assess the measurement properties of a model comprising four latent constructs (legal framework, institutional capacity, operational management, and performance) for the assessment of circular economy in municipal solid waste management; second, determine the structural relationships among these constructs, examining whether operational management plays an articulating role between regulation, institutional capacity, and performance; third, evaluate the predictive capacity of latent scores and contextual variables (experience and district size) on participation in circular economy training through supervised classification with XGBoost; and fourth, identify the variables with the greatest contribution to prediction through SHAP interpretability analysis, thereby integrating the explanatory and predictive perspectives of the hybrid pipeline.
The remainder of this paper is organized as follows. Section 2 presents the theoretical framework supporting the proposed constructs and the hybrid analytical approach. Section 3 describes the study design, instrument, data, and analytical procedure. Section 4 reports the results of the measurement model, structural model, and predictive analysis. Section 5 discusses the findings in relation to the literature and the study limitations. Finally, Section 6 presents the main conclusions and implications for research and practice.

2. Theoretical Framework

The circular economy, a paradigm that replaces linear extractive logic with closed loops of valorization in which waste is eliminated as a concept and natural systems are regenerated [22], requires, in municipal solid waste management, a simultaneous reconfiguration of regulatory, institutional, and operational architectures. Prieto-Sandoval et al. [23] demonstrated that the adoption of local circular practices depends on the coexistence of coherent regulations, technical capacities, and interinstitutional coordination, conditions that rarely converge in developing economies; this perspective was formalized by Velenturf and Purnell [24] through a framework that articulates legal instruments, institutional resources, and technical processes.
The legal-regulatory dimension constitutes the first construct of the model. Agovino et al. [25], through shift-and-share analysis across 28 European countries, demonstrated that regulatory alignment determines the achievement of circular objectives. Awino et al. [26], based on 151 articles covering 132 countries, showed that regulatory weakness and fragmented competencies are the most persistent barriers in the Global South. Likewise, Diaz-Barriga-Fernandez et al. [27] documented, in Latin America, a gap in which advanced legislation coexists with weak enforcement. Institutional capacity, the second construct, captures the organizational conditions required to implement circular policies. Gutberlet et al. [28], in Brazil, Colombia, and Argentina, conceptualized it in terms of qualified personnel, financial stability, and political leadership. Wilson et al. [29] specified that institutional weakness involves coordination deficits, staff turnover, and the absence of monitoring. Marshall and Farahbakhsh [30] positioned these factors as first-order determinants, and Orihuela [31] confirmed this in Peru through data envelopment analysis.
The operational dimension, the third construct, translates policy into technical processes of collection, segregation, valorization, and final disposal. Rigamonti et al. [32] integrated coverage, standardized protocols, infrastructure, and environmental education into an operational assessment framework, components that Abdel-Shafy and Mansour [33] identified as critical in developing countries. The mediating centrality of the operational dimension was demonstrated by Guerrero et al. [34]: after reviewing 117 studies, they concluded that improvements in recycling and regulatory compliance occur only when organizational capacities are translated into structured processes, a principle that underpins the hypothesis of C3 as a mediator between the regulatory-institutional binomial and performance.
The latter, the fourth construct, synthesizes tangible outcomes. Scheinberg et al. [35] distinguished among process, outcome, and impact indicators. Leal Filho et al. [36], across 45 countries, validated the correspondence between key stakeholders’ perceptions and objective indicators, thereby legitimizing perception-based instruments such as the one employed in this study. Di Foggia and Beccarello [37] further confirmed that circularity generates measurable positive effects only when regulation, institutional capacity, and operations are sufficiently developed, thus configuring the sequential causal chain that this study empirically tests.
From a methodological standpoint, PLS-SEM has become a benchmark technique for models with latent variables in moderate samples that combine explanation and prediction. Ringle et al. [38] documented its suitability for constructs that are not directly observable. Liengaard et al. [39], in a review of 204 studies, confirmed its advantages when using Likert scales and in specific territorial contexts. Müller et al. [40] emphasized that AVE values below 0.50 should be interpreted in light of semantic heterogeneity and reverse-coded items, a criterion applicable to the instrument used in this study. Integration with machine learning overcomes the limitations of each approach in isolation. Sarstedt et al. [41] argued that PLS-SEM excels in theoretical validation, whereas XGBoost captures nonlinear relationships undetectable by linear models. Rashad et al. [42], in Land Degradation & Development, demonstrated the feasibility of combining PLS-SEM, XGBoost, and SHAP in a pipeline in which latent scores feed the classifier and SHAP values interpret the results, an architecture identical to that adopted here. SHAP interpretability, grounded in Shapley game theory, decomposes each prediction into additive contributions by variable, thereby addressing the black-box problem [43]. Cakiroglu et al. [44] validated that this combination generates interpretable knowledge by providing causal traceability to the predictive component. Accordingly, the present study adopts this hybrid architecture as an emerging but already documented methodological strategy for contexts in which latent theoretical structures must be assessed while also exploring predictive behavior and model interpretability. This interpretation is also reinforced by Simões and Marques [45], who showed that regulation may influence the productivity and performance of waste utilities not only through legal stringency, but also through the institutional incentives it creates for service organization and operational efficiency.

3. Materials and Methods

The study was conducted under a quantitative approach, with a non-experimental, cross-sectional, and analytical-predictive design aimed at integrating an explanatory component (latent construct modeling) with a predictive component (supervised classification). The methodological strategy adopted a two-stage hybrid pipeline: (i) estimation of a partial least squares structural equation model (PLS-SEM) to represent and validate latent constructs measured through a survey, and (ii) use of the resulting latent scores as input variables in a machine learning model based on XGBoost for the prediction of a dichotomous target variable. This two-stage specification was selected to preserve the theoretical consistency of the latent measurement system while extending the analysis toward nonlinear classification and interpretable variable contribution assessment.

3.1. Study Context, Respondents, and Sampling

The unit of analysis in this study was the municipal respondent linked to public or environmental management functions related to solid waste management in Cajamarca-Peru. The analytical sample comprised 120 valid survey observations collected from municipal contexts in the region. The sampling strategy was non-probabilistic and oriented toward respondents with direct knowledge of municipal waste management processes, legal implementation, or operational conditions. Accordingly, the findings should be interpreted as analytically informative for the studied context rather than statistically representative of all municipalities in Peru.

3.2. Data Collection, Inclusion Criteria, and Ethics

Data were collected through a structured survey administered to eligible municipal respondents involved in public or environmental management activities. Inclusion criteria required direct familiarity with solid waste management practices, municipal procedures, or related governance functions. Incomplete or invalid responses were excluded from the final analytical dataset. Because the survey was administered in municipal contexts through non-probabilistic access to eligible respondents, a precise response rate could not be calculated. Participation was voluntary and based on informed consent, and all responses were processed anonymously. The study followed the ethical principles applicable to minimal-risk survey research; under the institutional conditions of the project, formal committee approval was not required.
The analyzed dataset consisted of 120 observations and variables derived from a structured survey. Specifically, 24 Likert-type items were used, organized into four blocks of six indicators each (C1_1 to C1_6, C2_1 to C2_6, C3_1 to C3_6, and C4_1 to C4_6), in addition to three contextual variables (D1, D2, and D3). Table 1 presents the data collection instrument. Variables D1 and D2 were treated as ordinal factors for the descriptive analysis, whereas D3 was defined as a dichotomous target variable (“No”/“Yes”) for the predictive stage.
No standalone technological construct was incorporated into the survey design. This delimitation was intentional, as the study sought to model regulatory, institutional, operational, and perceived-performance dimensions that were directly comparable across municipalities; therefore, plant-efficiency indicators, recovery technology intensity, and cost-based engineering measures were left for future research.
During the data preparation phase, the dataset was imported using readxl, and variable transformation was performed using tidyverse. The construct items (C1–C4) were converted to numeric format, and the negatively worded items C1_5, C2_5, C3_5, and C4_5 were reverse-coded so that higher values consistently represented more favorable conditions across all indicators. The contextual variables were recoded with explicit levels to ensure analytical consistency. Initial data quality checks were conducted, including the number of rows and columns, total count of missing values, and class distribution of D3, thereby confirming the integrity of the dataset prior to modeling. In addition, descriptive visualizations were generated to inspect the composition of Likert responses by construct and the class balance of the target variable, enabling a preliminary assessment of variability and distribution. Given the systematic use of reverse-worded item 5 across all four constructs, an additional sensitivity analysis was planned to compare measurement-model performance with and without the “_5” indicators, in order to assess their impact on convergent validity and loading stability.
The explanatory stage was implemented with PLS-SEM using the cSEM 0.6.1 package in R. A reflective measurement model (CFA-type syntax) was specified for four latent variables (C1, C2, C3, and C4), each defined by six observed indicators. The structural model included directed relationships among constructs: C2~C1; C3~C1 + C2; C4~C1 + C2 + C3. Estimation was carried out using the PLS-PM approach (.approach_weights = “PLS-PM”), incorporating bootstrap resampling (.resample_method = “bootstrap”) with 499 replications to support inferential assessment and parameter stability. This choice is methodologically appropriate in contexts involving Likert scales and moderate sample size, where it is necessary to simultaneously model relationships among constructs while preserving predictive robustness. Bootstrap resampling was used not only to assess parameter stability but also to derive inferential statistics for path coefficients and indirect effects, including confidence intervals and p-values where applicable.
To evaluate the PLS-SEM model, indicators of measurement model quality and structural model quality were extracted. Specifically, reliability and convergent/discriminant validity metrics were calculated using the function calculateAVE() from cSEM. Factor loadings (Loading_estimates) and path coefficients (Path_estimates) were also extracted from the internal structure of the estimated object. Because the internal output structure may vary across package versions, an auxiliary function (get_nested) was used for robust component retrieval, thereby ensuring reproducibility of the analytical workflow. In parallel, a supplementary model was estimated with lavaan (MLR estimator) solely for graphical representation of the construct and path diagram using semPlot, without replacing the main estimation performed with cSEM.
The transition to the predictive stage was carried out through the extraction of latent scores (construct scores) from the PLS-SEM model. Because getConstructScores() may return objects with heterogeneous structures (e.g., lists including auxiliary components such as weights and selected indicators), a robust routine was implemented to specifically identify and retrieve the Construct_scores matrix, which was subsequently converted into a tibble for further use. These latent scores (C1, C2, C3, and C4) constitute a reduced and theoretically informed representation of the measurement system and were combined with contextual variables D1 and D2 to form the predictor set for the supervised model.
For the machine learning stage, an analytical dataset (ml_df) was constructed by integrating latent scores and contextual variables, defining as the target variable a binary version (target = Yes/No) of D3 in order to standardize metric computation and ensure compatibility with xgboost. Subsequently, a manual stratified 80/20 train-test split was performed, preserving the class proportions of the target variable in both subsets. The categorical variables (D1 and D2) were transformed through one-hot encoding using model.matrix, and a column-alignment routine between the training and test matrices was incorporated to avoid inconsistencies when certain factor levels appeared only in one partition.
The predictive model was implemented with XGBoost (xgboost, objective binary:logistic) using xgb.DMatrix matrices. Hyperparameter selection was performed through a grid search over eta, max_depth, min_child_weight, subsample, and colsample_bytree, combined with stratified 5-fold cross-validation (xgb.cv) and early stopping (20 rounds) to prevent overfitting. To strengthen the workflow and avoid interruptions caused by package version differences or occasional errors during the validation process, a safe function (run_xgb_cv) was programmed using tryCatch in R 3.1.0, capturing errors and recording the status of each evaluated combination. In the absence of valid configurations, a methodologically reasonable fallback hyperparameter set was defined to ensure continuity of the analysis. The final model was trained using the best identified combination and the optimal number of rounds (best_iter).
Predictive performance was evaluated on the test set using classification and calibration metrics. Predicted probabilities were obtained, and the ROC curve was estimated with pROC in R 3.1.0, calculating the AUC-ROC as the main measure of discrimination. The optimal cutoff point was determined according to the Youden criterion, from which predicted classes were derived for the computation of accuracy, sensitivity (recall), specificity, precision, F1-score, and Brier score. In addition, model calibration was assessed through a decile-based calibration curve, comparing the mean predicted probability with the observed event rate in each group. This combination of metrics enables the simultaneous assessment of discriminative capacity, error balance, and probabilistic consistency of the model.
The interpretability of the machine learning component was addressed through SHAP (SHapley Additive exPlanations) values. “True” SHAP contributions were extracted from xgboost using predcontrib = TRUE, removing the bias term (BIAS/intercept) and summarizing variable importance through the mean absolute contribution value (mean |SHAP|). Optionally, the shapviz package was integrated for complementary visualizations. This procedure made it possible to identify which latent scores and contextual variables most strongly explained the final classification, thereby reinforcing the interpretive traceability of the hybrid pipeline.

4. Results

In terms of class balance, the distribution of the target variable D3 shows a relatively balanced composition, with a moderate predominance of the affirmative category. Table 2 quantifies this balance and is visually complemented by the right-hand panel of Figure 1, where the direct comparison of frequencies between “No” and “Yes” can be observed.
The composition of Likert-item responses by construct, shown in the left-hand panel of Figure 1, reveals a response pattern with greater concentration in intermediate and high categories, suggesting sufficient variability to estimate a latent model and, at the same time, a general tendency toward non-extreme responses. This descriptive reading is integrated with Table 2, since the balance of D3 directly influences the stability of the classification indicators reported later.

4.1. Results of the PLS-SEM Measurement Model

The evaluation of the measurement model focused on convergent validity through AVE and on the inspection of external loadings by indicator. The AVE values by construct (Table 3) fall below the conventional threshold of 0.50 in all four cases, indicating that, overall, the variance explained by the construct in its indicators remains limited under this instrument specification.
Consistent with Table 3, the loading analysis reveals heterogeneity in the contribution of items to each construct. Table 4 summarizes descriptive statistics of the loadings by construct, showing that each block contains one indicator with a loading below 0.40, whereas a relevant proportion of items falls within the mid-range (approximately between 0.57 and 0.73). This structure is consistent with the visual representation of the measurement model in Figure 2, where stronger arrows are observed for indicators with higher loadings (e.g., C4_1) and markedly weaker arrows for items with low loadings, especially the “_5” items of each construct.
The full details of the loadings (Table 5) make it possible to specify that the lowest-performing pattern systematically falls on item 5 of each construct: C1_5 = 0.281, C2_5 = 0.393, C3_5 = 0.317, and C4_5 = 0.332. Taken together, Table 5 reinforces the interpretation of Table 4 and explains the reduction in AVE reported in Table 3, since consistently low loadings in specific indicators reduce the aggregate convergent variance of the construct.

4.2. Results of the PLS-SEM Structural Model

The coefficients of the structural model are presented in Table 6 and are integratively visualized in Figure 2. The estimated relationship between C1 and C2 (β = 0.629) is of moderate-to-high magnitude, suggesting that an increase in C1 is associated with increases in C2 within the proposed causal architecture. In addition, C1 maintains a moderate direct effect on C3 (β = 0.446), while C2 also makes a relevant contribution to C3 (β = 0.583), consolidating C3 as a construct influenced by at least two explanatory sources within the proposed framework (Table 6).
A major structural finding is the high magnitude of the effect of C3 on C4 (β = 0.817), which emerges as the dominant link in the final segment of the model (Table 6) and is reflected in Figure 2 through the arrow with the greatest relative intensity. In contrast, the direct effects of C1 on C4 (β = 0.015) and of C2 on C4 (β = 0.045) are virtually null (Table 6), suggesting that, under this specification, the influence on C4 is channeled primarily through the mediated pathway via C3, a pattern that can also be visually inferred from the path structure in Figure 2.

4.3. Latent Scores and Articulation Toward the Predictive Component

The extraction of latent scores (construct scores) made it possible to transform the PLS-SEM model into a set of compact and continuous variables representing the relative position of each case in C1–C4. Table 7 shows an example of the obtained scores, revealing variability with positive and negative values; this behavior is consistent with centered latent scores that were subsequently used as classification inputs. This representation is conceptually consistent with the purpose of the hybrid pipeline, since it preserves the measurement structure of the instrument (Table 1) while enabling its use in a predictive model.
Based on these scores (Table 7), the predictive dataset was integrated by incorporating the contextual variables D1 and D2 from the instrument (Table 1), resulting in a final feature space that combined the four latent scores with the one-hot encoded categories of D1 and D2. This articulation between Table 1 and Table 7 is central to interpreting the results of the XGBoost component, as it delimits which information from the instrument effectively entered the classifier.

4.4. Predictive Performance of the XGBoost Model

The XGBoost model was trained using an optimal configuration selected through cross-validation, whose hyperparameters are presented in Table 8. The selected combination (eta = 0.05, maximum depth = 2, min_child_weight = 3, subsample = 1.0, colsample_bytree = 0.8) achieved an approximate cross-validated AUC of 0.643 and an optimal number of 82 training rounds (Table 8). However, when performance was assessed on the test set, the observed AUC was 0.519, a value represented in the ROC curve in Figure 3, where the trajectory approaches the diagonal, evidencing low overall discrimination in probabilistic classification under the evaluated threshold.
In the test set, Table 9 presents the performance metrics. The model achieved an accuracy of 0.640, accompanied by high sensitivity (0.786) and low specificity (0.455). This pattern suggests that the classifier tends to identify the “Yes” class correctly more often than the “No” class, which is consistent with the distribution of the target variable reported in Table 2 and with the decision threshold selected by the Youden criterion (0.467), also included in Table 9. The F1-score of 0.710 reflects a moderate balance between precision and recall, although the AUC-ROC of 0.519 indicates that, when all possible thresholds are considered, overall discriminatory capacity remains limited (Table 9), a finding that is also visually reinforced in the ROC panel of Figure 3.
The contingency table (Table 10) and Figure 4 make the classifier’s behavior more concrete: among 25 test cases, there were 11 true positives (Yes correctly classified), 5 true negatives (No correctly classified), 6 false positives (No classified as Yes), and 3 false negatives (Yes classified as No). This pattern is consistent with the sensitivity and specificity metrics in Table 9 and explains why performance is asymmetric across classes despite an overall accuracy of 0.640.

4.5. Integrated Interpretability with SHAP and Synthesis of Findings

The variable importance estimated through SHAP (bottom-right panel of Figure 3) shows that predictive contribution is concentrated in a combination of contextual variables and latent scores. In particular, the D2 category “Medium-sized” appears as the feature with the highest mean absolute contribution, followed by the latent scores C1, C3, and C2, with C4 showing a lower contribution within the set of latent variables. Table 11 summarizes the relative order of importance observed in Figure 3, following the ranking logic derived from the contribution analysis.
When the explanatory and predictive results are integrated, the PLS-SEM model identifies a causal structure in which C3 plays an articulating role toward C4, given the high coefficient C3→C4 (Table 6) and its dominant representation in Figure 2. In parallel, the predictive component confirms that C3 is also relevant for classification, although it shares prominence with C1 and with the context variable D2 (Table 11 and Figure 3). This coupling between Table 6 and Table 11 suggests partial coherence between the internal functioning of the latent system and the determinants used by the classifier, even though the overall discrimination of the XGBoost model on the test set remains limited according to the AUC in Table 9 and the observable overlap in probability distributions in Figure 3.
Subsequently, the joint reading of Table 3 and Table 5 provides an interpretive framework for predictive performance: AVE values below 0.50 (Table 3) and the systematic presence of one low-loading indicator per construct, particularly item “_5” in C1–C4 (Table 5), describe a measurement model with partial convergence. This result is consistent with a scenario in which the latent scores (Table 7) capture useful structural signal for the C1→C2→C3→C4 chain (Table 6), but the information available to discriminate D3 in probabilistic terms remains limited in the test set, as reflected by the AUC-ROC close to 0.5 (Table 9) and the error structure of the confusion matrix (Table 10, Figure 4). Overall, the hybrid results indicate partial coherence between the explanatory structure identified by PLS-SEM and the variable-importance profile identified by SHAP, although the out-of-sample discrimination of the XGBoost model remains modest.

4.6. Additional Robustness and Complementary Analyses

Additional robustness analyses were conducted to further assess the stability of the findings. As shown in Table 12, removing the reverse-worded item _5 improved AVE across all constructs, with the largest gain observed for C2 (ΔAVE = +0.072), which was the only construct to exceed the 0.50 threshold under the reduced specification. However, the remaining constructs continued to show suboptimal convergent validity, and the comparison between square roots of AVE and inter-construct correlations indicated persistent discriminant-validity overlap, particularly involving C3.
In parallel, the predictive component was contrasted with simpler baseline classifiers. As reported in Table 13, XGBoost achieved the highest AUC (0.519), but the difference relative to random forest (0.481), decision tree (0.477), and logistic regression (0.474) was marginal. These results suggest that the predictive stage should be interpreted as exploratory, with limited discriminatory strength under the current specification.

5. Discussion

The most robust structural finding of the PLS-SEM model lies in the magnitude of the C3→C4 coefficient (β = 0.817), which positions operational management as the direct and dominant determinant of perceived performance, relegating the direct effects of the legal-regulatory framework (C1→C4 = 0.015) and institutional capacity (C2→C4 = 0.045) to virtually null values. This pattern is consistent with an articulating role of operational management: regulation and organizational capacities do not appear to affect management outcomes directly and uniformly, but rather through their translation into structured operational processes of collection, segregation, monitoring, and environmental education. This evidence converges with Guerrero et al. [34], who, based on 117 studies in developing countries, concluded that technical operations constitute the link that transforms institutional conditions into measurable environmental performance, and with the circular infrastructure framework proposed by Velenturf and Purnell [24], which postulates the need for deliberate alignment between regulatory instruments and technical processes in order to materialize the circular transition.
The C1→C2 chain (β = 0.629) and C2→C3 (β = 0.583), in turn, reveals that the perception of regulatory clarity and enforceability mechanisms is associated with better organizational conditions, and that these, in turn, enable operational structuring, a sequence consistent with the argument advanced by Marshall and Farahbakhsh [30] regarding the primacy of institutional and governance factors as antecedents of performance. The underlying implication is that, in the municipal context of Cajamarca, the circular economy as a legal instrument generates enabling conditions, but performance is consolidated only when such conditions are converted into effective technical operations, confirming that regulation without operational mediation does not produce tangible results, a pattern documented by Di Foggia and Beccarello [37] in European municipal systems.
The triangulation between the explanatory and predictive components of the hybrid pipeline reveals partial convergences that enrich interpretation and, at the same time, expose limitations that should be read in methodological terms rather than as analytical failure. The SHAP analysis identifies medium district size (D2) as the variable with the highest absolute contribution to the prediction of participation in training activities (D3), followed by latent scores C1 and C2, whereas C4 and especially C3 show lower contributions. This ranking should be interpreted cautiously and in exploratory terms, given the weak discriminatory performance of the classifier. Rather than confirming the structural hierarchy identified by the explanatory model, the SHAP profile suggests that the fitted classifier distributed relative importance across contextual and latent predictors in a way that only partially overlaps with the structural sequence. In particular, the prominence of C1 indicates that perceived regulatory coherence may be more closely associated with the disposition to engage in training processes than with performance itself. This complementarity between approaches is precisely what Sharma et al. [19] anticipated when proposing hybrid SEM-machine learning pipelines in which causal structure and algorithmic prediction illuminate different dimensions of the same phenomenon, and it is consistent with the analytical architecture validated by Rashad et al. [42] by integrating PLS-SEM, XGBoost, and SHAP to reveal relationships that no isolated approach can capture. The test-set AUC-ROC of 0.500 indicates weak discriminatory performance, which requires cautious interpretation of the predictive component. In this context, SHAP values are more appropriately understood as an exploratory interpretive aid for examining how the fitted classifier distributed relative importance across predictors, rather than as robust evidence of stable predictive dominance [20,43].
AVE values below 0.50 across the four constructs (C1 = 0.352, C2 = 0.432, C3 = 0.390, and C4 = 0.378) constitute a limitation that requires a contextualized reading rather than a mechanical rejection of the instrument. Müller et al. [40] emphasized that the assessment of convergent validity should consider the semantic heterogeneity of items and the presence of reverse-worded indicators, precisely the situation documented here: the items with the suffix “_5” in each construct, phrased in a problematizing sense (regulatory confusion, staff turnover, lack of training, absence of progress), exhibit systematically low loadings (0.281–0.393), which depress aggregate AVE without invalidating conceptual content. This tension between achievement-oriented items and barrier-oriented items, paradoxically, provides a valuable substantive interpretation: the instrument does not merely measure capacities, but also critical implementation bottlenecks, thereby capturing a semantic duality inherent to municipal management in contexts of high institutional precariousness such as that documented by Marín-Cabanillas et al. [12] for Cajamarca.
Hair et al. [14] warned that the rigid application of psychometric thresholds in exploratory research may penalize instruments that capture emerging constructs in novel contexts, a criterion that is particularly relevant when municipal circular economy is modeled for the first time in Peru. The overlap of latent score distributions between the D3 classes, visible in the diagnostic outputs, explains the classifier’s modest discrimination and is consistent with the argument of Shmueli et al. [18] regarding the distinction between explanatory power and predictive power: a model may capture robust structural relationships while generating limited predictions when the target variable depends on exogenous factors not included in the model, such as the territorial availability of training opportunities or contingent administrative decisions.
The convergence between both components of the hybrid pipeline makes it possible to derive implications that go beyond mere statistical validation. The prominence of medium district size in the SHAP ranking suggests an institutional inflection point: intermediate-scale municipalities may possess sufficient organizational structure to access training processes while simultaneously exhibiting operational gaps that motivate them to actively demand training in circular economy, a pattern consistent with the heterogeneity of municipal capacities documented by Awino et al. [26] across 132 countries and with the evidence provided by Wilson et al. [29] regarding differentiated institutional weakness in local governments of the Global South. From a regulatory perspective, the fact that C1 emerges as the second SHAP predictor while its direct effect on C4 is virtually null in the structural model confirms that the perception of regulatory coherence operates as an enabler of processes—training, institutional strengthening, and demand for circularity—without automatically translating into performance, a finding that reinforces the implementation gap identified by Agovino et al. [25] between formal regulatory frameworks and effective environmental outcomes. Taken together, the pipeline results support the thesis that municipal circular economy requires a multilevel architecture in which regulation enables, institutions organize, and operations execute [23], and that the evaluation of this chain demands methodologies combining theoretical validation, predictive capacity, and algorithmic interpretability [41,44]. For Cajamarca, where 81.9% of waste is disposed of in open dumps [12], these findings orient public policy toward operational strengthening as the primary lever, conditioned by the simultaneous improvement of institutional capacity and regulatory coherence.

6. Conclusions

The hybrid PLS-SEM, XGBoost, and SHAP pipeline applied to 120 observations from Cajamarca made it possible to examine a theoretically grounded causal architecture and to complement it with exploratory predictive assessment and algorithmic interpretability. Regarding the first objective, the measurement model provided partial empirical support for the conceptual structure of the legal-regulatory framework (C1), institutional capacity (C2), operational management (C3), and performance (C4), although AVE values below 0.50 indicate limited convergent strength under the current instrument specification. Regarding the second objective, the structural results suggest that operational management plays the central role in linking regulation and institutional capacity to performance, given the high C3→C4 coefficient (β = 0.817) and the weak direct paths from C1 and C2 to C4. Regarding the third objective, the XGBoost classifier showed modest test-set discrimination (AUC-ROC = 0.519), indicating that training participation is only partially captured by the modeled constructs. Finally, the SHAP analysis identified district size and latent scores C1, C3, and C2 as the most relevant contributors to classification, suggesting that participation in training is embedded in a phase of institutional-operational strengthening rather than being a simple consequence of already consolidated performance. Overall, the findings support a sequential interpretation in which regulation enables, institutions organize, and operations execute, while also indicating that the present hybrid pipeline is more informative for explanatory integration and variable-importance analysis than for strong out-of-sample prediction under the current specification.
This study has some limitations. First, the analytical specification did not include a dedicated technological dimension, such as waste recovery rates, treatment efficiency, infrastructure modernization, or operating-cost indicators. As a result, the model is better interpreted as an assessment of the governance and operational conditions associated with circular economy performance than as a full technical evaluation of municipal waste systems. Future research should integrate technological and engineering indicators with latent governance constructs in order to test whether municipal performance is jointly shaped by institutional organization and technological capability.
From a practical standpoint, the findings suggest that local decision makers should prioritize three lines of action. First, municipalities should strengthen operational management through updated plans, standardized procedures, monitoring systems, and continuous staff training. Second, institutional capacity should be reinforced through better interdepartmental coordination, technical staffing continuity, and stable organizational support for circular-economy programs. Third, regulatory efforts should focus not only on formal compliance, but also on improving local enforceability and implementation coherence. These actions may help reduce the persistent gap between formal circular-economy policy and actual municipal waste-management performance in Cajamarca

Author Contributions

Conceptualization, P.V.-Z., A.F.H.-S. and G.C.F.-C.; methodology, P.V.-Z., A.F.H.-S. and G.C.F.-C.; software, A.F.H.-S.; validation, P.V.-Z., A.F.H.-S. and G.C.F.-C.; formal analysis, P.V.-Z., A.F.H.-S. and G.C.F.-C.; investigation, P.V.-Z., E.V.R.-F., A.F.H.-S., L.A.V.-Z., J.R.I.-E., K.L.F.-T., P.M.T.-M., R.J.T.-E. and G.C.F.-C.; resources, P.V.-Z., L.A.V.-Z. and G.C.F.-C.; data curation, P.V.-Z., E.V.R.-F. and A.F.H.-S.; writing—original draft preparation, P.V.-Z., A.F.H.-S. and G.C.F.-C.; writing—review and editing, E.V.R.-F., L.A.V.-Z., J.R.I.-E., K.L.F.-T., P.M.T.-M., R.J.T.-E., P.V.-Z., A.F.H.-S. and G.C.F.-C.; visualization, A.F.H.-S. and P.V.-Z.; supervision, G.C.F.-C. and P.V.-Z.; project administration, P.V.-Z. and G.C.F.-C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors would like to thank their respective institutions for their academic and research support during the development of this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AUC-ROCArea Under the Receiver Operating Characteristic Curve
AVEAverage Variance Extracted
CECircular Economy
HTMTHeterotrait–Monotrait Ratio
MSWMunicipal Solid Waste
MSWMMunicipal Solid Waste Management
PLS-SEMPartial Least Squares Structural Equation Modeling
SHAPShapley Additive Explanations
XGBoostExtreme Gradient Boosting
CIConfidence Interval
CRComposite Reliability
R2Coefficient of Determination

References

  1. UNEP. Global Waste Management Outlook 2024. United Nations Environment Programme. 2024. Available online: https://www.unep.org/resources/global-waste-management-outlook-2024 (accessed on 16 January 2026).
  2. Kaza, S.; Yao, L.; Bhada-Tata, P.; Van Woerden, F. What a Waste 2.0: A Global Snapshot of Solid Waste Management to 2050; World Bank Publications: Washington, DC, USA, 2018. [Google Scholar] [CrossRef]
  3. Geissdoerfer, M.; Savaget, P.; Bocken, N.M.P.; Hultink, E.J. The circular economy—A new sustainability paradigm? J. Clean. Prod. 2017, 143, 757–768. [Google Scholar] [CrossRef]
  4. Kirchherr, J.; Yang, N.H.N.; Schulze-Spüntrup, F.; Heerink, M.J.; Hartley, K. Conceptualizing the circular economy (revisited): An analysis of 221 definitions. Resour. Conserv. Recycl. 2023, 194, 107001. [Google Scholar] [CrossRef]
  5. Blomsma, F.; Brennan, G. The emergence of circular economy: A new framing around prolonging resource productivity. J. Ind. Ecol. 2017, 21, 603–614. [Google Scholar] [CrossRef]
  6. Ghisellini, P.; Cialani, C.; Ulgiati, S. A review on circular economy: The expected transition to a balanced interplay of environmental and economic systems. J. Clean. Prod. 2016, 114, 11–32. [Google Scholar] [CrossRef]
  7. Graziani, P. Circular economy in Latin America and the Caribbean: Drivers, opportunities, barriers and strategies. J. Clean. Prod. 2024, 448, 141586. [Google Scholar] [CrossRef]
  8. Betancourt Morales, C.M.; Zartha Sossa, J.W. Circular economy in Latin America: A systematic literature review. Bus Strategy Environ. 2020, 29, 2479-97. [Google Scholar] [CrossRef]
  9. Vilela-Pincay, W.; Espinosa-Fuentes, M.; Bravo-Cedeño, C. Circular economy and municipal waste management: Institutional challenges in developing countries. Sustainability 2023, 15, 9654. [Google Scholar] [CrossRef]
  10. Ferronato, N.; Torretta, V. Waste mismanagement in developing countries: A review of global issues. Int. J. Environ. Res. Public Health 2019, 16, 1060. [Google Scholar] [CrossRef]
  11. MINAM. Anuario Estadístico del Sector Ambiente 2024; Ministerio del Ambiente del Perú: Lima, Peru, 2024. [Google Scholar]
  12. Marín-Cabanillas, W.S.; Sánchez-Dávila, M.E.; Delgado-Bardales, J.M. Generación y disposición final de residuos sólidos municipales en la Región Cajamarca, Perú. Rev. Científica FIPCAEC 2023, 8, 871–890. [Google Scholar]
  13. INEI. Registro Nacional de Municipalidades—RENAMU; Instituto Nacional de Estadística e Informática: Lima, Peru, 2019. [Google Scholar]
  14. Hair, J.F.; Hult, G.T.M.; Ringle, C.M.; Sarstedt, M. A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM), 3rd ed.; Sage Publications: Thousand Oaks, CA, USA, 2022. [Google Scholar]
  15. Sarstedt, M.; Hair, J.F.; Pick, M.; Liengaard, B.D.; Radomir, L.; Ringle, C.M. Progress in partial least squares structural equation modeling use in marketing research in the last decade. Psychol. Mark. 2022, 39, 1035–1064. [Google Scholar] [CrossRef]
  16. Dangelico, R.M.; Alvino, L.; Fraccascia, L. Investigating the antecedents of consumer behavioral intention for sustainable fashion: A PLS-SEM approach. J. Clean. Prod. 2023, 401, 136735. [Google Scholar] [CrossRef]
  17. Yu, Z.; Khan, S.A.R.; Umar, M. Circular economy practices and industry 4.0 technologies: A strategic move of automobile industry. Bus. Strat. Environ. 2022, 31, 796–809. [Google Scholar] [CrossRef]
  18. Shmueli, G.; Sarstedt, M.; Hair, J.F.; Cheah, J.H.; Ting, H.; Vaithilingam, S.; Ringle, C.M. Predictive model assessment in PLS-SEM: Guidelines for using PLSpredict. Eur. J. Mark. 2019, 53, 2322–2347. [Google Scholar] [CrossRef]
  19. Sharma, P.; Panday, P.; Dangwal, R.C. Hybrid SEM-machine learning approach for predicting pro-environmental behavior. J. Environ. Manag. 2023, 345, 118778. [Google Scholar] [CrossRef]
  20. Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4765–4774. [Google Scholar]
  21. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
  22. Murray, A.; Skene, K.; Haynes, K. The circular economy: An interdisciplinary exploration. J. Bus. Ethics 2017, 140, 369–380. [Google Scholar] [CrossRef]
  23. Prieto-Sandoval, V.; Jaca, C.; Ormazabal, M. Towards a consensus on the circular economy. J. Clean. Prod. 2018, 179, 605–615. [Google Scholar] [CrossRef]
  24. Velenturf, A.P.M.; Purnell, P. Principles for a sustainable circular economy. Sustain. Prod. Consum. 2021, 27, 1437–1457. [Google Scholar] [CrossRef]
  25. Agovino, M.; Ferrara, M.; Garofalo, A. European waste management regulations and the transition towards circular economy. J. Environ. Manag. 2024, 354, 120345. [Google Scholar] [CrossRef]
  26. Awino, F.B.; Apitz, S.E.; Okot-Okumu, J. Solid waste management in the context of the waste hierarchy and circular economy frameworks. Integr. Environ. Assess. Manag. 2024, 20, 9–35. [Google Scholar] [CrossRef]
  27. Xavier, L.H.; Ottoni, M.; Lepawsky, J. Circular economy and e-waste management in the Americas: Brazilian and Canadian frameworks. J. Clean Prod. 2021, 297, 126570. [Google Scholar] [CrossRef]
  28. Gutberlet, J.; Besen, G.R.; Morais, L. Cooperative recycling and the circular economy in Latin American cities. World Dev. 2023, 167, 106226. [Google Scholar] [CrossRef]
  29. Wilson, D.C.; Velis, C.A.; Rodic, L. Integrated sustainable waste management in developing countries. Proc. Inst. Civ. Eng. Waste Resour. Manag. 2013, 166, 52–68. [Google Scholar] [CrossRef]
  30. Marshall, R.E.; Farahbakhsh, K. Systems approaches to integrated solid waste management in developing countries. Waste Manag. 2013, 33, 988–1003. [Google Scholar] [CrossRef]
  31. Orihuela, J.C. Análisis de la Eficiencia de la Gestión Municipal de Residuos Sólidos en el Perú; INEI: Lima, Peru, 2018. [Google Scholar]
  32. Rigamonti, L.; Gross, B.; Marinello, S.; Rubino, A. Assessing the performance of municipal solid waste management systems. Waste Manag. 2024, 178, 43–55. [Google Scholar] [CrossRef]
  33. Abdel-Shafy, H.I.; Mansour, M.S.M. Solid waste issue: Sources, composition, disposal, recycling, and valorization. Egypt. J. Pet. 2018, 27, 1275–1290. [Google Scholar] [CrossRef]
  34. Guerrero, L.A.; Maas, G.; Hogland, W. Solid waste management challenges for cities in developing countries. Waste Manag. 2013, 33, 220–232. [Google Scholar] [CrossRef]
  35. Wilson, D.C.; Rodic, L.; Cowing, M.J.; Velis, C.A.; Whiteman, A.D.; Scheinberg, A.; Vilches, R.; Masterson, D.; Stretz, J.; Oelz, B. ‘Wasteaware’ benchmark indicators for integrated sustainable waste management in cities. Waste Manag. 2015, 35, 329–342. [Google Scholar] [CrossRef]
  36. Diéguez-Santana, K.; Rodríguez Rudi, G.; Acevedo Urquiaga, A.J.; Muñoz, E.; Sablón-Cossio, N. An assessment tool for the evaluation of circular economy implementation. Acad. Rev. Latinoam. Adm. 2021, 34, 316–328. [Google Scholar] [CrossRef]
  37. Di Foggia, G.; Beccarello, M. Drivers of municipal solid waste management cost based on cost models. Waste Manag. 2023, 155, 324–333. [Google Scholar] [CrossRef]
  38. Ringle, C.M.; Sarstedt, M.; Straub, D.W. A critical look at the use of PLS-SEM in MIS Quarterly revisited. MIS Q. 2023, 47, 1553–1578. [Google Scholar]
  39. Liengaard, B.D.; Sharma, P.N.; Hult, G.T.M.; Jensen, M.B.; Sarstedt, M.; Hair, J.F.; Ringle, C.M. Prediction: Coveted, yet forsaken? Decis. Sci. 2021, 52, 362–392. [Google Scholar] [CrossRef]
  40. Müller, T.; Schuberth, F.; Henseler, J. How to assess discriminant validity in PLS-SEM. Ind. Manag. Data Syst. 2024, 124, 1438–1462. [Google Scholar] [CrossRef]
  41. Sarstedt, M.; Hair, J.F.; Ringle, C.M. PLS-SEM: Indeed a silver bullet—Retrospective observations and recent advances. J. Mark. Theory Pract. 2024, 32, 261–275. [Google Scholar] [CrossRef]
  42. Rashad, E.; Liu, Y.; Lü, D.; Refaee, A.; Pan, T. Unraveling scale-dependent relationships among ecological services using ML and PLS-SEM. Land Degrad. Dev. 2025, 36, e70326. [Google Scholar] [CrossRef]
  43. Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed.; Lulu Press: Raleigh, NC, USA, 2022. [Google Scholar]
  44. Cakiroglu, C.; Demir, S.; Ozdemir, M.H.; Aylak, B.L.; Sariisik, G.; Abualigah, L. Data-driven interpretable ensemble learning methods incorporating SHAP analysis. Expert Syst. Appl. 2024, 237, 121464. [Google Scholar] [CrossRef]
  45. Simões, P.; Marques, R.C. Influence of regulation on the productivity of waste utilities: What can we learn with the Portuguese experience? Waste Manag. 2012, 32, 1266–1275. [Google Scholar] [CrossRef]
Figure 1. Composition of Likert responses by construct (C1–C4) and class distribution of D3.
Figure 1. Composition of Likert responses by construct (C1–C4) and class distribution of D3.
Environments 13 00201 g001
Figure 2. Diagram of the PLS-SEM model with indicator loadings and standardized structural coefficients.
Figure 2. Diagram of the PLS-SEM model with indicator loadings and standardized structural coefficients.
Environments 13 00201 g002
Figure 3. Diagnostics of the XGBoost model: ROC curve, decile-based calibration, distribution of predicted probabilities, and SHAP importance.
Figure 3. Diagnostics of the XGBoost model: ROC curve, decile-based calibration, distribution of predicted probabilities, and SHAP importance.
Environments 13 00201 g003
Figure 4. Confusion matrix of the XGBoost model in the test set.
Figure 4. Confusion matrix of the XGBoost model in the test set.
Environments 13 00201 g004
Table 1. Item coding by question.
Table 1. Item coding by question.
CodeItem
C1_1National waste management regulations provide clear guidelines for implementing circular economy practices in our municipality.
C1_2Our municipality has local ordinances that effectively promote the reduction, reuse, and recycling of solid waste.
C1_3Current legal instruments facilitate coordination among different levels of government for the sustainable management of waste.
C1_4Regulations on extended producer responsibility for managing product-related waste are effectively enforced within our municipal territory.
C1_5The current waste management regulatory framework is confusing and hinders the implementation of circular economy strategies.
C1_6The control and sanction mechanisms established in the regulations enable the effective enforcement of provisions on sustainable waste management.
C2_1Our municipality has sufficient and adequately trained technical staff to implement sustainable waste management programs.
C2_2We have adequate financial resources to develop infrastructure and circular economy programs for solid waste.
C2_3There is good coordination among the different municipal departments involved in waste management (cleaning, enforcement, logistics).
C2_4Municipal leadership demonstrates an effective commitment to implementing circular economy practices.
C2_5The high turnover of municipal officials limits the continuity of sustainable waste management programs.
C2_6Our municipality has the technical capacity to develop strategic partnerships with the private sector for waste valorization.
C3_1Our municipality has clear and updated operational plans for integrated solid waste management.
C3_2Waste collection, transport, and disposal procedures are carried out in accordance with established and standardized protocols.
C3_3We have adequate infrastructure (vehicles, equipment, facilities) to implement circular economy strategies.
C3_4Regular monitoring and evaluation activities are conducted to assess the performance of waste management programs.
C3_5The lack of continuous training for operational staff negatively affects the quality of waste management services.
C3_6We develop effective citizen awareness and education programs on source separation and circular economy.
C4_1Our municipality has significantly increased waste collection service coverage in recent years.
C4_2Source separation programs have shown positive results in terms of citizen participation and the amount of material separated.
C4_3We have succeeded in increasing recycling and waste valorization rates within our municipal territory.
C4_4The municipality consistently complies with the environmental goals and standards established for waste management.
C4_5Waste management performance indicators have shown little or no progress over the past two years.
C4_6Continuous improvement has been achieved in the efficiency and sustainability of the municipal solid waste management system.
D1Years of experience in public/environmental management:
D2Perceived size of your district:
D3In the past 12 months, have you participated in training activities on circular economy or sustainable waste management?
ComponentOperational descriptionVariables/indicators includedScale/categories
Construct C1Latent dimension measured by observed indicatorsC1_1, C1_2, C1_3, C1_4, C1_5, C1_6Likert 1–5
Construct C2Latent dimension measured by observed indicatorsC2_1, C2_2, C2_3, C2_4, C2_5, C2_6Likert 1–5
Construct C3Latent dimension measured by observed indicatorsC3_1, C3_2, C3_3, C3_4, C3_5, C3_6Likert 1–5
Construct C4Latent dimension measured by observed indicatorsC4_1, C4_2, C4_3, C4_4, C4_5, C4_6Likert 1–5
Contextual variable D1Ordinal contextual profileD1Less than 2 years; 2–5 years; 6–10 years; 11–15 years; More than 15 years
Contextual variable D2Ordinal contextual profileD2Very small; Small; Medium-sized; Large; Very large
Target variable D3Dichotomous classification variableD3No; Yes
Table 2. Distribution of the target variable D3 (full dataset, n = 120).
Table 2. Distribution of the target variable D3 (full dataset, n = 120).
Class (D3)FrequencyPercentage
No5445.0%
Yes6655.0%
Total120100.0%
Table 3. Convergent validity (AVE) by construct (PLS-SEM).
Table 3. Convergent validity (AVE) by construct (PLS-SEM).
ConstructAVE
C10.352
C20.432
C30.390
C40.378
Table 4. Summary of external loadings by construct (PLS-SEM).
Table 4. Summary of external loadings by construct (PLS-SEM).
ConstructNo. of ItemsMean LoadingMin. LoadingMax. LoadingItems with Loading ≥ 0.70Items with Loading < 0.40
C160.5730.2810.74611
C260.6450.3930.77731
C360.6090.3170.73021
C460.5970.3320.81621
Table 5. External loadings by indicator (PLS-SEM).
Table 5. External loadings by indicator (PLS-SEM).
ConstructIndicatorLoading
C1C1_10.492
C1C1_20.631
C1C1_30.593
C1C1_40.746
C1C1_50.281
C1C1_60.696
C2C2_10.674
C2C2_20.714
C2C2_30.777
C2C2_40.736
C2C2_50.393
C2C2_60.574
C3C3_10.730
C3C3_20.728
C3C3_30.643
C3C3_40.609
C3C3_50.317
C3C3_60.625
C4C4_10.816
C4C4_20.700
C4C4_30.566
C4C4_40.594
C4C4_50.332
C4C4_60.575
Table 6. Coefficients of the structural model (PLS-SEM).
Table 6. Coefficients of the structural model (PLS-SEM).
PredictorOutcomeCoefficient (β)
C1C20.629
C1C30.446
C2C30.583
C3C40.817
C1C40.015
C2C40.045
Table 7. Example of latent scores (first 6 observations).
Table 7. Example of latent scores (first 6 observations).
CaseC1C2C3C4
10.8891.2201.2901.260
20.471−0.230−1.2000.842
30.184−1.730−1.100−1.650
4−0.2280.3130.686−0.157
50.881−0.2020.5500.712
60.3700.1490.8800.924
Table 8. Selected XGBoost model configuration through cross-validation.
Table 8. Selected XGBoost model configuration through cross-validation.
ParameterValue
eta0.05
max_depth2
min_child_weight3
subsample1.0
colsample_bytree0.8
best_iter (rounds)82
Cross-validated AUC (cv_auc)0.643
Table 9. Predictive performance of the XGBoost model on the test set.
Table 9. Predictive performance of the XGBoost model on the test set.
MetricValue
AUC-ROC0.519
Accuracy0.640
Sensitivity (Recall)0.786
Specificity0.455
Precision0.647
F1-score0.710
Brier score0.269
Threshold (Youden)0.467
Table 10. Confusion matrix in the test set (n = 25).
Table 10. Confusion matrix in the test set (n = 25).
Observed\PredictedNoYes
Yes311
No56
Table 11. Relative ranking of variables according to SHAP importance (mean absolute value) observed in Figure 3.
Table 11. Relative ranking of variables according to SHAP importance (mean absolute value) observed in Figure 3.
RankVariable (Feature)
1D2 = Medium-sized
2C1
3C3
4C2
5C4
6D1 = 2–5 years
7D2 = Very small
8D2 = Very large
9D1 = Less than 2 years
10D1 = More than 15 years
11D1 = 6–10 years
12D1 = 11–15 years
Table 12. Complementary Robustness Assessment of the Measurement Model.
Table 12. Complementary Robustness Assessment of the Measurement Model.
ConstructAVE (Full Model)√AVEHighest Inter-Construct CorrelationAVE Without Item _5ΔAVEInterpretation
C10.3520.5930.6470.413+0.062Convergent validity improved after removing item _5, but remained below 0.50; discriminant validity concerns persisted.
C20.4320.6580.7090.505+0.072Convergent validity improved and exceeded 0.50 after removing item _5, although overlap with other constructs remained.
C30.3900.6240.7090.440+0.050Partial improvement was observed, but convergent and discriminant validity issues remained.
C40.3780.6150.7010.426+0.048The construct showed moderate sensitivity to item _5, yet validity limitations persisted.
Table 13. Comparative Predictive Robustness Across Classification Models.
Table 13. Comparative Predictive Robustness Across Classification Models.
ModelAUCAccuracyBalanced AccuracyF1Brier ScoreInterpretation
XGBoost0.5190.4000.4350.2110.252Highest AUC among tested models, but predictive discrimination remained weak.
Random forest0.4810.6000.5650.7060.273Showed better accuracy and F1 than XGBoost, although discriminatory capacity remained limited.
Decision tree0.4770.4800.4770.5190.314Produced weak and unstable classification performance.
Logistic regression0.4740.4000.4060.4000.273Offered no meaningful predictive improvement under the current specification.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vera-Zelada, P.; Ramos-Farroñán, E.V.; Haro-Sarango, A.F.; Vera-Zelada, L.A.; Izquierdo-Espinoza, J.R.; Florez-Tolentino, K.L.; Torres-Moya, P.M.; Tejada-Estrada, R.J.; Farfán-Chilicaus, G.C. Evaluating Circular Economy Performance in Municipal Solid Waste Management: A Hybrid Structural Equation Modeling and Explainable Machine Learning Study from Cajamarca. Environments 2026, 13, 201. https://doi.org/10.3390/environments13040201

AMA Style

Vera-Zelada P, Ramos-Farroñán EV, Haro-Sarango AF, Vera-Zelada LA, Izquierdo-Espinoza JR, Florez-Tolentino KL, Torres-Moya PM, Tejada-Estrada RJ, Farfán-Chilicaus GC. Evaluating Circular Economy Performance in Municipal Solid Waste Management: A Hybrid Structural Equation Modeling and Explainable Machine Learning Study from Cajamarca. Environments. 2026; 13(4):201. https://doi.org/10.3390/environments13040201

Chicago/Turabian Style

Vera-Zelada, Persi, Emma Verónica Ramos-Farroñán, Alexander Fernando Haro-Sarango, Luis Alberto Vera-Zelada, Julio Roberto Izquierdo-Espinoza, Kevin Litman Florez-Tolentino, Pamela Maidolly Torres-Moya, Roberto Justo Tejada-Estrada, and Gary Christiam Farfán-Chilicaus. 2026. "Evaluating Circular Economy Performance in Municipal Solid Waste Management: A Hybrid Structural Equation Modeling and Explainable Machine Learning Study from Cajamarca" Environments 13, no. 4: 201. https://doi.org/10.3390/environments13040201

APA Style

Vera-Zelada, P., Ramos-Farroñán, E. V., Haro-Sarango, A. F., Vera-Zelada, L. A., Izquierdo-Espinoza, J. R., Florez-Tolentino, K. L., Torres-Moya, P. M., Tejada-Estrada, R. J., & Farfán-Chilicaus, G. C. (2026). Evaluating Circular Economy Performance in Municipal Solid Waste Management: A Hybrid Structural Equation Modeling and Explainable Machine Learning Study from Cajamarca. Environments, 13(4), 201. https://doi.org/10.3390/environments13040201

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop