1. Introduction
Nowadays, smart cities play a key role in sustainable development and improving the quality of citizens’ lives. The smart management of various aspects of the urban environment, including waste management, is an extremely important task. The growing volume of waste, which has a negative impact on the environment and resources, requires the development of effective and innovative approaches to waste management.
According to World Bank estimates [
1], the global level of waste generation is approximately 1.2 billion tons per year, and this value is expected to increase to 2.2 billion tons per year by the end of 2025. In addition, the war in Ukraine [
2] has led to the accumulation of significant amounts of “military” waste as well as pharmaceutical and construction waste in the destroyed cities from destroyed infra-structure and other materials. This situation requires significant efforts to reorganize/reorient the existing waste collection and disposal mechanisms, which have been inefficient and unable to ensure an adequate level of waste management during the crisis.
To ensure the optimal use of resources and reduce the negative impact on the environment, it is necessary to improve methods and approaches to waste management. Regarding the waste management task, this investigation aimed to develop an intelligent method of urban waste management, including waste sources clustering, accurate forecasting of waste volumes, and evaluation of forecast results. Additionally, the possibilities of applying clustering methods to group waste sources were investigated, which allows for the development of more targeted resource management plans. A key part of the research is the use of an Ensemble learning model to accurately predict waste volumes, taking into account various variations in clusters and city attributes.
This work combines a conceptual approach with a practical analysis, using a real dataset to validate and verify the results proposed method. The authors aimed to contribute to improving the rational use of resources and reducing the negative waste impact on the environment by providing specific recommendations for effective waste management in urban environments.
Also, the proposed method can be applied in the concept of smart cities and IoT proposed by A. Camero A. Kirimtat, S. Talari et al. [
3,
4], namely in the processing layer and business layer.
Therefore, the aim of the work was to develop a method for urban waste management using intelligent methods of classification, clustering, and forecasting. This approach contributes to more efficient waste management and supports the sustainable development of urban environments.
This article has the following structure. The Literature Review section presents an analysis of research in this area, while the Materials and Methods section describes an intelligent method for managing city waste volumes. Next, the Implementation section provides an example of the implementation of the developed method and the results of experimental studies. In the Discussion section, the results are discussed in comparison to the closest analogs, and important aspects of the accuracy of the forecasts and the approaches used are given. Finally, the Conclusions section summarizes the results of the entire study.
2. Literature Review
The problem of urban waste management is a global challenge that manifests itself in different ways in different regions in the world. A study in Benin City, Nigeria [
5], showed the low awareness level of residents about improper waste disposal and the impact on greenhouse gas emissions. The authors called for the integration of regional government services such as infrastructure, urban planning, and development into waste management policies.
Gutberlet, J. [
6] focused on the field of waste in urbanized agglomerations, especially in the Global South. This work showed the role of informal waste pickers in waste collection and recycling and their unrecognized contribution to reducing the carbon footprint of cities, resource recovery, and improving environmental conditions. The study emphasized the need to recognize and organizing those initiatives into community networks for sustainability and reduce the negative impact of cities on climate and environmental change.
Raab, K.; Tolotti, G.; and Wagner, R. [
7] conducted a study in Guatemala and focused on waste management behavior in suburbs. The study examined the critical events, decisions, and emotions associated with the discarding of household items by poor consumers. It was found that religion, social norms, and interpersonal relationships significantly influence consumers’ waste management behavior, playing a key role in controlling resource management in suburbs.
Gardiner, R. et al. showed in their work [
8] the relationship between economic growth and waste generation in the E.U. It revealed a two-way causal relationship between these elements, emphasizing that traditional economic policies are not enough to reduce waste production and pointing to the need to introduce environmental innovations.
Kebaili, F. K. et al. [
9] studied the impact of various factors on household waste management in Algeria. The use of GIS and principal component analysis showed that the distribution of waste depends on the geographical and socio-demographic regional characteristics.
Lee-Geiller, S. and Kütting, G. [
10] compared the approaches used in New York and Seoul for waste management, revealing the difference in the roles of different actors and the effectiveness of recycling. This study expanded the understanding of waste management by integrating the concept of environmental stewardship.
Iqbal, A. et al. [
11] reviewed the use of life cycle assessment (LCA) in MSW management, highlighting best practices and approaches related to data quality and methodologies. Their work focused on the integration of different waste treatment and disposal technologies and showed the importance of sensitive analysis for the reliability of results.
Also, Lu, D. and Iqbal, A. et al. [
12] considered the management of medical waste during the COVID-19 pandemic. This paper presented a model integrating LCA and data envelopment analysis (DEA) to evaluate the effectiveness of medical waste management, where better treatment and disposal strategies were applied.
Iqbal, A. and Zan, F. et al. [
13] in their work analyzed the integrated MSW management system of Hong Kong using LCA to assess the impact of different waste treatment scenarios on global warming and energy consumption. As a result, the importance of integrating different treatment methods to optimize MSW management was shown.
The above studies [
5,
6,
7,
8,
9,
10,
11,
12,
13] point out the complexity of the problem of waste management in cities and the variety of the approaches to solving this problem. The authors in these papers emphasize that intelligent approaches that include data analysis, environmental innovation, and the integration of different sectors of society can greatly improve waste management. The use of advanced technologies such as GIS and an understanding of social and economic interactions can lead to more efficient and sustainable solutions to urban waste management.
Qiang Zhang and Xujuan Zhang et al. [
14] proposed improving the accuracy of waste sorting using deep learning and intelligent classification based on computer vision/mobile terminals. Linda Andeobu [
15] focused on resource recovery (reuse, recycling, and energy recovery from waste) using artificial intelligence in different waste management sectors (generation, sorting, collection, transport routing, treatment, disposal, and planning).
Kunsen Lin et al. [
16] presented a comprehensive survey of deep learning and its application in municipal solid waste management. In this research, the authors compared technologies and their algorithms, such as ANN, CNN, RNN/LSTM, attention, and GAN, in terms of collection, transportation, final disposal, recovery, and predicting the amount and composition of waste.
Raimir Holanda and Aniello Castiglione et al. [
17,
18] focused in their research on solutions based on low-power wide-area network (LPWAN) and blockchain technologies that were developed to provide the necessary data to improve the efficiency of solid waste collection. J. Hidalgo-Crespo and Ninghui Li [
19,
20] proposed the use of a deep learning methodology that can recognize typical waste during transportation on a conveyor belt in waste collection systems based on convolutional neural networks (CNN). Soumyabrata Saha et al. [
21] proposed an Internet of Things (IoT) architecture for a smart waste management system during COVID-19.
Gue, I. H. V. and Lopez, N. S. [
22] proposed a rule-based machine learning model to assess the impact of city and country characteristics on waste management. The model was built on data from 100 cities in 41 countries, and it achieved a binary classification ac-curacy of 89–91%.
Zhang, C. [
23] proposed the use of a XGBoost machine learning method to predict the generation of municipal solid waste (MSW) in China from 2020 to 2060 under five different shared socioeconomic pathways (SSP) scenarios. The results of the study showed that the generation of MSW in China will double or even triple by 2060.
Kutty, A. et al. in their work [
24] proposed a new two-stage machine learning model to assess the sustainability and comfort of life in smart cities. The model combines multidimensional metric-distance analysis with machine learning methods. It showed that the gradient boosting machine is the best classification and predictive model for assessing the sustainability, livability, and overall performance of smart cities.
Izquierdo-Horna, L. et al. [
25] proposed a machine learning method to identify the city sectors prone to solid waste accumulation. The model was built on the 10 basic social indicators (age, education, income, etc.) and showed that the most important social indicators that help identify these sectors are monthly income, consumption patterns, age, and household population density.
El Ouadi, J. et al. [
26] showed that the SVM algorithm is the best demand-forecasting algorithm among the BR, LR, ANN, RF, KNN, and CART algorithms. It also gives the best results for data I [
27] and II [
28].
Table 1 shows the results of a generalized analysis of known methods and solutions according to the following structure: authors, description of research, and the main results obtained.
Compared to the well-known solutions and methods in the field of urban waste management presented in
Table 1, this research proposes a new intelligent approach to urban waste management. The proposed method includes waste classification and clustering, simulation of waste-volume forecasts using different models, and evaluation. That approach contributes to optimal waste management and improvement of the environmental situation in the city relative to the closest analogs [
22,
23,
25,
26]. The work brings scientific novelty by providing an integrated approach to waste management based on intelligent methods and expands knowledge about optimal waste management strategies to improve the environment and community life.
3. Proposed Method
The method uses the classification of waste by type and clustering of waste generation locations to predict future waste volumes. The steps of the method and its structure in the form of an algorithm (
Figure 1) and described below:
- Step 1.
Classification of waste [
16,
29,
30] (
Figure 1, Block 1): Using data [
31] on the composition and quantity of waste, it is classified by type. Various parameters can be used to classify waste, such as composition, quantity, size, shape, etc. However, for a more accurate classification, machine learning methods can be used [
32,
33,
34,
35]. To classify waste, it necessary (
Figure 1, Blocks 2,3) to have sufficient data about, for example, its composition, and other parameters that impact its classification. After model training, it can be used to classify new kinds waste.
- Step 2.
Clustering (
Figure 1, Block 4), according to A.E. Ezugwu and Ahmad A. [
36,
37], of waste generation locations: Using data on the place and time of waste generation, the clustering of waste generation sites is performed using clustering algorithms such as K-Means, agglomerative clustering, DBSCAN, birch, OPTICS, and spectral clustering.
K-Means, agglomerative clustering, DBSCAN, birch, OPTICS, and spectral clustering: These clustering algorithms group data based on the distances between their variables X1, ..., Xn.
Description of the algorithms:
K-Means (KM) [
38,
39] minimizes the squares sum of distances between points (
) and their cluster centers
;
Agglomerative clustering (AC) [
40] uses a hierarchical approach, combining the closest clusters by different distance metrics: single linkage, complete linkage, average linkage, etc.;
DBSCAN [
40] groups points based on density (minPts) and radius (eps): if the distance between
and
is less than eps, and the number of points in the neighborhood of
is greater than minPts,
and
belong to the same cluster;
Birch [
40] is based on a CF-tree that stores cluster statistics (sum, number, and squares sum of points) for efficient clustering;
OPTICS [
41] is similar to DBSCAN but uses reachability and density order to define clusters, which allows distinguishing between clusters with different densities;
Spectral clustering [
42] uses the eigenvectors of the Laplace matrix of the adjacency graph to separate clusters that are related but not globally similar.
After clustering waste generation sites, the resulting clusters can be used to predict future waste volumes.
- Step 3.
Forecasting (
Figure 1, Block 5) modeling [
43,
44,
45]:Different forecasting models, such as ARIMA, DNN, XGBoost, are used for each cluster to predict the amount of waste in the future.
For each waste cluster, waste-volume forecasts are modeled for a certain period in the future.
For this purpose, various machine learning methods can be used, such as autoregressive integrated mean moving average (ARIMA), deep neural networks (DNN), gradient boosting over decision trees (XGBoost), and others.
A separate prediction model is used for each waste cluster since each cluster may have its own unique characteristics such as the time of waste generation and its composition and quantity.
For example, waste generated in some area of the city at night may have a different forecast than waste generated in the same area during the day.
Thus, ARIMA, DNN, and XGBoost are forecasting models that can be used to create waste-volume forecasts for different clusters.
Models’ description:
3.1. ARIMA [46] (p, d, q)
An autoregressive, integrated, moving average model that uses autoregressive components (p), integration order (d), and moving average components (q) to predict time series values.
The ARIMA model has the following form:
where
y’t is the difference between the values of the time series with lag
d, and
εt is white noise (model error).
3.2. DNN (Deep Neural Networks) [47]
Deep neural networks use a structure with multiple layers of neurons to approximate nonlinear functions. The networks are trained using gradient descent algorithms to minimize the loss function.
where
f is the activation function,
W is the weighting matrix,
X is the input data,
b is the shift vector, and
k is the layer number.
3.3. XGBoost [48]
Gradient boosting over decision trees improves prediction results by combining multiple weak models (decision trees) into a single strong model using gradient descent.
The XGBoost model has the following form:
where
fm(X) is an ensemble of decision trees at the
mth step;
Tm(X) is an additional decision tree to be optimized.
The resulting forecasts of waste volume can be used to plan the optimal waste collection, transportation, and disposal system for the city. Also, these forecasts can help solve problems with excessive waste accumulation in some places, which can lead to environmental pollution and other problems.
- Step 4.
Evaluation of results (
Figure 1, Block 6): The forecast results for each cluster are evaluated using metrics such as mean absolute error (MAE) and mean squared error (MSE) to determine the accuracy of the model [
49].
Evaluation of the results allows us to determine how accurately the model predicts waste volumes and draw conclusions about its effectiveness and applicability for city waste management.
- Step 5.
Using the best model (
Figure 1, Block 7), we can make a forecast for future years for each best cluster separately for each “energy saved” value. We evaluate the accuracy (
Figure 1, Blocks 8–9) of the forecasts using MSE and compare them with the results obtained during data clustering.
This method can be used to forecast the amount of waste in the city, which will ensure effective waste management and improve the environment. The forecasting results can be used to develop waste collection and recycling strategies, optimize waste collection routes, plan production, and allocate resources for waste processing.
4. Results
In the context of this study, which focuses on waste management issues in Ukraine, our team faced limitations in collecting primary data due to the war in the country. This situation forced us to focus on open-source data analysis; therefore, we used information from Singapore that was available and met our research needs.
The approach was implemented using data from the Kaggle portal [
31]. This dataset contains annual data on the amount of waste generated and recycled in Singapore from 2003 to 2020. The dataset is divided into two parts: data from 2003 to 2017 and data from 2018 to 2020. Each record includes information on the type of waste (e.g., paper, glass, metal, and plastic), the amount of waste generated, and waste recycled. This dataset was created to study the effectiveness of Singapore’s waste management program and calculate the energy savings from recycling. The data were provided by the National Environment Agency (NEA) in Singapore and Green-tumble.
If we take into account the data on waste management in Singapore for the period from 2003 to 2020, the pa-212 parameter “Waste Generated” was the highest in 2013 at 7.85 million tons and the lowest in 213 2003, when 4.75 million tons of waste were generated. The parameter “Waste Recycled” was also the highest in 2013 at 4.82 million tons and the lowest in 2003, when 2.22 million tons 215 of waste were recycled. Both parameters decreased in the following years.
The dataset contains information on different types of waste and the number of times they appear. The most common types of waste are glass, paper/cardboard, textiles/leather, plastic, post-harvest waste, construction waste, food, wood, recycled metal, etc. There are also less-common types of waste, such as ceramic, rubber, ash, and other waste.
Table 2 shows the current values of the recycling rate by waste type. As can be seen from the results from
Table 2, non-ferrous metal (94.3%) and ferrous metal (90.1%) have the highest recycling rates, which indicates that these materials are effectively recycled. The lowest percentage of recycling is in the case of plastics (8.7%), which may be due to the recycling complexity of this material and its widespread use in various forms. The values for glass and paper are not as high as for non-ferrous metal and ferrous metal, but they are still relatively high at 16.7% and 49.8%, respectively.
The diagram (
Figure 2) is a square scatter plot that displays the amount of energy saved in kWh per metric ton for each waste type [
31]. Some materials produce more energy in kWh per metric ton, so the total amount of saved energy is scattered across the entire area of the diagram. The chart shows that the total energy saved from paper and plastic recycling has decreased significantly over the past few years due to government initiatives to control waste generation. The highest value of the “total energy saved” was for paper in 2011 and amounted to 3.13 billion tons, with total waste recycled tons = 765,000 tons, and the lowest value was recorded in 2020 and amounted to 1.77 billion tons, with total waste recycled tons = 432,000 tons. The highest value for the “total waste recycled” tons for ferrous metal was achieved in 2016 and amounted to 1,351,500 tons, with a total saved energy of 867.6 million tons, and the lowest value was recorded in 2007 and amounted to 668,000 tons, with a total saved energy of 428.8 million tons. Data before 2007 are not available. The lowest value of total waste recycled tons was recorded for glass.
Next, we analyzed how much electricity was saved per year from waste recycling in 2016–2020. The data are measured in gigawatt-hours (GWh). Energy savings from waste recycling grew until 2018 but then declined in 2019 and 2020. The largest amount of energy was saved in 2018, namely 5828.76 GWh, and the smallest amount was saved in 2020, namely 3598.42 GWh. These data can be useful for identifying trends in secondary resources use and assessing the effectiveness of waste recycling programs.
The Pearson correlation value is 0.94, which indicates a strong positive linear relationship between the amount of waste generated and waste recycled. This means that as the amount of waste generated increases, the amount of waste recycled also increases.
To identify similar characteristics of different material groups in terms of waste and energy saving, the next clustering methods used KM, AC, DBSCAN, birch, OPTICS, and spectral clustering.
Using the KElbowVisualizer tool [
38] (
Figure 3), we found the most optimal value of the cluster number for clustering data in the KM model. In this particular case, the number of clusters that best fits the data is four. Therefore, four clusters were used for all types of clustering to unify the results.
A detailed analysis is given in
Figure 4 of the clustering of waste by type and degree; the recycling values show that KM, AC, and OPTICS methods have a similar distribution of clusters, while other methods (birch, DBSCAN, and spectral clustering) differ. This result may indicate that KM, AC, and OPTICS methods can be effective for clustering waste by type and degree of recycling, as they give similar results. On the other side, birch, DBSCAN, and spectral clustering may be less effective because they differ from the other methods. However, the solution to a particular waste-clustering problem depends on many factors, including the size and composition of the data, the clustering purpose, and the selected parameters of the clustering methods. Therefore, more research and analysis are needed to determine which method is best for waste clustering.
For effective waste management, it is necessary to understand which groups (clusters) of waste can be considered together. Therefore, the waste-quantity values for each cluster were predicted using machine learning methods such as ARIMA, DNN, and XGBoost. Evaluation of different models for all types of clustering can help in selecting the best model for predicting waste quantities and in better understanding which waste groups can be combined for more efficient waste management.
Therefore, we can predict the amount of recycled waste by year and cluster class using the ARIMA approach. To achieve this, the data were divided into training and test sets by year (train ≤ 2015, test > 2016), then we calculated unique cluster class labels, trained the ARIMA model on the training set for each cluster class label, and then performed forecasting for the testing set and compared the predicted and actual values of the amount of waste (
Figure 5).
We also calculated the MAE and MSE metrics for each cluster class label (
Table 3). These results represent the MAE and MSE metrics for several clusters that were analyzed. Each column of the table corresponds to a separate clustering method (KM, AC, DBSCAN, birch, OPTICS, and spectral), and each row corresponds to a separate cluster. For each cluster, the MAE and MSE values for each clustering method were determined. MAE represents the mean of the absolute difference between the predicted and actual values, while MSE represents the mean of the difference squared between the predicted and actual values.
Thus, these metrics help to determine how accurate the predictions are for each clustering method and each individual cluster. One of the most interesting results is that for cluster 3, in which the values of MAE and MSE obtained using the AC algorithm are significantly different from the corresponding values obtained using the other algorithms. This may indicate that the AC algorithm is not optimal for this particular cluster or that the data included in cluster 3 are not sufficiently homogeneous. It can also be noted that for clusters 0 and 2, the highest metric values were obtained using the DBSCAN algorithm. This may indicate that these clusters contain areas of uncertain density, which is a good reflection of the capabilities of the DBSCAN algorithm.
Based on the results (
Figure 5), the following conclusions can be drawn: The cluster_KM and cluster_AC models had stable predictions in clusters 1 and 3 and 0 and 3, respectively, which indicates their effectiveness in these clusters. In cluster 3, the cluster_Birch model had unstable predictions, showing a reverse trend, which may indicate that this model is ineffective in this cluster. The cluster_Spectral model also had unstable forecasts with a reverse trend in cluster 0, which may indicate that this model is ineffective in this cluster. The cluster_OPTICS model also had unstable forecasts with a reverse trend in cluster 2, which may indicate that this model is ineffective in this cluster. The cluster_DBSCAN model did not produce normal results at all, which may indicate that it is not able to work well with this dataset.
Next, we built a neural network to predict (
Figure 6) waste production in each cluster using normalized data from the training and test datasets. The model has three layers: two with 32 and 16 neurons with ReLU activation [
50] and one with 1 neuron. Quality metrics were calculated (
Table 4), such as MAE and MSE, and graphs of predicted and actual values for each cluster are displayed.
Based on the results, we can conclude that the DNN method is not the most effective for this task of predicting the amount of recycled waste. The values of MAE and MSE (
Table 4) for each type of clustering and each cluster are quite large, which indicates a high level of forecasting error. Also, the diagram (
Figure 6) shows that the data were not predicted. This may be due to the fact that there were not enough data to train the model, or the DNN method is not optimal for this task.
Next, we built the XGBoost model for each cluster of each clustering type, which made predictions on the test dataset and calculates quality metrics (MAE and MSE) (
Table 5). The graphs show a comparison of predictions and actual waste values by year for each cluster.
Table 5 shows that the MAE and MSE values differed significantly for different clustering methods and for different clusters within each method. The model quality metrics vary significantly between clusters and clustering types. For example, for cluster 3 of the KM clustering type, there are very large metric values:
- -
MAE—708,051;
- -
MSE—8.83075 × 1011;
- -
RMSE—939,720.9474.
This may indicate poor model quality for this cluster. At the same time, for cluster 0 of the OPTICS clustering type, there are very low metric values:
- -
MAE—17,512.4694;
- -
MSE—1,160,221,158.
This may indicate the good fit of the model for this cluster.
In general, it can be concluded that the forecasting was accurate for most clusters, as the graphs (
Figure 7) of the predicted values quite accurately repeat the graphs of the actual values of the recycled waste amount by year. However, some deviations were visible for KM cluster 3 and OPTICS cluster 3, which may indicate a lack of forecasting accuracy for these clusters. In general, the chart demonstrates the effectiveness of the XGBoost approach in predicting the amount of recycled waste by year for different cluster types.
The research shows that the best results in predicting the amount of recycled waste can be obtained using the ARIMA, DNN, and XGBoost models, but their effectiveness depends on the specific cluster and clustering method. In particular, the K-Means method shows the best results in terms of MAE and MSE metrics for all three models on clusters with numbers 0 and 1. The diagram of the predicted values of the amount of recycled waste by year shows the effectiveness of the XGBoost approach in predicting for different types of clusters, although there were some deviations for some clusters, such as KM, AC, and OPTICS. DBSCAN, birch, and spectral clustering showed good prediction accuracy. Therefore, for city waste management, it is recommended to use DBSCAN, birch, and spectral clustering for clustering and XGBoost for data prediction. Therefore, forecasting energy savings for the next three years can be determined by the XGBoost method for cluster types DBSCAN, birch, and spectral clustering (
Figure 8).
From the MSE results for predicting energy-saved values for each of the three types of clusters (DBSCAN, birch, and spectral clustering), we can conclude that the prediction for the “crude oil saved” column has a significantly lower error than the “energy saved” column. In addition, clustering with the birch algorithm allows for more accurate forecasts for both columns compared to the other algorithms. Overall, the results show that the XGBoost method is effective in predicting energy-savings values for the next three years, and clustering can help improve the quality of such predictions.
The obtained results can be used for effective waste management in cities. In particular, the predicted values of the amount of waste recycled can be used to plan resource requirements such as the employees and vehicles needed for waste processing.
The results of clustering can also be used to allocate resources to different areas of the city. For example, areas that require more resources for waste processing can be prioritized for resource allocation. In general, the results can be used to optimize waste recycling processes, which will help improve the environmental situation in the city and reduce waste management costs.
Table 6 compares the results of the current research with the closest analogs in terms of the forecast accuracy and approaches used.
In addition, we applied an integrated approach to waste clustering based on DBSCAN, birch, and spectral clustering methods to improve the quality of forecasts and the efficiency of waste management. The proposed approach demonstrates the use of a variety of forecasting and clustering methods to achieve high accuracy in urban waste management and thus makes an innovative contribution to this field of research.
Therefore, it is important to emphasize that the main difference between the proposed research and previous works, such as that of El Ouadi [
26], Asri, H. [
39], and Tokuda, E.K. [
40], is the use of the XGBoost model for waste-volume forecasting. This model provides high forecast accuracy, reaching up to 98%. In addition, we applied various clustering methods, such as DBSCAN, birch, and spectral clustering, to improve the quality of the forecasts and waste management efficiency. This approach demonstrates the value of using a variety of methods in achieving high accuracy in urban waste management, making an innovative contribution to this research area.
6. Conclusions
An intelligent method of urban waste management was developed, including waste classification, clustering of waste sources, waste-volume modeling, forecasting using various models, and the evaluation of the forecast results.
This research analyzed a dataset from the Kaggle portal, which contains annual data on the amount of waste generated and recycled in Singapore from 2003 to 2020. Using various clustering and forecasting methods, results were obtained that can be used for effective waste management in the city.
The best results in predicting the amount of recycled waste were obtained using the XGBoost model. The XGBoost model performed well in predicting the amount of recycled waste for different clustering methods such as DBSCAN, birch, and spectral clustering:
- -
Cluster 0: best MAE 36,854.49 (birch);
- -
Cluster 1: best MAE 7791.988 (agglomerative clustering);
- -
Cluster 2: best MAE 28,470.26 (birch);
- -
Cluster 3: best MAE 13,827.25 (spectral clustering).
Based on the MSE results for predicting energy saved and crude oil saved for the three types of clusters (DBSCAN, birch, and spectral clustering), we can conclude that the prediction for the “crude oil saved” column has a smaller error. Clustering using the birch algorithm allows for more accurate forecasts for both columns. The results show that the XGBoost method can be effective in predicting energy-savings values for the next three years, and clustering can help improve the quality of such predictions.
The achieved results will be useful for waste management in Ukraine in the post-war period. The proposed research has several limitations that should be recognized. One of the main limitations is the difficulty in collecting data due to the ongoing war in Ukraine. This situation forced us to use data from open sources, particularly from Singapore, to conduct the study. While this alternative provided valuable insights, it may not fully reflect the unique challenges and dynamics of waste management in different urban contexts, especially those affected by war. Therefore, the conclusions and generalizations drawn in this study should be considered with these specific contextual limitations in mind. The integration of several advanced methods, such as XGBoost and different clustering techniques, while contributing to accuracy, also increases the complexity of the model. This complexity can pose challenges in terms of computational resources and require specialized expertise for effective implementation and maintenance.
In general, the use of intelligent management methods for urban waste is in line with the smart city concept and the Internet of Things, and can significantly improve the quality of residents’ lives, the sustainability of urban evolution, and environmental sustainability in a global context.
Future research could focus on improving forecasting models, investigating the impact of various factors on waste recycling efficiency, and developing new approaches to waste management. In general, the obtained results can be used to optimize waste recycling processes, which will help improve the environmental situation in the city and reduce waste management costs.