Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Comparison of Vertex AI and Convolutional Neural Networks for Automatic Waste Sorting

Sustainability 2025, 17(4), 1481; https://doi.org/10.3390/su17041481

by Jhonny Darwin Ortiz-Mata^*

, Xiomara Jael Oleas-Vélez, Norma Alexandra Valencia-Castillo, Mónica del Rocío Villamar-Aveiga

and David Elías Dáger-López

Reviewer 1:

Wei Wang

Reviewer 2: Anonymous

Reviewer 3:

Oasama Labib

Sustainability 2025, 17(4), 1481; https://doi.org/10.3390/su17041481

Submission received: 12 December 2024 / Revised: 16 January 2025 / Accepted: 17 January 2025 / Published: 11 February 2025

(This article belongs to the Special Issue Sustainable Application of Artificial Intelligence and Machine Learning)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This study primarily compares Vertex AI and Convolutional Neural Networks to address urban solid waste management issues, establishing a practical framework for an automatic urban solid waste classification system. This is a significant research effort; however, there are still some issues that need to be revised.

1. The "1" at the beginning of Table 1 regarding the introduction of the functions of Lighting System and Web camera indicates what?

2. Most of the content in the paper has completed the relevant literature citations, but there are still some parts that lack citations, such as the Adam optimizer and others.

3. In section 3.13, the optimal settings for different hyperparameters are presented, but there is a lack of experimental results to support this. For example, could you provide parameters such as speed and accuracy for clarification?

4. The font size of the equations in the paper seems to be inconsistent. It is recommended to optimize this.

5. What loss function is the loss variation mentioned in Figure 4 based on?

6. Check the numbering order of the figures in the paper.

Author Response

Response to the editors and reviewers comments

Revisions requested

Automatic Waste Sorting System Using Arduino, Raspberry Pi, Vertex Ai And Convolutional Neural Networks

sustainability-3396450

Dear reviewers and editor,

Thank you for your useful comments and suggestions of our manuscript.

We have modified the manuscript accordingly, and detailed corrections and comments are listed below point by point:

COMMENTS FOR THE AUTHOR:

All the changes and comments in the redaction realized in the manuscript by the Reviewers were accepted.

Reviewer #1:

The "1" at the beginning of Table 1 regarding the introduction of the functions of Lighting System and Web camera indicates what?

The number “1” at the beginning of Table, it was a typing error.

Operating Mode	Electronic component	Function
Vertex	CanaKit Raspberry	It provides the Raspberry Pi as the central processing unit, responsible for controlling and running the image capture and classification system, as well as managing interaction with other system components.
Vertex	Raspberry Pi Camera Module	It captures images in real time that are processed and classified using deep learning models run on the Raspberry Pi.
CNN	Web Camera	It captures additional images from different angles for validation processes and remote monitoring, ensuring the correct operation of the system.
Vertex & CNN	Arduino Uno	It controls sensors and actuators, such as the servo motor of the robotic arm; it manages hardware interactions and sends information to the Raspberry Pi or laptop for processing along with the captured images.
Vertex & CNN	Servo Motor S MG90S	It controls the movements of the robotic arm or other mechanical parts, allowing objects to be manipulated and classified into specific compartments according to the identification made.
Vertex & CNN	Robotic Arm	Controlled by the Arduino Uno, it performs physical actions according to the classification results, such as picking up and depositing objects in the corresponding compartments.
Vertex & CNN	Lighting System	It provides additional illumination during image capture, improving image quality, consisting of a small light bulb and a mini 2-pin switch.

Most of the content in the paper has completed the relevant literature citations, but there are still some parts that lack citations, such as the Adam optimizer and others.

Lines 289 to 294

Adam, RMSprop, and Adamax optimizers excel in deep learning for their weight adjustment efficiency [28]. Adam combines gradient descent with first- and second-order moments, dynamically adjusting learning rates [29]. RMSprop adapts the learning rate based on the average of recent square derivatives, effective for noisy data and recurring networks. Adamax, Adam's variant based on the infinity standard, is ideal for data with large values, ensuring stable convergence.

In section 3.13, the optimal settings for different hyperparameters are presented, but there is a lack of experimental results to support this. For example, could you provide parameters such as speed and accuracy for clarification?

Several combinations of hyperparameters were evaluated to optimize the model configuration (see Table 3). The categorical_crossentropy loss function and Momentum 0 were used during the tests. For the learning rate, values of 0.001, 0.005, 0.01, 0.02, and 0.04 were tested. Among these, 0.02 provided the best balance between speed and accuracy, avoiding slow or unstable learning. For batch size, 32 and 64 were evaluated, with 64 proving to be the most suitable, as it effectively balanced training speed and stability when paired with a learning rate of 0.02

Accuracy and classification speed metrics are now presented per image, providing clearer, more quantitative support for optimal settings.

Optimizer	Learning rate	Epoch	Batch size	Test Accuracy	Speed Clasification
Adam	0.001	10	32	97.44%	0.642s
	0.005		64	98.01%	0.711s
	0.01		64	96.84%	0.847s
RMSprop	0.02	12	64	98.12%	0.234s
Adamax	0.04	12	64	97.89%	0.726s

The font size of the equations in the paper seems to be inconsistent. It is recommended to optimize this.

The mathematical expressions used are presented in equations 1 to 4:

	(1)
	(2)
	(3)
	(4)

Where, TP (True Positives) are the true positives, FP (False Positives) are the false positives, TN (True Negatives) are the true negatives and FN (False Negatives) are the false negatives.

In addition, Matthews' correlation coefficient (MCC) was calculated to evaluate the statistical reliability of the model from the confounding matrix, as shown in equation 5 [15]:

(5)

What loss function is the loss variation mentioned in Figure 4 based on?

In this case, the loss function used is the categorical cross-entropy, which is suitable for multiclass classification problems.

Check the numbering order of the figures in the paper.

Figure numbering has been corrected

Thank you very much for the comments to improve the article.

Sincerely,

The authors.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper concerns finding an optimal solution for waste sorting by comparing different CNN classification architectures (developed in TensorFlow) with a pre-trained Vertex AI model provided by Google Cloud. The system was designed for sorting solid paper, plastic and metal waste and it’s composed of an Arduino Uno board, a Raspberry Pi, a camera and a robotic arm. The comparison is made in terms of software scalability, performance (accuracy, response time), as well as considering the cost of the two approaches.

The scientific level of the paper is good, the results are convincing. References are appropriate and well chosen, and related works are discussed and cited appropriately. However, it may be accepted after major corrections. Here are my remarks:

Please consider revising the paper for spelling and grammatical accuracy in the English language.

The title of the paper is appropriately chosen, but the abstract may need to be revised for better clarity and understanding. The English text must be revised. For example, in line 13, I suggest removing the word "or".

The abbreviations and their explanation must be entered as they appear in the text

In Chapter 3.2, Figure 1, the arrow connecting Vertex AI to OpenCV and TensorFlow does not align with the explanations provided in the text about the two operating modes of the system

In Chapter 3.3, for better clarity, Table 1 would be more effective if it included rows delineating the different operating modes of the system.

In the article's text, the presentation should avoid a tutorial-style approach. For example: "To train a model, you enable the Vertex AI API in Google Cloud Console…" (Chapter 3.11) or "The metrics used allow you to evaluate various aspects of the performance of the models…" (Chapter 4.1).

In Chapter 3.13, Table 2 is difficult to follow. A redesign is recommended.

In Chapter 4.3, row 474, referencing Figure 8, the drawing is captioned as Figure 5. The text refers to WFT, while the figure depicts DA.

Starting with Figure 9, the numbering of the figures in the text becomes inconsistent.

Regarding Figure 7, in Chapter 4.3, the text does not address the two cases: Figure 7a and Figure 7b.

The description of Vertex AI in section 4.5 should be expanded in comparison with studies on CNN algorithms presented in the paper.

Comments on the Quality of English Language

Please consider revising the paper for spelling and grammatical accuracy in the English language.

Author Response

Response to the editors and reviewers comments

Revisions requested

Automatic Waste Sorting System Using Arduino, Raspberry Pi, Vertex Ai And Convolutional Neural Networks

sustainability-3396450

Dear reviewers and editor,

Thank you for your useful comments and suggestions of our manuscript.

We have modified the manuscript accordingly, and detailed corrections and comments are listed below point by point:

COMMENTS FOR THE AUTHOR:

All the changes and comments in the redaction realized in the manuscript by the Reviewers were accepted.

Reviewer #2:

Please consider revising the paper for spelling and grammatical accuracy in the English language.

Grammar has been revised

The title of the paper is appropriately chosen, but the abstract may need to be revised for better clarity and understanding. The English text must be revised. For example, in line 13, I suggest removing the word "or".

This study emphasizes the significance of optimizing municipal solid waste management through the implementation of automated waste sorting systems, comparing two advanced artificial intelligence methodologies: Vertex AI and convolutional neural network (CNN) architectures developed using TensorFlow. Automated solid waste classification is presented as an innovative technological approach that leverages advanced algorithms to accurately identify and segregate materials, addressing the inherent limitations of conventional sorting methods, such as high labor dependency, inaccuracies in material separation, and constrained scalability for processing large waste volumes. The system, designed for the classification of paper, plastic, and metal waste, integrates an Arduino Uno microcontroller, a Raspberry Pi, a high-resolution camera, and a robotic manipulator, and it is evaluated based on performance metrics including classification accuracy, response time, scalability, and implementation cost. The findings reveal that Xception achieved a flawless classification accuracy of 100% with an average processing time of 0.25 seconds, whereas Vertex AI, with an accuracy of 90% and a response time of 2 seconds, excels in cloud scalability, making it ideal for resource-constrained environments. These results underscore the superiority of Xception in high-precision applications and the adaptability of Vertex AI in scenarios requiring flexible deployment, advancing efficient and sustainable waste management solutions.

The abbreviations and their explanation must be entered as they appear in the text

Correction has been made

In Chapter 3.2, Figure 1, the arrow connecting Vertex AI to OpenCV and TensorFlow does not align with the explanations provided in the text about the two operating modes of the system

In Chapter 3.3, for better clarity, Table 1 would be more effective if it included rows delineating the different operating modes of the system.

Operating Mode	Electronic component	Function
Vertex	CanaKit Raspberry	It provides the Raspberry Pi as the central processing unit, responsible for controlling and running the image capture and classification system, as well as managing interaction with other system components.
Vertex	Raspberry Pi Camera Module	It captures images in real time that are processed and classified using deep learning models run on the Raspberry Pi.
CNN	Web Camera	It captures additional images from different angles for validation processes and remote monitoring, ensuring the correct operation of the system.
Vertex & CNN	Arduino Uno	It controls sensors and actuators, such as the servo motor of the robotic arm; it manages hardware interactions and sends information to the Raspberry Pi or laptop for processing along with the captured images.
Vertex & CNN	Servo Motor S MG90S	It controls the movements of the robotic arm or other mechanical parts, allowing objects to be manipulated and classified into specific compartments according to the identification made.
Vertex & CNN	Robotic Arm	Controlled by the Arduino Uno, it performs physical actions according to the classification results, such as picking up and depositing objects in the corresponding compartments.
Vertex & CNN	Lighting System	It provides additional illumination during image capture, improving image quality, consisting of a small light bulb and a mini 2-pin switch.

In the article's text, the presentation should avoid a tutorial-style approach. For example: "To train a model, you enable the Vertex AI API in Google Cloud Console…" (Chapter 3.11) or "The metrics used allow you to evaluate various aspects of the performance of the models…" (Chapter 4.1).

(Chapter 3.11)

Vertex AI is a Google cloud platform that facilitates the development and deployment of machine learning models. In this project, it was used to train a model intended for classifying images into three categories: paper, plastic, and metal. The platform automatically tunes the model’s architecture and hyperparameters, in contrast to tools like TensorFlow, which provide more fine-grained control over internal configuration. [31].

Vertex AI's capabilities include image classification and object detection, making it adaptable to a variety of machine learning tasks. The trained model is integrated into external applications via an endpoint API, making it easy to deploy. However, as a paid service, there are associated costs, such as $3.46 per hour of training and $18 for model creation.

The data was organized into training, validation, and test sets, with images labeled automatically and manually. Once training was complete, the model was linked to a functional endpoint, with an approximate deployment time of 10 minutes using a processing node. Resource management in Vertex AI was carried out strategically, activating the API only when necessary, allowing efficient cost control. In addition, permissions managed through IAM and an API key in JSON format were used for model integration. It is worth mentioning that the platform requires an internet connection for its operation.

(Chapter 4.1)

The metrics used in this study evaluate various aspects of model performance. Test Accuracy measures the overall percentage of correct predictions. Precision indicates the proportion of true positive predictions among all positive predictions, while Recall reflects the model's ability to identify all actual positive cases. The F1 Score provides a balance between Precision and Recall, making it particularly useful for unbalanced datasets. Finally, the Matthews Correlation Coefficient (MCC) considers all elements of the confusion matrix, ranging from -1 (complete disagreement) to 1 (perfect prediction), with 0 indicating a random result [9], [22]. The mathematical expressions for these metrics are presented in Equations 1 to 4:

In Chapter 3.13, Table 2 is difficult to follow. A redesign is recommended.

Table 2 was redesigned for a better understanding of the data:

Optimizer	Learning rate	Epoch	Batch size	Loss function	Momentum	Test Accuracy	Speed Clasification
Adam	0.001	10	32	categorical_crossentropy	0	97.44%	0.642s
	0.005		64			98.01%	0.711s
	0.01		64			96.84%	0.847s
RMSprop	0.02	12	64	categorical_crossentropy	0	98.12%	0.234s
Adamax	0.04	12	64	categorical_crossentropy	0	97.89%	0.726s

In Chapter 4.3, row 474, referencing Figure 8, the drawing is captioned as Figure 5. The text refers to WFT, while the figure depicts DA.

The accuracy comparison between image classification models highlights the excellent performance of Xception, which achieved a perfect accuracy of 1.0. See Figure 5, which indicates correct predictions on all samples in the test set. InceptionV3, in both fine-tuning (FT), no data augmentation, and data augmentation (DA) configurations, also showed excellent performance, with accuracies above 0.98, suggesting that both techniques are effective in improving model generalization, although fine-tuning offers a slight advantage in this case.

Models such as ResNet50V2 and DenseNet121 also achieved high levels of accuracy in both configurations (FT and no data augmentation), although fine-tuning provided a small additional improvement. In contrast, MobileNetV2 showed lower accuracy, peaking at 0.9497. This result suggests that although MobileNetV2 is suitable for applications with computational processing constraints, it has limitations compared to more robust models such as Xception or InceptionV3 in terms of final accuracy. Figure 5 shows that Xception, InceptionV3 FT and ResNet50V2 are the architectures that achieve the highest levels of accuracy, standing out as the most effective options for image classification tasks in this context. It can also be determined based on the accuracy values that none of the CNN models proposed for the automatic waste classification system manages to improve accuracy when the data augmentation technique is applied, so it is considered that for this type of systems it is not necessary to use the data augmentation technique.

Starting with Figure 9, the numbering of the figures in the text becomes inconsistent.

Figure numbering has been corrected

Regarding Figure 7, in Chapter 4.3, the text does not address the two cases: Figure 7a and Figure 7b.

The fine-tuning technique showed a positive effect on models such as InceptionV3, DenseNet121 and ResNet50V2, See Figure 8 a). InceptionV3, for example, achieved an accuracy of 0.993 with fine-tuning, compared to 0.990 without fine-tuning, see figure 8 b).

The description of Vertex AI in section 4.5 should be expanded in comparison with studies on CNN algorithms presented in the paper.

The description of vertex ai was expanded in this way:

The results obtained with Vertex AI show consistent accuracy and relatively stable processing times for each class. On average, the model achieved an accuracy of 94.31% in the classification of “Metal”, 82.36% in “Paper”, and 94.72% in “Plastic”, with response times ranging from 2.25 to 2.46 seconds, as detailed in Table 7. These values reflect the effectiveness of the Vertex AI model in the automatic identification of materials, demonstrating its potential to improve efficiency in solid waste management.

While the accuracy of Vertex AI is slightly lower than other models, its great advantage lies in the ease of deployment in the cloud and automatic scalability. This model adapts well to environments with infrastructure limitations, allowing automated sorting systems to be expanded and adapted with greater flexibility. It does not require a robust local infrastructure, as it runs in the cloud, making it an attractive option for areas with limited resources.

Thank you very much for the comments to improve the article.

Sincerely, the authors.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Dear the author

My Comments to the Author

Automatic Waste Sorting System Using Arduino, Raspberry Pi, 2 Vertex Ai, And Convolutional Neural Networks

The title:

Line 2-3: I recommend the author clarify the title so that the reader understands what is meant by "Arduino, Raspberry Pi, 2 Vertex Ai model." Is it possible to modify the title to be clearer?

Abstract:

Line 10-15, I recommend the author reconsider writing the abstract because he started with the significance of the study in municipal solid waste management by comparing two artificial intelligence methods in 10 for automatic waste sorting: Vertex AI and convolutional neural network models 11 (CNN). Accordingly, he must explain to the reader the automatic use of solid waste sorting and the main problem facing normal sorting. Then he explains the importance of using this model and accordingly explains the most prominent and important results.

Introduction:

Line 55-60, I recommend the author that before delving into the main objectives of the research, it is necessary to present the basic problem facing ordinary sorting and compare it with the new model so that the view is clear to the reader.

Literature Review:

Line 124-129, the author reviewed modern techniques in the automatic sorting of solid waste, but it must be clarified whether these techniques are programmed for all types of solid materials of different shapes, meaning paper of all types, platen of all types, and metals of all types, or are there wastes that cannot be programmed for inductive and hybrid models?

System Architecture:

Line 150-154, I recommend the author be clearer in this paragraph about what factors he has identified to compare the performance between the pre-trained Vertex AI model and different convolutional neural network (CNN) models.

Prototype Hardware Schema:

Line 184-188 & table 1, Are these the elements that evaluate the performance of automatic sorting of solid waste of all kinds, and has the efficiency of the model been calculated and compared with other models?

Line 223-232, I recommend the author clarify this paragraph because the author mentioned that green innovations play a mediating role in the relationship between Green Supply Chain Management (GSCM) and Environmental Practices (EP). The question is, was he using mediating to explain the relationship or was there an influence from green innovations on the relationship because there is a difference between using mediating or moderating?

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship. So, I suggest the author to be clear in this paragraph.

Line 257-266, I recommend the author explain and clarify the role of green knowledge sharing which played an important role by business leaders in influencing knowledge sharing, which greatly supports the relationship between Green Supply Chain Management (GSCM) and Environmental Practices (EP).

Methodology:

Line 219, I recommend the author separate the methodology from the Smart Waste Management System.

In lines 219-224, the author stated that the training, validation, and evaluation of the model follows a systematic approach based on the strategic division of the data set into three subsets: 70% for training, 15% for validation, and 15% for testing. Is this model sufficient for the final evaluation of solid waste sorting and what are the basic factors which affect on this validation and evaluation?

In lines 237-243, the author mentioned in this paragraph that the solid waste sorting model deals with three categories: in the upper part, all types of metal, in the middle, all types of paper, and at the bottom, all types of plastic, bottles, and containers. The following question is: Does the efficiency of the model change with changes like the waste, whether wet or dry, or a difference in thickness and area? How can we ensure that the model works efficiently and with high quality using the artificial intelligence program?

In line 287, the author mentioned that there are important parameters. Are these parameters used in evaluating the solid waste sorting model or are they factors that control the processing of the model by a specific method? It is also preferable to put these factors in an explanatory table that compares Vertex AI model and CNN Architecture.

Line 320 & 341 through the two models (Vertex AI model and CNN Architecture), the author mentioned the Vertex AI model and the CNN architecture in terms of efficiency and use. Does the efficiency of each model differ according to the waste and the number of times used or repetition? Some factors control the evaluation of the efficiency of each machine and its performance accurately. Therefore, the author suggested that it be supported by an explanatory table that shows the reader the extent of the difference and compatibility between them in terms of performance, use, and savings.

Line 411-422, the author mentioned that the overall performance was evaluated accurately through a set of factors that contributed significantly to the accuracy of the results. The performance of the Xception and InceptionV3 models reached 98%, while the performance of the ResNet50V model was close to 98%. Finally, the DenseNet121 (FT) and MobileNetV2 models were the lowest in performance. Is this evaluation a trade-off between the different models according to performance only, or are there factors that control the evaluation? Is there an economic difference between them, meaning which one is more expensive and takes less time to sort? Also, are there problems that appear with each model during sorting, or is it just the speed of the difference in sorting speed between them?

In line 450-451, the author mentioned in Table 4 that the Xception model and the ResNet50V2 model had 2820 S and 2786 S convergence in training time respectively but there was a difference in Validation accuracy 98.78 % and 96.49% and test accuracy 98.12% and 95.79%. The reason for the difference in evaluation and testing due to its inability to adapt to sorting with high efficiency because of there being factors that hinder accuracy in evaluation and testing, or speed, which is a basic factor in saving time.

In line 512, the author mentioned that the following models DenseNet12, InceptionV3, and ResNet50V2 Xception work with high efficiency at rates 0.94-0.97 ranging between through the analysis of each category of waste while the models MobileNetV2 and MobileNetV3 perform poorly compared to the previous models. The author did not mention the real reason, but rather guesses, so it is necessary to clarify and what are the real reasons that led to the poor performance through certain factors that were studied on the models.

In lines 618-629, the author mentioned that the final cost of the Vertex AI model is $335, while the Xception model is $150. This is an economic factor, but are there differential factors between the models that confirm which one is better in use than the cost, or is the cost factor essential in the evaluation regardless of the disadvantages and advantages of each of the models?

Comments for author File: Comments.pdf

Comments on the Quality of English Language

I think it is better to edit and proofread the manuscript until it becomes more accepted for publication.

Author Response

Response to the editors and reviewers comments

Revisions requested

Automatic Waste Sorting System Using Arduino, Raspberry Pi, Vertex Ai And Convolutional Neural Networks

sustainability-3396450

Dear reviewers and editor,

Thank you for your useful comments and suggestions of our manuscript.

We have modified the manuscript accordingly, and detailed corrections and comments are listed below point by point:

COMMENTS FOR THE AUTHOR:

All the changes and comments in the redaction realized in the manuscript by the Reviewers were accepted.

Reviewer #3:

The title:

Line 2-3: I recommend the author clarify the title so that the reader understands what is meant by "Arduino, Raspberry Pi, 2 Vertex Ai model." Is it possible to modify the title to be clearer?

“Comparison of Vertex AI and Convolutional Neural Networks in Automatic Waste Sorting”

Abstract:

Introduction:

Automatic systems based on artificial intelligence, such as Vertex AI and convolutional neural network (CNN) architectures, overcome the limitations of manual methods by improving accuracy, reducing costs, and minimizing occupational and environmental risks. This study compares the performance of Vertex AI with a system based on selected CNN architectures, evaluating key metrics to determine the most efficient and cost-effective solution, providing a solid foundation for implementing sustainable waste classification solutions.

Literature Review:

In summary, recent studies emphasize the potential of architectures such as ResNet, DenseNet, and Xception, along with transfer learning techniques and the integration of IoT technologies and accessible hardware like the Raspberry Pi, to improve the accuracy and efficiency of automatic waste classification. However, it is important to clarify that while these techniques can classify various types of solid materials, such as different kinds of paper, plastics, and metals, the models must be specifically programmed and supported by datasets containing sufficiently representative images of those materials. In the present case, the dataset consisted of solid and dry waste. Furthermore, hybrid approaches and optimization techniques have enabled the classification of a broader range of waste types, significantly contributing to the sustainability and automation of waste management in urban and rural environments.

System Architecture:

The purpose of this system is to evaluate and compare the performance of the Vertex AI pretrained model with various convolutional neural network (CNN) models developed in TensorFlow. Figure 1 presents the general scheme of the automatic waste sorting system, highlighting the hardware components on the left and the software elements on the right.

Vertex AI, a commercial platform with pre-trained models, is implemented as the main engine of the system. Its capabilities include training and running machine learning models in the cloud, efficiently processing large volumes of data, and accurate real-time classification. This platform also makes it easy to manage large data sets and offers scalability options for industrial applications.

On the other hand, open source tools TensorFlow and OpenCV are used to develop and train custom CNN models for debris classification. The performance comparison between these approaches is based on key factors such as accuracy (ability to classify waste correctly), efficiency (processing speed and resource usage), and response times (latency in real-time applications). These metrics provide a solid framework to determine the most suitable model by analyzing the results obtained during the training and validation phases.

Prototype Hardware Schema:

The evaluation of the classification model is based on metrics such as accuracy and response time. In addition, the hardware elements required to implement each model are considered, which are shown in Table 1. The efficiency of the model was evaluated by testing with different types of waste, allowing these metrics to be compared.

Line 223-232, I recommend the author clarify this paragraph because the author mentioned that green innovations play a mediating role in the relationship between Green Supply Chain Management (GSCM) and Environmental Practices (EP). The question is, was he using mediating to explain the relationship or was there an influence from green innovations on the relationship because there is a difference between using mediating or moderating?

Does not belong to the article under review

Line 257-266, I recommend the author explain and clarify the role of green knowledge sharing which played an important role by business leaders in influencing knowledge sharing, which greatly supports the relationship between Green Supply Chain Management (GSCM) and Environmental Practices (EP).

Does not belong to the article under review

Methodology:

Line 219, I recommend the author separate the methodology from the Smart Waste Management System.

In lines 219-224, the author stated that the training, validation, and evaluation of the model follows a systematic approach based on the strategic division of the data set into three subsets: 70% for training, 15% for validation, and 15% for testing. Is this model sufficient for the final evaluation of solid waste sorting and what are the basic factors which affect on this validation and evaluation?

3.6. Methodology
The flow of training, validation, and evaluation of the model follows a structured approach based on the strategic division of the dataset into three subsets: 70% for training, and 15% for validation and testing, respectively. See Figure 2. This segmentation optimizes the model's fitting and generalization capacity.
The process is carried out in two environments: Google Colab and Visual Studio Code. Google Colab facilitates the training phase, using 70% of the dataset and leveraging hardware acceleration to optimize model performance [10].
Visual Studio Code handles the validation and testing phases, using the remaining 15% of the dataset to assess the model's generalization and detect overfitting. This workflow ensures a comprehensive performance evaluation before final deployment.
The workflow continues with model training through hyperparameter tuning and data augmentation to enhance generalization. After training, the model is validated and adjusted based on the validation set results, followed by a final evaluation. In this evaluation, key metrics such as accuracy, loss, recall, F1 Score, and the Matthews Correlation Coefficient (MCC) are applied, providing a thorough analysis of performance.

Justification:

The choice of using 70% of the data for training and 15% each for validation and testing is based on established best practices in machine learning. Training with 70% of the data ensures that the model has sufficient examples to learn without overfitting. Although 30% is typically allocated for validation, in this case, a test folder was added and divided into 15% each to make the classification more robust. This balance helps evaluate the model's ability to generalize to unseen data. Factors influencing effectiveness include the size and diversity of the dataset, which allow the model to generalize better, reducing the risk of overfitting and underfitting. Another factor that prevents biases toward certain classes is the consistent representation of data across the training, validation, and testing sets.

In lines 237-243, the author mentioned in this paragraph that the solid waste sorting model deals with three categories: in the upper part, all types of metal, in the middle, all types of paper, and at the bottom, all types of plastic, bottles, and containers. The following question is: Does the efficiency of the model change with changes like the waste, whether wet or dry, or a difference in thickness and area? How can we ensure that the model works efficiently and with high quality using the artificial intelligence program?

Dataset Description

The dataset includes three main categories of waste: metal, paper, and plastic. At the top are metal waste, in the middle are various types of paper, and at the bottom are plastic bottles and containers. This set of images was collected from different datasets hosted in the Roboflow community repository, including “RecycleSorter” [25], used for paper sorting; “Bottle Defect Detection Dataset” [26], used for plastic bottle identification; and “Detect Can Dataset” [27], applied for metal can detection. These datasets were used to train artificial intelligence models with the aim of automating the waste sorting process, optimizing accuracy and efficiency in solid waste management.

Furthermore, to ensure that the model maintains a high level of accuracy and efficiency, even in the face of variations in the physical characteristics of the waste, such as its wet or dry state, thickness, and area, specific strategies were implemented during preparation and training. These strategies include the selection of representative images that cover different physical conditions of the waste, the use of data augmentation techniques to simulate variations in texture, lighting and orientation, and the validation of the models in real-life scenarios that reflect practical situations. These measures ensure that the model is robust and capable of classifying waste with high quality, regardless of the conditions in which it is found.

In line 287, the author mentioned that there are important parameters. Are these parameters used in evaluating the solid waste sorting model or are they factors that control the processing of the model by a specific method? It is also preferable to put these factors in an explanatory table that compares Vertex AI model and CNN Architecture.

The parameters mentioned above are not directly used to evaluate the solid waste classification model; instead, they play a crucial role in controlling the model's processing during training and optimization. Each of these factors is designed to influence how the model learns to classify data and improves its performance.

These parameters are factors that control the behavior of the model, and while they are not evaluation metrics by themselves, they have a huge impact on the final results when optimizing the model's performance during processing.It can be seen in the table 2

Line 320 & 341 through the two models (Vertex AI model and CNN Architecture), the author mentioned the Vertex AI model and the CNN architecture in terms of efficiency and use. Does the efficiency of each model differ according to the waste and the number of times used or repetition? Some factors control the evaluation of the efficiency of each machine and its performance accurately. Therefore, the author suggested that it be supported by an explanatory table that shows the reader the extent of the difference and compatibility between them in terms of performance, use, and savings.

Regarding model efficiency, it is important to consider that the effectiveness of Vertex AI and CNN architecture can vary depending on the type of waste and the frequency with which it is used repeatedly. For example, some materials such as plastics may require more training or processing cycles than others such as metals or paper. In addition, the performance of each model can be affected by factors such as the amount of data and the complexity of the classes

Table 2 compares the characteristics of Vertex AI and CNN architectures in four aspects: training, performance, usage, and long-term savings.

Table 2. Comparative Table between Vertex AI and CNN Architecture.

Feature

Vertex AI

CNN Architecture

Training

- Cloud processing, no significant hardware dependency.
- Easy integration and scalability.

- Requires robust hardware, especially GPUs, for efficient training.
- Requires manual configuration of the training environment.

Performance

- Automatic scalability.
- Pre-trained models optimized for efficiency.

- Requires optimization after pre-training.
- Greater control over the optimization process.

Usage

- Requires Internet connection to access the cloud service.
- User-friendly interface for users with no ML experience.

- Does not require Internet connection once the model is trained.
- Requires technical knowledge in implementation and management.

Long-term Savings

- Does not require own infrastructure.
- Continuous costs for usage.

- Requires initial investment in infrastructure.
- Potential savings by not relying on external services.

Compatibility with Waste Types

- Can adapt to different types of waste through specific configurations.
- No hardware changes required for each type of waste.

- Requires training for each type of material.
- Greater customization for each type of waste.

Line 411-422, the author mentioned that the overall performance was evaluated accurately through a set of factors that contributed significantly to the accuracy of the results. The performance of the Xception and InceptionV3 models reached 98%, while the performance of the ResNet50V model was close to 98%. Finally, the DenseNet121 (FT) and MobileNetV2 models were the lowest in performance.Is this evaluation a trade-off between the different models according to performance only, or are there factors that control the evaluation? Is there an economic difference between them, meaning which one is more expensive and takes less time to sort? Also, are there problems that appear with each model during sorting, or is it just the speed of the difference in sorting speed between them?

While training these models is free in terms of licensing, the time required for training varies significantly. Deeper architectures such as Xception and ResNet50V2 require more computational resources and longer training times compared to lighter models such as MobileNetV2, which train faster but achieve lower accuracy. Furthermore, during classification, Xception and InceptionV3 (FT) have slightly longer processing times than MobileNetV2 due to their higher complexity. These results indicate a trade-off between classification accuracy, training time, and processing speed, highlighting the importance of selecting a model based on the specific needs of the application.

In line 450-451, the author mentioned in Table 4 that the Xception model and the ResNet50V2 model had 2820 S and 2786 S convergence in training time respectively but there was a difference in Validation accuracy 98.78 % and 96.49% and test accuracy 98.12% and 95.79%. The reason for the difference in evaluation and testing due to its inability to adapt to sorting with high efficiency because of there being factors that hinder accuracy in evaluation and testing, or speed, which is a basic factor in saving time.

The difference in accuracy between the Xception and ResNet50V2 models can be attributed to ResNet50V2’s inability to fully adapt to classification with high efficiency. This is due to factors such as the nature of the data, which may introduce variations that ResNet50V2 does not handle as well as Xception, and its architecture, which, despite having more parameters (23.6M vs. 20.8M), is not as efficient at extracting relevant features. Therefore, this difference is not due to training speed, but to the model’s ability to efficiently generalize over the evaluation and test data.

In line 512, the author mentioned that the following models DenseNet12, InceptionV3, and ResNet50V2 Xception work with high efficiency at rates 0.94-0.97 ranging between through the analysis of each category of waste while the models MobileNetV2 and MobileNetV3 perform poorly compared to the previous models. The author did not mention the real reason, but rather guesses, so it is necessary to clarify and what are the real reasons that led to the poor performance through certain factors that were studied on the models.

This significant drop in performance is mainly attributed to its limited ability to handle the complexities of the dataset, as it is designed for lightweight applications and lacks the depth required to model complex patterns effectively. Furthermore, the reduced number of parameters and layers in MobileNet models, while advantageous in reducing computational costs, compromises its ability to capture and process fine-grained features, leading to decreased accuracy in tasks requiring high levels of precision, such as residue classification. These findings underscore the trade-off between computational efficiency and classification performance, and emphasize the need to align model selection with the specific requirements and complexity of the application.

In lines 618-629, the author mentioned that the final cost of the Vertex AI model is $335, while the Xception model is $150. This is an economic factor, but are there differential factors between the models that confirm which one is better in use than the cost, or is the cost factor essential in the evaluation regardless of the disadvantages and advantages of each of the models?

Cost plays a major role in evaluating models like Vertex AI and Xception, especially in resource-constrained projects. According to the analysis, Vertex AI resulted in a total cost of $355 to implement, while Xception was considerably cheaper at $150. Vertex AI’s cost breaks down to model creation, which took 1.5 hours at $18 per hour, resulting in $28.41, plus an additional cost of $3.46 per hour for the 3 hours of creation and testing, which added up to $10.38. In total, this led to a final expense of $38.79 for using the model alone. However, cost is not the only aspect to consider. Accuracy and performance in waste classification are key, as a more expensive model, like Vertex AI, could offer higher accuracy and better real-time processing capabilities, which may justify the additional cost in certain contexts. Additionally, infrastructure and technology dependency play a major role; While Vertex AI requires an internet connection and leverages cloud infrastructure, Xception is internet-agnostic, making it more suitable for environments with connectivity limitations or looking for more autonomous solutions. Flexibility and control are also differentiating factors; Xception allows for greater customization and control over the model, which is ideal for teams that want to modify the system according to their needs. On the other hand, Vertex AI, being a pre-trained solution, might have restrictions in terms of customization, but offers the benefit of automatic updates and maintenance, reducing the operational burden. In this sense, although cost is essential, the choice of model should be based on an analysis that balances cost with performance, flexibility, scalability and maintenance, to determine which offers the best advantages according to the needs of the project.

Thank you very much for the comments to improve the article.

Sincerely,

The authors.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

After the first round of revisions, most of the issues in the paper have been resolved. However, there are still two problems that need further optimization:

(1) The line representations in Tables 3 and 4 are incomplete.it is recommended to optimize and improve them further.

(2) Check whether the numbering sequence of the tables in the paper is correct.

Author Response

Response to the editors and reviewers comments

Revisions requested

Comparison of Vertex AI and Convolutional Neural Networks in Automatic Waste Sorting

jof-2392783

Dear reviewers and editor,

Thank you for your useful comments and suggestions of our manuscript.

We have modified the manuscript accordingly, and detailed corrections and comments are listed below point by point:

COMMENTS FOR THE AUTHOR:

All the changes and comments in the redaction realized in the manuscript by the Reviewers were accepted.

Reviewer #1

(1) The line representations in Tables 3 and 4 are incomplete.it is recommended to optimize and improve them further.

Tables 3 and 4 have been corrected.

Table 3. Training hyperparameters setting.

Optimizer	Learning rate	Epoch	Batch size	Test Accuracy	Speed Clasification
Adam	0.001	10	32	97.44%	0.642s
	0.005	10	64	98.01%	0.711s
	0.01	10	64	96.84%	0.847s
RMSprop	0.02	12	64	98.12%	0.234s
Adamax	0.04	12	64	97.89%	0.726s

Table 4. Details of the pre-trained models

Model	Parameters	Layers	Size (MB)
Xception	20.8M	135	96.04
Inceptionv3	21.8M	314	100.07
MobileNetV2	2.3M	157	19.15
Resnet50v2	23.6M	193	106.44
DenseNet121	7.0M	429	36.04
MobileNetv3	3.0 M	191	19.52

(2) Check whether the numbering sequence of the tables in the paper is correct.

The table numbering sequence has been checked.

Thank you very much for the comments to improve the article.

Sincerely,

The authors.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

In reference to the revised version of the manuscript, I consider that the authors have effectively addressed all necessary revisions and improvements. Consequently, I find the article suitable for publication.

Remark: Please make a small editing correction (regarding the numbering). In 4.6. Xception, it refers to Table 8, not Table 7.

Author Response

Response to the editors and reviewers comments

Revisions requested

Comparison of Vertex AI and Convolutional Neural Networks in Automatic Waste Sorting

jof-2392783

Dear reviewers and editor,

Thank you for your useful comments and suggestions of our manuscript.

We have modified the manuscript accordingly, and detailed corrections and comments are listed below point by point:

COMMENTS FOR THE AUTHOR:

All the changes and comments in the redaction realized in the manuscript by the Reviewers were accepted.

Reviewer #2:

Remark: Please make a small editing correction (regarding the numbering). In 4.6. Xception, it refers to Table 8, not Table 7.

Xception

To evaluate the Xception model within the prototype, exhaustive tests were carried out using various types of materials, Table 8 presents the results. This table details the accuracy and processing time that each classification instance had as an average.

Table 8. Average results from the 3 classes.

Material	Tests	Accuracy (%)	Time(s)
Metal	5	99.99	0.234
Paper	5	100.00	0.234
Plastic	5	100.00	0.242

Thank you very much for the comments to improve the article.

Sincerely,

The authors.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Thank you very much for editing and progressing. The article has been reviewed carefully and I think there are two parts: the first is changing Schedule 6 to become Schedule 5, and the second is to edit and proofread the article.

Thank you

Dr. Ossama Labib

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Dear Sir

I think the manuscript needs editing and proofreading to be acceptable for publication.

Thank you

Author Response

Response to the editors and reviewers comments

Revisions requested

Comparison of Vertex AI and Convolutional Neural Networks in Automatic Waste Sorting

jof-2392783

Dear reviewers and editor,

Thank you for your useful comments and suggestions of our manuscript.

We have modified the manuscript accordingly, and detailed corrections and comments are listed below point by point:

COMMENTS FOR THE AUTHOR:

All the changes and comments in the redaction realized in the manuscript by the Reviewers were accepted.

Reviewer #3:

Schedule 6 has been changed to Schedule 5 and the article has been edited and corrected.

Cost evaluation

Thank you very much for the comments to improve the article.

Sincerely,

The authors.

Author Response File: Author Response.pdf

Article Menu

Comparison of Vertex AI and Convolutional Neural Networks for Automatic Waste Sorting

Further Information

Guidelines

MDPI Initiatives

Follow MDPI