Smart Waste Management and Classiﬁcation Systems Using Cutting Edge Approach

: With a rapid increase in population, many problems arise in relation to waste dumps. These emits hazardous gases, which have negative effects on human health. The main issue is the domestic solid waste collection, management, and classiﬁcation. According to studies, in America, nearly 75% of waste can be recycled, but there is a lack of a proper real-time waste-segregating mechanism, due to which only 30% of waste is being recycled at present. To maintain a clean and green environment, we need a smart waste management and classiﬁcation system. To tackle the above-highlighted issue, we propose a real-time smart waste management and classiﬁcation mechanism using a cutting-edge approach (SWMACM-CA). It uses the Internet of Things (IoT), deep learning (DL), and cutting-edge techniques to classify and segregate waste items in a dump area. Moreover, we propose a waste grid segmentation mechanism, which maps the pile at the waste yard into grid-like segments. A camera captures the waste yard image and sends it to an edge node to create a waste grid. The grid cell image segments act as a test image for trained deep learning, which can make a particular waste item prediction. The deep-learning algorithm used for this speciﬁc project is Visual Geometry Group with 16 layers (VGG16). The model is trained on a cloud server deployed at the edge node to minimize overall latency. By adopting hybrid and decentralized computing models, we can reduce the delay factor and efﬁciently use computational resources. The overall accuracy of the trained algorithm is over 90%, which is quite effective. Therefore, our proposed (SWMACM-CA) system provides more accurate results than existing state-of-the-art solutions, which is the core objective of this work.


Introduction
Waste management requires necessary processes and activities to dominate from its inception to demolition. Waste comes in solid, liquid, or gaseous form, and every type of waste demands a different method of classification, disposal, and management. Waste management deals with every waste category, including household, organic, industrial, municipal, biomedical, organic, biological, and radioactive waste. Any unnecessary substance or substance with no use is called "waste". Waste management involves the collection of the waste [1] and its transport and disposal to appropriate locations [2]. In the European Union (EUROPA), 423 million tons or 56% of domestic waste was recycled in 2016. Reports reflect the need for proper household waste management for the recycling process [3]. According to [4], most of the Earth's population will emigrate from rural to urban areas in the coming years. Therefore, bigger cities will require a highly sustainable infrastructure and smart waste management system to fulfill the fundamental needs of its citizens and provide them with a good service for the future [5,6].
Traditional recycling processes segregate waste objects manually or by applying a sequence of filters. If modern technology and waste management could be bound together,

1.
We combine IoT and deep learning paradigms to ensure an optimal solution for waste management.

2.
We design a waste classification model (WCM) that classifies the waste into biodegradable and non-biodegradable items, such as plastic, metal, glass, etc., using the image classification technique.

3.
We implement an architectural development process of smart waste dump using the segmented grid image captured by a camera mounted on the raspberry pie.

4.
We develop a smart way to monitor the waste dump in real time using a cutting-edge approach that decreases the overall latency and improves energy utilization. 5.
We combine the cloud and edge processing mechanism as a hybrid computing phenomenon which improves the overall performance of the proposed system. 6.
We perform a performance analysis of the proposed system results.

Literature Review
With the rapid modernization of every sector of society, nowadays, people rely on technology for everything. In this growing age, human lives have changed a lot, and modern technology has taken its place in the heart of every human being. Undoubtedly, there is no field in our surroundings where technology does not play a vital role. People prefer to live in cities with the latest facilities and technology. As a result, the population in cities is increasing daily, which has many disadvantages and advantages [18,19]. Individuals working in cities have a positive effect on the economy of the country [20,21]. Still, as societies become increasingly congested, many problems related to health, safety, and the environment arise [22][23][24][25]. These problems include medical facilities, security, privacy, and transportation [26,27].
Another huge problem in cities nowadays is waste management, which involves the collection, transportation, and classification of waste, and also helps with recycling waste items [27,28]. Intensive usage of natural resources has become unavoidable [29]. In contrast, due to increasing consumption trends, waste objects have reached levels that endanger human health and the environment in quantity and harmful content. Chemical, manufacturing, physical, and consumption properties are the considerations used to classify waste items [30,31]. Most of the population of growing cities is educated and well aware of the environmental effects of waste, but they dispose of their waste without classifying it. Everything in this universe has two aspects; one is good, and one is bad. Man feels this when it happens to him, or his belongings [3]. Several countries have placed bins with separate compartments for different waste categories. Still, the dwellers are not following the rules, which makes waste management and classification a complex task that needs a specific system to be designed to perform waste classification automatically [32,33].
According to [34][35][36][37], it has been proved that nearly 0.75% of solid domestic waste can be recycled. Therefore, it will be costly to dump it and not recycle it. If these waste items are classified, it will boost the recycling process, which will positively affect the economic boost of the country [38]. It will also provide a much greener environment for future generations to live in [39]. In short, failing to prioritize recycling can cause wastage of natural resources, and financial loss [40,41]. Recycling is a viable solution, though it can be daunting to classify waste accurately. The efficient management of waste has a significant impact on people's quality of life. The reason is that waste disposal has a clear connection with adverse effects on the environment, and thus, people's health. Therefore, there is a need for a proper plan for a waste management system for the betterment of the people who want to live in a healthy environment [42,43].
Various countries such as America, Canada, Russia, Italy, Malaysia, the Kingdom of Saudi Arabia, Qatar, etc., and many other countries are working to develop a smart waste management system. In previous work, the waste management technique which was implemented in St. Petersburg, Russia, used Wireless Sensor Networks (WSNs), Radio Frequency Identification (RFID) [44], sensors, and actuators. St. Petersburg is a city of 5 million individuals [45]. On average, 1.7 million tons of solid waste is produced in the city annually. Whereas, in Canada [46] the k-means and linear regression are used for waste management systems, in which multiple beats are involved to regulate the cycle.
In [47], single waste image classification was performed using SVM with SIFT and CNN. They manually collected a total of 2527 waste images. SVM and CNN models were trained on the collected dataset and achieved 63% and 22% accuracy, respectively. Their research classifies waste into six categories; paper, metal, cardboard, plastic, glass, and trash [48]. Municipal solid waste can be classified into either six or four categories. In six-class systems [49][50][51][52] researchers focus on recyclable waste classes: paper, metal, glass, cardboard, plastic, and trash. In four-class systems [53][54][55][56][57] waste classes are wet waste, i.e., probably kitchen waste, dry garbage, recyclable and hazardous garbage. A ResNet-5013 model and SVM-based intelligent system was proposed in [58]. The system was tested on a single waste images dataset [53] and gained 87% accuracy in classification. In public places for automatic detection of recyclable wastes, a multilayered hybrid deeplearning-based system was proposed [59]. The system was employed with CNN for image features extraction and a multilayered perceptron used for consolidating only relevant features of the image. Their proposed technique outperforms and achieves an accuracy of 90% for classification. A CNN-based system was developed to classify plastic wastes [60].
For Municipal Solid Waste (MSW), derived classifier models [61] were proposed based on transfer learning. Models were retrained on 9200 MSW images by pertained classifiers of CNN (VGG16, MobileNetV2, ResNet50, and DenseNet121) to classify the waste into four predefined groups (recyclable waste, hazardous waste, compostable waste, and general waste). In [62], the authors proposed an image classifier to identify the waste item and classify its category. In their research, four classifiers of CNN (VGG16, DenseNet169, ResNet50, and AlexNet) trained on the ImageNet dataset were used for feature extraction from waste images to classify them into six categories: paper, metal, cardboard, plastic, glass, and trash. Their results reflect that ResNet50 performs better, and its performance is closer to DenseNet169. The flaw in their proposed mechanism is that it misclassifies glass. Since the ILSVRC Competition, different image classification methods based on CNN architectures have developed [63][64][65]. In image classification of computer vision, VGG16 and VGG19 (also known as VGGNet) are two representatives of CNN architectures, achieving the best performance in the ILSVRC Competition. For large-scale image recognition, these models use 3 × 3 tiny convolutional filters in every layer and push the depth of the network from 16-19 layers.
Recent studies reported that deep learning (DL) models are more effective for object detection and classification than traditional techniques. Due to rapid urbanization, smart cities are being designed with smart and automated waste management using the internet of things (IoT) technologies that lead to an increase in efficiency and flexibility, saving energy and time and keeping the environment sustainable [8,[66][67][68][69][70][71]. IoT-based solutions provide real-time monitoring, collecting, and management of garbage. In [72], the authors developed IoT-based smart bins using deep learning (DL) and machine learning (ML) mod-els to monitor, collect, manage waste, and forecast air pollutants present in the surrounding environment. An IoT stationed smart waste segregation and management system was developed by Shamin et al. [73] employed with an ultrasonic sensor, a moisture sensor, a metal sensor, and a camera. Image processing and machine learning algorithms were used to identify degradable waste items and segregate them into different dustbins.
Before proposing a solution of our own, our research covered a wide range of previously proposed models, papers, and studies. All the research and studies were thoroughly read and understood, considering their domain of interest, their architecture, the pros and cons, the features added in their studies, and the accuracy of the architecture proposed. After critically evaluating many studies on waste management and classification, some crucial information about these studies is provided in Tables 1 and 2. So, the readers can have an overview of the previous work carried out by researchers, practitioners, authors, and technologists related to the subject mentioned.  Previous studies have their benefits and limitations, and having their results in mind helped us propose a system that can overcome all the limitations. Previous research studies solved the issue of waste classification and management to some extent, but all of them lag one way or the other. Some have combined multiple approaches to propose a hybrid solution. The best-known accuracy has been achieved through the hybrid approach of Deep Learning algorithms: Inception and ResNet. The accuracy achieved was over 88% in classifying waste items.
Similarly, many proposed systems have a hybrid solution consisting of "machine learning and deep learning". Still, the search for a more accurate and reliable system continues. Our research aims to focus on observing and understanding traditional methods for automatic waste classification systems, which can further help in the recycling process of waste items. Currently, many techniques for waste classification exist, but many require human involvement. If a fully automatic system is deployed in any society, it will be a win-win situation for the government, societies, and industrialists. The underlying purpose of this research is to provide an automated waste management system that can perform classification quickly and provide better and more accurate results at a low cost.

Proposed Methodology
The main focus of this research is to develop a system for the segregation of solid waste, mainly recyclables, as our proposed model is not a simple system for classifying waste. It is proposed by carefully understanding all aspects and constraints of waste classifications. All the functional and non-functional requirements for waste classification will be achieved throughout the implementation of the system. Figure 2 shows a simplified system approach in which, firstly, the waste bins are equipped with sensors to keep the level of waste in the bins. Waste management authorities collect waste from all commercial and domestic locations. Secondly, a camera is used to capture the image of the waste dump. A grid segmentation approach is applied to the image to divide it into grids. This image is fed into a trained deep learning algorithm named VGG-16, which performs identification. A classifier further determines the class of every waste object and segregates it to the corresponding waste container using a robotic arm or gripper. Edge computing is used along with cloud computing to decrease the overall latency. The proposed waste classification model's whole system is divided into three main modules, i.e., Edge Node Processing, Cloud Processing, and Control Unit. It is further divided into multiple substages shown in Figure 2, discussed below:

Edge Node Processing
The edge node becomes the fundamental unit of the proposed system. It minimizes the overall response time of the system, as it is close to the source of data and efficiently uses computational resources to accomplish the desired tasks. We used raspberry pi 4 as an edge node to make it more effective in a real-time application. This stage is accomplished in three phases, i.e., image capturing, grid segmentation, and waste item classification.

Image Capture
The pie-cam is mounted on the edge node to captures a waste-dumped area image. In the center of the pie cam sensor focus, the image size should be 1440 × 1080 pixels from a 5 mega-pixels pie cam. The captured image is further processed in the later phases to recognize the waste items effectively.

Grid Segmentation
Grid segmentation strategy is used to split the whole test image into grid-like cell structures. We took the test image from the pile of a waste dump in a controlled environment; as shown in Figure 3, it contains a lot of waste objects in a single test image. Thus, it is difficult to segregate the waste items in real time. To make the proposed waste classification mechanism more effective, we divide the test image into grid-like cell structures. It has a meager computational cost and better performance overall. We apply the grid-like cell process in two phases: • Phase 1: In the first step, we map the captured image of a waste dump on a grid-like cell structure, i.e., 5 × 6 matrix, of the same resolutions, and convert it into grayscale. The size of the grid and the total number of cells depend on the test image size. • Phase 2: In the second step, initial classification is performed, and labels are applied based on the texture, color, and position features of each segment by the VGG-16 algorithm as dicussed in Section 3.3.2. Each cell segment is processed separately to recognize the waste items appropriately. After that, we move from one segment to another column-wise to the VGG16 algorithm to recognize the waste item effectively. Each cell contains one waste object at a time, which is ultimately picked by the robotic arm and placed in a respective bin. This process continues until the whole test image has been processed. Incorrect segments are put back, and final labeling or classification consists of the union of only correct segments. Further, limitations in the training data set are as follows: we have items in a segregated form, i.e., a single object in each image. So, for a time, we conduct our experiment under a single object in each grid cell and under control conditions, as the training data set was acquired with proper luminance and related parameters. However, chipping the test image makes it easier to recognize the waste item from the waste dump. It does not chip the single waste item; in fact, it chips the overall test image which contains multiple waste items into a grid-like cell structure. The resolution of the test image cell after the grid segmentation technique is being set according to the trained image size in the pre-processing stage. Finally, we obtain the results with a minute difference in terms of accuracy due to the similar environment (as in the training dataset) provided during the experimental setup, discussed in Section 4.
For example, an image is shown in Figure 4-a) is divided into grids, i.e., the division of the image into rows and columns. Figure 4-b) represents the grids split into segments and forwarded into the next part one by one to get the best result. In Figure 4-c), each segment is converted into grayscale. The image is again merged into its original shape shown in Figure 4-d) to make sure the test image is complete.

Waste Item Classification
Once the waste dumped image is converted into the grid-like structure, each grid cell is processed to recognize the appropriate waste item, as shown in Figure 3. We use the VGG16, a deep learning-based classification mechanism, to identify the waste item. The segregated waste item is managed with the help of the control unit, as discussed in the control unit module.

Control Unit
The control unit is the core part of the whole system. It can generate control signals based on the input received from the edge processing module. It controls the robotic arm movement according to its degree of freedom (DOF) specification. We use the four DOF robotic arm gripper to pick the recognized waste item to place it into the segregated waste bin as per the recognized waste item category, i.e., metal, plastic, glass, and trash, received from the edge processing module.

Cloud Processing
The cloud processing platform assists in computing heavy processing processes and algorithms. It can also facilitate managing and storing large datasets on which the whole system can be trained. The major processes of this module are data storage, deep learning algorithm, and pre-trained model, as shown below:

Data Storage
We use the "trashnet" waste items dataset. It comprises 2527 waste item images; details are mentioned in Section 4.1. To compute this dataset, we need to store it on a dedicated storage resource, as the edge node cannot handle this vast dataset. So, we use cloud storage resources to place this dataset, which was ultimately processed by the Google AI module for deep-learning model computations on the cloud.

Deep Learning Algorithm
In computer vision, the trend is to design more complicated and deeper networks to achieve higher accuracy. However, deeper networks also reflect some trade-off of speed and size. Intuitively, deeper networks should not perform worse than shallower ones, but in practice, the deeper networks perform worse than shallower. In short, deeper networks are harder to optimize. In real time smart applications, such as in our Internet of Things (IoT)-based system, the object detection and recognition tasks must be able to be performed with computationally limited cost and time. VGG16 alternative networks are computationally costly. The VGG16 network is characterized by its simplicity and deeper architecture with smaller kernel sizes; thus, it is suitable for our real-time smart waste classification process with greater efficiency and speed. We train multiple deep-learning classification models such as Fast-RCNN, MobileNetV2, and VGG16 on trashnet dataset to recognize the appropriate waste item from the waste dump image. We find VGG16 performs best for the proposed smart waste classification system. After that, the trained file is imported into the edge node to recognize the appropriate waste item based on the test waste dump image in real time. It also generates the proper input for the control unit to effectively control the robotic arm movements, as discussed in Section 4.

Experimental Setup
In the proposed smart waste classification mechanism, we aim to classify the waste items, i.e., plastic, glass, metal, and trash, from the waste dump in real time. To accomplish this task, we develop three modules, i.e., Edge node processing, Cloud processing, Control unit, and Power source, as discussed in Section 3. The edge node takes inputs from the pie cam and the cloud platform, which ultimately controls the robotic arm through the control unit, respectively. The training of the classification model is entirely carried out on the cloud AI module, which is being tested on the edge node in real-time. The hardware components of each module of the proposed system are elaborated in the Table 3, respectively. The components are interconnected and embedded in a single module to make it a smart waste classification system. The waste items are picked from the dump and dropped in a specified bin by a 4-DOF gripper, which operates on the inputs received from raspberry pi. Arduino controls the gripper's movement and the movement of all motors associated with it, and the edge node (Raspberry pi) performs all the computation and classification based on the pie cam input image in a real-time environment. Figure 5 represented the visual assembly of the proposed system components. As for the experimental setup, pie cam mounted with edge node hanging at the distance of 45 cm. The overall waste dump area is 4 * 50 cm 2 , and the gripper is hanging in the middle of the mechanical assembly at a distance of 60 cm from the surface. It can move up and down through the steel string folded on the moveable pulley attached to the stepper motor. The waste dump is inside the mechanical model on the surface. It is picked after the grid segmentation technique and classification output, as discussed in Section 3.

Dataset Description
In this study, the TrashNet dataset is used, which is publicly available on Github [85]. It is a hand-collected dataset of a size of approximately 3.5GB. The dataset spans six classes: metal, paper, glass, cardboard, and trash. Together, these classes account for 99% of recycled material. Currently, this repository contains 2527 waste images. The dimensions of each image are 512 × 384, which can be changed or resized in the data. These images are captured by placing the object on a white poster board as a background using room lighting or sunlight. Figure 6 illustrates samples picked from each class of dataset.
For this study, we use a subset of this dataset for four categories of waste: plastic, metal, glass, and trash, and we split it into two sets: training and testing, with ratios 80% and 20%, respectively. However, for real-time proposed model testing, we trained the whole data-set and test images captured in real-time particularly. Table 4 reflects the sample distribution of classes for each subset. It can be noticed that the number of samples per class is not balanced. Figure 7 represents the dataset's number of samples among each label class.

Performance Metrics
Performance metrics are part of every machine-learning pipeline. Statistical validation of our proposed system and evaluation parameters used for the proposed waste classification architecture are as follows:

Accuracy
The performance of the system can be measured by calculating the accuracy of the system. The system accuracy (AC) is the ratio of true positive predictions for the complete dataset. Mathematically, it is represented in Equation (1) Here, TP is true positive, TN is true negative, FN is false negative, and FP is false positive.

Latency Overhead
Latency is a measure of delay or the time it takes to transfer data to get to its destination between cloud and edge node. For our system, we use the hybrid technique for the minimal latency overhead. The cloud latency overhead (CLO) parameter needs to calculate the cloud latency. Similarly, the edge latency overhead (ELO) parameter needs to calculate the edge node processing latency. However, we used both cloud and edge node processing for training and testing purposes due to the computational and storage differences. Therefore, we calculate the hybrid latency overhead (HLO) parameter to ensure the overall latency of the proposed system. Mathematically, it is represented in the Equations (2)-(4) shown below: where n is the number of attempts, Lc is cloud latency, and Le is edge latency.

Resource Energy Utilization
Energy is a system's capacity to carry out work. It is represented as the product of power, and the length of time it is consumed, mathematically represented in Equation (5). So, if we know how much power (in Watts), is being consumed and the time (in seconds) for which it is used, we can find the total energy consumed in watt-seconds or joules. Power is the product of current multiplied by voltage represented in Equation (6). Thus, energy consumed can be measured by current, voltage, and time, as depicted in Equation (7). In our proposed system, this parameter refers to the Figure 13. The more computational and memory resources used, the more energy is needed to operate the devices. As in a smart system, the overall energy consumption requirement reflects the performance of the whole system. E = P * t (5) Here, I is current, V is voltage, and t is time in seconds.

Performance Analysis
The performance system analysis parameters are thoroughly analyzed and illustrated in this section.

Accuracy Comparison
The trend is shown in Figure 8. It is clearly visible that in the beginning, the accuracy of Fast R-CNN is the highest at 0.5. It is because of the fast-learning rate of this algorithm. The algorithm we use in our proposed system learns gradually with the increased number of epochs. The learning rate of MobileNetV2 is the lowest. The reason is not that this algorithm is slow in general, but it is because of the large block size used by the previous research. The same behavior can be seen by Fast R-CNN, where the block size of 128 is used, which starts well, but with the increase in the number of epochs, the curve nearly becomes flat. The other two algorithms follow a gradual increase in the accuracy with the increase in epochs, but VGG-16 achieves higher accuracy because the block size we use is 10. The accuracy of MobileNetV2 after 100 epochs is around 0.557. That is why the number of epochs used in the previous study was 20,000, which gives far better accuracy than what is achieved at 100 epochs in our proposed system. Moreover, the increased number of epochs also requires more computational resources, which is the critical factor in the energy constraint smart systems.

System Latency Comparison
The system's latency can be measured by the time taken to complete one round of classification. For instance, this includes the time the system takes to capture an image using the camera, make segments of it, perform classification, and pass the prediction results to the micro-controller and robotic arm picking that item. All this time taken by the Edge node is calculated, which is the total latency of the system when the prediction is made on Edge Node. In the Figure 9, we compare the latency of the Edge Node with the scenario, to carry out all the work entirely on a cloud platform. It is quite evident in the Figure 9, the latency overhead of the cloud is much more than the edge node.The reason for this is that as the cloud server is far away, more time will be taken to deploy every captured image to the cloud, where predictions will be made, and then to get the results back from the cloud and pass them to the edge node. However, we train our data-set entirely on the cloud, as it has more computational and storage resources compared to the edge node, irrespective of its latency overhead, and test the model on the edge node to achieve better resource utilization.

Error Rate Comparison
The Figure 10 shows the comparison of the total error rate incurred during the training of VGG-16, which is the algorithm used in our proposed system. The total error rate at the start (after 1 epoch) is around 0.68, which gradually decreases with the increase in the number of epochs, falling to approximately 0.48 after first 20 epochs. After that, it follows a similar downward trend, and the total error rate drops to 0.13 after 100 epochs. On the other hand, the total error rate of the previous state-of-the-art solution with MobileNetV2 is around 0.97 at the start, and a minimal decrease is observed in the error rate, which is the opposite of the prominent decrease observed with VGG16, and after 100 epochs it is still at 0.84, which is far behind the algorithm (VGG16) we used in our proposed model. Furthermore, if we look at the trend followed by the Fast R-CNN algorithm, its error rate curve is quite close to that of VGG16, with 0.89 after the first epoch and 0.22 after 100 epochs, which is still more than the VGG-16. Therefore, seeing the trends of these three algorithms, it is obvious that VGG16 performs better than all others for proposed system in terms of total error rate incurred during the training. The reason is that the 16 convolutional layers and over 138 million parameters of the VGG16 significantly reduce the error rate and increase the accuracy of the overall system.

Inference Time Comparison
The inference time is the total time a model takes to take a real-life input and deliver an actionable output. All three algorithms show optimal performance, as shown in Figure 11. Minimal difference is observed in the algorithms, i.e., Fast R-CNN, VGG16, and Mobile NetV2. This is due to the quality of the test picture, which is passed to the algorithm and also depends on the type of system on which you are testing your algorithm. For instance, as we used raspberry pi 4B as a edge node to test our algorithm, it performs better in some cases compared to MobileNetV2 and Fast R-CNN. Particularly, VGG-16 performs better in the case of glass and unknown category; Fast R-CNN performs better to recognize the metal category; and MobileNetV2 takes the lead in the case of plastic. However, the results vary because of the quality of test data, the number of epochs used in training, and the processor specs.

Average Precision Comparison
Precision generally means the reproducibility of similar results by an algorithm. For example, Figure 12 illustrates the precision of the algorithm we use for classification by providing samples of each of four categories. It achieves over 90% precision in the case of glass and metal, but the average precision for plastic and the unknown category is between 85-90%. The reason is the shuffling of training data or the high accuracy for predicting items of glass and metal compared to plastic and unknown materials. Correspondingly, MobileNetV2 achieves higher precision in the case of plastic, while Fast R-CNN achieves the highest precision in the glass category compared to both MObileNetV2 and VGG16, because all the test data used in previous research were taken from the dataset on which the model was trained.

Resource Utilization Comparison
The Figure 13 shows the comparison of resource utilization between VGG16, Mo-bileNetV2, and Fast R-CNN. The GPU resources are used by Fast R-CNN much more than the the other two models for training because of the large size of the model and fast training. Furthermore, as all the model training is performed on the cloud and only the saved model is deployed on raspberry pi (edge Node), it takes less memory on raspberry pi. Still, the model's training in the case of VGG16 on the cloud requires much more space than MobileNetV2, because its 16 layers and 138 million trainable parameters take up much space in the memory. On the other hand, MobileNetV2 takes the minimum GPU and memory resources because it is a lightweight model with less trainable parameters than VGG16 and Fast R-CNN, as shown in Figure 13. Additionally, as a lightweight model, it cannot be used to train large and complex datasets.

Critical Analysis
To achieve better results compared to the state-of-the-art techniques shown in Table 5, we use the VGG16 algorithm. It is the most appropriate algorithm in terms of computation and accuracy for the object detection and classification process. It has 16 convolutional layers and some flattening layers at the end. These convolutional layers are computed to determine the valuable features on the basis of weights assigned to each feature via a deep learning algorithm at the flattening output layer in order to calculate the overall performance metrics of the proposed system. The dataset is trained by the VGG16 deep learning algorithm up to 100 epochs. The epochs are basically referred to as the total number of times the model is trained. The accuracy achieved after a single epoch is 85%, as shown in Figure 14. As now our model is already trained and saved, we do not have to train the entire model up to 100 epochs. Figure 14 is shown for demonstration purposes only. However, training the model on higher epochs can increase the accuracy, but it also has the disadvantage of over-training the model. Therefore, epochs should be chosen wisely. In our proposed system, we critically train the model by providing a different number of epochs, and we get the highest accuracy, i.e., 96% by providing the test data in a real-time environment where the number of epochs was around 100. Similarly, the variations in the accuracy of the proposed system on the basis of epoch hyper-parameter reflects the more trainable features as the epoch values increase.

Test Predictions
Further, after running the test predictions in a real-time environment, it can be seen that our categories are in the sequence of (glass, metal, plastic, unknown), as shown in Figure 15. Therefore, an output array (1.0.0.0) means that the item is a glass, (0.1.0.0) means that the classified item is a metal, and so on. What we are doing in our classification system is performing some actions based on these outputs. For instance, if the output is glass, it generates a specific number (say one) on the serial monitor of the controller. If the output is metal, it generates a specific number (say two) on the serial monitor of the controller, and so on. Against each number, there is a loop programmed on which the gripper will move and pick the classified item, as shown in Figure 16.

Conclusions
The proposed grid-like segmentation system provides awe-inspiring segmentation results. The deep-learning algorithm, a popular choice for image classification VGG16, gives us accurate results of 96% compared to the previously used Inception and ResNet algorithms, which achieved accuracy of 85-88%. The overall IoT communication is also effective, as we use a personal area network (PAN) to make communication between the edge node and the controller more reliable in a real-time environment. The output from the edge node is instantly generated on the controller's serial monitor, decreasing the proposed system's overall latency. Moreover, the low processing resources of the edge node are overcome by training the model on a cloud system. So, we only deploy the trained model on the edge node to get fruitful results.