Waste Management and Prediction of Air Pollutants Using IoT and Machine Learning Approach

: Increasing waste generation has become a signiﬁcant issue over the globe due to the rapid increase in urbanization and industrialization. In the literature, many issues that have a direct impact on the increase of waste and the improper disposal of waste have been investigated. Most of the existing work in the literature has focused on providing a cost-e ﬃ cient solution for the monitoring of garbage collection system using the Internet of Things (IoT). Though an IoT-based solution provides the real-time monitoring of a garbage collection system, it is limited to control the spreading of overspill and bad odor blowout gasses. The poor and inadequate disposal of waste produces toxic gases, and radiation in the environment has adverse e ﬀ ects on human health, the greenhouse system, and global warming. While considering the importance of air pollutants, it is imperative to monitor and forecast the concentration of air pollutants in addition to the management of the waste. In this paper, we present and IoT-based smart bin using a machine and deep learning model to manage the disposal of garbage and to forecast the air pollutant present in the surrounding bin environment. The smart bin is connected to an IoT-based server, the Google Cloud Server (GCP), which performs the computation necessary for predicting the status of the bin and for forecasting air quality based on real-time data. We experimented with a traditional model (k-nearest neighbors algorithm (k-NN) and logistic reg) and a non-traditional (long short term memory (LSTM) network-based deep learning) algorithm for the creation of alert messages regarding bin status and forecasting the amount of air pollutant carbon monoxide (CO) present in the air at a speciﬁc instance. The recalls of logistic regression and k-NN algorithm is 79% and 83%, respectively, in a real-time testing environment for predicting the status of the bin. The accuracy of modiﬁed LSTM and simple LSTM models is 90% and 88%, respectively

(RFID) tags, sensors, and Long Range (LoRa) technology for the real-time collection of waste with minimum human effort. The data gathered using IoT-based technologies provide the real-time tracking of waste management authorities and air pollutants using integrated systems that consist of Radio-frequency identification (RFID), Global positioning system (GPS), General packet radio services (GPRS), Geographic information system (GIS), and web cameras [15,16]. Mostly traditional system focuses on the monitoring, tracking of waste and monitoring of air quality. There is no such system which provide all these features which is a major drawback in the existing literature. The detailed analysis is done in Section 2. To make the environment hygienic, there must be a proper mechanism for the monitoring and forecasting of toxic air pollutants present in waste. Our developed system provides both the monitoring and management of waste and with the monitoring and forecasting of air pollutants to avoid dangerous effects on human health.
We conducted a novel study by utilizing machine learning and an IoT-based mechanism to effectively handle smart garbage management systems, and our system showed an improved accuracy compared to traditional garbage collection systems. The proposed system also provides sufficient information for the monitoring of air quality analysis in the environment. The proposed system can provide the accurate, real-time monitoring of garbage level along with notifications from an alert mechanism to municipal waste management. It deals with polluted waste issues and management in smart cities where the garbage collection system is not optimized. It provides the real-time monitoring of different toxic gas concentrations in the environment. Air quality monitoring mechanisms let the user forecast the next level of concentration in the air to take early corrective action.
The following research questions were investigated in this paper: • RQ-1: What are the state-of-the-art approaches and techniques used for the monitoring and management of the waste? The rest of the paper is organized as follows. In Section 2, we review the existing work based on the collection and monitoring of waste and emphasize the research work that specifically addresses the limitation of existing air quality analysis. In Section 3, we describe the system architecture. In Section 4, we discuss our proposed methodology using traditional and non-traditional machine learning models. In Section 5, we discuss the solution and goal of this research project. In Section 6, detailed evaluations of the system are discussed. In Section 7, we present the discussion and analysis of the results. In Section 8, we present our conclusion and provide future direction.

Related Work
According to an online report [17], the production of global annual waste was 2.01 billion tons in 2016, and it is anticipated that it will be in billions of tons in the next few years. Traditional approaches utilize the manual door to door approach, the shortest path algorithm [11], and the shortest route to collect the garbage [18]. To generate a notification message to municipal corporations, Global System for Mobile Communications (GSM) technology, which automatically sends a notification when a bin is full of rubbish, was used in [19]. In [20,21], ACS and the K-means algorithm, respectively, were used to measure the distance covered, fuel consumption, and the amount of solid waste accumulated in traditional waste systems.
The survey presented in [14] contained a bin that was equipped with a microcontroller with a wireless system that was used to show the current status of garbage in a dustbin on a mobile phone with an internet connection. In [22], the author used a microcontroller with CCTV cameras to identify and monitor the external environment. Furthermore, RFID tags were used to identify each bin by assigning each bin a unique ID; to identify the extent of waste in each bin, a wireless sensor network was employed by establishing an alert message to the authorized person using an embedded board, a Zigbee module was used to establish communication between different nodes located in a specific range, and an alert message was displayed on the smart bin when it was about to be full. GSM technology has also been used to send the status of a bin to its respective municipal authority [23]. RFID tags were used as waste tags to identify the waste in [11,24].
The technology presented above uses CCTV cameras that are expensive but still they didn't provide the status of bin (empty or fill). Radio-frequency identification tags are also an expensive and inefficient solution due to limited amount of storage. It is time consuming to manually check these tags for garbage info resulting delay in collection of the garbage. The piles of garbage results in spreading harmful gases in the environment, not only spoiling the beauty of nature but also putting an adverse effect on human health [18].

Monitoring and Tracking of Waste Using IoT
After traditional approaches for waste collection, research has moved towards IoT-based solutions that mostly consists of sensors and actuators, and a communication infrastructure between devices is established through the internet. All the things are connected and controlled by the internet as its architecture.
IoT architecture is also defined by three characteristics at the system level: • Things can communicate over the network.

•
Things can be identified using IDs.

•
Things can interact with the local environment.
In the field of the IoT, researchers [22,25] deals with the management of waste by using a smart module integrated with built-in sensors. The data acquired from the hardware module can also be shown on mobile and web applications at the following two levels: (i) level detection on bin cover and (ii) weight detection on the bottom of the bin. Weight sensors can measure up to 750 kg of waste with an accuracy of 0.02%. The trash level provides a range of 2-400 cm, depending on the depth of the bin. The knowledge of waste passes through a mobile application using a GSM module.
An IoT-based module was proposed by using WeMos and ultrasonic sensors presented in [26]. The main objective of this research was to provide a solution for garbage collection and the management of waste in smart cities. Ultrasonic sensors are attached to smart bins, and the status of bins is transferred to a municipal office through a WeMos chip. These chips have a Wi-Fi development board and are cheaper than Arduino. They send the dustbin waste info-where it is empty or full-to a municipal office through the smart bin's IP address.
To distinguish different levels in waste, the author of [27] proposed a technique where different colors (such as black, green, purple, and red) represent different levels (such as empty, low, medium, and full) of the bin. A garbage truck collects garbage when the bin gets filled. A unique ID is allocated to each bin. When the reading is from 0 to 24, the color of bin is black on map, which means the bin is empty. When the reading is from 25 to 49, the color is green on the map, which means the level of garbage is low. When the level of reading is from 50 to 74, the color is purple the map, which means the level is medium. When the level of reading is 100, the color is red, which means the bin waste needs to be disposed of with priority. The direction of the bin is also mentioned on the map. In the existing literature, a couple of different solutions have been proposed using various sensors and architectures. Detailed descriptions of these components and their extracted features are presented in Table 1.
The aforementioned techniques identify the status of the bin, the location of the bin, and the different levels of waste by utilizing different mechanisms. However, they are limited to classifying or segregating the waste inside the bin. Different state-of-the-art techniques have been proposed for the segregation of waste into metallic, non-metallic, organic, and non-organic waste. Detailed descriptions of these segregated waste materials are presented in the article [28,29]. In [29], an automated teller dustbin (ATD) was introduced. This is a smart system that automatically detects organic and non-organic garbage objects.  [24].
Energies 2020, 13, 3930 6 of 22 One author used a convolution neural network (CNN) based network to detect objects in images. Another research work in India related to the sanitization, dumping, and separation of the garbage proposed using the Internet of Things and machine learning [38]. The author proposed a system that uses sensors to detect metallic, non-metallic, moisturized, and bio-hazard materials. An ultrasonic sensor is used to detect the waste, a moisture sensor detects moisture in the waste, and a metallic sensor is used to separate metallic and non-metallic things. Reusable waste is identified by using image processing techniques. The acquired data are sent to a remote server to provide the real time status of waste. Most of the existing work in the literature has focused on providing a cost-efficient solution for the monitoring of garbage collection systems. Though IoT-based solutions provide the real-time monitoring of garbage collection systems, they are limited to monitoring and controlling the spread of overspill and bad odor blowout gasses (carbon monoxide, nitrogen oxides, sulfur dioxide, lead, etc.) in the environment.
These gases contaminate the environment and cause dangerous diseases, not only in the human body but also in plants and animals. Mostly, these air pollutants are measured as PM, which is the term used to describe a mixture of solid particles and liquid droplets in the air. These air pollutants may be naturally occurring or generally emitted during the emission of the combustion of solid and liquid fuel. After a detailed analysis of the nature of gases present in a bin and the surrounding environment of a bin, we chose CO gas in particular because it has standardized benchmark thresholds and because of its effects on human health [39]. Table 2 highlights the different concentration levels of CO in the air and its impact on human health.
Another work attempted to identify the haze level [40] and weather conditions to monitor air quality. Monitoring air quality is essential for a hygienic environment, and different concentrations have different impacts on human health. In the exiting literature, air quality has been monitored by image-based and sensor-based approaches, as discussed in [40,41], respectively. In [40], air quality was monitored by the estimation of haze levels (nonHaze, lightHaze, and heavyHaze) in an image by applying different pooling and transformation functions. Different images of weather conditions (clear, cloudy, etc.) were taken to build a dataset for the estimation of air pollutant PM2.5 effects on the environment. In [41], the author used CNN fine-tuning with 15 layers, and they extracted features using fully connected (Fc) layer before SoftMax, e.g Fc8 of the CNN for the training of random forest classifiers. These features were used as inputs for other classifiers. The image was classified into three major categories (good, moderate, and severe) to identify the different concentrations of air. The author also compared the CNN and random forest classifier and showed that the accuracy of the CNN-based classifier was 5% more than the random forest. Air quality was also monitored using sensor-based techniques. In [34], the author used MyRIO, Arduino Mega, and MQ-7 gas sensors to identify CO concentrations in the environment. The data extracted from the sensor were sent to a LABVIEW GUI interface for the real-time monitoring of CO concentration.

Limitation of Existing Literature
In developing countries like Pakistan, there is a need to change the manual collecting system into a smart monitoring and tracking system for the collection of waste, and researchers need to focus their attention on monitoring the spreading of overspill and bad odor blowout gasses due to the burning and inadequate disposal of waste. In other words, the monitoring and tracking of waste alone are not sufficient. To make the environment hygienic, there must be a proper mechanism for the monitoring and forecasting of toxic waste. In the literature, CNN-based classifiers have used for monitoring air quality analysis, but they two shortcomings: (i) They do not provide the real-time concentration of gases in the air, and (ii) they are limited to forecasting next level of concentration in the air of the respective area. Currently, there are no smart systems that integrate both garbage management and environment monitoring. In this work, we propose a system that identifies the levels of garbage along with real-time air monitoring by utilizing machine learning and deep learning approaches. •

Overall Design of the Research Object
The use of solar panels along with a 12 V battery is used for power supply (energy efficiency).

•
An ultrasonic sensor interfaced with a microcontroller is used to detect the level of garbage (working), and an LCD interface on the bin is used to show the status of the bin-whether it contains toxic gases or is filled with trash. • An odor sensor is used for the detection of odor generated by garbage. • A TGS2600 sensor is used to monitor the existence of gases in the surrounding environment for the purpose of air quality control.

•
Each bin is fixed at the specific location so there is no need to add a sperate GPS module.

•
A weight sensor measures the weight of the garbage.

•
Fixed location is added in the system for every new bin.

•
Smart bins are connected to Thingspeak (a cloud server) using NodeMCU. Sensor data are sent to a cloud server, which processes the data and pushes the notification to a sanitation work near the bins.

Proposed Methodology
The proposed methodology of the system consists of a different section as explained below. These stages are the critical concern to develop a Smart Garbage Collection System (SGCS) module that makes our environment and neat and clean. Moreover, the main aim of our proposed system was to make improvements in the collection of waste in everyday life in smart cities.

Proposed Methodology
The proposed methodology of the system consists of a different section as explained below. These stages are the critical concern to develop a Smart Garbage Collection System (SGCS) module that makes our environment and neat and clean. Moreover, the main aim of our proposed system was to make improvements in the collection of waste in everyday life in smart cities.

Data Collection
One of the major issues we faced while developing this system was to have a standard database containing garbage statistics. No real-world dataset was available to be used in our application to target smart air monitoring and smart garbage collection. Currently, one air quality dataset that contains a concentration of different polluted elements available in the air exists [42]. The dataset contains 9358 entries of 13 gas levels present in a polluted environment. This dataset was collected using multivariant sensors deployed in the polluted environment of an Italian city. We developed a smart bin consisting of a distance sensor, a weight sensor, an odor sensor, and an air monitoring module. The smart bins were installed at 4 different locations in the industrial city of Sialkot, Pakistan. Distance and weight sensors were used to determine the level and weight of the trash, respectively. Odor sensors were used to determine the odor generated by garbage. The software and hardware interaction are shown in Figure 2. For the detection of the air pollutant and toxic gases dangerous for human health, we used a TGS2600 sensor. A TGS2600 sensor can detect hydrogen, ethanol, and CO levels in the air. A TGS2600 sensor was placed at the side of the bin, as shown in Figure 3. The density of CO at standard temperature was slightly lower than the air. The concentration of CO and air is slightly different. The concentration of CO can be increased due to nature of the garbage inside the bin, but it can also be increased by some external factors such as the incomplete burning of carbon-containing fuels like coal, oil, charcoal, wood, kerosene, natural gas, and propane. If a sensor is installed inside a bin, it will only provide the CO level inside the bin rather than in the bin-surrounding environment. After considering these points, we placed the TGS2600 sensor outside of the bin so that it could detect the leakage of gas produced inside the bin and also measure the concentration of the CO level present in the environment where the bin was installed.
A higher concentration of CO results in a shortage of breath, and, in some cases, it results in the death of a person. CO can be generated by different chemical wastes present in garbage or due to incomplete combustion processes. The garbage data were labeled by the researcher in real-time according to three levels shown in Table 3. Errors could still occur due to the inability of the user to classify the levels of the dustbin in real-time. To resolve these issues, the dataset was labeled again utilizing a simple classifications algorithm design based on the levels of trash and weight. The bin status labeled by the classification algorithm is shown in Table 4. The sensor data were stored in the Firebase database. Currently, the system consists of four modules installed at four different locations of the city from the experimental perceptive. The data of air quality, odor sensor, level of trash, and weight of garbage were stored separately for each dustbin. True hourly averaged values of the gas's concentration were individually stored for each dustbin. The dataset contained 6 months' reading of the four smart bins. This dataset was then downloaded from the Firebase server to perform different evaluations.

Traditional Machine Learning Model
The proposed system uses machine learning approaches to effectively handle the smart garbage management system to improve the accuracy of the system compared to traditional systems. For traditional machine learning algorithms (naive Bayes, logistic regression, and k-nearest neighbors (KNN), before any learning can occur, the raw bins-filled collected data must be processed to extract a set of features that can be used for creating the model. In our experiments, we used following features: (1) time slot, (2) mean of trash level at a particular time, (3) standard deviation of trash level at a particular time, (4) mean of the weight of the trash at a particular time, and (5) standard deviation of the weight of the trash at a particular time.
The system was trained to predict the status of the bin for a particular time slot based on weight and trash levels. Naive Bayes and a logistic regression model were trained to predict the bin status as un-predicted, un-filled, half-filled, or filled. One of the limitations of the proposed model was that it was unable to handle faulty input received by a smart bin. The faulty input may have been cause due by failure of the hardware module.
To ensure the reliability of the hardware module, we also develop a probabilistic model to handle the ambiguities or exceptions in proposed system. There was another check on data to see if the values are within a threshold limit, and then values are passed to a web server and mobile application to notify the worker about the present status of bin. In case of ambiguities or exceptions occurs in the propose system, the status of the bin is determine using a probabilistic model. A prior and posterior probabilistic model was implemented to assign a particular label to the bin status based on previous bin status decisions according to the respective time slot. Additionally, it indicated which class (un-filled, half-filled, and filled) a particular bin status lied. The following expression was used to predict the expected bin status based on the prior and posterior probabilistic mechanism: where X and Y are events, (Y) should be greater than zero, and P(Y/X) defines the conditional probability of event Y given by event X. This function calculated the probability of bin status on the bases of previous data and assigned a label to it that indicated which class a bin status lied. If the status of the bin did not lie in the defined levels, then according to conditional probability, the previous status of the bin was assigned to bin due to larger probability. The pseudo-code for our traditional machine learning model is described in Theorem 1. Whenever a dustbin was filled with garbage, an alert message was sent to the respective worker on the Android app for the collection of waste. Additionally, after collecting the garbage, the worker sent an acknowledgment message to the administrator to verify the bin real-time status. The worker also received an alert notification based on the odor of the garbage if exceeded a specific threshold level.

Deep Learning Model
Machine learning classifiers were used to classify the levels of the trash in the dustbin, and the monitoring of the environment was performed with deep learning approaches. In machine learning, one of the major issues faced by researchers is the manual extraction of features and the provision of these features as inputs to the algorithm before the system starts its learning. The values obtained by sensors provide a level of gases over of a wide range of values. The gas levels obtained in a particular time slot are time-critical and time-sensitive, so the manual extraction of the features increases the complexity of the task. Time-series data analysis is a complicated task that also affects the target selection of the features, resulting in the degraded performance of a model [43][44][45].
Deep learning solves this problem by automatically selecting features at each layer, and these features are used for training purposes. The recurrent neural network (RNN) model provides a solution by adopting a network with a loop that maintains the information about previous events. In air quality monitoring, earlier levels of toxic gases play an important role in the decision-making process. Long short-term memory architecture is a specialized network based on the architecture of RNN. It can maintain the information in the long term to efficiently make a decision. The LSTM model has different layers that interact with each other in different manners by taking the decisions of previous blocks into account to forecast the next event. The forecasting of the air pollutant concentrations in a particular time slot is also necessary to avoid any incident caused by the increased concentration of an air pollutant. For this purpose, we first implemented a simple base model to make a future prediction at a specific time slot. Then, we utilized an LSTM model for the prediction of future levels of air pollutant concentrations present in the air.
has different layers that interact with each other in different manners by taking the decisions of previous blocks into account to forecast the next event. The forecasting of the air pollutant concentrations in a particular time slot is also necessary to avoid any incident caused by the increased concentration of an air pollutant. For this purpose, we first implemented a simple base model to make a future prediction at a specific time slot. Then, we utilized an LSTM model for the prediction of future levels of air pollutant concentrations present in the air.   concentrations in a particular time slot is also necessary to avoid any incident caused by the increased concentration of an air pollutant. For this purpose, we first implemented a simple base model to make a future prediction at a specific time slot. Then, we utilized an LSTM model for the prediction of future levels of air pollutant concentrations present in the air.   The base model utilized a simple averaging technique to predict future concentration levels. The LSTM model consisted of an input layer, an output layer, and hidden layers. The input layers contained nodes for gas input. These inputs were then fed to the system. The hidden layers shared information to predict the value of future incidents. The output layer predicted the value of the future instant, which is verified upon receiving the next actual outcome via the sensor. The difference between the actual and predicted value is to observe for accuracy of the system.
The concentration of air pollutants was obtained every 1 h. This means that we had 24 readings in a day. We predicted a single value every hour, which meant that the model was trained on 720 instances; from here, 1-month's readings were fed as inputs for the predictions of future values. This model utilized following settings: batch size = 256; interval = 200; and epochs = 10. These parameters played an important role in the efficiency of the system. The number of epochs defined how many times the whole dataset was passed to network to update the weights. As the number of epochs increased, the system went from underfitting to overfitting.
The number of epochs was not that significant. During training validation, error and training errors are important factors to achieve a higher accuracy. The model is trained until it produces less error. If the number of validation errors starts increasing, it might be an indication of overfitting caused by the neural network. To avoid overfitting and underfitting in the neural network, experimentation was done by setting different values of the epochs. The neural network resulted in optimal fitting when the number of epochs was set 10. This model was able to forecast univariate time series data specifying a particular concentration of air pollutants present in the air.

The Solution and Goal of this Research Project
The overall smart garbage collection system, consisting of both hardware and software solutions, is shown in Figure 2. The system consists of the following major modules.

Hardware Solutions
The smart bin consisted of several components, as described in Section 3. The basic model of the bin is shown in Figure 3. It contained various hardware modules for performing different types of functionality. The smart bin was powered by a solar plate interfaced on the top of the bin that charged a 12 V battery connected to a solar plate. The smart bin contained an HC-SR04 ultrasonic sensor interfaced at the top of the bin lid. The sonic waves discharged by the trig pin of the sensor were bounced back by an item or object and received back in an echo pin. After emitted the waves, the sensor shifted to receive mode. The time taken between emitting and receiving was relative to the distance of the item or object from the sensor, and this time was used to calculate the level of trash present in the bin. A TGS2600 sensor was used as an air quality detector. The sensor could detect the level of hydrogen, ethanol, and carbon monoxide gases in the surrounding air of the smart bin.
This sensor played an important role in air quality feature of the smart bin. A QS-01 sensor was used to detect the odor caused by trash present in the bin. At the bottom of the bin, an HX711 load cell was attached to calculate the weight of the trash present in the bin. Internet connectivity was provided using a NodeMCU attached to a microcontroller. This module is a low-cost provenance platform for the IoT. It gave access to Wi-Fi or the internet.
All these sensors were interfaced with a single chip Arduino microcontroller. Data obtained from these sensors were processed by the controller to show the level and weight of the bin on an LCD. The data from these sensors were then passed to the microcontroller, which sent values to the IoT Google Firebase cloud server. The Firebase database interacted with Wi-Fi NodeMCU, which stored all these sensor values in its database. This sensor data stored in Firebase were utilized by the GCP for predicting the status of the bin with the help of a trained model and utilized a trained LSTM model for forecasting the levels of air pollutant and toxic gases present in a specific range of smart bin.

Cloud Platform Architecture
Sensor data transmitted through the microcontroller were stored in the Firebase cloud server. Firebase is used as a communication server to send the notification to the Android application. Alert messages created by the systems are transmitted to the Android application using Firebase cloud messaging services. The GCP was used for the training and testing of the model for every new value. The trained machine learning and LSTM model was used for the prediction of the bin status and for forecasting the concentration of air pollutants in a specific time slot. During system operation, when the concentration of a specific gas, e.g., CO exceeds a regular range, alert messages that say to take precautionary measures while collecting the garbage from the bin are transmitted to a worker. Figure 4 shows the interaction between the different modules of the SGCS.

Web Application
The web application was developed using Php, MySql, and Ajax. The web portal was connected to the Firebase database to obtain real-time information regarding the status of trash and air quality in a specific area. The web application was especially designed from the admin perceptive. The web applications developed for the administrator contains different functionalities. During operation, an admin allocates the regions to the specific sanitary worker who is responsible for collecting the garbage within time. Monthly reports are generated by a system that shows the statistics of air pollution in a specific area, the timely collection of garbage, and other useful factors necessary for improving the lifestyle of the people. An admin can view the bin status, bin statistics, bin locations, and worker's locations at any moment. An admin can also see the details of workers and their assigned bin by clicking on "View Detail." Complaints made by users can be seen by an admin in a separate section. Management can track a worker and also check the efficiency of workers by generating reports based on the number of times that garbage overflow occurred in the worker regions. Figure 5a shows the admin panel interface.

Cloud Platform Architecture
Sensor data transmitted through the microcontroller were stored in the Firebase cloud server. Firebase is used as a communication server to send the notification to the Android application. Alert messages created by the systems are transmitted to the Android application using Firebase cloud messaging services. The GCP was used for the training and testing of the model for every new value. The trained machine learning and LSTM model was used for the prediction of the bin status and for forecasting the concentration of air pollutants in a specific time slot. During system operation, when the concentration of a specific gas, e.g., CO exceeds a regular range, alert messages that say to take precautionary measures while collecting the garbage from the bin are transmitted to a worker. Figure 4 shows the interaction between the different modules of the SGCS.

Web Application
The web application was developed using Php, MySql, and Ajax. The web portal was connected to the Firebase database to obtain real-time information regarding the status of trash and air quality in a specific area. The web application was especially designed from the admin perceptive. The web applications developed for the administrator contains different functionalities. During operation, an admin allocates the regions to the specific sanitary worker who is responsible for collecting the garbage within time. Monthly reports are generated by a system that shows the statistics of air pollution in a specific area, the timely collection of garbage, and other useful factors necessary for improving the lifestyle of the people. An admin can view the bin status, bin statistics, bin locations, and worker's locations at any moment. An admin can also see the details of workers and their assigned bin by clicking on "View Detail." Complaints made by users can be seen by an admin in a separate section. Management can track a worker and also check the efficiency of workers by generating reports based on the number of times that garbage overflow occurred in the worker regions. Figure 5a shows the admin panel interface.

Android Application
The Android application was developed using the Java language. The app was connected to the Firebase database to promptly receive live updates of the smart bins. The Android applications were explicitly designed for sanitation workers who are responsible for the collection of waste. During operation, a worker can check the status of bins by logging into the app. The worker receives pop-up notifications along with alert messages of levels 0, 1, and 2, which are represented by green, yellow, and red colors, respectively. The green color represents the low level, yellow means medium, and red means that the garbage needs immediate disposal. The worker can choose the optimized route between different dustbins located at the various locations using the Google Map Application Program Interface (API) integrated with the app-a cost-efficient solution. Whenever there is the detection of hazardous gases near a smart bin, an alert message is sent to a worker so that he/she can take precautionary measures for the disposal of toxic waste and also save the worker from any incident. The worker can see the statistics of the garbage of its specific area. The live locations of dustbins along with the odor, trash level, and weight of the trash are also shown in the app. Workers can add a new dustbin to a specific region by specifying the Google coordinate of the bin. The system automatically assigns a new ID to the newly added bin in the system. Figure 5b shows the

Android Application
The Android application was developed using the Java language. The app was connected to the Firebase database to promptly receive live updates of the smart bins. The Android applications were explicitly designed for sanitation workers who are responsible for the collection of waste. During operation, a worker can check the status of bins by logging into the app. The worker receives pop-up notifications along with alert messages of levels 0, 1, and 2, which are represented by green, yellow, and red colors, respectively. The green color represents the low level, yellow means medium, and red means that the garbage needs immediate disposal. The worker can choose the optimized route between different dustbins located at the various locations using the Google Map Application Program Interface (API) integrated with the app-a cost-efficient solution. Whenever there is the detection of hazardous gases near a smart bin, an alert message is sent to a worker so that he/she can take precautionary measures for the disposal of toxic waste and also save the worker from any incident. The worker can see the statistics of the garbage of its specific area. The live locations of dustbins along with the odor, trash level, and weight of the trash are also shown in the app. Workers can add a new dustbin to a specific region by specifying the Google coordinate of the bin. The system automatically assigns a new ID to the newly added bin in the system. Figure 5b shows the notifications received in Android applications.

Classification of Garbage Levels
The main objective of our system is to accurately detect the status of a bin-whether the bin is filled, half-filled, and unfilled. A missed bin status due to any issue is represented as false negative (FN), and a wrong classification of the bin status is represented as a false positive (FP). The reliability of the system can be verified with the help of a machine learning model and a double check prior and posterior probability. If the sensors do not provide correct values, the machine learning model cannot predict the level of bin, thus providing an unpredicted state.
To handle this situation, a probabilistic model determines the status of the bin as filled, half-filled, and unfilled based on previous data. When the notification for the collection of garbage is sent to the worker, if the prediction is false, it is counted as a FN. A true positive (TP) is when the worker marked this prediction alert as true when a bin needs disposal. In this system, a true negative (TN) is not important because it is marked when the bin is unfilled.
Different classifier analyses are done using the following parameters: Accuracy Accuracy is the ratio of correctly predicted observation of the total observation. It is the ratio between the sum of the TP and TN values and the sum of the TP, TN, FP, and FN values. Its mathematical notation is:

Precision
Precision is the ratio of correctly predicted positive observations. It is the ratio between the TP value and the sum of the TP and FP values. Its mathematical notation is:

Recall
Recall is the ratio of correctly predicted positive observations of all observation in the actual class. It is the ration between the TP value and the sum of the TP and FN values. Its mathematical notation is: The results of the training and evaluation of different machine learning classifiers are shown in Figure 6. Accuracy, precision, and recall were generated using Equations (2)- (4).
In this section, we first present the training evaluating our garbage dataset on naive Bayes, multilayer perceptron, logistic, and KNN. All these classifiers predicted the status of the bin by matching them with corresponding labeled entries presents. The results are shown in Figure 6.
This trained model was then tested in a real-time environment by fetching data from the Firebase data server and made predictions of bin status based on the trained mode. The real-time streaming of two smart bins for trash level and weight are shown in Figure 7. In this way, our trained model was tested in real-time scenarios. The KNN model had a recall of 0.89, which meant that the model well-predicted the status of the bin. The model had a remarkable precision, thus indicating less FP, which meant fewer wrong alerts were generated. The other three models had mostly the same results. The KNN performed well when used in real-time scenarios. The machine learning model is unable to handle the faulty inputs and unknown states, so the alert message is produced by using probabilistic model. In a real-time environment, the results showed that the KNN model performed better predictions regarding the status of the bin, as shown in Figure 8.
The results of the training and evaluation of different machine learning classifiers are shown in Figure 6. Accuracy, precision, and recall were generated using equation (2), equations (3), and equations (4).
In this section, we first present the training evaluating our garbage dataset on naive Bayes, multilayer perceptron, logistic, and KNN. All these classifiers predicted the status of the bin by matching them with corresponding labeled entries presents. The results are shown in Figure 6. This trained model was then tested in a real-time environment by fetching data from the Firebase data server and made predictions of bin status based on the trained mode. The real-time streaming of two smart bins for trash level and weight are shown in Figure 7. In this way, our trained model was tested in real-time scenarios. The KNN model had a recall of 0.89, which meant that the model well-predicted the status of the bin. The model had a remarkable precision, thus indicating less FP, which meant fewer wrong alerts were generated. The other three models had mostly the same results. The KNN performed well when used in real-time scenarios. The machine learning model is unable to handle the faulty inputs and unknown states, so the alert message is produced by using probabilistic model. In a real-time environment, the results showed that the KNN model performed better predictions regarding the status of the bin, as shown in Figure 8.   The difference between KNN and logistic regression, as seen in Figure 8, was very small for accuracy. The results of predictive modeling with and without variable selections were measured. Both results seemed to be good; however, statistical significance is not established for the logistic regression. Statistical significance was observed only in KNN and seemed to be sensitive to the variable selection method.  The difference between KNN and logistic regression, as seen in Figure 8, was very small for accuracy. The results of predictive modeling with and without variable selections were measured. Both results seemed to be good; however, statistical significance is not established for the logistic regression. Statistical significance was observed only in KNN and seemed to be sensitive to the variable selection method.

Monitoring of Air Pollutant
The monitoring and forecasting of air pollutants are some of the major objectives of our proposed system. To achieve this objective, we utilized deep learning approaches for the good forecasting of future levels of a particular gas present in the air. Higher concentrations of CO cause severe health issues, as described in Table 2. Forecasting was done with the help of the LSTM model, along with a comparison with a univariant model, which was considered the base model. The dataset contained the hourly concentration of the different gases. Five days of observations of gas levels were used to create a window of size 120 (24 × 5) to train the model.

Monitoring of Air Pollutant
The monitoring and forecasting of air pollutants are some of the major objectives of our proposed system. To achieve this objective, we utilized deep learning approaches for the good forecasting of future levels of a particular gas present in the air. Higher concentrations of CO cause severe health issues, as described in Table 2. Forecasting was done with the help of the LSTM model, along with a comparison with a univariant model, which was considered the base model. The dataset contained the hourly concentration of the different gases. Five days of observations of gas levels were used to create a window of size 120 (24 × 5) to train the model.
The means and standard deviation were calculated to perform the standardization of the dataset. Before training the model, we implemented a baseline model. For a given input point, the baseline method looked at all the history and predicted the next point to be the average of the last 20 observations. The prediction performed by the system is shown in Figure 9, where the blue line represents the previous instant, the model prediction is shown in the green circle, and the actual data received at that particular time slot are shown as the true future.
The baseline model prediction was not reliable due because it simply utilized averaging approaches. Forecasting of future instance value is improved by utilizing an RNN. An RNN is a neural network that is suitable for time series data. It processes a time series step by step to maintain the internal information in cells that are passed to the next cells. We trained a special type of RNN-based LSTM model for the predictions of the concentration of the gas in the next 1 h. The results of the prediction performed by the model are shown in Figure 10. The predicted value is presented for Time Slot 0, representing the concentration of gas at the current time. It obvious from the graph that there was a minor difference between the actual and predicted values. The baseline model was not able to handle any change in future instances by simply adopting the averaging approaches. The LSTM model had a memory cell and an input-output gate structure. The memory cell was used to record information, and the input-output gate determined whether the information was capable of flowing into or out of the memory cell. Due to these characteristics, it had a better performance in forecasting future instances compared to the baseline model. Forecasting for future events can also be done by changing the configuration of the LSTM model to forecast further in future instances.
The LSTM model configurations were changed to train it for the prediction of gas levels in the next 12 h time slots. The model was trained on five days data of hourly collected gas concentration levels for predicting the next level of the gas in the next 12 h. The output of the model is shown in Figure 11. The model prediction was further away from instance zero, which means that it presented the future value of the gas level. The predicted future and true future overlap in the graph, which means that the accuracy of prediction was improved when the LSTM model was trained on a larger dataset. The accuracy of the baseline, simple LSTM, and modified LSTM models was generated while testing them on the offline and real-time datasets. The accuracy of these three approaches is shown in Figure 12. It was observed that the Modified LSTM model could achieve a 0.90 accuracy while testing in real-time scenarios. From the graph, it can be shown that a simple baseline model achieves an accuracy of around 0.8 in real-time scenarios. The accuracy of the system depends on concentration of gases and other external factors. The LSTM model could consider changes caused by these elements along with an improved accuracy of 0.9 in real-time scenarios. Figure 9 shows the concentration of CO using the baseline model. Figures 10 and 11 show the forecasting of CO concentration in the next 1 and 12 h, respectively.
The LSTM model configurations were changed to train it for the prediction of gas levels in the next 12 h time slots. The model was trained on five days data of hourly collected gas concentration levels for predicting the next level of the gas in the next 12 h. The output of the model is shown in Figure 11. The model prediction was further away from instance zero, which means that it presented the future value of the gas level. The predicted future and true future overlap in the graph, which means that the accuracy of prediction was improved when the LSTM model was trained on a larger dataset.
The accuracy of the baseline, simple LSTM, and modified LSTM models was generated while testing them on the offline and real-time datasets. The accuracy of these three approaches is shown in Figure 12. It was observed that the Modified LSTM model could achieve a 0.90 accuracy while testing in real-time scenarios. From the graph, it can be shown that a simple baseline model achieves an accuracy of around 0.8 in real-time scenarios. The accuracy of the system depends on concentration of gases and other external factors. The LSTM model could consider changes caused by these elements along with an improved accuracy of 0.9 in real-time scenarios. Figure 9 shows the concentration of CO using the baseline model. Figure 10 and Figure 11 show the forecasting of CO concentration in the next 1 and 12 h, respectively.

Delay Graph
Delay is how much time is required for a bit of data to travel through a network from one point to another. In this project, our goal was to ensure efficiency in relation to the performance or function of a system in the best possible manner with the minimum time interval. The time required for data to move from source to destination in propose system is shown in Table 5. Sensor interval delay is defined the time after that sensor value is sent to the cloud server. The Approximately 8.25 s are required for generation and transmission of the alert message to the worker. This delay could have

Delay Graph
Delay is how much time is required for a bit of data to travel through a network from one point to another. In this project, our goal was to ensure efficiency in relation to the performance or function of a system in the best possible manner with the minimum time interval. The time required for data to move from source to destination in propose system is shown in Table 5. Sensor interval delay is defined the time after that sensor value is sent to the cloud server. The Approximately 8.25 s are required for generation and transmission of the alert message to the worker. This delay could have been affected by network connectivity. Delay was increased when there was an error that occurred in the network connection. The proposed system delay calculation is shown in Table 5.

Discussion
The traditional garbage collection system is ineffective in terms of the monitoring and management of both waste and air quality at the same time. This research handled the aforementioned problems by utilizing a machine learning approach to determine the status of a bin and the forecasting of the toxic gas concentration present in the air. The section discusses the findings and limitation of the propose work below. To recap; RQ-1: What are the state-of-the-art approaches and techniques used for the monitoring and management of waste?
This research set out to monitor and manage waste and forecast the concentration of air pollutants (CO) in bin-surrounding environments. Studies have discussed the issues of the traditional garbage collection systems and the problems in existing systems. In existing state-of-the-art approaches, only a few attempts have been made to propose a method for the management of waste and the concurrent, real-time monitoring of waste and toxic gasses. A comprehensive literature review was done to identify the limitations in the existing techniques; these are summarized in Table 1. Mostly, researchers have utilized IoT-based models for the management and monitoring of waste. Few have utilized CNN-based approaches for waste classification. Different approaches have been utilized for the monitoring of air pollutants. Image and sensors-based techniques have been adopted to monitor the quality of the air. From the literature review, we identified our problem statement: Currently, there is no such system that utilizes machine learning approaches to provide waste management, along with the monitoring the effect of toxic air pollutants presents in the bin environment.
RQ-2: What are the effects and nature of toxic air pollutants present in waste?
There have been several studies that have investigated different types of pollutants present in waste. The type of these pollutant varies due to the nature of garbage and has after-effects on human health, e.g., lung cancer, lung emphysema, and neurological, cardiovascular, and respiratory diseases. In this research, an air pollutant (CO) was selected to monitor the air quality and forecast the next level of concentration in the air. The effects of CO gas on human health are discussed in Table 2. which was used to create alert messages. However, any other air pollutant, such as PM 2.5 and PM 10 , could also be used as a benchmark to monitor air quality. The proposed work focused on monitoring and forecasting CO concentration due to its severe effects on human health. CO gas was chosen after considering the nature of the waste present in bins.
RQ-3: How can we analyze the current status of a bin to provide the real-time monitoring and collection of waste?
To analyze the current status of bins while also providing a solution for the monitoring and management of waste, an IoT-based module consisting of level sensor, a weight sensor, an odor sensor, and an air sensor was placed on a bin. The data acquired from the bin were sent to the GCP server, where necessary computation was performed. Prediction about the status of the bin was performed using an already-trained model that classified the bin's level as filled, half-filled, or un-filled. The analysis was performed using different machine learning classifiers. If the data acquired from the bin were not in a specific range, then the posterior and prior model was utilized to assign the bin status based on previous instances. The system accuracy was tested in both offline and online modes. In the offline mode, the KNN and logistic classifiers showed accuracies of 0.891 and 0.865, respectively. The trained model was then used to perform the prediction in real-time scenarios. The accuracy of the system was slightly reduced when the trained model was used for prediction. There was a slight difference between the accuracy of KNN and logistic classifiers. After performing the statistical test, KNN results were found to be statistically significant because they were sensitive to dependent variables. An Android app was developed for sanitary workers to can see the live status of bins, along with alert notifications. RQ-4: How can we forecast and monitor the concentration of air pollutants in the surroundings of a smart bin?
The monitoring of the air pollutant in the bin-surrounding environment was done by utilizing an LSTM model that forecasted the concentration of the CO present in the air. The data were acquired using a TGS2600 sensor that measured the concentration of the CO levels at a present instance. Initially, we implemented a baseline model that considered the past 20 instances, and the future instance value was predicted by utilizing averaging approaches. The efficiency of the baseline model seemed good but was not always reliable in real-time scenarios. To improve the accuracy of the system, a deep learning-based LSTM model was used. The LSTM model was modified to be trained on the last five days' readings to predict the concentration of CO in the next 12 h. The accuracy of the trained model was determined in both the offline and online modes. The system showed overall 0.99 and 0.90 accuracies in the offline and online modes, respectively, for predicting the concentration in the next 12 h.

Conclusions
In the last few decades, we have seen increases in piles of garbage due to the rapid increase in the population. A municipal corporation often shows negligence in the disposal of garbage, resulting in an increased concentration of toxic gases (CO) in bin-surrounding environments. Exposure to this gas for a long period has severe effects on human health. To improve the living standard of people, it is necessary to provide a mechanism that monitors and manages waste while forecasting air pollutant to avoid future negative incidents. A comprehensive literature review was performed to identify the pros and cons of existing solutions. The limitation of tradition system was identified and solved in the proposed work. An in-depth analysis of machine learning classifiers on real-time garbage datasets was performed to determine which model worked best in classifying bin status as filled, half-filled, and un-filled. The machine learning algorithms were trained by extracting five features as input. The logistic regression and KNN model have shown recall values of 79% and 83%, respectively, in a real-time testing environment. An LSTM-based model was used for sensor time-series data that considered the previous entries to forecast the level of air pollutants at a particular time slot. The modified LSTM and simple LSTM model shows 90% and 88% accuracy values, respectively, to predict the future concentration of gases present in the air. The system provided the real-time monitoring of garbage levels along with notifications via an alert mechanism. The data from an air monitor, a distance sensor, a weight sensor, and an odor sensor were sent to the Firebase database. A GCP server extracted the various features and assigned a label to a particular bin with the already trained model. Posterior and prior probability was used as a double check to verify the ambiguity in the system. The proposed work was found to provide an improved accuracy by utilizing machine learning, as compared to existing solutions based on simple approaches.
One of the next steps is to deploy our system in larger areas and then collect data for a long period of time. Currently, the machine learning model can easily classify bin status due to the fixed size of the bin. In future deep learning approaches can be used for classifying bin status. Currently, the system can predict a specific CO concentration level. In the future, a relationship between different air pollutants can be explored, and a mathematical model that considers the change of a single element effect's on different air pollutants found in the air can be developed.