Air Quality Modeling for Sustainable Clean Environment Using ANFIS and Machine Learning Approaches

: Air quality monitoring and assessment are essential issues for sustainable environmental protection. The monitoring process is composed of data collection, evaluation, and decision-making. Several important pollutants, such as SO 2 , CO, PM 10 , O 3 , NOx, H 2 S, location, and many others, have important effects on air quality. Air quality should be recorded and measured based on the total effect of pollutants that are collectively prescribed by a numerical value. In Canada, the Air Quality Health Index (AQHI) is used which is one numerical value based on the total effect of some concentrations. Therefore, evolution is required to consider the complex, ill-deﬁned air pollutants, hence several naive and noble approaches are used to study AQHI. In this study, three approaches such as hybrid data-driven ANN, nonlinear autoregressive with external (exogenous) input (NARX) with a neural network, and adaptive neuro-fuzzy inference (ANFIS) approaches are used for estimating the air quality in an urban area (Jeddah city—industrial zone) for public health concerns. Over three years, 1771 data were collected for pollutants from 1 June 2016 until 30 September 2019. In this study, the Levenberg-Marquardt (LM) approach was employed as an optimization method for ANNs to solve the nonlinear least-squares problems. The NARX employed has a two-layer feed-forward ANN. On the other hand, the back-propagation multi-layer perceptron (BPMLP) algorithm was used with the steepest descent approach to reduce the root mean square error (RMSE). The RMSEs were 4.42, 0.0578, and 5.64 for ANN, NARX, and ANFIS, respectively. Essentially, all RMSEs are very small. The outcomes of approaches were evaluated by fuzzy quality charts and compared statistically with the US-EPA air quality standards. Due to the effectiveness and robustness of artiﬁcial intelligent techniques, the public’s early warning will be possible for avoiding the harmful effects of pollution inside the urban areas, which may reduce respiratory and cardiovascular mortalities. Consequently, the stability of air quality models was correlated with the absolute air quality index. The ﬁndings showed notable performance of NARX with a neural network, ANN, and ANFIS-based AQHI model for high dimensional data assessment.


Introduction
One of the most critical factors that significantly affect climate change and human health is air pollution.Many countries have been using different systems for monitoring air pollution.Thus, this area of research is of interest and very active.Several naive modeling approaches have been presented in the literature are hybrid approaches [1,2], a linear unbiased estimator [3], autoregressive integrated moving average (ARIMA) [4,5] bias adjustment [6], and principal component regression approach.Similarly, non-parametric regression [7], artificial intelligence (AI) techniques, machine learning [8], neuro-fuzzy inference systems, and autoregression feedforward ANN with genetic algorithm [9] are noble air quality modeling and control approaches.Similarly, simulation and data mining are well-known modeling tools and techniques for predicting and assessing air quality.In this context, Aggarwal et al. [10] and Bai et al. [11] have concentrated on the models used to predict the abnormality exploration in air quality.Deep learning applications (as a subset of machine learning) have recently shown considerable potential for investigating further aspects of the ecological dimensions [12][13][14].A recent study by Sayeed et al. [15] proposed an artificial intelligence (AI) model using deep convolutional ANNs to predict 24 h ozone concentration in Texas for comparing the results of different periods in the year 2017.Munawar et al. [16] presented a case study of Lahore city of Pakistan for the prediction of an Air Quality Index (AQI) using a hybrid approach of neuro-fuzzy inference systems.Rahman et al. [17] investigated the soft computing applications of air quality modeling by reviewing and discussing the neuro-fuzzy systems, fuzzy logic, deep learning, conventional and evolutionary ANNs, and many hybrid models.Hvidtfeldt et al. [18], Ansari and Ehrampoush [19], and Liu et al. [20] expressed the exposure to pollutants causes different diseases such as respiratory diseases, asthma, type 2 diabetes, cancer, and allergies.Alimissis et al. [21], Cabaneros et al. [22], and Taylan [23] searched the air quality models playing crucial roles to evaluate the air quality problems in the atmosphere.These models can show the health conditions in the cities using domain knowledge and applying reliable and noble forecasting approaches.The advantages of these models are that they can provide early warning in case they are effectively utilized and can reduce the number of manual measurements of data acquisition substantially.As a modeling approach, ANNs provide effective, flexible, and less assumption-dependent outcomes.They have adaptive properties and can be integrated with other modeling approaches to assess and control environmental systems.The integration of ANNs and fuzzy logic models called neuro-fuzzy modeling approaches have obtained extensive attention in air quality modeling due to their adaptiveness and well-generalized performance.The different potentials of ANNs have been employed for modeling the various air pollutants, including NO x and SO x , CO x , O 3 [24], PM 10 [25], daily precipitation and temperature using neuro-fuzzy networks [26], and PM 2.5 [27] in different places all over the world.In this context, Grivas and Chaloulakou [28] used evolutionary computational algorithms such as ANNs in air quality modeling; similarly, they used genetic-algorithm-tuned ANN hybrid models for the hourly PM 10 concentrations in Greece.In the time series problems, NARX with a neural network approach can be used to predict the future values of a time series 'y(t)' using the past values of that time series and past values of a second time series 'x(t)'.
Similarly, as an evolutionary approach, fuzzy modeling can be used to deal with the vagueness and uncertainties of real-world problems using fuzzy 'If-Then' rules.A rule set is designed to control the possible relations between the input and output factors by a fuzzification process.Fuzzy modeling is a robust tool to solve complex engineering problems that are difficult to solve by traditional algebraic models.These modeling approaches encapsulate the vagueness of linguistic parameters and terms of qualitative factors.Jorquera et al. [29] demonstrated the usefulness of fuzzy logic modeling in predicting the maximum daily O 3 concentration levels.The adapted neuro-fuzzy and fuzzy logic approaches have been used to model concentrations of O 3 and PM 10 .Ghoneim et al. [30] and Zhou et al. [31] employed deep learning and deep multi-output long short-term memory ANNs models for determining the air pollutants' concentration.Rybarczyk et al. [8] claimed that only a few review articles are available to discuss the soft computing techniques in air quality modeling [32][33][34] where it was found that these ANNs or/and deep learning techniques are mostly limited applications.The articles covering the whole spectrum of the available soft computing techniques can rarely be found.
The state of air pollution is frequently expressed by the Air Quality Index (AQI).The AQI is extensively used for air quality assessment and management [35].The USA Environmental Protection Agency and local authorities use the AQI to provide air quality information of a location and its impact on health [36].High AQI values mean increased pollution and high exposition of living things to health problems [37].Sulfur dioxide (SO2 µg/m 3 ), carbon monoxide (CO mg/m 3 ), particular matters (PM 10 µg/m 3 ), ozone (O3, µg/m 3 ), and nitrogen oxide NO µg/m 3 ), and hydrogen sulfur H2S (µg/m 3 ) are considered pollutants in the urban area.The AQI categories and their standard quality intervals are given in Table 1.These categories of AQI have been identified by fuzzy linguistic terms and their numerical intervals for air quality assessment.In this study, initially, statistical inferencing approaches were used to examine the underlying relationship between the pollutants and their impacts on the air quality index.Equation ( 1) is a way to present the relationship between an air pollutant concentration and AQI.The pollutant concentration in this equation was defined as a ratio of the relevant standard. Air where c i and s i show the pollutant concentration and standard pollutant level, respectively.In recent years, several studies were carried out to develop air quality prediction models for launching ambient air quality standards.Numerous guidelines have been presented to set the level of air quality bounds on the emissions of pollutants [37].On the other hand, determining and developing AQI limits using big data is a very recent work.Attention was mainly given to soft computing techniques to obtain and evaluate the big data [38] regarding the air quality models.Due to the size and complexity of big data in air quality systems, the essentials for soft computing approaches have extensively increased, particularly with the growing interests in the systems of early warning alerts and preventive actions for pollutants' when high concentrations of pollutants are observed [23].Recently, several attempts have been conducted to investigate air quality using machine-learning and neuro-fuzzy (ANFIS) approaches and big data analytics [10,11,15,[39][40][41][42][43].The characteristics of modeling approaches require different types of data sets.For instance, ANNs and fuzzy systems are bidirectional and need numerical and linguistic data, which are broadly discussed [44].Similarly, fuzzy systems can organize, handle, and use vague, imprecise, and uncertain information to construct balance among different and inconsistent observations, and to use subjective and qualitative information to model complex problems [45].As seen in Table 1, linguistic terms are employed for air quality assessment together with numerical values.The numerical data shows the upper and lower limit of pollutants that the observations have taken.Taylan et al. [46] used numerical data to train machine learning approaches and developed adaptive fuzzy models using symbolic qualitative and numerical data.Neuro-fuzzy systems integrate neural networks and fuzzy systems for developing models that have learning capabilities obtained through training processes.The goal of hybrid integration with big data is to form a more intelligent system for predicting and controlling air quality.However, applying a hybrid neuro-fuzzy system is very rare in air quality prediction and control systems.These hybrid approaches can predict air quality, evaluate the findings, and provide online information.In case of unhealthy or hazardous conditions, local authorities can take immediate actions more intelligently.In this study, the modeling method considers six major air pollutants as input parameters; SO 2 , CO, PM 10 , O 3 , NO, H 2 S, and the output parameter is the AQI.For each parameter, 1771 data were obtained, 1065 data (60%) were used for training, 353 data (20%) were used for testing, and the remaining 353 data (20%) were employed for the validation of the model.
The steps of the modeling approach are presented in detail in Section 2. The article is organized as follows: Section 2.1 describes the significant air pollution sources and their impacts on the air quality index.Section 2.2 gives the application of ANFIS in air quality modeling.The details of ANFIS modeling were presented in Section 2.3.Section 3 explains the machine learning approach for air quality estimation.Section 3.2 presents the nonlinear autoregressive with external (exogenous) input (NARX) and neural networks.The results and discussions are given in Section 4. Finally, the research ends with conclusions and references.

Major Sources of Air Pollution and Their Impacts on Air Quality
Several factors affect air pollution, such as dust storms, particulate matter, greenhouse gases, other gas emissions, urban growth, and transportation.The impacts of sulfur dioxide, nitrogen dioxide, and ozone cause declines in crop yields and affects human health [45].Alternatively, ozone is caused by complex chemical reactions in the atmosphere [47].The highest level of pollution occurs where pollutant concentrations are the greatest.The level of pollution allowed is given in Table 2, where air quality standards in Saudi Arabia, Gulf countries, and the US-EPA [37] are presented.Ozone and sulfur dioxide are considered the leading causes of the low yield of crops because of the acidification of soils, lakes, and streams.When the soils are acidified, acidity and toxic aluminum move from catchments into lakes and the sea, making them highly polluted.The nitrogen disordering can acidify the soil, fertilize sensitive natural plant communities, and cause irregularity that can affect imbalance ecosystems.Figure 1a-d illustrates sulfur dioxide (a), particulate matters (b), nitrogen oxide (c), and carbon monoxide (d) on air quality.This figure shows that an increase in sulfur dioxide, particularly nitrogen oxide, raises the AQI, which means a high level of pollution and lower air quality.On the other hand, the effect of carbon monoxide is more complicated; this gas is a toxic air pollutant, mainly produced from vehicle emissions, and has health effects including weakness, vomiting, headaches, nausea, clouding of consciousness, coma, and, unfortunately, at high concentrations and with long enough exposure, may cause death.It also raises the AQI and reduces the air quality.However, this study aims to find out the cumulative effect of pollutants on air quality.
Air pollutants encounter the human body mainly via the respiratory system.Ozone, NO, and SO 2 , delicate particulate matter, and dust can affect the mucous membranes' inflammation.These redden the eyes, inflame the pharynx and throat, affect lung functions, and weaken the immune system, which eventually causes respiratory diseases.Several symptoms may occur, such as headaches, giddiness, nausea, and pounding of the heart as the signs of extreme exposure.The US-EPA [37] standards were considered for the conversion of pollutants' data into the indexes.As shown in Table 1, when the AQI is between zero and 50, the level of health concern is good for society.Conversely, a higher AQI means high-level pollution, which is risky for public health.

Application of ANFIS for Air Quality Modeling
An ANFIS model designed with suitable input-output parameters can depict a human expert's behaviors to control the air quality between the predefined parameters.The model can use environmental data, produce suitable outcomes of AQI and inform authorities.An adaptive network is connected by links, where each node executes a function on incoming signals from sensory information of pollutants to produce output and specifies the direction of signal flow between the nodes [48].In a typical network, nodes present mathematic functions modifiable by specified parameters.These parameters can impact the performance of the network and its functions.However, in this work, the mathematical functions are replaced with fuzzy rules.As shown in Figure 2, membership functions can take the place of mathematical equations and carry out their duties, which is making this approach unique and noble for air quality modeling.The complete fuzzy rules set given below, is the backbone of the expert system.Figure 2 shows the architecture of the ANFIS model for the prediction of the air quality index.An ANFIS model consisting of fuzzy if-then rules (Rs) is a fundamental tool for assessing air quality.The input parameters are X i = {x 1 : Sulfur dioxide (SO 2 ), x 2 : Carbon monoxide (CO), x 3 : Hydrogen sulfide (H 2 S), x 4 : Ozone (O 3 ), x 5 : Nitrogen oxide (NO x ), and x 6 : Particulate matters (PM 10 )}.The output parameter is the air quality Index (y i ; AQI).The rules are the backbone of the ANFIS model, consisting of Gaussian MFs (µs) to depict the fuzzy linguistic terms (ϕs) and are presented in the rule set given below.It is essential to mention that there are often uncontrollable and unavoidable causes of variations in air quality.Identifying variations require dealing with air quality characteristics using linguistic terms.Collecting numerical data about the air pollutants is essential, but this will not be as meaningful as linguistic terms used to identify the air quality parameters.Because crisp numbers cannot identify some parameters, fuzzy linguistic terms might be more suitable to deal with these parameters.For instance, air quality is a linguistic variable whose values might be linguistic terms such as 'good, healthy, unhealthy, very unhealthy, hazardous, etc.' Due to the imprecision and vagueness in these quality measures, a trend was initiated to integrate the randomness and fuzziness for assessing environmental quality problems.In Figure 3a,c, the air quality index is plotted in three-dimensional (3D) graphs versus carbon monoxide and sulfur dioxide.Similarly, it was plotted against ozone and nitrogen oxide for Jeddah, respectively.Figure 3b,d shows that the nonlinear relation appears clearly between the input parameters and the air quality index.The 3D plots are very obliging for observing the full view of the air quality index's output surface based on the whole span of the input parameters.The 2D and 3D plots of air quality index and regressors such as ozone, sulfur dioxide, carbon monoxide, and nitrogen oxide showed that the system was nonlinear and recommended the evolution of an intelligent approach to predict and control the air quality in a city.
The analysis of 3D surfaces shows that many local maximum and minimum points appear in the responses of the given parameters.Therefore, this reveals that the rise (or maximum points) in the pollutant concentration will increase the AQI and cause many negative effects.On the other hand, the local and global minimum points show where the AQI is low, and the air quality is good and healthy.Hence, a highly nonlinear relation appears between the pollutants and air quality index.

ANFIS Based Reasoning for Air Quality Prediction
The ANFIS model was established from six rules and the linguistic statements for air quality modeling and prediction.Fuzzy rules are used to map input parameters to the output.A fuzzy rule is constituted from the assertion and the conclusion parts, including linguistic variables and their term sets.Clustering analysis was carried out, and the optimal number of clusters was found to be six with a 99.9480% similarity level and 0.00104 distance level between the clusters.Therefore, the number of rules was considered equal to the number of clusters; each rule represents the characteristic of data in the cluster for identifying the AQI.Due to the nonlinearity (see in Figures 1 and 3), Gaussian membership functions (MFs) (see in Figures 4 and 5) were employed for the fuzzy input sets and delta functions for the output spaces.In this study, the center average defuzzification and product premise approach were employed for obtaining the outcomes of AQI, as given in Equation (2).
where R represents the number of rules in the rule base, 'n' denotes the number of inputs per data tuple.θ is represented in a vector form that contains the MF parameters for the rule base, 'c i ' is the MF center, and 'σ i ' is the width of MFs (µ i (x)) in the rule base, the Gaussian MFs were used for the rules' premises, and the delta function is used for the conclusion part.The coefficient b i represents the point in the output space at which the output MF for the ith rule is a delta function and denotes the point in the jth input universe of discourse, where the MF for the ith rule achieves a maximum.It is essential to mention that the relative width; of the jth input MF for the ith rule is always larger than zero.Fuzzy reasoning is the crucial factor in the modeling of fuzzy set theory.For the prediction of air quality, the input membership functions, fact base, the ruleset, and the inference engine are presented in Figure 4.These fuzzy rules and the reasoning process and defuzzification are considered as the pillar of the fuzzy inference system to obtain the outcomes of the fuzzy model.Figure 4 shows the fuzzy reasoning procedure of the Sugeno fuzzy model [23] for predicting the air quality in Jeddah.If air pollution is considered as a space-defining by fuzzy set U, X is is the fuzzy input parameters in this space and Y i is the fuzzy output parameter, then, the input parameters of this work are; X is = {x 1 : Sulfur dioxide (SO 2 ), x 2 : Carbon monoxide (CO), x 3 : Hydrogen sulfide (H 2 S), x 4 : Ozone (O 3 ), x 5 : Nitrogen oxide (NO x ), and x 6 : Particulate matters (PM 10 )}, and the output parameter is the air quality Index (y i ; AQI) which can be used for neuro-fuzzy modeling.The fuzzy linguistic term set employed for this study is ϕs = {good, moderate, unhealthy, very unhealthy, and hazardous}.A fuzzy model is structured by the collection of fuzzy If-Then rules.The upper and lower limits of all input parameters and output are presented in Figure 4.This figure also shows the fuzzy reasoning procedure.For instance, the upper and lower bounds of sulfur dioxide (SO 2 ) are between 0-14 µg/m 3 , ozone's (O 3 ) is between 29-119 µg/m 3 , and particulate matters (PM 10 ) is between 11-113 µg/m 3 , and so on.The membership functions µ i (x); i = 1, 2, . . ., n, are always parametric functions used in the fuzzy model.Figure 5a-d   A big data set was used for the training, testing, and validation of the ANFIS model developed which can cover the nonlinear functional dependency between the input and output parameters.The root-mean-square error (RMSE) approach was employed for the error determination, in which 'o i ' and 'p i ' are the observed and predicted values of error, respectively, for the AQI.Equation (4) gives the mean square error of the ANFIS model developed for this study.
Figure 6 shows the relative error of training and testing data determined for the ANFIS model developed.As seen in Figure 6, the relative errors were tolerable and the model checking performance was good.On the other hand, the average training error was found at 4.42, and the RMSE for the training data set was calculated at 5.64.Essentially, both the RMSEs were very small for the training and testing of the ANFIS model.Therefore, the developed ANFIS identified the essential components of the underlying dynamics.In the backpropagation learning algorithm, 'η' and 'µ' are used for 'speeding up' or 'slowing down' the error convergence established in the range of '0' and '1'.The performance of the ANFIS model is presented in Table 3.In case these errors exceeded the statistical standards (the 'd' value), the network was retrained with the increased number of epochs with a repeating process.The magnitudes of 'd' were not the measure of correlation but rather the error's predicted model outcomes.It takes values between 0 and 1; the perfect agreement between the observed and predicted values is when 'd' is '1', however '0' means absolute disagreement.The value of 'd' can be calculated as given in Equation ( 5) follows: where o represents the observed data average, and 'p' is the predicted data.

Machine Learning Approach for Air Quality Estimation
ANNs are computing systems capable of deep learning and are made up of several highly interconnected elements for information processing.In this work, a backpropagation multilayer perceptron (BPMLP) algorithm was employed for estimating the air quality (y i ) level in Jeddah city.The BPMLP algorithm can perform certain nonlinear mapping that can be described by the terms for a given set of input parameters as sulfur dioxide (SO 2 ), carbon monoxide (CO), hydrogen sulfide (H 2 S), ozone (O 3 ), nitrogen oxide (NO x ), and particulate matters (PM 10 ).The big data set was divided into suitable partitions for the training process, after fifteen iterations of training as appearing in Figure 7, and considering the distribution and allocation of weights, the minimum error was obtained by the mean square error approach.The problem of nonlinear relation minimization was solved by the Levenberg-Marquardt (LM) algorithm.The algorithm of steepest descent is known as the error backpropagation (EBP) algorithm and is considered one of the most crucial parts in the implementation of training the machine learning algorithm.However, this algorithm's disadvantage is the slow convergence, which can be significantly enhanced by applying the Gauss-Newton algorithm.In evaluating the error surface's curvature, it is customary to use the second-order derivatives of the error function.The Gauss-Newton algorithm can be employed for obtaining the suitable step sizes for each direction and rapidly reach convergence.As seen in Figure 7a, the error function seems to have a quadratic surface.In the initial iteration, the learning is weak (see Figure 7a), and the error rate is high.After some iterations (see Figure 7b), the algorithm could converge quickly and directly.Hence, the learning level is now high, and the error rate is low.The LM algorithm integrates two minimization methods: The steepest descent method and the Gauss-Newton algorithm, for fitting the error curve.However, combining these two algorithms reduces the variance by simultaneously updating the parameters in the steepest descent direction [49].On the other hand, Figure 7c shows that any overfitting has occurred for the NARX with a neural network, and the training and validation errors decreased until the highlighted epoch.This approach showed an amazing performance over the others.

Levenberg-Marquardt (LM) Algorithm
As the Jacobian matrix J T J is an invertible matrix and can be used for multilayer network training, it is expressed in the standard back propagation algorithm, and the terms in the Jacobian matrix are calculated using the LM algorithm to present the other approximation of the Hessian matrix (H) as presented in Equation (6).
where δ is an always positive combination coefficient, and 'I' is the identity matrix in Equation ( 6), in which the elements of the Hessian matrix are greater than zero and is always invertible.The Hessian matrix appearing in Equation ( 6) is updated and presented in Equation (7).
As the LM algorithm integrates the steepest descent and the Gauss-Newton algorithms, it switches between the two algorithms during the training process and gains both advantages.Where w k denotes the weight vector for node k, and e k is the training error of the machine learning algorithm.'J k ' is the Jacobian matrix, while 'J T ' is the transpose of m × n Jacobian matrix [49].Selecting a very small (nearly zero) combination coefficient δ, Equation ( 7) is updated and the Gauss-Newton algorithm is employed to implement the LM algorithm for the training of data obtained from the set of input parameters including x 1 : sulfur dioxide (SO 2 ), x 2 : carbon monoxide (CO), x 3 : hydrogen sulfide (H 2 S), x 4 : ozone (O 3 ), x 5 : nitrogen oxide (NO x ), and x 6 : particulate matters (PM 10 ), and the output parameters if 'AQI.' As seen in Figure 8, with ANNs, two problems must be solved: the calculation of the Jacobian matrix, and the organization of the training process.Considering the neuron 'n' with n i inputs in the first layer, all its independent parameters are connected to the network's input layer.Equation ( 8) was employed to calculate the air quality index given in the neuron 'n' as the output of the ANN.
where f n is the activation function of neuron n and the net value 'net n ' is the sum of weighted input nodes of neuron n which can be presented by Equation (9).
where, y n,i is the ith input node of neuron n, weighted by w n,i and w n,o .When the training of the data set is completed, a high value of correlation coefficient decently describes that the data are highly correlated with the fit.It also shows that these parameters are significantly correlated, meaning that a change in one parameter will affect the other parameters.The histogram in Figure 9a depicts the difference between the data values and the curve fit for ANN.The error histogram for the training, validation, and testing process of NARX with a neural network is presented in Figure 9b.These figures show that the curve-fit errors are normally distributed.In this study, the redundant data were not used in the training process, as the ANN algorithm does not work well with redundant data.A multilayered perceptron (BPMLP) network with six inputs, eight processing units in the hidden layer, and one output parameter was considered for the training process.As seen in Figure 8, the back-propagation algorithms were used for training the network with LM tools' employment, which minimizes the divergence between the input and the output parameters.The outcomes predicted by the BPMLP algorithm were converted to air quality numerals that are recorded in Table 4.During the training process, it was found that the solution had improved, as the δ was decreased, the LM method approached the Gauss-Newton method, and the solution usually accelerated to the local minimum [49].Sum square error (SSE) method was employed to assess the training process.The SSE for all training patterns and network outputs was computed using Equation (10).The error rate is reasonable because redundant data and noisy data were excluded during the training, testing, and validation process.For training, 60% of data was used, 20% was used for testing, and 20% of data was used for validation.Excluding the outliers (the redundant data), the average absolute error was found at 0.07147%, and the sum of the squared errors was found at 0.0251%.
where, as seen in Equation ( 10), w denotes the weight vector, and e p,m refers to the training error of the machine learning algorithm.When using pattern p, as it is defined in Equation ( 11), m represents the index of outputs, from 1 to M, where M is the number of outputs.
'd' determines the desired output vector for air quality index (AQI), the actual output vector for AQI is represented by 'o'.Considering the nodes and the links between the output node y j of a hidden neuron j and network output o m , a complex nonlinear relationship exists between the network parameters that can be defined simply by o m and f j , where o m is the mth actual output of the network representing the air quality.Figure 10a-d  Similarly, the value of R for testing is 0.97948 and 0.98103 for validation.The training process was initiated as shown in Figure 7a, and the final training was carried out after several training steps and illustrated in Figure 7b.The training, testing, and validation were converged at the three epochs with the validation performance of 92.3206.Thus, the result is acceptable since the final mean-square error and the absolute mean square errors are small, after several training steps, the error rates fell to 0.611236% and 0.080739%, respectively.It is also clear that the set errors of the training and testing have similar characteristics.For instance, no significant over-fitting has been obtained by iteration number thirteen, where the highest performance of the validation has occurred.

Nonlinear Autoregressive with External (Exogenous) Input (NARX)
In the time-series problems, it is desired to predict future values of a time-series 'y(t)' from past values of that time series and past values of a second time-series 'x(t)'.This prediction approach is labeled NARX, and can be presented as given in Equation ( 12): The standard NARX network was employed in this study which has a two-layer feedforward ANN, with a sigmoid transfer function in the hidden layer and a linear transfer function in the output layer.'y(t)', the output of the NARX network, is fed back to the input of the network, because 'y(t)' is a function of y(t − 1), y(t − 2), . . ., y(t − n).where 't' is the time, and 'n' is the amount of data.NARX can be employed to predict future values of air quality, chemical processes, manufacturing systems, robotics, and aerospace vehicles based on several variables.It can also be used for system identification, in which models are developed to represent the dynamic behavior of systems.The outputs of the training, validation, and testing process of ANNs are presented in Figure 12a.For the NARX with a neural network AQI prediction model, there is one nonzero value of the autocorrelation function, and it occurred at zero lag.This is the mean square error (MSE).In the case of AQI prediction, the correlations, except for the one at zero lag, fall approximately within the 95% confidence limits around zero, so the model seems to be adequate.
The training, testing, and validation of the ANFIS model were converged at the 60 epochs with the validation performance of 99.3206.The mean-square error and the absolute residual rate are small in this approach; after training, they fall to 0.611236% and 0.080739%, respectively.The errors of training and testing have similar characteristics.The low-level errors obtained were due to mainly insignificancy of over-fitting observed and occurred by iteration thirteen, where the best validation performance has been observed.The NARX with neural network showed much better performance for the same data set of independent regressors' used for the ANN and ANFIS models.Hence, the prediction performance of NARX with the neural network approach is higher, as seen in Figure 12a,b.The NARX model training, testing, and validation were converged at the 16 epochs (see Figure 7c) with the validation performance of 99.In this approach, the mean-square error and the absolute residual rate are smaller; after training, they were determined 0.334% and 0.0475%, respectively.

Results and Discussion
NARX with a neural network, ANFIS, and machine learning are highly interrelated soft computing systems for information processing approaches, and capable of deep learning.They were employed for the big-data advancement of the environmental systems, using the BPMLP, two-layer feedforward ANN algorithm and steepest descent approach to reduce the mean square error of the big data set of training.The Levenberg-Marquardt (LM) [49] approach was employed as an optimization method for ANNs, as a sub-technic of machine learning approach to solve the pollutant parameters that have nonlinear relations.The results obtained were evaluated by fuzzy quality charts and compared with the US-EPA air quality standards statistically.
One of the most critical ecological issues is environmental pollution, including air, water, land pollution, etc. Emissions of sulfur dioxide and other pollutants are gradually rising as the number of industries grows [50].Nitrogen oxides have been increasing in many locations.The widest spread of air pollution in these areas is mainly formed by the emissions created from domestic industrial plants and transportation sources.Daily arithmetic averages of sulfur dioxide, carbon-monoxide, hydrogen sulfite, ground-level ozone, nitrogen oxide, and particulate matter were collected from stations and used to model the air quality index.
Data accumulated over the last three years offered us a big data set which was substantial for training the model to obtain an ANFIS model.The AQI of each pollutant was calculated by Equation ( 1) and an air quality index was obtained for the cumulative effects of pollutants.Some gases are inert (like CO) and do not interact chemically with others.However, we consider the relations statistically and mathematically.This data set was then employed to train the NARX with a neural network, ANFIS, and ANN models to predict pollutants' air quality index.The degree level of inter-correlations between the pollutants shows that atmospheric pollution depends on various parameters, the relation of some pollutants with AQI is given in Figure 14.
techniques on two entirely different series.Equation (13) shows the MAPE calculation, and it is found at 2.3747% for the AQI in this study.
On the other hand, the mean percentage error (MPE) was used to compute finding the error in each period.It is computed by finding the actual residual value for each period, then dividing by the actual AQI values to obtain the % error, and at the end, averaging these percentage errors.The MPE is calculated by Equation ( 14) and was found at 0.3423% for this study, which is close to zero.
As a result, when a MAPE of 2.3747% is compared to the RMSE of 5.64, the MAPE can be used to forecast the air quality data.A small MPE of 0.3423% reveals that the technique is not biased, while the value is close to zero, the techniques do not consistently over/or underestimate the AQI daily.The actual AQI observations versus the outcomes of ANN and ANFIS modeling approaches are given in Figure 15.The results clearly show that the outcomes of both models are close to the actual AQI values and the air quality is good and moderate in Jeddah.There are some deviations during some periods and this might be because of dust storms and particulate matters.The NARX with a neural network, ANNs, and ANFIS model aims to construct an online and intelligent control strategy for air quality prediction.All methods produced vigorous outcomes.Table 4 illustrates NARX, ANN, and ANFIS models' outcomes for certain pollutants versus observed air quality index.The average error was determined at 0.00335, 0.10858, and 0.10362 for NARX, ANFIS, and ANN models, respectively.On the other hand, the optimal number of rules was found to be six for the data set available for the ANFIS model.Moreover, the essential findings depicted that an additional number of membership functions and rules did not improve the ANFIS model's efficiency [52].Therefore, as it is given in Figure 4, six rules appear adequate to establish a rule-based ANFIS model for AQI prediction.Figure 5 depicts the fine-tuned MFs of pollutants; bellshaped Gaussian MFs were employed for determining the membership degrees.The reason that the Gaussian MFs were employed is that the relations of parameters are nonlinear.
Figure 6 shows the distribution of relative errors determined for training and testing of the ANFIS model developed for this study.The ANFIS model outcomes for certain degrees of pollutants were given in Table 4, which provides the comparison of AQI obtained from the ANFIS model, and the observed AQI obtained from the US-EPA standard [37].In this article, the back-propagation multilayer perceptron (BPMLP) algorithm was employed to perform nonlinear mapping of parameters.The BPMLP algorithm used the Levenberg-Marquardt (LM) approach as an optimization method for solving a nonlinear least-squares problem. Figure 7a,b show the initial and final training process, respectively.Similarly, Figure 7c shows the overfitting of the NARX with neural network for training and validation error.
The training process was successfully carried out because the mean-square error and the absolute mean square errors were low and were 0.611236% and 0.080739%, respectively.Similarly, Figure 10a-d    categories for air quality index (AQI) to evaluate the air quality in Jeddah.The ANFIS and ANN are more reliable and practical approaches to observe the air quality online, which add more flexibility than the crisp assessment of air quality offline.For an overall quality assessment, when the AQI is between 0 and 50, it is defined as good air quality, if it is between 51 to 100, the air quality is moderate.However, if it is above 100, the quality is poor and unhealthy; the sensitive groups are affected.Higher AQI creates hazards (if it is above 300), which affects people's respiratory systems.EPA [37] standards for air quality have been established to prevent several harmful effects of pollutants.

Conclusions
The prominent prediction techniques fall into two broad categories, namely, soft computing and statistical techniques.ARIMA (also known Box-Jenkins) and the other traditional techniques are commonly regarded as the most efficient forecasting technique in social science and are utilized broadly for time series [53].This study aims to envisage air quality and its distribution using soft computing techniques, such as adaptive neurofuzzy system (ANFIS), and NARX with neural network and ANNs as machine learning approaches.The proposed methods in this work are practical, robust, and capable of estimating pollutants' cumulative effect inside the urban areas to reduce respiratory and cardiovascular mortalities.The findings showed the remarkable performance of NARX, ANFIS, and ANN-based air quality models for high-dimensional data assessment.As a statistical approach, the usage limitation of ARIMA for forecasting time-series is crucial with uncertainty as it does not undertake knowledge of any fundamental model or input parameters as in soft computing methods [54].The conventional techniques for the prediction of time-series, such as ARIMA, SARIMA, and many others assume that the time-series are generated from linear processes, therefore the outcomes may be inappropriate for most nonlinear real-world problems [55].On the contrary, soft computing techniques are data-driven, self-adaptive intelligent approaches used for prediction with the ability to make generalized observations from the results obtained from original data.Additionally, machine learning approaches are universal approximators as an ANN can effectively approximate a continuous function to the anticipated accuracy level [53].Although the literature depicts the different views on the relative superiority and performance of ANNs and ARIMA approaches for prediction, further studies are needed for a unified coherent view on these methodologies for better applications.
For the situation where the AQI values increase, people may encounter several symptoms of health concerns [37].Air quality models' outcomes were found meaningful for warning the public earlier in case an unhealthy situation is encountered.Air pollution management involves capacity building, monitoring ground-based networks and systems for appropriate strategic and operational decision-making.Implementing these strategies requires quality controlling and assurance, modeling approaches, and institutional capabilities.Therefore, local and global environmental policymakers can consider the presented methodologies and findings as a suitable, reliable, and useful technique in air quality assessment and management.Consequently, the stability of air quality was correlated with the absolute air quality index using soft computing techniques.

Figure 3 .
Figure 3.The impacts of pollutants on the air quality index.(a-d) The impacts of pollutants on the air quality index.

Figure 4 .
Figure 4. Fuzzy reasoning procedure for predicting air quality.

Rule 1 :
depicts the MFs and their term sets for sulfur dioxide (a), ozone (b), nitrogen oxide (c), and carbon monoxide in Jeddah, respectively.IF (SO 2 ) is low and (CO) is low and (H 2 S) is low and (O 3 ) is low and (NO) is low and (PM 10 ) is low THEN The air quality is good.Rule 2: IF (SO 2 ) is low and (CO) is normal and (H 2 S) is normal and (O 3 ) is low and (NO) is normal and (PM 10 ) is normal THEN The air quality is good.Rule 3: IF (SO 2 ) is high and (CO) is normal and (H 2 S) is normal and (O 3 ) is low and (NO) is normal and (PM 10 ) is very high THEN The air quality is normal.Rule 4: IF (SO 2 ) is low and (CO) is high and (H 2 S) is high and (O 3 ) is very low and (NO) is high and (PM 10 ) is very low THEN The air quality is unhealthy.Rule 5: IF (SO 2 ) is normal and (CO) is high and (H 2 S) is high and (O 3 ) is very high and (NO) is high and (PM 10 ) is high THEN The air quality is unhealthy.Rule 6: IF (SO 2 ) is very low and (CO) is very high and (H 2 S) is very high and (O 3 ) is high and (NO) is very high and (PM 10 ) is low THEN air quality is hazardous.

Figure 5 . 2 ( 3 ) 2 f
Figure 5. Depicts the MFs and their term sets for sulfur dioxide (a), ozone (b), nitrogen oxide (c), and carbon (d).Appropriate separation of fuzzy input and output data spaces and a correct choice of MFs are essential to obtain a useful ANFIS model for the AQI.The MFs and the fuzzy term sets of all variables are determined based on the domain knowledge of the system parameters considered.The Gaussian MFs are identified by two parameters (c, σ), where 'c' denotes the MFs' center, and 'σ' represents the MFs' width.Figure5cshows the Gaussian MFs for 'nitrogen oxide' and fuzzy term set 'very high' representing the MFs.Some other fuzzy variables and their MFs are presented in Figure5.For example, the MF of 'nitrogen oxide' for the fuzzy term 'very high' is mathematically presented as given in Equation (3).gaussian(x, c, σ) = e −1/2( x−c σ ) 2

Figure 6 .
Figure 6.The training and checking error were determined for the ANFIS model.

Figure 8 .
Figure 8.The architecture of the Artificial Neural Network used for Air Quality Index Estimation.

Figure 9 .
Figure 9. Depicts the curve fit for the training, validation, and testing process of ANN (a), and NARX with neural network (b).
depicts the targets of output for training (a), validation (b), testing (c) and all process (d) of correlation coefficient (R).The value of R is close to 1 for training, and 0.91227 for validation of data.

Figure 10 .
Figure 10.Depicts the targets of output for training (a), validation (b), testing (c) and all process (d) of correlation coefficient (R).

Figure
Figure 11a and b show the histogram of error distribution and the residual is shown in c for initial and final training stages of the machine learning approach, respectively.

Figure 11 .
Figure 11.(a) and (b) show the histogram of error distribution and (c) shows the residual for initial and final training stages of the machine learning approach.
Figure 12b plots the root mean square error (RMSE) of the training, validation, and testing process of ANNs.

Figure 12 .
Figure 12.(a) shows the outputs of the training, validation, and testing process of NARX with a neural network, while figure (b) plots the root mean square error (RMSE).

Figure 13
Figure13displays the error autocorrelation function.It describes how the prediction errors are related in time.For the NARX with a neural network AQI prediction model, there is one nonzero value of the autocorrelation function, and it occurred at zero lag.This is the mean square error (MSE).In the case of AQI prediction, the correlations, except for the one at zero lag, fall approximately within the 95% confidence limits around zero, so the model seems to be adequate.The training, testing, and validation of the ANFIS model were converged at the 60 epochs with the validation performance of 99.3206.The mean-square error and the absolute residual rate are small in this approach; after training, they fall to 0.611236% and 0.080739%, respectively.The errors of training and testing have similar characteristics.The low-level errors obtained were due to mainly insignificancy of over-fitting observed and occurred by iteration thirteen, where the best validation performance has been observed.

Figure 13 .
Figure 13.The error autocorrelation function for AQI prediction.

Figure 15 .
Figure 15.Air quality index observed vs. the outcomes of NARX, ANN, and ANFIS approaches.
Figure6shows the distribution of relative errors determined for training and testing of the ANFIS model developed for this study.The ANFIS model outcomes for certain degrees of pollutants were given in Table4, which provides the comparison of AQI obtained from the ANFIS model, and the observed AQI obtained from the US-EPA standard[37].In this article, the back-propagation multilayer perceptron (BPMLP) algorithm was employed to perform nonlinear mapping of parameters.The BPMLP algorithm used the Levenberg-Marquardt (LM) approach as an optimization method for solving a nonlinear least-squares problem.Figure7a,b show the initial and final training process, respectively.Similarly, Figure7cshows the overfitting of the NARX with neural network for training and validation error.The training process was successfully carried out because the mean-square error and the absolute mean square errors were low and were 0.611236% and 0.080739%, respectively.Similarly, Figure10a-dshows the training correlation coefficient (R) (a), validation R (b) and testing R (c); the R is 1 for training, validation, and testing.ANN has a similar capability for the same data set of independent regressors' used for the ANFIS model training process.The low-level errors obtained were mainly because there was no significant overfitting observed during iteration thirteen, where the best validation performance had been observed.Figure 11a,b show the histogram of error distribution and the residual (c) of initial and final training stages of ANN, respectively.Convergence was observed between the three parameters; hence the training process was ended.Because of the lack of identification of the cumulative effect of quality parameters in pollution issues, a novel trend has been inspired by combining randomness and fuzziness in evaluating the environmental quality problem of air pollution in this work.Quality assessment in fuzzy sets expresses that the quality level of air is measured by membership degrees.The scatter plot of 100 principal component outcomes of AQI obtained for ANN, ANFIS and NARX models are illustrated in Figure16a-c, respectively.

Figure
Figure 16a-c shows the fuzzy quality assessment of the AQI by ANN (a), ANFIS (b), and NARX (c) models with numerical values, respectively.The fuzzy quality charts with linguistic terms were employed along with the US environmental protection agency

Table 2 .
Air quality standards in Saudi Arabia, Gulf countries, and the US-EPA.

Table 3 .
The parameters for determining the strength of the ANFIS model.

Table 4 .
AQI outputs of ANNs, ANFIS, and NARX models for certain parameters.