Air Pollutant Analysis and AQI Prediction Based on GRA and Improved SOA-SVR by Considering COVID-19

: Since COVID-19 pneumonia broke out, the Chinese government has taken a series of measures to control the spread of the epidemic, which has made the air quality of Taiyuan in February 2020 signiﬁcantly better than during the same period in previous years. In this paper, the Gray Relational Analysis (GRA) method was ﬁrst applied to evaluate and analyze the inﬂuence of six major pollutants on air quality. Then, the improved seagull optimization algorithm (ISOA) was proposed and combined with Support Vector Regression (SVR) to establish a hybrid predicted model ISOA-SVR. Finally, the proposed ISOA-SVR was utilized to predict air quality index (AQI). The experimental results on two kinds of different data showed that the proposed ISOA-SVR had the better generalization ability and robustness compared with other predicted models. Further, the proposed ISOA-SVR is suitable for the prediction of AQI.


Introduction
With the concentration of urban population and industrial growth, environmental pollution problems are arising. Previous research showed that air pollution has a direct impact on human health. It is estimated that 1.22 million people are killed every year by haze in China, accounting for about 15% of annual deaths [1]. According to the data of the World Health Organization in 2018, air pollution causes seven million people to die of cancer, stroke, heart disease and respiratory diseases every year. Air pollution has become the number one killer of human beings. In 2019, the World Health Organization lists air pollution as one of the top ten threats to global health [2].
Air quality prediction is helpful to prevent air pollution from causing damage, and it can guide government departments to take corresponding preventive measures to prevent serious air pollution, such as heavy polluting enterprises stopping production and traffic restriction. At the same time, it can provide travel suggestions for people's outdoor activities. Therefore, timely and reliable air quality forecasting is very necessary for the prevention and control of air pollution. Air quality index (AQI) is an important index to reflect and evaluate air quality [3]. According to China's air quality standard, AQI is obtained from the concentrations of six main pollutants including fine particulate matter (PM 2.5 ), inhalable particles (PM 10 ), sulfur dioxide (SO 2 ), carbon monoxide (CO), nitrogen dioxide (NO 2 ) and ozone(O 3 ) [4]. The value of the AQI index reflects the level of air quality; a low AQI value means that the air quality is good, and it is conducive to travel, whereas a higher AQI value indicates that the air pollution is serious.
In the face of increasingly serious air pollution problems, people have more and more attention on air quality. An autoregressive moving average (ARIMA) model was used to predict the monthly value of air pollution index and the daily value of AQI in [5,6]. However, if the sequence is nonlinear or irregular, ARIMA may not provide reliable prediction results. Compared with the ARIMA model, Support Vector Regression (SVR) [7,8]

Grey Relational Analysis (GRA)
Grey relational analysis (GRA) is a method to measure the degree of relational between factors according to the degree of similarity or difference in the development trend between them [24]. If the change trend of two factors is consistent, that is, the degree of synchronous change is high, this indicates that the relational degree between the two factors is high; otherwise it is low. The specific steps of the grey relational method are as follows: Step 1: Determine the analysis sequence.
The reference sequence reflecting the characteristics of system behavior and the data sequence of factors influencing system behavior are determined.
Because the data of each factor column in the system may be different due to different dimensions, it is not convenient to compare, or it is difficult to get the correct results when comparing. Therefore, in order to ensure the reliability of the results, dimensionless processing should be carried out first in the grey relational analysis.
Step 3: Calculate the relational coefficient.
The relational coefficient of comparison sequence x i (i = 1, 2, · · · n) to reference sequence x 0 at time t is as follows: where ρ is the resolution coefficient, usually the number in (0, 1).
Step 4: Calculate the relational degree. Because the relational coefficient is the value of the relational degree between the comparison sequence and the reference sequence at each time (i.e., each point in the curve), there is more than one of them and the information is too scattered to facilitate the overall comparison. Therefore, it is necessary to concentrate the relational coefficient of each time into one value; that is, to calculate the average value as the quantitative expression of the relational degree between the comparison sequence and the reference sequence. The relational degree is calculated as follows: Step 5: Rank the relational degree. The larger the value, the stronger the relational.

Support Vector Regression (SVR)
Support Vector Regression (SVR) is a support vector algorithm for regression problems [7]. The goal of SVR is to find an optimal hyperplane, so that the deviations between training data and the hyperplane are not greater than ε. In other words, it minimizes the distance between the support vectors on both sides of the hyperplane. A schematic diagram of SVR is shown in Figure 1. Therefore, a regression function f (x) can be obtained by training SVR so that its deviation from the actual value y is not more than ε for each training vector x. The details of the SVR are presented in reference [8]. The regression function f (x) can be obtained as: where x is the input feature vector, b is the bias term, α * i and α i are Lagrange multipliers and K(x, x i ) is a kernel function with functional forms that can be obtained from the following equations: radial basis kernel function :

Seagull Optimization Algorithm (SOA)
The seagull optimization algorithm (SOA) is a swarm intelligence optimization algorithm proposed by Dhiman and Kumar in 2019 [25]. In this algorithm, the migration of seagulls and the attack behavior in the migration process are regarded as the optimization process of the optimization problem, and the location of seagulls in the algorithm is the potential solution of the optimization problem. The migration and attack patterns of seagulls are shown in Figure 2.

Migration behavior of seagulls (global search)
In this part, the algorithm simulates how the group of seagulls move from one position to another. In the process of migration, the movement behavior of seagulls satisfies the following three conditions: (1). Collision avoidance. An additional variable A is employed for the calculation of new seagull position to avoid collisions between seagull neighbors.
where C is defined as a new position that does not allow collision with other seagulls. The current position of seagulls is represented by P. t and T are the current iteration and the maximum number of iterations, respectively.A represents the movement behavior of seagulls in a given search space. The constant a is used to control the range of A, when a = 2. A decreases linearly from 2 to 0.
(2). Movement towards global best position direction: After avoiding the overlap with other seagulls, the seagulls move toward the direction of global best position.
where M is the step size to move from the current position to the global optimal position and P best is the global best position. The control parameter used to balance global search and local search is defined as B, where b rand is a random number that lies in the range of (0, 1). (3). Moving distance. When the seagull moves to a position where it does not collide with other seagulls, it moves towards the direction of global best position. D is the distance that the seagull moves from the current position towards the global best position.
2. Attacking behavior of seagulls (local search) Seagulls use their wings and weight to maintain their flight height, and they can constantly change their attack angle and speed during migration. Seagulls move in a spiral pattern when attacking their prey. The spiral movement behavior of seagulls in x, y and z planes is described by Equations (11)- (14).
where r is the radius of the spiral, θ is a random angle value in the range of [0, 2π], u and v are constants that define the spiral shape and e is the base of natural logarithm. The updated position P(t) of seagulls is obtained using Equation (15).

Improved Seagull Optimization Algorithm (ISOA)
In this part, in order to avoid the SOA algorithm falling into a local minimum and to accelerate the convergence speed of the SOA algorithm in the optimization process, the SOA algorithm is improved by three aspects.
(1). As we know, a good optimization algorithm should have strong exploration ability in the early iteration and good development ability in the late iteration. In addition, compared with a linear function, a nonlinear function can achieve a faster global search in the early stage and a slower local search in the later stage. Therefore, in the seagull optimization algorithm, we change the parameter B, which controls the seagull towards the optimal position, from a linear function to an exponential function.
(2). From the SOA algorithm, we can see that the new position of the seagulls is mainly affected by the global optimal position, but when we observe the foraging of seagulls, we can find that the flight of seagulls is also affected by the individual optimal value of seagulls. In order to solve this problem, we consider the influence of the particle itself, and introduce the adaptive variable weight parameter ω. where, ωmax and ωmin represent the maximum and minimum values of weight ω, respectively, f itpbest represents the individual optimal value of particle fitness, the global optimal value of particle fitness is defined as f itgbest, and avg f it is the average fitness value of the current particle population. t and T are the current iteration and the maximum number of iterations, respectively. It can be seen from Equation (17) that when the target values of the particles tend to be consistent or the location optimal, the weight ω is larger, whereas when the target values of the particles are dispersed, the weight ω is smaller. For a particle with an objective function value better than the average target value, the corresponding weight ω is smaller, so that the particle is retained. On the contrary, for a particle with an objective function value worse than the average target value, the corresponding weight ω is larger, so that the particle moves closer to the better search area. (3). Increase the diversity of particles to avoid the algorithm falling into a local minimum.
First, the fitness of particles is sorted from small to large. Then, the particles with poor ranking generate a new position near the global optimal position, while the other particles are retained.
To summarize, the attack position P i of the ith seagull follows: where i is the new serial number of particles after fitness sorting, population size is defined as popsize, λ is a constant and rp is a random value in the range of (−1 to 1).

Proposed ISOA-SVR Prediction Model
In this study, the ISOA algorithm was used to optimize the penalty parameter c, width parameter g and loss parameter p in SVR. The mean square error (MSE) is the fitness function of ISOA algorithm. The fitness function of the kth training sample is defined by the following Equation: where n is the number of samples andŷ i and y i are the actual and the forecasted return AQI value, respectively. The algorithm flow of the proposed ISOA-SVR prediction model is as follows: Step 1: Set the control parameters of the proposed ISOA algorithm.
Step 3: Map P into the c, g and p of the SVR and calculate its fitness evaluation f by use of Equation (19), then f itpbest, f itgbest and avg f it are calculated.
Step 9: The fitness values f are sorted from small to large. Update the attack position P according to Equation (18).
Step 10: If the stopping criterion is met, then go to Step 11. Otherwise, go to Step 3.
Step 11: Output the best positions that is mapped into the c, g and p of the SVR.
Then train and test the SVR.

Data Description and Preprocessing
The data used in this study comes from the historical data of Taiyuan city released by the historical data query of China air quality online monitoring and analysis platform (https://www.aqistudy.cn/historydata/ accessed on 15 July 2020). The daily and monthly data of air quality index (AQI) and six closely related pollutants (PM 2.5 , PM 10 , SO 2 , CO, NO 2 , O 3 ) in Taiyuan from December 2013 to June 2020 were extracted. The daily data is the average result of the real-time dynamic data released by the environmental protection station on that day, and the monthly data is the average value of all the daily data of that month.
Since the difference of the numerical dimensions of the indexes involved in the calculation of grey relational degree and AQI prediction, this study carried out dimensionless processing on the data in the grey relational analysis (GRA) and AQI prediction. When calculating the grey relational degree, we used the mean method of Equation (20) to realize the dimensionless processing of the data. In addition, to realize dimensionless data in AQI predictions, we used Equation (21) to normalize the data to the range of (0-1).
where x new is the dimensionless value of x and x is the mean value of the sample during the experimental period.
where x norm is the normalized value and x min and x max represent the minimum and maximum values of experimental data in each experiment.

Experimental Environment
The experimental environment was as follows. MATLAB 2016b (MathWorks, Natick, MA, USA) running Windows10 (Microsoft, Redmond, WA, USA) on an Intel Core i7-8700 CPU (Intel, Santa Clara, CA, USA) with a 3.20 GHz processor and 8 GB of RAM. In addition, all of the algorithms in the experiments ran independently for 30 times under different randomly initialized populations. To verify the efficiency of the proposed ISOA, the proposed ISOA was compared with SOA, PSO, SCA and WOA. The maximum number of iterations T was set to 200 for all algorithms and the population size N was set to 20. The other parameters of SCA were similar to those in [27], the WOA parameters were similar to those in [28] and the SOA parameters were similar to those in [25]. For PSO, acceleration coefficients c1 and c2 were set to 1.49445.

Evaluation of Prediction Effect
In this study, ISOA-SVR, SOA-SVR, PSO-SVR, SCA-SVR and WOA-SVR were used to establish the prediction model of AQI in Taiyuan. Mean square error (MSE), relative mean square error (RMSE) and mean percentage error (MAPE) were the evaluation indexes of the above five models. In order to get an excellent regression model, the smaller the MSE, RMSE and MAPE between the predicted value and the actual value, the better. The mathematical expression of evaluation index is shown in Equations (22)-(24) where y i is the label of the ith sample andŷ i is the predicted value of the ith sample.

Experimental Results
In this section, GRA was used evaluate and analyze the air quality of Taiyuan city before and during the outbreak of COVID-19. Then, a prediction model of AQI in Taiyuan was established by using ISOA-SVR and compared with other predicted models.

Analysis of Air Pollution Assessment Results
Before applying the GRA method to analyze the influence of air pollution factors on AQI, a simple numerical statistical analysis of AQI and each air pollution factor in four time periods was carried out. The four time periods mentioned here include the COVID-19 locking period (25 January 2020 to 24 February 2020), before the outbreak of COVID-19 (25 December 2019 to 24 January 2020), the same period in 2019 (25 January 2019 to 24 February 2019) and one other time period (25 December 2018 to 24 January 2019), each of which has 31 days. In addition, according to the technical regulation of ambient air quality index (AQI) (HJ 1130-2020) issued by the China Ministry of Environment of Protection, AQI can be divided into six levels (0~50 'good', 51~100 'moderate', 101~150 'lightly polluted', 151~200 'moderately polluted', 201~300 'heavily polluted' and >300 'severely polluted') [29]. This paper takes these as the standards to analyze the air pollution in the above four periods. The statistical results of AQI are shown in Figure 3 and Table 1. In the 31 days of the COVID-19 locking period (25 January 2020 to 24 February 2020), there were three days of good, 16 days of moderate, nine days of lightly polluted and three days of heavily polluted air quality, and there was no moderately polluted or severely polluted weather. In the 31 days before the outbreak of COVID-19 (25 December 2019 to 24 January 2020), respectively, and another day with severely polluted air quality the days with good and moderate grades were two and seven, respectively, the days with lightly polluted, moderately polluted and heavily polluted were 11, two and eight. In the same period of 2019 (25 January 2019 to 24 February 2019), the days of good and moderate were one and 13 respectively, the days of lightly polluted, moderately polluted and heavily polluted were six, eight and three respectively, and there was no severely polluted weather. In the fourth period (25 December 2018 to 24 January 2019), the days of good and moderate weather were zero and 11 respectively, the days of lightly polluted were 10 days, the days of moderately polluted were four days, the days of heavily polluted were four days and the days of severely polluted were two days. From these statistics, it can be seen that the air quality of the COVID-19 locking period (25 January 2020 to 24 February 2020) was significantly better than the other three periods. In fact, during the period of COVID-19 lock-in, the total number of days with good and moderate air quality was 19 days, accounting for 61.3% (19/31), and there was no severely polluted weather. However, in the other three time periods, the total number of good and moderate days of air quality were 9, 14 and 11, respectively, which did not reach 50%. In the locking period of COVID-19 (25 January 2020 to 24 February 2020), only three days reached or exceeded the moderately polluted level, accounting for 9.7% (3/31), and the other three time periods were 11 days, 11 days and 10 days, respectively, which exceeded 30%. In addition, before the outbreak of COVID-19 (25 December 2019 to 24 January 2020), and in the fourth period (25 December 2018 to 24 January 2019), there were one day and two days of severely polluted weather, respectively. All these indicate that during the period of epidemic prevention and control, the factories stopped production and motor vehicles were restricted and stopped, which played a positive role in improving air quality.   Figure 4 and Table 2 show the average values of AQI and pollutant concentrations over the above four time periods. It can be seen that compared with the other three periods, except for the increase of the average concentration of O 3 , the average value of AQI and the average concentration of other pollutants decreased significantly during the COVID-19 locking period (25 January 2020 to 24 February 2020). The average value reflects the overall level of data. Combining these results with the statistical results in Table 1 shows that the air quality in the locking period of COVID-19 was significantly better than in the other three periods. In fact, good weather accounted for 61.3% (19/31) in the COVID-19 locking period (25 January 2020 to 24 February 2020). Compared with 29% (9/31) before the outbreak of COVID-19, 45.1% (14/31) in the same period of 2019 and 35.5% (11/31) in the fourth period, there were 32.3% (10/31), 16.2% (5/31) and 25.8% (8/31) more, respectively. The number of days reaching and exceeding the moderately polluted level accounted for 9.7% (3/31), which was 25.8% (8/31), 25.8% (8/31) and 22.6% (7/31) less than that before the outbreak of COVID-19 (35.5% (11/31)) for the same period in 2019 (35.5% (11/31)) and the fourth period (32.2% (10/31)) respectively. There was no severely polluted weather during the period of COVID-19 lock-in. In addition, the average value of AQI during the locking period of COVID-19 was 102, which is smaller than the other three time periods (141, 122, 141). These results show that the prevention and control measures of COVID-19 had great restrictions on the movement of people, transportation, engineering construction, industrial production and commercial trade activities. Industrial emissions and automobile exhaust were greatly reduced, and air quality was significantly improved.  From the literature [29], we know that AQI is calculated by the concentration values of PM 2.5 , PM 10 , SO 2 , CO, NO 2 , O 3 and other pollutants. From previous numerical statistical analysis, we know that compared with the other three time periods the AQI and the pollution concentration changed greatly during the COVID-19 locking period. So, did the impact of these air pollutants on AQI change greatly? Next, we used the GRA method to analyze the gray relational degree of each pollution factor and AQI in the above four time periods, and made a comparative analysis of the situation in different time periods. The analysis and comparison results are shown in Figure 5 and Tables 3-6.  It can be seen from Tables 3 and 4 that during the COVID-19 locking period (25 January 2020 to 24 February 2020), compared with before the outbreak of COVID-19 (25 December 2019 to 24 January 2020), the impact of PM 10 was slightly enhanced, the impact of PM 2.5 , SO 2 and CO was basically the same as that in the earlier period, the impact of NO 2 was slightly weakened and the impact of O 3 was significantly enhanced. Compared with the same period in 2019 (25 January 2019 to 24 February 2019), the impacts of PM 10 and O 3 were significantly enhanced, the impacts of PM 2.5 and CO were basically the same as before and the impacts of SO 2 and NO 2 were slightly weakened.
The results in Table 5 show that in the same period of 2019 (25 January 2019 to 24 February 2019) compared with its earlier period (25 December 2018 to 24 January 2019), the impact of PM 2.5 , SO 2 and NO 2 was significantly weakened, the impact of PM 10 was basically the same as that of the earlier period, the impact of CO was slightly weakened and the impact of O 3 was slightly enhanced. It can be seen from Table 6 that before the outbreak of COVID-19 (25 December 2019 to 24 January 2020), compared with the same period in 2018 (25 December 2018 to 24 January 2019), the impact of PM 2.5 , SO 2 and NO 2 was significantly weakened, the impact of PM 10 was significantly enhanced and the impact of CO and O 3 was significantly enhanced.
Based on the analysis of the results in Tables 3-6 and Figure 5, it can be seen that although there were some differences in the impact changes of pollutants on air quality (AQI) in different comparison periods, the changes were similar in the corresponding comparison periods. PM 10 and PM 2.5 had great influence on AQI. These results indicate that although the impact of various air pollutants on AQI had some changes due to the outbreak of COVID-19, most of the changes were not significant. Therefore, it is feasible to use the data before the outbreak of COVID-19 to predict the data during the outbreak of COVID-19, but there are also some challenges.

AQI Prediction Results and Analysis
In this part, we used five prediction models, such as ISOA-SVR, SOA-SVR, PSO-SVR, SCA-SVR and WOA-SVR, to predict the monthly average AQI values from September 2019 to June 2020, and the daily AQI values for 15-24 February 2020. In addition, in order to ensure the reliability of the experiment, each group of experiments of each model was run separately for 30 times, and the prediction results compared and analyzed.
The sampling time of Experiment 1 was from December 2013 to June 2020, with a total of 79 groups of monthly average air quality data. The first 69 groups were used as the training set and the remaining 10 groups as the test set. The sampling time of Experiment 2 was from 1 January 2020 to 24 February 2020. There were 55 groups of daily air quality data, including the first 45 groups as the training set and the remaining 10 groups as the test set. In the experiment, the AQI value of the sixth group was predicted by inputting the AQI value of the first five groups and the pollutant value of the fifth group, and then sliding forward was carried out to eliminate the random fluctuation in the historical statistical series, and to eliminate the influence of season and random factors to make more accurate predictions. The fitness convergence curve of training data and the test results of test data (including MSE, RMSE, MAPE and run time) were obtained by running each prediction model.

Results and Analysis of Experiment 1
As can be seen from Figure 6, the convergence effect of fitness functions of ISOA, SCA and WOA were similar, but SOA was slightly worse, and the worst was PSO. The training results in Table 7 show that ISOA-SVR was the best, MSE, RMSE and MAPE were 1.299370396, 0.010565973 and 9.064760441, SCA-SVR was 1.386464339, 0.012714389 and 9.62641542, and WOA-SVR were 1.367865788, 0.012364791 and 9.463121198, respectively. From the evaluation index values of the prediction results in Table 8, we found that the three index values of the prediction results of ISOA-SVR were the best, which were 3.379206115, 0.03090546 and 14.03810752, respectively. WOA-SVR was slightly worse than the ISOA-SVR algorithm, SOA-SVR and SCA-SVR were similar, both slightly lower than WOA-SVR, and PSO-SVR was the worst. In fact, we found from the prediction results in Table 9 that the prediction result of ISOA-SVR achieved the optimal value or suboptimal value in 6/10, which was the best among the five algorithms. three index values of the prediction results of ISOA-SVR were the best, which were 3.379206115, 0.03090546 and 14.03810752, respectively. WOA-SVR was slightly worse than the ISOA-SVR algorithm, SOA-SVR and SCA-SVR were similar, both slightly lower than WOA-SVR, and PSO-SVR was the worst. In fact, we found from the prediction results in Table 9 that the prediction result of ISOA-SVR achieved the optimal value or suboptimal value in 6/10, which was the best among the five algorithms. Figure 6. Fitness convergence curves of five algorithms in Experiment 1. Figure 6. Fitness convergence curves of five algorithms in Experiment 1. We can see from Figure 7 that the convergence effect of fitness function of ISOA, SOA, SCA and WOA were similar, but the convergence effect of PSO was a little worse. From the MSE, RMSE and MAPE of the training results in Table 10, we found that the training results of ISOA-SVR (13.37973604, 0.163687428 and 28.1740733) were slightly worse than the best SOA-SVR (12.90917741, 0.162034198 and 28.71513087), but better than other algorithms. From the three evaluation index values of the prediction results in Table 11, we can see that the prediction results of ISOA-SVR were the best (6.147418336, 0.185148465 and 37.28176518), SOA-SVR was slightly worse than the ISOA-SVR algorithm, SCA-SVR was better than WOA-SVR and was slightly lower than SOA-SVR; the worst was PSO-SVR. In fact, from the prediction results in Table 12, we found that the prediction result of ISOA-SVR achieved the optimal value or suboptimal value in 7/10, which was the best among the five algorithms.    Combined with the results of Experiment 1 and Experiment 2, we can conclude that the proposed ISOA-SVR prediction model had better generalization ability and robustness than other prediction models tested in this study. This good result is attributed to the three characteristics of the ISOA algorithm: one is the exponential balance of local and global optimization; the second is the adaptive ability of particles themselves and groups and the third is the diversity of particles. These make the algorithm avoid falling into local minimums in the process of optimization. In addition, the weight parameters introduced in the ISOA algorithm can balance the global exploration and local development ability of particles and accelerate the convergence speed of the algorithm.

Conclusions
This study focused on air quality evaluation during the COVID-19 lock down (25 January 2020 to 24 February 2020), before the COVID-19 outbreak (25 December 2019 to 24 January 2020), the same period in 2019 (25 January 2019 to 24 February 2019) and another time period (25 December 2018 to 24 January 2019). First, a numerical statistics comparative analysis was made, and then the grey relational analysis (GRA) method was used to analyze the relational degree between six major pollutants and AQI. The results not only reflect the distribution of air quality index (AQI) and pollutant concentration in each analysis period, but also reflect the impact of pollutants on AQI. The air condition during the locking period of COVID-19 was evaluated. The results show that during the locking period of COVID-19, the overall air condition was good, good and moderate weather accounted for a large proportion of the period, and there was no severely polluted weather. The average air quality index (AQI) and the average concentrations of five pollutants except O 3 decreased significantly. PM 10 was still the main factor affecting AQI. This indicates that air pollution was closely related to the higher pollutant discharge. The prevention and control measures during the COVID-19 locking period reduced the emissions of industrial waste gas, automobile exhaust and other pollutants in Taiyuan. Taiyuan greatly improved its air. However, coal-fired heating in winter still contributes a lot to PM 10 concentration. In addition, the evaluation results also reflect that the average concentration of O 3 increased slightly during the period of prevention and control of COVID-19, which deserves our attention.
After the evaluation of air quality, this study constructed an ISOA-SVR prediction model for the prediction of AQI value of air quality. Air quality is affected by many factors such as pollution sources and the meteorological environment, which makes AQI prediction inaccurate and difficult. SVR is suitable for AQI prediction, but its prediction effect depends on the selection of model parameters. First, based on the SOA algorithm, this study proposed the ISOA algorithm by introducing nonlinear parameters, considering the influence of individual extreme and average adaptations of each generation, and increasing the diversity of particles. Then, the parameters of the SVR model were optimized by the ISOA algorithm. Finally, the optimized ISOA-SVR prediction model was used to analyze the monthly average AQI values from September 2019 to June 2020, and the daily AQI values for 15-24 February 2020, respectively. The results were compared with those of the SVR prediction models optimized by the SOA, PSO, SCA and WOA algorithms. Finally, the results showed that the SVR model optimized by the ISOA algorithm achieved good results in MSE, RMSE and MAPE.
To sum up, first of all, the air quality evaluation results of Taiyuan in this study were in line with the air quality situation of Taiyuan city. Secondly, the integrated prediction model of ISOA-SVR proposed in this study was suitable for the prediction of AQI value of air quality in Taiyuan before and after COVID-19 epidemic. These results can play a guiding role in the prevention and control of air quality in Taiyuan and other places.

Data Availability Statement:
The data that support the findings of this study are openly available in the historical data query of China air quality online monitoring and analysis platform (https: //www.aqistudy.cn/historydata/, accessed on 1 March 2021).