Air Quality Modeling with the Use of Regression Neural Networks

Air quality is assessed on the basis of air monitoring data. Monitoring data are often not complete enough to carry out an air quality assessment. To fill the measurement gaps, predictive models can be used, which enable the approximation of missing data. Prediction models use historical data and relationships between measured variables, including air pollutant concentrations and meteorological factors. The known predictive air quality models are not accurate, so it is important to look for models that give a lower approximation error. The use of artificial neural networks reduces the prediction error compared to classical regression methods. In previous studies, a single regression model over the entire concentration range was used to approximate the concentrations of a selected pollutant. In this study, it was assumed that not a single model, but a group of models, could be used for the prediction. In this approach, each model from the group was dedicated to a different sub-range of the concentration of the modeled pollutant. The aim of the analysis was to check whether this approach would improve the quality of modeling. A long-term data set recorded at two air monitoring stations in Poland was used in the examination. Hourly data of basic air pollutants and meteorological parameters were used to create predictive regression models. The prediction errors for the sub-range models were compared with the corresponding errors calculated for one full-range regression model. It was found that the application of sub-range models reduced the modeling error of basic air pollutants.


Introduction
Air pollution is considered one of the main factors affecting the human population and the environment. The World Health Organization estimates that many millions of people die prematurely due to poor air quality [1]. Once pollutants are emitted into the air, it is impossible to stop them. If pollutants get into the atmosphere, they contribute to the deterioration of air quality in the vicinity of the emission, but by spreading they can have negative effects hundreds and thousands of kilometers from the point of emission. Therefore, air pollution is treated as a global threat and emission reduction strategies are implemented in many countries. It should be highlighted that, in Poland, the PM 10 (particulate matter) concentration still exceeds the permissible limits and causes premature deaths of over 40,000 people every year [2].
Air pollution causes negative changes in the human respiratory and circulatory systems, even when pollutant concentrations do not exceed permissible levels [3][4][5][6][7][8]. It may trigger various reactions of organisms, including mental health disorders [9,10]. It was also found that air pollution can negatively affect the economy [11][12][13][14]. High concentrations of air pollutants have a negative effect on plants, which is manifested in the reduction of crop yields in agriculture [15,16].
The control and reduction of anthropogenic emissions are now recognized as the keys to good global air quality. An important element of the control system is the assessment of 2 of 33 air quality. This task is performed using air quality monitoring. Air monitoring includes continuous measurements of air pollutants. The main air pollutants include O 3 , SO 2 , NO x , CO and PM 10 [17]. Concentrations of the mentioned pollutants can be measured with automatic air monitoring stations. These types of stations are often equipped with devices that also continuously measure meteorological data such as temperature, wind direction, wind speed and solar radiation. Measurement results at automatic air monitoring stations are recorded in the form of 1 h averages, e.g., hourly concentrations, hourly temperature, etc. In the EU, average hourly concentrations are the basis for calculating averages over longer periods of time that are required for air quality standards, such as 8 h, 24 h and annual concentrations [18]. The system acquires and collects measurement data of the air pollutants' concentration levels at many individual air monitoring stations, according to the standardized measurement methods [19][20][21][22][23]. The collected hourly concentrations in the air monitoring system constitute the basis for direct and indirect statistical evaluation of air quality in the zones represented with individual monitoring stations, in accordance with the procedures described in the relevant legal acts [18]. Correct assessment requires a high degree of completeness of the time series of concentrations obtained at the monitoring stations. Usually, the completeness should exceed 90% in an annual series of hourly measurements. Unfortunately, monitoring data are never 100% complete in annual terms, and often they do not even have the completeness required by the regulations. When there is a deficit of data in a series of measurements, then there is a need to complete the missing data.
Missing data can be supplemented by introducing modeled concentrations in the measurement gaps [24][25][26][27]. The time series data obtained with air monitoring have specific characteristics. All measurements, both of concentrations and meteorological parameters, are performed simultaneously and are recorded in similar time series. Therefore, methods based on auto-regression and regression can be used for modeling monitoring concentrations. If historical data from the selected air monitoring station are available, they can be used to explore the knowledge hidden in them. Autonomous models of this type can be accurate and have a very significant advantage: the approximation of concentrations does not require external data from outside the monitoring system [26,27]. In the first models, classical statistical regression techniques were used [28]. Classical methods are increasingly being replaced by methods that use machine learning artificial intelligence, including artificial neural networks (ANNs) [29][30][31][32][33][34][35][36][37]. ANN models enable a deeper exploration of knowledge hidden in historical data and, as a result, more accurate concentration predictions.
In regression modeling, it was found that the application of one neural network to the entire range of concentrations of the predicted pollutant resulted in different prediction accuracies in the concentration sub-ranges [38,39]. It was considered advisable to replace one neural network with several networks (sub-models), each of which would be adjusted to specific concentration sub-ranges [39]. The use of several sub-range models should improve the accuracy of the prediction. This paper presents an analysis that verifies the above thesis.
The main aim of the study was to improve the accuracy of prediction of air pollutant concentrations in neural regression models, using many predictive models created for various sub-ranges of air pollutant concentrations. The analyzed data came from two air monitoring stations in the Upper Silesian Region, Poland. The quality of prediction was assessed separately for the concentrations of six main air pollutants: O 3 , NO, NO 2 , SO 2 , PM 10 and CO. Multi-layer perceptrons with an identical architecture were used to model all pollutants. The predicted concentrations were compared with the observed ones to estimate the prediction error. The prediction errors were calculated for various sub-models, and, based on them, the prediction error was estimated over the entire concentration range. This error was compared with the approximation error obtained from a single model covering the entire concentration range. The city of Zabrze is one of the most polluted towns in Poland and throughout the EU. The station at Zabrze was an urban background monitoring site. Złoty Potok is a rural town, located outside the Upper Silesian Agglomeration. In Złoty Potok, there is a background monitoring station for the Upper Silesian Region. The data were provided by the Voivodeship Inspectorate of Environmental Protection in Katowice. Time series data, including 1 h average values of O3, NO, NO2, SO2, PM10 and CO concentrations, as well as meteorological data for temperature, wind speed, solar radiation and relative humidity, were recorded. The analyzed data sets are not publicly available. We received them on an individual request from the Voivodeship Inspectorate for Environmental Protection. Time series of concentrations are available on the website of the Chief Inspectorate of Environmental Protection: https://powietrze.gios.gov.pl/pjp/archives, accessed on 30 April 2022. This is only part of the data. Meteorological data are not available online. The data also include two variables describing the time: day and hour. These two variables were converted to numeric form following the procedure described in [39].
The following symbols were used to describe the time series (  The city of Zabrze is one of the most polluted towns in Poland and throughout the EU. The station at Zabrze was an urban background monitoring site. Złoty Potok is a rural town, located outside the Upper Silesian Agglomeration. In Złoty Potok, there is a background monitoring station for the Upper Silesian Region. The data were provided by the Voivodeship Inspectorate of Environmental Protection in Katowice. Time series data, including 1 h average values of O 3 , NO, NO 2 , SO 2 , PM 10 and CO concentrations, as well as meteorological data for temperature, wind speed, solar radiation and relative humidity, were recorded. The analyzed data sets are not publicly available. We received them on an individual request from the Voivodeship Inspectorate for Environmental Protection. Time series of concentrations are available on the website of the Chief Inspectorate of Environmental Protection: https://powietrze.gios.gov.pl/pjp/archives, accessed on 30 April 2022. This is only part of the data. Meteorological data are not available online. The data also include two variables describing the time: day and hour. These two variables were converted to numeric form following the procedure described in [39].
The following symbols were used to describe the time series (

Regression Models
Multi-layer perceptrons with identical architecture were used to model all pollutants. The predicted concentrations were compared with the observed ones to estimate the prediction error. The prediction errors were calculated for various sub-models, and, based on them, the prediction error was estimated over the entire concentration range. This error was compared with the approximation error obtained from a single model covering the entire concentration range. For all air pollutants, similar perceptron models were created. The output of this was the concentration of the chosen air pollutant (explained variable), and the inputs were the date and hour, concentrations of other air pollutants and meteorological parameters (explanatory variables). Table 2 presents the predicted variables and predictors, with separate lines for individual models. For example, the following 11 input variables were used to model the ozone concentration in Złoty Potok: H, D, NO, NO 2 , SO 2 , CO, PM 10 , WS, T, I and RH.
For regression modeling, artificial neural networks in the form of multi-layer perceptrons were used. Each perceptron consisted of input neurons, with 10 neurons in one hidden layer and one output neuron. Figure 2 shows such a perceptron used to model O 3 concentrations. The Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm was used in the learning process. The Broyden-Fletcher-Goldfarb-Shanno algorithm is used in problems related to numerical optimization [40]. The BFGS algorithm was developed on the basis of solutions proposed in 1970 by the four mathematicians mentioned in its name [41][42][43][44]. The algorithm uses an iterative method of solving unlimited non-linear optimization problems.
For regression modeling, artificial neural networks in the form of multi-layer perceptrons were used. Each perceptron consisted of input neurons, with 10 neurons in one hidden layer and one output neuron. Figure 2 shows such a perceptron used to model O3 concentrations. The Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm was used in the learning process. The Broyden-Fletcher-Goldfarb-Shanno algorithm is used in problems related to numerical optimization [40]. The BFGS algorithm was developed on the basis of solutions proposed in 1970 by the four mathematicians mentioned in its name [41][42][43][44]. The algorithm uses an iterative method of solving unlimited non-linear optimization problems. The learning process was always limited to 300 epochs. An epoch is a specialized expression related to the learning process of a neural network. In the network learning process, one epoch means a single learning cycle. In the learning process, the network repeats the cycles many times in order to minimize the learning error. The function of activating hidden and output neurons was a logistic function. The scales were initiated randomly. The network initialization was random (Gaussian). Modeling was performed using the Statistica program, version 13.3. Each prediction was performed 5 times. Repeating the training of the neural network, while maintaining the same parameters of the learning process, is a routine procedure. Each model's learning process is somewhat random and leads to a different network. The created networks have the same structure of neurons, but differ in terms of weights and the degree of activation of individual neurons. In general, they differ slightly in the modeling error. The most accurate of the 5 created models was selected for reporting. The other models were rejected. The sum of squares (SOS) was assumed as the error function. SOS is the sum of the squared distances of all predicted values from the actual values.

Preparation of Data for Modeling
The complete set of hourly air monitoring data from the 6-year measurement period should have included 52,608 cases (hourly observations). Prior to modeling, raw data from the air monitoring database were prepared by removing the cases where there were missing data. After removing the cases with missing data, a set of cases with complete data was obtained for further analysis. This set was called the full-range set. The full-range sets included 36,460 and 15,536 cases, for Zabrze and Złoty Potok stations, respectively. The learning process was always limited to 300 epochs. An epoch is a specialized expression related to the learning process of a neural network. In the network learning process, one epoch means a single learning cycle. In the learning process, the network repeats the cycles many times in order to minimize the learning error. The function of activating hidden and output neurons was a logistic function. The scales were initiated randomly. The network initialization was random (Gaussian). Modeling was performed using the Statistica program, version 13.3. Each prediction was performed 5 times. Repeating the training of the neural network, while maintaining the same parameters of the learning process, is a routine procedure. Each model's learning process is somewhat random and leads to a different network. The created networks have the same structure of neurons, but differ in terms of weights and the degree of activation of individual neurons. In general, they differ slightly in the modeling error. The most accurate of the 5 created models was selected for reporting. The other models were rejected. The sum of squares (SOS) was assumed as the error function. SOS is the sum of the squared distances of all predicted values from the actual values.

Preparation of Data for Modeling
The complete set of hourly air monitoring data from the 6-year measurement period should have included 52,608 cases (hourly observations). Prior to modeling, raw data from the air monitoring database were prepared by removing the cases where there were missing data. After removing the cases with missing data, a set of cases with complete data was obtained for further analysis. This set was called the full-range set. The full-range sets included 36,460 and 15,536 cases, for Zabrze and Złoty Potok stations, respectively. Before starting the learning process, each data set was divided into three subsets: the training subset consisted of 70% of the cases, the validation subset consisted of 15% of the cases and the test subset consisted of 15% of the cases.
Two approaches were adopted during modeling and, therefore, two groups of models were developed for each of the stations. In the first approach, all cases in the data set were sorted according to the real value of the predicted variable, from the lowest concentration to the highest. Then, a set of cases sorted in this way was divided into subsets. The entire set of cases was divided gradually into 2, then 4, and finally 8 subsets of the same size. The subsets prepared in this way differed in the ranges of the real concentration values of the predicted variable. For each of these subsets, regression models called RVS (Real Values Sorting) were created. As a result, for each set/subset, a separate predictive model was obtained, marked with the same symbol as the modeled set/subset. The name of the monitoring station (ZAB or ZP) was also added to the designations of sets and subsets. The full-range model was marked as RVS-1/1. As a result of the division of the full-range set, the following subsets and submodels were obtained: − Two sub-models (RVS-1/2, RVS-2/2), after division into two subsets; − In a situation where the real concentrations of the pollutant to be modeled were not known, the RVS-type models could not be used because it was not possible to sort and classify the cases into the real concentration sub-ranges. Therefore, RVS models reflect only the potential, not practical, opportunities to improve the quality of modeling through segmentation of the prediction process. If the real concentrations are not known and there is a need to predict them, then the RVS sub-range models are not available. In such a situation, a different approach can be proposed, which also uses division into sub-ranges, but then with sectoral modeling in designated sub-ranges. The most important step in this approach is the initial modeling of the concentrations of the selected pollutant in the full-range set, understood as a set of all cases containing complete data of all explanatory variables (model inputs) for the modeled pollutant. After initial modeling of the entire range of cases, predictive concentrations of the dependent variable are obtained. Then, the entire set of cases is sorted according to the increasing predictive concentrations of the modeled pollutant. The next step is to divide the sorted full-range set of cases into a specified number of equal sub-ranges. In this way, you can generate a model called PVS (Predicted Values Sorting). As a result of the division of the full-range set, the following subsets and submodels were obtained:       Thanks to the divisions of the full-range set, it was possible to check how the modeling accuracy changed in the concentration sub-ranges, and whether the modeling carried out in the sub-ranges would improve the modeling quality in relation to the full-range modeling.

Assessment of the Approximation Error
To assess the accuracy of the obtained regression models, the MAE and RMSE values were used, which were calculated on the basis of the discrepancy between the actual and predicted values. The formulas for calculating individual errors are presented in equations 1 and 2.
MAE-mean absolute error where n-number of cases, y-predicted concentrations, x-real concentrations, i-the case number.

Results
For each pollutant, modeling errors were calculated in relation to the real pollutant concentrations (O3, NO, NO2, SO2, CO and PM10 for the Zabrze station; and O3, NO, NO2, SO2 and PM10 for the Zloty Potok station). To assess the modeling accuracy, two error measures were calculated: MAE and RMSE.  Thanks to the divisions of the full-range set, it was possible to check how the modeling accuracy changed in the concentration sub-ranges, and whether the modeling carried out in the sub-ranges would improve the modeling quality in relation to the full-range modeling.

Assessment of the Approximation Error
To assess the accuracy of the obtained regression models, the MAE and RMSE values were used, which were calculated on the basis of the discrepancy between the actual and predicted values. The formulas for calculating individual errors are presented in Equations (1) and (2).
MAE-mean absolute error where n-number of cases, y-predicted concentrations, x-real concentrations, i-the case number.

Results
For each pollutant, modeling errors were calculated in relation to the real pollutant concentrations (O 3 , NO, NO 2 , SO 2 , CO and PM 10 for the Zabrze station; and O 3 , NO, NO 2 , SO 2 and PM 10 for the Zloty Potok station). To assess the modeling accuracy, two error measures were calculated: MAE and RMSE. Tables 3-6 show the results of O 3 concentration predictions obtained with the fullrange and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 3 and the Złoty Potok monitoring station in Table 4. Similar lists of errors for the RVS models are presented in Tables 5 and 6. The presented results show that the sub-range modeling errors changed for individual sub-ranges. Regardless of the number of sub-range models, the modeling error usually increased with increasing concentration values in the sub-ranges. A good way to assess the quality of the prediction is to compare the overall prediction error over the entire range of concentrations of the modeled pollutant, and not in individual sub-ranges. Therefore, the tables include average values for the entire ranges: "overall MAE" and "overall RMSE". Division into the sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. The exceptions were the eight-sub-range PVS models for both monitoring stations. In the case of Zabrze, the same average value of MAE was recorded in the eight-sub-range models as in the four-sub-range models (8.02 µg/m 3 ). In the case of Zloty Potok, even an increase in the overall values of MAE and RMSE errors was observed in the eight-sub-range models compared to the overall error of the four-sub-range models. In the case of RVS models, a decrease in the overall values of MAE and RMSE was always observed with an increase in the number of sub-ranges. The comparison showed that dividing the area into sub-ranges and separately modeling these sub-ranges improved the overall quality of the prediction, but that having too many sub-ranges and sub-models may be ineffective. The values of "overall MAE" and "overall RMSE" depending on the number of sub-ranges are presented in

The Results of the Modeling of NO Concentrations
Tables 7-10 show the results of NO concentration predictions obtained with the fullrange and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 7 and the Złoty Potok monitoring station in Table 8. The corresponding errors for the RVS models are presented in Tables 9-10. The division into sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The exceptions were the eight-sub-range PVS models for the monitoring station at Złoty Potok. At this station, the overall MAE value for the eight-subrange sub-models (0.429 µg/m 3 ) was comparable to the overall MAE value for the foursub-range sub-models (0.428 µg/m 3 ). In the case of RVS models, a decrease in the overall values of MAE and RMSE errors was always observed with an increase in the number of sub-ranges. In the Zloty Potok station, for the real concentrations in sub-ranges with a range of 0.0-0.0 µg/m 3 and 1.0-1.0 µg/m 3 , RVS sub-models could not be created due to the lack of variability in these concentration ranges. Therefore, the overall error for this group

The Results of the Modeling of NO Concentrations
Tables 7-10 show the results of NO concentration predictions obtained with the fullrange and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 7 and the Złoty Potok monitoring station in Table 8. The corresponding errors for the RVS models are presented in Tables 9 and 10. The division into sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The exceptions were the eight-sub-range PVS models for the monitoring station at Złoty Potok. At this station, the overall MAE value for the eightsub-range sub-models (0.429 µg/m 3 ) was comparable to the overall MAE value for the four-sub-range sub-models (0.428 µg/m 3 ). In the case of RVS models, a decrease in the overall values of MAE and RMSE errors was always observed with an increase in the number of sub-ranges. In the Zloty Potok station, for the real concentrations in sub-ranges with a range of 0.0-0.0 µg/m 3 and 1.0-1.0 µg/m 3 , RVS sub-models could not be created due to the lack of variability in these concentration ranges. Therefore, the overall error for this group of sub-models was not estimated. Figures 9-12 show the overall values of MAE and RMSE graphically.

Regression Model
Sub-ranges of NO Concentrations, µg/m 3

The Results of the Modeling of NO2 Concentrations
Tables 11-14 show the results of NO2 concentration predictions obtained with the full-range and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 11 and the Złoty Potok monitoring station in Table 12. The corresponding errors for the RVS models are presented in Tables 13 and 14. The division into sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The exceptions were the eight-sub-range PVS models for both monitoring stations. The overall MAE and RMSE values for the eight-sub-range sub-mod-

The Results of the Modeling of NO 2 Concentrations
Tables 11-14 show the results of NO 2 concentration predictions obtained with the full-range and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 11 and the Złoty Potok monitoring station in Table 12. The corresponding errors for the RVS models are presented in Tables 13 and 14. The division into sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The exceptions were the eight-sub-range PVS models for both monitoring stations. The overall MAE and RMSE values for the eight-sub-range sub-models were higher than the overall MAE and RMSE values for the four-sub-range sub-models. In the case of RVS models, a decrease in the overall values of MAE and RMSE errors was always observed with an increase in the number of sub-ranges. In the case of PVS models, it appeared that having too many sub-ranges and sub-models could degrade the quality of the modeling. Figures 13-16 show the overall values of MAE and RMSE graphically.

The Results of the Modeling of SO2 Concentrations
Tables 15-18 show the results of SO2 concentration predictions obtained with the fullrange and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 15 and the Złoty Potok monitoring station in Table 16. The corresponding errors for the RVS models are presented in Tables 17 and 18. The division into sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The exceptions were the four-sub-range and eight-sub-range PVS sub-models for the monitoring station at Zabrze.

The Results of the Modeling of SO 2 Concentrations
Tables 15-18 show the results of SO 2 concentration predictions obtained with the full-range and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 15 and the Złoty Potok monitoring station in Table 16. The corresponding errors for the RVS models are presented in Tables 17 and 18. The division into sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The exceptions were the four-sub-range and eight-sub-range PVS sub-models for the monitoring station at Zabrze.  At this station, the overall MAE values (5.18 µg/m 3 for the eight-sub-range submodels, and 5.17 µg/m 3 for the four-sub-range sub-models) were comparable to the overall MAE value for the two-sub-range sub-models (5.17 µg/m 3 ). In the case of RVS models, a decrease in the overall values of MAE and RMSE was always observed with an increase in the number of sub-ranges. In the case of PVS models, it appeared that having too many sub-ranges and sub-models could degrade the quality of the modeling. Figures 17-20 show the overall values of MAE and RMSE graphically.          Tables 19-22 show the results of PM10 concentration predictions obtained in the fullrange and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 19 and the Złoty Potok monitoring station in Table 20. The corresponding errors for the RVS models are presented in Tables 21 and 22. The division into sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The exceptions were the eight-sub-range PVS models for the monitoring station at Złoty Potok. The overall MAE and RMSE values for the eight-subrange models were higher than the overall MAE and RMSE values for the four-sub-range   Tables 19-22 show the results of PM 10 concentration predictions obtained in the fullrange and sub-range models. The errors of the PVS models for the Zabrze monitoring station are presented in Table 19 and the Złoty Potok monitoring station in Table 20. The corresponding errors for the RVS models are presented in Tables 21 and 22. The division into sub-ranges generally improved the accuracy of prediction, especially in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The exceptions were the eight-sub-range PVS models for the monitoring station at Złoty Potok. The overall MAE and RMSE values for the eight-subrange models were higher than the overall MAE and RMSE values for the four-sub-range models and even the two-sub-range models. In the case of RVS models, a decrease in the overall values of MAE and RMSE was always observed with an increase in the number of sub-ranges. In the case of PVS models, it appeared that having too many sub-ranges and submodels could degrade the quality of the modeling. Figures 21-24 show the overall values of MAE and RMSE graphically.

The Results of the Modeling of CO Concentrations
CO concentrations were not monitored at the Złoty Potok station, so the analysis was carried out only using monitoring data from Zabrze .  Tables 23 and 24 show the results of PM10 concentration predictions obtained with the full-range and sub-range models. The errors in the PVS models for the Zabrze monitoring station are presented in Table 23. The errors in the RVS models are presented in Table 24. The division into sub-ranges improved the accuracy of prediction in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The PVS models showed slight changes in accuracy. The MAE level did not change much. The overall MAE value for the eight-sub-range models was

The Results of the Modeling of CO Concentrations
CO concentrations were not monitored at the Złoty Potok station, so the analysis was carried out only using monitoring data from Zabrze .  Tables 23 and 24 show the results of PM 10 concentration predictions obtained with the full-range and sub-range models. The errors in the PVS models for the Zabrze monitoring station are presented in Table 23. The errors in the RVS models are presented in Table 24. The division into sub-ranges improved the accuracy of prediction in the case of RVS models. In general, as the number of sub-models increased, the overall measures of modeling error decreased. The PVS models showed slight changes in accuracy. The MAE level did not change much. The overall MAE value for the eight-sub-range models was higher than the overall MAE value for the four-sub-range models and equal to the MAE value for the two-sub-range models. In the case of RVS models, a decrease in the overall values of MAE and RMSE was always observed with an increase in the number of sub-ranges. In the case of PVS models, it appeared that having too many sub-ranges and sub-models could degrade the quality of the modeling. Figures 25 and 26 show the overall values of MAE and RMSE graphically.   Table 25 shows the percentage changes in the overall values of MAE and RMSE obtained by modeling the concentrations in sub-ranges, calculated in relation to the error values of the corresponding full-range models.  Table 25 shows the percentage changes in the overall values of MAE and RMSE obtained by modeling the concentrations in sub-ranges, calculated in relation to the error values of the corresponding full-range models. In the case of RVS models, each division into narrower concentration sub-ranges and the development of appropriate sub-range models resulted in a significant reduction in the overall value of the modeling error. The division into sub-models always improved the accuracy of predictions in the case of RVS models. When the number of sub-models increased, the overall measures of modeling error decreased. A significant improvement in the quality of modeling was achieved at both air monitoring stations. Modeling errors could be reduced by more than 60% using eight sub-models. However, it should be emphasized that the RVS models are not of great practical importance because their use is related to knowledge of the real concentrations of the pollutants. Moreover, once the concentration values are known, there is no need to perform modeling. The importance of the RVS models was that they allowed us to assess the potential for improving the quality of the modeling.

Summary and Discussion
The division into sub-models generally improved the accuracy of the PVS models; however, the decrease in modeling error was not as great as in the RVS models. Moreover, quite often, after splitting the full-range set into eight sub-ranges and running eight submodels, the MAE and RMSE values could be higher than in the sub-models created after division into only four sub-ranges. Such an effect was found for O 3 , NO, NO 2 and PM 10 in Złoty Potok, and NO 2 , SO 2 and CO in Zabrze. The probable cause of the deterioration in the quality of prediction in some eight sub-range PVS models was the error in the classification of cases into individual sub-ranges. The classification was made on the basis of the predicted concentration values obtained as a result of the preliminary prediction. With an increasing number of sub-ranges, the sub-ranges of concentrations became narrower and the number of misclassified cases also increased. The number of misclassified cases became so large that it increased the mean prediction error in the sub-range.
The PVS models always showed a lower accuracy than the RVS models. This is understandable, as the PVS models required a preliminary prediction, which introduced an additional error. In conclusion, having too many sub-ranges and sub-models can degrade the quality of modeling for the PVS models.
After division into sub-ranges (two sub-ranges, four sub-ranges, eight sub-ranges), the error in the models for the highest sub-ranges was always very large compared to the error in the models for the lower sub-ranges. There are two reasons for this effect. The first is that the width of the highest sub-range was always greater than that of the models of the lower sub-ranges. The second reason was the need to predict extremely high concentrations. The approximation of such unusual concentrations is always burdened with a higher error.

Conclusions
Monitoring data are often not complete enough to carry out an air quality assessment. To fill the measurement gaps, predictive models are used. Such models often use archival measurement data from air monitoring systems. This is the best source of knowledge about the relationships between measured variables (concentrations and meteorological parameters). There is a need to model the missing concentrations as accurately as possible. The use of artificial neural networks reduces the prediction error compared to classical regression methods. In previous studies, a single regression model over the entire concentration range was used to approximate the concentrations of a selected pollutant. In this study, it was assumed that not a single model, but a group of models, could be used for the prediction. In this approach, each model from the group is dedicated to a different sub-range of the concentration of a modeled pollutant. The aim of this analysis was to check whether this approach would improve the quality of modeling.
The aim of the analysis was not to create the most up-to-date models based on possible new data. Once trained, a model using historical data (e.g., from 2011 to 2016) should also be able to predict concentrations for current data. This feature of the model's "generalization of acquired knowledge" was tested during the learning process on the cases from the test subset, and also on the cases from the validation subset after the network training was completed. The performed validation should show that the model can approximate the target value using data independent of the learning process.
Air monitoring data from the period 2011-2016 allowed us to verify the possibility of improving the accuracy of modeling by carrying out modeling in subsets. A similar analysis can be carried out using data from other monitoring stations, or data from Zabrze and Złoty Potok stations from a different period. However, the selected set of cases should cover a measurement period of several years, so that the recorded cases correspond to different meteorological situations and different ranges of concentrations of monitored pollutants.
The most important conclusions that resulted from the conducted analysis are as follows: 1.
Modeling segmentation, consisting in prediction in the sub-ranges of concentrations of the modeled pollutant, allowed for a higher overall modeling accuracy.

2.
For RVS models, segmentation of the modeling process guarantees a significant increase in modeling accuracy compared to a model based on the full-range of concentrations.

3.
In the case of PVS models, segmentation of the modeling process allows to reduce overall prediction errors. However, the number of concentration sub-ranges cannot be too large. When predicting with 8 submodels, the modeling accuracy may be lower than when predicting with 4 submodels.