Characterization of Surface Ozone Behavior at Different Regimes

Previous studies showed that the influence of meteorological variables and concentrations of other air pollutants on O3 concentrations changes at different O3 concentration levels. In this study, threshold models with artificial neural networks (ANNs) were applied to characterize the O3 behavior at an urban site (Porto, Portugal), describing the effect of environmental and meteorological variables on O3 concentrations. ANN characteristics, and the threshold variable and value, were defined by genetic algorithms (GAs). The considered predictors were hourly average concentrations of NO, NO2, and O3, and meteorological variables (temperature, relative humidity, and wind speed) measured from January 2012 to December 2013. Seven simulations were performed and the achieved models considered wind speed (at 4.9 m·s−1), temperature (at 17.5 ◦C) and NO2 (at 26.6 μg·m−3) as the variables that determine the change of O3 behavior. All the achieved models presented a similar fitting performance: R2 = 0.71–0.72, RMSE = 14.5–14.7 μg·m−3, and the index of agreement of the second order of 0.91. The combined effect of these variables on O3 concentration was also analyzed. This statistical model was shown to be a powerful tool for interpreting O3 behavior, which is useful for defining policy strategies for human health protection concerning this air pollutant.


Introduction
Surface ozone (O 3 ) is considered one of the most concerning air pollutants in Europe.It is a secondary pollutant (it is not directly emitted), generated by chemical reactions that occur in the atmosphere between primary air pollutants (nitrogen oxides-NO x -and volatile organic compounds-VOCs) catalyzed by sunlight [1].The impact of this air pollutant has been studied in different areas [2][3][4][5].Concerning human health, O 3 can cause injuries to airway epithelial cells (and lung diseases such as asthma), hyperplasia, headaches, and nausea, particularly in sensitive people, such as children and elderly [6][7][8][9][10].Regarding vegetation, O 3 can damage plant leaves (decreases in both leaf photosynthesis and leaf area), reducing crop yields associated with a high negative economic impact [11][12][13].Moreover, as a strong oxidant, it is responsible for the degradation of material via corrosion [14].
As already mentioned, O 3 is produced by chemical reactions between primary pollutants present in the atmosphere.It can also be transported from other locations by the wind (horizontal transport) and from the stratosphere (vertical transport) [15,16].Thus, the atmosphere works as an open chemical reactor, in which reaction kinetics depend on the concentrations of reactants (primary air pollutants), mixture (wind speed and direction), temperature (influencing exponentially kinetic reaction constants-Arrhenius equation), and solar radiation.Thus, the O 3 behavior is highly dependent on the environment (urban, rural, or background) and meteorological variables.
In urban areas, O 3 concentrations are usually lower than the values observed in rural areas [17][18][19][20].This phenomenon occurs mainly due to high NO x concentrations.Photochemical equilibrium is defined by the following equations [21][22][23]: The photochemical reaction of NO 2 decomposition (chemical reaction R1) leads to the production of NO molecules and oxygen atoms that combine with molecular oxygen to produce ozone (chemical reaction R2).O 3 can also react with NO, forming NO 2 (chemical reaction R3).The complexity of phenomena associated with O 3 formation makes it hard to understand and predict its concentration in ambient air.Moreover, there are insufficient data (e.g., inventories of air pollutant emissions) to develop consistent phenomenological models to describe O 3 behavior.
Alternatively, statistical models have the capability of characterizing the relationship between variables using collected data and they involve mere pattern-recognition using mathematical operations.One of the most applied statistical models is the artificial neural network (ANN).ANNs are nonlinear models, which are inspired in the biological neural processing system [24,25].These models are composed by artificial neurons (grouped in layers; three layers-input, hidden, and output-are often applied) that receive an input value and converts to an output through a selected function (activation function).Additionally, ANNs are characterized by a high fitting performance.The rapid development of computer hardware has increased the processing capabilities, which have led to achievement of ANN models with less computation time [26].Therefore, these models have been used in a wide range of applications, including classification, regression, and mapping [27][28][29].However, there are too many variables that need to be defined before the model parameters can be determined, including (i) the number of processing neurons in the hidden layer, and (ii) the activation function for each neuron.In recent years, genetic algorithms (GAs) have been applied to help in the definition of these variables.GAs are commonly applied to generate high-quality solutions for optimization and search problems, based on bio-inspired operators, such as mutation, crossover, and selection [30,31].In GAs, a set of candidate solutions (called population-a group of individuals) are iteratively modified though the mentioned genetic operators in order to find a group of better solutions for the next generation (new iteration).GAs present the following advantages: (i) continuous or discrete variables can be optimized; (ii) a derivative function is not required; (iii) multivariable problems can be optimized; (iv) extremely complex cost surfaces can be dealt with; and (v) a list of optimal solutions (and not just a single one) is provided.
In Porto (selected area in this study-Portugal), an increasing trend of O 3 concentrations (147% higher) has been observed since the 19th century due to the photochemical production of this pollutant, associated with the increase in anthropogenic emissions mainly due to traffic [32].Additionally, in the north of Portugal, Lamas d'Olo is a rural site where the highest O 3 concentrations are usually measured.Consequently, this site is often selected to evaluate O 3 behavior.Russo et al. [33] mentioned that high-ozone episodes can be explained by several factors: (i) atmospheric stagnation; (ii) horizontal transport by the wind of ozone-rich air masses; (iii) high solar radiation and temperature; and (iv) the influence of local winds (sea breezes and valley winds).Carvalho et al. [19] observed a positive correlation between O 3 concentrations with temperature and a negative correlation with relative humidity.Regarding the effect of wind field, the northeast flow from Spain (Galicia and Asturias) was observed, and this can be associated with the long-range transport of atmospheric pollutants to Portugal.Fernández-Guisuraga et al. [34] compared O 3 trends at urban and rural sites.At the rural site, O 3 concentrations were mainly influenced by the wind (transport), showing low variability with the concentrations of other pollutants.On the other hand, at the urban site, most of the variance was explained by the NO 2 /NO x ratio.Several research studies can be found where ANN models were applied to determine O 3 trends and to predict its concentration (to provide early warning to the population when high O 3 concentration episodes occur).Comrie [35] compared the performance of an ANN model with a multiple linear regression (MLR) model to predict daily average O 3 concentrations in different cities with distinct climate and O 3 regimes.ANN models presented slightly better performance than MLR.Abdul-Wahab and Al-Alawi [36] developed ANN models to predict O 3 concentrations through meteorological and environmental data.The contribution of the meteorological data was defined between 33% and 41%, while the remaining variation was attributed to chemical pollutants.NO, SO 2 , relative humidity (the highest contribution), non-methane hydrocarbon, and NO 2 were the variables that most influenced the O 3 concentrations.Additionally, temperature also presents an important role, while solar radiation had a lower effect than expected.Pires et al. [37] compared threshold autoregressive (TAR) models, autoregressive (AR) models, and ANN in the prediction of the next day hourly average O 3 concentrations.In the training period, ANN presented a higher performance.However, in the test period, TAR models presented more accurate results and the distinction became greater when the evaluation was performed for the prediction of extreme values.
In recent studies, O 3 concentrations have shown different behaviors regarding certain explanatory variables [25,37], which can be classified as O 3 regimes.This observation can be justified by the chemical reactions associated with O 3 formation/destruction that are influenced by certain variables, such as temperature, solar radiation, and wind speed [32].To take these regimes account, threshold regression models were considered in this study [38].Thus, GAs were used to define the threshold variable and value (the value of the explanatory variable corresponding to the change of the regime; two regimes were selected), the number of hidden neurons, and the activation function in the hidden and output layers.In this study, hourly average O 3 concentrations were modeled using threshold models with an ANN, whose structure was iteratively optimized by GAs.The achieved models enable the characterization of O 3 variability with selected meteorological and environmental variables in different regimes.

Data
Air quality data were obtained from an urban background site (Sobreiras-Lordelo do Ouro, see Figure 1) of the Air Quality Monitoring Network (AQMN) of Porto, Portugal.The AQMN is managed by the Regional Commission of Coordination and Development of Northern Portugal (Comissão de Coordenação e Desenvolvimento Regional do Norte), under the responsibility of the Ministry of Environment.Hourly average concentrations of NO, NO 2 , and O 3 from the period from January 2012 to December 2013 (8760 hourly average values in 2012 and 7481 in 2013) were used to develop the proposed models.NO and NO 2 were obtained through the chemiluminescence method according to European Union (EU) Directive 1999/30/EC (European Community).According to EU Directive 2002/3/EC, O 3 measurements were performed through UV-absorption photometry using the equipment 41 M UV Photometric Ozone Analyzer (Environment S.A., Poissy, France).This monitoring equipment was subject to a rigid maintenance program, calibrated every 4 weeks.Measurements were continuously registered, and hourly average concentrations (in µg•m −3 ) were recorded.
The meteorological data were collected in a meteorological station located at Pedras Rubras, which is managed by Instituto Português do Mar e da Atmosfera (IPMA, I.P.); these values are considered representative for the entire Metropolitan Area of Porto.In this study, hourly averages of temperature (T, • C), relative humidity (RH, %), and wind speed (WS, m•s −1 ) were used to analyze the influence of meteorological conditions on O 3 concentrations.

Statistical Model
In this study, threshold models with ANNs were defined with GAs, aiming to evaluate the effect of environmental and meteorological variables in O3 concentrations.The applied model is defined as the following: where y is the output variable, net1 and net2 are ANN models, xi are the exploratory variables, xd is the threshold variable, and r is the threshold value.Applied feedforward ANN models had three layers (input, hidden, and output) and considered eight input variables (hourly average data): NO concentration, NO2 concentration (due to the chemical reactions R1 and R3), the ratio NO2/NO (due to the equilibrium constant of the chemical reaction R3), T, RH, 1/RH (as RH usually shows a negative effect on O3 levels), WS, and 1/WS (the same as the RH effect).The output variable was the hourly average O3 concentrations measured at the same time of the input data to infer the direct influence of these variables on O3 chemistry.Regarding the activation functions, the linear function was considered for the output neuron and four functions were selected by GAs: sigmoid, hyperbolic tangent, inverse, and radial basis.The data were divided in training (75%) and validation (25%) sets and the early stopping method (ANN training procedure is stopped when an increase in validation error is observed) was applied to avoid overfitting.The division of the data was performed by time: 75% for training (January 2012 to 25 May 2013); 25% for validation (25 May 2013 to 19 December 2013).In the training set, O3 concentrations ranged from 0 to 161 μg•m −3 , while O3 concentrations ranged from 0 to 170 μg•m −3 in the validation set.
GAs are a search and optimization technique based on Darwin principles of evolution and natural genetics [30,31].This procedure begins with a set of individuals (population) that is randomly generated.Each individual (also called chromosome) is a binary code string and contains information about a set of parameters, which is a potential solution to a given problem.To evaluate the quality of the proposed solution (to rank the individuals in the population), a fitness function should be defined.To create new chromosomes for the next generation, the fittest chromosomes are submitted to the genetic operations [30]: (i) selection; (ii) crossover; (iii) mutation.These new chromosomes are then evaluated according to the fitness function, and the ones with the highest performance were selected.The repetition of this procedure generates a sequence of populations containing better solutions.The termination criteria can be (i) to stop after a previously defined maximum number of generations is achieved, or (ii) to stop when a desired fitness value is achieved.In this study, GAs were used to define the threshold variable and value, the number of hidden neurons, and the activation function in the hidden layer, and to select the explanatory variables to be used in each ANN model.

Statistical Model
In this study, threshold models with ANNs were defined with GAs, aiming to evaluate the effect of environmental and meteorological variables in O 3 concentrations.The applied model is defined as the following: where y is the output variable, net 1 and net 2 are ANN models, x i are the exploratory variables, x d is the threshold variable, and r is the threshold value.Applied feedforward ANN models had three layers (input, hidden, and output) and considered eight input variables (hourly average data): NO concentration, NO 2 concentration (due to the chemical reactions R1 and R3), the ratio NO 2 /NO (due to the equilibrium constant of the chemical reaction R3), T, RH, 1/RH (as RH usually shows a negative effect on O 3 levels), WS, and 1/WS (the same as the RH effect).The output variable was the hourly average O 3 concentrations measured at the same time of the input data to infer the direct influence of these variables on O 3 chemistry.Regarding the activation functions, the linear function was considered for the output neuron and four functions were selected by GAs: sigmoid, hyperbolic tangent, inverse, and radial basis.The data were divided in training (75%) and validation (25%) sets and the early stopping method (ANN training procedure is stopped when an increase in validation error is observed) was applied to avoid overfitting.The division of the data was performed by time: GAs are a search and optimization technique based on Darwin principles of evolution and natural genetics [30,31].This procedure begins with a set of individuals (population) that is randomly generated.Each individual (also called chromosome) is a binary code string and contains information about a set of parameters, which is a potential solution to a given problem.To evaluate the quality of the proposed solution (to rank the individuals in the population), a fitness function should be defined.To create new chromosomes for the next generation, the fittest chromosomes are submitted to the genetic operations [30]: (i) selection; (ii) crossover; (iii) mutation.These new chromosomes are then evaluated according to the fitness function, and the ones with the highest performance were selected.The repetition of this procedure generates a sequence of populations containing better solutions.The termination criteria can be (i) to stop after a previously defined maximum number of generations is achieved, or (ii) to stop when a desired fitness value is achieved.In this study, GAs were used to define the threshold variable and value, the number of hidden neurons, and the activation function in the hidden layer, and to select the explanatory variables to be used in each ANN model.The determination of the models was coded by the authors with MATLAB ® software (R2014a, MathWorks, Natick, MA, USA, 2014) using the following specifications: Figure 2 shows an example of chromosome (37 bits).It is divided in 8 sets of bits (SB i ).SB 1 (3 bits) defines the threshold variable (from the explanatory variables; the maximum number of 8) through the conversion from binary to decimal numbers (MATLAB function bin2dec).SB 2 (8 bits) defines the threshold value.With the threshold variable already defined, the maximum (x max ) and minimum (x min ) values of this variable are determined.Threshold value is calculated based on Equation (2).
SB 3 and SB 6 (2 bits) define the activation function for the hidden layer of each ANN: 00-log-sigmoid (logsig); 01-hyperbolic tangent sigmoid (tansig); 10-inverse (netinv); 11-radial basis (radbas).SB 4 and SB 7 (3 bits) define the number of neurons in the hidden layer through the conversion from binary to decimal number (1 to 8).SB 5 and SB 8 (8 bits) define the explanatory variables that are used in each ANN (1 bit for each explanatory variable): 0-not selected; 1-selected.The determination of the models was coded by the authors with MATLAB ® software (R2014a, MathWorks, Natick, MA, USA, 2014) using the following specifications: Figure 2 shows an example of chromosome (37 bits).It is divided in 8 sets of bits (SBi).SB1 (3 bits) defines the threshold variable (from the explanatory variables; the maximum number of 8) through the conversion from binary to decimal numbers (MATLAB function bin2dec).SB2 (8 bits) defines the threshold value.With the threshold variable already defined, the maximum (xmax) and minimum (xmin) values of this variable are determined.Threshold value is calculated based on Equation (2).

Air Quality and Meteorological Data Characterization
During the analyzed period, the hourly average O3 concentrations were between 0 and 170 μg•m −3 (not exceeding the information neither the alert threshold-180 and 240 μg•m −3 , respectively).Regarding O3 exceedances to EU limits for the protection of human health, the 8 h average O3 concentrations were higher than 120 μg•m −3 twice in September 2012, twice in July 2013, and once in August 2013.Figure 3 shows the average daily profile of O3 concentrations.As a photochemical pollutant, its concentration increases during the daylight period, presenting a maximum between 14 and 15 h and a minimum at night time.The observed profile is characteristic of an urban site, as it does not present a high amplitude of concentrations (due to the presence of high NOx concentrations).
Figure 4 shows the monthly average values of NO, NO2, and O3 concentrations, as well as the analyzed meteorological variables (temperature, relative humidity, and wind speed).High O3 concentrations were observed in April 2012 (63.7 μg•m −3 ) and from March to July 2013 (59.9-71.3μg•m −3 ).In this period, low concentrations of NO and NO2 were also measured.High temperatures and low relative humidity were also observed.On the other hand, lower O3 concentrations were measured for periods with high NO and NO2 concentrations and lower temperatures.These observations are in agreement with other research studies in which the behavior of O3 was analyzed [18,[39][40][41].Pires et al. [40] compared several linear models to predict O3 concentrations at an urban site in Porto.The correlation

Air Quality and Meteorological Data Characterization
During the analyzed period, the hourly average O 3 concentrations were between 0 and 170 µg•m −3 (not exceeding the information neither the alert threshold-180 and 240 µg•m −3 , respectively).Regarding O 3 exceedances to EU limits for the protection of human health, the 8 h average O 3 concentrations were higher than 120 µg•m −3 twice in September 2012, twice in July 2013, and once in August 2013.Figure 3 shows the average daily profile of O 3 concentrations.As a photochemical pollutant, its concentration increases during the daylight period, presenting a maximum between 14 and 15 h and a minimum at night time.The observed profile is characteristic of an urban site, as it does not present a high amplitude of concentrations (due to the presence of high NO x concentrations).
Figure 4 shows the monthly average values of NO, NO 2 , and O 3 concentrations, as well as the analyzed meteorological variables (temperature, relative humidity, and wind speed).High O 3 concentrations were observed in April 2012 (63.7 µg•m −3 ) and from March to July 2013 (59.9-71.3µg•m −3 ).In this period, low concentrations of NO and NO 2 were also measured.
High temperatures and low relative humidity were also observed.On the other hand, lower O 3 concentrations were measured for periods with high NO and NO 2 concentrations and lower temperatures.These observations are in agreement with other research studies in which the behavior of O 3 was analyzed [18,[39][40][41].Pires et al. [40] compared several linear models to predict O 3 concentrations at an urban site in Porto.The correlation analysis performed between O 3 concentrations and meteorological variables showed also negative correlations with NO, NO 2 , and RH and a positive correlation with T. In another study focusing on the same region [41], O 3 concentrations were negatively correlated with NO, NO 2 , and RH and positively correlated with T and WS.Zhang, Wang, Park, and Deng [18] analyzed high O 3 concentration episodes and related them with meteorological variables.O 3 concentrations were highly correlated with maximum temperature and minimum relative humidity.The effect of minimum WS was also analyzed at urban, suburban, and rural sites.O 3 concentrations were positively (negatively) correlated with minimum WS at urban (suburban and rural) sites.Shan, Yin, Zhang, Ji, and Deng [39] analyzed the effect of meteorological variables on O 3 concentrations at an urban site in China.Daily average O 3 concentrations were negatively correlated with pressure and RH, and positively correlated with temperature, solar radiation, sunshine duration, and wind speed.
Appl.Sci.2017, 7, 944 6 of 12 analysis performed between O3 concentrations and meteorological variables showed also negative correlations with NO, NO2, and RH and a positive correlation with T. In another study focusing on the same region [41], O3 concentrations were negatively correlated with NO, NO2, and RH and positively correlated with T and WS.Zhang, Wang, Park, and Deng [18] analyzed high O3 concentration episodes and related them with meteorological variables.O3 concentrations were highly correlated with maximum temperature and minimum relative humidity.The effect of minimum WS was also analyzed at urban, suburban, and rural sites.O3 concentrations were positively (negatively) correlated with minimum WS at urban (suburban and rural) sites.Shan, Yin, Zhang, Ji, and Deng [39] analyzed the effect of meteorological variables on O3 concentrations at an urban site in China.Daily average O3 concentrations were negatively correlated with pressure and RH, and positively correlated with temperature, solar radiation, sunshine duration, and wind speed.analysis performed between O3 concentrations and meteorological variables showed also negative correlations with NO, NO2, and RH and a positive correlation with T. In another study focusing on the same region [41], O3 concentrations were negatively correlated with NO, NO2, and RH and positively correlated with T and WS.Zhang, Wang, Park, and Deng [18] analyzed high O3 concentration episodes and related them with meteorological variables.O3 concentrations were highly correlated with maximum temperature and minimum relative humidity.The effect of minimum WS was also analyzed at urban, suburban, and rural sites.O3 concentrations were positively (negatively) correlated with minimum WS at urban (suburban and rural) sites.Shan, Yin, Zhang, Ji, and Deng [39] analyzed the effect of meteorological variables on O3 concentrations at an urban site in China.Daily average O3 concentrations were negatively correlated with pressure and RH, and positively correlated with temperature, solar radiation, sunshine duration, and wind speed.(e) (f)

Linear Correlation Analysis
Figure 5 shows the variation in linear correlation between O3 and meteorological parameters on a monthly basis.Negative correlations were observed for NO (−0.547 to −0.296) and NO2 (−0.807 to −0.276) concentrations.The effect of these air pollutants was more significant in winter periods than in summer periods.Regarding the effect of meteorological variables, temperature was usually positively correlated with O3, which was in agreement with what was expected

Linear Correlation Analysis
Figure 5 shows the variation in linear correlation between O3 and meteorological parameters on a monthly basis.Negative correlations were observed for NO (−0.547 to −0.296) and NO2 (−0.807 to −0.276) concentrations.The effect of these air pollutants was more significant in winter periods than in summer periods.Regarding the effect of meteorological variables, temperature was usually positively correlated with O3, which was in agreement with what was expected.The highest value (R = 0.661) was determined in September 2013 and an unusual negative correlation (R = −0.376)was determined in July 2013.RH was negatively correlated in almost all periods.The highest impact was also observed in September 2013 (R = −0.685)and an unusual positive correlation was determined in July 2013 (R = 0.419).Chen et al. [42] demonstrated that RH favors O3 decomposition, justifying the associated negative effect.Regarding WS, this variable can have two different effects on O3 concentrations.Low WS can promote the accumulation of O3 produced in the region (increasing its concentration), while high values reduce the levels of other air pollutants (such as NOx) that influence the O3 chemistry (in the case of NOx, its concentration decrease leads to the increase in O3 levels).Thus, the effect of WS on O3 concentrations depends on the studied environment: urban or rural.In this study (urban environment), WS was positively correlated with O3 concentrations, with the highest value (R = 0.593) in February 2013.

ANN Models and Interpretation
Seven simulations were performed to determine the models able to describe the relationship between O3 concentrations with NO, NO2, T, RH, and WS (measured at same time).These models are threshold models, considering two O3 regimes where the relationship between output and input variables are different.The change from one regime to another depends on the value (threshold value) of a specific input variable (threshold variable).GAs were used to optimize the ANN characteristics (the number of hidden neurons, the activation function in the hidden layer, and the input variables), and the threshold variable and value.The models were evaluated according their fitting performance in training and validation sets.Table 1 shows the best models for the seven simulations.All of them presented similar fitting performance: (i) R 2 = 0.71-0.72;(ii) RMSE between 14.5 and 14.7 μg•m −3 ; and (iii) the index of agreement of the second order of 0.91.Three explanatory variables were selected, each one with a specific threshold value: (i) WS with 4.9 m•s −1 ; (ii) T with 17.5 °C; and (iii) NO2 with 26.6 μg•m −3 .Generally, hyperbolic tangent and radial basis were the functions selected for the hidden layer, composed by 7 or 8 neurons.In almost all models, all input variables were selected as ANN inputs in both O3 regimes.
The analysis of the combined effect of input variables (two variables) was performed for the three threshold variables, considering the two regimes determined by the best models in Simulations I, II, and III. Figure 6 shows the combined effect of NO2, T, and WS on O3 concentrations for WS ≤ 4.9 m•s −1 and for WS > 4.9 m•s −1 .For WS ≤ 4.9 m•s −1 , O3 concentrations (i) decreased with NO2 except when T > 17 °C (without significant variation), (ii) increased with T except when WS > 2.8 m•s −1 (O3 presented a maximum between 17 and 20 °C), and (iii) did not change significantly with WS except when T > 26 °C (presenting a decreasing tendency).For WS > 4.9 m•s −1 , O3 concentrations (i) decreased with NO2 except when T > 17 °C (presenting a slight increase), (ii) presented high values for high T with all tested ranges of NO2 concentrations (presenting a local maximum-≈87.4μg•m −3for T ≈ 11 °C and NO2 ≈ 6 μg•m −3 ), (iii) increased with T for the tested range of WS, (iv) did not change significantly with WS, and (v) presented higher values than those where WS ≤ 4.9 m•s −1 .The combined effect of T-NO2 is in agreement with what was concluded in linear correlation analysis.The effect of NO2 is more significant in the winter period, in which temperatures are usually low and NO2 concentrations are high (see Figure 4).With high NO2 concentrations, the chemical equilibrium given by Equation (R3) limits the increase in O3 concentrations.In addition, based on a comparison of the two regimes defined by WS, the combined effect of these two variables presented similar behavior; however, O3 concentrations were higher when WS > 4.9 m•s −1 (49-104 μg•m −3 ) than they were when WS ≤ 4.9 m•s −1 (11-41 μg•m −3 ).High values of WS are associated with the dispersion of air pollutants, reducing their concentration.As NO2 concentrations decrease, O3 concentrations can achieve higher values [17,19,20,22].The combined effect of T-WS showed the highest variability of O3 with T, showing a positive correlation between these variables.The O3 variability with WS is almost insignificant.Regarding the combined effect NO2-WS, similar conclusions were drawn: O3 concentrations presented a decreasing tendency with NO2 concentrations, and their variability with

ANN Models and Interpretation
Seven simulations were performed to determine the models able to describe the relationship between O 3 concentrations with NO, NO 2 , T, RH, and WS (measured at same time).These models are threshold models, considering two O 3 regimes where the relationship between output and input variables are different.The change from one regime to another depends on the value (threshold value) of a specific input variable (threshold variable).GAs were used to optimize the ANN characteristics (the number of hidden neurons, the activation function in the hidden layer, and the input variables), and the threshold variable and value.The models were evaluated according their fitting performance in training and validation sets.Table 1 shows the best models for the seven simulations.All of them presented similar fitting performance: (i) R 2 = 0.71-0.72;(ii) RMSE between 14.5 and 14.7 µg•m −3 ; and (iii) the index of agreement of the second order of 0.91.Three explanatory variables were selected, each one with a specific threshold value: (i) WS with 4.9 m•s −1 ; (ii) T with 17.5 • C; and (iii) NO 2 with 26.6 µg•m −3 .Generally, hyperbolic tangent and radial basis were the functions selected for the hidden layer, composed by 7 or 8 neurons.In almost all models, all input variables were selected as ANN inputs in both O 3 regimes.
The analysis of the combined effect of input variables (two variables) was performed for the three threshold variables, considering the two regimes determined by the best models in Simulations I, II, and III. Figure 6  The effect of NO 2 is more significant in the winter period, in which temperatures are usually low and NO 2 concentrations are high (see Figure 4).With high NO 2 concentrations, the chemical equilibrium given by Equation (R3) limits the increase in O 3 concentrations.In addition, based on a comparison of the two regimes defined by WS, the combined effect of these two variables presented similar behavior; however, O 3 concentrations were higher when WS > 4.9 m•s −1 (49-104 µg•m −3 ) than they were when WS ≤ 4.9 m•s −1 (11-41 µg•m −3 ).High values of WS are associated with the dispersion of air pollutants, reducing their concentration.As NO 2 concentrations decrease, O 3 concentrations can achieve higher values [17,19,20,22].The combined effect of T-WS showed the highest variability of O 3 with T, showing a positive correlation between these variables.The O 3 variability with WS is almost insignificant.
Regarding the combined effect NO 2 -WS, similar conclusions were drawn: O 3 concentrations presented a decreasing tendency with NO 2 concentrations, and their variability with WS is almost insignificant.Figures S1 and S2 present the effect of the same combinations of variables on O 3 concentrations considering T (simulation II) and NO 2 (simulation III) as threshold variables, respectively.Similar analysis can be performed through these figures.The application of this statistical methodology allows for the determination of the influence of environmental and meteorological variables on O 3 concentration.Consequently, it is possible to develop more accurate predictive models for this secondary pollutant, which is important for the definition of policy measures for human health protection.The application of this statistical methodology allows for the determination of the influence of environmental and meteorological variables on O3 concentration.Consequently, it is possible to develop more accurate predictive models for this secondary pollutant, which is important for the definition of policy measures for human health protection.

Conclusions
Linear correlation analysis showed a positive relationship between O3 concentrations with T and WS, while NO, NO2, and RH showed a negative effect.In the studied period, the highest O3 concentrations were observed for low NOx concentrations and high wind speed.Threshold models with ANNs and those defined by genetic algorithms define three important variables that could define different O3 regimes: (i) a wind speed of 4.9 m•s −1 ; (ii) a temperature of 17.5 °C; and (iii) an NO2 concentration of 26.6 μg•m −3 .The achieved models enabled the evaluation of the combined effect of two input variables in different O3 regimes.This information may be useful for defining policy strategies for human health protection concerning surface ozone.

•
a population size of 100; • a selection probability of 0.20 (proportion of the individuals of the new generation obtained by selection operator); • a selection criterion based on elitism (a small proportion of the fittest candidates is copied unchanged into the next generation); • a crossover probability of 0.70 (proportion of the individuals of the new generation obtained by crossover operator); • a mutation probability of 0.1 (proportion of the individuals of the new generation obtained by mutation operator); • an evaluation of root mean squared error (RMSE) in training and validation sets; • a stopping criterion based on the maximum number of generations.

•
a population size of 100; • a selection probability of 0.20 (proportion of the individuals of the new generation obtained by selection operator); • a selection criterion based on elitism (a small proportion of the fittest candidates is copied unchanged into the next generation); • a crossover probability of 0.70 (proportion of the individuals of the new generation obtained by crossover operator); • a mutation probability of 0.1 (proportion of the individuals of the new generation obtained by mutation operator); • an evaluation of root mean squared error (RMSE) in training and validation sets; • a stopping criterion based on the maximum number of generations.

Figure 2 .
Figure 2. Example of a chromosome.

Figure 2 .
Figure 2. Example of a chromosome.

Figure 3 .
Figure 3. Daily average profile of O3 concentrations at the monitoring site.

Figure 3 .
Figure 3. Daily average profile of O 3 concentrations at the monitoring site.

Figure 3 .
Figure 3. Daily average profile of O3 concentrations at the monitoring site.
Figure5shows the variation in linear correlation between O3 and meteorological parameters on a monthly basis.Negative correlations were observed for NO (−0.547 to −0.296) and NO2 (−0.807 to −0.276) concentrations.The effect of these air pollutants was more significant in winter periods than in summer periods.Regarding the effect of meteorological variables, temperature was usually positively correlated with O3, which was in agreement with what was expected.The highest value (R = 0.661) was determined in September 2013 and an unusual negative correlation (R = −0.376)was determined in July 2013.RH was negatively correlated in almost all periods.The highest impact was also observed in September 2013 (R = −0.685)and an unusual positive correlation was determined in July 2013 (R = 0.419).Chen et al.[42]  demonstrated that RH favors O3 decomposition, justifying the associated negative effect.Regarding WS, this variable can have two different effects on O3 concentrations.Low WS can promote the accumulation of O3 produced in the region (increasing its concentration), while high values reduce the levels of other air pollutants (such as NOx) that influence the O3 chemistry (in the case of NOx, its concentration decrease leads to the increase in O3 levels).Thus, the effect of WS on O3 concentrations depends on the studied environment: urban or rural.In this study (urban environment), WS was positively correlated with O3 concentrations, with the highest value (R = 0.593) in February 2013.

Figure 5
Figure 5 shows the variation in linear correlation between O 3 and meteorological parameters on a monthly basis.Negative correlations were observed for NO (−0.547 to −0.296) and NO 2 (−0.807 to −0.276) concentrations.The effect of these air pollutants was more significant in winter periods than in summer periods.Regarding the effect of meteorological variables, temperature was usually positively correlated with O 3 , which was in agreement with what was expected.The highest value (R = 0.661) was determined in September 2013 and an unusual negative correlation (R = −0.376)was determined in July 2013.RH was negatively correlated in almost all periods.The highest impact was also observed in September 2013 (R = −0.685)and an unusual positive correlation was determined in July 2013 (R = 0.419).Chen et al. [42] demonstrated that RH favors O 3 decomposition, justifying the associated negative effect.Regarding WS, this variable can have two different effects on O 3 concentrations.Low WS can promote the accumulation of O 3 produced in the region (increasing its concentration), while high values reduce the levels of other air pollutants (such as NO x ) that influence the O 3 chemistry (in the case of NO x , its concentration decrease leads to the increase in O 3 levels).Thus, the effect of WS on O 3 concentrations depends on the studied environment: urban or rural.In this study (urban environment), WS was positively correlated with O 3 concentrations, with the highest value (R = 0.593) in February 2013.

Figure 5 .
Figure 5. Temporal variation of linear correlation between O 3 concentrations and the following: (a) NO concentrations, (b) temperature, (c) NO 2 concentrations, (d) relative humidity, and (e) wind speed.
shows the combined effect of NO 2 , T, and WS on O 3 concentrations for WS ≤ 4.9 m•s −1 and for WS > 4.9 m•s −1 .For WS ≤ 4.9 m•s −1 , O 3 concentrations (i) decreased with NO 2 except when T > 17 • C (without significant variation), (ii) increased with T except when WS > 2.8 m•s −1 (O 3 presented a maximum between 17 and 20 • C), and (iii) did not change significantly with WS except when T > 26 • C (presenting a decreasing tendency).For WS > 4.9 m•s −1 , O 3 concentrations (i) decreased with NO 2 except when T > 17 • C (presenting a slight increase), (ii) presented high values for high T with all tested ranges of NO 2 concentrations (presenting a local maximum-≈ 87.4 µg•m −3 -for T ≈ 11 • C and NO 2 ≈ 6 µg•m −3 ), (iii) increased with T for the tested range of WS, (iv) did not change significantly with WS, and (v) presented higher values than those where WS ≤ 4.9 m•s −1 .The combined effect of T-NO 2 is in agreement with what was concluded in linear correlation analysis.

Figure S1 :
The combined effect of NO2 concentrations, temperature, and wind speed on O3 concentrations based on the model determined in Simulation II; FigureS2: The combined effect of NO2 concentrations, temperature, and wind speed on O3 concentrations based on the model determined in Simulation III.

Table 1 .
ANN models: their input variables, activation functions (AF), number of hidden neurons (HN), and performance indexes (R 2 , RMSE (root mean squared error) and d 2 ) for each performed simulation (Sim).

Table 1 .
Appl.Sci.2017, 7, 944 9 of 12 WS is almost insignificant.Figures S1 and S2 present the effect of the same combinations of variables on O3 concentrations considering T (simulation II) and NO2 (simulation III) as threshold variables, respectively.Similar analysis can be performed through these figures.ANN models: their input variables, activation functions (AF), number of hidden neurons (HN), and performance indexes (R 2 , RMSE (root mean squared error) and d2) for each performed simulation (Sim).