Short-Term Forecasting of Ozone Concentration in Metropolitan Lima Using Hybrid Combinations of Time Series Models

: In the modern era, air pollution is one of the most harmful environmental issues on the local, regional, and global stages. Its negative impacts go far beyond ecosystems and the economy, harming human health and environmental sustainability. Given these facts, efﬁcient and accurate modeling and forecasting for the concentration of ozone are vital. Thus, this study explores an in-depth analysis of forecasting the concentration of ozone by comparing many hybrid combinations of time series models. To this end, in the ﬁrst phase, the hourly ozone time series is decomposed into three new sub-series, including the long-term trend, the seasonal trend, and the stochastic series, by applying the seasonal trend decomposition method. In the second phase, we forecast every sub-series with three popular time series models and all their combinations In the ﬁnal phase, the results of each sub-series forecast are combined to achieve the results of the ﬁnal forecast. The proposed hybrid time series forecasting models were applied to four Metropolitan Lima monitoring stations—ATE, Campo de Marte, San Borja, and Santa Anita—for the years 2017, 2018, and 2019 in the winter season. Thus, the combinations of the considered time series models generated 27 combinations for each sampling station. They demonstrated signiﬁcant forecasts of the sample based on highly accurate and efﬁcient descriptive, statistical, and graphic analysis tests, as a lower mean error occurred in the optimized forecast models compared to baseline models. The most effective hybrid models for the ATE, Campo de Marte, San Borja, and Santa Anita stations were identiﬁed based on their superior out-of-sample forecast results, as measured by RMSE (4.611, 3.637, 1.495, and 1.969), RMSPE (4.464, 11.846, 1.864, and 15.924), MAE (1.711, 2.356, 1.078, and 1.462), and MAPE (14.862, 20.441, 7.668, and 76.261) errors. These models signiﬁcantly outperformed other models due to their lower error values. In addition, the best models are statistically signiﬁcant ( p < 0.05) and superior to the rest of the combination models. Furthermore, the ﬁnal proposed models show signiﬁcant performance with the least mean error, which is comparatively better than the considered baseline models. Finally, the authors also recommend using the proposed hybrid time series combination forecasting models to predict ozone concentrations in other districts of Lima and other parts of Peru.


Introduction
The stratosphere is the atmospheric layer characterized by the significant presence of ozone (O 3 ), which benefits all kinds of life on the planet due to the filtering process of solar ultraviolet radiation that occurs in the environment.Its presence in the biosphere is harmful to the health of all living beings and the environment because ozone is not only a greenhouse gas but also a powerful oxidant that contributes to global warming [1].In addition, the impact caused by this atmospheric pollutant on crop production is known [2].
Recently, the atmospheric levels of ozone in the air have increased, affecting more and more people, especially in the cardiovascular system, causing inflammation, oxidative stress, and imbalances that have been related to mortality and morbidity [3].It is important to develop stricter controls on O 3 precursors to mitigate the increased risks of ozone pollution episodes [4].Tropospheric ozone monitoring represents a practical tool to analyze spatiotemporal trends in the behavior of this polluting agent in the air [5].Thus, accurately forecasting ozone concentration is crucial to safeguarding vulnerable individuals, such as children, the elderly, and outdoor workers, from air pollution during hazardous periods of the day.Ground-level ozone concentrations are of significant concern due to their toxic agents, which can adversely affect the respiratory systems of people who inhale high ozone concentrations for extended periods.These adverse health effects can lead to decreased lung function, chest pain while breathing, coughing, throat infections, congestion, and worsening symptoms of asthma.
Time series record the observations made in a particular place and are associated with the evolution over time of a particular variable; observed behavior cannot be replicated with repeated experiments, and observations are often time-dependent.This information has allowed the development of traditional deterministic modeling and statistical models.Although chemical transport models have generally been applied to differentiate emission sources and meteorological variables to explain short-or long-term ozone fluctuations, temporal analysis can show spatial and seasonal changes in the distribution of ozone concentrations [6].At the same time, statistical models are generated by relational analysis between factors influencing pollutants, producing powerful statistical prediction equations [7].However, when you want to study the behavior in the spatiotemporal distribution of a pollutant, the problem lies in the variability of pollutant concentrations, which are strongly influenced by the fluctuations of the emitting sources and the meteorological statehourly, daily, seasonal, and annual.Thus, the impact exerted by trends in the behavior of air pollutants may be beneficial to optimize the performance of modeling [7].
Currently, statistical modeling is evolving, including the management of time series that deserves to be compared with traditional models, mainly multiple linear regression.However, it is still necessary to continue exploring new studies to improve the prediction of reliable models, the reduction of noise through filters, and the organization of the numerical information of contaminants [7].Decomposition is a methodology applied to analyze time series air pollutant data; the decomposition in ensemble empirical mode is counted to process these non-stationary and nonlinear signals and allows one to gradually separate the different fluctuation components [8].Generally, the numerical data of the ozone time series have various types of patterns, so it is essential to break the database into several components or sub-classes in such a way that each one is a unique pattern of the data.Furthermore, the time series for an air pollutant is considered to be additive and may comprise elements over time [6].Interrupted time series designs are a powerful tool for comparing the variation of levels and the trend of results [9].
According to Din [6], the ozone concentration at time t is given by the sum of each component (from decomposition).One component is given by the "trend" of time in the time series and is relevant to the persistent decrease or increase in ozone concentration driven via emission sources or meteorological variables.For its part, the second component, "seasonality", describes the fluctuations of the periodic seasons (decomposed), and the third, fixed by a short-term component, shows the "rest" of the random data once the first two components have been separated.In addition, other combined decomposition methods or structures have been proposed in series and time convolution and long-term short-memory bidirectional networks [10].Other models use a non-parametric Theil-Sen estimator as a robust Kendall [11] line-fit method or locally estimated scatterplot models for smoothing to filter the data obtained and subsequently decompose the time series models into trend, seasonal, and residual components of data and then recombine them appropriately [12].After the decomposition of the time series into components or subseries, the data can be used in standardized time series modeling as linear, nonlinear autoregressive, or autoregressive moving averages.Linear models such as autoregressive are difficult to handle with nonlinear and time-varying data [13].However, the application of combined auto-correlation function (ACF) and partial autocorrelation function (PACF) graphs overcomes the limitations of simple techniques by showing the correlation between the time series and the lags after excluding the contributions from previous lags [14].Iftikhar et al. [15,16] applied a nonlinear autoregressive model relating a past value and smoothed functions of the original values of the time series.An autoregressive moving average was also applied to take into account the errors that make up the model, as well as linear models of combination for all lag observations and the lag error term.On the other hand, machine learning models have also been used to forecast ozone levels.For instance, the researchers in [17] proposed a deep learning model for the prediction of ozone levels in Aarhus using a grid search technique and implemented it as an accuracy tool for forecasting ozone levels in smart cities.The ozone concentration in India is predicted [18] using eight machine learning models, including XGboost, random forest, k-nearest neighbor, support vector regression, decision tree, Adaboost, linear regression, and bidirectional long-shortterm memory, which achieved the predictive capabilities with a R 2 of 0.75 in winter.The researchers further divided the predicted capabilities in terms of season, and the winter season was found to be more predictable with 97.3%, post-monsoon 92.8%, monsoon 90.3%, and summer 88.9%.The authors in [19][20][21] applied time series, hybrid decomposition, machine learning, and deep learning models for forecasting ozone concentration in Tehran, Iran, in 11 municipal districts of Nanjing, China, and 8 out of 35 stations in Turkey.
Peru is a country located in South America in the Southeast Pacific Region, and its capital, Lima, is no stranger to ozone air pollution.Lima has become a megacity with more than 10 million inhabitants and severe air pollution problems.Romero et al. [22] evaluated the impact of meteorological variables on the ozone concentration and other pollutants present in the air through linear correlations made for data obtained between 2015 and 2018 at eight sampling stations in metropolitan Lima and reported that this pollutant increased with solar irradiation around 10:00 and 16:00 h, especially in spring, possibly caused by the interaction of primary NOx and hydrocarbon emissions from vehicle engines.Carbo-Bustinza et al. [23] instead studied the behavior of ozone in winter using machine learning algorithms in four stations in the city of Lima and found the highest critical levels (165.80 µg/m 3 ) in the Ate district (ATE).However, we observed, in general, a drop in values in the cold season (O 3 < 100 µg/m 3 ), similar to another study [24].At the same time, there is a need to comprehensively analyze the time series of the most polluted districts to optimize the prediction of ozone concentration.In this context, this research aims to propose an improved tool to forecast tropospheric ozone concentration using hybrid combinations of time series models in four districts of the megacity of Metropolitan Lima in a very precise way, through an innovative methodology based on the decomposition of a time series of data and the combination of traditional methods to achieve efficient predictions.The following are the contributions of this research:

•
We improve the efficiency and accuracy of one-hour-ahead ozone concentration forecasting using a proposed hybrid combination of time series models based on the seasonal trend decomposition technique and various standard time series models.

•
We apply the seasonal trend decomposition method of the ozone concentration database in four districts-ATE, Campo de Marte (CDM), San Borja (SB), and Santa Anita (STA)-with severe episodes of ozone contamination between 2017 and 2019.This article describes the proposed hybrid time series forecasting methodology and explains its construction step by step in Section 2. The results of the case study for each district studied are in Section 3. Discussion about the best combination model of this study versus the standard time series models is detailed in Section 4, and the conclusions, along with limitations and future challenges, are presented in Section 5.

The Proposed Hybrid Time Series Forecasting Methodology
Before starting the modeling, it often makes sense to prepare the data.The goal of preprocessing is usually to simplify the modeling of the data.To do this, the database is sorted, classified, and analyzed for each monitoring station, taking into account the winter period of the city, which runs from 21 June to 22 September, for ozone.From 2017 to 2019, four monitoring stations located at strategic points in the capital of Lima were considered.It should be noted that the number of monitoring stations in the capital of Lima is ten; however, four were selected due to a lack of data in the registry.The hourly ozone concentrations were measured with a Teledyne analyzer (an instrument with about 15 sensing technologies used in the monitoring and manufacturing of gas, liquid analysis, and medical fields).Analyzer operations include zero and span testing, calibration, and leak detection.Data are transmitted by telemetry to SENAMHI (National Meteorology and Hydrology Service of Peru) for validation after correcting zeros, duplicates, and/or anomalies.Similarly, SENAMHI has a systematic network of stations that normally and automatically monitor and report the variables studied to a processing center.These stations use high-quality instruments and sensors to measure temperature, relative humidity, wind speed, and direction on an hourly scale.In addition, an inductive algorithm called Multiple Imputation by Chained Equations was applied.This algorithm is based on a fully conditional specification, where each incomplete variable is specified by a separate model [25].This performs multiple assignments to replace missing values in a dataset, in this case, for hourly rate records details (see Table 1).After obtaining the imputed ozone time series (free from missing values), we then proceed with the imputed ozone series and achieve a one-hour-ahead ozone concentration using the proposed hybrid combination of time series models.As explained previously, the hourly time series of ozone contains specific properties, such as a nonlinear long-run trend, an hourly cycle, and a different mean and variance.Considering these particular features in the model improves forecast accuracy significantly.To get these results, the ozone concentration in time series (C n ) is divided into three new sub-series: the first is a longrun trend (l n ), the second is a seasonal series (h n ), and the third is a residual (r n ) series.The mathematical description of the decomposed subsequence is given by however, these sub-series are obtained using the seasonal trend decomposition method described in the following subsection.

Seasonal Trend Decomposition Method
Cleveland et al. [26] proposed the decomposition technique where a seasonal time series model is split into three components of trend, seasonal, and stochastic.Seasonal trend decomposition (STLD) uses losses to decompose the seasonal component of a time series into other three components, including seasonal, trend, and stochastic.In particular, the steps included in STLD are: first de-trending; second cyclic smoothing of a sub-sequence, which creates the sequence of each seasonal component and smooths them individually; third, the regular sub-string is smoothed by a low-pass filter, which recombines and smooths sub-strings; fourth, we clean up the season series; fifth, the seasonal component computed in the previous step is used to de-trend the original series, and sixth, the seasonal sequence smoothing is used to get the trend component.To graphically explore the performance of the STLD method described above, the decomposed sub-series are shown in Figure 1.In each sub-figure (a to d) over a year (only winter season), the top panel indicates the long-term trend (l n ), the seasonal component is shown in the middle panel (h n ), and the residual component is presented in the bottom panel (r n ).Hence, the STLD technique was applied to decompose (C n ) to properly extract the long-term trend and hourly cycle in the ozone concentration time series.Moreover, the considered decomposition method extracts the specific features in all four station ozone concentration time series very well.

Modeling the Decomposed Sub-Series
Once the sub-series are obtained from the hourly ozone concentration time series using the STL decomposition technique, the extracted sub-series are fit by applying the three considered standard time series models, including linear autoregressive (AR), nonlinear autoregressive (NLAR), and autoregressive moving averages (ARMA) [27,28].These three models are explained in the following subsections.

Autoregressive Model
The autoregressive model (AR) model uses a linear combination of x lagged observations of C n to explain the short-term dynamics of C n [29] and can be expressed as where ξ i (i = 1, 2, . . ., r) are the parameters of AR model and m denotes the white noise process.In the present study, the maximum likelihood method is used for parameter estimation.The lags 1, 2, 3, 4, and 5 were included in the model due to their significant results after the plotting of autocorrelation function (ACF) and partial autocorrelation function (PACF) for the series.

Nonlinear Autoregressive Model
The nonlinear autoregressive model (NLAR) is the additive counterpart of the AR model, in which there is no specific linear form between z n and its corresponding lag values [30].Mathematically, it can be expressed as where w i represents each lag value, and smoothing function C n expresses the relationship between C n .In this study, the function w i is described by a cubic regression spline, and lags 1, 2, 3, 4, and 5 are used for NLAR modeling.

Autoregressive Moving Average Model
The autoregressive moving average (ARMA) model includes both error terms and lagged values of the time series.In this work, the sub-series are modeled with a linear combination of x lagged values and delayed error terms [31].Mathematically, the model equation can be expressed as where µ is the intercept, ξ i (i = 1, 2, . . ., x) and ψ j (j = 1, 2, . . ., s) are the parameters for the MA and AR models, respectively, and n ∼ N(0, σ 2 ).In this work, the descriptive and graphical analysis indicates that, in the MA part, the first two lags are significant, whereas in the AR part, only lags 1, 2, 3, 4, and 5 are significant.
In this research study, each combined model is denoted with the STLD method by l n STLD h n r n , where the l n in the top left corner represent the long-run component/sub-series, the h n in the top right indicates the seasonal component/sub-series, and the residual component/sub-series is represented in the bottom right by r n .In the forecasting models, we assign the codes "a", "b", and "c" to the autoregressive, the nonlinear autoregressive, and the autoregressive moving average models, respectively.For example, a STLD b c describes the estimate of the long-term trend (l n ) with AR model, the seasonal series (h n ) estimated with the NLAR model, and the residual series (r n ) estimated by using ARMA.The individual forecast models are combined to obtain the final one-hour-ahead forecasts of ozone concentration.

Accuracy Measures
In order to check the performance of the forecasting models in previous studies, many researchers used various performance measures and statistical tests [32][33][34][35].Hence, in this study, for model evaluation, first, we used five accuracy mean errors: two relative mean errors, two absolute mean errors, and one correlation measure for observed versus forecasted values, such as root mean square error (RMSE), root mean square percentage error (RMSPE), mean absolute error (MAE), and mean absolute percentage error (MAPE).The mathematical formula for accuracy means errors are expressed as Here, the observed value is C i of the time series, and Ĉi represent the forecasted ozone concentration value of the ith observation (i = 1, 2, . .., n), with the size of n in the testing set.Second, the Diebold and Mariano (DM) test [36] was conducted to test the significance of the differences among the performance of the forecasting models.The DM test is a broadly used statistical test for the comparison of forecasts extracted from various models [37][38][39].To understand it, consider two forecasts, Ĉ1n and Ĉ2n , that are available for the time series C n for n = 1, . . ., N. The associated forecast errors are e 1n = C n − Ĉ1n and e 2n = C n − Ĉ2n .Let the loss associated with forecast error {e in } 2 i=1 be L(e in ).For example, the absolute loss in time n would be L(e in ) = |e in |, and the differential loss between forecast 1 and forecast 2 for time t is then w n = L(e 1n ) − L(e 2n ).The null hypothesis of equal forecast accuracy for two forecasts is E[w n ] = 0.The DM test needs the differential loss to be covariance stationary, i.e., Under these assumptions, the DM test of equal forecast accuracy is where w = 1 N ∑ N n=1 w n is the differential loss of the sample mean, and σ w is a consistent estimate of standard error w n .Finally, we verify the superiority of the proposed hybrid combination of time series forecasting models using various figures, such as the box plot, line plot, bar plot, and dot plot in this work.To conclude this section, the design of the proposed hybrid combination of time series modeling and forecasting technique is presented in Figure 2.

Case Study Results
This work uses hourly ozone concentration datasets from four monitoring stations: ATE, CDM, SB, and Santa Anita, in Metropolitan Lima, for the duration of three consecutive years: 2017, 2018, and 2019.Within each year, only winter days are considered.Therefore, there are 6768 data points for one station.The graphic presentation of all four stations' hourly time series can be seen in Figure 3.The descriptive statistics and non-stationary statistics (augmented Dickey-Fuller (ADF) [40] test) for all four stations' imputed hourly ozone time series and the log imputed hourly ozone time series are listed in Table 2. Hence, descriptive metrics are a collection of methods for summarising and describing the key characteristics of a dataset, such as its central tendency, variability, and distribution.These statistics give an overview of the data and aid in determining the presence of patterns and linkages.It can be seen from Table 2 that the clear effect of the log and without log time series is in terms of all descriptive statistics, especially the variance and standard deviation stabilization.To conclude, the log-filtered series has the least descriptive statistic values.In addition to the above, we check the unit root issue for all four stations' imputed hourly ozone time series and the log imputed hourly ozone time series statistically by the ADF test.The results (statistic values), listed in Table 2, suggest that both the log-filtered imputed hourly ozone time series and the log-imputed hourly ozone time series have a higher negative statistic value, which indicates that the series is stationary.Therefore, once the database addresses all the essential treatments, we proceed further, and for forecasting and model estimation purposes, the data are divided into two parts: a training part (for model fit) and a testing part (for out-of-sample forecast).The training part contains the data for 5424 h, which is about 80% of the overall data, and 1344 h are used as the out-of-sample (testing).To obtain the forecast for ozone concentration one step ahead of an hour using the proposed hybrid methodology time series forecasting presented in Section 2, the given steps need to be followed: first, the STL method of decomposition was used to get a long-run trend (l n ), a seasonal (h n ), and the residual (r n ) of the time sub-series.Second, the previously explained three famous models of times series were used for each sub-series.Therefore,  From all twenty-seven models, in each monitoring station, the best four hybrid combination models are selected for comparison and compared with other models in each case.The outcome of all these best hybrid combination models is tabulated in Table 4.For example, in the case of the ATE station, based on the performance accuracy measure findings, it is evident that the a STLD b c give the least values (RMSE = 4.611, RMSPE = 4.464, MAE = 1.711,MAPE = 14.862, and CC = 0.949).Therefore, it is concluded that the a STLD b To confirm the dominance of models for all monitoring stations (the ATE, the CDM, the SB, and the STA) listed in Table 4, in this work, we performed the DM test on each pair of models.The null hypothesis is that the two models on the columns and rows are equally accurate, and the alternative hypothesis is that the model on the columns is more accurate than the model on the rows (using the loss-squared function).The results (p-values) of the DM test are given in Table 5 for all four stations (ATE, CDM, SB, and STA) of Metropolitan Lima.The results of the ATE station show that the final best ( a STLD b c ) model within all four best models is statistically superior to the other best combination models at the 5% level of significance.However, in the CDM, the SB, and the STA stations, the final best combination models, the ( b STLD c c ), the ( b STLD b c ), and ( c STLD b c ), are statistically superior to the other best combination models at the 5% level of significance.
Once the proposed hybrid time series combination models' performance was evaluated by accuracy performance measures (RMSPE, RMSE, MAE, MAPE, and CC) and a statistical test (the DM test), we then processed the models for graphic analysis.For instance, a graphical representation of mean errors (RMSE, RMSPE, MAE, and MAPE) for all twenty-seven models is shown in Figure 4a for the ATE station, Figure 4b for the CDM station, Figure 4c for the SB station, and Figure 4d for the STA station.From Figure 4a-d MAPE) in comparison to the rest of all combination models.On the other hand, from all twenty-seven models in each monitoring station, the best four hybrid combination models are selected for comparison and compared with other models in each station.The results of all these best hybrid combination models are plotted in Figure 5.For example, see the ATE station in Figure 5a, the CDM station in Figure 5b, the SB station in Figure 5c, and the STA station in Figure 5d It can be observed from these plots that the a STLD b c , a STLD b c , and a STLD b c show the least mean errors, respectively.In addition to the above, we plot the scatter diagrams for each station using their respective best model, which were obtained previously.For instance, Figure 6 displays the scatter plots for all considered monitoring stations.This figure showed that the best model produces greater correlation coefficient values, and it indicates that the correlation between forecast and actual ozone concentration values is highly significant.In the same way, the forecasted and observed values for the supermodel in each monitoring station are plotted in Figure 7.In Figure 7, forecasts of the best models follow the observed concentration of ozone very closely; from this, we can conclude that the supermodel in each considered station has accurate and efficient forecasts.Thus, from the descriptive statistical analysis, tests, and graphical results, we can conclude that the proposed hybrid combination of time series models is highly efficient and accurate in forecasting hourly ozone concentration.
Table 5. Ozone concentration in four Metropolitan Lima stations (µg/m 3 ): results (p-value) of the DM test for the best four models given in Table 4.

Discussion
Finally, according to the results (descriptive statistical analysis, tests, and visual analysis), it is concluded that the final best models for forecasting hourly ozone concentration were the a STLD b c , the b STLD c c , the b STLD b c , and the c STLD b c for the ATE, the CDM, the SB, and the STA, respectively.However, to verify the superiority of these final best models, we compare them with some standard baseline time series models, including parametric autoregressive (PAR), nonparametric autoregressive (NPAR), and autoregressive integrated moving averages (ARIMA) models.For example, the comparative results are presented in Table 6 for all four monitoring stations.The results show that the considered baseline time series models are significantly outperformed by the best-proposed model in each station.In addition, to confirm the dominance of the best-proposed models given in Table 6 for each station, we performed a statistical DM test on each pair of models.The results (p-values) of the DM test are listed in Table 7, indicating that the baseline time series (PAR, NPAR, and ARIMA) models performed poorly in comparison to our best-proposed models in the considered stations at the 5% level of significance.To conclude, based on overall results, the performance measures of accuracy for the proposed methods of fore-casting are comparatively better and more efficient than all other benchmark models in the competition.In addition to the above, in the literature, Carbo-Bustinza [23] explored the correlations between ozone and meteorological variables and predicted ozone concentration for the same sites and winter periods selected in this study.They used models such as linear regression, support vector regression, decision trees, random forest, and multilayer perceptron and based their arguments on R 2 , MSE, and MAE.The linear model presented the highest prediction performance for all the places evaluated (R 2 : 0.9849-9923), supported by the lowest calculated errors (MAE: 0.0087-0.0724and MSE: 0.0036-0.0087).Conversely, when the ozone concentration model is represented exclusively as a function of time as a relevant factor without considering meteorological factors, the decomposition methods have shown great performance, since in this investigation the significant models (p < 0.05; R 2 max: 0.949) with errors less than 20% (RMSE, RMSPE, MAE, MAPE) showed great performance.These errors have been comparable to other STL decomposition studies that used root mean square error (RMSE: 6.8%) and mean absolute percentage error (MAPE: 10.49%) as benchmarks for forecast reliability for ozone [10].This evaluation of tropospheric ozone explains its long-term and seasonal behavior with temporary ozone patterns [41], in accordance with what was demonstrated by Carbo-Bustinza [23] for the winter months in these geographic areas.This approach has presented high precision and strong performance that allows for preventing serious tropospheric ozone pollution events and optimizing the powers of the authorities and actors involved in decision making, especially at the urban level.

Conclusions
An improved tool for forecasting ozone concentration has been proposed using hybrid combinations of time series models in four districts of Metropolitan Lima between the years 2017 and 2019.It was shown that the combination of the models through the decomposition of the series ozone temporal data into "long-term trend", "seasonal", and "stochastic" series, by the use of the seasonal trend decomposition method, produced efficient model performance.The combinations made of the autoregressive models, nonlinear autoregressive models, and autoregressive moving average models generated 27 combinations for each sampling station.They demonstrated significant forecasts of the sample based on highly accurate and efficient descriptive, statistical, and graphic analysis tests, as a lower mean error occurred in the optimized forecast models compared to traditional models.Thus, the best hybrid models for the ATE ( a STLD b c ), CDM ( b STLD c c ), SB ( b STLD b c ), and Santa Anita ( c STLD b c ) stations were presented because they showed the best forecast reflected in the measurement of RMSE, RMSPE, MAE, MAPE, and CC, which were very small compared to the other models.The confirmation of the best models was statistically significant (p < 0.05), being superior to the other models.The graphical representation of the mean errors (RMSE, RMSPE, MAE, MAPE, and CC) for the twenty-seven models at each sampling station presented a better precision for the supermodels compared to the rest of all the models combined.These statistical tests and graphical results show that the proposed forecast methodology is highly accurate and efficient in predicting hourly ozone concentration, which meant that the independent AR, NPAR, and ARIMA models were outperformed by our best models (p < 0.05).
The main drawback of this study is that it only provides hourly data on ozone concentration.It can be extended to include additional exogenous factors such as wind speed, temperature, wind direction, and humidity, which may improve the short-term forecast of ozone concentration.In addition, the current work uses only four district datasets in Lima, Peru.This can be extended to other districts of Lima (San Juan de Lurigancho, Chorrillos, Comas, San Juan de Miraflores, etc.) or to different regions of Peru (Huánuco, Coyhaique, Traiguén, Padre Las Casas, Santiago, etc.).It could also be extended to the world level (Mexico, China, Japan, Malaysia, Pakistan, etc.) to evaluate the performance of the proposed hybrid time series modeling and forecasting technique.Moreover, only univariate time series models were used in this study, which should be extended by machine learning models such as deep learning and artificial neural networks.They can also be considered in the current hybrid time series forecasting framework.It can also be extended and applied to other approaches and datasets (for example, energy [42][43][44], air pollution [45,46], solid waste [47], and academic performance [48]).

Figure 1 .
Figure 1.Ozone concentration in the metropolitan area of Lima (µg/m 3 ): the hourly ozone concentration of the decomposed time series by the STLD method; ATE (a), Campo de Marte (b), San Borja (c), and Santa Anita (d), in each sub-figure, the top panel shows the long-run trend (l n ), the middle shows the seasonal (h n ) component, and the bottom shows the residual (r n ) component over a year.

Figure 2 .
Figure 2. A flowchart of the proposed forecasting methodology.
, we can see that within all twenty-seven models, the c STLD b c model in the ATE station, the c STLD b c model in the CDM station, the c STLD b c model in the SB station, the c STLD b c model in the STA station produce the highest accuracy measures (RMSE, RMSPE, MAE, and

Table 1 .
This table is based on 6768 observations taken throughout the winter season encompassing three years (2017, 2018, and 2019).It includes the percentage of imputation for each monitoring site.

Table 2 .
This table contains descriptive statistics for the time series of ozone concentration and the logarithmic time series of the ozone concentration for all considered monitoring stations.

Table 3 .
Ozone concentration in four Metropolitan Lima stations (µg/m 3 ): out-of-sample one-hour ahead mean forecast error for all models combined with the STL decomposition method.

Table 6 .
Ozone concentration in four Metropolitan Lima stations (µg/m 3 ): mean accuracy measures of the proposed versus the baseline models.

Table 7 .
Ozone concentration in four Metropolitan Lima stations (µg/m 3 ): results (p-value) of the DM test for the final best-proposed model versus the baseline models given in Table6.