Next Article in Journal
On the Validity of Granger Causality for Ecological Count Time Series
Previous Article in Journal
Stein-like Common Correlated Effects Estimation under Structural Breaks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Short-Term Hourly Ozone Concentration Forecasting Using Functional Data Approach

1
Department of Statistical Sciences, University of Padua, 35121 Padua, Italy
2
Department of Statistics, Quaid-i-Azam University, Islamabad 45320, Pakistan
3
Department of Economics, School of Business, Lebanese International University-LIU, Beirut 146404, Lebanon
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Econometrics 2024, 12(2), 12; https://doi.org/10.3390/econometrics12020012
Submission received: 9 March 2024 / Revised: 20 April 2024 / Accepted: 23 April 2024 / Published: 5 May 2024

Abstract

:
Air pollution, especially ground-level ozone, poses severe threats to human health and ecosystems. Accurate forecasting of ozone concentrations is essential for reducing its adverse effects. This study aims to use the functional time series approach to model ozone concentrations, a method less explored in the literature, and compare it with traditional time series and machine learning models. To this end, the ozone concentration hourly time series is first filtered for yearly seasonality using smoothing splines that lead us to the stochastic (residual) component. The stochastic component is modeled and forecast using a functional autoregressive model (FAR), where each daily ozone concentration profile is considered a single functional datum. For comparison purposes, different traditional and machine learning techniques, such as autoregressive integrated moving average (ARIMA), vector autoregressive (VAR), neural network autoregressive (NNAR), random forest (RF), and support vector machine (SVM), are also used to model and forecast the stochastic component. Once the forecast from the yearly seasonality component and stochastic component are obtained, both are added to obtain the final forecast. For empirical investigation, data consisting of hourly ozone measurements from Los Angeles from 2013 to 2017 are used, and one-day-ahead out-of-sample forecasts are obtained for a complete year. Based on the evaluation metrics, such as R 2 , root mean squared error (RMSE), and mean absolute error (MAE), the forecasting results indicate that the FAR outperforms the competitors in most scenarios, with the SVM model performing the least favorably across all cases.

1. Introduction

Air pollution refers to the existence of detrimental materials such as gases, solids, or liquid particles in the atmosphere, leading to harmful consequences for human health, the environment, and ecosystems. These materials are termed pollutants and can have diverse origins, arising from sources like industrial operations, transportation, natural occurrences, and human activities. Typical pollutants include ozone ( O 3 ) , nitric oxide ( N O ) , nitrogen dioxide ( N O 2 ) , sulfur dioxide ( S O 2 ) , carbon monoxide ( C O ) , particulate matter 2.5 ( P M 2.5 ) , and particulate matter 10 ( P M 10 ) , among others. Air pollution, especially ground-level O 3 , is a significant global concern affecting public health and the environment (Seinfeld and Pandis 2016).
In the United States of America (USA), the environmental protection agency (EPA) estimates that mobile sources such as cars, buses, planes, trucks, and trains account for about half of the cancer risk and more than 70% of the non-cancer health effects associated with air toxics (Munshed et al. 2023). Stationary sources such as power plants, oil refineries, industrial facilities, and factories emit large amounts of pollutants from particular sites called point sources. Area sources are made up of several minor pollution contributors that combined have a significant impact on air quality. Area sources include agricultural areas, urban regions, and wood-burning fireplaces. Natural sources like wind-blown dust, wildfires, and volcanic activity can also impact air quality. Figure 1 displays the different sources described above that contribute significantly to air pollution. The collective impact of these sources on the environment poses a serious challenge for researchers and policymakers (Lotrecchiano et al. 2020; Mavroidis and Ilia 2012).
Exposure to ground-level O 3 can have a range of adverse effects on human health. For example, O 3 exposure is associated with an elevated risk of chronic respiratory and cardiovascular ailments, as indicated by numerous studies (World Health Organization 2020). Even short-term exposure to O 3 can perturb lung mucociliary function, consequently weakening resistance to bacterial infections (Sujith et al. 2017). Alarming statistics reveal that approximately 16,400 premature deaths within the European Union (EU) are attributed to O 3 pollution (Nuvolone et al. 2018). Beyond human health, O 3 poses a substantial threat to vegetation and biodiversity. Its capacity to impair crops, forests, and other forms of vegetation manifests in the form of diminished photosynthesis, reduced carbon sequestration, and losses in biodiversity. Moreover, it induces visible leaf injuries (Paoletti and Manning 2007). This concern extends to various regions, including parts of southern Europe, where a multitude of grassland areas face the risk of high O 3 levels, impacting plant composition, seasonal flowering, and seed production for diverse natural species (Mills et al. 2016).
Ozone concentrations are important for air quality management and public health, and accurate prediction of these concentrations can aid in decision-making processes. Due to the importance of this topic, researchers in the past have proposed different methods and techniques to model and forecast O 3 concentrations. For example, Hong et al. (2023) presented a novel approach for predicting hourly O 3 levels using deep learning techniques called the multi-order difference embedded long short-term memory (MDELSTM) method, using a Los Angeles air quality dataset. The performance of the proposed model is compared with partial least squares (PLS), gated recurrent unit (GRU), long short-term memory (LSTM), multilayer perceptron (MLP), and stationary difference embedded LSTM (SDELSTM). Based on the results, the MDELSTM model demonstrates superior prediction performance among the models considered. Machine learning techniques are popular for forecasting O 3 concentrations (Eslami et al. 2020; Hashim et al. 2022; Yafouz et al. 2022). For instance, Yafouz et al. (2022) conducted a study to predict the tropospheric O 3 concentration using different machine learning models. These models included linear regression (LR), support vector regression (SVR), tree regression (TR), Gaussian process regression (GPR), ensemble regression (ER), and artificial neural networks (ANNs). Using data from Peninsular, Malaysia, the results indicated that the LR, SVR, GPR, and ANN performed better in terms of having a high R 2 .
Many researchers have investigated the performance of classical time series models for forecasting O 3 concentrations (Salazar et al. 2019). For example, Kumar et al. (2004) studied the autoregressive integrated moving average (ARIMA) model to forecast the daily surface O 3 concentration. Using a real dataset, the ARIMA (1,0,1) model demonstrated the best performance in predicting the maximum daily O 3 concentration, with a MAPE of 13.14%. Aneiros-Pérez et al. (2004) forecast daily maximum O 3 concentrations in Toulouse, France, using nonlinear models based on kernel estimators. They also included exogenous variables in their analysis. The study found that the functional additive model (FAM), with its back-fitting kernel approach, produced the lowest quadratic errors and explained the highest percentage of variability in the data compared to other models. Due to the challenges posed by O 3 to human health, researchers are actively proposing different models to model and forecast O 3 concentration time series data (Arsić et al. 2020; Gao et al. 2018; Ghoneim et al. 2017; Rahman and Nasher 2024; Su et al. 2020).
This research contributes to the literature on O 3 forecasting by applying functional data analysis (FDA) methods, which can capture the dynamic and complex features of the O 3 concentration as a function of time. FDA methods have been used in various fields such as bio-statistics, econometrics, and environmental science, but are less explored in the context of O 3 forecasting (Jan et al. 2022; Shah et al. 2022). This study proposes a novel time series model based on FDA, which treats each day as a single functional observation with 24 discrete points. The performance of the FDA model is compared with classical time series and machine learning models using standard accuracy metrics. The results of this study provide insights into the advantages and limitations of different forecasting methods and suggest ways to improve the accuracy and reliability of O 3 forecasts. This study also has practical implications for policymakers, environmental managers, and public health officials, who can use the O 3 forecasts to implement timely and effective measures to mitigate the adverse effects of O 3 pollution.
The rest of the article is organized in the following manner. Section 2 provides details about the general modeling framework and the details of different models used in this research work. Analysis and results based on empirical investigation are provided in Section 3. Finally, Section 4 concludes the study.

2. Methodology

This section explores the predictive models used in this study to model and forecast O 3 concentration data. To this end, our proposed approach FAR is outlined in more detail. Several parametric and non-parametric competing models like ARIMA, VAR, SVM, NNAR, and RF are also briefly described in this section. Before going into detail, we revisit some of the preliminaries of functional data analysis.

2.1. Functional Data Analysis

James O. Ramsay coined the term “functional data analysis” to describe this field of study (Ramsay 1982). FDA is at the forefront of modern statistical computing, ushering transformative changes across diverse fields. The rapid advancement of technology allows for quicker and more accurate measurement equipment, facilitating the collection of continuous data spanning time, space, or other continua. This paradigm shift challenges classical statistical assumptions, particularly the conventional belief that the number of data points should surpass the number of variables in a dataset. In response to this evolution, FDA emerged as a field treating data as functional, where each curve is regarded as a single observation, rather than a collection of discrete data points. The FDA method leverages information within and across sample units to incorporate derivatives, smoothness, and other inherent features of the functional data structure. The process of converting data to functional objects involves collecting information at discretized points, typically equally spaced, and implementing suitable basis functions to eliminate common noise while accurately representing each data curve.
Basis functions are pivotal in FDA, serving as the fundamental components for modeling intricate functional observations. These functions are expressed as a linear combination of coefficients C and known basis functions ϕ . The representation of a functional observation y ( j ) is given by
y ( j ) = k = 1 K C k ϕ k ( j )
where C k denotes parameters, ϕ k ( j ) represents the known basis functions, and K is the number of basis functions used. A more simplified representation using matrix notation can be written as
y = c ϕ = ϕ c
Several common basis functions, including the Fourier basis, B-spline basis, polynomial principal components, and exponential basis are used to cater to different data characteristics. The selection depends on the nature of the data, e.g., the Fourier basis is generally suitable for periodic data and the B-spline basis is preferred for non-periodic data. In our research, we chose the B-spline basis system for its applicability to non-periodic data.
The B-spline basis functions are polynomial segments that are joined at certain knots or breakpoints. The compact support property, where each basis function is positive over a limited number of adjacent intervals, increases the computational efficiency. The order of a B-spline determines the degree of polynomial segments while the knot sequence governs their placement. The strategic placement of knots, whether equally spaced or tailored to data characteristics, plays a vital role in achieving accurate and meaningful representations. Coincident knots are employed strategically, offering flexibility to induce specific characteristics like derivative discontinuities.

2.2. Functional Autoregressive Model

The FAR model used in this study is an extension of the classical AR model for functional data. The FAR model of order 1, FAR(1), operates within the framework of a separable Hilbert space denoted by H. It specifically considers the Hilbert space L 2 [ 0 , 1 ] , although the idea is applicable to other L 2 -spaces. An autoregressive Hilbertian process of order 1, ARH(1), which is also called FAR(1), is identified in this context as a sequence Y t of H-random variables. The process is strictly stationary and satisfies the AR equation:
Y t ( j ) μ ( j ) = ψ ( Y t 1 ( j ) μ ( j ) ) + ϵ t ( j ) , j J
where μ ( j ) is the mean function, ψ is a bounded linear operator, and ϵ ( j ) is the shock or innovation term. The AR operator ψ is assumed to be a compact Hilbert–Schmidt, symmetric, and positive bounded linear operator from L 2 [ 0 , 1 ] to itself. The compactness property permits a decomposition using orthonormal bases and real numbers, and the operator is said to be nuclear if certain eigenvalue conditions are met. Within the functional framework, we will drop the compact support ( j ) for simplicity.

2.2.1. Operators in the Hilbert Space ( L 2 [ 0 , 1 ] )

Operators in the Hilbert space L 2 [ 0 , 1 ] are defined by a norm and are bounded linear operators and obtained from the inner product of the space H. This norm is the supremum of the operator’s norm over all unit vectors y in H. It is represented by the symbol ψ L and written as
ψ L = sup y 1 ψ ( y )
An operator ψ is termed compact with respect to orthonormal bases ν j and f j in H and a sequence λ j approaching zero if it can be expressed as
ψ ( y ) = j = 1 λ j y , ν j f j
If the sum of the squares of an operator’s sequence λ j is finite, the operator is Hilbert–Schmidt, i.e.,
j = 1 λ j 2 <
The space of Hilbert–Schmidt operators S is separable and is equipped with an inner product and associated norm:
ψ 1 , ψ 2 S = 1 < = i , j < = ψ 1 ( g i ) , h j ψ 2 ( g i ) , h j
ψ S = j λ j 2 1 / 2
Operators are classified as symmetric if ψ ( x ) , y = x , ψ ( y ) , and positive if ψ ( y ) , y 0 . A symmetric positive Hilbert–Schmidt operator admits the decomposition
ψ ( y ) = j = 1 λ j y , ν j ν j
A compact operator is nuclear if the sum of the absolute values of its sequence λ j is finite:
j | λ j | <
The norms of these operators follow the relationship · N · S · L . If ψ is an integral operator in L 2 defined by ψ ( x ) ( t ) = ψ ( t , s ) x ( s ) d s , where ψ ( . , . ) is a real kernel, it is a Hilbert–Schmidt operator if and only if ψ 2 ( t , s ) d t d s < . The model is non-parametric, as ψ represents an infinite-dimensional parameter.

2.2.2. Estimation of the Operator ψ

Estimating the AR operator ψ in the FAR model involves addressing specific assumptions for obtaining a stationary solution. Two assumptions are considered to ensure the existence of a stationary solution. The first assumption requires the existence of an integer j 0 1 such that ψ j 0 L < 1 , while the second assumption necessitates the existence of a > 0 and 0 < b < 1 such that ψ j L a b j for all j 0 . These assumptions, under certain conditions, guarantee a unique strictly stationary solution, as proven in Bosq (2000).
It is important to note that estimating ψ cannot rely on likelihood due to the non-existence of the Lebesgue measure in non-locally compact spaces and the concept of density is not available for the functional data. Instead, the classical method of moments is employed. The estimation of ψ is represented as ψ = C Γ 1 , where Γ = E ( Y t Y t ) and C = E ( Y t Y t + 1 ) are the covariance and cross-covariance operators of the process, and ⊗ is the Kronecker product. The sample versions of these operators are denoted as Γ t ^ and C ^ t , respectively.
To simplify notation, it is assumed that the mean of the process E ( Y t ) = 0 is known. The sample versions of the covariance and cross-covariance operators, denoted as Γ t ^ and C ^ t , are given by Γ t ^ = 1 n t = 0 n 1 Y t Y t and C ^ t = 1 n t = 0 n 1 Y t Y t + 1 .
The covariance operator Γ is a symmetric, positive definite, and compact operator. It can be decomposed into eigenfunctions and eigenvalues, denoted as λ j and ν j , respectively. However, Γ 1 is not a bounded operator. To address this, a practical solution is proposed involving the consideration of the first p most important empirical functional principal components as substitutes for unknown population principal components, as given by
Γ t 1 ( y ) = j = 1 p λ j 1 y , ν ^ j ν ^ j = Γ t ( y )
From the ARH(1) equation, when multiplying by Y n we obtain the relation
Y t Y t + 1 = Y t ( ψ Y t ) + Y t ϵ t + 1 = ψ Y t Y t + Y t ϵ t + 1
By the definitions of the covariance and cross-covariance operators of ARH(1) and using E ( ϵ ) = 0 , we have C = ψ Γ and ψ = C Γ 1 . The estimate of ψ is then given by
ψ ^ n ( y ) = 1 n 1 k = 1 n 1 j = 1 p i = 1 p λ ^ j 1 y , ν ^ j Y k , ν ^ j Y k + 1 , ν ^ i ν ^ i
The last term is derived by performing an additional smoothing step on Y k + 1 and ν ^ j . The empirical eigenfunctions are known to converge asymptotically to the population eigenfunctions.
Once the estimator ψ ^ of the population parameter ψ is obtained, it is crucial to assess its optimality in estimating the true parameter. For the FAR parameter ψ , Didericksen et al. (2012) demonstrated that the estimator is optimal in terms of MSE and MAE, as its prediction error is comparable to the infeasible predictor ψ ( y ) for an appropriately chosen p.

2.3. Autoregressive Integrated Moving Average (ARIMA) Models

Time series data analysis is a vital tool for comprehending and forecasting temporal patterns. The autoregressive moving average (ARMA) model combines elements of autoregressive (AR) and moving average (MA) models, providing a framework for modeling univariate time series data. For a univariate time series Y t , an ARMA(p, q) model can be written as
Y t = C + r = 1 p ϕ r Y t r + l = 1 q Φ l Z t l + Z t
Equation (3) contains an intercept term C, AR parameters ϕ r ( r = 1 , 2 , , p ), MA parameters Φ l ( l = 1 , 2 , , q ), and a white noise term Z t N ( 0 , σ z 2 ) . The ARMA models are well-suited for stationary time series data. However, differencing is required to achieve stationarity for non-stationary data, and this is where the ARIMA model comes into play. The ARIMA model is an extension of ARMA specifically tailored for non-stationary time series (Shumway et al. 2000). It involves differencing the data to attain stationarity, which can be expressed as
Y t d = r = 1 p ϕ r Y t r d + l = 1 q Φ i Z t l + Z t
where Y t d represents the dth difference of the series, ϕ r ( r = 1 , 2 , , p ) and Φ l ( l = 1 , 2 , , q ) are the parameters of the AR and MA components, respectively, and Z t N ( o , σ z 2 ) .
Identifying the appropriate ARIMA model is a critical step and it involves determining the order of differencing ‘d’ and the numbers of AR (p) and MA (q) terms. The autocorrelation function (ACF) and partial autocorrelation function (PACF) plots are valuable tools in this process. The ACF indicates the series correlation with itself at different lags, whereas the PACF reveals autocorrelation at a lag k with intervening data deleted. For parameter estimation, generally, the maximum likelihood estimation (MLE) is used (Shumway et al. 2000). This study investigated different models and found that the ARIMA(5,0,0), which is a pure autoregressive model with five lagged observations, fits the data well and provides white noise errors, thus being suitable for forecasting.

2.4. Vector Autoregressive

A vector autoregressive (VAR) model is a powerful and commonly used time series analysis method that enables us to capture the dynamic interactions between variables over time. It extends the concept of univariate AR models to a multivariate situation. Because of their flexibility and forecasting accuracy, VAR models are extremely useful for understanding and forecasting complicated real-world behavior. They can effectively express variable interdependence.
A VAR model is built on a set of equations that describe the evolution of various time series variables. Each variable in the model is treated as a linear function of its own lagged values as well as the lagged values of all other variables in the model. Suppose Y t is a vector of univariate time series, then a VAR model can be written as
Y t = C + Φ 1 Y t 1 + Φ 2 Y t 2 + + Φ p Y t p + ϵ t
where Y t is an n × 1 vector representing the current values of n distinct response time series variables at time t. The n × 1 vector C consists of constant offsets that serve as intercepts, accounting for the baseline level of the variables. The Φ j matrices, ranging from j = 1 to p, are n × n matrices of AR coefficients. These matrices capture the relationships between the variables at different time lags; specifically, the impact of past values up to lag p on the current values. The parameter “p” defines the order of the VAR model, determining the maximum number of past periods considered in each equation and influencing the model’s ability to capture system dynamics. Finally, ϵ t represents an n × 1 vector of “white noise” terms, introducing randomness into the model and accounting for unexplained variability not captured by the lagged variables, thereby enhancing the model’s accuracy in describing the underlying time series data.
The choice of the order is a critical step and is typically determined through cross-validation techniques and information criteria such as AIC, BIC, and HQ. As we estimate models for 365 days, the value of the selected p can vary. However, in most cases, a VAR of order five, i.e., VAR(5), is suitable as it generally provides whitened residuals and produces lower AIC values compared to other orders of the model. Estimating and inferring parameters in VAR models is typically achieved by using the MLE or ordinary least squares (OLS) techniques. These methods rely on certain assumptions, such as the error terms having a conditional mean of zero, stationary variables, and the absence of perfect multicollinearity.

2.5. Artificial Neural Networks

Artificial neural networks (ANNs) are a foundational component of machine learning, particularly in the realm of deep learning. They mirror the human brain’s structure and functionality, comprising interconnected nodes arranged in layers: input, hidden, and output. These nodes, akin to artificial neurons, process information through weighted connections, activating based on a threshold to transmit data across layers. The ANNs are renowned for their proficiency in rapid data classification, clustering, and various applications such as speech and image recognition, with Google’s search algorithm being a prominent example of their implementation.
ANNs constitute a multi-layered architecture comprising essential components featuring an input layer, hidden layers, and an output layer. The input layer receives diverse data formats while the hidden layer, functioning as a “distillation layer”, extracts pertinent patterns to enhance network efficiency by recognizing crucial information and discarding redundancy. The output layer processes transformed information from the hidden layers to produce the final output. The most commonly employed ANNs for predictive tasks are MLPs with hidden layers, utilizing a three-layered network interconnected by acyclic connections. Nodes are considered as manufacturing components and the architecture allows for more than one hidden layer.

Neural Network Autoregressive

The neural network autoregressive (NNAR) model is a potent deep learning structure widely employed in various fields like natural language processing, time series forecasting, and speech recognition. It utilizes past data to predict future values, employing a recursive approach where each time step’s input is the prior prediction. This model handles nonlinear relationships within intricate datasets and can be constructed using feedforward or recurrent neural networks.
The NNAR(p, k) architecture involves “p” lagged inputs for forecasting and includes “k” nodes in the hidden layer. It shares similarities with ARIMA models but operates without constraints for stationarity. The NNAR model equation comprises weighted node connections, nonlinear activation functions, AR dependencies, exogenous impacts, and error terms, and can be written as
Y i t = f α 1 j = 1 N w i j Y j ( t 1 ) + λ Z i ( t 1 ) T + ϵ t
where Y i t represents the univariate response at node i and time t, w i j indicates the connection strength between nodes i and j for i, j = 1, …, N, λ Z i ( t 1 ) T captures the exogenous impact, and f ( ) is an unidentified smoothed link function. Estimation methods for the NNAR include profile least squares estimation and local linear approximation to model the unknown link function and optimize parameter estimates. This model iterates through historical inputs to generate multi-step forecasts in time series analysis, showing adaptability and robustness in handling intricate data patterns. In our case, a simple autoregressive of order one with one node in the hidden layer, NNAR(1,1), is used.

2.6. Support Vector Machine

Support vector machines (SVMs) are powerful supervised learning models widely employed for data analysis in classification, regression, and outlier detection tasks. They excel in handling both linear and nonlinear data separations, making them versatile in various domains such as text classification, image recognition, gene expression analysis, and even time series prediction. Notably, these machines were meticulously designed not only for effective ranking but also for efficiently simplifying training set outcomes. This distinct quality has led to the widespread adoption of SVM techniques, especially in forecasting time series.
An SVM commences with the input data, composed of labeled feature vectors, where each vector corresponds to one of two classes. Extending beyond binary classification, SVMs can also tackle multi-class classification problems. These feature vectors are then transformed into a higher-dimensional space using various kernel functions, such as the linear, polynomial, or radial basis functions. This transformation is pivotal as it calculates the similarity between different feature vectors, allowing the SVM to address nonlinear separable data.

Key Elements and Optimization in the SVM

  • Training data and class labels:The training data contains assigned class labels ( y i ) denoted within { 1 , 1} for binary classification, serving as an anchor for the model’s learning process.
  • Margin and decision boundary: The SVM aims to establish a hyperplane maximizing the margin between classes, mathematically expressed as
    w T x + b = 0 ,
    where w denotes the weight vector and b is the bias.
  • Support vectors and decision boundary: Essential in shaping the decision boundary, support vectors are the data points lying within or on the margin. They play a defining role in establishing the decision boundary.
  • Optimization and decision function: The optimization process fine-tunes key parameters like the weight vector, bias, and regularization factor to craft the decision function.
  • Optimization (mathematical model): The hyperplane determination involves an optimization problem:
    Minimizing ( 1 2 w 2 + C ξ i )
    under the constraints y i ( w T x i + b ) 1 ξ i with ξ i 0 , where C denotes the regularization parameter and ξ i are slack variables permitting misclassifications within margin boundaries.
  • Decision function and prediction: The decision function anticipates the class label for a new data point x, utilizing the transformed feature vector generated via the kernel function. This function is represented as f ( x ) = sign ( w T ϕ ( x ) + b ) , where ϕ ( x ) denotes the transformed input feature vector.
In this work, the SVM model is an epsilon-regression type with a radial basis function kernel. It uses two lagged values of ozone concentration as inputs and is parameterized with a cost of 1.0, gamma of 0.5, and epsilon of 0.1. The model is supported by 1191 support vectors to ensure accurate predictions.

2.7. Random Forest

The random forest (RF) algorithm, introduced by Leo Breiman (Breiman 2001), is an important machine learning method suitable for both classification and regression tasks. It employs an ensemble of decision trees, each trained on a distinct subset of data created through bagging. Bagging, or bootstrap aggregating, involves forming diverse training sets for each tree by random sampling with replacement. A distinctive feature of the RF is the additional randomness introduced during tree construction. Instead of opting for the optimal feature for node splitting, it considers a random subset of features, thereby enhancing model diversity. This unique characteristic contributes to RF’s robustness against overfitting and its effectiveness in handling complex datasets.
Random forest constructs a “forest” of decision trees through bagging. Each decision tree is descriptive, with a root node containing the variable of interest and leaf nodes representing predicted outcomes. The trees grow without pruning and predictions for new observations are made by aggregating results using f ( x ) = 1 K i = 1 K f i ( x ) , where f ( x ) is the RF regression predictor, K is the number of trees, and f i ( x ) represents the individual regression trees.

Working Principle of Random Forests

  • Tree construction:
1. Random subsampling (bootstrap aggregating): The algorithm selects a subset of training data with a replacement (bootstrap sample) and trains a decision tree on each subset. Samples are obtained by randomly selecting observations with replacements from the original dataset.
2. Random feature selection: At each decision tree node, a random subset of features is chosen for splitting, reducing the correlation between trees and enhancing ensemble diversity. The number of features considered at each node is typically the square root of the total features for classification and one-third for regression.
  • Prediction and aggregation:
3. OOB error and feature importance: Post-training, RF provides two indices: the out-of-bag (OOB) error and the importance value of each feature. The OOB error measures prediction accuracy on data not included in the bootstrap sample, while feature importance is determined using the OOB dataset, aiding in variable selection.
4. Voting: Once all decision trees are trained, predictions are made by aggregating individual tree results. Majority voting is used for classification, and averaging for regression. The algorithm’s performance is assessed through OOB error, providing an estimate of the RF’s generalization error by y ^ = 1 n tree j = 1 n tree y j , where y ^ is the final predicted value, n tree is the number of trees in RF, and y j is the predictive value of the jth tree.
  • Introducing randomness:
The RF introduces randomness by selecting a subset of features for node splitting, enhancing model performance and reducing overfitting. The equation for the RF regression predictor with randomness in feature selection is f ( x ) = 1 K i = 1 K f i ( x , θ i ) where θ i are independent random vectors with the same distribution, representing the randomly selected features.
  • Hyperparameters and feature importance:
The RF has hyperparameters such as the number of trees ( n estimators ) , maximum features for splitting a node max_features, and minimum samples required to split an internal node (min_samples_leaf). These can be tuned for optimal performance. The RF assesses feature importance by measuring the accuracy decrease using OOB error estimation during feature selection.
Finally, our RF model uses four lagged values of ozone concentration as predictors in this work. It was fine-tuned through 5-fold cross-validation, achieving optimal performance with the number of variables tried at each split (mtry) set to 2.

2.8. General Modeling Framework

The main goal of this study is to forecast O 3 concentrations and compare the proposed model, FAR(1), with other traditional time series and machine learning models. To explain in detail, for the O 3 concentration series denoted by X t , h for the tth day ( t = 1 , 2 , , n ) and the hth hour in a day ( h = 1 , 2 , , 24 ), the dynamics of the O 3 concentration can be modeled as
X t , h = D t , h + Y t , h .
This means that the O 3 series Y t , h is decomposed into two components. The first one is D t , h , which is the deterministic component that contains long-term dynamics, and the second one is Y t , h , which is the stochastic component that captures short-term variations. The deterministic component consists of yearly seasonality, which is modeled and forecast using the generalized additive modeling (GAM) technique using smoothing splines. More precisely, the yearly seasonality, which is represented by the series (1, 2,…, 365, 1, 2,…, 365, 1, 2,…, 365, 1, 2,…, 366, 1, 2,…, 365), is modeled using a smoothing spline, the one-day-ahead forecast for y t + 1 , h = y t , h , as this component represents long-term dynamics for our forecast horizon. Hence, the one-day-ahead forecast for the deterministic component is obtained as follows:
D ^ t + 1 , h = D ^ t , h .
To compute the stochastic component, we subtract the deterministic component D t , h from X t , h . Mathematically, this is represented as
Y t , h = X t , h D ^ t , h .
The modeling and forecasting of the stochastic component is performed using FAR(1) and five alternate competing models. To this end, the stochastic component Y t , h is converted to a matrix of dimensions “days × hours”, where each row represents a day and each column represents an hour. In the case of FAR(1), this matrix is then transformed into a functional object using the B-spline basis function given in Equation (1), that results in Y t ( j ) , and the FAR(1) model is applied thereafter. In the case of the VAR model, the matrix of hourly series is considered and modeled as given in Section 2.4. Finally, for all other models, the hourly series is modeled and forecast separately considering each hourly series as a univariate time series.
Once both components, deterministic and stochastic, are modeled and forecast separately, the final one-day-ahead out-of-sample forecasts are obtained by combining the forecasts of both components. In summary, the following procedure is used to obtain one-day-ahead out-of-sample forecasts from different models.
  • For an hourly time series, decompose the O 3 concentration time series into deterministic and stochastic components, as given in Equation (7).
  • Using Equation (8), obtain a one-step-ahead forecast for each hour that will result in a 24 h one-day-ahead forecast.
  • Obtain the stochastic component using Equation (9). Use the models described in Section 2 to obtain a one-day-ahead forecast.
  • Compute the final one-day-ahead out-of-sample forecasts by combining the forecasts from deterministic and stochastic components.
  • Repeat the above steps for 365 days using a rolling window to obtain the one-day-ahead out-of-sample forecasts for the entire year.
Figure 2 shows the flowchart of the proposed modeling framework.

3. Analysis and Results

3.1. Data Overview and Preprocessing

The data for this study originate from Los Angeles, an area highly prone to ground-level O 3 pollution due to its unique geography, surrounded by mountains and bordered by the sea, resulting in a Mediterranean climate with high temperatures and minimal rainfall. The dataset utilized for our research was sourced exclusively from O 3 concentration measurements taken at the North Main Street monitoring site (−118.22688 E, 34.06659 N) as part of the Environmental Protection Agency’s (EPA’s) air quality system in Los Angeles1. This dataset spans the years from 2013 to 2017, capturing a comprehensive record of O 3 levels in the region. While the EPA’s air quality system typically collects a wide range of meteorological and air pollution data, we have focused specifically on the O 3 concentration data for our study, thereby refining our dataset to provide a precise and specialized foundation for our research. The dataset is plotted in Figure 3, where the red line separates the model estimation and out-of-sample forecasting periods. This figure shows considerable variation in O 3 concentration throughout the year.
Missing values are common in air quality datasets and we have a very small number of missing values in our dataset. To address this, linear interpolation is applied to estimate missing values by assuming a linear relationship between neighboring known data points. This method calculates the value of a missing data point based on the values of the nearest data points before and after it in the dataset. This method is commonly used in various fields, such as time series analysis, geographic information systems, and data analysis, to interpolate and fill in missing data points. The dataset was organized with 80% of the data, spanning from 2013 to 2016, allocated for model estimation, while the remaining 20%, which corresponds to the data for 2017, was reserved for one-day-ahead out-of-sample forecasts.
The descriptive statistics in Table 1 reveal that the smallest observed value is 0.0010 parts per million (ppm), while the largest value stands at 0.1160 ppm. The mean O 3 concentration is calculated at 0.0239 ppm. Notably, the distribution appears to be slightly positively skewed, evident from the mean being marginally higher than the median (0.0220 ppm). Moreover, the standard deviation of 0.0180 ppm signifies a reasonable level of variability around the mean O 3 concentration value. Additionally, the first quartile, at 0.0070 ppm, and the third quartile, at 0.0360 ppm, help us to understand where the majority of values cluster within the dataset.

3.2. Forecasting Accuracy Metrics

The accuracy of our forecasting models is measured using three standard error measures, i.e., mean absolute error (MAE), root mean squared error (RMSE), and R-squared ( R 2 ) . These error measures are common descriptive statistics that show how close the predictions are to the actual values. Mathematically, they are described as
MAE = 1 N t = 1 N X t , h X ^ t , h
RMSE = 1 N t = 1 N X t , h X ^ t , h 2
R 2 = t = 1 N ( X ^ t , h X ¯ h ) 2 t = 1 N ( X t , h X ¯ h ) 2
where
  • N is the number of observations in the out-of-sample forecast period.
  • X t , h and X ^ t , h represents the actual and forecast O 3 values of the tth ( t = 1 , 2 , 365 ) day and hth ( h = 1 , 2 , 24 ) hour.
  • X ¯ h denotes the mean O 3 value of the hth hour.

3.3. Results

A comparison of the forecasting accuracy and model performance using R 2 , MAE, and RMSE as the evaluation criteria is shown in Table 2 and depicted in Figure 4. The results indicate that the FAR(1) model achieved the best performance, as it had the highest R 2 value and the lowest error values. The second-best model was the VAR model, followed by the ARIMA, NNAR, and RF models. The SVM model had the worst performance among the six models.
The O 3 concentration forecasting performance for each model for different days of the week is summarized in Table 3 and illustrated in Figure 5. These errors represent the average errors for the whole year when calculated for different days of the week. From these results, one can notice that the proposed model outperforms the other models by producing the lowest RMSE and MAE values for every day of the week. The forecast errors are different for each day of the week, with the lowest values on Wednesday, for which the O 3 concentration is more stable than the rest of the days, and the highest values on Saturday and Sunday. The SVM model is the worst among the six models, as it has the highest RMSE and MAE values for every day.
Table 4 compares the hourly O 3 forecasting errors of six different models using the RMSE and MAE metrics. The results show that the FAR(1) is the most accurate for most hours, except for the first five hours, when the VAR model performs better. The O 3 levels and the errors change throughout the day due to various factors such as weather, traffic, and emissions. The errors are usually higher during the peak hours (14–18) and the late hours (21–24), and lower during the off-peak hours (1–4). The VAR model is slightly better than the ARIMA, NNAR, and SVM models, but still worse than the FAR(1) model. Again, the SVM model has the highest errors for most hours and is the least accurate model. These results are also depicted in Figure 6.
Finally, Table 5 lists the month-specific RMSE and MAE values for the models used in the study to forecast the one-day-ahead O 3 concentration, which are further depicted in Figure 7. These errors are calculated by averaging the errors month-wise for the year 2017. The results show that the forecasting errors are not uniform across the months. The errors tend to be higher in February, March, April, and October, which implies that O 3 is more challenging to predict in these months. Again, the FAR(1) model performs the best in most months, except for April, where the VAR model is slightly better, and February, where both models have equal errors. Again, the SVM model has the worst performance in all months, which suggests that it is not good for O 3 forecasting. The ARIMA, NNAR, and RF models have comparable performance but are inferior to the FAR(1) models. The FAR(1) model has the lowest RMSE and MAE values in December, when the O 3 level is the lowest.

4. Conclusions

Air pollution, particularly ground-level O 3 , is a global issue with severe implications for both human health and ecosystems. It reduces agricultural output, exacerbates global warming as a greenhouse gas, and affects respiratory health, causing symptoms such as coughing, chest tightness, and worsening asthma. Given the severity of these consequences, accurate forecasting of ground-level O 3 concentrations is crucial. This study aims to use a functional approach to model ground-level O 3 concentrations, a method less explored in the literature. The performance of this approach is compared with traditional time series and machine learning models, including ARIMA, VAR, NNAR, RF, and SVM. Hourly data from Los Angeles, collected from 2013 to 2017, are used, with 80% allocated for model estimation and the remaining 20% for one-day-ahead out-of-sample forecasting. The performance of the model is evaluated using R 2 , MAE, and RMSE.
The results revealed that the FAR(1) model performs better than other models included in the study by producing lower forecasting errors. The second-best model was the VAR model, followed by the ARIMA, NNAR, and RF models. The SVM model performed the worst among all models. These findings highlight the importance of choosing a suitable model for accurate O 3 forecasting, which is crucial for mitigating the impacts of air pollution. In future research, including exogenous variables, including temperature, humidity, wind speed, solar radiation, and other meteorological factors, may improve the forecasting accuracy. As the current study is based on the Los Angeles dataset, the performance of the proposed approach could be assessed by conducting a study on other site datasets.

Author Contributions

Conceptualization, I.S. and N.G.; methodology, I.S.; software, S.A.; validation, N.G. and H.H.; formal analysis, N.G.; investigation, I.S.; resources, H.H.; data curation, S.A.; writing—original draft preparation, I.S. and H.H.; writing—review and editing, N.G. and S.A.; visualization, S.A.; supervision, I.S.; project administration, S.A.; funding acquisition, H.H. All authors have read and agreed to the published version of the manuscript.

Funding

The study was funded by Lebanese International University-LIU, Lebanon.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available from the corresponding author on request.

Acknowledgments

The authors appreciate the efforts of the anonymous reviewers in improving the quality and presentation of their work. Also, the authors extend their appreciation to Lebanese International University-LIU, Lebanon for funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Note

1
https://aqs.epa.gov/aqsweb/airdata (accessed on 12 March 2023).

References

  1. Aneiros-Pérez, Germán, Herve Cardot, Graciela Estévez-Pérez, and Philippe Vieu. 2004. Maximum ozone concentration forecasting by functional non-parametric approaches. Environmetrics 15: 675–85. [Google Scholar] [CrossRef]
  2. Arsić, Milica, Ivan Mihajlović, Djordje Nikolić, Živan Živković, and Marija Panić. 2020. Prediction of ozone concentration in ambient air using multilinear regression and the artificial neural networks methods. Ozone: Science & Engineering 42: 79–88. [Google Scholar]
  3. Bosq, Denis. 2000. Linear Processes in Function Spaces: Theory and Applications. Berlin/Heidelberg: Springer Science & Business Media, vol. 149. [Google Scholar]
  4. Breiman, Leo. 2001. Random forests. Machine Learning 45: 5–32. [Google Scholar] [CrossRef]
  5. Didericksen, Devin, Piotr Kokoszka, and Xi Zhang. 2012. Empirical properties of forecasts with the functional autoregressive model. Computational Statistics 27: 285–98. [Google Scholar] [CrossRef]
  6. Eslami, Ebrahim, Yunsoo Choi, Yannic Lops, and Alqamah Sayeed. 2020. A real-time hourly ozone prediction system using deep convolutional neural network. Neural Computing and Applications 32: 8783–97. [Google Scholar] [CrossRef]
  7. Gao, Meng, Liting Yin, and Jicai Ning. 2018. Artificial neural network model for ozone concentration estimation and monte carlo analysis. Atmospheric Environment 184: 129–39. [Google Scholar] [CrossRef]
  8. Ghoneim, Osama, Doreswamy, and Busnur Rachotappa Manjunatha. 2017. Forecasting of ozone concentration in smart city using deep learning. Paper presented at the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, September 13–16; pp. 1320–26. [Google Scholar]
  9. Hashim, NurIzzah M., Norazian Mohamed Noor, Ahmad Zia Ul-Saufie, Andrei Victor Sandu, Petrica Vizureanu, György Deák, and Marwan Kheimi. 2022. Forecasting daytime ground-level ozone concentration in urbanized areas of malaysia using predictive models. Sustainability 14: 7936. [Google Scholar] [CrossRef]
  10. Hong, Fugui, Cheng Ji, Jingzhi Rao, Chang Chen, and Wei Sun. 2023. Hourly ozone level prediction based on the characterization of its periodic behavior via deep learning. Process Safety and Environmental Protection 174: 28–38. [Google Scholar] [CrossRef]
  11. Jan, Faheem, Ismail Shah, and Sajid Ali. 2022. Short-term electricity prices forecasting using functional time series analysis. Energies 15: 3423. [Google Scholar] [CrossRef]
  12. Kumar, Krishan, A. K. Yadav, M. P. Singh, H. Hassan, and V. K. Jain. 2004. Forecasting daily maximum surface ozone concentrations in brunei darussalam—An arima modeling approach. Journal of the Air & Waste Management Association 54: 809–14. [Google Scholar]
  13. Lotrecchiano, Nicoletta, Daniele Sofia, Aristide Giuliano, Diego Barletta, and Massimo Poletto. 2020. Pollution dispersion from a fire using a gaussian plume model. International Journal of Safety and Security Engineering 10: 431–39. [Google Scholar] [CrossRef]
  14. Mavroidis, Ilias, and M. Ilia. 2012. Trends of nox, no2 and o3 concentrations at three different types of air quality monitoring stations in Athens, Greece. Atmospheric Environment 63: 135–47. [Google Scholar] [CrossRef]
  15. Mills, Gina, Harry Harmens, Serena Wagg, Katrina Sharps, Felicity Hayes, David Fowler, Mark Sutton, and Bill Davies. 2016. Ozone impacts on vegetation in a nitrogen enriched and changing climate. Environmental Pollution 208: 898–908. [Google Scholar] [CrossRef] [PubMed]
  16. Munshed, Mohammad, Jesse Van Griensven Thé, and Roydon Fraser. 2023. Methodology for mobile toxics deterministic human health risk assessment and case study. Atmosphere 14: 506. [Google Scholar] [CrossRef]
  17. Nuvolone, Daniela, Davide Petri, and Fabio Voller. 2018. The effects of ozone on human health. Environmental Science and Pollution Research 25: 8074–88. [Google Scholar] [CrossRef] [PubMed]
  18. Paoletti, Elena, and William J. Manning. 2007. Toward a biologically significant and usable standard for ozone that will also protect plants. Environmental Pollution 150: 85–95. [Google Scholar] [CrossRef] [PubMed]
  19. Rahman, Azizur, and N. M. Refat Nasher. 2024. Forecasting hourly ozone concentration using functional time series model—A case study in the coastal area of bangladesh. Environmental Modeling & Assessment 29: 125–34. [Google Scholar]
  20. Ramsay, James O. 1982. When the data are functions. Psychometrika 47: 379–96. [Google Scholar] [CrossRef]
  21. Salazar, Ledys, Orietta Nicolis, Fabrizio Ruggeri, Jozef Kisel’ák, and Milan Stehlík. 2019. Predicting hourly ozone concentrations using wavelets and arima models. Neural Computing and Applications 31: 4331–40. [Google Scholar] [CrossRef]
  22. Seinfeld, John H., and Spyros N. Pandis. 2016. Atmospheric Chemistry and Physics: From Air Pollution to Climate Change. Hoboken: John Wiley & Sons. [Google Scholar]
  23. Shah, Ismail, Izhar Muhammad, Sajid Ali, Saira Ahmed, Mohammed M. A. Almazah, and A. Y. Al-Rezami. 2022. Forecasting day-ahead traffic flow using functional time series approach. Mathematics 10: 4279. [Google Scholar] [CrossRef]
  24. Shumway, Robert H., David S. Stoffer, and David S. Stoffer. 2000. Time Series Analysis and Its Applications. Berlin/Heidelberg: Springer, vol. 3. [Google Scholar]
  25. Su, Xiaoqian, Junlin An, Yuxin Zhang, Ping Zhu, and Bin Zhu. 2020. Prediction of ozone hourly concentrations by support vector machine and kernel extreme learning machine using wavelet transformation and partial least squares methods. Atmospheric Pollution Research 11: 51–60. [Google Scholar] [CrossRef]
  26. Sujith, Babu, Meena Sehgal, Karthik L. Balajee, and Rizwan Suliankatchi. 2017. Characteristics of the ozone pollution and its health effects in india. International Journal of Medicine and Public Health 7: 56–60. [Google Scholar]
  27. Suraki, Mohsen Yaghoubi. 2013. Nfc: A surveillance system for preventing and controlling air pollution. Paper presented at the 2013 7th International Conference on Application of Information and Communication Technologies, Baku, Azerbaijan, October 23–25; pp. 1–4. [Google Scholar]
  28. World Health Organization. 2020. Air Pollution. Available online: https://www.who.int/health-topics/air-pollution#tab=tab_1 (accessed on 21 September 2020).
  29. Yafouz, Ayman, Nouar AlDahoul, Ahmed H. Birima, Ali Najah Ahmed, Mohsen Sherif, Ahmed Sefelnasr, Mohammed Falah Allawi, and Ahmed Elshafie. 2022. Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction. Alexandria Engineering Journal 61: 4607–22. [Google Scholar] [CrossRef]
Figure 1. Different sources of air pollution. Source: (Suraki 2013).
Figure 1. Different sources of air pollution. Source: (Suraki 2013).
Econometrics 12 00012 g001
Figure 2. Flowchart of the proposed general modeling framework.
Figure 2. Flowchart of the proposed general modeling framework.
Econometrics 12 00012 g002
Figure 3. O 3 concentration time series for Los Angeles. The red line divides the estimation and out-of-sample forecasting periods.
Figure 3. O 3 concentration time series for Los Angeles. The red line divides the estimation and out-of-sample forecasting periods.
Econometrics 12 00012 g003
Figure 4. One-day-ahead out-of-sample forecasts for O 3 concentration: (top) RMSE, (middle) MAE, and (bottom) R 2 values.
Figure 4. One-day-ahead out-of-sample forecasts for O 3 concentration: (top) RMSE, (middle) MAE, and (bottom) R 2 values.
Econometrics 12 00012 g004
Figure 5. Day-specific RMSE and MAE values for O 3 concentration forecasting.
Figure 5. Day-specific RMSE and MAE values for O 3 concentration forecasting.
Econometrics 12 00012 g005
Figure 6. One-day-aheadout-of-sample hour-specific RMSE and MAE for O 3 concentration.
Figure 6. One-day-aheadout-of-sample hour-specific RMSE and MAE for O 3 concentration.
Econometrics 12 00012 g006
Figure 7. Out-of-sample month-specific RMSE and MAE for O 3 concentration.
Figure 7. Out-of-sample month-specific RMSE and MAE for O 3 concentration.
Econometrics 12 00012 g007
Table 1. Descriptive statistics of O 3 concentration time series.
Table 1. Descriptive statistics of O 3 concentration time series.
StatisticValue
Minimum0.0010
1st Quartile0.0070
Median0.0220
Mean0.0239
3rd Quartile0.0360
Maximum0.1160
Standard Deviation0.0180
Table 2. One-day-ahead out-of-sample forecast results for O 3 concentration.
Table 2. One-day-ahead out-of-sample forecast results for O 3 concentration.
MethodMAERMSE R 2
FAR(1)0.00660.00900.7971
ARIMA(5,0,0)0.00770.01010.7429
VAR(5)0.00690.00920.7827
NNAR0.00780.01020.7375
SVM0.00850.01160.6693
RF0.00780.01060.7136
Table 3. Day-specific forecasting errors of all models.
Table 3. Day-specific forecasting errors of all models.
MethodErrorDay of the Week
MondayTuesdayWednesdayThursdayFridaySaturdaySunday
FAR(1)RMSE0.00860.00870.00770.00900.00910.00990.0095
ARIMA0.00990.00950.00870.01030.01050.01110.0106
VAR0.00870.00890.00810.00930.00920.01030.0099
NNAR0.01010.00960.00890.01050.01080.01070.0103
RF0.01070.01050.00890.01040.01100.01140.0113
SVM0.01120.01100.01010.01100.01230.01260.0127
FAR(1)MAE0.00660.00630.00570.00660.00680.00720.0074
ARIMA0.00780.00740.00690.00770.00780.00840.0083
VAR0.00670.00660.00600.00690.00690.00750.0076
NNAR0.00790.00740.00690.00790.00810.00810.0081
RF0.00820.00790.00660.00760.00780.00830.0085
SVM0.00840.00810.00760.00790.00850.00900.0097
Table 4. Hour-specific forecasting errors for O 3 concentration.
Table 4. Hour-specific forecasting errors for O 3 concentration.
HourErrorsFAR(1)ARIMAVARNNARRFSVM
1RMSE0.00490.01080.00480.01080.01150.0124
MAE0.00340.00830.00330.00820.00830.0088
2RMSE0.00570.01050.00570.01040.01110.0119
MAE0.00400.00810.00410.00800.00800.0085
3RMSE0.00690.01090.00680.01090.01160.0124
MAE0.00500.00850.00510.00840.00830.0089
4RMSE0.00750.01070.00740.01080.01140.0120
MAE0.00550.00840.00540.00830.00820.0086
5RMSE0.00810.01060.00810.01050.01110.0116
MAE0.00600.00830.00600.00800.00790.0079
6RMSE0.00770.00960.00780.00960.01020.0106
MAE0.00540.00750.00570.00730.00710.0071
7RMSE0.00700.00840.00720.00840.00900.0091
MAE0.00500.00620.00510.00610.00620.0060
8RMSE0.00670.00800.00690.00790.00870.0090
MAE0.00480.00600.00490.00570.00610.0062
9RMSE0.00700.00790.00720.00770.00850.0088
MAE0.00540.00600.00550.00580.00630.0065
10RMSE0.00760.00820.00770.00770.00860.0094
MAE0.00590.00650.00610.00610.00670.0072
11RMSE0.00870.00930.00890.00920.00990.0111
MAE0.00680.00730.00720.00730.00780.0086
12RMSE0.00920.00980.00970.00970.01030.0117
MAE0.00700.00740.00750.00730.00790.0088
13RMSE0.00920.01000.00980.00980.01060.0120
MAE0.00710.00770.00770.00770.00830.0093
14RMSE0.01000.01050.01040.01070.01090.0124
MAE0.00760.00810.00810.00830.00850.0097
15RMSE0.01040.01070.01060.01110.01110.0127
MAE0.00800.00820.00840.00860.00870.0097
16RMSE0.01090.01100.01100.01170.01150.0133
MAE0.00810.00830.00850.00890.00880.0100
17RMSE0.01060.01090.01090.01120.01130.0127
MAE0.00800.00820.00830.00850.00870.0096
18RMSE0.01090.01100.01130.01130.01140.0127
MAE0.00820.00840.00870.00870.00860.0097
19RMSE0.00990.00980.01040.01010.01010.0110
MAE0.00780.00770.00820.0080.00760.0085
20RMSE0.00970.00990.01010.01010.01020.0108
MAE0.00770.00780.00800.00800.00780.0084
21RMSE0.01030.01040.01070.01070.01080.0115
MAE0.00790.00800.00810.00830.00800.0087
22RMSE0.01050.01060.01100.01060.01120.0124
MAE0.00790.00800.00830.00800.00800.0087
23RMSE0.01080.01090.01120.01090.01140.0124
MAE0.00830.00840.00860.00830.00830.0090
24RMSE0.01090.01090.01120.01090.01150.0124
MAE0.00830.00840.00860.00830.00830.0088
Table 5. Month-specific forecasting errors of all models.
Table 5. Month-specific forecasting errors of all models.
MonthErrorsFAR(1)ARIMAVARNNARRFSVM
JanuaryRMSE0.00850.00920.00850.00920.00980.0101
MAE0.00640.00720.00650.00730.00740.0074
FebruaryRMSE0.01060.01190.01060.01210.01260.0136
MAE0.00810.00910.00810.00940.00950.0101
MarchRMSE0.01000.01120.01070.01120.01210.0126
MAE0.00750.00860.00800.00850.00880.0093
AprilRMSE0.01110.01250.01100.01240.01360.0146
MAE0.00850.00970.00840.00960.01050.0112
MayRMSE0.00820.00950.00890.00980.01010.0116
MAE0.00640.00750.00700.00760.00780.0090
JuneRMSE0.00900.01030.00940.01020.01030.0117
MAE0.00680.00820.00720.00800.00820.0092
JulyRMSE0.00830.00900.00860.00890.00930.0103
MAE0.00610.00670.00640.00660.00700.0076
AugustRMSE0.00910.00990.00940.00990.01020.0117
MAE0.00660.00760.00690.00740.00750.0087
SeptemberRMSE0.00910.01050.00940.01060.01110.0115
MAE0.00700.00830.00720.00840.00870.0090
OctoberRMSE0.00970.01140.00970.01150.01190.0136
MAE0.00710.00880.00720.00880.00890.0099
NovemberRMSE0.00680.00740.00710.00770.00760.0085
MAE0.00520.00600.00540.00630.00560.0060
DecemberRMSE0.00610.00700.00640.00710.00710.0070
MAE0.00430.00540.00460.00540.00450.0043
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shah, I.; Gul, N.; Ali, S.; Houmani, H. Short-Term Hourly Ozone Concentration Forecasting Using Functional Data Approach. Econometrics 2024, 12, 12. https://doi.org/10.3390/econometrics12020012

AMA Style

Shah I, Gul N, Ali S, Houmani H. Short-Term Hourly Ozone Concentration Forecasting Using Functional Data Approach. Econometrics. 2024; 12(2):12. https://doi.org/10.3390/econometrics12020012

Chicago/Turabian Style

Shah, Ismail, Naveed Gul, Sajid Ali, and Hassan Houmani. 2024. "Short-Term Hourly Ozone Concentration Forecasting Using Functional Data Approach" Econometrics 12, no. 2: 12. https://doi.org/10.3390/econometrics12020012

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop