Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model

Shah, Ismail; Uzair, Muhammad; Ali, Sajid; Aljeddani, Sadiah M.

doi:10.3390/math14050835

Open AccessArticle

Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model

¹

Department of Statistics, Quaid-i-Azam University, Islamabad 45320, Pakistan

²

Department of Statistical Sciences, University of Padova, 35121 Padova, Italy

³

Mathematics Department, Al-Lith University College, Umm Al-Qura University, Al-Lith 21961, Saudi Arabia

^*

Authors to whom correspondence should be addressed.

Mathematics 2026, 14(5), 835; https://doi.org/10.3390/math14050835

Submission received: 18 January 2026 / Revised: 13 February 2026 / Accepted: 25 February 2026 / Published: 1 March 2026

(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics, Machine Learning and FinTech: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

Functional time series (FTS) modeling has emerged as a powerful framework for capturing complex temporal dependencies using the functional autoregressive models FAR(p, m) and FARX(p, m, τ). These functional models characterize the evolution of functional observations by incorporating ‘p’ lagged functional responses, ‘m’ truncated dimensions from functional principal component analysis (FPCA), and τ number of scalar covariates with optimal parameter selection guided by the minimization of the functional final prediction error fFPE(p, m). The aim of this study is to propose a computationally efficient FAR model that can integrate a number of functional covariates to achieve a high predictive accuracy in terms of standard out-of-sample accuracy measures. To this end, an integrated functional autoregressive model FAR

X (p, m, \underset{̲}{g}, τ)

is developed, where X denotes the exogenous information, this being a lagged or modeled functional profile within the FAR(p, m) framework, and ‘

\underset{̲}{g}

’ represents a vector of optimal dimensions for a number of functional covariates. The theoretical contributions are twofold: first, deriving the distribution of the modified functional final prediction error, denoted as fFPE

X (p, m, \underset{̲}{g}, τ)

; second, using this derivation to establish formal criteria for optimal model selection. To empirically investigate the predictive performance of the proposed model, hourly temperature data from the NASA POWER project are considered, and day-ahead out-of-sample forecasts over a full annual cycle are computed. The forecasting performance of the proposed model is assessed against state-of-the-art models using different error summary metrics. The results show that functional models consistently outperform traditional time series and neural network-based approaches, with FAR

X (p, m, \underset{̲}{g}, τ)

achieving superior predictive accuracy compared to FAR(p, m) and FARX(p, m, τ), thereby underscoring the efficacy of incorporating functional exogenous information in FTS modeling.

Keywords:

functional autoregressive models; forecasting; functional final prediction error; functional principal component analysis; ARIMA; temperature

MSC:

62M10; 62R10; 46-11; 62J05

1. Introduction

Climate patterns exhibit intricate variability, driven by complex interactions between atmospheric, oceanic, and terrestrial systems. Understanding these patterns is crucial for predicting extreme weather events, assessing climate change impacts, and developing effective mitigation strategies. Among the many climatic variables, temperature plays a pivotal role, serving as a primary driver of atmospheric circulation, precipitation dynamics, and weather extremes [1]. It is a fundamental variable in meteorology, shaping the intricate patterns of weather systems and influencing atmospheric circulation. As key drivers of energy exchange within the Earth’s climate system, temperature variations dictate the formation of storms, heatwaves, and cold spells, making them crucial parameters for understanding and predicting complex weather structures. Moreover, long-term temperature trends serve as vital indicators of environmental changes, including global warming and shifts in regional climate regimes. Accurate modeling and forecasting of temperature are essential for climate resilience, disaster preparedness, and sustainable environmental policies [2,3].

To accurately forecast air temperature, researchers used different methods that vary in complexity, methodology, and performance. For example, ref. [4] used linear regression to model the daily maximum temperature data with geopotential thickness forecasts as a covariate for the site in Nashville, Tennessee. The study reveals that forecasting accuracy of daily maximum temperature is high for warmer air, whereas in the winter season, the forecasting accuracy tends to decrease. Although this method improves daily maximum temperature modeling, it has limitations, including dependence on geopotential thickness forecasts, seasonal effects, and other environmental factors. In the context of producing long-horizon predictive densities, which are crucial for pricing weather derivatives, the daily average temperature data are modeled by various authors [5,6]. For pricing temperature derivatives, the Chicago Mercantile Exchange (CME) defines the daily average temperature underlying its contracts as the mean of the daily high and low temperatures. In an evaluation using daily average temperature data from the United States of America, ref. [7] proposed an Extreme Value Theory (EVT) approach that outperformed classical models such as AR, generalized autoreggresive hetroskedasticity (GARCH), and standard regression.

Based on 62 years of historical time series (TS) data from Zhengzhou, China, ref. [8] modeled the daily average temperature using a mean-reverting Ornstein–Uhlenbeck process to provide accurate pricing for weather derivatives. Recent advances in nonlinear and complex time series modeling include Markov switching bilinear models for higher-order spectral analysis and threshold-based stochastic volatility frameworks which are capable of capturing asymmetric dynamics and regime changes [9,10]. Heatwaves significantly impact society and hence are referred to as highlighting the importance of modeling temperature. Ref. [11] evaluated the Regional Atmospheric Modelling System (RAMS) for predicting maximum and minimum summer temperatures in the Valencia Region from 2007 to 2010. The results show that the model predicts maximum temperatures well, with small errors of about 2 °C. The impact of global warming based on daily minimum and maximum temperature data from Coimbra in Portugal is assessed by [12] using the Hadley Centre model and a neural network. The results suggest that the two-layer neural network outperforms the competitors. Ref. [13] used the Abductory Induction Mechanism (AIM) as a modeling tool for temperature forecasting within a machine learning framework. Using daily maximum temperature data from the city of Dhahran in Saudi Arabia, the model was validated, achieving 97% accuracy within a range of ±3 °C, and hence provides better predictions than other traditional models. A novel abductive network model is developed by [14] to forecast a day-ahead and hour-ahead temperature data. The aforementioned model was trained using 5 years of hourly data and tested for the next whole year, for which the mean absolute errors (MAEs) of day-ahead and hour-ahead forecasts were 1.68 °F and 1.05 °F, respectively. Multiple linear regression and three artificial neural networks—feed-forward backpropagation (FFBP), radial basis function (RBF), and generalized regression neural network (GRNN)—have also been applied to model daily minimum, average, and maximum temperatures [15]. These models are evaluated against the geographical location of Turkey’s most important areas for agricultural production—the Geyve and Sakarya basins—which are located in the southeast of the Marmara region. The study reveals that the three ANN models outperform the classical benchmark.

A comparative study of data-intelligent models, namely a generalized regression neural network (GRNN), multivariate adaptive regression splines (MARSs), random forest (RF), and extreme learning machines (ELMs), is presented by [16], in which longitude, latitude, altitude, and periodicity are used as covariates rather than traditional atmospheric features such as humidity, precipitation, etc. These models were implemented at eleven different sites in Madhya Pradesh, central India, and the results show that the GRNN works more effectively for modeling air temperature, particularly in data-sparse regions where only geographic and topographic factors are utilized for temperature forecasting. One can also read [17] for an in-depth review of air temperature modeling using machine learning and neural network models, respectively. A systematic literature review (from 2019 to 2024) in connection with modeling and forecasting daily temperature is also given in Table 1.

Accurate day-ahead temperature forecasts are essential for energy, agriculture, and public health, as they help optimize resource management and reduce climate-related risks. Traditional methods often struggle to capture the continuous and complex nature of temperature variations. To overcome this deficiency, this research work contributes an effort to model and forecast the daily temperature curves using functional data analysis (FDA). To this end, a novel functional autoregressive model FAR

X (p, m, \underset{̲}{g}, τ)

is developed and the distribution of the modified functional final prediction error fFPE

X (p, m, \underset{̲}{g}, τ)

is derived for optimal model selection.

The rest of the article is structured as follows. Section 2 provides an introduction to functional data analysis along with the fundamental tools necessary for its implementation. Section 3 presents the development of a new functional model, including discussions on classical and functional competitors. An empirical application of the proposed methodology is provided in Section 4, where the results are analyzed in detail. Finally, Section 5 concludes the study.

2. Functional Data Analysis

FDA offers a strong and promising framework for treating complex and high-dimensional datasets. Within FDA, data are represented as smooth curves or functions rather than discrete points or vectors [36]. This technique contrasts with traditional approaches that often concentrate on summary statistics or discrete observations. There are diverse applications of FDA in various disciplines such as medicine, finance, neuroscience, economics, environmental studies, and quality control [37,38]. A systematic review about the applications of the FDA in various fields for the period from 1995 to 2010 is given by [39]. A recent study [40] involves forecasting crude oil prices using derivative information in a multivariate empirical mode decomposition (MEMD) model within the FDA framework, which results in better performance as compared with traditional time series models. An application of the FDA to analyze the behavioral patterns of the Colombian stock market, specifically the effects of COVID-19 and the aftermath of the Ukraine war, is conducted by [41]. Ref. [42] employed the FDA approach to predict age-specific brain cancer mortality trends for planning public health policies and resource allocation. FDA has also significantly contributed to numerical weather prediction. For example, it is utilized to create point and interval forecasts for drought time series patterns [43]. They developed a reliable predictive method for predicting drought intervals. Based on the FDA technique, ref. [44] illustrated the temporal changes of rainfall to assist in forecasting the future and to achieve a clear understanding of rainfall patterns. Ref. [45] implemented the FDA framework to forecast temperature and precipitation under various climate change scenarios. Ref. [46] performed FPCA to model the structure of rainfall. A functional autoregressive model of the first order, FAR(1), is explored to forecast hourly air temperature up to 24 h ahead, where the traditional time series model ARIMA served as the competitor, and the results suggest that FAR(1) is superior. Online auction prices and traffic volume are also predicted under the FDA framework by [47,48,49], respectively. Due to substantial fluctuations, ref. [50] indicates a non-linear and complex structure in electricity consumption, demand, and prices. Refs. [51,52,53] applied the FDA to estimate and forecast electricity consumption and demand, respectively. Hour-ahead electricity demand forecasting strategies are implemented using non-parametric FDA [54,55,56,57], and the results significantly outperform those obtained from the classical seasonal ARIMA (SARIMA) models.

2.1. Preliminaries

Let

{S_{t} (υ) : t \in Z, υ \in [0, 1]}

denote a stationary functional time series of observations, where t represents the time index and υ the argument in the continuous domain of each function. Each

S_{t}

is assumed to belong to the Hilbert space

H = L^{2} ([0, 1])

with the inner product

〈 x, y 〉 = \int_{0}^{1} x (υ) y (υ) d υ

. Thus, each

S_{t}

is square-integrable, satisfying

∥ S_{t} ∥^{2} = \int_{0}^{1} S_{t}^{2} (υ) d υ < \infty

, and all random functions are defined on a common probability space

(Ω, A, P)

. For p > 0, we write

S_{t} \in L_{H}^{p}

if

E [∥ S_{t} ∥^{p}] < \infty

. Since the series is not necessarily centered, we define the mean function

λ (υ) = E [S_{t} (υ)], υ \in [0, 1]

, and the covariance operator of the centered functions

S_{t} - λ

as:

C (x) (υ) = E [〈 S_{t} - λ, x 〉 (S_{t} (υ) - λ (υ))], x \in H

or equivalently via its kernel representation

C (x) (υ) = \int_{0}^{1} c (υ, γ) x (γ) d γ, x \in H

(1)

with

c (υ, γ) = Cov (S_{t} (υ), S_{t} (γ)) .

(2)

Equation (1) expresses the covariance operator C as an integral operator with kernel

c (υ, γ)

, while Equation (2) defines this kernel as the covariance between the functional observations at points υ and γ. For a more detailed discussion of covariance operators in functional time series, the reader is referred to [58].

2.2. Basis Function System

In the FDA framework, a basis function system (BFS) plays a crucial role in converting discrete data into a functional object, enabling a more compact representation without losing essential information. Various types of basis function systems exist, including polynomial, B-spline, and Fourier basis functions (FBFs), each suitable for different data structures. Among these, FBF, which consist of sine and cosine waves with increasing frequency, are particularly effective for modeling periodic data. FBFs are especially useful for cyclical environmental data, such as temperature, wind speed, and atmospheric pressure, where seasonal and diurnal variations play a crucial role. The sinusoidal components of the FBF effectively capture periodic fluctuations, providing a smooth and continuous representation of temperature profiles. A Fourier basis function is defined as

ξ_{j} (υ) = cos (j φ (υ)) + sin (j φ (υ)),

(3)

where j denotes the Fourier frequency index and

φ (υ)

is the fundamental frequency function, typically defined as

φ (υ) = 2 π υ

,

υ \in [0, 1]

. Using M Fourier basis functions, the functional observation at time t can be approximated as

S_{t} (υ) = \sum_{j = 1}^{M} b_{t, j} ξ_{j} (υ), υ \in [0, 1]

(4)

where

b_{t, j}

are time-varying Fourier coefficients and

{ξ_{j} (υ)}_{j = 1}^{M}

form the Fourier basis function system. An example of functional data, based on M = 20 FBFs represented in Figure 1, for the years 2019 to 2024 reflects 2192 daily temperature curves in Figure 2. The

S_{t} (υ)

is known as the functional time series (FTS) where one can investigate the major source of variation among different temperature curves.

2.3. Functional Principal Component Analysis (FPCA)

Suppose we observe an FTS,

S_{t} (υ)

, consisting of n curves, where

υ \in [0, 1]

. The mean function is defined as

λ (υ) = \frac{1}{n} \sum_{t = 1}^{n} S_{t} (υ)

. The mean-centered curves are then given by

S_{t}^{⋆} (υ) = S_{t} (υ) - λ (υ) .

(5)

Let

α_{j} (υ)

denote the jth FPC. Each mean-centered curve

S_{t}^{⋆} (υ)

can be projected onto

α_{j} (υ)

in the Hilbert space

H = L^{2} ([0, 1])

. The corresponding FPC score is defined as

θ_{j, t} = 〈 S_{t}^{⋆}, α_{j} 〉 = \int_{0}^{1} S_{t}^{⋆} (υ) α_{j} (υ) d υ .

(6)

The FPCs satisfy the normalization condition

\int_{0}^{1} α_{j}^{2} (υ) d υ = 1

ensuring the identifiability of the scores. The covariance operator

C : H \to H

of the mean-centered process is defined as

C (x) (υ) = E [〈 S_{t}^{⋆}, x 〉 S_{t}^{⋆} (υ)], x \in H

. According to Mercer’s theorem, C admits the spectral decomposition

C (x) (υ) = \sum_{j = 1}^{\infty} δ_{j} 〈 x, α_{j} 〉 α_{j} (υ), x \in H

(7)

where

{δ_{j}}_{j \geq 1}

are the eigenvalues in decreasing order and

{α_{j}}_{j \geq 1}

are the corresponding orthonormal eigenfunctions [59]. Using the Karhunen–Loève expansion, each functional observation can be expressed as

S_{t} (υ) = λ (υ) + \sum_{j = 1}^{\infty} θ_{j, t} α_{j} (υ),

(8)

where

E [θ_{j, t}] = 0

and

Var (θ_{j, t}) = δ_{j}

. For dimension reduction, we retain the first m FPCs and obtain the approximation [60,61]

S_{t} (υ) = λ (υ) + \sum_{j = 1}^{m} θ_{j, t} α_{j} (υ) + \sum_{j = m + 1}^{\infty} θ_{j, t} α_{j} (υ),

(9)

where the third term in the above expression accounts for the variability not explained by the first m functional principal components and has zero mean and finite variance [62]. In practice, the mean function and covariance kernel are estimated as

\hat{λ} (υ) = \frac{1}{n} \sum_{t = 1}^{n} S_{t} (υ),

(10)

\hat{C} (υ, γ) = \frac{1}{n - 1} \sum_{t = 1}^{n} [S_{t} (υ) - \hat{λ} (υ)] [S_{t} (γ) - \hat{λ} (γ)],

(11)

Let

{\hat{δ}}_{j}

and

{\hat{α}}_{j} (υ)

denote the eigenvalues and eigenfunctions of

\hat{C}

. The reconstructed FTS is then given by

{\hat{S}}_{t} (υ) = \hat{λ} (υ) + \sum_{j = 1}^{m} {\hat{θ}}_{j, t} {\hat{α}}_{j} (υ),

(12)

where

{\hat{θ}}_{j, t} = \int_{0}^{1} [S_{t} (υ) - \hat{λ} (υ)] {\hat{α}}_{j} (υ) d υ

.

3. Functional Time Series Modeling

An FTS captures nonlinear dependencies while preserving smooth temporal structures, making it particularly suitable for modeling complex dynamic processes. This section discusses the existing methodologies and the new developed approach which can effectively handle such dependencies. Despite its statistical relevance and mathematical appeal, functional time series modeling has practical limitations. Few user-friendly software packages exist, with notable exceptions being the far [63] and ftsa [64] packages in R version 4.5.0 (2025-04-11 ucrt). The lack of dedicated tools often requires manual implementation, limiting its use to academicians. This article aims to bridge this gap by introducing a prediction algorithm.

3.1. Functional Autoregressive Models

An autoregressive (AR) model is a well-known linear model where the response variable is regressed over its lagged values plus a noise term [65,66]. If the response observations are functions defined over a continuous domain (rather than scalar or vector values), the autoregressive model is referred to as an FAR model. An FAR process and its theoretical properties are described in a Hilbert space framework, which is one of the best techniques for modeling the complex nature of FTS [67]. In traditional univariate and multivariate time series analysis, reliable predictions rely on recursive methods like the Durbin–Levinson algorithm and the innovations algorithm [68], which systematically update predictions based on past observations. Prediction equations for general stationary FTS can be derived explicitly. However, their practical implementation is challenging due to the infinite-dimensional nature of functional data. As a result, most of the research in this area is limited to FAR(1) models primarily due to the computational and theoretical challenges of estimating higher-order FAR(p) models, as functional time series are infinite-dimensional and more complex models require substantial data and computation for reliable parameter estimation. Mathematically, the FAR(1) model [69] is given in Equation (13) as

S_{t} (υ) = λ (υ) + ϕ S_{t - 1} (υ) + N_{t} (υ),

(13)

where

λ (υ)

is the functional mean curve,

S_{t - 1} (υ)

is the first lagged functional variable,

N_{t} (υ)

is i.i.d. in

L_{H}^{2}

such that

E [N_{t} (υ)] = 0

, and ϕ is the linear operator bounded over mapping

H \to H

such that Equation (13) has a unique solution. If the FAR(1) approach is infeasible, the multiple testing procedure ensures a better fit to the data and can be used to determine the optimal order p [70]. A more general FAR(p) process is given by Equation (14), in which the kernel function yields a test statistic following an approximate chi-square distribution, with the degrees of freedom based on the number of FPCs being

S_{t} (υ) = λ (υ) + \sum_{i = 1}^{p} ϕ_{i} S_{t - i} (υ) + N_{t} (υ),

(14)

where

ϕ_{i}

are the linear operators of corresponding p-lagged functional variables

S_{t - i} (υ)

. Another approach for FAR(p, m) based on multivariate time series and FPCA, in which a VAR model is used with FPC scores. To implement the VAR model, an automatic procedure is suggested for selecting the optimal lag order (p) and dimension (m) by minimizing the final functional prediction error (fFPE) given in Equation (15).

f F P E (p, m) = [\frac{n + p \times m}{n - p \times m}] t r a c e ({\hat{Σ}}_{N}) + \sum_{j = m + 1}^{\infty} {\hat{δ}}_{j},

(15)

where the first part is the product of a penalty term and the trace of an estimated covariance matrix of the residuals from a VAR model fitted to the FPC scores. n is the sample size in the penalty term. The second term represents the sum of eigenvalues associated with ignored principal components. The values of m and p, which minimize the fFPE, are the optimal values that are used in FAR(p, m). The predictive performance of FAR(p, m) is also increased by incorporating the τ number scalar covariates, namely the model FARX(p, m, τ) given as in Equation (16)

S_{t} (υ) = λ (υ) + \sum_{i = 1}^{p} ϕ_{i} S_{t - i} (υ) + \sum_{k = 1}^{τ} ζ_{k} Z_{k} + N_{t} (υ),

(16)

where

ζ_{k}

are coeffcients of

Z_{k}

scalar covariates. Hence, the optimal orders p and m for Equation (16) are obtained by minimizing the fFPE, adjusted for degrees of freedom for scalar covariates as

f F P E (p, m) = [\frac{n + p \times m + τ}{n - p \times m - τ}] t r a c e ({\hat{Σ}}_{N}) + \sum_{j = m + 1}^{\infty} {\hat{δ}}_{j} .

(17)

For detailed discussions about Equations (13)–(17) and their applications to real data, one can consult [18,52,58] and the references therein.

3.2. Building $FAR X (p, m, \underset{̲}{g}, τ)$ Model

To enhance the forecasting accuracy of the FARX(p, m, τ) model, this work introduces a novel functional model, denoted as FAR

X (p, m, \underset{̲}{g}, τ)

. This model incorporates

\underset{̲}{g} = {[\begin{matrix} g_{1} & g_{2} & \dots & g_{ρ} \end{matrix}]}^{'}

, where each g represents the optimum dimensions of corresponding functional exogenous variables, extending the standard framework. Additionally, a modified version of fFPE(p, m), is derived, leading to the formulation of fFPE

X (p, m, \underset{̲}{g}, τ)

for improved model selection.

Suppose a functional stationary response

S_{t} (υ)

and functional exogenous variables

Y_{1} (υ), Y_{2} (υ), \dots, Y_{ρ} (υ)

are given and the goal is to derive an empirical

{\hat{S}}_{t + 1} (υ)

. Then Equation (17) can be written as a FAR

X (p, m, \underset{̲}{g}, τ)

given as

S_{t} (υ) = λ (υ) + \sum_{i = 1}^{p} ϕ_{i} S_{t - i} (υ) + \sum_{l = 1}^{ρ} ψ_{l} Y_{l} (υ) + \sum_{k = 1}^{τ} ζ_{k} Z_{k} + N_{t} (υ),

(18)

where

ψ_{l}

are the functional operators of the ρ functional covariates

Y_{l} (υ)

of the above model. The rest of the notations are same as discussed in Section 3.1. In this work, we estimate the Equation (18) model using the FPCA approach. This method ensures both accurate model estimation and computational efficiency, as the first few FPCs capture most of the variation in the endogenous and exogenous functional variables. In this context, the identification step involves selecting the optimal dimensions m and lag order p for an endogenous functional variable and a set of optimal dimensions

\underset{̲}{g} = {[\begin{matrix} g_{1} & g_{2} & \dots & g_{ρ} \end{matrix}]}^{'}

for a ρ number of exogenous functional variables. To achieve this, the following steps are employed for selecting

p, m, \underset{̲}{g}

and τ while a flowchart is given in Figure 3.

1.: Based on the algorithm in Section 3.3, fix the dimension m of the functional endogenous variable. Using FPCA based on m dimensions, the estimated FPC scores are obtained ${\hat{s}}_{j, t} = \int {\hat{S}}_{t} (υ) {\hat{α}}_{j} (υ) d (υ)$ for each of ${\hat{S}}_{t} (υ)$ such that we have vectors of j-variate-estimated FPC scores ${\hat{S}}_{t} = {[\begin{matrix} {\hat{s}}_{1, t} & {\hat{s}}_{2, t} & \dots & {\hat{s}}_{m, t} \end{matrix}]}^{'}$ where t = 1, 2, 3, …, k and j = 1, 2, 3, …, m.
2.: Fix the dimensions $\underset{̲}{g} = {[\begin{matrix} g_{1} & g_{2} & \dots & g_{ρ} \end{matrix}]}^{'}$ for a ρ number of functional exogenous variables and obtain the estimated FPC scores for each of the $Y_{l} (υ)$ .

${\hat{y}}_{j_{1}, t} = \int {\hat{Y}}_{1} (υ) {\hat{α}}_{j_{1}} (υ) d (υ), j_{1} = 1, 2, 3, \dots, g_{1}$

${\hat{y}}_{j_{2}, t} = \int {\hat{Y}}_{2} (υ) {\hat{α}}_{j_{2}} (υ) d (υ), j_{2} = 1, 2, 3, \dots, g_{2}$

and so on,

${\hat{y}}_{j_{ρ}, t} = \int {\hat{Y}}_{ρ} (υ) {\hat{α}}_{j_{ρ}} (υ) d (υ), j_{ρ} = 1, 2, 3, \dots, g_{ρ}$

Such that we have vectors of $j_{1}, j_{1}, \dots, j_{ρ}$ -variate-estimated FPC scores, respectively.

${\hat{Y}}_{t_{1}} = {[\begin{matrix} {\hat{y}}_{1, t_{1}} & {\hat{y}}_{2, t_{1}} & \dots & {\hat{y}}_{g_{1}, t_{1}} \end{matrix}]}^{'}$

${\hat{Y}}_{t_{2}} = {[\begin{matrix} {\hat{y}}_{1, t_{2}} & {\hat{y}}_{2, t_{2}} & \dots & {\hat{y}}_{g_{2}, t_{2}} \end{matrix}]}^{'}$

and so on,

${\hat{Y}}_{t_{ρ}} = {[\begin{matrix} {\hat{y}}_{1, t_{ρ}} & {\hat{y}}_{2, t_{ρ}} & \dots & {\hat{y}}_{g_{ρ}, t_{ρ}} \end{matrix}]}^{'}$
3.: If the number of scalar exogenous variables τ is sufficiently large, then it is better to use their functional form instead of their scalar form. That is, fix the dimensions τ using the same algorithm as described in Section 3.3. Using FPCA based on τ dimensions, the estimated FPC scores are obtained ${\hat{z}}_{j, t} = \int {\hat{Z}}_{k} (υ) {\hat{α}}_{j} (υ) d (υ)$ for ${\hat{Z}}_{k} (υ)$ , such that we have vectors of j-variate-estimated FPC scores ${\hat{Z}}_{t} = {[\begin{matrix} {\hat{z}}_{1, t} & {\hat{z}}_{2, t} & \dots & {\hat{z}}_{τ, t} \end{matrix}]}^{'}$ where t = 1, 2, 3, …, k and j = 1, 2, 3, …, τ.
4.: The first and second derivatives of functional data capture essential dynamic features, such as trends and curvature, making them valuable predictors for improving the accuracy and interpretability of functional response models. Therefore, using the same idea as discussed earlier, one can also use the derivatives of functional variables as predictors.
5.: Once all vectors of estimated FPC scores from all functional and scalar exogenous variables are obtained, combine them into a single vector as follows: $C_{t} = {[\begin{matrix} {\hat{Y}}_{t_{1}} & {\hat{Y}}_{t_{2}} & \dots & {\hat{Y}}_{t_{ρ}} & {\hat{Z}}_{t} \end{matrix}]}^{'}$ .
6.: Next, fix the lag order p and using the estimated vector of FPC scores of endogenous variable ${\hat{S}}_{t} = {[\begin{matrix} {\hat{s}}_{1, t} & {\hat{s}}_{2, t} & \dots & {\hat{s}}_{m, t} \end{matrix}]}^{'}$ obtained in Step (1) and the $C_{t}$ from Step (5), fit an appropriate multivariate model, for example the VAR model with an exogenous variable (VARX), given as

$S_{t} = \sum_{i = 1}^{p} Ψ_{i} S_{t - i} + Γ C_{t} + N_{t}$

where Γ is a matrix of coefficients of $C_{t}$ and $N_{t}$ is a white noise process. Then obtain a one-step-ahead forecast for ${\hat{S}}_{t + 1}$ as

${\hat{S}}_{t + 1} = {[\begin{matrix} {\hat{s}}_{1, t + 1} & {\hat{s}}_{2, t + 1} & \dots & {\hat{s}}_{m, t + 1} \end{matrix}]}^{'} .$
7.: In the last step, the ${\hat{S}}_{t + 1}$ is reverted to a functional object using the KL theorem and a one-step-ahead forecast in a functional form ${\hat{S}}_{t + 1} (υ)$ is obtained as

${\hat{S}}_{t + 1} (υ) = \hat{λ} (υ) + \sum_{j = 1}^{m} {\hat{s}}_{t + 1, j} {\hat{α}}_{j} (υ) .$

3.3. Selection of Optimal Orders $p, m, \underset{̲}{g}, τ$ by $fFPE X (p, m, \underset{̲}{g}, τ)$

The performance of FAR(p, m) clearly depends on the optimal selection of the order parameters p and m representing the lag and appropriate number of FPCs, respectively. Past studies suggest some techniques to get these optimal values for the FAR models. For example, the value of m was suggested by [70], based on multistage hypothesis testing. Also, a mechanical and automated technique is suggested for getting an optimum p and m based on minimizing the mean squared error (MSE) as proposed by [58], and called ffPE(p, m) in Equations (15) and (17), as discussed in Section 3.1. In this section, we demonstrate a modified ffPE(p, m) called fFPE

X (p, m, \underset{̲}{g}, τ)

, which is minimized for the selection of optimal parameter values

p, m, \underset{̲}{g}, τ

where

\underset{̲}{g} = {[\begin{matrix} g_{1} & g_{2} & \dots & g_{ρ} \end{matrix}]}^{'}

. We begin by analyzing the MSE. Since the eigenfunctions

α_{j}

are orthogonal and the FPCs

s_{j, t}

are uncorrelated, we can partition the MSE as follows:

\begin{matrix} E \{{∥S_{t + 1} (υ) - {\hat{S}}_{t + 1} (υ)∥}^{2}\} & = E \{{∥\sum_{j = 1}^{\infty} s_{t + 1, j} α_{j} - \sum_{j = 1}^{m} {\hat{s}}_{t + 1, j} α_{j}∥}^{2}\} \\ = E \{{∥S_{t + 1} - {\hat{S}}_{t + 1}∥}^{2}\} + \sum_{j = m + 1}^{\infty} δ_{j} + \sum_{l = 1}^{ρ} \sum_{r = g_{l} + 1}^{\infty} Z_{l, r} . \end{matrix}

(19)

Here, the usual

L^{2}

Euclidean norm of vectors is represented by

{∥ \cdot ∥}^{2}

. In the above equation, the first term

E \{{∥S_{t + 1} - {\hat{S}}_{t + 1}∥}^{2}\}

represents the finite-dimensional MSE for the approximation based on the truncated score vectors. Essentially, it quantifies the error within the selected subspace of dimension m. This finite-dimensional space contains the most significant components (typically derived from the eigenfunctions of the functional data), allowing us to approximate

S_{t + 1}

using a lower-dimensional projection. The second term

\sum_{j = m + 1}^{\infty} δ_{j}

captures the information loss due to truncation at dimension m. Here,

δ_{j}

represents the eigenvalues associated with the components beyond m. By truncating at m, we exclude components with eigenvalues

δ_{m + 1}, δ_{m + 2} \dots

which means losing some functional information, as these components are no longer part of the approximation. The third term

\sum_{l = 1}^{ρ} \sum_{r = g_{l} + 1}^{\infty} Z_{l, r}

captures the information loss due to truncation at dimension

\underset{̲}{g} = {[\begin{matrix} g_{1} & g_{2} & \dots & g_{ρ} \end{matrix}]}^{'}

based on ρ number of functional covariates. Here l is the index of the functional covariate ranging from 1 to ρ, while r indexes the higher-order terms beyond the optimal dimension

g_{l}

.

Z_{l, r}

captures the information loss due to truncating the

l^{t h}

functional covariate at its

g_{l}^{t h}

dimension and corresponds to the eigenvalue of the

r^{t h}

principal component of that covariate, quantifying the variance it explains. Summing over all components beyond

g_{l}

accounts for the total information lost due to dimension reduction, analogous to

\sum_{j = m + 1}^{\infty} δ_{j}

for the response. Hence, the third term represents the cumulative contribution of the truncated higher-order components of the functional covariates beyond the selected optimal dimensions

\underset{̲}{g}

. Assuming the stationarity of the process

S_{t}

is a

(m + K + τ)

-variate

VARX (p)

, where

K = \sum_{l = 1}^{ρ} g_{l}

, has the following form:

S_{t + 1} = Ψ_{1} S_{t} + Ψ_{2} S_{t - 1} +, \dots, + Ψ_{p} S_{t - p + 1} + Γ C_{t} + N_{t + 1},

where Γ is a matrix of coefficients of

C_{t} = {[\begin{matrix} {\hat{Y}}_{t_{1}} & {\hat{Y}}_{t_{2}} & \dots & {\hat{Y}}_{t_{ρ}} & {\hat{Z}}_{t} \end{matrix}]}^{'}

and

N_{t}

is a white noise process such that

\sqrt{n} (\hat{W} - W) \overset{D}{\to} N_{(p \times m^{2}) + K + τ} (0, Σ_{N} \otimes Δ_{(p \times m + K + τ)}^{- 1}),

(20)

where

\hat{W} = vec ({[{\hat{Ψ}}_{1}, \dots, {\hat{Ψ}}_{p}]}^{'}, {[{\hat{Γ}}_{1}, \dots, {\hat{Γ}}_{K + τ}]}^{'})

is the least squares estimator of

W = vec ({[Ψ_{1}, \dots, Ψ_{p}]}^{'}, {[Γ_{1}, \dots, Γ_{K + τ}]}^{'})

,

\sqrt{n} (\hat{W} - W)

is the difference between the estimator

\hat{W}

and the true parameter vector

W

scaled by

\sqrt{n}

and

Δ_{(p \times m + K + τ)} = Var [vec (S_{1}, \dots, S_{p})]

, and

Σ_{N} = E [N_{1}, N_{1}^{'}]

. The asymptotic distribution suggests that as

n \to \infty

, this scaled difference converges to a normal distribution with a zero mean and a covariance structure-determined Kronecker product

(\otimes)

of

Σ_{N}

and

Δ_{(p \times m + K + τ)}^{- 1}

. Suppose that the

\hat{W}

are estimated from an independent training sample

(R_{1}, \dots, R_{t}) \overset{D}{=} (S_{1}, \dots, S_{t})

. It follows then that

\begin{matrix} E \{{∥S_{t + 1} - {\hat{S}}_{t + 1}∥}^{2}\} = & E \{{∥S_{t + 1} - ({\hat{Ψ}}_{1} S_{t} + {\hat{Ψ}}_{2} S_{t - 1} +, \dots, + {\hat{Ψ}}_{p} S_{t - p + 1} + \hat{Γ} C_{t})∥}^{2}\} \end{matrix}

\begin{matrix} E \{{∥S_{t + 1} - {\hat{S}}_{t + 1}∥}^{2}\} \\ = E \{{∥N_{t + 1}∥}^{2}\} + E \{∥ (Ψ_{1} - {\hat{Ψ}}_{1}) S_{t} + \dots + (Ψ_{p} - {\hat{Ψ}}_{p}) S_{t - p + 1} + \sum_{t = 1}^{K + τ} (Γ_{t} - \hat{Γ_{t}}) C_{t} ∥^{2}\} \\ = trace \{Σ_{N}\} + E \{{∥[I_{(p \times m + K + τ)} \otimes (S_{t}^{'}, \dots, S_{t - p + 1}^{'}, C_{1}^{'}, \dots, C_{K + τ}^{'})] (W - \hat{W})∥}^{2}\} \end{matrix}

(21)

The first term

trace \{Σ_{N}\}

is the intrinsic noise variance from the white noise process

N_{t + 1}

and second term is obtained from the estimation error, which is expressed compactly using the Kronecker product (⊗) and vector notation

(S_{t}^{'}, \dots, S_{t - p + 1}^{'}, C_{1}^{'}, \dots, C_{K + τ}^{'})

represents the past values in the autoregressive component, and combined from a reduced dimension from functional covariates and scalar exogenous variables. The independence of

\hat{W}

and

(S_{1}, \dots, S_{t}, C_{t})

yields that

\begin{matrix} E \{∥ [I_{(p \times m + K + τ)} & \otimes (S_{t}^{'}, \dots, S_{t - p + 1}^{'}, C_{1}^{'}, \dots, C_{K + τ}^{'})] (W - \hat{W}) ∥^{2}\} \\ = E [trace \{{(W - \hat{W})}^{'} [I_{(p \times m + K + τ)} \otimes Δ_{(p \times m + K + τ)}] (W - \hat{W})\}] \\ = trace [(I_{(p \times m + K + τ)} \otimes Δ_{(p \times m + K + τ)})] E [(W - \hat{W}) {(W - \hat{W})}^{'}] \end{matrix}

I_{(p \times m + K + τ)} \otimes Δ_{(p \times m + K + τ)}

represents the overall variance–covariance structure for the autoregressive, functional exogenous, and scalar terms. By using Equation (20), the last term can be approximated as

\frac{1}{n} (trace [Σ_{N} \otimes I_{(p \times m + K + τ)}] + o (1)) \sim \frac{(p \times m + K + τ)}{n} trace \{Σ_{N}\} .

Using the above results, Equation (21) can be written as

E \{{∥S_{t + 1} - {\hat{S}}_{t + 1}∥}^{2}\} = trace \{Σ_{N}\} + \frac{(p \times m + K + τ)}{n} trace \{Σ_{N}\} .

Replacing

trace \{Σ_{N}\}

by

\frac{n}{n - (p \times m + K + τ)} trace \{{\hat{Σ}}_{N}\}

,

\begin{matrix} E \{{∥S_{t + 1}^{⋆} - {\hat{S^{⋆}}}_{t + 1}∥}^{2}\} \\ = \frac{n}{n - (p \times m + K + τ)} trace \{{\hat{Σ}}_{N}\} + \frac{(p \times m + K + τ)}{n} \times \frac{n}{n - (p \times m + K + τ)} trace \{{\hat{Σ}}_{N}\} \end{matrix}

= [\frac{n}{n - (p \times m + K + τ)} + \frac{(p \times m + K + τ)}{n} \times \frac{n}{n - (p \times m + K + τ)}] trace \{{\hat{Σ}}_{N}\}

= [\frac{n + (p \times m + K + τ)}{n - (p \times m + K + τ)}] trace \{{\hat{Σ}}_{N}\} .

Using the above results, Equation (19) can be written as

E {∥S_{t + 1} (υ) - {\hat{S}}_{t + 1} (υ)∥}^{2} \approx [\frac{n + (p \times m + K + τ)}{n - (p \times m + K + τ)}] trace \{{\hat{Σ}}_{N}\} + \sum_{j = m + 1}^{\infty} δ_{j} + \sum_{l = 1}^{ρ} \sum_{r = g_{l} + 1}^{\infty} Z_{l, r}

In practice, the population-level quantities in the second and third terms are not observable and are therefore replaced by their corresponding empirical estimates. Thus, the orders

p, m, \underset{̲}{K} = {[\begin{matrix} g_{1} & g_{2} & \dots & g_{ρ} \end{matrix}]}^{'}

and τ are simultaneously selected by minimizing the functional final perdition error criterion generalized for exogenous covariates as

f F P E X (p, m, \underset{̲}{g}, τ) = [\frac{n + (p \times m + K + τ)}{n - (p \times m + K + τ)}] trace \{{\hat{Σ}}_{N}\} + \sum_{j = m + 1}^{\infty} {\hat{δ}}_{j} + \sum_{l = 1}^{ρ} \sum_{r = g_{l} + 1}^{\infty} {\hat{Z}}_{l, r}

(22)

The implementation of the

f F P E X (p, m, \underset{̲}{g}, τ)

criterion renders the proposed forecasting framework fully automated, eliminating reliance on arbitrary tuning parameter specification. Notably, the determination of the optima dimension are now sample-adaptive and explicitly dependent on the observed sample size n. The empirical results presented in Section 4 confirm the strong practical efficacy of this approach.

3.4. Competitive Models

The traditional time series model, like the AutoRegressive Integrated Moving Average (ARIMA), the neural network autoregressive model (NNAR), and two functional autoregressive models FAR(p, m) and FARX(p, m, τ) in Equations (14) and (16), as described in Section 3.1, are studied as benchmarks to compare the forecasting accuracy of the proposed functional models.

3.4.1. Autoregressive Integrated Moving Average Model

ARIMA is one of the most commonly applied time series models for forecasting a univariate time series. ARIMA is the extended form of the simple ARMA model, which is a tool that extrapolates the signal into the future to generate forecasts by separating signal from noise. It comprises three parametric parts known as the order of AR, MA, and the number of differences needed for a time series to be stationary; i.e., p, q, and d, respectively. Mathematically, it can be written as

{(1 - B)}^{d} S_{t, k} = α + \sum_{h = 1}^{p} β_{h} {(1 - B)}^{d} S_{t - h, k} + N_{t} + \sum_{h^{'} = 1}^{q} ϕ_{h^{'}} N_{t - h^{'}}

(23)

where

S_{t, k}

is the stochastic part obtained from partitioning the

S_{t, k}

, q is an integrated difference to achieve a stationarity, B is the backward shift operator,

N_{t}

is the white noise term, α is an intercept term, and

β_{h} (h = 1, \dots, p)

and

ϕ_{h^{'}} (h^{'} = 1, \dots, q)

are the parameters of the AR and MA parts, respectively, which are estimated using the maximum likelihood estimation method.

3.4.2. Neural Network Autoregressive Model

Artificial neural networks (ANNs) allow complex nonlinear relationships between the response variable and its predictors. The strength of the NNAR model comes from the parallel processing of data, which eliminates the need for classical assumptions. As a result, the network model may be easily chosen based on the characteristics of the data. ANN includes three types of layers, namely input, output, and one or more hidden layer(s) with an activation function that determines the relationship (represented by a sigmoid) between the input (

S_{t - 1, k}, S_{t - 2, k}, \dots, S_{t - d, k}

) and output (

S_{t, k}

) of a node and network. The mathematical form of the NNAR model is

S_{t, k} = ω_{0} + \sum_{a = 1}^{q} ω_{a} f (ω_{0 a} + \sum_{z = 1}^{d} ω_{z a} S_{t - z, k}) + N_{t}

(24)

where

w_{0}

and

w_{0 a}

are the biases on the nodes,

w_{a}

and

w_{z a}

(

a = 1, 2, \dots q

,

z = 1, 2, \dots, d

,) are the connection weights between the layers of the model,

f (\cdot)

is the activation function of the hidden layer, d is the number of input nodes, q is the number of hidden nodes, and

N_{t} \overset{iid}{\sim} N (0, σ^{2})

. Furthermore, the sigmoid is based on the logistic function given as

f (x) = {(1 + e^{- x})}^{- 1}

.

4. Application to Real Data

4.1. Study Area

For the empirical investigation of the proposed FAR model, an hourly temperature time series (in °C) for the site Karachi, Pakistan (Latitude: 24.8607° N, Longitude: 67.0011° E), is considered (Figure 4). The data used in this study span the period from 1 January 2019 to 31 December 2024 and were collected from the NASA POWER data access portal (https://power.larc.nasa.gov/data-access-viewer/ (accessed on 14 January 2025)).

In addition, exogenous variables such as wet-bulb temperature (°C), wind speed (10 m/s), and surface pressure (kPa) are also collected for the same location. The times series data for each variable consists of 52,632 hourly observations for 2192 days. The descriptive summaries for the these variables are given in Table 2.

Figure 5 depicts the nature of variations for response and the three exogenous variables. The time series plot of the temperature data over six years clearly distinguish the training set (43,848) used for modeling and estimation from the testing set (8784), which corresponds to out-of-sample forecasts. Since 2024 is a leap year, a total of 366 day-ahead forecasts are generated using the expanding window modeling technique. The hourly temperature data also exhibits seasonal variation. Figure 6a illustrates seasonal variations across different seasons in 2024, showing that winter and autumn experience significant fluctuations in temperature as compared to the other three seasons.

4.2. Data-Driven Modeling Procedure

Figure 7 clearly illustrates the step-by-step data-driven modeling procedure that is used to implement the proposed and competitive models, beginning with data preprocessing.

Extreme values in the times series of responses, as well as in the explanatory variables, pose challenges for estimation and forecasting. Identifying and adjusting these values improves prediction accuracy. This study applies the Shifting Filter on time series (SFT), an extension of [71], which operates on a rolling window instead of the entire series. Each of the original time series is divided into

P = (N / r)

segments, where N is the total number of observations and r is the window width. Values deviating beyond 2.58 (based on a 99% confidence interval from a normal distribution) times the sample standard deviation σ from the mean

\hat{λ}

are flagged as extreme [72]. The process iterates across all segments for all series as follows:

S_{t}^{°} = ⋃_{i = 1}^{P} S_{t} : | S_{t} - \hat{λ} | \geq 2.58 * S D_{S},

where

S_{t}

is the temperature (°C) at time t,

\hat{λ}

is the sample mean, and

S D_{S}

is the standard deviation. The replacement rule can be mathematically defined as

{\tilde{S}}_{t} = \{\begin{matrix} S_{t}, & if | S_{t} - \hat{λ} | < 2.58 \cdot S D_{S} \\ \tilde{λ}, & otherwise \end{matrix}

Extreme values are replaced with the median (

\tilde{λ}

) of the respective window for improved forecasting stability.

Once

{\tilde{S}}_{t}

are obtained for each TS, we move towards the data-driven framework as indicated in Figure 7, which includes the application of the traditional time series model ARIMA, a machine learning model NNAR, and three FAR models. Consider

S_{t, k}

to be a filtered temperature time series where

t \in N

and

k = 1, 2, 3, \dots, 24

. Due to significant seasonal variations (

U_{t,}

), long trends (

T_{t, k}

), and year effects (

Y_{t, k}

), as represented in Figure 6b, we partitioned

S_{t, k}

into deterministic (

D_{t, k}

) and stochastic (

S_{t, k}

) parts, such that

S_{t, k} = D_{t, k} + S_{t, k},

the first term

D_{t, k} = U_{t, k} + T_{t, k} + Y_{t, k}

is estimated via a generalized additive model (GAM) and the second term

S_{t, k} = S_{t, k} - {\hat{D}}_{t, k}

is modeled by ARIMA. Finally, the results are combined (

{\hat{D}}_{t, k} + {\hat{S}}_{t, k}

) for the day-ahead final forecast; that is,

{\hat{S}}_{t + 1, k} = {\hat{D}}_{t + 1, k} + {\hat{S}}_{t + 1, k}

. Additionally, models like NNAR and FAR can capture non-linearity directly without decomposing

S_{t, k}

, as they performed even better in handling complex patterns.

Order Identification and Estimation

For different variables used in the study, order identification is crucial for ensuring accurate time series modeling and forecasting. The optimal orders required for the implementation of the models in Figure 7 are estimated and given in Table 3.

The optimal orders for all models are selected based on the model selection criteria: for ARIMA, using AIC/BIC; for neural networks via cross-validation; and for FAR(p, m), FARX(p, m, τ) and FAR

X (p, m, \underset{̲}{g}, τ)

, the respective fFPE’s in Equations (15), (17) and (22) are minimized to determine the right functional orders, which optimizes the dependency structure in FTS, enhancing predictive performance. Thus, based on Table 3, the estimated time series models are ARIMA(3, 0, 2), NNAR(32, 16), and the three FAR models FAR(6, 14), FARX(6, 14, 3), FAR

X (2, 15, {[\begin{matrix} 15 & 15 & 9 \end{matrix}]}^{'}, 6)

, respectively.

To evaluate the residual behavior of the selected models, the autocorrelation function (ACF) and partial ACF (PACF) plots are depicted in Figure 8. Overall, the final estimated residuals appear to be largely whitened, indicating that the fitted models can be regarded as adequate. Since the fitted models exhibit satisfactory diagnostic behavior, we now proceed to the out-of-sample forecasting analysis.

4.3. Out-of-Sample Forecasting

Out-of-sample forecasting evaluates a model’s predictive performance on unobserved data beyond the estimation period. This approach helps assess the generalization and robustness of the model in real-world applications. To validate the accuracy of the forecasting models, different performance measures are used. In particular, the forecasting accuracy of a model is determined by the mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE) and mean absolute scaled error (MASE) for

T

number of out-of-sample forecasts used for accuracy evaluation, which are given as

MAE = \frac{1}{T} \sum_{t = 1}^{T} |S_{t, k} - {\hat{S}}_{t, k}|

RMSE = \sqrt{\frac{1}{T} \sum_{t = 1}^{T} {(S_{t, k} - {\hat{S}}_{t, k})}^{2}}

MAPE = \frac{1}{T} \sum_{t = 1}^{T} |\frac{S_{t, k} - {\hat{S}}_{t, k}}{S_{t, k}}| \times 100

MASE = \frac{\sum_{t = 1}^{T} | S_{t, k} - {\hat{S}}_{t, k} |}{\frac{T}{T - 1} \sum_{t = 2}^{T} | S_{t, k} - S_{t - 1, k} |}

Using the proposed and competitor models, day-ahead out-of-sample forecasts are obtained for the complete leap year 2024. A univariate hourly time series dataset of temperature is used to get ARIMA(3, 0, 2) and NNAR(32, 16) while the daily temperature curves are used to estimate the three FAR models.

The results for various models are summarized in Table 4 by means of RMSE, MAE, MAPE and MASE. From the results we can see that the NNAR outperforms the classical ARIMA by producing relatively small RMSE (0.9720) and MAPE (2.7928). While the FAR and FARX models provide considerable improvement in forecasting in comparison with ARIMA and NNAR, the proposed FAR

X (2, 15, {[\begin{matrix} 15 & 15 & 9 \end{matrix}]}^{'}, 6)

ensures superior performance within the class of functional models. It achieves the lowest RMSE (0.2942) and MAE (0.2004), indicating improved forecasting accuracy. Additionally, the significant reduction in MAPE (0.8213) highlights lower prediction errors, while the MASE (0.3188) is also the lowest among all models, showcasing its robustness. These findings confirm that FAR

X (2, 15, {[\begin{matrix} 15 & 15 & 9 \end{matrix}]}^{'}, 6)

provides more accurate day-ahead temperature forecasts while minimizing overall errors, which is a major advancement in climate data analysis.

To assess the significance of the differences between the accuracy metrics given in Table 4, we applied the Diebold and Mariano (DM) [73] test, which is a standard statistical method for comparing forecast accuracy across models. It evaluates whether differences in accuracy measures between forecast pairs are statistically significant and the null hypothesis states no difference in accuracy. Table 5 consists of findings of the DM test, amd contains the p-values, using which we test the null hypothesis that the forecasting accuracies of the model in the row and the model in the column are the same against the alternative hypothesis that the forecasting accuracy of the model in the column is greater than the model in the row.

Figure 8. ACF and PACF plots for all models under study, (a) ACF for ARIMA; (b) PACF for ARIMA; (c) ACF for NNAR; (d) PACF for NNAR; (e) ACF for FAR(p, m); (f) PACF for FAR(p, m); (g) ACF for FAR(p, m, τ); (h) PACF for FAR(p, m, τ); (i) ACF for FAR

X (p, m, \underset{̲}{g}, τ)

; (j) PACF for FAR

X (p, m, \underset{̲}{g}, τ)

.

Figure 8. ACF and PACF plots for all models under study, (a) ACF for ARIMA; (b) PACF for ARIMA; (c) ACF for NNAR; (d) PACF for NNAR; (e) ACF for FAR(p, m); (f) PACF for FAR(p, m); (g) ACF for FAR(p, m, τ); (h) PACF for FAR(p, m, τ); (i) ACF for FAR

X (p, m, \underset{̲}{g}, τ)

; (j) PACF for FAR

X (p, m, \underset{̲}{g}, τ)

.

This table indicates that the functional models produce statistically more significant results than ARIMA and NNAR. Among the functional models, the proposed FAR

X (p, m, \underset{̲}{g}, τ)

outperforms the other functional models, demonstrating its statistical significance for temperature forecasting.

To evaluate the performance of our proposed functional model, we compute the same accuracy measures for different seasons of the year as well as for each month, and the results are listed in Table 6 and Table 7, respectively. The study site goes through four different seasons and the daily temperature curves are considerably different resulting in different forecasting accuracy errors. For example, looking at Table 6, which lists the one-day-ahead out-of-sample forecasting errors for different seasons, one can notice that the errors vary through all seasons. More precisely, looking at the MAE values, one can see that in summer, the value is 0.1550 for the FAR

X (2, 15, {[\begin{matrix} 15 & 15 & 9 \end{matrix}]}^{'}, 6)

model, which is the lowest value as compared to other seasons and compared to the other models. Although the forecasting accuracies of all the models are high, the three FAR approaches have relatively better performance, especially the proposed FAR

X (2, 15, {[\begin{matrix} 15 & 15 & 9 \end{matrix}]}^{'}, 6)

model. The error metrics of all models are relatively small in autumn, but large in winter. For instance, the FAR(6, 14) and FARX(6, 14, 3) produce MAPE values of 1.2031 and 1.1875 in autumn, but 3.4099 and 3.4241 in winter, respectively. On the other hand, the benchmark models ARIMA(3, 0, 2) and NNAR(32, 16) produces fairly large MAPE values. The results indicate that the proposed model performs better in summer and autumn compared to winter and spring. This can be attributed to the relatively smoother and more stable temperature patterns during these seasons, whereas winter and spring exhibit rapid fluctuations due to complex atmospheric interactions.

Table 7 lists the monthly forecasting accuracy for the models used in this study. The results suggest that the forecasting errors are lower in June, September and November compared to the other months of the year. The lowest errors are observed in the month of June, whereas the higher errors correspond to the month of March. Comparing the forecasting accuracies of different models, it is evident that the functional models perform relatively better than ARIMA and NNAR. Among the functional models, the FAR

X (2, 15, {[\begin{matrix} 15 & 15 & 9 \end{matrix}]}^{'}, 6)

gives better results than FAR(6, 14) and FARX(6, 14, 3). The lowest MAPE produced by our proposed functional model is 0.4081 for June, whereas the value of 1.4048, produced for February, is the highest MAPE value.

Finally, Table 8 provides a comparison of the models for each hour of the day. The results are attributed to the one-day-ahead out-of-sample forecasts for a whole year. That is, the accuracy measures are obtained for the first hour based on all first hours for a complete cycle in 2024 (366 days), and so forth. This table indicates that daily temperature curves are more stable in the second-to-sixth and eighteenth-to-twenty-first hours of the day than they are in the first, seventh, thirteenth, and the twenty-fourth hour. The forecasting errors for the thirteenth and fourteenth hours are comparatively lower than in the first and last hour of the day. Additionally, higher prediction errors around dawn are observed, likely caused by abrupt temperature transitions and lower signal-to-noise ratios at these times, which increase the difficulty of accurate short-term forecasting. The results suggest that our proposed functional model FAR

X (2, 15, {[\begin{matrix} 15 & 15 & 9 \end{matrix}]}^{'}, 6)

is more efficient at capturing the non-linear trends and performs significantly better compared to the traditional ARIMA and NNAR within the class of functional models. Hence, the inclusion of a number of exogenous variables enhances the capability of FAR(6, 14) and FARX(6, 14, 3).

To summarize the work, one can see from Table 4, Table 5, Table 6, Table 7 and Table 8 that the proposed approach is efficient, producing significantly lower error metrics than the other competitors, which indicates better predictive power for modeling and forecasting the daily temperature curves and hence demonstrates the usefulness of our estimation framework. Moreover, the forecasting results show that the functional forecasting approach is superior to the classical ARIMA and NNAR methods.

4.4. Computational Complexity

Finally, we compare the average computational time in seconds for the five models as given in Table 9. For the functional models, authors wrote their own code and the necessary programming is carried out in R which is a statistical computing language and all the computations have been performed using an Intel(R)-Core(TM) i7-4770 CPU running at 3.40 GHz. For deterministic part, the library gam is used while for ARIMA and NNAR, the library forecast is used. For the applications of functional models the libraries fda and vars. The documentation provided by the packages includes in-depth details on the particular algorithms utilized in the estimation process.

Table 9 presents the average computation time in seconds for one-day-ahead temperature forecasting across various models. ARIMA exhibits the lowest execution time (0.33 s), serving as the baseline. The functional models, particularly FARX(

p, m, \underset{̲}{g}, τ

), show increased computational cost, with a maximum time of 2.17 s. This rise reflects the added complexity due to functional and exogenous components. However, such models provide improved predictive performance, justifying the additional time.

5. Conclusions

Accurate modeling and forecasting of daily temperature curves are essential for many practical applications, such as weather forecasting, agriculture, energy management, coastal activities, engineering, and climate analysis. However, due to the complex and non-linear patterns in weather data, the forecasting problem is a challenging task. To address this issue, a novel functional autoregressive model, namely FAR

X (p, m, \underset{̲}{g}, τ)

, is proposed to accommodate any number of functional and scalar covariates. The optimum orders used for different parts of the proposed model are estimated by minimizing a modified functional final prediction error called fFPE

X (p, m, \underset{̲}{g}, τ)

. To evaluate the performance of the proposed methodology, the daily temperature curve data obtained from NASA POWER for Karachi in Pakistan is used, and one-day-ahead out-of-sample forecasts are obtained for a complete year. For comparison purposes, forecasts are also obtained using the classical ARIMA and NNAR models, as well as two functional models, namely FAR(p, m) and FARX(p, m). The forecasting performance of the different models is assessed through different accuracy measures, including the RMSE, MAE, MAPE, and MASE.

The study findings suggest that the proposed functional model FAR

X (p, m, \underset{̲}{g}, τ)

is efficient at forecasting daily temperature curves by producing considerably lower forecasting errors compared to the competitors. Both of the functional models, FAR(p, m) and FARX(p, m), perform relatively better than ARIMA and NNAR. Among the functional models, our proposed model, taking into consideration several functional and scalar exogenous variables, produces the best forecasting results. The proposed model effectively captures functional temperature dynamics, particularly for short-term horizons; however, it may be less accurate under extreme or abrupt climatic events. Future work could incorporate additional high-frequency exogenous variables and explore hybrid approaches with deep learning to further enhance predictive performance. Additionally, the model should be evaluated across regions with diverse climatic patterns, such as continental or tropical climates, to assess its robustness and generalizability beyond Karachi.

Author Contributions

Conceptualization, I.S. and M.U.; methodology, I.S. and M.U.; software, M.U. and S.A.; validation, S.A. and S.M.A.; formal analysis, I.S. and M.U.; investigation, S.A. and S.M.A.; resources, S.A. and S.M.A.; data curation, M.U. and S.M.A.; writing—original draft preparation, I.S.; writing—review and editing, S.A. and S.M.A.; visualization, M.U. and S.M.A.; supervision, I.S.; project administration, I.S. and S.A.; funding acquisition, S.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was funded by Umm Al-Qura University, Saudi Arabia, under grant number: 26UQU4310037GSSR03.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study were collected from the NASA POWER data access portal (https://power.larc.nasa.gov/data-access-viewer/), accessed on 14 January 2025.

Acknowledgments

The authors extend their appreciation to Umm Al-Qura University, Saudi Arabia, for funding this research work through grant number: 26UQU4310037GSSR03.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tajfar, E.; Bateni, S.M.; Margulis, S.A.; Gentine, P.; Auligne, T. Estimation of turbulent heat fluxes via assimilation of air temperature and specific humidity into an atmospheric boundary layer model. J. Hydrometeorol. 2020, 21, 205–225. [Google Scholar] [CrossRef]
Valipour, M.; Bateni, S.M.; Gholami Sefidkouhi, M.A.; Raeini-Sarjaz, M.; Singh, V.P. Complexity of forces driving trend of reference evapotranspiration and signals of climate change. Atmosphere 2020, 11, 1081. [Google Scholar] [CrossRef]
Cifuentes, J.; Marulanda, G.; Bello, A.; Reneses, J. Air temperature forecasting using machine learning techniques: A review. Energies 2020, 13, 4215. [Google Scholar] [CrossRef]
Massie, D.R.; Rose, M.A. Predicting daily maximum temperatures using linear regression and Eta geopotential thickness forecasts. Weather Forecast 1997, 12, 799–807. [Google Scholar] [CrossRef]
Campbell, S.D.; Diebold, F.X. Weather forecasting for weather derivatives. J. Am. Stat. Assoc. 2005, 100, 6–16. [Google Scholar] [CrossRef]
Svec, J.; Stevenson, M. Modelling and forecasting temperature based weather derivatives. Glob. Financ. J. 2007, 18, 185–204. [Google Scholar] [CrossRef]
Dupuis, D.J. Forecasting temperature to price CME temperature derivatives. Int. J. Forecast. 2011, 27, 602–618. [Google Scholar] [CrossRef]
Wang, Z.; Li, P.; Li, L.; Huang, C.; Liu, M. Modeling and forecasting average temperature for weather derivative pricing. Adv. Meteorol. 2015, 2015, 837293. [Google Scholar] [CrossRef]
Cavicchioli, M.; Ghezal, A.; Zemmouri, I. (Bi)spectral analysis of Markov switching bilinear time series. Stat. Methods Appl. 2025, 1–30. [Google Scholar] [CrossRef]
Alraddadi, R. The logTG-SV model: A threshold-based volatility framework with logarithmic shocks for exchange rate dynamics. AIMS Math. 2025, 10, 19495–19511. [Google Scholar] [CrossRef]
Gomez, I.; Estrela, M.J.; Caselles, V. Operational forecasting of daily summer maximum and minimum temperatures in the Valencia Region. Nat. Hazards 2014, 70, 1055–1076. [Google Scholar] [CrossRef]
Trigo, R.M.; Palutikof, J.P. Simulation of daily temperatures for climate change scenarios over Portugal: A neural network model approach. Clim. Res. 1999, 13, 45–59. [Google Scholar] [CrossRef]
Abdel-Aal, R.E.; Elhadidy, M.A. Modeling and forecasting the daily maximum temperature using abductive machine learning. Weather Forecast 1995, 10, 310–325. [Google Scholar] [CrossRef]
Abdel-Aal, R.E. Hourly temperature forecasting using abductive networks. Eng. Appl. Artif. Intell. 2004, 17, 543–556. [Google Scholar] [CrossRef]
Ustaoglu, B.; Cigizoglu, H.K.; Karaca, M. Forecast of daily mean, maximum and minimum temperature time series by three artificial neural network methods. Meteorol. Appl. 2008, 15, 431–445. [Google Scholar] [CrossRef]
Sanikhani, H.; Deo, R.C.; Samui, P.; Kisi, O.; Mert, C.; Mirabbasi, R.; Gavili, S.; Yaseen, Z.M. Survey of different data-intelligent modeling strategies for forecasting air temperature using geographic information as model predictors. Comput. Electron. Agric. 2018, 152, 242–260. [Google Scholar] [CrossRef]
Tran, T.T.K.; Bateni, S.M.; Ki, S.J.; Vosoughifar, H. A review of neural networks for air temperature forecasting. Water 2021, 13, 1294. [Google Scholar] [CrossRef]
Shah, I.; Mubassir, P.; Ali, S.; Albalawi, O. A functional autoregressive approach for modeling and forecasting short-term air temperature. Front. Environ. Sci. 2024, 12, 1411237. [Google Scholar] [CrossRef]
Selmy, H.A.; Mohamed, H.K.; Medhat, W. A predictive analytics framework for sensor data using time series and deep learning techniques. Neural Comput. Appl. 2024, 36, 6119–6132. [Google Scholar] [CrossRef]
Alexander, V.O.; Ali, M.I. Enhancing Time Series Data Predictions: A Survey of Augmentation Techniques and Model Performance. In Proceedings of the 2024 Australasian Computer Science Week (ACSW 2024); Association for Computing Machinery: New York, NY, USA, 2024. [Google Scholar] [CrossRef]
Uluocak, I.; Bilgili, M. Daily air temperature forecasting using LSTM-CNN and GRU-CNN models. Acta Geophys. 2024, 72, 2107–2126. [Google Scholar] [CrossRef]
An, H.; Li, Q.; Lv, X.; Li, G.; Qian, Q.; Zhou, G.; Nie, G.; Zhang, L.; Zhu, L. Forecasting daily extreme temperatures in Chinese representative cities using artificial intelligence models. Weather. Clim. Extrem. 2023, 42, 100621. [Google Scholar] [CrossRef]
Sen, A.; Mazumder, A.R.; Dutta, D.; Sen, U.; Syam, P.; Dhar, S. Comparative evaluation of metaheuristic algorithms for hyperparameter selection in short-term weather forecasting. arXiv 2023, arXiv:2309.02600. [Google Scholar] [CrossRef]
Toharudin, T.; Pontoh, R.S.; Caraka, R.E.; Zahroh, S.; Lee, Y.; Chen, R.C. Employing long short-term memory and Facebook prophet model in air temperature forecasting. Commun. Stat.-Simul. Comput. 2023, 52, 279–290. [Google Scholar] [CrossRef]
Elshewey, A.M.; Shams, M.Y.; Elhady, A.M.; Shohieb, S.M.; Abdelhamid, A.A.; Ibrahim, A.; Tarek, Z. A Novel WD-SARIMAX model for temperature forecasting using daily delhi climate dataset. Sustainability 2022, 15, 757. [Google Scholar] [CrossRef]
Gong, B.; Langguth, M.; Ji, Y.; Mozaffari, A.; Stadtler, S.; Mache, K.; Schultz, M.G. Temperature forecasting by deep learning methods. Geosci. Model Dev. 2022, 15, 8931–8956. [Google Scholar] [CrossRef]
Shin, K.-H.; Jung, J.-W.; Chang, K.-H.; Kim, K.; Jung, W.-S.; Lee, D.-I.; You, C.-H. Dynamical prediction of two meteorological factors using the deep neural network and the long short-term memory (II). J. Korean Phys. Soc. 2022, 80, 1081–1097. [Google Scholar] [CrossRef]
Alomar, M.K.; Khaleel, F.; Aljumaily, M.M.; Masood, A.; Razali, S.F.M.; AlSaadi, M.A.; Al-Ansari, N.; Hameed, M.M. Data-driven models for atmospheric air temperature forecasting at a continental climate region. PLoS ONE 2022, 17, e0277079. [Google Scholar] [CrossRef]
Haque, E.; Tabassum, S.; Hossain, E. A comparative analysis of deep neural networks for hourly temperature forecasting. IEEE Access 2021, 9, 160646–160660. [Google Scholar] [CrossRef]
Wang, H.; Pathan, M.S.; Lee, Y.H.; Dev, S. Day-ahead forecasts of air temperature. In 2021 IEEE USNC-URSI Radio Science Meeting (Joint with AP-S Symposium); IEEE: New York, NY, USA, 2021; pp. 94–95. [Google Scholar] [CrossRef]
Lee, S.; Lee, Y.-S.; Son, Y. Forecasting daily temperatures with different time interval data using deep neural networks. Appl. Sci. 2020, 10, 1609. [Google Scholar] [CrossRef]
Zhang, Z.; Dong, Y. Temperature Forecasting via Convolutional Recurrent Neural Networks Based on Time-Series Data. Complexity 2020, 2020, 3536572. [Google Scholar] [CrossRef]
Shin, J.-Y.; Kim, K.R.; Ha, J.-C. Seasonal forecasting of daily mean air temperatures using a coupled global climate model and machine learning algorithm for field-scale agricultural management. Agric. For. Meteorol. 2020, 281, 107858. [Google Scholar] [CrossRef]
Wanishsakpong, W.; Owusu, B.E. Optimal time series model for forecasting monthly temperature in the southwestern region of Thailand. Model. Earth Syst. Environ. 2020, 6, 525–532. [Google Scholar] [CrossRef]
Zahroh, S.; Hidayat, Y.; Pontoh, R.S.; Santoso, A.; Sukono, F.; Bon, A.T. Modeling and forecasting daily temperature in Bandung. In Proceedings of the International Conference on Industrial Engineering and Operations Management, Riyadh, Saudi Arabia, 26–28 November 2019; pp. 406–412. [Google Scholar]
Ramsay, J.O.; Silverman, B.W. Applied Functional Data Analysis: Methods and Case Studies; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar] [CrossRef]
Uzair, M.; Shah, I.; Ali, S. An adaptive strategy for wind speed forecasting under functional data horizon: A way towards enhancing clean energy. IEEE Access 2024, 12, 68730–68746. [Google Scholar] [CrossRef]
Naeem, N.; Ali, S.; Shah, I. Functional EWMA control chart for phase II profile monitoring. J. Stat. Comput. Simul. 2025, 95, 96–116. [Google Scholar] [CrossRef]
Ullah, S.; Finch, C.F. Applications of functional data analysis: A systematic review. BMC Med. Res. Methodol. 2013, 13, 43. [Google Scholar] [CrossRef]
Tao, Z.; Wang, M.; Liu, J.; Wang, P. A Functional Data Analysis Framework Incorporating Derivative Information and Mixed-Frequency Data for Predictive Modeling of Crude Oil Price. IEEE Trans. Ind. Inform. 2025, 21, 3226–3235. [Google Scholar] [CrossRef]
Rodríguez Cuadro, D.; Pérez-Plaza, S.; Castaño-Martínez, A.; Fernández-Palacín, F. A Study of the Colombian Stock Market with Multivariate Functional Data Analysis (FDA). Mathematics 2025, 13, 858. [Google Scholar] [CrossRef]
Pokhrel, K.P.; Tsokos, C.P. Forecasting age-specific brain cancer mortality rates using functional data analysis models. Adv. Epidemiol. 2015, 2015, 721592. [Google Scholar] [CrossRef]
Beyaztas, U.; Yaseen, Z.M. Drought interval simulation using functional data analysis. J. Hydrol. 2019, 579, 124141. [Google Scholar] [CrossRef]
Hael, M.A.; Yongsheng, Y.; Saleh, B.I. Visualization of rainfall data using functional data analysis. SN Appl. Sci. 2020, 2, 461. [Google Scholar] [CrossRef]
Ghumman, A.R.; Haider, H.; Shafiquzamman, M. Functional data analysis of models for predicting temperature and precipitation under climate change scenarios. J. Water Clim. Change 2020, 11, 1748–1765. [Google Scholar] [CrossRef]
Hael, M.A. Modeling of rainfall variability using functional principal component method: A case study of Taiz region, Yemen. Model. Earth Syst. Environ. 2021, 7, 17–27. [Google Scholar] [CrossRef]
Wang, S.; Jank, W.; Shmueli, G. Explaining and forecasting online auction prices and their dynamics using functional data analysis. J. Bus. Econ. Stat. 2008, 26, 144–160. [Google Scholar] [CrossRef]
Wagner-Muns, I.M.; Guardiola, I.G.; Samaranayke, V.A.; Kayani, W.I. A functional data analysis approach to traffic volume forecasting. IEEE Trans. Intell. Transp. Syst. 2017, 19, 878–888. [Google Scholar] [CrossRef]
Shah, I.; Muhammad, I.; Ali, S.; Ahmed, S.; Almazah, M.M.A.; Al-Rezami, A.Y. Forecasting day-ahead traffic flow using functional time series approach. Mathematics 2022, 10, 4279. [Google Scholar] [CrossRef]
Liebl, D. Modeling and forecasting electricity spot prices: A functional data perspective. Ann. Appl. Stat. 2013, 12, 1562–1592. [Google Scholar] [CrossRef]
Lisi, F.; Shah, I. Forecasting next-day electricity demand and prices based on functional models. Energy Syst. 2020, 11, 947–979. [Google Scholar] [CrossRef]
Jan, F.; Shah, I.; Ali, S. Short-Term Electricity Prices Forecasting Using Functional Time Series Analysis. Energies 2022, 15, 3423. [Google Scholar] [CrossRef]
Shah, I.; Jan, F.; Ali, S. Functional data approach for short-term electricity demand forecasting. Math. Probl. Eng. 2022, 2022, 6709779. [Google Scholar] [CrossRef]
Chen, Y.; Koch, T.; Lim, K.G.; Xu, X.; Zakiyeva, N. A review study of functional autoregressive models with application to energy forecasting. Wiley Interdiscip. Rev. Comput. Stat. 2021, 13, e1525. [Google Scholar] [CrossRef]
Frois Caldeira, J.; Gupta, R.; Suleman, M.T.; Torrent, H.S. Forecasting the term structure of interest rates of the BRICS: Evidence from a nonparametric functional data analysis. Emerg. Mark. Financ. Trade 2021, 57, 4312–4329. [Google Scholar] [CrossRef]
Vilar, J.M.; Cao, R.; Aneiros, G. Forecasting next-day electricity demand and price using nonparametric functional methods. Int. J. Electr. Power Energy Syst. 2012, 39, 48–55. [Google Scholar] [CrossRef]
Curceac, S.; Ternynck, C.; Ouarda, T.B.M.J.; Chebana, F.; Niang, S.D. Short-term air temperature forecasting using nonparametric functional data analysis and SARMA models. Environ. Model. Softw. 2019, 111, 394–408. [Google Scholar] [CrossRef]
Aue, A.; Norinho, D.D.; Hörmann, S. On the prediction of stationary functional time series. J. Am. Stat. Assoc. 2015, 110, 378–392. [Google Scholar] [CrossRef]
Karhunen, K. Under linear methods in probability theory. Ann. Acad. Sci. Fenn. Ser. A. I. Math.-Phys. 1947, 37, 3–79. [Google Scholar]
Shang, H.L. A survey of functional principal component analysis. AStA Adv. Stat. Anal. 2014, 98, 121–142. [Google Scholar] [CrossRef]
Hall, P.; Hosseini-Nasab, M. On properties of functional principal components analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 2006, 68, 109–126. [Google Scholar] [CrossRef]
Hörmann, S.; Kokoszka, P. Weakly dependent functional data. Ann. Stat. 2010, 38, 1845–1884. [Google Scholar] [CrossRef]
Damon, J.; Guillas, S. far: Modelization for Functional AutoRegressive Processes. 2024. Available online: https://CRAN.R-project.org/package=far (accessed on 14 January 2025).
Hyndman, R.J.; Shang, H.L. ftsa: Functional Time Series Analysis. 2025. Available online: https://CRAN.R-project.org/package=ftsa (accessed on 14 January 2025).
Dickey, D.A.; Fuller, W.A. Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 1979, 74, 427–431. [Google Scholar] [CrossRef]
Dickey, D.A.; Fuller, W.A. Likelihood ratio statistics for autoregressive time series with a unit root. Econometrica 1981, 49, 1057–1072. [Google Scholar] [CrossRef]
Hörmann, S.; Kokoszka, P. Functional time series. In Handbook of Statistics; Elsevier: Amsterdam, The Netherlands, 2012; Volume 30, pp. 157–186. [Google Scholar] [CrossRef]
Brockwell, P.J.; Davis, R.A. Time Series: Theory and Methods; Springer: Berlin/Heidelberg, Germany, 1991. [Google Scholar] [CrossRef]
Bosq, D. Linear Processes in Function Spaces: Theory and Applications; Springer: Berlin/Heidelberg, Germany, 2000. [Google Scholar] [CrossRef]
Kokoszka, P.; Reimherr, M. Determining the order of the functional autoregressive model. J. Time Ser. Anal. 2013, 34, 116–129. [Google Scholar] [CrossRef]
Borovkova, S.; Permana, F.J. Modelling electricity prices by the potential jump-diffusion. In Stochastic Finance; Springer: Berlin/Heidelberg, Germany, 2006; pp. 239–263. [Google Scholar] [CrossRef]
Shah, I.; Akbar, S.; Saba, T.; Ali, S.; Rehman, A. Short-term forecasting for the electricity spot prices with extreme values treatment. IEEE Access 2021, 9, 105451–105462. [Google Scholar] [CrossRef]
Diebold, F.X.; Mariano, R.S. Comparing predictive accuracy. J. Bus. Econ. Stat. 2002, 20, 134–144. [Google Scholar] [CrossRef]

Figure 1. The Fourier basis functions using M = 20.

Figure 2. Functional data representation of (a) response variable; (b) first exogenous variable; (c) second exogenous variable; (d) third exogenous variable, under study using Fourier basis functions.

Figure 3. Framework for the proposed functional model.

Figure 4. Spatial location of the study area, (a) map of Pakistan with Karachi highlighted; (b) zoomed-in view of Karachi.

Figure 5. Time series plots of (a) endogenous variable indicating training and testing data partition; (b) first exogenous variable; (c) second exogenous variable; (d) third exogenous variable.

Figure 6. (a) Seasonal variations within an year; (b) long-trend, yearly, and seasonal variations for the whole dataset.

Figure 7. Flowchart of data-driven modeling procedure.

Table 1. Summary of literature review.

No.	Authors	Location	Forecasting	Data Length (Years)	Methods	Error Metrices
1	Shah et al. (2024) [18]	Islamabad, Pakistan	Day-ahead	5	FAR, ARIMA, VAR	MAE, RMSE, MAPE
2	Selmy et al. (2024) [19]	Delhi	Day-ahead	5	ARIMA, SARIMA, LSTM, CNN-LSTM,	MAE, RMSE, MAPE
3	Alexander et al. (2024) [20]	Jena, Germany	Day-ahead	8	ARIMA, WaveNet, LSTM	MAE, RMSE, MSE
4	Uluocak et al. (2024) [21]	Adana and Ankara, Türkiye	Day-ahead	8	GRU-CNN, LSTM-CNN, FNN, ANFIS, ARMA, GRU, LSTM, CNN	MAE, RMSE, R², NSE
5	An et al. (2023) [22]	China	Day-ahead	10	MLR, SVR, GBRT, LSTM, MLP, GFS	MAE
6	Sen et al. (2023) [23]	Ottawa, Canada	Hour-ahead	11	ARIMA, ANN, LSTM, GRU, GA, DE, PSO	MAPE, MSE
7	Toharudin et al. (2023) [24]	Bandung, Indonesia	Day-ahead	5.5	LSTM, Facebook Prophet	RMSE
8	Elshewey et al. (2022) [25]	Delhi, India	Day-ahead	5	WD-SARIMAX	MAE, RMSE, R², MAPE, MSE, MedAE,
9	Gong et al. (2022) [26]	Europe	Hour-ahead	13	ConvLSTM, SAVP	MSC, ACC, SSIM, rG
10	Shin et al. (2022) [27]	South Korea	Day-ahead	6	ANN, DNN, ELM, LSTM, LSTM-PC	MAE, RMSE, R, Thiel’s U-statistic
11	Alomar et al. (2022) [28]	North America	Day-ahead	22	SVR, RT, QRT	MAE, RMSE, R, Thiel’s U-statistic
12	Haque et al. (2021) [29]	Beijing, China, Toronto, Las Vegas, Seattle, Dallas	Hour-ahead	4	SRN, GRU, LSTM, CNN, CNN-LSTM, GRU-LSTM	MAE, RMSE, R²
13	Wang et al. (2021) [30]	Michigan, United States	3 days ahead	6	Exponential Smoothing	RMSE
14	Lee et al. (2020) [31]	South Korea	Day- and hour-ahead	10	MLP, RNN, CNN	MAE
15	Zhang et al. (2020) [32]	Mainland China	Day-ahead	67	CRNN	MAE, RMSE
16	Shin et al. (2020) [33]	South Korea	Day-ahead	3	Hybrid (GloSea5GC2, RELM), Climatology Model	RMSE
17	Wanishsakpong et al. (2020) [34]	Ranong and Phuket, Thailand	Month-ahead	10	ARIMA, ARIMAX	RMSE, RRMSE
18	Zahroh et al. (2019) [35]	Bandung, Indonesia	Day-ahead	5.5	LSTM	MAPE

Vector Autoregressive (VAR), Gated Recurrent Unit (GRU), Wavelet Decomposition (WD), Stochastic Adversarial Video Prediction (SAVP), Anomaly Correlation Coefficient (ACC), Structural Similarity Index (SSIM), Gradient Ratio (rG), Support Vector Regression (SVR), Regression Tree (RT), Quantile Regression Tree (QRT), Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), (a Global Climate model (GloSea5GC2), Regularized Extreme Learning Machine (RELM), Genetic Algorithm (GA), Differential Evolution (DE), and Particle Swarm Optimization (PSO).

Table 2. Descriptive Summaries of Endogenous and Exogenous Variables.

Variables	Min.	Q1	Median	Q3	Max.	Mean	SD
Temperature (°C)	9.48	22.98	27.03	29.7	39.57	26.2	4.95
Wet-bulb temperature (°C)	3.36	18.44	24.43	26.98	31.43	22.35	5.73
Surface Pressure (kPa)	98.51	99.7	100.29	100.82	102	100.26	0.67
Wind Speed (10 m/s)	0.03	2.93	4.09	5.59	16.52	4.36	2.01

Table 3. Optimal orders for various models.

ARIMA	NNAR	FAR(p, m)	FARX(p, m, τ)	FARX( $p, m, \underset{̲}{g}, τ$ )
p = 3	d = 32	p = 6	p = 6	p = 2
d = 0	q = 16	m = 14	m = 14	m = 15
q = 2			τ = 3	$g_{1}$ = 15
				$g_{2}$ = 15
				$g_{3}$ = 9
				τ = 6

p: Autoregressive order, d: Integration order, q: Moving average order, d: Input neurons, q: Hidden layers, p: Lagged value, m: Dimensions of functional response variable, τ: number of scalar covariates,

g_{1}, g_{2}

and

g_{3}

: Dimensions of first, second, and third functional covariates.

Table 4. One-day-ahead out-of-sample temperature forecasting performance for various models.

Models	RMSE	MAE	MAPE	MASE
ARIMA	1.1021	0.6924	2.8604	1.1013
NNAR	0.9720	0.6725	2.7928	1.0697
FAR(p, m)	0.7887	0.5370	2.1412	0.8542
FARX(p, m, τ)	0.7873	0.5360	2.1378	0.8526
FAR $X (p, m, \underset{̲}{g}, τ)$	0.2942	0.2004	0.8213	0.3188

Table 5. p-values from the Diebold–Mariano test, where the null hypothesis assumes equal predictive accuracy between models, while the alternative suggests the column model outperforms the row model.

Models	ARIMA	NNAR	FAR(p, m)	FARX(p, m, τ)	FARX( $p, m, \underset{̲}{g}, τ$ )
ARIMA	–	<0.01	<0.01	<0.01	<0.01
NNAR	>0.99	–	<0.01	<0.01	<0.01
FAR(p, m)	>0.99	>0.99	–	<0.01	<0.01
FARX(p, m, τ)	>0.99	>0.99	>0.99	–	<0.01
FAR $X (p, m, \underset{̲}{g}, τ)$	>0.99	>0.99	>0.99	>0.99	–

Table 6. Accuracy measures for one-day-ahead out-of-sample temperature forecasts for different seasons based on various models.

Seasons	Errors	ARIMA	NNAR	FAR(p, m)	FARX(p, m, τ)	FARX( $p, m, \underset{̲}{g}, τ$ )
Winter	RMSE	1.2935	1.2562	0.9180	0.9204	0.3584
	MAE	0.9444	0.9645	0.6977	0.7008	0.2553
	MAPE	4.6811	4.8695	3.4099	3.4241	1.2796
	MASE	0.9805	1.0014	0.7244	0.7276	0.2651
Spring	RMSE	1.4741	1.1767	0.9757	0.9791	0.3604
	MAE	0.8976	0.8576	0.6904	0.6904	0.2509
	MAPE	3.6264	3.3937	2.6307	2.6418	0.9884
	MASE	0.9770	0.9334	0.7514	0.7554	0.2730
Summer	RMSE	0.7738	0.7140	0.6650	0.6640	0.2237
	MAE	0.4985	0.4667	0.4146	0.4159	0.1550
	MAPE	1.6006	1.4836	1.3111	1.3150	0.5073
	MASE	0.9778	0.9154	0.8133	0.8157	0.3041
Autumn	RMSE	0.6402	0.5542	0.4905	0.4898	0.1941
	MAE	0.4289	0.4015	0.3409	0.3370	0.1404
	MAPE	1.5389	1.4323	1.2031	1.1875	0.5114
	MASE	1.0445	0.9778	0.8302	0.8206	0.3419

Table 7. One-day-ahead out-of-sample temperature forecasts: month-specific forecasting errors.

Months	Errors	ARIMA	NNAR	FAR(p, m)	FARX(p, m, τ)	FARX( $p, m, \underset{̲}{g}, τ$ )	Months	Errors	ARIMA	NNAR	FAR(p, m)	FARX(p, m, τ)	FARX( $p, m, \underset{̲}{g}, τ$ )
January	RMSE	1.0441	1.1219	0.8484	0.8507	0.2912	July	RMSE	0.8731	0.8616	0.8178	0.8156	0.2552
	MAE	0.7858	0.8907	0.6552	0.6560	0.2120		MAE	0.5706	0.5588	0.5031	0.5042	0.1814
	MAPE	3.9065	4.4024	3.1678	3.1675	1.0754		MAPE	1.7669	1.7291	1.5459	1.5497	0.5746
	MASE	0.9319	1.0563	0.7770	0.7780	0.2514		MASE	0.8855	0.8671	0.7807	0.7823	0.2815
February	RMSE	1.2813	1.2888	0.9585	0.9609	0.4120	August	RMSE	0.6580	0.6037	0.5751	0.5747	0.2318
	MAE	0.9975	0.9998	0.7142	0.7171	0.2960		MAE	0.4483	0.4011	0.3692	0.3699	0.1558
	MAPE	4.7546	4.7935	3.3520	3.3663	1.4048		MAPE	1.5386	1.3706	1.2467	1.2490	0.5359
	MASE	0.9573	0.9595	0.6855	0.6882	0.2841		MASE	0.9449	0.8453	0.7781	0.7797	0.3284
March	RMSE	2.1264	1.5213	1.1283	1.1315	0.4168	September	RMSE	0.6907	0.4020	0.3780	0.3583	0.1746
	MAE	1.1900	1.1580	0.8042	0.8086	0.2863		MAE	0.3753	0.3094	0.2684	0.2551	0.1270
	MAPE	5.7452	5.3436	3.5958	3.6115	1.3095		MAPE	1.3450	1.0949	0.9423	0.8945	0.4570
	MASE	1.0631	1.0345	0.7184	0.7224	0.2558		MASE	1.2614	1.0398	0.9021	0.8573	0.4269
April	RMSE	0.9866	0.9736	0.9405	0.9448	0.3628	October	RMSE	0.6771	0.7008	0.6261	0.6314	0.2133
	MAE	0.7717	0.7549	0.7040	0.7061	0.2664		MAE	0.4800	0.5043	0.4329	0.4337	0.1550
	MAPE	2.7698	2.7160	2.4912	2.4962	0.9867		MAPE	1.6210	1.7116	1.4546	1.4556	0.5416
	MASE	0.8584	0.8397	0.7831	0.7854	0.2963		MASE	0.9315	0.9786	0.8401	0.8417	0.3007
May	RMSE	0.9928	0.9370	0.8346	0.8373	0.2904	November	RMSE	0.5408	0.5123	0.4266	0.4330	0.1918
	MAE	0.7271	0.6565	0.5634	0.5678	0.2004		MAE	0.4297	0.3875	0.3184	0.3189	0.1388
	MAPE	2.3367	2.0995	1.8007	1.8128	0.6689		MAPE	1.6481	1.4811	1.2039	1.2034	0.5346
	MASE	1.0272	0.9274	0.7960	0.8022	0.2831		MASE	0.9343	0.8425	0.6923	0.6932	0.3017
June	RMSE	0.7753	0.6476	0.5685	0.5689	0.1752	December	RMSE	1.5119	1.3489	0.9460	0.9484	0.3653
	MAE	0.4757	0.4392	0.3702	0.3721	0.1269		MAE	1.0534	1.0052	0.7248	0.7303	0.2605
	MAPE	1.4929	1.3466	1.1349	1.1408	0.4081		MAPE	5.3870	5.4079	3.7061	3.7348	1.3666
	MASE	1.0418	0.9619	0.8107	0.8150	0.2779		MASE	0.9921	0.9467	0.6826	0.6878	0.2454

Table 8. Hour-specific one-day-ahead out-of-sample forecasting errors for temperature.

Hours	Errors	ARIMA	NNAR	FAR(p, m)	FARX(p, m, τ)	FARX( $p, m, \underset{̲}{g}, τ$ )	Hours	Errors	ARIMA	NNAR	FAR(p, m)	FARX(p, m, τ)	FARX( $p, m, \underset{̲}{g}, τ$ )
1	RMSE	0.9675	0.8736	0.2658	0.2674	0.4146	13	RMSE	1.2435	1.0984	1.0217	1.0236	0.1243
	MAE	0.5696	0.5823	0.1888	0.1880	0.2831		MAE	0.8210	0.7966	0.7409	0.7421	0.0799
	MAPE	2.7269	2.8245	0.8946	0.8962	1.3748		MAPE	2.7335	2.6322	2.4490	2.4526	0.2726
	MASE	1.0825	1.1065	0.3588	0.3572	0.5380		MASE	1.0176	0.9874	0.9183	0.9198	0.0990
2	RMSE	0.9744	0.9191	0.2568	0.2579	0.1878	14	RMSE	1.2470	1.1036	1.0463	1.0478	0.1691
	MAE	0.5856	0.6094	0.1754	0.1758	0.1331		MAE	0.8251	0.8044	0.7542	0.7552	0.0968
	MAPE	2.8407	3.0073	0.8257	0.8282	0.6338		MAPE	2.7504	2.6557	2.4991	2.5029	0.3244
	MASE	1.0766	1.1204	0.3225	0.3232	0.2447		MASE	1.0297	1.0038	0.9412	0.9424	0.1208
3	RMSE	0.9922	0.9248	0.3398	0.3397	0.1589	15	RMSE	1.2243	1.0939	1.0443	1.0465	0.2320
	MAE	0.6025	0.6073	0.2371	0.2371	0.1164		MAE	0.8207	0.8094	0.7600	0.7603	0.1475
	MAPE	2.9540	3.0337	1.1291	1.1297	0.5349		MAPE	2.7664	2.7094	2.5471	2.5487	0.4940
	MASE	1.0776	1.0863	0.4240	0.4242	0.2082		MASE	1.0366	1.0223	0.9598	0.9603	0.1863
4	RMSE	1.0209	0.9145	0.4320	0.4329	0.1756	16	RMSE	1.1947	1.0533	1.0437	1.0482	0.3303
	MAE	0.6276	0.6153	0.3169	0.3185	0.1300		MAE	0.8025	0.7845	0.7741	0.7802	0.2495
	MAPE	3.1106	3.0861	1.5174	1.5240	0.5956		MAPE	2.7748	2.6959	2.6526	2.6740	0.8707
	MASE	1.0829	1.0618	0.5468	0.5496	0.2244		MASE	1.0266	1.0035	0.9902	0.9981	0.3192
5	RMSE	1.0489	0.9513	0.5264	0.5280	0.1861	17	RMSE	1.1610	1.0083	0.9853	0.9893	0.3261
	MAE	0.6481	0.6405	0.3779	0.3808	0.1219		MAE	0.7688	0.7536	0.7234	0.7273	0.2512
	MAPE	3.2162	3.2141	1.8461	1.8565	0.5786		MAPE	2.8123	2.7290	2.6079	2.6220	0.9071
	MASE	1.0642	1.0517	0.6206	0.6253	0.2002		MASE	1.0285	1.0082	0.9678	0.9730	0.3360
6	RMSE	1.0852	0.9843	0.6217	0.6200	0.2210	18	RMSE	1.1184	0.9714	0.9231	0.9235	0.3684
	MAE	0.6764	0.6661	0.4279	0.4273	0.1560		MAE	0.7344	0.6892	0.6854	0.6830	0.2882
	MAPE	3.3638	3.3555	2.1425	2.1370	0.7077		MAPE	2.9013	2.7089	2.6769	2.6703	1.1229
	MASE	1.0962	1.0794	0.6935	0.6925	0.2528		MASE	1.0239	0.9608	0.9555	0.9521	0.4018
7	RMSE	1.1272	1.0036	0.6943	0.6912	0.3283	19	RMSE	1.0298	0.8876	0.8178	0.8187	0.3043
	MAE	0.6861	0.6715	0.4890	0.4875	0.2428		MAE	0.6512	0.6000	0.5840	0.5814	0.2206
	MAPE	3.3010	3.2858	2.3783	2.3639	1.1503		MAPE	2.6950	2.4874	2.3813	2.3740	0.8757
	MASE	1.1042	1.0806	0.7869	0.7845	0.3907		MASE	1.0187	0.9386	0.9135	0.9095	0.3451
8	RMSE	1.1043	0.9855	0.6914	0.6903	0.3360	20	RMSE	0.9974	0.8428	0.7720	0.7759	0.2804
	MAE	0.6846	0.6732	0.5061	0.5063	0.2499		MAE	0.6129	0.5645	0.5307	0.5320	0.1947
	MAPE	3.0044	2.9736	2.1990	2.1960	1.0419		MAPE	2.6254	2.4273	2.2383	2.2460	0.7976
	MASE	1.0515	1.0340	0.7773	0.7776	0.3838		MASE	1.0188	0.9383	0.8822	0.8844	0.3236
9	RMSE	1.1115	0.9322	0.7256	0.7238	0.3447	21	RMSE	0.9972	0.8712	0.7481	0.7525	0.2687
	MAE	0.7109	0.6463	0.5318	0.5309	0.2600		MAE	0.6028	0.5704	0.5149	0.5166	0.1972
	MAPE	2.8150	2.5586	2.0833	2.0763	0.9898		MAPE	2.6531	2.5108	2.2339	2.2426	0.8209
	MASE	1.0396	0.9451	0.7777	0.7765	0.3802		MASE	1.0454	0.9892	0.8928	0.8959	0.3420
10	RMSE	1.1806	1.0114	0.8336	0.8343	0.3465	22	RMSE	1.0114	0.8914	0.7448	0.7491	0.2754
	MAE	0.7737	0.7318	0.6075	0.6097	0.2575		MAE	0.5939	0.5802	0.5112	0.5136	0.2034
	MAPE	2.8290	2.6613	2.1932	2.1996	0.9148		MAPE	2.6777	2.6370	2.2792	2.2900	0.8583
	MASE	1.0249	0.9694	0.8048	0.8077	0.3411		MASE	1.0446	1.0205	0.8993	0.9035	0.3577
11	RMSE	1.2384	1.0871	0.9507	0.9506	0.3125	23	RMSE	1.0097	0.8585	0.7501	0.7538	0.3214
	MAE	0.8267	0.7905	0.6877	0.6876	0.2155		MAE	0.5813	0.5708	0.5049	0.5086	0.2268
	MAPE	2.8801	2.7358	2.3755	2.3735	0.7479		MAPE	2.6842	2.6486	2.3189	2.3334	1.0017
	MASE	1.0112	0.9669	0.8412	0.8410	0.2636		MASE	1.0545	1.0355	0.9160	0.9227	0.4114
12	RMSE	1.2481	1.0956	0.9987	0.9993	0.2212	24	RMSE	1.0165	0.8786	0.7799	0.7820	0.5023
	MAE	0.8379	0.7981	0.7240	0.7250	0.1491		MAE	0.5731	0.5845	0.5103	0.5134	0.3389
	MAPE	2.8288	2.6782	2.4242	2.4265	0.5102		MAPE	2.7055	2.7713	2.4133	2.4251	1.5841
	MASE	1.0167	0.9684	0.8785	0.8797	0.1809		MASE	1.0745	1.0958	0.9568	0.9626	0.6354

Table 9. Average time in seconds for one-day-ahead forecast of temperature data.

Average Time	ARIMA	NNAR	FAR(p, m)	FARX(p, m, τ)	FARX( $p, m, \underset{̲}{g}, τ$ )
Time (s)	0.33	1.05	1.14	1.17	2.17
Relative Time	1	3.18	3.45	3.55	6.58

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shah, I.; Uzair, M.; Ali, S.; Aljeddani, S.M. Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model. Mathematics 2026, 14, 835. https://doi.org/10.3390/math14050835

AMA Style

Shah I, Uzair M, Ali S, Aljeddani SM. Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model. Mathematics. 2026; 14(5):835. https://doi.org/10.3390/math14050835

Chicago/Turabian Style

Shah, Ismail, Muhammad Uzair, Sajid Ali, and Sadiah M. Aljeddani. 2026. "Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model" Mathematics 14, no. 5: 835. https://doi.org/10.3390/math14050835

APA Style

Shah, I., Uzair, M., Ali, S., & Aljeddani, S. M. (2026). Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model. Mathematics, 14(5), 835. https://doi.org/10.3390/math14050835

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model

Abstract

1. Introduction

2. Functional Data Analysis

2.1. Preliminaries

2.2. Basis Function System

2.3. Functional Principal Component Analysis (FPCA)

3. Functional Time Series Modeling

3.1. Functional Autoregressive Models

3.2. Building $FAR X (p, m, \underset{̲}{g}, τ)$ Model

3.3. Selection of Optimal Orders $p, m, \underset{̲}{g}, τ$ by $fFPE X (p, m, \underset{̲}{g}, τ)$

3.4. Competitive Models

3.4.1. Autoregressive Integrated Moving Average Model

3.4.2. Neural Network Autoregressive Model

4. Application to Real Data

4.1. Study Area

4.2. Data-Driven Modeling Procedure

Order Identification and Estimation

4.3. Out-of-Sample Forecasting

4.4. Computational Complexity

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model

Abstract

1. Introduction

2. Functional Data Analysis

2.1. Preliminaries

2.2. Basis Function System

2.3. Functional Principal Component Analysis (FPCA)

3. Functional Time Series Modeling

3.1. Functional Autoregressive Models

3.2. Building FAR X ( p , m , g ̲ , τ ) Model

3.3. Selection of Optimal Orders p , m , g ̲ , τ by fFPE X ( p , m , g ̲ , τ )

3.4. Competitive Models

3.4.1. Autoregressive Integrated Moving Average Model

3.4.2. Neural Network Autoregressive Model

4. Application to Real Data

4.1. Study Area

4.2. Data-Driven Modeling Procedure

Order Identification and Estimation

4.3. Out-of-Sample Forecasting

4.4. Computational Complexity

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2. Building $FAR X (p, m, \underset{̲}{g}, τ)$ Model

3.3. Selection of Optimal Orders $p, m, \underset{̲}{g}, τ$ by $fFPE X (p, m, \underset{̲}{g}, τ)$