Next Article in Journal
ADAT: Adaptive Dynamic Anonymity and Traceability via Privacy-Aware Random Forest and Truncated Local Differential Privacy in a Trusted Execution Environment (TEE)
Next Article in Special Issue
A Unified Family of Percentage-Error Support Vector Regression Models with Symmetric Kernel Extensions
Previous Article in Journal
Extending Polynomially Normal Operators to (P, Q)-Normal Operators in Semi-Hilbertian Spaces
Previous Article in Special Issue
Doubly Robust Estimation of the Finite Population Distribution Function Using Nonprobability Samples
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model

1
Department of Statistics, Quaid-i-Azam University, Islamabad 45320, Pakistan
2
Department of Statistical Sciences, University of Padova, 35121 Padova, Italy
3
Mathematics Department, Al-Lith University College, Umm Al-Qura University, Al-Lith 21961, Saudi Arabia
*
Authors to whom correspondence should be addressed.
Mathematics 2026, 14(5), 835; https://doi.org/10.3390/math14050835
Submission received: 18 January 2026 / Revised: 13 February 2026 / Accepted: 25 February 2026 / Published: 1 March 2026

Abstract

Functional time series (FTS) modeling has emerged as a powerful framework for capturing complex temporal dependencies using the functional autoregressive models FAR(p, m) and FARX(p, m, τ). These functional models characterize the evolution of functional observations by incorporating ‘p’ lagged functional responses, ‘m’ truncated dimensions from functional principal component analysis (FPCA), and τ number of scalar covariates with optimal parameter selection guided by the minimization of the functional final prediction error fFPE(p, m). The aim of this study is to propose a computationally efficient FAR model that can integrate a number of functional covariates to achieve a high predictive accuracy in terms of standard out-of-sample accuracy measures. To this end, an integrated functional autoregressive model FAR X ( p , m , g ̲ , τ ) is developed, where X denotes the exogenous information, this being a lagged or modeled functional profile within the FAR(p, m) framework, and ‘ g ̲ ’ represents a vector of optimal dimensions for a number of functional covariates. The theoretical contributions are twofold: first, deriving the distribution of the modified functional final prediction error, denoted as fFPE X ( p , m , g ̲ , τ ) ; second, using this derivation to establish formal criteria for optimal model selection. To empirically investigate the predictive performance of the proposed model, hourly temperature data from the NASA POWER project are considered, and day-ahead out-of-sample forecasts over a full annual cycle are computed. The forecasting performance of the proposed model is assessed against state-of-the-art models using different error summary metrics. The results show that functional models consistently outperform traditional time series and neural network-based approaches, with FAR X ( p , m , g ̲ , τ ) achieving superior predictive accuracy compared to FAR(p, m) and FARX(p, m, τ), thereby underscoring the efficacy of incorporating functional exogenous information in FTS modeling.

1. Introduction

Climate patterns exhibit intricate variability, driven by complex interactions between atmospheric, oceanic, and terrestrial systems. Understanding these patterns is crucial for predicting extreme weather events, assessing climate change impacts, and developing effective mitigation strategies. Among the many climatic variables, temperature plays a pivotal role, serving as a primary driver of atmospheric circulation, precipitation dynamics, and weather extremes [1]. It is a fundamental variable in meteorology, shaping the intricate patterns of weather systems and influencing atmospheric circulation. As key drivers of energy exchange within the Earth’s climate system, temperature variations dictate the formation of storms, heatwaves, and cold spells, making them crucial parameters for understanding and predicting complex weather structures. Moreover, long-term temperature trends serve as vital indicators of environmental changes, including global warming and shifts in regional climate regimes. Accurate modeling and forecasting of temperature are essential for climate resilience, disaster preparedness, and sustainable environmental policies [2,3].
To accurately forecast air temperature, researchers used different methods that vary in complexity, methodology, and performance. For example, ref. [4] used linear regression to model the daily maximum temperature data with geopotential thickness forecasts as a covariate for the site in Nashville, Tennessee. The study reveals that forecasting accuracy of daily maximum temperature is high for warmer air, whereas in the winter season, the forecasting accuracy tends to decrease. Although this method improves daily maximum temperature modeling, it has limitations, including dependence on geopotential thickness forecasts, seasonal effects, and other environmental factors. In the context of producing long-horizon predictive densities, which are crucial for pricing weather derivatives, the daily average temperature data are modeled by various authors [5,6]. For pricing temperature derivatives, the Chicago Mercantile Exchange (CME) defines the daily average temperature underlying its contracts as the mean of the daily high and low temperatures. In an evaluation using daily average temperature data from the United States of America, ref. [7] proposed an Extreme Value Theory (EVT) approach that outperformed classical models such as AR, generalized autoreggresive hetroskedasticity (GARCH), and standard regression.
Based on 62 years of historical time series (TS) data from Zhengzhou, China, ref. [8] modeled the daily average temperature using a mean-reverting Ornstein–Uhlenbeck process to provide accurate pricing for weather derivatives. Recent advances in nonlinear and complex time series modeling include Markov switching bilinear models for higher-order spectral analysis and threshold-based stochastic volatility frameworks which are capable of capturing asymmetric dynamics and regime changes [9,10]. Heatwaves significantly impact society and hence are referred to as highlighting the importance of modeling temperature. Ref. [11] evaluated the Regional Atmospheric Modelling System (RAMS) for predicting maximum and minimum summer temperatures in the Valencia Region from 2007 to 2010. The results show that the model predicts maximum temperatures well, with small errors of about 2 °C. The impact of global warming based on daily minimum and maximum temperature data from Coimbra in Portugal is assessed by [12] using the Hadley Centre model and a neural network. The results suggest that the two-layer neural network outperforms the competitors. Ref. [13] used the Abductory Induction Mechanism (AIM) as a modeling tool for temperature forecasting within a machine learning framework. Using daily maximum temperature data from the city of Dhahran in Saudi Arabia, the model was validated, achieving 97% accuracy within a range of ±3 °C, and hence provides better predictions than other traditional models. A novel abductive network model is developed by [14] to forecast a day-ahead and hour-ahead temperature data. The aforementioned model was trained using 5 years of hourly data and tested for the next whole year, for which the mean absolute errors (MAEs) of day-ahead and hour-ahead forecasts were 1.68 °F and 1.05 °F, respectively. Multiple linear regression and three artificial neural networks—feed-forward backpropagation (FFBP), radial basis function (RBF), and generalized regression neural network (GRNN)—have also been applied to model daily minimum, average, and maximum temperatures [15]. These models are evaluated against the geographical location of Turkey’s most important areas for agricultural production—the Geyve and Sakarya basins—which are located in the southeast of the Marmara region. The study reveals that the three ANN models outperform the classical benchmark.
A comparative study of data-intelligent models, namely a generalized regression neural network (GRNN), multivariate adaptive regression splines (MARSs), random forest (RF), and extreme learning machines (ELMs), is presented by [16], in which longitude, latitude, altitude, and periodicity are used as covariates rather than traditional atmospheric features such as humidity, precipitation, etc. These models were implemented at eleven different sites in Madhya Pradesh, central India, and the results show that the GRNN works more effectively for modeling air temperature, particularly in data-sparse regions where only geographic and topographic factors are utilized for temperature forecasting. One can also read [17] for an in-depth review of air temperature modeling using machine learning and neural network models, respectively. A systematic literature review (from 2019 to 2024) in connection with modeling and forecasting daily temperature is also given in Table 1.
Accurate day-ahead temperature forecasts are essential for energy, agriculture, and public health, as they help optimize resource management and reduce climate-related risks. Traditional methods often struggle to capture the continuous and complex nature of temperature variations. To overcome this deficiency, this research work contributes an effort to model and forecast the daily temperature curves using functional data analysis (FDA). To this end, a novel functional autoregressive model FAR X ( p , m , g ̲ , τ ) is developed and the distribution of the modified functional final prediction error fFPE X ( p , m , g ̲ , τ ) is derived for optimal model selection.
The rest of the article is structured as follows. Section 2 provides an introduction to functional data analysis along with the fundamental tools necessary for its implementation. Section 3 presents the development of a new functional model, including discussions on classical and functional competitors. An empirical application of the proposed methodology is provided in Section 4, where the results are analyzed in detail. Finally, Section 5 concludes the study.

2. Functional Data Analysis

FDA offers a strong and promising framework for treating complex and high-dimensional datasets. Within FDA, data are represented as smooth curves or functions rather than discrete points or vectors [36]. This technique contrasts with traditional approaches that often concentrate on summary statistics or discrete observations. There are diverse applications of FDA in various disciplines such as medicine, finance, neuroscience, economics, environmental studies, and quality control [37,38]. A systematic review about the applications of the FDA in various fields for the period from 1995 to 2010 is given by [39]. A recent study [40] involves forecasting crude oil prices using derivative information in a multivariate empirical mode decomposition (MEMD) model within the FDA framework, which results in better performance as compared with traditional time series models. An application of the FDA to analyze the behavioral patterns of the Colombian stock market, specifically the effects of COVID-19 and the aftermath of the Ukraine war, is conducted by [41]. Ref. [42] employed the FDA approach to predict age-specific brain cancer mortality trends for planning public health policies and resource allocation. FDA has also significantly contributed to numerical weather prediction. For example, it is utilized to create point and interval forecasts for drought time series patterns [43]. They developed a reliable predictive method for predicting drought intervals. Based on the FDA technique, ref. [44] illustrated the temporal changes of rainfall to assist in forecasting the future and to achieve a clear understanding of rainfall patterns. Ref. [45] implemented the FDA framework to forecast temperature and precipitation under various climate change scenarios. Ref. [46] performed FPCA to model the structure of rainfall. A functional autoregressive model of the first order, FAR(1), is explored to forecast hourly air temperature up to 24 h ahead, where the traditional time series model ARIMA served as the competitor, and the results suggest that FAR(1) is superior. Online auction prices and traffic volume are also predicted under the FDA framework by [47,48,49], respectively. Due to substantial fluctuations, ref. [50] indicates a non-linear and complex structure in electricity consumption, demand, and prices. Refs. [51,52,53] applied the FDA to estimate and forecast electricity consumption and demand, respectively. Hour-ahead electricity demand forecasting strategies are implemented using non-parametric FDA [54,55,56,57], and the results significantly outperform those obtained from the classical seasonal ARIMA (SARIMA) models.

2.1. Preliminaries

Let { S t ( υ ) : t Z , υ [ 0 , 1 ] } denote a stationary functional time series of observations, where t represents the time index and υ the argument in the continuous domain of each function. Each S t is assumed to belong to the Hilbert space H = L 2 ( [ 0 , 1 ] ) with the inner product x , y = 0 1 x ( υ ) y ( υ ) d υ . Thus, each S t is square-integrable, satisfying S t 2 = 0 1 S t 2 ( υ ) d υ < , and all random functions are defined on a common probability space ( Ω , A , P ) . For p > 0, we write S t L H p if E [ S t p ] < . Since the series is not necessarily centered, we define the mean function λ ( υ ) = E [ S t ( υ ) ] , υ [ 0 , 1 ] , and the covariance operator of the centered functions S t λ as:
C ( x ) ( υ ) = E S t λ , x ( S t ( υ ) λ ( υ ) ) , x H
or equivalently via its kernel representation
C ( x ) ( υ ) = 0 1 c ( υ , γ ) x ( γ ) d γ , x H
with
c ( υ , γ ) = Cov S t ( υ ) , S t ( γ ) .
Equation (1) expresses the covariance operator C as an integral operator with kernel c ( υ , γ ) , while Equation (2) defines this kernel as the covariance between the functional observations at points υ and γ. For a more detailed discussion of covariance operators in functional time series, the reader is referred to [58].

2.2. Basis Function System

In the FDA framework, a basis function system (BFS) plays a crucial role in converting discrete data into a functional object, enabling a more compact representation without losing essential information. Various types of basis function systems exist, including polynomial, B-spline, and Fourier basis functions (FBFs), each suitable for different data structures. Among these, FBF, which consist of sine and cosine waves with increasing frequency, are particularly effective for modeling periodic data. FBFs are especially useful for cyclical environmental data, such as temperature, wind speed, and atmospheric pressure, where seasonal and diurnal variations play a crucial role. The sinusoidal components of the FBF effectively capture periodic fluctuations, providing a smooth and continuous representation of temperature profiles. A Fourier basis function is defined as
ξ j ( υ ) = cos j φ ( υ ) + sin j φ ( υ ) ,
where j denotes the Fourier frequency index and φ ( υ ) is the fundamental frequency function, typically defined as φ ( υ ) = 2 π υ , υ [ 0 , 1 ] . Using M Fourier basis functions, the functional observation at time t can be approximated as
S t ( υ ) = j = 1 M b t , j ξ j ( υ ) , υ [ 0 , 1 ]
where b t , j are time-varying Fourier coefficients and { ξ j ( υ ) } j = 1 M form the Fourier basis function system. An example of functional data, based on M = 20 FBFs represented in Figure 1, for the years 2019 to 2024 reflects 2192 daily temperature curves in Figure 2. The S t ( υ ) is known as the functional time series (FTS) where one can investigate the major source of variation among different temperature curves.

2.3. Functional Principal Component Analysis (FPCA)

Suppose we observe an FTS, S t ( υ ) , consisting of n curves, where υ [ 0 , 1 ] . The mean function is defined as λ ( υ ) = 1 n t = 1 n S t ( υ ) . The mean-centered curves are then given by
S t ( υ ) = S t ( υ ) λ ( υ ) .
Let α j ( υ ) denote the jth FPC. Each mean-centered curve S t ( υ ) can be projected onto α j ( υ ) in the Hilbert space H = L 2 ( [ 0 , 1 ] ) . The corresponding FPC score is defined as
θ j , t = S t , α j = 0 1 S t ( υ ) α j ( υ ) d υ .
The FPCs satisfy the normalization condition 0 1 α j 2 ( υ ) d υ = 1 ensuring the identifiability of the scores. The covariance operator C : H H of the mean-centered process is defined as C ( x ) ( υ ) = E S t , x S t ( υ ) , x H . According to Mercer’s theorem, C admits the spectral decomposition
C ( x ) ( υ ) = j = 1 δ j x , α j α j ( υ ) , x H
where { δ j } j 1 are the eigenvalues in decreasing order and { α j } j 1 are the corresponding orthonormal eigenfunctions [59]. Using the Karhunen–Loève expansion, each functional observation can be expressed as
S t ( υ ) = λ ( υ ) + j = 1 θ j , t α j ( υ ) ,
where E [ θ j , t ] = 0 and Var ( θ j , t ) = δ j . For dimension reduction, we retain the first m FPCs and obtain the approximation [60,61]
S t ( υ ) = λ ( υ ) + j = 1 m θ j , t α j ( υ ) + j = m + 1 θ j , t α j ( υ ) ,
where the third term in the above expression accounts for the variability not explained by the first m functional principal components and has zero mean and finite variance [62]. In practice, the mean function and covariance kernel are estimated as
λ ^ ( υ ) = 1 n t = 1 n S t ( υ ) ,
C ^ ( υ , γ ) = 1 n 1 t = 1 n S t ( υ ) λ ^ ( υ ) S t ( γ ) λ ^ ( γ ) ,
Let δ ^ j and α ^ j ( υ ) denote the eigenvalues and eigenfunctions of C ^ . The reconstructed FTS is then given by
S ^ t ( υ ) = λ ^ ( υ ) + j = 1 m θ ^ j , t α ^ j ( υ ) ,
where θ ^ j , t = 0 1 S t ( υ ) λ ^ ( υ ) α ^ j ( υ ) d υ .

3. Functional Time Series Modeling

An FTS captures nonlinear dependencies while preserving smooth temporal structures, making it particularly suitable for modeling complex dynamic processes. This section discusses the existing methodologies and the new developed approach which can effectively handle such dependencies. Despite its statistical relevance and mathematical appeal, functional time series modeling has practical limitations. Few user-friendly software packages exist, with notable exceptions being the far [63] and ftsa [64] packages in R version 4.5.0 (2025-04-11 ucrt). The lack of dedicated tools often requires manual implementation, limiting its use to academicians. This article aims to bridge this gap by introducing a prediction algorithm.

3.1. Functional Autoregressive Models

An autoregressive (AR) model is a well-known linear model where the response variable is regressed over its lagged values plus a noise term [65,66]. If the response observations are functions defined over a continuous domain (rather than scalar or vector values), the autoregressive model is referred to as an FAR model. An FAR process and its theoretical properties are described in a Hilbert space framework, which is one of the best techniques for modeling the complex nature of FTS [67]. In traditional univariate and multivariate time series analysis, reliable predictions rely on recursive methods like the Durbin–Levinson algorithm and the innovations algorithm [68], which systematically update predictions based on past observations. Prediction equations for general stationary FTS can be derived explicitly. However, their practical implementation is challenging due to the infinite-dimensional nature of functional data. As a result, most of the research in this area is limited to FAR(1) models primarily due to the computational and theoretical challenges of estimating higher-order FAR(p) models, as functional time series are infinite-dimensional and more complex models require substantial data and computation for reliable parameter estimation. Mathematically, the FAR(1) model [69] is given in Equation (13) as
S t ( υ ) = λ ( υ ) + ϕ S t 1 ( υ ) + N t ( υ ) ,
where λ ( υ ) is the functional mean curve, S t 1 ( υ ) is the first lagged functional variable, N t ( υ ) is i.i.d. in L H 2 such that E N t ( υ ) = 0 , and ϕ is the linear operator bounded over mapping H H such that Equation (13) has a unique solution. If the FAR(1) approach is infeasible, the multiple testing procedure ensures a better fit to the data and can be used to determine the optimal order p [70]. A more general FAR(p) process is given by Equation (14), in which the kernel function yields a test statistic following an approximate chi-square distribution, with the degrees of freedom based on the number of FPCs being
S t ( υ ) = λ ( υ ) + i = 1 p ϕ i S t i ( υ ) + N t ( υ ) ,
where ϕ i are the linear operators of corresponding p-lagged functional variables S t i ( υ ) . Another approach for FAR(p, m) based on multivariate time series and FPCA, in which a VAR model is used with FPC scores. To implement the VAR model, an automatic procedure is suggested for selecting the optimal lag order (p) and dimension (m) by minimizing the final functional prediction error (fFPE) given in Equation (15).
f F P E ( p , m ) = n + p × m n p × m t r a c e ( Σ ^ N ) + j = m + 1 δ ^ j ,
where the first part is the product of a penalty term and the trace of an estimated covariance matrix of the residuals from a VAR model fitted to the FPC scores. n is the sample size in the penalty term. The second term represents the sum of eigenvalues associated with ignored principal components. The values of m and p, which minimize the fFPE, are the optimal values that are used in FAR(p, m). The predictive performance of FAR(p, m) is also increased by incorporating the τ number scalar covariates, namely the model FARX(p, m, τ) given as in Equation (16)
S t ( υ ) = λ ( υ ) + i = 1 p ϕ i S t i ( υ ) + k = 1 τ ζ k Z k + N t ( υ ) ,
where ζ k are coeffcients of Z k scalar covariates. Hence, the optimal orders p and m for Equation (16) are obtained by minimizing the fFPE, adjusted for degrees of freedom for scalar covariates as
f F P E ( p , m ) = n + p × m + τ n p × m τ t r a c e ( Σ ^ N ) + j = m + 1 δ ^ j .
For detailed discussions about Equations (13)–(17) and their applications to real data, one can consult [18,52,58] and the references therein.

3.2. Building FAR X ( p , m , g ̲ , τ ) Model

To enhance the forecasting accuracy of the FARX(p, m, τ) model, this work introduces a novel functional model, denoted as FAR X ( p , m , g ̲ , τ ) . This model incorporates g ̲ = g 1 g 2 g ρ , where each g represents the optimum dimensions of corresponding functional exogenous variables, extending the standard framework. Additionally, a modified version of fFPE(p, m), is derived, leading to the formulation of fFPE X ( p , m , g ̲ , τ ) for improved model selection.
Suppose a functional stationary response S t ( υ ) and functional exogenous variables Y 1 ( υ ) , Y 2 ( υ ) , , Y ρ ( υ ) are given and the goal is to derive an empirical S ^ t + 1 ( υ ) . Then Equation (17) can be written as a FAR X ( p , m , g ̲ , τ ) given as
S t ( υ ) = λ ( υ ) + i = 1 p ϕ i S t i ( υ ) + l = 1 ρ ψ l Y l ( υ ) + k = 1 τ ζ k Z k + N t ( υ ) ,
where ψ l are the functional operators of the ρ functional covariates Y l ( υ ) of the above model. The rest of the notations are same as discussed in Section 3.1. In this work, we estimate the Equation (18) model using the FPCA approach. This method ensures both accurate model estimation and computational efficiency, as the first few FPCs capture most of the variation in the endogenous and exogenous functional variables. In this context, the identification step involves selecting the optimal dimensions m and lag order p for an endogenous functional variable and a set of optimal dimensions g ̲ = g 1 g 2 g ρ for a ρ number of exogenous functional variables. To achieve this, the following steps are employed for selecting p , m , g ̲ and τ while a flowchart is given in Figure 3.
1.
Based on the algorithm in Section 3.3, fix the dimension m of the functional endogenous variable. Using FPCA based on m dimensions, the estimated FPC scores are obtained s ^ j , t = S ^ t ( υ ) α ^ j ( υ ) d ( υ ) for each of S ^ t ( υ ) such that we have vectors of j-variate-estimated FPC scores S ^ t = s ^ 1 , t s ^ 2 , t s ^ m , t where t = 1, 2, 3, …, k and j = 1, 2, 3, …, m.
2.
Fix the dimensions g ̲ = g 1 g 2 g ρ for a ρ number of functional exogenous variables and obtain the estimated FPC scores for each of the Y l ( υ ) .
y ^ j 1 , t = Y ^ 1 ( υ ) α ^ j 1 ( υ ) d ( υ ) , j 1 = 1 , 2 , 3 , , g 1
y ^ j 2 , t = Y ^ 2 ( υ ) α ^ j 2 ( υ ) d ( υ ) , j 2 = 1 , 2 , 3 , , g 2
and so on,
y ^ j ρ , t = Y ^ ρ ( υ ) α ^ j ρ ( υ ) d ( υ ) , j ρ = 1 , 2 , 3 , , g ρ
Such that we have vectors of j 1 , j 1 , , j ρ -variate-estimated FPC scores, respectively.
Y ^ t 1 = y ^ 1 , t 1 y ^ 2 , t 1 y ^ g 1 , t 1
Y ^ t 2 = y ^ 1 , t 2 y ^ 2 , t 2 y ^ g 2 , t 2
and so on,
Y ^ t ρ = y ^ 1 , t ρ y ^ 2 , t ρ y ^ g ρ , t ρ
3.
If the number of scalar exogenous variables τ is sufficiently large, then it is better to use their functional form instead of their scalar form. That is, fix the dimensions τ using the same algorithm as described in Section 3.3. Using FPCA based on τ dimensions, the estimated FPC scores are obtained z ^ j , t = Z ^ k ( υ ) α ^ j ( υ ) d ( υ ) for Z ^ k ( υ ) , such that we have vectors of j-variate-estimated FPC scores Z ^ t = z ^ 1 , t z ^ 2 , t z ^ τ , t where t = 1, 2, 3, …, k and j = 1, 2, 3, …, τ.
4.
The first and second derivatives of functional data capture essential dynamic features, such as trends and curvature, making them valuable predictors for improving the accuracy and interpretability of functional response models. Therefore, using the same idea as discussed earlier, one can also use the derivatives of functional variables as predictors.
5.
Once all vectors of estimated FPC scores from all functional and scalar exogenous variables are obtained, combine them into a single vector as follows: C t = Y ^ t 1 Y ^ t 2 Y ^ t ρ Z ^ t .
6.
Next, fix the lag order p and using the estimated vector of FPC scores of endogenous variable S ^ t = s ^ 1 , t s ^ 2 , t s ^ m , t obtained in Step (1) and the C t from Step (5), fit an appropriate multivariate model, for example the VAR model with an exogenous variable (VARX), given as
S t = i = 1 p Ψ i S t i + Γ C t + N t
where Γ is a matrix of coefficients of C t and N t is a white noise process. Then obtain a one-step-ahead forecast for S ^ t + 1 as
S ^ t + 1 = s ^ 1 , t + 1 s ^ 2 , t + 1 s ^ m , t + 1 .
7.
In the last step, the S ^ t + 1 is reverted to a functional object using the KL theorem and a one-step-ahead forecast in a functional form S ^ t + 1 ( υ ) is obtained as
S ^ t + 1 ( υ ) = λ ^ ( υ ) + j = 1 m s ^ t + 1 , j α ^ j ( υ ) .

3.3. Selection of Optimal Orders p , m , g ̲ , τ by fFPE X ( p , m , g ̲ , τ )

The performance of FAR(p, m) clearly depends on the optimal selection of the order parameters p and m representing the lag and appropriate number of FPCs, respectively. Past studies suggest some techniques to get these optimal values for the FAR models. For example, the value of m was suggested by [70], based on multistage hypothesis testing. Also, a mechanical and automated technique is suggested for getting an optimum p and m based on minimizing the mean squared error (MSE) as proposed by [58], and called ffPE(p, m) in Equations (15) and (17), as discussed in Section 3.1. In this section, we demonstrate a modified ffPE(p, m) called fFPE X ( p , m , g ̲ , τ ) , which is minimized for the selection of optimal parameter values p , m , g ̲ , τ where g ̲ = g 1 g 2 g ρ . We begin by analyzing the MSE. Since the eigenfunctions α j are orthogonal and the FPCs s j , t are uncorrelated, we can partition the MSE as follows:
E S t + 1 ( υ ) S ^ t + 1 ( υ ) 2 = E j = 1 s t + 1 , j α j j = 1 m s ^ t + 1 , j α j 2 = E S t + 1 S ^ t + 1 2 + j = m + 1 δ j + l = 1 ρ r = g l + 1 Z l , r .
Here, the usual L 2 Euclidean norm of vectors is represented by · 2 . In the above equation, the first term E S t + 1 S ^ t + 1 2 represents the finite-dimensional MSE for the approximation based on the truncated score vectors. Essentially, it quantifies the error within the selected subspace of dimension m. This finite-dimensional space contains the most significant components (typically derived from the eigenfunctions of the functional data), allowing us to approximate S t + 1 using a lower-dimensional projection. The second term j = m + 1 δ j captures the information loss due to truncation at dimension m. Here, δ j represents the eigenvalues associated with the components beyond m. By truncating at m, we exclude components with eigenvalues δ m + 1 , δ m + 2 which means losing some functional information, as these components are no longer part of the approximation. The third term l = 1 ρ r = g l + 1 Z l , r captures the information loss due to truncation at dimension g ̲ = g 1 g 2 g ρ based on ρ number of functional covariates. Here l is the index of the functional covariate ranging from 1 to ρ, while r indexes the higher-order terms beyond the optimal dimension g l . Z l , r captures the information loss due to truncating the l t h functional covariate at its g l t h dimension and corresponds to the eigenvalue of the r t h principal component of that covariate, quantifying the variance it explains. Summing over all components beyond g l accounts for the total information lost due to dimension reduction, analogous to j = m + 1 δ j for the response. Hence, the third term represents the cumulative contribution of the truncated higher-order components of the functional covariates beyond the selected optimal dimensions g ̲ . Assuming the stationarity of the process S t is a ( m + K + τ ) -variate VARX ( p ) , where K = l = 1 ρ g l , has the following form:
S t + 1 = Ψ 1 S t + Ψ 2 S t 1 + , , + Ψ p S t p + 1 + Γ C t + N t + 1 ,
where Γ is a matrix of coefficients of C t = Y ^ t 1 Y ^ t 2 Y ^ t ρ Z ^ t and N t is a white noise process such that
n ( W ^ W ) D N ( p × m 2 ) + K + τ 0 , Σ N Δ ( p × m + K + τ ) 1 ,
where
W ^ = vec Ψ ^ 1 , , Ψ ^ p , Γ ^ 1 , , Γ ^ K + τ
is the least squares estimator of W = vec Ψ 1 , , Ψ p , Γ 1 , , Γ K + τ , n ( W ^ W ) is the difference between the estimator W ^ and the true parameter vector W scaled by n and Δ ( p × m + K + τ ) = Var vec S 1 , , S p , and Σ N = E N 1 , N 1 . The asymptotic distribution suggests that as n , this scaled difference converges to a normal distribution with a zero mean and a covariance structure-determined Kronecker product ( ) of Σ N and Δ ( p × m + K + τ ) 1 . Suppose that the W ^ are estimated from an independent training sample R 1 , , R t = D S 1 , , S t . It follows then that
E S t + 1 S ^ t + 1 2 = E S t + 1 Ψ ^ 1 S t + Ψ ^ 2 S t 1 + , , + Ψ ^ p S t p + 1 + Γ ^ C t 2
E S t + 1 S ^ t + 1 2 = E N t + 1 2 + E Ψ 1 Ψ ^ 1 S t + + Ψ p Ψ ^ p S t p + 1 + t = 1 K + τ Γ t Γ t ^ C t 2 = trace Σ N + E I ( p × m + K + τ ) S t , , S t p + 1 , C 1 , , C K + τ ( W W ^ ) 2
The first term trace Σ N is the intrinsic noise variance from the white noise process N t + 1 and second term is obtained from the estimation error, which is expressed compactly using the Kronecker product (⊗) and vector notation ( S t , , S t p + 1 , C 1 , , C K + τ ) represents the past values in the autoregressive component, and combined from a reduced dimension from functional covariates and scalar exogenous variables. The independence of W ^ and ( S 1 , , S t , C t ) yields that
E I ( p × m + K + τ ) S t , , S t p + 1 , C 1 , , C K + τ ( W W ^ ) 2   = E trace ( W W ^ ) I ( p × m + K + τ ) Δ ( p × m + K + τ ) ( W W ^ )   = trace I ( p × m + K + τ ) Δ ( p × m + K + τ ) E ( W W ^ ) ( W W ^ )
I ( p × m + K + τ ) Δ ( p × m + K + τ ) represents the overall variance–covariance structure for the autoregressive, functional exogenous, and scalar terms. By using Equation (20), the last term can be approximated as
1 n trace Σ N I ( p × m + K + τ ) + o ( 1 ) ( p × m + K + τ ) n trace Σ N .
Using the above results, Equation (21) can be written as
E S t + 1 S ^ t + 1 2 = trace Σ N + ( p × m + K + τ ) n trace Σ N .
Replacing trace Σ N by n n ( p × m + K + τ ) trace Σ ^ N ,
E S t + 1 S ^ t + 1 2 = n n ( p × m + K + τ ) trace Σ ^ N + ( p × m + K + τ ) n × n n ( p × m + K + τ ) trace Σ ^ N
= n n ( p × m + K + τ ) + ( p × m + K + τ ) n × n n ( p × m + K + τ ) trace Σ ^ N
= n + ( p × m + K + τ ) n ( p × m + K + τ ) trace Σ ^ N .
Using the above results, Equation (19) can be written as
E S t + 1 ( υ ) S ^ t + 1 ( υ ) 2 n + ( p × m + K + τ ) n ( p × m + K + τ ) trace Σ ^ N + j = m + 1 δ j + l = 1 ρ r = g l + 1 Z l , r
In practice, the population-level quantities in the second and third terms are not observable and are therefore replaced by their corresponding empirical estimates. Thus, the orders p , m , K ̲ = g 1 g 2 g ρ and τ are simultaneously selected by minimizing the functional final perdition error criterion generalized for exogenous covariates as
f F P E X ( p , m , g ̲ , τ ) = n + ( p × m + K + τ ) n ( p × m + K + τ ) trace Σ ^ N + j = m + 1 δ ^ j + l = 1 ρ r = g l + 1 Z ^ l , r
The implementation of the f F P E X ( p , m , g ̲ , τ ) criterion renders the proposed forecasting framework fully automated, eliminating reliance on arbitrary tuning parameter specification. Notably, the determination of the optima dimension are now sample-adaptive and explicitly dependent on the observed sample size n. The empirical results presented in Section 4 confirm the strong practical efficacy of this approach.

3.4. Competitive Models

The traditional time series model, like the AutoRegressive Integrated Moving Average (ARIMA), the neural network autoregressive model (NNAR), and two functional autoregressive models FAR(p, m) and FARX(p, m, τ) in Equations (14) and (16), as described in Section 3.1, are studied as benchmarks to compare the forecasting accuracy of the proposed functional models.

3.4.1. Autoregressive Integrated Moving Average Model

ARIMA is one of the most commonly applied time series models for forecasting a univariate time series. ARIMA is the extended form of the simple ARMA model, which is a tool that extrapolates the signal into the future to generate forecasts by separating signal from noise. It comprises three parametric parts known as the order of AR, MA, and the number of differences needed for a time series to be stationary; i.e., p, q, and d, respectively. Mathematically, it can be written as
( 1 B ) d S t , k = α + h = 1 p β h ( 1 B ) d S t h , k + N t + h = 1 q ϕ h N t h
where S t , k is the stochastic part obtained from partitioning the S t , k , q is an integrated difference to achieve a stationarity, B is the backward shift operator, N t is the white noise term, α is an intercept term, and β h ( h = 1 , , p ) and ϕ h ( h = 1 , , q ) are the parameters of the AR and MA parts, respectively, which are estimated using the maximum likelihood estimation method.

3.4.2. Neural Network Autoregressive Model

Artificial neural networks (ANNs) allow complex nonlinear relationships between the response variable and its predictors. The strength of the NNAR model comes from the parallel processing of data, which eliminates the need for classical assumptions. As a result, the network model may be easily chosen based on the characteristics of the data. ANN includes three types of layers, namely input, output, and one or more hidden layer(s) with an activation function that determines the relationship (represented by a sigmoid) between the input ( S t 1 , k , S t 2 , k , , S t d , k ) and output ( S t , k ) of a node and network. The mathematical form of the NNAR model is
S t , k = ω 0 + a = 1 q ω a f ω 0 a + z = 1 d ω z a S t z , k + N t
where w 0 and w 0 a are the biases on the nodes, w a and w z a ( a = 1 , 2 , q , z = 1 , 2 , , d ,) are the connection weights between the layers of the model, f ( · ) is the activation function of the hidden layer, d is the number of input nodes, q is the number of hidden nodes, and N t iid N 0 , σ 2 . Furthermore, the sigmoid is based on the logistic function given as f ( x ) = 1 + e x 1 .

4. Application to Real Data

4.1. Study Area

For the empirical investigation of the proposed FAR model, an hourly temperature time series (in °C) for the site Karachi, Pakistan (Latitude: 24.8607° N, Longitude: 67.0011° E), is considered (Figure 4). The data used in this study span the period from 1 January 2019 to 31 December 2024 and were collected from the NASA POWER data access portal (https://power.larc.nasa.gov/data-access-viewer/ (accessed on 14 January 2025)).
In addition, exogenous variables such as wet-bulb temperature (°C), wind speed (10 m/s), and surface pressure (kPa) are also collected for the same location. The times series data for each variable consists of 52,632 hourly observations for 2192 days. The descriptive summaries for the these variables are given in Table 2.
Figure 5 depicts the nature of variations for response and the three exogenous variables. The time series plot of the temperature data over six years clearly distinguish the training set (43,848) used for modeling and estimation from the testing set (8784), which corresponds to out-of-sample forecasts. Since 2024 is a leap year, a total of 366 day-ahead forecasts are generated using the expanding window modeling technique. The hourly temperature data also exhibits seasonal variation. Figure 6a illustrates seasonal variations across different seasons in 2024, showing that winter and autumn experience significant fluctuations in temperature as compared to the other three seasons.

4.2. Data-Driven Modeling Procedure

Figure 7 clearly illustrates the step-by-step data-driven modeling procedure that is used to implement the proposed and competitive models, beginning with data preprocessing.
Extreme values in the times series of responses, as well as in the explanatory variables, pose challenges for estimation and forecasting. Identifying and adjusting these values improves prediction accuracy. This study applies the Shifting Filter on time series (SFT), an extension of [71], which operates on a rolling window instead of the entire series. Each of the original time series is divided into P = ( N / r ) segments, where N is the total number of observations and r is the window width. Values deviating beyond 2.58 (based on a 99% confidence interval from a normal distribution) times the sample standard deviation σ from the mean λ ^ are flagged as extreme [72]. The process iterates across all segments for all series as follows:
S t ° = i = 1 P S t : | S t λ ^ | 2.58 S D S ,
where S t is the temperature (°C) at time t, λ ^ is the sample mean, and S D S is the standard deviation. The replacement rule can be mathematically defined as
S ˜ t = S t , if | S t λ ^ | < 2.58 · S D S λ ˜ , otherwise
Extreme values are replaced with the median ( λ ˜ ) of the respective window for improved forecasting stability.
Once S ˜ t are obtained for each TS, we move towards the data-driven framework as indicated in Figure 7, which includes the application of the traditional time series model ARIMA, a machine learning model NNAR, and three FAR models. Consider S t , k to be a filtered temperature time series where t N and k = 1 , 2 , 3 , , 24 . Due to significant seasonal variations ( U t , ), long trends ( T t , k ), and year effects ( Y t , k ), as represented in Figure 6b, we partitioned S t , k into deterministic ( D t , k ) and stochastic ( S t , k ) parts, such that S t , k = D t , k + S t , k , the first term D t , k = U t , k + T t , k + Y t , k is estimated via a generalized additive model (GAM) and the second term S t , k = S t , k D ^ t , k is modeled by ARIMA. Finally, the results are combined ( D ^ t , k + S ^ t , k ) for the day-ahead final forecast; that is, S ^ t + 1 , k = D ^ t + 1 , k + S ^ t + 1 , k . Additionally, models like NNAR and FAR can capture non-linearity directly without decomposing S t , k , as they performed even better in handling complex patterns.

Order Identification and Estimation

For different variables used in the study, order identification is crucial for ensuring accurate time series modeling and forecasting. The optimal orders required for the implementation of the models in Figure 7 are estimated and given in Table 3.
The optimal orders for all models are selected based on the model selection criteria: for ARIMA, using AIC/BIC; for neural networks via cross-validation; and for FAR(p, m), FARX(p, m, τ) and FAR X ( p , m , g ̲ , τ ) , the respective fFPE’s in Equations (15), (17) and (22) are minimized to determine the right functional orders, which optimizes the dependency structure in FTS, enhancing predictive performance. Thus, based on Table 3, the estimated time series models are ARIMA(3, 0, 2), NNAR(32, 16), and the three FAR models FAR(6, 14), FARX(6, 14, 3), FAR X ( 2 , 15 , 15 15 9 , 6 ) , respectively.
To evaluate the residual behavior of the selected models, the autocorrelation function (ACF) and partial ACF (PACF) plots are depicted in Figure 8. Overall, the final estimated residuals appear to be largely whitened, indicating that the fitted models can be regarded as adequate. Since the fitted models exhibit satisfactory diagnostic behavior, we now proceed to the out-of-sample forecasting analysis.

4.3. Out-of-Sample Forecasting

Out-of-sample forecasting evaluates a model’s predictive performance on unobserved data beyond the estimation period. This approach helps assess the generalization and robustness of the model in real-world applications. To validate the accuracy of the forecasting models, different performance measures are used. In particular, the forecasting accuracy of a model is determined by the mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE) and mean absolute scaled error (MASE) for T number of out-of-sample forecasts used for accuracy evaluation, which are given as
MAE = 1 T t = 1 T S t , k S ^ t , k
RMSE = 1 T t = 1 T S t , k S ^ t , k 2
MAPE = 1 T t = 1 T S t , k S ^ t , k S t , k × 100
MASE = t = 1 T | S t , k S ^ t , k | T T 1 t = 2 T | S t , k S t 1 , k |
Using the proposed and competitor models, day-ahead out-of-sample forecasts are obtained for the complete leap year 2024. A univariate hourly time series dataset of temperature is used to get ARIMA(3, 0, 2) and NNAR(32, 16) while the daily temperature curves are used to estimate the three FAR models.
The results for various models are summarized in Table 4 by means of RMSE, MAE, MAPE and MASE. From the results we can see that the NNAR outperforms the classical ARIMA by producing relatively small RMSE (0.9720) and MAPE (2.7928). While the FAR and FARX models provide considerable improvement in forecasting in comparison with ARIMA and NNAR, the proposed FAR X ( 2 , 15 , 15 15 9 , 6 ) ensures superior performance within the class of functional models. It achieves the lowest RMSE (0.2942) and MAE (0.2004), indicating improved forecasting accuracy. Additionally, the significant reduction in MAPE (0.8213) highlights lower prediction errors, while the MASE (0.3188) is also the lowest among all models, showcasing its robustness. These findings confirm that FAR X ( 2 , 15 , 15 15 9 , 6 ) provides more accurate day-ahead temperature forecasts while minimizing overall errors, which is a major advancement in climate data analysis.
To assess the significance of the differences between the accuracy metrics given in Table 4, we applied the Diebold and Mariano (DM) [73] test, which is a standard statistical method for comparing forecast accuracy across models. It evaluates whether differences in accuracy measures between forecast pairs are statistically significant and the null hypothesis states no difference in accuracy. Table 5 consists of findings of the DM test, amd contains the p-values, using which we test the null hypothesis that the forecasting accuracies of the model in the row and the model in the column are the same against the alternative hypothesis that the forecasting accuracy of the model in the column is greater than the model in the row.
Figure 8. ACF and PACF plots for all models under study, (a) ACF for ARIMA; (b) PACF for ARIMA; (c) ACF for NNAR; (d) PACF for NNAR; (e) ACF for FAR(p, m); (f) PACF for FAR(p, m); (g) ACF for FAR(p, m, τ); (h) PACF for FAR(p, m, τ); (i) ACF for FAR X ( p , m , g ̲ , τ ) ; (j) PACF for FAR X ( p , m , g ̲ , τ ) .
Figure 8. ACF and PACF plots for all models under study, (a) ACF for ARIMA; (b) PACF for ARIMA; (c) ACF for NNAR; (d) PACF for NNAR; (e) ACF for FAR(p, m); (f) PACF for FAR(p, m); (g) ACF for FAR(p, m, τ); (h) PACF for FAR(p, m, τ); (i) ACF for FAR X ( p , m , g ̲ , τ ) ; (j) PACF for FAR X ( p , m , g ̲ , τ ) .
Mathematics 14 00835 g008
This table indicates that the functional models produce statistically more significant results than ARIMA and NNAR. Among the functional models, the proposed FAR X ( p , m , g ̲ , τ ) outperforms the other functional models, demonstrating its statistical significance for temperature forecasting.
To evaluate the performance of our proposed functional model, we compute the same accuracy measures for different seasons of the year as well as for each month, and the results are listed in Table 6 and Table 7, respectively. The study site goes through four different seasons and the daily temperature curves are considerably different resulting in different forecasting accuracy errors. For example, looking at Table 6, which lists the one-day-ahead out-of-sample forecasting errors for different seasons, one can notice that the errors vary through all seasons. More precisely, looking at the MAE values, one can see that in summer, the value is 0.1550 for the FAR X ( 2 , 15 , 15 15 9 , 6 ) model, which is the lowest value as compared to other seasons and compared to the other models. Although the forecasting accuracies of all the models are high, the three FAR approaches have relatively better performance, especially the proposed FAR X ( 2 , 15 , 15 15 9 , 6 ) model. The error metrics of all models are relatively small in autumn, but large in winter. For instance, the FAR(6, 14) and FARX(6, 14, 3) produce MAPE values of 1.2031 and 1.1875 in autumn, but 3.4099 and 3.4241 in winter, respectively. On the other hand, the benchmark models ARIMA(3, 0, 2) and NNAR(32, 16) produces fairly large MAPE values. The results indicate that the proposed model performs better in summer and autumn compared to winter and spring. This can be attributed to the relatively smoother and more stable temperature patterns during these seasons, whereas winter and spring exhibit rapid fluctuations due to complex atmospheric interactions.
Table 7 lists the monthly forecasting accuracy for the models used in this study. The results suggest that the forecasting errors are lower in June, September and November compared to the other months of the year. The lowest errors are observed in the month of June, whereas the higher errors correspond to the month of March. Comparing the forecasting accuracies of different models, it is evident that the functional models perform relatively better than ARIMA and NNAR. Among the functional models, the FAR X ( 2 , 15 , 15 15 9 , 6 ) gives better results than FAR(6, 14) and FARX(6, 14, 3). The lowest MAPE produced by our proposed functional model is 0.4081 for June, whereas the value of 1.4048, produced for February, is the highest MAPE value.
Finally, Table 8 provides a comparison of the models for each hour of the day. The results are attributed to the one-day-ahead out-of-sample forecasts for a whole year. That is, the accuracy measures are obtained for the first hour based on all first hours for a complete cycle in 2024 (366 days), and so forth. This table indicates that daily temperature curves are more stable in the second-to-sixth and eighteenth-to-twenty-first hours of the day than they are in the first, seventh, thirteenth, and the twenty-fourth hour. The forecasting errors for the thirteenth and fourteenth hours are comparatively lower than in the first and last hour of the day. Additionally, higher prediction errors around dawn are observed, likely caused by abrupt temperature transitions and lower signal-to-noise ratios at these times, which increase the difficulty of accurate short-term forecasting. The results suggest that our proposed functional model FAR X ( 2 , 15 , 15 15 9 , 6 ) is more efficient at capturing the non-linear trends and performs significantly better compared to the traditional ARIMA and NNAR within the class of functional models. Hence, the inclusion of a number of exogenous variables enhances the capability of FAR(6, 14) and FARX(6, 14, 3).
To summarize the work, one can see from Table 4, Table 5, Table 6, Table 7 and Table 8 that the proposed approach is efficient, producing significantly lower error metrics than the other competitors, which indicates better predictive power for modeling and forecasting the daily temperature curves and hence demonstrates the usefulness of our estimation framework. Moreover, the forecasting results show that the functional forecasting approach is superior to the classical ARIMA and NNAR methods.

4.4. Computational Complexity

Finally, we compare the average computational time in seconds for the five models as given in Table 9. For the functional models, authors wrote their own code and the necessary programming is carried out in R which is a statistical computing language and all the computations have been performed using an Intel(R)-Core(TM) i7-4770 CPU running at 3.40 GHz. For deterministic part, the library gam is used while for ARIMA and NNAR, the library forecast is used. For the applications of functional models the libraries fda and vars. The documentation provided by the packages includes in-depth details on the particular algorithms utilized in the estimation process.
Table 9 presents the average computation time in seconds for one-day-ahead temperature forecasting across various models. ARIMA exhibits the lowest execution time (0.33 s), serving as the baseline. The functional models, particularly FARX( p , m , g ̲ , τ ), show increased computational cost, with a maximum time of 2.17 s. This rise reflects the added complexity due to functional and exogenous components. However, such models provide improved predictive performance, justifying the additional time.

5. Conclusions

Accurate modeling and forecasting of daily temperature curves are essential for many practical applications, such as weather forecasting, agriculture, energy management, coastal activities, engineering, and climate analysis. However, due to the complex and non-linear patterns in weather data, the forecasting problem is a challenging task. To address this issue, a novel functional autoregressive model, namely FAR X ( p , m , g ̲ , τ ) , is proposed to accommodate any number of functional and scalar covariates. The optimum orders used for different parts of the proposed model are estimated by minimizing a modified functional final prediction error called fFPE X ( p , m , g ̲ , τ ) . To evaluate the performance of the proposed methodology, the daily temperature curve data obtained from NASA POWER for Karachi in Pakistan is used, and one-day-ahead out-of-sample forecasts are obtained for a complete year. For comparison purposes, forecasts are also obtained using the classical ARIMA and NNAR models, as well as two functional models, namely FAR(p, m) and FARX(p, m). The forecasting performance of the different models is assessed through different accuracy measures, including the RMSE, MAE, MAPE, and MASE.
The study findings suggest that the proposed functional model FAR X ( p , m , g ̲ , τ ) is efficient at forecasting daily temperature curves by producing considerably lower forecasting errors compared to the competitors. Both of the functional models, FAR(p, m) and FARX(p, m), perform relatively better than ARIMA and NNAR. Among the functional models, our proposed model, taking into consideration several functional and scalar exogenous variables, produces the best forecasting results. The proposed model effectively captures functional temperature dynamics, particularly for short-term horizons; however, it may be less accurate under extreme or abrupt climatic events. Future work could incorporate additional high-frequency exogenous variables and explore hybrid approaches with deep learning to further enhance predictive performance. Additionally, the model should be evaluated across regions with diverse climatic patterns, such as continental or tropical climates, to assess its robustness and generalizability beyond Karachi.

Author Contributions

Conceptualization, I.S. and M.U.; methodology, I.S. and M.U.; software, M.U. and S.A.; validation, S.A. and S.M.A.; formal analysis, I.S. and M.U.; investigation, S.A. and S.M.A.; resources, S.A. and S.M.A.; data curation, M.U. and S.M.A.; writing—original draft preparation, I.S.; writing—review and editing, S.A. and S.M.A.; visualization, M.U. and S.M.A.; supervision, I.S.; project administration, I.S. and S.A.; funding acquisition, S.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was funded by Umm Al-Qura University, Saudi Arabia, under grant number: 26UQU4310037GSSR03.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study were collected from the NASA POWER data access portal (https://power.larc.nasa.gov/data-access-viewer/), accessed on 14 January 2025.

Acknowledgments

The authors extend their appreciation to Umm Al-Qura University, Saudi Arabia, for funding this research work through grant number: 26UQU4310037GSSR03.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Tajfar, E.; Bateni, S.M.; Margulis, S.A.; Gentine, P.; Auligne, T. Estimation of turbulent heat fluxes via assimilation of air temperature and specific humidity into an atmospheric boundary layer model. J. Hydrometeorol. 2020, 21, 205–225. [Google Scholar] [CrossRef]
  2. Valipour, M.; Bateni, S.M.; Gholami Sefidkouhi, M.A.; Raeini-Sarjaz, M.; Singh, V.P. Complexity of forces driving trend of reference evapotranspiration and signals of climate change. Atmosphere 2020, 11, 1081. [Google Scholar] [CrossRef]
  3. Cifuentes, J.; Marulanda, G.; Bello, A.; Reneses, J. Air temperature forecasting using machine learning techniques: A review. Energies 2020, 13, 4215. [Google Scholar] [CrossRef]
  4. Massie, D.R.; Rose, M.A. Predicting daily maximum temperatures using linear regression and Eta geopotential thickness forecasts. Weather Forecast 1997, 12, 799–807. [Google Scholar] [CrossRef]
  5. Campbell, S.D.; Diebold, F.X. Weather forecasting for weather derivatives. J. Am. Stat. Assoc. 2005, 100, 6–16. [Google Scholar] [CrossRef]
  6. Svec, J.; Stevenson, M. Modelling and forecasting temperature based weather derivatives. Glob. Financ. J. 2007, 18, 185–204. [Google Scholar] [CrossRef]
  7. Dupuis, D.J. Forecasting temperature to price CME temperature derivatives. Int. J. Forecast. 2011, 27, 602–618. [Google Scholar] [CrossRef]
  8. Wang, Z.; Li, P.; Li, L.; Huang, C.; Liu, M. Modeling and forecasting average temperature for weather derivative pricing. Adv. Meteorol. 2015, 2015, 837293. [Google Scholar] [CrossRef]
  9. Cavicchioli, M.; Ghezal, A.; Zemmouri, I. (Bi)spectral analysis of Markov switching bilinear time series. Stat. Methods Appl. 2025, 1–30. [Google Scholar] [CrossRef]
  10. Alraddadi, R. The logTG-SV model: A threshold-based volatility framework with logarithmic shocks for exchange rate dynamics. AIMS Math. 2025, 10, 19495–19511. [Google Scholar] [CrossRef]
  11. Gomez, I.; Estrela, M.J.; Caselles, V. Operational forecasting of daily summer maximum and minimum temperatures in the Valencia Region. Nat. Hazards 2014, 70, 1055–1076. [Google Scholar] [CrossRef]
  12. Trigo, R.M.; Palutikof, J.P. Simulation of daily temperatures for climate change scenarios over Portugal: A neural network model approach. Clim. Res. 1999, 13, 45–59. [Google Scholar] [CrossRef]
  13. Abdel-Aal, R.E.; Elhadidy, M.A. Modeling and forecasting the daily maximum temperature using abductive machine learning. Weather Forecast 1995, 10, 310–325. [Google Scholar] [CrossRef]
  14. Abdel-Aal, R.E. Hourly temperature forecasting using abductive networks. Eng. Appl. Artif. Intell. 2004, 17, 543–556. [Google Scholar] [CrossRef]
  15. Ustaoglu, B.; Cigizoglu, H.K.; Karaca, M. Forecast of daily mean, maximum and minimum temperature time series by three artificial neural network methods. Meteorol. Appl. 2008, 15, 431–445. [Google Scholar] [CrossRef]
  16. Sanikhani, H.; Deo, R.C.; Samui, P.; Kisi, O.; Mert, C.; Mirabbasi, R.; Gavili, S.; Yaseen, Z.M. Survey of different data-intelligent modeling strategies for forecasting air temperature using geographic information as model predictors. Comput. Electron. Agric. 2018, 152, 242–260. [Google Scholar] [CrossRef]
  17. Tran, T.T.K.; Bateni, S.M.; Ki, S.J.; Vosoughifar, H. A review of neural networks for air temperature forecasting. Water 2021, 13, 1294. [Google Scholar] [CrossRef]
  18. Shah, I.; Mubassir, P.; Ali, S.; Albalawi, O. A functional autoregressive approach for modeling and forecasting short-term air temperature. Front. Environ. Sci. 2024, 12, 1411237. [Google Scholar] [CrossRef]
  19. Selmy, H.A.; Mohamed, H.K.; Medhat, W. A predictive analytics framework for sensor data using time series and deep learning techniques. Neural Comput. Appl. 2024, 36, 6119–6132. [Google Scholar] [CrossRef]
  20. Alexander, V.O.; Ali, M.I. Enhancing Time Series Data Predictions: A Survey of Augmentation Techniques and Model Performance. In Proceedings of the 2024 Australasian Computer Science Week (ACSW 2024); Association for Computing Machinery: New York, NY, USA, 2024. [Google Scholar] [CrossRef]
  21. Uluocak, I.; Bilgili, M. Daily air temperature forecasting using LSTM-CNN and GRU-CNN models. Acta Geophys. 2024, 72, 2107–2126. [Google Scholar] [CrossRef]
  22. An, H.; Li, Q.; Lv, X.; Li, G.; Qian, Q.; Zhou, G.; Nie, G.; Zhang, L.; Zhu, L. Forecasting daily extreme temperatures in Chinese representative cities using artificial intelligence models. Weather. Clim. Extrem. 2023, 42, 100621. [Google Scholar] [CrossRef]
  23. Sen, A.; Mazumder, A.R.; Dutta, D.; Sen, U.; Syam, P.; Dhar, S. Comparative evaluation of metaheuristic algorithms for hyperparameter selection in short-term weather forecasting. arXiv 2023, arXiv:2309.02600. [Google Scholar] [CrossRef]
  24. Toharudin, T.; Pontoh, R.S.; Caraka, R.E.; Zahroh, S.; Lee, Y.; Chen, R.C. Employing long short-term memory and Facebook prophet model in air temperature forecasting. Commun. Stat.-Simul. Comput. 2023, 52, 279–290. [Google Scholar] [CrossRef]
  25. Elshewey, A.M.; Shams, M.Y.; Elhady, A.M.; Shohieb, S.M.; Abdelhamid, A.A.; Ibrahim, A.; Tarek, Z. A Novel WD-SARIMAX model for temperature forecasting using daily delhi climate dataset. Sustainability 2022, 15, 757. [Google Scholar] [CrossRef]
  26. Gong, B.; Langguth, M.; Ji, Y.; Mozaffari, A.; Stadtler, S.; Mache, K.; Schultz, M.G. Temperature forecasting by deep learning methods. Geosci. Model Dev. 2022, 15, 8931–8956. [Google Scholar] [CrossRef]
  27. Shin, K.-H.; Jung, J.-W.; Chang, K.-H.; Kim, K.; Jung, W.-S.; Lee, D.-I.; You, C.-H. Dynamical prediction of two meteorological factors using the deep neural network and the long short-term memory (II). J. Korean Phys. Soc. 2022, 80, 1081–1097. [Google Scholar] [CrossRef]
  28. Alomar, M.K.; Khaleel, F.; Aljumaily, M.M.; Masood, A.; Razali, S.F.M.; AlSaadi, M.A.; Al-Ansari, N.; Hameed, M.M. Data-driven models for atmospheric air temperature forecasting at a continental climate region. PLoS ONE 2022, 17, e0277079. [Google Scholar] [CrossRef]
  29. Haque, E.; Tabassum, S.; Hossain, E. A comparative analysis of deep neural networks for hourly temperature forecasting. IEEE Access 2021, 9, 160646–160660. [Google Scholar] [CrossRef]
  30. Wang, H.; Pathan, M.S.; Lee, Y.H.; Dev, S. Day-ahead forecasts of air temperature. In 2021 IEEE USNC-URSI Radio Science Meeting (Joint with AP-S Symposium); IEEE: New York, NY, USA, 2021; pp. 94–95. [Google Scholar] [CrossRef]
  31. Lee, S.; Lee, Y.-S.; Son, Y. Forecasting daily temperatures with different time interval data using deep neural networks. Appl. Sci. 2020, 10, 1609. [Google Scholar] [CrossRef]
  32. Zhang, Z.; Dong, Y. Temperature Forecasting via Convolutional Recurrent Neural Networks Based on Time-Series Data. Complexity 2020, 2020, 3536572. [Google Scholar] [CrossRef]
  33. Shin, J.-Y.; Kim, K.R.; Ha, J.-C. Seasonal forecasting of daily mean air temperatures using a coupled global climate model and machine learning algorithm for field-scale agricultural management. Agric. For. Meteorol. 2020, 281, 107858. [Google Scholar] [CrossRef]
  34. Wanishsakpong, W.; Owusu, B.E. Optimal time series model for forecasting monthly temperature in the southwestern region of Thailand. Model. Earth Syst. Environ. 2020, 6, 525–532. [Google Scholar] [CrossRef]
  35. Zahroh, S.; Hidayat, Y.; Pontoh, R.S.; Santoso, A.; Sukono, F.; Bon, A.T. Modeling and forecasting daily temperature in Bandung. In Proceedings of the International Conference on Industrial Engineering and Operations Management, Riyadh, Saudi Arabia, 26–28 November 2019; pp. 406–412. [Google Scholar]
  36. Ramsay, J.O.; Silverman, B.W. Applied Functional Data Analysis: Methods and Case Studies; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar] [CrossRef]
  37. Uzair, M.; Shah, I.; Ali, S. An adaptive strategy for wind speed forecasting under functional data horizon: A way towards enhancing clean energy. IEEE Access 2024, 12, 68730–68746. [Google Scholar] [CrossRef]
  38. Naeem, N.; Ali, S.; Shah, I. Functional EWMA control chart for phase II profile monitoring. J. Stat. Comput. Simul. 2025, 95, 96–116. [Google Scholar] [CrossRef]
  39. Ullah, S.; Finch, C.F. Applications of functional data analysis: A systematic review. BMC Med. Res. Methodol. 2013, 13, 43. [Google Scholar] [CrossRef]
  40. Tao, Z.; Wang, M.; Liu, J.; Wang, P. A Functional Data Analysis Framework Incorporating Derivative Information and Mixed-Frequency Data for Predictive Modeling of Crude Oil Price. IEEE Trans. Ind. Inform. 2025, 21, 3226–3235. [Google Scholar] [CrossRef]
  41. Rodríguez Cuadro, D.; Pérez-Plaza, S.; Castaño-Martínez, A.; Fernández-Palacín, F. A Study of the Colombian Stock Market with Multivariate Functional Data Analysis (FDA). Mathematics 2025, 13, 858. [Google Scholar] [CrossRef]
  42. Pokhrel, K.P.; Tsokos, C.P. Forecasting age-specific brain cancer mortality rates using functional data analysis models. Adv. Epidemiol. 2015, 2015, 721592. [Google Scholar] [CrossRef]
  43. Beyaztas, U.; Yaseen, Z.M. Drought interval simulation using functional data analysis. J. Hydrol. 2019, 579, 124141. [Google Scholar] [CrossRef]
  44. Hael, M.A.; Yongsheng, Y.; Saleh, B.I. Visualization of rainfall data using functional data analysis. SN Appl. Sci. 2020, 2, 461. [Google Scholar] [CrossRef]
  45. Ghumman, A.R.; Haider, H.; Shafiquzamman, M. Functional data analysis of models for predicting temperature and precipitation under climate change scenarios. J. Water Clim. Change 2020, 11, 1748–1765. [Google Scholar] [CrossRef]
  46. Hael, M.A. Modeling of rainfall variability using functional principal component method: A case study of Taiz region, Yemen. Model. Earth Syst. Environ. 2021, 7, 17–27. [Google Scholar] [CrossRef]
  47. Wang, S.; Jank, W.; Shmueli, G. Explaining and forecasting online auction prices and their dynamics using functional data analysis. J. Bus. Econ. Stat. 2008, 26, 144–160. [Google Scholar] [CrossRef]
  48. Wagner-Muns, I.M.; Guardiola, I.G.; Samaranayke, V.A.; Kayani, W.I. A functional data analysis approach to traffic volume forecasting. IEEE Trans. Intell. Transp. Syst. 2017, 19, 878–888. [Google Scholar] [CrossRef]
  49. Shah, I.; Muhammad, I.; Ali, S.; Ahmed, S.; Almazah, M.M.A.; Al-Rezami, A.Y. Forecasting day-ahead traffic flow using functional time series approach. Mathematics 2022, 10, 4279. [Google Scholar] [CrossRef]
  50. Liebl, D. Modeling and forecasting electricity spot prices: A functional data perspective. Ann. Appl. Stat. 2013, 12, 1562–1592. [Google Scholar] [CrossRef]
  51. Lisi, F.; Shah, I. Forecasting next-day electricity demand and prices based on functional models. Energy Syst. 2020, 11, 947–979. [Google Scholar] [CrossRef]
  52. Jan, F.; Shah, I.; Ali, S. Short-Term Electricity Prices Forecasting Using Functional Time Series Analysis. Energies 2022, 15, 3423. [Google Scholar] [CrossRef]
  53. Shah, I.; Jan, F.; Ali, S. Functional data approach for short-term electricity demand forecasting. Math. Probl. Eng. 2022, 2022, 6709779. [Google Scholar] [CrossRef]
  54. Chen, Y.; Koch, T.; Lim, K.G.; Xu, X.; Zakiyeva, N. A review study of functional autoregressive models with application to energy forecasting. Wiley Interdiscip. Rev. Comput. Stat. 2021, 13, e1525. [Google Scholar] [CrossRef]
  55. Frois Caldeira, J.; Gupta, R.; Suleman, M.T.; Torrent, H.S. Forecasting the term structure of interest rates of the BRICS: Evidence from a nonparametric functional data analysis. Emerg. Mark. Financ. Trade 2021, 57, 4312–4329. [Google Scholar] [CrossRef]
  56. Vilar, J.M.; Cao, R.; Aneiros, G. Forecasting next-day electricity demand and price using nonparametric functional methods. Int. J. Electr. Power Energy Syst. 2012, 39, 48–55. [Google Scholar] [CrossRef]
  57. Curceac, S.; Ternynck, C.; Ouarda, T.B.M.J.; Chebana, F.; Niang, S.D. Short-term air temperature forecasting using nonparametric functional data analysis and SARMA models. Environ. Model. Softw. 2019, 111, 394–408. [Google Scholar] [CrossRef]
  58. Aue, A.; Norinho, D.D.; Hörmann, S. On the prediction of stationary functional time series. J. Am. Stat. Assoc. 2015, 110, 378–392. [Google Scholar] [CrossRef]
  59. Karhunen, K. Under linear methods in probability theory. Ann. Acad. Sci. Fenn. Ser. A. I. Math.-Phys. 1947, 37, 3–79. [Google Scholar]
  60. Shang, H.L. A survey of functional principal component analysis. AStA Adv. Stat. Anal. 2014, 98, 121–142. [Google Scholar] [CrossRef]
  61. Hall, P.; Hosseini-Nasab, M. On properties of functional principal components analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 2006, 68, 109–126. [Google Scholar] [CrossRef]
  62. Hörmann, S.; Kokoszka, P. Weakly dependent functional data. Ann. Stat. 2010, 38, 1845–1884. [Google Scholar] [CrossRef]
  63. Damon, J.; Guillas, S. far: Modelization for Functional AutoRegressive Processes. 2024. Available online: https://CRAN.R-project.org/package=far (accessed on 14 January 2025).
  64. Hyndman, R.J.; Shang, H.L. ftsa: Functional Time Series Analysis. 2025. Available online: https://CRAN.R-project.org/package=ftsa (accessed on 14 January 2025).
  65. Dickey, D.A.; Fuller, W.A. Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 1979, 74, 427–431. [Google Scholar] [CrossRef]
  66. Dickey, D.A.; Fuller, W.A. Likelihood ratio statistics for autoregressive time series with a unit root. Econometrica 1981, 49, 1057–1072. [Google Scholar] [CrossRef]
  67. Hörmann, S.; Kokoszka, P. Functional time series. In Handbook of Statistics; Elsevier: Amsterdam, The Netherlands, 2012; Volume 30, pp. 157–186. [Google Scholar] [CrossRef]
  68. Brockwell, P.J.; Davis, R.A. Time Series: Theory and Methods; Springer: Berlin/Heidelberg, Germany, 1991. [Google Scholar] [CrossRef]
  69. Bosq, D. Linear Processes in Function Spaces: Theory and Applications; Springer: Berlin/Heidelberg, Germany, 2000. [Google Scholar] [CrossRef]
  70. Kokoszka, P.; Reimherr, M. Determining the order of the functional autoregressive model. J. Time Ser. Anal. 2013, 34, 116–129. [Google Scholar] [CrossRef]
  71. Borovkova, S.; Permana, F.J. Modelling electricity prices by the potential jump-diffusion. In Stochastic Finance; Springer: Berlin/Heidelberg, Germany, 2006; pp. 239–263. [Google Scholar] [CrossRef]
  72. Shah, I.; Akbar, S.; Saba, T.; Ali, S.; Rehman, A. Short-term forecasting for the electricity spot prices with extreme values treatment. IEEE Access 2021, 9, 105451–105462. [Google Scholar] [CrossRef]
  73. Diebold, F.X.; Mariano, R.S. Comparing predictive accuracy. J. Bus. Econ. Stat. 2002, 20, 134–144. [Google Scholar] [CrossRef]
Figure 1. The Fourier basis functions using M = 20.
Figure 1. The Fourier basis functions using M = 20.
Mathematics 14 00835 g001
Figure 2. Functional data representation of (a) response variable; (b) first exogenous variable; (c) second exogenous variable; (d) third exogenous variable, under study using Fourier basis functions.
Figure 2. Functional data representation of (a) response variable; (b) first exogenous variable; (c) second exogenous variable; (d) third exogenous variable, under study using Fourier basis functions.
Mathematics 14 00835 g002
Figure 3. Framework for the proposed functional model.
Figure 3. Framework for the proposed functional model.
Mathematics 14 00835 g003
Figure 4. Spatial location of the study area, (a) map of Pakistan with Karachi highlighted; (b) zoomed-in view of Karachi.
Figure 4. Spatial location of the study area, (a) map of Pakistan with Karachi highlighted; (b) zoomed-in view of Karachi.
Mathematics 14 00835 g004
Figure 5. Time series plots of (a) endogenous variable indicating training and testing data partition; (b) first exogenous variable; (c) second exogenous variable; (d) third exogenous variable.
Figure 5. Time series plots of (a) endogenous variable indicating training and testing data partition; (b) first exogenous variable; (c) second exogenous variable; (d) third exogenous variable.
Mathematics 14 00835 g005
Figure 6. (a) Seasonal variations within an year; (b) long-trend, yearly, and seasonal variations for the whole dataset.
Figure 6. (a) Seasonal variations within an year; (b) long-trend, yearly, and seasonal variations for the whole dataset.
Mathematics 14 00835 g006
Figure 7. Flowchart of data-driven modeling procedure.
Figure 7. Flowchart of data-driven modeling procedure.
Mathematics 14 00835 g007
Table 1. Summary of literature review.
Table 1. Summary of literature review.
No.AuthorsLocationForecastingData Length (Years)MethodsError Metrices
1Shah et al. (2024) [18]Islamabad, PakistanDay-ahead5FAR, ARIMA, VARMAE, RMSE, MAPE
2Selmy et al. (2024) [19]DelhiDay-ahead5ARIMA, SARIMA, LSTM, CNN-LSTM,MAE, RMSE, MAPE
3Alexander et al. (2024) [20]Jena, GermanyDay-ahead8ARIMA, WaveNet, LSTMMAE, RMSE, MSE
4Uluocak et al. (2024) [21]Adana and Ankara, TürkiyeDay-ahead8GRU-CNN, LSTM-CNN, FNN, ANFIS, ARMA, GRU, LSTM, CNNMAE, RMSE, R2, NSE
5An et al. (2023) [22]ChinaDay-ahead10MLR, SVR, GBRT, LSTM, MLP, GFSMAE
6Sen et al. (2023) [23]Ottawa, CanadaHour-ahead11ARIMA, ANN, LSTM, GRU, GA, DE, PSOMAPE, MSE
7Toharudin et al. (2023) [24]Bandung, IndonesiaDay-ahead5.5LSTM, Facebook ProphetRMSE
8Elshewey et al. (2022) [25]Delhi, IndiaDay-ahead5WD-SARIMAXMAE, RMSE, R2, MAPE, MSE, MedAE,
9Gong et al. (2022) [26]EuropeHour-ahead13ConvLSTM, SAVPMSC, ACC, SSIM, rG
10Shin et al. (2022) [27]South KoreaDay-ahead6ANN, DNN, ELM, LSTM, LSTM-PCMAE, RMSE, R, Thiel’s U-statistic
11Alomar et al. (2022) [28]North AmericaDay-ahead22SVR, RT, QRTMAE, RMSE, R, Thiel’s U-statistic
12Haque et al. (2021) [29]Beijing, China, Toronto, Las Vegas, Seattle, DallasHour-ahead4SRN, GRU, LSTM, CNN, CNN-LSTM, GRU-LSTMMAE, RMSE, R2
13Wang et al. (2021) [30]Michigan, United States3 days ahead6Exponential SmoothingRMSE
14Lee et al. (2020) [31]South KoreaDay- and hour-ahead10MLP, RNN, CNNMAE
15Zhang et al. (2020) [32]Mainland ChinaDay-ahead67CRNNMAE, RMSE
16Shin et al. (2020) [33]South KoreaDay-ahead3Hybrid (GloSea5GC2, RELM), Climatology ModelRMSE
17Wanishsakpong et al. (2020) [34]Ranong and Phuket, ThailandMonth-ahead10ARIMA, ARIMAXRMSE, RRMSE
18Zahroh et al. (2019) [35]Bandung, IndonesiaDay-ahead5.5LSTMMAPE
Vector Autoregressive (VAR), Gated Recurrent Unit (GRU), Wavelet Decomposition (WD), Stochastic Adversarial Video Prediction (SAVP), Anomaly Correlation Coefficient (ACC), Structural Similarity Index (SSIM), Gradient Ratio (rG), Support Vector Regression (SVR), Regression Tree (RT), Quantile Regression Tree (QRT), Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), (a Global Climate model (GloSea5GC2), Regularized Extreme Learning Machine (RELM), Genetic Algorithm (GA), Differential Evolution (DE), and Particle Swarm Optimization (PSO).
Table 2. Descriptive Summaries of Endogenous and Exogenous Variables.
Table 2. Descriptive Summaries of Endogenous and Exogenous Variables.
VariablesMin.Q1MedianQ3Max.MeanSD
Temperature (°C)9.4822.9827.0329.739.5726.24.95
Wet-bulb temperature (°C)3.3618.4424.4326.9831.4322.355.73
Surface Pressure (kPa)98.5199.7100.29100.82102100.260.67
Wind Speed (10 m/s)0.032.934.095.5916.524.362.01
Table 3. Optimal orders for various models.
Table 3. Optimal orders for various models.
ARIMANNARFAR(p, m)FARX(p, m, τ)FARX( p , m , g ̲ , τ )
p = 3d = 32p = 6p = 6p = 2
d = 0q = 16m = 14m = 14m = 15
q = 2 τ = 3 g 1 = 15
g 2 = 15
g 3 = 9
τ = 6
p: Autoregressive order, d: Integration order, q: Moving average order, d: Input neurons, q: Hidden layers, p: Lagged value, m: Dimensions of functional response variable, τ: number of scalar covariates, g 1 , g 2 and g 3 : Dimensions of first, second, and third functional covariates.
Table 4. One-day-ahead out-of-sample temperature forecasting performance for various models.
Table 4. One-day-ahead out-of-sample temperature forecasting performance for various models.
ModelsRMSEMAEMAPEMASE
ARIMA1.10210.69242.86041.1013
NNAR0.97200.67252.79281.0697
FAR(p, m)0.78870.53702.14120.8542
FARX(p, m, τ)0.78730.53602.13780.8526
FAR X ( p , m , g ̲ , τ ) 0.29420.20040.82130.3188
Table 5. p-values from the Diebold–Mariano test, where the null hypothesis assumes equal predictive accuracy between models, while the alternative suggests the column model outperforms the row model.
Table 5. p-values from the Diebold–Mariano test, where the null hypothesis assumes equal predictive accuracy between models, while the alternative suggests the column model outperforms the row model.
ModelsARIMANNARFAR(p, m)FARX(p, m, τ)FARX( p , m , g ̲ , τ )
ARIMA<0.01<0.01<0.01<0.01
NNAR>0.99<0.01<0.01<0.01
FAR(p, m)>0.99>0.99<0.01<0.01
FARX(p, m, τ)>0.99>0.99>0.99<0.01
FAR X ( p , m , g ̲ , τ ) >0.99>0.99>0.99>0.99
Table 6. Accuracy measures for one-day-ahead out-of-sample temperature forecasts for different seasons based on various models.
Table 6. Accuracy measures for one-day-ahead out-of-sample temperature forecasts for different seasons based on various models.
SeasonsErrorsARIMANNARFAR(p, m)FARX(p, m, τ)FARX( p , m , g ̲ , τ )
WinterRMSE1.29351.25620.91800.92040.3584
MAE0.94440.96450.69770.70080.2553
MAPE4.68114.86953.40993.42411.2796
MASE0.98051.00140.72440.72760.2651
SpringRMSE1.47411.17670.97570.97910.3604
MAE0.89760.85760.69040.69040.2509
MAPE3.62643.39372.63072.64180.9884
MASE0.97700.93340.75140.75540.2730
SummerRMSE0.77380.71400.66500.66400.2237
MAE0.49850.46670.41460.41590.1550
MAPE1.60061.48361.31111.31500.5073
MASE0.97780.91540.81330.81570.3041
AutumnRMSE0.64020.55420.49050.48980.1941
MAE0.42890.40150.34090.33700.1404
MAPE1.53891.43231.20311.18750.5114
MASE1.04450.97780.83020.82060.3419
Table 7. One-day-ahead out-of-sample temperature forecasts: month-specific forecasting errors.
Table 7. One-day-ahead out-of-sample temperature forecasts: month-specific forecasting errors.
MonthsErrorsARIMANNARFAR(p, m)FARX(p, m, τ)FARX( p , m , g ̲ , τ )MonthsErrorsARIMANNARFAR(p, m)FARX(p, m, τ)FARX( p , m , g ̲ , τ )
JanuaryRMSE1.04411.12190.84840.85070.2912JulyRMSE0.87310.86160.81780.81560.2552
MAE0.78580.89070.65520.65600.2120MAE0.57060.55880.50310.50420.1814
MAPE3.90654.40243.16783.16751.0754MAPE1.76691.72911.54591.54970.5746
MASE0.93191.05630.77700.77800.2514MASE0.88550.86710.78070.78230.2815
FebruaryRMSE1.28131.28880.95850.96090.4120AugustRMSE0.65800.60370.57510.57470.2318
MAE0.99750.99980.71420.71710.2960MAE0.44830.40110.36920.36990.1558
MAPE4.75464.79353.35203.36631.4048MAPE1.53861.37061.24671.24900.5359
MASE0.95730.95950.68550.68820.2841MASE0.94490.84530.77810.77970.3284
MarchRMSE2.12641.52131.12831.13150.4168SeptemberRMSE0.69070.40200.37800.35830.1746
MAE1.19001.15800.80420.80860.2863MAE0.37530.30940.26840.25510.1270
MAPE5.74525.34363.59583.61151.3095MAPE1.34501.09490.94230.89450.4570
MASE1.06311.03450.71840.72240.2558MASE1.26141.03980.90210.85730.4269
AprilRMSE0.98660.97360.94050.94480.3628OctoberRMSE0.67710.70080.62610.63140.2133
MAE0.77170.75490.70400.70610.2664MAE0.48000.50430.43290.43370.1550
MAPE2.76982.71602.49122.49620.9867MAPE1.62101.71161.45461.45560.5416
MASE0.85840.83970.78310.78540.2963MASE0.93150.97860.84010.84170.3007
MayRMSE0.99280.93700.83460.83730.2904NovemberRMSE0.54080.51230.42660.43300.1918
MAE0.72710.65650.56340.56780.2004MAE0.42970.38750.31840.31890.1388
MAPE2.33672.09951.80071.81280.6689MAPE1.64811.48111.20391.20340.5346
MASE1.02720.92740.79600.80220.2831MASE0.93430.84250.69230.69320.3017
JuneRMSE0.77530.64760.56850.56890.1752DecemberRMSE1.51191.34890.94600.94840.3653
MAE0.47570.43920.37020.37210.1269MAE1.05341.00520.72480.73030.2605
MAPE1.49291.34661.13491.14080.4081MAPE5.38705.40793.70613.73481.3666
MASE1.04180.96190.81070.81500.2779MASE0.99210.94670.68260.68780.2454
Table 8. Hour-specific one-day-ahead out-of-sample forecasting errors for temperature.
Table 8. Hour-specific one-day-ahead out-of-sample forecasting errors for temperature.
HoursErrorsARIMANNARFAR(p, m)FARX(p, m, τ)FARX( p , m , g ̲ , τ )HoursErrorsARIMANNARFAR(p, m)FARX(p, m, τ)FARX( p , m , g ̲ , τ )
1RMSE0.96750.87360.26580.26740.414613RMSE1.24351.09841.02171.02360.1243
MAE0.56960.58230.18880.18800.2831MAE0.82100.79660.74090.74210.0799
MAPE2.72692.82450.89460.89621.3748MAPE2.73352.63222.44902.45260.2726
MASE1.08251.10650.35880.35720.5380MASE1.01760.98740.91830.91980.0990
2RMSE0.97440.91910.25680.25790.187814RMSE1.24701.10361.04631.04780.1691
MAE0.58560.60940.17540.17580.1331MAE0.82510.80440.75420.75520.0968
MAPE2.84073.00730.82570.82820.6338MAPE2.75042.65572.49912.50290.3244
MASE1.07661.12040.32250.32320.2447MASE1.02971.00380.94120.94240.1208
3RMSE0.99220.92480.33980.33970.158915RMSE1.22431.09391.04431.04650.2320
MAE0.60250.60730.23710.23710.1164MAE0.82070.80940.76000.76030.1475
MAPE2.95403.03371.12911.12970.5349MAPE2.76642.70942.54712.54870.4940
MASE1.07761.08630.42400.42420.2082MASE1.03661.02230.95980.96030.1863
4RMSE1.02090.91450.43200.43290.175616RMSE1.19471.05331.04371.04820.3303
MAE0.62760.61530.31690.31850.1300MAE0.80250.78450.77410.78020.2495
MAPE3.11063.08611.51741.52400.5956MAPE2.77482.69592.65262.67400.8707
MASE1.08291.06180.54680.54960.2244MASE1.02661.00350.99020.99810.3192
5RMSE1.04890.95130.52640.52800.186117RMSE1.16101.00830.98530.98930.3261
MAE0.64810.64050.37790.38080.1219MAE0.76880.75360.72340.72730.2512
MAPE3.21623.21411.84611.85650.5786MAPE2.81232.72902.60792.62200.9071
MASE1.06421.05170.62060.62530.2002MASE1.02851.00820.96780.97300.3360
6RMSE1.08520.98430.62170.62000.221018RMSE1.11840.97140.92310.92350.3684
MAE0.67640.66610.42790.42730.1560MAE0.73440.68920.68540.68300.2882
MAPE3.36383.35552.14252.13700.7077MAPE2.90132.70892.67692.67031.1229
MASE1.09621.07940.69350.69250.2528MASE1.02390.96080.95550.95210.4018
7RMSE1.12721.00360.69430.69120.328319RMSE1.02980.88760.81780.81870.3043
MAE0.68610.67150.48900.48750.2428MAE0.65120.60000.58400.58140.2206
MAPE3.30103.28582.37832.36391.1503MAPE2.69502.48742.38132.37400.8757
MASE1.10421.08060.78690.78450.3907MASE1.01870.93860.91350.90950.3451
8RMSE1.10430.98550.69140.69030.336020RMSE0.99740.84280.77200.77590.2804
MAE0.68460.67320.50610.50630.2499MAE0.61290.56450.53070.53200.1947
MAPE3.00442.97362.19902.19601.0419MAPE2.62542.42732.23832.24600.7976
MASE1.05151.03400.77730.77760.3838MASE1.01880.93830.88220.88440.3236
9RMSE1.11150.93220.72560.72380.344721RMSE0.99720.87120.74810.75250.2687
MAE0.71090.64630.53180.53090.2600MAE0.60280.57040.51490.51660.1972
MAPE2.81502.55862.08332.07630.9898MAPE2.65312.51082.23392.24260.8209
MASE1.03960.94510.77770.77650.3802MASE1.04540.98920.89280.89590.3420
10RMSE1.18061.01140.83360.83430.346522RMSE1.01140.89140.74480.74910.2754
MAE0.77370.73180.60750.60970.2575MAE0.59390.58020.51120.51360.2034
MAPE2.82902.66132.19322.19960.9148MAPE2.67772.63702.27922.29000.8583
MASE1.02490.96940.80480.80770.3411MASE1.04461.02050.89930.90350.3577
11RMSE1.23841.08710.95070.95060.312523RMSE1.00970.85850.75010.75380.3214
MAE0.82670.79050.68770.68760.2155MAE0.58130.57080.50490.50860.2268
MAPE2.88012.73582.37552.37350.7479MAPE2.68422.64862.31892.33341.0017
MASE1.01120.96690.84120.84100.2636MASE1.05451.03550.91600.92270.4114
12RMSE1.24811.09560.99870.99930.221224RMSE1.01650.87860.77990.78200.5023
MAE0.83790.79810.72400.72500.1491MAE0.57310.58450.51030.51340.3389
MAPE2.82882.67822.42422.42650.5102MAPE2.70552.77132.41332.42511.5841
MASE1.01670.96840.87850.87970.1809MASE1.07451.09580.95680.96260.6354
Table 9. Average time in seconds for one-day-ahead forecast of temperature data.
Table 9. Average time in seconds for one-day-ahead forecast of temperature data.
Average TimeARIMANNARFAR(p, m)FARX(p, m, τ)FARX( p , m , g ̲ , τ )
Time (s)0.331.051.141.172.17
Relative Time13.183.453.556.58
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shah, I.; Uzair, M.; Ali, S.; Aljeddani, S.M. Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model. Mathematics 2026, 14, 835. https://doi.org/10.3390/math14050835

AMA Style

Shah I, Uzair M, Ali S, Aljeddani SM. Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model. Mathematics. 2026; 14(5):835. https://doi.org/10.3390/math14050835

Chicago/Turabian Style

Shah, Ismail, Muhammad Uzair, Sajid Ali, and Sadiah M. Aljeddani. 2026. "Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model" Mathematics 14, no. 5: 835. https://doi.org/10.3390/math14050835

APA Style

Shah, I., Uzair, M., Ali, S., & Aljeddani, S. M. (2026). Uncertainty Quantification of Complex Weather Dynamics Using a Novel Functional Autoregressive Model. Mathematics, 14(5), 835. https://doi.org/10.3390/math14050835

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop