Next Article in Journal
Complex Time Approach to the Hamiltonian and the Entropy Production of the Damped Harmonic Oscillator
Previous Article in Journal
Hybrid CNN-LSTM-GNN Neural Network for A-Share Stock Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Varying-Coefficient Additive Models with Density Responses and Functional Auto-Regressive Error Process

1
Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
2
School of Statistics and Data Science, Shanghai University of Finance and Economics, Shanghai 200433, China
3
Department of Mathematics and Statistics, McMaster University, Hamilton, ON L8S 4L8, Canada
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(8), 882; https://doi.org/10.3390/e27080882
Submission received: 11 July 2025 / Revised: 15 August 2025 / Accepted: 19 August 2025 / Published: 20 August 2025
(This article belongs to the Section Information Theory, Probability and Statistics)

Abstract

In many practical applications, data collected over time often exhibit autocorrelation, which, if unaccounted for, can lead to biased or misleading statistical inferences. To address this issue, we propose a varying-coefficient additive model for density-valued responses, incorporating a functional auto-regressive (FAR) error process to capture serial dependence. Our estimation procedure consists of three main steps, utilizing spline-based methods after mapping density functions into a linear space via the log-quantile density transformation. First, we obtain initial estimates of the bivariate varying-coefficient functions using a B-spline series approximation. Second, we estimate the error process from the residuals using spline smoothing techniques. Finally, we refine the estimates of the additive components by adjusting for the estimated error process. We establish theoretical properties of the proposed method, including convergence rates and asymptotic behavior. The effectiveness of our approach is further demonstrated through simulation studies and applications to real-world data.

1. Introduction

Density data or, more broadly, distributional data, are increasingly encountered across a wide range of scientific and applied research domains. Notable examples include the distributions of cross-sectional or intraday stock returns [1,2], mortality rate densities [3], and the distributions of intrahub connectivity in neuroimaging studies [4,5]. In many settings, such density functions are observed sequentially over time, forming what we refer to in this paper as a density time series. A motivating example is presented in Figure 1. Panel (a) displays a density time series of the global COVID-19 mortality rate (‰) over a 100-day period from 22 January 2020 to 15 April 2021. Panel (b) offers a complementary view by plotting the densities on three selected days, highlighting the temporal evolution of the distributional patterns. In this work, we study a regression framework in which the response consists of a density time series, while the predictors are scalar covariates. This setting allows for the exploration of how scalar factors influence the dynamic evolution of entire distributions over time.
Unlike conventional functional data, density functions do not form a linear space due to inherent constraints, such as nonnegativity and the requirement that they integrate with one. These restrictions pose significant challenges for directly applying standard functional data analysis techniques to random densities. To address this, several approaches have been proposed, which can be broadly grouped into two categories. The first approach involves transforming densities into a Hilbert space through suitable continuous and invertible mappings, thereby overcoming the nonlinear structure of the density space. For example, Petersen and Müller [6] introduced two such transformations, the log-hazard transformation and the log-quantile density (LQD) transformation, that map probability densities to an unrestricted space of square-integrable functions. Building on this, Han et al. [7] employed the LQD transformation to model density responses within an additive functional-to-scalar regression framework. Similarly, Kokoszka et al. [1] developed two methods for forecasting density functions derived from cross-sectional and intraday financial returns, using compositional data analysis and a modified log-quantile transformation combined with functional principal component (FPC) analysis and exponential smoothing techniques. The second category of methods takes a geometric perspective by defining appropriate metrics on the space of probability distributions. For instance, Talská et al. [8] used an infinite-dimensional extension of Aitchison geometry to construct a density-on-scalar linear regression model within Bayes-Hilbert spaces. Meanwhile, Petersen and Müller [3] studied Fréchet regression in general metric spaces equipped with the Wasserstein metric. Extending this line of work, Chen et al. [9] leveraged the geometry of tangent bundles in Wasserstein space to propose distribution-on-distribution regression models and developed auto-regressive extensions for distribution-valued time series. Additionally, Zhang et al. [10] explored auto-regressive models of order p for density-valued time series using the Wasserstein metric through a different methodological framework.
Let F denote the space of density functions f defined on a common support U . Without of generality, we assume that U = [ 0 , 1 ] . Given a transformation Ψ : F L 2 , the conditional Fréchet mean of a random density f, given a covariate X R d , is defined as
μ ( · | X ) = arg min d F E ( | | Ψ ( f ) Ψ ( d ) | | 2 2 ) X ) ,
where the expectation E represents the joint distribution of ( X , f ) .
This is equivalent to the following formulation:
Ψ ( μ ( · | X ) ) ( u ) = E ( Ψ ( f ) ( u ) | X ) , 0 u 1 ,
leading to the fact that
μ ( s | X ) = Ψ 1 ( E ( Ψ ( f ) ( u ) | X ) ) ( s ) , 0 s 1 .
The data considered in this article consist of a density time series d t , observed sequentially over time, along with associated with scalar predictors ( X t , Z t ) . To facilitate the analysis of density functions, we employ the LQD transformation Ψ : F L 2 , where F denotes the space of density functions d satisfying the moment condition R u 2 d ( u ) d u < . For each d t F , let F t ( y ) be the corresponding cumulative distribution function with support on [0,1], and let Q t ( u ) denote the associated quantile function. The quantile density function is given by q t ( u ) , i.e., q t ( u ) = Q t ( u ) = d d u F t 1 ( u ) for u [ 0 , 1 ] . Then, the LQD transformation of d t is defined as
Ψ ( d t ) ( u ) = log d d u F t 1 ( u ) , u [ 0 , 1 ] .
In this study, we propose a varying-coefficient additive model with a functional auto-regressive error process to estimate the conditional expectation E ( Ψ ( d t ) | X t , Z t ) . Under this framework, the density function d t can be expressed as
d t = Ψ 1 ( E ( Ψ ( d t ) | X t , Z t ) ) + δ t 1 ,
where δ t 1 represents the regression error.
In addition to δ t 1 , a second source of error commonly arises from the estimation of the density function d t . Specifically, in most practical settings, the density d t is not directly observed. Instead, only a finite sample Y t 1 , , Y t n t d t is available at each time point t, leading to an estimated density d ^ t given by
d ^ t = d t + δ t 2 ,
where δ t 2 denotes the error due to density estimation. Throughout this article, we assume that the sample size n t = n is fixed across time.
Following the approach of [6], we estimate d t using a modified kernel density estimator that addresses boundary effects. The estimator is defined as
d ^ t ( y ) = i = 1 n K y Y t i h w ( y , h ) / i = 1 n 0 1 K s Y t i h w ( s , h ) d s ,
where K is a symmetric kernel function with bandwidth h < 1 / 2 and the weight function w ( y , h ) is designed to correct for boundary bias. Specifically, w ( y , h ) is given by
w ( y , h ) = y / h 1 K ( u ) d u 1 I y [ 0 , h ) + 1 ( 1 y ) / h K ( u ) d u 1 I y ( 1 h , 1 ] + I y [ h , 1 h ] .
We assume that the kernel K is of bounded variation, symmetric about 0, and satisfies the following conditions: 0 1 K ( u ) d u > 0 ; R | u | K ( u ) d u , R K 2 ( u ) d u , and R | u | K 2 ( u ) d u are finite. Therefore, when fitting the regression model using the estimated density d ^ t in place of the true d t , the model can be written as
d ^ t = Ψ 1 ( E ( Ψ ( d t ) | X t , Z t ) ) + δ t 1 + δ t 2 .
The key contribution of this article lies in the integration of density time series modeling with a functional auto-regressive (FAR) error process, a direction that has not been previously studied in the literature. A common assumption in regression analysis, including functional and density-based models, is the independence of random errors; however, this assumption is often violated in time-indexed data, where observations naturally exhibit serial dependence. By explicitly incorporating a FAR(1) structure into the error process, our approach effectively captures the temporal correlation inherent in density time series, thereby enhancing both the flexibility and accuracy of the model. Many real-world phenomena are characterized by time-evolving densities that exhibit strong temporal dependencies [11,12,13,14]. In these settings, the observed distribution at a given time point is not independent of previous distributions, but rather influenced by them through complex temporal dynamics. This phenomenon, commonly referred to as sequence dependence, cannot be adequately modeled under the assumption of independent errors. Ignoring such dependencies often leads to biased estimation, underestimated variability, and invalid inference, as documented in numerous empirical and theoretical studies. While FAR models have been widely explored as standalone tools for modeling functional time series data [11,12,13,14,15,16,17], their use as an error structure within a regression model for density-valued responses remains largely underdeveloped. This article addresses this methodological gap by embedding a FAR(1) process into the error term of a varying-coefficient additive regression framework tailored to density time series. This novel integration enables the model to more faithfully capture both the structured signal and the dynamic residual behavior present in such data. In addition to this modeling innovation, we make several theoretical contributions. Specifically, we develop a new estimation procedure that accommodates both the infinite-dimensional nature of the response and the temporal dependence in the errors. We further derive the asymptotic normality of the proposed estimator, which requires nontrivial extensions of existing techniques in functional data analysis and Hilbert space theory. This allows for valid statistical inference and construction of confidence intervals in practice. In summary, this work contributes a new class of models for density-valued time series with auto-regressive error dynamics, bridging gaps between functional time series, density regression, and auto-regressive modeling. The proposed framework provides both a theoretical foundation and a practical tool for analyzing complex time-evolving distributional data in a wide range of applications.
The remainder of this article is organized as follows: Section 2 presents the methodology for constructing a varying-coefficient additive model with a density response, incorporating a functional auto-regressive (FAR) error process. In Section 3, we propose a three-step estimation procedure for the bivariate varying-coefficient components within the model. Section 4 establishes the theoretical properties of the proposed model and discusses related inferential results. Section 5 reports Monte Carlo simulation studies that evaluate the efficiency and robustness of our approach. In Section 6, we demonstrate the practical utility of the model through applications to COVID-19 mortality data and U.S. income distribution data. Finally, Section 7 offers concluding remarks, and the Supplementary Material contains detailed proofs of the theoretical results.

2. Model Setup

In this article, we focus on modeling density responses. Due to the inherent constraints of density functions, namely nonnegativity and integration to one, we work with their representations after applying the log-quantile density (LQD) transformation.
Our primary goal is to estimate the conditional expectation E ( Ψ ( d t ) ( u ) | x ) through the transformation of density function, expressed as
E ( Ψ ( d t ) ( u ) | x ) = m = 1 k z t , m g m ( u , x t , m ) , 0 u 1 ,
which leads to the proposed varying-coefficient additive models with density responses and functional auto-regressive error process F A R ( p ) (DVCA-FAR):
f t ( u ) = Ψ ( d t ) ( u ) = m = 1 k z t , m g m ( u , x t , m ) + ε t ( u ) , 0 u 1 , 1 t T ,
where the error process ε t ( u ) follows a functional auto-regressive process of order p:
ε t ( u ) = γ 1 ( s , u ) ε t 1 ( s ) d s + + γ p ( s , u ) ε t p ( s ) d s + e t ( u ) .
In this framework, the random density d t ( · ) F serves as the response variable, and Ψ : F L 2 denotes the LQD transformation. Each density is associated with two sets of k - dimensional covariates, x t = ( x t , 1 , , x t , k ) τ and z t = ( z t , 1 , , z t , k ) τ , with supports S x and S z , respectively. Without loss of generality, we assume S x = S z = [ 0 , 1 ] . In this article, the covariate x t can represent z t or the rescaled time index t / T .
The bivariate functions g m ( · , x m ) capture the effects of the covariates z , while the kernel functions γ l ( · , · ) are smooth and satisfy the integrability condition γ l 2 ( s , u ) d u d s < . The innovation process e t ( u ) consists of independent and identically distributed random functions with zero mean E ( e t ( u ) ) = 0 and covariance function C o v ( e t ( u ) , e t ( s ) ) = σ t 2 ( u , s ) .
When the density functions are estimated, denoted by d ^ , we write f ^ t = Ψ ( d ^ t ) . Then DVCA-FAR model then takes the form
f ^ t ( u ) = m = 1 k z t , m g m ( u , x t , m ) + ε t ( u ) + ε f t ( u ) , 0 u 1 ,
where ε f t ( u ) represents the additional random error introduced by the transformation of estimated density.

3. Three-Step Estimation Methodology

We propose a three-step estimation procedure to estimate the varying-coefficient functions in the presence of a functional auto-regressive error structure. In the first step, we apply B-spline smoothing to obtain initial estimates of the bivariate varying-coefficient functions, ignoring the temporal dependence in the error process. In the second step, using the initial estimates and the transformed response, we estimate the error component. Specifically, the order and structure of the functional auto-regressive (FAR) process are determined using the sequential testing procedure proposed by Kokoszka and Reimherr [16]. In the final step, after removing the estimated FAR error from the response, we refine the estimation of the varying-coefficient functions using the spline-based method to obtain improved results.

3.1. Initial Estimation of Bivariate Varying-Coefficient Function

To begin, we estimate the bivariate varying-coefficient functions g m ( u , x m ) , for m = 1 , , k , by applying a tensor product B-spline approximation, ignoring the temporal structure in the error term.
Let { B 0 ( u ) , , B N 0 ( u ) } denote a set of B-spline basis functions of order q with L 0 interior knots, defined on the domain of u [ 0 , 1 ] , so that N 0 + 1 = L 0 + q . Similarly, for each m = 1 , , k , let { B 0 , m ( x m ) , , B N m , m ( x m ) } be a set of B-spline basis functions of order q for the covariate x m , with L m interior knots, so that N m + 1 = L m + q . Let b j , m * ( x m ) denote the normalized B-spline basis functions of B j , m ( x m ) for x m , and define the scaled basis b r ( u ) = N 0 1 / 2 B r ( u ) of B r ( u ) .
The tensor product of the B-spline basis functions is given by
b r , j , m ( u , x m ) = b r ( u ) b j , m * ( x m ) , 1 r N 0 , 1 j N m , 1 m k .
Using this basis, the function of g m ( u , x m ) can be approximated as
g m ( u , x m ) r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u , x m ) , 1 m k ,
where λ r , j , m are the spline coefficients.
The least squares estimator of g m ( u , x m ) is then given by
g ˜ m ( u , x m ) = r = 1 N 0 j = 1 N m λ ˜ r , j , m b r , j , m ( u , x m ) , 1 m k ,
where the vector of estimated coefficients is defined as λ ˜ = ( λ ˜ 1 , 1 , 1 , , λ ˜ N 0 , N k , k ) τ , a ( N 0 m = 1 k N m ) -dimensional parameter vector obtained by solving
λ ˜ = arg min λ t = 1 T i = 1 n f ^ t ( u i ) m = 1 k z t , m r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u i , x t , m ) 2 .
Theoretical properties of this estimator are established in Theorem 1, which shows that the initial estimators g ˜ m ( u , x m ) are uniformly consistent under suitable regularity conditions.

3.2. Estimation of FAR Error Process

With the initial estimates g ˜ m ( u , x m ) obtained, we proceed to estimate the FAR error process. To do so, define the residuals as
ε ˜ t ( u ) = f t ( u ) m = 1 k z t , m g ˜ m ( u , x t , m ) , 1 t T .
Let ρ t ( u ) = l = 1 p γ l ( s , u ) ε t l ( s ) d s denote the additive component in the F A R ( p ) error process (3). Then, the FAR process can be written as
ε t ( u ) = ρ t ( u ) + e t ( u ) ,
where e t ( u ) is a zero-mean innovation term.
Let { B 0 ( u ) , B 2 ( u ) , , B N ( u ) } be a set of B-spline basis functions of order q with L interior knots, such that N + 1 = L + q . Define the tensor product of the B-spline basis as
b r , j ( u , s ) = B r ( u ) B j ( s ) , 1 r , j N .
Using this basis, the FAR kernel functions γ l ( · , · ) are approximated as
γ l ( s , u ) = r = 1 N j = 1 N μ r , j , l b r , j ( u , s ) , 1 l p .
The vector of spline coefficients μ = ( μ 1 , 1 , 1 , , μ N , N , p ) τ R p N 2 is obtained by minimizing the following squared error criterion:
μ ^ = arg min μ t = p + 1 T i = 1 n ε ˜ t ( u i ) l = 1 p r = 1 N j = 1 N μ r , j , l b r , j ( u i , s ) ε ˜ t l ( s ) d s 2 .
The estimated FAR kernel functions γ l ( · , · ) and the additive component of the error process ρ t ( u ) are given by
γ ^ l ( s , u ) = r = 1 N j = 1 N μ ^ r , j , l b r , j ( u , s ) , 1 l p ,
and
ρ ^ t ( u ) = l = 1 p r = 1 N j = 1 N μ ^ r , j , l b r , j ( u , s ) ε ^ t l ( s ) d s , 0 u 1 , p + 1 t T ,
respectively.
Since the order p of the FAR error process is typically unknown in practice, we employ the sequential testing procedure proposed by [16] to determine the optimal order p. The details of this procedure are provided in Section 3.4.2.

3.3. Improved Estimation of Bivariate Varying-Coefficient Function

With an estimate of the FAR error component obtained, we now refine the estimation of the varying-coefficient functions g m ( u , x m ) by removing the estimated serial dependence (8) from the response f t ( u ) .
Define the adjusted response function as
f t c ( u ) = f t ( u ) l = 1 p γ l ( s , u ) ε t l ( s ) d s , 0 u 1 , p + 1 t T ,
and its empirical estimates f t c ( u ) as
f ^ t c ( u ) = f t ( u ) l = 1 p γ ^ l ( s , u ) ε ^ t l ( s ) d s , 0 u 1 , p + 1 t T .
From the model specification, we have
f t c ( u ) = m = 1 k z t , m g m ( u , x t , m ) + e t ( u ) , 0 u 1 .
which allows us to re-estimate g m ( u , x m ) by repeating the same spline-based procedure as described in Section 3.1, but now applied to the corrected responses f ^ t c ( u ) .
The improved spline approximation estimates take the form
g ^ m ( u , x m ) = r = 1 N 0 j = 1 N m λ ^ r , j , m b r , j , m ( u , x m ) , 1 m k ,
where the coefficient vector λ ^ = ( λ ^ 1 , 1 , 1 , , λ ^ N 0 , N k , k ) T is a ( N 0 m = 1 k N m ) -dimensional vector obtained by minimizing
λ ^ = arg min λ t = 1 T i = 1 n f ^ t c ( u i ) m = 1 k z t , m r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u i , x t , m ) 2 .
Theoretical guarantees for this refined estimator are provided in Theorems 2 and 3, which establish its uniform convergence and asymptotic normality under regularity conditions. In addition, simulation results reported in Section 5 demonstrate that the improved estimator g ^ m ( u , x m ) achieves greater efficiency and accuracy compared to the initial estimator g ˜ m ( u , x m ) .

3.4. Implementation

3.4.1. Selection of Bandwidth

In empirical applications, it is necessary to estimate the underlying density functions before model fitting. This step requires selecting an appropriate bandwidth for the modified kernel density estimator. In this section, we adopt a leave-one-out cross-validation (LOOCV) strategy to select the optimal bandwidth.
Specifically, the bandwidth h is chosen to minimize the following mean squared error (MSE) criterion:
C V ( h ) = 1 n T t = 1 T i = 1 n [ d t ( y t i ) d ^ t ( i ) ( y t i ) ] 2 ,
where for each i = 1 , , n , d ^ t ( i ) ( y t i ) denotes the density estimate of d t ( y t i ) with bandwidth h at point y t i using all observations from time point t except the i-th one.

3.4.2. Identifying the Order of the FAR Process

To determine the order p of FAR error process, we apply the sequential testing procedure procedure proposed by [16]. The method frames FAR modeling as a fully functional linear regression with dependent regressors and systematically tests whether increasing the order improves the model fit.
The procedure tests the following nested hypotheses:
H 0 , p : { ε t } follows a F A R ( p ) vs H a , p + 1 : { ε t } follows a F A R ( p + 1 ) , p = 0 , 1 , 2 , ,
Here, F A R ( 0 ) corresponds to an independent and identically distributed process. The testing begins at p = 0 and continues sequentially. The process stops when H 0 , p is not rejected, at which point the selected model order is taken to be p. See [16] for the full theoretical development.
To construct the test statistic, define the following components. Let
η j ( s ) = l = 1 p ε ˜ j l ( s p ( l 1 ) ) I l ( s ) , φ ( s , u ) = p l = 1 p γ l ( s p ( l 1 ) , u ) I l ( s ) ,
where I l is the indicator function on the interval [ ( l 1 ) / p , l / p ] . Denote
C ^ η ( s , u ) = 1 T j = 1 T ( η j ( s ) η ¯ ( s ) ) ( η j ( u ) η ¯ ( u ) )
as the empirical covariance operator of { η j } , where η ¯ ( s ) is the sample mean function. Let { x ^ j } and { λ ^ j } be the eigenfunctions and corresponding decreasingly ordered eigenvalues of C ^ η , respectively. We retain only the first q η eigenfunctions for dimensionality reduction. Similarly, for the functional responses { π j } , define eigenpairs { y ^ j } and corresponding number q π analogously.
For the product space L 2 ( [ 0 , 1 ] × [ 0 , 1 ] ) , define the projections
η ( j , k ) = η j , x ^ k , π ( j , m ) = π j , y ^ m , ψ ( k , m ) = φ , x ^ k y ^ m .
Denote the matrices η = [ η ( j , k ) ] T × q η , π = [ π ( j , m ) ] T × q π , and ψ = [ ψ ( k , m ) ] q η × q π , j = 1 , , T , k = 1 , , q η , m = 1 , , q π .
Next, construct the matrix A ^ R q η × q η with entries
A ^ ( k , k ) = x ^ k , p , x ^ k , p , where x ^ k , p ( s ) = x ^ k ( s + p 1 p ) , 0 s 1 .
Define the orthonormal eigenvectors β ^ k with corresponding ordered eigenvalues ξ ^ 1 ξ ^ q η as A ^ β ^ k = ξ ^ k β ^ k , 1 k q η . Define the matrix B ^ = [ β ^ 1 , , β ^ q * ] , where
q * = max { k { 1 , , q η } : | | z ^ k , p | | 2 0.9 p } , with z ^ k , p ( s ) = i = 1 q η β ^ k , i x ^ k , p ( s ) .
Finally, following [16], the test statistic is constructed as
τ ^ p = 1 T v e c [ B ^ τ ψ ^ ] τ ( I q ε B ^ τ ) ( C ^ Λ ^ ) ( I q ε B ^ ) 1 v e c [ B ^ τ ψ ^ ] ,
where Λ ^ = d i a g ( λ ^ 1 , , λ ^ q η ) , C ^ = 1 T ( π η ψ ^ ) τ ( π η ψ ^ ) . Under H 0 , p , the test statistic τ ^ p + 1 asymptotically follows a chi-squared distribution with degrees of freedom q π q * .

4. Theoretical Results

In this section, we investigate the asymptotic properties of both the initial and improved estimators of the bivariate varying-coefficient functions g m ( u , x m ) . We also establish the consistency of the estimator for the order p of the functional auto-regressive (FAR) process. All technical proofs are deferred to the Supplementary Materials.
Throughout the remainder of this paper, for any fixed interval [ a , b ] , we denote the space of functions that are l-times continuously differentiable on [ a , b ] as C ( l ) [ a , b ] = { g | g ( l ) [ a , b ] } . Let L i p ( [ a , b ] , C ) = { g | | g ( x ) g ( x ) | | x x | , x , x [ a , b ] } denote the class of Lipschitz-continuous functions with Lipschitz constant C > 0 . Let S x m and S z m denote the supports of x m and z m , respectively. Then, the supports of the covariate vectors x and z are given by S x = m = 1 k S x m and S z = m = 1 k S z m , respectively. The following regularity conditions are imposed to derive the asymptotic properties of the proposed estimators.
(A1)
For any density function d F , d is differentiable and there exists a constant M > 1 such that | | d | | , | | 1 / d | | , and | | d | | are all bounded by M.
(A2)
(a) The kernel function K is Lipschitz-continuous, bounded, and symmetric about zero. Furthermore, K L i p ( [ 1 , 1 ] , L k ) for some constant L k > 0 . (b) The kernel function satisfies the following conditions: 0 1 K ( u ) d u > 0 , R | u | K ( u ) d u < , R K 2 ( u ) d u < , and R | u | K 2 ( u ) d u < .
(A3)
The covariates x t , m , z t , m , for 1 m k , and the error functions ε t ( u ) satisfy the following moment conditions: for some s > 2 ,
max 1 t T max 1 m k E ( | x t , m | 2 s ) < , max 1 t T max 1 m k E ( | z t , m | 2 s ) < , max 1 t T sup u E ( | ε t ( u ) | 2 s ) < .
For each t = 1 , , T , the covariance function of the error process C o v ( ε t ( s ) , ε t ( v ) ) = Σ t ( s , v ) has finite nondecreasing eigenvalues λ 1 λ m a x such that j λ j < .
(A4)
The varying-coefficient functions g m ( u , x m ) are continuous over the domain [ 0 , 1 ] × [ a m , b m ] and are twice continuously partially differentiable with respect to u and x m , for each 1 m k . Here, [ a m , b m ] is a compact subset of the support S x m .
(A5)
The numbers of basis functions satisfy N 0 ( n T ) 1 / 6 log n T , N m ( n T ) 1 / 6 log n T , 1 m k , and the bandwidth satisfies h n 1 / 3 , as n , T .
Remark 1. 
Assumption (A1) is standard and ensures the well-posedness of density transformations. Assumption (A2) imposes mild conditions on the kernel function K ( · ) , which are satisfied by commonly used kernel functions such as the uniform and Epanechnikov kernels. The moment conditions in (A3) are essential for establishing the uniform convergence and other asymptotic properties of spline-based estimators. Assumption (A4) requires only moderate smoothness of the coefficient functions and is relatively weak compared to traditional nonparametric assumptions. Lastly, the growth conditions in (A5) are widely adopted in the literature on spline smoothing to ensure optimal convergence rates.
We begin by examining the uniform convergence of the initial estimator of bivariate functions g m ( u , x m ) , as stated in Theorem 1.
Theorem 1. 
Assume that Assumptions (A1)–(A5) hold, and let g ˜ m ( u , x m ) denote the initial estimator of g m ( u , x m ) , defined in Equation (5), for m = 1 , , k . Then, as n and T , we have
sup u , x m [ 0 , 1 ] | g ˜ m ( u , x m ) g m ( u , x m ) | = O p ( n T ) 1 / 3 log ( n T ) + n 1 / 3 .
Theorem 2 characterizes the uniform convergence of the improved estimation of g m ( u , x m ) , and Theorem 3 describes the asymptotic properties of both the initial and improved estimators.
Theorem 2. 
Assume that Assumptions (A1)–(A5) hold, and that the order p of the functional error process is known. Let g ^ m ( u , x m ) denote the improved estimator of g m ( u , x m ) , as defined in Equation (9), for m = 1 , , k . Then, as n and T , it holds that
sup u , x m [ 0 , 1 ] | g ^ m ( u , x m ) g m ( u , x m ) | = O p ( ( n T ) 1 / 3 ( log ( n T ) ) 2 + n 1 / 3 ) .
To establish the asymptotic normality of the estimators, we introduce the following notations. Denote b ( u , x t , m ) = ( b 1 , 1 , m ( u , x t , m ) , , b N 0 , N m , m ( u , x t , m ) ) τ , b z ( u , x t , m ) = z t , m b ( u , x t , m ) , Bz t , m = ( b z ( u 1 , x t , m ) , , b z ( u n , x t , m ) ) n × N 0 N m τ , B m = ( Bz 1 , m τ , , Bz T , m τ ) τ , B = ( B 1 , , B k ) , and B * = B / n T .
Let A m = ( 0 , , I , , 0 ) denote a block matrix of dimension 1 × k , where the m-th block is an identity matrix of size N 0 N m × N 0 N m , and all other blocks are zero matrices of appropriate dimensions.
Theorem 3. 
Assume that Assumptions (A1)–(A5) hold, let g ˜ m ( u , x m ) and g ^ m ( u , x m ) denote the initial and improved estimators of g m ( u , x m ) , as defined in Equations (5) and (9), m = 1 , , k , respectively. Then, as n T , for any u ( 0 , 1 ) and x m [ 0 , 1 ] , the following results hold:
(i) 
The initial estimator g ˜ m ( u , x m ) is asymptotically normally distributed, i.e.,
n T ( C m Σ ε C m τ ) 1 ( g ˜ m ( u , x m ) g m ( u , x m ) ) D N ( 0 , 1 ) ,
where C m = b τ ( u , x m ) E ( A m ( B * τ B * ) 1 B * τ ) , the covariance matrix Σ ε = ( Σ t , s ) 1 t , s T , with Σ t , s = C o v ( ε t , ε s ) .
(ii) 
The improved estimator g ^ m ( u , x m ) is asymptotically normally distributed, i.e.,
n T ( C m Ξ ε C m τ ) 1 ( g ^ m ( u , x m ) g m ( u , x m ) ) D N ( 0 , 1 ) ,
where the covariance matrix Ξ ε = diag ( Ξ t , t ) 1 t T , with Ξ t , t ( u , s ) = σ t 2 ( u , s ) .

5. Numerical Study

In this section, we present two simulation studies designed to evaluate the performance of the proposed identification and estimation procedures for the additive model.

5.1. Case 1

This scenario aims to assess the estimation accuracy of the proposed method when the order of the auto-regressive error process is known with finite n and T. We consider a DVCA-FAR(1) model given by
f t ( u ) = z t , 1 g 1 ( u , x t , 1 ) + z t , 2 g 2 ( u , x t , 2 ) + ε t ( u ) , 0 u 1 ,
and the functional error process ε t ( u ) takes form as
ε t ( u ) = γ 1 ( s , u ) ε t 1 ( s ) d s + e t ( u ) , 2 t T .
The bivariate varying-coefficient functions are specified as
g 1 ( u , x t , 1 ) = sin ( 2 π u ) ( 2 x t , 1 1 ) , g 2 ( u , x t , 2 ) = sin ( 2 π u ) sin ( 2 π x t , 2 ) ,
and the coefficient functions and innovation process are given by
γ 1 ( s , u ) = 0.2 u s , e t ( u ) = 0.2 η t , 1 sin ( π u ) + η t , 2 sin ( 2 π u ) ,
with η t , 1 N ( 0 , 0 . 1 2 ) , η t , 2 N ( 0 , 0 . 05 2 ) , and η t , 1 are independent of η t , 2 for u [ 0 , 1 ] .
The covariates are generated as follows: z t , 1 N ( 0 , 1 ) , z t , 2 N ( 0 , 0 . 5 2 ) , and ( x t , 1 , x t , 2 ) τ = ( Φ ( v t , 1 ) , Φ ( v t , 2 ) ) τ , 1 t T , where Φ denotes the cumulative distribution function of the standard normal distribution and v t , 1 , v t , 2 are independent standard normal variables.
To generate the response densities, for each given Z = z and X = x , let α ( u , x , z ) be the additive predictor given by α ( u , x , z ) = m = 1 p z m g m ( u , x m ) . The conditional quantile function Q ( · | x , z ) with the error process ε ( u ) , corresponding to the density d t , is constructed as Q ( u | x , z ) = F 1 ( u | x , z ) = θ ( x , z ) 1 0 u exp { α ( v , x , z ) + ε ( v ) } d v , where θ ( x , z ) = 0 1 exp { α ( v , x , z ) + ε ( v ) } d v .
Given this construction, we generate response samples by applying the quantile function to uniform random variables { U t , 1 U t , n t } U ( 0 , 1 ) , which are independent of X t and Z t . Specifically, for each 1 t T , we obtain the random samples Y t = { Y t , j = Q ( U t , j | X t , Z t ) : 1 j n t } , ensuring that Y t , 1 , , Y t , n t d t , where d t denotes the response density. The transformed density, as used in model (12), is then defined as f t ( u ) = Ψ ( d t ( u ) ) . For simplicity, we assume that independent and identically distributed observations are available for each response distribution, i.e., n t = n .
The simulation is conducted with T = 100 , n = 100 , and results are averaged over 200 Monte Carlo replications. Figure 2 presents the true error process ε ( u ) in panel (a) and its corresponding spline-based estimations in panel (b), demonstrating a high degree of accuracy in error recovery. Figure 3 provides a a comparative view of the bivariate function estimates. Specifically, the left panel displays the true surfaces of g m ( u , x m ) , while the middle and right panels show the average of the initial and improved estimations, respectively. The initial estimates are obtained without accounting for the FAR(1) error structure, whereas the improved estimates incorporate the estimated error process. To facilitate visual comparison, the surfaces are presented from two distinct viewing angles. This allows for a more comprehensive assessment of the estimation performance before and after error correction.
As illustrated in Figure 2, the proposed method achieves a highly accurate estimation of the error process. Moreover, the right panel of Figure 3 clearly demonstrates that incorporating the estimated FAR structure leads to substantially improved function estimates when compared to the initial results shown in the middle panel. These findings confirm the effectiveness of the proposed approach in refining the estimation of bivariate varying-coefficient functions by properly addressing the temporal dependence in the functional error process.
To further evaluate the performance of the proposed estimation procedure, we conduct simulations under varying sample sizes, specifically T = 50 , 100 and n = 50 , 100 . The accuracy of the initial and improved estimators of g m ( u , x m ) is assessed using the root mean squared error (RMSE), defined as
R M S E ( g ˙ m ) = 1 T t = 1 T { 1 n i = 1 n | | g ˙ m ( u i , x t , m ) g m ( u i , x t , m ) | | 2 2 } 1 2 .
where g ˙ m denotes either the initial estimate g ˜ m or the improved estimate g ^ m .
Table 1 presents the average root mean square errors (RMSEs) along with their standard deviations, calculated over 200 Monte Carlo replications for both the initial and improved estimators of g m ( u , x m ) . The results demonstrate a clear trend that the RMSEs decrease as both the number of time points T and the number of observations per curve n increase. More importantly, the improved estimators consistently outperform the initial estimators, yielding substantially lower RMSEs across all settings. This improvement is anticipated, as the initial estimates are obtained without adjusting for the auto-regressive error structure, which introduces bias and additional variability into the estimation process. In contrast, the improved estimates incorporate the estimated error component, leading to more accurate and reliable results.
To offer a deeper understanding of the relative performance between the two estimation strategies, we also compare their biases and standard deviations, taking the first setting in the simulation study as a representative example.
Table 2 presents the average bias and standard deviation of both the initial and improved estimators of g m ( u , x m ) . The results further confirm the superiority of the improved approach, indicating that across all combinations of sample size, both the bias and standard deviation of the improved estimators are markedly smaller than those of the initial estimators. Furthermore, both metrics exhibit a decreasing trend as the sample size increases, highlighting the consistency and efficiency of the improved estimation method. These findings provide strong empirical support for the theoretical result that the improved estimator possesses a smaller asymptotic variance–covariance matrix, thereby offering enhanced precision and robustness in practical applications.

5.2. Case 2

Case 2 is designed to evaluate the efficiency of identifying the auto-regressive order of the functional error process. The response densities are also generated from model (12), but now the error process follows a F A R ( 2 ) structure, with the second-order coefficient function specified as γ 2 ( s , u ) = 1 4 u s 2 . All other simulation settings remain consistent with those in Case 1.
Table 3 reports the empirical power of the testing procedure used to determine the order of the FAR error process across various sample sizes and significance levels. The results clearly show that the test’s power increases as both the number of time points T and the number of observations per curve n grow. In particular, the power approaches one when T and n reach 100, especially when testing the null hypothesis of independent and identically distributed (i.i.d.) errors. This indicates that the test becomes highly reliable with larger sample sizes. While the power is somewhat lower when testing the null hypothesis of FAR(1) against FAR(2), this is expected due to the inherent difficulty in distinguishing between these closely related models. Additionally, the test maintains an appropriate size when assessing FAR(2) against higher-order alternatives, confirming its accuracy and practical feasibility for identifying the correct order of the functional error process. Overall, these findings demonstrate the robustness and effectiveness of the proposed testing algorithm in diverse settings.
To further explore the impact of correctly identifying the auto-regressive order p on estimation accuracy, Table 4 presents the average RMSEs of the bivariate varying-coefficient functions. The observed pattern closely parallels the results in Case 1, reinforcing the validity and reliability of the model’s identification and estimation procedures. This evidence highlights the critical role that the accurate determination of the auto-regressive order plays in improving estimation precision. The consistent RMSE patterns across different sample sizes and scenarios underline the model’s robustness in effectively accounting for the error structure, thus providing precise and reliable estimates of the bivariate varying-coefficient functions.

5.3. Case 3

To further examine the performance of proposed estimation approach and identification procedure under different scenarios, we consider the coefficient functions and innovation process which takes form as
γ 1 ( s , u ) = 0.2 u s , e t ( u ) = 0.2 η t , 1 sin ( π u ) + η t , 2 sin ( 2 π u ) ,
with η t , 1 G a m m a ( 3 , 2 ) , η t , 2 t ( 5 ) , and η t , 1 being independent of η t , 2 for u [ 0 , 1 ] . All other simulation settings remain consistent with those in Case 2.
Table 5 summarizes the power performance of the proposed testing approach under a setting where the innovation process e t ( u ) follows a non-Gaussian distribution with increased variability. The outcomes indicate that, similar to Case 2, the test remains capable of effectively identifying the correct order of the FAR process, even in the presence of more complex error structures. Although the overall power is somewhat increased relative to the non-Gaussian innovation process case, particularly in distinguishing closely related models such as FAR(1) versus FAR(2), the test still demonstrates satisfactory performance, especially as the sample size increases. Notably, when both the number of time points T and the number of observations per curve n are large (e.g., 100), the power approaches nearly unity, confirming that the test is still reliable under more challenging, non-ideal conditions.
Table 6 further examines how accurately identifying the auto-regressive order influences the estimation quality of the bivariate varying-coefficient functions. Despite the non-Gaussian error distribution and larger noise fluctuations leading to visibly higher RMSEs—both for the initial and refined estimates—consistent improvements are observed when the correct FAR order is utilized. These improvements closely mirror the trends seen in Case 2, reaffirming the stability and practical value of the estimation procedure. The results suggest that, although the estimation becomes inherently more difficult under heavy-tailed or heteroskedastic error conditions, the proposed methods remain applicable and beneficial in terms of both model identification and estimation refinement.

6. Real Data Analysis

In this section, we illustrate the feasibility and effectiveness of the proposed estimation procedure through the analysis of two real-world datasets. By applying our methodology to empirical data, we demonstrate its practical capability to capture the underlying patterns and dependencies present in complex data. This analysis not only serves to validate the performance of the estimation approach but also underscores its broad applicability across diverse domains. Moreover, it highlights the model’s flexibility and robustness in handling intricate, time-dependent, and non-Euclidean data structures, thus emphasizing its value as a versatile tool for real-world applications.

6.1. COVID-19 Data

On 11 March 2020, the World Health Organization (WHO) officially declared COVID-19, a contagious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a global pandemic. The rapid and widespread transmission of the virus presented unprecedented challenges to global public health, prompting countries worldwide to implement lockdowns and other measures aimed at controlling the spread of the disease. As of 15 August 2021, WHO reports indicated a staggering 221,885,822 confirmed cases and 4,583,539 deaths spanning nearly all countries, underscoring the extensive and profound impact of the pandemic. Given the scale of this crisis, it is crucial for international health organizations and research institutions to continuously monitor the evolving global trends of COVID-19. Such monitoring enables timely, accurate analysis that supports effective public health responses, informs medical treatment strategies, and guides prevention and control measures for future outbreaks. Understanding the epidemic’s dynamics through data-driven modeling is therefore essential for shaping informed policy decisions and improving health outcomes worldwide amid this ongoing global crisis.
To illustrate this point, we focus on the mortality rate as a key indicator for tracking the global trend of the COVID-19 pandemic. The mortality rate is defined as the ratio of the cumulative number of deaths each day to the total population of each country, serving as a critical measure of the disease’s lethality and spread. Importantly, the calculation of mortality rates inherently involves temporal dependence, as daily figures are based on previous days’ data. Consequently, the mortality rates, and thus the global epidemic trend, exhibit temporal autocorrelation.
The data on COVID-19-related deaths, essential to our analysis, are sourced from the publicly accessible Johns Hopkins University repository. This resource provides a dynamic tracking map offering comprehensive insights into global pandemic trends. The dataset, available at https://www.jhu.edu/ (accessed on 25 July 2021), covers the period from 22 January 2020 through 15 April 2021. Additionally, the most recent population data required to compute mortality rates for each country are obtained from the World Bank’s online platform, accessible at https://data.worldbank.org (accessed on 17 September 2021). These publicly available datasets form a valuable foundation for monitoring the pandemic’s progression and conducting rigorous statistical and epidemiological analyses to better understand the disease’s behavior across regions.
Because the timing of outbreaks varies between countries and regions, we standardize the time scale by defining day zero as the date when each country reached 100 cumulative confirmed COVID-19 cases. Our analysis considers daily cumulative death data from 189 countries over the subsequent 100-day period. At each time point t, we estimate the density function of the mortality rate, denoted as d ^ t ( y ) , using data from these countries. Figure 1a displays the estimated densities of the global mortality rate (‰) across the 100 days, with data from up to 189 countries at each time point. Figure 1b presents an alternative view by showing the estimated densities on three selected days. From these visualizations, it is clear that the mortality rate densities remain well-defined throughout the observed period. Moreover, a temporal dependency among the distributions is clearly observable, suggesting the presence of an auto-regressive structure in the data, which supports the hypothesis of a functional auto-regressive (FAR) error process.
The main objective of this analysis, based on the COVID-19 data, is to identify the FAR process underlying the mortality rate and estimate its component functions. For the sake of simplicity, we begin by considering a special case where the covariate z is constant (set to 1), and x represents a scaled time variable, denoted as t / T . The model is specified as
f ^ t ( u ) = Ψ ( d ^ t ) ( u ) = g 1 ( u , x t , 1 ) + ε t ( u ) , 1 t 100 ,
where ε t ( u ) = l = 1 p γ l ( u , s ) ε t l ( s ) d s + e t ( u ) and x t , 1 denotes the time scale t / T .
Using the initial spline estimate of g 1 ( u , x 1 ) , we apply a testing algorithm to determine the order p of the FAR process. Table 7 reports the corresponding p-values under different hypotheses. The results provide strong evidence of significant autocorrelation in the data, supporting the model of a first-order functional auto-regressive process, F A R ( 1 ) . These findings empirically confirm the presence of temporal dependencies in the COVID-19 mortality rates, further justifying the application the use of a FAR error structure to effectively capture the evolving epidemic dynamics.
Figure 4 presents a heat map of the estimated bivariate function g ^ 1 ( u , x 1 ) , obtained after accounting for the functional error process and determining the auto-regressive order. The heat map reveals a relatively stable temporal pattern, where the function initially attains a minimum at lower values of u, gradually increases, and reaches a maximum at later time points. This pattern reflects the underlying dynamics of the COVID-19 mortality rate over successive days. The observed correlation between consecutive days further supports the notion that the global mortality rate exhibits substantial temporal dependence, consistent with the nature of the mortality measure derived from prior daily data.
To evaluate the uncertainty associated with these estimates, we conducted a residual-based bootstrap analysis. Specifically, we first fitted the model to obtain the estimated coefficient surface g ^ 1 ( u , x 1 ) and the residual functions. Then, using the estimated functional auto-regressive operator from the FAR(1) error process, we recursively generated bootstrap residual samples by resampling the innovation functions with replacement. For each of the 500 bootstrap replications, new response functions were constructed by adding the bootstrap residuals to the fitted values based on g ^ 1 ( u , x 1 ) . The entire estimation procedure was repeated on each bootstrap dataset to obtain bootstrap replicates of the coefficient surface. The variability among these bootstrap replicates was then used to calculate point-wise standard errors and confidence intervals. The resulting standard errors were generally small across the domain, typically ranging between 0.04 and 0.07, indicating stable estimates throughout. The 95% confidence intervals for the bivariate varying-coefficient surface consistently excluded zero along the increasing temporal trend, confirming its statistical significance. Moreover, the bootstrap results showed that the identified pattern, a minimum at early u values followed by a gradual rise, was robust across replications, demonstrating that the observed dynamic is unlikely to be due to random noise. These findings underscore the reliability of the estimated surface and validate the importance of accounting for the temporal dependence captured by the FAR(1) process in modeling the functional response. This suggests that the observed structure is not an artifact of noise, but reflects a meaningful underlying dynamic, which reinforces the reliability of the visual patterns in the figure. Overall, these findings reinforce the necessity of incorporating a functional auto-regressive process to accurately model the temporal structure of the mortality rate. The clear and significant progression observed in the heat map further validates the model’s ability to capture the global evolution of the pandemic over time.

6.2. USA Income Data

Personal income statistics are essential for enabling governments to understand the interplay between national income, consumption, and saving. These statistics also serve as a valuable tool for assessing and comparing economic well-being across different regions or countries. In this study, we focus on the density time series of per capita personal income, defined as the total personal income of a region divided by its population. This metric offers a detailed perspective on the economic conditions within a region by capturing the distribution and evolution of income on a per-person basis over time. Analyzing such time series allows policymakers and researchers to gain insights into the long-term economic trends of a region, evaluate income disparities, and make more informed decisions regarding fiscal policies, social welfare programs, and strategies for economic development.
Income data for the United States are publicly publicly accessible through the official website of the United States Bureau of Economic Analysis (http://www.bea.gov/ accessed on 16 October 2021). We consider the quarterly per capita personal income of all 50 states in the USA spanning from the first quarter of 2010 through the fourth quarter of 2020, resulting in 44 time points, t = 1 , , 44 . At each quarter t, we estimte the density function of per capita income, d ^ t ( y ) , based on these 50 observations. Given that the quarterly personal income reflects broader national economic conditions, we incorporate two related covariates, ‘GDP’ (quarterly gross domestic product of the USA) and ‘Population’ (quarterly total population of the USA), both also available from the BEA (http://www.bea.gov/ accessed on 16 October 2021).
Traditionally, income curves are studied as panel data in economics, focusing on the relationship between consumers’ equilibrium points. As individual incomes fluctuate, the connections among these equilibrium points form trajectories that represent not only income growth but also increased consumer satisfaction. This perspective highlights the dynamic nature of income changes and their impact on well-being, offering valuable insights into consumer behavior over time.
In contrast, the income density curve, treated here as functional data, captures the distribution of income within a region or demographic group. It visually represents the shape and trends of income across different intervals, providing a more comprehensive view of the socio-economic environment. By examining income density curves, one can effectively observe income inequality within a population and identify key patterns of wealth distribution. Such curves are critical for economic research as they facilitate a deeper understanding of consumption behavior, socio-economic status, and the design of social policies.
Furthermore, income density curves are important tools for economic forecasting and analysis. By tracking changes in income distribution over time, economists can integrate insights about consumer preferences and consumption habits at different income levels. This approach enhances the ability to predict future economic conditions and shifts in consumption patterns, making income density curves indispensable for both microeconomic and macroeconomic analyses. Consequently, these curves play a crucial role in shaping policy decisions, economic planning, and our broader comprehension of economic well-being.
Figure 5a depicts the density time series of quarterly personal income across the 44 quarters. The density curves indicate a consistent pattern in the distribution of per capita income across states over the past decade. Specifically, there are relatively few individuals in the high-income and upper-middle-income brackets, a moderate number in the middle-income category, and a larger share in the lower-middle-income range.
To illustrate the temporal changes in income distribution more clearly, Figure 5b shows density curves at three distinct time points: the second quarter of 2015, the first quarter of 2017, and the third quarter of 2018. The curves reveal a gradual shift towards higher income levels over time, alongside a corresponding decrease in the peak density. This trend aligns with broader economic and technological progress in recent years. As the economy develops, the proportion of individuals in lower income brackets steadily decreases, while the middle-to-high income groups grow. As a result, income distribution is becoming more balanced, with an increasing share of the population moving into middle- and higher-income categories. This pattern reflects general trends of economic growth and income redistribution over the period.
We model the income density curves using the following dynamic varying-coefficient auto-regressive functional regression (DVCA-FAR) model:
f ^ t ( u ) = Ψ ( d ^ t ) ( u ) = g 0 ( u , x t , 0 ) + z t , 1 g 1 ( u , x t , 1 ) + z t , 2 g 2 ( u , x t , 2 ) + ε t ( u ) , 1 t 44 ,
where ε t ( u ) = l = 1 p γ l ( u , s ) ε t l ( s ) d s + e t ( u ) . Here, z t , 1 denotes the quarterly gross domestic product of the USA, z t , 2 denotes the quarterly total population of the USA, and x t , 0 , x t , 1 , x t , 2 represent scaled time variables t / T .
Following the initial spline-based estimation, we apply a testing procedure to determine the appropriate order p of the functional auto-regressive error process. The p-values in Table 8 indicate that an FAR(2) model best captures the autocorrelation structure in the error terms. This finding suggests that a second-order functional auto-regressive process effectively accounts for the dynamic dependencies within the income data.
Using the three-step estimation procedure, we estimate the bivariate varying-coefficient functions. Figure 6 presents heat maps of the three estimated functions, where g 0 ( u , x 0 ) represents the baseline effect over time, g 1 ( u , x 1 ) captures the influence of GDP, and g 2 ( u , x 2 ) reflects the impact of population on data respectively. The heat map of g 0 shows an alternating pattern over time between high and low values, indicating that individuals at both high and low per capita income levels experience similar effects, whereas those in the middle-income range tend to display the opposite trend. The function g 1 , related to GDP, exhibits a consistent pattern across income levels u that changes over time, with an initial peak followed by a dip and a subsequent rise toward the end of the period. Conversely, the effect of population, as shown in g 2 , generally opposes the baseline pattern and remains relatively stable across both high- and low-income groups.
To evaluate the uncertainty associated with these estimated varying-coefficient surfaces, we conducted a residual-based bootstrap analysis analogous to that described previously. Specifically, we used the estimated FAR(2) operator and innovation residuals to generate 500 bootstrap samples, each time reconstructing the response functions and refitting the entire model. Point-wise standard errors and confidence intervals for all g j ( u , x j ) derived from these bootstrap replications indicate that the main features of the estimated coefficient functions are statistically significant and robust throughout the domain. Point-wise standard errors and confidence intervals for all estimated coefficient functions g j ( u , x j ) obtained from the 500 bootstrap replications show that the estimation uncertainty is generally low across most of the domain. For example, the standard errors of g 0 ( u , x 0 ) remain below 0.25 in regions corresponding to the high- and low-income groups, supporting the stability of the observed alternating pattern over time. Similarly, g 1 ( u , x 1 ) , representing the GDP effect, exhibits standard errors typically under 0.38 near the temporal peaks and troughs, confirming that the initial peak, mid-period dip, and late-period rise are statistically significant features rather than random fluctuations. The function g 2 ( u , x 2 ) , reflecting population impact, has slightly larger uncertainty in the mid-income range but remains stable and significant across the majority of the income spectrum. Overall, the 95% bootstrap confidence intervals exclude zero in these key regions, reinforcing the robustness and reliability of the identified macroeconomic influences on quarterly personal income distributions. Together, these findings highlight a significant dependence of quarterly personal income distributions in the United States on prior values, with the dynamics closely linked to key macroeconomic factors such as GDP and demographic changes represented by population growth.

7. Conclusions

Sequentially collected data often exhibit autocorrelation, which must be properly addressed to ensure accurate statistical modeling. At the same time, the analysis of non-Euclidean data structures, such as probability density functions, has gained increasing attention in modern statistical research. To address these challenges, we propose a varying-coefficient additive model with density-valued responses, incorporating a functional auto-regressive (FAR) error process to capture temporal dependence. Given the intrinsic nonlinearity and geometric constraints of density functions, we begin by applying the transformation method proposed by Petersen and Müller [6] to map the density functions into a linear Hilbert space, enabling the use of conventional regression techniques. We then develop a three-step estimation procedure for the varying-coefficient components. In the first step, we employ B-spline series approximations to obtain preliminary estimates of the bivariate varying-coefficient functions, initially ignoring the functional error structure. In the second step, we determine the order of the FAR process using the test statistic introduced by Kokoszka and Reimherr [16], based on the residuals obtained from the first step. In the final step, we account for the FAR error process and construct refined estimators for the varying-coefficient functions by removing the estimated auto-correlated components and reapplying the B-spline estimation. We provide theoretical justification for the proposed procedure by establishing convergence rates and asymptotic properties for both the initial and refined estimators. The effectiveness of the proposed method is further demonstrated through comprehensive simulation studies and applications to two real-world datasets. The results underscore the importance of addressing temporal dependence in density-valued data and validate the accuracy and efficiency of our approach.
This work opens several avenues for future research. While our model establishes the relationship between density function responses and scalar predictors using a varying-coefficient additive framework, the growing prevalence of complex, high-dimensional data calls for further methodological extensions. In particular, although a FAR structure is assumed for temporal dependence, consistent estimation may still be possible under simpler or approximate structures, similar to working correlations in GEE. Exploring such alternatives could offer more flexible and efficient modeling in future work. While the estimation method proposed in this article combines well-established techniques, its tailored integration within the context of density time series with functional auto-regressive errors addresses unique challenges in this setting. Nevertheless, developing more novel and efficient estimation approaches remains an important direction for future research, with potential to further improve accuracy and computational performance. Furthermore, although the least squares-based estimation method used in this paper is not optimal in the classical sense due to the presence of temporally dependent functional errors, it remains a theoretically justified and practically effective approach in our setting. The estimator achieves consistency and asymptotic normality, and is tailored to the model’s structural complexity. Developing more efficient alternatives, such as methods that explicitly incorporate the error dependence structure, represents a promising direction for future research. Our work adapts classical theoretical tools to this complex setting, but there exist additional methods and frameworks that could further strengthen the theoretical properties. Developing and applying these tools offers valuable opportunities for future research. Additionally, a limitation of the current approach is that it models the distribution of responses pooled across units at each time point, thus not capturing individual unit trajectories over time. This means that development within specific countries or states cannot be directly traced. Extending the model to include unit-specific effects or hierarchical structures, such as functional mixed-effects models, would allow for tracking of within-unit temporal dynamics while accounting for autocorrelation. Future studies will aim to develop such extensions and explore their theoretical and empirical properties in greater depth.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/e27080882/s1. The supplementary materials provide a detailed proof of the theoretical results. Refs. [18,19] are cited in the supplementary materials.

Author Contributions

Conceptualization, Z.H., T.L. and J.Y.; data curation, Z.H.; formal analysis, Z.H.; funding acquisition, T.L. and J.Y.; methodology, Z.H., T.L. and J.Y.; project administration, T.L., J.Y. and N.B.; supervision, T.L., J.Y. and N.B.; writing—original draft, Z.H.; writing—review and editing, Z.H., T.L., J.Y. and N.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Li and You. Li’s research is supported by grants from the Humanities and Social Science Fund of the Ministry of Education of China (No. 21YJA910001). You’s research is supported by grants from the National Natural Science Foundation of China (NSFC) (No. 11971291).

Data Availability Statement

The original datasets employed in this study are publicly accessible from the official website of Johns Hopkins University at https://www.jhu.edu/ (accessed on 25 July 2021), the World Bank’s online platform at https://data.worldbank.org/ (accessed on 17 September 2021), and the United States Bureau of Economic Analysis at http://www.bea.gov/ (accessed on 16 October 2021).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kokoszka, P.; Miao, H.; Petersen, A.; Shang, H.L. Forecasting of density functions with an application to cross-sectional and intraday returns. Int. J. Forecast. 2019, 35, 1304–1317. [Google Scholar] [CrossRef]
  2. Sen, R.; Ma, C. Forecasting density function: Application in finance. J. Math. Financ. 2015, 5, 433–447. [Google Scholar] [CrossRef]
  3. Petersen, A.; Müller, H. Fréchet regression for random objects with Euclidean predictors. Ann. Stat. 2019, 47, 691–719. [Google Scholar] [CrossRef]
  4. Petersen, A.; Chen, C.; Müller, H. Quantifying and visualizing intraregional connectivity in resting-state functional magnetic resonance imaging with correlation densities. Brain Connect. 2019, 9, 37–47. [Google Scholar] [CrossRef] [PubMed]
  5. Saha, A.; Banerjee, S.; Kurtek, S.; Narang, S.; Lee, J.; Rao, G.; Martinez, J.; Bharath, K.; Rao, A.; Baladandayuthapani, V. DEMARCATE: Density-based magnetic resonance image clustering for assessing tumor heterogeneity in cancer. NeuroImage Clin. 2016, 12, 132–143. [Google Scholar] [CrossRef] [PubMed]
  6. Petersen, A.; Müller, H. Functional data analysis for density functions by transformation to a Hilbert space. Ann. Stat. 2016, 44, 183–218. [Google Scholar] [CrossRef]
  7. Han, K.; Müller, H.; Park, B. Additive functional regression for densities as responses. J. Am. Stat. Assoc. 2020, 115, 997–1010. [Google Scholar] [CrossRef]
  8. Talská, R.; Menafoglio, A.; Machalová, J.; Hron, K.; Fiserová, E. Compositional regression with functional response. Comput. Stat. Data Anal. 2018, 123, 66–85. [Google Scholar] [CrossRef]
  9. Chen, Y.; Lin, Z.; Müller, H. Wasserstein regression. J. Am. Stat. Assoc. 2023, 118, 869–882. [Google Scholar] [CrossRef]
  10. Zhang, C.; Kokoszka, P.; Petersen, A. Wasserstein autoregressive models for density time series. J. Time Ser. Anal. 2022, 43, 30–52. [Google Scholar] [CrossRef]
  11. Berhoune, K.; Bensmain, N. Sieves estimator of functional autoregressive process. Stat. Probab. Lett. 2018, 135, 60–69. [Google Scholar] [CrossRef]
  12. Chen, Y.; Chua, W.S.; Hardle, W. Forecasting limit order book liquidity supply-demand curves with functional autoregressive dynamics. Quant. Financ. 2019, 19, 1473–1489. [Google Scholar] [CrossRef]
  13. Chen, Y.; Li, B. An adaptive functional autoregressive forecast model to predict electricity price curves. J. Bus. Econ. Stat. 2017, 35, 371–388. [Google Scholar] [CrossRef]
  14. Daniel, R.; David, S.; David, R. Functional autoregression for sparsely sampled data. J. Bus. Econ. Stat. 2019, 37, 97–109. [Google Scholar]
  15. Bosq, D. Linear Processes in Function Spaces: Theory and Applications; Springer Science & Business Media: New York, NY, USA, 2000. [Google Scholar]
  16. Kokoszka, P.; Reimherr, M. Determining the order of the functional autoregressive model. J. Time Ser. Anal. 2013, 34, 116–129. [Google Scholar] [CrossRef]
  17. Xu, X.; Chen, Y.; Zhang, G.; Koch, T. Modeling functional time series and mixed-type predictors with partially functional autoregressions. J. Bus. Econ. Stat. 2022, 42, 349–366. [Google Scholar] [CrossRef]
  18. Stone, C. The use of polynomial splines and their tensor products in multivariate function estimation. Ann. Stat. 1994, 22, 118–171. [Google Scholar] [PubMed]
  19. DeVore, R.; Lorentz, G. Constructive Approximation; Springer Science & Business Media: New York, NY, USA, 1993; Volume 303. [Google Scholar]
Figure 1. Densities of global COVID-19 mortality rates (‰) observed over a 100-day period. (a) Three-dimensional representation of the evolving density time series across the entire time span. (b) Density curves plotted for three selected days.
Figure 1. Densities of global COVID-19 mortality rates (‰) observed over a 100-day period. (a) Three-dimensional representation of the evolving density time series across the entire time span. (b) Density curves plotted for three selected days.
Entropy 27 00882 g001
Figure 2. Average estimates of the F A R ( 1 ) error process ε ( u ) obtained from 200 Monte Carlo replications with sample size T = 100 and n = 100 observations. Panel (a) represents true curves, while panel (b) represents spline-based estimates. Each color represents a curve corresponding to an individual simulated subject.
Figure 2. Average estimates of the F A R ( 1 ) error process ε ( u ) obtained from 200 Monte Carlo replications with sample size T = 100 and n = 100 observations. Panel (a) represents true curves, while panel (b) represents spline-based estimates. Each color represents a curve corresponding to an individual simulated subject.
Entropy 27 00882 g002
Figure 3. Average estimates of the bivariate functions g m ( u , x m ) , m = 1 , 2 . Left panels: true density surfaces, middle panels: initial spline-based estimates, right panels: improved estimates after adjusting for the error structure. Top two panels correspond to g 1 ( u , x 1 ) viewed from two different angles, bottom panels illustrate g 2 ( u , x 2 ) .
Figure 3. Average estimates of the bivariate functions g m ( u , x m ) , m = 1 , 2 . Left panels: true density surfaces, middle panels: initial spline-based estimates, right panels: improved estimates after adjusting for the error structure. Top two panels correspond to g 1 ( u , x 1 ) viewed from two different angles, bottom panels illustrate g 2 ( u , x 2 ) .
Entropy 27 00882 g003
Figure 4. Heat map of bivariate varying-coefficient function g 1 ( u , x 1 ) in the model based on the COVID-19 mortality rate (‰) data.
Figure 4. Heat map of bivariate varying-coefficient function g 1 ( u , x 1 ) in the model based on the COVID-19 mortality rate (‰) data.
Entropy 27 00882 g004
Figure 5. Densities of national quarterly personal income in the USA over 44 quarters. (a) Three-dimensional view of the density time series over the entire period; (b) density curves at three selected quarters.
Figure 5. Densities of national quarterly personal income in the USA over 44 quarters. (a) Three-dimensional view of the density time series over the entire period; (b) density curves at three selected quarters.
Entropy 27 00882 g005
Figure 6. Heat maps of bivariate varying-coefficient functions g m ( u , x m ) , m = 0 , 1 , 2 , based on the USA income data.
Figure 6. Heat maps of bivariate varying-coefficient functions g m ( u , x m ) , m = 0 , 1 , 2 , based on the USA income data.
Entropy 27 00882 g006
Table 1. Average RMSEs of both initial and improved estimators of bivariate varying-coefficient additive functions g m ( u , x m ) .
Table 1. Average RMSEs of both initial and improved estimators of bivariate varying-coefficient additive functions g m ( u , x m ) .
Average RMSEs of Bivariate Varying-Coefficient Additive Functions
Sample Size g 1 ( u , x 1 ) g 2 ( u , x 2 )
T n Initial Improved Initial Improved
50500.22470.18480.21390.1785
1000.17590.13250.18440.1521
100500.18260.14710.17320.1354
1000.14310.11640.13190.1057
Table 2. Average Standard Deviation (SD) and Bias of both initial and improved estimators of bivariate varying-coefficient additive functions g m ( u , x m ) .
Table 2. Average Standard Deviation (SD) and Bias of both initial and improved estimators of bivariate varying-coefficient additive functions g m ( u , x m ) .
Average SD and Bias of Bivariate Varying-Coefficient Additive Functions
Sample Size g 1 ( u , x 1 ) g 2 ( u , x 2 )
Initial Improved Initial Improved
T n SD Bias SD Bias SD Bias SD Bias
50500.2050.1470.1680.1040.2190.1370.1830.117
1000.1790.1220.1420.0930.1960.1280.1640.095
100500.1740.1360.1510.0820.1870.1310.1580.086
1000.1330.0990.1120.0570.1530.1110.1290.061
Table 3. Empirical power of testing algorithm to determine the order of FAR error process under different significance levels.
Table 3. Empirical power of testing algorithm to determine the order of FAR error process under different significance levels.
Null Hypothesis p = 0 p 1 p 2
Alternative Hypothesis p 1 p 2 p 3
Sample Size Significance Level Significance Level Significance Level
T n 0.05 0.1 0.05 0.1 0.05 0.1
50500.8930.9620.7870.8460.0820.134
1000.9310.9850.8240.8930.0730.125
100500.9420.9720.8210.8810.0710.121
1000.9851.0000.8890.9350.0640.113
Table 4. Average RMSEs of both initial and improved estimators of bivariate varying-coefficient additive functions g m ( u , x m ) .
Table 4. Average RMSEs of both initial and improved estimators of bivariate varying-coefficient additive functions g m ( u , x m ) .
Average RMSEs of Bivariate Varying-Coefficient Additive Functions
Sample Size g 1 ( u , x 1 ) g 2 ( u , x 2 )
T n Initial Improved Initial Improved
50500.27390.24380.26910.2235
1000.22640.18520.21570.1809
100500.21360.18170.22320.1761
1000.17290.12630.18160.1224
Table 5. Empirical power of testing algorithm to determine the order of FAR error process under different significance levels.
Table 5. Empirical power of testing algorithm to determine the order of FAR error process under different significance levels.
Null Hypothesis p = 0 p 1 p 2
Alternative Hypothesis p 1 p 2 p 3
Sample Size Significance Level Significance Level Significance Level
T n 0.05 0.1 0.05 0.1 0.05 0.1
50500.8320.8910.7240.7950.1540.197
1000.8760.9320.7760.8430.1360.162
100500.8840.9370.7920.8380.1310.159
1000.9230.9510.8540.8930.0890.127
Table 6. Average RMSEs of both initial and improved estimators of bivariate varying-coefficient additive functions g m ( u , x m ) .
Table 6. Average RMSEs of both initial and improved estimators of bivariate varying-coefficient additive functions g m ( u , x m ) .
Average RMSEs of Bivariate Varying-Coefficient Additive Functions
Sample Size g 1 ( u , x 1 ) g 2 ( u , x 2 )
T n Initial Improved Initial Improved
50500.53790.49060.55820.5072
1000.46310.41250.48240.4436
100500.45820.42070.48710.4320
1000.39650.35180.42410.3736
Table 7. p-values from the testing algorithm applied to identify the order of functional error process based on the mortality rate data of COVID-19.
Table 7. p-values from the testing algorithm applied to identify the order of functional error process based on the mortality rate data of COVID-19.
Null Hypothesis p = 0 p 1
Alternative Hypothesis p 1 p 2
p-value0.0000.194
Table 8. p-values from the testing algorithm for determining the order of the functional error process based on the USA income data.
Table 8. p-values from the testing algorithm for determining the order of the functional error process based on the USA income data.
Null Hypothesis p = 0 p 1 p 2
Alternative Hypothesis p 1 p 2 p 3
p-value0.0000.0000.436
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, Z.; Li, T.; You, J.; Balakrishnan, N. Varying-Coefficient Additive Models with Density Responses and Functional Auto-Regressive Error Process. Entropy 2025, 27, 882. https://doi.org/10.3390/e27080882

AMA Style

Han Z, Li T, You J, Balakrishnan N. Varying-Coefficient Additive Models with Density Responses and Functional Auto-Regressive Error Process. Entropy. 2025; 27(8):882. https://doi.org/10.3390/e27080882

Chicago/Turabian Style

Han, Zixuan, Tao Li, Jinhong You, and Narayanaswamy Balakrishnan. 2025. "Varying-Coefficient Additive Models with Density Responses and Functional Auto-Regressive Error Process" Entropy 27, no. 8: 882. https://doi.org/10.3390/e27080882

APA Style

Han, Z., Li, T., You, J., & Balakrishnan, N. (2025). Varying-Coefficient Additive Models with Density Responses and Functional Auto-Regressive Error Process. Entropy, 27(8), 882. https://doi.org/10.3390/e27080882

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop