Abstract
Functional data such as curves, shapes, and manifolds have become more and more common with modern technological advancements. The multiplicative regression model is well suited for analyzing data with positive responses. In this study, we study the estimation problems of the partial functional multiplicative regression model (PFMRM) based on the least absolute relative error (LARE) criterion and least product relative error (LPRE) criterion. The functional predictor and slope function are approximated by the functional principal component basis functions. Under certain regularity conditions, we derive the convergence rate of the slope function and establish the asymptotic normality of the slope vector for two estimation methods. Monte Carlo simulations are carried out to evaluate the proposed methods, and an application to Tecator data is investigated for illustration.
Keywords:
functional principal component analysis; least absolute relative error; least product relative error; partial functional multiplicative regression model MSC:
62L05; 62L12; 62F10; 62F12; 62F25
1. Introduction
We consider the PFMRM which includes some scalar covariates and a functional predictor, paired with a scalar response. That is,
where Y is a positive response variable; is a p-dimensional vector covariate, is the p-vector of slope coefficients, in which p is assumed to be fixed; is the unknown slope function associated with functional predictor ; and is a positive random error independent of and . Here, the Hilbert space is the set of all square integrable functions on T, endowed with the inner product and the norm . The model (1) generalizes both the classic multiplicative regression model [1] and the functional multiplicative regression model [2] that correspond to the cases of and , respectively. When the log transformation applies to this model, the above model simply degrades to a partial functional linear regression model [3]. When the response variable Y is a failure time, model (1) is called the functional accelerated failure time model in survival analysis; see [4] for example. For simplicity of notation, we assume that , and and have zero mean throughout the study.
In many applications, the response variable is positive, for example, survival time, stock prices, income, body fat level, emissions of nitrogen oxides, and the value of owner-occupied homes frequently arise in statistical practice. The multiplicative regression model plays an important role in describing these types of data. To estimate the multiplicative regression models, Refs. [1,5] proposed LARE and LPRE estimation, respectively. The LARE criterion minimizes , and the LPRE criterion minimizes , which is equivalent to minimizing . As pointed out by [5], the LARE estimation is robust and scale-free, but optimization of its use may be challenging as the objective function minimized is non-smoothing. In addition, confidence intervals for parameters are not very accurate due to the complexity of the asymptotic covariance matrix, which involves the density of the model error. In order to overcome the shortcoming of LARE, Ref. [5] proposed the LPRE criterion, which is strictly convex and infinitely differentiable, and the optimization procedure is much easier. In recent years, due to the excellent properties of LARE and LPRE estimation, scholars in various fields have been attracted to conducting extended research on them. The readers can refer to [6,7,8].
For functional multiplicative models, to the best of our knowledge, there are only a few works and all of them focus on the above two criteria. For example, Ref. [9] developed the functional quadratic multiplicative model and derived the asymptotic properties of the estimator with the LARE criterion. Later, Refs. [2,10] considered the variable selection for partially and locally sparse functional linear multiplicative models based on the LARE criterion. In this paper, we consider the modeling of a positive scalar response variable with both scalar and functional predictors under the PFMRM. The above two criteria are employed to estimate the parametric vector and the slope function in model (1).
The major contributions of this paper are four-fold. First, this study first extends the LPRE criterion to the estimation of functional regression models. Second, we estimate the unknown slope function and functional predictor by using a functional principal component analysis technique, derive the convergence rates of the slope function, and establish the asymptotic normality of the parameter vector under mild regularity conditions for two estimation methods. Third, we develop an iterative algorithm to solve the involved optimization problem and propose a data-driven procedure to select the tuning parameters. Finally, we conduct extensive numerical studies to examine the finite sample performance of the proposed methods and find that the LPRE method has better performance than the LARE, least square, and least absolute deviation methods.
The rest of the article is organized as follows. Section 2 describes the detailed estimation procedures for model (1). Section 3 is dedicated to the asymptotic study of our estimators. The feasible algorithm for estimations of the parameters and nonparametric functions of PFMRM is proposed based on the LPRE criterion and presented in Section 4. Section 5 conducts simulation studies to evaluate the finite sample performance of the proposed methods. In Section 6, we apply the proposed method to the Tecator data. The article concludes with a discussion in Section 7. Proofs are provided in Appendix A.
2. Estimation Method
Let , be the independent realizations of generated from model (1), that is,
where random errors , are independent and identically distributed (i.i.d.) and independent of and .
The covariance and empirical covariance functions of can be defined as
According to Mercer’s theorem, the spectral expansions of and can be written as
where and are the ordered eigenvalue sequences of the linear operators with kernels and , respectively, and and are appropriate orthonormal eigenfunction sequences. With a slight abuse of notation, we use to denote both the covariance operator and the covariance function of . We assume that the covariance operator defined by is strictly positive. In addition, can be regarded as an estimator of .
On the basis of the Karhunen–Loève decomposition, and can be expanded to
where , and represents the coordinate of the ith curve with respect to the jth eigenbasis.
Analogously, we define , , , and . Then, their corresponding empirical counterparts can be defined as
Given the orthogonality of and (3), model (2) can be rewritten as
where , , , and the truncation parameter as .
2.1. LARE Estimation
This is based on the LARE criterion of [1] and is replaced with its estimator . The LARE estimates of model (1) can be obtained by minimizing the following loss functions:
where with , . Moreover, we can obtain the LARE estimator .
2.2. LPRE Estimation
This is based on the LPRE criterion of [5] and is replaced with its estimator . The LPRE estimates of model (1) can be obtained by minimizing the following loss functions:
where . Moreover, we can obtain the LPRE estimator .
3. Asymptotic Properties
In this section, we establish the asymptotic properties of the estimators. Formulating the results requires the following technical assumptions. Firstly, we present some notations. Suppose that and are the true values of and , respectively, and let be the true score coefficient vector. The notation denotes the norm for a function or the Euclidean norm for a vector. In what follows, c denotes a generic positive constant that may take various values. Moreover, implies that is bounded away from zero and infinity as .
- C1.
- The random process and the score satisfy the following conditions:
- C2.
- For the eigenvalues of the linear operator and the score coefficients, the following conditions hold:There exist some constants c and such that ;There exist some constants c and such that .
- C3.
- The tuning parameter .
- C4.
- For the random vector ,
- C5.
- There exists some constant c such that .
- C6.
- Let with , for each k, then are independent and identically distributed random variables. Assume thatwhere is the kth diagonal element of with , and is a positive definite matrix.
- C7.
- The error has a continuous density in a neighborhood of 1, and is independent of .
- C8.
- , , and .
- C9.
- , .
Remark 1.
C1–C3 are standard assumptions used in classical functional linear regression (see, e.g., [11,12]). More specifically, C1 is needed for the consistency of . C2(a) is required to identify the slope function by preventing the spacing between the eigenvalues from being too small, while C2 (b) is used to make the slope function sufficiently smooth. C3 is required to obtain the convergence rate of the slope function . C4–C6 are used to handle the linear part of the vector-type covariate in the model, which are similar to [3,13]. C4 is a little stronger than those in classical linear models and is primarily used to ensure the asymptotic behavior of and . C5 makes the effect of truncation on the estimation of small enough. Notably, is the regression error of on , and the conditions on in C6 essentially restrict that can only be linearly related to . C6 is also used to establish the asymptotic normality of the parameter estimator, in a similar manner to that applied in [3,13] for modeling the dependence between parametric and nonparametric components. C7–C8 are standard assumptions on random errors of the LARE estimator used in [1]. C9 is the standard assumption for random errors in the LPRE estimator used in [5].
The following two theorems present the convergence rate of the slope function estimator and establish the asymptotic normality of the parameter estimator , respectively, with the LARE method introduced in Section 2.1 above.
Theorem 1.
If conditions C1–C8 hold, then
Theorem 2.
If conditions C1–C8 hold, as , we have
where represents convergence in distribution, .
The following two theorems give the rate of convergence of the slope function and the asymptotic normality of the parameter vector, respectively, with the LPRE method introduced in Section 2.1 above.
Theorem 3.
Suppose conditions C1–C6 and C9 hold; then,
Theorem 4.
Suppose conditions C1–C6 and C9 hold; as , we have
where .
Remark 2.
The convergence rate of the slope function obtained in Theorems 1 and 3 is the same as that of [12,13], which is optimal in the minimax sense. The variance in Theorems 2 and 4 involves the random error density function, which is the standard feature of multiplicative regression models. One can consult Theorem 3.2 of [14] for more details.
4. Implementation
Considering that the minimization problems of the LARE method are the special cases of the LPRE procedure, we only provide a detailed implementation of the LPRE approach. Specially, we use the Newton–Raphson iterative algorithm to solve the LPRE problem in Equation (5). Let ; then, Then, the computation can be implemented as follows:
Step 1 Initialization step. In this paper, the least squares estimator is chosen as the initial estimator.
Step 2 Update the estimator of by using the following iterative procedure:
where and represent the gradient and Hessian matrix of at , respectively.
Step 3 Step 2 is repeated until convergence. We use the norm of the difference between two consecutive estimates less than as the convergence criterion. Note that [8] proposed a profiled LPRE method in partial linear multiplicative models. The algorithm in [8] requires that and hold simultaneously. Since the LPRE objective functions (5) are infinitely differentiable and strict, the prposed Newton–Raphson method can relax the restriction . Moreover, the minimizer of the objective function (5) is just the root of its first derivative. We will express the final LPRE estimator of as . Furthermore, the LPRE estimator of the slope function is indicated by .
5. Simulation Studies
In this section, the finite sample properties of the proposed estimation methods are investigated through Monte Carlo simulation studies. We compare the performance of the two proposed methods with the least absolute deviations (LAD) method in [15] and the least squares (LS) method in [3], where both the LS and LAD estimates are based on the logarithmic transformation on the two sides of the following model (6). The sample size n is set as 150, 300, and 600. And the datasets are generated from the following model:
where follows the standard normal distribution, follows the Bernoulli distribution with a probability of 0.5, and . For the functional linear component, we use a similar setting to that used by [13] to set and , where , and s are independently distributed according to the normal distribution with mean 0 and variance for . Similar to [1], the random error is considered from the following three distributions: (i) , (ii) , and (iii) ; the choice of a satisfies .
Implementing the proposed estimation method requires the tuning parameter m. Here, m is selected as the minimum value that reaches a certain proportion (denoted by ) of the cumulative percentage of total variance (CPV) by the first m leading components as follows:
where M is the largest number of functional principle components, such that , and is used in this study.
Based on 500 replications, Table 1 summarizes the performance of different estimators in terms of bias (Bias) and standard deviation (Sd) of the estimated and , as well as the mean squared error (MSE) of the estimated . Table 2 provides the root average square errors (RASEs) of the estimated for LARE estimation, where the RASE is defined as follows:
where are equally spaced grids to calculate the value of function , and we take in this simulation. We compute the RASE for each replicate observation and obtain the average. In addition, the definitions of RASE for the LPRE, LAD, and LS methods are similar, we just replaced with the corresponding estimators.
Table 1.
The biases and standard deviations of the estimators for and , and the mean squared error of the estimators for under different distributions (results ).
Table 2.
The RASEs of the estimators for under different error distributions (results ).
From Table 1 and Table 2, we have the following observations: (a) Sd, MSE, and RASE decrease and the estimation performance improves as sample size n increases from 150 to 600. The estimates of the parametric covariate effects are basically unbiased and close to their true values, indicating that our proposed approaches produce consistent estimators. (b) When follows the normal random error, as expected, both LS and LPRE perform the best, and LAD performs the worst. (c) When follows , LPRE performs the best, LARE also performs well, and LAD still performs the worst. (d) follows . Note that the random error violates condition C8 for the LARE method, which implies that the random error of zero mean in the least squares or of zero median in the LAD regression does not hold. Meanwhile, the LPRE method works well in the case of . LPRE performs considerably better than LARE and LAD; this indicates that LPRE is much more robust than the LARE and LAD methods. In summary, LPRE performs the best in almost all the scenarios considered, confirming its superiority to LARE and other competing methods.
6. Application to Tecator Data
In this section, we introduce the application of the proposed estimation methods to Tecator data. The dataset is contained in the R package fda.usc in [16], and includes 215 independent food samples with fat, protein, and water of meat measured in percent. It has been widely used in the analysis of functional data. The Tecator data consist of a 100-channel spectrum of absorbances working in the wavelength from 850 to 1050 nanometers (nm). Further details on the data can be found in [2,9]. The purpose is to tease out the relation among the quantity of fatty Y (response), protein content , and water content (real random variables), and the spectrometric curve (a functional predictor). To predict the fat content of a meat sample, we consider the following PFMRM:
To assess the predictive capability of the proposed methods, we followed [13] to randomly divide the sample into two subsamples: as the training sample, where represents the base of , and as the testing sample. The training and testing samples were used to estimate parameters and check the accuracy of the prediction, respectively. We used the mean quadratic prediction error (MQEP) as a criterion to evaluate the performance of various estimation procedures. The MQEP is defined as follows:
where is predicted based on the training sample, and is a response variable from test sample variance.
In addition, we compare the performances of the proposed model with the partial functional regression model in Shin [3], and log transformation on two sides of model (7) (denoted as “LogPFLM”). Specifically,
The CPV criterion introduced in Section 4 was used to determine the cutoff parameter m. Here, was selected to explain approximately 95% of the variance in the Tecator data. Table 3 shows the average MQEP of N times repeated operations. The first and second rows of Table 3 show the prediction results of the logPFLM using LS and LAD methods, respectively. The third and fourth rows give the prediction results of (7) under the LARE and LPRE methods, respectively. The final row presents the prediction results of the PFLM without logarithmic transformation by the LS method. Overall, the LPRE outperforms all other competing methods regardless of the number of random splits. LS performs the second best, whereas LAD performs the worst. In addition, we employed the above models and methods to Tecator data just considering the scalar predictors or functional predictors; and the results indicated relatively poor performance, so we have not reported them.
Table 3.
MQEP of different random partitions.
Then, we used the best-performing LPRE method to estimate the unknown parameters based on the entire dataset. The estimates of and are and , respectively. Both protein and water are positively associated with the logarithmic transformation of fatty. Figure 1 depicts the estimated . In general, the spectrometric curve has a positive effect on the logarithmic transformation of fatty, and the estimated curve is small due to the large integral domain. The advantages of the LPRE method are particularly evident in the analysis of this dataset.
Figure 1.
The estimated functional weight in model (7) with LPRE method.
7. Conclusions
In this paper, we study the estimated problems of PFMRM based on the LARE and LPRE criteria, and the unknown slope function and functional predictor are approximated by the functional principal component analysis technique. Under some regularity conditions, we obtain the convergence rates of the slope function and the asymptotic normality of the parameter vector for two estimation methods. Both the numerical simulation results and the real data analysis show that the LPRE method is superior to the LARE, least square, and least absolute deviation methods. Several issues still warrant further study. First, we choose the Karhunen–Loève expansion to approximate the slope function in this article. Other nonparametric smoothing techniques, such as B-spline, kernel estimation, and penalty regression spline, can be used in our proposed LARE and LPRE estimation methods, and the large sample characteristics and limited sample comparison are worth studying. Furthermore, the proposed methods can also be extended to more general situations, including, but not limited to, dependent functional data, partially observed functional data, and multivariate functional data. Substantial efforts must be devoted to related advances in the future.
Author Contributions
Conceptualization, X.L. and P.Y.; methodology and proof, X.L. and P.Y.; numerical study, X.L. and J.S.; writing—original draft preparation, P.Y. and J.S.; writing—review and editing, X.L., P.Y. and J.S. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the National Natural Science Foundation of China (12401356), the Natural Science Foundation of Shanxi Province (20210302124262, 20210302124530, 202203021222223), the National Statistical Science Research Project of China (2022LY089), and the Natural Science Foundation of Shanxi normal University (JYCJ2022004).
Data Availability Statement
Researchers can download the Tecator dataset from the R package “fda.usc”.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| PFMRM | partial functional multiplicative regression model |
| PFLM | partial functional regression model |
| LARE | least absolute relative error |
| LPRE | least product relative error |
| LAD | least absolute deviations |
| LS | least square |
| RASE | root average square error |
| MQEP | mean quadratic error of prediction |
| MSE | mean squared error |
| CPV | Cumulative Percentage of Variance |
Appendix A
In this appendix, we provide the technical proofs for the results presented in Section 3.
Proof of Theorem 1.
Let , , , , and , where L is a large enough constant. Next, we show that, for any given , there exists a sufficiently large constant , such that
This implies that with the probability of at least there exists a local minimizer and in the ball such that , .
By using (see, e.g., [3,13]), we have
For , by conditions C1, C2, and the Hölder inequality, we can obtain
For , given that
We have
Taking these together, we obtain
Furthermore, a simple calculation yields
For , by the Knight identity (see, e.g., [17]), we have
By routine calculation, we have
where .
Using the Taylor series expansion, we have
where is between and , and is between and .
For , we have
It follows that , , and . Therefore,
For , define , and ; then,
Therefore,
Similarly, we can prove that
Proof of Theorem 2.
Firstly, let
According to the convexity lemma of [18] and lemma A.1 in [1], for any compact sets and ℜ, as , we have
With a simple calculation, we have
According to condition C8, we can obtain that the sum of first term in Equation (A6) is 0. Further, by condition C8, we have
This means , and the second term in Equation (A6) is non-negative. And it is easy to prove that the third term in Equation (A6) is also non-negative. Thus, for all and , we have In addition, we have , . And then is a unique minimum point of . According to condition C8 and , we can obtain that is a minimizer of . Let . Then, for each , , there exist , such that , .
For any constant , , and , let () be a minimizer of such that . According to (A6), as , we have , and .
On the other hand, according to (A6), for arbitrary positive constants and , we have . Thus, with probability tending to 1, the minimizer is taken inside of for the strictly convex of . Therefore, the local minimizer inside is the only global minimizer. According to the definition of , when , . Thus, as , , we can obtain the weak consistency of and .
Next, we prove the asymptotic normality. Note that . By invoking the Taylor expansion, we have
where ,
Let , .
Next, we will show that, for every , one has
Let , . Then, Equation (A8) is rewritten as
To prove the above equation, we will first prove that the following Equation (A9) holds for each fixed and , that is,
Let
Then, we have
For each fixed and , as , one has
where such that for . As , one has
On the other hand, by the Taylor series expansion, for every fixed and , we have
where b such that .
Similarly, we have
where such that . Furthermore, for each fixed and , one has
Combining Equation (A11) with condition C8, we complete the proof of (A9). According to Lemma A.1 in [1], we know that is convex. Then, for each constant , we have
Lastly, let
Then, for each , as , we have
Let , be a minimizer of . A simple calculation yields . According to the definition of , for each , there exists some constant , , and , as . Therefore, one has . According to (A11), for each , there exists some constant , for any , such that
Therefore, for each , there exists , for any , such that
This means . Similarly, for each constant , one has
Let , and
A simple calculation yields . Note that
Then, for any constant , , , , such that , , one has
where , and are the smallest eigenvalues of , and , respectively.
Proof of Theorem 3.
Let , , , , , where L is a large enough constant. Next, we show that, for any given , there exists a sufficiently large constant , such that
This implies that with the probability of at least , there exists a local minimizer and in the ball such that , .
With a simple calculation, we have
Invoking the Taylor expansion, we have
where is between and , and is between and . For , we have
It is easy to obtain , , and .
Furthermore, we have
Therefore, Equation (A15) holds, and there exists a local minimizer , such that .
Note that
Proof of Theorem 4.
According to Theorem 3, we know that, as , with probability tending to 1, achieves the minimal value at . We have the following score equations:
Using the Taylor series expansion for Equation (A21), we have
Let , , , . A simple calculation yields
Similarly,
Furthermore, we have
Let . Then,
where .
According to the law of large numbers and the central limit theorem, as , we obtain
□
References
- Chen, K.; Guo, S.; Lin, Y.; Ying, Z. Least absolute relative error estimation. J. Am. Stat. Assoc. 2010, 105, 1104–1112. [Google Scholar] [CrossRef] [PubMed]
- Fan, R.; Zhang, S.; Wu, Y. Penalized relative error estimation of functional multiplicative regression models with locally sparse properties. J. Korean Stat. Soc. 2022, 51, 666–691. [Google Scholar] [CrossRef]
- Shin, H. Partial functional linear regression. J. Stat. Plan. Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
- Liu, C.; Su, W.; Su, W. Efficient estimation for functional accelerated failure time model. arXiv 2024, arXiv:2402.05395. [Google Scholar]
- Chen, K.; Guo, S.; Lin, Y.; Ying, Z. Least product relative error estimation. J. Multivar. Anal. 2016, 144, 91–98. [Google Scholar] [CrossRef]
- Ming, H.; Liu, H.; Yang, H. Least product relative error estimation for identification in multiplicative additive models. J. Comput. Appl. Math. 2022, 404, 113886. [Google Scholar] [CrossRef]
- Ye, F.; Zhou, H.; Yang, Y. Asymptotic properties of relative error estimation for accelerated failure time model with divergent number of parameters. Stat. Its Interface 2024, 17, 107–125. [Google Scholar] [CrossRef]
- Zhang, J.; Feng, Z.; Peng, H. Estimation and hypothesis test for partial linear multiplicative models. Comput. Stat. Data Anal. 2018, 128, 87–103. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, Q.; Li, N. Least absolute relative error estimation for functional quadratic multiplicative model. Commun. Stat.-Theory Methods 2016, 45, 5802–5817. [Google Scholar] [CrossRef]
- Zhang, T.; Huang, Y.; Zhang, Q.; Ma, S.; Ahmed, S. Penalized relative error estimation of a partially functional linear multiplicative model. In Matrices, Statistics and Big Data: Selected Contributions from IWMS 2016; Springer: Cham, Switzerland, 2019; Volume 45, pp. 127–144. [Google Scholar]
- Cai, T.; Hall, P. Prediction in functional linear regression. Ann. Stat. 2006, 34, 2159–2179. [Google Scholar] [CrossRef]
- Hall, P.; Horowitz, J.L. Methodology and convergence rates for functional linear regression. Ann. Stat. 2007, 35, 70–91. [Google Scholar] [CrossRef]
- Yu, P.; Song, X.; Du, J. Composite expectile estimation in partial functional linear regression model. J. Multivar. Anal. 2024, 203, 105343. [Google Scholar] [CrossRef]
- Xia, X.; Liu, Z.; Yang, H. Regularized estimation for the least absolute relative error models with a diverging number of covariates. Comput. Stat. Data Anal. 2016, 96, 104–119. [Google Scholar] [CrossRef]
- Tang, Q.; Cheng, L. Partial functional linear quantile regression. Sci. China Math. 2019, 57, 2589–2608. [Google Scholar] [CrossRef]
- Febrero-Bande, M.; Fuente, M. Statistical Computing in Functional Data Analysis: The R Package fda.usc. J. Stat. Softw. 2012, 51, 1–28. [Google Scholar] [CrossRef]
- Knight, K. Limiting distributions for L1 regression estimators under general conditions. Ann. Stat. 1998, 26, 755–770. [Google Scholar] [CrossRef]
- Pollard, D. Asymptotics for Least Absolute Deviations Regression Estimators. Econom. Theory 1982, 7, 186–199. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
