Next Article in Journal
Thermal Buckling and Postbuckling Behaviors of Couple Stress and Surface Energy-Enriched FG-CNTR Nanobeams
Previous Article in Journal
Synthesis of a New Ag(I)-Azine Complex via Ag(I)-Mediated Hydrolysis of 2-(((1-(Pyridin-2-yl)ethylidene)hydrazineylidene) Methyl)phenol with AgClO4; X-ray Crystal Structure and Biological Studies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Reproducing Kernel Hilbert Space Approach to Multiresponse Smoothing Spline Regression Function

1
Department of Mathematics, Faculty of Mathematics and Natural Sciences, The University of Jember, Jember 68121, Indonesia
2
Research Group of Statistical Modeling in Life Sciences, Faculty of Science and Technology, Airlangga University, Surabaya 60115, Indonesia
3
Department of Mathematics, Faculty of Science and Technology, Airlangga University, Surabaya 60115, Indonesia
4
Department of Statistics, Faculty of Science, Muğla Sıtkı Koçman University, Muğla 48000, Turkey
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(11), 2227; https://doi.org/10.3390/sym14112227
Submission received: 20 September 2022 / Revised: 13 October 2022 / Accepted: 14 October 2022 / Published: 23 October 2022
(This article belongs to the Section Mathematics)

Abstract

:
In statistical analyses, especially those using a multiresponse regression model approach, a mathematical model that describes a functional relationship between more than one response variables and one or more predictor variables is often involved. The relationship between these variables is expressed by a regression function. In the multiresponse nonparametric regression (MNR) model that is part of the multiresponse regression model, estimating the regression function becomes the main problem, as there is a correlation between the responses such that it is necessary to include a symmetric weight matrix into a penalized weighted least square (PWLS) optimization during the estimation process. This is, of course, very complicated mathematically. In this study, to estimate the regression function of the MNR model, we developed a PWLS optimization method for the MNR model proposed by a previous researcher, and used a reproducing kernel Hilbert space (RKHS) approach based on a smoothing spline to obtain the solution to the developed PWLS optimization. Additionally, we determined the symmetric weight matrix and optimal smoothing parameter, and investigated the consistency of the regression function estimator. We provide an illustration of the effects of the smoothing parameters for the estimation results using simulation data. In the future, the theory generated from this study can be developed within the scope of statistical inference, especially for the purpose of testing hypotheses involving multiresponse nonparametric regression models and multiresponse semiparametric regression models, and can be used to estimate the nonparametric component of a multiresponse semiparametric regression model used to model Indonesian toddlers’ standard growth charts.

1. Introduction

A reproducing kernel Hilbert space (RKHS) theory was first introduced by Aronszajn in 1950 [1]. This theory was later developed by [1,2] to solve optimization problems in regression, especially nonparametric spline original regression. The RKHS approach was used by [3] for an M-type spline estimator. Next, Ref. [4] used the RKHS approach for a relaxed spline estimator.
There are many cases in our daily life that we have to analyze, especially cases involving the functional relationship between different variables. In statistics, to analyze the functional relationship between several variables, namely, the influence of the independent variable or predictor variable on the dependent variable or response variable, regression analysis is used. In regression analysis, it is necessary to build a mathematical model, which is commonly referred to as a regression model, and this functional relationship is expressed by a regression function. In regression analysis, there are two kinds of basic regression model approaches, namely, parametric regression models and nonparametric regression models. In general, the main problem in regression analysis whether using a parametric regression model approach or a nonparametric regression model approach is the problem of estimating the regression function. In the parametric regression model, the problem of estimating the regression function is the same as the problem of estimating the parameters of the parametric regression model where this is different from the nonparametric regression model. In nonparametric regression models, estimating the regression function is equivalent to estimating an unknown smooth function contained in a Sobolev space using smoothing techniques.
There are several frequently used smoothing techniques for estimating nonparametric regression functions, for example, local linear, local polynomial, kernel, and spline. The research results of several previous researchers have shown that smoothing techniques such as local linear, local polynomial, and kernel are highly recommended for estimating nonparametric regression functions for prediction purposes. These researchers include [5,6], who used local linear for predicting hypertension risk and predicting Mycobacterium tuberculosis numbers, respectively; Ref. [7] used local linear for determining boundary correction of nonparametric regression function; Ref. [8] used local linear for determining the bias reduction of a regression function estimate; Ref. [9] used local linear to design a standard growth chart for assessing the nutritional status of toddlers; Refs. [10,11] used local polynomial for estimating regression functions in cases of errors-in-variable and correlated errors, respectively; Refs. [12,13] used local polynomial to estimate the regression function for functional data and for finite population, respectively; Ref. [14] discussed smoothing techniques using kernel; Refs. [15,16] discussed consistency kernel regression estimation and estimating regression functions for cases of correlated errors using kernel, respectively; and Refs. [17,18] discussed estimating covariance matrix and selecting bandwidth using kernel, respectively. However, local linear, local polynomial, and kernel are highly dependent on the bandwidth in the neighborhood of the target point. Thus, if we use these local linear, local polynomial, or kernel approaches to estimate a model with fluctuating data, then we require a small bandwidth, and this results in too rough an estimation of the curve. This means that these local linear, local polynomial, and kernel approaches do not consider smoothness, only the goodness of fit. Therefore, for estimation models with fluctuating data in the sub-intervals, these local linear, local polynomial, and kernel methods are not good to use, as the results of estimation result in a large value of the mean square error (MSE). This is different from spline approaches, which consider goodness of fit and smoothness factors, as discussed by [1,19], who used splines for modeling observational data and estimating nonparametric regression functions. Furthermore, for prediction and interpretation purposes, smoothing techniques such as smoothing spline and truncated spline are better and more flexible for estimating the nonparametric regression functions [20]. Due to the flexible nature of these splines, many researchers have been interested in using and developing them in several cases. For examples, M-type splines were used by [21] to analyze variance for correlated data, and by [22] for estimating both nonparametric and semiparametric regression functions; truncated splines have been discussed by [23] to estimate mean arterial pressure for prediction purpose and by [24] to estimate blood pressure for prediction and interpretation purposes. Additionally, Ref. [25] developed truncated spline for estimating a semiparametric regression model and determining the asymptotic properties of the estimator. Furthermore, Ref. [26] discussed the flexibility of B-spline and penalties in estimating regression function; Ref. [27] discussed analyzing current status data using penalized spline; Ref. [28] analyzed the association between cortisol and ACTH hormones using bivariate spline; and Ref. [29] analyzed censored data using spline regression. In addition, Ref. [30] used both kernel and spline for estimating the regression function and selecting the optimal smoothing parameter of a uniresponse nonparametric regression (UNR) model; Refs. [31,32] developed both kernel and spline for estimating the regression function and for selecting the optimal smoothing parameter of a multiresponse nonparametric regression (MNR) model; and Ref. [33] discussed smoothing techniques, namely, kernel and spline, to estimate the coefficient of a rates model.
In regression modeling, a common problem involves more than one response variable observed at several values of predictor variables and between responses that are correlated with each other. The multiresponse nonparametric regression (MNR) model approach is appropriate for modeling the functions which represent the relationship between the response variable and predictor variable with correlated responses. In this model there is a correlation between the responses. Because of this correlation, it is necessary to construct a matrix called a weight matrix. Constructing the weight matrix is one of the things that distinguishes the MNR model approach from a classical model approach, that is, a parametric regression model or uniresponse nonparametric regression model approach. Thus, in the estimation process the regression function requires a weight matrix in the form of a symmetric matrix, especially a diagonal matrix. Furthermore, in the MNR model there are several smoothing techniques which can be used to estimate the regression function. One of these smoothing techniques is the smoothing spline approach. In recent years, studies on smoothing splines have attracted a great deal of attention and the methodology has been widely used in many areas of research, for example, for estimating regression functions of nonparametric regression models, in [34,35] used smoothing spline, mixed smoothing spline, and Fourier series; estimating regression functions were conducted by [36,37] for a semiparametric nonlinear regression model and a semiparametric regression model; and smoothing spline in an ANOVA model was discussed by [38]. Smoothing spline estimator, with its powerful and flexible properties, is one of the most popular estimators used for estimating regression function of the nonparametric regression model. Although the researchers mentioned above have previously discussed splines for estimating regression functions in many cases, none of them have used a reproducing kernel Hilbert space (RKHS) approach to estimate the regression function of the MNR model. On the other hand, even though there are studies, as mentioned above, that have used the RKHS approach for estimating regression functions, those researchers used the RKHS for estimating regression functions of single–response or uniresponse linear regression models only. This means that RKHS approaches were not used for estimating the regression function of the MNR model based on a smoothing spline estimator. In addition, although [34] used the RKHS approach to estimate the regression function of the MNR model, and also discussed it in a special case involving a simulation study; but Ref. [34] assumed that the three responses of the MNR model have the same smoothing parameter values, which in real life situation is a difficult assumption to fulfill. In addition, Ref. [34] did not discuss the consistency of the smoothing spline regression function estimator. Therefore, in this study we provide a theoretical discussion on estimating the smoothing spline regression function of the MNR model in case of unequal values of the smoothing parameters using the RKHS approach. In other words, in this study we discuss it for the more general case.

2. Materials and Methods

In this section, we briefly describe the materials and methods used according to the needs of this study, following the steps in the order in which they were carried out.

2.1. Multiresponse Nonparametric Regression Model

Suppose, given a paired observation y r i , t r i which satisfies the following multiresponse nonparametric regression (MNR) model:
y r i = f r t r i + ε r i ,   i = 1 , 2 , , n r ,   r = 1 , 2 , ,   p
where y r i is the observation value of the response variable on the r t h response and the i t h observation; f r · represents an unknown nonparametric regression function of r t h response which is assumed to be smooth in the sense that it is contained in a Sobolev space; t r i is the observation value of a predictor variable on the r t h response and the i t h observation; and ε r i represents the value of the random error on the r t h response and the i t h observation, which is assumed to have zero mean and variance σ r i 2 (heteroscedastic). In this study, we assume that the correlation between responses is ρ r s = ρ r f o r r = s 0 f o r r s .
In general, the main problem in MNR modelling is how we estimate the MNR model, which in this case is equivalent to the problem of estimating the regression function of the MNR model. There are many smoothing techniques that can be used to estimate the MNR model presented in (1), for example, kernel, local linear, splines, and local polynomial. One of these smoothing techniques is the spline approach, in which the smoothing spline is the most flexible estimator for estimating fluctuating data on sub-intervals. The following briefly presents the estimation method using the smoothing spline estimator. Further details related to the smoothing spline estimator can be found in [20].

2.2. Smoothing Spline Estimator

In this study, we estimated the regression function, f r t r i of the MNR model presented in (1) based on the smoothing spline estimator using the reproducing kernel Hilbert space approach, which is discussed in the following section. An estimate of the regression function of the MNR model presented in (1) can be obtained by developing the penalized weighted least squares (PWLS) optimization method proposed by [31], which is only used for the two-response nonparametric regression model with the same variance of errors, namely, the homoscedastic case. We then develop the PWLS optimization to estimate a nonparametric regression model with more than two responses, namely, the MNR model, in case of unequal variance of errors, which is called as heteroscedastic case. Hence, the estimated smoothing spline of the MNR model presented in (1) can be obtained by carrying out the following PWLS optimization:
min f 1 , , f p W 2 m a r , b r { N 1 y 1 f 1 T W 1 y 1 f 1 + + y p f p T W p y p f p   +   λ 1 a 1 b 1 f 1 2 t 2 d t + + λ p a p b p f p 2 t 2 d t }
where N = r = 1 p n r ; W 1 , , W p are symmetric weight matrices, λ 1 , , λ p are smoothing parameters, and f 1 , f 2 , , f p are unknown regression functions in a Sobolev space W 2 m a r , b r , where the Sobolev space W 2 m a r , b r is defined as follows:
W 2 m a r , b r = { f | f v ,   v = 0 , 1 , 2 , , m 1   are   absolutely   continuous   on   a r , b r   and f m L 2 a r , b r ,   where   L 2 a r , b r   is   the   collection   of   square   integrable   function   on   L 2 a r , b r ,   r = 1 , 2 , , p }
Furthermore, to obtain the solution to the PWLS provided in (2), we used the reproducing kernel Hilbert space (RKHS) approach. In the following section, we provide a brief review of RKHS. Further details related to RKHS can be found in [39], a paper concerning the theory of RKHS, and in [40], a textbook which discusses the use of RKHS in probability and statistics.

2.3. Reproducing Kernel Hilbert Space

The need to reproducing kernel Hilbert space (RKHS) arises in various fields, including statistics, theory of approximation, theory of machine learning, theory of group representation, and complex analysis. In statistics, the RKHS method is often used as a method for estimating a regression function based on the smoothing spline estimator for prediction purposes. In machine learning, the RKHS method is arguably the most popular approach for dealing with nonlinearity in data. Several researchers have discussed the RKHS method; for example, Refs. [41,42] discussed the use of RKHS in Support Vector Machines (SVM) and optimization problems, respectively, and Refs. [43,44] discussed the use of RKHS in asymptotic distribution for regression and machine learning.
A Hilbert space H is called an RKHS on a set X over field F if the following conditions are met [1,39]:
(i)
H is a vector subspace of F (X, F ), where F (X, F ) is a vector space over F ;
(ii)
H is endowed with an inner product , , making it into a Hilbert space;
(iii)
the linear evaluation functional E y : H F , defined by E y f = f y , is bounded, for every y X.
Furthermore, if H is an RKHS on X, then for every y X there exists a unique vector k y H such that for every f H , f y = f ,   k y . This is because every bounded linear functional is provided by the inner product with a unique vector in H . The function k y is called a reproducing kernel (RK) for point y . The reproducing kernel (RK) for H is a two–variable function defined by K x , y = k y x . Hence, we have K x , y = k y x = k y , k x and E y 2 = k y 2 = k y , k y = K y , y .
In this study, we provide a simulation study to evaluate the performance of the proposed MNR model estimation method.

2.4. Simulation

The simulation in this study consists of a simulation to determine the optimal smoothing spline based on a generalized cross-validation (GCV) criterion to obtain the best estimated MNR model and a simulation to describe the effect of the smoothing parameters on the estimation results of the regression function of the MNR model based on minimal GCV value. We generate samples sized n = 100 from the MNR model and provide an illustration of the effects of the smoothing parameters in order to estimate the results of the MNR model by comparing three kinds of different smoothing parameter values, namely, small, optimal, and large smoothing parameters values.
In the following section, we provide the results and discussion of this study covering estimation of the regression function of the MNR model using the RKHS approach by estimating the symmetric weight matrix and optimal smoothing parameters, a simulation study, and investigating the consistency of the smoothing spline regression function estimator.

3. Results and Discussions

The results and discussion presented in this section include estimating the regression function of the MNR model using RKHS, estimating the weight matrix, estimating the optimal smoothing parameter, investigating the consistency of the regression function estimator, a simulation study, and an application example using real data.

3.1. Estimating the Regression Function of the MNR Model Using the RKHS Approach

The MNR model presented in (1) can be expressed in matrix notation as follows:
y = f + ε
where y = y 1 ,   y 2 , , y p T ,   f = f 1 ,   f 2 , , f p T , t = t 1 , t 2 , , t p T , ε = ε 1 ,   ε 2 , , ε p T ,   y r = y r 1 , y r 2 , , y r n r T , f r = f r t r 1 , f r t r 2 , , f r t r n r T , t r = t r 1 , t r 2 , , t r n r T , and ε r = ε r 1 , ε r 2 , , ε r n r T .
We assume that ε is a zero mean random error with covariance W 1 . In this case, the covariance matrix W 1 is a symmetrical matrix, that is, it is a diagonal matrix which can be expressed as follows:
W 1 = d i a g W 1 1 , W 2 1 , , W p 1
where W r 1 is an r t h -response covariance matrix of ε r for r = 1 , 2 , , p .
To determine the regression function of the MNR model (1) using the RKHS approach, we first express the MNR model in a general smoothing spline regression model [20]. Therefore, we can express the MNR model (1) as follows:
y r i = L t r f r + ε r i ;   i = 1 , 2 , , n r ;   r = 1 , 2 , ,   p
where f r H r is an unknown smooth function, L t r H r is a bounded linear functional, and H r is a Hilbert space.
Next, the Hilbert space H r is decomposed into a direct sum of the Hilbert subspace G r and Hilbert sub space K r , where G r has basis α r 1 , α r 2 , , α r m r , K r has basis β r 1 , β r 2 , , β r n r , and G r   K r is as follows:
F r = G r K r .
This implies that for g r G r , z r K r , and r = 1 , 2 , ,   p we can express every function f r F r as follows:
f r = g r + z r .
Because α r 1 , α r 2 , , α r m r is the basis of the Hilbert subspace G r and β r 1 , β r 2 , , β r n r is the basis of the Hilbert subspace K r , the function f r in (7) can be expressed as follows:
f r = i = 1 m r b r i α r i + j = 1 n r c r j β r j = α r T b r + β r T c r
where r = 1 , 2 , , p ; b r i ,   c r j ; α r = α r 1 , α r 2 , , α r m r T ; b r = b r 1 , b r 2 , , b r m r T ; β r = β r 1 , β r 2 , , β r n r T ; and c r = c r 1 , c r 2 , , c r n r T .
Hence, for r = 1 , 2 , , p and i = 1 , 2 , , n r , we have
L t r i f r = L t r i g r + z r = L t r i g r + L t r i z r = g r t r i + z r t r i = f r t r i .
Because L t r F r is a bounded linear functional in the Hilbert space F r , according to [20] there exists a Riesz representer δ r i F r such that
L t r i f r = δ r i ,   f r = f r t r i
where f r F r and · , · denote an inner product. Next, by considering Equations (8) and (9) and applying the properties of the inner product, the function f r t r i can be written as follows:
f r t r i = δ r i , α r T b r + β r T c r = δ r i , α r T b r + δ r i , β r T c r
Then, based on Equation (10), we can obtain the regression functions f r t r i for r = 1 , 2 , , p , which are the regresion functions for the first response, the second response, …, and the pth response, as follows:
For   r = 1 , we   have : f 1 t 1 i = δ 1 i , α 1 T b 1 + δ 1 i , β 1 T c 1 , i = 1 , 2 , , n 1 For   r = 2 , we   have : f 2 t 2 i = δ 2 i , α 2 T b 2 + δ 2 i , β 2 T c 2 , i = 1 , 2 , , n 2 For   r = p , we   have : f p t p i = δ p i , α p T b p + δ p i , β p T c p , i = 1 , 2 , , n p
Hence, following Equation (11), we obtain the regression function for i = 1 as follows:
f 1 t 1 = f 1 t 11 ,   f 1 t 12 , , f 1 t 1 n 1 T = A 1 b 1 + C 1 d 1
where b 1 = b 11 , b 12 , , b 1 m 1 T ; d 1 = d 11 , d 12 , , d 1 n 1 T ; A 1 = δ 11 , α 11 δ 11 , α 12 δ 11 , α 1 m 1 δ 12 , α 11 δ 12 , α 12 δ 12 , α 1 m 1 δ 1 n 1 , α 11 δ 1 n 1 , α 12 δ 1 n 1 , α 1 m 1 ; and C 1 = δ 11 , β 11 δ 11 , β 12 δ 11 , β 1 n 1 δ 12 , β 11 δ 12 , β 12 δ 12 , β 1 n 1 δ 1 n 1 , β 11 δ 1 n 1 , β 12 δ 1 n 1 , β 1 n 1 .
Similarly, we obtain the regression functions for i = 2 , , n 1 , which are f 2 t 2 , f 3 t 3 , …, f p t p , as follows:
f 2 ( t 2 ) = ( f 2 ( t 11 ) , f 2 ( t 12 ) , , f 2 ( t 1 n 2 ) ) T = A 2 b 2 + C 2 d 2 f 3 t 3 = f 3 t 11 ,   f 3 t 12 , , f 3 t 1 n 3 T = A 3 b 3 + C 3 d 3 f p t p = f p t 11 ,   f p t 12 , , f p t 1 n p T = A p b p + C p d p
Hence, based on Equations (12) and (13), we obtain the regression function f t of the MNR model as follows:
f t = f 1 t 1 , f 2 t 2 , , f p t p T = A 1 b 1 , A 2 b 2 , , A p b p T + C 1 d 1 , C 2 d 2 , , C p d p T = d i a g A 1 , A 2 , , A p b 1 , b 2 , , b p T + d i a g C 1 , C 2 , , C p d 1 , d 2 , , d p T = A b + C d
Thus, we can express the MNR model presented in (1) in matrix notation as follows:
y = A b + C d + ε
where A = d i a g A 1 , A 2 , , A p is an N × M diagonal matrix with N = r = 1 p n r , M = r = 1 p m r ; b = b 1 T , b 2 T , , b p T T is an M × 1 vector of parameters; C = d i a g C 1 , C 2 , , C p is an N × N diagonal matrix; and d = d 1 T , d 2 T , , d p T T is an N × 1 vector of parameters.
Now, we can determine an estimated smoothing spline regression function of the MNR model presented in (1) using the RKHS approach by taking the solution of the following optimization:
Min f r F r r = 1 , 2 , , p { W 1 2   ε 2 } = Min f r F r r = 1 , 2 , , p { W 1 2   y f 2 }
with a constraint a r b r f r m t r 2 d t r < γ r , where γ r 0 and r = 1 , 2 , , p .
Note that determining the solution to Equation (16) is equivalent to determining the solution to the following PWLS optimization:
Min f r W 2 m a r , b r r = 1 , 2 , , p N 1 y f T W y f + r = 1 p λ r a r b r f r m t r 2 d t r
where N = r = 1 p n r ;   N 1 y f T W y f is a weighted least square that represents the goodness of fit, r = 1 p λ r a r b r f r m t r 2 d t r represents a penalty that measures smoothness, and λ r ,   r = 1 , 2 , , p represents a smoothing parameter which controls the trade-off between the goodness of fit and the penalty.
Next, we decompose the penalty presented in (17) as follows:
r = 1 p λ r a r b r f r m t r 2 d t r = λ 1 a 1 b 1 f 1 m t 1 2 d t 1 + + λ p a p b p f p m t p 2 d t p  
Because we have a 1 b 1 f 1 m t 1 2 d t 1 = β 1 T d 1 , β 1 T d 1 = d 1 T β 1 , β 1 T d 1 = d 1 T C 1 d 1 a 2 b 2 f 2 m t 2 2 d t 2 = β 2 T d 2 , β 2 T d 2 = d 2 T β 2 , β 2 T d 2 = d 2 T C 2 d 2 a p b p f p m t p 2 d t p = β p T d p , β p T d p = d p T β p , β p T d p = d p T C p d p we are able to obtain the penalty presented in (17) or (18) as follows:
r = 1 p λ r a r b r f r m t r 2 d t r = d T Ʌ C d
where Ʌ = d i a g λ 1 I n 1 , λ 2 I n 2 , , λ p I n p . Furthermore, we can write the goodness of fit component in (17) as follows:
N 1 y f T W y f = N 1 y A b C d T W y A b C d
Based on Equations (19) and (20), we can express the PWLS optimization presented in (17) as follows:
Min b p n , d p m Q b , d = Min b p n , d p m { N 1 y A b C d T × W y A b C d + d T Ʌ C d }
The solution to (21) can be obtained by taking the partial diferentiation Q b , d with respect to b and d . In this step, we obtain the estimations of b and d as follows:
b ^ = A T D 1 W A 1 A T D 1 W y ,   and   d ^ = D 1 W I A A T D 1 W A 1 A T D 1 W y
where D = W C + N Ʌ I .
From this step, we obtain the estimation of the smoothing spline regression function of the MNR model presented in (1) or (15) as follows:
f ^ = A b ^ + C d ^ = H y
where H = A A T D 1 W A 1 A T D 1 W + C D 1 W I A A T D 1 W A 1 A T D 1 W , D = W C + N Ʌ I , Ʌ = d i a g λ 1 I n 1 , λ 2 I n 2 , , λ p I n p , I is an identity matrix with dimension N , and N = r = 1 p n r .

3.2. Estimating the Symmetric Weight Matrix

Based on MNR model presented in (3), the W 1 from Equation (4) is a covariance matrix of the random error ε . To obtain the estimated weight matrix W ^ , where the weight matrix W is the inverse of the covariance matrix, we first we consider a paired observation y r i , t r i , r = 1 , 2 , , p ; i = 1 , 2 , , n r which follows the MNR model presented in (3). Second, supposing that y = y 1 , y 2 , , y p T is a multivariate (i.e., N-variates where N = r = 1 p n r ) normally distributed random sample with mean f and covariance W 1 , we have the following likelihood function:
L f , W | y = j = 1 n 1 2 π N 2 W 1 1 2 exp 1 2 y j f j T W y j f j
Because N = r = 1 p n r and W = d i a g W 1 , W 2 , , W p , the likelihood function presented in (23) can be written as follows:
L f , W | y = 1 2 π n n 1 2 W 1 1 n 2 exp 1 2 j = 1 n y 1 j f 1 j T W 1 y 1 j f 1 j × 1 2 π n n 2 2 W 2 1 n 2 exp 1 2 j = 1 n y 2 j f 2 j T W 2 y 2 j f 2 j × × 1 2 π n n p 2 W p 1 n 2 exp 1 2 j = 1 n y p j f p j T W p y p j f p j
Next, based on (24), the estimated weight matrix can be obtained by carrying out the following optimization:
L f , W | y = Max W 1 1 2 π n n 1 2 W 1 1 n 2 exp 1 2 j = 1 n y 1 j f 1 j T W 1 , y 1 j f 1 j × Max W 2 1 2 π n n 2 2 W 2 1 n 2 exp 1 2 j = 1 n y 2 j f 2 j T W 2 , y 2 j f 2 j × × Max W p 1 2 π n n p 2 W p 1 n 2 exp 1 2 j = 1 n y p j f p j T W p y p j f p j
According to [45], the maximum value of each component of the likelihood function in Equation (25) can be determined using the following equations:
W ^ 1 = ε ^ 1 ε ^ 1 T N = y 1 f ^ 1 y 1 f ^ 1 T N ,   W ^ 2 = ε ^ 2 ε ^ 2 T N = y 2 f ^ 2 y 2 f ^ 2 T N ,   ,   W ^ p = ε ^ p ε ^ p T N = y p f ^ p y p f ^ p T N .
We may express the estimated smoothing spline regression function presented in (22) as follows:
f ^ λ , σ 2 = H λ , σ 2 y
where λ = λ 1 , λ 2 , , λ p T , and σ 2 = σ 1 2 , σ 2 2 , , σ p 2 T .
Hence, the maximum likelihood estimator for the weight matrix W is provided by:
W ^ = d i a g W ^ 1 , W ^ 2 , , W ^ p = d i a g y 1 f ^ 1 y 1 f ^ 1 T N , y 2 f ^ 2 y 2 f ^ 2 T N ,   , y p f ^ p y p f ^ p T N = d i a g I n 1 H λ 1 , σ 1 2 y 1 y 1 T I n 1 H λ 1 , σ 1 2 T N ,   I n 1 H λ 1 , σ 1 2 y 1 y 1 T I n 1 H λ 1 , σ 1 2 T N , , I n p H λ p , σ p 2 y p y p T I n p H λ p , σ p 2 T N .
This shows that the estimated weight matrix obtained above is a symmetric matrix, specifically, a diagonal matrix the main diagonal components of which are the estimated weight matrices of the first response, second response, etc., up to the p-th response.

3.3. Estimating Optimal Smoothing Parameters

In MNR modeling, selection of the optimal smoothing parameter value λ cannot be omitted, and is crucial to obtaining a good regression function fit of the MNR model based on the smoothing spline estimator. According to [46], there are several criteria that can be used to select λ, including minimizing cross-validation (CV), generalized cross-validation (GCV), Mallows’ C p , and Akaike’s information criterion (AIC). However, according to [47], for good regression function fitting based on the spline estimator Mallows’ C p and GCV are the most satisfactory.
In this section, we determine the optimal smoothing parameter value for good regression function fitting of the MNR model (1). Taking into account Equation (26), we may express the estimated smoothing spline regression function presented in (22) as follows:
f ^ λ , σ 2 = H λ , σ 2 y
where λ = λ 1 , λ 2 , , λ p T , σ 2 = σ 1 2 , σ 2 2 , , σ p 2 T . The mean squared error (MSE) of the estimated smoothing spline regression function presented in (26) is provided by
M S E = y f ^ λ , σ 2 T W y f ^ λ , σ 2 r = 1 p n r = y f ^ λ , σ 2 T W y f ^ λ , σ 2 N = I H λ , σ 2 y T W I H λ , σ 2 y N = W 1 2 I H λ , σ 2 y 2 N
Hereinafter, we define this function as follows:
G λ = N 1 W 1 2 I H λ , σ 2 y 2 1 N   t r a c e I N H λ , σ 2 2
Therefore, based on (27), we can obtain the optimal smoothing parameter value, λ o p t = λ 1 ; o p t , λ 2 ; o p t , , λ p ; o p t T , by taking the solution of the following optimization:
G o p t λ o p t = Min λ 1 , , λ p + G λ = Min λ 1 , , λ p + N 1 W 1 2 I H λ , σ 2 y 2 1 N   t r a c e I N H λ , σ 2 2
where + represents a positive real number set and N = r = 1 p n r .
Thus, the optimal smoothing parameter value λ o p t = λ 1 ; o p t , λ 2 ; o p t , , λ p ; o p t T is obtained from the minimizing process of the function G λ in (27). The function G λ in (27) is called the generalized cross-validation function [1].

3.4. Simulation Study

In this section, we provide a simulation study for estimating the smoothing spline regression function of the MNR model, where the performance of the proposed MNR model estimation method depends on the selection of an optimal smoothing parameter value. For example, we generate samples with size n = 100 from an MNR model, namely, a three-response nonparametric regression model, as follows:
y 1 i = 5 + 3 sin 2 π t 1 i 2 + ε 1 i ,   for   i = 1 , 2 , , n y 2 i = 3 + 3 sin 2 π t 1 i 2 + ε 2 i   ,   for   i = 1 , 2 , , n y 3 i = 1 + 3 sin 2 π t 1 i 2 + ε 2 i   ,   for   i = 1 , 2 , , n
where n = 100 and with correlations ρ 12 = 0.6 , ρ 13 = 0.7 , ρ 23 = 0.8 and variances σ 1 2 = 2 , σ 2 2 = 3 , σ 3 2 = 4 . Based on the results of this simulation, we obtain a minimum generalized cross-validation (GCV) value of 2.526286 and three optimal smoothing parameter values, which are λ 1 o p t = 2.146156 × 10 7 (for the first response), λ 2 o p t = 1.084013 × 10 7 (for the second response), and λ 3 o p t = 5.930101 × 10 8 (for the third response).
Next, we present an illustration of the effects of the smoothing parameters on the estimation results of the MNR Model by comparing three kinds of different smoothing parameter values, namely, λ 1 s m a l l = 10 10 , λ 2 s m a l l = 2 × 10 10 , and λ 3 s m a l l = 3 × 10 10 , which represent small smoothing parameter values; λ 1 o p t = 2.146156 × 10 7 , λ 2 o p t = 1.084013 × 10 7 , and λ 3 o p t = 5.930101 × 10 8 , which represent optimal smoothing parameter values; and λ 1 l a r g e = 10 5 , λ 2 l a r g e = 2 × 10 5 , and λ 3 l a r g e = 3 × 10 5 , which represent large smoothing parameter values. In the following table and figures, we provide the results of this simulation study.
Table 1 shows that the smoothing parameter values of 2.146156 × 10 7 , 1.084013 × 10 7 , and 5.930101 × 10 8 are the optimal smoothing parameter values, as these smoothing parameters have the lowest GCV value (2.526286) among all the others. Thus, according to (28), these smoothing parameter values are the optimal smoothing parameters. We can write them as λ 1 o p t = 2.146156 × 10 7 , λ 2 o p t = 1.084013 × 10 7 , and λ 3 o p t = 5.930101 × 10 8 . The optimal smoothing parameters provide the best estimation results for the MNR model presented in (29).
The plots of the estimated regression function of the MNR model presented in (29) for the three different smoothing parameters are shown in Figure 1, Figure 2 and Figure 3.
Figure 1 shows that for all responses the small smoothing parameter values provide estimates of the regression functions of the MNR model presented in (29) that are too rough, namely, ( y 1 ) for the first response, ( y 2 ) for the second response, and ( y 3 ) for the third response.
Figure 2 shows that the optimal smoothing parameter values provide the best estimates of the regression functions of the MNR model presented in (29) for all responses, namely, ( y 1 ) for the first response, ( y 2 ) for the second response, and ( y 3 ) for the third response.
Figure 3 shows that for all responses the large smoothing parameter values provide estimates of the regression functions of the MNR model presented in (29) that are too smooth, namely, ( y 1 ) for the first response, ( y 2 ) for the second response, and ( y 3 ) for the third response.

3.5. Investigating the Consistency of the Smoothing Spline Regression Function Estimator

We first investigate the asymptotic properties of the smoothing spline regression function estimator f ^ based on the integrated mean square error (IMSE) criterion. We develop the IMSE proposed by [14] from the uniresponse case to the multiresponse case. Suppose that we decompose the IMSE into two components, b i a s 2 λ and V a r λ , as follows:
I M S E λ = E a b f t f ^ t T W f t f ^ t d t = b i a s 2 λ + V a r λ
where b i a s 2 λ = a b E [ f t E f ^ t T W f t E f ^ t ] d t , and V a r λ = a b E E f ^ t f ^ t T W E f ^ t f ^ t d t . Furthermore, in order to investigate the asymptotic property of b i a s 2 λ , we assign the solution to PWLS optimization in the following theorem.
Theorem 1.
If  f ^ t is the solution that minimizes the following penalized weighted least square (PWLS):
N 1 y g t T W y g t + r = 1 p λ r a r b r g r m t r 2 d t r
then the solution that minimizes the following penalized weighted least square (PWLS):
N 1 f t g t T W f t g t + r = 1 p λ r a r b r g r m t r 2 d t r
is  g ^ * t = E f ^ t .
Proof of Theorem 1.
In Section 3.1, we obtained the solution to the PWLS in (31), that is, f ^ = A b ^ + C d ^ = H y , where as provided in (22), H = A A T D 1 W A 1 A T D 1 W + C D 1 W I A A T D 1 W A 1 A T D 1 W , D = W C + N Ʌ , Ʌ = d i a g λ 1 I n 1 , λ 2 I n 2 , , λ p I n p , N = r = 1 p n r , and I is an identity matrix. Next, if we substitute y = f t into Equation (31), we find that the value that minimizes PWLS (32) is
g ^ * t = A A T D 1 W A 1 A T D 1 W + C D 1 W I A A T D 1 W A 1 A T D 1 W y = E f ^ t
Thus, Theorem 1 is proved. □
Furthermore, we investigate the asymptotic property of b i a s 2 λ . For this purpose, we first provide the following assumptions.
Assumptions (A): 
(A1)
For every r = 1 , 2 , , p , n r = n and λ r = λ .
(A2)
For every i = 1 , 2 , , n , t i = 2 i 1 2 n .
(A3)
For any continuous function g and 0 < W i = W < , i = 1 , 2 , , n , the following statements are satisfied [48,49]:
(a)
n 1 i = 1 n g t i a b g t d t , as n .
(b)
n 1 i = 1 n W i g t i a b W g t dt, as n .
(c)
n 1 i = 1 n λ i a i b i g i m t i 2 d t i λ a b g m t d t , as n .
Next, given Assumptions (A), the asymptotic property of b i a s 2 λ is provided in the Theorem 2.
Theorem 2.
If the Assumptions in (A) hold, then   B i a s 2 λ O λ , as n .
Proof of Theorem 2.
Suppose g ^ t is the value which minimizes the following penalized weighted least square (PWLS):
a b f t g t T W f t g t d t + r = 1 p λ r a r b r g r m t r 2 d t r
then, considering the Assumptions (A), we have
n 1 f t g t T W f t g t a b f t g t T W f t g t d t ,   as   n .
Hence, we obtain g ^ * t = E f ^ t g ^ t . Thus, for every g W 2 m a , b we have
B i a s 2 λ = a b E [ f t E f ^ t T W f t E f ^ t ] d t   a b E [ f t E f ^ t T W f t E f ^ t ] d t + r = 1 p λ r a b g ^ r m t r d t r
Because E f ^ t g ^ t , we obtain
B i a s 2 λ a b E [ f t g ^ t T W f t g ^ t ] d t + r = 1 p λ r a b g ^ r m t r d t r
Thus, we have the following relationship:
B i a s 2 λ a b E [ f t g t T W f t g t ] d t + r = 1 p λ r a b g r m t r d t r
for every g W 2 m a , b .
Because every g W 2 m a , b satisfies the relationship in (33), by taking g t = f t , we have
B i a s 2 λ r = 1 p λ r a b g r m t r d t r = O λ   as   n
where O · represents “big oh”. Details about “big oh” can be found in [14,50].
Thus, Theorem 2 is proved. □
Furthermore, the asymptotic property of V a r λ is provided in Theorem 3.
Theorem 3.
If Assumptions (A) hold and r = 1 , 2 , , p
, then V a r λ r O 1 n   λ r 1 2 m .
Proof of Theorem 3.
To investigate the asymptotic property of V a r λ , we first define the following function:
ω f ^ , h , α = R f ^ + α h + r = 1 p λ r J r f ^ + α h
where R g = n 1 y g t T W y g t ; J r g = a b g r m t r 2 d t r ; h W 2 m a , b and α .
Hence, for any f , g W 2 m a , b we have
ω f ^ , h , α = n 1 y f t α g t T W y f t α g t + r = 1 p λ r a r b r f r m t r + α g r m t r 2 d t r
Next, by taking the solution to ω f ^ , h , α α = 0 , for α = 0 we have
n 1 y f t T W g t = r = 1 p λ r a r b r f r m t r g r m t r 2 d t r
Furthermore, if γ 1 , γ 2 , , γ n are the bases for the natural spline and f r t r = k = 1 n β r k γ r k t r , r = 1 , 2 , , p , then according to [51] this implies that
j = 1 n y r j k = 1 n β r k γ r k t r j W r j g r j t r j = n λ r 1 m 2 m 1 ! i = 1 n g r t r i k = 1 n β r k d r i k
where d r i k is the rth response diagonal element of matrix H in (26).
Now, because Equation (35) holds for every g r W 2 m a r , b r , r = 1 , 2 , , p , it follows that determining the solution to Equation (35) is equivalent to determining the value of β r k that satisfies the following equation:
y r i = k = 1 n n λ r 1 m 2 m 1 ! W r i 1 d r i k + γ r k t r i β r k   ,   i = 1 , 2 , , n
We can express Equation (36) in matrix notation as follows:
y = n λ r 1 m 2 m 1 ! W 1 K + γ β
where = d r i k , γ = γ r k t r i , r = 1 , 2 , , p , and i , k = 1 , 2 , , n .
Hence, we obtain the estimator for β in Equation (37) as follows:
β ^ = d i a g 1 1 + n   λ r   θ 1 ,   , 1 1 + n   λ r   θ n   γ T y
where { θ 1 , , θ n } H (here, H is an RKHS) and is perpendicular to γ .
Thus, the estimator for the regression function f t can be expressed as follows:
f ^ t = γ t β ^ = j = 1 n 1 1 + n λ θ j γ j T y   γ j t  
Hence, for r = 1 , 2 , , p Equation (38) results in
f ^ r t r = j = 1 n 1 1 + n λ r θ r j γ r j T y   γ r j t r = j = 1 n 1 1 + n λ r θ r j k = 1 n γ r j t r k y r k γ r j t r k
Thus, for r = 1 , 2 , , p we have
V a r f ^ r t r = σ r 2 j = 1 n γ r j 2 t r 1 + n   λ r   θ r j 2 k = 1 n γ r j t r k   γ r j t r k W r k 1 max 1 i n W r i 1 σ r 2 j = 1 n γ r j 2 t r 1 + n   λ r   θ r j 2
From this step, for r = 1 , 2 , , p we have
V a r λ r = a r b r E E f ^ r t r f ^ r t r T W r E f ^ r t r f ^ r t r d t r max 1 i n W r i 1 σ r 2 j = 1 n γ r j 2 t r 1 + n   λ r   θ r j 2 a r b r γ r j 2 t r W r d t r
In the next step, Refs. [51,52] provide an approximation for n as follows:
n 1 = n 1 j = 1 n W r j γ r j 2 t r j a r b r γ r 2 t r W r d t r ,   and V a r λ r max 1 i n W r i 1 σ r 2 n 1 j = 1 n 1 1 + λ r   γ j 2 .
Furthermore, Ref. [51] leads to the following result:
V a r λ r max 1 i n W r i 1 σ r 2 n 1 j = 1 n 1 1 + λ r   π j 2 m 2
Next, using integral approximation [51], for r = 1 , 2 , , p we have
V a r λ r max 1 i n W r i 1 σ r 2 n   π   λ r 1 2 m 0 d x 1 + x 2 m 2 = 1 n   λ r 1 2 m K m , σ = O 1 n   λ r 1 2 m
where m , σ = σ r 2 π max 1 i n W r i 1 0 d x 1 + x 2 m 2 . Thus, Theorem 3 is proved. □
Here, based on Theorems 2 and 3, we obtain the asymptotic property of the smoothing spline regression function estimator of the MNR model presented in (1) based on the integrated mean square error (IMSE) criterion, as follows:
I M S E λ = B i a s 2 λ + V a r λ O λ + O z   as   n
where = λ 1 , λ 2 , , λ p T and z = 1 n   λ 1 1 2 m 1 n   λ 2 1 2 m 1 n   λ p 1 2 m T .
The consistency of the smoothing spline regression function estimator of the MNR model presented in (1) is provided by the following theorem.
Theorem 4.
If  f ^ t is a smoothing spline estimator for regression function  f t of the MNR model presented in (1), then  f ^ t is a consistent estimator for  f t based on the integrated mean square error (IMSE) criterion.
Proof of Theorem 4.
Based on Equation (39), we have the following relationship:
I M S E λ = B i a s 2 λ + V a r λ O λ + O z O n λ   as   n .
Hence, according to [48], for any small positive number δ > 0 we have
Lim n P I M S E λ n λ > δ Lim n P I M S E λ > δ = 0 .
Because P I M S E λ > δ = 1 P I M S E λ δ , based on Equation (40) and applying the probability properties we have
Lim n 1 P ( I M S E λ δ ) = 0   1 Lim n P I M S E λ δ = 0   Lim n P I M S E λ δ = 1 .
Equation (41) means that the smoothing spline regression function estimator of the multiresponse nonparametric regression (MNR) model is a consistent estimator based on the integrated mean square error (IMSE) criterion. Thus, Theorem 4 is proved. □

3.6. Illustration of Theorems

Suppose a paired observation y r i , t r i follows the multiresponse nonparametric regression (MNR) model:
y r i = f r t r i + ε r i ,   t r i 0 , 1 ,   r = 1 , 2 , , p , i = 1 , 2 , , n r .
Next, for every r = 1 , 2 , , p , we assume f 1 = f 2 = = f p = f and ε 1 = ε 2 = = ε p = ε such that y 1 = y 2 = = y p = y . Hence, we have a nonparametric regression model as follows:
y i = f t i + ε i ,   t i 0 , 1 ,   i = 1 , 2 , , n .
Based on the model presented in (43), we present an illustration related to the four theorems in Section 3.5 through a simulation study with sample size of n = 200 . Based on this model, let f t = s i n 3 2 π t 3 , where t is generated from a uniform (0,1) distribution and ε is generated from a standard normal distribution. The first step is to create plots of the observation values t , y and f t , as shown in Figure 4.
It can be seen from Figure 4 that there is a tendency towards y variance inequality. For larger values of t , the y variance tends to be larger. Next, the data model is approximated by a weighted spline with a weight of w i = 1 / t i 2 , i = 1 , 2 , , 200 . The next step is to determine the order, number, and location of the knot points. Here, we use a weighted cubic spline model with two knot points, namely, 0.5 and 0.785. This weighted cubic spline model can be written as follows:
E y = β 0 + β 1 t + β 2 t 2 + β 3 t 3 + β 4 t 0.5 + 3 + β 5 t 0.785 + 3
The plot of this weighted cubic spline with two knot points is shown in Figure 5. The plots of the residuals and the estimated weighted cubic spline are shown in Figure 6.
Figure 6 shows that with weight w i = 1 / t i 2 , the variance of the response variable y tends to be the same. Meanwhile, Figure 7 provides a residual normality plot for the weighted cubic spline model. The plot in Figure 7 shows no indication towards deviation from the normal distribution.
Next, as a comparison, we investigate a weighted cubic polynomial model with weight w i = 1 / t i 2 , i = 1 , 2 , , 200 for fitting the model (43). The fitting of the weighted cubic polynomial model is shown in Figure 8. From the visualization in Figure 8, it can be seen that this weighted cubic polynomial approach tends to approach the function f t very globally. This is in contrast to the weighted cubic spline with two knot points in (44), which approaches the function f t more locally. Thus, the weighted cubic spline model with two knot points is adequate as an approximation model for the model presented in (43).
Based on the illustration above and Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8, if f ^ t is an estimator for model (43), that is, if f ^ t is the Penalized Weighted Least Squares (PWLS) solution, then g ^ * t = E y , from Equation (44) is an estimator for model (43) as well, such that g ^ * t = E f ^ t , as provided by Theorem 1.
The plots of the asymptotic curves of IMSE λ , B i a s 2 λ , and Var λ , where IMSE λ = B i a s 2 λ + Var λ , are shown in Figure 9 [53].
Figure 9 shows that the Integrated Mean Square Error (IMSE(λ)) curve represented by curve (a) is the sum of the quadratic bias ( B i a s 2 λ ) curve represented by curve (b) and the variance curve ( V a r λ ) represented by curve (c). It can be seen from Figure 9b that B i a s 2 λ O λ , as n , that is, Lim   Sup n B i a s 2 λ λ < [14,50,54]. This means that B i a s 2 λ λ remains bounded, as n , which is provided by Theorem 2. Furthermore, Figure 9c shows that V a r λ O 1 n λ 1 2 m   (that is, Lim   Sup n n λ 1 2 m 1 / λ < [14,50,54]. This means that n λ 1 2 m 1 / λ remains bounded, as n , which is provided by Theorem 3. Furthermore, Figure 9 shows that I M S E λ = B i a s 2 λ + V a r λ O n λ . According to [14,50,54], this means that Lim   Sup n I M S E λ n λ < . In other words, I M S E λ O n λ if I M S E λ n λ remains bounded, as n . According to [14,50,54], for any small positive number δ > 0 we have Lim n P I M S E λ n λ > δ Lim n P I M S E λ > δ = 0 ; hence, Lim n P I M S E λ δ = 1 is consistent.

4. Conclusions

The smoothing spline estimator with the RKHS approach has good ability to estimate the MNR model, which is a nonparametric regression model where the responses are correlated with each other, because the goodness of fit and smoothness of the estimation curve is controlled by the smoothing parameter, making the estimator very suitable for prediction purposes. Therefore, selection of the optimal smoothing parameter value cannot be omitted, and is crucial to good regression function fitting of the MNR model based on smoothing spline estimator using the RKHS approach. The estimator of the smoothing spline regression function of the MNR model that we obtained is linear with respect to the observations in Equation (22), and is a consistent estimator based on the integrated mean square error (IMSE) criterion. The main influence of this study is lies in the easier estimation of a multiresponse nonparametric regression model where there is a correlation between responses using the RKHS approach based on a smoothing spline estimator. This approach is easier, and faster, and more optimal, as estimation is carried out simultaneously instead of response by response for each observation. In addition, the theory generated in this study can be used to estimate the nonparametric component of the multiresponse semiparametric regression model used to model Indonesian toddlers’ standard growth charts. In the future, the results of this study can be further developed within the scope of statistical inference, especially for the purpose of testing hypotheses involving multiresponse nonparametric regression models and multiresponse semiparametric regression models.

Author Contributions

All authors have contributed to this research article. Conceptualization, B.L. and N.C.; methodology, B.L. and N.C.; software, B.L., N.C., D.A. and E.Y.; validation, B.L., N.C., D.A. and E.Y.; formal analysis, B.L., N.C. and D.A.; investigation, resource and data curation, B.L., N.C., D.A. and E.Y.; writing—original draft preparation, B.L. and N.C.; writing—review and editing, B.L., N.C. and D.A.; visualization, B.L. and N.C.; supervision, B.L., N.C. and D.A.; project administration, N.C. and B.L.; funding acquisition, N.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Directorate of Research, Technology, and Community Service (Direktorat Riset, Teknologi, dan Pengabdian kepada Masyarakat–DRTPM), the Ministry of Education, Culture, Research, and Technology, the Republic of Indonesia through the Featured Basic Research of Higher Education Grant (Hibah Penelitian Dasar Unggulan Perguruan Tinggi–PDUPT, Tahun Ketiga dari Tiga Tahun) with master contract number 010/E5/PG.02.00.PT/2022 and derivative contract number 781/UN3.15/PT/2022.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The authors confirm that there are no available data.

Acknowledgments

The authors thank Airlangga University for technical support and DRTPM of the Ministry of Education, Culture, Research, and Technology, the Republic of Indonesia for financial support. The authors are grateful to the editors and anonymous peer reviewers of the Symmetry journal, who provided comments, corrections, criticisms, and suggestions that were useful for improving the quality of this article.

Conflicts of Interest

The authors declare no conflict of interest. In addition, the funders had no role in the design of the study, in the collection, analysis, or interpretation of data, in the writing of the article, or in the decision to publish the results.

References

  1. Wahba, G. Spline Models for Observational Data; SIAM: Philadelphia, PA, USA, 1990. [Google Scholar]
  2. Kimeldorf, G.; Wahba, G. Some results on Tchebycheffian spline functions. J. Math. Anal. Appl. 1971, 33, 82–95. [Google Scholar] [CrossRef] [Green Version]
  3. Cox, D.D. Asymptotics for M-type smoothing spline. Ann. Stat. 1983, 11, 530–551. [Google Scholar] [CrossRef]
  4. Oehlert, G.W. Relaxed boundary smoothing spline. Ann. Stat. 1992, 20, 146–160. [Google Scholar] [CrossRef]
  5. Ana, E.; Chamidah, N.; Andriani, P.; Lestari, B. Modeling of hypertension risk factors using local linear of additive nonparametric logistic regression. J. Phys. Conf. Ser. 2019, 1397, 012067. [Google Scholar] [CrossRef]
  6. Chamidah, N.; Yonani, Y.S.; Ana, E.; Lestari, B. Identification the number of mycobacterium tuberculosis based on sputum image using local linear estimator. Bullet. Elect. Eng. Inform. 2020, 9, 2109–2116. [Google Scholar] [CrossRef]
  7. Cheruiyot, L.R. Local linear regression estimator on the boundary correction in nonparametric regression estimation. J. Stat. Theory Appl. 2020, 19, 460–471. [Google Scholar] [CrossRef]
  8. Cheng, M.-Y.; Huang, T.; Liu, P.; Peng, H. Bias reduction for nonparametric and semiparametric regression models. Stat. Sin. 2018, 28, 2749–2770. [Google Scholar] [CrossRef] [Green Version]
  9. Chamidah, N.; Zaman, B.; Muniroh, L.; Lestari, B. Designing local standard growth charts of children in East Java province using a local linear estimator. Int. J. Innov. Creat. Chang. 2020, 13, 45–67. [Google Scholar]
  10. Delaigle, A.; Fan, J.; Carroll, R.J. A design-adaptive local polynomial estimator for the errors-in-variables problem. J. Amer. Stat. Assoc. 2009, 104, 348–359. [Google Scholar] [CrossRef] [Green Version]
  11. Francisco-Fernandez, M.; Vilar-Fernandez, J.M. Local polynomial regression estimation with correlated errors. Comm. Statist. Theory Methods 2001, 30, 1271–1293. [Google Scholar] [CrossRef] [Green Version]
  12. Benhenni, K.; Degras, D. Local polynomial estimation of the mean function and its derivatives based on functional data and regular designs. ESAIM Probab. Stat. 2014, 18, 881–899. [Google Scholar] [CrossRef]
  13. Kikechi, C.B. On local polynomial regression estimators in finite populations. Int. J. Stat. Appl. Math. 2020, 5, 58–63. [Google Scholar]
  14. Wand, M.P.; Jones, M.C. Kernel Smoothing; Chapman & Hall: London, UK, 1995. [Google Scholar]
  15. Cui, W.; Wei, M. Strong consistency of kernel regression estimate. Open J. Stats. 2013, 3, 179–182. [Google Scholar] [CrossRef] [Green Version]
  16. De Brabanter, K.; De Brabanter, J.; Suykens, J.A.K.; De Moor, B. Kernel regression in the presence of correlated errors. J. Mach. Learn. Res. 2011, 12, 1955–1976. [Google Scholar]
  17. Chamidah, N.; Lestari, B. Estimating of covariance matrix using multi-response local polynomial estimator for designing children growth charts: A theoretically discussion. J. Phy. Conf. Ser. 2019, 1397, 012072. [Google Scholar] [CrossRef]
  18. Aydin, D.; Güneri, Ö.I.; Fit, A. Choice of bandwidth for nonparametric regression models using kernel smoothing: A simulation study. Int. J. Sci. Basic Appl. Res. 2016, 26, 47–61. [Google Scholar]
  19. Eubank, R.L. Nonparametric Regression and Spline Smoothing, 2nd ed.; Marcel Dekker: New York, NY, USA, 1999. [Google Scholar]
  20. Wang, Y. Smoothing Splines: Methods and Applications; Taylor & Francis Group: Boca Raton, FL, USA, 2011. [Google Scholar]
  21. Liu, A.; Qin, L.; Staudenmayer, J. M-type smoothing spline ANOVA for correlated data. J. Multivar. Anal. 2010, 101, 2282–2296. [Google Scholar] [CrossRef] [Green Version]
  22. Gao, J.; Shi, P. M-Type smoothing splines in nonparametric and semiparametric regression models. Stat. Sin. 1997, 7, 1155–1169. [Google Scholar]
  23. Chamidah, N.; Lestari, B.; Massaid, A.; Saifudin, T. Estimating mean arterial pressure affected by stress scores using spline nonparametric regression model approach. Commun. Math. Biol. Neurosci. 2020, 2020, 1–12. [Google Scholar]
  24. Fatmawati, I.N.; Budiantara, B.L. Comparison of smoothing and truncated spline estimators in estimating blood pressures models. Int. J. Innov. Creat. Change 2019, 5, 1177–1199. [Google Scholar]
  25. Chamidah, N.; Lestari, B.; Budiantara, I.N.; Saifudin, T.; Rulaningtyas, R.; Aryati, A.; Wardani, P.; Aydin, D. Consistency and asymptotic normality of estimator for parameters in multiresponse multipredictor semiparametric regression model. Symmetry 2022, 14, 336. [Google Scholar] [CrossRef]
  26. Eilers, P.H.C.; Marx, B.D. Flexible smoothing with B-splines and penalties. Statist. Sci. 1996, 11, 86–121. [Google Scholar] [CrossRef]
  27. Lu, M.; Liu, Y.; Li, C.-S. Efficient estimation of a linear transformation model for current status data via penalized splines. Stat. Meth. Medic. Res. 2020, 29, 3–14. [Google Scholar] [CrossRef]
  28. Wang, Y.; Guo, W.; Brown, M.B. Spline smoothing for bivariate data with applications to association between hormones. Stat. Sin. 2000, 10, 377–397. [Google Scholar]
  29. Yilmaz, E.; Ahmed, S.E.; Aydin, D. A-Spline regression for fitting a nonparametric regression function with censored data. Stats 2020, 3, 11. [Google Scholar] [CrossRef]
  30. Aydin, D. A comparison of the nonparametric regression models using smoothing spline and kernel regression. World Acad. Sci. Eng. Technol. 2007, 36, 253–257. [Google Scholar]
  31. Lestari, B.; Budiantara, I.N.; Chamidah, N. Smoothing parameter selection method for multiresponse nonparametric regression model using spline and kernel estimators approaches. J. Phy. Conf. Ser. 2019, 1397, 012064. [Google Scholar] [CrossRef]
  32. Lestari, B.; Budiantara, I.N.; Chamidah, N. Estimation of regression function in multiresponse nonparametric regression model using smoothing spline and kernel estimators. J. Phy. Conf. Ser. 2018, 1097, 012091. [Google Scholar] [CrossRef] [Green Version]
  33. Osmani, F.; Hajizadeh, E.; Mansouri, P. Kernel and regression spline smoothing techniques to estimate coefficient in rates model and its application in psoriasis. Med. J. Islam. Repub. Iran 2019, 33, 1–5. [Google Scholar] [CrossRef]
  34. Lestari, B.; Budiantara, I.N. Spline estimator and its asymptotic properties in multiresponse nonparametric regression model. Songklanakarin J. Sci. Technol. 2020, 42, 533–548. [Google Scholar]
  35. Mariati, M.P.A.M.; Budiantara, I.N.; Ratnasari, V. The application of mixed smoothing spline and Fourier series model in nonparametric regression. Symmetry 2021, 13, 2094. [Google Scholar] [CrossRef]
  36. Wang, Y.; Ke, C. Smoothing spline semiparametric nonlinear regression models. J. Comp. Graphical Stats. 2009, 18, 165–183. [Google Scholar] [CrossRef]
  37. Lestari, B.; Chamidah, N. Estimating regression function of multiresponse semiparametric regression model using smoothing spline. J. Southwest Jiaotong Univ. 2020, 55, 1–9. [Google Scholar]
  38. Gu, C. Smoothing Spline ANOVA Models; Springer: New York, NY, USA, 2002. [Google Scholar]
  39. Aronszajn, N. Theory of reproducing kernels. Transact. Amer. Math. Soc. 1950, 68, 337–404. [Google Scholar] [CrossRef]
  40. Berlinet, A.; Thomas-Agnan, C. Reproducing Kernel Hilbert Spaces in Probability and Statistics; Kluwer Academic: Norwell, MA, USA, 2004. [Google Scholar]
  41. Wahba, G. Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. Adv. Kernel Methods-Support Vector Learn. 1999, 6, 69–87. [Google Scholar]
  42. Scholkopf, B.; Smola, A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond; MIT Press: Cambridge, MA, USA, 2002. [Google Scholar]
  43. Zeng, X.; Xia, Y. Asymptotic distribution for regression in a symmetric periodic Gaussian kernel Hilbert space. Stat. Sin. 2019, 29, 1007–1024. [Google Scholar] [CrossRef]
  44. Hofmann, T.; Scholkopf, B.; Smola, A.J. Kernel methods in machine learning. Ann. Stat. 2008, 36, 1171–1220. [Google Scholar] [CrossRef] [Green Version]
  45. Johnson, R.A.; Wichern, D.W. Applied Multivariate Statistical Analysis; Prentice Hall: New York, NY, USA, 1982. [Google Scholar]
  46. Li, J.; Zhang, R. Penalized spline varying-coefficient single-index model. Commun. Stat.-Simul. Comp. 2010, 39, 221–239. [Google Scholar] [CrossRef]
  47. Ruppert, D.; Carroll, R. Penalized Regression Splines, Working Paper; School of Operation Research and Industrial Engineering, Cornell University: New York, NY, USA, 1997. [Google Scholar]
  48. Tunç, C.; Tunç, O. On the stability, integrability and boundedness analyses of systems of integro-differential equations with time-delay retardation. RACSAM 2021, 115, 115. [Google Scholar] [CrossRef]
  49. Aydin, A.; Korkmaz, E. Introduce Gâteaux and Frêchet derivatives in Riesz spaces. Appl. Appl. Math. Int. J. 2020, 15, 16. [Google Scholar]
  50. Sen, P.K.; Singer, J.M. Large Sample in Statistics: An Introduction with Applications; Chapman & Hall: London, UK, 1993. [Google Scholar]
  51. Eubank, R.L. Spline Smoothing and Nonparametric Regression; Marcel Dekker, Inc.: New York, NY, USA, 1988. [Google Scholar]
  52. Speckman, P. Spline smoothing and optimal rate of convergence in nonparametric regression models. Ann. Stat. 1985, 13, 970–983. [Google Scholar] [CrossRef]
  53. Budiantara, I.N. Estimator Spline dalam Regresi Nonparametrik dan Semiparametrik. Ph.D. Dissertation, Universitas Gadjah Mada, Yogyakarta, Indonesia, 2000. [Google Scholar]
  54. Serfling, R.J. Approximation Theorems of Mathematical Statistics; John Wiley: New York, NY, USA, 1980. [Google Scholar]
Figure 1. Plots of estimated MNR Models in (29) for small smoothing parameters.
Figure 1. Plots of estimated MNR Models in (29) for small smoothing parameters.
Symmetry 14 02227 g001
Figure 2. Plots of estimated MNR Model in (29) for optimal smoothing parameters.
Figure 2. Plots of estimated MNR Model in (29) for optimal smoothing parameters.
Symmetry 14 02227 g002
Figure 3. Plots of estimated MNR Models in (29) for large smoothing parameters.
Figure 3. Plots of estimated MNR Models in (29) for large smoothing parameters.
Symmetry 14 02227 g003
Figure 4. Plots of observation values t , y and f t .
Figure 4. Plots of observation values t , y and f t .
Symmetry 14 02227 g004
Figure 5. Plots of (a) weighted cubic spline with two knot points and (b) curve of f t .
Figure 5. Plots of (a) weighted cubic spline with two knot points and (b) curve of f t .
Symmetry 14 02227 g005
Figure 6. Plots of residuals and estimation values of weighted cubic spline.
Figure 6. Plots of residuals and estimation values of weighted cubic spline.
Symmetry 14 02227 g006
Figure 7. Plot of weighted cubic spline residual normality.
Figure 7. Plot of weighted cubic spline residual normality.
Symmetry 14 02227 g007
Figure 8. Plots of (a) weighted cubic polynomial and (b) curve f t .
Figure 8. Plots of (a) weighted cubic polynomial and (b) curve f t .
Symmetry 14 02227 g008
Figure 9. Plots of asymptotic curves of (a) IMSE λ , (b) B i a s 2 λ , and (c) Var λ .
Figure 9. Plots of asymptotic curves of (a) IMSE λ , (b) B i a s 2 λ , and (c) Var λ .
Symmetry 14 02227 g009
Table 1. Comparison estimation results of MNR Model in (29) for three kinds of smoothing parameter values.
Table 1. Comparison estimation results of MNR Model in (29) for three kinds of smoothing parameter values.
n = 100 ;   ρ 12 = 0.6 ;   ρ 13 = 0.7 ;   ρ 23 = 0.8 ;   σ 1 2 = 2 ;   σ 2 2 = 3 ;   σ 3 2 = 4  
Smoothing ParametersMinimum Values of GCVResults of Estimation
λ 1 s m a l l = 10 10 λ 2 s m a l l = 2 × 10 10 λ 3 s m a l l = 3 × 10 10 S m a l l   V a l u e s 4.763234The estimation results are too rough.
λ 1 o p t = 2.146156 × 10 7 λ 2 o p t = 1.084013 × 10 7 λ 3 o p t = 5.930101 × 10 8 O p t i m a l   V a l u e s 2.526286The estimation results are the best.
λ 1 l a r g e = 10 5 λ 2 l a r g e = 2 × 10 5 λ 3 l a r g e = 3 × 10 5   L a r g e   V a l u e s 4.617504The estimation results are too smooth.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lestari, B.; Chamidah, N.; Aydin, D.; Yilmaz, E. Reproducing Kernel Hilbert Space Approach to Multiresponse Smoothing Spline Regression Function. Symmetry 2022, 14, 2227. https://doi.org/10.3390/sym14112227

AMA Style

Lestari B, Chamidah N, Aydin D, Yilmaz E. Reproducing Kernel Hilbert Space Approach to Multiresponse Smoothing Spline Regression Function. Symmetry. 2022; 14(11):2227. https://doi.org/10.3390/sym14112227

Chicago/Turabian Style

Lestari, Budi, Nur Chamidah, Dursun Aydin, and Ersin Yilmaz. 2022. "Reproducing Kernel Hilbert Space Approach to Multiresponse Smoothing Spline Regression Function" Symmetry 14, no. 11: 2227. https://doi.org/10.3390/sym14112227

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop