On Optimal and Asymptotic Properties of a Fuzzy L 2 Estimator

: A fuzzy least squares estimator in the multiple with fuzzy-input–fuzzy-output linear regression model is considered. The paper provides a formula for the L 2 estimator of the fuzzy regression model. This paper proposes several operations for fuzzy numbers and fuzzy matrices with fuzzy components and discussed some algebraic properties that are needed to use for proving theorems. Using the proposed operations, the formula for the variance, provided and this paper, proves that the estimators have several important optimal properties and asymptotic properties: they are Best Linear Unbiased Estimator (BLUE), asymptotic normality and strong consistency. The conﬁdence regions of the coefﬁcient parameters and the asymptotic relative efﬁciency (ARE) are also discussed. In addition, several examples are provided including a Monte Carlo simulation study showing the validity of the proposed theorems.


Introduction
Regression analysis is commonly perceived as one of the most useful tools in statistical modeling. If the data could be observed precisely, the classical regression appears usually as a sufficient solution. However, we encounter a lot of situations where the observations cannot be obtained precisely. In such a case we need a framework to handle the uncertain matters coming from the following two sources: randomness and imprecision. As far as randomness could be satisfactorily managed by probability theory, one has to adopt a suitable approach for modeling imprecise data. Applying fuzzy sets proposed by Zadeh [1], Tanaka et al. [2] introduced a fuzzy regression analysis. On the other hand, Diamond [3] generalized the main technique of the regression analysis, i.e., the least squared method, for fuzzy numbers. Further on numerous researchers considered different fuzzy regression models, known as crisp input-fuzzy output (CIFO) or fuzzy input-fuzzy output (FIFO), both with crisp or fuzzy parameters (see, e.g., ). Many fewer efforts were dedicated to exploring statistical properties of the estimators in fuzzy regression models. For situations where data has an assumed error structure Diamond [25] and Näther [26][27][28][29] discussed the fuzzy best linear unbiased estimators (FBLUEs). Kim et al. [16] proved the asymptotic properties of fuzzy least squares estimators (FLSEs) for a fuzzy simple linear regression model. Due to the complexity of mathematical formulas describing fuzzy least squares estimators, some authors use α-cuts of fuzzy numbers to express the estimates (e.g., [21,30]), while others separate the estimators into some parts, e.g., corresponding to the mode and two spreads in the case of triangular fuzzy numbers (see, e.g., [6,26,31,32]). Moreover, many authors do not express any analytic formulas for the desired estimators but they determine the estimates from the normal equations directly (see [3,19]).
To overcome these problems the Yoon et al. [33][34][35] redefined the mathematical model of a fuzzy linear regression using the so-called triangular fuzzy matrix and suitable operations defined both on triangular fuzzy numbers like on triangular fuzzy matrices. Even though this paper deals with the triangular fuzzy numbers, it can be extended to the general case [36]. Moreover, the importance of triangular and trapezoidal fuzzy numbers has been emphasized in [37]. This approach enables us to determine fuzzy least squares estimators of the regression parameters in a concise form which is also useful for exploring the statistical properties of the estimators. The asymptotic theory for fuzzy multiple regression model is hardly discussed in the paper so far. In this contribution we continue the examination of the fuzzy least squares estimator obtained there, focusing on its fundamental finite-sample and asymptotic properties.
The paper is organized as follows: In Section 2, we introduce basic notation related to triangular fuzzy numbers, triangular fuzzy matrices and operation defined on them which would be used later in the contribution. Some algebraic properties that are needed to prove theorems are discussed in Section 2. In Section 3, we discuss the fuzzy linear regression model based on the author's previous studies [33,35]. Next, in Section 4 we prove that the fuzzy least squares estimator shown in the previous section is Best Linear Unbiased Estimator (BLUE). Then we present the next important results of the paper showing that under some general assumptions the aforementioned estimator is asymptotically normal (in Section 5) and strongly consistent (Section 6). In addition, the confidence regions of the coefficient parameters and the asymptotic relative efficiency (ARE) are also discussed in Section 6. Several examples are proposed including Monte Carlo simulation study. Section 7 concludes the results.

Preliminaries
Each triangular fuzzy number A can be represented by an ordered triple, i.e., A = (l a , a, r a ), where a is the mode of A, while l a and r a denote the lower and the upper bound of the support of A, respectively. Further on let us assume that F T denote a family of all triangular fuzzy numbers defined on the real numbers R.
Besides well-known basic operations on fuzzy numbers, like addition where X = (l x , x, r x ), Y = (l y , y, r y ) ∈ F T , or scalar multiplication Some other operations defined in F T are sometimes useful. Let us recall here some concepts proposed in [33]: X ⊗ Y = (l x l y , xy, r x r y ).
Clearly, X Y, X ⊗ Y ∈ F T , but the output of (3) is a crisp number (i.e., it is isomorphic with the corresponding real value).
Further on we'll also need some operations defined on the matrices of fuzzy numbers. We denote by M R * a set of all n × n real crisp matrices with nonnegative elements and let M F T be a set of all triangular fuzzy matrices matrix (t.f.m.), i.e., matrices whose elements belong to F T . For any two n × n triangular fuzzy matricesΓ = [X ij ] andΛ = [Y ij ], a crisp matrix A = [a ij ] and a constant k > 0 we haveΓ Of course,Γ ⊕Λ,Γ ⊗Λ,ÃΓ ∈ M F T andΓ Λ ∈ M R * .
We can also define the following three types of fuzzy scalar multiplications of a crisp matrix:

Fuzzy Least Squares Estimation
Consider a classical simple linear model where Y = (Y 1 , · · · , Y n ) T is a vector of observed responses, X is a design matrix with size n × (p + 1) of explanatory variables x ij , β = (β 0 , β 1 , . . . , β p ) T denotes a p-dimensional vector of unknown parameters, and = ( 1 , . . . , n ) T is a vector of errors. We usually assume that E = 0 and Var = σ 2 I n , where σ 2 < ∞. It is also usual to take x i0 ≡ 1, i = 1, . . . , n.
The most common estimator in the simple regression model is the least squares estimator (LSE) given byβ where the design matrix X is supposed to have the full rank. By the Gauss-Markov theorem (14) is the best linear unbiased estimator (BLUE) of the parameters, where "best" means giving the lowest variance of the estimate, as compared to other unbiased, linear estimators. On the other hand, estimator (14) is strong consistent under certain conditions for the design matrix, i.e., (X T n X n ) −1 → 0. In this paper, we consider a fuzzy generalization of (13), i.e., the following linear regression model with fuzzy inputs and fuzzy outputs where X ij = (l x ij , x ij , l x ij ) and Y i = (ly i , y i , r y i ), while β 1 , . . . , β p denote unknown crisp regression parameters to be estimated. Moreover, let Φ i , i = 1, . . . , n, denote fuzzy error terms which express both randomness and fuzziness allowing negative spreads [8,16,38], i.e., Φ i = (θ l i , i , θ r i ), where θ l i , i , θ r i are crisp random variables which satisfy the following assumptions: It can be shown (see [33]) that defining the following design matrix , and a vectorỹ = [(l y i , y i , r y i )] n×1 = [(l y 1 , y 1 , r y 1 ), . . . , (l y n , y n , r y n )] T and assuming that det(X T X ) = 0, we obtain the following least squares estimator In the next section, we explore basic properties (16) of the show that it is BLUE. Then, in Section 4 we examine the asymptotic behavior of (16).

BLUE in Fuzzy Regression Model
The following theorems provide formulas to find the expectation and variance of our estimator.
(l z ij , z ij , r z ij ) (l y j , y j , r y j ) = ∑ n j=1 (l z ij l y j + z ij y j + r z ij r y j ). Hence, by (A4), we obtain Note that σ 2 Φ is a crisp vector, not a triangular fuzzy number.

Corollary 1.
Letβ be the least squares estimator of β. Then Hence, by Lemma 2, we get which completes the proof.
Theorem 1. Letβ be the least squares estimator given by (16). Thenβ is the unbiased estimator of β.
Proof. Firstly, we want to show that E(ỹ) =Xβ. Since Now, by Lemma 1, we conclude that which means thatβ is the unbiased estimator of β.
Now we are able to state the main theorem in this section.

Theorem 2.
The least squares estimatorβ, given by (16), is the best linear unbiased estimator of β.
Proof. It is clear thatβ = (X T X ) −1XT ỹ is the linear estimator with the fuzzy operation , i.e., fuzzy-type linear estimator. By Theorem we also know thatβ is unbiased. Therefore, it is enough to prove thatβ has the minimum variance among the linear unbiased estimators of β.
Letβ * be arbitrary linear unbiased estimator for β. Then Since where i, l = 1, . . . , p + 1. So ∑ n k=1 (l z ik l x kl + z ik x kl + r z ik r x kl ) = 0 for all i, l = 1, . . . , p + 1. Since we assumedΛ,X ∈ M F T = M F T (R + ) , hence all components are non-negative, i.e., l z ik l x kl + z ik x kl + r z ik r x kl = 0 for all k = 1, . . . , n. Thus we conclude that l z ik l x kl + z ik x kl + r z ik r x kl = 0 for all k = 1, . . . , n. Moreover, since l z ik l x kl = 0, z ik x kl = 0 and r z ik r x kl = 0, hence l z ik l x kl = 0, z ik x kl = 0 and r z ik r x kl = 0, so Z ik X kl = (l z ik l x kl , z ik x kl , r z ik r x kl ) = (0, 0, 0) for all k = 1, . . . , n. Thus whereΘ is the zero triangular fuzzy matrix in M F T which has all elements (0, 0, 0) ∈ F T . Of course, X T ⊗Λ T = (Λ ⊗X) T . Therefore, Equation (19) reduces to Then Var(β * k ) for k = 0, · · · , p, appear on the diagonal of Var(β * ).

Example 1.
With the modified data of [7], in Table 1, we evaluate the covariance matrix of some fuzzy-type linear estimators to compare the variances of some estimators. The data involves student grades and family income. The data were fuzzified as triangular fuzzy numbers. We compare four fuzzy-type linear estimators. Table 1. Numerical data for an example. β is the our estimator andβ 1 ,β 2 ,β 3 are the several modified four fuzzy-type linear estimators. We define the estimators as followings:β = (X t X ) −1Xt ỹ,  Var(β 0 ) = 1.2366 is smaller than 1.2367 and 1.2466. Var(β 1 ) = 0.0090 is the smallest among 0.0090, 0.0091, 0.0092 and 0.0190. Finally, Var(β 2 ) = 0.0155 is the smaller than 0.0156, 0.0255. Hence we conclude that the estimatorsβ have the minimum variances amongβ,β 1 ,β 2 andβ 3 .

Asymptotic Normality
We start this section by citing some theorems concerning the Central Limit Theorem (CLT) and Strong Law of Large Numbers (SLLN) for martingales which will be useful in the proof of the main result.
Theorem 3 (SLLN for martingales). Let S n = ∑ n i=1 X i , n ≥ 1, be a martingale such that E|X k | p < ∞ for k ≥ 1 and 1 ≤ p ≤ 2. Suppose that {b n } is a sequence of positive constants increasing to ∞ as n → ∞, where the notation L −→ stands for the convergence in law.
The proof of Theorem 4 can be found in [39].
Theorem 5 (Courant-Fisher minimax theorem). For any n × n real symmetric matrix A its eigenvalues λ 1 . . . λ n satisfy where C is a subspace of R n .
To obtain some asymptotic properties of our least squares estimator in the generalized fuzzy regression model, the following additional assumption is required besides Assumption A given in Section 3: x i → 0 as n → ∞, wherex i T denotes the i-th row in a fuzzy matrixX.
Now we are able to formulate one of the main results of this contribution. Letβ n be the estimator for β of size n. Theorem 6. If model (15) satisfies Assumptions A and B, then the least squares estimatorβ n is asymptotically normal, i.e., where σ 2 Φ = (σ 2 l , σ 2 , σ 2 r ).
Proof. By (16) one can find that so, consequently,β n − β = (X T X ) −1XT Φ. Let λ n ∈ R p+1 (λ n = 0) be an arbitrary but fixed vector. Moreover, let Z n = λ n T (β n − β) = C n1 , · · · , C nn ∈ F T , then (see [33]) We claim thatC n satisfies the regularity condition of Theorem 3. Then we can obtain the asymptotic distribution of Z n . Letx T i be the i-th row ofX. Then we get C ni = x T i (X T X ) −1 λ n .
Since C ni ∈ F T we have C T ni = C ni . Hence Therefore, by Theorem 5 (see [33]) becomes where ch max (Q) stands for the largest characteristic value of matrix Q. Thus which, by Assumption (B1), converges to 0 as n → ∞. It means that as n → ∞. So, consequently, by the Hajék-Sidȃk Central Limit Theorem 4. On the other hand, one may notice that as n → ∞ (by Assumption B2). Thus which completes the proof.

Strong Consistency and Confidence Region
The weak consistency is a direct consequence of the asymptotic normality.
Theorem 7. For the model (1), under Assumptions A and B, theβ is weakly consistent estimator for β, that iŝ where the notation P −→ means converges in probability.
Proof. From the fact that √ n β n − β converges a random variable in law, which is non-degenerated, it follows that √ n β n − β = O P (1), where O P (1) stands for bounded in probability. This implies that eachβ i is weakly consistent, and thenβ n − β P −→ 0.
One of main results of this section is the following theorem, the strong consistency of fuzzy least squares estimators. Moreover, this theorem explains that for strong consistency property, asymptotic normality is not needed, and hence, some of the following theorems may be relaxed.
where the notation a.s.
−→ means converges in almost surely.
Proof. From the fact thatβ n − β = (X t X ) −1Xt Φ and under the assumption s 2 j → ∞ for all j, it can be proved by Theorem 3 that diag(1/s 2 0 , · · · , 1/s 2 p )X t Φ −→ 0 almost surely. The result of the theorem is now an obvious consequence of the assumption. The proof is completed.
Next, we provide an approximate confidence region for β based on the large-sample normality of the FLSE. The asymptotic normality of √ n(β n − β), derived in Theorem 6, under the regularity conditions, suggests the use of the pivotal quantity of the form The following theorem gives a large sample distribution of Q n (β n ).

Corollary 2.
Under the conditions of Theorem 6, Q n (β n ) has asymptotically a chi-squared distribution with p + 1 degrees of freedom.
Proof. It is directly obtained from Corollary 2.

Remark 1.
Note that it is well known that under certain regularity conditions, the sequence of the crisp LSE's β n has asymptotically a normal distribution in the sensethat where σ * 2 is the variance of errors i in model and V is given by V n = 1 n (X X) → V as n → ∞. Thus, a 100(1 − α) percent approximate confidence region based on the LSE, denoted by C * 1−α (β), is the set ofβ n such that where δ * = σ * 2 n χ 2 1−α (p) and σ * 2 = Var[ ]. Then, for n large C * 1−α (β) provides a 100(1 − α) percent confidence region for β. Now we compare the sequences of the FLSE {β n } and the classical crisp LSE {β n }. A numerical measure of the asymptotic relative efficiency (ARE) of {β n } with respect to {β n } on the inverse ratio of their generalized limiting variances, and denote it by e (F,C) , which implies a strictly smaller asymptotic confidence region, is given by [40] If the ARE of the FLSE's {β n } with respect to the crisp LSE {β n } is greater than 1, we can then say that {β n } is more efficiency than {β n }. Table 1 is used asan example of ARE. If we put σ 2 θ l = σ 2 θ r = σ 2 = 1 and σ * 2 = 1, then we obtain

Example 2. The dataset from in
In this case, it can be concluded that our estimator {β n } is more efficient than {β n }. Let us regard the data as crisp numbers, i.e., Y i = (y i , y i , y i ) and X i = (x i , x i , x i ). We take σ 2 θ l = σ 2 θ r = 0, σ 2 = 1 and σ * 2 = 1. Then we have We can verify that the efficiency is approximately 1, as we wished.

Example 3 (Monte Carlo Simulation).
We performed Monte Carlo simulations to examine the performance of the proposed estimator with fuzzy observations discussed in this paper. The asymptotic behavior and the accuracies for some finite sample datasets are investigated. Two independent variables of parameters are chosen for this simulation: (β 0 , β 1 ) = (1.50, 0.20). For the two independent variables, the spreads and the modes are selected from the normal distributions N(15, 4 2 ), N(2, 0.5 2 ), respectively. In addition, the measurement errors of the modes and spreads are chosen to be Gaussian white noise with mean zero and variance 0.25: Φ i = (θ l i , i , θ r i ), i , θ l i , θ r i ∼ NID(0, 0.5 2 )"(i = 1, · · · , n). In this paper, sample sizes for n = 10, 50, 100 have been used for the small, moderate and large samples. Here 1000 different datasets were generated for a particular sample size n. For each data set we estimated the parameters β 0 , β 1 by the proposed estimators, and provided the average estimates and average mean squared error (MMSE) over 1000 simulations. These results are shown in Table 2. In addition, the percentiles, minimum, 1st quartile, median, 3rd quartile, 95 percent point, and the maximum of the 1000 estimation errors are given in Table 3 and 4. Thus, the accuracy of estimators was assessed for several values of sample sizes, our simulation results indicate that our estimation procedures have smaller mean bias and smaller mean squared error in the estimates of parameters as the sample size increases.

Conclusions
In this paper consider a multiple fuzzy-input-fuzzy-output regression model with fuzzy error terms, described by a suitable matrix, called the triangular fuzzy matrix. We proposed a simple formula for the fuzzy least squares estimator and examine its fundamental properties. It appears that the suggested estimator is BLUE so it is in some sense optimal. Moreover, due to its asymptotical normality under quite general assumptions, we open a new perspective for constructing statistical tests and confidence intervals useful both in the model validation and in forecasting. These very topics will be considered in further research. Another open problem to be undertaken is to discover the analogous results in more general fuzzy regression models based on trapezoidal fuzzy numbers, LR-fuzzy numbers and so on.