A Large Sample Study of Fuzzy Least-Squares Estimation

Jin Hee Yoon; Seung Hoe Choi

doi:10.3390/axioms14030181

and

¹

Department of Mathematics and Statistics, Sejong University, Seoul 05006, Republic of Korea

²

School of Liberal Arts and Science, Korea Aerospace University, Goyang 10540, Republic of Korea

^*

Author to whom correspondence should be addressed.

Axioms2025, 14(3), 181;https://doi.org/10.3390/axioms14030181

This article belongs to the Section Logic

Version Notes

Order Reprints

Abstract

In many real-world situations, we deal with data that exhibit both randomness and vagueness. To manage such uncertain information, fuzzy theory provides a useful framework. Specifically, to explore causal relationships in these datasets, a lot of fuzzy regression models have been introduced. However, while fuzzy regression analysis focuses on estimation, it is equally important to study the mathematical characteristics of fuzzy regression estimates. Despite the statistical significance of optimal properties in large-sample scenarios, only limited research has addressed these topics. This study establishes key optimal properties, such as strong consistency and asymptotic normality, for the fuzzy least-squares estimator (FLSE) in general linear regression models involving fuzzy input–output data and random errors. To achieve this, fuzzy analogues of traditional normal equations and FLSEs are derived using a suitable fuzzy metric. Additionally, a confidence region based on FLSEs is proposed to facilitate inference. The asymptotic relative efficiency of FLSEs, compared to conventional least-squares estimators, is also analyzed to highlight the efficiency of the proposed estimators.

Keywords:

fuzzy least-squares estimation; asymptotic normality; strong consistency; triangular fuzzy matrix

MSC:

62A86; 62E17; 62E86

1. Introduction

The least-squares estimation technique has proven to be highly effective in various fields. By analyzing the provided data, this method establishes a mathematical relationship between dependent and independent variables, which can subsequently be applied for forecasting, optimization, and control. Under appropriate design conditions, the parameter estimates obtained through this method typically exhibit optimal properties. Furthermore, when it is assumed that the error terms follow a Gaussian distribution, these estimates align with maximum likelihood estimators. With certain regularity conditions in place, the least-squares method produces the best linear unbiased estimates for regression parameters. The Gauss–Markov theorem further generalizes this result to linear regression models with arbitrary error distributions. In these fields, it is typically presumed that the observed data are accurate, leading to the development of statistical techniques tailored for clearly defined and precise datasets.

Nevertheless, in many real-world situations, the accessible data are not only random but also ambiguous because of imprecise information or expressions such as “approximately 10”, “slightly more than 10”, “somewhere between 5 and 10”, or subjective qualifiers such as “fair”, “good”, or “excellent”. In such instances, the classical least-squares method is not appropriate. Regression analysis typically considers two forms of uncertainty—randomness and fuzziness—but often only models random errors. Therefore, a broader regression framework is necessary to account for both random and fuzzy uncertainties. To handle data ambiguity, various statistical techniques have been developed [1,2]. The pioneering works on regression analysis using the fuzzy model were conducted by Tanaka and his collaborators [3,4,5]. In recent decades, numerous researchers have studied regression within fuzzy environments [6,7,8,9,10,11]. Choi and Yoon introduced several fuzzy linear regression models, such as the componentwise fuzzy linear regression model [6], the fuzzy rank linear model [7], the general fuzzy regression model [8], the equivalence in alpha-level linear regression [9]. Lee et al. [10] employed bootstrap techniques to make inferences about fuzzy regression models. Namdari et al. [11] and Sohn et al. [12] proposed fuzzy logistic regression models utilizing LAD (least absolute deviation) and LSE (least-squares estimation).

More recently, Yoon developed fuzzy mediation and moderation analysis techniques grounded in fuzzy regression analysis [13,14] and introduced a variable selection technique for multiple fuzzy regression models [15]. Additionally, Bas and Egrioglu [16,17] discussed robust fuzzy regression, and some authors [18,19] applied TSK system to fuzzy regression.

In these studies, the discrepancies between the regression models and observed data are interpreted not as random errors governed by probability distributions but rather as stemming from the fuzziness of the model structure. Nonetheless, various researchers have investigated models incorporating random errors in this context [20,21,22,23,24,25,26,27,28].

The mathematical properties of regression models, including optimality and large-sample behavior, hold significant importance in statistics [28,29,30,31]. However, research on fuzzy estimation has predominantly focused on practical applications rather than theoretical exploration of these aspects. So far, research on these mathematical topics has been relatively limited. Körner and Näther [32,33,34], as well as Yoon and Grzergozewski [28], have examined the characteristics of the BLUE (Best Linear Unbiased Estimator). Furthermore, Kim et al. [25] and Yoon et al. [27,28] have investigated asymptotic theories in the context of large-sample studies. Recently, Croci et al. [29] used semidefinite programming to analyze BLUE properties, while Song et al. [30] and Young [31] demonstrated consistency-related properties among optimality criteria.

This study builds on a prior investigation [25] that explored the asymptotic properties of least-squares estimation using vague data. To identify sufficient conditions for the consistency and asymptotic normality of a sequence of least-squares estimators with vague data applied to a multiple linear regression model is the main goal of this paper, and additionally, asymptotic efficiency of these estimators has been discussed.

This paper is structured as follows. Section 2 introduces the assumptions underlying the fuzzy model, along with key preliminary concepts essential for deriving the main results. In Section 3, we develop the counterparts of the normal equations and derive the estimators. The large-sample properties of fuzzy least-squares estimators (FLSEs), including their strong consistency and asymptotic normality, are discussed in Section 3 and Section 4. Lastly, Section 5 presents the approximate confidence region for the parameters as well as a comparison of the asymptotic relative efficiency of the FLSEs with that of the crisp least-squares estimators.

2. Preliminaries

Let us begin by examining the standard multiple linear regression model:

Y_{i} = β_{0} + β_{1} x_{i 1} + \dots + β_{p - 1} x_{i, p - 1} + ϵ_{i}, i = 1, \dots, n,

(1)

which can be expressed in matrix notation as

Y = X β + ϵ

(2)

where

Y = {(Y_{1}, \dots, Y_{n})}^{t}

is an

(n \times 1)

vector of observed response variables,

X

denotes an

(n \times p)

matrix of known constants

x_{i j}

,

β = {(β_{0}, β_{1}, \dots, β_{p - 1})}^{t}

represents the

(p \times 1)

vector of unknown parameters, and

ϵ = {(ϵ_{1}, \dots, ϵ_{n})}^{t}

is an

(n \times 1)

vector of unobserved random errors, which are assumed to follow a distribution with d.f. F such that

E ϵ_{i} = 0

and

V a r ϵ_{i} = σ^{2} I_{n}

, where

σ^{2} < \infty

.

The most popular estimation for

β

in model (2) is least-squares estimation; the least-squares estimator (LSE) is given by

{\hat{β}}_{n} = {(X^{t} X)}^{- 1} X^{t} Y

where the regression matrix

X

is supposed to have full rank, i.e., the columns of

X

are linearly independent. The LSE satisfies the standard optimal properties within the Gaussian model, where it is assumed that the error terms follow a Gaussian distribution. These estimators are both unbiased and efficient asymptotically. Moreover, they represent the best linear unbiased estimators (BLUEs) available. The only assumptions made are that the error distribution has a mean of zero and a finite variance

σ^{2}

. The Gauss–Markov theorem represents a broader and more general principle.

In [35], the assumption is made that the independent variables are observed without error. An et al. [36] have considered the random inputs for a linear regression model.

On the other hand, an estimator

{\hat{δ}}_{n}

of

β

is said to be strong (weakly) consistent if

{\hat{δ}}_{n} \to β

(in probability) as

n \to \infty

. The strong consistency of

{\hat{β}}_{n}

was established in [35,37] under specific conditions for the design points

x_{i j}

, which correspond to the requirement

{(X^{t} X)}^{- 1} \to 0

. Both [35,37] assume that the input variables are observed without error. For the linear model with random inputs, An et al. [36] presented a novel estimator for the unknown parameters, derived from the Fourier transform of a symmetric weight function.

And an estimator

{\hat{δ}}_{n}

for

β

is considered strongly (or weakly) consistent if it converges to

β

almost surely (or in probability) as

n \to \infty

. The strong consistency of

{\hat{β}}_{n}

was demonstrated in [35,37], given certain conditions on the design points

x_{i j}

, which impose the requirement that

{(X^{t} X)}^{- 1} \to 0

. Both [35,37] operate under the assumption that the input variables are measured without error. An et al. [36] introduced an innovative estimator for a linear model with random inputs, which was formulated using the Fourier transform of a symmetric weight function.

They established the strong consistency of this estimator under the assumption of independent and identically distributed observations

(Y_{i}, x_{i})

.

In this study, the experimental data are regarded as imprecise. Moreover, we assume that these data can be interpreted as samples drawn from a fuzzy set-valued random variable.

To extend the least-squares method to cases involving imprecise data—while ensuring that it reduces to the classical method when the model elements are precise—we introduce the following fuzzy linear regression model:

Y_{i} = β_{0} \oplus β_{1} X_{i 1} \oplus \dots \oplus β_{p - 1} X_{i, p - 1} + Φ_{i}, i = 1, \dots,, n,

(3)

where

X_{i j}

and

Y_{i}

(j = 1, \dots, p - 1)

are random fuzzy variables,

β_{j}

are unknown crisp regression parameters that must be estimated based on the observed values of

Y_{i}

and

X_{i j}

, and

Φ_{i}

are assumed to be crisp random error vectors. In Equation (3), the operator ⊕ denotes the addition of fuzzy sets, while + represents the conventional addition of vectors.

It is important to note that model (3) incorporates two distinct types of uncertainty: vagueness and randomness, simultaneously. Hence, model (3) can be viewed as an expansion of the conventional linear regression model to encompass situations where the observations of both explanatory and response variables are expressed as fuzzy numbers, where precise values are treated as special cases of degenerate fuzzy numbers. Therefore, this model is different from Tanaka’s fuzzy regression models, which are discussed in [32,38,39], etc.

It is worth noting that in model (3), we are dealing with two distinct forms of uncertainty: vagueness and randomness, both of which are addressed concurrently. Therefore, model (3) can be viewed as an expansion of the conventional linear regression model to accommodate situations where the observations of explanatory and response variables are fuzzy numbers. This is because crisp values can be seen as a special case of degenerated fuzzy numbers.

Models addressing the ambiguous representation of data will consist of specific fuzzy sets within the real number space.

Following [21,24], we present certain definitions pertaining to fuzzy sets and fuzzy numbers, alongside fundamental findings from fuzzy theory.

A fuzzy subset of

R^{1}

is defined as a mapping, known as the membership function, from

R^{1}

to the interval

[0, 1]

. Consequently, a fuzzy subset A is represented by its membership function

m_{A} (x)

.

For any

α \in (0, 1]

, the crisp set

A_{α} = x \in R^{1} : m_{A} (x) \geq α

is referred to as the

α

-cut of A.

A fuzzy number A is a normal and convex subset of the real line

R^{1}

with a bounded support.

The collection of all fuzzy numbers is denoted by

F_{c} (R^{1})

. Notably, there are no universally applicable guidelines for determining the membership function of a fuzzy observation.

As a specific instance, we utilize a particular parametric class of fuzzy numbers, referred to as

L R

-fuzzy numbers:

\begin{matrix} μ_{A} (x) = \{\begin{matrix} L (\frac{m - x}{l}) if x \leq m, \\ R (\frac{x - m}{r}) if x > m \end{matrix} for x \in R^{1}, \end{matrix}

where

L, R : R^{1 +} \to [0, 1]

are predefined left-continuous and non-increasing functions satisfying

R (0) = L (0) = 1

and

R (1) = L (1) = 0

. The functions L and R serve as the left and right shape functions of X, with m representing the mode of A, while l and

r > 0

correspond to the left and right spreads of X. The notation

A = {(m, l, r)}_{L R}

is used to represent an

L R

-fuzzy number.

The parameters l and r indicate the level of fuzziness associated with the numerical value, which may be either symmetric or asymmetric. If both l and r are equal to 0, the numerical value is entirely crisp, meaning it has no fuzziness. The

α

-cuts of the fuzzy numbers are given by the interval representation:

A_{α} = [m - L^{- 1} (α) l, m + L^{- 1} (α) r], α \in (0, 1] .

The collection of all

L R

-fuzzy numbers is denoted as

F_{L R} (R^{1})

. Specifically, when

L (x) = R (x) = {[1 - x]}^{+}

in

A = {(m, l, r)}_{L R}

, the fuzzy number A is referred to as a triangular fuzzy number, represented as

A = {(m, l, r)}_{T}

. Consequently, fuzzy numbers provide an effective means for modeling imprecise data.

The statistical management of fuzzy numbers usually requires considering elementary operations between them. Operations of fuzzy numbers are defined based on Zadeh’s extension principle [40].

An advantageous aspect of

L R

-fuzzy numbers lies in their capability to express operations ⊕ and · through straightforward operations with respect to the parameters m, l, r:

{(m_{1}, l_{1}, r_{1})}_{L R} \oplus {(m_{2}, l_{2}, r_{3})}_{L R} = {(m_{1} + m_{2}, l_{1} + l_{2}, r_{1} + r_{2})}_{L R}

and

\begin{matrix} λ {(m, l, r)}_{L R} = \{\begin{matrix} {(λ m, λ l, λ r)}_{L R} & if λ > 0, \\ {(λ m, - λ r, - λ l)}_{L R} & if λ < 0, \\ {(0, 0, 0)}_{L R} & if λ = 0 . \end{matrix} \end{matrix}

On the other hand, in the context of applying the least-squares method to fuzzy data, it necessitates a suitable metric defined on the spaces of fuzzy sets. Various metrics can be established within the class

F_{c} (R^{1})

. The distance between two fuzzy numbers is often determined by the disparity between their

α

-cuts.

A significant class of metrics can be formulated using support functions. The support function corresponding to any fuzzy set

A \in F_{c} (R^{1})

is expressed as:

s_{A} (α, r) = sup_{a \in A_{α}} < r, a >, r \in S^{d - 1}, α \in (0, 1]

where

S^{d - 1}

denotes the

(d - 1)

-dimensional unit sphere in

R^{d}

, and

< \cdot, \cdot >

signifies the inner product in

R^{p}

. A metric on

F_{c} (R^{1})

is introduced based on the

L_{2}

-metric within the space of Lebesgue integrable functions:

δ_{2} (A, B) = {[d \cdot \int_{0}^{1} \int_{S^{n - 1}} {| s_{A} (α, r) - s_{B} (α, r) |}^{2} μ (d r) d α]}^{1 / 2}

for any

A, B \in F_{c} (R^{1})

. Ming and Friedman [39] proposed a metric for fuzzy numbers X and Y, defined in terms of the distance between their respective images. For

A, B \in F_{c} (R^{1})

, the metric is given as:

d_{M}^{2} (X, Y) = \int_{0}^{1} {(X_{α}^{-} - Y_{α}^{-})}^{2} d α + \int_{0}^{1} {(X_{α}^{+} - Y_{α}^{+})}^{2} d α,

where

X_{α}^{-}

and

X_{α}^{+}

represent the lower and upper endpoints of

X_{α}

. Diamond [21] introduced an alternative metric applicable to the set of all triangular fuzzy numbers. Let

T (R^{1})

denote the collection of all triangular fuzzy numbers in

R^{1}

. For

A, B \in T (R^{1})

, the metric is defined as follows:

d^{2} (X, Y) = D_{2}^{2} (supp X, supp Y) + {[m (X) - m (Y)]}^{2},

where

supp X

represents the compact interval supporting X, and

m (X)

refers to its mode. Given

X = {(x, ξ^{l}, ξ^{r})}_{T}

and

Y = {(y, η^{l}, η^{r})}_{T}

, the metric is equivalently expressed as:

d^{2} (X, Y) = {[y - η^{l} - (x - ξ^{l})]}^{2} + {[y + η^{r} - (x + ξ^{r})]}^{2} + {(y - x)}^{2} .

Then,

S_{n} / b_{n} \overset{a . s .}{⟶} 0

, where the notation

\overset{a . s .}{⟶}

means that it converges almost surely.

Theorem 1

(Lemma in [35], p. 125). Let

{r_{i}}

be a sequence of real numbers and

{e_{i}}

be a sequence of i.i.d. r.v.’s such that

E [e_{i}] = 0

and

0 < E [e_{i}^{2}] < \infty

for all i. Moreover,

\sum_{i = 1}^{n} r_{i}^{2} \to \infty

as

n \to \infty

. Then,

(\sum_{i = 1}^{n} r_{i} e_{i}) {(\sum_{i = 1}^{n} r_{i}^{2})}^{- 1} \overset{a . s .}{⟶} 0

as

n \to \infty

.

3. Fuzzy Least-Squares Estimators and Asymptotic Normality

The focus of statistical analysis under (3) primarily revolves around the task of making inferences about the parameters

β_{0}, \dots, β_{p}

. We reconsider the multiple regression model:

Y_{i} = β_{0} + β_{1} X_{i 1} + \dots + β_{p - 1} X_{i, p - 1} + Φ_{i}, i = 1, \dots, n,

(4)

where

X_{i j}

,

Y_{i}

(

j = 1, \dots, p - 1

) are triangular fuzzy numbers

Y_{i} = (y_{i} - η_{i}^{l}, y_{i}, y_{i} + η_{i}^{r})

,

X_{i j} = (x_{i j} - ξ_{i j}^{l}, x_{i j}, x_{i j} + ξ_{i j}^{r})

, where

y_{i}

,

x_{i j}

are the modes,

η_{i}^{l}

,

ξ_{i}^{l}

are the left spreads and

η_{i}^{r}

,

ξ_{i j}^{r}

are the right spreads of

Y_{i}

and

X_{i j}

, respectively. And we assume that the crisp random vectors

Φ_{i}

for expressing randomness represented by

Φ_{i} = (θ_{i}^{l}, ϵ_{i}, θ_{i}^{r})

with crisp random variables

ϵ_{i}

,

θ_{i}^{l}

,

θ_{i}^{r}

.

In (4), the operations + and · represent standard addition and scalar multiplication of vectors, respectively.

It is important to note that the component-wise representation of model (4) is expressed through the following crisp models:

\begin{matrix} y_{i} & = β_{0} + β_{1} x_{i 1} + \dots + β_{p - 1} x_{i, p - 1} + ϵ_{i}, \\ η_{i}^{l} & = β_{1} ξ_{i 1}^{l} + \dots + β_{p - 1} ξ_{i, p - 1}^{l} + (ϵ_{i} - θ_{i}^{l}), \\ η_{i}^{r} & = β_{1} ξ_{i 1}^{r} + \dots + β_{p - 1} ξ_{i, p - 1}^{r} + (θ_{i}^{r} - ϵ_{i}), \end{matrix}

(5)

where the terms

θ_{i}^{l}

,

ϵ_{i}

, and

θ_{i}^{r}

in

Φ_{i}

are subject to the constraints

η_{i}^{l} > 0

and

η_{i}^{r} > 0

, almost surely.

In the least-squares approach, our goal is to determine the estimators for

β = {(β_{0}, \dots, β_{p - 1})}^{t}

that minimize the sum of squared residuals between the n observed values of Y and their corresponding predicted values

\hat{Y}

.

Any vector

{\hat{β}}_{n} = {({\hat{β}}_{0}, \dots, {\hat{β}}_{p - 1})}^{t}

that minimizes

Q (β_{0}, \dots, β_{p - 1}) = \sum_{i = 1}^{n} d_{H}^{2} (Y_{i}, \sum_{j = 0}^{p - 1} β_{j} X_{i j})

is referred to as the fuzzy least-squares estimator of

β

, given the fuzzy data

{(X_{i j}, Y_{i})}

, where

j = 0, 1, \dots, p - 1

and

i = 1, \dots, n

, with

X_{i 0} \equiv 1

and d.

Furthermore, if

ξ_{i 0}^{l} \equiv 0

and

ξ_{i 0}^{r} \equiv 0

, then

\begin{matrix} Q (β_{0}, \dots, β_{p - 1}) = \sum_{i = 1}^{n} {[(3 y_{i} + η_{i}^{r} - η_{i}^{l}) - \sum_{j = 0}^{p - 1} β_{j} (3 x_{i j} + ξ_{i j}^{r} - ξ_{i j}^{l})]}^{2} . \end{matrix}

Denoting

Q = Q (β_{0}, \dots, β_{p - 1})

, we obtain for

k = 0, 1, \dots, p - 1

:

\begin{matrix} \frac{\partial Q}{\partial β_{k}} = - 2 \sum_{i = 1}^{n} [(3 y_{i} + η_{i}^{r} - η_{i}^{l}) - \sum_{j = 0}^{p - 1} β_{j} (3 x_{i j} + ξ_{i j}^{r} - ξ_{i j}^{l})] (3 x_{i k} + ξ_{i k}^{r} - ξ_{i k}^{l}) = 0 . \end{matrix}

Thus, the fuzzy least-squares normal equations for

k = 0, 1, \dots, p - 1

are given by:

\begin{matrix} \sum_{i = 1}^{n} (3 y_{i} + η_{i}^{r} - η_{i}^{l}) (3 x_{i k} + ξ_{i k}^{r} - ξ_{i k}^{l}) = \sum_{i = 1}^{n} [(3 x_{i k} + ξ_{i k}^{r} - ξ_{i k}^{l}) \sum_{j = 0}^{p - 1} {\hat{β}}_{j} (3 x_{i j} + ξ_{i j}^{r} - ξ_{i j}^{l})] . \end{matrix}

(6)

In matrix terms, the normal Equation (6) is as follows:

\begin{matrix} {(3 X + D)}^{t} (3 X + D) {\hat{β}}_{n} = {(3 X + D)}^{t} (3 Y + η) \end{matrix}

(7)

where

X = (x_{i j})

is the

(n \times p)

matrices of known constants

x_{i j}

which express the values of the jth independent variable for the ith sample,

D = (d_{i j})

is the

(n \times p)

matrices of the values

d_{i j} = ξ_{i j}^{r} - ξ_{i j}^{l}

, which is the difference of the right spread

ξ_{i j}^{r}

and left spread

ξ_{i j}^{l}

of the

x_{i j}

,

Y = {(y_{1}, \dots, y_{n})}^{t}

is the

(n \times 1)

vector of observations, and

η = η^{r} - η^{l}

, where

η^{r} = {(η_{1}^{r}, \dots, η_{n}^{r})}^{t}

and

η^{l} = {(η_{1}^{l}, \dots, η_{n}^{l})}^{t}

are the

(n \times 1)

vectors of left and right spreads of response variable

Y

, respectively.

If

rank (X + D) = p

, then (7) has a single solution, given by

{\hat{β}}_{n} = {[{(3 X + D)}^{t} (3 X + D)]}^{- 1} {(3 X + D)}^{t} (3 Y + η) .

(8)

Moreover, we obtain from (5) that

\begin{matrix} 3 Y + η = (3 X + D) β + ϵ^{*} \end{matrix}

(9)

where

ϵ^{*} = {(ϵ_{1} + θ_{1}^{l} + θ_{1}^{r}, \dots, ϵ_{n} + θ_{n}^{l} + θ_{n}^{r})}^{t}

.

Under the same assumptions, it is worth noting that the renowned Gauss–Markov Theorem asserts that for any fixed p-vector

c \in R^{p}

, the expression

c^{t} {\hat{β}}_{n}

serves as the Best Linear Unbiased Estimator (BLUE) of

c^{t} β

. This designation indicates that

c^{t} {\hat{β}}_{n}

exhibits the minimum variance among all linear unbiased estimators of

c^{t} β

. While this represents a significant optimality characteristic of the FLSE, its practical utility is limited unless we possess some understanding of the associated distribution.

If we assume that the random errors in (8) follow a normal distribution, then the FLSE

{\hat{β}}_{n}

is identical to the MLE, and we have

{\hat{β}}_{n} \sim N_{p} (β, σ^{2} V^{- 1}), where V = {(3 X + D)}^{t} (3 X + D) .

Moreover, utilizing standard theoretical results, one can construct confidence intervals and perform hypothesis tests for the fuzzy regression parameters, as well as derive prediction intervals for new observations, based on the known values of the fuzzy regression variables.

Nonetheless, the Gaussian assumption proves overly restrictive and often challenging to validate in practical applications, underscoring the need for more suitable alternatives. This is where the large sample methods can help, and Section 4 discusses the asymptotic properties of the FLSE

{\hat{β}}_{n}

.

4. Asymptotic Consistency

This section examines the consistency of the FLSE, which is among the large sample properties of estimators. The weak consistency of the FLSE follows directly from Theorem 2, which establishes its asymptotic normality.

Theorem 2.

Consider the model (4) under the assumption that Assumptions A and B hold. Then, the FLSE

{\hat{β}}_{n}

, as defined in (7), is weakly consistent for β, meaning that

{\hat{β}}_{n} \overset{P}{⟶} β

where the notation

\overset{P}{⟶}

denotes convergence in probability.

Proof.

Since

\sqrt{n} ({\hat{β}}_{n} - β)

converges in distribution to a non-degenerate random variable, it follows that

\sqrt{n} ({\hat{β}}_{n} - β) = O_{P}

(1), where

O_{P}

(1) signifies boundedness in probability. This result ensures that each component

{\hat{β}}_{j}

(

j = 0, 1, \dots, p - 1

) is weakly consistent, leading to

{\hat{β}}_{n} - β \overset{P}{⟶}

0. □

This section also presents an important theorem that provides sufficient conditions for the strong consistency of FLSEs.

Additionally, this theorem elucidates that for the property of strong consistency, asymptotic normality is unnecessary. Consequently, certain assumed regularity conditions stipulated in Theorem 2 could be relaxed or adjusted.

Theorem 3.

For the model (4), assume that Assumptions A and B hold. Furthermore, assume that

s_{j}^{2} = \sum_{i = 1}^{n} {(3 x_{i j} + d_{i j})}^{2} \to \infty

(

j = 0, 1, \dots, p - 1

) as

n \to \infty

, and the sequence of matrices

{[{(3 X + D)}^{t} (3 X + D)]}^{- 1} d i a g (s_{0}^{2}, \dots, s_{p - 1}^{2})

is bounded. Then, the FLSE

{\hat{β}}_{n}

defined on (4) is strongly consistent for β, that is,

{\hat{β}}_{n} \overset{a . s .}{⟶} β .

Proof.

From (8) and (9), we may write

\begin{matrix} {\hat{β}}_{n} & = {[{(3 X + D)}^{t} (3 X + D)]}^{- 1} {(3 X + D)}^{t} (3 Y + η) \\ = {[{(3 X + D)}^{t} (3 X + D)]}^{- 1} {(3 X + D)}^{t} {(3 X + D) β + ϵ^{*}} \end{matrix}

so that

{\hat{β}}_{n} - β = {[{(3 X + D)}^{t} (3 X + D)]}^{- 1} {(3 X + D)}^{t} ϵ^{*}

(10)

where

ϵ^{*} = {(ϵ_{1} + θ_{1}^{l} + θ_{1}^{r}, \dots, ϵ_{n} + θ_{n}^{l} + θ_{n}^{r})}^{t}

as before. Under the assumption

s_{j}^{2} \to \infty

for all j, it can be proved by Theorem 1 that

d i a g (1 / s_{0}^{2}, \dots, 1 / s_{p - 1}^{2}) {(3 X + D)}^{t} ϵ^{*} \to 0

almost surely. The conclusion of the theorem follows directly from the assumption (4.7). The proof of the theorem is completed. □

5. Confidence Region and Asymptotic Relative Efficiency

In this section, we will present an approximate confidence region for the parameters

β

in model (4), leveraging the large-sample normality of the FLSE. We will also explore the asymptotic relative efficiency (ARE) of the proposed region compared to the standard result obtained from the limiting distribution of the crisp LSE, evaluating the ARE of the two estimators.

5.1. Confidence Region

The asymptotic normality of

\sqrt{n} ({\hat{β}}_{n} - β)

, as established in Theorem 2 under the regularity conditions, indicates that the pivotal quantity can be expressed as

\begin{matrix} Q_{n} (\hat{β} n) = \frac{n}{σ ϵ^{2}} {({\hat{β}}_{n} - β)}^{t} Σ_{n} ({\hat{β}}_{n} - β) \end{matrix}

where

Σ_{n} = \frac{1}{n} {(3 X + D)}^{t} (3 X + D)

.

The theorem below describes the large sample distribution of

Q_{n} ({\hat{β}}_{n})

.

Theorem 4.

Under the assumptions of Theorem 2,

Q_{n} ({\hat{β}}_{n})

converges asymptotically to a chi-squared distribution with p degrees of freedom.

As a direct result of Theorem 4, we define

C_{1 - α}^{*} (β)

as the set of

{\hat{β}}_{n}

satisfying the condition

\begin{matrix} {({\hat{β}}_{n} - β)}^{t} Σ_{n} ({\hat{β}}_{n} - β) \leq δ^{*} \end{matrix}

where

δ^{*} = \frac{σ_{ϵ^{*}}^{2}}{n} χ_{1 - α}^{2} (p)

and

σ_{ϵ^{*}}^{2} = σ_{ϵ}^{2} + σ_{r}^{2} + σ_{l}^{2}

, as previously stated, and

χ_{1 - α}^{2} (p)

denotes the

(1 - α)

th quantile of the chi-squared distribution. Consequently, for sufficiently large n,

C_{1 - α}^{*} (β)

serves as a

100 (1 - α)

.

It is well established that under certain regularity conditions, the sequence of crisp LSEs, denoted as

{\overset{˘}{β}}_{n}

, exhibits asymptotic normality in the sense that

\sqrt{n} ({\overset{˘}{β}}_{n} - β) \overset{L}{⟶} N (0, σ_{ϵ}^{2} V^{- 1}),

where

σ_{ϵ}^{2}

represents the variance of errors

ϵ_{i}

in the model (1), and V is given by the limit

V_{n} = \frac{1}{n} (X X) \to V

as

n \to \infty

. Accordingly, an approximate

100 (1 - α)

based on the LSE, denoted as

C_{1 - α} (β)

, is defined as the set of

{\overset{˘}{β}}_{n}

for which

\begin{matrix} {({\overset{˘}{β}}_{n} - β)}^{t} V^{- 1} ({\overset{˘}{β}}_{n} - β) \leq δ \end{matrix}

where

δ = \frac{σ_{ϵ}^{2}}{n} χ_{1 - α}^{2} (p)

and

σ_{ϵ}^{2} = V a r [ϵ]

. Consequently, for large n,

C_{1 - α} (β)

provides a

100 (1 - α)

.

In particular, if the designed input data satisfy the condition where

{ξ_{i}^{l}}

and

{ξ_{i}^{r}}

are chosen such that

ξ_{i}^{l} = ξ_{i}^{r}

, for

i = 1, \dots, n

, meaning that

X_{i}

consists of symmetric triangular fuzzy input data, then

Σ_{x d} = Σ_{d d} = 0

in (B2), leading to

Σ = 9 Σ_{x x} + 6 Σ_{x d} + Σ_{d d} = 9 Σ_{x x} .

5.2. Asymptotic Relative Efficiency

Now, let us compare the sequences of the FLSE

{\hat{β}}_{n}

and the classical crisp LSE

{\overset{˘}{β}}_{n}

.

If we establish, as demonstrated by Serfling [17], p. 141, a quantitative metric to gauge the Asymptotic Relative Efficiency (ARE) between

{{\hat{β}}_{n}}

and

{{\overset{˘}{β}}_{n}}

based on their respective generalized limiting variances, denoted as

e_{(F, C)}

, indicating a narrower asymptotic confidence region, then we obtain

\begin{matrix} e_{(F, C)} = {(\frac{| σ_{ϵ}^{2} V^{- 1} |}{| σ_{ϵ^{*}}^{2} Σ^{- 1} |})}^{\frac{1}{p}} = {(\frac{9^{p} σ_{ϵ}^{2 p} | Σ_{x x}^{- 1} |}{σ_{ϵ^{*}}^{2 p} | Σ_{x x}^{- 1} |})}^{\frac{1}{p}} = \frac{9 σ_{ϵ}^{2}}{σ_{ϵ}^{2} + σ_{r}^{2} + σ_{l}^{2}} = 3 . \end{matrix}

(11)

If the ARE of the FLSE

{\hat{β}}_{n}

with respect to the crisp LSE

{\overset{˘}{β}}_{n}

exceeds 1, then we can conclude that

{\hat{β}}_{n}

is more efficient than

{\overset{˘}{β}}_{n}

.

In (11), as a particular case, if

σ_{ϵ}^{2} = σ_{r}^{2} = σ_{l}^{2}

, then

e_{(F, C)} = 3

.

This result suggests that within the context of the simple linear regression model featuring random error terms, the fuzzy least-squares estimator (FLSE) derived from triangular fuzzy data exhibits a comparatively higher level of efficiency than the least-squares estimator (LSE) relying on crisp data. Hence, this outcome suggests that for sufficiently large n, the triangular fuzzy least-squares estimator (LSE) possesses a smaller asymptotic confidence region compared to the crisp LSE.

6. Conclusions

Within the domain of data analysis, numerous scenarios arise where data exhibit stochasticity and fuzziness due to uncertain information or linguistic subtleties. Navigating through such ambiguous data necessitates the application of fuzzy theory, which emerges as a promising approach. Specifically, many fuzzy regression models have been advanced to investigate causal relationships within such datasets. However, beyond mere estimation lies the imperative task of delving into the mathematical properties of fuzzy regression estimates. The significance of optimal properties, particularly in the context of large samples, holds paramount importance within statistical frameworks. Nevertheless, a notable gap persists in the scholarly discourse concerning these pivotal subjects.

In this context, the present study endeavors to scrutinize a multifaceted fuzzy regression model encompassing multiple fuzzy input–fuzzy output variables and fuzzy error terms, incorporating various optimal properties such as consistency, normality, and confidence regions with relative efficiency. Our investigation introduces a streamlined formulation for the fuzzy least-squares estimator while meticulously examining its foundational properties. Our findings indicate that the proposed estimator conforms to the Best Linear Unbiased Estimator (BLUE) criterion, thereby exemplifying optimality. Furthermore, the observed asymptotic normality under broad assumptions lays the groundwork for novel avenues in devising statistical tests and confidence intervals, crucial for both model validation and forecasting endeavors. These avenues beckon further exploration in subsequent research pursuits. The asymptotic relative efficiency shows that the analysis of this paper using fuzzy triangular functions is valid.

In addition, a significant challenge lies in our future research in uncovering analogous results within broader, more intricate fuzzy regression models based on trapezoidal fuzzy numbers, LR-fuzzy numbers, and analogous constructs.

Author Contributions

J.H.Y. developed the conceptualization, proving the theorems, data analysis and drafted the manuscript. Investigation and reviewing were performed by S.H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government(MSIT) (RS-2024-00351610).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Jung, H.-Y.; Choi, S.H.; Lee, W.-J.; Yoon, J.H. A unified approach of asymptotic behaviors for the autoregressive model with fuzzy data. Inform. Sci. 2014, 257, 127–137. [Google Scholar] [CrossRef]
Lee, W.-J.; Jung, H.Y.; Yoon, J.H.; Choi, S.H. Analysis of variance for fuzzy data based on permutation method. Int. J. Fuzzy Log. Intell. Syst. 2017, 17, 43–50. [Google Scholar] [CrossRef]
Tanaka, H.; Tanaka, H.; Uejima, S.; Asai, K. Linear regression analysis with fuzzy model. IEEE Trans. Syst. Man Cybern. 1982, 12, 903–907. [Google Scholar]
Tanaka, H.; Hayashi, I.; Watada, J. Possibilistic linear regression analysis based on possibility measure. In Proceedings of the Second IFSA World Congress, Tokyo, Japan, 20–25 July 1987; pp. 317–320. [Google Scholar]
Tanaka, H. Fuzzy data analysis by possibilistic linear models. Fuzzy Sets Syst. 1987, 24, 363–375. [Google Scholar] [CrossRef]
Yoon, J.H.; Choi, S.H. Componentwise fuzzy linear regression using least squares estimation. J. Mult.-Valued Log. Soft Comp. 2009, 55, 137–153. [Google Scholar]
Jung, H.-Y.; Yoon, J.H.; Choi, S.H. Fuzzy linear regression using rank transform method. Fuzzy Sets Syst. 2015, 274, 97–108. [Google Scholar] [CrossRef]
Choi, S.H.; Yoon, J.H. General fuzzy regression using least squares method. Int. J. Syst. Sci. 2010, 4, 477–485. [Google Scholar] [CrossRef]
Yoon, J.H.; Choi, S.H. Equivalence in alpha-level linear regression. Commun. Korean Stat. Soc. 2010, 17, 611–624. [Google Scholar] [CrossRef]
Lee, W.-J.; Jung, H.Y.; Yoon, J.H.; Choi, S.H. The statistical inferences of fuzzy regression based on bootstrap techniques. Soft Comput. 2015, 19, 883–890. [Google Scholar] [CrossRef]
Namdari, M.; Yoon, J.H.e.; Abadi, A.; Taheri, S.M.; Choi, S.H. Fuzzy logistic regression with least absolute deviations estimators. Soft Comput. 2015, 19, 909–917. [Google Scholar] [CrossRef]
Sohn, S.Y.; Kim, D.H.; Yoon, J.H. Technology credit scoring model with fuzzy logistic regression. Appl. Soft Comput. 2016, 43, 150–158. [Google Scholar] [CrossRef]
Yoon, J.H. Fuzzy mediation analysis. Int. J. Fuzzy Syst. 2020, 22, 338–349. [Google Scholar] [CrossRef]
Yoon, J.H. Fuzzy moderation and moderated mediation analysis. Int. J. Fuzzy Syst. 2020, 22, 1948–1960. [Google Scholar] [CrossRef]
Yoon, J.H. Novel Fuzzy Correlation Coefficient and Variable Selection Method for Fuzzy Regression Analysis based on Distance Approach. Int. J. Fuzzy Syst. 2023, 25, 2969–2985. [Google Scholar] [CrossRef]
Bas, E. Robust fuzzy regression functions approaches. Inf. Sci. 2022, 613, 419–434. [Google Scholar] [CrossRef]
Bas, E.; Egrioglu, E. Robust Picture Fuzzy Regression Functions Approach Based on M-Estimators for the Forecasting Problem. Comput. Econ. 2024. [Google Scholar] [CrossRef]
Shi, Z.; Wu, D.; Guo, C.; Zhao, C.; Cui, Y.; Wang, F.-Y. FCM-RDpA: TSK fuzzy regression model construction using fuzzy C-means clustering, regularization, Droprule, and Powerball Adabelief. Inf. Sci. 2021, 574, 490–504. [Google Scholar] [CrossRef]
Mei, Z.; Zhao, T.; Xie, X. Hierarchical fuzzy regression tree: A new gradient boosting approach to design a TSK fuzzy model. Inf. Sci. 2024, 652, 119740. [Google Scholar] [CrossRef]
Celminš, A. Least-squares model fitting to fuzzy vector data. Fuzzy Sets Syst. 1987, 22, 245–269. [Google Scholar] [CrossRef]
Diamond, P. Fuzzy least-squares. Inform. Sci. 1988, 46, 141–157. [Google Scholar] [CrossRef]
Diamond, P.; Körner, R. Extended fuzzy linear models and least-squares estimates. Comput. Math. Applic. 1997, 33, 15–32. [Google Scholar] [CrossRef]
Kao, C.; Chyu, C. A fuzzy linear regression model with better explanatory power. Fuzzy Sets Syst. 2002, 126, 401–409. [Google Scholar] [CrossRef]
Kao, C.; Chyu, C. Least-squares estimates in fuzzy regression analysis. Eur. J. Oper. Res. 2003, 148, 426–435. [Google Scholar] [CrossRef]
Kim, H.K.; Yoon, J.H.; Li, Y. Asymptotic properties of least squares estimation with fuzzy observations. Inform. Sci. 2008, 178, 439–451. [Google Scholar] [CrossRef]
Yoon, J.H.; Choi, S.H. Fuzzy least squares estimation with new fuzzy operations. Adv. Intell. Syst. Comput. 2013, 8, 193–202. [Google Scholar]
Yoon, J.H.; Choi, S.H.; Grzegorzewski, P. On Asymptotic Properties of the Multiple Fuzzy Least Squares Estimator. Adv. Intell. Syst. Comput. 2017, 456, 525–532. [Google Scholar]
Yoon, J.H.; Grzegorzewski, P. On optimal and asymptotic properties of a fuzzy L2 estimator. Mathematics 2020, 8, 1956. [Google Scholar] [CrossRef]
Croci, M.; Willcox, K.E.; Wright, S.J. Multi-output multilevel best linear unbiased estimators via semidefinite programming. Comp. Methods Appl. Machanics Eng. 2023, 413, 116130. [Google Scholar] [CrossRef]
Song, Y.; Dhariwal, P.; Chen, M.; Sutskever, I. Consistency models. arXiv 2023, arXiv:2303.01469. [Google Scholar]
Young, A. Consistency without inference: Instrumental variables in practical application. Eur. Econ. Rev. 2022, 147, 104112. [Google Scholar] [CrossRef]
Körner, R.; Näther, W. Linear regression with random fuzzy variables: Extended classical estimates, best linear estimates, least-squares estimates. Inform. Sci. 1998, 109, 95–118. [Google Scholar] [CrossRef]
Näther, W. Linear statistical inference for random fuzzy data. Statistics 1997, 29, 221–240. [Google Scholar] [CrossRef]
Näther, W. On random fuzzy variables of second order and their application to linear statistical inference with fuzzy data. Metrika 2000, 51, 201–221. [Google Scholar] [CrossRef]
Drygas, H. Weak and strong consistency of the least square estimates in regression models. Z. Wahrscheinlickeitstheorie Verwandte Geb. 1976, 34, 119–127. [Google Scholar] [CrossRef]
An, H.; Hickernell, F.J.; Zhu, L. A new class of consistent estimators for stochastic linear regressive models. J. Multivar. Anal. 1997, 63, 242–258. [Google Scholar] [CrossRef][Green Version]
Baran, S. A new consistent estimator for linear errors-in-variables models. Comput. Math. Appl. 2001, 41, 821–833. [Google Scholar] [CrossRef]
Lai, T.L.; Robbins, H.; Wei, C.Z. Strong consistency of least-squares estimators in multiple regression. Proc. Natl. Acad. Sci. USA 1978, 75, 3034–3036. [Google Scholar] [CrossRef]
Ming, M.; Friedman, M.; Kandel, A. General fuzzy least squares. Fuzzy Sets Syst. 1997, 88, 107–118. [Google Scholar] [CrossRef]
Zadeh, L.A. Fuzzy sets. Inform. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Large Sample Study of Fuzzy Least-Squares Estimation

Abstract

1. Introduction

2. Preliminaries

3. Fuzzy Least-Squares Estimators and Asymptotic Normality

4. Asymptotic Consistency

5. Confidence Region and Asymptotic Relative Efficiency

5.1. Confidence Region

5.2. Asymptotic Relative Efficiency

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics