1. Introduction
Several studies have been carried out on data science. Datasets play an important role in several areas of knowledge, since information can be extracted from them. This information can be used, for example, in decision making, product improvement, process automation, and trend forecasting [
1,
2,
3].
A number of methods and algorithms have been developed in the literature to extract different information from datasets through mathematical and computational methods. In general, these algorithms were developed to model datasets collected from a single source. In this regard, few algorithms have been formulated to solve the problem in a data fusion scenario, that is, in a scenario where data comes from different sources [
4].
The least-squares method (LSM) is a widely used technique for data modeling based on the minimization of a quadratic function [
4,
5,
6,
7,
8,
9]. LSM was initially conceived for modeling data from a single source. In [
4], an LSM was developed considering a data fusion situation (LSM-DF), that is, a method considering data from different sources. LSM-DF was designed for weighted data fusion.
From a mathematical point of view, the LSM-DF is based on a weighted average of the length of residual vectors of the equations
with
, expressed by
where
are the weights, that is, an aggregation of
L values with their corresponding weightings. Here, a very interesting question arises: is weighted averaging the best method for aggregating the data in all scenarios? Within this context, the study of different aggregation methods has recently gained prominence.
Aggregation operators constitute a subarea of fuzzy theory that has the characteristic of combining finite datasets of the same nature into a single dataset [
1,
2,
6,
7,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19]. These operators are basically classified into three categories: mean, conjunctive, and disjunctive. Applications of these operators can be found in medical problems, image processing, decision making, and engineering problems.
weights are directly related to the length of residual vectors. However, in some situations, it would be interesting to dynamically allocate the weights to the weightings, putting more weight on the more important values. Thus, considering the above, the aggregation operators can be considered to bw a viable alternative to change the behavior of LSM-DF.
This study seeks to optimally combine the least-squares method and the aggregation operators of the average type, more specifically, the ordered weighted averaging (OWA) [
3,
20,
21,
22] Choquet integral, [
23,
24], and mixture [
25,
26] operators. Furthermore, the aim of this study is to formulate and solve appropriate least-squares methods to model finite collections of datasets of the same nature. An important goal of these algorithms is to generate optimal estimates that aggregate data of different sources. This is necessary for situations that involve systems that can operate under different failure conditions. A numerical example is presented to show the effectiveness of the proposed algorithm.
This paper is organized as follows: in
Section 2, preliminary results are related with an admissible order for matrices, aggregation operators, and LSM. In
Section 3, LSM-DF via aggregation operators are deduced. In
Section 4, a numerical example is shown.
2. Preliminaries
This section addresses topics that form the theoretical basis for the development of LSM-DF via aggregation operators. Initially, the admissible order for matrices is discussed, followed by the aggregation operators of the average type and the classical least-squares method.
2.1. Admissible Order for Matrices
In this section, we present the concept of admissible order for matrices based on [
2,
16,
27]. This is a special way to consider total orders on the set of all matrices of order
with scalar in
(set of real numbers) denoted by
.
Let
. It is clear that
given by
is a partial order on
.
Considering a matrix
as a vector of columns, i.e.,
where
are the columns of
A (
), then ≤ can be defined as
One can extend that partial order for a total order by considering the concept of admissible order as follows.
Definition 1. A total order ≼ on is admissible if, for each we have that whenever .
Example 1. Let be A and B column matrices on and the projection on the i-th line of A. Then,is an admissible order. Therefore, one can generalize an admissible order on by considering the following definition: Let such that and . Thenis an admissible order on . 2.2. Aggregation Operators
Aggregation operators are numeric operators that combine multiple input values into a single output value. In this data fusion process, operators aggregate data from different sources to obtain a single unit of data from the conducted analysis. Next, the operators used in this study are presented: OWA, Choquet integral, and mixture operators.
Definition 2 ([12]).(OWA operator) Providing an n- dimensional weight vector, that is, a with , the function is defined bywhere is the descending order of vector and is named an ordered weighted average function. Example 2. Defining the vector of weights, where and , for some fixed . So is the so-called static OWA operator.
Remark 1. As one can see in Definition 2, the sum of all the weights in the OWA aggregation results is 1 (). If the weights are matrices, the sum is given by where is the norm of the matrices given by Remark 2. The entries in the OWA aggregation must be sorted; if the entries are a matrix, an ordering relation must be used over the set . So, we can consider an admissible order on as defined in 1.
The next definition is the fuzzy discrete measure, a significant result for the definition of the Choquet integral operator.
Definition 3 ([15]).A discrete fuzzy measure is a function where and is the group of parts of , such that: Definition 4 ([10]).(Choquet integral operator) is a discrete fuzzy measure. The discrete Choquet integral related to the measure μ is the function defined by:where is an ascending ordering of the vector and by convention. The Choquet integral operator can also be calculated with the following simplified expression:
where
and
.
Example 3. Considering fuzzy discrete measure Thus, the following Choquet integral can be defined by: and for the other values of i; therefore, the result is .
Definition 5 ([15]).(Mixture Operator) are functions called weight functions. The function is defined by:is called the mixture function associated with the weight functions . Example 4. For simplicity, consider . In this case, considering thatis the mixture function determined by the weights defined above. 2.3. Least-Squares Method
LSM is a widely known and applied mathematical optimization method used to solve several problems, including parameter estimation. This method consists of finding an optimal solution to the problem by minimizing the square of a residual vector.
Considering the equation
where
is an unknown vector,
is a known parameter matrix,
is a known vector, and
is a vector named residual.
The least-squares problem is to find a solution
that minimizes the length of the residual vector, that is, satisfying the following property:
for all
. The
denotes the square of Euclidean norm
Therefore, the solution to the least-squares problem consists of solving the optimization problem
where the functional cost
is given by
Theorem 1 ([4]).(Least-Squares Method) If matrix A has full rank, then there is a single optimal solution for least-squares Problem (9) that is given by Moreover, the resulting minimal value of the cost function can be written as 3. LSM-DF via Aggregation Operators
In this section, LSM-DF is developed via aggregation operators. LSM-DF via an OWA operator, LSM-DF via a Choquet integral operator, and LSM-DF via a mixture operator are also presented. These LSM-DFs are an alternative to estimation problems in the case of several datasources.
The next result is necessary to the proof of the LSM-DF via aggregation operators.
Lemma 1. If matrices have full rank and matrix is symmetric definite-positive with , then whereis nonsingular. Proof. Let suppose that
is singular; then, there must exist a nonzero vector
, such that
, which implies that
, i.e.,
(
15) can be rewritten as
denotes the square of the weighted Euclidean norm
As matrices
are symmetric definite-positive, it follows from (
16) that
so that
with
. This, in turn, means that the columns of
are linearly dependent. Hence,
is not full-rank. □
3.1. LSM-DF via OWA Operator
For the deduction of LSM-DF via OWA operator, the following equations should be considered
where
is an unknown vector,
known parameters arrays,
known vectors, and
vectors named residuals.
A solution to the least-squares problem via operator OWA
must minimize the length of the residual vector, that is, it must satisfy the following property:
for all
x∈
and where
are a positive-definite symmetric matrices.
Optimal solution
is found by solving the following minimization problem:
Functional
can be defined as
where
are weight matrices and
Therefore, by defining the OWA operator, Function (
21) can be rewritten as
The next theorem brings the solution to the least-squares problem via the OWA operator in (
20).
Theorem 2. (LSM-DF via OWA Operator) If matrices with have full rank and are symmetric definite-positive matrices, then there is a unique optimal solution to the least-squares problem via OWA operator (LSM-DF via OWA operator) that is given by: The corresponding minimal value of is Proof. Consider the cost function
that can be rewritten in matrix form as
where
Entries and are descending orders of and , respectively. is a diagonal positive-definite symmetric matrix with entries .
To find the critical point in
x,
must be differentiated and equal to zero
Via Lemma 1, matrix
is invertible. Therefore,
Replacing (
31) into (
33), the solution can be rewritten as
In fact, for the Hermitian matrix to be defined as positive
in (
30) must be a strictly convex function; therefore,
is a unique global minimum.
The minimal cost
can be expressed as
Replacing (
33) into (
36) results in
Replacing (
31) into (
37), the optimal cost can be rewritten as
□
Remark 3. Applying in Theorem (2), the LSM-DF via OWA operator reduces to the classical LSM in Theorem (1).
3.2. LSM-DF via Choquet Integral Operator
The deduction of the LSM-DF via the Choquet integral operator follows from the equations
where
is an unknown vector,
known parameters matrices,
known vectors, and
vectors named residuals.
A solution to the least-squares problem via the Choquet integral operator
must minimize the length of the residual vector, that is, it must satisfy the following property:
for all
x∈
and where
is a matrix identity multiplied by discrete fuzzy measure.
The optimal solution
is found by solving the following minimization problem:
Functional
can be defined as
where
Therefore, by defining the Choquet integral operator, Function (
42) can be rewritten as
where
is a positive-definite symmetric matrix.
The next theorem brings the solution to the least-squares problem via the Choquet integral operator in (
41).
Theorem 3. (LSM-DF via Choquet Integral Operator) If the matrices with have a full rank and are symmetric definite-positive matrices, then there is a single optimal solution for the least-squares problem via Choquet integral operator (LSM-DF via Choquet integral operator) that is given by: The corresponding minimal value of is Proof. Using the matrices, this can be rewritten as
where
Entries , and are ascending orders of , and , respectively. is a diagonal symmetric definite-positive matrix with entries .
On the basis of Function (
48) and the solution of LSM-DF via the OWA operator presented in Theorem (2), the solution to Optimization Problem (
41) is given by
which, through Matrices (
49), can be rewritten as
Similar to the procedure performed in Theorem (2), the minimal cost
can be expressed as
Replacing (
49) into (
52), the optimal cost can be rewritten as
□
Remark 4. is the null matrix and is the null vector by convention.
Remark 5. By applying in Theorem (3), the LSM-DF via Choquet integral operator reduces to the classical LSM in Theorem (1).
3.3. LSM-DF via Mixture Operator
For the deduction of the LSM-DF via the mixture operator, it is necessary to adapt the mixture operator presented in Definition (5).
The weight functions that are dynamic in the mixture operator uses were previously calculated and became constant (static) weight functions. Thus, the adapted mixture operator is calculated in two steps. In the first step, the weights are calculated and fixed. In the next step, aggregations are carried out. The next definition brings the adapted mixture operator.
Definition 6. (Adapted Mixture Operator) The adapted MIX function can be calculated using the following steps:
Step 1: weight functions with can be calculated and fixed as follows: Step 2: with the fixed weight functions, the MIX function can be calculated as follows:
Now, the LSM-DF via the mixture operator must be deduced. The following equation must be considered:
where
is an unknown vector,
known parameters matrices,
known vectors, and
vectors named residuals.
A solution to the least-squares problem via the mixture operator must minimize the length of the residual vector, that is, it must satisfy the following property:
for all
x∈
and where
is a positive-definite symmetric matrix.
Optimal solution
is found by solving the following minimization problem:
Functional
can be defined as
where
By defining Mixture Operator (
59), the function can be rewritten as
The next theorem brings the solution to the least-squares problem via the mixture operator in (
58).
Theorem 4. (LSM-DF via Mixture Operator) If the matrices with have a full rank and are symmetric definite-positive matrices, then there is a single optimal solution to the least-squares problem via the mixture operator (LSM-DF via mixture operator) (58) that is given by: The corresponding minimal value of is Proof. Consider the function
that can be rewritten as
where
where
is a diagonal positive-definite symmetric matrix with entries
.
To find the solution to optimization problem
,
must be differentiated in (
65) and equal to zero. On the basis of the theorem, the solution of the derivative is given by
On the basis of Theorem (2), the solution of the derivative is given by
Through Matrices (
66), the solution can be rewritten as
Minimal cost
can be expressed as
replacing(
69) into (
71), the result is
Replacing (
66) into (
72), the optimal cost can be rewritten as
□
Remark 6. The optimal solution of the LSM-DF via a mixture operator reduces to the LSM-DF in [4]. 4. Illustrative Example
In this section, we present artificially created (by authors) datasets in order to illustrate the behavior, effectiveness, and the relationship between the proposed methods for finding the best fitting curve to a given set of points from a mathematical point of view.
Table 1 shows two simulated datasets about income and consumption.
First, the LSM was separately applied to the datasets, and the following results were found:
The MSEs between
with
and
with
were
and
, respectively. Model (
74) was more accurate than Model (75).
Second, the LSM-DF via OWA, Choquet integral, and mixture operators were calculated in the two datasets, and the following weighting matrices were used in the simulation:
and
; more weight was given to
than to
. The following results were found:
The MSEs between
,
and
with
were
,
,
, respectively. The MSEs between
,
and
with
were
,
and 223 respectively.
Table 2 and
Table 3 compare samples with regard to
and
, respectively, of Equations (
76)–(78).
Table 4 compares the samples of
to the samples generated by Equations (
74), (
76)–(78).
Table 5 compares the samples of
with the samples generated with Equations (
76)–(78).
MSE shows that Models (
76)–(78) were more accurate than Model (
74). The LSM-DF via OWA, Choquet integral, and mixture operators outperformed the LSM.
5. Conclusions
In this paper, the LSM-DF was studied through aggregation operators in order to explore different ways to aggregate data. More specifically, the LSM-DF via an OWA operator, the LSM-DF via a Choquet integral operator, and the LSM-DF via a mixture operator were defined. These operators were particularly chosen due to their efficiency when applied to other methods in different areas of knowledge [
12,
13,
22,
24,
26]. These new methods provide a theoretical framework with variations of the classic least square, which may be more suitable in certain applications. For instance, LSM-DF via OWA operator could be chosen for situations where one wants to place greater weights on the first data entries.
The main objective of developing these methods is to estimate an optimal parameter for situations involving more than one dataset, and to show how it can be changed for different types of data. The methods were mathematically demonstrated by applying aggregation operators of the average type to optimization problem. The illustrate example was set up to demonstrate the mathematical behavior of these procedures trough fitting curves in comparison with an approach that does not incorporate the aggregation operators in its formulation.
In future studies, we want to explore some applications that can show the advantages and disadvantages of each method, and set up LSM for other aggregation operators such as a weighted OWA (WOWA) operator and a Sugeno integral operator. Furthermore, these methods will be extended to models subject to parametric uncertainties.