Statistical Parameters Based on Fuzzy Measures

: In this paper, we study the problem of deﬁning statistical parameters when the uncertainty is expressed using a fuzzy measure. We extend the concept of monotone expectation in order to deﬁne a monotone variance and monotone moments. We also study parameters that allow the joint analysis of two functions deﬁned over the same reference set. Finally, we propose some parameters over product spaces, considering the case in which a function over the product space is available and also the case in which such function is obtained by combining those in the marginal spaces.


Introduction
Fuzzy measures [1], also known as capacities [2], non-additive measures or monotone measures [3], have shown to be a valuable tool for representing uncertainty, since they are able to cope with more general scenarios than probability measures do. Even though fuzzy measures have been successfully applied in a wide range of applications [4], no theory analogous to mathematical statistics has emerged around them in the general case, due to the difficulty of defining statistical parameters with a clear interpretation when additivity is replaced by monotonicity.
A remarkable exception is the case of the so-called imprecise probabilities [5,6], characterized by upper and lower expectations that provide rich semantics and interpretability. Dempster-Shafer belief functions [7,8], for instance, can be formulated as special cases of imprecise probabilities.
The field of fuzzy probability and statistics [9][10][11][12][13][14] has received significant attention during the last two decades. The contributions in this field can be classified into two basic groups according to the underlying approach they follow [15]. One of the groups include the methods that deal with the analysis of classical (non-fuzzy) data using methods based on fuzzy set theory, while the other group focuses on analyzing fuzzy data using statistical methods. In this context, fuzzy data refers to data in which the values correspond to fuzzy numbers [16], characterized by a membership function that returns a value between 0 and 1 indicating to which extent a given real number matches a given fuzzy number.
Examples within the first group include fuzzy clustering [17], fuzzy linear regression [18], testing fuzzy hypothesis from non-fuzzy data [19], fuzzy statistical quality control [20], time series forecasting based on fuzzy logic [21] and making statistical decisions with fuzzy utilities [22].
In this paper, we are interested in the definition of statistical parameters when the uncertainty is represented by a general fuzzy measure. More precisely, our starting point is a measurable space of the reference set. Given a permutation σ of the set of indices {1, . . . , n}, we will denote by X σ the ordering of the elements of X according to permutation σ, i.e., X σ = {x σ (1) , . . . , x σ(n) }. When it is clear from the context, we will drop σ from the subscripts and write X σ = {x (1) , . . . , x (n) }. Definition 7. [2] Let (X, A, µ) be a measurable space, and let h be a measurable real function of X. The Choquet integral of h with respect to µ is (4) where A ∈ A and H α are the α-cuts of h, defined as If the reference set is finite, the integral can be expressed as where X σ is an ordering such that h( where P h is the probability function associated with the ordering X σ induced by h (see Definition 2). Given two measurable spaces (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) , the concept of product fuzzy measure is defined as follows. Definition 8. [31] A product fuzzy measure of µ 1 and µ 2 is a function µ 12 : A X 1 ×X 2 −→ [0, 1] satisfying: The next definitions particularize the concept of a fuzzy measure product, so that it is guaranteed to be compatible with the intuitive idea of independence, in the sense that if two fuzzy measures are independent, their fuzzy measure product should be possible to be obtained using exclusively the two original fuzzy measures. Definition 9. [31] Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces. µ 1 and µ 2 are -independent fuzzy measures if there exists a product fuzzy measure µ 12 such that for any H ∈ R, where H = A × B and is a t-norm. µ 12 is called the -independent product of µ 1 and µ 2 .
Definition 10. [31] Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces. The -exterior product measure for any H ∈ A X 1 ×X 2 is defined as where is a t-norm. Definition 11. [31] Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces. The -interior product measure for any H ∈ A X 1 ×X 2 is defined as where is a t-norm.
Both measures conform to lower and upper bounds for any -independent product fuzzy measure. Proposition 1. [31] Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces. Given any -independent product of µ 1 and µ 2 , it holds that for all C ∈ A X 1 ×X 2 , Note that, for the particular case of the class R, both measures coincide [31], i.e., for all H ∈ R, Product fuzzy measures can also be defined in terms of the associated probability measures [31].

Definition 12.
[31] Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces and P µ 1 σ 1 and P µ 2 σ 2 be the probability functions associated with X σ 1 1 and X σ 2 2 , respectively. The lower product p-measure is defined as for all C ∈ A X 1 ×X 2 , where ⊗ is the standard probabilistic product, i.e., P Definition 13. [31] Given the conditions in Definition 12, the upper product p-measure is defined as where ⊗ is the standard probabilistic product.

Parameters over One Measurable Space
In this section we propose statistical parameters aimed to characterize the behavior of functions defined on a measurable space endowed with a fuzzy measure. We will separately address the case of analyzing a single function and the case of simultaneously analyzing two functions.

The Case of Only One Function
Our proposals rely on the extension of the concept of mathematical expectation associated with probability measures, to the more general case of fuzzy measures. Consider a measurable space (X, A, µ) where µ is a fuzzy measure, and the class P of all the additive measures over X. One way to extend the concept of mathematical expectation [5,35] is based on defining the set M P (µ) = {P ∈ P|P(A) ≥ µ(A), ∀A ∈ A} (15) of all the probability measures that dominate the fuzzy measure µ.
Since all the elements in M P (µ) are additive measures, the expectation of a function h with respect to a fuzzy measure µ can be defined as where E P (h) is the mathematical expectation of h with respect to the probability measure P.
The problem of this definition is that it is not always well defined, since there can exist a fuzzy measure µ for which M P (µ) = ∅. It happens, for instance, when the sum of the fuzzy measure µ over the unitary subsets of X is greater than 1, as it is not possible to find a probability measure bounding µ from above.
A class of fuzzy measures that are compatible with the definition of expectation in Equation (16) are those that conform a lower envelope of a set of probability measures [6], i.e., µ(A) = min{P(A)|P ∈ M ⊆ P}, because in that case M P (µ) = ∅.
A more general definition of expectation, based on Choquet integral [2], was given in [30] with the aim of extending the probabilistic concept of expectation to non-additive settings. Definition 14. [30] Let (X, A, µ) be a measurable space and let h be a non-negative, real valued measurable function of X. The monotone expectation of h with respect to the fuzzy measure µ is defined as Since a fuzzy measure can always be characterized by a set of probability measures, it is clear from Definition 14 and Equation (7) that the monotone expectation is equal to the mathematical expectation obtained with the probability function associated with the fuzzy measure µ and the ordering induced by the function h (see Definition 2), i.e., where P µ,h denotes the probability function associated with µ and the ordering induced by h. In the particular case of considering a finite reference set, the monotone expectation can be expressed as The relation between the monotone expectation and the mathematical expectation is also illustrated in Proposition 2. Proposition 2. [30] Let (X, A, µ) be a measurable space and let {P σ , σ ∈ S n } be the set of all the probability functions associated with the fuzzy measure µ. Then, for any non-negative real valued, measurable function h of X it holds that min

Monotone Variance
In the same way as the monotone expectation extends in a natural way the concept of mathematical expectation to non-additive measures, we will pursue the extension of other statistical parameters in a similar way.
We will start off considering the extension of the concept of variance to a non-monotone context. A direct approach is to define an extension of the variance using Choquet integral, as in the case of the monotone expectation, which yields However, the definition of variance in Equation (21) is problematic, since the distribution associated with µ and the ordering induced by h is not, in general, the same as the one induced by (h − E µ (h)) 2 . The reason is that functions h and (h − E µ (h)) 2 are not comonotone, and therefore they may induce different orderings of the reference set. Hence, the monotone variance defined in this way could not be considered as a measure of dispersion with respect to the monotone expectation, as the underlying probability distribution can be different (see Definition 2).
Taking this into account, we propose a definition of monotone variance that preserves the underlying probability measure associated with µ and the ordering induced by h. Definition 15. Let (X, A, µ) be a measurable space and let h be a non-negative real valued measurable function of X. We define the monotone variance of h with respect to the fuzzy measure µ as where P µ,h is the probability function associated with µ and the ordering induced by h.
It is clear from the definition that Var µ (h) ≥ 0 and that it is equal to the traditional variance when µ is a probability measure.

Example 2.
Consider the fuzzy measure over the reference set X = {x 1 , x 2 , x 3 } and its associated probability distributions in Table 1, and the function h defined as h( , the ordering induced by permutation σ = (2, 1, 3), which corresponds to the probability distribution P (2,1,3) . Therefore, according to Equation (22), the monotone variance of h is just the variance of h computed using probability distribution P (2,1,3) , resulting in Var µ (h) = 0.0621. Table 1. A fuzzy measure and the associated probability distributions corresponding to all the possible permutations of the indices (1, 2, 3).
Our definition of monotone variance preserves some properties of the traditional variance, likewise the monotone expectation preserves some properties of the mathematical expectation. In particular, the result in Theorem 1 is of practical value as it simplifies the calculation, and it is also of interest because it links the concepts of monotone variance and monotone expectation. Theorem 1. Let (X, A, µ) be a measurable space and let h be a non-negative real valued measurable function of X, then it holds that Proof. According to Equation (22), i.e., the variance of h computed according to probability distribution P µ,h , which can be calculated as and thus The functions h and h 2 are comonotone, and therefore they induce the same ordering of the reference set and hence yield the same associated probability distribution (see Definition 2). Thus, it holds that P µ,h = P µ,h 2 and therefore, In addition, according to Equation (18), (24) we obtain Equation (23).
As a continuation of Example 2, we will compute Var µ (h) using Equation (23).
The next result shows that the monotone variance behaves in a similar way as traditional variance in relation to affine transformations. Proposition 3. Assume the conditions in Theorem 1 and let t be a function defined as t = ah + b with a ∈ R + 0 and b ∈ R. It holds that Var µ (t) = a 2 Var µ (h). (25) Proof. First, we have to show that t and h are comonotone, i.e., that for all x, y ∈ X, (h(x) − h(y)) and (t(x) − t(y)) have the same sign: since a ∈ R + 0 . Therefore, the probability distribution associated with the measure µ is the same for both functions, i.e., P µ,t = P µ,h and thus The next results analyze when the monotone variance is equal to 0. Theorem 2. Let (X, A, µ) be a measurable space and let h be a non-negative real valued measurable function of X. Let P µ,h be the probability function associated with µ and h. Then, the following three conditions are equivalent: Proof. Let us assume without loss of generality that Suppose that Var µ (h) = 0 and there exist two different j, k ∈ {1, . . . , n}, j < k, such that p σ (x j ) = 0 and p σ (x k ) = 0. Then it holds that However, according to the assumption in Equation (26) which is a contradiction with the assumption that p σ (x j ) = 0. Thus, there is only one p σ (x i ) = 0 and furthermore, p σ (x i ) = 1.

(2) =⇒ (3)
Assume ∃!i such that p σ (x i ) = 0. Then, On the other hand, since µ( It is straightforward from the definition of monotone variance.

Example 4. Assume a function h defined on
and an associated probability distribution p σ such that p σ (x 1 ) = 1, p σ (x 2 ) = 0 and p σ (x 3 ) = 0. We will see how the monotone variance is equal to 0. However, first we need to calculate the monotone expectation. Thus, Now we will calculate the value of the measure µ over the sets We can obtain the values of µ from p σ using Definition 2. The result is Therefore, µ(H α 1 ) = 1, µ(H α 2 ) = 1 and µ(H α 3 ) = 0.

Monotone Moments
Following the same idea underlying the definition of monotone variance, we can extend the concepts of central and non-central moments from a probabilistic setting to a monotone one. Definition 16. Let (X, A, µ) be a measurable space and let h be a non-negative real valued measurable function of X. We define the k-th non-central monotone moment of h with respect to µ as Note that Equation (27) is well defined, since h and h k are comonotone, and therefore the corresponding probability function is the same for both of them, regardless of the value of k.
The definition of central monotone moments is, however, more problematic. If we follow the same idea as in Definition 16, and define the central monotone moment as E µ (h − E µ (h)) k , we find the problem that functions h and (h − E µ (h)) k are not comonotone, and that would mean that different underlying probability distributions would be used to compute E µ (h) and E µ (h − E µ (h)) k . We will therefore generalize the definition of monotone variance to values of k = 2, utilizing the probability function associated with µ and h. Definition 17. Let (X, A, µ) be a measurable space and let h be a non-negative real valued measurable function of X. We define the k-th central monotone moment of h with respect to µ as where P µ,h is the probability function associated with µ and h.
The following result establishes the relation between central and non-central monotone moments.

Proposition 4.
Let (X, A, µ) be a measurable space and let h be a non-negative real valued measurable function of X. It holds that Proof. Assume X = {x 1 , . . . , x n }.

The Case of Two Functions
In this section we approach the simultaneous analysis of two functions h 1 and h 2 over the same reference set, X. Our goal is to model the information that both functions have in common, or the way in which they interact with one another.
Generalizing the concept of covariance, for instance, by using , raises the problem that the underlying probability distribution used to compute the monotone expectation is not the one induced by h 1 nor by h 2 for the same fuzzy measure µ, and therefore it is not clear that this monotone covariance in fact measures the relationship between both functions at all. We will therefore explore a different approach, in which we will model the degree of similarity between h 1 and h 2 , by measuring the common region determined by both functions. Definition 18. Let (X, A, µ) be a measurable space and let h 1 and h 2 be non-negative real valued measurable functions of X. We define the common expectation of h 1 and h 2 with respect to µ as The concept of common expectation is illustrated in Figure 1. More precisely, the value of the common expectation of h 1 and h 2 is the measure, according to µ, of the function under which the shaded area is. Example 5. We want to obtain the global grade for two students out of the individual grades they obtained in four different courses {x 1 , x 2 , x 3 , x 4 }. In the final grade we want to reflect if a student shows a good performance in the two scientific courses, {x 1 , x 2 }, the humanistic ones, {x 3 , x 4 }, or in the combination {x 2 , x 3 }, corresponding to a social sciences profile. These criteria are encoded in the fuzzy measure in Table 2, while the grades obtained by both students (between 0 and 1) in each of the courses are shown in Table 3. Table 2. A fuzzy measure matching the criteria in Example 5.

Reference Subsets
Measure Table 3. Grades obtained by the students in Example 5 in the individual courses.
The calculation of the respective monotone expectations and variances result in which are quite similar, while the common expectation is ψ µ (h 1 , h 2 ) = 0.25.
The next proposition states the basic properties of the common expectation.

Proposition 5.
Let (X, A, µ) be a measurable space and let h 1 and h 2 be non-negative real valued measurable functions of X. Then, ψ µ satisfies the following properties: If ∀x ∈ X, h 1 (x) ≤ h 2 (x), then for any non-negative real valued measurable function h of X, Proof.

It is straightforward from Equation (30). 2.
It follows from the facts that E µ is a monotone functional and that min{h 1 , h 2 } is bounded from above by both h 1 and h 2 .

3.
It is a direct consequence of the monotonicity of operator min and functional E µ .

5.
If {x ∈ X|h 1 (x) > 0} ∩ {x ∈ X|h 2 (x) > 0} = ∅ then the minimum of both functions is the identically null function, which is known to be the only one that has null monotone expectation [30].
The common expectation is not normalized, and therefore its value alone is not enough to determine if it can be regarded as high or low. For instance, in Example 5 we obtained ψ µ (h 1 , h 2 ) = 0.25, but that value does not tell us if it is high or low. However, it is clear that the common expectation can be bounded from above, since it is known that for any positive real numbers a and b, it holds that min{a, b} ≤ √ a · b ≤ max{a, b} and the equality is reached only when a = b. Hence, we can normalize the common expectation using these bounds, which yields three possible definitions of coefficients of concordance between h 1 and h 2 . (X, A, µ) be a measurable space and let h 1 and h 2 be non-negative real valued measurable functions of X. We define the coefficients of concordance ρ 1 , ρ 2 and ρ 3 between h 1 and h 2 with respect to µ as

Definition 19. Let
The next proposition shows the basic properties of the three concordance coefficients (when it is clear from the context, we will drop the measure and the functions, thus denoting ρ µ i (h 1 , h 2 ) by ρ i ).

Proposition 6.
Assume the conditions in Definition 19. The coefficients of concordance satisfy the following conditions:

1.
It is clear that According to property 2 in Proposition 5, ψ µ (h 1 , h 2 ) ≤ min{E µ (h 1 ), E µ (h 2 )}, and thus On the other hand, since h 1 and h 2 are non-negative, so it is ψ µ (h 1 , h 2 ), which means that ρ ; hence, the three coefficients are equal to 1.

3.
From the proof of property 1, we know that

Example 6.
As a continuation of Example 5, we can use the data in Tables 2 and 3 to compute the coefficients of concordance, obtaining Note how the three coefficients have low values, which is consistent with the data in the example, as in spite of the similar values for the monotone expectation and variance corresponding to both students, they have a clearly different profile, scientific in the case of h 1 and humanistic in the case of h 2 .

Parameters Defined over Product Spaces
In this section we explore scenarios where we have two measurable spaces each of them equipped with a different fuzzy measure. We will consider the definition of statistical parameters on the product space.
Likewise, in Section 3, we will separately study the case of one or two real functions. In both cases, it is necessary to obtain a fuzzy measure over the product space. We will rely on the proposals in [31] to obtain the product measures.

The Case of One Function
The methods proposed in [31] for constructing fuzzy measures over product spaces, rather than single measures, usually yield a set of them, bounded by an upper and lower measure. Similarly, our proposals here will consist of intervals of parameters rather than a single one.
We will start defining the concept of joint expectation making use of the interior and exterior product measures (see Definitions 10 and 11).
The concept of joint -expectations is analogous to the concept of monotone expectation in a marginal space, with the difference that, in the case of the product space, the underlying fuzzy measure is not known, but instead we have an interval of measures bounded by the interior and exterior -product measures. We can define joint expectations using other product measures, as the p-measures given in Definitions 12 and 13.

Definition 21.
Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces, h : X 1 × X 2 → [0, 1] and m 12 , m 12 the lower and upper product p-measures respectively. We define the lower and upper joint probabilistic expectations as Since we have a function defined over the product space and fuzzy measures defined over the marginal spaces, it is natural to define marginal expectations. We will utilize the concept of ⊕-marginal of a function [31].

Definition 22.
[31] Let h be a function defined on X 1 × X 2 and taking values on [0, 1]. We define the ⊕-marginals of h as where ⊕ is a t-conorm (see Definition 6), n is the cardinality of X 1 and m is the cardinality of X 2 .

Definition 23.
Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces and let h be a function defined on X 1 × X 2 and taking values on [0, 1]. We define the marginal ⊕-expectations as where h ⊕ X i are the ⊕-marginals of h.

The Case of Two Functions
We will now assume that we have two different functions, one for each marginal space, and define parameters that combine the information provided by the marginal spaces. Definition 24. Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces, and let h 1 , h 2 be functions defined on X 1 and X 2 respectively, taking values on [0, 1]. We define the upper and lower global expectation of h 1 and h 2 as where and are arbitrary t-norms (see Definition 5), h 12 (x 1 , and µ 12 and µ 12 are the interior and exterior product measures of µ 1 and µ 2 respectively.
The next proposition shows that both expectations coincide when is the min t-norm.

Proposition 8.
Assume the conditions in Definition 24. If is the min t-norm, it holds that Proof. According to ([31], Proposition 8), the α-cuts generated by h belong to R when is the min t-norm. Furthermore, Equation (12) establishes that µ 12 = µ 12 for the elements of R, which proves the result.
As a consequence of Proposition 8, when using the min t-norm we will just write φ for both φ and φ . and therefore Likewise for the common expectation, the global expectation is not normalized, but it can be easily normalized in the same way as we did for the common expectation case, as stated in the next definition. Definition 25. Let (X 1 , A X 1 , µ 1 ) and (X 2 , A X 2 , µ 2 ) be measurable spaces and let h 1 and h 2 be functions defined on X 1 and X 2 respectively and taking values on [0, 1]. We define the global coefficients of concordance of h 1 and h 2 as . (48)

Example 8. (Continuation of Example 5)
Using the data in Table 3 we can obtain the function h min 12 , the values of which are given in Table 5.  Using the fuzzy measure in Table 2, we find that φ min min = 0.6 and the global concordance coefficients are Φ 1 (h 1 , h 2 ) = 0.953 and Φ 2 (h 1 , h 2 ) = 0.984.
The value of the global expectation (0.6) is very close to the values of the monotone expectations for each student in Example 5 (0.61 and 0.65 respectively). It can be interpreted as the fact that the grades of both students are acceptable individually and also globally, which is reflected in high values of the global coefficients of concordance. Note how the global expectation is not detecting the fact that both students have different profiles (scientific and humanistic), while the common expectation detected this fact yielding a much lower value (0.25) resulting in lower values of the coefficients of concordance as well.

Conclusions
With the introduction of the concept of monotone variance, we have complemented the already known concept of monotone expectation. It can be regarded as a measure of dispersion with respect to a central position measure. We have also introduced the concepts of central and non-central monotone moments, that can serve as a vehicle to define further statistical parameters based on fuzzy measures as, for instance, shape measures. The potential application scope is certainly wide, as it covers non-additive scenarios like the ones described in the examples in this paper, and just to mention some of them, such scenarios can be found in Engineering and Social Sciences applications.
The common expectation and concordance coefficients can be interpreted as measures of match between the functions, and in that sense can provide information about to which extent one function explains the other one. A possible application of these concepts is the development of prediction models when the measures are not additive.
Thanks to the developments in [31] we have been able to extend the concept of monotone expectation to product spaces, where, in addition, we have shown how to marginalize the information provided by a function over a product space using the marginal ⊕-expectations.
All the developments in this paper are restricted to finite reference sets. Even though it covers a wide variety of practical applications, it is worth exploring the formulation of the results obtained here to uncountable reference sets, which seems to be a promising research line. The first step in this direction would be the extension of the results in [31] to continuous domains. Funding: This research was funded by the Spanish Ministry of Science and Innovation through grants TIN2016-77902-C3-3-P, PID2019-106758GB-C32 and by ERDF-FEDER funds.

Conflicts of Interest:
The authors declare no conflict of interest.