1. Introduction
Data are often referred to as the ‘oil’ of the 21st century, as it has become a new factor of production and strategic asset, contributing significantly to technological innovation and industrial upgrading. At the enterprise level, data play an increasingly important role as a key element in enhancing the core competitiveness of enterprises. Therefore, a scientific assessment of the value of enterprise data assets is crucial for enterprise development. Feng [
1] investigated data asset value and impact factors, suggesting a valuation model that is adaptable across societal sectors. Brennan, Attard, Petkov, Nagle and Helfert [
2] pointed out that there is a lack of research on data valuation techniques and observed that data value perceptions differ among organizations. Li and Alotaibi [
3] utilized nonparametric estimation methods and nonlinear expectations to build various risk metric models for the asset pricing and financing risk assessments of small businesses. Harish, Liu, Zhong and Huang [
4] studied digital asset valuation and risk assessment of logistics companies and utilized digital assets of letters of credit to help finance logistics. However, the data model in these studies is quantitative data [
5,
6,
7].
The EDAV evaluation is a complex issue involving multiple variables and factors, with some factors possibly being difficult to express in clear numerical terms. Zadeh [
8] believed that traditional set theory is too precise and difficult to handle the uncertainty and fuzziness present in the real world. Therefore, he introduced the concept of Fuzzy sets (FSs) in 1965. In the last few decades, the application of FSs has driven the development of various fields and has given rise to numerous new variations and extensions (IFS [
9], HFS [
10], PHFS [
11], PDHFS [
12], HFNs [
13], GHFNs [
14], etc. [
15,
16,
17]). However, we found that GHFNs can comprehensively describe the potential information of variables and enable the representation of multiple possible degrees of membership. In MADM problems, they excel at handling complex uncertainty and fuzziness scenarios. Keikha [
14] proposed GHFNs based on HFNs and introduced their definition, operation laws, aggregation operators, and so on. Keikha [
18] gave some useful distance measures for GHFNs and proposed updating the TOPSIS method, which is applied to the selection of energy projects. Based on the general forms of t-norm and t-conorm functions, Garg and Keikha [
19] introduced several aggregation operators for GHFNs, thereby expanding the aggregation theory of GHFNs. Liu, Wang, Ning and Wei [
20] extended the CPT-TODIM method to GHFNs and used it for researcher selection. Liu, Wang and Wei [
21] proposed a new score function and entropy measure for GHFNs to select energy projects using the GHF-EDAS method. GHFNs have been widely used in MADM problems, but there are currently no relevant applications for EDAV evaluation.
The problem with MADM is how to fuse and rank the evaluative information being processed. The COPRAS method is capable of considering the importance and validity of various alternatives in the process of evaluating and ranking them, and its calculation process is simple and transparent. Therefore, this method is widely used in MADM problems. Seker, Baglan, Aydin, Deveci and Ding [
22] used the IVq-ROF-COPRAS method to evaluate COVID-19 social risk factors. Mishra, Rani, Saha, Senapati, Hezam and Yager [
23] proposed the COPRAS method for Fermatean FSs and applied the method to the selection of renewable energy sources. Naz, Akram and Muzammal [
24] extended the COPRAS method to the 2-tuple linguistic T-spherical fuzzy MAGDM problem. Dang, Nguyen, Nguyen and Dang [
25] proposed SFs Gray COPRAS (G-COPRAS) and applied it to the SSS problem. Buyukozkan and Gocer [
26] proposed the PFS-COPRAS method and applied it to the MADM problem of partner selection. Yuan, Xu and Zhang [
27] proposed a hybrid DEMATEL-COPRAS approach for probabilistic linguistic term sets and applied it to third-party supplier selection. Song and Chen [
28] proposed the COPRAS method for the MADM problem in PHFS. The application of the COPRAS method in some other fuzzy environments is not listed [
29,
30,
31]. However, until now, the application of the COPRAS method within the context of GHFNs has remained unexplored.
It is noteworthy that in real-world scenarios, decision-makers frequently exhibit bounded rationality, not always aiming to maximize utility. Instead, they tend to opt for choices that best align with their preferences. For this reason, Tversky and Kahneman [
32] proposed a CPT for decision analysis under uncertainty and risk conditions. Currently, CPT has been successfully applied to a wide range of fuzzy information risk-based MADM problems. Zhang, Wei, Guo and Wei [
33] developed the CPT-TODIM model, which is the MADM for 2TLPFSs, and applied it to company credit risk assessment. Zhang, Wei, Lin and Chen [
34] proposed an intuitionistic fuzzy TOPSIS method (IF-CPT-TOPSIS) based on CPT and applied it to the MAGDM problem. Liao, Gao, Lin, Wei and Chen [
35] proposed the PHF-CPT-EDAS method by combining CPT and information entropy theory and used it to solve the MAGDM. Zhang and Wei [
36] established the SF-CPT-CoCoSo model based on CPT in a spherical fuzzy environment and used this method for the location of electric vehicle charging stations. Mao, Chen, Lv, Guo and Xie [
37] proposed a MADM method based on the CPT and DEMATEL methods and applied it to the problem of municipal plastic solid waste disposal. Han, Zhang and Deng [
38] proposed IF-CPT-VIKOR in an intuitionistic fuzzy environment based on the CPT and VIKOR decision-making methods and applied it to commercial concrete supplier selection. However, research on CPT-based generalized hesitant fuzzy MADM methods is still relatively limited to date. It is interesting to note that the CPT-COPRAS method has not yet been proposed to cope with the uncertainty problem in GHFNs.
Another problem with MADM is how to identify the weights of the criteria. In the MADM problem, the objective weights method is based on the available data and information to determine the weights, which can better reflect the relationship and importance between the decision criteria and reduce the subjective bias and subjective judgment of the DMs, thus making the decision-making process more objective and scientific. The objective weight method mainly includes the entropy weight method [
39], MEREC method [
40], CRITIC method [
41], etc. The CRITIC method comprehensively determines the weights through the intensity of comparison within the indexes and the degree of conflict between the indexes. Therefore, the CRITIC method is widely used to determine attribute weights in MADM problems [
42,
43,
44,
45]. As research into the CRITIC method deepens, we have identified distinct limitations in its approach to determining attribute weights: (1) The conflict ability of the indices should only be associated with the degree of relevance, independent of positive or negative correlations. Hence, it is necessary to eliminate the positive and negative signs of the correlation coefficients. (2) The CRITIC method tends to assign higher weights to attributes of indices that are directly assigned or less relevant, thereby requiring a reduction in conflict ability. Recently, Krishnan, Kasim, Hamid and Ghazali [
46] proposed the D-CRITIC method, which integrates distance correlation into the CRITIC method to capture linear and nonlinear relationships between criteria and overcomes the inadequacy of conflicting relationships between Pearson’s correlation coefficients to obtain attribute weights efficiently. Zhang and Wei [
36] extended the D-CRITIC method to Spherical fuzzy sets to compute attribute weights and apply it to uncertain fuzzy decision problems. Maneengam [
47] used the weights of the D-CRITIC method objective function and then used the modified TOPSIS method to study the MRP problem with multiple objective functions. Wu, Yan, Wang, Chen, Jin and Shen [
48] used the modified CRITIC to calculate attribute weights, then simulated a multidimensional connectivity cloud, and calculated the connectivity relative to the evaluation criteria to evaluate eutrophication water quality. However, there are fewer applications of the D-CRITIC method in other fuzzy environments. In the D-CRITIC method, one of the key factors is the distance measure. Kullback–Leibler (K-L) divergence is an evolved form of Jensen–Shannon divergence [
49], and K-L divergence is an effective method for data fusion that distinguishes between two probability distributions on the same variable, reflecting the distance of one probability distribution from the other. Kumar, Patel and Mahanta [
50] proposed PFSs new distance measure using K-L divergence, which proves its mathematical properties, and conducted a comparative study with existing distance measures to verify the superiority of K-L divergence measures. Moreno, Ho and Vasconcelos [
51] derived the kernel function distance of the probabilistic models between the generating models based on the K-L divergence. However, there are no relevant works on the K-L divergence measure under GHFNs.
Therefore, it is clear from the study of the literature that the EDAV evaluation problem is a typical MADM problem. In this paper, we first propose the CPT-based COPRAS decision-making method given the DMs’ limited rational behavior and establish the GHF-CPT-COPRAS model of MADM. Second, we propose the K-L divergence measures for GHFNs and extend the D-CRITIC method to GHFNs to obtain the weights of MADM criteria. Finally, we illustrate the applicability of the GHF-CPT-COPRAS model through an EDAV evaluation examples analysis and conduct a comparative study to verify the validity and feasibility of the model.
The primary motivations of this paper are as follows: (1) In the era of big data, EDAV evaluation holds significant practical importance. However, there is a scarcity of related studies. Therefore, this paper aims to establish an EDAV evaluation index system and translate decision-making information into GHFNs to facilitate better decision-making on EDAV evaluation problems. (2) The K-L divergence measure distinguishes between two probability distributions on the same variable, indicating the distance between them. This measure has been extended to the GHF environment to reflect the distance measure of two GHFNs. (3) The COPRAS method has been widely utilized due to its capability to consider the importance and validity of different alternatives in the evaluation and ranking process, along with its simple and transparent calculation process. Decision-makers exhibit various psychological preferences when facing losses and gains, and CPT effectively simulates these preferences. By integrating CPT with COPRAS, the CPT-COPRAS model can fully capture DMs’ psychological preferences and provide effective and rational rankings. (4) The D-CRITIC method combines distance correlation with the CRITIC method to capture linear and nonlinear relationships between criteria, which overcomes the inadequacy of conflicting relationships between Pearson’s correlation coefficients and minimizes the possible deviation of the final weights. However, so far, the D-CRITIC method has rarely been applied in GHFNs. (5) It is important to apply the proposed GHF-CPT-COPRAS model to the EDAV evaluation problem for decision-making. For the reasons stated above, this paper first proposes the K-L divergence measure for GHFNs. Second, the GHF-CPT-COPRAS model is established to be applied to uncertain fuzzy decision-making problems, and the D-CRITIC method is extended to obtain the criteria weights in the MADM problem. Finally, the developed model is applied to the EDAV evaluation problem to verify its effectiveness. In addition, it further demonstrated the effectiveness and feasibility of the GHF-CPT-COPRAS model through a comparative discussion with existing decision-making methods for GHFNs.
The main contributions are as follows: (1) established the EDAV evaluation index system and transformed the EDAV evaluation information into GHFNs; (2) proposed the K-L divergence measure for GHFNs, which enriched the distance measure theory of GHFNs; (3) extended the D-CRITIC method to assign the weights of unknown attributes in the GHF environment decision-making; (4) established the GHF-CPT-COPRAS model to solve the MADM problem, integrating decision-making habits of DMs and risk preferences and integrating CPT theory into the COPRAS method for effective evaluation of the scheme; (5) the proposed model was used for the EDAV evaluation problem to evaluate the value of the data assets of five Internet financial enterprises, and the results of the study can provide a reference to the managers; (6) further comparative analyses to validate the GHF-CPT-COPRAS model’s validity and feasibility, which provides a reference for expanding the CPT-COPRAS method to other decision-making environments and also providing some ideas for expanding the established model to other MADM problems.
In addition to the above, this paper consists of the following sections: In
Section 2, we review the definition and operator laws of GHFNs, CPT theory, COPRAS method, and D-CRITIC method.
Section 3 proposes a distance measure of GHFNs based on the K-L divergence measure.
Section 4 introduces the GHF-CPT-COPRAS model, incorporating the D-CRITIC method. In
Section 5, we establish the EDAV evaluation system, apply the proposed method to practical EDAV evaluation problems, and compare GHFN operators and decision-making methods to illustrate the effectiveness and feasibility of the EDAV evaluation method. Finally,
Section 6 provides a summary of the paper and suggests interesting directions for future research.
3. GHFNs K–L Divergence Measure
In this section, we propose a new distance measure for GHFNs based on K-L divergence measures. It is an effective data fusion method that quantifies the proximity of two probability distributions for highly precise estimations. The K-L divergence distinguishes between two probability distributions on the same variable, reflecting the distance of one probability distribution from the other.
For any discrete random variable
, suppose
and
are two probability distributions. Then, define the K-L divergence between
and
as:
To avoid the situation where
,
can be modified as follows:
where
is a value with
.
This is because the K-L divergence does not satisfy the symmetric property. Therefore, we obtain the following result by transforming it into a symmetric divergence:
Since all possibilities are less than 1, which is the case, we set
.
Divergence is used to measure the difference or dissimilarity between two probability distributions. Therefore, we can construct a distance measure to measure the difference between information using K-L divergence. Next, we define the K-L divergence measure for two GHFNs.
Let
and
be two GHFNs, then the new distance measures of GHFNs are recorded as:
Since
and
when
,
and
,
the distance measure
takes the maximum value
, which is more than 1. Consequently
is not limited to the interval [0, 1]. Therefore, we normalize the distance measure
by dividing it by its maximum value.
Theorem 1. Let , and be three AGHFNs, and if satisfies the three properties of distance, then is the distance measure between two AGHFNs.
- (A1)
;
- (A2)
if and only if ;
- (A3)
;
Proof. The GHFNs K-L divergence measure Equation (32) can be reformulated as follows:
To expedite the proof of the distance properties, we exploit the following function
- (A1)
Obtained the partial derivatives of with respect to and as
Assuming has no loss of normality, we have . In addition, we also obtain and . This shows that the monotonicity of with respect to is the exact opposite of that with respect to . That is, the largest value of occurs at (1,0), and the largest value is . Furthermore, since , then there is , so we have .
Since , then, we get , so .
- (A2)
If , then and , therefore, and , we obtain
Therefore, the equation holds. □
5. An Illustrative Example
One company plans to carry out data business cooperation with Internet financial enterprises and now evaluates the data asset value of five Internet financial enterprises; the five enterprises are . The GHF-CPT-COPRAS method is applied to the EDAV evaluation problem in the following sections.
5.1. Background
Data assets encompass the data that an enterprise owns and manages, which, in turn, can generate value for the enterprise. Examining the EDAV evaluation’s inputs, they primarily comprise labor, equipment, material, power and related expenses associated with data collection, storage, analysis and business applications. These can be further categorized into costs for data carriers, operations and maintenance, and services. From an output perspective, value manifests in two main ways. First, there is the directly tangible value from external services, evident in the amount of processed and analyzed data provided to external customers, its quality, and the resulting gains. Second, it is the value derived from the internally processed data used as an enterprise resource, influencing quality and contributing to decision-making support. Guided by the principles of being systematic, hierarchical, objective and comparable, we structure the EDAV evaluation index system as illustrated in
Figure 2.
Data cost reflects the value of various types of cost inputs. It mainly includes the following indexes: Carrier cost, the construction and transformation cost of creating various types of business data systems and convergent data systems (such as data warehouses, data marts) and other data carriers; Operation and maintenance cost, the cost of daily data collection, cleaning, loading, storage, dynamic monitoring, and integration, as well as the cost of security and maintenance, fault detection, etc.; Service cost, the cost to meet the needs of the enterprise’s internal business scenarios and customer customization needs, the cost of data computation, analysis, mining, delivery of products and outputs.
The apparent value reflects the quantity and quality of data assets, which are the source of current service value and its future value added. It mainly includes the following indexes: Data scale, the amount of data owned and controlled by the enterprise; Data completeness, the completeness of the coverage of the delivered data to support internal decision-making in the business area and external services; and Data rationality, the degree of accuracy and reasonableness of the delivered data.
Service value includes both external and internal services and reflects the application value of EDAV. It mainly includes the following indexes: Service revenue, the amount of revenue gained from the delivery of data products to external customers; External customer satisfaction, the degree of satisfaction of external customers with the quality and delivery time of data deliverables; and Decision support contribution, the level of contribution of the data deliverables or data sources to the enterprise’s decision support in terms of strategy, operation and so on.
In practice, EDAV evaluation indexes have both quantitative and qualitative indexes. When dealing with decision-making problems containing uncertainty and ambiguity, the unified conversion to the form of GHFNs has certain advantages and can avoid the problem of more information distortion caused by converting uncertain variables to deterministic variables. According to this EDAV evaluation system: , , , and are quantitative indexes, the real part of GHFNs is composed of experts according to the actual survey data, and the membership degree part expresses the degree of the hesitancy of the experts to the survey data, and the assessment includes a set of a finite number of values ranging from 0 to 1. , , , , and are qualitative indexes, and the real part of GHFNs is composed of experts according to the interviews and surveys, the quantitative assessment includes a finite set of values from 0 to 10, the membership degree part expresses the degree of hesitation of experts in evaluating the data, and the assessment includes a finite set of values from 0 to 1.
5.2. Decision Process
Step 1: The EDAV evaluation results are obtained through the research and organized into a data matrix expressed by GHFNs. The evaluation results are displayed in
Table 1,
Table 2 and
Table 3.
Next, we select the optimal cooperative enterprise using the new GHF-CPT-COPRAS method and the D-CRITIC method.
Step 2: Since GHF decision matrix
of GHFNs is not of equal length, we need to adjust GHFNs, and in this paper, we use an optimistic way to adjust it and obtain the adjusted matrix
, which is exhibited in
Table 4,
Table 5 and
Table 6.
Step 3: In the EDAV evaluation decision problem, attributes
,
and
are cost attributes and attributes
,
,
,
,
and
are benefit attributes. The matrix is normalized according to Equation (36) and the results are displayed in
Table 7,
Table 8 and
Table 9.
Step 4. The decision reference point for the loss of gain for each attribute is calculated by Equation (37), and the results are displayed in
Table 10,
Table 11 and
Table 12.
Step 5. The standard deviation of each attribute is calculated through Equation (38), and the results are displayed in
Table 13.
Step 6. The distance correlation matrix for each pair of attributes is calculated through Equations (39)–(43), and the results are displayed in
Table 14.
Step 7. The information content of each attribute is calculated through Equation (44), and the results are displayed in
Table 15.
Step 8. The D-CRITIC method weights are calculated through Equation (45), and the results are displayed in
Table 16.
Step 9. Using Equation (46), we obtain the normalized
matrix, and the results are displayed in
Table 17,
Table 18 and
Table 19.
Step 10. The expected mean of the attributes is obtained by Equation (47), and the results are displayed in
Table 20,
Table 21 and
Table 22.
Step 11. The distance matrix
between each GHFN and the expected value of the corresponding attribute is computed according to Equation (48), and the results are displayed in
Table 23.
Step 12. The transformed probability weight of each alternative is computed through Equation (49), and the results are displayed in
Table 24.
Step 13. The comprehensive prospect value matrix
is computed from Equation (50), and the results are displayed in
Table 25.
Step 14. Given that the negative attributes are
,
and
, and the positive attributes are
,
,
,
,
and
. The maximizing and minimizing indexes of each attribute are obtained by Equations (51) and (52), and the results are displayed in
Table 26.
Step 15. The relative significance value
is calculated through Equation (53), and the results are displayed in
Table 26.
Step 16. Relative significance values were ranked in descending order, and the optimal cooperative enterprise is .
5.3. Comparative Analysis
In this subsection, we compare and rank the proposed model with the GHWAA operator [
14], the GHWGA operator [
14], the A-GHWAA operator [
19], the A-GHWGA operator [
19], the GHF-TOPSIS method [
18] and the GHF-CPT-TODIM method (
,
,
and
) [
20]. The results are shown in
Table 27 and
Table 28, and the
is the preferable alternative.
From
Table 28, it can be seen that the results of the GHF-CPT-COPRAS method are almost the same as those of the other methods, except that the ordering of the individual solutions is not consistent. The GHWAA operator and the A-GHWAA operator emphasize the overall impact, while the GHWGA operator and the A-GHWGA operator emphasize the impact of the extremes. The GHF-TOPSIS method measures the distance from the ideal solution to evaluate each solution. The GHF-CPT-TODIM method describes the decision-making process by taking into account the limited rational behavior of the DMs and using the overall value to measure the degree of dominance of the alternatives. However, our proposed method in this paper not only utilizes the advantages of the COPRAS method but also integrates CPT into the decision-making process, which fully simulates the psycho-behavioral characteristics of DMs facing risks. In addition, we use the proposed K-L divergence distance measure to effectively extend the D-CRITIC method to assign attribute objective weights under GHFNs. Therefore, the proposed D-CRITIC-based method and GHF-CPT-COPRAS technique make the EDAV evaluation results more scientific. In addition,
Table 29 shows further details of the advantages of the various methods.