A Hybrid ANP Method for Evaluation of Government Data Sustainability

: The evaluation of government data sustainability is a multicriteria decision making problem. The analytic network process (ANP) is among the most popular methods in determining the weights of criteria, and its limitation is the un-convergence problem. This paper proposes a hybrid ANP (H-ANP) method, which aims to improve the ANP by combining the weights obtained from the analytic hierarchy process (AHP). The proposed method is proved to be convergent since the network of the H-ANP is strongly connected. According to the simulation experiments, H-ANP is more robust than ANP under different settings of parameters. It also shows a higher Kendall cor-relationship and lower MSE with respect to AHP, compared with the existing method (e.g., the averagely connected ANP method). An empirical example is also provided, which uses H-ANP to evaluate the government data sustainability of a city.


Introduction
Government agencies generate, acquire, and preserve a large amount of data in fulfilling their administrative duties every day. The data are valuable for the whole society in improving the transparency of government work and promoting sustainable social development [1]. In ISO 37122:2019, data sustainability is regarded as "the capabilities that data shall be verifiable, audit-able, trustworthy and justified". It has become an important issue to improve the sustainability of government data. The evaluation of government data sustainability is also deemed as an effective way to increase the performance of the public sector [2].
The evaluation of government data sustainability is a multicriteria decision making (MCDM) problem. Once the complete list of criteria is determined, weights are computed for each criterion; then the judgement with respect to the criteria is quantified and collected from multiple experts. The AHP [3][4][5][6][7] and the ANP [8][9][10][11] have been widely used to determine the weights of the criteria [12][13][14]. The AHP organizes the criteria in a multilevel hierarchy and uses ratio scales to derive relative priorities for elements at the same level by making pairwise comparisons [3]. As a generalization of the AHP, the ANP replaces the hierarchies with networks and makes it possible to network decisions that involve functional dependencies. By taking into consideration both inner (among elements in a cluster) and outer (among elements in different clusters) dependency to prioritize alternatives, the ANP provides more information and reduces the error from expert judgement, thus ensuring the accuracy of estimation [11].
In the ANP, the dependencies are organized in a matrix (known as supermatrix), which is used to compute the priorities of the alternatives. The supermatrix is raised to powers until it converges to a limiting matrix, whose columns contain the priorities of the nodes. A limitation of the ANP is that the convergence assumption of the weighted supermatrix only holds under certain conditions [9]. It is demonstrated that the supermatrix of ANP converges when it is acyclic and irreducible. In dynamic system terms, this condition can be expressed that eigenvector of λ max = 1 must be a simple root, where λ is the eigenvalue of supermatrix. However, when eigenvectors of λ max = 1 are multiple roots, the convergence condition of ANP method does not hold. In this situation, the influence network is reducible and the dynamical system is not Lyapunov stable. However the situation is common, which restricts the application of ANP method. Mu et al. [15] have found that many of the supermatrices converge to a limit matrix of zeros in their reviewed studies. The situation is not rare in complex system evaluation scenarios, e.g., data sustainability evaluation, that usually involves a large number of indexes which easily result in sparse supermatrix. Zammori and Gabbrielli [16] take notice of the non-convergence problem and propose to use the averagely-connected method to add weights for zero terms in the supermatrix, in which the difference of elements is not taken into consideration.
This paper attempts to deal with this problem by using the weight information obtained from the AHP that provides the relative importance of criteria and adding connections to the supermatrix. A hybrid method H-ANP is proposed, in which the AHP provides direct comparison information between two criteria, and the ANP provides indirect comparison information between the two through a third one (W ij × W jk = W ik ). Based on theoretic proof and simulation experiments, the proposed method shows advantages of solving the non-convergence problem in complex system evaluation, and is proved to be robust in terms of convergence and accuracy. The contributions of this paper include: expanding original ANP method to higher-order matrix decision making situation and solving the non-convergence problem in theory. These problems are very common in data sustainability evaluation problem, which usually contains more indicators than normal. On the one hand, more indicators in data sustainability evaluation cause the sparse supermatrix problem that leads to the un-convergence of ANP method. On the other hand, more indicators in data sustainability evaluation cause the increasing pair-wise comparisons among indicators. Without outer information corrections, the inaccuracy problem will be more obvious with the increasing of order of the data sustainability evaluation matrix. Thus, the aforementioned H-ANP method is proposed to get over the shortcomings above in the data sustainability evaluation problem.
The rest of the paper is structured as follows. Section 2 introduces the AHP and ANP methods. The related works on data sustainability evaluation are also presented. Section 3 details the mathematical equations and the H-ANP method, including the evidence by Markov stochastic process theory and dynamic system theory. Section 4 presents the simulation results with different settings of parameters, stochastic matrix orders, and matrix connectivity probabilities. Section 5 introduces an empirical example of government data sustainability evaluation to illustrate the proposed method. Section 6 concludes the paper and discusses the future work.

Related Works
This section introduces the AHP, ANP and Hybrid MCDM method, and also presents the related works on data sustainability.

Data Sustainability
In recent years, we have witnessed the efforts on evaluating the quality of government data at various levels (e.g., country, city, etc.) For example, the Government Data Openness Survey evaluated 193 member countries of the United Nations by accessing government portals and data [17]. The Digital and Mobile Governance Laboratory (DMG) of Fudan University conducts the government open data evaluation of provinces and cities in China every six months [18]. Vetrò et al. [19] presented a framework of indicators to measure the quality of open government data on a series of data quality dimensions at most granular level of measurement. In the evaluations various frameworks have been proposed with different focuses. Viscusi et al. [20] evaluated the open government data compliance in terms of completeness, accuracy and timeliness. Vetrò et al. [19] developed a framework with seven dimensions: completeness, accuracy, traceability, timeliness, expiration, compliance, understandability. Donker and Loenen [21] developed a framework to assess open data supply, open data governance, and open data user characteristics holistically. Representing China, the China Open Data Index is proposed based on four dimensions: readiness, data, platform, utilization [18].
A wider and related concept is data ecosystem, in which a number of actors interact with each other to exchange, produce and consume data. Such ecosystems provide an environment for creating, managing and sustaining data sharing initiatives [22]. The data ecosystem needs periodical monitoring with participation of different interest of stakeholders. Furthermore, the changes in technology, user practices, regulation, industrial networks, infrastructure, and symbolic meaning or culture needs carefully considering. ISO 37122:2019 regards data sustainability as "the capabilities that data shall be verifiable, auditable, trustworthy and justified". Sustainable open government data emphasizes the realization of long-term continuous open government data and the reuse of data to generate political, economic and social value [23]. The sustainability of open government data means to utilize government data that has been continuously open to the public through the platform and to maximize the benefits of open government data [1].
The evaluation of sustainability generally includes the evaluation of processes, social effects, and their link to sustainability transition impacts [24]. To evaluate the performance on data sustainability, OECD is famous for its handbook on constructing composite indicators and the OUR-data Index that evaluates the open, useful and reusable data worldwide [25]. In the handbook, AHP is among the recommended methods for weighting the indexes. Chang et al. [26] introduced a modified indifference threshold-based attribute ratio analysis (ITARA) technique that can obtain the objective weights of the indexes and the preference ranking in the field of sustainable supplier evaluation and selection. Jiang et al. [1] adopted the Open Government Data assessment framework proposed by New York University, and evaluated the sustainability of open government data from environment, data, use and impact dimensions with 18 indexes. Decision-Making Trial and Evaluation Laboratory (DEMATEL) is a famous weighting technique [27]. It assumes that the weighting results are only decided by influential relationship. Moreover, in the equation of DEMATEL, T = D(I − D) −1 , the (I − D) is not necessary invertible. Only when the decision matrix is positive definite, the matrix is invertible. Thus, when the Det(I − D) = 0, a new alternative method is in demand. DANP combines the DEMATEL method and the ANP method, to construct the evaluation model [28]. To address the deficiencies and gaps generally found in past studies on bench-marking intellectual capital (IC), an approach integrated the analytic network process and the concept of thinking and non-thinking assets with the generic bench-marking procedure is proposed [29]. It resolves the lack of consideration of relationships among past bench-marking concepts and the impacts of their managerial factors, as well as to examine the wide range of elements and indicators of IC influencing the sustainable development of organizations. To develop a practical framework and to assess the development sustainability of Karaj, Group Fuzzy BWM (GFBWM) and Analytic Hierarchy Process (AHP) methods are applied [30]. To identify the content of manufacturing strategy infrastructural decisions that attempt to integrate a sustainability and classical manufacturing strategy framework, triangular fuzzy numbers (TFNs) are used to elucidate the judgment of elements in pairwise comparison in the framework of the analytic network process (ANP) [31].

AHP ANP and Hybrid MCDM
The AHP is a multicriteria decision making method that decomposes a decision into a hierarchy of sub-problems, each of which can be analyzed independently. First, it constructs the judgment matrices by pairwise comparison, and assigns a 1 to 9 scale (i.e., equal importance to extreme importance) or its reciprocal scale to represent the comparative weight w i /w j of two alternatives. The scale table has been improved by a number of scholars [32][33][34][35][36], who address the inaccuracy of estimation performing using the linear scale. Second, it calculates the eigenvector to determine the exact weight of each alternative.
By calculating the eigenvector of eigenvalue λ max , which is the max eigenvalue, the real comparative weighting of each alternative is obtained. Third, consistency validation is conducted to verify the consistency of judgment of all the decision makers. The consistency ratio (C.R.) is defined as the ratio between the consistency of the evaluation matrix and the consistency of empirical evaluation matrix. When the inconsistency ratio is below 10%, the estimation of AHP is recognized as consistent. Otherwise, it is necessary to: (1) find the most inconsistent judgment in the matrix and (2) ask the judger to reconsider if their judgment can be changed to a plausible value falling into that range.
As a generic form of the AHP [9], the ANP can model complex decision problems where a hierarchical model is insufficient. It provides feedback and self-loop relationships among different indexes, which enables the method to solve a network decision-making problem [37]. The decision-making process involves three kinds of matrix calculation: (1) the unweighted supermatrix of column eigenvectors as obtained from the pairwise comparison matrices of elements; (2) the weighted supermatrix where each block of column eigenvectors belonging to a cluster that is weighted by the priority of influence in the cluster, rendering the weighted supermatrix column stochastic; (3) the limiting supermatrix as obtained by raising the weighted supermatrix to high powers [38]. As the Cesàro average is consistently existent for the classical finite state Markov chain, the transition probability matrix converges (maybe periodically) after infinite steps [39].
Although Saaty continued to demonstrate the correctness of the ANP method by applying it to more than 100 case studies [11], the method is not well accepted in different perspectives. According to Beasley [40], for non-negative irreducible square matrices, an attractive structure exists due to Frobenius theorem (see Theorem 1), which is the premise of the ANP theory. When the matrix shows a reducible structure, however, which is more common in practice, the Frobenius theorem is ineffective. Sekitani and Takahashi [41] believe that Saaty treats the reducible and irreducible conditions as a whole (not necessarily irreducible). However, the results are neither perfect nor acceptable, which is because there is no satisfactory solution if the graph is not strongly connected. In the paper of Orrin and Guoqing [42], an example is cited to demonstrate that a decision-maker can form a disjointed supermatrix by following the process of ANP. Zammori and Gabbrielli [16] provide a typical example to explain the fatal problem caused by a black hole in the ANP network (weakly connected network). It is clearly demonstrated that the convergence results of ANP are obtained from the zero entries in some cases. In this circumstance, the rating results are contradictory to the actual weights [36]. As indicated by Adams and Saaty [43], the ANP method ignores the feedback connections between sub-networks. It is suspected by Kinoshita and Sugiura [44] that the ANP method transforms the direct priorities to the indirect priorities, which may affect the evaluation of the ultimate goal. Some negative effects caused by the problems have been manifested. For example, in Ozcan-Deniz and Zhu [45]'s work the supermatrix is not convergent and the results are not stochastically summed to one, which means that the matrix lacks stability.
Efforts have been made to deal with the non-convergence problem of the ANP. Saaty [3,5] draws the additional "linking" comparison to obtain stable results. Zammori and Gabbrielli [16] address the problem by removing the goal cluster and adding a feedback relationship to connect the nodes. The patent of Adams and Saaty [43] generates an ANP-Version-2 algorithm to resolve the non-convergence problem by adding new relationships. Sekitani and Takahashi [41] develop the unified method based on Frobenius min-max theorem. Azis [46] believes that a loop at the bottom level of the hierarchy is requisite for obtaining stable results. On the other hand, it is necessary for the entries in the first row and last column of matrix to be non-zero. Sekitani and Takahashi [47] converts the convergence problem into an optimal problem by modifying matrix W, S into stablê W,Ŝ for calculation of the weights, which proves effective but is difficult to implement. Zammori and Gabbrielli [16] take notice of the non-convergence problem and propose to use the averagely connected method to add weights for zero terms in the supermatrix, in which the difference of elements are not taken into consideration.
Recently, many Hybrid MCDM methods are developed, which can solve the ANP nonconvergence problem to some degrees (See Table 1). Two methods are attached importance to. PROFUZANP is a kind of interval estimation method based on fuzzy set theory and ANP theory [31]. Through confidence interval estimation, this method can avoid the discussions on convergence to some degrees. However, it lacks of mathematical proofs on the convergence of the end points of interval. The convergence problem is reduced instead resolved in this solution. Fuzzy AHP is a method providing more estimation of the experts [48]. Its assumption is that the experts' opinions can be totally measured by using more scaled alternatives and then defuzzying them. However, our assumption is that experts' opinions can not be totally measured by providing wider initial comparison scales since the defuzzification operation is still a human based choice. The H-ANP instead, can start estimation with AHP's approximated point, and then correct them with real correlation data from other experts or real data sources. H-ANP turns the problem from human decision problem to data-powered decision problem.  [49] For addressing the deficiencies and gaps generally found in past studies on benchmarking and for benchmarking intellectual capital (IC) in the underdeveloped domain of logistics.
The proposed approach integrated the analytic network process and the concept of thinking and non-thinking assets with the generic benchmarking procedure.
Resolve the lack of consideration of relationships among past benchmarking concepts and the impacts of their managerial factors, as well as to examine the wide range of elements and indicators of IC influencing the sustainable development of organizations Shanshan et al., 2018 [50] An application of a new aeronautic component assembly workshop facility layout selection is conducted.
A hybrid MCDM approach that employs ANP and technique for order preference by similarity to an ideal solution (TOPSIS) method to rank the optimal facility layout alternatives.
Illustrate the advantage of the proposed approach, the difference between ANP-TOPSIS and AHP-TOPSIS methods are compared and discussed. Results have demonstrated the effectiveness and feasibility of the proposed method. Wudhikarn et al., 2015 [51] Solve the uncertainty inherent in the input data. To select among newly developed roof formulas by considering the uncertainty and interrelation among decision criteria and elements as well as alternatives.
Propose an improved process that considers uncertainty by using Monte Carlo analysis with input values then applied to the ANP procedures.
Furthermore, the results of improved method differ the rankings produced by the original ANP. The observed dissimilarities mainly result from uncertainty consideration discussed in this study. Wudhikarn et al., 2015 [52] To account for the tradeoff issues among the criteria (quality, cost and green issue) in the new green product selection processes.
Eight quality dimensions proposed by Garvin are used to manage the quality issue, and a life cycle costing (LCC) method is applied for consideration of the cost and green issue. Therefore, the dependency issue among the criteria is considered.
An optimal environmentally friendly product does not overcome the existing toxic product of the focused company. The environmental performance is necessarily balanced by the quality and cost capabilities. Ozceylan et al., 2022 [53] Evaluation and selection the best device. Fuzzy analytic hierarchy process (FAHP) is applied for assigning weights of the attributes and weighted aggregated sum product assessment (WASPAS) method is used to determine the most suitable alternative device for explosive and narcotics trace detection.
Three well-known devices in the market are evaluated and the best alternative is suggested.
Foroozesh et al., 2022 [30] Develop a practical framework consisting of GIS and MCDM to assess the development sustainability of Karaj The criteria were weighted and prioritized using Group Fuzzy BWM (GFBWM) and Analytic Hierarchy Process (AHP) methods. Fuzzy logic and Weighted Linear Combination (WLC) methods in GIS were used to determine the sustainability of Karaj for urban development.
Socioeconomic criterion and employment sub-criterion were the most important in AHP and GF-BWM methods. Çelik and Akmermer, 2021 [54] Selection of these priority products to support the exporting potential of Turkey.
Provide forecasting about target markets based on qualitative and quantitative criteria by combining fuzzy analytic hierarchy process (FAHP) and the technique for order preference by similarity to ideal solution (TOPSIS) methods.
According to FAHP results, the trade balance criterion has the most significant effect while the distance criterion has least effect on the decision problem for ranking the target countries. Ocampo, 2018 [31] Identifying the content of manufacturing strategy infrastructural decisions that attempt to integrate a sustainability and classical manufacturing strategy framework, in the presence of firm size as a relevant component in decision-making.
Triangular fuzzy numbers (TFNs) are used to elucidate the judgment of elements in pairwise comparison in the framework of the analytic network process (ANP).
This paper is a novel approach that holistically captures the judgmental uncertainty of individual decisionmakers and the uncertainty of group decision-making.
Ocampo and Clark, 2017 [55] A unifying framework in formulating a manufacturing strategy which espouses sustainability with due consideration of the manufacturing's internal and external competitive functions.
The proposed framework integrates the features based on the classical theories of manufacturing strategy and the other features that must be considered to transform a firm's manufacturing strategy into a sustainable manufacturing strategy.
This framework serves as a guide for decision makers in identifying policies in various manufacturing decision areas that would comprise a sustainable manufacturing strategy Alyamani and Long, 2020 [48] Rank the different project criteria based on relative importance and impact on sustainable projects.
Use the fuzzy analytic hierarchy process (FAHP) methodology in which fuzzy numbers are utilized to realistically represent human judgment. The most important criterion to consider in sustainable project selection is project cost, followed by novelty and uncertainty as the second and third most important criteria, respectively. Dahooie et al., 2021 [56] Allow for interaction between different decision makers, considering multiple and sometimes conflicting criteria.
Provide a framework to assess the NSD performance in healthcare industry using multiple-criteria-decision-making methods.
The proposed model consists of 17 different criteria that have been identified and finalized based on previous studies as well as experts' opinions. Tabatabaee et al., 2021 [57] Developing a risk assessment tool for BIM-based IBS projects and employing a hybrid, comprehensive and efficient method for model development.
The "Fuzzy Delphi Method" was employed to identify the critical risk factors, while "DEMATEL" and the "Parsimonious-fuzzy analytic network process" were employed for data analyses.
Developing a risk assessment tool for BIM-based IBS projects and employing a hybrid, comprehensive, and efficient method for model development.
Torbacki, 2021 [28] Ranking of the proposed three groups of measures, seven dimensions and twenty criteria to be implemented in companies to ensure cybersecurity in Industry 4.0 and facilitate the implementation of the sustainable production principles.
Using the combined DEMATEL and ANP (DANP) and PROMETHEE II methods Achieve the Sustainable Development goals, reducing the carbon footprint of companies and introducing circular economy elements was also indicated.
Hosseini et al., 2021 [58] Assess the urban heritage of central districts in Tehran with an emphasis on tourism risk as a real case study.
Fourteen criteria developed on the basis of the fuzzy decision-making trial and evaluation laboratory (FDEMATEL) method are adopted for this assessment to construct the fuzzy influential network relation map and find the fuzzy influential weights;The hybrid modified fuzzy VIKOR method is adopted to evaluate and reduce the tourism risk towards for closing the gap zero.
According to the model and the results of the risk assessment in tourism, this method is a reasonable solution for the assessment and risk analysis in real-world problems. The proposed method can be a useful tool for managers in tourism and urban planning Gupta and Jayant, 2021 [59] A novel hybrid framework has been proposed, which can provide sound support for implementations of LCSCM practices by effective evaluation of concerned criterions.
A novel hybrid MCDM model, which involved Decision making trial and evaluation laboratory (DEMATEL), Analytical network process (ANP) and techniques for order performance by similarly to ideal solution (TOPSIS) followed by fuzzy methodologies has been developed for evaluation and selection low carbon suppliers.
The novel hybrid MCDM approach to evaluate low carbon supplier to the improvement of LCSCM alternatives is the one which have greater final performance index having value of 0.2350 with corresponding index of supplier (T3), which is the best criteria in this method. Ortiz-Barrios et al., 2020 [60] Supplier selection. FAHP and FDEMATEL are combined to obtain the final contributions of both criteria and sub-criteria on the basis of interrelations and uncertainty.
The results evidence that quality criterion is the most crucial aspect when selecting suppliers of forklift filters.

Method
This section firstly analyzes the convergence problem of the ANP and then presents the proposed H-ANP method, with the mathematical proofs and the implementation algorithm.

The Convergence Problem of ANP
The supermatrix possesses steady state distribution only under certain conditions is that the network should be strongly connected. It is the same problem that PageRank faces in its implementation, i.e., the rank sinks and cycles problem [61]. In the simple example of Figure 1, the dangling node 3 is a rank sink.  The ANP also faces these two problems, which cause the non-convergence problem. The modified PageRank formula (see Equation (1)) is presented to adjust the weighting mechanism, which is also helpful for solving ANP's non-convergence problem.
The personalization denoted as v T is a probability vector [61]. G is primitive, which means that a unique stationary vector for the Markov chain exists as the PageRank vector. S indicates the transition probability matrix as obtained using the traditional PageRank method. It is a row stochastic matrix. α denotes a parameter ranging within [0, 1).
The modified power method is expressed in Equation (2).
This is a hint for solving the non-convergence problem in the ANP.

H-ANP
To ensure the transition matrix positive definite by adding extra relationship to the supermatrix, an option besides the averagely connected method (AC-ANP) is to use the vector v T obtained from the AHP method. The AHP/ANP unified equation is obtained using Equation (3a). This is a hybrid method, in which the AHP provides direct comparison information between two indexes, and the ANP provides indirect comparison information. The H-ANP method degenerates to AHP when α = 0.
where A denotes a transition probability matrix and is a column stochastic matrix as well. D denotes the impact matrix that can be determined through indirect comparison [11] or other statistic methods based on real data that ensures D to be a row stochastic matrix. Thus, D T = [d ji ] denotes a column stochastic matrix, and v refers to the AHP eigenvector. When α = 0, the result degrades to AHP (see Equation (6)), suggesting more concerns on expert judgment. When α approaches 1, it is more closely resembles the ANP. If and only if α ∈ [0, 1), the limit exists (see Equation (7)). w denotes the eigenvector. Equation (5) provides a simple data representation of supermatrix D. The priorities derived from pairwise comparison matrices are entered as parts of the columns of D. The supermatrix represents the influence priority of an element on the left of the matrix on an element at the top of the matrix with respect to a particular control criterion. A supermatrix along with an example of one of its general entry matrices is shown in Figure 3. The component C 1 in the supermatrix includes all the priority vectors derived for nodes that are "parent" nodes in the C 1 cluster. Components C 4 to C 2 indicate the outer dependence of elements in C 2 on the elements in C 4 with respect to a common property. Let A be the stochastic matrix for which we wish to obtain f (A) = A ∞ . We have max ∑ n j=1 a ij ≥ ∑ n j=1 a ij a j a i = λ max for max a i min ∑ n j=1 a ij ≤ ∑ n j=1 a ij a j a i = λ max for min a i . In addition, for a row stochastic matrix 1 = min ∑ n j=1 a ij ≤ λ max ≤ max ∑ n j=1 a ij = 1, thus λ max = 1. When λ = 1, the Equation (3b) can be simplified to AW = W. According to linear system numerical theory, the A can be progressively approximated by the power method. The meaning of power method can be explained as follow. The weight transfers with the step increasing.
To prove the convergence of Equation (7), Perron-Frobenius theorem (see Theorem 1 [62] and Markov stochastic process theory [63] are used. (7). Since A is an irreducible matrix, as a state-limited Markov chain, it is recurrent.

Proof of Equation
The set's greatest common divisor is 1, so d(i) = 1 and A's every state is acyclic. A's every state is recurrent and acyclic, as a result of which its every state is ergodic. For an irreducible ergodic Markov chain A, lim n→∞ p n ij exists and is independent of i. Furthermore, let The existence of a limit in the proof above is guaranteed by Theorem 1. To understand Theorem 1, the following notations are defined. Matrix A n×n is reducible when there exists a permutation matrix P so that A square matrix A is irreducible if and only if its directed graph is strongly connected. That is, A is irreducible if and only if there is a sequence of entries in A for each pair of indices (i, j).
Theorem 1 (Perron-Frobenius Theorem, e.g., see [62]). If A n×n is irreducible, then each of the following is true.
There exists an eigenvector x > 0 such that Ax = rx; 5.
The Perron vector is the unique vector defined by In addition, except for positive multiples of p, there are no other non-negative eigenvectors for A, regardless of the eigenvalue. 6.
It is unnecessary for r to be the only eigenvalue on the spectral circle of A; 7.
the Collatz-Wielandt formula if A n×n > 0 is irreducible but imprimitive, there are h > 1 eigenvalues on the spectral circles. Then, A can be used to demonstrate that each of these eigenvalues is simple and that they are distributed uniformly on the spectral circle so that they are the h th roots of r = ρ(A).
The demonstration above is general and theoretic, for which a second micro proof is presented on the basis of dynamic system theory [64] and the stable theory proposed by Lyapunov [65].

Proof of Equation (7). Since
be a non-singular matrix with the eigenvector e as its first column. Let So e T y T = 0 and e T Y T = 1.
As for Equation (13), λ = 1 is the largest eigenvalue, as well as a single root. According to Lyapunov's first method (eigenvalue criterion), for the discrete-time linear time-invariant autonomous system, the sufficient and necessary condition for the stationary state of the origin, i.e., x = 0 being stable in the sense of Lyapunov is that all eigenvalue amplitudes of the transition matrix G are no greater than 1, and the eigenvalue with an amplitude equal to 1 can only be a single root of the minimum polynomial of G. This criterion is demonstrated by De la Fuente [64]. The difference between the Equation (3a) and the ANP is that the possibility that the transition matrix of ANP possesses more than one eigenvector when λ = 1.
Two examples are provided for understanding the H-ANP method. We first take the black hole situation as an example.
If α = 0.15, then the H-ANP matrix in this case be A 1 The final result of using power method is (0.5866, 0.3044, 0.1090). Instead, the original ANP method gives unexplainable result(0.0000, 0.3760, 0.6240).
The second example is in the situation of weakly connected graph.

Implementation
An implementation Algorithm 1 is presented as follows, which includes 5 steps. The initial weights act as the influence. The matrix reflects the impact of element j on element i, where i, j = 1, 2, · · · , n. w ij stays the same on the same row. After the limit power is calculated, the steady state distribution of weights are obtained.

1.
Primitive weight calculation. The eigenvector with the largest eigenvalue is obtained. The consistency validation is performed.

2.
Construction of influence network. 1 and 0 are used to represent the relationship among pairwise elements.

3.
The construction of matrix D, the supermatrix in the ANP.

4.
Choice of the damping coefficient. The α indicates the extent to which the real relationship can affect the final result. 5.
Power calculation. The limit power of the stochastic matrix A is calculated until a stable result is obtained.
The implementation of H-ANP is depicted in Figure 4. When the H-ANP is used, other influence is added to element i. After the steady state is reached, A is stable and indicates the n-step probability. Limiting A multiplied by the initial weights can ensure the correct description of the final real weight of each element.

Algorithm 1 H-ANP
Input: An AHP matrix H; coefficient α, the preference between AHP and ANP. Output: Weight vector P n ; a matrix D of pairwise impact. Firstly, D matrix is established whose each entry is d ij ∈ {0, 1}. Then, the entry of 1 in each row can be replaced with the real ratio obtained from other data sources or from max eigenvector generated from AHP comparison, so that each row sums to 1, indicating the different impacts of d i on d ij | d ij = 1 . Finally, the D matrix is transferred into a D T matrix, which is a stochastic column matrix 1: function H-ANP(H, α) 2: v ←MAXEIGENVECTOR(H) 3: Record the times of iteration 5: error ← 100,000 Initialize error to be large enough 6: P n ← v Actually the v can be replaced by any column vector sumed to 1 7: ve T e is a unit column vector 9: while error < 0.00000001 and k < 1000 do 10: error ← ||P n1 − P n || ∞

Experiment Results
To demonstrate the effectiveness of the H-ANP, simulations are conducted on three sets of randomly generated instances as follows.
Set A: In this part, the goal is to examine how the convergent step of H-ANP and ANP changes in the context of different D matrix connectivity probabilities. Through a random generator and the mapping functions, a vector agent is constructed, which follows the steps of AHP. A second agent is constructed to generate stochastic D matrices of different connectivity probabilities. These stochastic matrices are in the order of 35. Moreover, alpha is 0.85 as recommended [61]. To perform a comparison in this set, the experiment involves one real 35-order D matrix obtained in the empirical example.
Set B: The most stable connectivity probability parameter is chosen from set A, with α as 0.85. The goal is to observe how the convergent steps of the H-ANP and ANP change with the variation of matrix size.
Set C: In this part, the matrix size is set to be 35, and connectivity probability is setting as the optimum obtained in set A. The goal is to examine how alpha affects the convergent steps of H-ANP and ANP.
The computation is conducted on a Windows 64 bit desktop with 16 GB of RAM. All of the tests are confined to a single thread (Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz). The PyCharm is applied (professional 2019.2 version), which is the IDE for the Python 3.7.4. It relies on the package "Decimal" to improve calculation accuracy to 100 decimal places. Figure 5 and Table 2 show the results of experiment A. It is observed that the convergent step of the H-ANP (denoted as NO-ANP in the figure) is more stable than the ANP. The 35-order matrix D of low connectivity probability is used to conduct experiment. In 100 times, the ANP convergent steps are invariably above 1000 steps, indicating that the ANP method is unfit for this situation. The H-ANP can produce stable results with low calculation cost incurred (See Figure 5a). As the connectivity probability of matrix D increases, it is found that the ANP is unstable when connectivity probability ranges between 0 and 0.2 (See Figure 5b-i). The ANP can be non-convergent in 1000 steps for nearly 40% of the situations (see Table 2). Thus, the algorithm significantly improves the stability of the ANP.    Figure 6 shows the simulation results for set B. It is found that the stability of ANP and H-ANP (denoted as NO-ANP in the figure) is acceptable when the size of the stochastic matrix D changes. Thus, the solution is effective in the context of large-scale matrix calculation.

Example and Discussion
This section reports the evaluation of data sustainability of the local government in City X, which was conducted between June and July, 2018. The process is summarized in Appendices A-C. The standards, policies, projects were collected and processed using content analysis and axis coding. According to ISO 37122:2019, in which data sustainability is regarded as "the capabilities that data shall be verifiable, audit-able, trustworthy and justified", 6 groups of indexes were finally extracted, namely, management strategy (U 1 ), standards (U 2 ), data security (U 3 ), resource assurance (U 4 ), network security (U 5 ), system security respectively (U 6 ). In total, there are 35 indexes that are grouped into the 6 groups, as shown in Table 3. U 1 and U 2 are in the justified data sustainability dimension. The indicators in these two sets are defined to evaluate rule-based requirements and capabilities in data sustainability. U 3 and U 5 are defined to evaluate the data trustworthy related problems including data encapsulation and so on. The U 4 is defined to evaluate data verifiable data sustainability. It shows to which degrees the data sustainability is acceptable and supported by stakeholders of interest. The U 6 concludes indicators in audit-able data sustainability aspect. The digital signature and operation log are used to audit the operation environment of data process and utilization. The evaluation cluster relationship is depicted in Figure 8. Then a questionnaire was designed and used in an expert survey (n = 8) in January 2019. After repeated consultation and full feedback, Delphi consistent index results were obtained. A follow-up expert survey was conducted (n = 5) to collect expert opinions on the AHP weighting of the indexes. The experts are professors or associate professors, and one of them is the technical leader on data sustainability in City X. Among the experts, three major in information resource management, one major in data sustainability, and one major in information technology. The questions are all pairwise comparison questions among 35 indexes. The geometric average values were used to combine the opinions from different experts.
For the ANP, one expert (n = 1) in authority weighted the index using the ANP method.The indexes are built as a network structure (i.e., the supermatrix), which allows for inner dependency and outer dependency. The H-ANP method was applied to calculate the index weights thereafter. The coefficient α is set as 0.85 as recommended by Langville and Meyer [61]. Although the ANP does not work for this case due to the convergence problem, the H-ANP provides a meaningful result. The matrix D was constructed based on the expert questionnaire (n = 1). The matrix D is presented using the Gephi visualization tool, as shown in the Figure 9. Each node represents an evaluation index, and each arrow indicates the influence relationship between the indexes, and the number along each arrow denotes the influence degree determined by the expert (n = 1). The size of the node denotes the in-degrees of each node. It can be seen that nodes E4, E14 and E17 have relatively large in-degrees. Compared with the AHP, these indexes obtain higher weights after using the H-ANP. This process can be regarded as reducing the overlapping weights in the AHP.
To understand the effectiveness of the H-ANP, it was compared with the averagely connected ANP, as shown in Table 3. The ANP is not included since it fails to work for this case due to the matrix sparsity. The results of the two were compared with those of the AHP, which represents the expert opinions on the weights. The results of H-ANP is more close to those of AHP in terms of MSE (0.000018) and Kendall correlation coefficient (0.8319), which means it better reflects the expert opinions than averagely connected ANP.
The sensitivity analysis with respect to the parameter α is conducted to check the impact of α on the final weights in this case. Figure 10 shows that among the 35 weights of indexes only a few are sensitive to parameter α, e.g., Operation plan and control (E4), Metadata (E13), Data exchange security (E14) and Finance and materials (E25).   Set V as the judgment set, {V 1 , V 2 , V 3 , V 4 , V 5 }. They represent "Extensive poor", "Poor", "Average", "Good", "Excellent". Matrices R i are obtained from expertise judgment gener-ated by U i and V i . The H-ANP results in Table 3 are segmented into small column vectors A i according to group criteria of U. Using operator (·, +) to make first-and second-grade fuzzy comprehensive evaluation. B i = A i × R i and B = A × R.
According to the results in Table 4, the data sustainability of City X is evaluated as "Extensive poor". "Standards" (U 2 ) and "Network security" (U 5 ) are evaluated as "Extensive poor", which reveals the data sustainability in the two aspects in City X is not satisfying. As "Standards" is concerned, City X is evaluated as "Poor" in such indexes as "Storage period and appraisal disposal plan (E6)", "Electronic records filing process specification (E8)", "Datatype of storage and transfer plan (E9)", "Quality inspection (E10)", "Metadata (E13)". To be specific, the storage period and appropriate disposal plan is lacking in the data center of City X ("the city brain"). The electronic records filing process specification is not well designed. There is no integration of the business systems in multiple units, and transferring massive data is not supported. In addition, the existing data storage in City X cannot well support data analysis and management. The existing system is not able to automatically figure out the damaged, virus-infectious, secret related files and suspicious users, or to monitor the data sustainability of each link in real time. In the metadata, few manageable items are included.
As far as "Network security" is concerned, City X is evaluated as "Poor" in such indexes as "Invasion detection (E28)", "Data flow cleaning (E29)", "Single sign on (E30)". To be specific, the existing B/S architecture of data center does not impose restrictions on data upload, task management and other service interfaces, and thus the data can be easily exposed. Control strategy of sensitive links is lacking, resulting in the potential leakage of sensitive information. The anti-tampering protection for the Internet access to application system is lacking, so the illegal tampering behavior cannot be detected in time. The data flow cleaning is not utilized to detect service attacks. The existing systems do not support single sign-on, which leads to security risks of user information. The evaluation results reveal the weakness of the data sustainability in City X and will help to improve the work efficiently.

Conclusions
To address the non-convergence problem of ANP in weighting the indexes of data sustainability, this paper proposes a hybrid method, denoted as H-ANP, to combine the weights information obtained from AHP. With the parameter α, the H-ANP tends towards the AHP or ANP, and degenerates to the AHP when α = 1. The simulation experiment results demonstrate that the method is more stable than ANP in the context of low matrix connectivity probabilities. It is also proved that the H-ANP applies to the large-scaled matrix calculations, which shows advantages on evaluation of complex system with sparse matrix D (e.g., data sustainability evaluation). Based on theoretic proof and simulation experiments, the proposed method shows advantages of solving the non-convergence problem in complex system evaluation, and is proved to be robust in terms of convergence and accuracy. An empirical example on the data sustainability evaluation of a city has been conducted to illustrate the proposed method.
The government data sustainability evaluation is a social-scale problem which is more complex than traditional enterprise-scale decision making problem. In order to make ANP applicable to more complex scenarios, we contributes to develop such a H-ANP method which can get over the un-convergence problem in sparse and large supermatrix. Compared with the traditional evaluation method ANP, the proposed H-ANP method shows advantages on stability on conditions of sparse supermatrix D. It is not only useful in the large-scale index weighting in data sustainability evaluation, but also is an effective method in other complex system evaluations. Our theoretical contributions to the scientific literature is that we clearly analyzes the ANP un-convergence problem and one of its solution H-ANP by using dynamic system theory and discrete stochastic process theory. The un-convergence problem is obvious in ANP case. This proof can benefit prohibiting the un-convergence problem happening in ANP-related researches, e.g., the point estimation DANP and interval estimation FANP. They all avoid the discussions on un-convergence problem since the un-convergence rate is low in their case. We also give numerical experiments related to supermatrix size. The results indicate that H-ANP is an interface of extending the application scenarios to data-driven higher order matrix decision-making case. It expands original ANP method to higher order matrix decision making situation and solving the non-convergence problem in theory.
In practice, the proposed method can solve data sustainability evaluation problem in real life, which usually contains more indicators than normal. More indicators in data sustainability evaluation cause the sparse supermatrix problem that leads to the un-convergence of ANP method and the estimation inaccuracy caused by limited scales with increasing pair-wise comparisons among indicators. Without outer information corrections, the inaccuracy and un-convergence problem will be more obvious in the data sustainability evaluation matrix. Through adding one damping coefficient parameter α in the equation constraints, we make the super-matrix positive definite which can get over the un-convergence problem. Moreover, the combination of AHP and ANP possesses new properties that it can adjust propensities between the two methods and get over the inaccuracy caused by limited scales.
A limitation of this research is that the supermatrix D in the empirical study is constructed based on one expert's judgement, which was supposed to capture actual evaluation results from the practice. Future work includes exploitation of related data available and further improvement of the evaluation. Funding: This research was funded by the major foundation of Renmin University of China "Construction and application of standardized collaborative management system for government big data governance and comprehensive utilization" grant number 21XNL019.

Informed Consent Statement: Not applicable.
Data Availability Statement: The supporting data are stored by Jicang Xu, Email: xujicang@ruc.edu.cn.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix B. ANP Hybrid Evaluating Process
Algorithm 1 is applied in this process. α = 0.15, w is obtained through the AHP weighting process. The matrix D is in the following table: Table A9. Matrix D in government data sustainability evaluation of City X. is obtained by weighted average according to the final convergent weight in the previous chapter, which will not be repeated below, and the evaluation matrix is established as follows: 0.0000 0.0000 0.3000 0.7000 0.0000 0.0000 0.1000 0.9000 0.0000 0.0000 0.0000 0.0000 0.4000 0.5000 0.1000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 Using the operator m(·, +) as the first-order fuzzy comprehensive evaluation, the 0.0000 0.1000 0.3000 0.6000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.1000 0.6000 0.3000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.1000 0.9000 0.0000 0.0000 0.0000 0.2000 0.4000 0.4000 0.0000 0.0000 0.0000 0.6000 0.4000 0.0000 Using the operator m(·, +) as the first-order fuzzy comprehensive evaluation, the According to the principle of maximum subordination, the government data sustainability assessment results of City X are as follows: dangerous.