A Complex MCDM Procedure for the Assessment of Economic Development of Units at Di ﬀ erent Government Levels

: Studies on the economic development of government units are among the key challenges for authorities at di ﬀ erent levels and an issue often investigated by economists. In spite of a considerable interest in the issue, there is no standard procedure for the assessment of economic development level of units at di ﬀ erent levels of government (national, regional, sub-regional). This assessment needs a complex system of methods and techniques applicable to the various types of data. So, adequate methods must be used at each level. This paper proposes a complex procedure for a synthetic indicator. The units are assessed at di ﬀ erent government levels. Each level (national, regional, and sub-regional) may be described with a particular type of variables. Set of data may include variables with a normal or near-normal distribution, a strong asymmetry or extreme values. The objective of this paper is to present the potential behind the application of a complex Multi-Criteria Decision Making (MCDM) procedure based on the tail selection method used in the Extreme Value Theory (EVT), i.e., Mean Excess Function (MEF) together with one of the most popular MCDM methods, namely the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS), to assess the economic development level of units at di ﬀ erent government levels. MEF is helpful to identify extreme values of variables and limit their impact on the ranking of local administrative units (LAUs). TOPSIS is suitable in ranking units described with multidimensional data set. The study explored the use of two types of TOPSIS (classical and positional) depending on the type of variables. These approaches were used in the assessment of economic development level of LAUs at national, regional and sub-regional levels in Poland in 2017.


Introduction
Regional and local development are processes that overlap and condition each other. Local development can be considered a process which, while being similar to regional development extending over the economic, political, social, and cultural aspects, but takes place on a smaller scale. The source literature provides various definitions of local and regional development. In the American literature (e.g., [1,2]), local development is related to economic development while in the European literature it is often called socio-economic development (e.g., [3]). Generally, local development means economic development. It should be noted that most definitions view development as positive changes. In contrast, the definition by Schumpeter [4] clearly differs from classical definitions of development, and considers it to be a transition between stages of an economic system which cannot be subdivided into infinitely small steps. This paper, except introduction, is structured as follows: Section 2 presents the methods and data used in the empirical study, Section 3 comprises the results of research with their discussion, while the last two sections sum up the study and include conclusions and recommendations.

Materials and Methods
Each level of analyses (national, regional, sub-regional) can be described with variables of different types (i.e., ones which follow a normal, asymmetric or strongly asymmetric distribution). Different data types have different properties associated with them. For these reasons, this paper proposes a comprehensive multiway procedure for assessing economic development level of local administrative units at national, regional and sub-regional levels. The construction of a synthetic measure is a multi-stage process which requires the researcher, at each stage, to make many decisions regarding aspects such as: method for selecting variables to be studied, normalization procedure and method of values aggregation. These are important and often difficult tasks, especially if actual datasets include extreme values which may be due to the particularities of the complex phenomenon considered.
This paper proposes to application a synthetic measure with the use of MCDM methods combined with MEF. At the beginning of the 21st century, MCDM methods have gained in importance and are developing very fast, boosted by progress in computer-aided calculations. MCDM methods are suitable for complex problems which are characterized by uncertainty and often include conflicting objectives and different data and information types in comprehensive, evolving socio-economic systems [13]. Generally, there are two groups of MCDM: Multi-Attribute Decision Making (MADM) and Multi-Objective Decision Making (MODM) [14]. The first one includes a finite number of alternatives. In the latter, the number of alternatives is infinite (cf. [15]). Usually, problems related to the selection and assessment of alternative solutions are finite, i.e., belong to the first group. In turn, as regards development-related problems, the decisive attribute may take any value from a specific interval. Therefore, the number of potential alternative solutions may be infinite [10]. Many various MCDM methods exist e.g.: the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS), the Analytic Hierarchy Process (AHP), VIšekriterijumska optimizacija i KOmpromisno Rešenje (VIKOR), the Weighted Sum Method (WSM), ELimination Et Choix Traduisant la REalité (ELECTRE), and the Preference Ranking Organization METHod for Enrichment Evaluation (PROMETHEE) (see [16]). Currently, modifications, enhancements and new unique approaches to MCDM emerge around the world (see, e.g., [17]). With its broad range of versatile uses, TOPSIS stands apart from the MCDM methods. Behzadian et al. [18] identified following application areas: supply chain management and logistics, design, engineering and manufacturing systems, business and marketing management, health, safety and environment management, human resources management, energy management, chemical engineering, water resources management and other issues.
Different approaches with TOPSIS methods appear: fuzzy TOPSIS based on fuzzy sets, hesitant fuzzy sets or intuitionistic fuzzy sets (see also [19][20][21][22][23]), interval type-2 fuzzy TOPSIS [24,25], extended intuitionistic fuzzy TOPSIS [26], fuzzy TOPSIS with emerging interval-valued spherical fuzzy sets [27], rough cloud TOPSIS [28], TOPSIS-Sort for sorting problems [29], and fuzzy rough number-based TOPSIS [30]. There also exist many applications of TOPSIS with other MCDM methods: fuzzy DEMATEL and TOPSIS [31], AHP and fuzzy TOPSIS [32], fuzzy AHP and TOPSIS [33], fuzzy TOPSIS and fuzzy COPRAS [34]. We propose to combine TOPSIS with MEF which allows to identify extreme values of variables. The appropriate identification of extreme values is a demanding task which can be done using a wide variety of methods. This paper is a follow-up to authors' previous research which relied on the use TOPSIS with MEF to identify the extreme values and eliminate their impact on the synthetic measure used to assess the financial situation of Polish municipalities (see [35]).
Proposed multiway procedure includes four approaches based on TOPSIS [36] and MEF (see e.g., [37]) to assess the economic development level of LAUs at different government levels. This process includes the construction of a synthetic measure which enables a synthetic assessment of a phenomenon described by multiple variables. It consists of eight key stages ( Figure 1).
Proposed multiway procedure includes four approaches based on TOPSIS [36] and MEF (see e.g., [37]) to assess the economic development level of LAUs at different government levels. This process includes the construction of a synthetic measure which enables a synthetic assessment of a phenomenon described by multiple variables. It consists of eight key stages ( Figure 1). The first stage of constructing a synthetic measure is to determine a hierarchical structure for the multi-criteria assessment of economic development levels of LAUs. This is done by decomposing the problem into the following components: main assessment criterion, sub-criteria, variables, and units to be assessed (e.g., LAUs) ( Figure 2). Particular attention should be paid to selecting the variables. For this purpose, they need to be validated in substantive and statistical terms. Substantive validation precedes statistical ones. The substantive analysis may rely on expert opinions, the researcher's knowledge of the issue concerned, or guidelines developed as part of theories of economic phenomena (see [38]). Once the substantive validation is complete, in statistical analysis descriptive statistics can be used, including e.g.,: arithmetic mean, variance, correlation coefficient. The analysis of the inverse correlation matrix between values of variables − = [ ], where ∈ (1, +∞) ( , = 1, …, ), facilitates the elimination of variables strongly correlated with each other. If some values are excessively correlated with other ones, the elements on the main diagonal are much greater than ten [39]. Excessively correlated variables are eliminated from the initial data set. Main assessment criterion is a primary standard for judging or assessing complex phenomenon (first level). Sub-criteria are set of related factors related to main criterion (second level). Each sub-criterion is described by package of variables (third level). All variables describe each unit (fourth level).    The first stage of constructing a synthetic measure is to determine a hierarchical structure for the multi-criteria assessment of economic development levels of LAUs. This is done by decomposing the problem into the following components: main assessment criterion, sub-criteria, variables, and units to be assessed (e.g., LAUs) ( Figure 2).  The variables describing the economic development level of local territorial units (i.e., districts) at country or voivodeship level usually include extreme values or they follow a strongly (or extremely strongly) asymmetrical distribution. That problem can be solved by analyzing the graph of the Mean Excess Function in order to identify the extreme values; and by using the positional Particular attention should be paid to selecting the variables. For this purpose, they need to be validated in substantive and statistical terms. Substantive validation precedes statistical ones. The substantive analysis may rely on expert opinions, the researcher's knowledge of the issue concerned, or guidelines developed as part of theories of economic phenomena (see [38]). Once the substantive validation is complete, in statistical analysis descriptive statistics can be used, including e.g., arithmetic mean, variance, correlation coefficient. The analysis of the inverse correlation matrix between values of variables R −1 = r ij , where r ij ∈ (1, +∞) (i, j = 1, . . . , K), facilitates the elimination of variables strongly correlated with each other. If some values are excessively correlated with other ones, the elements on the main diagonal r ii are much greater than ten [39]. Excessively correlated variables are eliminated from the initial data set. Main assessment criterion is a primary standard for judging or assessing complex phenomenon (first level). Sub-criteria are set of related factors related to main criterion (second level). Each sub-criterion is described by package of variables (third level). All variables describe each unit (fourth level).
The variables describing the economic development level of local territorial units (i.e., districts) at country or voivodeship level usually include extreme values or they follow a strongly (or extremely strongly) asymmetrical distribution. That problem can be solved by analyzing the graph of the Mean Excess Function in order to identify the extreme values; and by using the positional TOPSIS method together with the Weber spatial median to restrict the impact of the strong asymmetry (cf. [35]).
In the second stage, the graph of the Mean Excess Function is used to identify the extreme values. The analysis of the MEF graph is one of the methods for identifying the threshold level which defines the beginning of the distribution tail modeled with the Peaks Over Threshold (POT) model. In the POT method (see e.g., [40]), the distribution tail of the variable X k (k-th variable) is modeled based on the generalized Pareto distribution: where ul k is threshold value, N the number of observations (units, e.g., districts), N ul k the number of observations above the threshold value ul k .ξ andβ are estimates of shape and scale parameters, respectively, andF(x k ) is a tail estimator of the distribution function of the variable X k . The starting point for analyzing the MEF graph is the conditional expected value [37]: β(ul k ) is linearly dependent on ul k , and therefore the empirical estimator of conditional expected value must also be linearly dependent on ul k . Hence, the MEF graph: should be linear above ul k . The method described above is used to identify the beginning of the right tail of the variable (extreme values in the right tail of the variable). In the left tail, the values of the variable are multiplied by minus one to calculate the threshold value ll k . Following the identification of thresholds, the values of the variable located in the distribution tails were replaced by the corresponding threshold values. This transformation is referred to as winsorization.
The third stage consists in determining the nature of variables, and grouping them by whether they have a stimulating or destimulating effect. Variables with a destimulating effect need to be converted into ones with a stimulating effect using the negative coefficient transformation.
In the fourth stage, the variables are normalized in order to make them mutually comparable. This consists in rescaling the variables and unifying their orders of magnitude. This study uses four different types of normalization of variables depending on data type: • classical standardization: for all variables that follow a normal or near-normal distribution (approach I): • classical standardization for winsorized data: for a set of variables which includes extreme values (approach II): • classical median standardization based on Weber spatial median: for a set of variables which includes variables that follow a strongly asymmetric distribution (approach III) [41,42]: • modified median standardization based on Weber spatial median: for a set of variables which includes variables that have extreme values and follow a strongly asymmetric distribution (approach IV) [35]: In Formulas (4)- (7) x ik is value of the variable k (k = 1, 2, . . . , K) in the i-th unit (e.g., LAU) (i =1, 2, . . . , N); x k -mean value of the variable k; s k -standard deviation for the variable k; z ik -normalized value of the variable k in the i-th unit; m ed k -Weber median for the variable k; m ad k = med i x ik − m ed k -absolute median deviation for the variable k; 1.4826-constant scaling coefficient which depends on the distribution of variables (see [41,42]). The calculations were performed with robustX package in the R program [43].
The fifth stage will consist in determining the positive ideal solution (PIS): and the negative ideal solution (NIS): Next, in the sixth stage, L1 (Manhattan) distances are calculated between each multi-attribute object and the PIS: and between each multi-attribute object and the NIS: In the seventh stage, the aggregation formula proposed by Hwang and Yoon [36] is used to construct the synthetic measure of development: The synthetic variable S i ranges from 0 to 1 in value. The higher values of the synthetic measure of development, the higher is the level of economic development.
In the eighth stage, the calculated values of the synthetic measure of development are used to linearly order the LAUs and to typological classification. The entire range of the synthetic variable may be divided into classes by defining numeric intervals for the measure S i (Table 1). Level of economic development very high high medium-high medium-low low very low Source: Own adjustment based on Wysocki [44].
The classes of level of economic development for districts were evaluated by measure of homogeneity. Total clusters homogeneity measure H O base on the idea of Hubert and Levin cf. [45] was proposed by Łuczak and Just [35]: where N); N cthe number of objects in c-th class (c = 1, . . . , C); P c -set of subscripts of objects in the c-th class; rthe number of non-empty classes. The lower the value of the measure H O the more homogeneous the classes are. Phenomena involved in economic development level of territorial units are described using three fundamental dimensions: units covered by the study (e.g., districts); diagnostic variables used to describe a phenomenon related to development processes; and time. The proposed multiway procedure was used to assess the level of economic development of LAUs (districts) at national (N = 314 districts in Poland), regional (N = 31 districts in the Wielkopolskie voivodeship) and sub-regional (N = 8 districts in the Pleszew sub-region-Pleszew district, and neighboring districts) levels in Poland in 2017. The study was based on 2017 statistical data obtained from the Central Statistical Office of Poland [12].
Based on substantive criteria were selected variables that represent: public finance, technical infrastructure, social infrastructure, and economy. Eight variables were selected based on a substantive and statistical analysis, as follows: x 1 -own revenue of municipal budgets in the district (PLN per capita), x 2 -population per public pharmacy, x 3 -share of population served by water supply in the total population (%), x 4 -share of population served by a sewerage network in the total population (%), x 5 -entities of the national economy per 10,000 working-age population, x 6 -investment outlays in enterprises per capita (in current prices) (PLN), x 7 -average monthly gross wages and salaries (PLN), x 8 -registered unemployment rate (%).
A five-year (2013-2017) average was calculated for the variable x 1 which represents own revenue of municipal budgets in the district. The distributions of variables for the districts at national and regional levels are shown in Figures 3 and 4.     Figure 4. (a) Estimated density for the variable x1-own revenue of municipal budgets in the district at the regional level. (b) Estimated density for the variable x2-population per public pharmacy at regional level. (c) Estimated density for the variable x3-share of population served by water supply at regional level. (d) Estimated density for the variable x4-share of population served by a sewerage network at regional level. (e) Estimated density for the variable x5-entities of the national economy per 10,000 of population in working age at the regional level. (f) Estimated density for the variable x6-investment outlays in enterprises per capita at regional level. (g) Estimated density for the variable x7-average monthly gross wages and salaries at regional level. (h) Estimated density for the variable x8-registered unemployment rate at regional level.  (a) Estimated density for the variable x 1 -own revenue of municipal budgets in the district at the regional level. (b) Estimated density for the variable x 2 -population per public pharmacy at regional level. (c) Estimated density for the variable x 3 -share of population served by water supply at regional level. (d) Estimated density for the variable x 4 -share of population served by a sewerage network at regional level. (e) Estimated density for the variable x 5 -entities of the national economy per 10,000 of population in working age at the regional level. (f) Estimated density for the variable x 6 -investment outlays in enterprises per capita at regional level. (g) Estimated density for the variable x 7 -average monthly gross wages and salaries at regional level. (h) Estimated density for the variable x 8 -registered unemployment rate at regional level.

Results
This study proposed and used a procedure for the assessment of economic development level of LAUs at national, regional and sub-regional levels in Poland in 2017. The analysis assumed that two variables (x 2 and x 8 ) have a destimulating effect while others have a stimulating effect.
At national level, most variable distributions describing districts (x 1, x 2, x 5, x 6 and x 7 ) demonstrated strong or extremely strong positive skewness. Only one variable distribution (x 3 ) was found to demonstrate extremely strong negative skewness. These variables also had high positive kurtosis values which mean that extreme values are highly likely to appear ( Table 2). The distribution analysis of these variables, based on the graphs of the empirical density function (Figure 3) corroborated these findings. On the basis of the Jarque-Bera test, it was stated that the variables describing districts at national level were not normally distributed (at the significance level of 0.05). It should be noted that only one variable (x 4 ) followed a near-normal distribution (there were no reason to reject the hypothesis that it is normally distributed at the significance level of 0.01). As regards variables that demonstrated strong or extremely strong positive skewness, MEF graphs were used as a basis for determining the threshold values for the right tail of the distribution (or the left tail for the variable with extremely strong negative skewness) ( Table 2). Also, the skewness coefficient indicated strong or extremely strong asymmetry in distribution for three variables (x 1, x 2 and x 5 ) attributed to districts at regional level ( Figure 4, Table 3). As regards these variables, kurtosis also indicated that extreme values were more likely to emerge in the right tail of their distribution than in a normal distribution. Based on the analysis of measures of volatility (range, standard deviation) and density function graphs, these variables can be concluded to be less volatile than the corresponding variables describing districts at national level. In the case of these three variables attributed to districts at the regional level, there were only a few outliers. Considering the descriptive statistics of variables attributed to districts at regional and sub-regional levels (Tables 3 and 4), and the relatively small number of districts at these levels, this study did not identify the extreme values based on the corresponding MEF graph. Table 4. Descriptive statistics of the variables of LAUs at the sub-regional level.

Specification
Variables Afterwards, winsorization was performed to replace tail values with calculated threshold values for districts at national level (Table 2). Once transformed, the variable distributions exhibited moderate skewness and smaller kurtosis than for non-winsorized variables ( Table 2).
This study used and compared four approaches at national, regional, and sub-regional levels ( Tables 5-8 In the approach I at national level, LAUs are primarily concentrated in one class (nearly 88% of the total number of LAUs), representing a low level (Table 5). This results from extreme values and a strong asymmetry of variables. An improvement in the classification is not provided by the positional approach (III) which should restrict the impact of the strong asymmetry through the use of the positional standardization. The results obtained with winsorized variables (approach II) were much better. The reduction of impact of extreme values through data winsorization enabled a more adequate attribution of LAUs to classes. In turn, in the approach IV, while restricting the impact of extreme values through data winsorization, also reduced the impact of the strong asymmetry by using the positional standardization formula. This approach gave a similar result to approach II, i.e., the substantively best classification which reflected the differences in levels of economic development between LAU classes. As a consequence, it provided a more complete and more correct identification of economic development types. Similar results of approaches II and IV resulted from the lack of significant asymmetries in the central part of the distributions of variables describing districts.

Specification
Approaches National Level Regional Level Sub-Regional Level   I  II  III  IV  I  II  III  IV  I  II  III  At the regional level, the approaches (I and III) gave similar results as at national level (Table 6). In the classical approach I, LAUs are mainly concentrated in the two classes at medium-low (nearly 38.7% of LAUs) and low (51.6% of LAUs) levels. Winsorization could not be used because there was the relatively small number of districts at regional level. No extreme values were identified at sub-regional level. At that level of analysis, only I and III approaches were used (Table 7). In approach I, four classes were distinguished, and in approach III, only three classes.
Furthermore, Table 8 presents the descriptive statistics for synthetic measures of economic development level of LAUs by approaches and analysis levels.
In the final ranking, objects described by variables with atypical values may have an excessively high or low rank. Outlier-robust approaches (the II, III and IV approaches) were proposed as a way of solving these problems. As shown by empirical research, the approaches with winsorized data (II and IV approaches) proposed in this paper improve the rankings if the variables include extreme values.
Values of homogeneity measures according to approaches were shown in the Figure 5. At national level classifications, based on data set including variables with extreme values, gave more homogenous classes for robust approaches (II and IV). Whereas, positional approach (III) led to more homogenous classes at regional level, because data set includes variables with asymmetry. However, at sub-regional level classification, based on data sets including variables with a normal or near-normal distribution, provided more homogenous classes for classical approach (I). It should be noted, if more classes have been distinguished, the classes are more homogeneous. II and IV approaches were not appropriate to data at regional and sub-regional levels, so did not calculate values of synthetic measures and homogeneity measure.
Additionally summarizing, the study revealed also advantages and disadvantages methods used (Table 9).  I  II  III  IV  I  II  III  IV  I  II  III  IV national level regional level sub-regional level Note: II and IV approaches were not appropriate to data at regional and sub-regional levels, so did not calculate values of synthetic measures and homogeneity measure.
Additionally summarizing, the study revealed also advantages and disadvantages methods used (Table 9). − partial loss of information by winsorization of data, − much more complicated procedure than classical TOPSIS.

Conclusions
The proposed multiway procedure to a multidimensional analysis of economic development levels of LAUs was implemented with the use of a complex MCDM procedure. The critical analysis of four different approaches was presented. Complex approaches (approach II and approach IV) are the original proposals, that reduce the impact of strong asymmetry and extreme values of variables attributed to objects (LAUs at national level). These approaches use the MEF to identify extreme values and set up the positive and negative ideal solutions. The occurrence of even one extreme (very large or very small) value for LAU can significantly affect its final position in ranking, resulting in it being ranked too high or too low. This is evidenced by the use of the classical and positional TOPSIS methods (the approach I and approach III). The main advantage of the proposed hybrid positional approach (IV) is that (opposite to other approaches considered) it can be apply intended for variables with extreme values and strongly asymmetric distribution. It was shown that the use of other approaches considered can lead to the wrong rankings and incorrect typologies of units. Approaches I-III are less suitable for the assessment of units described simultaneously by variables with extreme values and strongly asymmetric distribution, as they mistakenly create the ranking.
At national level the extreme values were identified, so the results obtained with the positional TOPSIS based on winsorized data has revealed at the national level in 2017 more accurate typological classification of districts in Poland in terms of the level of economic development. The reduction of impact of extreme values through winsorization of data enabled a more adequate attribution of objects to classes, and this typological classification was according with the expectations. At regional and sub-regional levels, this study did not identify the extreme values and only classical and positional methods could be used.

Recommendations
The analysis of economic development level of local administrative units (i.e., districts) at different (national, regional and sub-regional) levels involves different data types. If the data set includes atypical values of variables, the use of classical MCDM methods (e.g., TOPSIS) may result in excessively reducing the range of variation of the synthetic measure. As a consequence, it may become problematic to properly identify the development types of the complex phenomenon under consideration. The reason for these problems is that empty classes may appear if the types of development are identified based on an arbitrary manner.
The authors recommend using: • The hybrid positional approach: analyzing the MEF graphs and using the positional TOPSIS method with winsorized data, if the variables follow an asymmetrical distribution and include extreme values.

•
The hybrid approach: analyzing the MEF graphs and using the classical TOPSIS method with winsorized data, if the distribution of the variables includes extreme values.

•
The positional approach: using the positional TOPSIS method with positional standardization based on Weber spatial median, if the variables follow an asymmetric distribution without extreme values.

•
The classical approach: using the classical TOPSIS method based on classical standardization, if all variables follow a normal or near-normal distribution.
It should be noted that the choice of the most appropriate approach mainly depends on the distribution of variables.
Moreover, the proposed procedure to assessing the economic development levels of LAUs is a universal technique that may be used for other administrative units. The multiway procedure discussed in this paper may also provide a basis for development strategies and other strategic documents (e.g., financial plans, investment plans, operating plans).