Efficiency Analysis of Regional Innovation Development Based on DEA Malmquist Index

The aim of the work was to evaluate the dynamics of regional innovation development and compare the Russian regions according to their innovation efficiency, used resources, and achieved results. To estimate direct and indirect innovation effects, this study used the data on Russian regions according to variables of the innovative product volume, the share of high-tech products in the gross regional product (GRP) structure, the number of used patents, and investment in innovation activity for 2006–2017. To obtain a representative sample, a cluster analysis was applied as a preliminary step, which made it possible to select a group of regions that were most advanced in terms of their innovative development. Output-oriented data envelopment analysis models were applied for Malmquist Productivity Index calculation. The obtained results indicate the average growth of total factor productivity of regional innovation development over time. The main source of innovative development is largely derived from the economy of scale, while the effectiveness of regional innovation systems is basically increasing through broader resource bases, rather than through its effective utilization. The research findings can be applied to diagnose regional innovation effectiveness, justify public investment in research and development (R & D), and identify the priorities of regional innovation policy for specific regions.


Introduction
The crucial role of regions in generating innovation and national economic growth makes the issues of analyzing the quality of innovative growth, searching for instruments for quantitative measuring the regional innovation effectiveness, and developing a methodology for an objective assessment of regional innovative development highly topical. Policy-making in the field of promoting innovations requires a better understanding of all the factors that influence the pace of innovative growth in a region, the cause of their success, and their effectiveness.
The aim of this research was to develop a methodology for studying innovative effects and to evaluate the dynamics of regional innovation development. The main research targets were to characterize the dynamics of innovative development in Russian regions and to identify the regions that, amid decreased resourcing, could demonstrate growth in the resulting indicators.
We expanded the set of output parameters in comparison with existing studies, differentiated the direct and indirect effects of the innovation process, and tried to draw attention to the complexity and multifactor nature of evaluating the innovative system effectiveness. The use of the data envelopment analysis (DEA) Malmquist Productivity Index enabled us to determine the dynamics of regional innovative development, to reach new conclusions regarding the development of Russian regions, and to identify effective regions, which (amid a decrease in resourcing) demonstrate an increase in the resulting indicators.
The novelty of the research lies in the fact that we developed the approaches of regional innovation efficiency assessment based on the direct and indirect effects of innovations, and we evaluated the structural changes and innovative activity dynamics of regions. The academic contributions of this research are the development of DEA instruments and Malmquist Productivity Index, the identification of trends, and the presentation of new essential findings concerning Russian regions development and the nature of innovative effects in Russian regions.
The research strategy consisted of two sequential stages: the pre-processing of regional data and the calculation of the DEA Malmquist Index figures to create region rankings. This strategy was chosen because of Russia's vast territory and a big gap between leading and lagging economic regions in the country. Therefore, we conducted a cluster analysis to select regions for this study. Often, we witnessed a situation where the resources allocated for the development of innovations did not lead to significant structural shifts at the macroeconomic level, while input indicators did not change or decrease in regions that succeed in innovation, but the results of the regions' performance dramatically improved. To evaluate these processes, the DEA model for the measurement of regional innovation systems' efficiency and the Malmquist Index were adopted. We also selected two important time periods for the study (2011-2014 and 2014-2017), as the pre-crisis and post-crisis dynamics were very important for the study findings.
The study tested the following hypotheses: Hypothesis 1 (H1). The total factor productivity of regional innovation development indicates positive growth for the regions that are advanced in terms of innovative development.

Hypothesis 2 (H2).
The gap between the levels of innovation development by region has increased over the selected time period.
Hypothesis 3 (H3). Scale factors have a major influence on regional innovation development.
To achieve the aim, the following tasks were accomplished: • Methodological approaches for the analysis of the innovative development dynamics were determined, and new DEA-based instruments for its assessment were developed. • Indicators characterizing the cost-effectiveness and results of innovative activities in the Russian regions were collected and analyzed. • A cluster model for selecting regions relevant for the analysis of innovative development was built, and a research sample on the basis of Russian regional data was formed. • Various models to evaluate the direct and indirect effects of innovation at the regional level were developed. • The Malmquist Productivity Index was calculated to assess the dynamics of innovative development over three periods.

•
The developed instruments to evaluate the dynamics of the development effectiveness of regional innovative systems in Russia from 2006 to 2017 were implemented.
The present work is organized as follows. The first section contains the main research question and research strategy. The second section discusses the theoretical framework of the research and literature review, approaches to the assessment of the regional innovative dynamics based on direct and indirect effects of innovations, and the description of variables and models used for assessing regional innovative growth. The third section presents the research methods. The fourth section illustrates the empirical results. The fifth section presents the discussion, and conclusions are presented in the sixth section.

Literature Review
Researchers of regional innovation systems have paid great attention to identifying factors and developing a methodologies for evaluating the effectiveness of regional innovation development. The literature on regional innovative development has presented different studies aimed at measuring the effectiveness and the performance of regional innovative development. Researchers have evaluated the regional innovation system effectiveness by using both econometric and nonparametric methods for the estimation of production frontiers.
The problems of the search for and estimation of the factors that influence regional innovation activities, innovation performance and geographic localization of knowledge, and R & D transfer have frequently been quantitatively measured and widely analyzed with econometric methods based on various countries' data, such as in the work by Griliches and Nilsson [1]. An important aspect of R & D activities in terms of policy implications is the diffusion of knowledge spillovers. The geographical scope of knowledge spillover as a spatial process was examined by Audretsch and Belitski [2], Feldman [3], and Boschma [4], all of who substantiated the influence of geographical proximity on the effectiveness of regional innovation development. There are a set of different parameters that have been applied to the measuring and modeling of innovation system efficiency [5]. To assess the level of innovative development, the scholars have mostly applied, with some variations, the ratio of total patents granted in the region to R & D expenditure and R & D employees by using the models of spatial econometrics and production functions of knowledge.
In contrast to this regression analysis focused on identifying the averaged trends, data envelopment analysis (DEA) makes it possible to identify cases of best practice. DEA is a methodology for measuring the efficiency of different decision making units, and it is often used in the research on economic system functioning. This approach was proposed and actively developed by Farrell and Fieldhouse [6], and Charnes and Cooper [7]. DEA as a measure of the relative efficiency of decision-making units has already been applied in research concerning regional innovation systems. DEA has been applied to the evaluation of regional innovation process and can provide and estimation instruments for region innovation activity.
The efficiency of regional innovation systems involves the obtaining of maximum innovative output with the amount of resources used (human capital and investments in R & D). In pieces of research dealing with the regional innovation systems, approaches to DEA modeling have differed in the objects under consideration and in the sets of inputs and outputs. These can include regional innovation systems, regional development programs, innovation development centers, universities, and even various projects within universities as decision-making units. Foddi and Usai [8] used DEA to analyze the efficiency of knowledge production in 29 regions of the European Union. Zabala-Iturriagagoitia [9] and Matei and Aldea [10] determined the factors that influence regional innovation system development programs in Europe and used DEA to assess their effectiveness. Stejskal, Nekolova, and Rouag in [11] developed a methodology for assessing the efficiency of an innovation system based on the simplest algorithm for multi-criteria decision making, and they used it to analyze the regions of the Czech Republic. Chen and Guan [12] explored the relative efficiency of regional innovation systems in 31 China provinces using stochastic frontier analysis, and then they analyzed the factors that determined the effectiveness of regional innovation systems. Multi-stage DEA models take the time factor into account [13] and [14] and allow for the analysis of the effectiveness of regional innovation systems in different periods.
Xuanli [15] used applications for patents as a resulting feature and evaluated the following indicators as variables for assessing innovation development in Chinese regions: foreign direct investment per capita, total industrial production, the average number of employees, the number of human resources employed in the technical and technological fields, the regional R & D ratio to Information 2020, 11, 294 4 of 24 total industrial production, R & D expenditures, the scientific and technical standard of expenditures, and the ratio of regional R & D expenditures to gross regional product (GRP).
Broekel, Rogge, and Brenner [16] used DEA to analyze regional innovation efficiency using R & D expenditures, R & D employees, and employment data to approximate input factors and patent numbers to approximate innovative output, i.e., the scenario that is most common in this type of literature.
Examples of the analysis of regional innovation systems in Russia with the DEA approach can be found in the following studies.
Baburin and Zemtsov [17] examined the effectiveness of regional innovation systems in Russia by using the input variables of human capital and R & D expenditures and the output variable of the number of potentially commercializable patents.
Zemtsov and Kotsemir [18] analyzed the following variables in DEA models: The input parameters of the model were real domestic expenditure on R & D and the number of employed urban citizens with higher education; the indicator of innovation output was the number of patents, an indicator that has been used for many decades; the most important and significant factors of regional innovation system efficiency in Russia in the long term were regional patent stock, R & D intensity, and entrepreneurial activity. They came to conclusion that the effectiveness of regional innovation systems was higher in more technologically advanced regions with the oldest universities and greatest number of patents available. It is advantageous to be located near major innovation centers due to more intense interregional knowledge spillovers. In general, the effectiveness of regional innovation systems in Russia has increased over this period, especially in the least developed territories, but at the same time, a significant regional differentiation was revealed. The most effective regional innovation systems have been formed in the largest agglomerations with leading universities and research centers.
Firsova and Chernyshova [19] presented an approach to analyze the efficiency of innovation activity in 80 Russian regions using DEA for evaluating the performance of regional innovation systems in terms of the structure of R & D financing. Their paper evaluated the impact of financing structures on the regional innovation system functioning by using the parameters of a ratio of their set of input parameters of R & D sources and financing structure, and their set of output parameters was the ratio of the innovative goods volume (the ratio of the set of the values of input source parameters and the structure of internal finances spent on R & D to the set of the values of the output parameters of innovative goods volume), and they ranked the regions according to the degree of efficiency of financial resources used in regional innovation systems.
Didenko, Loseva, and Abdikeev [20] worked out a DEA-based index of the scientific and innovation efficiency of regions. A DEA model evaluates the transformation of inputs (human capital as costs for research and development personnel, innovation activity (the number of R & D companies that are innovative enterprises), and R & D costs into outcomes (research and development personnel, patents, the volume of innovation goods, technologies, articles, scientific papers, and Ph.D. theses) in the national innovation process. This study revealed that not only obvious leaders (such as Moscow and the Moscow region) but also small regions with relatively limited resources can lead in efficiency.
Rudskaia and Rodionov [21] used the following parameters in their DEA model. Their input consisted of the number of studies, internal R & D costs in the regions, and share of investment in the gross regional product. Their output consisted of the number of advanced production technologies developed in the region, the ratio of innovation activity (share of innovation-active enterprises in the total number of enterprises), the share of innovative products (goods, works, and services) in the total volume of manufactured products, and the share of high-tech products (works and services) in the gross regional product. The authors came to conclusion that strong regions predominate among the technically efficient ones (they are small in number but significant in representation), as do moderately strong (second category) regions, which indicates the overall efficiency of the authorities' efforts to conduct innovation policy and the high exposure of the innovation environment to it.
Given the difficulty in justifying the choice of "input" and "output" variables when constructing a DEA model, the indicators used in the DEA assessment models for data from Russian regions were of a particular interest for our analysis and discussion.
This literature review demonstrates that there is no single approach to evaluating the effectiveness of regional innovative development. In our opinion, some of the available measures can be criticized for mixing the inputs and outputs of innovation processes instead of evaluating the output based on the used inputs. However, in general, we identified a number of indicators that can be attributed to the key indicators of evaluating innovative development according to the research results obtained by most authors: R & D expenses, the volume of innovative goods, operations, services, employment in the innovation sphere, and the number of patents applied for and granted. These studies are consistent with the international system of innovation indicators and with the fundamental Oslo Manual [22], Frascati [23], and the Organisation for Economic Co-Operation and Development recommendations [24] as main guidelines to measure innovation that determine the methodology for the accounting and analysis of innovations and to determine the main indicators, thus allowing for access to a country's statistical reporting.
For our purposes, it was important that all the previous studies employed a wide range of variables that were considered as input factors including the number of R & D employees [12,16], R & D employees in combination with the level of highly qualified employees in a region [16], the number of employees, and a set of regional factors including R & D expenditures. The variation on the output side was relatively small, as data availability left patents as the dominant approximation of innovative output (the number of innovations that have been registered for patenting). All researchers have used patents as well accessible statistic indicators. Some authors have used new products sales as output in their models.
Thus, there are widely used determinants to identify the main regional innovation system efficiency factors in DEA models, but we did not find a proper theoretical model to evaluate the direct and indirect effects of innovation and to assess structural changes in regional economic dynamics. The novelty of this research is that we developed approaches for the regional innovation efficiency evaluation based on the direct and indirect effects of innovations to measure the innovative activity dynamics of regions.
This study continues the scientific discussion on the effectiveness of regional economic development and the search for tools to assess regional innovation policy and the dynamics of innovative structural shifts. The paper contributes to the debate on how to measure regions' innovation performance using the tools of the DEA Malmquist Index and to evaluate efficiency change of the regional innovation system over a time period for Russia.

Development of Approaches to Assess the Innovative Dynamics of Regions
It is now widely recognized that the regional innovation system is responsible for the transformation of innovation resources into innovation results. We suggest considering and evaluating the results of innovation in terms of their direct and indirect effects. This will allow for the evaluation of what factors and sources lead to the innovative development of a region, whether due to expanding resources or their efficient use.
The direct effects of innovation are the immediate results of innovation. It manifests itself in an increase in the quantity and quality of produced innovative goods or products and is defined as the anticipated or actual result of the innovations' introductions. Direct effects are measured by indicators such as the volume of innovative products, the share of innovative products in the GRP, patents, R & D costs, and other quantitative indicators. The effectiveness of innovation in this case is defined as the balance between the effect and the costs resulting from it.
The indirect effects of innovation, in our opinion, reflect the systemic effect of innovation policy and the results in the form of structural shifts in the economy as a process of increasing the share of the innovative component in the regional economy. Indirect effects derived from innovation spillover can Information 2020, 11, 294 6 of 24 be defined as effects that influence businesses' activities that are not directly involved in the process of innovation dissemination, as well as the resulting structural changes in the economy under the impact of innovations [25][26][27].
Indirect effects are expressed in the larger proportion of high-tech products in the GRP structure, the larger proportion of the increase in the number of innovations introduced, the larger proportion of the results of using intellectual property, a greater investment effort in the economy and financial support for innovation, and in the development of innovative susceptibility and investment processes.
While the direct effect of innovations can be measured (and while it is expressed in the direct impact of the innovation spillover on the economic growth of a country or region through an increase in their basic economic development indicators), the indirect effect of innovation diffusion on economic development cannot be directly attributed to the results of its activities [28,29]. The indirect results and effects of innovation are much more difficult to measure, and the economic literature has so far not sufficiently examined the aspects of quantitative assessment, methods, and instruments for evaluating the indirect effect of innovation diffusion on economic growth.
In order to determine the nature of the impact of innovative direct and spillover effects on regional innovative development, we compared and analyzed the indicators of innovative activity in the regions of the Russian Federation for the period of 2006-2017 based on DEA Malmquist Productivity Index instruments.

Indicators and Description of Variables
We chose our variables in accordance with studies presented in literature review, and those variables were as the basis for assessing the innovative dynamics of the regions in this research. The proposed set of output and input variables were included in the model due to the presence of a theoretical or empirical causal relationship between them and the level of innovative development in terms of managing the innovation process.
The study used the following metrics as variables for evaluating innovative effects. The input parameters for modeling were the resources invested in maintaining the functions of the regional innovation system: The indicators of activity, the growth of which contributes to the innovation development of a regional economy, were used as outputs. The proposed indicators are official indicators used in statistical sources to assess the effectiveness of innovation results in Russia. The following variables, in the form of innovative direct and indirect effects, were selected as output parameters: • The volume of innovation goods is volume of innovative goods, works, and services of organizations by economic activity.

•
Hi-tech share in GRP is the share of value-added of high-tech and knowledge-intensive industries in the GRP of Russian regions. • Investment in fixed assets is the ratio of investments in fixed capital to GRP; it characterizes the investment activity of the Russian regions. The acceleration of investment growth in fixed capital in the economy is an important indicator for the Russian Federation Government.

•
Used patents are patents that have been used and commercialized in real businesses.
A quantitative analysis of the dynamics of variables under examination for Russia is presented in Figures 1 and 2. The research materials were statistical data from Rosstat [30].  An analysis of the presented variables dynamics demonstrated that there were no significant structural changes in the economy in terms of innovation dynamics, and this explained the use of the approach outlined in our study that highlighted the direct and indirect effects of innovation and applied additional methods for assessing regional indicators of innovation development based on DEA and the Malmquist Index to analyze the nature of structural changes.
To assess the comparative effectiveness of various parameters of innovative systems, the DEA methodology was used [31][32][33]. The formation of approaches for assessing regional innovation systems based on direct and indirect effects and the use of DEA instruments to assess the effectiveness of innovative development of Russian regions was carried out for the first time.

Models for Assessing Innovative Growth
Our approach for assessing the effectiveness of the regional innovation development in this study involved taking the following aspects into account: • The choice of a non-specific set of indicators (inputs and outputs parameters) reflecting the effects of innovations that would be available in Russian statistical databases and would be statistically significant and accessible for researchers.   An analysis of the presented variables dynamics demonstrated that there were no significant structural changes in the economy in terms of innovation dynamics, and this explained the use of the approach outlined in our study that highlighted the direct and indirect effects of innovation and applied additional methods for assessing regional indicators of innovation development based on DEA and the Malmquist Index to analyze the nature of structural changes.
To assess the comparative effectiveness of various parameters of innovative systems, the DEA methodology was used [31][32][33]. The formation of approaches for assessing regional innovation systems based on direct and indirect effects and the use of DEA instruments to assess the effectiveness of innovative development of Russian regions was carried out for the first time.

Models for Assessing Innovative Growth
Our approach for assessing the effectiveness of the regional innovation development in this study involved taking the following aspects into account: • The choice of a non-specific set of indicators (inputs and outputs parameters) reflecting the effects of innovations that would be available in Russian statistical databases and would be statistically significant and accessible for researchers. An analysis of the presented variables dynamics demonstrated that there were no significant structural changes in the economy in terms of innovation dynamics, and this explained the use of the approach outlined in our study that highlighted the direct and indirect effects of innovation and applied additional methods for assessing regional indicators of innovation development based on DEA and the Malmquist Index to analyze the nature of structural changes.
To assess the comparative effectiveness of various parameters of innovative systems, the DEA methodology was used [31][32][33]. The formation of approaches for assessing regional innovation systems based on direct and indirect effects and the use of DEA instruments to assess the effectiveness of innovative development of Russian regions was carried out for the first time.

Models for Assessing Innovative Growth
Our approach for assessing the effectiveness of the regional innovation development in this study involved taking the following aspects into account: • The choice of a non-specific set of indicators (inputs and outputs parameters) reflecting the effects of innovations that would be available in Russian statistical databases and would be statistically significant and accessible for researchers. This study strategy presented and analyzed two models of evaluating innovative growth with matching inputs and different outputs: • Model 1 reflects the immediate direct effects of innovation policy-the volume of innovative products as the main indicator of innovation dynamics (Table 1). • Model 2 depicts indirect effects-so called spillover effects-accompanying the innovations diffusion processes and characterizing the level of effectiveness of innovation policy through the creation of general conditions for innovative susceptibility, the growth of investment indicators, the patents used, and, as a result, structural changes and increases in the volume of industries with high-tech development in regional economies ( Table 2). The specificity of the innovation process lies in the long life cycle of the innovation project in contrast to projects that use traditional products, as well as the scope of all its units: research and development, the implementation of innovations in practice, placing products on the market, and obtaining an economic effect. Therefore, to assess the effectiveness of innovation policy, we chose a five-year time lag, which meant that the resources invested in the innovation system in 2006, 2009, and 2012 turned into performance in 2011, 2014, and 2017, respectively.
The factors of innovative development were the input resources that affected output with a lag; since the real effects in the form of implemented developments and changes in the structure of the regional economy appeared in the medium term, we selected indicators with a gap of five years. In addition, these years witnessed important events in Russia's economic development: 2014 was a period of crisis, which is why the analysis of the indicators of pre-crisis and post-crisis trends was of a particular interest (Figure 3).

Data Preprocessing by Cluster Analysis
A cluster analysis was used as an initial stage in assessing the dynamics of regional development. To apply the DEA method using the Malmquist Index, it is advisable to distinguish groups of regions that have similar characteristics in terms of innovative development. The uneven development of the regions led to a high differentiation of indicators used in the DEA analysis process and necessitated the alignment of the initial data.
The level of innovative development in the Russian Federation varies across regions depending on the level of productive forces development, natural and geographical conditions, historical and socio-economic development, and the territorial distribution of industry and science, so there is a high differentiation and polarization of their number across Russian regions that was reflected in the regional specifics of innovative enterprises financing. These factors inhibit the spillover of innovation capital and innovative evolution.
Initially, the effectiveness of regional innovation systems was evaluated based on the data of 80 regions, as a number of Russian regions in 2006-2017 did not have comparable indicators. However, there was a big gap between the leading regions and regions in which the utilized innovative resources have not led to significant innovative structural shifts in economic development at the macroeconomic level, so the specifics of the territorial and federal structure of the Russian Federation explains the use of cluster analysis for the regions selection.
In the examined set of areas, there were regions that were in better socio-economic conditions and there was a high inter-regional differentiation (relative to other regions) by indicators of territory and economy, which is why spatial indicators characterizing innovative development (volume of innovative goods, domestic R & D costs, patents, and number of innovative companies) varied significantly across Russian regions.
Thus, some regions in terms of innovative development were not of research interest, since the importance of innovative development and, accordingly, their indicators were not significant for economic regional development. They were excluded from examination.
Bringing the original set to a homogeneous form was a separate data mining task. Within the framework of the study, it was necessary to carry out stratification of regions so that we could single out homogeneous groups and select comparable objects; therefore, we conducted a cluster analysis according to the indicators chosen for analysis: high-tech share in GRP, investment share in fixed assets, volume of innovation goods, and internal R & D costs.

Data Preprocessing by Cluster Analysis
A cluster analysis was used as an initial stage in assessing the dynamics of regional development. To apply the DEA method using the Malmquist Index, it is advisable to distinguish groups of regions that have similar characteristics in terms of innovative development. The uneven development of the regions led to a high differentiation of indicators used in the DEA analysis process and necessitated the alignment of the initial data.
The level of innovative development in the Russian Federation varies across regions depending on the level of productive forces development, natural and geographical conditions, historical and socio-economic development, and the territorial distribution of industry and science, so there is a high differentiation and polarization of their number across Russian regions that was reflected in the regional specifics of innovative enterprises financing. These factors inhibit the spillover of innovation capital and innovative evolution.
Initially, the effectiveness of regional innovation systems was evaluated based on the data of 80 regions, as a number of Russian regions in 2006-2017 did not have comparable indicators. However, there was a big gap between the leading regions and regions in which the utilized innovative resources have not led to significant innovative structural shifts in economic development at the macroeconomic level, so the specifics of the territorial and federal structure of the Russian Federation explains the use of cluster analysis for the regions selection.
In the examined set of areas, there were regions that were in better socio-economic conditions and there was a high inter-regional differentiation (relative to other regions) by indicators of territory and economy, which is why spatial indicators characterizing innovative development (volume of innovative goods, domestic R & D costs, patents, and number of innovative companies) varied significantly across Russian regions.
Thus, some regions in terms of innovative development were not of research interest, since the importance of innovative development and, accordingly, their indicators were not significant for economic regional development. They were excluded from examination.
Bringing the original set to a homogeneous form was a separate data mining task. Within the framework of the study, it was necessary to carry out stratification of regions so that we could single out homogeneous groups and select comparable objects; therefore, we conducted a cluster analysis according to the indicators chosen for analysis: high-tech share in GRP, investment share in fixed assets, volume of innovation goods, and internal R & D costs.
As a result of cluster analysis, regions were divided by several essential features. In addition, the cluster analysis allowed us to reduce large amounts of socio-economic information and to make them compact and homogeneous. These features seemed essential for the preliminary processing of regional development data in combination with DEA methods.
Clustering consists of dividing an analyzed set into groups of similar objects that are called clusters. The task of the clustering algorithm is the distribution of objects among clusters, the separation of large and the organization of small clusters, and then the redistribution of objects between them [34,35].
There are a various clustering techniques for unsupervised learning such as: • Partitional (k-means, k-medoids, k-mode, Clustering LARge Applications (CLARA) and Clustering Large Applications based on RAN-Domized Search (CLARANS).

•
Density-based spatial clustering of applications with noise.
The most commonly used clustering algorithms are non-hierarchical, such as the k-means and k-medoids algorithms.
k-medoids is a greedy algorithm; it is a modification of k-means that does not use linear space properties and that uses compactness as criteria of clustering [36]. A medoid is the point in the cluster with minimal average dissimilarities to the other data points.
A significant advantage in this regional study was that the k-medoid algorithm is less sensitive to outliers than other partitional algorithms-k-means, in particular.
The common realization of k-medoid is an iterative algorithm (partitioning around medoid) that minimizes the sum of distances from each object to its cluster medoid. Its metrics is as follows: • Input: dataset A of L data points, number of clusters k. In order to use a clustering algorithm and implement separation, it is required to choose a similarity measure. The measure of similarity is a function that determines the distance between objects in multidimensional space. The similarity measure is selected based on attribute types, as well as computational complexity requirements. The distance metrics are classified as follows: The approach based on Bregman divergences are the generalized distance measures [37]. There exists a corresponding generalized distance measure (squared Euclidean distance, Kullback-Leibler distance, Itakura-Saito distance, and others) for the probability distribution family. It should be noted that some types of divergence provide better cluster separability.
The Bregman divergence represents a convex on a convex set: where ϕ(x) is a strictly convex, continuously-differentiable function on a convex set, and ∇ϕ(y) is its derivative on y. Bregman divergences are not a metric. In a generalized divergence case, the convex function in r-dimensional real vector space is: Many indicators used to measure clustering accuracy (Rand, Jaccard Index, F-measure, mutual information, and Fowlkes-Mallows) are external validation methods that require target variables. Indices like those of Davies-Bouldin, Dunn, and Silhouette can be used as internal metrics for evaluating clustering algorithms.
A significant drawback of non-hierarchical methods is the need to specify the number of clusters k. The Davies-Bouldin score is used to evaluate the optimal number of clusters [38].
Here is an option for calculating similarity, determined by the formula: The Davies-Bouldin Index (DBI) is calculated as follows: where R i = max i,j∈{1...k}, i j R ij , i = 1, . . . , k. A low DBI value indicates a more appropriate cluster structure. Data preparation is an important part of cluster analysis. As the normalization method, we used proportion transformation. As a result of such normalization, each value was divided by the total sum of that attribute values ignoring non-numeric or missing values. In this case, the normalized attribute values were positive in accordance with the necessary requirements of the used DEA method.

DEA Malmquist Index Modeling
The methodology for assessing the dynamics of region innovative development and innovative spillovers was based on the application of the DEA model with the Malmquist Productivity Index. DEA is a linear programming methodology that uses input and output data for a group of homogeneous objects in order to build a piece-wise linear production frontier for each object in the sample. The DEA provides an estimation of the efficiency relative to the best practices under the condition that the technology is fixed at current level. To construct the frontier, linear programming problems are solved (for each object in the sample, a separate problem is formulated and solved). The degree of technical inefficiency of each object is defined as the distance between the observed data point and the frontier. DEA makes it possible to obtain a quantitative evaluation of the analyzable entities that are usually called decision-making units (DMU).
Currently there are different types of DEA models depending on orientation (input-oriented and output-oriented), returns to scale (constant return to scale (CRS) and variable returns to scale (VRS)), Information 2020, 11, 294 12 of 24 distance function, frontier type, and other aspects [39]. In an input-oriented model, the DEA method constructs the frontier by searching for the maximum possible proportional reduction of input data with constant output levels for each object. In the output-oriented model, the DEA method defines the maximum proportional increase in production output under assumption fixed input levels. These two approaches give the same estimations of technical efficiency when applying the CRS model, but they are unequal in the VRS model. The Malmquist Index is used to evaluate technological efficiency obtained for relatively different sets of objects [40][41][42]. The Malmquist Index measures the total factor productivity change of a DMU between two consecutive time periods. It is defined as the product of a change in efficiency (catch-up) and technological change (shift of the border). A change in efficiency reflects the extent to which the DMU improves or deteriorates its effectiveness, while technological changes reflect a change in the frontiers of efficiency between two periods [43].
The total factor productivity (TFP), when using the Malmquist Index methods, changes between two data points (e.g., those of a particular region in two adjacent time periods) by calculating the ratio of the distances of each data point relative to a common technology. The Malmquist TFP change index in output-orientated DEA model between period t and period (t + 1) is: where (x t+1 , y t+1 ) and x t , y t represent the input and output vector of the period (t + 1) and t, respectively, and the notation d t o x t , y t represents the distance from the period t to the period (t + 1) technology.
A value of M o greater than 1 indicates positive TFP growth from period t to period (t + 1). A value of M o less than 1 indicates a TFP decline. If M o = 1, there is no progress or regression in period t ratio to (t + 1).
M o is the geometric mean of two TFP indices. The first is evaluated with respect to period t technology, and the second is evaluated with respect to period (t+1) technology. An equivalent decomposed form of the TFP index [44,45] is: Let us consider the first factor in Equation (2), which is named technical efficiency change (EC): EC is the change in the output-oriented measure of technical efficiency between periods t and (t+1), and it shows how the ratio of actual outputs to potential has changed. EC indicates the capability of a DMU to catch up with more efficient DMUs.
The second factor in Equation (2) is technological change (TEC): TEC is the potential index-the geometric mean of two relations-that characterizes the shift of the potential technology frontier between period t6 and (t + 1). In other words, the last relation reflects a change in the technological efficiency of the evaluated object caused by a shift in the effective boundary. Another process of technical efficiency decomposition is based on using both CRS and VRS DEA frontiers. A scale efficiency change (SEC) can be represented as follows: A pure efficiency change (PEC) is given below: The TFP index can be represented in the form of such factors as the change in efficiency and the technical change. Let for a time period t, t = 1, . . . , T, DMU i , i = 1, . . . , N, use P inputs to produce S outputs. In a particular time period t, the following definitions hold: y i is a S×1 vector of output quantities for the DMU i . x i is a P×1 vector of input quantities for the DMU i . Y is a N×S matrix of output quantities for all N DMU i . X is a N×P matrix of input quantities for all N DMU i . λ is a N×1 vector of weights. ϕ is a scalar.
To calculate M o (y t+1 , x t+1 , y t , x t ), it is necessary to solve the following linear programming problems [44]: The above Equations (12)-(15) need to be solved for each DMU in a sample. Hence, the total number of linear programming problems for N DMUs and T time periods is N·(3T − 2).

Clustering of Regions According to the Level of Innovative Development
The specific feature of present study was that in the set of objects under examination, one could observe a situation where resource indicators in terms of the innovation system effectiveness reduce or remain the same, but the performance of the regional innovation systems sharply improves. We tried by means of cluster analysis to take the influence of this phenomenon into account in the evaluation of the innovative development effectiveness and the dynamics assessment for 2006-2017.
The proposed set of indicators allowed for, in principle, the desirable clustering all regions according to integrated assessment of their innovative development rate. To ensure the adequacy of assessment from the sample, it was necessary to exclude regions for which there were no data for the periods under review or where zero values of indicators were observed. The authors applied a cluster analysis to create a balanced data panel.
It was decided to use the following indicators for 2017 to perform cluster analysis: • High-tech share in GRP, %. For clustering regions, the k-medoids method was applied with Bregman divergence as a distance type. To select the number of clusters k, the Davies-Bouldin Index was used. The Davies-Bouldin criterion was calculated for each number of clusters k = 1, . . . , 10 ( Table 3). The plot (Figure 4) shows that the lowest resulting Davies-Bouldin value occurred at k = 3, thus suggesting that the optimal number of clusters was three.  The plot (Figure 4) shows that the lowest resulting Davies-Bouldin value occurred at k = 3, thus suggesting that the optimal number of clusters was three. As a result of applying the cluster model for k = 3, the plot (Figure 3) shows that the lowest resulting Davies-Bouldin value occurred at k = 3, thus suggesting that the optimal number of clusters was three.
There was division of regions into three clusters in accordance with the given model. The description of the of cluster analysis result is presented in Table 4.   As a result of applying the cluster model for k = 3, the plot (Figure 3) shows that the lowest resulting Davies-Bouldin value occurred at k = 3, thus suggesting that the optimal number of clusters was three.
There was division of regions into three clusters in accordance with the given model. The description of the of cluster analysis result is presented in Table 4.  Table 5 shows the characteristics of cluster medoids for the obtained cluster model for the normalized data. In terms of intramural R & D costs, cluster 2 represents an abnormal object. The essential purpose of the preliminary data processing was the selection of abnormal objects. It was necessary to reject outliers that did not fit into the data structure. The results of the preliminary data processing using the cluster analysis allowed us to identify Moscow as an abnormal sample object (cluster 2), which was quite consistent with the expert assessment. An important step in the visualization of clustering results was the formation of scatter plots and histograms, where the diagonal cells are the distribution plot and others elements correspond to scatter plots for a different pairs of attributes with color coding in accordance to cluster labels ( Figure 5). This allowed us to check whether outliers were falling inside cluster influence areas and to detect which clusters were distant to one another. There was no significant correlation between attributes.
A visualization of the cluster analysis results demonstrates the obtained data structure ( Figure 6). A bar plot indicates a difference between the cluster's mean in such normalized model variables as hi-tech share in GRP, investment share in fixed assets, and intramural R & D costs.
As a result of preliminary data analysis by cluster analysis methods, it was proposed to use only a part of the regions for a comparative assessment of technical efficiency. The regions belonging to cluster 0 and cluster 2 are economic objects that are more developed in terms of innovative potential. The regions forming cluster 1 with regard to selected indicators are much less successful in terms of innovative development. It should be noted that in most cluster 1 regions, the indicators that were proposed to be used in DEA models were close to the minimum values. Thus, to assess the innovative development dynamics in these regions, it was further proposed to use 31 regions belonging to clusters 0 and 2.
As a result, 31 out of 80 regions were selected for subsequent analysis; a number of regions were excluded from the analysis due to incomparably polarized indicators or a lack of statistical data. Finally, this allowed us to distinguish two groups of regions-underdeveloped regions and leading regions-in accordance with the level of their innovative development.
An important step in the visualization of clustering results was the formation of scatter plots and histograms, where the diagonal cells are the distribution plot and others elements correspond to scatter plots for a different pairs of attributes with color coding in accordance to cluster labels ( Figure 5). This allowed us to check whether outliers were falling inside cluster influence areas and to detect which clusters were distant to one another. There was no significant correlation between attributes.    As a result of preliminary data analysis by cluster analysis methods, it was proposed to use only a part of the regions for a comparative assessment of technical efficiency. The regions belonging to cluster 0 and cluster 2 are economic objects that are more developed in terms of innovative potential. The regions forming cluster 1 with regard to selected indicators are much less successful in terms of innovative development. It should be noted that in most cluster 1 regions, the indicators that were proposed to be used in DEA models were close to the minimum values. Thus, to assess the innovative development dynamics in these regions, it was further proposed to use 31 regions belonging to clusters 0 and 2.
As a result, 31 out of 80 regions were selected for subsequent analysis; a number of regions were excluded from the analysis due to incomparably polarized indicators or a lack of statistical data. Finally, this allowed us to distinguish two groups of regions-underdeveloped regions and leading regions-in accordance with the level of their innovative development.

Analysis of Malmquist Productivity Index
To test the research hypotheses, the Malmquist Productivity Index and its components were calculated to assess the dynamics of innovative development over three adjustment periods. Values M o > 1 for assessing direct and indirect effects of innovation indicated an increased efficiency of the regions under consideration during the study period. The construction of the Malmquist Index allowed us to obtain an estimate of the region distance from production possibility frontier (EC) characterizing the dynamics of total factor productivity, as well as to reveal the dynamics of technological progress (TEC). The EC indicator made it possible to separate the innovative development dynamics of the region from the change with time of the frontier itself. A TEC value of more than 1 indicated technological progress; otherwise, there was technological degradation in the region development. An analysis of the Malmquist Productivity Index and decomposition results from data of 31 regions of the Russian Federation allowed us to present the tendency of regional innovation system development. Tables 6 and 7 contain regions with Malmquist Index for a single output (volume of innovation goods) as direct effects of innovation activity. As for the results of modeling the direct effects of innovative development, regions with M o > 1 showed a progress in the period from 2011 to 2017. Other regions (Republic of Tatarstan, Ryazan Region, Belgorod Region, Novosibirsk Region, Kirov Region, Moscow City, Stavropol Region, Penza Region, and Ulyanovsk Region) with M o < 1 showed regression in the two periods. Tables 8 and 9 show the Malmquist Index for the model 2 with 3 outputs (hi-tech share in GRP, investment share in fixed assets, and used patents) as indirect effects of innovation activity for the regions.   Among well-developed regions, 54.84% had > 1 in accordance with both models for assessing direct and indirect effects, i.e., there was a steadily innovative development of more than half of the territorial entities from the research sample for the period from 2011 to 2017. The decomposition into two components additionally demonstrated that the part of the improvement was attained through the technical change rather than through the efficiency change of the relatively inefficient regions catching up with efficient ones. If EC > 1, the technical efficiency reduced the difference with the optimal region. If EC < 1, the technical efficiency increased the difference with the optimal region. The average technical efficiency change index was 0.739, which was less than 1. This objectively indicated that the gap between the levels of innovative system development had significantly increased by the regions during the selected time period.
The technological change index TEC denotes the degree of technical advancement or technical innovation. If TEC > 1, technical advancement was indicated, and TEC < 1 meant that the technology possessed a recessionary tendency. As seen from index decomposing, the average of the technological change index was 1.598, which was greater than 1. This indicated that the technical efficiency indicators of the reference regions increased monotonously.
PEC and SEC can be used to identify main direction of further catching up the frontier in region innovation system. In model 1 for the direct effects of innovation process, PEC = 0.937, which reflected the decreasing impact of technical and management factors. SEC = 1.061 for model 1 indicated the main influence on regional innovation system by scale factors. As for model 2 with indirect factors of innovation process, PEC = 0.905 and SEC = 0.817, thus indicating an insufficient use of both technical and management capabilities and scale economy factors. It should be noted that for model 2 with indirect innovation factors, the value of SEC as less than for model 1. Since SEC mainly depends on institutional arrangement, we can conclude that there was a negative spillover trend. Among well-developed regions, 54.84% had M o > 1 in accordance with both models for assessing direct and indirect effects, i.e., there was a steadily innovative development of more than half of the territorial entities from the research sample for the period from 2011 to 2017.
The decomposition into two components additionally demonstrated that the part of the improvement was attained through the technical change rather than through the efficiency change of the relatively inefficient regions catching up with efficient ones. If EC > 1, the technical efficiency reduced the difference with the optimal region. If EC < 1, the technical efficiency increased the difference with the optimal region. The average technical efficiency change index was 0.739, which was less than 1. This objectively indicated that the gap between the levels of innovative system development had significantly increased by the regions during the selected time period.
The technological change index TEC denotes the degree of technical advancement or technical innovation. If TEC > 1, technical advancement was indicated, and TEC < 1 meant that the technology possessed a recessionary tendency. As seen from index decomposing, the average of the technological change index was 1.598, which was greater than 1. This indicated that the technical efficiency indicators of the reference regions increased monotonously.
PEC and SEC can be used to identify main direction of further catching up the frontier in region innovation system. In model 1 for the direct effects of innovation process, PEC = 0.937, which reflected the decreasing impact of technical and management factors. SEC = 1.061 for model 1 indicated the main influence on regional innovation system by scale factors. As for model 2 with indirect factors of innovation process, PEC = 0.905 and SEC = 0.817, thus indicating an insufficient use of both technical and management capabilities and scale economy factors. It should be noted that for model 2 with indirect innovation factors, the value of SEC as less than for model 1. Since SEC mainly depends on institutional arrangement, we can conclude that there was a negative spillover trend.  The significance of obtained results is as follows. A decrease in technical efficiency in unfavorable environment conditions was established, whereas there was a shift of the regional efficient frontier. A change in the index components (efficiency shift and technology change) suggested that in an unfavorable environment, the technological frontier is moving and there has been no catch-up development among the regions.

Discussion
The results of the plant-level data analysis over the period 2011-2017 indicated the growth of total factor productivity of regional innovation development on average. At the same time, 2014-2017 observed a negative trend in the dynamics of the direct effects of innovative development. The evaluation of innovation activity effectiveness in terms of indirect effects demonstrated positive dynamics in Russia (H1). This may indicate the impact of innovation policy on the innovative development in the regions.
The average value of technical efficiency change for both models with direct and indirect effects was less than 1. This demonstrated a reduction in changes reflecting the diffusion effects of technology. Moreover, when the direct effects model showed positive dynamics in 2014-2017, then the spillover effects showed a negative trend in the past period. This indicated a growing gap between the levels of region innovation development (H2), which in its turn revealed the process of polarization in regional innovative development and its decreased effectiveness.
The reverse trend was observed for technological change recording shifts in the frontier. In the period of 2014-2017, the value of the regions' technological changes indicated a downward trend in respect to direct effects of region innovation development. We could see significant growth in the indirect effect in 2014-2017.
The change in the effectiveness scale on average for all time periods indicated a downward trend according to the direct effects model and the upwards trend for the indirect effects mode. This indicated that the main source of innovative development is mostly the economy of scale (H3). This gives evidence of an extensive resource type of economic growth, where the effectiveness of regional innovation systems grows only due to the expansion of resources rather than due to their effective use. The significance of obtained results is as follows. A decrease in technical efficiency in unfavorable environment conditions was established, whereas there was a shift of the regional efficient frontier. A change in the index components (efficiency shift and technology change) suggested that in an unfavorable environment, the technological frontier is moving and there has been no catch-up development among the regions.

Discussion
The results of the plant-level data analysis over the period 2011-2017 indicated the growth of total factor productivity of regional innovation development on average. At the same time, 2014-2017 observed a negative trend in the dynamics of the direct effects of innovative development. The evaluation of innovation activity effectiveness in terms of indirect effects demonstrated positive dynamics in Russia (H1). This may indicate the impact of innovation policy on the innovative development in the regions.
The average value of technical efficiency change for both models with direct and indirect effects was less than 1. This demonstrated a reduction in changes reflecting the diffusion effects of technology. Moreover, when the direct effects model showed positive dynamics in 2014-2017, then the spillover effects showed a negative trend in the past period. This indicated a growing gap between the levels of region innovation development (H2), which in its turn revealed the process of polarization in regional innovative development and its decreased effectiveness.
The reverse trend was observed for technological change recording shifts in the frontier. In the period of 2014-2017, the value of the regions' technological changes indicated a downward trend in respect to direct effects of region innovation development. We could see significant growth in the indirect effect in 2014-2017.
The change in the effectiveness scale on average for all time periods indicated a downward trend according to the direct effects model and the upwards trend for the indirect effects mode. This indicated that the main source of innovative development is mostly the economy of scale (H3). This gives evidence of an extensive resource type of economic growth, where the effectiveness of regional innovation systems grows only due to the expansion of resources rather than due to their effective use.
Compared with previous studies, we proposed a methodology for assessing the innovation results in terms of direct and indirect effects of innovation in order to estimate what factors are responsible for the innovative development of a region-by expanding resources or by using them effectively. Russian regions were ranked according to changes in their total factor productivity and the dynamics rate of regional innovative development; we identified leaders and outsiders in terms of direct and indirect effects of innovation. Our study made it possible to identify promising regions in terms of innovation growth and revealed the reasons for this effectiveness.
Previous results and literature findings [17,21] have established that over the past few years, the effectiveness of Russian regional innovation systems has increased, and the most efficient innovation development has resulted in the largest agglomerations and technologically developed regions with leading universities and research centers. We also confirmed the findings about the positive dynamics of innovative development in Russia for 2010-2017; however, at the same time, the polarization of regions has been increasing, while the effectiveness of innovative development has been decreasing. Our study showed that the regions that have been witnessing significant growth over our chosen 15 year period do not belong to research and university centers, and the growth of their innovation indicators is caused not by patent activity but by some other factors.
Previous Russian studies have employed the results of patenting as their resulting parameters. We expanded the set of output parameters, which is relevant for the Russian specifics of "not exactly accurate" patent statistics. Many Russian regions fail to demonstrate high patent activity, but at the same time, they have good indicators of the innovation development rate. We differentiated the direct and indirect effects of the innovation process and tried to draw attention to the complexity and multifactorial nature of evaluating innovative system effectiveness. Thus, the growth of the resulting factors for innovative development was extensive and occurred due to resource factors.
Our study was designed to draw attention to the fact that the standard statistical approach for assessing the effectiveness of innovative development needs to be expanded. The DEA Malmquist Index and cluster analysis that we used made it possible to consider the inconsistency in the development of Russian innovation systems due to the vast territory and extreme heterogeneity of the economic development in the Russian regions. Our approach and the proposed models allowed us to obtain significant conclusions and identify those regions that, with a decrease in resourcing, have demonstrated an increase in the resulting indicators. What regions are more efficient and what are the reasons for getting better results of innovation should be known. This is important for the process of making innovative decisions and influencing regional politics, and this is particularly important for the identification of regions with best practices. Realizing this information will enable the drawing of attention to the best practices of these regions and the conducting of benchmarking with other ones.
This finding could be important in terms of policy realization. The results can be used to diagnose effectiveness and prioritize regional innovation policies for specific regions, as well as to justify public investments in R & D, the development of mechanisms for attracting investors' funds in innovations, the improvement of legislative documents regulating innovative activity, and the development of measures to promote the innovation sphere to solve the problems of innovative development of the Russian economy.

Conclusions
The paper presents a study of innovation system effectiveness in Russian regions based on cluster analysis and a DEA Malmquist Index. As a result, conclusions were drawn about the movement of the technology frontier shift and the dynamics of regional innovation development in Russian regions.
To assess direct and indirect innovation effects, the study used indicators for the dynamics of changes in the growth rate of regional innovative development in Russian regions for two periods: 2011-2014 and 2014-2017. The analysis was carried out according to the following variables: the volume of innovative products and the increase in innovative activity, the increase in the share of high-tech products in the GRP structure, the number of patents introduced as a result of intellectual property use, and the growth of financial and investment activity in the economy.
There were some limitations to the present analysis that could form interesting topics for further research. The peculiarity of current investigation was that the sensitivity analysis of dynamic Malmquist DEA models was complicated by the limited availability of statistical data for long periods and the insufficiency of economic performance to explain the nature and specialty of the regional innovative dynamics.
An assessment of innovation development could reveal other aspects in the analysis of regional strategy. Further research should be aimed at identifying the root causes of the innovative effectiveness growth in these regions. A respective analysis of the particular regions would be an extension of great interest. It would be useful to apply additional DEA methods and stochastic frontier analysis models that would allow us to compare models and supplement sensitivity analysis based on econometric approaches. A logical continuation of the research would be the extension of the time interval.
Funding: This research was funded by Russian Science Foundation, grant number 19-18-00199.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: