Dynamic Convergence of Green Total Factor Productivity in Chinese Cities

: China’s energy consumption in urban areas accounts for a large proportion of total energy consumption, and many pollutants are emitted with the energy consumption. Considering the requirement for green development of economy, it is necessary to study the green total factor productivity (GTFP) in cities. In this study, the Malmquist index, spatial autocorrelation analysis and convergence analysis are used to analyze the GTFP for 263 prefectural or higher-level cities in China. The results show a growing trend of values measured by the GTFP in Chinese cities, indicating an increase in e ﬃ ciency. In addition. the eastern region has the highest e ﬃ ciency, followed by the central region while the lowest being the western region. The calculated values of GTFP show a relatively strong overall spatial clustering with some local high-high clusters of high index values. GTFP also shows relatively weak divergence and no sign of convergence. Thus, we propose that, to improve GTFP and narrow the gap between regions, it would be necessary to enhance technological progress and restructuring industrial productivity in cities.


Introduction and Literature Review
More than half of the world's population now lives in cities, with 67 percent predicted to do so by 2050 [1]. Indeed, cities play an increasingly important role in human development. After the reform and opening-up, China has seen an accelerating urbanization process. By 2018, China's urbanization rate reached 59.58% [2]. With the increase of population in urban areas, the secondary and tertiary industries would greatly agglomerate, playing an increasingly important role in each region's economic development. The changes in lifestyle and increase in economic activities would definitely lead to increased energy consumption. Cities now account for 85 percent of China's total energy consumption, according to the China Energy Consumption Report [3]. Increasing energy consumption results in the increase of pollutant emission, which threatens the sustainable development of urban areas, even at a regional scale. According to the 2018 State of China's Ecological Environment Bulletin [4], the air quality in 64.2% (217 out of 338) of the cities at or above the prefectural level (from the monitoring network of the ministry of ecological environment) exceeded China's environmental standards. China's provincial-level (highest-level) administrative divisions contain 23 provinces, 5 autonomous regions, with 4 municipalities directly under the central government and 2 special administrative regions. Each province and autonomous region can be divided into a number of cities, known as prefectural-level cities.
In 2018, the area of acid rain zone was about 530,000 square kilometers, accounting for 5.5% of China's total land area [4]. According to China National Environmental Analysis (2012), the cost of air pollution in China was at around 1.2% of GDP in terms of cost of illness and was as high as 3.8% in terms of willingness to pay for it [5]. Therefore, it is particularly important to reduce energy consumption and pollution without affecting economic development. An important way to achieve this goal is to develop green economy and achieve green growth. As defined by the Organization for Economic Cooperation and Development (OECD), green growth means "fostering economic growth and development, while ensuring that natural assets continue to provide the resources and environmental services on which our well-being relies" [6]. Thus, it is of great significance to study the efficiency of green growth in cities.
There are different terms for the green growth efficiency. For example, Ma [7] directly used the term, green growth efficiency, while Lei [8] and Li [9] called it GTFP (green total factor productivity). Total factor productivity (TFP) refers to everything that affects growth other than physical capital and labor (institutions, policies, technologies, etc.). TFP is at the center of studies on economic growth and production efficiency. In a similarly way, Total factor energy productivity (TFEP) measure is an energy efficiency calculation method proposed by Hu [10], which takes into account the effect of input factors that are related to economic output and energy efficiency. With these, the TFEP measure effectively overcomes the deficiency of traditional single-factor energy efficiency [11]. Chang [12] conducted an empirical analysis of TFEP on 28 member countries in the European Union. He found that countries in Baltic sea region performed better than countries in non-Baltic sea regions in terms of average energy efficiency.
Using a slacks-based measure (SBM) model, Chen [13] analyzed TFEP of Anhui Province's 38 industrial sub-industries in China. The results suggested that industrial sector in Anhui had low TFEP which left much room for improvement. Based on the non-radial directional distance function of the data envelopment analysis (DEA) model, Cheng [14] used the meta-frontier method to measure the TFEP of 30 provinces in China from 1997 to 2016. They found that TFEP had shown a significant regional heterogeneity, with the highest being the Eastern region, the second highest being the central region and the lowest being the western region. Data envelopment analysis (DEA) is a method of evaluation of relative data validity based on input-output data. DEA method can be applied with many calculation methods. If the decision making unit (DMU) achieves the efficient frontier by reducing all inputs or expanding all outputs in the same proportion, this is called the radial distance DEA function. If the inputs or outputs of the DMU does not necessarily decrease or increase proportionally, this is called the non-radial directional distance function. Meta-frontier method refers to an algorithm that takes into account the potential technical level of all DMUs. Wu [15] studied the spatiotemporal pattern of total factor energy efficiency in 11 provinces and cities along the Yangtze River economic belt. The results showed that the total factor energy efficiency of the Yangtze River economic belt under environmental constraints had decreased with a 2.9% annual rate between 1999 and 2013. Liu [16] utilized the inter-provincial panel data and DEA to analyze the total factor productivity of energy in eight major regions of China. With the growing emphasis on green economy and green growth, scholars also considered undesirable outcomes by the growth in industrial outputs when analyzing TFEP, such as carbon dioxide, sulfur dioxide and other pollutants. When calculating TFEP, the efficiency obtained by taking energy, labor and input capital as inputs and undesirable outputs, such as sulfur dioxide and carbon dioxide, and GDP as outputs is called green total factor productivity (GTFP).
The research on green energy efficiency can be divided into three different directions: (1) Research on GTFP in different industries. For instance, Feng [17] investigated the sources of GTFP changes and its inefficiency of China's metal industry during 2000-2015 using a meta-frontier approach. They found GTFP in China's metal industry increased by 11.52% annually. Technological progress was the most critical driving factor. Along with that, they also found that the reduction of the gap in regional technology uses played a significant role in promoting GTFP growth. Finally, they concluded that the declines in scale efficiency and pure technical efficiency were two inhibitors to economic growth. Using a global DEA, Zhu [18] analyzed the GTFP of China's mining and quarrying industries for the period of 1991-2014 with  [21] studied land transfer marketization (LTM) on GTFP and its mechanisms. They found LTM had a significant promoting effect on the improvement of GTFP values for cities in China and the effect was also significant in the eastern, central and western regions, indicating that the application of land transfer policy that regulated regional economic development was an important factor in most regions in China. (3) Research on the GTFP in different regions. Rusiawan [22] evaluated the effect of CO 2 intensity to the TFP in Indonesia, they stated that GTFP would improve productivity growth and emission reduction. Xia [23] studied the provincial GTFP in China. Their results showed little evidence for neither traditional productivity growth nor green productivity growth for most provinces, despite dramatic and continuous GDP growth in the country. In addition, after incorporating environmental factors, productivity performance improved in some provinces, but deteriorated in some others. Focusing on the primary provinces along China's belt and road (BRI) route, Liu [24] used a global Malmquist-Luenberger (GML) index based on SBM directional distance function to evaluate provincial GTFP and quantitatively analyzed the BRI's net effect on provincial GTFP. The results indicated a relatively good development for provincial GTFP, with technological progress being its main driving force. Shao [25] used a stochastic frontier analysis method to study Shanghai's industrial green development transformation. Shao's results showed an overall upward trend for the industrial GTFP in Shanghai with technical efficiency changes being identified as the main factor.
Based on findings from the aforementioned studies, it can be seen that most studies on GTFP in China were carried out with analyses limited to several provinces, regions or cities-or just certain types or industrial sectors. Furthermore, there are only a few studies on urban GTFP that were conducted with a national coverage. China is a large country enclosing a large number of administrative regions with different levels of economic and technological development. Given the vast territory, differences are expected to exist among these regions in terms of GTFP. Due to the influence of the financial crisis in 2008, China's economic growth had since slowed down. Under this new normal, improving GTFP became an important proposition in the economic development of China. To resolve the increasingly higher energy and environmental pressures, it is necessary to make a comprehensive analysis of GTFP from the perspective of city.
In this study, we use the Malmquist index, spatial autocorrelation analysis and convergence analysis to study the GTFP of Chinese cities. This study has seven sections. The second section introduces the analytical methodology applied in the study. The third section is the introduction to computational methods. The fourth section discusses the selection of indicators and data sources. The fifth section presents the analytical results from the analysis. The sixth section is the discussion of the research outcome. The last section concludes the discussion along with some policy recommendations.

DEA-Malmquist Index Method
Data envelopment analysis (DEA) is a quantitative analysis that evaluates the relative efficiency of comparable units of the same type according to multiple input indexes and multiple output indexes by using the method of linear programming. The DEA method can be used for efficiency analysis with multiple input and output variables [26]. Analytical results of this method are not affected by the data units. Data of different units can be directly calculated without standardization. This method also has the advantage of avoiding manual or arbitrary assignments of weights in the calculation of index values, so it avoids the influence of human factors. These characteristics allow DEA to be the most commonly used method for assessing efficiency. It should be noted, however, that the general DEA method can only make comparisons of efficiency among different places in the same period instead of making comparisons of efficiency over different periods. To that end, the DEA-Malmquist index method can overcome these shortcomings and calculate the panel data and spatiotemporally compare the efficiency [27]. The DEA-Malmquist index is a model that describes the change of productivity by calculating the ratio of the distance functions at two different moments using DEA.
Most scholars analyzed efficiency from the perspective of technological changes as a whole without any in-depth discussion on the specific components of technological changes. This shortcoming can be avoided by using the DEA method. Another advantage of the DEA-Malmquist index is its ability to decompose efficiency index into two indices: one on technical efficiency changes and another on technical changes, making it easier to understand the driving factors for the efficiency changes.
Many studies tended to treat the undesirable output of industrial growth and those of growth in GDP together. Hailu [28] suggested that the undesirable output can be treated as an input variable and GDP as an output variable. They compared such method with traditional computational method where the undesirable output is treated as an input variable, finding such method is better than the traditional method. Following the practice in Xie [29], we treated the undesirable output as the input variable in the analysis.

Spatial Clustering
Spatial autocorrelation indices can describe the interdependence or the level of clustering among geographic phenomena as represented by the observed data in an area. For example, Chen [30] studied the park quality of Cache County in northern Utah by using spatial autocorrelation index. The results showed that parks located near each other often share similar qualities. Parks located further away from each other often displayed many different qualities. Using the 1990 data from 62 large American metropolitan areas to analyze the effects of subcenters on the spatial distribution of employment, McMillen [31] found that proximity to subcenters significantly increased employment densities in this sample of cities. Simonetta [32] studied employment levels of the 326 West German districts in year 2000. The figure suggested that high-employment regions tend to be located close to other high-employment regions, while low-employment regions tend to be located close to other low-employment regions. Cities that are near to each other tend to engage similar economic activities and have similar consumer cultures. There are certain spatial interactions and technological diffusion between neighboring cities. Consequently, cities adjacent to each other may be interdependent on each other in some ways. Liu [33] found that carbon emissions in 30 provincial-level regions in China have a spatial spillover effect, where the spread of advanced technology from one province to a neighboring province reduced carbon emissions in the neighboring province. Yuan [34] found that in prefecture-level cities, financial agglomeration had a spatial spillover effect. Wang [35] used a three-stage DEA model and the global spatial autocorrelation test to study the energy efficiency of 30 provinces and regions in China. They also found that the energy efficiency had a spillover effect. Yang [36] used panel data from 30 provinces and regions to study carbon dioxide emissions in the transportation industries. They, too, found spatial spillover effects in carbon dioxide emissions. Kutlu [37] found that the technological efficiency had a spatial spillover effect, concluding from studying the technological efficiency of Indian chemical industry.

Convergence Analysis
The research on convergence of efficiency has also been a focus of scholars. Herrerias [38] studied the convergence of energy intensities in China and found a mixture of convergence clubs and divergence. Club convergence is a concept that different areas, due to difference in their initial conditions, would form different clubs in terms of development. Areas with similar conditions within the club will show convergence in development. Bhattacharya [39] analyzed the convergence of energy efficiency among Indian states and regions. Their findings from the club merging analysis showed that energy productivity across the states and territories converges into two clubs with one divergent club. Ma [40] analyzed the convergence of energy efficiency over time in the construction industry in seven regions of Australia, discovering a convergence in five states and a divergence in other two states. Parker [41] analyzed the convergence of energy efficiency in manufacturing in Organization for Economic Cooperation and Development (OECD) countries and non-OECD countries. They indicated that both developed-and less-developed regions-showed a convergence trend. Pan [42] analyzed the convergence of China's energy efficiency from the provincial level and found that China's regional energy efficiency had convergence. Convergence was also found by Li [43] who employed panel data model to study the convergence characteristics of energy efficiency in eastern, central and western regions of China.
The research on convergence can be divided into β-convergence, σ-convergence and club convergence. β-convergence analyses whether poor countries or regions will catch up with rich ones and describes the rate at which countries are converging. σ-convergence looks at income inequalities or differences among countries or regions, and it analyses whether the dispersion of income distribution shrinks [44].
Most studies on the convergence and related problems were dominated by the β-convergence method [45]. However, studies based on σ-convergence were inadequate. However, σ-convergence analysis is an important method for studying regional disparity of economic growth [46] and it can reflect the dynamic process of regional unbalanced development. Therefore, to fill the gap, we adopted σ-convergence in this study to analyze the convergence of green Total Factor Productivity in China.

Green Total Factor Productivity
The input-output efficiency including energy consumption and pollutant emissions is called green total factor productivity. The green total factor productivity was calculated using the Malmquist index method. Let x t , y t and x t+1 , y t+1 represent the input and output vectors for phase t and phase t + 1, respectively. When the input-output relationship changes from (x t , t ) to (x t + 1 ,y t + 1 ), it indicates that production efficiency changes. These include changes in technology levels and those of technical efficiency. D t x t , y t and D t x t+1 , y t+1 , respectively, represents the technical efficiency level in the stages t and t + 1; D t+1 x t , y t and D t+1 x t+1 , y t+1 for the stages t and t + 1 technical levels. The calculation of the Malmquist index follows those in Färe [47], Zhou [48] and Huang [49].
Following Caves [50], the Malmquist index can be expressed as (1) If calculated t f pch > 1, it indicates that the efficiency is higher than the previous year. When t f pch < 1, it indicates that the efficiency is lower than the previous year. Finally, when t f pch = 1, it indicates no change is found when comparing to the previous year.
The Malmquist index can be decomposed into: where e f f ch represents the change in the levels of technical efficiency. It can measure the allocation level of production input factors in different industries. That is, the changes in different industrial management levels [51,52]. The calculated value of techch represents the change in technology levels, which can be used to measure both the changes in technology levels and the application ranges of new technology. Technological progress is not only reflected in the application of energy technology, but also in the whole process from the input of energy which is also considered as a factor of production into the economic system to economic output [53]. When e f f ch > 1, it indicates an improvement in management over the previous year. When e f f ch < 1, it indicates a decline in management over the previous year. When e f f ch = 1, it indicates the levels of management are unchanged between the two-time stages. The change in techch has similar meaning to that of e f f ch.
Under the assumption of variable return to scale (VRS), e f f ch of technology levels can be further decomposed into pure technology efficiency index and scale efficiency index as given below: where pech denotes pure technical efficiency changes. Pure technical efficiency reflects the utilization efficiency of input factors in a decision-making unit (DMU) at a certain scale [54]. The notation, sech, stands for the effect of scale, that is, the change in technical efficiency caused by the change in industrial scales [55].

Global Spatial Autocorrelation
Moran's Index can reflect the degree of clustered attribute values among adjacent area units. The formula of Moran's I is: x i , x i means observed value at the specific region i.
In this equation, n is the number of regions and was is a spatial weights matrix. The value of Moran's I is generally between [−1,1] with an expected value of −1/(n − 1) indicating a random distribution of attribute values of the area units. When I < −1/(n − 1), it means negative spatial autocorrelation that area attribute values to not clustered or do form a disperse pattern. When I = 0, it means that no spatial autocorrelation was found and when I > −1/(n − 1), it means that there exists a positive spatial autocorrelation where similar values clustered. Associated with each I value calculated, a probability value p, and a z-score can be obtained for use of the statistical significance of the index value. The smaller the p-value is and the larger the absolute value of the z-score is, the safer you are to reject the null hypothesis.

Local Spatial Autocorrelation
Moran's I can only indicate that GTFP may have global spatial autocorrelation, but it cannot indicate if it is high-high clustering or low-low clustering in local areas. Note that high-high clustering means a location with a high GTFP value is surrounded by neighboring locations that also have high GTFP values. Conversely, a low-low clustering means a location with a low GTFP value is surrounded by neighboring locations that also have low GTFP values. There are two local autocorrelation coefficients available for us to use: local Moran index and local G coefficient, but local G coefficient is better than local Moran index in exploring spatial clustering (according to [56]).
The local G coefficient is calculated as follows: n j=1 x i x j where d is the threshold distance between spatial units and w ij (d) is the spatial weight defined according to the distance function defined by the user. In our study, if the distance between unit i and unit j is less than d, the corresponding weight is 1; otherwise, it is zero. x i and x j are the attribute observations of unit i and unit j when the z value is positive and significant, there is high-high clustering between data; when the z value is negative and significant, there is low-low clustering between data. As the z value approaches zero, the data are randomly distributed [57].

Analysis of Convergence
To explore the levels of GTFP changes over time in studied cities, the σ-convergence formula was used to calculate such changes. The formula used for this is given as follows: where k(k = 1,2, . . . , n) represents different cities; t denotes the study time, x it means the GTFP at city k at the time of t, x t means the average GTFP of cities at time t. When value of σ becomes smaller, the difference of GTFP becomes smaller and GTFP converges. Otherwise, GTFP diverges, and regional differences vary greatly. By using the data of 83 countries over the 1960-1989 period, Miller [58] found that real GDP per worker provided little evidence of σ-convergence, while σ-convergence existed for total factor productivity per worker. Carree [59] studied labor productivity across manufacturing industries in 18 OECD countries over the period 1972-1992. They found evidence for industries with a relatively high labor productivity having a low rate of σ-convergence of productivity.

Index Selection and Data Source
Since 1997, different indicators have been used for calculating different statistics of prefectural and higher-level cities and county-level cities. However, in existing studies, some indicators are not comparable between different types of cities when their sizes or administrative hierarchical ranks are hugely different. To avoid this problem, our study as reported in this study chooses to use only prefecture and higher-level cities for analysis.
To measure urban GTFP, we need to select evaluation indices that represent both urban economic inputs and outputs. According to the principles of the index selection in terms of representativeness, accuracy, authenticity and availability, the local gross national product (GDP) of the city is set as the output index, while the input index mainly includes urban power consumption, urban fixed asset investment, the number of urban employees on the job and industrial sulfur dioxide emissions.
(1) Local GDP. The unit GDP is 100 million yuan (CNY). According to the GDP index, all GDP figures are converted using the 2008 consumer prices as the base with the unit being100 million CNY; (2) energy consumption. Due to a lack of primary energy consumption data in many Chinese cities, we use electricity consumption data instead, with the unit being 10,000 kWh. Electricity consumption data are collected by computers and are more accurate; (3) Fixed-assets Investment. We use urban fixed-assets investment; the unit is 100 million CNY. Due to the lack of conversion data, the current year price is used; (4) Labor population. The number of urban employees is adopted for this and the unit is ten thousand; (5) The emission of pollutants. It is measured in volume of industrial sulfur dioxide emission.
The unit is ton. Choosing sulfur dioxide emission as the index for the emission of pollutants is because 70% industrial pollutants in China is sulfur dioxide emission [60] and such data are easy to acquire.
According to the measurement method of efficiency, DEA models can be divided into input-oriented, output-oriented and non-oriented models. An input-oriented model measures the degree of inefficiency of the evaluating DMU from the perspective of input, focusing on the degree to which each input should be reduced to achieve the technical effectiveness without reducing the output. Whereas an output-oriented model measures the inefficiency degree of the evaluating DMU from the perspective of output, focusing on the degree to which the technically effective outputs should be increased without increasing input. A non-oriented model measures both input and output simultaneously. According to the above principles, the input-oriented model is used in this study. Industrial sulfur dioxide emissions are non-expected outputs. Sulfur dioxide emissions are taken as input along with electricity consumption, fixed assets and labor and urban GDP as output.
The 2008 global financial crisis had a great impact on world economic development. In the wake of the financial crisis, China's response to the economic crisis led to a brief relaxation of capacity controls. At the same time, another 4 trillion CNY was invested to stimulate economic development, making the problem of over-capacity worse [61].
In order to understand the change in GTFP under such circumstances, data from 2008 to 2016 were selected for this study. The data mainly come from the statistical yearbooks of Chinese cities (2009-2017) [62]. Part of the data come from the statistical yearbook of China, statistical yearbook of each province, statistical yearbook of prefectural cities and socioeconomic development bulletin of each city. The cities with relatively more missing data were excluded, including Yinchuan, Xining, Zhongwei, Linfen, Qiqihar, Meizhou, Heyuan, Qingyuan, Zhongshan, Panzhihua, Nanchong, Bazhong, Ziyang and other cities. The interpolation method was used to process the cities with relatively few missing data. In addition. due to the lack of data, this analysis does not include Hong Kong, Macao and Taiwan. Finally, a total of 263 prefecture-level cities with complete data were obtained. The descriptive analysis of each input/output variable is presented in Table 1. Due to the large difference in economic scale and population size of each city, the value range of each variable is also large. Note: The unit of local gross national product (GDP) is 100 million Chinese yuan (CNY); The unit of SO 2 is ton; The unit of energy is 10,000 kilowatt-hours; The unit of fixed assets is 100 million CNY; The unit of labor is ten thousand. The DEA-Malmquist calculation method requires that the number of DMU should be twice or more than twice of the input-output term. Insufficient DMU quantity will affect the analysis results [63]. Thus, 263 DMUs fully meet the requirement.

Results
As previously stated, we used power consumption, fixed assets investment, labor population and sulfur dioxide emissions as the input variables and economic output as the output variable. Using the software DEAP 2.1 that was developed by Coelli's team (https://economics.uq.edu.au/cepa/software), we calculated a GTFP value for each city and the average of all calculated values of GTFP over 9 years, with a total of 2375 records. With these data, we analyze the spatiotemporal variations of GTFP in studied Chinese cities. Table 2 shows the changes in the GTFP and its components over the studied period. As we can see from Table 2, although the GTFP of Chinese cities had fluctuated over the studied period, the overall trend shows a continuous improvement of efficiency. There are seven years with GTFP values that are greater than 1, with the only exception in that of 2012-2013.  The decrease of pure technology efficiency is related to the decrease of technology import quantities and the slower rates of independent innovation in China [64,65]. The weakening of the scale effect may be related to the serious overproduction after 2012. The government had vigorously compressed production capacity, resulting in the weakening of the scale effect. It can also be related to the rising price of human capital in China in recent years, which had caused many enterprises to shift their industries to southeast Asia and other places.

GTFP Analysis of Cities in Different Regions
According to the classification standard of [66], China's provinces are divided into three regions: East, Central and West. The distribution of the three regions is shown in Figure 1. The eastern region includes 11 provinces and municipalities that include Beijing, Tianjin, Shanghai, Liaoning, Hebei, Shandong, Jiangsu, Zhejiang, Fujian, Guangdong and Hainan. These include a total of 92 cities in the eastern region. The central part includes 8 provinces, including Heilongjiang, Jilin, Shanxi, Henan, Anhui, Hubei, Hunan and Jiangxi, covering a total of 97 cities. In the west, there are mainly 12 provinces including Shaanxi, Gansu, Qinghai, Ningxia, Xinjiang, Sichuan, Chongqing, Guizhou, Yunnan, Tibet, Guangxi and Inner Mongolia. Due to the lack of data on Qinghai and Tibet, this study covers 10 provinces with 74 cities in the western region.

GTFP Analysis of Cities in Different Regions
According to the classification standard of [66], China's provinces are divided into three regions: East, Central and West. The distribution of the three regions is shown in Figure 1. The eastern region includes 11 provinces and municipalities that include Beijing, Tianjin, Shanghai, Liaoning, Hebei, Shandong, Jiangsu, Zhejiang, Fujian, Guangdong and Hainan. These include a total of 92 cities in the eastern region. The central part includes 8 provinces, including Heilongjiang, Jilin, Shanxi, Henan, Anhui, Hubei, Hunan and Jiangxi, covering a total of 97 cities. In the west, there are mainly 12 provinces including Shaanxi, Gansu, Qinghai, Ningxia, Xinjiang, Sichuan, Chongqing, Guizhou, Yunnan, Tibet, Guangxi and Inner Mongolia. Due to the lack of data on Qinghai and Tibet, this study covers 10 provinces with 74 cities in the western region. The data in Table 3 show the calculated values of the GTFP for the entire country across three regions. It can be seen from Table 3 that over the studied 9 years, the eastern region has the highest values of GTFP, followed by the central region. Not surprisingly, the western region has the lowest efficiency. For the three major regions, the effects of the technological progress are significantly different among regions, which are 1.039, 1.036 and 1.032, respectively. The differences among regions' technical efficiency are small. As we can see from Figure 2, the values of GTFP of the three regions show basically the same trend of changes. From 2011 to 2014, the GTFP displays a slowdown, which is related to a large amount of overcapacity after 2009. That leads to a decline in economic development efficiency. After cutting capacity in 2015, the GTFP had improved rapidly. As shown in Figure 2, the efficiency improvement rate in the eastern region is sometimes slower than that in the central region. This could be related to the greater competition pressure between cities in the eastern region [67]. Through fiscal subsidies and other means, governmental stimulations for enterprise investment can easily lead to overcapacity [68]. The improvement of GTFP in the western region was due to previously backward in terms of technology and management. After undertaking the industrial transfer in the eastern region, GTFP in there has an obvious advantage of fast improvement. effects of the technological progress are significantly different among regions, which are 1.039, 1.036 and 1.032, respectively. The differences among regions' technical efficiency are small. As we can see from Figure 2, the values of GTFP of the three regions show basically the same trend of changes. From 2011 to 2014, the GTFP displays a slowdown, which is related to a large amount of overcapacity after 2009. That leads to a decline in economic development efficiency. After cutting capacity in 2015, the GTFP had improved rapidly. As shown in Figure 2, the efficiency improvement rate in the eastern region is sometimes slower than that in the central region. This could be related to the greater competition pressure between cities in the eastern region [67]. Through fiscal subsidies and other means, governmental stimulations for enterprise investment can easily lead to overcapacity [68]. The improvement of GTFP in the western region was due to previously backward in terms of technology and management. After undertaking the industrial transfer in the eastern region, GTFP in there has an obvious advantage of fast improvement.

Spatial Effects of Green Total Factor Productivity Index
Global spatial autocorrelation and local spatial autocorrelation analyses were performed using GeoDa software. The results of global spatial autocorrelation are shown in Table 4. It is generally believed that if p-value is less than 0.05 and z value is ≥ 1.65 or ≤ −1.65, Moran's Index values would pass the test for statistical significance. From Table 3

Spatial Effects of Green Total Factor Productivity Index
Global spatial autocorrelation and local spatial autocorrelation analyses were performed using GeoDa software. The results of global spatial autocorrelation are shown in Table 4. It is generally believed that if p-value is less than 0.05 and z value is ≥ 1.65 or ≤ −1.65, Moran's Index values would pass the test for statistical significance. From Table 3 The technological progress and technological efficiency of neighboring cities can affect each other. This is different from the conclusion in Han [69] that urban economic development has a negative effect on industrial energy efficiency of neighboring cities. The univariate local Geary statistic is an indicator of detecting univariate significant spatial clustering at a local level [69,70]. The univariate local Geary index in GeoDa software is used for assessing the 2012-2013 GTFP distributions that showed a strong spatial correlation. The results are shown in Figure 3. The red legend of figure indicates high-high clusters. The yellow legend of figure indicates low-low clusters. The high-high clusters are those areas where the cities with high GTFP are also surrounded by high-GTFP cities, and the low-low clusters are those in and around the low GTFP values. The high-high clusters are mainly distributed in Jilin, Jiangsu, Shanghai, Beijing, Hebei and western Sichuan, while the low-low clusters are mainly distributed in central China. The GTFP of 2010-2011 with weak spatial autocorrelation patterns are shown in Figure 4. The high-high cluster area has two clusters: one is located in Jiangsu and Shanghai and the other is in the central region. Low-low cluster areas are relatively dispersed. These findings indicate that the growth patterns of urban GTFP have the characteristics of high-high clustering but are not very statistically significant.  [69] that urban economic development has a negative effect on industrial energy efficiency of neighboring cities. The univariate local Geary statistic is an indicator of detecting univariate significant spatial clustering at a local level [70,71]. The univariate local Geary index in GeoDa software is used for assessing the 2012-2013 GTFP distributions that showed a strong spatial correlation. The results are shown in Figure 3.

Analysis of Convergence of GTFP
The mean values of GTFP of the whole country and the GTFP of cities in eastern, central and western regions were calculated by using the σ-convergence. As we can see from the blue line in Figure 5, the convergence values of GTFP in Chinese cities fluctuate slightly. The σ value of 2015-2016 is slightly higher than that of 2008-2009, indicating that the overall GTFP shows a small divergence trend and no regional convergence.

Analysis of Convergence of GTFP
The mean values of GTFP of the whole country and the GTFP of cities in eastern, central and western regions were calculated by using the -convergence. As we can see from the blue line in Figure 5, the convergence values of GTFP in Chinese cities fluctuate slightly. The value of 2015-2016 is slightly higher than that of 2008-2009, indicating that the overall GTFP shows a small divergence trend and no regional convergence.
The convergence values of GTFP of the three regional cities also fluctuate, but the ranges of fluctuation are quite different. The convergence values of GTFP in central and western regions fluctuate greatly, while the convergence values of GTFP in eastern regions fluctuate only slightly. These trends indicate that there is little difference in technology levels and management levels between cities in the eastern region, while there are some differences in technology levels and Therefore, measures should be taken to make the areas with low GTFP converge to those with high efficiency to significantly improve the overall GTFP in China.

Discussion
Based on the analysis of the Malmquist index, spatial autocorrelation and convergence, we found:  The convergence values of GTFP of the three regional cities also fluctuate, but the ranges of fluctuation are quite different. The convergence values of GTFP in central and western regions fluctuate greatly, while the convergence values of GTFP in eastern regions fluctuate only slightly. These trends indicate that there is little difference in technology levels and management levels between cities in the eastern region, while there are some differences in technology levels and management levels between cities in the central and western regions. The changed trend of convergence values of GTFP in the three regions and the whole country are basically the same, showing a weak divergence trend compared with 2008-2009. The convergence values of the GTFP of the three regions and the whole country are basically the same in the two-year period for 2014-2015 and 2015-2016. This indicates that their change trends were similar and showed a weak divergence. Therefore, measures should be taken to make the areas with low GTFP converge to those with high efficiency to significantly improve the overall GTFP in China.

Discussion
Based on the analysis of the Malmquist index, spatial autocorrelation and convergence, we found: (1) The values of GTFP show a tendency of increases, indicating that China's GTFP continues to improve. The changes in the values of GTFP were influenced by changes in the values of technology change index and technology efficiency index. Before 2012-2013, it was mainly the comprehensive effects of technology change index and technology efficiency index, while after 2012-2013, the effect of technology change index was the main factor. In the spatial perspective, the eastern region has the highest values of GTFP over the past nine years, followed by the central region and the west region being the lowest; (2) Based on the calculated values of Moran's Index, it seems that the GTFP shows a certain level of global spatial autocorrelation. The univariate local Geary index graph shows that the values of GTFP reveal a weak high-high clustering and a low-low clustering. This indicates that the technology diffusion between adjacent cities is relatively weak; (3) The calculated values of national and three regional σ-convergence index values show that the values of GTFP have both convergence and divergence. The overall trend is a slightly divergent one. Moreover, the σ-convergence trend in China and the three regions is basically the same. The divergence of GTFP measures indicates that there are certain efficiency differences among regions and among cities. Whereas the convergence of GTFP measures indicates that the efficiency of different regions gradually converges. That is, the difference in technology and management levels among regions gradually becomes smaller over time.
Different from other scholars who study GTFP at the national, provincial or regional scale, we mainly study GTFP at a smaller scale such as city level. Although Zhang [70] and Ma [7] also conducted research on GFTP at city level, results in Zhang [70] study showed that the overall growth declines gradually and technical progress is the main contributor of GTFP growth. It differs from the results in our study where the overall growth of GTFP tended to increase and the main contributor of the growth is the joint effect of both technical efficiency and technical change indexes. Ma [7] only analyzed the data for the year of 2005, 2010, 2016 and they did the static analysis using SBM. Whereas we did the dynamic analysis using the Malmquist method and the data for the years from 2008 to 2016, which is more convincing. We also employed a novel method that includes the undesirable output in the input variables, contrasting to the traditional method most scholars used where the undesirable output is included in the output variables.
It should be noted that there are many influencing factors of GTFP. Analysis using the Malmquist method is incomprehensive to certain degree. We look forward to more in-depth analysis by Panel regression model if we have richer data, so that we could identify more influencing factors.

Conclusions
Currently, China is facing problems, such as energy security and severe environmental issues. It is of great significance to study GTFP. From the perspective of cities, variations in spatiotemporal