Spatial Distribution Pattern of the Headquarters of Listed Firms in China

This study chose 4667 listed firms with headquarters in China as the research objects. The spatial aggregations of headquarters in sub-periods and subsectors were investigated based on prefecture-level cities using Global Moran’s I statistic, the Getis-Ord Gi* statistic, and kernel density estimation. The spatial association and frequent patterns of headquarters belonging to different sectors were analyzed using the Apriori algorithm. The results showed the following: (1) Most headquarters of listed firms agglomerate around megacities and concentrate in the coastal regions. (2) Many prefer that headquarters be located in an area within a 5 km radius, forming the pivotal local industry cluster. (3) Industrial, information technology, and consumer discretionary firms show a higher degree of agglomeration than materials and health care firms. Materials and consumer discretionary industries have a stronger spillover effect on surrounding cities compared to industrial, information technology, and health care firms. (4) Service industries show a strong spatial association with other sectors at all distance thresholds and frequent items. These results can provide a reference for future considerations of headquarters locations, industrial development, and the reshaping of urban networks.


Introduction
A market is composed of a large number of small firms and a small number of large firms [1].Starting a new business, establishing headquarters, and opening joint ventures, branches, and subsidiaries contribute to expansions in regional employment, provide earning opportunities, and promote local economic growth [2,3].In general, a very small number of large firms account for most of the economic output and are crucial to the development of the regional economy [4].Moreover, unforeseen shocks to a single sector that is strongly interconnected with others in an economic system can have dramatic effects on the overall economy [4,5].Analyzing the behavior of microeconomic entities, especially large-scale or strategically important companies, can thus enable a better understanding of the macroeconomy.
Compared to non-listed companies, listed companies benefit from the competitive advantages of a larger scale, better performance, smoother information transmission, and a stronger ability to source finance, making them the most dynamic and high-potentially enterprise organizations in the modern market economy [6][7][8].Listed companies with strong market performance and a good social reputation can easily attract and retain talents, which in turn promotes scientific and technological progress and innovation, forming a virtuous circle.When emergencies arise in the market, listed companies can recapitalize themselves and spread risk with the aid of the capital market [9,10].Headquarters outsource a large amount of work, such as advertising, consulting, accounting, and financing, which stimulates the development of the producer services industry and creates more jobs and tax revenue [11][12][13][14][15].In addition, employees at these headquarters, who are regarded as elite members of society, can help enhance levels of local consumption.Cities in which more headquarters of large firms are located are often considered the "commanding" centers of the economy, occupying a dominant strategic and economic position [16].The "headquarters economy" is also considered "a path of industrial upgrade".Examples of municipalities seeking to attract headquarter activity can be found across North America, Europe, and the Asia-Pacific region [13,[17][18][19][20].Therefore, the headquarters of listed firms can be considered an ideal research object for microeconomic studies, and the location and behavior of these headquarters have drawn a great deal of attention from both the public and private sectors [17].
Analyzing the evolution of the spatial distribution and spatial aggregation of large-scale listed firms' headquarters is fundamental for understanding the structure and constitution of each market and of the regional economy.This, in turn, is key to effective urban planning and strategic urban development.In previous studies, geographers and economic geographers have tended to use the concentration ratio of industry, the entropy index, the Herfindahl-Hirschman index, the Space Gini coefficient, and the EG (Ellison-Glaese) index to identify clustering in an industry.They have, moreover, conducted spatial autocorrelation analysis using the Gi-statistic or Moran's I index, and used the ESDA (Exploratory Spatial Data Analysis) method to study the topological relationships of geospatial entities in a city or region [21,22].However, these methods for measuring agglomeration are mainly cluster-based approaches based on administrative districts.A large difference in district size may be misleading and conceal important information related to small areas.Few studies have analyzed spatial aggregation and spatial location correlation among different industries based on point position relationships.With the advent of big data, the ability to access data and the potentially for data mining have greatly improved.The Apriori algorithm, a classic data mining method, offers the advantages of a simple theoretical basis, easy implementation, and good performance.This makes it well suited for calculating the spatial association rules and frequent patterns of listed firms from a microenterprise perspective.
The remainder of this paper is organized as follows: Section 2 conducts a literature review regarding patterns in the locations of listed firms' headquarters and the evolution of their spatial distribution.Section 3 describes the methodology for data collection and analysis.In Section 4, the spatial aggregation of listed firm headquarters is analyzed in sub-periods and subsectors based on prefecture-level cities.Further, the spatial associations and frequent patterns of headquarters belonging to different sectors are analyzed based on coordinates.Finally, Section 5 provides concluding remarks.

Literature Review
Researchers in economic geography have an ongoing interest in defining the patterns in business location and identifying the complex interaction rules behind the evolution of the spatial distribution of company headquarters.Geographers began analyzing the locations of corporate headquarters in the 1960s.Goodwin provided the first analysis of the metropolitan distribution of US management activity [23].By assessing changing levels in the geographic concentration of US headquarters' activity, Semple and Phipps proposed a developmental sequence of headquarters activity as follows: "beginning with a period of increasing geographic concentration, a period of peak concentration in nationally dominant cities, and finally involving the growth of a dispersed, national system of major regional corporate centers" [24,25].This conceptualization was gradually demonstrated by later research and became the early theoretical foundation for this field [16,26].Studies on headquarters location specifically consider the benefits and costs of locational choices and the influences of state factors, policy interventions, and institutional background.On the one hand, firms prefer to locate their headquarters in the central business districts (CBDs) of larger metropolitan areas due to the regional advantages of high-quality infrastructure, high-end producer services, excellent university and research institutes, and convenient living environments.They choose to gather in only a few larger metropolitan cities and set up close to each other to facilitate the sharing of common inputs and the same labor market, in addition to enjoying information spillover [14,[27][28][29].For example, Sydney and Melbourne accounted for 44.2% and 27.8% of the headquarters of the largest 300 listed firms in Australia in 2010 [30].On the other hand, due to the growing cost of operating in CBDs and the development of information and communications technology, some firms have moved their headquarters outside of urban centers [31].The evolution of the distribution of corporate headquarters appears to be linked to the economic, social, and political histories of particular places [32].Some recent studies have shown that headquarters activities are geographically dispersed and configured in increasingly complex ways.Large firms (e.g., General Electric, IBM, Lenovo, and Royal Dutch Shell) have disaggregated and dispersed their financial, legal, and managerial headquarters activities across different locations [33][34][35].Pan (2003) investigated the relocation of the headquarters of listed firms from 2001 to 2012 in China and found that Beijing and Shanghai were the most attractive cities for inter-province relocation.Firms in the mining and real estate industries have the highest relocation rates, and firms in the service industry have higher relocation rates than those in the manufacturing industry [36].
Furthermore, geographical research has not been limited to investigating the location and distribution of headquarters activity.Since the 1980s, quaternary research, which is the study of "corporate decision-making and high-level business", has broadened to examine the nature of the interactions among headquarters, the internal elements of a firm (such as subsidiaries, divisions, and plants), and external elements in the firm's competitive environment (such as suppliers, sources of finance, and customers) [16,[37][38][39].Research by Rice and Pooler (2013) [39] revealed the geographically evolving tracks of the largest subsidiaries in North America from 1996 to 2004, stressing that cities with a high subsidiary-parent ratio represent additional spatial focal points of power.Li et al. (2016) suggested that regulative and cultural distances between home and host countries can positively influence the quality of the headquarters-subsidiary relationship, in contrast to the previously held opinion that institutional distance increases the potentially for conflicts between headquarters and subsidiaries [40,41].
The preceding summary shows that extensive research has been conducted on the development of the distribution and location patterns of company headquarters and the complex interaction rules behind them.A large number of these studies have taken MNCs (multinational corporations) as the research object, and only a small number have focused on temporal tracking and geographic analysis of the headquarters of listed firms belonging to different industries in China.Few researchers have investigated the rules of spatial association and frequent patterns of headquarters in different industries.In light of these opportunities for contribution, the present study has the potentially to broaden research on the spatial distribution of headquarters.

Data
Information (e.g., the date of establishment, office address, and financials) on all firms listed prior to the end of 2016 in both domestic and overseas stock exchanges was obtained from Wind, a high-quality, large-scale financial engineering and financial data warehouse in China.A total of 4667 publicly listed companies with offices in China were included in this research, as shown in Figure 1.The WGS84 coordinates of all listed firms were extracted from the head office address using the API interface of the Baidu Map open platform.
This study adopted the classification standards used by Wind, which include 11 first-level industries and 24 second-level industries; these refer to the Global Industries Classification Standard (GICS) industry system but with slight adjustments according to actual Chinese conditions.

Global Moran's I Statistic
The Global Moran's I statistic was used to explore the degree of clustering or dispersion of the headquarters of listed firms throughout China and measure spatial autocorrelation based on feature location and attribute values, with prefecture-level cities as the assessment units [42][43][44].This can be expressed by Formulas (1) and (2): where where zi is the deviation of an attribute for city i from its mean (xi − ), wi,j is the spatial weight between cities i and j, n is equal to the total number of cites, and S0 is the aggregate of all spatial weights.The Global Moran's I statistic evaluates whether the pattern expressed is clustered, dispersed, or random, using both the z-score and p-value to evaluate the significance of the index; p-values are numerical approximations of the area under the curve for a known distribution limited by the test statistic.The zI-score for the statistic is computed as per Formula (3): The range of values for the Global Moran's I index is -1 to 1.A positive coefficient indicates that contiguous cities have a similar number of listed firms, and a higher value suggests a stronger association.By contrast, the lower the zI value, the stronger the negative correlation.When Global Moran's I approaches zero, it indicates a random distribution.

Global Moran's I Statistic
The Global Moran's I statistic was used to explore the degree of clustering or dispersion of the headquarters of listed firms throughout China and measure spatial autocorrelation based on feature location and attribute values, with prefecture-level cities as the assessment units [42][43][44].This can be expressed by Formulas (1) and (2): where where z i is the deviation of an attribute for city i from its mean (x i − x), w i,j is the spatial weight between cities i and j, n is equal to the total number of cites, and S 0 is the aggregate of all spatial weights.The Global Moran's I statistic evaluates whether the pattern expressed is clustered, dispersed, or random, using both the z-score and p-value to evaluate the significance of the index; p-values are numerical approximations of the area under the curve for a known distribution limited by the test statistic.The z I -score for the statistic is computed as per Formula (3): The range of values for the Global Moran's I index is −1 to 1.A positive coefficient indicates that contiguous cities have a similar number of listed firms, and a higher value suggests a stronger association.By contrast, the lower the z I value, the stronger the negative correlation.When Global Moran's I approaches zero, it indicates a random distribution.

Getis-Ord Gi* Statistic
Performing better than Local Moran's Index in detecting the center and range of headquarter gathering area, the Getis-Ord Gi* statistic was used to identify cold and hot spots.The locations of clusters of high or low values were identified by calculating their Z G scores.The formula by which the statistic is calculated is as follows [45,46]: where x i and x j are attribute values for cities i and j, w i,j is the spatial weight between features i and j, n is the number of prefecture-level cities, and i = j.
The Z G -score for the statistic is computed as where and ( 6) The Getis-Ord Gi* statistic of each element is its Z G score.For positive Z G scores with obvious statistical significance, a higher Z G score implies a closer grouping of high values.For negative Z G scores with obvious statistical significance, a lower Z G score indicates a closer grouping of low values.The Jenks natural breaks method was used in this study to divide the Z G scores into four grades: cold spot, secondary cold spot, secondary hot spot, and hot spot.

Kernel Density Estimation
Kernel density estimation was used in this study to describe the agglomeration of firms [47].If one industry has a larger frequency at a specific distance, it indicates that this industry has a higher degree of agglomeration at this distance.The kernel estimator used in this paper is shown in Formula (8), which was developed by Silverman [48]: where n is the number of observations, d i,j is the distance between the ith and the jth observation, and h is the smoothing parameter that is significant, whose formula is as follows: where A is min (standard deviation of distance between headquarters, four-minute distance between headquarters).

Apriori Algorithm
The objective of this work was to integrate GIS and association mining techniques to determine the distribution relationships among headquarters classified in different sectors within a certain distance.Such information can be valuable for decision-making.If the frequency of occurrence of industries X and Y is high, we can say they are strongly associated with each other.The association rules are defined as follows: A = {a 1 , a 2 , . . ., a m } denotes the itemset, in which each item represents a specific literal.D represents a set of industries distributed in a database in which each transaction I stands for an itemset (I ⊆ A).
An association rule is denoted by X→Y.The rule X→Y holds in D according to two measure standards: support and confidence.
Support (denoted as Sup (X, D)) represents the number of times a rule shows up in a database.The higher the value of Support, the more important the D is.
Confidence (denoted as Conf (X→Y)) represents the conditional probability of Y given X.This indicates that if one set of industry distribution includes X, the chance of occurrence of Y is relatively high.Thus, it is employed to evaluate the level of confidence regarding the association rule X→Y.
The Apriori algorithm is used to find out all transaction rules that have certain user-specified minimum support (Minsup) and confidence (Minconf ).First, a large itemset is detected whose support is larger than Minsup; second, association rules are generated using the large itemset.Such rules must satisfy the conditions Sup (X∪Y, D) ≥ Minsup and Conf (X→Y) ≥ Minconf.
The lift judgment is used to judge possible biases incurred when using these measure standards.It is defined as Lift = Conf (X→Y)/Sup(Y) [49].

Temporal Growth of Listed Firms in China
China has exhibited dramatic economic growth over the past few decades.As the engines of this growth, there has been a simultaneous and comparable rise in the number of Chinese companies listed overseas or domestically (Figure 2).In 2001, revenues from publicly listed companies accounted for 14% of China's GDP [50].By 2007, that figure had doubled, accounting for 28% of the GDP.In 2016, the total operating revenues of listed companies accounted for 44% of China's GDP.The listing activity of Chinese firms began in the late 1980s via reverse takeovers in the Hong Kong Stock market.From 1992 to the end of 2016, nearly 4700 Chinese firms became listed on a variety of stock markets.
The study period can be broadly divided into four stages on the basis of the derivative of the cumulative number of listed companies established each year (Figure 3).Because the date of their founding precedes their being listed, such firms will be referred to below as "potentially listed firms".The first period (Period I) was before 1991, during which time both annual increases in the number and cumulative number of potentially listed firms were small (Figure 2).The second period (Period II) was from 1992 to 2000, a period that witnessed extraordinary growth in the number of potentially listed firms (totaling 2337).This phenomenon was promoted by Xiaoping Deng's "South Tour Speech" in 1992.The derivative value of cumulative listed firms peaked in 2000 (Figure 3).Affected by the global financial crisis initiated by the subprime mortgage crisis of 2007-2008 in the US, there was a slowing in the growth in the number of potentially listed firms from 2001 to 2007 (Period III).Lastly, in Period IV, from 2008 to 2016, the number of new potentially listed companies dropped sharply to the level of Period I, with a decreasing trend year by year.To sum up, the annual number of new potentially listed companies has been a barometer of China's economy.
growth, there has been a simultaneous and comparable rise in the number of Chinese companies listed overseas or domestically (Figure 2).In 2001, revenues from publicly listed companies accounted for 14% of China's GDP [50].By 2007, that figure had doubled, accounting for 28% of the GDP.In 2016, the total operating revenues of listed companies accounted for 44% of China's GDP.The listing activity of Chinese firms began in the late 1980s via reverse takeovers in the Hong Kong Stock market.From 1992 to the end of 2016, nearly 4700 Chinese firms became listed on a variety of stock markets.The study period can be broadly divided into four stages on the basis of the derivative of the cumulative number of listed companies established each year (Figure 3).Because the date of their founding precedes their being listed, such firms will be referred to below as "potentially listed firms".The first period (Period I) was before 1991, during which time both annual increases in the number and cumulative number of potentially listed firms were small (Figure 2).The second period (Period II) was from 1992 to 2000, a period that witnessed extraordinary growth in the number of potentially listed firms (totaling 2337).This phenomenon was promoted by Xiaoping Deng's "South Tour Speech" in 1992.The derivative value of cumulative listed firms peaked in 2000 (Figure 3).

Global Moran's I Statistic
There are significant regional differences in the distribution of cities hosting the headquarters of listed firms.Global spatial autocorrelation analysis was performed on the locations of headquarters in the four periods (Table 1).The z-values ranged from 4.84 to 7.74, well over 2.58, indicating that the headquarters showed definite clustering at 99% significance (Figure 4).The Global Moran index,

Global Moran's I Statistic
There are significant regional differences in the distribution of cities hosting the headquarters of listed firms.Global spatial autocorrelation analysis was performed on the locations of headquarters in the four periods (Table 1).The z-values ranged from 4.84 to 7.74, well over 2.58, indicating that the headquarters showed definite clustering at 99% significance (Figure 4).The Global Moran index, ranging from 0.05 to 0.09, suggests a weak positive spatial autocorrelation among the headquarters.Overall, the degree of agglomeration in headquarters locations experienced an "increase-decrease-increase" fluctuation.

Getis-Ord Gi* Statistic in Different Periods
The Global Moran index provides only a rough description of the distribution patterns observed.The Getis-Ord Gi* statistic was used in this study to detect more concrete hot spots: areas with concentrations of events that are higher or lower than expected.Figure 5  Overall, listed company headquarters in China tend to agglomerate in a few metropolitan areas

Getis-Ord Gi* Statistic in Different Periods
The Global Moran index provides only a rough description of the distribution patterns observed.The Getis-Ord Gi* statistic was used in this study to detect more concrete hot spots: areas with concentrations of events that are higher or lower than expected.Figure 5 illustrates the hot and cold spots for the headquarters locations identified, and Table 2 lists all hot spots and secondary cold spots in the five periods.Beijing, Shanghai, and Shenzhen were hot spots in all time nodes.Prior to 2000, Tianjin, Guangzhou, and several cities along the Yangtze River were secondary hot spots.Subsequently, the secondary hot spots mainly clustered around the Yangtze River Delta.The secondary cold spots were mainly distributed along the coastal areas and central region of China.These three types of clustering only accounted for 12-16% of all of the cities.Both market orientation and government guidance contribute to the accumulation patterns of headquarters [1,18,29,51,52].The implementation of the "Unbalanced Development Strategy" largely resulted in this pronounced inhomogeneity of headquarters among cities and districts.
Overall, listed company headquarters in China tend to agglomerate in a few metropolitan areas with easy access to the producer service industry, high-quality infrastructure, and economies of agglomeration.Beijing overwhelmingly outstrips the other cities in terms of the resource advantages it offers to company headquarters.The sound basic infrastructure, advanced business facilities, strong momentum of scientific and technological innovation, and continuous optimization of the innovation environment foster and attract large numbers of listed firms to establish their headquarters there [29].The first-rate services environment, excellent business facilities, and openness of the port city of Shanghai make it another optimal location for listed firm headquarters.Shenzhen, meanwhile, offers a great degree of openness, a sound environment for creativity and entrepreneurship, and favorable policies, and these advantages have likewise attracted a large number of listed firms to locate their headquarters there.This phenomenon has verified Marshall's external economic theory [53].

Spatial Patterns in Different Sectors
The number of headquarters in different sectors reflects the importance of each type of industry to the domestic economy.Figure 6 shows the proportion of different industries in the first and second levels, respectively.At the first level, the listed firms belonging to industrials, consumer discretionary, information technology, materials, and health care are recognized as pillars of Chinese industry, accounting for 78% of the total listed firms.Capital goods, materials II, technology hardware and equipment, consumer durables and apparel, software and services, pharmaceuticals, biotechnology & life, and real estate are the principal industries at the sub-industry level, making up 63% of the total listed firms.The number of headquarters in different sectors reflects the importance of each type of industry to the domestic economy.Figure 6 shows the proportion of different industries in the first and second levels, respectively.At the first level, the listed firms belonging to industrials, consumer discretionary, information technology, materials, and health care are recognized as pillars of Chinese industry, accounting for 78% of the total listed firms.Capital goods, materials II, technology hardware and equipment, consumer durables and apparel, software and services, pharmaceuticals, biotechnology & life, and real estate are the principal industries at the sub-industry level, making up 63% of the total listed firms.
Line and bar graphs are plotted in Figure 7 to further describe and understand the characteristics of the development of publicly listed firms belonging to the top five first-level industries during 1992-2016.In general, the amount of potentially listed firms in the top five sectors showed a continuous decline from Period II to Period IV.The core period for the establishment of listed companies in industrials and materials was in the earlier stage (Period I).This provided a foundation for China's early economic development, promoting economic growth through heavy industry and infrastructure construction.Large numbers of potentially listed firms in information technology established in Periods II and III represents the vigorous development of the information technology industry.Improvements in living standards and an increase in consumption have increased spending on consumer discretionary items.Thus, after 2008, the largest proportion of newly established listed firms belonged to consumer discretionary.This is an omen of the advent of a new consumption era, suggesting that China's economic development has entered a new period.Line and bar graphs are plotted in Figure 7 to further describe and understand the characteristics of the development of publicly listed firms belonging to the top five first-level industries during 1992-2016.In general, the amount of potentially listed firms in the top five sectors showed a continuous decline from Period II to Period IV.The core period for the establishment of listed companies in industrials and materials was in the earlier stage (Period I).This provided a foundation for China's early economic development, promoting economic growth through heavy industry and infrastructure construction.

Global Moran's I Statistics in the Top Five Sectors
Table 3 shows the results for Global Moran's I for the top five sectors.The z-values are above 2.58 and are significant at the 1% level, which demonstrates that there is strong spatial clustering among the locations of headquarters in the different sectors.All values of the Global Moran's index are larger than zero, suggesting that the spatial autocorrelation of the headquarters in different sectors are positive.There is a remarkable difference in z-values and the Global Moran index among these sectors.The spatial autocorrelation of materials and consumer discretionary industries is larger than that of health care and information technology, indicating a stronger spillover effect on surrounding cities.Meanwhile, health care and information technology tend to gather in a few regions.

Global Moran's I Statistics in the Top Five Sectors
Table 3 shows the results for Global Moran's I for the top five sectors.The z-values are above 2.58 and are significant at the 1% level, which demonstrates that there is strong spatial clustering among the locations of headquarters in the different sectors.All values of the Global Moran's index are larger than zero, suggesting that the spatial autocorrelation of the headquarters in different sectors are positive.There is a remarkable difference in z-values and the Global Moran index among these sectors.The spatial autocorrelation of materials and consumer discretionary industries is larger than that of health care and information technology, indicating a stronger spillover effect on surrounding cities.Meanwhile, health care and information technology tend to gather in a few regions.For deeper insight into the clustering pattern of headquarters in these five sectors, Figure 8 shows the hot and cold spot maps.Table 4 provides a summary of cities that are hot and secondary cold spots for headquarters.We can conclude the following: the primary hot spots for listed firms in the industrials sector are Beijing, Shanghai, and Shenzhen; Beijing and Shenzhen are the most-favored destinations for the headquarters of firms in information technology; the core areas for consumer discretionary and health care firms are Beijing and Shanghai; and the materials sector has more hot spots than the other sectors.Besides Beijing and Shenzhen, such types of firms prefer to locate their headquarters in the Yangtze River Delta, in cities such as Shanghai, Wuxi, Jiaxing, and Suzhou.The proportion of hot spots and secondary cold spots for information technology (3.8%), industrials (11.4%), and consumer discretionary (11.4%) are extremely low, suggesting a higher degree of agglomeration than in materials and health care.All these distribution patterns are the consequence of historical factors, specific features of industries, and government actions.Giving priority to industrials, the early founding of the industrials sector was mainly located in the central areas of major cities, producing a high degree of agglomeration.Being close to consumer markets and scientific and technological talent, companies in consumer discretionary and information technology have a stronger internal impetus to gather in a few megacities with strong consumer demand and a large quantity of high-end talent.Materials and health care have a more scattered distribution.This is due to these firms preferring to locate their headquarters in regions with an integrated industrial foundation or with raw materials.

Kernel Density Estimation with Different Radius
Figure 9 plots the kernel density distribution of headquarters classified in the top five industries and the total headquarters within 100 km.The Gaussian variant of the kernel function was used, and the bandwidth was set to 1.As shown in the kernel density estimation curve, the maximum kernel density value for the total headquarters appears in the 0-5 km interval, which indicates that headquarters tend to agglomerate in an area with a 5 km radius to obtain the benefits of a cluster economy.The second and third peak values are approximately 28 and 35 km.These two peaks may be the general distance between two separate agglomeration areas.The kernel density value falls with distance, suggesting a diminishing spatial agglomeration.
Headquarters categorized into different industries show different characteristics.With the exception of industrials, all other industries show a maximum value within a 5 km radius.In contrast to the other industries, the peak kernel density value of industrials is approximately 30 km, and the overall kernel density values tend to be low for this sector.The headquarters of consumer discretionary firms have the highest level of agglomeration within a 5 km radius, with materials headquarters ranking second.The high mean kernel density value within 20 km implies information technology has a higher degree of agglomeration than the other sectors.All of these results are consistent with the conclusion in Section 4.3.2.
to the other industries, the peak kernel density value of industrials is approximately 30 km, and the overall kernel density values tend to be low for this sector.The headquarters of consumer discretionary firms have the highest level of agglomeration within a 5 km radius, with materials headquarters ranking second.The high mean kernel density value within 20 km implies information technology has a higher degree of agglomeration than the other sectors.All of these results are consistent with the conclusion in Section 4.3.2.The spatial association rules between the headquarters of different sectors were mined on the basis of ArcGIS 10.2 (Esri, California, USA, 2014) and the Apriori algorithm in the R language.For each distance, the support threshold was set as 0.01, and the confidence threshold was set as 0.5.The association rules range from several hundred to tens of thousands, with different maxlen of frequent itemsets.The rules were sorted in descending order by lift.In this paper, only the five rules with the highest degree of lift at 5, 10, 30, and 50 km thresholds are shown (Table 5).Smaller distance thresholds have a lower degree of support and confidence and a higher degree of lift.Meanwhile, at longer distances, the degree of confidence is larger, and the degree of lift is lower.The degree of confidence and degree of lift of "{Insurance II} ⇒ {Banks}" at the 5 km threshold are high.The business co-operation model between banks and insurance contributes to this aggregation phenomena."{Insurance II} ⇒ {Consumer Services II}" is a set of strong association rules at distance thresholds of 10, 30, and 50 km.The top five association rules at all distance thresholds are mainly in the financial, services, and retailing industries, suggesting that these industries have close connections with other industries.

Frequent Itemsets
To illustrate the findings in a more intuitive manner, Figure 10 shows the graphs of the two and three frequent itemsets generated at each threshold.The shortened forms of the names of the industries used in Figure 10 are shown in Table 6.The size of each circle in Figure 10 represents the confidence value, and the color indicates the lift value.At the 5 km threshold (Figure 10a), telecommunication services II, banks, and insurance II take the central positions in both the two and three frequent itemsets, connecting other industries.In the two frequent items at both the 10 and 30 km thresholds (Figure 10b,c

Frequent Itemsets
To illustrate the findings in a more intuitive manner, Figure 10 shows the graphs of the two and three frequent itemsets generated at each threshold.The shortened forms of the names of the industries used in Figure 10 are shown in Table 6.The size of each circle in Figure 10 represents the confidence value, and the color indicates the lift value.At the 5 km threshold (Figure 10a), telecommunication services II, banks, and insurance II take the central positions in both the two and three frequent itemsets, connecting other industries.In the two frequent items at both the 10 and 30 km thresholds (Figure 10b,10c), headquarters are located around telecommunication services II, banks, insurance II, and food & staples retailing II.At the 10 km threshold of the three frequent items, there are two clear separate association systems.As displayed in Figure 10c,10d, for the three frequent itemsets at the 30 and 50 km thresholds, headquarters tend to cluster around insurance II and consumer service II.
Generally, insurance II and consumer service II showed strong correlations with other industries at all distance thresholds and for all frequent items.They show a dispersive spatial distribution pattern and have no fixed aggregation model with any particular industry, being scattered around the headquarters of other sectors.Such a layout enables them to conveniently provide services for clients.Generally, insurance II and consumer service II showed strong correlations with other industries at all distance thresholds and for all frequent items.They show a dispersive spatial distribution pattern and have no fixed aggregation model with any particular industry, being scattered around the headquarters of other sectors.Such a layout enables them to conveniently provide services for clients.

Conclusions and Policy Implications
The headquarters of listed firms are the commanding centers of an economy, and their locations can have a major influence on urban structure and local economic growth.The spatial distribution of listed firms reflects the patterns of economic development and the spatial distribution of resources in a district.Based on the spatial autocorrelation and spatial association analysis detailed in this paper, we can conclude the following distribution characteristics of the headquarters of listed firms in China: (1) The headquarters of listed firms in China agglomerate around megacities, especially Beijing, Shanghai, and Shenzhen, and are concentrated in certain regions along the Pacific coast, notably the Pearl River Delta region in the south, the Yangtze River Delta region in the southeast, and the Bohai Rim region in the northeast.(2) Headquarters belonging to firms in different sectors show different clustering patterns.
Headquarters of listed companies classified as consumer discretionary and information technology display a higher degree of agglomeration than those in materials and health care.This enables them to obtain clustering benefits or take advantage of technology diffusion and knowledge spillover.Being close to raw materials or located in industrial parks with specific industrial facilities, the headquarters of materials firms show more dispersed location patterns than other sectors.(3) The service industries, especially insurance II and consumer service II, show strong correlations with other industries at all distance thresholds.This is so they can provide professional services to clients conveniently and quickly.
Headquarters are an important force leading the innovation and development of industry.And the "headquarters economy" can not only enhance China's industry to the high range of the global value chain, but also prompt the development of modern service industry.This study provides preliminary grounds for understanding the geography of the headquarters of listed firms in China, revealing how headquarters classified into different sectors connect to each other.There exist significant differences in the development capacity of "headquarters economy" among major cities in China.Occupying more resources in the national resource allocation signifies the advantage of "headquarters economy" in the eastern region obvious.Overall, the Matthew Effect in headquarters locating is prominent, which indicates the task of promoting regional balanced development is still arduous.
Along with the constraints of market orientation, government guidance has played a significant role in shaping the accumulation patterns of headquarters.Given the importance of listed firms' headquarters in promoting economic growth and shaping urban structure, the results of this study can help identify interconnections between cities and can provide vital reference materials for firms deciding on future locations and for planners reshaping urban networks and constructing city group: (1) The development directions of different urban function zones for industrial undertakings should be clear, so that regional industrial division, joint development, and industrial upgrading are promoted.Development models such as "enclave economy", "joint development zone", and "service outsourcing" can be implemented in the Beijing-Tianjin-Hebei region to adjust the regional spatial economic structure.The Yangtze River Delta should make full use of the advantages of its developed industrial system to update its position in the global value chain, whereas the Pearl River Delta region should continue to deepen reform and develop a world-class manufacturing industry and advanced service industries.
(2) Building a multi-industry distribution architecture incorporating leading listed firms together with small, medium-small, and microenterprises should be encouraged.This can help form a regional division and cooperation system where continuous innovation and development can be achieved through the correlation and complementarity of multiple industries.(3) High-performing listed companies are crucial for advancing innovation and development.The leading technological innovation enterprises should receive encouragement since they ultimately drive the industrialization of new advanced technologies.In addition, setting up their innovation achievement industrialization base in other cities can help promote the local economic growth.(4) Finally, the government should offer unique services to attract headquarters settled according to the local characteristics.The practical problems of professional talents should be solved effectively (e.g., settling down and children's education), in order to garner more outstanding people.The service of government in the fields of trade and business, finance, commerce, and logistics should be improved and perfected, providing headquarters more support for the development.

Figure 1 .
Figure 1.Distribution of listed companies in China.

Figure 1 .
Figure 1.Distribution of listed companies in China.

Figure 2 .
Figure 2. (a) The annual real GDP and the number of cumulative firms from 1978 to 2016; and (b) trends in the annual GDP growth rate and new listed firms from 1978 to 2016.

Figure 2 .
Figure 2. (a) The annual real GDP and the number of cumulative firms from 1978 to 2016; and (b) trends in the annual GDP growth rate and new listed firms from 1978 to 2016.
Affected by the global financial crisis initiated by the subprime mortgage crisis of 2007-2008 in the US, there was a slowing in the growth in the number of potentially listed firms from 2001 to 2007 (Period III).Lastly, in Period IV, from 2008 to 2016, the number of new potentially listed companies dropped sharply to the level of Period I, with a decreasing trend year by year.To sum up, the annual number of new potentially listed companies has been a barometer of China's economy.

Figure 4 .
Figure 4.A spatial autocorrelation report for Global Moran's I statistic.
illustrates the hot and cold spots for the headquarters locations identified, and Table 2 lists all hot spots and secondary cold spots in the five periods.Beijing, Shanghai, and Shenzhen were hot spots in all time nodes.Prior to 2000, Tianjin, Guangzhou, and several cities along the Yangtze River were secondary hot spots.Subsequently, the secondary hot spots mainly clustered around the Yangtze River Delta.The secondary cold spots were mainly distributed along the coastal areas and central region of China.These three types of clustering only accounted for 12-16% of all of the cities.Both market orientation and government guidance contribute to the accumulation patterns of headquarters[1,18,29,51,52].The implementation of the "Unbalanced Development Strategy" largely resulted in this pronounced inhomogeneity of headquarters among cities and districts.

Figure 4 .
Figure 4.A spatial autocorrelation report for Global Moran's I statistic.

Figure 5 .
Figure 5. Getis-Ord Gi* statistics for the headquarters of listed firms in different periods: (a) period I before 1991; (b) period II from 1992 to 2000; (c) period III from 2001 to 2007; (d) period IV from 2008 to 2016; and (e) all listed firm till 2016.

Figure 5 .
Figure 5. Getis-Ord Gi* statistics for the headquarters of listed firms in different periods: (a) period I before 1991; (b) period II from 1992 to 2000; (c) period III from 2001 to 2007; (d) period IV from 2008 to 2016; and (e) all listed firm till 2016.

Figure 6 .
Figure 6.Proportion of different industries in the first and second levels of the classification system.

Figure 6 .
Figure 6.Proportion of different industries in the first and second levels of the classification system.

20 Figure 7 .
Figure 7.The development of publicly listed firms belonging to the top five first-level industries, 1992-2016.

Figure 7 .
Figure 7.The development of publicly listed firms belonging to the top five first-level industries, 1992-2016.

Figure 9 .
Figure 9. Kernel density estimation curves of the total headquarters and headquarters belonging to the top five industries.

4. 4 .
Spatial Association Analysis of Industry Sectors 4.4.1.Rules of Spatial Association ), headquarters are located around telecommunication services II, banks, insurance II, and food & staples retailing II.At the 10 km threshold of the three frequent items, there are two clear separate association systems.As displayed in Figure 10c,d, for the three frequent itemsets at the 30 and 50 km thresholds, headquarters tend to cluster around insurance II and consumer service II.Sustainability 2018, 10, x FOR PEER REVIEW 15 of 20

Figure 10 .
Figure 10.(a) Two and three frequent items at the 5 km threshold; (b) Two and three frequent items at the 10 km threshold; (c) Two and three frequent items at the 30 km threshold; and (d) Two and three frequent items at the 50 km threshold.

Table 1 .
Output for Global Moran's I statistic during different periods.

Table 2 .
Summary of hot spot/cold spot clusters of headquarters in different periods.

Table 3 .
Output for Moran's I statistic in the top five sectors.

Table 3 .
Output for Moran's I statistic in the top five sectors.

Table 4 .
Summary of hot spot/cold spot clusters of headquarters in the top five sectors.

Table 4 .
Summary of hot spot/cold spot clusters of headquarters in the top five sectors.

Table 5 .
The top five rules by degree of lift at different distance thresholds.

Table 6 .
Shortened forms of the names of the 24 second-level industries.

Table 6 .
Shortened forms of the names of the 24 second-level industries.