Clustering Analysis of Soybean Production to Understand its Spatiotemporal Dynamics in the North China Plain

The production gap of soybean (Glycine max L. Merr.) has been expanding in China recently, due to the increasing demand and decreasing production. Identifying soybean production dynamics is contributable to appropriate adjustment of crop rotation system and efficient use of agricultural resources—and thus to ensure food security. Taking the North China plain (NCP) as a case area, this study first analyzed the spatiotemporal dynamics of soybean production during 1998–2015 based on the spatial autocorrelation method, and then calculated contributions to the total production by yield and sown area using the factor decomposition method. The results indicated that total soybean production in the NCP decreased dramatically from 1998 to 2015 and showed a decreasing trend in 80.4% (263) of the counties, mainly (83.9%) contributed by the shrinkage of sown area, largely caused by decreasing benefit. Two regions were found with significantly spatial clustering degree of soybean production. In the south part of NCP, soybean production was highly clustered in Anhui province, and in north it was mainly clustered in western Hebei plain. It was found that soybean production in the NCP was rather sensitive to the return gaps of soybean from maize (Zea mays L.). These imply that the reduced area of soybean production can be restored if the return is improved by adopting appropriate policies such as appropriate subsidies. These findings could be helpful for the policymakers to make soybean production planning in the NCP, contributing to the national revitalization strategy of soybean production.


Introduction
Demand for milk and meat in China has been expanding with much higher rate than the global average, due to dramatically improved dietary structure in recent decades [1][2][3][4], stimulating considerable increase of soybean and maize consumption [5,6]. Since 1980s, their consumptions have increased twelvefold and threefold, respectively, and will continue to grow rapidly in the future [6][7][8][9]. To meet this increasing demand, maize production has maintained a rapid growing trend, increased by 323.3%, while soybean production showed obviously phased change characteristics [10][11][12]. During 1980-2004, soybean production increased from 7.9 Mt to the highest 17.4 Mt, but then decreased to 12.4 Mt in 2015, due mainly to the shrinkage of sown area resulted from maize expansion. From 2004 to 2015, maize sown area in China increased by 76.7% (19.5 Mha), because of the government promotion and its higher yield. Since maize and soybean grew in the same season of June-October [13], this expansion of maize area was partly achieved by the reduction of soybean sown area, resulted in a marked decrease in soybean area from 9.6 Mha in 2004 to 6.8 Mha in 2015 at a rate of −45.8 k ha/yr [7,14].
To meet the self-sufficiency of staple grain supplies has always been the strategic priority of China's agricultural policy. But for non-staple or feed grains particularly soybean, China has to depend on import, due to limited arable land resources. So, international trade has gradually become the main approach to meet the increasing soybean demand [4,15]. Before 2004, the import volume of soybean was less than 20.2 Mt in China, but in 2015 it increased to 86.7 Mt, accounting for 86.9% of the total consumption. This increasing import has largely reduced the soybean growing area, resulted in quick shrinkage of soybean production, and thus increased the risk of food security in China. In addition, the reduction of soybean growing area has a potential to increase n losses and thus greenhouse gas emission, as the reduced soybean area was used for growing maize and other cereal crops that need much higher chemical fertilizer inputs [16][17][18]. Sun et al. indicated that conversion of soybean land to cereal crops resulted in soil n balance turned from negative to positive, increasing risk of n losses and soil pollution [18]. Recognizing these problems, the Chinese government has made efforts to promote soybean production, and in 2015, enacted the National planting structure adjustment plan (2016-2020), in which the sown area of soybean was planned to restore to 9.33 Mha in 2020. In the No.1 Document of the year 2019 from the central government, revitalization of soybean production by expanding the sowing area was listed as one of the top rural and agricultural development tasks [19,20].
To revitalize soybean production in China, it is necessary to have an analysis on the spatiotemporal changes during past decades, as to identify priority areas that should be promoted to restore the soybean production. Presently, some studies analyzed the spatial distribution of soybean production in provinces of Jiangsu and Henan [21,22] and identified the spatial distribution patterns in the Northeast China [23], but no study gives specific analysis on soybean production in the North China plain (NCP). To well understand the spatiotemporal variation and clustering characteristics is helpful for the appropriate arrangement of soybean production and efficient use of agricultural resources, and thus to ensure the sustainability of resources and food security [24][25][26][27]. Identification of clustering patterns can help recognize the priority areas for specific crop production, and thus support agricultural planning. For this identification, the spatial autocorrelation method is often used, as it can measure the mutual dependence in a certain area and the similarity degree in adjacent space [28][29][30][31].
The NCP has favorable climate, soil properties and agricultural infrastructure, and is one of the key production regions of food crops including soybean in China [14,32]. During recent decades, soybean has been largely replaced by maize, leading to a decline in soybean production and soil degradation to some extent [18]. To realize the national goal of soybean revitalization, soybean production in the NCP should be restored and expanded. For this purpose, this study first, analyzed the spatiotemporal changes of soybean production at county level from 1998 to 2015, and then identified the cluster types and the dynamics, using the methodologies of spatial autocorrelation and factor decomposition. Finally, the causes of changes and implications were discussed.

Study Area
The NCP is located between 31.9-39.8 • N, 112.0-122.5 • E, and involves 327 counties, extending over parts of the provinces (municipalities) of Beijing, Tianjin, Hebei, Henan, Shandong, Anhui and Jiangsu ( Figure 1) [32]. It covers an area of approximately 39 million ha, of which 81.8% was farmland, 8.7% built-up land, 6.9% forestland and grassland and 2.6% water bodies and unused land, based on the 300 m spatial resolution data interpreted from European Space Agency in 2015. The double cropping system is the common planting pattern in the NCP, where winter wheat (T. aestivum L.) grows in October to June, followed by summer crops of maize, soybean, cotton, etc. [33,34]. As one of the most important agricultural area in China, the NCP produced more than 34.5% of China's total grain production with 32.3% of the total sown area in 2015 [7]. The NCP was once the main soybean producing area in the 1980s, playing an important role in guaranteeing soybean production in China.
For instance, the sown area and production of soybean in the NCP were 2.79 Mha and 4.09 Mt in 1980, accounting for 38.6% and 51.5% of the total in China, respectively.

Data Preparation
Soybean production data in the NCP were obtained from the city or provincial rural yearbooks during 1998-2015, including the total production, sown area, yield, irrigation rate. To analyze the influence of economic factors, we collected input costs and market prices of soybean and maize from the Compilation of national agricultural product cost and income data, published by China price press (Table 1). For few missing data of individual counties, we interpolated the values using the average values of adjacent four years. There are some changes in administrative boundary of counties such as Shouxian, Kaifeng and Bozhou, and we used the data after the administration adjusted.

Spatial Autocorrelation
Spatial autocorrelation is often used to indicate the mutual dependence of variable attributes in a certain region and to measure the similarity degree of selected indicator in adjacent space [35][36][37]. It includes global and local spatial autocorrelation, involved global and local Moran indices, respectively. The global Moran index (I) is used to evaluate whether the distribution is grouped or random through verifying the degree of spatial autocorrelation between target unit and its neighbors [38]. Direct correlation exists when the Moran index is positive (between 0 and 1), while inverse correlation is identified when it is negative (between −1 and 0) [39]. The higher absolute value of the index implies the higher cluster degree of the selected region. The Equations are presented below.

Data Preparation
Soybean production data in the NCP were obtained from the city or provincial rural yearbooks during 1998-2015, including the total production, sown area, yield, irrigation rate. To analyze the influence of economic factors, we collected input costs and market prices of soybean and maize from the Compilation of national agricultural product cost and income data, published by China price press (Table 1). For few missing data of individual counties, we interpolated the values using the average values of adjacent four years. There are some changes in administrative boundary of counties such as Shouxian, Kaifeng and Bozhou, and we used the data after the administration adjusted.

Spatial Autocorrelation
Spatial autocorrelation is often used to indicate the mutual dependence of variable attributes in a certain region and to measure the similarity degree of selected indicator in adjacent space [35][36][37]. It includes global and local spatial autocorrelation, involved global and local Moran indices, respectively. The global Moran index (I) is used to evaluate whether the distribution is grouped or random through verifying the degree of spatial autocorrelation between target unit and its neighbors [38]. Direct correlation exists when the Moran index is positive (between 0 and 1), while inverse correlation is identified when it is negative (between −1 and 0) [39]. The higher absolute value of the index implies the higher cluster degree of the selected region. The Equations are presented below.
where n is the number of space units; Y i and Y j are the attribute values of units i and j; x is the average of the selected attribute; W i,j signifies the adjacent degree between units i and j and W i,j = 1, when i and j are adjacent, otherwise W i,j = 0. Moran's I is between −1 and 1 and I > 0 means that the attribute values of adjacent units are similar spatially and I < 0 means that there is no spatial similarity. S 0 is the aggregate of all spatial weights. The global Moran index indicates the correlation between an observation and its neighbor units but cannot measure the spatial arrangement. As a supplementary, local Moran index (I i ) can be used to capture possible local patterns of spatial autocorrelation and identify the location of the spatial clusters with similar values or outliers [40]. The indices that are significantly high and positive indicate the presence of unit clusters with similar values, while low values indicate the no similarity among the selected units [41]. Further, Z and p-values are used to evaluate the significance of similarity of the target unit and its adjacent units. The local Moran index is expressed as Equations (3) and (4).
where x i is the value of the selected attribute of unit i, x j is the value of the selected attribute of unit i, x is the average value of the selected attribute, w k,l is the weight signifies the adjacent degree between units i and j and n is the number of space units. Z is the standardized statistical value of global Moran's I. E(I i ) and S(I i ) are the theoretical average and variance of Moran index for all units in the region. We calculated the global Moran index I, local Moran index (I i ) and their Z and p-values for each county, using the spatial autocorrelation (Moran's I) and hot spot analysis (Getis-Ord Gi *) tools in the ArcMap software, respectively. Based on the Z and p-values, all county units were classified into 7 clustering patterns (Table 2). HH, HM, HL indicate counties that have higher soybean production and the similarity to its neighboring counties are high, medium and low, respectively; LH, LM, LL indicate counties that have lower soybean production and the similarity to its neighboring counties are high, medium and low, respectively; NS pattern indicates no significant similarity.
Note: HH, HM, HL indicate high production patterns with high, moderate and low clustering characteristics; LL, LM, LH indicate low production patterns with high, moderate and low clustering characteristics, respectively. NS indicates no clustering significance.

Factor Decomposition
Factor decomposition method was used to calculate the contribution rates to the change of total production by yield and sown area, based on the time series data [32]. The equations for each county are shown as below: where C y, i+1 and C S, i+1 are the contribution rates by yield and sown area; i is the serial number of the yield and i = 1, 2, . . . 11. Y i+1 and Y i are the soybean yield in the (i + 1)-th and i-th years. S i+1 and S i are sown area in the (i + 1)-th and i-th years. P i+1 and P i are the total production in the (i + 1)-th and i-th years.

Factor Decomposition
Factor decomposition method was used to calculate the contribution rates to the change of total production by yield and sown area, based on the time series data [32]. The equations for each county are shown as below: where Cy, i+1 and CS, i+1 are the contribution rates by yield and sown area; i is the serial number of the yield and i = 1, 2, … 11. Yi+1 and Yi are the soybean yield in the (i + 1)-th and i-th years. Si+1 and Si are sown area in the (i + 1)-th and i-th years. Pi+1 and Pi are the total production in the (i + 1)-th and i-th years.  Soybean production showed obviously spatial changes during 1998-2015 ( Figure 3). In 1998, there were 53 counties producing more than 20.0 Kt soybean, 37 counties mainly distributed in the southern and northeastern NCP, producing 20.0-40.0 Kt and 16 counties located in the south, producing more than 40.0 Kt. In the remaining 186 counties spreading over the NCP and 88 counties mainly located in the northwestern and central parts, the production was between 5.0 and 20.0 Kt and was less than 5.0 Kt, respectively. Subsequently, the counties with the production below 5.0 Kt were expanded from north to south rapidly and in 2015 the total number accounted for 65.7% (215)  Based on the Sen's slope approach, the change rate of soybean production in each county during 1998-2015 was obtained [32] and mapped in Figure 3f. The results showed that the total production of soybean decreased in 80.4% (263) counties throughout the NCP. The decrease rate varied from 1.7 to 494.8 t/yr in 178 counties and from 510.9 to 9877.2 t/yr in 98 counties, mainly located in northeastern and central parts. In other 51 counties, the production increased, at a rate of 3.7-3533.0 t per year. Among these counties, 20 counties showed a higher increase rate, averagely increased by more than 500 t/yr, mainly concentrated in the south-central NCP.

Spatial Statistics of Clustering Characteristics
The value of global Moran index was above zero and the p-values were less than 0.01 in all over the years, indicating the existence of spatial cluster patterns significantly in the NCP (Table 32). From 1998 to 2015, the index was significantly increased from 0.270 to 0.380 (R 2 = 0.560, p < 0.01) and the Z values increased from 7.312 to 9.721, suggesting that the degree of spatial clustering for soybean production was increasing gradually, particularly in Hebei and Anhui provinces, which were the major soybean production areas. Based on the Sen's slope approach, the change rate of soybean production in each county during 1998-2015 was obtained [32] and mapped in Figure 3f. The results showed that the total production of soybean decreased in 80.4% (263) counties throughout the NCP. The decrease rate varied from 1.7 to 494.8 t/yr in 178 counties and from 510.9 to 9877.2 t/yr in 98 counties, mainly located in northeastern and central parts. In other 51 counties, the production increased, at a rate of 3.7-3533.0 t per year. Among these counties, 20 counties showed a higher increase rate, averagely increased by more than 500 t/yr, mainly concentrated in the south-central NCP.

Spatial Statistics of Clustering Characteristics
The value of global Moran index was above zero and the p-values were less than 0.01 in all over the years, indicating the existence of spatial cluster patterns significantly in the NCP (Table 3). From 1998 to 2015, the index was significantly increased from 0.270 to 0.380 (R 2 = 0.560, p < 0.01) and the Z values increased from 7.312 to 9.721, suggesting that the degree of spatial clustering for soybean production was increasing gradually, particularly in Hebei and Anhui provinces, which were the major soybean production areas. The results of clustering analysis showed that the high production counties with high clustering degree were mainly concentrated in the south-central part, while low production counties with low clustering characteristic were mainly distributed in the northwest (Figure 4). The HH cluster appeared in the southern NCP since the beginning of the study period and its location and extent were changed with relatively small amplitude over time. In 1998, for instance, the HH cluster comprised of 40 counties in Anhui and Henan provinces, and then it was shrunk to 31 counties in 2003, and the disappeared counties were mainly located in Anhui province. Thereafter, it extended westwards in Henan province, but its number of counties decreased from 68 counties in 2011 to 35 in 2015. Both of the HM and HL clusters covered rather less counties, mainly distributed around the HH cluster. Different from the HH cluster, the LL cluster was mainly located in the northern region and the change amplitudes of its extent and location were far larger than that of HH cluster. In 1998, it covered 14 counties, located in Hebei province, but in 2011, the extent expanded around obviously, and its number of counties increased to 106 counties dramatically. However, these counties disappeared completely in other years as 2003, 2007 and 2015. Additionally, the clusters of LM and LL covered 50-107 counties in Hebei and Shandong provinces and the clustering characteristic was not significant in other regions (Figure 4a-e).

Spatiotemporal Changes of Yield and Sown Area
During 1998-2015, the sown area of soybean decreased from 2.18 to 1.17 Mha, with an annual change rate of −3.49%. The yield showed an increase trend, but the amplitude was rather small, only 0.01 t/ha/yr. Similar to the total production, the sown area and yield also showed a phased variation. In more than 85% of counties (280) in the NCP, soybean area showed a decreasing trend, and in 89 counties scattered all over the NCP, the decrease rate exceeded 0.2 k ha/yr. The sown area increased in 47 counties, mainly distributed in Anhui, Henan and Jiangsu provinces, and in 17 counties, the increase amplitude was more than 0.2 k ha per year (Figure 5a). The changes of yield and sown area differed spatially, for instance, the yield change rate in northern counties was higher than that in the south. Yield showed an increase trend in 251 counties, mostly distributed in the eastern Hebei and southern Shandong provinces, where soybean yield increase rate was more than 40 kg/ha per year (Figure 5b). In other 74 counties, yield showed a decreasing trend, of which 19 counties showed the yield decreased at a rate of 40-78.9 kg/ha per year.

Spatiotemporal Changes of Yield and Sown Area
During 1998-2015, the sown area of soybean decreased from 2.18 to 1.17 Mha, with an annual change rate of −3.49%. The yield showed an increase trend, but the amplitude was rather small, only 0.01 t/ha/yr. Similar to the total production, the sown area and yield also showed a phased variation. The sown area decreased during the periods of 1998-2003 and 2006-2015 but increased (by 6.32%) from 2003 to 2006. Regarding to the yield, it increased from 1.99 to 2.27 t/ha from 1998 to 2009, but then decreased and returned to 2.00 t/ha again in 2015.
In more than 85% of counties (280) in the NCP, soybean area showed a decreasing trend, and in 89 counties scattered all over the NCP, the decrease rate exceeded 0.2 k ha/yr. The sown area increased in 47 counties, mainly distributed in Anhui, Henan and Jiangsu provinces, and in 17 counties, the increase amplitude was more than 0.2 k ha per year (Figure 5a). The changes of yield and sown area differed spatially, for instance, the yield change rate in northern counties was higher than that in the south. Yield showed an increase trend in 251 counties, mostly distributed in the eastern Hebei and southern Shandong provinces, where soybean yield increase rate was more than 40 kg/ha per year (Figure 5b). In other 74 counties, yield showed a decreasing trend, of which 19 counties showed the yield decreased at a rate of 40-78.9 kg/ha per year.

Contribution Rates to Total Production by Yield and Sown Area
During 1998-2015, only 16.1% of the change in total soybean production was contributed by yield changes, and 83.9% of total production change was caused by the sown area variation for the whole NCP. The contribution rates of both yield and sown area to total production fluctuated largely during the period. Before 2010, the contribution rate of sown area change was higher than yield, but it became lower thereafter. Further linear regression indicated that the difference was widening gradually after 2013 ( Figure 6). At county level, the contribution rates to production change by yield and sown area were rather similar to that at the regional level. The sown area change contributed more than 50% of production change in 181 counties, distributed in northern NCP (Figure 7a). The contribution rate of yield was more than 50% in 146 counties, mainly located in the southern part (Figure 7b). Summarily, the total production change was mainly contributed by the yield change in the southern NCP, while in the north it was mainly caused by the sown area.

Contribution Rates to Total Production by Yield and Sown Area
During 1998-2015, only 16.1% of the change in total soybean production was contributed by yield changes, and 83.9% of total production change was caused by the sown area variation for the whole NCP. The contribution rates of both yield and sown area to total production fluctuated largely during the period. Before 2010, the contribution rate of sown area change was higher than yield, but it became lower thereafter. Further linear regression indicated that the difference was widening gradually after 2013 ( Figure 6).

Contribution Rates to Total Production by Yield and Sown Area
During 1998-2015, only 16.1% of the change in total soybean production was contributed by yield changes, and 83.9% of total production change was caused by the sown area variation for the whole NCP. The contribution rates of both yield and sown area to total production fluctuated largely during the period. Before 2010, the contribution rate of sown area change was higher than yield, but it became lower thereafter. Further linear regression indicated that the difference was widening gradually after 2013 ( Figure 6). At county level, the contribution rates to production change by yield and sown area were rather similar to that at the regional level. The sown area change contributed more than 50% of production change in 181 counties, distributed in northern NCP (Figure 7a). The contribution rate of yield was more than 50% in 146 counties, mainly located in the southern part (Figure 7b). Summarily, the total production change was mainly contributed by the yield change in the southern NCP, while in the north it was mainly caused by the sown area. At county level, the contribution rates to production change by yield and sown area were rather similar to that at the regional level. The sown area change contributed more than 50% of production change in 181 counties, distributed in northern NCP (Figure 7a). The contribution rate of yield was more than 50% in 146 counties, mainly located in the southern part (Figure 7b). Summarily, the total production change was mainly contributed by the yield change in the southern NCP, while in the north it was mainly caused by the sown area.

Variation of Soybean Production
Our results showed that the total production of soybean decreased dramatically in the NCP during 1998-2015, except for the increase during 2003-2006. The short increase was mainly promoted by the policy implementation. In 1998, the government initiated the Grain for Green Project to mitigate the land degradation, by supplying attractive subsidies to the farmers who returned the sloping croplands to forests or grasslands [42,43]. This policy stimulated overconversion of croplands, and thus affected farmer's enthusiasm for grain production. This caused decline of crop production over the whole China, and thus the soybean production in the NCP was also impacted, resulting in continuously decreasing of the sown area and total production from 1998 to 2003. Recognizing this problem, the government strictly regulated the scope of cropland conversion in 2004 and implemented series of policy measures including the exemption of agricultural taxes and increase of direct subsidies to crop production [44]. These policies mobilized the farmer's enthusiasm of planting crop including soybean. Under the effects of these policies, the soybean production in the NCP got a short recovery growth during 2003-2006.
After 2006, soybean production in the NCP experienced a decrease again, mainly due to shrinkage of the sown area. This decrease could be mainly caused by the low benefit and expanding return gaps compared to maize production. Although total inputs and production costs of maize were higher than soybean (Figure 8a), the average gross and net return of maize per hectare were around 29.2% and 15.9% higher than that of soybean, respectively (Figure 8b,c). These large return gaps promoted the farmers to grow more maize instead of soybean. In addition, the price of imported soybeans from international trade was 13.5-54.7% lower than the domestic price during 1998-2015 [45], which compressed the room to increase the soybean price, causing the sustained low price in China. Furthermore, the government paid more attention to increase grain production, and less effort and insufficient subsidy was given to soybean production, due to its much lower (66.4-71.6%) yield than maize (Figure 8d). As a result, farmers lost their enthusiasm to plant soybeans, resulted in gradual reduction of the sown area.

Variation of Soybean Production
Our results showed that the total production of soybean decreased dramatically in the NCP during 1998-2015, except for the increase during 2003-2006. The short increase was mainly promoted by the policy implementation. In 1998, the government initiated the Grain for Green Project to mitigate the land degradation, by supplying attractive subsidies to the farmers who returned the sloping croplands to forests or grasslands [42,43]. This policy stimulated overconversion of croplands, and thus affected farmer's enthusiasm for grain production. This caused decline of crop production over the whole China, and thus the soybean production in the NCP was also impacted, resulting in continuously decreasing of the sown area and total production from 1998 to 2003. Recognizing this problem, the government strictly regulated the scope of cropland conversion in 2004 and implemented series of policy measures including the exemption of agricultural taxes and increase of direct subsidies to crop production [44]. These policies mobilized the farmer's enthusiasm of planting crop including soybean. Under the effects of these policies, the soybean production in the NCP got a short recovery growth during 2003-2006. After 2006, soybean production in the NCP experienced a decrease again, mainly due to shrinkage of the sown area. This decrease could be mainly caused by the low benefit and expanding return gaps compared to maize production. Although total inputs and production costs of maize were higher than soybean (Figure 8a), the average gross and net return of maize per hectare were around 29.2% and 15.9% higher than that of soybean, respectively (Figure 8b,c). These large return gaps promoted the farmers to grow more maize instead of soybean. In addition, the price of imported soybeans from international trade was 13.5-54.7% lower than the domestic price during 1998-2015 [45], which compressed the room to increase the soybean price, causing the sustained low price in China. Furthermore, the government paid more attention to increase grain production, and less effort and insufficient subsidy was given to soybean production, due to its much lower (66.4-71.6%) yield than maize (Figure 8d). As a result, farmers lost their enthusiasm to plant soybeans, resulted in gradual reduction of the sown area. With Pearson regression analysis, we calculated the correlation coefficients of soybean production with the gaps of cost, gross and net returns from maize in the NCP during 1998-2015, respectively (− (Table 34). The results suggest that the gross return gap between maize and soybean had the most significant correlation with total soybean production (R = −0.726, p < 0.05) and sown area (R = −0.841, p < 0.05), but the net return gap was not significant. The gap between domestic and international trade prices of soybean also showed significant influences on the sown area (R = −0.556, p < 0.05) and total production (R = −0.320). Therefore, the changes in soybean production were rather sensitive to the relative benefit compared to maize in the NCP. Table 34. Pearson coefficients among the soybean production and the gap of economic variables between maize and soybean in the NCP during 1998-2015.

Item
Total

Statistical Techniques and Production Clusters
Understanding the spatiotemporal dynamics and clustering characteristics of grains over time is important for regional agricultural planning. The global and local Moran indices adopted in this study have been applied in some agricultural studies and are able to reveal the spatial clusters of soybean production [26,[46][47][48]. They are useful tools to identify the hot plots of agricultural development for a specific region, supporting the regional planning of crop production.
From the values of global Moran indices, it could be concluded that the soybean production in the NCP generally showed a strong clustering characteristic, and the clustering degree was enhanced annually, implying that the soybean production areas tended to concentrate. Regionally, the soybean production significantly concentrated in and shifted to Anhui and Henan provinces, from covering 30.7% and 19.7% of the total in the NCP in 1998, increased to 38.6% (R 2 = 0.678, p < 0.01) and 29.7% With Pearson regression analysis, we calculated the correlation coefficients of soybean production with the gaps of cost, gross and net returns from maize in the NCP during 1998-2015, respectively ( Table 4). The results suggest that the gross return gap between maize and soybean had the most significant correlation with total soybean production (R = −0.726, p < 0.05) and sown area (R = −0.841, p < 0.05), but the net return gap was not significant. The gap between domestic and international trade prices of soybean also showed significant influences on the sown area (R = −0.556, p < 0.05) and total production (R = −0.320). Therefore, the changes in soybean production were rather sensitive to the relative benefit compared to maize in the NCP. Table 4. Pearson coefficients among the soybean production and the gap of economic variables between maize and soybean in the NCP during 1998-2015.

Item Total Cost Gap Gross Return Gap Net Return Gap Market Price Gap
Total production −0.666 ** −0.726 ** −0.255 −0.320 Sown area −0.675 ** −0.841 ** −0.069 −0.556 ** Note: total cost gap, gross return gap and net return gap indicates the gap between maize and soybean. The market price gap is the gap between domestic and international trade prices of soybean. ** indicates significant (p < 0.05).

Statistical Techniques and Production Clusters
Understanding the spatiotemporal dynamics and clustering characteristics of grains over time is important for regional agricultural planning. The global and local Moran indices adopted in this study have been applied in some agricultural studies and are able to reveal the spatial clusters of soybean production [26,[46][47][48]. They are useful tools to identify the hot plots of agricultural development for a specific region, supporting the regional planning of crop production.
From the values of global Moran indices, it could be concluded that the soybean production in the NCP generally showed a strong clustering characteristic, and the clustering degree was enhanced annually, implying that the soybean production areas tended to concentrate. Regionally, the soybean production significantly concentrated in and shifted to Anhui and Henan provinces, from covering 30.7% and 19.7% of the total in the NCP in 1998, increased to 38.6% (R 2 = 0.678, p < 0.01) and 29.7% (R 2 = 0.324, p < 0.05) in 2015, respectively. Accordingly, the proportion sum of soybean area in these two provinces was increased significantly from 53.2% to 75.2% of the total sown area (R 2 = 0.872, p < 0.01). The reasons could be mainly attributed to the risk tolerance ability of household and farmers' planting habits. For instance, in these two provinces, the weather and soil conditions were relatively unfavorable, and crop production was frequently affected by extreme weather hazards including high temperature and rainstorms, resulting in production losses of maize and soybean [49,50], and the yields of maize and soybean were lower than other regions [51][52][53]. Compered to soybean, growing maize suffered greater economic risks, because maize required more material and capital inputs than soybean (Figure 8b). Thus, local farmers preferred to plant soybeans, although the net return of soybean was lower than maize in normal years. Furthermore, our household surveys indicated that they had the tradition to grow soybean for many years, as maize planting required more labor for the management.

Implications
The import dependence of soybean in China approached 90% in 2019, which may be a potential threat to the food security. Given the possible risks of trade friction, it is necessary for China to increase soybean production and improve the self-supply level. To this end, the NCP, a main production region of soybean in China, should be promoted to recover the soybean production in the traditional growing areas with high clustering characteristics, such as the Henan plain and northern Anhui. As the first aim, it is suggested that the sowing area of soybean in the NCP should be gradually restored to the level of 1998, i.e., 2.18 Mha, by compressing the sowing area of summer maize. This can also help resolve the over-production problem of maize recently [54][55][56], to reduce the gap of soybean demand and thus contribute to food security in China.
Expanding of soybean area can reduce chemical fertilizer use and thus benefit the environment. The currently practiced wheat-maize double cropping pattern is highly efficient in grain production, but it needs high fertilizer and water inputs, and thus has high risk of agrochemical contamination to soil [57,58]. For instance, problems such as excessive heavy metal content, soil compaction, acidification, decreased fertility and imbalanced nutrition were reported in some regions of the NCP [59][60][61]. Furthermore, crop production in the NCP heavily relies on irrigation. The expansion of irrigated area to increase grain production has caused severe water problem [62]. Several studies reported that the groundwater was depleted increasingly since 1960s, especially in the northern NCP with scarce water resources [14,63,64]. Changing summer maize to soybean could be an effective way to alleviate these problems, because maize production requires much higher inputs of fertilizers and water than soybean [14,58]. Therefore, the wheat-maize double cropping system should be partly converted to wheat-soybean growing pattern in the NCP. In addition, high-yield varieties should be adopted to improve soybean yield, and thus increase the return of soybean production and reduce the dependence on subsidy.

Conclusions
This study presented a procedure to reveal spatiotemporal dynamics and spatial clustering characteristic of soybean production in the NCP, by combined use of the methods of spatial autocorrelation, factor decomposition and GIS-based spatial analysis. With the analyses, the soybean production changes and its regional clustering patterns during 1998-2015 in the NCP were revealed, helping to identify the hot plots for soybean production development and thus supporting the regional agricultural planning. The results indicated that the total production of soybean decreased dramatically from 1998 to 2015, mainly caused by the shrinkage of sown area due to decreasing benefit. This implies that if the benefit is improved, the reduced areas of soybean production in the NCP can be restored to contribute to the national soybean revitalization strategy. Two regions identified were the main soybean production areas of the NCP during 1998-2015, which had significant clustering characteristics, and the clustering degree was enhanced during the period. One is located in south-central NCP with higher clustering characteristic and higher production, and another one is located in northwestern NCP with relatively low clustering characteristic and low production. To stimulate soybean production in the NCP and thus contribute to China's soybean revitalization, development priorities of soybean production should be given to these two regions, aimed first, to restore the soybean growing area to the level of 1998. To this end, appropriate subsidy policies to soybean production are very necessary and should be adopted, to make the return at an acceptable level.
Author Contributions: C.L. promoted the research issue and revised the manuscript; Z.Z. detailed the research procedure, analyzed the data and wrote the manuscript. All authors have read and agreed to the published version of the manuscript.