Spatialization of Actual Grain Crop Yield Coupled with Cultivation Systems and Multiple Factors: From Survey Data to Grid

: The spatialization of actual grain crop yield helps to understand the spatial heterogeneity of yield and support for the precise farming and targeted farmland management. However, inadequate consideration and quantiﬁcation of anthropogenic factors a ﬀ ecting the estimation of actual yield distribution easily cause uncertainties in recent researches. Here, we developed a new grain crop yield spatialization (GCYS) model in order to downscale the yield from county to grid scale. The GCYS model is composed of four modules: (a) cultivated land Net Primary Productivity (NPP) calculation module, (b) comprehensive agricultural system construction module, (c) key factors establishment module, and (d) integration and validation module. Its novelty is to realize the actual grain crop yield spatialization from county scale to grid scale by quantifying and spatializing the comprehensive agricultural system when considering the diversity of cultivated structure and ﬁeld management factors. Taking Guizhou and Guangxi Karst Mountains Region as a study-area, the GCYS model is developed and tested. The determination coe ﬃ cients of regression analysis between agricultural survey data and spatialization results of paddy rice yield calculated by our model reach 0.72 and 0.76 in 2000 and 2015, respectively ( p < 0.01). The results visualize the spatial pattern of actual grain crop yield at the grid scale, which show a gradually decreasing trend from southeast to northwest. With an increase in potential yield and improvement of ﬁeld management technologies, the actual average yield of grain crops per unit increased form 4745.10 kg / ha of 2000 to 5592.89 kg / ha of 2015. Especially in high-yield zones in southeast area, mechanized cultivation became the dominated factor, rather than chemical fertilizer application. This demonstrates that our model can provide a reference for agricultural resource utilization and policy-making. and high productivity ﬁelds. It shows that the per unit area yield improved from 2000 to and might serve for agricultural development planning in the future. The novelty of the GCYS model is the construction of a comprehensive agricultural system, then coupled with key physical and anthropogenic factors to realize accurate and actual grain crop yield spatialization from agricultural survey data at a ﬁner-scale grid, and ﬁnally reﬁned the details of spatial heterogeneity. The GCYS model is applied and tested while taking Guizhou and Guangxi Karst Mountains Region as the case study area. The results show that the mean per unit yield of mixed grain crops (i.e., rice, wheat, maize, soybean, and potato) weighted by sown area proportion of each grain crop of the total cultivated land is 4745.10 kg / ha and 5592.89 kg / ha in 2000 and 2015, respectively. Although the average per unit area yield of mixed grain crops increased from 2000 to 2015, its value of newly reclaimed cultivated lands trended downward in general. The main limiting and improvement factors are AWC and chemical fertilizer application, respectively. Meanwhile, the dominated proportional area of mechanization increased rapidly from 0.23% in 2000 to 17.47% in 2015. The signiﬁcant correlation between agricultural survey data and spatialization results for rice, with a determination coe ﬃ cient of 0.72 and 0.76 in 2000 and 2015, respectively. This validated the reliability of the GCYS model. The slopes of regression lines are close to 1, indicating the high accuracy of our model. Finally, a mixed grain crop calorie map is generated based on the actual yield spatialization results. This helps to identify the calories, where people could obtain from local grain crop cultivation. Emerging grain crop recognition and distribution mapping methods that are based on remote sensing technology can be combined with our GCYS model to further optimize the accuracy of spatialization, and our results are expected to provide a reference for agricultural development or land use planning in the future.


Introduction
The actual grain crop yield is affected to different degrees by the physical environment and social factors [1][2][3][4], such as irregular topography [1,3], irrigation [5,6], soil nutrients [1,7], and the application of pesticides and chemical fertilizers [8,9]. These factors can generate grain crop yield differences, even at a fine scale [10][11][12][13][14]. Understanding the spatial distribution of the grain crop yield of specific regions plays a crucial role in policy-making and market management [15,16]. However, agricultural survey data at a county level mask the spatial heterogeneity of these factors. Spatialization downscales the grain crop yield statistics from administrative units to grid by coupling the physical and anthropogenic factors. The downscaled grid data can contribute to understanding the spatial heterogeneity of agricultural resource distribution. Identifying the major obstacles or opportunities at

Study Area
The Guizhou and Guangxi karst mountain region is a typical karst region in Southwest China, located between 22 • 8'54" N-28 • 12'27" N and 104 • 18'27" E-110 • 20'40" E. It covers 211,400 km 2 within the Guizhou and Guangxi provinces ( Figure 1). The area lies in the subtropical monsoon humid climate zone with a mean annual temperature of 19°C and a mean annual precipitation of 1350 mm. The terrain gradually descends from northwest to southeast with an elevation range of 0 to 2848 m. The topography in this region is mainly composed of the positive terrain that is represented by Fengcong and the negative terrain represented by depressions. The sloping land is common in karst regions, because the terrain is dominated by positive terrain with great amplitude relief, and the expansion of construction land occupied relatively flat and limited negative terrain regions [43]. The agricultural survey data show that the per capita grain holding in 2015 is about 300 kg in this region, which is only 66% of the national level and barely met basic food requirements.

Data Collection and Preprocessing
The administrative divisions at the county level are adjusted between 2000 and 2015; therefore, the names and boundaries of counties in the study area needed to be unified before analysis. We consolidated data within inconsistent counties in order to match the administrative division adjustment. Ninety counties in the Guizhou and Guangxi karst mountain region are used for analysis.
We collected the main agricultural production survey data for grain crops (i.e., rice, wheat, maize, soybean, and potato) from the Ministry of Agriculture and Rural Affairs of the People's Republic of China, including cultivated land area, irrigation area, use of chemical fertilizer, sown area, and crop yield for the five main crops in 90 counties. We selected these data at the county level because there are no available data at the land parcel unit or a more detailed scale. These agricultural survey data, except the five grain crop yields, constituted the comprehensive agricultural system, which is an important module of our spatialization model. The light and temperature potential productivity and climatic potential productivity are collected from the national standard of the Regulation for Gradation on Agricultural Land Quality in each county. The potential productivity of the main grain crops and calculations from the appointed meteorological observation station data of each county are revised based on the topographic characteristics of the station to the mean elevation and slope of the cultivated land of the county, according to the research conducted by Wang et al. [45]. The revised dataset reflects the maximum actual crop productivity of each grain crop at the county level.

Key Factors of Yield Spatialization
Ten parameters in five categories are applied to analyze the spatial differences of actual crop yields at different stages based on previous studies of the Guizhou and Guangxi karst mountain region ( Table 1). The land use types, categorized into six first level and twenty-five second level classes, are interpreted from Landsat images. The average interpretation accuracy according to field surveys and random sample checks are 92.9% and 97.6%, respectively. Paddy field and dry farmland distributions are extracted from all land use categories and they are used for actual yield spatialization, and the area of each cultivated land plot is calculated. The spatial scale of the land use classification and the topographic data is 30 m. The spatial scale of climate data, soil data, and economic data is 1 km. However, as the spatial heterogeneity of these data are embodied at a larger scale than land use and topography in practice, a grid scale with 1 km resolution is unable to reflect the fine spatial pattern. The resolution of 250 m is set as the appropriate grid scale when considering the complex topography of the study region and the need to maintain the details on land use as far as possible. Climatic data are used to calculate the NPP of cultivated land; topographic and soil data are used to modify the available soil water capacity. The area of each cultivated land plot and the topographic data are the key parameters for mechanized cultivation, and economic data are used to spatialize chemical fertilizer application. These factors are the key to determining the spatialization results of the actual grain crop yield.

The Grain Crop Yield Spatialization Model
NPP is defined as the net carbon retained by plants [46], which can be calculated and converted into potential productivity based on growth characteristic parameters; the converted yield is the light and temperature potential productivity. However, for a rainfed scenario, the yield should be revised by the available soil water capacity (AWC) to climate productivity. The potential productivity must be modified by anthropogenic field management practices to realize the transformation from potential productivity to actual crop yield, because factors, such as mechanization and chemical fertilizer application, have generated a gradually increasing impact on grain crop yield. Finally, the spatialization results should be calibrated at the grid and county levels, respectively, to improve the accuracy of actual grain crop yield at different scales.
Based on the framework mentioned above, the GCYS model consists of four modules ( Figure 2): (a) cultivated land NPP calculation, (b) comprehensive agricultural system construction, (c) key factors establishing, and (d) integration and validation. The GCYS model can realize the actual crop yield spatial allocation, which couples a comprehensive cultivation system with the spatial heterogeneity that is generated by human management factors. In the cultivated land NPP calculation module, NPP is estimated based on the Thornthwaite Memorial model, which is an improved version of the Miami model [47]. Once the NPP is obtained, it can be converted to potential yield by using the parameters of crop growth characteristics. However, comprehensive crop productivity rather than the specific single crop yield is the target of our research. Therefore, we improve the input parameters of the Thornthwaite Memorial model to fit the cultivation system of each county in this module.
In the agricultural system construction module, a comprehensive cultivation system conforming to reality is constructed, instead of only estimating a single crop yield distribution, in order to generate the comprehensive crop productivity parameters that are needed to be applied in the first module. Therefore, a standard cultivation system, a total cultivation structure, a multiple cropping index, and crop growth categories in paddy field and dry farmland for each county are coupled in our agricultural system.
In the key factors establishing module, physical and anthropogenic factors, which can influence the grain crop yields, are integrated to achieve a better response of actual grain crop yield spatialization. Several improvements are made in this module, which is based on the SAORES (Spatial Assessment and Optimization Tool for Regional Ecosystem Services) model [48]. When considering the complex topographic and lithological characteristics of the karst mountain region, the AWC is revised in dry farmland, according to the slope difference. In addition, human management factors, including mechanization and chemical fertilizer application, are spatialized to further revise the potential yield close to reality.
In the model integration and validation module, a complete model integrating the former three modules is established. Subsequently, the grid value is calibrated by the data collected from the national standard and revised by terrain characteristics in the previous step to achieve higher reliability at the grid level. Finally, the reliability of the model is evaluated at the county level.

Cultivated Land NPP Calculation Module
First, the NPP of cultivated land is calculated by the Thornthwaite Memorial model [49], which is driven by precipitation and temperature data: where NPP ck indicates the annual potential NPP in grid k of county c (g·C·m −2 ·a −1 ), V ck represents the actual annual evapotranspiration (mm) in grid k of county c, L ck represents the annual average evapotranspiration (mm) in grid k of county c, P ck is the annual average precipitation (mm) in grid k of county c, and T ck is the annual average temperature ( • C) in grid k of county c. Subsequently, the mixed potential grain crop yield (i.e., rice, wheat, maize, soybean, and potato) is calculated according to the improved Equation (2) [50][51][52], based on the specific crop growth characteristics. However, the actual crop cultivation distribution is difficult to obtain in different years, and some variable factors influence the grain crop category in one site, like market price fluctuation, and the comprehensive crop productivity rather than specific grain crop yield in one grid is the target of spatialization. The input parameters were improved from Equation (3) to Equation (5), which combines the proportional area of various crops in paddy field or dry farmland, according to the standard cultivation system. The calculations are as follows: where PY ck indicates the mixed potential crop yield in grid k of county c (kg/ha), which means that PY ck is the light and temperature potential productivity of paddy field or dry farmland; Mc c , HI c , and f AG_c are the mixed water content (%) for different cultivation systems, harvest index (the ratio of the crop harvest and economic production), and the ratio of the aboveground biomass and the total biomass in county c, respectively; for the parameters referring to crop n, if grid k in county c is located in a paddy field, then n = i, namely, i is the grain crop category that is planted in the paddy field of county c; or else, n = j, and j represented the crops cultivated in the dry farmland of county c; m c is a calibration coefficient for multiple indices: if cropping in county c take place once a year, then m c is 0.5, otherwise, if it occurs more than once a year, its value is 1.0; 0.45 gC/g is the carbon conversion coefficient; and, t means the transform coefficient of converting NPP to potential grain crop yield. The parameters in Equation (2) can be calculated, as follows: where AreaP nc represents the sown area proportion of the crop n planted in the paddy field or dry farmland of county c; Mc n , HI n , and f AG_n are the water content (%) for different cultivation systems, harvest index, and ratio of the aboveground biomass in county c (Table 2). Therefore, the NPP is the comprehensive net primary productivity weighted by sown area proportion of the mixed five grain crops, i.e., rice, wheat, maize, soybean, and potato. Once the AreaP nc become known quantity, the mixed potential grain crop yield in grid k of county c can be calculated.

Agricultural System Construction Module
The purpose of the agricultural system construction module is to obtain the proportional area for each grain crop in the irrigation (paddy field) or rainfed (dry farmland) scenarios, and then light and temperature productivity could be obtained using Equation (2). The paddy field area is not included in the agricultural survey data; therefore, the irrigation area is used for estimation instead. Based on the survey data that were collected from the Regulation for the Gradation on Agriculture Land Quality, the multiple cropping index showed that crops in nine counties are harvested once a year, and three counties had triple cropping. Thus, most of the ninety counties harvest at least twice a year. For the counties where the multiple cropping index is larger than one, the cropping pattern of rice combined with another grain crop is universal.
The sown area in paddy yield is equal to the sown area of rice in the counties that did not have multiple cropping of rice with other grain crops. In contrast, in the counties practicing multiple cropping of rice with rainfed crops, no more than two crops are planted. For example, the standard double or triple cropping pattern is "a-b" or "a-a-b"("-" indicates the planting order of multiple crops in a year; "a" and "b" stands for the different grain crops), and the sown area estimation of these grain crops in paddy field is calculated, as follows: where IrrigationArea c is the irrigation area of county c (ha), MCI c is the multiple cropping index in county c, SownArea ac is the sown area of the crop a in county c (ha), and SurplusArea ac is the surplus sown area with irrigation in county c (ha) after planting crop a. Subsequently, an allocation on crop area is made while using Equation (7): SownArea Pbc = SownArea Pc − SownArea ac (8) where SownArea bc is the sown area of the crop b (ha) in county c, SownArea Pc is the total area of grain crops cultivating in the paddy field (ha) in county c, and SownArea Pbc and SownArea Dbc are areas (ha) of crop b cultivating in the paddy field and dry farmland in county c, respectively. If the SurplusArea ac is not less than the sown area of crop b in county c, then the total area of crop a and b is the sown area of grain crops in paddy fields in county c; if the SurplusArea c is less than the sown area of crop b, the SurplusArea c becomes the sown area of crop b in the paddy field. After carrying out the calculation of Equation (7), the area of crop b in the paddy field and dry farmland in county c can be obtained based on Equations (8) and (9). After the sown area of grain crops in the paddy field or dry farmland is determined, the total sown area in the two scenarios is calculated, as follows:

Key Factors Establishing Module
In the key factors establishing module, three physical or anthropogenic factors, including the AWC revised by topography, mechanization, and chemical fertilizer application, are calculated in order to improve the spatial heterogeneity accuracy of the grain crop yield at the grid level.

Soil Available Water Capacity
The AWC is the volume of soil water that should be available to plants (unit: 10 −2 cm 3 ·cm −3 ) [53]. The AWC is related to the ratio of sand, silt, clay soil, and soil organic matter based on previous experiments in the study area [54]. The revision of AWC is only calculated in the dry farmland. When considering the sum proportion of sand, silt, clay soil is 1, the quantitative relations are as follows: where (SAN%) ck , (SIL%) ck , and (C%) ck are the ratios of the sand, silt, and organic matter of grid k in county c, respectively, and the AWC ck is the available soil water capacity of grid k.
Equation (13) could be used to estimate the mean AWC k value of most soil categories. However, the complex terrains in the karst mountain region impacted AWC k . An experimental research in Puding County, which is located in our study area, showed a linear decrease in the capacity until the slope is up to 40 • [53]. Therefore, the AWC k of farmland where the slope is larger than 6 • is revised according to the slope land grade of the technical regulation of the Third Nationwide Land and Resources Survey, as follows: where Slope k is the slope of grid k, when the slope degree is larger than 6 • , Max(Slope k ) is the maximum slope value of whole study area, S k is the calibration coefficient of AWC k , and CAWC k is the actual soil water capacity calibrated by slope degree. If the slope degree of grid k is less than 6 • , the calibration coefficient S k is 1; Min(CAWC) k and Max(CAWC) k are the minimum and maximum value of CAWC k in the whole study area, respectively, and NCAWC k is the normalization result of CAWC k . For the paddy field, the value of NCAWC k is 1.

Mechanized Cultivation
The farmland patch area and grain crop categories are the two main factors impacting the mechanized cultivation rate. Previous researches have shown that the length of farmland patches that are available for mechanized cultivation are usually larger than 200 m [55], and the slope is no more than 6 • [56]. Restricted by the spatial resolution of the raster data in the present study, we select the area of one grid cell as the lower limit for mechanized cultivation.
The mechanized cultivation process included three steps: plowing, seeding, and harvesting. The contribution of the mechanized cultivation rate to the increase in production of grain crops is calculated, as follows: As for the potato and soybean, the mechanization rate of the latter two steps is extremely low, and the sown areas of both are relatively lower than the other three grain crops in most counties, so we regarded the M nc values of potato and soybean as 0.

Chemical Fertilizer Application
Li. et al. [58] showed that the amount of chemical fertilizer application is significantly correlated with the farmers' household income, which is determined by using questionnaires. However, this indicator is not included in the agricultural survey data of all counties each year. Therefore, the chemical fertilizer application is quantified, as follows: where PV c and GDP c are the planting industry production value (10 4 yuan) and the gross regional domestic product (10 4 yuan) of county c (c = 1, 2, . . . , 90), respectively; GDP ck is the GDP value of grid k in county c (10 4 yuan); therefore, PV ck is the planting industry production value of grid k in county c (10 4 yuan), and the k PV ck is the total planting industry production value of county c (10 4 yuan); Cf c is the net application of chemical fertilizer of county c (kg/ha), which is collected from the available agricultural survey data; and, Cf ck is the spatialization value of the net application of chemical fertilizer in grid k of county c (kg/ha). The upper limit is set as 1000 kg/ha when considering the amount of chemical fertilizer applied to grain crops. The mean crop yield increase from the fertilizer application is calculated, as follows [28]: where ∆Y ck is the mean increasing yield in grid k of county c (kg/ha). After revising the potential yield by the three factors, respectively, the maximum reduction or increase in the yield, namely, the dominated factor, can be identified among the three factors. Then the area proportion of each domination factors can be calculated, which help to understand the sphere of each factor influence.

Integration and Validation Module
The former three modules are integrated to determine the spatialization of crop yield in this module, and then the reliability of the model is verified; and finally, the grid values are calibrated to be lower than the revised crop productivity collected through the national standard, while the spatialized results are revised being the same as agricultural survey data at the county level.
The three factors developed in the second module are combined further to describe the spatial heterogeneity of the actual grain crop yield, while the revised and normalized available soil water capacity, and anthropogenic management factors, including mechanized cultivation and chemical fertilizer application, are considered both in dry farmland and paddy field, as follows: where Y ck is the final actual crop yield in grid k of county c after revised by the three key factors (kg/ha). Our results indicating the comprehensive productivity of grain crop yield is difficult to be accurately assessed. However, it is feasible to assess the accuracy of the GCYS model through the available crop distribution extent. We collected the rice distribution within the pure pixels of paddy field based on our comprehensive agricultural system module, and calculated the average per unit area of paddy rice of each county in ArcGIS 10.2. The average per unit yield of paddy field can be calculated according to the yield and sown area of rice. The significant correlation test is used to validate the reliability of our GCYS model. First, the data normal distribution is examined before the correlation analysis. We set the null hypothesis that there is no correlation between the two groups. Otherwise, the alternative hypothesis, that a significant correlation existed, is accepted. The p-value is set as 0.01. The linear regression is applied to analyze the function relationship and the difference between the two groups.
From the micro-scale perspective, the crop yield in a grid should be less than the light and temperature productivity of paddy field or the climate productivity of dry farmland, which are calculated through observational data and topographic calibration. The calibration at the grid level is as follows: where if the grid k is located in paddy field of county c, Max(RY ci ) indicates the maximum per unit area yield of crop i of county c; or else, Max(RY cj ) is the maximum per unit area yield of crop j of county c.

The Application of the GCYS Model
Grain crops are a basic and important source of nutritional calories for humans, and grain crop production is one of the most important indicators of food security and healthy diet [59]. Because all the grain crop yields are estimated in our model, the spatialization result is used to calculate the calories of local grain crops by using the following formula [59]: where Q ck is the total calorie in grid k of county c (kcal/ha); EP n is the edible proportion of crop n (%); EP c is the comprehensive edible proportion of grain crops of county c (%); C n is the calorie of crop n (kcal/100 g); and, C c is the comprehensive calorie of grain crops of county c (kcal/100 g).

Spatial Pattern of Net Primary Productivity
NPP calculation is the first step for estimating the potential yield of grain crops. The mean mixed NPP value of cultivated land is 1629.92 in 2000 and 1765.00 g·C·m −2 ·a −1 in 2015, respectively (Figure 3a,b). In 2000, the NPP distribution values are higher in the central region, and lower in the north and south regions of the study area. Relatively higher elevation and complex karst terrain are distributed in the northern region, and temperature became the limiting factor for NPP. The lowest annual average temperature of the whole area is 9.16°C in the north, which is significantly lower than the maximum average temperature in the whole study area of 23.13 • C. Precipitation is the limiting factor in the southern region, with the lowest precipitation of 1029 mm in this region, which is far below the 1849 mm in the central area. In 2015, precipitation is a dominant factor in NPP spatial pattern changes. The higher precipitation area moves slightly to the south, and the NPP shows a similar spatial distribution.

Spatialization of Potential Productivity
Generally, the spatial pattern of the mixed transform coefficient decreased slightly by 0.02 from 2000 to 2015. The mixed coefficient of paddy field decreased by 0.0052. However, the change of that in dry farmland is −0.0311 due to the diversity of cropping categories of dry farmland. The most significant change is found in Dafang County, in the north of the study area, where there is a maximum reduction of 0.31 due to remarkable changes in the cropping structure. In 2000, the sown area of potato is just 1.05%, but the ratio sharply increased to 40.35% in 2015, while the sown areas of the other four crops decreased. The mixed transform coefficient of potato from NPP to potential productivity is the lowest among the five grain crops based on Table 3; thus, the mixed transform coefficient of Dafang County shows the greatest reduction (Figure 4). A similar situation is observed across the northeast and northwest.  The mixed potential grain yield (i.e., rice, wheat, maize, soybean, and potato) is calculated ( Figure 5). The lower mixed potential grain crop yields are mainly distributed in the northeast and southwest regions in 2000, and the higher mixed potential grain crop yields are mainly located in the central part of the study area. In contrast, the relatively low grain crop yield region expanded in the northern area in 2015, while the regions with the higher values showed a southeast shift. The minimum and the maximum per unit area potential yield both increased from 2000 to 2015, and the mean value of the whole study area increased from 14,600.57 kg/ha in 2000 to 15,521.49 kg/ha in 2015.  Figure 6 shows the areas of different dominant factors influencing the grain crop yield. In general, the AWC is the limiting factor affecting the largest area. The proportion of dry farmland increased from 2000 to 2015. Although field management technologies, including the mechanization level and chemical fertilizer application, improved rapidly, the AWC increased slightly from 59.08% to 59.29% (Table 4).  The impact of the proportional area of mechanization is the smallest among the three key factors, and it had little effect with the area proportion of 0.23% in 2000. However, its dominated area proportion increased to 17.47% in 2015. In the southern region, where the terrain is relatively flat and multiple crops are harvested at least twice a year, the advantages of mechanization gradually become the dominant factor, thereby playing a more important role of increasing production in 2015.

Key Factors Analysis of Actual Yield
The chemical fertilizer application proved to be the most efficient field management technology for improving crop yield during the study period, even though the effect of mechanization grew rapidly. From 2000 to 2015, the dominant area of chemical fertilizer application decreased by 17.45%, but the mean per unit area of production increased from 332.27 kg/ha to 347.52 kg/ha. Figure 7 shows the regression lines of the per unit area yield in county level between the spatialized results of GCYS model and agricultural survey data. The determination coefficients of paddy field in 2000 and 2015 are 0.72 and 0.76, respectively, and the model results are significantly correlated with the agricultural survey data (p < 0.01) and have a higher fitting degree when compared with previous studies [16,28,35]. The slopes of the two curves are 0.95 and 1.23, respectively, indicating that the spatialization results are close to reality, although a slight underestimation and overestimation occurred for paddy rice yield estimation per unit area at the county level in 2000 or 2015. The significant correlation of spatialization results and agricultural survey data verified the reliability of our model. Based on this, the spatialization results were revised with the agricultural survey data at the county level, and then calibrate any grid productivity values higher than potential productivity. However, the effect of field management technologies, like chemical fertilizer application in lower potential yield regions, will be larger than that in other regions. Although we have taken these factors into account, the uncertainty of the simulated results still exists, especially the points that are located in lower value area of the plot while deviating from the straight line.

Spatialization Results of Actual Grain Crop Yield
The mean mixed grain crop yield (i.e., rice, wheat, maize, soybean, and potato) of the total cultivated lands is 4745.10 kg/ha in 2000 and 5592.89 kg/ha in 2015, respectively. In 2000, the mean mixed grain crop yield of paddy field and dry farmland are 8495.48 kg/ha and 2894.82 kg/ha, respectively. Additionally, the yield changed to 8473.25 kg/ha and 3676.96 kg/ha, respectively, by 2015. In general, although the local spatial pattern showed variations at different degrees, the gradient of the actual yield in the whole study area increased from northwest to southeast, consistent with elevation gradient (Figure 8). In the north, the actual grain crop yield decreased in 2015. The AWC, the dominant factor of actual grain crop yield, is relatively steady, but changes in local temperature and precipitation led to a decrease in NPP. Meanwhile, the change in cropping structure decreased the transform coefficient from NPP to potential yield (Figure 4), thus limiting potential productivity and the actual yield. In the southern region, the productivity of low actual yield obviously improved, and the high value region is more concentrated and moved southward. The increase in potential yield and its southward shift was attributed to the rapid mechanized cultivation development in these regions. Huanjiang County is used as an example to analyze the spatial pattern change in actual grain crop yield. Huanjiang County is a typical karst plateau-peak-cluster-depression transition region that is located in the middle of the study area, and the elevation gradually decreased from northeast to southwest. The average per unit area yield of mixed actual grain crop is 2395.84 kg/ha in 2000. Except for farmlands with a low productivity (less than 1000 kg/ha) scattered on the steep mountains in the north, the lower values mainly surrounded the mountains in the east (Figure 9). Relatively high productivities in the paddy field are found in the southern karst depression areas. In 2015, the average per unit area yield of mixed actual grain crop increased to 4505.41 kg/ha (an increase of 88.51%). The proportional area of increasing yields dominated by mechanization increased from 0% in 2000 to 13.07% in 2015 in the southern flat region. However, this substantial improvement does not alter the spatial pattern of the whole county. Especially in the northern and eastern mountains, AWC determined by terrain and soil attributes is the dominant limiting factor for actual grain crop yield, although the per unit area yield value is improved by other factors.

The Spatio-Temporal Change in Actual Grain Crop Yield
The mixed average grain crop yield (i.e., rice, wheat, maize, soybean, and potato) increased by 847.79 kg/ha from 2000 to 2015. We set (0,4000), (4000,8000), and (8000,15,000) kg/ha as the ranges for low, medium, and high productivity fields. It shows that the per unit area yield improved from 2000 to 2015 (Figure 10), mainly due to the decreasing percentage of low productivity fields and the increasing percentage of medium productivity fields. The per unit area yield were reclassified as seven groups with an interval of 2000 kg/ha. The percentage of the lowest productivity fields registers the largest decline, decreasing from 11.84% in 2000 to 0.71% in 2015, and the percentage of medium productivity fields (4000-6000 kg/ha) increases from 14.81% to 25.20%, showing the largest increase. From 2000 to 2015, the cultivated land area decreased, and a total of 119,112.50 ha of farmland is lost. Of the total land lost, 50.70% is paddy field and 49.30% is dry farmland, which is mainly located in the northern karst mountains. Even though 56,781.25 ha of land was newly cultivated by 2015, its average per unit yield is 4255.75 kg/ha, which is less than the lost farmland per unit yield of 4485.20 kg/ha ( Figure 11). Therefore, from the perspective of general grain crop productivity, the productivity of supplementary cultivated land declined. However, the average per unit yield of total cultivated land increased from 4745.10 kg/ha to 5592.89 kg/ha, which arose from the beneficial effects from the physical environment and improved field management on existing farmland.  Figure 12 shows the mixed grain crop (i.e., rice, wheat, maize, soybean, and potato) calorie map. The spatial distribution of calorie outputs is higher in the south and lower in the north. The average calories for 2000 and 2015 are 6.37 × 10 6 kcal/ha and 6.63 × 10 6 kcal/ha, respectively, and the per capita calories obtained from grain crops each day are about 654.76 kcal/person and 1025.77 kcal/person, respectively. It should be noted that the grain crop calorie distributions are not the same as the actual yield spatialization results, because they are related to the planting structure and the multiple indices. For example, the grain crop calories of relatively flat paddy field in the southeast, where rice is harvested twice a year, would be far greater than the calories of dry farmland in the northern region, where the standard cultivation system only allows harvest once a year and potatoes occupied a large percentage area. The unit mass calories of the potato are just one-fifth of the other four grain crops. In the counties with a low proportion of potato cultivation, the spatial pattern of calorie spatialization results is similar to that of the grain crop yield. Therefore, the variation of calorie values is larger than that in yield in the whole study area.

Discussion
This research aims to realize the spatialization of actual grain crop yield. Knowledge of cultivation systems and planting structures are essential preconditions for analyzing the grain crop production [1,28,60]. Therefore, we built a comprehensive agricultural system, which integrates the standard cultivation system, the planting structure, including all five grain crops (i.e., rice, wheat, maize, soybean, and potato), and the cropping difference between the paddy field and dry farmland at the county level in the Guizhou and Guangxi karst mountain region in China. This agricultural system reflects the agricultural production rather than only estimating one crop at the county level [6,12,18,21]. The characteristics of comprehensiveness and localization of the cultivation system underpin the basic framework of the GCYS model.
Previous studies focused on the physical factors limiting crop productivity [27,28,51]. However, improvements in field management technologies have resulted that anthropogenic factors gradually become the key elements of increasing crop yields [3]. Such yield differences at a micro-scale cannot been described in accurate detail just by using physical limiting factors in previous studies. Therefore, based on former studies, questionnaires, and experiments in the Guizhou and Guangxi karst mountain region, we can spatialize the effects of mechanization and chemical fertilizer applications on improved crop production to establish the key factors module of the GCYS model. These factors highlighted the spatial differences in the grain crop yield at the grid level. The reliability of our GCYS model is validated by the significant correlations between spatialization results and agricultural survey data of paddy rice yields, and the determination coefficients are 0.72 and 0.76 in 2000 and 2015, respectively. Furthermore, validation at the county and grid scale helped to improve the accuracy of the GCYS model.
The cultivation area of grain crops in the paddy field or dry farmland is quite different in our study. The proportions of farmland used for grain crop cultivation in paddy fields are 74.22% and 71.84% in 2000 and 2015, respectively, according to the estimation of our agricultural system module. In contrast, the average proportions of dry farmland are only 49.97% in 2000, and decreased to 43.13% in 2015. The lower proportion of grain crops in dry farmland might be less accurately estimated than in paddy fields, because greater uncertainty existed in practice, especially whether the grain crops are cultivated on low or high productivity farmland [35]. This problem is hard to solve, unless the precise grain crop distribution can be obtained [15,35,42]. However, precise mapping of grain crop distribution is also difficult to implement, particularly in large-scale studies [11,23]. The validation and calibration process at both the county and grid scale of GCYS model helped to minimize such uncertainty to some extent, improving the adaptability and stability of the GCYS model.
The per unit area yield in a grid is the mixed grain crop yield following the cropping structure within its county. Although the mixed process decreased the spatial heterogeneity of the grain crop yield, the grid yield is a comprehensive reflection of farmland productivity, field management level, and cultivated systems, where these factors are relatively stable or controllable. Therefore, drastic yield fluctuations because of changes in the type of cultivated grain crop or sown area resulting from unstable factors, like policy [16] or market forces [17], are avoided. Additional refinement of crop yield spatialization might come from the quantification of high-resolution nitrogen or other flows in cultivated land [8,34,61,62]. The relationship between chemical fertilizer application and crop yield would provide us a new perspective to improve actual yield spatialization. Furthermore, algorithms, such as cross-entropy and the general algebraic modeling system (GAMS) [17,26], can be coupled with our model to improve spatial allocation and optimization procedure, realizing a higher accuracy in the GCYS model.
The GCYS model can spatialize the agricultural survey data of grain crop yields coupled with physical and anthropogenic factors, while indicating the main limiting or improvement factors for local grain crop yields. With multiple demands for food security, economic development, and ecological protection [63,64], more informative and precise results, accurately spatialized through the GCYS model, will help to improve field management measures [17], generate more rational farmland use planning, and improve land use efficiency [63,65]. It is important to explore the production potential of cultivated land, especially where the scale of farmland is limited by physical or anthropogenic factors [3].
With the advances in remote sensing technology, efforts have been made to search for efficient ways of extracting the spatial distributions of major grain crops [15,66]. Some new indicators, such as the evaporative stress index (ESI), have been found to be significantly correlated with crop yields at the field scale [67]. Combined with the GCYS model, the accuracy of actual grain crop yield spatialization would be potentially improved. In order to avoid fluctuations caused by unstable factors, the GCYS model could also be applied to predict grain crop yield and spatialization patterns in combination with agricultural development planning in the future and provide a reference for decision-making.

Conclusions
The spatialized grain crop yield helped to explore the major obstacles or key factors for crop yields, contributed to an understanding of the spatial heterogeneity of agricultural resource distributions, and might serve for agricultural development planning in the future. The novelty of the GCYS model is the construction of a comprehensive agricultural system, then coupled with key physical and anthropogenic factors to realize accurate and actual grain crop yield spatialization from agricultural survey data at a finer-scale grid, and finally refined the details of spatial heterogeneity.
The GCYS model is applied and tested while taking Guizhou and Guangxi Karst Mountains Region as the case study area. The results show that the mean per unit yield of mixed grain crops (i.e., rice, wheat, maize, soybean, and potato) weighted by sown area proportion of each grain crop of the total cultivated land is 4745.10 kg/ha and 5592.89 kg/ha in 2000 and 2015, respectively. Although the average per unit area yield of mixed grain crops increased from 2000 to 2015, its value of newly reclaimed cultivated lands trended downward in general. The main limiting and improvement factors are AWC and chemical fertilizer application, respectively. Meanwhile, the dominated proportional area of mechanization increased rapidly from 0.23% in 2000 to 17.47% in 2015. The significant correlation between agricultural survey data and spatialization results for rice, with a determination coefficient of 0.72 and 0.76 in 2000 and 2015, respectively. This validated the reliability of the GCYS model. The slopes of regression lines are close to 1, indicating the high accuracy of our model. Finally, a mixed grain crop calorie map is generated based on the actual yield spatialization results. This helps to identify the calories, where people could obtain from local grain crop cultivation. Emerging grain crop recognition and distribution mapping methods that are based on remote sensing technology can be combined with our GCYS model to further optimize the accuracy of spatialization, and our results are expected to provide a reference for agricultural development or land use planning in the future.