1. Introduction
Amid the global shift toward a low-carbon and intelligent economy, agriculture—the fundamental pillar sustaining human survival and development—faces dual challenges of resource constraints and ecological pressures in pursuing sustainable development. These challenges are particularly acute for China: as the world’s largest developing agricultural nation, China supports nearly 20% of the global population with only 7% of the world’s arable land, making the tension between food security and ecological protection exceptionally pronounced. Currently, China’s agricultural development is confronted with unique structural problems. The degradation of the vital black soil in the Northeast Plain—characterized by thinning, nutrient depletion, and compaction—has already spread to core production regions. In the southern red soil areas, over 40% of the land is affected by acidification, and declining soil quality directly limits the stability of grain productivity. Persistent overuse of fertilizers and pesticides has led to non-point source pollution, creating a vicious cycle of “diffuse pollution—water eutrophication—deteriorating soil quality,” with 19.4% of the country’s arable land now polluted. Furthermore, the mismatch between smallholder, dispersed farming and the requirements of modern, large-scale agriculture has resulted in inefficient resource allocation; water resource utilization efficiency remains below 50%, significantly lagging behind developed countries. Collectively, these issues represent specific constraints on China’s green agricultural transition and pose real obstacles to achieving high-quality development in the sector. The direct impact of global trends on China’s agricultural environment has become increasingly evident, further intensifying these contradictions. First, global warming has led to a higher frequency of agro-meteorological disasters in China, with more than 66.7 million hectares of crops affected annually over the past five years. The southward shift of the northern arid belt has expanded irrigation gaps in the major wheat-producing areas of North China, while increased heavy rainfall in the south has exacerbated soil and nutrient losses, with the spread of non-point source pollution rising by 30% compared to a decade ago. Second, new international low-carbon trade rules have put pressure on agricultural exports. The EU’s Carbon Border Adjustment Mechanism (CBAM) has already included some agricultural products under carbon footprint tracing, and China’s agricultural carbon emission intensity per unit of output remains 1.5 to 2 times higher than that of developed countries, increasing the challenge posed by green trade barriers. Third, fluctuations in the global food market have heightened supply risks, compelling China to maintain both food self-sufficiency and the long-term productivity of its arable land. This dual imperative—to “ensure yield” while “preserving ecology”—constitutes a unique challenge for Chinese agriculture amid unfolding global trends.
In recent years, agricultural and rural big data have emerged as strategic production factors driving transformative changes within agriculture. By building comprehensive data governance systems—encompassing soil moisture monitoring, precision input management, and dynamic carbon accounting—such data technologies offer innovative solutions to persistent challenges like information silos, decision-making gaps, and implementation delays. New infrastructure developments, such as remote sensing monitoring networks, IoT sensors, and blockchain traceability platforms, are further driving spatial optimization and procedural reform in resource management. Data-driven intelligence systems, which integrate meteorological forecasts, market fluctuations, and ecological carrying capacities, have significantly improved the efficient allocation of resources such as water, fertilizer, and pesticides, providing quantitative foundations for practices like fallowing, pollution interception, and biodiversity conservation. However, leveraging big data to empower green agricultural development still faces deep-seated institutional barriers. Unclear data property rights limit market-based circulation, a lack of digital literacy among smallholders undermines technology adoption, and interdepartmental data silos impede coordinated governance. The urgent task at hand is to shift from traditional models of green agricultural development toward approaches that incorporate big data into sustainable and smart practices. To address this, the present study employs a difference-in-differences method to investigate both the extent and the mechanisms by which agricultural and rural big data policy promotion impacts high-quality agricultural development, thereby advancing sustainable agriculture.
The advancement of modern agriculture is inextricably linked to the deep integration of agricultural digitalization [
1]. The rise of agricultural big data not only aligns with inevitable trends in agricultural modernization but is also reshaping how we produce and live. Extensive research has already highlighted the role of big data in supporting rural revitalization and smart agriculture, providing solid theoretical underpinnings for this study. For instance, Liu et al. [
2] proposed an integrated solutions framework for smart rural clusters centered on “smart villages + big data technology + agricultural services,” which builds a data-driven ecosystem incorporating real-time data processing, machine learning, and containerization for data collection, smart agriculture analytics, and direct marketing of agricultural products. These subsystems collaborate to enhance scientific and intelligent agricultural services, driving rural social advancement and sustainable development. Feng et al. [
3] examined the application of big data within rural revitalization policies—spanning dimensions such as industrial prosperity, ecological livability, cultural vitality, effective governance, and affluence—while Ma [
4] empirically verified the positive relationship between IoT-driven smart agriculture and rural revitalization. From an international perspective, despite agriculture’s foundational role in India, declines in rural population and per capita arable land represent formidable challenges. In response, Shankarnarayan et al. [
5] and Yang et al. [
6] identified the promise of big data research for addressing these structural issues. Zang et al. [
7] and Tang et al. [
8] emphasize that only through a thorough analysis of the current state of big data infrastructure, combined with a careful consideration of the actual circumstances of rural households, can appropriate development models be identified.
Despite deepening integration between agriculture and internet technologies, farmers still face significant barriers in accessing and utilizing data and services, highlighting the importance of improving farmers’ data literacy for advancing green, high-quality development [
9,
10,
11]. Achieving this requires a focus on agricultural and rural ecosystems, which exert crucial influences on the broader economy, society, and environment. Enhanced national attention to these issues has made scientific management of agricultural ecosystems a research priority. The adoption of big data technologies, coupled with system theory and data mining, enables more effective analysis and management, thereby facilitating modernization and sustainability [
12]. The goal of green governance is to realize the transformation and high-quality development of agriculture; economic growth theories and total factor productivity provide established frameworks for evaluating such progress [
13]. Digital technologies continue to fuel advances in sustainable agriculture in China, with spatial patterns indicating robust digital village construction in coastal areas and strong green development in southern regions. The degree of coupling between digital villages and green agricultural development is trending upward in eastern China, and factors such as e-commerce, income, innovation, and education play a decisive role in this coordinated linkage [
14,
15].
In summary, agricultural and rural big data are becoming vital drivers of green, high-quality agricultural development, offering technological solutions to structural challenges and laying the groundwork for integrated management of agriculture and ecosystems. By systematically reviewing relevant literature and employing empirical DID analysis, this study explores the policy mechanisms by which big data advances green agricultural transformation, enhances decision-making, and improves resource allocation. These insights bear significant theoretical and practical relevance for the implementation of high-quality agricultural development and rural revitalization strategies.
3. Construction and Measurement of the High-Quality Agricultural Development Indicator System
3.1. Indicator System for High-Quality Agricultural Development
Table 1 systematically constructs an indicator system for high-quality agricultural development along five dimensions: innovative development, coordinated development, green development, open development, and shared development. This multidimensional framework provides a scientific basis for the sustainable, stable, and efficient advancement of agriculture.
From a theoretical perspective, the selection of these five dimensions is firmly grounded in rationality. First, they align with the core requirements of the new development philosophy: innovation serves as the driving force, coordination as the intrinsic requirement, greenness as the necessary condition, openness as the essential pathway, and sharing as the fundamental goal. Together, these five dimensions form a comprehensive logical framework for high-quality development. Second, the framework reflects the multi-objective nature of agricultural economic systems. High-quality agricultural development pursues not only efficiency improvements but also emphasizes ecological protection, structural optimization, and the broad sharing of developmental outcomes. Existing research confirms that the quality of agricultural development should be assessed through dimensions such as innovation-driven growth, industrial coordination, green transformation, openness and integration, and improvements in people’s livelihoods. Third, these dimensions encompass the entire agricultural value chain—from innovation inputs at the production end to the sharing of achievements at the consumption end, and from domestic industrial coordination to international market openness—collectively forming a complete and coherent set of indicators that comprehensively reflect the essential connotations of high-quality agricultural development.
Innovation: This dimension emphasizes the role of technological input and innovation capacity, as reflected in indicators such as the share of government R&D expenditures, the level of agricultural mechanization (total power of agricultural machinery per sown area), R&D intensity, number of patent applications, and agricultural GDP per unit area. It captures the critical importance of technological progress in driving modernization, productivity, and economic upgrading.
Coordination: Indicators under this dimension focus on synergistic interactions within agriculture and between agriculture and rural economies. Key metrics include financial support for agriculture, the rural Engel coefficient, rural consumption levels, the contribution of primary industry to GDP, and an index reflecting adjustments to agricultural industry structure. The dimension highlights the integration between agricultural development and improvements in farmers’ income and consumption, aiming for balanced and harmonious economic systems.
Green development: Here, attention centers on ecological protection and resource use efficiency, assessed by metrics such as the application rates of fertilizers, pesticides, agri-plastic film per unit area, and forest coverage. These indicators reflect the embedding of ecological considerations within agriculture, supporting the sector’s green transformation.
Openness: This dimension measures the degree of agricultural integration with international markets via export and import dependency ratios. It highlights the competitiveness of an open agricultural economy and its capacity for global integration.
Sharing: Finally, indicators here gauge the equitable distribution of development benefits between urban and rural populations—including income ratios, per capita rural disposable income, and rural spending on education, culture, and entertainment—highlighting agriculture’s role in enhancing life quality and social equity.
3.2. Measurement Method
This study employs the entropy method to comprehensively assess the level of high-quality agricultural development. After constructing a composite evaluation index system, the entropy method is used to determine the weights of each indicator. Weighted aggregation then produces a composite index for each province, which serves as a measure of its food security level.
Assigning weights in indicator evaluation systems generally falls into two categories. The first is subjective weighting, whereby scholars or experts determine indicator weights based on knowledge, experience, or preferences—examples include the analytic hierarchy process (AHP) and the Delphi method. Such approaches involve a high degree of subjectivity and may introduce bias into results. The second is objective weighting, in which weights are determined by analyzing either the information content of original data or correlations among indicators. The entropy method and principal component analysis are representative of this category. To ensure objectivity and accuracy in evaluation, this study adopts the entropy method, which quantitatively captures information diversity within data and is especially suitable for comparative studies of regional developmental differences. It is important to note, however, that the entropy method has certain limitations. First, it is highly sensitive to extreme values; the presence of unusually high or low data points can significantly affect the calculation of information entropy and thus introduce bias in the weight assignment. Second, the method is sensitive to missing data. If indicator data for certain provinces or years are incomplete, even if missing values are supplemented using interpolation techniques, the accuracy of weight calculation may still be compromised. Third, because the entropy method determines weights solely based on the degree of data dispersion, it does not account for the theoretical significance of each indicator, potentially leading to a disconnect between data-driven and theory-driven approaches. To address these limitations, this study applies several data preprocessing steps before using the entropy method: extreme values in the raw data are managed through techniques such as Winsorization, and multiple imputation is employed to fill in missing data. These measures are intended to mitigate the methodological limitations and enhance the robustness of the results. Despite its constraints, the entropy method effectively captures data variability and is particularly well-suited for comparative studies on developmental differences across provinces. Given the panel data structure, a time-dimensional entropy method is introduced to calculate indicator weights and to measure high-quality agricultural development across provinces for the period 2011–2022.
First, let D denote the original m × n matrix containing the values of n indicators for m regions, where d
ij is the value of the jth indicator in the ith province. Next, according to model assumptions and the characteristics of the indicators, all variables are processed as positive indicators and standardized using the following formula:
where X = (x
ij) m × n denotes the matrix of normalized indicators.
Subsequently, the weights for each indicator are calculated as follows:
Finally, the composite score SS for each dimension is computed as follows:
where S represents the comprehensive level of high-quality agricultural development in province i.
3.3. Data Analysis
3.3.1. Overview of Spatial Patterns
Figure 1 illustrates the spatial distribution and changes in high-quality agricultural development across China from 2011–2016 and 2017–2022. The six maps on the left depict the overall characteristics for 2011–2016, while those on the right correspond to 2017–2022. Major grain-producing areas and developed eastern coastal regions consistently demonstrate outstanding performance, and although the overall core distribution pattern remains unchanged, some provinces’ rankings shift between periods. The introduction of the Agricultural and Rural Big Data policy in 2016 marks a critical historical inflection point in the temporal segmentation.
Specifically, both eastern and central regions maintained leadership in high-quality agricultural development from 2011–2016 through 2017–2022. The advantages of key grain-producing provinces such as Jiangsu, Shandong, and Henan were significantly above the national average. In contrast, while certain western and northeastern provinces made progress, their aggregate development remained comparatively lagging, underscoring persistent regional disparities. Notably, the 2016 policy introduction precipitated a structural shift in spatial patterns—during 2017–2022, eastern coastal and some central provinces achieved more conspicuous improvements. This shift reflects not only the accelerated adoption of digital and intelligent tools in resource allocation, production management, and agricultural services in leading regions but also the lag in policy diffusion to the central and western provinces, where infrastructural and contextual differences impede the realization of digitalization benefits. At a micro-level, provinces with greater policy participation and more robust information infrastructure displayed even more pronounced gains in high-quality agricultural development. Their comprehensive development capacity, efficiency of resource inputs, and environmental sustainability improved notably, accelerating the transition toward digitally driven agricultural models. Overall, since 2016, high-quality agricultural development in China has exhibited a dynamic, gradient evolution—from isolated centers to broader spatial expansion.
3.3.2. Temporal Analysis
As shown, panel (a) in
Figure 2 captures the dynamic kernel density distribution of national high-quality agricultural development from 2011–2022. Panels (b–d) represent the distributions for eastern, central, and western regions, respectively. By comparing these kernel densities, one can discern both the temporal evolution and spatial heterogeneity of regional agricultural development quality.
Panel (a) reveals that, nationwide, the distribution of high-quality agricultural development shifted rightward and became more concentrated from 2011 to 2022, with peaks rising especially after 2016. This denotes a marked acceleration in national progress, closely linked to the implementation of major policy initiatives, such as the rollout of agricultural big data practices after 2016, which propelled agricultural development to new levels. Regional subpanels highlight further heterogeneity:
Eastern region (b): The primary density peaks remained in higher-value intervals and shifted most significantly to the right, with distributions contracting around higher values. This suggests not only that the eastern region consistently leads in development but also that policy effects are most pronounced here.
Central region (c): Though with lower peak values than the east, the distribution shows a steady rightward shift, reflecting a strengthening developmental foundation and gradually released potential as a result of policy and resource optimization.
Western region (d): Peaks persist at relatively lower values, and the rightward movement is notably slower, constrained by geography, infrastructure, and information access. Nonetheless, select years demonstrate peak elevation, indicating internal differentiation and some progress.
3.4. Construction of the Empirical Model
3.4.1. Parallel Trend Test
The key identifying assumption of the difference-in-differences (DID) model is the parallel trends precondition—that is, prior to the policy intervention, treatment and control provinces experience similar temporal trajectories in high-quality agricultural development. If this is not satisfied, DID estimation is invalid. To examine this, model (5) is estimated as follows:
where
HADAit is the level of high-quality agricultural development for province
i in year
t;
Datai identifies pilot provinces for the big data policy;
Dt is a year dummy (1 if province
ii implemented the policy in year
tt, 0 otherwise);
Xit represents time-varying controls;
μi and
γt are province and year fixed effects; and
εit is the error term. The coefficients
α,
βi, and
δ are to be estimated. The study expects that, prior to policy implementation,
βi is statistically insignificant, confirming parallelism; after the policy shock, a significantly positive
βi would indicate policy effectiveness.
3.4.2. Difference-in-Differences Model
To identify the causal effect of agricultural big data policy on high-quality development, the following generalized DID model is constructed:
where
denotes a post-policy period dummy, and all other variables remain as previously defined. The model controls for province and year effects and tests whether
βi is significantly positive.
3.5. Data Sources and Variable Selection
3.5.1. Data Sources
Due to data availability, this study utilizes standardized panel data for 31 provincial-level administrative regions in China (excluding Hong Kong, Macau, and Taiwan) from 2011 to 2022, with some variables log-transformed as appropriate. Data were primarily obtained from the China Statistical Yearbook and the China Science and Technology Statistical Yearbook, supplemented by provincial statistics and the Compendium of Statistics for Sixty Years of New China.
3.5.2. Variable Selection
The dependent variable is the level of high-quality agricultural development, while mechanism variables include five dimensions: innovation, coordination, green, open, and shared development. Detailed measurement methods are provided above. The core explanatory variable is the interaction term between agricultural big data policy implementation and the post-policy period (coded as 1 for pilot provinces post-2016, 0 otherwise). Control variables include the following:
R&D intensity: Ratio of R&D internal expenditure to regional GDP;
Industrial structure: Ratio of value added by the secondary industry to that of the tertiary industry;
Urban–rural income gap: Ratio of urban per capita disposable income to rural per capita disposable income;
Tax burden: Ratio of tax revenue to regional GDP;
Transportation infrastructure: Natural log of aggregate freight volume (descriptive statistics report unlogged figures);
Fiscal support: Ratio of general budgetary fiscal expenditure to regional GDP.
Specific details are shown in
Table 2.
4. Empirical Analysis
4.1. Parallel Trends Plot Test
Figure 3 presents the event-study regression coefficients and their confidence intervals for the impact of agricultural and rural big data policies on high-quality agricultural development, with the policy implementation point indicated by a dashed line. Prior to the policy intervention (2012–2016), the estimated coefficients fluctuate narrowly around zero, and confidence intervals consistently cross the zero line—demonstrating no significant pre-treatment differences in trends between the treatment and control groups. Thus, the parallel trend assumption is well supported, providing a robust foundation for causal inference from the difference-in-differences (DID) approach.
After policy implementation (2017 onward), the estimated coefficients rise appreciably each year, with most confidence intervals excluding zero—revealing that the policy has had a persistent and statistically significant positive effect on high-quality agricultural development. The effect peaks in 2021, reflecting amplified impact as both policy measures and supporting infrastructure were scaled up. Overall, this analysis both validates the parallel trend assumption and elucidates the dynamic policy effect before and after implementation, empirically supporting the role of digital transformation in advancing high-quality, green agricultural development in China. As the rollout of digital infrastructure accelerated, big data’s contribution to optimizing agricultural systems and achieving high-quality growth has become increasingly pronounced.
4.2. Baseline Regression Results
When p < 0.1, it means that, under the null hypothesis, the probability of obtaining the observed result is less than 10%. In academic research, 0.1 is generally regarded as a relatively lenient significance level. When the p-value is less than 0.1, we reject the null hypothesis at the 10% significance level, indicating a statistically significant relationship between variables. p < 0.05 is a more commonly used and stricter significance level, meaning the null hypothesis is rejected at the 5% threshold. p < 0.01 is a very strict significance level, indicating rejection of the null hypothesis at the 1% level. The same notation is used in the following table.
This study adopts a DID approach, using the agricultural and rural big data policy as the core explanatory variable. The regression results, which illustrate how these factors interact to shape agricultural development, are summarized in
Table 3. Columns (1) and (2) report regressions without control variables, employing different standard errors for robustness. Column (3) adds a set of controls—including R&D intensity, industrial structure, urban–rural income gap, tax burden, transport infrastructure, and fiscal support—further improving model specification. Across all models, the policy coefficient
Datai ×
Itpost is positive and highly significant at the 1% level. Specifically, in the full model with controls, the coefficient is 0.0161, confirming the robust, significant effect of big data policy implementation on high-quality agricultural development. These findings validate the empirical hypothesis, providing evidence to support digitally empowered agricultural transformation.
4.3. Robustness Checks
To further validate the robustness of the baseline regression results, this study conducts a placebo test by artificially advancing the policy implementation year to 2012 and 2013, constructing fictitious treatment variables (
Datai ×
Itpost2012 and
Datai ×
Itpost2013) and re-running the regressions. As shown in
Table 4, the coefficients for these placebo variables are consistently close to zero (0.0026, 0.0026, 0.0012, 0.0011) and fail to attain statistical significance in all models, regardless of whether control variables are included. This consistency indicates that the policy effect does not exist in years prior to the actual policy implementation, effectively ruling out the possibility that other contemporaneous factors or random model specification are driving the observed policy impact on high-quality agricultural development. Thus, the policy effect identified in the baseline regression is not attributable to model design or sample selection bias.
To further test the robustness of the results, we apply winsorization to the key variables, trimming the top and bottom 5% and 1% extreme values, as shown in Columns (1) and (2) of
Table 5. Regardless of whether 95% or 99% winsorization is used, the estimated effect of the agricultural and rural big data policy remains positive and statistically significant (at the 5% and 1% levels, respectively), consistent with the baseline results and confirming the robustness of the findings to extreme value treatment. Specifically, the policy effect coefficient is 0.0046 at the 5% level under 95% winsorization, and 0.0154 at the 1% level under 99% winsorization—underscoring the stability of the results.
Moreover, as a robustness check for estimation methods, we employ cluster-robust standard errors at the provincial level, as reported in column (3) of
Table 5. The core policy variable retains a coefficient of 0.0161, significant at the 10% level, and the sign and magnitude remain consistent with previous results. This further demonstrates the positive and robust effect of the agricultural and rural big data policy on high-quality agricultural development. In sum, the principal findings are unchanged under both winsorization and alternative estimation techniques, strengthening the reliability and credibility of the empirical results.
4.4. Heterogeneity Analysis
To assess regional heterogeneity, we perform subgroup regressions by region (East, Central, and West) and by grain production zones (major and non-major). The results of these subgroup analyses are presented in
Table 6. As seen in Columns (1)–(3), the policy effect is significant for all regions but differs in magnitude. The largest effect is observed in the eastern region (coefficient = 0.0323, significant at the 1% level), highlighting the greater impact of big data policies where informatization and mechanization foundations are stronger. Central and western regions also show significant positive effects (coefficients = 0.0066 and 0.0051, respectively), though smaller, reflecting more limited infrastructure but positive policy impacts.
In terms of grain production zones, Columns (4) and (5) reveal the policy effect is greater and significant in non-major grain producing regions (coefficient = 0.0433, significant at the 1% level), whereas the effect is negative and insignificant in major grain regions (coefficient = −0.0029). This suggests that in areas with more advanced production and mechanization systems, marginal policy effects are limited; in less developed regions, the policy more effectively enhances technical services and supports high-quality growth.
To further investigate heterogeneity, we divide the sample based on the initial level of high-quality agricultural development. The results in
Table 7 show that the policy effect is significant and positive in both low and high development regions (coefficients = 0.0056 and 0.0289, both significant at the 1% level), but insignificant in regions of medium development. This suggests that big data policies have the greatest effect where either baseline capacity is low and thus more can be gained, or where there is already sufficient infrastructure to maximize policy synergy. In contrast, policy impact in medium-level regions may be limited due to diminishing marginal returns or unresolved institutional bottlenecks.
4.5. Mechanism Analysis
The results, as presented in
Table 8, indicate that the agricultural and rural big data policy has the most significant impact on the dimensions of “openness” and “sharing”. Specifically, the policy coefficient for the openness index is 0.0560, which is statistically significant at the 1% level. This suggests that big data policies facilitate the flow of resources and international cooperation in the agricultural sector, promoting the effective allocation of production factors. For the sharing index, the policy coefficient is –0.0203, also significant at the 1% level, but negative. This outcome implies that the policy may face structural constraints in promoting sharing mechanisms within agriculture, or that its positive effects in this domain have yet to fully materialize in the short term.
In terms of green development, the policy effect coefficient is −0.0156 and significant at the 5% level, also indicating a negative impact. This finding warrants attention, as it may reflect that while big data policies enhance production efficiency, insufficient supporting measures for green technologies and ecological protection in certain regions may impede their positive influence on green development, and may even lead to unintended adverse effects in the short run. In the dimension of innovation-driven development, the estimated policy effect coefficient is 0.0242, which is positive but not statistically significant. This suggests that agricultural and rural big data policies may have a positive influence on innovation, but the effect has yet to reach statistical significance. One possible explanation is the inherently long cycle of agricultural technological innovation; the entire process from data integration to the implementation of new technologies typically spans three to five years. However, within the observation period of this study (2007–2022), policy implementation only covered the most recent seven years, making it difficult for short-term effects to fully materialize. Additionally, the standard deviation of R&D investment intensity among pilot regions is only 0.012, indicating low data dispersion, which may further contribute to the lack of statistical significance. Regarding coordinated development, the policy effect coefficient is −0.0065, which is negative and not statistically significant. This outcome is closely related to the persistence of regional data barriers; although the policy promotes data openness and sharing, 12 out of 19 pilot regions have yet to establish cross-regional data sharing platforms, thus impeding the coordinated allocation of agricultural resources. Furthermore, the standard deviation of the industrial coordination level indicator is 0.08, reflecting minimal overall fluctuation, which also limits the extent to which policy effects can be detected through changes in the data.
Overall, agricultural and rural big data policies have demonstrated a prominent role in advancing openness in the sector, but challenges remain in the areas of green development and sharing. The effects on innovation and coordination are not yet evident. Future policy design should focus on strengthening support for green development and sharing mechanisms, enhancing agricultural innovation, and optimizing regional coordination, in order to fully realize the comprehensive benefits of big data policies.
5. Conclusions and Policy Implications
Drawing on provincial panel data from 2011 to 2022 and leveraging a quasi-natural experiment based on the pilot implementation of agricultural and rural big data policies, this study systematically investigates both the mechanisms and effects of these policies on high-quality agricultural development in China. Employing methods such as difference-in-differences (DID), parallel trend testing, placebo checks, winsorization, heterogeneity, and transmission mechanism analysis, the following principal conclusions are reached:
First, the spatial distribution and temporal evolution of high-quality agricultural development in China exhibit marked regional gradients and staged leaps. From 2011 to 2022, the eastern region, central core grain-producing areas, and economically advanced provinces consistently maintained leading positions, although the development rankings of some provinces shifted across different stages. The year 2016, which marked a turning point with the implementation of agricultural and rural big data policies, became a significant watershed in the developmental trajectory; since then, national and regional development levels have improved substantially, with the peaks of kernel density curves continuously shifting rightward, reflecting a consolidation of the eastern region’s advantages. However, information infrastructure and institutional constraints in central and western regions have led to policy diffusion lag, with regional imbalances remaining pronounced. This pattern indicates that digital tools have initially penetrated and accelerated optimization in regions with favorable foundational conditions, thereby enhancing production, management, and market services. At the same time, it suggests that regional disparities may be linked to the comprehensiveness of data acquisition—data omissions or inaccuracies due to outdated equipment or limited awareness in remote areas may affect the precision of policy effect evaluation, hindering a full understanding of the reasons behind delays in central and western regions due to actual grassroots data limitations.
Second, the parallel trend test and dynamic effect analysis confirm the suitability of the difference-in-differences (DID) model, and both baseline regression and robustness checks consistently indicate that agricultural and rural big data policies significantly promote high-quality agricultural development. It is important to note, however, that such big data policies do not operate in isolation; their effects are intertwined with those of subsidy, land, and other agricultural policies, and the interaction effects have not been fully disentangled. As a result, the study may not have entirely isolated the independent impact of big data policies, thereby limiting the precise ascription of their net effects.
Third, heterogeneity analysis reveals differentiated policy effects across various dimensions. Regionally, the policy effect is most pronounced in the eastern area and comparatively weaker in the central and western regions. The marginal effect in major grain-producing areas is limited, with more substantial impacts observed in non-grain-producing regions. Regarding development levels, a “the higher, the better” trend emerges, while the effects are not significant in regions with moderate development. These differences are not only related to disparities in information infrastructure and economic development, but also expose shortcomings in the consideration of policy implementation diversity. Variations in government attention and implementation capacity may result in heterogeneous policy outcomes, making general conclusions less applicable to all regions, especially in explaining policy ineffectiveness in moderately developed areas of the central and western regions, where the micro-level differences in execution have not been adequately incorporated into the analysis.
Fourth, in terms of mechanisms, the policy mainly fosters agricultural resource flow and external cooperation through the openness dimension, while the coefficient for the sharing dimension, although statistically significant, is negative. Effects on the green, innovative, and coordinated dimensions are not yet significant. This suggests that some channels are beginning to show effects, but also highlights current stage-specific limitations of technology adoption: advanced technologies such as big data analytics and artificial intelligence remain in their early stages, and underdeveloped data security and privacy protection mechanisms limit the positive impact of sharing mechanisms. Consequently, the study may overestimate the actual effectiveness of these technologies and inadequately account for practical bottlenecks that constrain their contribution to green development. Furthermore, farmers, as core policy implementers, have not been fully considered in the analysis with respect to their acceptance, willingness to participate, and actual needs; as a result, interpretations of the negative effect in the sharing dimension overlook heterogeneity in farmer behavior. The proposed policy optimization strategies may thus fail to fully mobilize farmers’ engagement, making the improvement of mechanisms to better reflect farmer realities a key challenge for future policy design.
Based on these findings, this paper proposes four key policy recommendations:
First, implement regionally differentiated policy strategies. Policymakers should tailor big data strategies to the varying resource endowments and development foundations of different regions. For the eastern region, the focus should be on integrating big data deeply into the agricultural value chain and establishing hubs of smart agricultural innovation. The central and western regions should prioritize the development of digital infrastructure and supporting systems to close the digital divide. In major grain-producing areas, policymakers should explore integrating big data with high-standard farmland construction and food security initiatives. In non-major grain regions, big data can be leveraged to optimize the layout of specialty agricultural industries and enhance overall efficiency.
Second, strengthen targeted policy support. Policies should be fine-tuned according to the varying levels of agricultural development. In low-development areas, increased investment in big data training and funding is necessary to overcome deficits in data acquisition and application capacity. In advanced regions, innovation in big data application models should be encouraged to establish benchmarks for agricultural digitalization. For regions with intermediate development, more detailed analysis is needed to identify implementation bottlenecks and optimize policy execution to overcome institutional barriers.
Third, improve supporting mechanisms for big data applications. Addressing weak links in the policy transmission chain requires particular focus on green and shared development mechanisms. In green development, efforts should center on establishing integrated big data platforms with green production technologies, promoting precision input management systems, and enhancing environmental monitoring and early warning systems. In sharing, policy should foster open data services and the creation of equitable benefit-distribution systems, dismantling data barriers and resolving structural imbalances to ensure the sustained release of policy dividends.
Fourth, enhance policy coordination and innovation incentives. To maximize synergies, big data initiatives must be aligned with broader innovation and coordination policies. A dedicated innovation fund should be created to support key R&D and demonstration projects in agricultural big data, nurturing leading entities in agri-tech innovation. Mechanisms for inter-regional data sharing and coordinated development should be established to optimize resource allocation nationwide. Lastly, the policy effectiveness evaluation system should be refined to enable dynamic adjustment and continual optimization of the agricultural and rural big data policy framework.
Finally, the short-term negative effects observed in green and shared development indicate that the implementation of digital policies must balance efficiency improvement with equitable transformation, in order to prevent the sustainability of these policies from being undermined by uneven cost-sharing or imbalanced benefits. This calls for the incorporation of transitional buffer mechanisms into policy design, providing additional support for vulnerable groups and ecologically sensitive areas. Future research can be deepened in three main ways:
First, by extending the observation period to 8–10 years to track the long-term effects of these policies and to determine whether the negative impacts on green and shared development are reversed as the transformation process is completed.
Second, by incorporating micro-level survey data from farmers to analyze differences in policy perception among various scales of agricultural operations, thereby enhancing the micro-foundation of research conducted at the provincial level.
Third, by conducting policy package simulations to assess how varying degrees of subsidies and data openness affect the transmission mechanisms, thus providing more precise quantitative evidence for policy optimization.