Next Article in Journal
The Response of Earthworm Communities and Weed Dynamics to East–West Tree Row Orientation in a Willow-Based Temperate Agroforestry System
Previous Article in Journal
Digital Twins in Agriculture: From Technological Promise to Epistemological Tension in Complex Agroecosystems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Unveiling the Effects of Digital Transformation on Agribusiness Green Innovation in China: An Explainable Machine Learning-Based Approach

by
Wanqi Liang
and
Xin Feng
*
College of Economics and Management, Nanjing Forestry University, Nanjing 210037, China
*
Author to whom correspondence should be addressed.
Agriculture 2026, 16(12), 1288; https://doi.org/10.3390/agriculture16121288
Submission received: 16 April 2026 / Revised: 16 May 2026 / Accepted: 27 May 2026 / Published: 10 June 2026
(This article belongs to the Section Agricultural Economics, Policies and Rural Management)

Abstract

Digital transformation is a key driver of green innovation in agribusiness. While the positive impact of digital transformation on firm innovation has been well documented, its multidimensional nature and heterogeneous associations on agribusiness green innovation remain underexplored. This study deconstructs digital transformation into five business dimensions and two structural features, using a sample of 155 Chinese A-share listed agricultural companies from 2011 to 2021. By combining an explainable machine learning framework integrating Bayesian-optimized XGBoost and SHAP, we identify individual and interaction predictive effects of each feature on green innovation measured by green patent applications. The results reveal correlational evidence that governance digitalization is the dominant predictive driver of agricultural green innovation, followed by institutional digitalization. Merely expanding the scope of digital transformation delivers no substantial improvements in green-patent-based innovation outputs. Different digital dimensions present notable heterogeneous nonlinear correlations with distinct threshold characteristics. We further find significant synergistic interaction linkages across digital dimensions, where coordinated multi-dimensional digital development is critical to fully unlocking the green innovation potential of digital transformation. These findings provide insights for agribusiness managers and policymakers to prioritize digital investment and facilitate low-carbon transition.

1. Introduction

In the context of the deep integration of the digital economy and the real economy, digital transformation has emerged as a critical pathway for traditional industries to achieve high-quality development [1,2,3,4,5]. As a major agricultural country, China is accelerating the modernization and green transformation of its agricultural sector. Agribusinesses, as the main drivers of agricultural technological innovation, play a crucial role in promoting the modernization of agriculture. However, compared with manufacturing and high-tech enterprises, the innovation of Chinese agribusinesses faces unique sector-specific constraints. Specifically, their production processes are non-standardized, and the industrial chains are long and fragmented. In addition, they hold insufficient collateral and have weak risk resilience. These inherent characteristics lead to low innovation efficiency and insufficient high-quality innovation output [6]. More significantly, the agricultural sector is facing mounting pressure to pursue green development. Therefore, green innovation has become an urgent imperative for Chinese agribusinesses to balance productivity growth with ecological sustainability.
Existing studies have confirmed that digital transformation can significantly promote firm innovation [7,8,9,10]. However, most of them focus on industrial enterprises, with limited attention paid to the unique characteristics of agricultural enterprises. Unlike industrial firms that implement digital transformation mainly within standardized production and closed supply chains, agribusinesses’ digitalization needs to cover the entire chain from “farm to table” [11], address the high uncertainty of natural production conditions [12,13], and coordinate with tens of thousands of dispersed smallholder farmers [14]. In addition, the existing literature measures digital transformation as a single-dimensional construct by aggregating the frequency of digital keywords in annual reports. This approach obscures the heterogeneous manifestations of digital transformation across different business links. In the context of agribusiness, digital transformation encompasses multiple facets, including production, circulation, traceability, governance, and institutional digitalization. The heterogeneous predictive impacts of these dimensions on green innovation remain unexamined.
Methodologically, most existing studies rely on traditional econometric models such as structural equation modeling or linear regression, which assume linear relationships between variables. However, the relationship between digital transformation and innovation may involve nonlinearities and complex feature interactions [15,16]. Different digital dimensions may have synergistic or threshold predictive effects on environmental innovation that linear models fail to capture. Explainable machine learning algorithms, such as XGBoost, offer advantages in capturing such complexities and identifying key predictive features [17,18,19]. Combining XGBoost with SHAP analysis enables a more nuanced understanding of the heterogeneous predictive effects of different digital dimensions. While recent studies have begun to apply machine learning methods to examine digital transformation, most still treat it as a single aggregate indicator and fail to unpack the heterogeneous predictive contributions of different digital dimensions. Moreover, few studies adopt explainable artificial intelligence techniques to provide actionable insights into the underlying mechanisms, which limits their practical value for both managers and policymakers.
Against this backdrop, this study addresses three core research questions (RQs) as follows.
RQ1. Which digital transformation dimension is the core predictive driver of agribusiness green innovation?
RQ2. How are different digital transformation dimensions heterogeneously associated with agribusiness green innovation?
RQ3. Is there a synergistic interaction between different digital transformation dimensions?
Drawing on the resource-based view and dynamic capabilities theory, these questions correspond to three testable propositions: first, digital resources embedded in different business links differ in their value, rarity, and inimitability, leading to a clear hierarchy of importance; second, the transformation of digital resources into innovation capabilities follows a nonlinear path bounded by absorptive capacity thresholds; and third, cross-dimensional digital resources generate complementarities that produce synergistic effects beyond the sum of their individual contributions.
To answer these questions, this study systematically analyzes the complex relationship between multi-dimensional digital transformation and agribusiness green innovation. Using panel data of Chinese A-share listed agricultural companies from 2011 to 2021, we deconstruct agribusiness digital transformation into five business dimensions and two structural features. We then employ the XGBoost algorithm with SHAP analysis to identify the most critical digital dimensions, uncover their nonlinear predictive patterns, and explore their interaction effects. This study moves beyond the traditional single-dimensional and linear analytical framework and provides a more nuanced understanding of how digital transformation is associated with green innovation in the agribusiness context.
This study makes three contributions to the existing literature. First, it refines the measurement system of agribusiness digital transformation by decomposing it into five distinct dimensions and two structural features. This is in contrast to prior research that typically measures digital transformation as a single aggregate indicator. This decomposition extends the resource-based view by distinguishing heterogeneous digital resources that underpin green innovation. Second, it introduces the XGBoost–SHAP explainable machine learning method to capture nonlinear relationships and threshold predictive effects, thereby expanding the application of dynamic capability theory in agricultural digitalization research. Unlike traditional linear models that impose restrictive assumptions, this approach identifies hidden nonlinear patterns and interactive effects that cannot be detected in conventional analyses. Third, by uncovering the nonlinear characteristics and synergistic interactions among digital dimensions and between digitalization and firm capabilities, this study provides actionable insights for agribusiness managers to prioritize core digital dimensions and for policymakers to design targeted interventions that account for threshold effects and complementarities. By doing this, this study advances both measurement and methodological approaches beyond the existing literature and provides new empirical evidence for institutional theory.
The remainder of this study is structured as follows. Section 2 provides a comprehensive review of the related literature. Section 3 builds a theoretical framework for analyzing the relationship between multi-dimensional digital transformation and agribusiness green innovation. Section 4 describes the data sources, variable definitions, and methodology. The model results and validation are presented in Section 5, and Section 6 discusses the findings and implications. Finally, Section 7 concludes the paper.

2. Literature Review

2.1. The Influencing Factors of Enterprise Green Innovation

Green innovation, defined as the development of environmentally friendly technologies, products, and processes, has emerged as a critical pathway for balancing economic growth with ecological sustainability [20,21]. In the context of global climate change and increasing environmental regulations, exploring factors associated with firms’ green innovation engagement has become a central research question. Existing studies have identified multiple determinants of green innovation, including environmental regulations [22,23,24], government subsidies and policy support [25,26,27], and technological capabilities [28,29,30]. However, the majority of these studies have focused on manufacturing or high-polluting industries, and few studies compare or evaluate the relative importance of these factors in agricultural settings.
Agribusinesses face unique challenges that distinguish their green innovation activities from those of manufacturing firms. Unlike the standardized, factory-based production model of the manufacturing sector, agricultural production is deeply tied to natural conditions, with highly non-standardized production processes and strong dependence on scenario-specific technological empowerment [31,32]. Meanwhile, the green innovation of agribusinesses runs through the entire “farm-to-table” chain, covering core links including production, circulation, quality traceability, internal governance, and external institutional support, rather than being limited to production and R&D links as in manufacturing enterprises [33,34]. Moreover, in the context of China’s “dual carbon” goals, agricultural enterprises are under mounting pressure to reduce carbon emissions, minimize resource waste, and develop environmentally sustainable production methods [35]. Green innovation in agribusiness has therefore become an urgent imperative.
Despite its importance, research on the determinants of green innovation in agribusiness remains fragmented. Existing studies have explored isolated factors such as climate policy uncertainty [36,37,38], digital finance [39], and symbiotic relationships [40] on agricultural enterprises’ green innovation. However, the role of digital transformation as a potential driver has only recently begun to attract scholarly attention [41,42]. The literature lacks a synthetic view of how these factors interact with digital transformation. More importantly, existing studies fail to address three key gaps: over-reliance on one-dimensional measurement, ignorance of nonlinear mechanisms, and the absence of agriculture-specific theoretical explanations. These shortcomings directly motivate the design of this study.

2.2. Impact of Digital Transformation on Enterprise Innovation

With the rapid development of the digital economy, digital transformation has emerged as a core factor correlated with corporate innovation performance [43]. Existing studies have explored the core mechanisms through which digital transformation promotes innovation from multiple perspectives. This includes reducing information acquisition costs [44,45] facilitating knowledge spillovers and cross-organizational collaboration [16,46,47], optimizing production processes and resource allocation [48,49,50], and enhancing corporate risk-taking capacity [51]. Regarding innovation outcomes, scholars have moved beyond the focus on innovation quantity to examine the impact of digital transformation on innovation quality [9], green innovation [52,53,54,55,56,57,58], and ambidextrous innovation [59,60,61,62], which confirms the wide-ranging innovation-enhancing effects of digital transformation.
A critical limitation of existing research is the near-universal reliance on unidimensional measures of digital transformation. Most studies aggregate the frequency of digital-related keywords in annual reports to construct a single composite index [7,8,9,10,63]. Although this approach captures the overall level of corporate digitalization, it ignores the heterogeneous correlations between digitalization practices in different business segments and innovation activities. Recent studies have begun to address this limitation by decomposing digital transformation into multiple dimensions [64,65]. However, these decompositions are primarily designed for manufacturing or service contexts and fail to account for the unique operational characteristics of agribusinesses. Digital transformation in agribusinesses involves multiple unique links, including production, circulation, traceability, management, and institutional interaction [66,67,68], and the differentiated linkages of these dimensions on green innovation remain largely unexamined. While our study also adopts a keyword-based approach to measure digital transformation, we address this fundamental limitation by decomposing digital-related keywords into seven distinct dimensions tailored explicitly to the unique operational characteristics of agribusinesses. This multi-dimensional decomposition allows us to capture the heterogeneous effects of different digital resources rather than treating digitalization as an undifferentiated monolithic construct.
For agribusinesses, digital transformation is not a monolithic process but a systemic reconfiguration of the entire agricultural value chain [69]. Existing studies have either focused on single dimensions [70] or treated digital transformation as a black box [63], which fails to identify the most critical dimensions for green innovation. This gap is particularly significant given agricultural enterprises’ limited resources and need to prioritize digital investments with the highest sustainability returns.
Methodologically, existing empirical studies on the relationship between digital transformation and corporate green innovation rely primarily on traditional econometric models. Although these models are effective in testing linear causal relationships between core variables under strict theoretical assumptions, their pre-imposed linear assumptions make it difficult to capture the potential nonlinear relationships and complex interactive effects between digital transformation and green innovation [71,72]. Recent theoretical work further emphasizes that economic systems are inherently nonlinear and interconnected, reinforcing the need for methodological approaches that explicitly account for such complexities [73]. In recent years, machine learning methods such as XGBoost and random forest have been used to explore the complex nonlinear linkages of the digital economy and corporate digitalization on green development [12,74,75]. Among them, the explainable machine learning framework combining XGBoost with SHAP (SHapley Additive exPlanations) analysis has unique advantages for this study. This framework retains XGBoost’s strengths in capturing nonlinear relationships between multi-dimensional digital transformation and green innovation but also solves the “black box” problem of traditional machine learning models through SHAP value decomposition [76,77]. Nevertheless, the application of this method in research on agribusiness green innovation remains limited, forming a methodological gap in the existing literature.
In summary, existing studies have systematically identified the core determinants of corporate green innovation, widely verified the innovation-enabling effects of digital transformation, and conducted preliminary explorations into the driving paths of digital transformation for green innovation in the agricultural sector. However, extant research lacks targeted multi-dimensional deconstruction of digital transformation tailored to the unique operational characteristics of agribusinesses. Furthermore, prior empirical analyses cannot effectively capture the nonlinear relationships and complex interactive effects between multi-dimensional digital transformation and green innovation, while the XGBoost–SHAP explainable machine learning framework has not yet been applied to research on agribusiness green innovation and no systematic research framework has been established, creating a critical methodological gap in the literature. This paper aims to fill these aforementioned research gaps.

3. Theoretical Foundation and Research Propositions

Building on the literature review and the identified research gaps, this study develops a theoretical framework integrating the resource-based view, dynamic capabilities theory, and resource complementarity theory, and proposes three testable research propositions to guide the subsequent empirical analysis.

3.1. Heterogeneity of Digital Resources and Hierarchical Predictive Importance

The resource-based view posits that firms’ sustained competitive advantage stems from heterogeneous resources that are valuable, rare, inimitable, and non-substitutable [78]. Digital transformation essentially involves the conversion of digital technologies into firm-specific heterogeneous resources [79]. Unlike traditional production factors, digital resources exhibit non-rivalry, replicability, and network externalities, which enable them to reshape resource allocation patterns across the agricultural value chain [80]. However, not all digital resources create equal value. Digital resources embedded in different business links differ significantly in their value, rarity, and inimitability [81]. For example, underlying digital infrastructure requires large upfront investment and long-term accumulation, making it highly inimitable and valuable. In contrast, superficial digital applications such as basic e-commerce platforms are easily replicable and thus create limited competitive advantage. This heterogeneity in digital resource value should lead to a clear hierarchy of importance in their predictive contributions to green innovation.
Proposition 1.
Digital transformation dimensions exhibit a clear hierarchical structure in their predictive contributions to agribusiness green innovation, with digital resources embedded in core business links showing greater predictive importance.

3.2. Nonlinear Threshold Characteristics of Digital Resource Transformation

Dynamic capabilities theory emphasizes that firms need to integrate, build, and reconfigure internal and external resources to adapt to changing environments and achieve sustainable innovation [82]. The transformation of digital resources into innovation capabilities is not a linear process but requires firms to develop sufficient absorptive capacity to effectively utilize digital technologies. For agribusinesses, digital transformation often involves high upfront investment in hardware, software, and talent, which may initially crowd out resources available for green R&D [8]. Only when firms accumulate a critical mass of digital capabilities and develop the absorptive capacity to integrate digital technologies into their innovation processes can digital resources begin to generate significant green innovation dividends [9]. This suggests that the relationship between digital transformation and green innovation should exhibit nonlinear threshold characteristics, where the marginal predictive contribution of digitalization increases significantly after crossing a certain threshold.
Proposition 2.
The predictive relationship between digital transformation dimensions and agribusiness green innovation exhibits nonlinear threshold characteristics, with the marginal predictive contribution increasing significantly after crossing a critical absorptive capacity threshold.

3.3. Complementarity of Digital Resources and Synergistic Predictive Effects

Resource complementarity theory argues that combinations of complementary resources can generate synergistic effects that exceed the sum of their individual contributions [83]. Digital transformation is a systemic process that involves multiple business links, and digital resources in different dimensions are often complementary rather than independent [65].
For example, governance digitalization provides the underlying technological infrastructure for production digitalization, while institutional digitalization alleviates the financing constraints that limit firms’ investment in both governance and production digitalization [84]. When multiple digital dimensions develop in tandem, they can reinforce each other’s effects, smooth the nonlinear fluctuations of individual dimensions, and lower the threshold required to generate positive innovation returns [2]. In contrast, isolated development of a single digital dimension may fail to unlock the full potential of digital transformation.
Proposition 3.
There exist significant synergistic interaction effects between different digital transformation dimensions, where the combined predictive contribution of complementary digital resources exceeds the sum of their individual contributions.

4. Data, Methodology and Research Design

4.1. Sample Selection and Data Source

This study selects Chinese A-share listed agricultural companies from 2011 to 2021 as the initial research sample. Because the sample period covers 2020–2021, years that were heavily affected by the COVID-19 pandemic, we conduct a robustness check by excluding observations from 2020–2021 to ensure the reliability of our findings, and the main results remain highly consistent. Corporate annual report textual data for measuring digital transformation are retrieved from the China Securities Information Network (CNINFO). Green patent data for measuring agribusiness green innovation are obtained from the Chinese Research Data Services Platform (CNRDS). Financial and corporate governance data are collected from the China Stock Market & Accounting Research Database (CSMAR) and Wind Database.

4.2. Variable Definitions

4.2.1. Explained Variable: Agribusiness Green Innovation

Drawing on the studies by [20,42], we use the number of green patent applications of enterprises to represent the level of agribusiness green innovation. Given the large number of zero values in the raw green patent application data, this variable is measured via logarithmic transformation of the annual number of enterprise green patent applications, i.e., ln(1 + number of patents), covering invention patents. It is worth noting that this measure primarily captures formal and codified technological innovation and may understate process-based, incremental, and operational green improvements that are common in agricultural practice.

4.2.2. Multi-Dimensional Digital Transformation Variables

Referring to the research of [63,85], the level of firm-level digital transformation is measured by the occurrence frequency of keywords associated with digital transformation in the annual reports of Chinese listed agricultural enterprises. Nevertheless, a unidimensional indicator cannot capture the reality that digital transformation is not a monolithic phenomenon but rather a multifaceted process permeating various operational links of agribusiness. Therefore, this study decomposes digital transformation into five business dimensions and two structural features.
1.
Production digitalization (PD)
PD refers to the application of digital technologies in agricultural production processes. Through real-time monitoring and data-driven decision-making, production digitalization optimizes the allocation of agricultural inputs, thereby freeing up internal resources for R&D activities [31,86]. Furthermore, the operational data generated by precision farming delivers actionable insights for the development of eco-friendly agricultural technologies, directly fueling agribusiness green innovation [87].
2.
Circulation digitalization (CD)
CD captures the deployment of digital platforms and technologies across agricultural product distribution and marketing. A defining feature of agriculture is its long industrial chains and high circulation costs, which frequently drive information asymmetry between producers and consumers [88]. By enabling real-time information sharing and precision demand forecasting, circulation digitalization alleviates this asymmetry, allowing agribusinesses to accurately identify unmet green product demand [48,87]. This directly incentivizes targeted green innovation, as firms develop novel products and processes to match consumers’ evolving sustainability preferences [42].
3.
Traceability digitalization (TD)
TD refers to the deployment of traceability technologies across the full farm-to-table cycle of agricultural products. It elevates product transparency, ensures compliance with stringent environmental regulations, and creates reputational incentives for greener production. For agribusinesses, a robust traceability system not only ensures compliance with increasingly stringent environmental regulations but also generates strong reputational incentives to adopt greener production practices [89]. The imperative to maintain full traceability records drives process innovations that reduce environmental footprints, while the granular data collected informs the development of novel, high-quality green products.
4.
Governance digitalization (GD)
GD refers to the application of digital technologies that support agribusinesses’ internal management, R&D, and decision-making processes. As the core technological infrastructure for digital operation, GD directly strengthens innovation capacity by streamlining information processing, optimizing knowledge management, and boosting R&D efficiency [90]. It enables firms to identify green innovation opportunities, facilitate cross-team R&D collaboration, and accelerate the development of novel low-environmental-footprint technologies, ultimately fostering both incremental and radical green innovation [91].
5.
Institutional digitalization (ID)
ID captures the digitalization of interactions between agribusinesses and external institutional environments. Agribusinesses face severe financial constraints due to the inherent seasonality of agricultural production and insufficient collateralizable assets, and institutional digitalization eases such constraints by reducing information asymmetry between agribusinesses, financial institutions and government authorities [92]. This expands access to external funding, allowing firms to increase investment in long-term, high-risk green innovation projects that would otherwise suffer from insufficient funding [93].
To ensure the accuracy and validity of the above dimensional measurement, we developed a standardized keyword dictionary through a rigorous expert review process. An initial keyword pool was developed based on authoritative literature and agricultural digitalization application scenarios. Three rounds of independent screening and revision were conducted by three experts in agricultural economics and digital management to ensure accuracy, relevance, and theoretical consistency. The complete keyword dictionary is presented in Table 1.
Beyond these business-specific dimensions that capture the content of digital transformation across operational links, the structural distribution of a firm’s digital investment and implementation also plays a non-negligible role in shaping its innovation outcomes. To fully characterize the multi-faceted nature of agribusiness digital transformation, we further construct the following two structural features to profile the overall pattern of firms’ digital transformation efforts.
6.
Digital transformation Scope (DS)
To capture the extensiveness of a firm’s digital transformation across different functional dimensions, we measure digital transformation scope using the number of digital dimensions in which a firm exhibits non-zero keyword frequency as follows.
D S i t = d = 1 5 1 ( S c o r e i d t > 0 ) ,
where SCOREidt is the normalized keyword frequency of dimension d for firm i in year t and 1 ( · ) is an indicator function that takes the value of 1 if the normalized score is greater than zero, and 0 otherwise. DS ranges from 0 to 5, with higher values indicating that a firm engages in digital transformation across a broader range of business functions.
7.
Digital transformation balance (DB)
To capture the structural distribution of digital transformation across the five dimensions, we measure digital transformation balance using the inverse of the standard deviation across the five dimensions as follows:
D B i t = 1 ( 1 + σ i t ) ,
where σ i t is the standard deviation of keyword frequencies across the five dimensions. This specification ensures that higher values indicate more balanced digital development.

4.3. Data Preprocessing and Cleaning

We conduct full-text extraction and standardized preprocessing of annual reports in Python 3.14, with all procedures fully reproducible. The specific steps are as follows:
Step 1. Noise removal: eliminate page numbers, headers, footers, tables, special symbols, URLs, and redundant blank lines from the raw text;
Step 2. Chinese word segmentation: perform text segmentation using the Jieba library in Python;
Step 3. Stop-word removal: exclude general function words, conjunctions, pronouns, and other semantically empty terms;
Step 4. Text standardization: unify character encoding, consolidate synonymous expressions, and standardize technical terms;
Step 5. Keyword matching: count the frequency of digital-transformation-related keywords using exact whole-word matching;
Step 6. Normalization: apply the Min–Max normalization method to map the scores of all digital dimensions to the [0, 1] interval.
To ensure data reliability, we process the raw data as follows. (1) abnormal enterprise samples such as ST and *ST are excluded; (2) observations with substantial missing values for core variables are removed; (3) continuous variables are winsorized at the 1% and 99% levels to mitigate the impact of outliers. Ultimately, a final valid sample comprising 155 firms and 1636 firm–year observations is obtained.

4.4. Measurement Validity and Limitations

The dimensional division of digital transformation in this study follows the established valuation logic and scenario classification standards in authoritative agricultural digitalization studies, which ensures good discriminant validity. Meanwhile, we acknowledge the inherent limitations of the annual report keyword measurement method. While this approach effectively captures the strategic emphasis and information disclosure level of corporate digital transformation, it cannot fully reflect the actual implementation depth, investment scale, or operational efficiency of digitalization. This is a common limitation shared by all text-mining-based empirical research. To avoid potential bias caused by voluntary disclosure, we exclude firm–year observations with zero digital transformation keywords and further conduct a series of robustness checks in Section 5.6 to confirm the stability of our core conclusions.

4.5. Methodology

4.5.1. Explainable Machine Learning Method: XGBoost Algorithm

The eXtreme Gradient Boosting (XGBoost) algorithm proposed by [94] is a scalable machine learning method based on Gradient Boosting decision trees. XGBoost builds an ensemble of weak learners (decision trees) sequentially, where each new tree corrects the errors of the previous ones by optimizing a regularized objective function. Compared to traditional econometric models, XGBoost can capture nonlinear relationships and complex feature interactions without imposing a pre-specified functional form. The objective function of XGBoost is defined as
Φ = i = 1 n L ( y i , y ^ i ) + k = 1 t Ω ( f k ) ,
where L denotes the loss function that measures the difference between the predicted value y ^ i and the true value y i and Ω ( f k ) represents the regularization term for the k-th decision tree to avoid overfitting, and its calculation formula is
Ω ( f k ) = γ T + 1 2 λ j = 1 T w j 2 .
In the formula, T is the number of leaf nodes of the decision tree; γ and λ are regularization parameters; and w j is the weight of the j-th leaf node.
The Gain from splitting a decision tree node can be expressed as
G a i n = G L 2 H L + λ + G R 2 H R + λ ( G L + G R ) 2 H L + H R + λ ,
where GL and HL are the sum of the first derivatives and second derivatives of the loss function for samples in the left leaf node after splitting, respectively, and GK and HK are the corresponding sums for the right leaf node. If the Gain is positive, the node splitting is beneficial and will be retained; otherwise, the splitting is abandoned.

4.5.2. Input Features and Hyperparameter Optimization

The input features for the XGBoost model consist of the five business-specific digitalization dimension variables and two structural feature variables, yielding a total of 7 features. We divide the sample into a training set (2011–2018, 1190 observations) and a test set (2019–2021, 446 observations) to evaluate model performance.
XGBoost models require extensive parameter tuning, which often involves testing tens of thousands of parameter combinations. Identifying the optimal configuration within such a vast search space is known as hyperparameter optimization, with mainstream methods including grid search and Bayesian optimization. Bayesian optimization is a global optimization method for black-box functions, which uses historical parameter information (prior knowledge) to iteratively search for optimal hyperparameters [95]. Compared with grid Search, it converges to optimal parameters with fewer iterations, offering higher computational efficiency, and is better at finding the global optimum rather than local optima. As a result, a growing number of studies use Bayesian optimization for hyperparameter tuning.

4.5.3. SHAP for Model Interpretation

To interpret the XGBoost model and uncover the complex relationships between digital transformation dimensions and green innovation, we employ the SHapley Additive exPlanations (SHAP) method proposed by [76]. It provides a unified framework that decomposes the prediction of each instance into additive feature contributions based on Shapley values from cooperative game theory. The core advantage of the SHAP model lies in its ability to quantify feature importance while also revealing the direction, nonlinear patterns, and interaction effects of feature impacts, thus addressing the “black box” problem of many machine learning models. It is important to stress that SHAP values measure each feature’s contribution to the model’s prediction; they do not imply causal relationships. The analysis that follows identifies predictive associations, marginal effects, and nonlinear shapes among variables, rather than causal effects. The basic SHAP formula is
h ( z ) = φ 0 + i = 1 N φ i z i ,
where h ( z ) is the explainable model; φ 0 is the average predicted value of all samples; φ i is the SHAP value of the i-th feature, representing the marginal contribution of this feature to the prediction result; z is the standardized input value of the feature; and N is the total number of features.
For the interaction effects between features, the SHAP model expands the formula by introducing interaction terms as
h ( z ) = φ 0 + i = 1 N φ i z i + i = 1 N j = i + 1 N φ i j z i z j ,
where φ i j is the SHAP interaction value between the i-th feature and the j-th feature, reflecting the additional contribution of their combined effect to the prediction result.
The calculation of the SHAP value for a single feature follows the principle of fair distribution of the Shapley value, with the specific formula
φ i ( f , x ) = S F \ { i } | S | ! ( | F | | S | 1 ) ! | F | ! [ f ( x S { i } ) f ( x S ) ] ,
where F is the set of all input features; S is any subset that does not include the i-th feature; | S | is the number of features in subset S; f ( x S { i } ) and f ( x S ) respectively represent the predicted values of the model when the i-th feature is included and excluded; and | S | ! ( | F | | S | 1 ) ! | F | ! is the weight coefficient of the feature subset, ensuring that the contribution of each feature is calculated fairly.
In this study, SHAP values serve three purposes: (1) feature importance ranking based on the mean absolute SHAP values across all samples; (2) nonlinear effect plots (SHAP dependence plots) that show how the marginal contribution of a feature varies with its value; and (3) interaction effect plots (SHAP interaction values) that reveal how the contribution of one feature depends on the value of another feature. These visualizations allow us to identify whether digital dimensions exhibit diminishing returns, threshold effects, or synergistic interactions, all within a predictive rather than causal framework.

4.5.4. Implementation Details

We implemented all analyses in Python using several well-established libraries. The XGBoost regressor was built with XGBoost, with the objective function set to reg: squarederror. The evaluation metrics are RMSE, MAE, and R2, and the input features include PD, CD, TD, GD, ID, DS, and DB. The dependent variable is agribusiness green innovation, measured as the natural logarithm of one plus the number of green invention patent applications. Model evaluation and data splitting relied on scikit-learn, while Bayesian hyperparameter optimization was conducted with the bayesian-optimization package. LightGBM was used for robustness checks.
Bayesian optimization was adopted to tune XGBoost hyperparameters and avoid overfitting, with the optimal hyperparameters and their search ranges presented in Table 2. With the optimal hyperparameters fixed, the final model was trained on the complete training set and interpreted using the SHAP framework. SHAP values were computed to generate feature importance rankings, nonlinear dependence plots and interaction effect plots, revealing the heterogeneous marginal effects and synergistic interactions among digital transformation dimensions.

5. Results

5.1. Bayesian Hyperparameter Optimization

This study employed Bayesian optimization to identify the optimal XGBoost hyperparameter configuration, balancing model capacity and overfitting risk given the moderate size of our dataset. To strictly prevent data leakage, cross-validation was nested exclusively within the training set (2011–2018, 1190 firm–year observations), while the test set (2019–2021, 446 firm–year observations) was held out entirely until the final model evaluation. The number of iterations for Bayesian optimization was set to 200, with each iteration utilizing 5-fold cross-validation for testing purposes. The process of adjusting the six hyperparameters using Bayesian optimization is depicted in Figure 1. Table 2 summarizes the search space for all seven hyperparameters and the final optimal values.
Figure 1 presents the performance of key XGBoost hyperparameters under 5-fold cross-validation, including max_depth, learning_rate, n_estimators, subsample, gamma, and min_child_weight. Each subplot maps the relationship between one hyperparameter and the out-of-sample R2 score obtained during the search. The red and yellow points identify the configurations that achieved the best cross-validation performance. For several parameters, the R2 rises sharply when moving from less favorable regions toward the optimum and then remains relatively flat or declines slightly, a pattern consistent with a well-converged search. The relatively stable exploration near the optimum, particularly for max_depth, learning_rate, and n_estimators, suggests that the Bayesian optimizer successfully identified a region of strong and consistent performance rather than fitting noise.

5.2. Model Validation and Comparison

We adopt root mean squared error (RMSE), mean absolute error (MAE), and R-squared (R2) to evaluate the performance of the XGBoost model. RMSE and MAE quantify prediction accuracy, with lower values indicating smaller deviations between predicted and observed values, while R2 measures the model’s explanatory power for agribusiness green innovation, with higher values indicating stronger explanatory ability. The dataset was chronologically split into a training set (2011–2018, 1190 observations) and a test set (2019–2021, 446 observations). In order to verify the reliability and superiority of the Bayesian-optimized XGBoost model, its performance is comprehensively compared with representative mainstream machine learning models, including decision tree (DT), random forest (RF), Elastic Net, Light Gradient-Boosting Machine (LightGBM), and their grid-search-optimized variants. The detailed comparison results are summarized in Table 3.
Table 3 presents the performance comparison results of all evaluated models. The XGBoost model optimized by Bayesian optimization (XGBoost+BO) achieves the best overall performance across all three metrics, with the lowest RMSE (1.2657), the lowest MAE (0.6521), and the highest R2 (0.6562). To facilitate understanding for readers unfamiliar with machine learning, we first provide intuitive interpretations of the model evaluation indicators. A lower value of RMSE and MAE indicate a smaller deviation between predicted and observed values, representing higher prediction accuracy. A higher R2 value indicates a stronger explanatory power of the model for agribusiness green innovation. The linear model Elastic Net yields the lowest R2 of only 0.2762 among all baseline models, which confirms the existence of strong nonlinear relationships between digital transformation and agribusiness green innovation that traditional linear models cannot fully capture. Meanwhile, the grid-search-optimized variants show limited performance improvement compared with their default versions and even underperform the baseline models in some cases (e.g., decision tree+grid). This suggests that Bayesian optimization is more effective in navigating the hyperparameter space and avoiding local optima, thereby improving model performance and generalization ability. The superior overall performance and robustness of XGBoost+BO fully validate its suitability as the core benchmark model for subsequent SHAP-value-based mechanism interpretation and heterogeneity analysis.

5.3. Predictive Drivers of Digital Transformation to Agribusiness Green Innovation

This section tests Proposition 1 by examining the relative importance of different digital transformation dimensions. We use the term “predictive driver” to refer to features that make the largest contribution to the model’s predictive performance for agribusiness green innovation. This term does not imply causal relationships but rather reflects the relative importance of each feature in the XGBoost–SHAP framework.
To assess the relative importance of multi-dimensional digital transformation features, we employ SHAP values, which not only quantify feature importance but also reveal the direction and heterogeneity of their predictive contributions. Figure 2 depicts the predictive significance of multi-dimensional digital transformation features in the agribusiness green innovation. Specifically, Figure 2a presents the XGBoost built-in feature importance ranking, where longer bars indicate greater predictive contribution to model prediction. Figure 2b reports the mean absolute SHAP value ranking, quantifying each feature’s actual marginal predictive contribution to green innovation. Figure 2c presents the SHAP honeycomb diagram, where each dot corresponds to a firm–year observation of Chinese A-share listed agribusinesses, mapping the full distribution of SHAP values across the sample. The horizontal axis shows SHAP values, with positive and negative values indicating positive and negative predictive contributions to green innovation, respectively, while the color gradient indicates the magnitude of each feature.
As illustrated in Figure 2a,b, GD ranks first in both XGBoost feature importance and SHAP total effect, establishing it as the dominant predictive driver. ID consistently ranks second in both importance rankings, emerging as the second most critical feature for predicting agribusiness green innovation. This indicates that the external digital institutional environment, including digital credit, digital agricultural insurance and intelligent subsidy application systems, is a key predictor of agricultural enterprises’ green innovation activities, second only to the enterprise’s internal underlying digital infrastructure.
The remaining features show minor discrepancies in their rankings across the two metrics. This is because the two importance measures have different core emphases. The XGBoost Gain metric reflects the overall contribution of each feature to the model’s overall explanatory power, while the mean absolute SHAP value offers a more direct measure of each feature’s real-world marginal contribution to the predicted green innovation output of agricultural enterprises. Specifically, TD contributes more to the model’s overall explanatory power. PD, by contrast, delivers a more stable and widespread positive effect on green innovation across the full sample of firms. Both CD and DB register positive but more modest contributions in both rankings. This indicates they play a supportive, secondary role in predicting green innovation relative to core digital infrastructure and production-focused digitalization. Notably, DS ranks lowest in both rankings, with a near-zero mean absolute SHAP value. It confirms that simply expanding the number of digital dimensions has negligible predictive relevance for agricultural enterprises’ green innovation. Meaningful predictive associations only materialize when firms deepen digital implementation in their core business links.
The SHAP honeycomb diagram in Figure 2c further reveals the relationship between the value of each digital transformation dimension and the direction of its SHAP contribution to agribusiness green innovation. The scatter distribution of GD and ID is wider than that of other dimensions, indicating that these two dimensions exhibit a greater range of predictive influence on agribusiness green innovation. In contrast, DS shows a near-zero SHAP total effect, suggesting that simply increasing the number of digital dimensions without depth yields negligible predictive relevance.

5.4. Nonlinear Relationship Between Digital Transformation and Agribusiness Green Innovation

This section tests Proposition 2 by analyzing the nonlinear threshold characteristics of the predictive relationship between digital transformation and green innovation. Figure 3 illustrates the predictive associations between various dimensions of digital transformation and agribusiness green innovation. The figure presents seven core digital transformation features, with each scatter point representing a sampled agricultural enterprise in our dataset. The red dashed line in each subplot depicts the marginal contribution of the corresponding digital transformation feature to the predicted agribusiness green innovation output measured by SHAP values, which captures the nonlinear changing pattern of the feature’s predictive association as its value increases.
Among these features, CD and DB show positive contributions that strengthen steadily as the feature value increases. PD exhibits a mild U-shaped pattern in its SHAP contribution. Its marginal contribution turns negative at lower values (Ln(PD) roughly below 0.2), then gradually shifts upward and becomes weakly positive beyond that approximate region. TD presents an L-shaped pattern in its SHAP contribution, with a strong positive contribution at low TD levels, followed by a sharp decline in its marginal contribution as TD increases, and the contribution stabilizes at a low level with slight fluctuations after reaching the trough at Ln(TD) = 0.2. GD features a significant U-shaped pattern in its SHAP contribution, with negative contributions at low-to-medium levels that give way to strongly positive ones once the feature moves into the upper value range (Ln(GD) approximately above 0.25). ID demonstrates an exponentially increasing positive SHAP contribution, with a negligible contribution when Ln(ID) < 0.3, whereas its marginal contribution surges dramatically when the feature value exceeds this threshold. In contrast, DS shows no significant variation in marginal contribution across its full value range, indicating no meaningful linear or non-linear predictive association with agricultural enterprises’ green innovation. These approximate turning points should be read as empirical regularities observed in our sample rather than as exact universal thresholds.

5.5. Interaction Effects Between Features of Digital Transformation

This section tests Proposition 3 by exploring the synergistic interaction effects between different digital transformation dimensions. Figure 4 shows the SHAP interaction effects of digital transformation features on the predictive agribusiness green innovation. In each subplot, the red dashed curve represents the marginal contribution of a single feature on green innovation, while the green curve represents the interactive contribution after incorporating another feature. Observations with high values of the moderating feature are marked in red.
Figure 4a presents the interaction effect between GD and ID. The individual contribution of GD presents a significant U-shaped pattern. When ID enters the interaction, the green curve flattens substantially, staying close to zero across the GD range. This empirical pattern indicates that higher institutional digitalization correlates with the mitigation of GD’s early-stage negative predictive tendency while retaining its long-term positive relevance. High-ID firms (red points) cluster in the high-SHAP region once GD reaches elevated levels, indicating that the model predicts a stronger positive contribution of governance digitalization to green innovation when institutional digitalization is well-developed. Consistent with this predictive pattern, agricultural enterprises with early investment in data-driven decision protocols and digital compliance systems tend to show weaker negative SHAP contributions from GD at low levels and a steadier upward predictive contribution once governance digitization matures.
Figure 4b presents the interaction effect between GD and PD. The individual U-shaped contribution of GD is significantly weakened after interacting with PD, as reflected by the flat green curve near zero. Samples with high PD levels (marked in red) show higher SHAP values in the high GD interval, indicating that higher PD is associated with a pattern where the negative SHAP contribution of GD at low levels is weaker and the positive contribution at high GD levels is stronger. From a predictive perspective, agricultural enterprises with sensor-rich and data-ready production tend to show a smoother GD contribution trajectory: the early downward part of the U-curve is mitigated, and the late upward part is reinforced in the model’s prediction.
Figure 4c presents the interaction effect between ID and PD. The individual contribution of ID shows an exponentially increasing positive trend. After interacting with PD, the green curve is flat and close to zero, indicating that the interaction with PD substantially smooths the exponential growth pattern of ID’s individual contribution. Meanwhile, samples with high PD levels (marked in red) are concentrated in the high SHAP value region in the high ID interval, meaning that high-level production digitalization is associated with a significantly amplified positive contribution of institutional digitalization on green innovation, especially when ID breaks through the critical threshold. This predictive pattern suggests that production digitalization stabilizes ID’s contribution by channeling institutional innovations into green innovation through a feedback loop.
Figure 4d presents the interaction effect between GD and TD. After the interaction with TD, the green curve shows an almost flat trend near the zero line, indicating that the interaction between GD and TD significantly smooths the U-shaped nonlinear fluctuation of GD’s individual contribution. Meanwhile, samples with high TD levels (marked in red) are concentrated in the high SHAP value region when GD is at a high level, which means that the improvement of TD is associated with a stronger positive contribution of GD on green innovation after GD breaks through the critical threshold and a weaker negative contribution of GD at low levels. This predictive pattern suggests that, in the model, higher traceability digitalization tends to coincide with an earlier and stronger upward shift in GD’s SHAP contribution.
Figure 4e presents the interaction effect between GD and DB. After interacting with DB, the green curve is flat and close to the zero line, indicating that the interaction with DB significantly weakens the U-shaped nonlinear contribution pattern of GD’s individual contribution. Meanwhile, samples with high DB levels (marked in red) show higher SHAP values in the high GD interval, indicating that higher digital transformation balance is associated with a pattern where the early negative SHAP contribution of GD is attenuated and the later positive contribution is amplified. Specifically, when GD exceeds the critical threshold, higher DB tends to correspond with a further amplification of the positive contribution of GD on green innovation in the model’s prediction. This predictive pattern suggests that balanced multi-dimensional digitalization helps governance digitalization enter its positive contribution range more smoothly by providing complementary support, rather than operating in isolation.
Figure 4f presents the interaction effect between DB and ID. After interacting with ID, the green curve becomes largely flat and close to the zero line, indicating that the interaction with ID completely changes the inverted U-shaped nonlinear pattern of DB’s individual contribution. Meanwhile, samples with high ID levels (marked in red) are concentrated in the high SHAP value region in the medium and high DB interval, meaning that high-level institutional digitalization is associated with a significantly amplified positive contribution of digital transformation balance on green innovation, especially when DB breaks through the critical threshold.
Overall, the interaction effects between digital transformation features show significant synergistic patterns. The interaction between different dimensional digital transformation features can smooth the nonlinear fluctuation of individual features’ contributions, and the high level of moderating features are associated with significantly amplified positive contributions of core features after they break through the critical threshold. This indicates that the coordinated development of multi-dimensional digital transformation is the key to fully releasing the predictive gains of digitalization for agricultural enterprises’ green innovation.

5.6. Robustness Checks

To ensure the reliability of our findings, we conduct three robustness tests. We first replace the original green innovation measure with the logged sum of both green patent applications and grants, namely ln(1 + applications + grants). The model yields a test-set R2 of 0.3670, with RMSE and MAE of 0.3898 and 0.2143, respectively. Reassuringly, the SHAP-based feature importance ranking remains highly consistent with the baseline specification. This confirms that the direction, relative importance, and nonlinear patterns of digitalization dimensions affecting agribusiness green innovation are robust to the choice of green innovation proxy.
The pandemic period introduced substantial operational disruptions. To rule out confounding effects, we drop observations from 2020 and 2021 and reestimate the model using only the 2011 to 2019 subsample. The resulting test set R2 is 0.0496. The RMSE and MAE are 0.9461 and 0.5995, respectively. However, the overall fit declines substantially. This decline is expected given the reduced sample size and associated information loss. The SHAP importance ranking stays broadly stable. This indicates that our core inferences are not predictively driven by the pandemic period. The considerably weakened predictive power reflects the substantial loss of variation in both digitalization and green innovation measures in this reduced sample. It does not indicate a fundamental change in the underlying relationships.
Finally, we check whether the results depend on the choice of the base learner by switching from XGBoost to LightGBM. This alternative model returns a test-set R2 of −0.0607, with an RMSE and MAE of 1.0781 and 0.6999. While the predictive performance is considerably weaker, the SHAP feature importance pattern remains highly consistent with the baseline. This consistency across different algorithms lends further support to the stability of our findings.
In conclusion, these checks indicate that our main results are not sensitive to measurement choices, sample periods, or model specifications. The core story, which features matter most and how they nonlinearly relate to green innovation, holds up across all three tests.

6. Discussion

6.1. Representative Features of Digital Transformation for Agribusiness Green Innovation

To answer the first research question (RQ1) and empirically examine Proposition 1, this study employs the XGBoost algorithm with SHAP analysis to unravel the heterogeneous associations of multi-dimensional digital transformation on agribusiness green innovation. It finds that DS has no significant linear or nonlinear association on agricultural enterprises’ green innovation across its full value range. This result indicates that simply expanding the coverage of digital transformation across business links without deepening the implementation of digitalization in specific dimensions is not associated with substantive green innovation performance improvements in agricultural enterprises. Theoretically, this finding qualifies the conventional undifferentiated understanding of digital transformation. It reveals that digital resources differ fundamentally in their association with value creation. Only those with distinct value and limited replicability are associated with sustainable competitive advantages. Superficial expansion of digital coverage cannot achieve this outcome.
This study identifies a clear hierarchy of predictively important digital transformation features for agribusiness green innovation. GD, which reflects enterprises’ underlying digital technology layout including artificial intelligence, big data, and cloud computing, ranks first in both XGBoost feature importance and SHAP total effect. This suggests that a complete underlying digital infrastructure is a key feature that shows a strong predictive association with agricultural enterprises’ capacity to carry out green innovation activities. ID consistently ranks second, indicating that the external digital institutional environment (including digital credit, digital agricultural insurance, and intelligent subsidy application systems) is a critical pillar that is predictively associated with green innovation, second only to internal digital infrastructure. From an institutional theory perspective, digital institutional environments correlate with reduced resource constraints for agricultural firms, thereby showing associations with innovation-related activities. TD and PD, which are deeply embedded in the core production and operation links of agricultural enterprises, also show strong explanatory power for green innovation. This aligns with the unique characteristics of agricultural enterprises, whose green innovation activities are highly dependent on the optimization of production links and the improvement of quality and safety management systems. Digital empowerment in production and traceability links is predictively associated with lower resource consumption and pollutant emissions in the production process and is associated with enterprises’ compliance with environmental regulation and green certification requirements, thus directly driving green innovation output. In addition, CD and DB show positive but modest predictive associations, acting as important supplementary features for green innovation. In particular, the stable predictive pattern of DB indicates that balanced digital resource allocation is persistently associated with favorable green innovation performance.
It is important to note that the hierarchical structure identified in this study is based on a sample of Chinese A-share listed agricultural enterprises. These enterprises typically possess relatively abundant capital and technical resources to support the independent development of underlying digital infrastructure. For resource-constrained small- and medium-sized agricultural enterprises and family farms that dominate global agricultural production, as well as agricultural enterprises in emerging economies with less developed digital infrastructure, the relative importance of different digital dimensions may shift significantly. In these contexts, external digital institutional resources provided by governments or third-party platforms may play a more dominant role than internal governance digitalization. They allow small-scale operators to access digital capabilities without bearing prohibitive upfront costs. This contextual boundary highlights the need for differentiated digital transformation strategies tailored to the resource endowments of different types of agricultural operators.

6.2. Impact of Multi-Dimensional Digital Transformation on Agribusiness Green Innovation

6.2.1. Individual Nonlinear Effects

Unlike traditional econometric models that pre-assume a linear nexus between digital transformation and green innovation, this study systematically captures the complex nonlinear patterns and heterogeneous threshold characteristics of multi-dimensional digital transformation on agribusiness green innovation via SHAP dependence plots. This analysis empirically addresses the second research question (RQ2) and offers empirical support for Proposition 2.
Across the five business dimensions and two structural features of digital transformation examined, we observe pronounced heterogeneity in nonlinear predictive patterns. For core dimensions closely tied to enterprises’ underlying digital infrastructure and external institutional environment, i.e., GD and ID, we identify significant threshold effects. GD exhibits robust U-shaped predictive patterns with a critical threshold at Ln(GD) = 0.25. It suppresses green innovation at low-to-medium levels due to the crowding-out predictive effect of high upfront investment in AI, big data and cloud computing on green R&D resources yet delivers a strong positive innovation dividend once the underlying digital system is fully built and integrated into operations. This U-shaped trajectory reflects a common pattern of technology adoption. Early-stage investment in digital infrastructure initially strains resources without immediate returns. Once a critical mass of digital capabilities is accumulated, the enterprise may become capable of applying these technologies in ways that are predictively associated with green innovation. It should be noted that the threshold value observed here applies primarily to medium and large listed agricultural enterprises with relatively sufficient capital reserves. Smaller operators with tighter financial constraints may face higher thresholds or even be unable to cross this initial investment barrier.
Meanwhile, ID presents an exponentially increasing positive nonlinear predictive pattern with a threshold at Ln(ID) = 0.3. Its marginal predictive contribution increases sharply only when the digital institutional system (covering digital credit, agricultural insurance and intelligent subsidy schemes) is sufficiently developed to unlock its network effects, which is associated with alleviation of the financing constraints and operational risks facing agricultural enterprises’ green innovation activities. This pattern suggests that the enabling role of external support systems becomes particularly pronounced only after reaching a certain scale and level of maturity. In regions with less developed digital public infrastructure, this threshold may be higher and the network effects may take longer to materialize.
For PD and TD, which are deeply embedded in agricultural enterprises’ core production and operation links, nonlinear effects also feature clear thresholds but with distinct patterns. PD has a mild U-shaped predictive pattern, with an initial negative predictive association below Ln(PD) = 0.2 driven by the high upfront asset-specific investment in smart agricultural production equipment, followed by a weak positive predictive association as the scale effects of resource conservation and production efficiency improvement materialize. This trajectory reflects the time needed for firms to integrate and reconfigure production-related digital resources into functional innovation capabilities. TD displays an L-shaped predictive pattern. It shows a strong initial positive predictive association by enabling enterprises to quickly meet environmental regulations and obtain green certification. Its marginal benefit diminishes sharply after Ln(TD) = 0.2 once basic regulatory compliance requirements are fulfilled.
In contrast, circulation digitalization (CD) and digital transformation balance (DB) demonstrate a continuously and steadily strengthening positive predictive pattern without obvious thresholds. They are associated with stable incremental benefits for green innovation throughout the whole process and show no evidence of the crowding-out effect caused by high upfront investment.
These heterogeneous nonlinear patterns highlight the importance of avoiding one size fits all digital transformation strategies. It is important to note that the specific threshold values observed in this study are context-dependent. They may vary across countries with different levels of digital development and across enterprises with different resource endowments.

6.2.2. Interaction and Synergistic Effects

To answer the third research question (RQ3) and empirically examine Proposition 3, this study further reveals the significant synergistic interaction patterns between different digital transformation dimensions. The results show that the interaction between core digital transformation dimensions is associated with a smoothing of the nonlinear fluctuation of individual features’ predictive patterns. Specifically, the interaction between GD and ID is associated with a weakening of the U-shaped pattern of GD’s individual effect. High ID levels are associated with both a stronger positive predictive contribution of GD after the threshold and a less negative predictive association of GD at low levels. Similar synergistic patterns are observed in the interaction between GD and PD, as well as between ID and PD. These interaction patterns are consistent with a fundamental principle of digital resource deployment. When multiple digital resources are developed in tandem, they are associated with combined predictive benefits that appear to exceed the sum of their individual contributions. This appears to lower the threshold each dimension must cross before showing positive predictive returns.
In addition, DB shows a positive moderating pattern in the relationship between GD and green innovation. High-level DB is associated with an amplification of the positive predictive contribution of GD after the threshold, while the U-shaped pattern of GD is less pronounced under low DB levels. This finding further suggests that the coordinated development of multi-dimensional digital transformation is importantly associated with unlocking the predictive contribution of digitalization to agricultural enterprises’ green innovation. Single-dimensional digital transformation, even in the core governance dimension, is associated with higher threshold constraints and less stable innovation-related patterns. The results suggest that only through the coordinated development of multiple dimensions can agricultural enterprises more quickly show a positive association between digital transformation and green innovation, forming a more consistent predictive pattern for green innovation.
It is worth noting that the strength of these synergistic effects is also context-dependent. For enterprises with extremely limited resources, simultaneous investment in multiple digital dimensions may exacerbate resource constraints and delay the emergence of innovation dividends. In such cases, a phased development approach that first builds basic institutional digital capabilities may be more practical.

6.3. Implications

These findings provide clear and actionable practical implications for agricultural enterprises, managers, and policymakers. While the results are derived from Chinese listed agricultural firms, the core logic underlying these findings is that different digital dimensions play hierarchically distinct roles and interact synergistically. This logic offers reference value for agribusinesses in other emerging economies facing similar digitalization challenges.
For agricultural enterprises, priority should be given to advancing governance digitalization and institutional digitalization, which show the strongest predictive associations with green innovation. They should abandon the “wide but shallow” symbolic digital transformation strategy and shift from simply expanding the coverage of digitalization to deepening the implementation of digitalization in core business dimensions. Priority should be given to the construction of underlying digital infrastructure and the deep integration of digitalization in core production and traceability links to avoid the dispersion of limited resources caused by the blind layout of multiple digital dimensions. Concretely, priority should be given to building AI-driven decision systems and cloud-based data platforms for governance digitalization and to deploying IoT sensors and precision irrigation equipment in production rather than spreading limited budgets thinly across all digital dimensions. Second, enterprises should fully recognize the threshold effects of digital transformation. For core dimensions such as governance and institutional digitalization, they should maintain sustained investment to break through the critical threshold so as to release the innovation dividend of digitalization. Third, enterprises should pay attention to the balanced allocation of digital resources and the synergistic development of multi-dimensional digitalization. While focusing on the construction of core digital dimensions, they should also promote the coordinated development of digitalization in circulation, institutional and other links to amplify the driving effect of core digital dimensions on green innovation through synergistic effects. For managers, the nonlinear and threshold effects identified in this study indicate that enterprises should focus on deepening digitalization in key business links rather than blindly expanding the scope of digital transformation.
For policymakers, improving the digital institutional environment, including digital credit, digital agricultural insurance, and intelligent subsidy systems, can effectively support enterprises’ green innovation activities and help them cross critical digitalization thresholds. In operational terms, this means accelerating the rollout of digital credit products tailored to agribusiness cash-flow cycles, expanding pilot programs for digital agricultural insurance, and streamlining intelligent subsidy application platforms for faster disbursement. Second, they should formulate differentiated support policies for agricultural enterprises’ digital transformation. For small- and medium-sized agricultural enterprises with an insufficient digital foundation, governments can provide targeted subsidies for purchasing cloud computing services and sensor equipment, alongside tax rebates for early-stage digital infrastructure investment, to alleviate the crowding-out effect on green R&D. For enterprises with a certain digital foundation, policies should guide them toward integrating digital tools into core operational processes, such as linking traceability systems with green certification schemes and embedding AI analytics in production planning, to translate digital investment into measurable green outcomes. Third, they should build an open sharing platform for agricultural digital technology, promote the flow and sharing of digital technology knowledge between enterprises, and help agricultural enterprises form a collaborative development pattern of multi-dimensional digital transformation.

7. Conclusions

Digital transformation has become a core factor associated with agribusiness green innovation and a key strategic path for China’s agricultural sector to achieve low-carbon transformation, high-quality development and rural revitalization. Existing studies mostly treat digital transformation as a single aggregate variable, overlooking the heterogeneous associations of digitalization across different business links of agricultural enterprises, and traditional linear econometric models are limited in capturing the complex nonlinear patterns and synergistic interaction patterns between digital transformation and agribusiness green innovation. To fill these research gaps, this study takes 2011–2021 Chinese A-share listed agricultural companies as the research sample, decomposes digital transformation into five business dimensions (production, circulation, traceability, governance, and institutional digitalization) and two structural features (transformation scope and balance), and systematically examines the associations of multi-dimensional digital transformation on agribusiness green innovation through an explainable machine learning framework integrating Bayesian optimization, the XGBoost algorithm and the SHAP method.
The core findings of this study are threefold. First, the “wide but shallow” symbolic digital strategy that merely expands the number of digital dimensions is not associated with substantive green innovation performance for agricultural enterprises. The predictively important factors of agribusiness green innovation present a clear hierarchical structure. Governance digitalization is the most important predictive driver, followed by institutional digitalization, while traceability and production digitalization embedded in core production and operation links are also important predictive factors, and circulation digitalization and digital transformation balance play a positive supplementary role. Second, different dimensions of digital transformation show significant heterogeneous nonlinear patterns and clear threshold characteristics in their associations with agribusiness green innovation. Governance digitalization presents a significant U-shaped predictive pattern. Institutional digitalization shows an exponentially increasing positive nonlinear pattern with prominent network effects. Production digitalization has a mild U-shaped predictive pattern. Traceability digitalization presents an L-shaped pattern with diminishing predictive contributions, while circulation digitalization and digital transformation balance show a steadily strengthening positive pattern without obvious threshold constraints. Third, there are significant synergistic interaction patterns between different digital transformation dimensions. The interaction between core dimensions is associated with smoothing of the nonlinear fluctuation of individual patterns, and the balanced development of multi-dimensional digitalization is associated with a positive moderating effect on the predictive contributions of core digital dimensions to green innovation, which is importantly associated with unlocking the innovation potential of digital transformation. It should be noted that the XGBoost–SHAP framework identifies predictive associations rather than causal relationships. The observed nonlinear patterns and interaction patterns reflect the model’s estimation of feature contributions and should be interpreted with due caution.
Notwithstanding its findings, this study has two main limitations. First, the sample is limited to Chinese A-share listed agricultural enterprises, which hold advantages in capital, resources and policy support over the small- and medium-sized agricultural enterprises and family farms that dominate China’s agricultural sector. The conclusions’ applicability to small and micro-operators with weak digital foundations thus needs further verification, and future research can expand the sample to explore heterogeneous associations across agricultural operators of different types and scales. Second, this study measures multi-dimensional digital transformation through annual report keyword frequency, a mainstream method that cannot fully capture digital transformation’s actual implementation, input–output efficiency and application depth. Future research can optimize this measurement system by combining field survey data, digital investment and operation indicators to more accurately identify digitalization’s associations with agribusiness green innovation. Third, green patent applications, while widely used, mainly capture formal technological innovation and may overlook process-oriented improvements and informal green practices that are common in the agricultural sector. Future research could therefore incorporate a broader set of indicators, such as green product certifications, the adoption of resource-saving production techniques, and environmental compliance records, to provide a more comprehensive assessment of agribusiness green innovation. Fourth, given the observational nature of the data, potential endogeneity concerns cannot be fully ruled out. Future research could employ quasi-experimental designs or instrumental variable approaches to further strengthen causal identification. Fifth, while the explainable machine learning framework (XGBoost–SHAP) adopted in this study provides rich insights into heterogeneous patterns, the results should be interpreted as correlational rather than causal evidence. Future research could combine quasi-experimental designs with explainable machine learning to strengthen causal identification.

Author Contributions

Conceptualization, W.L. and X.F.; methodology, W.L. and X.F.; software, W.L.; validation, W.L. and X.F.; formal analysis, W.L.; investigation, W.L.; data curation, W.L.; writing—original draft preparation, W.L.; writing—review and editing, X.F.; supervision, X.F.; project administration, X.F.; funding acquisition, X.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by the Key Project of Philosophy and Social Science Research in Colleges and Universities in Jiangsu Province (No. 2023SJZD020).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no competing interests.

Abbreviations

The following abbreviations are used in this manuscript:
PDProduction digitalization
CDCirculation digitalization
TDTraceability digitalization
GDGovernance digitalization
IDInstitutional digitalization
DSDigital transformation scope
DBDigital transformation balance

References

  1. Bi, S.Y.; Zhang, X.J.; Xiao, G.L.; Li, H.X. Marketization, digital economy, and industrial structure transformation: Mechanisms and regional variations. Int. Rev. Econ. Financ. 2025, 102, 104228. [Google Scholar] [CrossRef]
  2. Liu, J.F.; Xian, X. Impact of digital transformation on high-quality economic development: The mediating role of human capital investment. Financ. Res. Lett. 2025, 86, 108657. [Google Scholar] [CrossRef]
  3. Wu, Y.; Li, H.; Luo, R.; Yu, Y. How digital transformation helps enterprises achieve high-quality development? Empirical evidence from Chinese listed companies. Eur. J. Innov. Manag. 2024, 27, 2753–2779. [Google Scholar] [CrossRef]
  4. Peng, Y.Z.; Tao, C.Q. Can digital transformation promote enterprise performance? From the perspective of public policy and innovation. J. Innov. Knowl. 2022, 7, 100198. [Google Scholar] [CrossRef]
  5. Calle, F.; Carrasco, I.; González-Moreno, A.; Córcoles, C. Are environmental regulations to promote eco-innovation in the wine sector effective? A study of Spanish wineries. Agronomy 2022, 12, 21. [Google Scholar] [CrossRef]
  6. Zhang, Y.L.; Wang, M.Z.; Qian, J.F.; Liao, Y.S. Development characteristics, problems of leading enterprises in agricultural industrialization in China and its development thoughts. Issues Agric. Econ. 2021, 8, 135–144. [Google Scholar] [CrossRef]
  7. Fang, X.B.; Liu, M.T. How does the digital transformation drive digital technology innovation of enterprises? Evidence from enterprise’s digital patents. Technol. Forecast. Soc. Change 2024, 204, 123428. [Google Scholar] [CrossRef]
  8. Liang, R.; Li, Y. How digital transformation affects exploitative and exploratory innovation: An innovation structure perspective. IEEE Trans. Eng. Manag. 2024, 71, 10912–10923. [Google Scholar] [CrossRef]
  9. Zhuo, C.F.; Chen, J. Can digital transformation overcome the enterprise innovation dilemma: Effect, mechanism and effective boundary. Technol. Forecast. Soc. Change 2023, 190, 122378. [Google Scholar] [CrossRef]
  10. Niu, Y.; Wen, W.; Wang, S.; Li, S. Breaking barriers to innovation: The power of digital transformation. Financ. Res. Lett. 2023, 51, 103457. [Google Scholar] [CrossRef]
  11. Stanescu, S.-G.; Ionescu, C.A.; Ștefan, M.C.; Ionescu, L.; Bondac, G.-T.; Cristea, A.M. Digitalization and Blockchain Integration in Agri-Food Supply Chains: Towards a Resilient, Circular, and Sustainable Future. Sustainability 2025, 17, 9276. [Google Scholar] [CrossRef]
  12. Huang, Y.; Liu, Q.; Xiong, N.; Liu, C.K. Digital transformation and environmentally sustainable innovation: Based on machine learning and text analysis methods. J. Environ. Manag. 2025, 393, 127090. [Google Scholar] [CrossRef]
  13. Richards, C.; Messner, R.; Higgins, V. Digital technology and on-farm responses to climate shocks: Exploring the relations between producer agency and the security of food production. Agric. Hum. Values 2024, 42, 53–67. [Google Scholar] [CrossRef]
  14. Amar, S.; Bori, N.; Cörvers, R. From collection to control: Data governance, digital technologies, and the politics of inclusion in the digitalisation of smallholder agriculture. Outlook Agric. 2026, 55, 19–27. [Google Scholar] [CrossRef]
  15. Lu, R.; Peng, X.; Reve, T. Firms’ digital transformation, competitive strategies, and innovation: Evidence from Chinese listed companies. J. Manag. Organ. 2024, 31, 575–601. [Google Scholar] [CrossRef]
  16. Liu, B. How does digital transformation enhance enterprise technological innovation? Evidence from Chinese manufacturing listed companies. Technol. Soc. 2025, 82, 102884. [Google Scholar] [CrossRef]
  17. Schäper, T.; Jung, C.; Foege, J.N.; Bogers, M.L.A.M.; Fainshmidt, S.; Nüesch, S. The S-shaped relationship between open innovation and financial performance: A longitudinal perspective using a novel text-based measure. Res. Policy 2023, 52, 104764. [Google Scholar] [CrossRef]
  18. Zhou, M.M.; Zhou, S.Q.; Hei, J.H.; Yang, S.J.; Liu, Q.; Yang, T.; Wu, Z. Impact of innovation drivers in Chinese cities: Machine learning analysis using XGBoost. Cities 2025, 167, 106347. [Google Scholar] [CrossRef]
  19. Shi, B.W.; Mao, X.J.; Yang, M.C.; Li, B. What, why, and how: An empiricist’s guide to double/debiased machine learning. Inf. Syst. Res. 2025. [Google Scholar] [CrossRef]
  20. Tao, A.P.; Wang, C.X.; Zhang, S.; Kuai, P. Does enterprise digital transformation contribute to green innovation? Micro-level evidence from China. J. Environ. Manag. 2024, 370, 122609. [Google Scholar] [CrossRef]
  21. Du, Y.J.; Chen, S. Going for sustainability: How digital transformation affects corporate radical green innovation in China. Econ. Change Restruct. 2025, 59, 14. [Google Scholar] [CrossRef]
  22. Peng, H.; Shen, N.; Ying, H.Q.; Wang, Q.W. Can environmental regulation directly promote green innovation behavior? Based on situation of industrial agglomeration. J. Clean. Prod. 2021, 314, 128044. [Google Scholar] [CrossRef]
  23. Yan, Z.M.; Yu, Y.; Du, K.R.; Zhang, N. How does environmental regulation promote green technology innovation? Evidence from China’s total emission control policy. Ecol. Econ. 2024, 219, 108137. [Google Scholar] [CrossRef]
  24. Lu, H.; Zhang, Y.; Jiang, J.; Cao, G. Do market-based environmental regulations always promote enterprise green innovation commercialization? J. Environ. Manag. 2025, 375, 124183. [Google Scholar] [CrossRef]
  25. Lyu, H.; Ma, C.N.; Farnoosh, A. Government innovation subsidies, green technology innovation and carbon intensity of industrial firms. J. Environ. Manag. 2024, 369, 122274. [Google Scholar] [CrossRef] [PubMed]
  26. An, J.; He, G.Q.; Ge, S.L.; Wu, S.S. The impact of government green subsidies on corporate green innovation. Financ. Res. Lett. 2025, 71, 106378. [Google Scholar] [CrossRef]
  27. Zhang, J.X.; Wu, Y.H.; Lin, L. Government subsidies, corporate ESG ratings, and green innovation. Financ. Res. Lett. 2025, 84, 107793. [Google Scholar] [CrossRef]
  28. Füller, J.; Hutter, K.; Wahl, J.; Bilgram, V.; Tekic, Z. How AI revolutionizes innovation management: Perceptions and implementation preferences of AI-based innovators. Technol. Forecast. Soc. Change 2022, 178, 121598. [Google Scholar] [CrossRef]
  29. Ning, J.; Jiang, X.Y.; Luo, J.M. Relationship between enterprise digitalization and green innovation: A mediated moderation model. J. Innov. Knowl. 2023, 8, 100326. [Google Scholar] [CrossRef]
  30. Guo, L.H.; Pei, H.C.; Liu, Y.Z. Artificial intelligence and corporate green innovation: Evidence from China. Res. Int. Bus. Financ. 2025, 79, 103039. [Google Scholar] [CrossRef]
  31. Ali, A.; Hassan, M.U.; Kaul, H.P. Broad scope of site-specific crop management and specific role of remote sensing technologies within it—A review. J. Agron. Crop Sci. 2024, 210, e12732. [Google Scholar] [CrossRef]
  32. Yao, D.; Yan, K. Can factoring business alleviate the seasonal impact on agricultural supply chain enterprises? Int. Rev. Financ. Anal. 2025, 98, 103891. [Google Scholar] [CrossRef]
  33. Yang, Y.; Jiang, Y.; Yang, Y. Institutional logics and organizational green transformation: Evidence from the agricultural industry in emerging economies. J. Environ. Manag. 2024, 370, 122932. [Google Scholar] [CrossRef]
  34. Ali, A.; Tan, Y.W.; Medani, K.; Xia, C.P.; Abdullahi, N.M.; Mahmood, I.; Yang, S.L. Horticultural postharvest loss and its socio-economic and environmental impacts. J. Environ. Manag. 2025, 373, 123458. [Google Scholar] [CrossRef] [PubMed]
  35. Wu, H.; Wang, B.; Lu, M.; Irfan, M.; Miao, X.; Luo, S.; Hao, Y. The strategy to achieve zero-carbon in agricultural sector: Does digitalization matter under the background of COP26 targets? Energy Econ. 2023, 126, 106916. [Google Scholar] [CrossRef]
  36. Muench, S.; Cechura, L.; Bavorova, M. Exploring the motives behind the adoption of climate change adaptation strategies among farmers in the Czech Republic. Mitig. Adapt. Strateg. Glob. Change 2024, 29, 84. [Google Scholar] [CrossRef]
  37. Sponagel, C.; Weik, J.; Witte, F.; Back, H.; Wagner, M.; Ruser, R.; Bahrs, E. Climate change mitigation potential and economic evaluation of selected technical adaptation measures and innovations in conventional arable farming in Germany. J. Environ. Manag. 2025, 374, 123884. [Google Scholar] [CrossRef] [PubMed]
  38. Lu, Z.Y.; Li, N.; Feng, H.L.; Dong, J.L.; Gou, D.; Xu, M. Climate change risks and green low-carbon development in agriculture: Evidence from China on the regulatory role of agricultural insurance and spatial spillover effects. Agriculture 2025, 16, 24. [Google Scholar] [CrossRef]
  39. Liu, X.; Huo, X. Digital finance, financial regulation and green innovation in agricultural enterprises. Financ. Res. Lett. 2026, 92, 109562. [Google Scholar] [CrossRef]
  40. Zheng, L.Y.; Huang, H.J.; Han, J.L. Can symbiotic relationship promote green technology innovation of agricultural enterprises? A study based on the empirical evidence of Chinese agricultural listed companies. Sustainability 2024, 16, 10841. [Google Scholar] [CrossRef]
  41. Yuan, Y.; Guo, X.Y.; Shen, Y. Digitalization drives the green transformation of agriculture-related enterprises: A case study of A-share agriculture-related listed companies. Agriculture 2024, 14, 1308. [Google Scholar] [CrossRef]
  42. Zhou, Y.; Wang, W.; Liu, Y. How does digital transformation drive green innovation in agricultural supply chains? Renew. Sustain. Energy Rev. 2025, 217, 115780. [Google Scholar] [CrossRef]
  43. Li, S.L.; Gao, L.W.; Han, C.J.; Gupta, B.; Alhalabi, W.; Almakdi, S. Exploring the effect of digital transformation on firms’ innovation performance. J. Innov. Knowl. 2023, 8, 100317. [Google Scholar] [CrossRef]
  44. Meng, F.L.; Wang, W.P. The impact of digitalization on enterprise value creation: An empirical analysis of Chinese manufacturing enterprises. J. Innov. Knowl. 2023, 8, 100385. [Google Scholar] [CrossRef]
  45. He, Q.; Yang, Z.W. Generative AI-driven knowledge management in manufacturing firms: A five-stage framework for dynamic knowledge optimization and digital innovation. J. Knowl. Manag. 2025, 30, 1447–1467. [Google Scholar] [CrossRef]
  46. Xu, C.; Lin, B. Embracing artificial intelligence: How does intelligent transformation affect the technological innovation of new energy enterprises? IEEE Trans. Eng. Manag. 2025, 72, 703–716. [Google Scholar] [CrossRef]
  47. Xia, H.; Liu, M.W.; Wang, P.C.; Tan, X.K. Strategies to enhance the corporate innovation resilience in digital era: A cross-organizational collaboration perspective. Heliyon 2024, 10, e39132. [Google Scholar] [CrossRef]
  48. Yu, Y.B.; Zeng, H.Y.; Zhang, M. Digital transformation for supply chain collaborative innovation and market performance. Eur. J. Innov. Manag. 2025, 28, 2446–2468. [Google Scholar] [CrossRef]
  49. Chen, W.M.; Lu, H.Y.; Mora, L.; Chen, T.; Beckers, D.; Hu, M.Y. Linking manufacturing digitalization and technological innovation: The mediating role of dynamic capabilities. Technol. Soc. 2025, 83, 103041. [Google Scholar] [CrossRef]
  50. Liang, P.; Sun, X. Does digital transformation promote the green innovation of China’s listed companies? Environ. Dev. Sustain. 2024, 26, 22199–22235. [Google Scholar] [CrossRef]
  51. Liu, M.Y.; Li, C.Y.; Wang, S.; Li, Q.H. Digital transformation, risk-taking, and innovation: Evidence from data on listed enterprises in China. J. Innov. Knowl. 2023, 8, 100332. [Google Scholar] [CrossRef]
  52. Zhao, Y.N.; Fang, W. How does digital transformation affect green innovation performance? Evidence from China. Technol. Anal. Strateg. Manag. 2023, 37, 139–154. [Google Scholar] [CrossRef]
  53. Tang, M.G.; Liu, Y.L.; Hu, F.X.; Wu, B.J. Effect of digital transformation on enterprises’ green innovation: Empirical evidence from listed companies in China. Energy Econ. 2023, 128, 107135. [Google Scholar] [CrossRef]
  54. Chen, R.; Zhang, B.; Chen, Y. How does digital transformation influence collaborative green innovation? J. Glob. Inf. Manag. 2024, 32, 1–18. [Google Scholar] [CrossRef]
  55. He, Q.Q.; Ribeiro-Navarrete, S.; Botella-Carrubi, D. A matter of motivation: The impact of enterprise digital transformation on green innovation. Rev. Manag. Sci. 2024, 18, 1489–1518. [Google Scholar] [CrossRef]
  56. Zhang, W.; Zhao, J.; Li, H.; Chen, S. Does digital transformation empower green innovation? Evidence from listed companies in heavily polluting industries in China. Financ. Res. Lett. 2024, 66, 105685. [Google Scholar] [CrossRef]
  57. Sun, Y.M. Digital transformation and corporates’ green technology innovation performance: The mediating role of knowledge sharing. Financ. Res. Lett. 2024, 62, 105105. [Google Scholar] [CrossRef]
  58. Zhang, J.; Yu, C.H.; Zhao, J.; Lee, C.C. How does corporate digital transformation affect green innovation? Evidence from China’s enterprise data. Energy Econ. 2025, 142, 108217. [Google Scholar] [CrossRef]
  59. Scuotto, V.; Arrigo, E.; Candelo, E.; Nicotra, M. Ambidextrous innovation orientation effected by the digital transformation: A quantitative research on fashion SMEs. Bus. Process Manag. J. 2020, 26, 1121–1140. [Google Scholar] [CrossRef]
  60. Li, T.; Xiong, S.X. The differential impact of enterprise digital transformation on ambidextrous innovation: Evidence from China. Int. Rev. Econ. Financ. 2025, 103, 104436. [Google Scholar] [CrossRef]
  61. Zhong, X.; Zhang, Y. Digital transformation speed and firms’ ambidextrous green innovation: Do employee stock ownership and education levels matter? Technol. Soc. 2025, 83, 103024. [Google Scholar] [CrossRef]
  62. Lu, N. Digitization and ambidextrous green innovation: A resource orchestration perspective. Ind. Manag. Data Syst. 2025, 126, 542–565. [Google Scholar] [CrossRef]
  63. Xue, Z.; Hou, Y.J.; Cao, G.Q.; Sun, G.L. How does digital transformation drive innovation in Chinese agribusiness: Mechanism and micro evidence. J. Innov. Knowl. 2024, 9, 100489. [Google Scholar] [CrossRef]
  64. Romero, I.; Mammadov, H. Digital transformation of small and medium-sized enterprises as an innovation process: A holistic study of its determinants. J. Knowl. Econ. 2024, 16, 8496–8523. [Google Scholar] [CrossRef]
  65. Valdivia, C.A.S.; Mamédio, D.F.; Loures, E.D.R.; Tortato, U. Dimensions of digital transformation for digital supply chains: Evidence from an automotive OEM group. Res.-Technol. Manag. 2024, 67, 57–68. [Google Scholar] [CrossRef]
  66. Liu, Y.; Ma, X.Y.; Shu, L.; Hancke, G.P.; Abu-Mahfouz, A.M. From industry 4.0 to agriculture 4.0: Current status, enabling technologies, and research challenges. IEEE Trans. Ind. Inform. 2021, 17, 4322–4334. [Google Scholar] [CrossRef]
  67. Mendes, J.A.J.; Carvalho, N.G.P.; Mourarias, M.N.; Careta, C.B.; Zuin, V.G.; Gerolamo, M.C. Dimensions of digital transformation in the context of modern agriculture. Sustain. Prod. Consum. 2022, 34, 613–637. [Google Scholar] [CrossRef]
  68. Tzachor, A.; Richards, C.E.; Jeen, S. Transforming agrifood production systems and supply chains with digital twins. npj Sci. Food 2022, 6, 47. [Google Scholar] [CrossRef]
  69. Wang, W.H.; Li, Z.; Meng, Q.F. Digital transformation drivers, technologies, and pathways in agricultural product supply chains: A comprehensive literature review. Appl. Sci. 2025, 15, 10487. [Google Scholar] [CrossRef]
  70. Xu, J.H.; Li, Y.Z.; Zhang, M.P.; Zhang, S.H. Sustainable agriculture in the digital era: Past, present, and future trends by bibliometric analysis. Heliyon 2024, 10, e34612. [Google Scholar] [CrossRef]
  71. Zhang, H.K.; Wu, J.C.; Mei, Y.; Hong, X.Y. Exploring the relationship between digital transformation and green innovation: The mediating role of financing modes. J. Environ. Manag. 2024, 356, 120558. [Google Scholar] [CrossRef]
  72. Chen, W.; Song, Z.C.; Xie, Y. Energy transition across the climate policy uncertainty divide: The critical role of green technology innovation and digital transformation. Econ. Anal. Policy 2026, 90, 322–342. [Google Scholar] [CrossRef]
  73. Angelidis, G. A Mathematical Framework for Modeling Global Value Chain Networks. Foundations 2026, 6, 8. [Google Scholar] [CrossRef]
  74. Han, Y.A.; Li, Z.T.; Feng, T.C.; Qiu, S.L.; Hu, J.; Yadav, K.K.; Obaidullah, A.J. Unraveling the impact of digital transformation on green innovation through microdata and machine learning. J. Environ. Manag. 2024, 354, 120271. [Google Scholar] [CrossRef]
  75. Li, H.; Tian, H.; Tang, H. How does enterprise digital transformation impact green innovation performance? A machine learning-based study. Ind. Manag. Data Syst. 2025, 125, 2999–3023. [Google Scholar] [CrossRef]
  76. Lundberg, S.M.; Lee, S. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems; Neural Information Processing Systems: Barcelona, Spain, 2017; pp. 4768–4777. [Google Scholar] [CrossRef]
  77. Shan, T.L.; Feng, S.; Li, K.J.; Chang, R.D.; Huang, R.P. Unveiling the effects of artificial intelligence and green technology convergence on carbon emissions: An explainable machine learning-based approach. J. Environ. Manag. 2025, 373, 123657. [Google Scholar] [CrossRef]
  78. Barney, J.B. Firm resources and sustained competitive advantage. J. Manag. 1991, 17, 99–120. [Google Scholar] [CrossRef]
  79. Verhoef, P.C.; Broekhuizen, T.; Bart, Y.; Bhattacharya, A.; Dong, J.Q.; Fabian, N.; Haenlein, M. Digital transformation: A multidisciplinary reflection and research agenda. J. Bus. Res. 2021, 122, 889–901. [Google Scholar] [CrossRef]
  80. Klerkx, L.; Rose, D. Dealing with the game-changing technologies of Agriculture 4.0: How do we manage diversity and responsibility in food system transition pathways? Glob. Food Secur. 2020, 24, 100347. [Google Scholar] [CrossRef]
  81. Wade, M.; Hulland, J. Review: The resource-based view and information systems research: Review, extension, and suggestions for future research. MIS Q. 2004, 28, 107–142. [Google Scholar] [CrossRef]
  82. Teece, D.J.; Pisano, G.; Shuen, A. Dynamic capabilities and strategic management. In Knowledge and Strategy; Butterworth-Heinemann: Oxford, UK, 1999; pp. 77–115. [Google Scholar] [CrossRef]
  83. Ennen, E.; Richter, A. The whole is more than the sum of its parts—Or is it? A review of the empirical literature on complementarities in organizations. J. Manag. 2010, 36, 207–245. [Google Scholar] [CrossRef]
  84. Li, J.W. Digital Resources Collaborative optimization of enterprise service efficiency: A dual-cycle orchestration perspective. Int. J. Inf. Syst. Serv. Sect. 2026, 17, 1–19. [Google Scholar] [CrossRef]
  85. Chen, W.Y.; Zhang, L.G.; Jiang, P.Y.; Meng, F.L.; Sun, Q.Y. Can digital transformation improve the information environment of the capital market? Evidence from the analysts’ prediction behavior. Account. Financ. 2022, 62, 2543–2578. [Google Scholar] [CrossRef]
  86. Cao, Y.Y.; Tang, L. Digitalization and agricultural businesses’ environmental sustainability. Financ. Res. Lett. 2025, 86, 108442. [Google Scholar] [CrossRef]
  87. Wang, Z.C.; Pan, Z.F.; Lai, W.L.; Lu, S.; Liu, H.T.; Wang, X.Q.; Wu, H.B. How does digital technology enhance sustainable operations in agribusiness? A case analysis of a Chinese agricultural enterprise. Front. Sustain. Food Syst. 2025, 9, 1718405. [Google Scholar] [CrossRef]
  88. Van Campenhout, B. ICTs to address information inefficiencies in food supply chains. Agric. Econ. 2022, 53, 968–975. [Google Scholar] [CrossRef]
  89. Kuei, S.C.; Chen, M.C. Blockchain technology for risk management in food supply chain: A systematic literature review on emerging themes and sustainability implications. Food Control 2026, 181, 111689. [Google Scholar] [CrossRef]
  90. Hu, C.J.; Xu, Y.T.; Gao, P.B. Leveraging big data analytics capability for firm innovativeness: The role of sustained innovation and organizational slack. Systems 2025, 13, 730. [Google Scholar] [CrossRef]
  91. Vo-Thai, H.C.; Tran, M.L. Green innovation strategies in Vietnamese enterprises: Leveraging knowledge management and digitalization for sustainable competitiveness. J. Knowl. Manag. 2025, 29, 1055–1091. [Google Scholar] [CrossRef]
  92. Shao, F.J.; Jiao, Z.Y.; Jin, T.Q.; Zhu, X.W. Bridging the gap: Digital finance’s role in addressing maturity mismatch in investment and financing for agricultural enterprises. Financ. Res. Lett. 2024, 64, 105415. [Google Scholar] [CrossRef]
  93. Shen, Z.Y.; Hong, T.Y.; Blancard, S.; Bai, K.X. Digital financial inclusion and green growth: Analysis of Chinese agriculture. Appl. Econ. 2024, 56, 5555–5573. [Google Scholar] [CrossRef]
  94. Chen, T.Q.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
  95. Frazier, P.I. A Tutorial on Bayesian Optimization. arXiv 2018, arXiv:1807.02811. [Google Scholar] [CrossRef]
Figure 1. Bayesian optimization process for XGBoost hyperparameters. Notes: The optimal hyperparameters were determined using 200 iterations and 5-fold cross-validation. The horizontal axis represents the number of iterations, and the vertical axis represents the cross-validation performance. The y-axis represents the 5-fold cross-validation score. The optimal hyperparameters (the red dot in the figure) are: max_depth = 7, learning_rate = 0.0818, n_estimators = 338, subsample = 0.8898, gamma = 0.2469, and min_child_weight = 1.
Figure 1. Bayesian optimization process for XGBoost hyperparameters. Notes: The optimal hyperparameters were determined using 200 iterations and 5-fold cross-validation. The horizontal axis represents the number of iterations, and the vertical axis represents the cross-validation performance. The y-axis represents the 5-fold cross-validation score. The optimal hyperparameters (the red dot in the figure) are: max_depth = 7, learning_rate = 0.0818, n_estimators = 338, subsample = 0.8898, gamma = 0.2469, and min_child_weight = 1.
Agriculture 16 01288 g001
Figure 2. XGBoost and SHAP feature importance analysis. Notes: GD = governance digitalization; ID = institutional digitalization; PD = production digitalization; TD = traceability digitalization; CD = circulation digitalization; DB = digital transformation balance; DS = digital transformation scope. (a) XGBoost feature importance ranking; (b) Mean feature importance of SHAP; (c) SHAP Honeycomb Diagram.
Figure 2. XGBoost and SHAP feature importance analysis. Notes: GD = governance digitalization; ID = institutional digitalization; PD = production digitalization; TD = traceability digitalization; CD = circulation digitalization; DB = digital transformation balance; DS = digital transformation scope. (a) XGBoost feature importance ranking; (b) Mean feature importance of SHAP; (c) SHAP Honeycomb Diagram.
Agriculture 16 01288 g002
Figure 3. Relationship between digital transformation and agribusiness green innovation. Notes: Vertical dashed lines mark key threshold values. GD = governance digitalization; ID = institutional digitalization; PD = production digitalization; TD = traceability digitalization; CD = circulation digitalization; DB = digital transformation balance; DS = digital transformation scope. (a) SHAP dependence plot of PD; (b) SHAP dependence plot of CD; (c) SHAP dependence plot of TD; (d) SHAP dependence plot of GD; (e) SHAP dependence plot of ID; (f) SHAP dependence plot of DB; (g) SHAP dependence plot of DS.
Figure 3. Relationship between digital transformation and agribusiness green innovation. Notes: Vertical dashed lines mark key threshold values. GD = governance digitalization; ID = institutional digitalization; PD = production digitalization; TD = traceability digitalization; CD = circulation digitalization; DB = digital transformation balance; DS = digital transformation scope. (a) SHAP dependence plot of PD; (b) SHAP dependence plot of CD; (c) SHAP dependence plot of TD; (d) SHAP dependence plot of GD; (e) SHAP dependence plot of ID; (f) SHAP dependence plot of DB; (g) SHAP dependence plot of DS.
Agriculture 16 01288 g003aAgriculture 16 01288 g003b
Figure 4. Interaction of SHAP value of features. Notes: Red dashed curves represent single-feature marginal effects; green curves represent interactive effects. Red points indicate high values of the moderating feature. SHAP interaction values reflect synergistic contributions to green innovation. GD = governance digitalization; ID = institutional digitalization; PD = production digitalization; TD = traceability digitalization; DB = digital transformation balance; DS = digital transformation scope. (a) SHAP dependence of features of GD and ID; (b) SHAP dependence of features of GD and PD; (c) SHAP dependence of features of ID and PD; (d) SHAP dependence of features of GD and TD; (e) SHAP dependence of features of GD and DB; (f) SHAP dependence of features of ID and DB.
Figure 4. Interaction of SHAP value of features. Notes: Red dashed curves represent single-feature marginal effects; green curves represent interactive effects. Red points indicate high values of the moderating feature. SHAP interaction values reflect synergistic contributions to green innovation. GD = governance digitalization; ID = institutional digitalization; PD = production digitalization; TD = traceability digitalization; DB = digital transformation balance; DS = digital transformation scope. (a) SHAP dependence of features of GD and ID; (b) SHAP dependence of features of GD and PD; (c) SHAP dependence of features of ID and PD; (d) SHAP dependence of features of GD and TD; (e) SHAP dependence of features of GD and DB; (f) SHAP dependence of features of ID and DB.
Agriculture 16 01288 g004
Table 1. Keyword dictionary for multi-dimensional digital transformation in agribusiness.
Table 1. Keyword dictionary for multi-dimensional digital transformation in agribusiness.
Business-Specific Digital Transformation DimensionKeywords
Production Digitalization (PD)precision irrigation, drone-based plant protection, smart greenhouses, soil sensors, satellite remote sensing, autonomous agricultural machinery, IoT monitoring
Circulation Digitalization (CD)e-commerce platforms, cold-chain logistics tracking, order management systems, digital fresh produce distribution, production-marketing docking platforms
Traceability Digitalization (TD)blockchain traceability, product QR code tracking, quality and safety databases, digital pesticide residue detection
Governance Digitalization (GD)artificial intelligence, big data, cloud computing, data middle platform, intelligent decision-making, digital management platforms
Institutional Digitalization (ID)digital credit, digital agricultural insurance, intelligent subsidy application, online bank-enterprise docking, digital policy declaration systems
Table 2. Search space and optimal values of XGBoost hyperparameters.
Table 2. Search space and optimal values of XGBoost hyperparameters.
XGBoost HyperparametersDescriptionSearch RangeOptimal Value
max_depthMaximum depth of a tree.(4, 12]7
learning_rateStep size shrinkage was used in the update to prevent overfitting.(0.01, 0.15]0.0818
n_estimatorsThe number of weak estimators integrated.(300, 1500]338
min_child_weightThe minimum sum of instance weight (hessian) needed in a child.(1, 8]1
subsampleSubsample ratio of the training instances.(0.7, 1.0]0.8898
gammaMinimum loss reduction is required to make a further partition on a leaf node of the tree.(0, 3]0.2469
colsample_bytreeThe proportion of randomly selected features when training each tree(0.7, 1.0]0.8135
Notes: Optimal hyperparameters were obtained via Bayesian optimization with 5-fold cross-validation.
Table 3. Performance comparison of different machine learning models.
Table 3. Performance comparison of different machine learning models.
ModelRMSEMAER2
Decision Tree (DT)1.40030.65640.5792
Random Forest (RF)1.33420.66170.618
Elastic Net1.83651.01510.2762
LightGBM1.42050.74940.567
XGBoost1.45220.71450.5474
Decision Tree+Grid1.70260.84620.3779
Random Forest+Grid1.34350.66860.6126
Elastic Net+Grid1.81741.00210.2912
LightGBM+Grid1.42050.74940.567
XGBoost-Grid1.42690.71230.5631
XGBoost + BO1.26570.65210.6562
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, W.; Feng, X. Unveiling the Effects of Digital Transformation on Agribusiness Green Innovation in China: An Explainable Machine Learning-Based Approach. Agriculture 2026, 16, 1288. https://doi.org/10.3390/agriculture16121288

AMA Style

Liang W, Feng X. Unveiling the Effects of Digital Transformation on Agribusiness Green Innovation in China: An Explainable Machine Learning-Based Approach. Agriculture. 2026; 16(12):1288. https://doi.org/10.3390/agriculture16121288

Chicago/Turabian Style

Liang, Wanqi, and Xin Feng. 2026. "Unveiling the Effects of Digital Transformation on Agribusiness Green Innovation in China: An Explainable Machine Learning-Based Approach" Agriculture 16, no. 12: 1288. https://doi.org/10.3390/agriculture16121288

APA Style

Liang, W., & Feng, X. (2026). Unveiling the Effects of Digital Transformation on Agribusiness Green Innovation in China: An Explainable Machine Learning-Based Approach. Agriculture, 16(12), 1288. https://doi.org/10.3390/agriculture16121288

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop