Next Article in Journal
Study on the Spatial Characteristics and Influencing Factors of Night-Time Economic Forms from the Perspective of the Integration of Culture and Tourism
Previous Article in Journal
Addressing the Collective Action Dilemma in Resident-Led Urban Regeneration: Designing and Verifying a Multi-Dimensional Policy Lever System Through Evolutionary Game Theory
Previous Article in Special Issue
The Effect of Corporate Ethical Level and Ethical Efforts on Corporate Performance: Evidence of a Corporate Moral Licensing Phenomenon
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

How Can Enterprises’ Green Innovation Persist? A Study Based on Explainable Machine Learning

School of Management Science and Engineering, Shanxi University of Finance and Economics, Taiyuan 030006, China
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(22), 10071; https://doi.org/10.3390/su172210071
Submission received: 9 September 2025 / Revised: 26 October 2025 / Accepted: 2 November 2025 / Published: 11 November 2025

Abstract

Based on the strategy tripod framework, this study identifies 27 feature variables that influence the persistence of enterprise green innovation. In addition, utilizing data from Chinese listed enterprises between 2012 and 2022, this study employs machine learning models and the SHAP method to analyze the driving factors and their underlying mechanisms. The findings indicate that the persistence of enterprise green innovation results from multiple factors, among which enterprise size, R&D investment, and technological utilization capability rank as the top three most important determinants. Enterprise size has a positive linear effect on the persistence of green innovation, while market competition has a negative linear effect. R&D investment, technological utilization capability, enterprise green culture, financing capacity, and integration capability all show non-linearly positive effects. The conclusions provide theoretical guidance and micro-level evidence for promoting high-quality enterprise green development in enterprises and supporting governmental policy formulation.

1. Introduction

In increasingly emphasized global sustainable development strategies, the dual-driver approach of environmental sustainability and technological innovation has emerged as a critical pathway for enterprises to balance economic efficiency with environmental benefits [1]. However, short-term green innovation only brings temporary profits. Therefore, to enhance long-term competitive advantage, enterprises must adopt a strategic framework driven by persistent green innovation [2]. Persistent green innovation reflects the intrinsic connection between an enterprise’s past and current levels of green innovation over time [3], and its importance far exceeds that of input scale and intensity [4]. In this study, we define persistence of enterprise green innovation as the capability of an enterprise to sustainably produce green innovation outcomes and maintain steady growth over a continuous period. As a result, important questions persist for both theorists and practitioners: What is the current level of persistence in enterprise green innovation? What factors influence persistent green innovation in enterprises? Which factors are most critical in promoting it?
However, current research remains limited on enterprises’ persistent green innovation. First, existing literature primarily focuses on institutional factors, such as digital policies [2,5], ESG ratings [6], government green attention [3,7], and green finance [8], while paying insufficient attention to resource-related and industry-related factors. Second, current studies often use linear regression to analyze the causal relationship between a single factor and persistent green innovation [6,7]. This approach hinders a comprehensive understanding of the complex drivers behind persistent green innovation. These identified research gaps provide two points of departure for this study. On the one hand, the strategy tripod framework serves as a comprehensive analytical framework that integrates resource, industry, and institution factors, thereby enabling a holistic examination of how different dimensions influence the persistence of enterprise green innovation. On the other hand, machine learning is a practical tool that integrates multiple disciplines. It offers significant advantages for conducting multi-factor comparative analyses and examining interaction effects.
The key characteristics of machine learning are highlighted in the following aspects: First, machine learning does not assume predefined causal relationships among variables. Instead, it identifies potentially relevant variables based on the existing literature, allowing algorithms to uncover complex and hidden patterns within the data. This trait grants the model autonomy, facilitating accurate statistical predictions [9]. Second, machine learning enables the quantitative evaluation of feature importance and enhances the interpretability of model predictions. It provides an empirical basis for testing the reliability of existing theories. Models with strong predictive performance could validate the practicality of these theories, guiding researchers toward developing more explanatory and robust frameworks [10]. Third, machine learning emphasizes performance optimization to improve predictive accuracy on unseen data rather than pursuing a perfect fit within the training set. This approach significantly enhances the model’s generalizability, allowing researchers to rely on it for more precise and reliable predictions [11]. Despite the broad applicability and inherent flexibility of machine learning methods, their interpretability remains challenging. To address this limitation, the SHapley Additive exPlanations (SHAP) method was developed, leveraging game theory to elucidate black-box models and quantify feature contributions [12].
Research on enterprise green innovation remains nascent in its adoption of machine learning, with only a limited number of studies exploring this nexus [13]. These pioneering efforts, however, seldom address critical aspects such as model hyper-parameter tuning, performance benchmarking, and interpretability. In light of this background, the objectives of this study are to employ the strategy tripod framework to identify characteristic factors influencing the persistence of enterprise green innovation from three dimensions: resources, industry, and institutions. Subsequently, this study evaluates the importance ranking and effects of these factors by comparing the performance of seven machine learning models and integrating the SHAP method. The marginal contributions of this study are threefold. In terms of the theoretical contribution, this study enriches the framework’s application context and provides a new theoretical lens for investigating the persistence of green innovation. At the methodological level, this study first expands the scope of application of machine learning and the research paradigm in the management field. Second, this study addresses the “black box” problem in machine learning models by introducing the SHAP method. The findings broaden the methodological approaches to researching enterprise green innovation and provide theoretical guidance and micro-level evidence for promoting high-quality corporate green development and supporting policy-making.

2. Research Framework and Theoretical Analysis

2.1. Strategy Tripod Framework

For a long time, identifying the drivers of corporate strategic behaviour has been a central focus of academic research. As the field of strategic management has evolved, the dominant theoretical perspectives addressing this question have also shifted significantly. In the early 1980s, the industry-based view posited that the external industry environment was the primary shaper of enterprises’ strategies [14]. By the early 1990s, Barney introduced the resource-based view, shifting the focus to the enterprise’s internal environment [15]. This perspective emphasized that inimitable and non-substitutable heterogeneous resources determine the substance and timing of corporate strategies. At the turn of the century, Peng et al. advanced an institution-based view, arguing that formal and informal institutions exert coercive pressure on firms, constraining their strategic decisions and actions [16]. Industry-based, resource-based, and institution-based views have become strategic management’s three key theoretical pillars. While these views are significant for understanding strategic behaviour, they exhibit limitations by largely neglecting their interactions. Acknowledging the complex interplay among managers, organizations, industries, and institutions, Peng et al. integrated these three strands into a coherent strategy tripod framework [17].
Nevertheless, existing theoretical frameworks predominantly adopt unidimensional analytical approaches in examining enterprise strategic behaviour. While such perspectives yield focused insights, they often fail to fully account for the multidimensional interactive mechanisms underpinning strategic decision-making. This theoretical limitation has spurred scholarly interest in developing integrated frameworks capable of systematically elucidating strategic drivers and pathways. Notable exemplars include the Technology–Organization–Environment framework (TOE), the Ability–Motivation–Opportunity framework (AMO), and the strategy tripod framework. The TOE framework contends that strategic choices emerge from dynamic interactions among technological contexts, organizational structures, and external environments [18]. Concurrently, the AMO framework conceptualizes performance outcomes as emanating from the confluence of ability, motivation, and opportunity [19]. Despite their multidimensional analytical utility, both frameworks exhibit limitations in accounting for cross-institutional variations and their differential impacts, thereby revealing a theoretical void regarding institutional dimensions.
In comparison, the strategic tripod framework provides a more systematic and comprehensive analytical lens for examining how firm-specific resources, industry competition, and institutional forces interact to shape strategic decisions. With clearly delineated analytical boundaries, it emphasizes not only internal resource endowments and capabilities but also distinguishes the external environment along two key dimensions: industry and institutions. The institutional dimension is further categorized into formal and informal institutions, allowing a holistic capture of the complex contextual factors influencing enterprise strategy and behaviour. Consequently, the strategic tripod framework offers a more robust theoretical explanation for the mechanisms underlying sustained enterprise green innovation, thereby establishing a solid conceptual foundation for this research. A concise summary of the framework is presented in Table 1.

2.2. Resource Factors and Persistence of Enterprise Green Innovation

According to the resource-based view, long-term engagement in innovation often requires substantial funding. The stability of funding sources and the adequacy of investment significantly affect the persistence of green innovation [20]. The existing literature generally agrees that strong profitability, financing capacity, and growth potential provide enterprises with the economic foundation and developmental prospects necessary to sustain green innovation activities. These capabilities enable enterprises to maintain essential investments in long-term green research and development, supporting persistent green innovation decisions [21,22]. A strong debt-paying capacity indicates enterprises can effectively manage their capital chains, maintaining a long-term balance between fundraising and green innovation outputs [23]. Additionally, scholars argue that the enterprise fame is a key differentiating resource that helps enterprises gain stakeholder support, secure sufficient funding, and facilitate sustained green innovation [24].
Regarding capital allocation, the abundant resources of large-scale enterprises provide a foundation for funding deployment in green innovation, endowing them with relatively sustainable green core competitiveness and performance [25]. Ample and stable R&D investment drives continuous innovation, offering guarantees and momentum for ongoing green R&D activities. It is considered a critical factor affecting the persistence of enterprise green innovation [26].
The resource-based view also highlights technological capabilities as valuable, scarce, and irreplaceable core resources [27]. Studies suggest that strong capabilities in resource integration, technology utilization, and technology perception could enable enterprises to anticipate and keep pace with the latest trends in green innovation. These capabilities enable enterprises to synergistically leverage external advanced technologies, reduce the difficulty of green research and development, and improve the accuracy of green innovation, thereby ensuring steady and sustained enterprise green innovation [28,29,30]. Furthermore, research has shown that improved data processing capabilities create a favourable digital environment for continuously advancing green innovation. This fosters information sharing and knowledge integration, significantly shortens research and development cycles, and encourages new paradigms of green innovation, laying a solid foundation for sustainability [30,31].
Drawing on the prevailing literature cited above, this study selects 11 characteristic variables to explore the impact of resource factors on the persistence of enterprise green innovation. These variables include profitability, financing capacity, growth potential, debt-paying capacity, enterprise fame, enterprise size, R&D investment, integration capability, technology utilization capability, technology perception capability, and data processing capability.

2.3. Industry Factors and Persistence of Enterprise Green Innovation

According to the industry-based view, the persistence of enterprise green innovation largely depends on the industry and market environment. The existing literature shows that an uncertain industry environment is a significant factor influencing the persistence of enterprise green innovation. Stable and abundant industry environments provide enterprises with ample financing opportunities and create a conducive atmosphere for innovation. This helps enterprises build confidence in achieving expected returns, enabling them to overcome doubts about sustaining green innovation and increase long-term green innovation investments [32,33]. In contrast, in high-growth industry environments, external financing gaps require enterprises to invest considerable effort into resource selection and continuous trial-and-error processes [34]. This exacerbates the risk of rapid core technology degradation, resulting in a heightened fear of innovation failure and diminishing motivation for green innovation.
It is worth noting that different industries possess varying resource bases, environmental challenges, and innovation opportunities, leading to divergent demands and pressures for green innovation [13]. Compared to heavily polluting industries, non-heavily polluting enterprises face lower environmental regulatory pressure. They typically encounter fewer difficulties, shorter timeframes, and lower costs in pursuing green innovation, granting them greater flexibility and motivation to innovate [13]. High-tech industries, as leaders in technology application, rely highly on R&D for value creation. Consequently, enterprises in these sectors usually engage in innovative activities characterized by high knowledge content, vigorous technological intensity, and substantial capital and human resources investments. Their motivation and commitment to innovation are often stronger [35].
Additionally, some scholars have examined the impact of market environment, identifying market demand and competition as key factors affecting persistent innovation in enterprises [6,36]. Strong market demand helps businesses accumulate the resources necessary for green innovation and ensures potential returns from their innovative activities [37]. Conversely, intense market competition can constrict an enterprise’s operational space, reduce expected returns on research and development, and weaken the willingness to invest in green innovation [38].
Consistent with the highlighted literature, this study identifies 7 key variables to explore the impact of industry factors on the persistence of enterprise green innovation. These variables include industry abundance, dynamism, growth, heavily polluting industries, high-tech industries, market demand, and market competition.

2.4. Institution Factors and Persistence of Enterprise Green Innovation

The institution-based view posits that formal and informal institutions significantly influence the persistence of enterprise green innovation. From the perspective of formal institutions, environmental regulations, intellectual property protection, and government subsidies play critical roles in affecting the persistence of enterprise green innovation. Moreover, some scholars argue that environmental regulations could increase the costs related to pollution governance, which may crowd out funding for green innovation activities [39] and reduce the persistence of green innovation. To complement environmental regulation, ESG ratings can enhance decision-making capabilities related to green innovation [40]. This support promotes persistence in production processes and increases investments in environmental protection [6]. In addition, robust intellectual property protection effectively reduces concerns about free-riding behaviour, alleviating enterprises’ fears regarding investments in green innovation and ensuring sustained commitment to these efforts [41]. Additionally, adequate government subsidies can facilitate access to external technological resources and innovation, motivating enterprises to pursue sustained green innovation [5].
Informal institutions also impact the persistence of enterprise green innovation. Factors such as ownership structure, political connections, enterprise culture, investor attention, and media attention play important roles. For instance, state-owned enterprises benefit from greater policy support, face lower risks in conducting green innovation, and possess stronger persistence capabilities [8]. Similarly, enterprises with political connections often gain information advantages and tax perks, enhancing their willingness to engage in long-term green innovation [42]. Furthermore, enterprise culture is embedded in an enterprise as an informal institution, representing a crystallization of its historically established norms, practices, and conventions. A strong internal green culture can foster environmental awareness and innovation among employees and executives, guiding enterprises’ research and decision-making on technology and green products [43]. Additionally, sustained attention from investors and media could influence enterprise green innovation by applying pressure concerning environmental activities and reducing opportunistic behaviour [29,44].
Based on the aforementioned literature, this study identifies nine key variables to explore the impact of institutional factors on the persistence of enterprise green innovation. These variables include environmental regulation, ESG rating, intellectual property protection, government subsidy, ownership structure, political connection, enterprise green culture, investor attention, and media attention.

3. Research Design

3.1. Machine Learning Models

3.1.1. Multiple Linear Regression (MLR)

MLR is a classic supervised machine learning model that captures the stochastic linear relationship between a target variable and multiple predictors. It typically yields more accurate predictions than those based on a single predictor. Owing to its simple structure, fast learning speed, and high interpretability, MLR has been widely adopted in various fields.

3.1.2. Elastic Net (E-Net)

E-Net is designed to tackle the challenges of high-dimensional feature selection in multiple linear regression models. It combines the L1 penalty from Lasso regression and the L2 penalty from Ridge regression. This combination helps reduce multicollinearity among variables while preventing excessive sparsity resulting from overly aggressive coefficient shrinkage. As a result, E-Net allows for continuous coefficient shrinkage and automatically selects correlated features. It often demonstrates superior predictive accuracy, particularly when the number of features greatly exceeds the number of samples or when multiple correlated features are present.

3.1.3. Support Vector Machine (SVM)

SVM is a fundamental model in machine learning theory and has found extensive application in classification and regression tasks. It works by constructing an optimal hyperplane within a high-dimensional feature space to separate data points effectively. By employing optimization techniques, SVM aims to achieve a globally optimal solution, thereby minimizing the risks of local optima and over-fitting. Support Vector Regression (SVR) is an extension of SVM tailored for regression problems that ensures the regression hyperplane is as smooth as possible. Numerous studies have shown that SVM performs exceptionally well in managing high-dimensional data.

3.1.4. Decision Tree (DT)

DT is a supervised machine learning method for classification and regression tasks within high-dimensional data spaces. The core idea is to start from a top-level root node and recursively select the optimal feature to split the data into subsets, thereby minimizing error at each division. This process continues until additional splits no longer significantly improve overall predictive accuracy. DTs are recognized for their clear interpretability and computational efficiency, and they have been widely utilized in the analysis of corporate behaviour, often yielding favourable predictive results.

3.1.5. Random Forest (RF)

RF is an ensemble model that relies on decision trees. Its fundamental principle involves repeated bootstrap sampling to create a diverse collection of uncorrelated decision trees. The final prediction is achieved by aggregating the outputs of these multiple trees through a voting process. Regression tasks entail calculating the average of the individual tree predictions. The inherent randomness in subsample selection and feature space partitioning allows RF to mitigate certain random errors, showcasing a robust resilience to outliers and noise. This approach effectively addresses the limitations in generalization associated with single decision trees while also helping alleviate the risk of overfitting.

3.1.6. Gradient Boosting Decision Tree (GBDT)

GBDT is another ensemble model based on decision trees. Unlike RF, GBDT iteratively constructs base regression trees via gradient boosting. Each iteration moves the model toward the negative gradient of the loss function, progressively reducing the loss function until a global optimum is approached. GBDT has a broad range of applications and offers advantages such as high stability, short computation time, and a small number of algorithmic parameters. It can handle almost all regression problems and often achieves higher accuracy than SVM and RF.

3.1.7. eXtreme Gradient Boosting (XGBoost)

XGBoost is an efficient ensemble model representing an advanced implementation of GBDT. It is widely recognized for its speed, performance, and scalability. The core concept behind XGBoost is to iteratively train new weak learners to correct the errors made by previous learners. Each subsequent learner focuses on predicting the residuals of the earlier ones, leading to gradual improvements in overall model performance. XGBoost shows exceptional predictive capabilities, effectively handles non-linear relationships, and captures complex interactions among diverse features. It generally outperforms alternative methods in regression tasks, such as Random Forest.

3.2. Hyper-Parameter Optimization

The optimal combination of hyper-parameters before training is essential for maximizing the model’s performance and predictive accuracy. To mitigate the limitations associated with the subjective and arbitrary selection of parameters, this study employs a grid search method coupled with 5-fold cross-validation. This approach systematically explores all possible parameter combinations within a defined search space, partitions the data into five subsets, and computes the average predictive accuracy. The hyper-parameter set yielding the highest average accuracy is subsequently identified as optimal. Detailed parameter settings are shown in Table 2.

3.3. Model Evaluation

Drawing on the work of Yu et al. [11], this study evaluates model performance through two lenses: explanatory power and prediction error. First, the coefficient of determination (R2) serves as a key indicator for assessing the model’s explanatory power. R2 reflects the degree to which the predicted values align with the actual observations, with values ranging from 0 to 1. A value closer to 1 signifies a better fit and indicates stronger explanatory power of the model. The formula for R2 is given as follows:
R 2 = 1 i = 1 n ( y i y i ) 2 i = 1 n ( y i y ¯ i ) 2
Secondly, this study utilizes the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) as evaluation metrics to assess the prediction errors of the model. The MAE calculates the average of the absolute differences between the predicted and actual values, while the RMSE quantifies the deviation between the model’s predictions and the true values. Smaller values of both MAE and RMSE indicate reduced model error and enhanced prediction accuracy. The respective formulas are delineated as follows:
MAE = 1 n i = 1 n y i y i
RMSE = 1 n i = 1 n ( y i y i ) 2
i denotes the individual sample, y represents the true value, ȳᵢ signifies the mean of the true values, and ŷᵢ indicates the predicted value from the model.

3.4. SHAP

In pursuit of model complexity and predictive accuracy, machine learning methods have largely sacrificed interpretability and transparency, resulting in what are often referred to as “black-box models”. This constitutes one of the major obstacles to their application in predicting corporate behaviour. The SHAP method has been proposed as an alternative approach for further interpreting machine learning models to reveal each feature variable’s importance and mechanism. Its foundational idea originates from the Shapley value in cooperative game theory, which aims to distribute profits or costs fairly among coalition members. In the context of machine learning, SHAP treats each feature as a contributor and quantifies its marginal contribution when incorporated into the model, thereby achieving model interpretability. The SHAP framework can be applied to any model and is considered one of today’s most effective methods for model interpretation. The specific formula is as follows:
y i = y ¯ + f ( x i 1 ) + f ( x i 2 ) + + f ( x i j )
f(xij) denotes the SHAP value of xij, which represents the contribution of the variable j in the feature sample i to the prediction of yi. When f(xij) > 0, it indicates that the feature has a positive effect on the prediction outcome; otherwise, it suggests a negative effect.

4. Data Source and Variable Selection

4.1. Data Source

In 2012, China proposed a fundamental development path centred on “green development, low-carbon development, and circular development”, following which enterprise green innovation practices entered a period of rapid diffusion. Therefore, this study selects A-share listed companies in China from 2012 to 2022 as the research sample. Following the common practices in mainstream literature, the data processing involved the following steps: (1) excluding financial industry due to their unique financial metrics; (2) excluding enterprises with abnormal listing status, such as ST and *ST; (3) excluding samples with missing key variables or obvious outliers. The final sample comprises 17,561 observations. Micro-level data were obtained from the CSMAR, CNRDS, and HuaZheng ESG Ratings. In contrast, macro-level data were sourced from the China City Statistical Yearbook and the Statistical Yearbook of the National Intellectual Property Administration. Additionally, to reduce the impact of extreme values, all continuous variables underwent winsorization at the top and bottom 1%.

4.2. Variable Definition

4.2.1. Label Variable

With reference to the study by Hou et al. [7], this study measures the persistence of green innovation by calculating the chain growth rate of the total number of green patent applications and multiplying it by the scale of green patent output. The specific calculation formula is as follows:
Oi p it = Patent it + Patent i t 1 Patent i t 1 + Patent i t 2 × ( Patent it + Patent i t 1 )
Oipit represents the persistence of green innovation of enterprise i in year t, while Patentit−2, Patentit−1, and Patentit denote the total number of green patent applications filed by enterprise i in years t − 2, t − 1, and t.

4.2.2. Feature Variable

Considering the theoretical analysis and data availability, this study utilizes established practices from existing literature to measure 27 feature variables. The specific variable definitions are provided in Table 3.

4.3. Descriptive Statistics

Table 4 reports the descriptive statistics of the main variables in this study. The explained variable, corporate green innovation persistence, has a maximum value of 159.853, a minimum value of 0.000, a mean value of 7.079, and a standard deviation of 22.509. These results exhibit a distinct right-skewed distribution, indicating significant disparities in the level of green innovation persistence among firms and an overall need for improvement in sustained green innovation. Therefore, it is necessary to investigate further the key factors influencing corporate green innovation persistence. Additionally, we observed that both R&D investment and technology-utilizing capability exhibit a significantly right-skewed distribution. This provides reasonable grounds to suggest that there exists a noteworthy relationship between R&D investment, technology-utilizing capability, and the persistence of green innovation.

5. Analysis of Empirical Results

5.1. Performance Measure of Machine Learning Models

Seeking explanations from more effective and robust machine learning models is essential for addressing research questions [45]. Table 5 presents the performance of various models on the test set utilizing optimal hyper-parameter settings. As shown in the results, ensemble models (RF, GBDT, XGBoost) generally outperform traditional single models (MLR, E-Net, SVM, DT). This demonstrates that the integrated model could mitigate potential over-fitting or under-fitting issues associated with a traditional single model, thereby enhancing the evaluation capability. Among the ensemble methods, XGBoost achieves the highest R2 value (0.512), along with the lowest MAE (0.336) and MSE (0.738), followed by GBDT, which ranks second overall. RF also demonstrates competitive performance, securing the third position among all models. In contrast, among the individual models, SVM yields better results than DT, whereas MLR and E-net show the poorest performance.

5.2. Analysis Based on the SHAP

5.2.1. Feature Importance Analysis

This study introduces the SHAP framework to enhance the transparency and interpretability of machine learning models, while intuitively illustrating how different feature variables influence outcomes. This approach aims to balance the model’s interpretability and accuracy. Specifically, the SHAP method assesses the weight of each feature variable, reflecting their contributions to the prediction results. A higher mean absolute SHAP value for a feature indicates a more substantial influence on the target variable, thus signifying greater importance within the model.
Figure 1 displays the feature importance ranking based on mean absolute SHAP values. The horizontal axis illustrates the mean absolute SHAP values, while the vertical axis lists the feature variables in descending order of importance. Based on the analysis presented in Figure 2, the most significant features affecting the persistence of enterprise green innovation are enterprise size, R&D investment, technological utilization capability, enterprise green culture, and high-tech industry. Notably, resource-related factors are particularly influential. Enterprise size emerges as the most impactful feature, with a mean absolute SHAP value of 0.17. R&D investment and technological utilization capability are closely followed, with their respective SHAP values indicated. The high-tech industry feature holds a mean absolute SHAP value of 0.06, establishing it as the most crucial industry-related factor. Additionally, enterprise green culture and market competition are significant institutional factors, with mean absolute SHAP values of 0.07 and 0.04, respectively. In contrast, ownership structure and political affiliation are ranked as less significant, playing relatively minor roles in the overall evaluation.

5.2.2. Feature Effect Direction Analysis

To elucidate the influence of various features on the persistence of enterprise green innovation, this study employs the optimized XGBoost model to generate density scatter plots of SHAP values for the samples in the test set. These plots illustrate the distribution of feature values and their relationship with the model’s predictions. As illustrated in Figure 2, each point represents an individual sample, with colour intensity indicating the magnitude of the feature value. Darker colours represent higher values, while lighter colours indicate lower values. Each row corresponds to a specific feature, and the horizontal axis denotes the SHAP value, reflecting the direction and magnitude of the feature’s impact on green innovation persistence. Positive SHAP values signify a positive effect, whereas negative values indicate a negative effect.
The findings presented in Figure 2 reveal that most features contribute positively to green innovation persistence. Among resource factors, enterprise size, R&D investment, technological utilization capability, financing capability, and resource integration capability all exhibit positive SHAP values, suggesting that enhancements in these areas foster green innovation persistence. Concerning industry factors, the high-tech industry and industry growth demonstrate positive effects, while heavily polluting industries and industry dynamism have inhibitory effects. Furthermore, market competition negatively influences, with greater competition diminishing persistence. In terms of institutional factors, stronger corporate green culture and government subsidies are positively associated with green innovation persistence, in contrast to stricter environmental regulations, which have a negative impact.

5.2.3. Feature Dependency Analysis

Dependency analysis could elucidate the intricate ways in which various features influence outcomes. In the dependency plots, the horizontal axis represents the range of feature values, while the vertical axis indicates the SHAP values. A SHAP value greater than 0 signifies a positive impact on green innovation persistence, whereas a value below 0 reflects a negative impact. The absolute value indicates the strength of this effect. This study focuses on the ten most significant features.
From the results illustrated in Figure 3a–j, several conclusions can be drawn. Enterprise size demonstrates a consistent linear positive correlation with green innovation persistence. Larger enterprises tend to have higher SHAP values, indicating greater persistence. Market competition exhibits a linear negative correlation. As competition increases, SHAP values decline, resulting in weaker persistence. R&D investment, technological utilization capability, enterprise green culture, financing capability, and resource integration capability show non-linear positive correlations. The SHAP values for R&D investment and resource integration rise with higher values and stabilize after reaching a certain threshold. In contrast, technological utilization capability, enterprise green culture, and financing capability initially remain flat before rising, suggesting potential threshold effects.
Government subsidies display an inverted U-shaped relationship. SHAP values initially increase and then slightly decline. This may suggest that moderate subsidies help alleviate financing constraints and promote innovation, while excessive subsidies could lead to dependency and hinder sustained efforts in green innovation. Notably, significant differences exist across industries. High-tech industry reports positive SHAP values, indicating stronger green innovation persistence, while the non-high-tech industry often has negative values. Likewise, non-heavily polluting industries show positive effects, whereas heavily polluting industries display negative effects.

5.2.4. Feature Interaction Analysis

The persistence of green innovation in enterprises is influenced by multiple interacting factors rather than any single element. This study aims to identify the most significant features among resource, industry, and institutional factors and conduct an interaction analysis. Specifically, we examine how factors such as enterprise size, high-tech industry, and enterprise green culture collectively contribute to the persistence of green innovation.
As shown in Figure 4a, there is a positive correlation between enterprise size and enterprise green culture. As enterprise size increases, the SHAP value associated with green culture also rises, indicating that, at comparable scales, a stronger internal green culture enhances the persistence of green innovation.
As shown in Figure 4b, non-high-tech industries exhibit higher SHAP values in smaller enterprises. Conversely, high-tech industries demonstrate greater value in larger enterprises. This indicates that larger high-tech enterprises display stronger green innovation persistence, likely because innovation-driven high-tech enterprises are more responsive to external changes and, with ample resources, can more effectively pursue green innovation initiatives.
As shown in Figure 4c, high-tech and non-high-tech industries exhibit negative SHAP values when the enterprises’ green culture is lacking. However, high-tech enterprises show heightened persistence in innovation efforts when the green culture is robust. This may result from the hindrance posed by weak green culture on innovation activities. In contrast, a strong culture and technical expertise empower high-tech enterprises to achieve more sustained outcomes in green innovation.

5.2.5. Local Interpretation Analysis

Global interpretation methods facilitate understanding overall model behaviour across the feature space, while local interpretation methods clarify individual predictions. This study utilizes SHAP analysis to illuminate the prediction of green innovation persistence for specific instances. Figure 5a,b illustrate the force plots for two sample enterprises.
In Sample 1, enterprise size emerges as the most significant feature with a positive impact. Additionally, technological integration capability and enterprise fame contribute positively. Conversely, R&D investment, being in a non-high-tech industry, and environmental regulations diminish the predicted persistence. The overall outcome suggests this enterprise’s relatively high level of green innovation persistence.
In Sample 2, enterprise size, belonging to a non-heavily polluting industry, and financing capability function as positive forces. In contrast, factors such as operating in a non-high-tech industry, low R&D investment, market competition, and profitability are negative forces. The combined result indicates this enterprise’s low level of green innovation persistence.

5.3. Robustness Test

5.3.1. Substituting Measurement of Label Variable

This study redefines the feature variable to mitigate estimation bias arising from variable measurement methods. Following the approach of Li and Wei [46], we replace the number of green patent applications with the number of green patents granted. The persistence of green innovation, recalculated based on the number of green patents granted, is used as an alternative variable for robustness testing. As shown in Figure 6, enterprise size remains the most important feature in predicting the persistence of enterprise green innovation, and the top five feature variables stay largely unchanged. These results strongly support the robustness of the study’s findings.

5.3.2. Adjusting the Sample Split Ratio

According to the methodology established by Chen et al. [47]. This study modifies the training-to-test ratio from 80:20 to 70:30 and 50:50 for robustness testing. The XGBoost model is retrained accordingly. As demonstrated in Figure 7a,b, enterprise size remains the most significant factor in predicting the persistence of enterprise green innovation following the adjustments to the sample split ratios. Moreover, the top five features: enterprise size, technological utilization capability, R&D investment, enterprise green culture, and high-tech industry remain consistent, with only minor shifts in their relative rankings. These findings further confirm the robustness of the study’s conclusions.

5.4. Discussion

Based on the above results, we solved several key issues.
First, addressing the practical challenges in the field of enterprise strategic management, this study investigates the factors influencing the persistence of enterprise green innovation and how these factors exert their effects. Diverging from the econometric model commonly employed in existing research, this study achieves a significant shift in methodological approach [2,3,5,6,7,8]. Our study responds to the call “enhance the quality of management research through machine learning” from Liu et al. [9] and offers insights into the introduction of new methodologies in the management area.
Second, this study seeks to incorporate the strategy tripod framework proposed by Peng et al. [17] into the research domain of enterprise innovation, aiming to address the insufficient attention paid to institutional factors in prior research. This study also responds to another call from Liu et al. [9], which is “incorporate institutional factors into green innovation prediction studies”.
Finally, empirical results show that enterprise size, R&D investment, and technological application capability are prominent factors that influence the persistence in enterprise green innovation. Market competition exhibits a negative effect on he persistence of enterprise green innovation. Enterprise size exerts a positive influence on the persistence of enterprise green innovation. In addition, R&D investment, technological application capability, corporate green culture, financing capacity, and resource integration capability exhibit non-linear positive effects on the persistence of corporate green innovation. These results are basically consistent with the ideas of Gao et al. [48] and Zhao et al. [49]. The impact of technology utilization capability, enterprise green culture, and financing capacity on the persistence of enterprise green innovation exhibits a pattern of being initially gradual and then progressively intensifying, which confirms the conclusions from previous studies [22,26,30].
Although these conclusions have been validated to some extent in existing research. For instance, Sun et al. [25] pointed out that there is a positive correlation between enterprise size and the persistence of innovation. Similarly, R&D investment and technological utilization capability could significantly promote the persistence of enterprise green innovation [26,29]. Heightened market competition adversely impacts the persistence of an enterprise’s green innovation [38]. Although existing research has proliferated, it has generally overlooked a lingering central question: What are the most critical influencing factors for enterprises? Providing a clear answer to this question constitutes the primary objective of this study.

6. Conclusions and Implications

6.1. Conclusions

This study utilizes China A-share listed companies from 2012 to 2022 as the sample and employs machine learning models alongside the SHAP method to meticulously examine the ranking of contribution importance and the effects of various features on the persistence of enterprise green innovation. Several conclusions have been proposed:
(1)
Persistent green innovation has emerged as a central pathway for enterprises to address environmental challenges and achieve sustainable development [3]. However, there is significant variation in the persistence of green innovation among enterprises, and a majority of them struggle to sustain their green innovation efforts.
(2)
27 feature variables influencing the persistence of enterprise green innovation have been identified within the strategy tripod framework. After empirical testing, we found that some variables play important roles in influencing the persistence of enterprise green innovation. Specifically, enterprise size, R&D investment, and technological utilization capability are identified as key determinants. Notably, enterprise size, high-tech industry, and enterprise green culture are the most significant variables within the resource, industry, and institution dimension. R&D investment, technological utilization capability, enterprise green culture, financing capacity, and integration capability exhibit non-linear positive effects on the persistence of green innovation.
(3)
Among seven machine learning models evaluated, ensemble models outperform traditional single models. Based on this result, an XGBoost-based prediction model for enterprise green innovation persistence has been developed.

6.2. Implications

Based on these empirical insights, this study offers the following recommendations for enterprises and the government. First, enterprises should adopt a philosophy emphasizing persistent green innovation, recognizing its complex, long-term, and systematic nature. Persistence should be integrated into core development strategies to foster synergy between high-quality economic growth and environmental sustainability. Second, enterprises should identify the principal drivers of persistent green innovation and proactively adjust development plans to strengthen innovation resources. In particular, increasing R&D investment, enhancing technological innovation capabilities, and promoting stable green R&D initiatives to ensure persistent green innovation outcomes while pursuing strategic expansion. Furthermore, enterprises should prioritize fostering a green culture, raising managers’ environmental awareness, and encouraging employee participation to build strong internal commitment to persistent green innovation.
Concurrently, governments should strengthen guidance on internal enterprise governance and optimize the external market and institutional environment. Policymakers should holistically address the key roles of resource, industry, and institutional factors by formulating equitable policies supporting green innovation’s sustainable development. This can be achieved by creating a favourable financing environment, prudently regulating market competition, and providing sufficient innovation subsidies, reducing the risks of stagnant green innovation and short-term innovation behaviour. Governments should also strive to build a supportive atmosphere for the persistent advancement of green innovation and offer tailored support for diverse micro-entities. Furthermore, local governments should differentiate among types of enterprises in terms of green innovation persistence. Encouraging those not yet engaged in green innovation to make substantial progress while assisting those already involved in consistently improving their persistence. In particular, based on this study’s insights, policy support should be directed toward small-sized enterprises, non-high-tech industries, and heavily polluting industries to incentivize their active adoption of green technologies, innovative management practices, and sustainable production models.

6.3. Limitations and Future Research Directions

First, the enterprise samples selected for this study are entirely drawn from China. As the largest developing country in the world, China possesses noteworthy potential for growth. Therefore, future research could incorporate a comparative perspective, for instance, by comparing the persistence of enterprise green innovation across developing and developed countries, or between enterprises in Asian and European nations, which could derive more practical conclusions. Second, this study is limited to seven machine learning models, which may provide an insufficient basis for comparison. Hence, subsequent research could include a wider array of models and employ more precise hyper-parameter tuning to achieve more accurate results.

Author Contributions

Conceptualization, H.Z. and J.W.; methodology, H.Z. and J.W.; software, J.W.; validation, J.W.; formal analysis, H.Z., J.W. and Y.Y.; data curation, Y.Y.; writing—original draft preparation, H.Z. and J.W.; writing—review and editing, H.Z., J.W. and Y.Y.; visualization, J.W.; supervision, H.Z.; project administration, H.Z.; funding acquisition, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Social Science Foundation of China, grant number 24BGL053, Shanxi Provincial Strategic Research Project on Science and Technology, grant number 202404030401065.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Albloushi, B.; Alharmoodi, A.; Jabeen, F.; Mehmood, K.; Farouk, S. Total quality management practices and corporate sustainable development in manufacturing companies: The mediating role of green innovation. Manag. Res. Rev. 2023, 46, 20–45. [Google Scholar] [CrossRef]
  2. Tang, L.; Chen, X. Can digital finance alleviate the adverse effects of short-term loans for long-term investments on corporate sustainable green innovation? China Popul. Resour. Environ. 2024, 34, 123–131. [Google Scholar]
  3. Zhou, Z.; Gao, Y. Local government environmental protection concern and corporate green sustainable innovation levels. Syst. Eng. Theory Pract. 2025, 45, 17–35. [Google Scholar]
  4. Yu, F.; FAN, X. TMT cognition, industrial regulation and firm innovation persistence. Sci. Res. Manag. 2022, 43, 173–181. [Google Scholar]
  5. Liu, R.; Hou, M.; Jing, R.; Bauer, A.; Wu, M. The impact of national big data pilot zones on the persistence of green innovation: A moderating perspective based on green finance. Sustainability 2024, 16, 9570. [Google Scholar] [CrossRef]
  6. Shan, H.; Zhao, K.; Liu, Y. ESG performance and the persistence of green innovation: Empirical evidence from Chinese manufacturing enterprises. Multinatl. Bus. Rev. 2025, 33, 268–296. [Google Scholar] [CrossRef]
  7. Hou, L.; Cai, S.; Ma, W.; Wang, Y. How government green procurement enhances the sustainability of corporate green innovation. Sci. Technol. Prog. Policy 2025, 42, 110–120. [Google Scholar]
  8. Jing, R.; Liu, R. The impact of green finance on persistence of green innovation at firm-level: A moderating perspective based on environmental regulation intensity. Financ. Res. Lett. 2024, 62, 105274. [Google Scholar] [CrossRef]
  9. Liu, J.; Zheng, C.; Hong, Y. How can machine learning empower management research? A domestic-foreign frontier review and future prospects. J. Manag. World. 2023, 39, 191–216. [Google Scholar]
  10. Fang, S.; Cheng, Y.; Liu, Z.; Zhang, J. Chairman and CEO value differences and firm innovation: Empirical evidence from a machine learning approach. China Soft Sci. 2024, S1, 203–222. [Google Scholar]
  11. Yu, M.; Liu, M.; Zhao, X. What kind of ‘‘navigator’’ does enterprise digital transformation need: An investigation based on machine learning methods. China Soft Sci. 2024, 5, 173–187. [Google Scholar]
  12. Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  13. Liu, F.; Wang, R.; Fang, M. Mapping green innovation with machine learning: Evidence from China. Technol. Forecast. Soc. Change 2024, 200, 123107. [Google Scholar] [CrossRef]
  14. Porter, M.E. The contributions of industrial organization to strategic management. Acad. Manag. Rev. 1981, 6, 609–620. [Google Scholar] [CrossRef]
  15. Barney, J. Firm resources and sustained competitive advantage. J. Manag. 1991, 17, 99–120. [Google Scholar] [CrossRef]
  16. Peng, M.W. Towards an institution-based view of business strategy. Asia Pac. J. Manag. 2002, 19, 251–267. [Google Scholar] [CrossRef]
  17. Peng, M.W.; Wang, D.Y.L.; Jiang, Y. An institution-based view of international business strategy: A focus on emerging economies. J. Int. Bus. Stud. 2008, 39, 920–936. [Google Scholar] [CrossRef]
  18. Tang, G.; Wang, L.; Zheng, T.; Wu, W. What types of business environment fosters the emergence of more specialized and sophisticated ‘‘little giant’’ enterprises?—An empirical study based on the TOE framework and configuration adaptation theory. Manag. Decis. Econ. 2024, 3, 1557–1572. [Google Scholar] [CrossRef]
  19. Li, T.; Zhang, J.; Chen, W. The configuration effect of factors influencing the performance of intelligent transformation in manufacturing enterprises: Based on AMO theoretical analysis framework. Sci. Technol. Manag. Res. 2024, 10, 143–152. [Google Scholar]
  20. Wang, F.; He, J.; Chen, L. Will interlocking directors with green experience promote quantity increase and quality improvement of enterprise green innovation. China Ind. Econ. 2023, 10, 155–173. [Google Scholar]
  21. Sun, X.; Zhai, Y. Does profitability affect companies’ R&D decision-making? An empirical research of Chinese listed manufacturing companies. Manag. Rev. 2021, 33, 68–80. [Google Scholar]
  22. Yu, C.; Wu, X.; Zhang, D.; Chen, S.; Zhao, J. Demand for green finance: Resolving financing constraints on green innovation in China. Energy Policy 2021, 153, 112255. [Google Scholar] [CrossRef]
  23. Wang, H.; Sun, H.; Xiao, H.; Xin, L. ‘‘Be cautious’’ or ‘‘Win in danger’’? Environmental policy uncertainty and green innovation of pollution-intensive enterprises. Ind. Econ. Res. 2021, 2, 30–41+127. [Google Scholar]
  24. Yuan, B.; Cao, X. Do corporate social responsibility practices contribute to green innovation? The mediating role of green dynamic capability. Technol. Soc. 2022, 68, 101868. [Google Scholar] [CrossRef]
  25. Sun, B.; Tan, Y.; Yang, X. Firm size and innovation sustainability: A dual-regulation based on firmness and flexibility. Soft Sci. 2024, 38, 101–107+124. [Google Scholar]
  26. Bai, Y.; Song, S.; Jiao, J.; Yang, R. The impacts of government R&D subsidies on green innovation: Evidence from Chinese energy-intensive firms. J. Clean. Prod. 2019, 233, 819–829. [Google Scholar]
  27. Chen, S.; Liu, Z.; Zhang, N. The empirical study on influence of enterprise informatization on innovation capability in resource-oriented regions--Based on resource view theory. Soft Sci. 2017, 31, 44–48. [Google Scholar]
  28. Wang, F.; Liu, X.; Zhang, L.; Cheng, W. Does digitalization promote green technology innovation of resource-based enterprises? Stud. Sci. Sci. 2022, 40, 332–344. [Google Scholar]
  29. Jiang, Y.; Yao, S. Organizational capital, stakeholder pressure and green innovation of enterprises. Sci. Res. Manag. 2023, 44, 71–81. [Google Scholar]
  30. Li, B.; Sun, W. Digital Business Models, Dynamic Capabilities, and Enterprise Innovation Performance. Econ. Rev. J. 2024, 8, 106–117. [Google Scholar]
  31. Gu, J. Digital economy, peer influence, and persistent green innovation of firms: A mixed embeddedness perspective. Environ. Sci. Pollut. Res. 2024, 31, 13883–13896. [Google Scholar] [CrossRef] [PubMed]
  32. Ran, Q.; Yang, X.; Ye, L. Digitalization and continuous green innovation: Evidence from A-share listed industrial companies. J. Soochow Univ. (Philos. Soc. Sci. Ed.) 2024, 45, 119–131. [Google Scholar]
  33. Li, M.; Gao, S. Research on relationships between environmental uncertainty, organizational slack and original innovation. Manag. Rev. 2014, 26, 47–56. [Google Scholar]
  34. Fu, H.; Yu, B.; Wang, K. Environmental uncertainty, slack resources and corporate strategic change. Sci. Sci. Manag. S.& T. 2018, 39, 92–105. [Google Scholar]
  35. Cao, Y.; Zhao, T. Whether Industry Growth Reduces Corporate Tax Burden: Evidence from Chinese Listed Companies. J. Zhongnan Univ. Econ. Law 2018, 2, 14–24+158. [Google Scholar]
  36. Raymond, W.; Mohnen, P.; Palm, F.; Loeff, S.S. Persistence of innovation in Dutch manufacturing: Is it spurious? Rev. Econ. Stat. 2010, 92, 495–504. [Google Scholar] [CrossRef]
  37. Wang, X.; Tao, F. Does downstream digitalization lead to green innovation in upstream enterprises--Based on the perspective of supply chain spillover. South China J. Econ. 2024, 5, 132–149. [Google Scholar]
  38. Kang, Z.; Tang, X.; Liu, X. Can you have your cake and eat it too?—Market competition, government subsidies and enterprise R&D. World Econ. Pap. 2018, 4, 101–117. [Google Scholar]
  39. Du, K.; Chen, G.; Liang, J. Heterogeneous environmental regulation, environmental dual strategy and green technology innovation. Sci. Technol. Prog. Policy 2023, 40, 130–140. [Google Scholar]
  40. Li, M.; Rasiah, R. Can ESG disclosure stimulate corporations’ sustainable green innovation efforts? Evidence from China. Sustainability 2024, 16, 9390. [Google Scholar] [CrossRef]
  41. Xiang, X.; Liu, C.; Yang, M. Who is financing corporate green innovation? Int. Rev. Econ. Financ. 2022, 78, 321–337. [Google Scholar] [CrossRef]
  42. Li, J.; Chen, Z. Institutional advantage transfer: Political connection and business green innovation. Financ. Econ. 2020, 9, 108–120. [Google Scholar]
  43. Bai, F.; Huang, Y.; Wang, J.; Shang, M. Entering the chamber of orchids: Corporate green culture and green innovation. Foreign Econ. Manag. 2025, 47, 137–152. [Google Scholar]
  44. Yang, Z.; Chen, J.; Ling, H. Media attention, environmental policy uncertainty and firm’s green technology: Empirical evidence from Chinese A-share listed firms. J. Ind. Eng. Eng. Manag. 2023, 37, 1–15. [Google Scholar]
  45. Lei, X.; Lin, L.; Xiao, B.; Yu, H. Re-exploration of small and micro enterprises’ default characteristics based on machine learning models with SHAP. Chin. J. Manag. Sci. 2024, 32, 1–12. [Google Scholar]
  46. Li, X.; Wei, S. Can overseas M&A promote the green innovation level of enterprises? Foreign Econ. Manag. 2024, 46, 106–121. [Google Scholar]
  47. Chen, Y.; Zhou, J.; Huang, J. How does the generosity of enterprises come? Evidence from machine learning. J. Financ. Econ. 2023, 49, 153–169. [Google Scholar]
  48. Gao, C.; Shi, X.; Zhang, Y. Corporate Innovation Culture and Audit Pricing. Audit. Res. 2023, 6, 123–135. [Google Scholar]
  49. Zhao, Y.; Qi, N.; Meng, Q. Technology Integration Capability, Green Patent Quality and Firms’ Sustainable Innovation. Sci. Technol. Prog. Policy 2023, 40, 11–21. [Google Scholar]
Figure 1. Feature importance.
Figure 1. Feature importance.
Sustainability 17 10071 g001
Figure 2. Feature effect direction.
Figure 2. Feature effect direction.
Sustainability 17 10071 g002
Figure 3. (a) Enterprise size; (b) R&D investment; (c) technology utilizing capability; (d) enterprise green culture; (e) high-tech industry; (f) financing capacity; (g) integration capability; (h) government subsidy; (i) heavily polluting industry; (j) market competition.
Figure 3. (a) Enterprise size; (b) R&D investment; (c) technology utilizing capability; (d) enterprise green culture; (e) high-tech industry; (f) financing capacity; (g) integration capability; (h) government subsidy; (i) heavily polluting industry; (j) market competition.
Sustainability 17 10071 g003aSustainability 17 10071 g003b
Figure 4. (a) Enterprise size and enterprise green culture; (b) enterprise size and high-tech industry; (c) enterprise green culture and high-tech industry.
Figure 4. (a) Enterprise size and enterprise green culture; (b) enterprise size and high-tech industry; (c) enterprise green culture and high-tech industry.
Sustainability 17 10071 g004
Figure 5. (a) Sample 1; (b) Sample 2.
Figure 5. (a) Sample 1; (b) Sample 2.
Sustainability 17 10071 g005
Figure 6. Feature importance (substituting measurement of label variable).
Figure 6. Feature importance (substituting measurement of label variable).
Sustainability 17 10071 g006
Figure 7. (a) Feature importance (adjusting the sample split ratio, 70:30); (b) feature importance (adjusting the sample split ratio, 50:50).
Figure 7. (a) Feature importance (adjusting the sample split ratio, 70:30); (b) feature importance (adjusting the sample split ratio, 50:50).
Sustainability 17 10071 g007
Table 1. Brief theoretical overview of the strategic tripod framework.
Table 1. Brief theoretical overview of the strategic tripod framework.
Theoretical FrameworkMain ConsiderationAuthor
Resource-based viewThe specific resources and capabilities possessed by an organization determine its strategic decisions.Barney [15]
Industry-based viewThe competitive advantage of an enterprise is derived from the conjunction of industry structure and the enterprise’s specific positioning within that industry.Porter [14]
Institution-based viewLegitimacy and other normative institutional factors constrain an enterprise’s selection of innovation strategies.Peng et al. [16]
Strategy tripod frameworkFactors at the resource, industrial, and institutional levels are interdependent and interact with one another, collectively shaping an enterprise’s innovation strategic choices.Peng et al. [17]
Table 2. Hyper-parameter optimization of each machine learning model.
Table 2. Hyper-parameter optimization of each machine learning model.
ModelParametersValue Range
E-Netalpha0.01, 0.1, 1, 10, 100
L1-ratio0.1, 0.5, 0.7, 0.9
SVMC0.01, 0.1, 1, 10
Gamma0.01, 0.1, 1
DTmax_depth2, 5, 7, 10
min_samples_split3, 5, 7, 9
min_samples_leaf2, 5, 8
RFn_estimators100, 200, 300
max_depth2, 5, 7, 10
min_samples_split3, 5, 7, 9
min_samples_leaf2, 5, 8
GBDTn_estimators100, 200, 300
max_depth2, 5, 7, 10
min_samples_split3, 5, 7, 9
min_samples_leaf2, 5, 8
Learning_rate0.01, 0.1, 0.5, 1.0
Subsample0.5, 0.8, 1.0
XGBoostmax_depth2, 5, 7, 10
Learning_rate0.01, 0.1, 0.5, 1.0
Subsample0.5, 0.8, 1.0
min_child_weight3, 6, 9, 12
Notes. This article uses Anaconda for experiments, with Python version 3.10.7.
Table 3. Definition of variables.
Table 3. Definition of variables.
Variable TypeName of VariablesAbbreviationDefinition of Variables
Label VariablePersistence of Green InnovationOipGreen patent year-over-year growth rate × R&D output scale
Feature VariableProfitabilityPrfNet profit/Total assets
Financing CapacityFnc1/SA Index
Growth PotentialEgrIncrease in operating revenue/Total operating revenue of the previous year
Debt-Paying CapacityDbpTotal current assets/Total current liabilities
Enterprise FameFamLn (Number of analysts tracking the enterprise + 1)
Enterprise SizeSizLn (Total assets)
R&D InvestmentRdmR&D expenditure/Operating revenue
Integration CapabilityItgTotal asset turnover ratio
Technology Utilizing CapabilityTecuEmployees with bachelor’s degree or above/Total employees
Technology Perception CapabilityTecpExecutives with technical background/Total executives
Data Processing CapabilityDgtDigital intangible assets/Total intangible assets
Industry AbundanceMabMeasurement method refers to Fu et al. [34]
Industry DynamismMdyMeasurement method refers to Fu et al. [34]
Industry GrowthIgrIndustry Tobin’s Q
Heavily Polluting IndustryHpiHeavy polluting industry, 1; otherwise, 0
High Tech industryHtiHigh tech industry, 1; otherwise, 0
Market DemandDemCost of sales/Average inventory balance
Market CompetitionComp1/HHI
Environmental RegulationEvrLn(Number of local environmental regulations)
ESG RatingEsgHuazheng ESG rating
Intellectual Property ProtectionIprLn(Number of concluded patent infringement cases in the region)
Government SubsidyGosLn(Government subsidies)
Ownership StructurePrsState-owned enterprise, 1; otherwise, 0
Political ConnectionPocCurrent chairman or general manager has political background, 1; otherwise, 0
Enterprise Green CultureGrcWord frequency of environmental terms in executive sections of enterprise’s annual reports
Investor AttentionIvaInstitutional shareholding/Total shares
Media AttentionMeaLn(Number of online + News reports)
Table 4. Results of descriptive statistics.
Table 4. Results of descriptive statistics.
VariableMean ValueStandard DeviationMinMax
Oip7.07922.5090.000159.853
Prf0.0280.070−0.3130.196
Fnc0.2580.1640.2200.316
Egr0.2920.670−0.6814.193
Dbp2.1021.7050.35910.715
Fam7.0909.9130.00045.000
Siz22.4801.26420.12926.381
Rdm4.6574.5810.02826.530
Itg0.6250.3970.1082.479
Tecu29.62220.5681.72388.384
Tecp0.3130.2300.0000.857
Dgt0.0910.2000.0001.000
Mab0.1130.147−0.2760.538
Mdy0.0490.0420.0050.239
Igr1.3010.7690.1334.667
Hpi0.2960.4570.0001.000
Hti0.6780.4670.0001.000
Dem9.73328.5250.333243.890
Comp14.1429.3861.48241.514
Evr0.9510.2270.5181.692
Esg4.3391.9101.0006.000
Ipr6.7861.8601.6099.763
Gos16.6011.51111.82920.419
Prs0.0880.2840.0001.000
Poc0.2730.4450.0001.000
Grc2.0370.6700.0003.401
Iva42.49123.9880.34190.211
Mea4.5411.1711.0997.645
Table 5. Comparison of the results of the performance measure of machine learning models.
Table 5. Comparison of the results of the performance measure of machine learning models.
MLRE-NetSVMDTRFGBDTXGBoost
R20.161 (7)0.158 (6)0.314 (4)0.269 (5)0.423 (3)0.508 (2)0.521 (1)
MAE0.480 (7)0.470 (6)0.325 (4)0.382 (5)0.358 (3)0.340 (2)0.336 (1)
RMSE0.977 (6)0.978 (7)0.883 (4)0.911 (5)0.810 (3)0.747 (2)0.738 (1)
Notes. The rankings in parentheses are the performance rankings of the machine learning models.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, H.; Wang, J.; Yuan, Y. How Can Enterprises’ Green Innovation Persist? A Study Based on Explainable Machine Learning. Sustainability 2025, 17, 10071. https://doi.org/10.3390/su172210071

AMA Style

Zhao H, Wang J, Yuan Y. How Can Enterprises’ Green Innovation Persist? A Study Based on Explainable Machine Learning. Sustainability. 2025; 17(22):10071. https://doi.org/10.3390/su172210071

Chicago/Turabian Style

Zhao, Huaping, Jian Wang, and Yuan Yuan. 2025. "How Can Enterprises’ Green Innovation Persist? A Study Based on Explainable Machine Learning" Sustainability 17, no. 22: 10071. https://doi.org/10.3390/su172210071

APA Style

Zhao, H., Wang, J., & Yuan, Y. (2025). How Can Enterprises’ Green Innovation Persist? A Study Based on Explainable Machine Learning. Sustainability, 17(22), 10071. https://doi.org/10.3390/su172210071

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop