2.1. Proposal of Research Hypotheses
To systematically reveal the impact mechanism of DT on the SMP of NEV enterprises, this study constructs a structural path model involving three core variables based on industry-specific characteristics and relevant theoretical frameworks. The level of corporate DT is set as the independent variable, the RMO as the mediating variable, and SMP as the dependent variable. The goal is to examine how enterprises enhance organizational responsiveness through technological empowerment to achieve sustainable growth in marketing performance amid rising green consumption demands.
H1: The DT level has a significant positive impact on the SMP of enterprises.
In recent years, digital technologies such as artificial intelligence (AI), big data, cloud computing, and blockchain have been widely applied in business operations of NEV enterprises, including user data collection, product iteration, intelligent network control, and supply chain management [
20,
21]. These applications not only enhance operational efficiency but also strengthen the ability of enterprises to communicate sustainable value under green consumption trends. For example, energy optimization and carbon footprint visualization help improve green brand image [
22,
23]. Therefore, enterprises with stronger digital capabilities are more likely to achieve synergistic growth in both long-term performance and environmental responsibility.
H2: The DT level has a significant positive impact on an enterprise’s RMO.
Amid the rapid development of the NEV industry, enterprises must continuously monitor changes in green consumer preferences, policy adjustments, and public opinion trends. DT enables access to more detailed customer behavior data, faster feedback mechanisms, and more flexible strategy adjustment tools [
24,
25]. For instance, some enterprises utilize cloud platforms to process user charging behavior data in real time, as digital transformation enables organizations to leverage digital technologies for enhanced operational efficiency and strategic decision-making across multiple levels of analysis, thereby dynamically optimizing market layouts and marketing strategies [
26,
27]. Hence, the deep integration of digital technology helps activate enterprises’ perception–judgment–response mechanisms, enhancing their market responsiveness and adaptability.
H3: Responsive market orientation has a significant positive impact on the SMP of enterprises.
RMO reflects an enterprise’s sensitivity and responsiveness to market fluctuations. In the context of the NEV industry, consumer decisions increasingly rely on recognition of green values and transparency across the product lifecycle [
28,
29]. Enterprises that can identify and respond to these evolving preferences are more likely to earn consumer trust, boost brand loyalty, and improve marketing performance [
30]. RMO is not only a cognitive capability but also a strategic execution mechanism driven by values, integrating belief-based orientation with institutional implementation [
31,
32].
H4: RMO partially mediates the relationship between DT and SMP.
Although DT provides a technological foundation for performance enhancement, its actual impact depends on whether enterprises have internal mechanisms to convert technological advantages into market adaptability, particularly in transforming green supply chains to achieve carbon neutrality [
33,
34]. RMO serves as a key channel in the transformation from “capability” to “outcome” [
35]. For example, even with strong data-processing abilities, enterprises may fail to continuously enhance marketing performance if they lack adjustment mechanisms driven by consumers’ green preferences. Therefore, RMO is likely to play a partial mediating role in the effect of DT on SMP, forming an organizational response cycle of cognition–belief–behavior.
Although this study does not empirically introduce moderating variables into the model, it theoretically recognizes that enterprise scale, market maturity, and policy environment may influence the strength of the relationships among DT, RMO, and SMP. For instance, larger enterprises may possess stronger capabilities in resource allocation and technological transformation. In regions with high levels of policy subsidies, firms are more likely to activate green strategic awareness. Similarly, market maturity may affect the heterogeneity of consumer preferences regarding green products. These contextual variables offer potential directions for future research, particularly through multi-group analysis or interaction modeling, to enhance the model’s generalizability and explanatory power. This study proposes a three-variable mediating pathway—“DT → RMO → SMP”—which provides theoretical support for investigating the mechanisms linking digital strategy to marketing performance. Future research could refine this framework by decomposing RMO into a three-stage structure of “cognition–behavior–institution,” incorporating time-series data to examine the dynamic evolution of organizational responses, or adopting multilevel models to explore variations across different organizational strata. In the theoretical analysis of Hypothesis H2, this study also posits the potential existence of nonlinear relationships—such as diminishing marginal effects or stepwise improvements—in the link between DT and RMO. However, given the primary objective of identifying mechanism pathways and verifying their statistical significance, a linear SEM framework is employed to ensure model identifiability and estimation robustness [
32]. These limitations are further discussed in the concluding section, along with suggestions for future studies to consider nonlinear or threshold-based modeling approaches.
To clarify the influence of DT on SMP, this study distinguishes between its direct and indirect effects. Within the model, DT exerts both a direct impact on SMP and an indirect effect mediated through RMO. Specifically, DT enhances a firm’s digital capabilities, thereby improving its market responsiveness. This heightened responsiveness enables better strategic execution and market behavior, ultimately enhancing SMP. As to why RMO serves as a partial rather than a full mediator, the study argues that while DT significantly improves RMO, RMO is not the sole channel through which DT influences SMP. Digital transformation also directly contributes to marketing performance, particularly when firms utilize digital technologies to enhance operational efficiency and strengthen brand image. Therefore, RMO is positioned as a partial mediator in the DT–SMP relationship. This assertion is grounded in a theoretical understanding of the complex interactive mechanisms between digital transformation and green marketing transformation.
2.2. Model Structure and Path Framework
To verify the theoretical pathway through which DT affects the SMP of NEV enterprises via RMO, this study constructs a SEM framework incorporating three core variables. The SEM framework is composed of two parts: a measurement model and a structural model [
32,
34].
In these equations, denotes the vector of observed variables; and represent exogenous and endogenous latent variables, respectively; is the factor loading matrix; refers to the measurement error; denotes the structural path coefficient matrix between endogenous variables; represents the path coefficient matrix from exogenous to endogenous variables; and is the structural residual.
In this model, DT functions as the independent variable, RMO as the mediating variable, and SMP as the dependent variable. The overall structure includes three direct paths and one mediating path. Departing from traditional unidirectional hypotheses such as “DT → SMP,” this study highlights the intermediary role of organizational response mechanisms. It emphasizes how firms convert technological strategies into actual performance through market sensing, value-driven guidance, and adaptive behavior. The model includes the following specific paths:
Path 1: DT → SMP (direct path)
This path captures the direct impact of an enterprise’s digital technology adoption on marketing performance. For instance, AI-powered personalized recommendations for green products can directly increase purchase rates and customer satisfaction.
Path 2: DT → RMO (direct path)
This path reflects how digital capabilities enhance an enterprise’s ability to perceive, interpret, and respond to market changes. For example, cloud-based user behavior analytics can improve responsiveness to evolving green consumer preferences.
Path 3: RMO → SMP (direct path)
This path demonstrates how enhanced market responsiveness leads to improved performance in green markets. RMO enables firms to convert external green value signals into internal strategic actions—such as green brand communication or supply chain adjustments—thereby fostering competitive advantage.
Path 4: DT → RMO → SMP (mediating path)
This path indicates that DT indirectly boosts SMP by enhancing RMO. Based on theoretical assumptions and preliminary empirical findings, this mediation is considered partial, suggesting that DT not only influences SMP through RMO but also has a direct effect.
To construct a multi-dimensional RMO variable system, this study—building on prior theory—attempts for the first time to extend VBN theory from the individual behavioral level to the organizational level. Originally developed to explain individual environmental behaviors, VBN theory has been shown to be applicable to the behavioral transformation processes of enterprises undergoing green transition, particularly in understanding strategic green marketing orientation drivers from an owner manager perspective [
36]. Specifically, the perception of green responsibility at the organizational level parallels the value cognition of individuals; the construction of a green corporate culture reflects belief formation; and institutional mechanisms such as ESG evaluation and green Key Performance Indicators (KPIs) serve as organizational-level behavioral norms [
37,
38]. Based on this theoretical translation, this study conceptualizes RMO as a five-dimensional mechanism comprising (1) customer perception responsiveness, (2) green value orientation, (3) belief dissemination, (4) behavioral adjustment, and (5) institutionalized execution. These five interrelated dimensions collectively form an organizational-level “cognition–belief–norm–behavior” feedback loop, which structurally aligns with the logic of the original VBN pathway. As such, they capture the dynamic process through which enterprises transition from external green signal perception to internal strategic response within the context of green marketing transformation. From this theoretical foundation, two primary contributions emerge. First, on a conceptual level, this study extends VBN theory beyond its original scope, applying it to organizational strategic decision-making in the context of green consumption. In doing so, it proposes an innovative, multi-dimensional closed-loop response model tailored to enterprise transformation behavior, thereby enriching the theoretical interface between RMO and SMP. Second, in terms of mechanism structure, the study emphasizes the dynamic feedback process of “perception–orientation–dissemination–adjustment–institution.” In contrast to traditional sales-oriented market models, this perspective foregrounds the internalization of values and behavioral adaptation processes, offering deeper insights into how organizations embed green strategies. To improve the practical applicability and explanatory power of the model, this study also accounts for specific characteristics of the NEV industry—such as rapid technological iteration, strong policy dependence, and heterogeneous green consumer preferences. Special attention is paid to the fact that many small- and medium-sized enterprises (SMEs) in this sector face dual constraints in digital transformation and green marketing due to limited resources. Consequently, considerations of operational feasibility and adaptability across enterprise sizes are integrated into sample construction and mechanism design, supporting broader applicability and extrapolation of the model. It is important to acknowledge the boundary conditions of this study. The model is most relevant for enterprises with a foundational level of digital infrastructure and green awareness. It may not fully apply to micro-enterprises or firms lacking basic transformation capabilities. Nevertheless, by reconstructing VBN theory at the organizational level and clarifying the dimensions of the response mechanism, this study provides both theoretical innovation and empirical validation. These contributions collectively enhance the understanding of enterprise performance generation mechanisms in the dual context of digitalization and green transformation.
To construct a multi-dimensional RMO variable system, this study—building on prior theory—attempts for the first time to extend VBN theory from the individual behavioral level to the organizational level. Originally developed to explain individual environmental behaviors, VBN theory has been shown to be applicable to the behavioral transformation processes of enterprises undergoing green transition, particularly in understanding strategic green marketing orientation drivers from an owner manager perspective [
36]. Specifically, the perception of green responsibility at the organizational level parallels the value cognition of individuals; the construction of a green corporate culture reflects belief formation; and institutional mechanisms such as ESG evaluation and green Key Performance Indicators (KPIs) serve as organizational-level behavioral norms [
37,
38]. Based on this theoretical translation, this study conceptualizes RMO as a five-dimensional mechanism comprising (1) customer perception responsiveness, (2) green value orientation, (3) belief dissemination, (4) behavioral adjustment, and (5) institutionalized execution. These five interrelated dimensions collectively form an organizational-level “cognition–belief–norm–behavior” feedback loop, which structurally aligns with the logic of the original VBN pathway. As such, they capture the dynamic process through which enterprises transition from external green signal perception to internal strategic response within the context of green marketing transformation. From this theoretical foundation, two primary contributions emerge. First, on a conceptual level, this study extends VBN theory beyond its original scope, applying it to organizational strategic decision-making in the context of green consumption. In doing so, it proposes an innovative, multi-dimensional closed-loop response model tailored to enterprise transformation behavior, thereby enriching the theoretical interface between RMO and SMP. Second, in terms of mechanism structure, the study emphasizes the dynamic feedback process of “perception–orientation–dissemination–adjustment–institution.” In contrast to traditional sales-oriented market models, this perspective foregrounds the internalization of values and behavioral adaptation processes, offering deeper insights into how organizations embed green strategies. To improve the practical applicability and explanatory power of the model, this study also accounts for specific characteristics of the NEV industry—such as rapid technological iteration, strong policy dependence, and heterogeneous green consumer preferences. Special attention is paid to the fact that many small- and medium-sized enterprises (SMEs) in this sector face dual constraints in digital transformation and green marketing due to limited resources. Consequently, considerations of operational feasibility and adaptability across enterprise sizes are integrated into sample construction and mechanism design, supporting broader applicability and extrapolation of the model. It is important to acknowledge the boundary conditions of this study. The model is most relevant for enterprises with a foundational level of digital infrastructure and green awareness. It may not fully apply to micro-enterprises or firms lacking basic transformation capabilities. Nevertheless, by reconstructing VBN theory at the organizational level and clarifying the dimensions of the response mechanism, this study provides both theoretical innovation and empirical validation. These contributions collectively enhance the understanding of enterprise performance generation mechanisms in the dual context of digitalization and green transformation.
The design of variables in this study adheres to the following specifications:
- (1)
DT is measured using a text-mining approach applied to corporate annual reports. Standardized keyword frequencies—based on terms such as AI, big data, cloud computing, and blockchain—are used to reflect the degree of digital technology adoption by enterprises
- (2)
RMO is constructed based on the theoretical framework of VBN theory, capturing the organizational response loop that progresses from perception to belief to behavioral execution
- (3)
SMP draws on the ESG framework, and is assessed across four key dimensions: economic benefits, risk control, marketing compliance, and growth sustainability.
While these dimensions originate from broader considerations of corporate performance, they are highly applicable to the green marketing context and possess strong explanatory relevance:
Economic benefits capture whether green marketing efforts have yielded financial gains—for example, through energy-saving cost reductions or increased sales of NEV products.
Risk control evaluates the extent to which green strategies help enterprises mitigate potential external risks, such as regulatory scrutiny, reputational pressures, or NGO activism.
Marketing compliance measures alignment with environmental regulations in branding, labeling, and channel practices. This is particularly critical in light of increasingly stringent oversight of green consumption.
Growth sustainability reflects the firm’s ability to cultivate long-term customer loyalty, brand credibility, and stable revenue streams through its green marketing efforts.
In contrast to traditional marketing performance indicators (e.g., the 4A model, sales-based metrics, or brand awareness), which tend to emphasize short-term consumer response or sales outcomes, the ESG-based framework enables a more comprehensive evaluation of institutional implementation, environmental adaptability, and sustainable value creation. By integrating financial and non-financial indicators, ESG effectively addresses the core components of environmental responsibility (E), social expectations (S), and governance compliance (G), making it especially well-suited for assessing the outcomes of green marketing initiatives.
To tailor ESG to the specific context of green marketing, this study refines its internal structure—focusing on the four dimensions most relevant to marketing performance: economy, risk, compliance, and growth. This adjustment maintains the integrity of the ESG framework while enhancing its contextual relevance.
However, several limitations of the ESG framework are acknowledged. First, SMEs often disclose less ESG-related information, potentially affecting data completeness. Second, attributing specific performance outcomes directly to green initiatives remains empirically complex. These issues are further discussed in the research limitations section. To address them, future studies are encouraged to introduce complementary behavioral indicators—such as green market share growth or improvements in consumer environmental awareness—to enhance the precision and explanatory power of performance evaluation.
Importantly, the ESG framework has already been extensively validated in the fields of green finance and strategic management, and it demonstrates strong feasibility in terms of both sample coverage and theoretical alignment. Therefore, this study adopts ESG as the observational variable system for measuring green marketing performance within the SEM framework—balancing theoretical robustness, empirical operability, and green-oriented expressiveness.
Finally, it is important to note that a nonlinear relationship or threshold effect may exist between the intensity of digital technology adoption and RMO in NEV enterprises. For example, in the early stages of digital transformation, an enterprise’s market response capability may remain underdeveloped. Upon surpassing a critical threshold, however, the enterprise may experience a rapid acceleration in responsiveness. At higher levels of digital investment, diminishing marginal returns may occur. While this study primarily focuses on identifying mechanisms and verifying path significance, a linear SEM framework is employed to ensure model identifiability and estimation robustness. Future research could explore nonlinear or threshold models to capture these more complex dynamics.
2.3. Variable Description and Indicator System Construction
To ensure scientific rigor, measurability, and theoretical consistency within the SEM framework, this study designs a comprehensive observation indicator system for the three core latent variables—DT, RMO, and SMP—and incorporates control variables to enhance model robustness.
- (1)
DT
DT reflects the breadth and depth of an enterprise’s application of key digital technologies, such as AI, big data, cloud computing, and blockchain, representing the strategic level of digital adoption [
39]. This study employs a text-mining approach to extract standardized word frequencies from corporate annual reports. Specifically, the presence and frequency of relevant digital terms are measured, and an aggregate indicator (“DT5”) is constructed to reflect the relative intensity of digital discourse in managerial narratives.
- (2)
RMO
Rooted in the VBN theory, RMO captures an enterprise’s dynamic perception, belief formation, behavioral adjustment, and institutionalization capacity in response to evolving green market demands [
40,
41]. The
RMO construct spans five interrelated dimensions, emphasizing the organizational mechanisms that translate digital capabilities into adaptive and sustainable marketing behaviors.
- (3)
SMP
Drawing on both the ESG framework and marketing performance literature, SMP is operationalized through four indicators reflecting financial and non-financial dimensions. These include the economic efficiency, stability, risk control, and compliance level of an enterprise’s green marketing efforts.
The full indicator system is presented in
Table 1.
Considering that an enterprise’s financial structure and growth capacity may significantly influence the implementation and effectiveness of green strategies, this study introduces a set of control variables to account for potential confounding effects. The selected variables are detailed in
Table 2.
These control variables are theoretically grounded and widely adopted in the literature related to green strategy, corporate performance, and marketing behavior analysis. They effectively capture enterprises’ resource allocation capabilities and their potential to influence sustainable marketing behaviors.
From the financial structure perspective
CV1 (Debt-to-Asset Ratio) reflects an enterprise’s capital structure and solvency. As a key indicator of financial stability and risk tolerance, a high leverage ratio may restrict an enterprise’s capacity or willingness to invest in digitalization and green initiatives, thereby hindering sustainable marketing performance.
CV2 (ROE) measures the efficiency of equity utilization and reflects the enterprise’s capability to generate profit from shareholders’ investment. Prior studies have established a strong link between ROE and an enterprise’s strategic engagement in green innovation and market development.
CV3 (Operating Cash Flow Ratio) represents the actual cash generated from core business activities. Sufficient operational cash flow is critical for sustaining long-term investments in green transformation and building flexible, responsive marketing systems.
From the growth capability perspective
CV4 (Revenue Growth Rate) serves as a direct proxy for enterprise growth potential and agility in market response. High growth firms typically possess stronger resource input capacity and greater organizational responsiveness to green market demands;
CV5 (Accounts Receivable Ratio) reflects liquidity risk and capital lock-in. Elevated levels may impede enterprises’ ability to respond quickly to green market signals or to invest in related promotional activities;
CV6 (Price-to-Book Ratio) indicates market valuation and investor expectations. A higher P/B ratio generally signals positive market sentiment regarding a firm’s prospects in digitalization and sustainable development, which may translate into greater access to resources for implementing green strategies.
In summary, the control variable framework integrates considerations of both financial robustness and growth dynamics, establishing a theoretically and empirically cohesive model foundation. These variables address the question of whether enterprises possess the financial and organizational capacity to make sustainable investments and execute responsive strategies, which is essential for unbiased estimation of the relationships among DT, RMO, and SMP.
All indicators are treated as first-order reflective constructs, and will be subjected to reliability and validity assessments in subsequent empirical analyses. For text-based indicators, the Term Frequency–Inverse Document Frequency (TF-IDF) algorithm is applied for weighting, while synonym consolidation and semantic filtering are performed using Python to ensure indicator consistency, interpretability, and inter-variable correlation.
It is worth noting that variables such as research and development (R&D) investment, patent counts, and supply chain digital maturity may also influence the implementation path of corporate green strategies. However, these factors are not included in the current study’s control variable system for the following reasons. First, the core objective of this research is to investigate the marketing performance-oriented response mechanism (RMO), constructing a theoretical pathway of “DT → RMO → SMP”. Therefore, the selection of control variables primarily aims to address potential confounding effects arising from the financial capacity and growth potential corporate marketing behaviors [
54,
55]. Accordingly, the study prioritizes financial structure indicators such as the debt-to-asset ratio and cash flow ratio, which reflect an enterprise’s resource allocation capability. In parallel, growth metrics such as revenue growth rate and accounts receivable ratio are incorporated to control for fundamental operational factors that may influence RMO responsiveness and
SMP outcomes. Second, although innovation-related variables such as R&D investment and patent output hold relevance in the context of technology strategy, their applicability in this study is limited. These indicators often exhibit high cross-industry heterogeneity, suffer from inconsistent disclosure standards, and face partial data unavailability—particularly within the NEV sector. These issues pose challenges in terms of data accessibility, comparability, and risk introducing estimation biases into the model. Third, including an excessive number of control variables—particularly those with potential collinearity—may weaken the explanatory power of key paths and reduce model fit robustness in SEM [
56]. To mitigate this risk, the current study limits control variable inclusion to six indicators that successfully pass Variance Inflation Factor (VIF) tests, thereby ensuring both model parsimony and structural clarity.
Given these considerations, this study does not incorporate technological innovation indicators as control variables within the SEM framework. The external generalizability of this decision and the associated risks of omitted variable bias are explicitly acknowledged in the “research limitations” of Conclusions section.
To empirically test the proposed “DT → RMO → SMP” path model, this study develops a structured questionnaire-based measurement instrument grounded in the previously established variable system. The overall design adheres to a “theory–indicator–measurement” logic chain and adopts a five-point Likert scale for all items (1 = “completely disagree” to 5 = “completely agree”). Detailed measurement design is presented in
Table 3.
To ensure theoretical consistency and empirical validity—particularly for the RMO construct—the measurement scale is developed through a rigorous, multi-step process: theoretical framework formulation → dimension extraction → item drafting → expert consultation and refinement → pilot testing → finalization of measurement instrument.
2.4. Experiment Settings
To empirically investigate the mechanism through which DT affects enterprises’ SMP via RMO, this study selects NEV enterprises in China as the research context. As a representative and dynamic sub-sector within the broader green consumer goods industry, NEV firms demonstrate notable leadership in both digital technology adoption (e.g., AI, big data, cloud computing, and blockchain) and environmental responsibility transformation. Therefore, focusing on this sector allows for a clearer examination of the interaction mechanisms between digital capabilities and green marketing performance. However, the NEV sector possesses certain unique characteristics—such as stronger policy intervention, higher technological concentration, and more integrated supply chain coordination—compared to the general green consumer goods industry. Consequently, the extrapolation of findings should be contextualized within the specificities of the NEV industry.
Sample firms were drawn from A-share listed companies on the Shanghai and Shenzhen Stock Exchanges, encompassing the Main Boards, Science and Technology Innovation Board (STAR Market), and Growth Enterprise Market. The sample includes midstream and downstream enterprises within the NEV value chain, such as complete vehicle manufacturers, core component providers, and intelligent connected system developers.
The screening criteria were as follows:
- (1)
Industry classification: Firms must be classified under NEV-related subcategories of “Automobile Manufacturing” in the National Economic Industry Classification (2021);
- (2)
Data completeness: Firms with missing annual reports or critical financial disclosures are excluded;
- (3)
Time coverage: Only enterprises with continuous disclosure of annual reports and key indicators from 2018 to 2022 are included to ensure longitudinal data consistency;
- (4)
Business focus: The primary business revenue must predominantly derive from NEV-related operations.
Following the above criteria, 86 eligible listed enterprises were identified, yielding balanced panel dataset of 430 firm-year observations.
To enhance transparency and sample interpretability, this study conduct further descriptive analysis on geographical distribution, enterprise scale, and industrial chain positioning:
- (1)
Geographical distribution: Sample firms are predominantly located in economically developed eastern and central provinces, including Guangdong, Shanghai, Jiangsu, Zhejiang, and Beijing—regions that serve as strategic hubs for China’s NEV sector, demonstrating significant spatial clustering;
- (2)
Enterprise scale: The majority of the sample comprises large and medium-sized firms, with an average total asset size of approximately 8.9 billion, indicating strong financial capacity and technological investment potential;
- (3)
Industrial chain position: About 35% of sampled firms are complete vehicle manufacturers, 45% are core component producers (e.g., electric drives, batteries, electronic control systems), and the remaining portion includes intelligent connected systems and supporting service providers. This reflects a relatively complete and representative industrial chain structure.
It is important to acknowledge that the current sample consists exclusively of listed enterprises, enabling access to standardized, continuous, and publicly disclosed data. However, this may introduce sampling bias, as SMEs are not included. SMEs often differ significantly from listed firms in terms of resource constraints, digital transformation pathways, and policy sensitivity. These limitations are further addressed in the “Conclusions” section, where the scope of external generalizability is clarified.
Moreover, the chosen time window of 2018–2022 reflects a distinct policy and market context. On one hand, this period marks a critical transition phase for the NEV industry, shifting from government subsidy-led growth to a market-oriented development model. It is a time of significant transformation in industrial ecosystems, enterprise behavior, and market dynamics—thus offering valuable empirical insights. On the other hand, this stage is also marked by external shocks such as subsidy phase-outs, capacity restructuring, and the COVID-19 pandemic, which may introduce structural disturbances into the dataset.
To address potential confounding effects, this study controls for variables related to enterprise size, financial structure, and growth capability during model construction and SEM analysis. Furthermore, the broader policy environment and potential impact boundaries are thoroughly discussed in the later Discussion section to contextualize the findings and enhance research robustness.
The DT indicator is constructed based on the frequency and weight of digital-technology-related keywords extracted from corporate annual reports. Textual analysis is conducted using Python v3.12 in a Jupyter Notebook environment (
https://jupyter.org/), following a structured pipeline to ensure accuracy and reproducibility. The steps are outlined as follows:
All corporate annual report texts are first cleaned by standardizing formats, removing punctuation, and eliminating irrelevant content (e.g., disclaimers, appendices, or repeated headers) to ensure the quality and consistency of input data
- (2)
Tokenization and Dictionary Customization
Text segmentation is performed using the Jieba Chinese tokenizer, incorporating a custom dictionary to unify and merge synonyms (e.g., “artificial intelligence” and “AI”). This ensures semantic consistency and reduces redundancy in term recognition
- (3)
Keyword Selection
Based on the prior literature and industry practices [
57,
58], four core digital technology terms are selected: “AI,” “big data,” “cloud computing,” and “blockchain.” These terms reflect the most relevant and impactful digital capabilities within the NEV industry
- (4)
Weighting via TF-IDF Algorithm
To account for differences in semantic range and prevalence across firms, the TF-IDF algorithm is applied. This weighting method corrects for term overuse or underrepresentation, adjusting raw frequency values to reflect term distinctiveness and informativeness [
43,
45].
The general formula for the
DT index is defined as
Here,
denotes the
DT score for firm
;
is the TF-IDF value for keyword
in firm
’s report;
represents the weight of keyword
; and
is the number of selected keywords [
57,
58].
Specifically, the annual reports for all firms from 2018 to 2022 are used as the reference corpus. For each report, the relative frequency of each keyword is calculated by dividing the count by the total word count of the report. This ensures normalization across documents of varying lengths [
59]. The document frequency (i.e., proportion of reports containing a given keyword) is then used to compute the inverse document frequency component. The final TF-IDF scores for the four keyword categories are aggregated and standardized to construct the DT index
- (5)
Standardization
To facilitate comparability across firms, the calculated keyword frequencies are transformed into standard deviation units (z-scores). The final
DT score is computed as follows:
represents the TF-IDF weight for the
i-th keyword category;
is the enterprise’s final DT index; and
denotes the standardized term frequency of the
i-th keyword [
60,
61]
- (6)
Data Processing and Outlier Treatment
All continuous variables, including the DT score, are Winsorized at the 1st and 99th percentiles to mitigate the influence of extreme values. For observations with missing data, industry-year averages or linear interpolation within the same sector and year are used to ensure completeness and reduce sample bias.