Next Article in Journal
An Evolutionary Game Analysis of AI Health Assistant Adoption in Smart Elderly Care
Previous Article in Journal
Conditions for Increasing the Level of Automation of Logistics Processes on the Example of Lithuanian Companies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Data Elements Marketization and Corporate Investment Efficiency: Causal Inference via Double Machine Learning

Faculty of Applied Economics, University of Chinese Academy of Social Sciences, Beijing 102488, China
*
Author to whom correspondence should be addressed.
Systems 2025, 13(7), 609; https://doi.org/10.3390/systems13070609 (registering DOI)
Submission received: 11 June 2025 / Revised: 16 July 2025 / Accepted: 17 July 2025 / Published: 19 July 2025
(This article belongs to the Section Systems Practice in Social Science)

Abstract

Amid the rapid development of the digital economy, data elements—emerging as a new type of production factor—are gradually becoming a key resource for enhancing corporate efficiency and promoting high-quality development. The marketization of data elements is also steadily progressing and playing an increasingly important role. Based on data from Chinese A-share listed companies spanning 2007 to 2023, this study systematically evaluates the impact of data element marketization on corporate investment efficiency using a Double Machine Learning approach. The findings reveal that data element marketization significantly improves investment efficiency. Mechanism analysis further demonstrates that such improvement is primarily driven by reduced information dispersion, enhanced risk-bearing capacity, and improved operational efficiency. Heterogeneity analysis indicates that these effects are more pronounced for firms in high-tech industries, high growth potential firms, enterprises located in regions with strong digital infrastructure, and firms experiencing overinvestment problems. This study provides empirical evidence on how the marketization of data elements in China enhances economic outcomes, improving corporate investment decisions, which could serve as a reference for other countries undergoing digital transformation.

1. Introduction

In today’s digital era, data has become a critical strategic asset for organizations seeking competitive advantage and long-term value creation [1,2]. It is increasingly recognized as a new factor of production in modern economies. It is characterized by non-rivalry, high replicability, and inherent non-excludability [3]. These features enable data to play a pivotal role in driving innovation outcomes. Empirical evidence shows that improvements in data-driven innovation efficiency significantly boost innovation output across regions, highlighting data’s strategic importance in fostering high-quality development [4]. The marketization of data elements involves the institutionalization of data trading and circulation through platforms such as data exchanges and big data infrastructures. By clarifying data ownership, improving transaction mechanisms, and eliminating monopolies and data silos, this process enhances the flow and reuse of data, thereby unlocking its value and improving resource allocation efficiency across sectors [5]. The marketization of data elements enhances the allocation and utilization efficiency of data resources within enterprises. By facilitating data circulation and reducing transaction and operational costs, it strengthens firms’ capacity for data-driven transformation and promotes their long-term, sustainable development [6]. On 20 December 2024, the National Data Bureau of China, in conjunction with other government departments, issued the Opinions on Promoting the Development and Utilization of Corporate Data Resources [7]. This marks a new stage in the marketization of data elements in China, driving data elements to play a more significant role in economic development. As data becomes more structured, accessible, and analytically actionable within firms, decision-making processes are undergoing a clear shift. Empirical evidence shows that enterprises increasingly rely on data-driven approaches to replace trial-and-error and experience-based judgments, leading to more accurate, timely, and efficient operational outcomes [8]. The application of data strengthens firms’ access to information, reduces decision-making biases, and facilitates more rational capital allocation, ultimately contributing to enhanced investment efficiency [9]. As the core agents of a market economy, enterprises play a pivotal role in promoting national economic transformation. Their investment and innovation behaviors directly influence the allocation efficiency of production factors and serve as essential levers for improving total factor productivity, driving industrial upgrading, and achieving high-quality economic development [10]. Therefore, examining the impact of data element marketization on corporate investment efficiency in the current stage of development holds substantial theoretical and practical significance.
This paper is closely related to two main strands of literature: the economic effects of data element marketization and the determinants of corporate investment efficiency. From the firm-level perspective, existing research suggests that the marketization of data elements enhances firm-level performance by strengthening innovation capacity, improving data utilization efficiency, and alleviating financing constraints, thereby supporting firms’ green governance and sustainable development [11]. Further research indicates that data trading platforms have been shown to enhance energy efficiency by reducing innovation risk, stimulating in-novation activity, and promoting breakthrough technological innovation [12]. In addition, this natural resource-saving effect gradually increases as energy efficiency improves [5]. Specifically, the enhancement of enterprise innovation through data factor marketization operates via four key mechanisms: strengthening enterprise information infrastructure, alleviating financing constraints, stabilizing supplier relationships, and expanding the scale of intangible assets. Additionally, the marketization of data elements alleviates corporate financing constraints by enhancing information transparency and management efficiency [13]. At the regional level, the development of the digital economy enhances innovation efficiency by reducing information costs, promoting interregional knowledge spillovers, and strengthening collaborative innovation networks [14]. It also supports green growth by improving market resource allocation, advancing green technologies, and expanding inclusive digital finance [15]. Collectively, these findings underscore the transformative impact of data element marketization in enhancing enterprise capabilities and promoting regional innovation and sustainable development.
As for the determinants of corporate investment efficiency, a growing body of literature has examined the role of digital economic development, particularly from the perspectives of digital innovation, digital technology, and government digitalization. Prior studies suggest that digital economy development can effectively improve corporate investment efficiency, mainly by inhibiting their overinvestment [16]. At a more granular level, digital innovation contributes to more efficient corporate investment by reducing information asymmetry, enhancing transparency, and mitigating agency conflicts, which in turn facilitates more rational capital allocation and curbs both overinvestment and underinvestment [17]. Digital technology adoption enhances enterprise investment efficiency by reducing information asymmetry and easing financing constraints, with stronger effects observed in firms with higher technological intensity or more concentrated ownership structures [18]. As an integral component of digital technologies, fintech has played a critical role in improving investment efficiency [19]. Government digitalization improves enterprise investment efficiency by enhancing information transparency, reducing policy uncertainty, and expanding firms’ access to financing and public services [20]. The capital market has long been recognized as a traditional mechanism for improving corporate investment efficiency [21]. Given the transformative nature of the big data era, investigating the role of data factor marketization in enhancing corporate investment efficiency has become as important as understanding the contributions of traditional capital markets. However, most existing studies address digitalization in general terms and overlook the distinct role of data element marketization in resource allocation. Furthermore, empirical evidence on the mechanisms and channels through which data element marketization influences corporate investment efficiency remains scarce. To address this gap, this paper draws on panel data from Chinese A-share listed firms from 2007 to 2023 and employs a Double Machine Learning approach to examine its impact on investment efficiency.
The contributions of this study are as follows: At the theoretical level, this study focuses on the impact of data element marketization on corporate investment efficiency. It incorporates data factor marketization into the explanatory framework of investment behavior and finds that reduced information dispersion, enhanced risk-bearing capacity, and improved operational efficiency are key channels through which it affects corporate investment efficiency. This enriches the literature on both data element marketization and enterprise investment determinants. At the empirical level, based on panel data of Chinese A-share listed firms from 2007 to 2023, this study applies a Double Machine Learning approach to investigate the impact of data element marketization on investment efficiency. The analysis includes a comprehensive examination of the direction, mechanism, and heterogeneity of this effect. Specifically, heterogeneity is analyzed across industry types, firm growth potential levels, regional digital infrastructure development, and types of investment inefficiency.
The novelty of this study lies in three aspects: first, it introduces corporate investment efficiency as a new micro-level evaluation dimension into the research on data element marketization; second, it constructs a robust causal inference framework using Double Machine Learning to reduce endogeneity and overfitting risks; third, it offers a detailed heterogeneity analysis to reveal differentiated responses among firm types. These innovations provide new insights into how emerging factor markets reshape firm behavior and resource allocation. Given the relevance of data governance reform and digital economy development across developing countries, the findings have significant policy implications for other economies undergoing digital transformation.

2. Theoretical Model and Research Hypothesis

From the perspective of overall digital economy development, as the level of digitalization continues to rise, the transition into the stage of data element marketization is expected to further enhance corporate investment efficiency. In the digital economy, data has become a key production factor. Yet its value is largely captured by a few organizations with the necessary expertise and resources. To advance the digital economy, data marketplaces have emerged, addressing key challenges of data sharing, discovery, and integration to facilitate effective data exchange and utilization, thereby enhancing the value creation potential of data [22]. The development of the digital economy enhances corporate investment efficiency by reducing transaction costs, improving resource allocation, and inhibiting overinvestment [16]. A data market platform enhances data sharing, discovery, and integration through well-designed incentive mechanisms and governance rules. By encouraging data owners to share their data and addressing the discovery and integration challenges for data consumers, enterprise data marketplaces significantly enhance the usability, reusability, and overall utilization efficiency of data within organizations [23]. Then data enhances decision-making quality by strengthening the forecasting information base, improving the relevance of decision inputs, and enabling real-time integration of business and financial functions. It also optimizes organizational structures and resource allocation, promoting dynamic decision-making. These capabilities collectively improve the accuracy, efficiency, and strategic value of corporate decision-making [24].
From the perspective of data element markets, marketization reduces firms’ search, negotiation, and compliance costs in data acquisition by establishing standardized trading platforms and structured data circulation mechanisms. The marketization of data elements facilitates access to external information resources, which reduces information costs and improves enterprise resource allocation efficiency [25]. What’s more, the substantial fixed costs and capital investment associated with processing and analyzing large volumes of data—coupled with the negligible marginal cost of data distribution and replication—result in pronounced economies of scale [26]. In the context of data marketization, this enables firms to access comprehensive, decision-relevant information at lower cost, thereby enhancing its affordability and strategic value The development of the data element market has also facilitated the market-based pricing of data assets, making data valuation more transparent and standardized. Clear and well-structured data pricing mechanisms reduce information asymmetry and improve the evaluability of data assets. By enhancing the measurability and transparency of data value, they facilitate more efficient data resource allocation and provide a technical basis for rational decision-making by market participants [27]. Moreover, clearer valuation of data assets has unlocked new avenues for enterprise investment, such as data-backed financing and equity contributions through data. These mechanisms expand financing options, reduce costs, and increase funding flexibility. As a result, they help alleviate financial constraints and create favorable conditions for firms to invest in high-quality projects, ultimately enhancing both the efficiency and quality of investment.
At the level of specific data products, data exchanges such as the Shanghai Data Exchange have built their core business systems around three key dimensions: datasets, data services, and data applications, offering comprehensive data solutions to enterprises. In terms of datasets, exchanges provide highly reliable and complete industrial and enterprise-level datasets through rigorous quality control and standardized processing. The quality of datasets directly affects decision-making outcomes, as high-quality data can minimize erroneous business judgments caused by poor data quality [28]. Verified data resources enable firms to better monitor industry trends and make sound strategic and investment decisions. In terms of data services, professional data service providers leverage advanced technical teams and infrastructure to deliver end-to-end services—including data cleaning, labeling, analysis, and modeling—especially for small and medium-sized enterprises (SMEs) that often lack in-house data processing capabilities. SMEs commonly face challenges such as shortages of technical personnel and limited capital for infrastructure investment. With the continuous advancement of business intelligence and analytics (BI&A) technologies, professional organizations are increasingly enhancing their capabilities in data collection, storage, and analysis by adopting sophisticated BI&A tools and methodologies. This enables them to more effectively address complex, data-intensive tasks in dynamic business environments [29]. This enables firms to outsource data-related functions to these professional organizations, thereby minimizing redundant investments in digital infrastructure, personnel recruitment, and training. As a result, firms can allocate limited resources more efficiently toward product development and market expansion, leading to more targeted investments and improved investment efficiency. In terms of data applications, the marketization of data elements provides firms with well-established algorithmic systems and rapid deployment solutions. This allows businesses to bypass prolonged R&D cycles and avoid the high costs associated with trial-and-error experimentation. By directly integrating market-tested data products into their decision-making systems, firms can significantly reduce the cost and complexity of technology adoption. At the same time, they benefit from improved timeliness and accuracy in data analysis. These advantages translate into tangible investment returns and sustainable competitive edges. Empirical studies have shown that adopting mature technologies, such as ERP systems and cloud computing, enhances firms’ competitive positioning by streamlining operations, lowering costs, and enabling more agile responses to dynamic market conditions [30].
Building on the above analysis, we propose the following hypothesis:
Hypothesis 1 (H1). 
The marketization of data elements significantly improves corporate investment efficiency.
Information with potential economic value is not concentrated in the hands of a few individuals. Instead, it is widely distributed among various market participants. This leads to a fundamental issue of information dispersion. A high degree of information dispersion increases the complexity of decision-making processes, diminishes the accuracy and reliability of outcomes, and ultimately hinders organizations from making effective strategic decisions [31]. When investment-related information is fragmented and dispersed, it considerably raises the costs associated with information transmission. This often results in delayed, incomplete, or inaccurate signals, undermining management’s ability to make informed decisions. As a consequence, it becomes increasingly difficult for firms to accurately assess market conditions and effectively identify investment opportunities. This limitation ultimately restricts improvements in investment efficiency, as timely and precise information is a critical driver in optimizing capital allocation and reducing market uncertainties.
In the digital economy, technological advancement serves as a key driver of innovation, but it also exacerbates the challenges associated with volatility, uncertainty, complexity, and ambiguity (VUCA). In this dynamic context, experience-based decision-making alone is increasingly inadequate. Organizations are therefore adopting more adaptive and data-driven approaches, integrating algorithmic analysis and computational models to process large-scale data. This shift enables firms to detect patterns, anticipate disruptions, and make timely, informed decisions amid rapidly evolving conditions [32]. The development of data element markets provides firms with access to a wide array of high-quality industrial and enterprise data that more accurately reflect real market conditions. By reducing information dispersion, such access enables more effective identification of high-return opportunities and improves overall investment decision-making. Furthermore, digital innovation plays a crucial role. The marketization of data elements acts as a key catalyst for digital innovation. Enhancements in a firm’s technological capabilities strengthen its information processing and managerial coordination capacity [33]. This advancement reduces information uncertainty and dispersion, enabling firms to make more informed investment decisions and thereby improving overall investment efficiency. As data element marketization advances, the diffusion and application of digital technologies and data services further enhance firms’ innovation capabilities while reducing the fragmentation of information. This dual mechanism—comprising data-driven insights and technological enablement—constitutes a causal pathway whereby the marketization of data elements mitigates information dispersion faced by firms through both data resources and digital technologies, ultimately enhancing corporate investment efficiency.
Accordingly, we propose the following hypothesis:
Hypothesis 2 (H2). 
The marketization of data elements improves corporate investment efficiency by reducing information dispersion.
Firms with greater risk-bearing capacity are better able to identify, undertake, and execute high-return investment projects in complex and uncertain environments, thereby enhancing investment efficiency. Research has shown that higher risk-bearing capacity allows firms to manage project risks more effectively by coordinating organizational, contractual, and financial mechanisms within project governance structures, ultimately improving project outcomes and investment returns [34]. In the context of the digital economy, effective utilization of data elements can enhance firms’ ability to bear risk, thereby boosting investment efficiency. The marketization of data elements further strengthens this process.
Specifically, data element marketization promotes the widespread opening of public data, which significantly enhances firms’ risk-bearing capacity. On the one hand, by analyzing public and industry datasets, firms can enhance their market insight and adaptive capacity [35]. This allows firms to make more precise, data-driven decisions under uncertainty, thereby enhancing their risk-bearing capacity. On the other hand, Open Government Data serves as a catalyst for enhanced information disclosure and decision-making transparency among enterprises and government agencies by reinforcing external oversight and accountability mechanisms [36]. As a result, firms are encouraged to behave more rationally and prudently when making high-risk decisions, which enhances their capacity to manage and absorb risk, and ultimately improves the efficiency and resilience of their investment activities. In addition, cost management plays a critical role in enterprise risk management by supporting risk mitigation strategies [37]. Excessive investment costs increase financial pressure and constrain risk-taking capacity, ultimately hindering investment efficiency. The marketization of data elements helps firms obtain accurate price information for key inputs such as raw materials, labor, and technology, enhancing the precision of cost estimation and enabling more effective cost control across all stages of the investment process. Furthermore, data element marketization alleviates information asymmetries between data suppliers and users, reduces information search and trial-and-error costs in investment decision-making, and further improves both risk-bearing capacity and investment efficiency.
Accordingly, we propose the following hypothesis:
Hypothesis 3 (H3). 
The marketization of data elements improves corporate investment efficiency by enhancing risk-bearing capacity.
Operational efficiency refers to a firm’s ability to maximize output and minimize waste within given resource constraints by optimizing management, streamlining processes, and controlling costs. Improvements in operational efficiency reduce marginal production costs, improve capital utilization, and accelerate cash flow turnover—thereby creating greater capacity for new investment projects and enhancing overall investment efficiency. Data-driven practices have been shown to significantly improve operational efficiency, serving as a key enabler for more effective investment decisions [38]. The marketization of data elements plays a supportive role in enabling data-driven improvements in operational efficiency. From an internal perspective, data element marketization empowers firms to undergo a profound digital transformation, which strengthens control over internal processes. Enterprises enhance internal control through the integration of internal data analysis and emerging technologies [39]. This integration helps firms monitor operations more effectively, simplify workflows, and improve coordination between departments. With data analytics applied to daily tasks, enterprises can make decisions more quickly and reduce unnecessary steps, which leads to better operational efficiency. As data element marketization continues, it further supports these improvements in internal processes and digital management. From an external perspective, data element marketization improves firms’ responsiveness to changing market conditions by facilitating the integration of external resources. Research has shown that data-driven supply chain capabilities significantly enhance firms’ responsiveness to external fluctuations, thereby improving their ability to adapt to dynamic market environments [40]. Moreover, big data integration and real-time analytics foster high-efficiency collaboration across firms and along the value chain, thereby improving inter-organizational efficiency and enhancing the timeliness of joint decision-making [41]. These developments ultimately contribute to improved investment efficiency.
Accordingly, we propose the following hypothesis:
Hypothesis 4 (H4). 
The marketization of data elements improves corporate investment efficiency by enhancing operational efficiency.
The theoretical derivation and hypothetical relationships of this research are summarized in Figure 1.

3. Methodology

3.1. Model Specification

Traditional approaches to causal inference often rely on strict model specifications and linearity assumptions. These methods are susceptible to the “curse of dimensionality” and model misspecification, particularly when dealing with high-dimensional data. To address these challenges, Chernozhukov et al. (2018) [42] proposed the Double Machine Learning approach, which relaxes the linearity assumptions among variables while enhancing the flexibility and predictive accuracy of non-parametric estimation. Given that the explained variable—corporate investment efficiency—is not directly observable and is inherently abstract and indirect, and considering the large number of complex, interacting factors that influence investment efficiency, traditional econometric models are often insufficient to fully capture its underlying mechanisms. Therefore, this study adopts the empirical strategy proposed by Chernozhukov et al. constructing an empirical model within the Double Machine Learning framework to systematically evaluate the impact and pathways of data element marketization on corporate investment efficiency.
Specifically, we begin by formulating a partially linear Double Machine Learning model and a series of auxiliary regressions:
I n v i t = θ 0 D a t a i t + g X i t + U i t
E [ U i t | D a t a i t , X i t ] = 0
In the model above, I n v i t denotes corporate investment efficiency. D a t a i t represents the treatment variable, indicating whether the firm’s local region implemented data element marketization in a given year. X i t = ( X i t 1 , X i t 2 , . . . , X i t p ) denotes a set of high-dimensional control variables. g X i t captures the true, potentially nonlinear impact of these control variables and is estimated using machine learning algorithms, yielding the estimate g ^ ( X i t ) . U i t is the error term, which has a conditional mean of zero given the treatment and control variables. The key parameter of interest θ 0 reflects the treatment effect of data element marketization on corporate investment efficiency.
When directly estimating this model, a key challenge arises from the regularization bias introduced by machine learning algorithms, which typically employ penalization techniques. If the function g ^ ( X i t ) is estimated via regularized machine learning methods, the resulting bias may contaminate the estimate of the treatment effect, leading to inaccurate inference. To address this issue, we construct an auxiliary regression for the treatment variable D a t a i t as follows:
D a t a i t = m X i t + V i t
E [ V i t | X i t ] = 0
In this step, the function m ( X i t ) is also estimated using machine learning models. The residual V i t isolates the component of the treatment—data element marketization—that is orthogonal to the control variables X i t . This procedure helps mitigate the influence of first-stage model estimation errors on the final causal effect estimate. As a result, it ensures that the estimator remains unbiased and asymptotically normal under standard regularity conditions.
After obtaining g ^ X i t and m ^ ( X i t ) , in order to mitigate overfitting biases from using the same sample in both model training and prediction, and to ensure the unbiasedness a Specifically, the full sample is randomly divided into K folds. For each fold k , we do not use the observations within that fold to train the machine learning models. Instead, we train the models for g ^ X i t and m ^ ( X i t ) on the remaining K 1 folds, and then use the trained models to compute residuals on fold k .
After calculating the residuals across all folds, we merge I n v ^ i t , D ^ i t and estimate the treatment effect:
θ ^ 0 = 1 n i I , t T V ^ i t ( I n v i t g ^ ( X i t ) ) 1 n i I , t T V ^ i t D a t a i t
Equation (5) can also be expressed as follows:
n ( θ ^ 0 θ 0 ) = 1 E [ V i t 2 ] ( 1 n i I , t T V i t U i t + 1 n i I , t T [ m ( X i t ) m ^ ( X i t ) ] [ g X i t g ^ X i t ] )
In this expression, the first term 1 E [ V i t 2 ] 1 n i I , t T V i t U i t follows a normal distribution with mean zero under standard regularity conditions. The second term 1 E [ V i t 2 ] ( 1 n i I , t T [ m ( X i t ) m ^ ( X i t ) ] [ g X i t g ^ X i t ] ) captures the product of estimation errors from the two machine learning stages in the Double Machine Learning procedure. Since m ^ ( X i t ) and g ^ X i t converge at rates of n φ m and n φ g . The overall product term is O p ( n ( φ m + φ g 1 2 ) ) . As long as φ m + φ g > 1 2 , this item is o p ( 1 ) , n ( θ ^ 0 θ 0 ) has a faster convergence rate compared with the original treatment effect. Thus ensuring the unbiasedness of θ ^ 0 .

3.2. Variable Selection

3.2.1. Dependent Variable: Corporate Investment Efficiency

This paper follows the method proposed by Richardson to measure corporate investment efficiency [43]. The specific calculation is as follows:
I n v e s t i , t = α 0 + α 1 T o b i n q i , t 1 + α 2 C a s h i , t 1 + α 3 A g e i , t 1 + α 4 S i z e i , t 1 + α 5 R e t u r n i , t 1 + α 6 I n v e s t i , t 1 + ε i , t
Investment expenditure ( I n v e s t ) is calculated based on data from the cash flow statement. Specifically, it is defined as the net value of “cash paid for the purchase and construction of fixed assets, intangible assets, and other long-term assets” plus “net cash paid for acquiring subsidiaries,” minus “net cash received from the disposal of long-term assets and subsidiaries,” all scaled by beginning total assets.
Following this, a regression equation is constructed incorporating the firm’s Tobin’s Q ( T o b i n q ), cash holdings ( C a s h ), firm age ( A g e ), asset size ( S i z e ), stock return ( R e t u r n ), and lagged investment expenditure ( I n v e s t i , t 1 ). The absolute value of the residual from this regression is used to measure investment efficiency, where a higher value of residual indicates a greater deviation from the expected level of investment—that is, a lower level of investment efficiency.

3.2.2. Dependent Variable: Data Element Marketization

This paper adopts a dummy variable approach to measure data element marketization, drawing on established methodologies in prior research [13]. Specifically, if city i established a data element trading platform in year t or any subsequent year, the binary variable Data is assigned a value of 1; otherwise, it is set to 0. The theoretical rationale behind this measure lies in the fact that the establishment of a data trading platform signifies a shift from the internal, closed use of data within organizations to the institutional transformation of data into a tradable economic asset. This transformation—through the design of platform infrastructure—fosters the construction of market mechanisms, the formulation of regulatory standards, and improvements in trading efficiency [44]. As such, it represents a critical institutional node in the marketization of data elements.
The data on the establishment of data trading platforms at the city level is primarily derived from the White Paper on Big Data. Since the study sample spans the period from 2007 to 2023, and the white paper provides information only up to 2021, we extended the dataset by tracking the progress of projects labeled as “under construction” in the white paper. We systematically collected the actual launch dates of these platforms between 2021 and 2023, thereby expanding the measurement to cover the full sample period. The specific establishment times in each area are shown in Figure 2.

3.2.3. Control Variables

Based on existing research [45], this study incorporates a set of firm-level control variables, including firm size, Tobin’s Q, financial leverage, profitability, book-to-market ratio, cash turnover, ownership concentration, CEO duality, and the proportion of independent directors to mitigate the influence of omitted variable bias. Additionally, the analysis controls for city-level heterogeneity by including regional economic development level as a macro-level factor. The specific description of variables are shown in Table 1.

3.3. Sample Selection and Data Sources

This study uses A-share listed firms in China from 2007 to 2023 as the research sample. To ensure the accuracy and reliability of the results, the initial dataset is processed as follows: (1) firms labeled as ST, *ST, or PT are excluded; (2) financial industry firms are removed; (3) observations with missing values for key variables are excluded. In addition, all continuous variables are winsorized at the 1% level on both tails to minimize the influence of outliers. After data cleaning, the final sample consists of 25,477 firm-year observations. The data used in this study are obtained from the following sources: Firm-level basic information, financial indicators, and corporate governance variables are drawn from the CSMAR database. City-level data are obtained from the China City Statistical Yearbook.
Descriptive statistics for all variables are presented in Table 2. It is worth noting that corporate investment efficiency is measured using the absolute value of regression residuals, and for presentation purposes, the values are multiplied by 1000. The mean of investment efficiency is 37.408, with a standard deviation of 43.156. A higher value indicates lower investment efficiency, suggesting substantial variation in investment efficiency across firms. For the data element marketization variable, the mean is 0.308, indicating that 30.8% of the sample firms are located in regions where data element marketization has been implemented.

4. Empirical Results and Analysis

4.1. Benchmark Regression

This paper treats data element marketization as the core independent variable and employs a Double Machine Learning approach based on the gradient boosting algorithm for estimation. A five-fold cross-fitting strategy is applied to construct a robust baseline regression model, with both firm fixed effects and year fixed effects included. The results are presented in Table 3. Column (1) reports the results with first-order terms of the control variables. The estimated coefficient on data element marketization is approximately −2.520 and is statistically significant at the 1% level. Column (2) includes second-order terms of the control variables based on Column (1), and the coefficient on data element marketization remains significantly negative. These findings suggest that data element marketization significantly reduces firms’ inefficient investment behavior, thereby improving corporate investment efficiency, which provides empirical support for Hypothesis H1.

4.2. Endogenous Test

Although a comprehensive set of control variables is included in the baseline model, potential endogeneity concerns, such as omitted variable bias, may still persist. To address these issues, this paper adopts an instrumental variable approach, following strategies widely used in the existing literature [46]. Specifically, we use the number of post offices per million inhabitants in each city in 1984 as an instrumental variable, based on the premise that the historical density of postal networks reflects a region’s longstanding patterns of information exchange. Importantly, the allocation of postal facilities during this period was predominantly shaped by administrative planning and population considerations under the planned economy, rather than by market-driven commercial activity. As such, postal density serves as a proxy for the historical infrastructure supporting information flows, which may have fostered a persistent cultural and institutional orientation toward information sharing. This orientation, in turn, plausibly contributes to greater contemporary demand for the marketization of data factors. While it is conceivable that postal density may correlate with historical levels of economic activity, we argue that such associations are unlikely to translate into a direct effect on current firm-level investment decisions. The significant temporal distance, combined with the profound economic and institutional transformations spurred by China’s Reform and Opening-Up policy since the late 1970s, undermines the plausibility of any direct causal channel linking historical postal distribution to present-day corporate behavior. Rather, we posit that the influence of postal density operates indirectly, through the persistence of regional norms related to information exchange. In this sense, the instrument satisfies the relevance condition while remaining plausibly exogenous to current micro-level investment outcomes.
As the 1984 post office data are cross-sectional, we construct a time-varying instrument by interacting the 1984 post office density with the lagged number of internet broadband access ports during the study period. This interaction term (Off) is used as the instrumental variable. Table 4. Columns (1) and (2) of Table 4 report the first-stage and second-stage regression results, respectively. The first-stage coefficient is significantly positive at the 1% level, indicating a strong correlation between the instrumental variable and data element marketization. The second-stage result shows that the estimated coefficient remains significantly negative, confirming that the core findings are robust even after controlling for potential endogeneity.

4.3. Robustness Check

4.3.1. Change the Measurement Method of Investment Efficiency

To ensure the robustness of the empirical findings, this paper follows the methodology of Biddle et al., and re-estimates corporate investment efficiency using industry-specific regression models [47]. This approach helps mitigate potential measurement errors associated with the construction of the dependent variable, thereby reducing bias in model estimation. As shown in Column (1) of Table 5, the coefficient on the core explanatory variable—data element marketization—remains significantly negative, indicating that the estimation results are highly credible and robust to alternative measures of investment efficiency.

4.3.2. Explanatory Variable Lag by One Period

To rule out the potential influence of lag effects on the empirical results, this paper further re-estimates the model using a one-period lag of the explanatory variable. The regression results are reported in Column (2) of Table 5. Even after introducing the lagged variable, the coefficient of the core explanatory variable—data element marketization—remains significantly negative. This finding suggests that the main conclusion is not driven by time-lagged effects, further confirming the robustness and reliability of the empirical results.

4.3.3. Changing the Assumptions of the Double Machine Learning Model

To further verify the robustness of the baseline regression results derived from the Double Machine Learning model, this paper conducts two additional tests.
First, we adjust the sample-splitting ratio in the DML framework. As shown in Columns (1) and (2) of Table 6, when the splitting ratio of the full sample is modified from the baseline 1:4 to 1:2 and 1:7, respectively, the estimated policy effect on the core explanatory variable—data element marketization—exhibits slight variation in magnitude but remains consistently negative and statistically significant. This suggests that the empirical results are robust to different sample-splitting configurations.
Second, we re-estimate the DML model using multiple machine learning algorithms. In addition to the gradient boosting algorithm used in the baseline estimation, we apply random forest (RF) and support vector machine (SVM) methods. As reported in Columns (3) to (4) of Table 6, the estimated coefficients on the core variable remain significantly negative across all models, reinforcing the validity and robustness of the core conclusions.

4.3.4. Parallel Trends Test

This study further employs an event-study approach to examine the parallel trends assumption prior to the policy intervention and the dynamic effects following the policy implementation. As shown in Figure 3, the regression coefficients prior to the establishment of the data exchange are statistically insignificant, supporting the validity of the parallel trends assumption. In the year of policy implementation, the coefficient of the policy shock is significantly positive. Given that the dependent variable in this study is a proxy for corporate investment inefficiency, this result indicates that the marketization of data elements, in fact, exacerbated corporate investment distortions in the short term. We attribute this to initial adjustment costs and an insufficient scale of market participants. As an emerging market, the establishment of a data exchange requires time to accumulate buyers and sellers until a sufficient threshold is reached for the market mechanism to function effectively. This interpretation is corroborated by the dynamic results: as time progresses, the policy coefficient turns from positive to negative, and its absolute value continues to increase. This suggests that after overcoming the initial hurdles, the marketization of data elements significantly enhances corporate investment efficiency.

5. Further Study

5.1. Mechanism Analysis

To further explore how data element marketization affects corporate investment efficiency, this paper conducts a mechanism analysis along three distinct channels: information dispersion, risk-bearing capacity and operational efficiency.

5.1.1. Information Dispersion

As indicated by the preceding theoretical analysis, the marketization of data factors improves corporate investment efficiency by reducing the degree of information dispersion faced by firms during the investment decision-making process. A firm exhibiting high volatility in its return on equity (ROE) often signals unstable or unpredictable earnings performance. This volatility increases the difficulty for investors and analysts to form accurate expectations about the firm’s future profitability. As a result, the dispersion of analysts’ earnings forecasts is likely to widen, reflecting greater disagreement among analysts and a more opaque information environment [48]. Therefore, ROE volatility can be interpreted as a proxy for the degree of information dispersion faced by the firm [49]. A higher standard deviation indicates that management must process more fragmented and volatile information, making it more difficult to accurately identify investment opportunities and select projects with the highest expected returns. This increased complexity ultimately reduces the likelihood of firms making optimal investment decisions. The degree of information dispersion is calculated as follows:
1 T 1 t = 1 T ( R O E i t 1 T t = 1 T R O E i t ) 2 , T = 3
In the above equation, i denotes the firm and t denotes the year. For presentation purposes, the measure of information dispersion is scaled by a factor of 1000.
The regression results are reported in Column (1) of Table 7. The estimated coefficient of data element marketization on information dispersion is significantly negative, indicating that marketization of data elements effectively reduces the degree of information dispersion firms face. This finding supports the proposed mechanism that reducing information dispersion serves as a potential channel through which data element marketization enhances investment efficiency. The rationale behind this mechanism is that greater information dispersion increases uncertainty and the cost of judgment in corporate investment decision-making. When firms are confronted with large volumes of fragmented and inconsistent information, managers may encounter issues such as information omission, misinterpretation, or misallocation of resources during the process of filtering and integrating relevant data. These challenges hinder their ability to timely and accurately identify high-return investment opportunities. By contrast, data element marketization improves the concentration, accessibility, and quality of information, thereby reducing the difficulty of acquiring critical decision-making inputs. As a result, firms are able to make investment judgments more efficiently, minimize resource waste and decision errors, and ultimately achieve more optimal capital allocation and investment returns.

5.1.2. Risk-Bearing Capacity

The marketization of data elements contributes to enhancing firms’ ability to identify and tolerate risk in the face of uncertainty, thereby promoting more precise and scientifically grounded investment decisions. Drawing on prior literature, this study uses the standard deviation of industry-adjusted stock returns over a five-year window as an inverse proxy for firms’ risk-bearing capacity [50].
Specifically, the metric is calculated as:
1 T 1 t = 1 T [ ( R i t R ¯ i n d u s t r y , t ) 1 T t = 1 T ( R i t R ¯ i n d u s t r y , t ) ] 2 , T = 5
where R i t is the annual return of firm i in year t , and R ¯ i n d u s t r y , t is the average return of all firms in the same industry and year. A higher value of this measure indicates greater firm-specific volatility, which is interpreted as lower risk-bearing capacity.
As shown in Column (2) of Table 7, data element marketization has a negative and statistically significant effect on the standard deviation of industry-adjusted stock returns at the 1% level. Because this standard deviation serves as an inverse proxy for firms’ risk-bearing capacity, the result implies that greater data element marketization is associated with stronger risk-bearing capacity, indicating that firms are better able to withstand uncertainty and volatility in more data-enabled environments. This result confirms the positive role of data element marketization in enhancing firms’ capacity to deal with complex and dynamic environments. The reason why improved risk-bearing capacity enhances investment efficiency lies in the firm’s ability to proactively identify and pursue high-return opportunities under uncertainty, rather than avoiding risk and missing out on high-potential projects. Firms with higher tolerance for risk tend to make forward-looking investment decisions, even under conditions of incomplete information or volatile market environments. This allows them to allocate resources to promising but uncertain projects, ultimately leading to higher investment efficiency. Moreover, stronger risk-bearing capacity also encourages firms to adopt a long-term perspective in their investment strategies. Rather than prematurely locking in suboptimal projects simply to secure short-term returns, firms are more willing to wait for higher-quality opportunities to emerge. This strategic patience enables more effective capital allocation and reduces the likelihood of inefficient investments driven by excessive risk aversion.

5.1.3. Operational Efficiency

The marketization of data elements helps improve operational efficiency, thereby promoting more optimal and efficient resource allocation, which ultimately contributes to enhancing corporate investment efficiency. Referring to existing studies, this paper uses the asset turnover ratio as a proxy for operational efficiency to measure how efficiently firms allocate and utilize resources during the observation period [51]. A higher asset turnover ratio indicates that the firm operates with greater efficiency. As reported in Column (3) of Table 7, data element marketization has a positive and statistically significant effect on operational efficiency at the 5% level. This result suggests that the advancement of data element marketization creates favorable conditions for improving firms’ operational efficiency, thereby contributing directly to improved investment efficiency. Improvements in operational efficiency imply more precise resource allocation, faster responsiveness in process management, and stronger cost control capabilities. Such gains allow firms to base their investment decisions on more accurate cost-benefit expectations, enabling better project selection and capital allocation while effectively avoiding inefficient investments. Furthermore, efficient operations can shorten the time from investment to output, accelerate capital turnover, and increase the realizability of investment returns.

5.2. Heterogeneity Test

5.2.1. Industry Heterogeneity

Firms in different industries exhibit varying degrees of demand for and reliance on data, which leads to heterogeneous effects of data element marketization across sectors. Following the Classification of Strategic Emerging Industries (2018) issued by the National Bureau of Statistics of China, the sample is divided into two groups: high-tech industries and non-high-tech industries. Separate regressions are then conducted for each group, with the results presented in Columns (1) and (2) of Table 8.
The empirical results reveal significant industry-level heterogeneity in the impact of data element marketization on corporate investment efficiency. The enhancement effect is strongly significant for high-tech firms but not statistically significant for firms in non-high-tech industries. This divergence primarily stems from the intrinsic characteristics of high-tech sectors, which are highly reliant on cutting-edge technologies and data resources.
Importantly, although high-tech firms generally possess more advanced data analytics capabilities, the marketization of data elements does not merely provide them with resources they already control—it substantially expands the volume, diversity, and accessibility of external data available to them. Given their superior analytical capacities, high-tech firms are better positioned to incorporate newly accessible data into their decision-making processes, improving forecasting, optimizing resource allocation, and refining investment strategies. In other words, data element marketization enhances the marginal utility of these firms’ pre-existing data capabilities by increasing the scope and quality of available inputs, thereby significantly improving investment efficiency.
In contrast, non-high-tech firms often rely more heavily on traditional production factors such as labor and capital, with investment decisions typically shaped by more rigid, experience-based routines. Even when external data resources become more abundant through marketization, these firms may lack the necessary infrastructure, talent, or organizational flexibility to effectively integrate such data into their operations, leading to limited improvements in investment efficiency.
Additionally, differences in the policy environment may further amplify this heterogeneity. High-tech industries occupy a strategically prioritized position within China’s national industrial policy framework and are more likely to benefit from preferential treatments, such as tax incentives and targeted government subsidies. These policy supports not only alleviate financial constraints but also convey strong signals to the market, attracting additional resources and investments into the high-tech sector. Combined with the expanded data access brought by marketization, these factors jointly reinforce improvements in investment efficiency for high-tech firms.

5.2.2. Firm Growth Potential Heterogeneity

To investigate the heterogeneous effects of data element marketization across different types of firms, this study divides the sample into high growth potential and low growth potential firms based on the median value of Tobin’s Q. Grouped regression analyses are then conducted for each subsample. The regression results, shown in Columns (3) and (4) of Table 8, reveal significant differences in the effect of data element marketization on investment efficiency.
For high growth potential firms, the estimated coefficient of data element marketization is positive and statistically significant at the 5% level, indicating that it plays a significant role in promoting investment efficiency. In contrast, the coefficient for low growth potential firms is statistically insignificant, suggesting no discernible effect. This heterogeneity may be attributed to the strategic and operational advantages typically held by high growth potential firms. These firms often exhibit stronger expansion intentions, proactive strategic orientation, and greater responsiveness to market dynamics, enabling them to identify and seize the potential value offered by data element marketization more rapidly. In practice, high growth potential firms are more likely to incorporate data resources into their production and investment decision-making processes, thereby optimizing resource allocation through data-driven strategies and improving investment efficiency. By contrast, low growth potential firms generally lack the momentum and flexibility to adapt to fast-changing environments. Their limited capacity in resource acquisition, strategic adaptation, and organizational innovation constrains their ability to effectively access and utilize data resources. As a result, they may fail to establish systematic and efficient data utilization mechanisms, which hinders their ability to capitalize on the benefits of data element marketization. This leads to delayed responses, weaker execution, and underutilization of data’s potential in enhancing investment efficiency.

5.2.3. Digital Infrastructure Heterogeneity

Given the significant regional disparities in digital infrastructure development, firms may exhibit region-specific characteristics in terms of investment efficiency. The number of registered domain names is widely recognized as an important indicator of a region’s internet resources and network connectivity, and thus serves as a proxy for the overall level of digital infrastructure. In this study, the sample is divided into two groups—high and low digital infrastructure regions—based on the median number of domain names at the provincial level. Grouped regressions are then conducted for each subsample. The results, presented in Columns (1) and (2) of Table 9, show that the positive effect of data element marketization on investment efficiency is more pronounced in regions with better-developed digital infrastructure.
In contrast, in regions with relatively weak digital infrastructure, firms may face limitations related to information connectivity and data application environments, which hinder their ability to fully leverage data elements in optimizing investment decisions. In these areas, the estimated coefficient is not statistically significant. Firms located in regions with strong digital infrastructure typically enjoy more advanced connectivity environments and greater capacity to utilize data, which allows them to respond more effectively to market signals. As a result, they are better positioned to benefit from data element marketization in improving the scientific rigor and precision of investment decision-making.

5.2.4. Investment Inefficiency Heterogeneity

Although both underinvestment and overinvestment are forms of investment inefficiency, they stem from different underlying causes. Based on the direction of the deviation between actual and expected investment, we classify firms into two groups: overinvesting firms and underinvesting firms. Grouped regression results, presented in Columns (3) and (4) of Table 9, show that the impact of data element marketization on investment efficiency varies significantly between these two groups. The coefficient of data element marketization for underinvesting firms is not statistically significant, whereas for overinvesting firms, the coefficient is significantly negative at the 5% level. This suggests that the positive effect of data element marketization on investment efficiency is stronger for firms with overinvestment behavior.
This finding may be explained by several mechanisms. First, data element marketization reduces information dispersion and enhances information transparency and accessibility, enabling firms to make more rational and accurate investment decisions, especially when facing high-risk investment environments. In this context, firms are less likely to engage in irrational expansion and resource misallocation, effectively curbing overinvestment. From the perspective of risk-bearing capacity, firms with lower risk tolerance may lack the ability to cope with uncertain environments and tend to preemptively invest in currently available projects—regardless of their quality or return potential—in an effort to lock in short-term opportunities. This kind of opportunistic investment behavior can lower capital allocation efficiency and result in overinvestment. By contrast, firms with stronger risk-bearing capacity are more likely to adopt prudent investment strategies under uncertainty. They are willing to delay investment decisions, allowing time to identify and capture higher-quality, higher-return opportunities. This behavior—commonly referred to as “selective waiting”—optimizes investment portfolios, improves return on investment, and ultimately enhances overall investment efficiency. From the perspective of operational efficiency, data element marketization also helps firms improve internal operations by reducing excessive asset expansion and resource redundancy. This contributes to controlling overinvestment and enhancing investment efficiency.
In sum, the reduction of information dispersion, the enhancement of risk-bearing capacity, and the improvement of operational efficiency all play important roles in curbing irrational investment and enhancing efficiency for firms prone to overinvestment. In contrast, underinvesting firms often face constraints related to external resource availability or credit access, which data element marketization may not alleviate in the short term. As such, its effect on their investment efficiency remains limited.

6. Conclusions and Insights

With the growing integration of data elements into corporate investment decision-making, the marketization of data elements has created new developmental opportunities for enterprises. Based on a panel dataset of Chinese A-share listed firms from 2007 to 2023, this study employs a Double Machine Learning approach to systematically examine the impact of data element marketization on corporate investment efficiency. It further explores the underlying mechanisms from three perspectives: information dispersion, risk-bearing capacity, and operational efficiency. The findings suggest that data element marketization effectively reduces inefficient investment behavior and enhances firms’ investment efficiency. These improvements result from reduced information dispersion, greater risk tolerance, and enhanced operational efficiency, all of which lead to more informed and efficient investment decisions. Moreover, the effect of data element marketization is not uniform across firms. It has a notably stronger impact on high-tech enterprises compared to non-high-tech firms, and shows greater efficiency gains for high growth potential firms than for low growth potential ones. Regionally, the effect is stronger in regions with better digital infrastructure. From the perspective of investment behavior, data element marketization exerts a more significant regulatory effect on firms characterized by overinvestment, while its influence on underinvesting firms remains limited.
To more effectively leverage the role of data element marketization in enhancing corporate investment efficiency, the following policy recommendations are proposed based on the above findings:
Develop Robust Data Trading Platforms and Institutional Frameworks. To fully harness the potential of data elements in improving investment efficiency, it is essential to accelerate the development of unified, transparent, and trustworthy data trading platforms. These platforms should serve as central hubs that efficiently connect data suppliers and users, fostering seamless data circulation. In parallel, a comprehensive institutional framework must be established to define data ownership, classification standards, transaction compliance, and property rights. This legal clarity is crucial for building trust and reducing transaction costs. Furthermore, standardizing data formats and interface protocols will enhance interoperability across platforms, break down information silos, and improve the liquidity and composability of data assets. A well-structured institutional foundation will enable secure and efficient data flows, thereby strengthening corporate decision-making and resource allocation.
Enhance Data Accessibility Through Public and Corporate Disclosure. Improving data availability is fundamental to effective data element allocation. Governments and enterprises should prioritize openness by implementing mechanisms for timely, accurate, and comprehensive data disclosure. Doing so will reduce information asymmetry and increase the practical utility of available data. In addition, establishing a rigorous data monitoring and evaluation system can enhance transparency, facilitate external oversight, and encourage public participation. These initiatives not only improve governance and regulatory compliance but also bolster corporate risk management and accountability. Institutionalizing data circulation will standardize market behavior and elevate the quality and efficiency of data usage. Moreover, by aligning disclosure practices with firms’ internal operational systems, such policies can streamline decision-making processes, improve resource allocation accuracy, and ultimately enhance overall operational efficiency.
Optimize Market-Based Allocation to Support Strategic Industries and Firms. A more refined market-oriented approach is needed to maximize the impact of data elements, particularly in high-tech and strategically important sectors. This includes targeted support for firms in building capabilities for data collection, processing, storage, modeling, and data-driven decision-making. Investments in data infrastructure and the cultivation of skilled talent should be prioritized to address existing gaps. Enterprises should also be incentivized to collaborate in the construction and sharing of regional and industry-level data ecosystems. Promoting best practices and successful data application models will create scalable examples and drive broader improvements in investment efficiency across sectors.
Strengthen Digital Infrastructure to Underpin the Data Economy. The marketization of data elements is closely tied to the availability of advanced digital infrastructure. To unlock data-driven growth, governments must increase strategic investments in digital infrastructure, particularly in underserved regions. This includes expanding broadband access, cloud computing capabilities, and industrial internet infrastructure in central and western areas. Local governments should coordinate with industry stakeholders to accelerate the deployment of next-generation digital infrastructure, creating a high-speed, low-latency, and widely accessible digital environment. This foundation is essential for enabling enterprises to adopt data-centric investment models and contribute to regional economic development.
This study has a few limitations worth noting. First, our analysis is based only on data from Chinese A-share listed companies. While this ensures data quality and consistency, it also means that the conclusions may not fully reflect the behaviors of unlisted small and medium-sized enterprises (SMEs). Given that SMEs contribute significantly to employment, innovation, and economic activity in China, the effects of data element marketization on their investment decisions might be different. The exclusion of these firms is due to significant and well-documented data availability constraints. Much of the existing research on unlisted SMEs either relies on non-public, proprietary databases that are inaccessible for broad empirical study, or uses smaller listed firms as proxies. However, these proxies often differ fundamentally from the vast majority of unlisted enterprises in their financing channels, governance structures, and regulatory environments, making the direct extrapolation of our findings problematic. Future research could expand in this direction once better and more representative SME-level data becomes accessible. Second, the study focuses exclusively on China, a country with unique economic and digital development characteristics. As such, the findings may not directly apply to countries with different regulatory environments or stages of digital transformation. Future studies could explore whether similar effects exist in other national contexts to enrich the understanding of data marketization from a global perspective.

Author Contributions

Conceptualization, Y.M. and L.H.; methodology, Y.M.; software, Y.M.; validation, Z.L.; writing—original draft preparation, Y.M.; writing—review and editing, L.H.; visualization, Z.L.; supervision, L.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the General Special Project for Studying and Interpreting Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era of University of Chinese Academy of Social Sciences “Research on Strategies for Enhancing the Resilience and Security of Enterprise Supply Chains from the Perspective of Deep Integration of Digital Economy and Real Economy” (grant number: 20240153).

Data Availability Statement

The original data presented in the study are openly available in the CSMAR database at https://data.csmar.com (accessed on 3 April 2025) and White Paper on Big Data at https://www.caict.ac.cn/kxyj/qwfb/bps/202112/P020211220495261830486.pdf (accessed on 3 April 2025).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Grover, V.; Chiang, R.H.; Liang, T.P.; Zhang, D. Creating strategic business value from big data analytics: A research framework. J. Manag. Inf. Syst. 2018, 35, 388–423. [Google Scholar] [CrossRef]
  2. DalleMule, L.; Davenport, T.H. What’s your data strategy. Harv. Bus. Rev. 2017, 95, 112–121. [Google Scholar]
  3. Jones, C.I.; Tonetti, C. Nonrivalry and the Economics of Data. Am. Econ. Rev. 2020, 110, 2819–2858. [Google Scholar] [CrossRef]
  4. Chen, X.; Liu, Z.; Ma, C. Chinese innovation-driving factors: Regional structure, innovation effect, and economic development—Empirical research based on panel data. Ann. Reg. Sci. 2017, 59, 43–68. [Google Scholar] [CrossRef]
  5. Wang, D.; Liao, H.; Liu, A.; Li, D. Natural resource saving effects of data factor marketization: Implications for green recovery. Resour. Policy 2023, 85, 104019. [Google Scholar] [CrossRef]
  6. Wang, D.; Yang, T. Research on the Promotion Effect of the Marketization of Data Elements on the Digital Transformation of Manufacturing Enterprises: An Empirical Evaluation of a Multiperiod DID Model. Sustainability 2025, 17, 3199. [Google Scholar] [CrossRef]
  7. National Data Administration; Cyberspace Administration of China; Ministry of Industry and Information Technology; Ministry of Public Security; State-owned Assets Supervision and Administration Commission of the State Council. Opinions on Promoting the Development and Utilization of Enterprise Data Resources (Guoshu Ziyuan [2024] No. 125); Gov.cn. Available online: https://www.gov.cn/zhengce/zhengceku/202412/content_6994570.htm (accessed on 30 May 2025).
  8. Sala, R.; Pirola, F.; Pezzotta, G.; Cavalieri, S. Data-driven decision making in maintenance service delivery process: A case study. Appl. Sci. 2022, 12, 7395. [Google Scholar] [CrossRef]
  9. Wang, L.; Wu, Y.; Huang, Z.; Wang, Y. Big data application and corporate investment decisions: Evidence from A-share listed companies in China. Int. Rev. Financ. Anal. 2024, 94, 103331. [Google Scholar] [CrossRef]
  10. Huang, H.; Qi, B.; Chen, L. Innovation and high-quality development of enterprises—Also on the effect of innovation driving the transformation of China’s economic development model. Sustainability 2022, 14, 8440. [Google Scholar] [CrossRef]
  11. Wang, W.; Xiao, D. Marketization of Data Elements and Enterprise Green Governance Performance: A Quasi-Natural Experiment Based on Data Trading Platforms. Manag. Decis. Econ. 2025, 46, 1686–1700. [Google Scholar] [CrossRef]
  12. Yang, B. Data element marketization and energy efficiency of heavy polluting enterprises: A technology innovation perspective. J. Environ. Manag. 2025, 391, 126318. [Google Scholar] [CrossRef] [PubMed]
  13. Ouyang, Y.; Hu, M. The impact of data elements marketization on corporate financing constraints: Quasi-experimental evidence from the establishment of data trading platforms in China. Financ. Res. Lett. 2024, 69, 106132. [Google Scholar] [CrossRef]
  14. Wang, P.; Cen, C. Does digital economy development promote innovation efficiency? A spatial econometric approach for Chinese regions. Technol. Anal. Strateg. Manag. 2024, 36, 931–945. [Google Scholar] [CrossRef]
  15. Dong, L.; Zhu, X.; Yang, L.; Jiang, G. Unleashing the power of data element markets: Driving urban green growth through marketization, innovation, and digital finance. Int. Rev. Econ. Financ. 2025, 99, 104070. [Google Scholar] [CrossRef]
  16. Peng, H.; Wang, L. Digital economy and business investment efficiency: Inhibiting or facilitating. Res. Int. Bus. Financ. 2022, 63, 101797. [Google Scholar] [CrossRef]
  17. Wang, Q.; Liu, T.; Zhong, X. Digital innovation, investment efficiency and total factor productivity: Evidence from enterprise digital patents. Appl. Econ. 2024, 56, 8183–8197. [Google Scholar] [CrossRef]
  18. Jiang, Q.; Zhang, C.; Wei, Q. Digital technology adoption and enterprise investment efficiency. Financ. Res. Lett. 2025, 72, 106623. [Google Scholar] [CrossRef]
  19. Lv, P.; Xiong, H. Can FinTech improve corporate investment efficiency? Evidence from China. Res. Int. Bus. Financ. 2022, 60, 101571. [Google Scholar] [CrossRef]
  20. Xu, C.; Jin, L. Effects of government digitalization on firm investment efficiency: Evidence from China. Int. Rev. Econ. Financ. 2024, 92, 819–834. [Google Scholar] [CrossRef]
  21. Tsai, H.J.S.; Wu, Y.; Xu, B. Does capital market drive corporate investment efficiency? Evidence from equity lending supply. J. Corp. Financ. 2021, 69, 102042. [Google Scholar] [CrossRef]
  22. Abbas, A.E.; Agahari, W.; Van de Ven, M.; Zuiderwijk, A.; De Reuver, M. Business data sharing through data marketplaces: A systematic literature review. J. Theor. Appl. Electron. Commer. Res. 2021, 16, 3321–3339. [Google Scholar] [CrossRef]
  23. Eichler, R.; Gröger, C.; Hoos, E.; Schwarz, H.; Mitschang, B. From data asset to data product–the role of the data provider in the enterprise data marketplace. In Service-Oriented Computing, Proceedings of the Symposium and Summer School on Service-Oriented Computing, Cham, Switzerland, 4–8 July 2022; Springer: Cham, Switzerland, 2022; pp. 119–138. [Google Scholar]
  24. Ren, S. Optimization of Enterprise Financial Management and Decision-Making Systems Based on Big Data. J. Math. 2022, 2022, 1708506. [Google Scholar] [CrossRef]
  25. Liu, Y.; He, Q. Digital transformation, external financing, and enterprise resource allocation efficiency. Manag. Decis. Econ. 2024, 45, 2321–2335. [Google Scholar] [CrossRef]
  26. Nuccio, M.; Guerzoni, M. Big data: Hell or heaven? Digital platforms and market power in the data-driven economy. Compet. Change 2019, 23, 312–328. [Google Scholar] [CrossRef]
  27. Xu, J.; Hong, N.; Xu, Z.; Zhao, Z.; Wu, C.; Kuang, K.; Shum, H. Data-driven learning for data rights, data pricing, and privacy computing. Engineering 2023, 25, 66–76. [Google Scholar] [CrossRef]
  28. Popovič, A.; Hackney, R.; Coelho, P.S.; Jaklič, J. Towards business intelligence systems success: Effects of maturity and culture on analytical decision making. Decis. Support Syst. 2012, 54, 729–739. [Google Scholar] [CrossRef]
  29. Chen, H.; Chiang, R.H.; Storey, V.C. Business intelligence and analytics: From big data to big impact. MIS Q. 2012, 36, 1165–1188. [Google Scholar] [CrossRef]
  30. Stratopoulos, T.C.; Wang, V.X. Estimating the duration of competitive advantage from emerging technology adoption. Int. J. Account. Inf. Syst. 2022, 47, 100577. [Google Scholar] [CrossRef]
  31. Scala, N.M.; Rajgopal, J.; Vargas, L.G.; Needy, K.L. Group decision making with dispersion in the analytic hierarchy process. Group Decis. Negot. 2016, 25, 355–372. [Google Scholar] [CrossRef]
  32. Minciu, M.; Berar, F.A.; Dobrea, R.C. New decision systems in the VUCA world. Manag. Mark. 2020, 15, 236–254. [Google Scholar] [CrossRef]
  33. Jia, N.; Rai, A.; Xu, S.X. Reducing capital market anomaly: The role of information technology using an information uncertainty lens. Manag. Sci. 2020, 66, 979–1001. [Google Scholar] [CrossRef]
  34. Chang, C.Y. Risk-bearing capacity as a new dimension to the analysis of project governance. Int. J. Proj. Manag. 2015, 33, 1195–1205. [Google Scholar] [CrossRef]
  35. Wamba, S.F.; Gunasekaran, A.; Akter, S.; Ren, S.J.F.; Dubey, R.; Childe, S.J. Big data analytics and firm performance: Effects of dynamic capabilities. J. Bus. Res. 2017, 70, 356–365. [Google Scholar] [CrossRef]
  36. Reggi, L.; Dawes, S. Open government data ecosystems: Linking transparency for innovation with transparency for partici-pation and accountability. In Electronic Government, Proceedings of the 15th IFIP WG 8.5 International Conference on Electronic Government (EGOV 2016), Guimarães, Portugal, 5–8 September 2016; Springer: Cham, Switzerland, 2016; pp. 74–86. [Google Scholar]
  37. Elsayed, M.; Wickramainghe, A.; Razik, M.A. The association between strategic cost management and enterprise risk management: A critical literature review. Corp. Ownersh. Control 2011, 9, 184–195. [Google Scholar] [CrossRef]
  38. Riipa, M.B.; Begum, N.; Hriday, M.S.H.; Haque, S.A. Role of data analytics in enhancing business decision-making and operational efficiency. Int. J. Commun. Netw. Inf. Secur. 2025, 17, 400–412. [Google Scholar]
  39. Zhou, J. Empowering Internal Control: A Case Study in Leveraging Data and Technologies. J. Account. Ethics Public Policy 2024, 25, 26. [Google Scholar] [CrossRef]
  40. Yu, W.; Chavez, R.; Jacobs, M.A.; Feng, M. Data-driven supply chain capabilities and performance: A resource-based view. Transp. Res. Part E Logist. Transp. Rev. 2018, 114, 371–385. [Google Scholar] [CrossRef]
  41. Olayinka, O.H. Big data integration and real-time analytics for enhancing operational efficiency and market responsiveness. Int. J. Sci. Res. Arch. 2021, 4, 280–296. [Google Scholar] [CrossRef]
  42. Chernozhukov, V.; Chetverikov, D.; Demirer, M.; Duflo, E.; Hansen, C.; Newey, W.; Robins, J. Double/debiased machine learning for treatment and structural parameters. Econom. J. 2018, 21, C1–C68. Available online: https://academic.oup.com/ectj/article-abstract/21/1/C1/5056401 (accessed on 30 May 2025). [CrossRef]
  43. Richardson, S. Over-investment of free cash flow. Rev. Account. Stud. 2006, 11, 159–189. [Google Scholar] [CrossRef]
  44. Thomas, L.D.; Leiponen, A. Big data commercialization. IEEE Eng. Manag. Rev. 2016, 44, 74–90. [Google Scholar] [CrossRef]
  45. He, J.; Xu, S.; Wang, B.; Chan, K.C. Learn from peers? The impact of peer firms’ analyst earnings forecasts on a focal firm’s corporate investment efficiency. Int. Rev. Financ. Anal. 2023, 89, 102750. [Google Scholar] [CrossRef]
  46. Gong, M.; Zeng, Y.; Zhang, F. New infrastructure, optimization of resource allocation and upgrading of industrial structure. Financ. Res. Lett. 2023, 54, 103754. [Google Scholar] [CrossRef]
  47. Biddle, G.C.; Hilary, G.; Verdi, R.S. How does financial reporting quality relate to investment efficiency? J. Account. Econ. 2009, 48, 112–131. [Google Scholar] [CrossRef]
  48. Ali, A.; Liu, M.; Xu, D.; Yao, T. Corporate disclosure, analyst forecast dispersion, and stock returns. J. Account. Audit. Financ. 2019, 34, 54–73. [Google Scholar] [CrossRef]
  49. Wang, H.C.; Li, X.C.; Li, H.T. Digital innovation and enterprise investment efficiency: Evidence based on patent text analysis. Kuaiji Yanjiu (Account. Res.) 2023, 7, 55–71. Available online: https://kns.cnki.net/kcms2/article/abstract?v=qAFzQoAYOgTobQHmD69RWMVysk351AmTsMWzoDbQB0u_fpnor-zweXaqQMWREVojnL0SiE0X2ku49jJXsqRacPFYOj2sgkyapcvbzW_dJoLBBIhYlI_tsxMQ0YFEi_MSPP_Bb2vCNpeu_ZditrJD7PdyW_qvY9zPg_rdidNAljtobB33-RzQcQ==&uniplatform=NZKPT&language=CHS (accessed on 30 May 2025).
  50. Xu, Z. Can the participation of party organizations in corporate governance enhance the risk-bearing capacity of listed companies? Int. Rev. Econ. Financ. 2025, 101, 104129. [Google Scholar] [CrossRef]
  51. Obeidat, M.I.S.; AlMOMANI, M.A.; Almomani, T.M.; Darkel, N.M.A.M.Y. The moderating impact of major shareholding of equity on operational performance efficiency and firm value relationship: The evidence of the manufacturing listed firms at ASE. Wseas Trans. Bus. Econ. 2023, 20, 1408–1421. [Google Scholar] [CrossRef]
Figure 1. Theoretical framework for the improvement of enterprise investment efficiency driven by data elements marketization.
Figure 1. Theoretical framework for the improvement of enterprise investment efficiency driven by data elements marketization.
Systems 13 00609 g001
Figure 2. The establishment time of data exchanges in each city.
Figure 2. The establishment time of data exchanges in each city.
Systems 13 00609 g002
Figure 3. Parallel trend test results. Note: The dashed line perpendicular to the horizontal axis represents the 95% confidence interval.
Figure 3. Parallel trend test results. Note: The dashed line perpendicular to the horizontal axis represents the 95% confidence interval.
Systems 13 00609 g003
Table 1. Variable description.
Table 1. Variable description.
Variable NameSymbolVariable Definition
Corporate Investment
Efficiency
InvRegression Residual
Data element Marketization DataWhether a Data Trading Platform Was Established in the City in the Given Year
Firm SizeSizeNatural Logarithm of Total Assets
Tobin’s QTobinqTobin’s Q
Financial LeverageLevTotal Liabilities/Total Assets
ProfitabilityRoaReturn on Assets
Book-to-Market RatioBtmBook Value of Equity/Market Value of the Firm
Cash TurnoverCetOperating Revenue/Average Cash Balance
Ownership ConcentrationStockShareholding Ratio of the Largest Shareholder
CEO DualityCpDummy Variable: 1 if CEO and Chairman Are the Same Person, 0 Otherwise
Proportion of Independent DirectorsIdrNumber of Independent Directors/Total Number of Board Members
Regional Economic
Development Level
AgdpNatural Logarithm of Regional GDP per Capita
Table 2. Descriptive statistics of main variables.
Table 2. Descriptive statistics of main variables.
VariableObsMeanStd. DevMinMax
Inv25,47737.40843.1560.400258.222
Data25,4770.3080.4620.0001.000
Size25,47722.3631.29620.17026.410
Tobinq25,4771.9981.1800.8327.315
Lev25,4770.4260.1920.0610.839
Roa25,4770.0400.055−0.1790.199
Btm25,4770.3390.1530.0740.797
Cet25,4776.7657.7300.40548.243
Stock25,47734.28014.8368.25973.984
Cp25,4770.2680.4430.0001.000
Idr25,47737.5105.22833.33057.140
Agdp25,47711.4600.5619.74412.208
Table 3. Benchmark regression results.
Table 3. Benchmark regression results.
(1)(2)
InvInv
Data−2.520 ***
(0.807)
−2.488 ***
(0.810)
First-Order Control
Variables
YesYes
Second-Order Control
Variables
NoYes
Year FEYesYes
Firm FEYesYes
N25,47725,477
Note: Robust standard errors are shown in brackets, *** is significant at the level of 1%.
Table 4. Endogenous problem test.
Table 4. Endogenous problem test.
(1)(2)
2SLS
Data
2SLS
Inv
Data −7.507 **
(3.818)
Off−2.620 ***
(0.065)
Control VariablesYesYes
Year FEYesYes
Firm FEYesYes
N20,76220,762
Note: Robust standard errors are shown in brackets, *** and **, are significant at the level of 1% and 5% respectively.
Table 5. Robustness checks (I).
Table 5. Robustness checks (I).
Change the Measurement Method of Investment EfficiencyExplanatory Variable Lag by One Period
(1)(2)
InvInv
Data−4.071 ***
(0.922)
−3.149 ***
(0.894)
Control VariablesYesYes
Year FEYesYes
Firm FEYesYes
N21,49721,497
Note: Robust standard errors are shown in brackets, ***, is significant at the level of 1%.
Table 6. Robustness checks (II).
Table 6. Robustness checks (II).
Change KChange ML Algorithms
(1)(2)(3)(4)
Machine Learning
Algorithms
K = 3K = 8RFSVM
InvInvInvInv
Data−2.448 ***
(0.804)
−2.635 ***
(0.810)
−5.348 **
(2.160)
−4.091 ***
(0.607)
Control VariablesYesYesYesYes
Year FEYesYesYesYes
Firm FEYesYesYesYes
N25,47725,47725,47725,477
Note: Robust standard errors are shown in brackets, *** and ** are significant at the level of 1% and 5% respectively.
Table 7. Mechanism test.
Table 7. Mechanism test.
(1)(2)(3)
Information
Dispersion
Risk-Bearing
Capacity
Operational
Efficiency
Data−0.606 **
(0.304)
−0.019 ***
(0.005)
0.014 **
(0.006)
Control VariablesYesYesYes
Year FEYesYesYes
Firm FEYesYesYes
N12,25119,39525,143
Note: Robust standard errors are shown in brackets, *** and ** are significant at the level of 1% and 5%respectively.
Table 8. Heterogeneity test (I).
Table 8. Heterogeneity test (I).
IndustryFirm Growth Potential
(1)(2)(3)(4)
Non-High-Tech IndustryHigh-Tech
Industry
Low Growth
Potential
High Growth
Potential
Data−0.324
(1.496)
−3.201 ***
(0.968)
−1.587
(1.039)
−3.233 **
(1.251)
Control VariablesYesYesYesYes
Year FEYesYesYesYes
Firm FEYesYesYesYes
N855816,91912,73912,738
Note: Robust standard errors are shown in brackets, *** and ** are significant at the level of 1% and 5% respectively.
Table 9. Heterogeneity test (II).
Table 9. Heterogeneity test (II).
Digital InfrastructureInvestment Inefficiency
(1)(2)(3)(4)
Low-Level
Digital
Infrastructure
High-Level
Digital
Infrastructure
Under-Investing FirmsOver-Investing Firms
Data−2.362
(1.856)
−1.921 *
(1.109)
−0.941
(0.811)
−4.018 **
(1.582)
Control VariablesYesYesYesYes
Year FEYesYesYesYes
Firm FEYesYesYesYes
N11,09214,38515,21410,263
Note: Robust standard errors are shown in brackets, ** and * are significant at the level of 5% and 10% respectively.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ma, Y.; Li, Z.; He, L. Data Elements Marketization and Corporate Investment Efficiency: Causal Inference via Double Machine Learning. Systems 2025, 13, 609. https://doi.org/10.3390/systems13070609

AMA Style

Ma Y, Li Z, He L. Data Elements Marketization and Corporate Investment Efficiency: Causal Inference via Double Machine Learning. Systems. 2025; 13(7):609. https://doi.org/10.3390/systems13070609

Chicago/Turabian Style

Ma, Yeteng, Zhuo Li, and Li He. 2025. "Data Elements Marketization and Corporate Investment Efficiency: Causal Inference via Double Machine Learning" Systems 13, no. 7: 609. https://doi.org/10.3390/systems13070609

APA Style

Ma, Y., Li, Z., & He, L. (2025). Data Elements Marketization and Corporate Investment Efficiency: Causal Inference via Double Machine Learning. Systems, 13(7), 609. https://doi.org/10.3390/systems13070609

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop