A System Model for Valuing Data Assets in Commercial Banks

Wang, Hu; Song, Liangrong; Zong, Qingying

doi:10.3390/systems14010115

Open AccessArticle

A System Model for Valuing Data Assets in Commercial Banks

by

Hu Wang

¹,

Liangrong Song

^1,* and

Qingying Zong

²

¹

School of Management, University of Shanghai for Science and Technology, Shanghai 200082, China

²

School of Statistics and Data Science, Lanzhou University of Finance and Economics, Lanzhou 730030, China

^*

Author to whom correspondence should be addressed.

Systems 2026, 14(1), 115; https://doi.org/10.3390/systems14010115

Submission received: 12 December 2025 / Revised: 19 January 2026 / Accepted: 19 January 2026 / Published: 22 January 2026

(This article belongs to the Special Issue Data-Driven Formation and Development of Business Ecosystems)

Download

Browse Figures

Review Reports Versions Notes

Abstract

With the ongoing development of the digital economy, the productive function of data as an economic factor has become increasingly salient. Scientifically and rigorously assessing the value of data assets is essential for improving the national economic accounting system and promoting sustainable economic growth. In light of the limitations inherent in existing cost-based and market-based valuation approaches, this paper proposes a comprehensive valuation model that integrates the cost approach with the income approach and applies it to the commercial banking sector. Specifically, text analysis is employed to estimate human capital investment in data assets from the perspective of labor supply and demand, after which total costs are derived based on the proportion of human capital. An ARIMA model is used to forecast future cost inputs and net profits associated with data assets. Furthermore, the income-based approach is adopted to estimate the average present value of data assets, with the results of the two methods serving to validate each other. The comparison of estimation results under the cost approach and the income approach further validates the relationship between input and output in data assets. This also demonstrates that data assets follow the law of diminishing marginal utility, thereby contradicting the notion that data increases in value with greater usage. This study enriches the theoretical framework of data asset valuation, broadens its application scope, and provides meaningful guidance for advancing data asset accounting practices and related research.

Keywords:

data assets; value assessment; commercial banks; ARIMA model

1. Introduction

With the continuous advancement of digital technology, data has increasingly been recognized as a core factor of production. This recognition offers a new source of momentum for economic growth [1]. To further unlock the latent value of data, theoretical investigation and practical exploration of data assets are considered essential. The promulgation of the Interim Provisions on Accounting Treatment for Enterprise Data Resources (hereinafter referred to as the Interim Provisions) marks the conclusion of the debate surrounding, “Should data assets be included in the balance sheet?” However, the increase in assets resulting from the inclusion of data assets on the balance sheet warrants careful consideration. The inclusion of data assets on corporate balance sheets is likely to increase enterprise value significantly [2]. Due to the immature nature of the data market and the inadequacy of existing valuation frameworks [3], firms may be incentivized to overstate the value of their data assets. This may contribute to the emergence of a “data asset bubble” [4]. Over time, the actual value of data assets may diverge from underlying fundamentals, ultimately hindering economic growth. Therefore, balancing data assets between “stabilizing growth” and “mitigating risks” has become a central challenge in contemporary data governance. Establishing a rigorous, effective data asset valuation system is essential to preventing data asset bubbles and improving the efficiency of data circulation.

Based on the characteristics of data assets and their current applications, their value primarily manifests in two aspects: first, economic value, which refers to the financial benefits generated for owners during the asset’s revenue-generating period. Second, social value arises from conceptualizing data assets as a new form of production factor [5]. As tools that support social production activities, data assets contribute to cost reduction and efficiency gains across society. However, the academic and professional communities currently lack a scientifically standardized methodology for reliably measuring the value of data assets. The Interim Provisions, recognizing the “non-physical” nature of data assets, classify them as intangible assets and measure them using historical cost [6]. However, due to the complexity of data asset cost structures and their limited observability, subsequent measurement challenges, such as impairment and amortization, also require careful consideration [7]. Consequently, the cost approach remains questionable regarding its operational feasibility and long-term measurement accuracy. However, the value creation of data assets is highly contingent upon the digital ecosystems in which they are embedded. Commercial banks have spearheaded the digital transformation of the financial industry, accumulating substantial volumes of data in the process. Currently, the primary applications of these data assets within commercial banks include user profiling, precision marketing, risk monitoring, and operational optimization. The main objectives are to enhance the efficiency of financial services and reduce systemic risks. Consequently, accurately assessing the value of data assets in commercial banks is crucial for improving the predictive capacity of risk management strategies.

With ongoing advances in data asset valuation theory and practice, a diversified valuation framework has gradually emerged, encompassing the cost, income, and market approaches [8,9]. In theory, the income approach to valuation aligns more closely with the “income-generating” characteristic of data assets and their intrinsic economic value. Within the System of National Accounts (SNA), the scope of the income approach is relatively narrow. Moreover, according to the SNA’s analysis of operational feasibility, priority should be given to the income approach over the market and cost approaches. The bottleneck in measuring data asset returns lies in their embedded nature, which makes it impossible to reliably quantify their contribution rate, thereby hindering the accurate calculation of the cash flows they generate.

Given the current lack of unified standards for valuing data assets, commercial banks are employed as a case study to explore valuation methods and their practical applications. By examining the strengths, weaknesses, and applicability of cost-based and income-based approaches, the limitations of each measurement method are systematically identified. Building on this foundation, an improved, integrated valuation model that combines both methodologies is introduced to measure the data assets of sample banks. It is assumed that this integrated model enhances measurement accuracy. Finally, by comparing valuation results, we further illustrate the overall investment in data assets and their profitability within China’s banking sector. Compared with the existing literature, the incremental contributions of this paper are primarily reflected in the following aspects: (1) Although research on data assets in China began relatively late, practical applications in this field have advanced rapidly. Among the current diverse valuation frameworks for data assets, the cost and income approaches are the two most widely advocated methods, attracting significant attention across various sectors. This paper provides a theoretical framework for future research on data asset management by comprehensively exploring two approaches. (2) Compared to existing research, this paper offers a relatively novel accounting perspective. While prior studies on the cost-based measurement of data assets provide detailed explanations of cost composition and types, the difficulty in obtaining statistical data for each cost component limits their broader applicability. This paper adopts a labor supply-and-demand perspective and employs the dictionary method to assess labor costs associated with data assets, thereby deriving the total investment cost of data assets. (3) Enhancing the decision-making value of data assets contributes to the stable development of both data assets and capital markets. This paper enriches the valuation framework by exploring methods for valuing data assets and their practical applications, thereby improving banks’ efficiency in risk prevention and data governance. Furthermore, scientifically rigorous and effective valuation methods can facilitate the circulation of data assets, promote the implementation of policies for the “inclusion of data assets on balance sheets,” and help maintain the stability of capital markets.

2. Literature on Data Asset Valuation

The income approach primarily values assets based on the present value of their expected net income over their useful lives. Evolving from the residual income model, it was first proposed by Edwards and Bell (1961) for calculating corporate equity value and was later systematically elaborated by Ohlson (1995) in equity valuation studies [10,11]. To date, the primary models within the income approach include the discounted free cash flow model, the discounted economic value-added model, the discounted dividend model, and the adjusted present value model. Since the income approach reflects the present value of future earnings, it requires reliable estimates of both incremental future earnings and the duration of the earnings period. The Autoregressive Integrated Moving Average (ARIMA) model, first introduced by Box and Jenkins (1962), is a time series analysis tool that linearly combines historical data and influencing factors to estimate future values [12]. As a result, it is commonly employed in scenarios such as revenue forecasting. However, the core assumption of ARIMA models necessitates testing for stationarity. As the time series lengthens, the model’s long-term predictive power diminishes due to factors such as seasonal effects and external shocks. Therefore, ARIMA is frequently used to capture short-term future trends. Financial time series typically exhibit characteristics such as non-stationarity and periodicity. As a result, ARIMA models have been adopted by some scholars to forecast financial risks. Pahlavani and Roshan (2015), as well as Adesina and Obokoh (2025), have employed ARIMA models to forecast bank exchange rates [13,14]. Furthermore, Sözen (2025) applied ARIMA models to forecast cryptocurrency yields, providing a useful reference for the application of ARIMA models in emerging financial domains [15].

Given the characteristics of commercial bank data assets in scenario-based applications, the application of the income approach is currently constrained by three primary challenges: (1) Determining the income period of data assets; (2) Establishing the discount rate; (3) Characterizing the income distribution function during the income period of data assets [16].

Determining the revenue distribution of data assets is a prerequisite for valuation under the income approach. Due to the inherent attachment nature of data assets [17], isolating them for direct observation of actual revenues is difficult. Therefore, during the literature collection process, the author referenced the intangible asset income approach, which similarly possesses “non-physical” characteristics, for value measurement [18,19]. A representative example of such an approach is the patent renewal model [20]. The core calculation method of this model involves evaluating the present value of the average expected net revenue from a patent by integrating its initial revenue distribution function and projected decay pattern, while ensuring the maximization of the patent holder’s revenue. Similar to intangible assets such as patents, data assets possess an initial value. After undergoing processing through the data value chain and processes, data assets acquire an initial value upon registration and rights confirmation [21]. Theoretically, this value reflects the price that willing parties are prepared to pay, and it tends to increase in the short term as the data is applied. However, unlike the patent renewal model, patents and other intangible assets represent a concentrated embodiment of the contributions made by the stock of technical knowledge [22]. Their high imitation and replication costs make it unlikely that they will be replaced in the short term. Due to characteristics such as data replicability and low marginal costs, data circulate rapidly. Over time, this leads to increased data reuse, while gradually diminishing the data monopolistic power [23]. Based on this analysis, the return distribution of data assets exhibits an overall pattern of initial growth followed by gradual decline.

Given that the physical lifespan of data assets is difficult to observe and there is no statutory time limit for reference, this paper necessitates a discussion of revenue period of data assets. The economic lifespan of data assets is influenced by multiple factors, including contractual terms, legal stipulations, and data asset management practices. For example, the economic lifespan of data assets sold under license agreements is determined by the contractual terms agreed upon by both parties. Currently, several cities in China are conducting pilot programs for data asset management. Local governments have issued relevant documents to actively promote the management of data assets, such as Shenzhen’s “Measures for the Administration of Data Property Rights Registration”. The revenue periods of data assets vary depending on their Scenario-based features. From the perspective of commercial banks, the informational value of data assets primarily includes user profiling, precision marketing, risk monitoring, and operational optimization. Regarding data collection, ensuring the quality, security, and sourcing of external data remains challenging [24]. Commercial banks in China operate under an internally isolated network data exchange system comprising internal, external, dedicated, and financial networks. From a micro-prudential regulatory perspective, to facilitate penetrating supervision, banks primarily share information with higher-level regulatory bodies, such as the central bank, via external networks. Consequently, the central bank possesses comprehensive data on all commercial banks. Corporate management practices within banks have intensified their profit-seeking, and competitive dynamics have led to information barriers between institutions. Consequently, the data-sharing and exchange mechanisms of commercial banks primarily operate through internal exchanges [25]. This information exchange mechanism can reduce the reuse rate of data assets, extend their economic lifespan, and thereby maximize returns.

The primary valuation approach of the cost method is based on measuring the total cost inputs incurred during data production and processing to assess the value of data assets. Accordingly, this approach requires a systematic classification of the costs associated with data assets. Building on this methodology, Xu et al. (2022) classified data asset costs along the data value chain, encompassing labor compensation, intermediate inputs, fixed capital consumption, net operating surplus, and other production-related taxes [26]. However, the cost composition of data assets is complex, and obtaining cost data is difficult, resulting in a lack of reliable statistical information for initial measurement of data asset costs. In addition, challenges related to the subsequent measurement of data assets—such as impairment and amortization—have emerged as major constraints on the application of the cost approach [27]. The current lack of scientifically unified professional assessment methodologies prevents the determination of key parameters, such as amortization methods and valuable lives, thereby hindering the confirmation of expected consumption patterns for data assets. Taken together, these challenges constrain the practical application of the cost approach and impede the broader recognition of data assets on balance sheets.

In the digital economy era, the role of data as a productive factor has become increasingly prominent [28]. With the progressive recognition of data assets on balance sheets and the implementation of policies such as data asset pledging, data assets expected to give rise to a more active market in the future. However, the existing data asset valuation framework remains insufficient to meet the requirements of both theoretical advancement and practical application. This paper reviews existing research to summarize the primary approaches to data asset valuation and their application constraints. It aims to establish a theoretical foundation and provide directional guidance for the subsequent optimization of existing valuation methods and their application in specific business scenarios.

3. Data Asset Valuation Model Design

Building on the above analysis, this paper presents a composite measurement system consisting of three subsystems, as shown in Figure 1: First, the cost tracking system primarily addresses the issue of data capitalization—specifically, the aggregation of data asset costs. By implementing data asset tagging, costs are progressively allocated to data assets, enabling cost traceability and real-time tracking. This paper classifies data asset costs into labor and non-labor categories from a labor input perspective. Text analysis methods are used to estimate labor costs related to data assets, with proportional calculations subsequently applied to determine overall cost levels. Second, the dynamic value forecasting system primarily addresses the future revenue of data assets under the income approach. By setting parameters such as useful life and discount rate and projecting future revenues, the current value of data assets can be determined. This system reflects the earning potential and economic value of data assets. It also reduces errors arising from relying on a single measurement method through comparative analysis, enhancing the persuasiveness of estimation results. This paper utilizes the ratio of initial data asset investment to bank net profit as the input-output ratio, thereby reflecting its asset contribution rate. This figure is subtracted from the asset pool to represent the baseline revenue of data assets. An ARIMA model is employed to forecast future revenues for data assets beyond the sample, which are then discounted to determine their present value. Finally, the value verification system provides feedback and optimization for both the cost tracking system and the dynamic value prediction system. By forecasting future data asset cost investments using the ARIMA model, the marginal contribution rate of data assets is calculated based on future cost investments divided by future asset returns, enabling assessment of data asset quality. Based on these estimates, cost investments are optimized and projections of future data asset returns adjusted.

Due to the inherently value-embedded characteristics of data assets, their independent isolation and separation are challenging. Therefore, to ensure the validity and scientific rigor of the proposed valuation system, the following assumptions are introduced: (1) As productive assets, the ultimate output of data assets is reflected in their contributions to bank revenue growth, cost savings, or risk reduction. These contributions are comprehensively captured by changes in a bank’s net profit and other profitability indicators. To more accurately identify the incremental contribution and economic value of data assets, Hypothesis 1 is proposed: Changes in a commercial bank’s net profit can be used as an indicator of the contribution rate of data assets. (2) Within the commercial banking context, data assets are classified into self-use data assets and trading data assets. Due to the underdeveloped state of the data circulation market, it generally low liquidity. As a result, the analysis focuses on the valuation of self-use data assets. Thus, Hypothesis 2 is proposed: Data assets in commercial banks are productive assets that do not independently generate revenue. (3) Unlike traditional production factors such as land and labor, data elements primarily function as enabling factors and do not independently generate revenue. As an “empowering factor,” data enhances the productivity of conventional production factors. For instance, in commercial banking, data-driven risk-control models improve the quality of credit assets. Changes in net profit result from synergies within an asset portfolio that includes data assets. Therefore, Hypothesis 3 is proposed: Annual profit changes in commercial banks are the result of the synergistic interaction between data assets and other asset types. (4) Building on Hypothesis 3, it is necessary to isolate the contribution of data assets by quantifying their share within the asset portfolio in preparation for estimating the present value of data assets. Therefore, Hypothesis 4 is proposed: By treating data assets and other asset types as a single asset group, the revenue change attributable to data assets is identified by excluding costs associated with other asset types.

3.1. Estimation of Initial Costs for Data Assets

The immediate priorities for addressing the initial cost estimation of data assets are twofold. First, the types of costs incurred in producing and creating data assets must be determined. This involves the issue of cost aggregation. Second, it is necessary to identify which costs can be capitalized and included in the cost of data assets. Xu et al. (2022) proposed a data value chain model based on the data production and processing process [26]. A relatively high share of data capitalization expenditures consists of labor costs, including production and technical personnel expenses. With the expansion of the digital economy, occupational structures and labor supply and demand have shifted. The transformation of human resources is now an essential component of digital technology development. Changes in human capital expenditures have become a key indicator of progress in digital technological innovation. Babina et al. (2024) used employee résumé data from U.S. firms to measure the level of corporate AI development [29]. Tan et al. (2025) evaluated corporate digital transformation by examining digital talent demand using corporate recruitment data [30]. These studies offer new revelations for the present research. Therefore, this paper proposes a novel method for measuring overall cost investment in data assets from the perspectives of production activities and labor input. Based on disclosures of the main work content of digital professionals in annual reports, text analysis techniques are used to estimate these costs.

3.2. Data Asset Revenue Model Assumptions

Patents generate expected returns as intangible assets by granting market monopoly power through rights-based protection of highly concentrated knowledge capital. However, this monopoly power gradually diminishes as patent information becomes publicly disclosed, a process commonly referred to as patent revenue decay [31,32]. Data assets differ from patents in that they grant short-term data monopoly power to rights holders via legal protections, thereby generating temporary excess returns stemming from data asymmetry [33]. Data assets are inherently defined by their replicability, substitutability, and low marginal costs [34]. These characteristics promote data flow, thus enhancing data reuse. Consequently, the excess returns generated by data assets gradually decrease after reaching a peak, ultimately reverting to a normal level. Therefore, this paper posits that the marginal benefits of data assets do not increase monotonically; rather, they exhibit an inverted U-shaped pattern.

3.3. Setting the Revenue Period for Data Assets

The retirement mechanism of data assets differs fundamentally from that of tangible assets, such as fixed assets. With respect to useful life, fixed assets are continuously depreciated as production activities advance, ultimately reaching their net salvage value. In contrast, data assets, as non-consumable resources [35], experience only transitions between old and new storage media—such as hard drives and servers—rather than actual extinction. Instead, data capital gradually accumulates during the utilization of data assets. Although the application registration date of data assets can be obtained from the data asset registration authority, relevant departments have not issued documents specifying the validity period of data assets. Consequently, the revenue period for data assets cannot be determined through a simple subtraction of the application year from the expiration year. The Interim Provisions classify self-used data assets as intangible assets; accordingly, this paper estimates the revenue period of data assets by referencing the amortization period applicable to intangible assets. In accordance with the prudence principle in accounting—which seeks to prevent the overstatement of asset values and revenues—the minimum amortization period of ten years for intangible assets is adopted as the lower bound for the revenue period of data assets. This benchmark is subsequently employed to estimate the present value of data-asset-generated revenues.

3.4. Discount Rate Calculation

The discount rate represents the expected rate of return under given conditions. Owing to the absence of a mature and efficient market for the circulation of data assets, an appropriate discount rate cannot be determined based on market interest rates for comparable assets. Moreover, the determination of the discount rate must incorporate the internal risk profiles of commercial banks, as well as the uncertainty surrounding future returns from data assets. Therefore, this paper adopts a discount rate defined as the sum of the benchmark interest rate and the market risk premium, wherein the benchmark interest rate reflects the risk-free rate, and the market risk premium consists of both the risk compensation required by investors and asset-specific risks, as presented in Equation (1):

R_{d} = R_{f} + β (R_{m} - R_{f}) + ε

(1)

In Equation (1),

R_{d}

denotes the discount rate for data assets,

R_{f}

denotes the risk-free rate,

R_{m}

represents the average market return,

β

denotes the systematic risk coefficient, and

ε

denotes the asset-specific risk coefficient.

4. Data Asset Valuation Model Application

This section applies and evaluates the aforementioned model. Using commercial banks from 2007 to 2022 as the sample, text analysis is employed to measure data asset costs, forecast data-asset-related investments and net profits, and assess the contribution rate of data assets based on their input–output performance. Future returns on data assets are subsequently estimated, and the average total present value of data assets is calculated according to the return period and discount rate. While validating the effectiveness of the proposed model, this study also addresses a widely debated question in both academic and media circles: “Does data become more valuable the more it is used?”

4.1. Data Sources, Explanations, and Processing

When estimating the initial cost of data assets, this paper selected 41 commercial banks in China from 2007 to 2022 as the research sample. To ensure comprehensive coverage of bank types, the sample comprised five large state-owned commercial banks, 10 joint-stock banks, 20 city commercial banks, and six rural commercial banks. To estimate the future returns of data assets, Huaxia Bank is used as an illustrative example for both calculation and cross-verification. Annual report text data were obtained from the official websites of the respective commercial banks, and financial data were sourced from the CSMAR and WIND databases.

4.2. Data Asset Initial Cost Estimate

Xu et al. (2022) classified data asset costs into labor costs, intermediate inputs, and depreciation of fixed assets, among others categories, according to the production and processing stages of the data value chain [26]. Nevertheless, the limited availability of relevant statistical data constrains the practical applicability of this classification. Based on the preceding analysis, the substantial increase in human capital is identified as a crucial manifestation of data asset accumulation. Accordingly, this study categorizes data asset costs into labor and non-labor components. For the labor cost component, text analysis is employed to estimate costs, following the procedure outlined below:

First, the underlying logic was established, and seed terms were selected. Data evolves from unstructured, low-value fragments into structured, high-value assets that can be utilized, forming a “data value chain”. Accordingly, this paper identifies data collection, data storage, data analysis, and data application as foundational seed terms. Additionally, classifications and descriptions of newly introduced digital occupations were gathered from the 2022 edition of the National Occupational Classification of the People’s Republic of China. By filtering primary job responsibilities relevant to commercial banking applications, these foundational seed terms were supplemented to construct a comprehensive seed dictionary.

Second, the seed vocabulary was expanded to construct a comprehensive seed dictionary. Building upon the foundational seed vocabulary, classifications, and descriptions of newly introduced digital occupations were collected from the 2022 edition of the Classification of Occupations in the People’s Republic of China. By filtering primary job responsibilities relevant to commercial banking applications, the foundational seed vocabulary was further supplemented. Terms with a frequency below 100, expanded terms with a similarity score below 0.6, and semantically redundant expansions were excluded. This process produced the final seed dictionary, as presented in Table 1 and Figure 2.

Third, word frequency and sentiment polarity were analyzed. Given the complexity of Chinese-language expression in financial corpora, Python3.7 was used to calculate the frequency of each feature term in corporate annual reports using the data asset dictionary. Subsequently, SnowNLP was applied to conduct sentiment polarity analysis for each lexeme, resulting in feature term frequencies adjusted by sentiment polarity (positive, neutral, negative).

Fourth, to estimate labor costs associated with data assets, the frequencies of positive and neutral feature words—adjusted for sentiment polarity—are summed. This sum is divided by the total word count in the annual report and multiplied by 100 to generate an indicator representing the bank’s digital job penetration rate. This indicator is then multiplied by the bank’s increment in employee compensation payable for the respective period to calculate labor costs attributable to data assets. The calculation formula is as follows:

c l_{i, t} = \frac{\sum D w o r d s_{i, t, n}}{T w o r d_{i, t}} \times w_{i, t} \times 100

(2)

After deriving labor cost estimates, it is essential to determine non-labor costs to obtain the total cost of data assets. Following the methodology of Liu et al. (2023) [36], the proportional estimation principle is applied to estimate the ratio of labor to non-labor costs, thereby calculating overall data asset costs. It is assumed that the ratio of labor to non-labor costs within data capital remains constant. Relevant data were collected from 376 banks covering the period 2007–2022. Statistical analysis reveals that, for most sample banks, the ratio of fixed asset depreciation and related expenses to employee compensation averages approximately 0.45, indicating that capital costs constitute roughly 45% of labor costs. Furthermore, according to a questionnaire survey conducted by the Chinese Academy of Fiscal Sciences on enterprise costs, tax obligations and related expenses in the financial and insurance sector account for about 14% of total enterprise costs. Non-labor costs in this sector primarily comprise capital costs, tax burdens, and other operational expenses. Overall, calculations suggest that non-labor costs account for approximately 70% of labor costs, thereby enabling the estimation of annual data asset costs for banks.

As shown in Figure 3, commercial banks’ overall data investment exhibits a trend of annual growth. This indicates that since the People’s Bank of China formally proposed the digital transformation of the banking industry in 2009, commercial banks have progressively strengthened their development of information infrastructure. Concurrently, greater emphasis has been placed on data-related investment. An examination of the growth trajectory reveals that China Construction Bank’s data-related cost investment substantially exceeds that of the other four banks. This disparity is likely attributable to the substantial advantages held by large state-owned banks in terms of data scale, capital resources, and cost efficiency. Together, these factors constitute a solid foundation for the advancement of data assetization.

4.3. Data Asset Revenue Forecast Validation

The Autoregressive Integrated Moving Average (ARIMA) model is currently the most widely used approach for modeling non-stationary time series. It belongs to the class of time series analysis techniques and is frequently applied in contexts such as economic forecasting and earnings projections. The ARIMA model combines an autoregressive (AR) component with a moving average (MA) component and is characterized by three parameters: p, d, and q. Specifically: p represents the number of lagged observations of the original time series included in the forecasting model. d denotes the order of differencing required to convert the original time series into a stationary sequence. q indicates the number of lagged forecast errors incorporated into the model. Given that p, d, and q are known, the ARMA model can be expressed as:

y_{t} = μ + φ_{1} y_{t - 1} + \dots + φ_{p} y_{t - p} + θ_{1} e_{t - 1} + θ_{2} e_{t - p}

(3)

A time series is considered stationary if its statistical properties, such as mean and variance, remain constant over time. Most time series models are developed under the assumption of stationarity and typically decompose the series into four components: long-term trends, seasonal variations, cyclical fluctuations, and irregular variations. Long-term trends (T) represent the overall directional movement over an extended period driven by fundamental factors. Seasonal variation (S) reflects regular, predictable fluctuations within a year associated with seasonal changes. Cyclical variation (C) captures wave-like fluctuations that span multiple years. Irregular variation (I) encompasses unpredictable fluctuations, including purely random changes as well as abrupt, irregular events with significant impact.

This study analyzes the data asset costs and net profits of Huaxia Bank. The data are first subjected to visualization procedures. Stationarity tests are then performed on the series using logarithmic transformation, differencing, and decomposition techniques. Stationarity is achieved through decomposition, which separates the series into residual, trend, and seasonal components. After conducting white noise tests, an additive model is applied to generate a forecast sequence for the original series.

4.3.1. Data Preprocessing

This study employs Huaxia Bank’s quarterly data from 2007 to 2022, focusing on the current period’s changes in asset costs and net profits as research variables. The dataset comprises 64 observations, spanning from the first quarter of 2007 to the fourth quarter of 2022. The model’s predictive accuracy is subsequently validated using 2023 quarterly data. Initially, the collected data are visualized to assess the time-series distribution of historical values. As illustrated in Figure 4 and Figure 5, both data asset costs and net profits demonstrate an upward trend over time, interspersed with periodic fluctuations.

4.3.2. Stability Test

The Augmented Dickey–Fuller (ADF) test is commonly used to test for stationarity of raw time series data. The null hypothesis generally posits that the original sequence contains a unit root, indicating that the series is non-stationary. With respect to the test results, a non-stationary time series is expected to be insignificant at the model’s predetermined confidence level, implying that the null hypothesis cannot be rejected. Conversely, if the series is stationary, the null hypothesis would be rejected.

First, the original time series for the increase in data asset costs and the increase in net profit was examined. The results, presented in Table 2 and Table 3, report t-statistics of −2.759 and −0.630, respectively. Both values exceed the critical values at the 1%, 5%, and 10% significance levels for the ADF test, indicating that the null hypothesis cannot be rejected for either original time series. Therefore, both series are identified as non-stationary. Second, after logarithmic transformations were applied to both series, the resulting t-statistics were −9.381 and −2.931, respectively. These results indicate that the current-period increase in data asset costs becomes stationary, whereas the current-period increase in net profit remains non-stationary. Finally, after first-order differencing was applied to both series, the t-statistics were −6.960 and −10.024, respectively. Both statistics fall below the corresponding critical values at the 1%, 5%, and 10% significance levels, demonstrating that both transformed time series are stationary.

4.3.3. Model Solution

In the aforementioned stationarity tests, the first-differenced time series satisfied the requirements for stationarity. Accordingly, the differencing order was confirmed as d = 1 for both time series. Further analysis of the truncation and tailing patterns of the autocorrelation and partial autocorrelation coefficients was conducted to determine the model’s p and q parameters. Through repeated testing of different combinations of p and q values, the lag order of the model is determined, guided by the minimization of the Akaike Information Criterion (AIC) and the Schwarz Criterion (SC). As shown in Table 4 and Table 5, the final model selected for the data asset cost series is ARIMA(1,1,2). In contrast, the net profit series is best represented by an ARIMA(1,1,1) specification. R² is an indicator that measures how well a model fits the observed data. The closer the value is to 1, the better the model fit. In this model, both the R² and adjusted R² values for the two time series are approximately 0.4. This suggests that the model captures certain trends and periodicity within the series, yet a substantial portion of the fluctuations remains unexplained. The operations of commercial banks are influenced by numerous complex factors, and historical time series data may be affected by external shocks such as COVID-19, changes in monetary policy, and economic uncertainty. As a result, the model failed to demonstrate a good fit, indicating its moderate ability to interpret the time series data.

After the models for the two time series are determined, an adaptability test must be performed on the models. The unit root test confirmed that the AR and MA parameters fall within reasonable bounds. As illustrated in Figure 6 and Figure 7, the autocorrelation and partial autocorrelation coefficients of the residual series remained within the standard deviation bands and exhibited low p-values. The absence of autocorrelation in the residuals indicates that they behave as white noise. This finding confirms that no critical information remains unexplained in the residual series, thereby validating the appropriateness of the specified models.

4.3.4. Model Prediction

This study employs Huaxia Bank’s data from the first quarter of 2007 through the fourth quarter of 2021 as the in-sample period. Dynamic forecasting methods were applied to the out-of-sample period in 2022. As illustrated in Figure 8 and Figure 9, the forecasted values align well with the original sample series, indicating that the model demonstrates satisfactory predictive performance. According to Table 6, the forecasted data asset costs for Q1–Q4 2022 are 86,025; 88,751; 90,011; and 90,593, respectively, while the forecasted net profits for the same period are 6,589,609; 6,691,213; 6,794,110; and 6,895,741, respectively. Although notable deviations exist between these estimates and the actual 2022 values, the overall annual discrepancies remain relatively small—at 3% and 7%—which supports the conclusion that the model’s predictive performance is generally robust. These deviations may be attributable to factors such as the bank’s annual and quarterly operational planning. To enhance the persuasiveness of the findings, this paper expanded the sample size and repeated the testing process, utilizing cross-validation to reinforce the robustness of the prediction results. To ensure comprehensive coverage of sample types, the Bank of China and Suzhou Rural Commercial Bank were chosen for testing. The final results, presented in Table 7 and Table 8, reveal significant deviations between the quarterly forecast values and the actual values. However, the overall annual discrepancies are relatively small and closely align with the estimates for Huaxia Bank.

4.4. Discount Rate Estimation

Based on the key indicators required for discount rate estimation in the aforementioned model, the risk-free rate

R_{f}

was obtained from the official website of the China Banking and Insurance Regulatory Commission. The average yield of Chinese government bonds over the past decade was adopted as the risk-free rate, and calculations show that

R_{f} = 3.11 %

. The market’s expected return was approximated using the annualized average return of the CSI 300 Composite Index. According to the WIND database,

R_{m} = 10.08 %

, the average risk coefficient for the monetary and financial sector from 2007 to 2022 is 0.36. Therefore,

β = 0.36

. In addition to the market risk premium, a specific risk coefficient must be incorporated. In this study, the specific risk coefficient is fixed at 4%. The Guidelines for Data Asset Valuation stipulate that risks specific to data assets must be considered. In addition, referencing recent valuation practices by securities firms for digital assets within the banking sector, the implied risk premium is estimated to range between 3% and 5%. Consequently, this assessment prudently adopts a specific risk factor of 4%, which falls within the theoretical range and aligns with market risk perceptions for comparable assets. By formula calculation, it can be determined that

R_{e} = 9.62 %

.

4.5. Data Asset Value Estimation Results

This paper employs the cost approach to estimate the investment costs of data assets and applies an ARIMA model to forecast their input–output relationship. From an incremental perspective, the marginal contribution of data assets is clearly identified. However, to further infer the annual contribution rate of data assets, it is necessary to incorporate their stock contribution. For intangible assets with uncertain lifespans, the amortization period should be at least 10 years. Accordingly, this study sets the depreciation period for data assets at ten years. The contribution rate of data assets is calculated using the formula “(stock + incremental)/net profit.” This approach enables the estimation of the annual revenue value of data assets, which is subsequently incorporated into the income-based valuation model to derive the present value of current data assets.

As shown in Figure 10, the value of data assets estimated using the income approach is considerably higher than that calculated using the cost approach. Given that data possess non-rivalrous and low marginal cost characteristics [28], both measures exhibit an upward trend over time, indicating that banks’ investments in data elements have been continuously increasing. The steady annual growth in both data asset investments and net profit suggests that banks have progressively recognized data elements as a new driver of growth. Driven by data elements, banks’ operational performance has further improved. However, data value, as a key criterion for evaluating data assets, has consistently garnered substantial attention across all sectors. Therefore, to further examine issues such as data value and production efficiency, this paper organizes and analyzes both the original and forecast data. As shown in Figure 9, assuming the contribution rates of other production factors remain constant, the growth rate of banks’ net profits declines as data inputs increase. This decline indicates a reduction in the marginal contribution of data assets, this indirectly demonstrates the law of diminishing marginal returns for data assets [37], thereby supporting the hypothesis that “data does not become more valuable with increased usage”. As the data reuse rate increases, data monopoly power weakens, and the excess returns generated gradually decline until they return to a rational level. Therefore, the proposition that “data becomes more valuable the more it is used” is ultimately self-defeating.

5. Conclusions

With the rapid advancement of the digital economy, the productive role of data as an economic factor has become increasingly prominent and has emerged as a critical driver of economic growth. Consequently, the scientific and reasonable measurement of data asset value is of substantial importance for optimizing resource allocation, enhancing market competitiveness, and establishing a fair and effective market-based circulation system. Therefore, this paper examines the methods and applications used to assess the value of data assets. By comparing the strengths and weaknesses of the cost and income approaches, their inherent limitations are analyzed, and a composite valuation system integrating these two traditional methods is designed. The innovation of this paper lies in its ability to scientifically estimate initial investment costs for data assets from a labor-input perspective using text analysis methods, while ensuring the scientific validity of the measurement strategy. This approach provides a scientific and intuitive reflection of the changing trends in data asset investment levels, offering a viable new perspective for advancing related theoretical research. However, this paper employs a proportional estimation method to measure the non-labor component of data assets for the sake of simplifying cost aggregation, which carries certain limitations. First, banks exhibit individual heterogeneity in their cost structures. This uniform cost structure ratio overlooks the structural differences between banks, which may lead to systematic overestimation or underestimation of costs for specific institutions. Second, cost structures are dynamic in nature. As the digital economy progresses, digital technologies demonstrate pronounced economies of scale and learning curve effects. As data volumes expand and technologies mature, non-labor costs may decrease, while the demand for digital labor—represented by digital occupations—could rise rapidly, gradually increasing the proportion of labor costs. Static ratios fail to capture these dynamic shifts, introducing errors in cross-period comparisons. This paper recommends abandoning the proportional estimation principle when measuring future investments in data asset costs and, instead, suggests exploring new bases for cost allocation. In addition, this paper calculates the input-output ratio of data assets by integrating their initial investment costs with net profits, thereby determining their contribution rate. It then employs an ARIMA model to estimate future returns on data assets. Subsequently, through scientific theoretical inference, it sets the key parameters—the benefit period and discount rate—of the data asset income approach model. Finally, it derives the present value of the average expected return on data assets. Furthermore, by cross-validating the results derived from the two measurement approaches, their respective strengths and limitations are more clearly identified. This offers valuable insights for advancing related theoretical research and for promoting the development of data asset valuation systems.

Based on the analysis of data asset valuation, this paper proposes the following recommendations: (1) A comprehensive data asset valuation system should be established to accelerate the monetization process of data assets. As policies regarding the incorporation of data assets into balance sheets and their use as collateral are implemented, the importance of accurate data asset valuation has become increasingly evident. A robust data asset valuation system should be established to expedite the monetization of data assets, providing a basis for decision-making to advance the development of data circulation and monetization methods such as data trading, data pledging, and corporate financing. (2) Establish a diversified data asset valuation system and refine statistical survey methods for data. The findings of this study once again validate the limitations inherent in various valuation approaches. Therefore, while acknowledging these constraints, we can develop a relatively equitable data asset valuation framework. For instance, given the differences in digital ecosystems across various industries, distinct valuation methods can be applied to various sectors and application scenarios. Furthermore, the lack of relevant statistical data significantly constrains the application of data asset valuation methods. By strengthening statistical survey systems for data production activities, improving accounting statements and foundational accounting materials, and optimizing the recognition and accounting of data asset accounts, enterprises can enhance their data asset management capabilities and reinforce the standardization of data assets. For instance, after establishing data asset rights, a dedicated data asset tab can be added to record and input all cost information associated with the data assets. (3) Optimize the data governance system and extend the lifecycle of data assets. As technology continues to advance, data openness and sharing have become an inevitable trend. However, this paper observes that data assets possess a distinct revenue lifecycle. Therefore, enterprises should enhance data asset management capabilities, develop tailored management approaches for different application scenarios, safeguard critical data privacy, and reduce reuse frequency to extend the revenue lifecycle. For instance, as data asset registration systems mature, timely registration should be conducted to monetize data assets through rights-based approaches.

Author Contributions

Conceptualization, H.W.; Data curation, H.W. and Q.Z.; Formal analysis, H.W. and L.S.; Investigation, H.W. and Q.Z.; Validation, H.W.; Supervision, L.S.; Writing—original draft, H.W.; Writing—review & editing, L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shanghai Municipal Plateau Discipline Development Fund Project: “Management Science and Engineering—Financial Management Engineering”, grant number SH1201GYXK.

Data Availability Statement

Data is available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hannila, H.; Silvola, R.; Harkonen, J.; Haapasalo, H. Data-driven begins with DATA; potential of data assets. J. Comput. Inf. Syst. 2022, 62, 29–38. [Google Scholar]
Sestino, A.; Kahlawi, A.; De Mauro, A. Decoding the data economy: A literature review of its impact on business, society and digital transformation. Eur. J. Innov. Manag. 2025, 28, 298–323. [Google Scholar] [CrossRef]
Stein, H.; Maass, W. Requirements for data valuation methods. In Proceedings of the 55th Annual Hawaii International Conference on System Sciences, Maui, HI, USA, 4 January 2022. [Google Scholar]
Penman, S.H. Accounting for intangible assets: There is also an income statement. Abacus 2009, 45, 358–371. [Google Scholar] [CrossRef]
Nolin, J.M. Data as oil, infrastructure or asset? Three metaphors of data as economic value. J. Inf. Commun. Ethics Soc. 2020, 18, 28–43. [Google Scholar]
Tang, Z. Research on the accounting recognition and measurement problems of enterprise data assets. Int. J. Glob. Econ. Manag. 2024, 3, 242–253. [Google Scholar] [CrossRef]
Zhang, Y.; Huang, Y.; Zhang, D.; Qian, Y. The importance of data assets and its accounting confirmation and measurement methods. In Proceedings of the 2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC), Beijing, China, 28 October 2019. [Google Scholar]
Bendechache, M.; Attard, J.; Ebiele, M.; Brennan, R. A systematic survey of data value: Models, metrics, applications and research challenges. IEEE Access 2023, 11, 104966–104983. [Google Scholar] [CrossRef]
Fleckenstein, M.; Obaidi, A.; Tryfona, N. A review of data valuation approaches and building and scoring a data valuation model. Harv. Data Sci. Rev. 2023, 5, 1–36. [Google Scholar] [CrossRef]
Edwards, E.O.; Bell, P.W. The Theory and Measurement of Business Income, 1st ed.; China People’s University Press: Beijing, China, 1961; pp. 45–50. [Google Scholar]
Ohlson, J.A. Earnings, book values, and dividends in equity valuation. Contemp. Account. Res. 1995, 11, 661–687. [Google Scholar] [CrossRef]
Box, G.E.P.; Jenkins, G.M. Some statistical aspects of adaptive optimization and control. J. R. Stat. Soc. B 1962, 24, 297–331. [Google Scholar]
Pahlavani, M.; Roshan, R. The comparison among ARIMA and hybrid ARIMA-GARCH models in forecasting the exchange rate of Iran. Int. J. Bus. Dev. Stud. 2015, 7, 31–50. [Google Scholar]
Adesina, O.S.; Obokoh, L.O. A Hybrid Framework of Deep Learning and Traditional Time Series Models for Exchange Rate Prediction. Sci. Afr. 2025, 29, e02818. [Google Scholar] [CrossRef]
Sözen, Ç. Volatility dynamics of cryptocurrencies: A comparative analysis using GARCH-family models. Futur. Bus. J. 2025, 11, 166. [Google Scholar] [CrossRef]
Mohan, S.K.; Bharathy, G.; Jalan, A. Enterprise Data Valuation—A Targeted Literature Review. J. Econ. Surv. 2026, 40, 73–92. [Google Scholar] [CrossRef]
Sai, Z.; Cheng, Y. Difficulties and Countermeasures in Data Asset Pricing. In Proceedings of the 16th International Conference on Cloud Computing, Shenzhen, China, 17–18 December 2023. [Google Scholar]
Xiong, F.; Xie, M.; Zhao, L.; Li, C.; Fan, X. Recognition and evaluation of data as intangible assets. Sage Open 2022, 12, 21582440221094600. [Google Scholar] [CrossRef]
Dugast, J.; Foucault, T. Data abundance and asset price informativeness. J. Financ. Econ. 2018, 130, 367–391. [Google Scholar]
Danish, M.S.; Ranjan, P.; Sharma, R. Estimating the value of expired and active patents: A renewal model approach. Technol. Anal. Strateg. Manag. 2025, 37, 2587–2604. [Google Scholar]
Dalessandro, B.; Perlich, C.; Raeder, T. Bigger is better, but at what cost? estimating the economic value of incremental data assets. Big Data 2014, 2, 87–96. [Google Scholar] [CrossRef] [PubMed]
Choi, W.W.; Kwon, S.S.; Lobo, G.J. Market valuation of intangible assets. J. Bus. Res. 2000, 49, 35–45. [Google Scholar] [CrossRef]
Lim, C.; Kim, K.H.; Kim, M.J.; Heo, J.-Y.; Kim, K.-J.; Maglio, P.P. From data to value: A nine-factor framework for data-based value creation in information-intensive services. Int. J. Inf. Manag. 2018, 39, 121–135. [Google Scholar] [CrossRef]
Liu, C.; Nitschke, P.; Williams, S.P.; Zowghi, D. Data quality and the Internet of Things. Computing 2020, 102, 573–599. [Google Scholar] [CrossRef]
Montes, R.; Sand-Zantman, W.; Valletti, T. The value of personal information in online markets with endogenous privacy. Manag. Sci. 2019, 65, 1342–1362. [Google Scholar] [CrossRef]
Xu, X.C.; Zhang, Z.W.; Hu, Y.R. Research on Data Asset Statistics and Accounting. Manag. World 2022, 38, 16–30+2. (In Chinese) [Google Scholar]
Coyle, D.; Manley, A. What is the value of data? A review of empirical methods. J. Econ. Surv. 2024, 38, 1317–1337. [Google Scholar] [CrossRef]
Jones, C.I.; Tonetti, C. Nonrivalry and the Economics of Data. Am. Econ. Rev. 2020, 110, 2819–2858. [Google Scholar] [CrossRef]
Babina, T.; Fedyk, A.; He, A.; Hodson, J. Artificial intelligence, firm growth, and product innovation. J. Financ. Econ. 2024, 151, 103745. [Google Scholar] [CrossRef]
Tan, Y.; Liu, Y.J.; Zhang, X. Statistical Measurement of Corporate Digital Transformation:A Perspective Based on Demand for Digital Technology Talent. Econ. Res. J. 2025, 60, 122–138. (In Chinese) [Google Scholar]
Deng, Y. Private value of European patents. Eur. Econ. Rev. 2007, 51, 1785–1812. [Google Scholar] [CrossRef]
Svensson, R. Patent value indicators and technological innovation. Empir. Econ. 2022, 62, 1715–1742. [Google Scholar] [CrossRef]
Birch, K.; Cochrane, D.T.; Ward, C. Data as asset? The measurement, governance, and valuation of digital personal data by Big Tech. Big Data Soc. 2021, 8, 20539517211017308. [Google Scholar] [CrossRef]
Xu, T.; Shi, H.; Shi, Y.; You, J. From data to data asset: Conceptual evolution and strategic imperatives in the digital economy era. Asia Pac. J. Innov. Entrep. 2024, 18, 2–20. [Google Scholar] [CrossRef]
Goldfarb, A.; Tucker, C. Digital economics. J. Econ. Lit. 2019, 57, 3–43. [Google Scholar] [CrossRef]
Liu, T.X.; Rong, K.; Zhang, Y.D. Estimating Data Capital and Its Contribution to China’s Economic Growth—A Perspective Based on the Data Value Chain. Soc. Sci. China 2023, 10, 44–64+205. (In Chinese) [Google Scholar]
Cai, T.T.; Wang, Y.; Zhang, L. The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy. Ann. Stat. 2021, 49, 2825–2850. [Google Scholar] [CrossRef]

Figure 1. Data asset valuation system.

Figure 2. Data asset dictionary example.

Figure 3. Trends in data asset cost changes for selected banks.

Figure 4. Trend in current period increase in data asset cost.

Figure 5. Trend in the Current Period Increase in Bank Net Profit.

Figure 6. Autocorrelation analysis results for data asset cost investment forecasts.

Figure 7. Residual autocorrelation analysis of bank net profit forecast models.

Figure 8. ARIMA model prediction and fitting performance for data asset costs.

Figure 9. ARIMA model prediction and fitting performance for net profit.

Figure 10. Comparison of valuation results for Huaxia Bank using two methods.

Table 1. Digital roles in commercial banking settings.

Major Group	Sub-Major Group	Minor Group	Unit Group	Work Content
Professional and technical personnel	Economic and Financial Professionals	FinTech Professionals	FinTech Engineer	FinTech Engineer is an expert engaged in applied technology research, product design, service operations, and performance evaluation within the fields of financial technology and digital finance.
	Economic and Financial Professionals	Commerce Professionals	Digitalization Manager	Digitalization managers refers to personnel who utilize digital office software platforms to perform the following functions: editing enterprise and organizational personnel structures, maintaining organizational operational processes, facilitating workflow collaboration, conducting big data decision-making analysis, and establishing online connections with upstream and downstream partners. This enables enterprises to achieve online presence, online communication, online collaboration, online operations, and an online ecosystem, thereby realizing the digital transformation of business management and operations.
	Engineering Technicians	Digital Technology Engineering Technicians	Big Data Engineering Technician	Big Data engineering technicians are primarily responsible for conducting technical research on big data collection, cleaning, analysis, governance, and mining, as well as utilizing, managing, maintaining, and servicing these data.
		Digital Technology Engineering Technicians	Blockchain Engineering Technician	Blockchain engineering technicians refer to professionals engaged in blockchain architecture design, underlying technologies, system applications, system testing, system deployment, and operational maintenance.
		Management (Industrial) Engineering Technicians	Supply Chain Engineering Technician	Supply chain engineering technicians are specialists dedicated to researching, building, and optimizing supply chain systems and related platforms. They must possess a solid technical background and systematic thinking capabilities, enabling them to design efficient, stable, and intelligent supply chain systems from a corporate strategic perspective, while maintaining a strong awareness of risk prevention and control.
		Management (Industrial) Engineering Technicians	Data Analysis and Processing Engineering Technician	Data analysis and processing engineering technicians are engineering technicians engaged in information system data planning, collection, management, analysis, database design and optimization, data resource integration, data mining, and data analysis, as announced by the Ministry of Human Resources and Social Security.

Table 2. Data asset cost (ADF) stability test results.

ADF Parameter	Raw Data Verification Value	Log-Transformed Test Value	Differential Test Value
T-statistic	−2.759	−9.381	−6.960
p-value	0.2179	0.0000	0.0000
Lag order	4	4	4
Number of observations	64	63	64
Reject the null hypothesis (1%)	−4.118	−4.118	−4.127
Reject the null hypothesis (5%)	−3.487	−3.487	−3.491
Reject the null hypothesis (10%)	−3.172	−3.172	−3.174

Table 3. Bank net profit (ADF) stationarity test results.

ADF Parameter	Raw Data Verification Value	Log-Transformed Test Value	Differential Test Value
T-statistic	−0.630	−2.931	−10.024
p-value	0.9734	0.1617	0.0000
Lag order	10	10	3
Number of observations	64	61	64
Reject the null hypothesis (1%)	−4.118	−4.148	−4.418
Reject the null hypothesis (5%)	−3.487	−3.500	−3.500
Reject the null hypothesis (10%)	−3.172	−3.179	−3.179

Table 4. Data asset cost ARIMA model estimate.

Variable	Coefficient	Standard Deviation	T-Statistic	p-Value
AR(1)	0.994187	0.007073	140.5555	0.0000
MA(1)	−1.863323	0.084219	−22.12462	0.0000
MA(2)	0.883176	0.084148	10.49555	0.0000
R-squared	0.409989	Mean of the dependent variable		0.037458
Adjusted R-squared	0.379988	Dependent variable variance		0.313792
Regression Standard Deviation	0.247082	Akaike Information Criterion (AIC)		0.159854
Sum of Squared Residuals	3.60193	Schwarz Criterion (SC)		0.295926
Log-likelihood value	−1.035394	Hannan-Quinn Criterion (HQ)		0.213372
Durbin-Watson statistic (DW)	1.894306

Table 5. Bank net profit ARIMA model estimate.

Variable	Coefficient	Standard Deviation	T-Statistic	p-Value
AR(1)	0.982392	−0.038153	−25.74905	0.0000
MA(1)	0.827650	0.128697	−22.12462	0.0000
R-squared	0.461291	Mean of the dependent variable		0.054628
Adjusted R-squared	0.432938	Dependent variable variance		0.220561
Regression Standard Deviation	0.16609	Akaike Information Criterion (AIC)		−0.668234
Sum of Squared Residuals	1.572397	Schwarz Criterion (SC)		−0.529816
Log-likelihood value	24.38113	Hannan-Quinn Criterion (HQ)		−0.613986
Durbin-Watson statistic (DW)	2.09666

Table 6. ARIMA prediction error comparison table.

Year	First Quarter of 2022	Second Quarter of 2022	Third Quarter of 2022	Fourth Quarter of 2022	Total for 2022
Actual cost of data assets	104,679	76,246	101,838	62,327	345,089.9457
Predicted value of data asset costs	86,025	88,751	90,011	90,593	355,380
Relative error of the ARIMA model	0.217	−0.141	0.131	−0.312	−0.029
Actual net profit	5,727,000	6,040,000	5,670,000	7,598,000	25,035,000
Predicted value of net profit	6,589,609	6,691,213	6,794,110	6,895,741	26,970,673
Relative error of the ARIMA model	−0.131	−0.097	−0.165	0.102	−0.072

Table 7. ARIMA prediction error comparison table (Bank of China).

Year	First Quarter of 2022	Second Quarter of 2022	Third Quarter of 2022	Fourth Quarter of 2022	Total for 2022
Actual cost of data assets	630,987	569,371	672,098	560,888	2,433,344
Predicted value of data asset costs	763,258	705,199	607,277	600,155	2,675,889
Relative error of the ARIMA model	−0.173	−0.193	0.107	−0.065	−0.091
Actual net profit	60,541,000	63,762,000	56,714,000	56,487,000	237,504,000
Predicted value of net profit	64,878,824	50,097,815	60,329,328	54,092,933	245,609,100
Relative error of the ARIMA model	−0.067	0.273	−0.060	0.044	−0.033

Table 8. ARIMA prediction error comparison table (Suzhou Rural Commercial Bank).

Year	First Quarter of 2022	Second Quarter of 2022	Third Quarter of 2022	Fourth Quarter of 2022	Total for 2022
Actual cost of data assets	2985	3672	3128	4232	14,017
Predicted value of data asset costs	2637	3891	2734	4630	13,892
Relative error of the ARIMA model	0.132	−0.056	0.144	−0.086	0.009
Actual net profit	310,937	532,997	412,340	252,675	1,508,949
Predicted value of net profit	255,789	573,288	371,159	199,000	1,399,236
Relative error of the ARIMA model	0.216	−0.070	0.111	0.270	0.078

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, H.; Song, L.; Zong, Q. A System Model for Valuing Data Assets in Commercial Banks. Systems 2026, 14, 115. https://doi.org/10.3390/systems14010115

AMA Style

Wang H, Song L, Zong Q. A System Model for Valuing Data Assets in Commercial Banks. Systems. 2026; 14(1):115. https://doi.org/10.3390/systems14010115

Chicago/Turabian Style

Wang, Hu, Liangrong Song, and Qingying Zong. 2026. "A System Model for Valuing Data Assets in Commercial Banks" Systems 14, no. 1: 115. https://doi.org/10.3390/systems14010115

APA Style

Wang, H., Song, L., & Zong, Q. (2026). A System Model for Valuing Data Assets in Commercial Banks. Systems, 14(1), 115. https://doi.org/10.3390/systems14010115

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A System Model for Valuing Data Assets in Commercial Banks

Abstract

1. Introduction

2. Literature on Data Asset Valuation

3. Data Asset Valuation Model Design

3.1. Estimation of Initial Costs for Data Assets

3.2. Data Asset Revenue Model Assumptions

3.3. Setting the Revenue Period for Data Assets

3.4. Discount Rate Calculation

4. Data Asset Valuation Model Application

4.1. Data Sources, Explanations, and Processing

4.2. Data Asset Initial Cost Estimate

4.3. Data Asset Revenue Forecast Validation

4.3.1. Data Preprocessing

4.3.2. Stability Test

4.3.3. Model Solution

4.3.4. Model Prediction

4.4. Discount Rate Estimation

4.5. Data Asset Value Estimation Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI