Next Article in Journal
From Crisis to Algorithm: Credit Delinquency Prediction in Peru Under Critical External Factors Using Machine Learning
Previous Article in Journal
Time-Course Transcriptomic Dataset of Gallic Acid-Induced Human Cervical Carcinoma HeLa Cell Death
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Data Descriptor

A Dataset for Examining the Problem of the Use of Accounting Semi-Identity-Based Models in Econometrics

by
Francisco Javier Sánchez-Vidal
Department of Economics, Accounting and Finance, Faculty of Business Studies, Universidad Politécnica de Cartagena, 30201 Cartagena, Spain
Data 2025, 10(5), 62; https://doi.org/10.3390/data10050062
Submission received: 26 December 2024 / Revised: 4 April 2025 / Accepted: 24 April 2025 / Published: 28 April 2025

Abstract

The problem of using accounting semi-identity-based (ASI) models in Econometrics can be severe in certain circumstances, and estimations from OLS regressions in such models may not accurately reflect causal relationships. This dataset was generated through Monte Carlo simulations, which allowed for the precise control of a causal relationship. The problem of an ASI cannot be directly demonstrated in real samples, as researchers lack insight into the specific factors driving each company’s investment policy. Consequently, it is impossible to distinguish whether regression results in such datasets stem from actual causality or are merely a byproduct of arithmetic distortions introduced by the ASI. The strategy of addressing this issue through simulations allows researchers to determine the true value of any estimator with certainty. The selected model for testing the influence of the ASI problem is the investment-cash flow sensitivity model (Fazzari, Hubbard and Petersen (FHP hereinafter) (1988)), which seeks to establish a relationship between a company’s investments and its cash flows and which is an ASI as well. The dataset included randomly generated independent variables (cash flows and Tobin’s Q) to analyze how they influence the dependent variable (cash flows). The Monte Carlo methodology in Stata enabled repeated sampling to assess how ASIs affect regression models, highlighting their impact on variable relationships and the unreliability of estimated coefficients. The purpose of this paper is twofold: its first goal is to provide a deeper explanation of the syntax in the related article, offering more insights into the ASI problem. The openly available dataset supports replication and further research on ASIs’ effects in economic models and can be adapted for other ASI-based analyses, as the information comprised in the reusability examples prove. Second, our aim is to encourage research supported by Monte Carlo simulations, as they enable the modeling of a comprehensive ecosystem of economic relationships between variables. This allows researchers to address a variety of issues, such as partial correlations, heteroskedasticity, multicollinearity, autocorrelation, endogeneity, and more, while testing their impact on the true value of coefficients.
Dataset: Please see the Supplementary File. Although the analyses were conducted using Stata software (version 13.1), for the sake of transparency, the main dataset is provided also in .csv format. Additionally, an equivalent R (version 4.4.0) syntax, identical to the Stata one, is also provided.
Dataset License: CC BY 4.0

1. Summary

In the financial management literature, it is widely accepted that a company’s annual cash flow significantly influences its investment decisions. The model proposed by Fazzari, Hubbard and Petersen (FHP hereinafter) (1988) [1] explores this relationship using a linear regression framework with investments as the dependent variable. The coefficient of the cash flow variable, known as investment–cash flow sensitivity (icfs hereinafter), should indicate the extent to which firms depend on internal funds for investments. Higher coefficients suggest greater financial constraints due to severe information asymmetry, making external financing costly or inaccessible. Since then, numerous studies have attempted to test this model, often yielding contradictory or unexpected results [2].
The data of this work are related to the research article “A Cautionary Note on the Use of Accounting Semi-Identity-Based Models” by Sánchez-Vidal (2023) [3], published in the Journal of Risk and Financial Management. In this related work, Ref. [3] critiques the FHP (1988) [1] model by identifying that the equation used is an accounting semi-identity (ASI), a partial representation of an accounting identity that omits certain variables. In this case, the accounting identity is given by the following expression:
Inv = CF + (Δ LTD + Δ Capital Stock-Deprec.-Dividends −ΔWorking capital−ΔOFA), where LTD is long-term debt; Capital Stock is social capital; and OFA is other fixed assets. All components within the parenthesis (the rest) are missing in the FHP equation. Running the full accounting identity would yield the fitted regression Inv = 1 × CF +1 × Rest, with an R2 value of 100% and an intercept coefficient value of 0 (See [3], Table 1). The omission of some parts of the complete accounting identity can introduce biases in econometric models, leading to an inaccurate estimation of causal relationships. Specifically, excluding other components from the complete accounting identity may distort the estimated coefficient of cash flow, causing more financially constrained firms to appear less constrained in the results of the estimation. In summary, if investments of a company amount to EUR 28 and cash flows to EUR 20, the rest is EUR 8. When running a regression of investments on cash flows, the cash flow coefficient must exceed 1 (e.g., 1.4) to compensate for the omitted rest. This higher “icfs” paradoxically appears in companies that have invested beyond their generated cash flows. From a double-entry perspective, firms that have increased their long-term debt (LTD)—and are therefore less financially constrained than those that have not—will have, ceteris paribus, a higher rest, artificially inflating the “icfs” coefficient.
Monte Carlo simulations have been widely applied in finance, including option pricing and portfolio risk management strategies [4,5] or, more recently, when they have been integrated with machine learning models to enhance financial forecasting [6]. By generating synthetic data, Monte Carlo simulations allow us to assess the impact of omitted variables in ASI-based models by making a comparison between estimated and true coefficients, highlighting the limitations of the FHP model and providing a valuable tool for evaluating ASI-based models. The synthetic nature of this database is critical because the alternative approach would involve using ‘real’ databases, allowing researchers to subsample the total sample into a priori more financially constrained firms and other companies based on any academically accepted criterion. Suppose that the linear regression is run and produces higher coefficients for the financially constrained subset. However, researchers would still face the uncertainty of whether these results are due to the actual impact of financial constraints—which should lead highly constrained firms to depend more on internally generated funds—or whether the larger coefficients are solely a consequence of the limitations imposed by the ASI. This issue is addressed by using synthetic data, which gives the researcher full control to distinguish between results driven by causality and those influenced by the ASI problem. The accompanying dataset facilitates the replication of the [1] model with similar or alternative parameters.

2. Methods and Data Description

As previously mentioned, the novelty of this work lies in the decision to use a randomly generated database instead of a real one, allowing for pinpoint control over the causal relationships. The approach used to create the data combines a fixed (deterministic) economic model with random (Monte Carlo) simulations to handle uncertainty in input factors, all performed using Stata software. Essentially, the explanatory variables that affect investments are generated using Monte Carlo simulations, and the investment values (dependent variable) are calculated by applying these randomly generated inputs to a model that explains how investments are determined.

Model for Generating Investments as a Function of Cash Flows and Tobin’s Q

The approach employed to produce the foundational data in this article combines a deterministic economic process model with Monte Carlo-based stochastic simulations to account for uncertainty in input parameters. This is performed by utilizing assumption distributions grounded in the literature and practical knowledge, all executed in Stata.
The simulation was designed by generating 50,000 sets of error terms using a normal distribution, each with a mean of zero and a standard deviation of 0.1. Additionally, 50,000 sets of a cash flow variable (CFi) were created, also normally distributed, with a mean of 0.25 and a standard deviation of 0.1. Another 50,000 sets of Tobin’s Q were generated, with a mean of 2.45 and a standard deviation of 0.2, along with a final 50,000 sets of error terms with a mean of zero and a standard deviation of 0.1. As is typical in Monte Carlo simulations, the selection of parameters is somewhat arbitrary [7]. Investments are calculated using the randomly generated values for cash flows, Tobin’s Q and the error term relating them to the investments in this way:
Invi = 0 + 0.45CFi + 0.03Qi + ɛi,
where i denotes value for the i-th company.
Since the goal is to test the model using Ordinary Least Squares (OLS), the choice of a normal distribution is appropriate. The true parameters in the synthetic relation of investments with cash flows and Tobin’s Q are chosen in a way so that the parameters of the variable investments are analogous to those in the seminal work of [1] from which the model we want to test comes from. The descriptive statistics of the randomly generated cash flows, as well as the investment variable, show means and standard deviations very similar to the values from the whole subsample of firms studied by [1] (pp. 22, 25), as can be seen in Table 1. These similar metrics will generate a value of the rest of the ASI comparable to that of FHP (1988) [1], which is crucial for explaining the ASI problem, particularly when, for instance, subsampling by the sign of the rest. The concrete values used in this controlled relationship imply that, ceteris paribus, an increase in cash flows of the 20% would cause an increase in investments of the 9%.
All factors that could influence investments, and which are not considered in the [1] model, are proxied by the error term ɛi [8]. These factors include broader macroeconomic conditions like interest rates and inflation, competitive forces within industries, regulatory changes, etc., all of which can shape investment behavior. The model also does not account for technological advancements, the quality of management and corporate governance, or the availability of external financing [9], which are all crucial in determining a firm’s investment strategy. Additionally, factors such as investor sentiment, supply chain disruptions, strategic considerations, and labor market conditions are not incorporated but can have a significant impact on investment decisions. One example is agency costs, where managers may be motivated to either overinvest or underinvest based on various corporate governance factors, as suggested by the free cash flow theory [10].
The inherent arbitrariness involved in constructing the investment–cash flow relationship is typical of any simulated model. This approach is comparable to the Electric Scooter Project example found in a widely renowned finance handbook [11] (p. 263). However, it offers the added benefit of high comparability with the original dataset from the seminal work of [1] due to the carefully chosen parameters for the randomly generated cash flows and the true coefficients of the synthetic relationship. Starting from this point, researchers can alter the distributions of the randomly generated variables and the features of the primary relationship to ensure they align with any newer research pertaining to this model.
Figure 1 displays a scatterplot of the cash flow and investment values for each observation, clearly illustrating the positive correlation between the two variables. Observing this graphical correlation is very illustrative as its existence is negated by the ASI problem in certain estimations of OLS regressions, as demonstrated in the related article [3].
The results of the different regressions in [3] with the simulated database described in Table 1 show that the rest drives the results when, for example, subsampling into the positive and negative values of the rest, making the coefficient of cash flows depart from its true value.

3. Results and Discussion

In this section, I will further analyze and discuss additional examples of reusability, focusing on the specific case of this ASI-based model. Additionally, I will explore a potential instance of reusability by modeling simulated variables that present econometric challenges, such as autocorrelation.

3.1. First Example of Reusability: Controlling for Q

In the following paragraphs, I will provide an example of how this dataset can be reused. The aim is to utilize a subset of companies with similar Q ratios, thereby controlling for one of the factors that could influence investment decisions [12]. This approach enables a more focused examination of the relationship between cash flows and investments, which represents the primary objective of this database and the associated research. This focus is particularly significant because both variables are integral to the ASI component in the [1] model, and the analysis of their relationship is framed within the context of the ASI problem.
As shown in the Stata syntax file associated with this article, I have selected companies in the highest quintile of Q values. Figure 2 illustrates the combinations of cash flows and investments for this subset of firms. In real-world scenarios, these high-Q firms—those valued in financial markets at significantly higher prices than their accounting values—are typically companies with highly valuable investment opportunities [12]. In Figure 2, I have highlighted two subsamples of companies, shown in brown and green, which will be analyzed further to confirm the ASI problem and the associated risks of using the ASI. The brown subset represents high-Q firms with cash flows below the overall sample mean (though still positive) and investments below the threshold of 0.1 (10%). In contrast, the green subset consists of companies with a similar combination of high Q and low cash flow generation but with investments exceeding 0.1.
For example, a company in the brown group might generate 0.08 in cash flows (considered low) and invest 0.08 (below the average), allocating all its available funds toward growth. This indicates a high reliance on internal cash flows to finance its investments. In contrast, a company in the green group might also generate 0.08 in cash flows but invest 0.16. This suggests it invests more due to better access to external funding sources, such as bank financing, which supplement its cash flow and enable higher levels of investment. The green subset, consisting of companies capable of making substantial investments despite limited cash flows, is clearly less financially constrained than the brown subset.
After defining these two groups in Stata, I perform a regression of investments on cash flows for each group. The results are displayed in Table 2.
The results in the first four columns (Models I and II) reveal that companies with higher investments exhibit greater cash flow sensitivity, a finding that contradicts both logical expectations and the predictions of the [1] model. As discussed in the related article, this inconsistency stems from issues introduced by the ASI problem, as the value of the rest is higher for high-investment companies, and arithmetic prevents causality to be correctly tested. To analyze the two subsamples within a single regression framework, I constructed a multiplicative variable combining cash flows with a low/high investment dummy (1, 0 respectively) and performed a unified regression analysis. The results, presented in Model III, indicate a lower sensitivity of investment to cash flows for the brown subsample, containing the more financially constrained companies. The coefficient for this type of companies, derived from the sum of the coefficients for cash flows and the multiplicative variable, results in a negative value of −0.4. This outcome is illogical in causal terms and stems from distortions introduced by the ASI and the influence of the rest of the ASI, which leads to mechanical biases. In contrast, less financially constrained firms (green subsample) exhibit higher investment–cash flow sensitivity. This is because they have been able to finance their greater investment needs using sources beyond their internally generated cash flows. External funding, such as financial debt, contributes to increase the rest (see [3], p. 4). Since this factor is unaccounted for, the regression compensates by yielding a higher coefficient for this subsample. This reproducibility example underscores, once again, the ineffectiveness of using ASI-based models. Details on the creation of subsamples, the multiplicative variable, and the regression analysis for this exercise are provided in the accompanying syntax file.

3.2. Second Example of Reusability: Decoupling Investments from Cash Flows

In the following paragraphs, I will recreate the same example, but this time I allow the investment variable to remain independent of cash flows while still being influenced by Q (Inv2i = 0 + 0.075Qi + ɛi). Table 3 presents the regression results for the entire sample (Model I), accurately capturing this artificially created relationship. When running the regression, the coefficient aligns with expectations, as it is not significantly different to 0.075 at the 99% confidence level.
However, the coefficient for Tobin’s Q loses significance when the regression is applied only to a subsample of positive residuals (Model II of Table 3), highlighting one of the issues associated with using ASIs: variables outside the ASI framework lose their significance in the presence of a dominant ASI. This emerging problem is also evident in the increased R2 compared to the ‘true’ regression in column 1.

3.3. Third Example: Developing Other Model

The previously mentioned Scooter example [11] could be interesting to expand upon. This example involves analyzing a business plan in which cash flows: inflows and outflows, derived from revenues and costs, must be estimated. The handbook provides guidance on the characteristics of cash flow estimations, with a more detailed explanation for the Sales variable. For the sake of brevity, I will focus solely on the simulation of the Sales variable. To fully simulate the complete cash flows variable, the researcher should recreate all components, incorporating all interrelationships between them. Sales are determined by the following formula:
REVENUES = MarketSize × MarketShare × UnitPrice
In this specific example, the market size is estimated at 1 million scooters for the first year, although the actual value is subject to estimation errors:
Market Size1 = ExpectedMarketSize1 × (1 + errorMS1),
and where errorMS1 refers to the error in Market Size forecast in year 1.
As expected, the Market Size for year t is calculated as the previous year’s Market Size (t − 1) multiplied by the forecasting error in year t. Due to the construction of this component, errors accumulate over time. In the example, the expected Sales value is set at 100,000, but the actual amount could be 110,000, reflecting a forecast error of 10%. As “what happens in year 2 is affected by what happens in year 1”, a scenario where “scooter sales are below expectations in year 1, it is likely that they will continue to be below in subsequent years”.
Since the Sales estimations are made in year 0, forecasting further into the future becomes increasingly complex as errors compound. For instance, by year 4, the Market Size is forecasted using the following formula:
Market Size4 = ExpectedMarketSize1 × (1 + errorMS1) × (1 + errorMS2) × (1 + errorMS3) × (1 + errorMS4)
Monte Carlo simulations can be used to model this accumulation in panel data, as illustrated in the syntax, where a Sales value is generated for each year. The example incorporates not only interdependence between periods but also interdependence between different variables. The authors suggest a correlation between Market Size and Price, noting that “the price of electrically powered scooters is likely to increase with market size”. For example, “a 10 percent shortfall in market size would lead to a predicted 3 percent reduction in price”. Using this relationship, the Price for the first year can be modeled as follows:
Price1 = Expected price1 × (1 + 0.3 × errorMS1)
Table 4 demonstrates the correlation between these two components of total revenues.
As market size errors accumulate over time, price errors accumulate over the years in a manner similar to sales. Figure 3 shows the histogram of the simulated values for the cash flows variables of year 1 and year 10, respectively. Due to the method used to generate random values for each year of the period, the standard deviations for different years are significantly different (this result is not reported).
A researcher could then generate the remaining variables with various interrelated relationships and derive a probability distribution for the associated NPV after estimating all components of cash flows and the appropriate cost of capital.
As demonstrated, Monte Carlo simulations enable the development of more complex models, such as this Scooter example. In this model, the behavior of one of the components of cash flows (revenues) introduces autocorrelation, as a portion of sales depends on its previous values. Additionally, a potential heteroskedasticity issue may arise due to the accumulation of errors over time.

4. Conclusions and User Notes

This article begins with a brief literature review, followed by a discussion on the significance of the dataset introduced in the previously published related study [3]. In this regard, this data descriptor relates to the use of Monte Carlo simulations to examine the relationship between cash flows and investments. Although Monte Carlo simulations are commonly used to test the validity of statistical tests [13], they can also be employed to generate a whole system of variables with crossed interrelations. The generated data of this work are designed to evaluate whether the regressions effectively capture the causal relationship, taking into account the constraints imposed by the accounting semi-identity (ASI). The Monte Carlo simulation process can be replicated to produce data that closely resemble or significantly diverge from those of [3], as examples 1 and 2 of reusability included in this work provide. The first example focuses on a specific subsample of companies with low Q and low cash flows but a wide range of investment values. In this subset, the influence of the rest is evident, causing less financially constrained firms to appear more constrained due to errors introduced by the ASI. The second example constructs a new synthetic relationship, where investments are independent of cash flows but influenced by Tobin’s Q. The dominance of the ASI incorrectly renders the Q coefficient insignificant while artificially inflating the regression’s goodness of fit. These additional examples serve as a robustness test for the findings in the related article [3], reinforcing its conclusions and further highlighting the potentially significant adverse effects of using ASIs. Additionally, these simulations also allow for the generation of entirely new datasets to analyze alternative models, as demonstrated in the last reusability example.
Hopefully, this paper can help other researchers better understand the use of simulations and realize their usefulness for research in finance, as well as in other areas related to business and economics. The effectiveness of Monte Carlo simulations is evident, as shown in Model I of Tables 3 and 4 in [3], and in, for example, Model I of Table 3 in this article. To generalize this to any other ASI-based model, the researcher can begin by identifying the complete accounting identity in the underlying model and then create a synthetic relation that mirrors the causal relationship described in the model. The researcher can then assess whether the ASI problem is present, such as by subsampling the sample based on positive/negative rests. Another approach could involve simulating the components of the rest using descriptive statistics that closely match those of a comparable real database, modeling them with their respective partial correlations (again, as similar as possible to the real database), and analyzing how changes in each component affect the overall rest. This would help identify which component of the rest contributes most to ASI-related issues in real database estimations. A researcher always knows the true coefficient that measures the relationship between a variable and the independent variable that may affect it, allowing them to assess the importance of various issues that may arise in estimating this true coefficient. Additionally, simulations enable the generation of panel data with autocorrelation (as shown in example 3 of reusability), as well as other potential problems (multicollinearity, heteroskedasticity, etc.), and help calibrate the effectiveness of various potential solutions.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/data10050062/s1.

Funding

This research received no external funding.

Data Availability Statement

The data is uploaded here: https://figshare.com/articles/dataset/Sup_material_for_DATA-Mdpi_-_A_Dataset_Examining_Problem_of_ASIs/28874555 (accessed on 23 April 2025), consisting of one Stata syntax (and the equivalent in R) in txt, and the two randomly generated datasets created for the specific analyses of this article. (and its equivalent in csv format).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Fazzari, S.; Hubbard, R.G.; Petersen, B. Financing Constraints and Corporate Investment; National Bureau of Economic Research: Cambridge, MA, USA, 1988. [Google Scholar] [CrossRef]
  2. Chen, H.J.; Chen, S.J. Investment-cash flow sensitivity cannot be a good measure of financial constraints: Evidence from the time series. J. Financ. Econ. 2012, 103, 393–410. [Google Scholar] [CrossRef]
  3. Sánchez-Vidal, F.J. A cautionary note on the use of accounting semi-identity-based models. J. Risk Financ. Manag. 2023, 16, 389. [Google Scholar] [CrossRef]
  4. Zhang, Y. The value of Monte Carlo model-based variance reduction technology in the pricing of financial derivatives. PLoS ONE 2020, 15, e0229737. [Google Scholar] [CrossRef] [PubMed]
  5. Simsek, K. Monte Carlo Simulation in Financial Modeling. J. Portf. Manag. 2023, 49, 178. [Google Scholar] [CrossRef]
  6. Deep, A. Advanced financial market forecasting: Integrating Monte Carlo simulations with ensemble Machine Learning models. Quant. Financ. Econ. 2024, 8, 286–314. [Google Scholar] [CrossRef]
  7. Carbone, E. Discriminating between Preference Functionals: A Monte Carlo Study. J. Risk Uncertain. 1997, 15, 29–54. [Google Scholar] [CrossRef]
  8. Gujarati, D.N.; Porter, D.C. Basic Econometrics; McGraw-Hill Irwin: New York, NY, USA, 2013. [Google Scholar]
  9. Rajan, R.G.; Zingales, L. Financial dependence and growth. Am. Econ. Rev. 1998, 88, 559–586. [Google Scholar] [CrossRef]
  10. Jensen, M. Agency costs of free cash flow, corporate finance, and takeovers. Am. Econ. Rev. 1986, 76, 323–329. Available online: https://www.jstor.org/stable/1818789 (accessed on 23 April 2025).
  11. Brealey, R.A.; Myers, S.C.; Allen, F. Principles of Corporate Finance; McGraw-Hill: New York, NY, USA, 2003. [Google Scholar]
  12. Tobin, J. A general equilibrium approach to monetary theory. J. Money Credit Bank. 1969, 1, 15–29. [Google Scholar] [CrossRef]
  13. Dufour, J.M.; Khalaf, L. Monte Carlo test methods in econometrics. In Companion to Theoretical Econometrics’, Blackwell Companions to Contemporary Economics; Basil Blackwell: Oxford, UK, 2001; pp. 494–519. [Google Scholar]
Figure 1. Scatterplot of cash flows (X) and investments (Y).
Figure 1. Scatterplot of cash flows (X) and investments (Y).
Data 10 00062 g001
Figure 2. Scatterplot of cash flows (X) and investments (Y) for a subset of high-Q companies. The brown subset represents high-Q, low-cash flows and low-investments companies, and the green subset represents high-Q, low-cash flows and high-investments companies.
Figure 2. Scatterplot of cash flows (X) and investments (Y) for a subset of high-Q companies. The brown subset represents high-Q, low-cash flows and low-investments companies, and the green subset represents high-Q, low-cash flows and high-investments companies.
Data 10 00062 g002
Figure 3. Histograms of the simulated cash flow variable (in millions of JPY) for years 1 (a) and 10 (b), respectively.
Figure 3. Histograms of the simulated cash flow variable (in millions of JPY) for years 1 (a) and 10 (b), respectively.
Data 10 00062 g003
Table 1. Descriptive statistics for the simulated variables.
Table 1. Descriptive statistics for the simulated variables.
Variables’ Names in the ArticleVariables’ Names in the SyntaxMeanMedianSDMin1st
Quartile
3rd QuartileMax
Cash flowsrgcfc0.2500.2500.100−0.1430.1820.3170.659
Qrgq2.4502.4500.2001.6342.3162.5853.276
Investmentsrgcapex0.1870.1870.110−0.2800.1130.2610.658
Restrgres−0.063−0.0630.114−0.536−0.1400.0140.564
N observations: 50,000.
Table 2. Results of OLS estimation for Equation (1) for a subset of high-Q, low-cash-flow companies. subsampled in low vs. high investments.
Table 2. Results of OLS estimation for Equation (1) for a subset of high-Q, low-cash-flow companies. subsampled in low vs. high investments.
I—Subsample Low
Investments (Brown)
II—Subsample High
Investments (Green)
III—Whole
Sample
Coef.tCoef.tCoef.t
Cash flow0.105 ***4.470.232 ***10.170.510 ***28.27
Dummy low invest × Cash flow −0.910 ***−68.31
Constant0.021 ***5.300.165 ***38.550.111 ***33.53
R20.014 0.028 0.511
N1306 3586 4892
Robust standard errors in parentheses. *** p < 0.01.
Table 3. Results of OLS estimation for Invi = α + β1CFi + ɛi for a subset of investments independent of the cash flows variable.
Table 3. Results of OLS estimation for Invi = α + β1CFi + ɛi for a subset of investments independent of the cash flows variable.
I—Whole SampleII—Subsample for
Positive Rest
Coef.tCoef.t
Cash flow0.0020.380.558 ***85.69
Q0.076 ***34.210.03212.36
Constant−0.003−0.470.089 ***13.74
R20.023 0.323
N50,000 16,109
Robust standard errors in parentheses. *** p < 0.01.
Table 4. Correlation of Sales and Price.
Table 4. Correlation of Sales and Price.
Sales (Units)Price
Sales (Units)1.000
Price0.992 ***1.000
***: significance at the 1 percent level.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sánchez-Vidal, F.J. A Dataset for Examining the Problem of the Use of Accounting Semi-Identity-Based Models in Econometrics. Data 2025, 10, 62. https://doi.org/10.3390/data10050062

AMA Style

Sánchez-Vidal FJ. A Dataset for Examining the Problem of the Use of Accounting Semi-Identity-Based Models in Econometrics. Data. 2025; 10(5):62. https://doi.org/10.3390/data10050062

Chicago/Turabian Style

Sánchez-Vidal, Francisco Javier. 2025. "A Dataset for Examining the Problem of the Use of Accounting Semi-Identity-Based Models in Econometrics" Data 10, no. 5: 62. https://doi.org/10.3390/data10050062

APA Style

Sánchez-Vidal, F. J. (2025). A Dataset for Examining the Problem of the Use of Accounting Semi-Identity-Based Models in Econometrics. Data, 10(5), 62. https://doi.org/10.3390/data10050062

Article Metrics

Back to TopTop