Disentangling Relevance from Reliability in Value Relevance Tests

: The literature shows that during the COVID-19 pandemic, the value relevance of earnings decreased. Traditionally, the literature measures value relevance using the relationship between stock returns and earnings. However, these tests are, in fact, “joint tests of relevance and reliability”. This caveat can distort the measurement of relevance, especially during the COVID-19 pandemic where the exceptional level of uncertainty could have affected relevance and reliability to different extents. This study disentangles reliability and relevance by extending the value relevance test. We use this extended test to examine ﬁrm categories in two dimensions: proﬁts versus losses and intensive versus scarce use of accounting estimates. The results show that reliability and relevance are complementary when both are sufﬁciently high, but reliability has no signiﬁcant impact on the usefulness of earnings when relevance is weak.


Introduction
Value relevance (or, in short, relevance) and reliability are the two basic premises of useful accounting information. The Statement of Financial Accounting Concepts No. 8 (2010) defined accounting information as reliable (termed "faithfully represented") if it is complete, neutral, and free from error, and challenged the value relevance stream of studies in empirical accounting research: "studies have not yet provided techniques for empirically measuring faithful representation apart from relevance" (SFAC 8, BC3. 30). Although numerous studies examined value relevance using the relationship between stock returns and earnings [1], Barth, Beaver, and Landsman [2] stated that these tests are "joint tests of relevance and reliability." A significant earnings response coefficient (ERC) suggests the accounting variable is both relevant and reliable to some degree, but does not enable us to determine the distinct effect of each attribute.
Recently, research documented a decrease in the value relevance of earnings during the COVID-19 period [3]. A possible explanation offered for this decrease is greater transitory noise in earnings brought on by the pandemic, which made earnings less informative of a firm's future earnings. However, when we apply joint tests of relevance and reliability, another possible reason for the empirical findings emerges as diminished reliability of earnings. Reliability could have decreased during the COVID-19 pandemic period because of the enormous uncertainty that dominated the markets (see, e.g., [4]). Additionally, lockdowns and social distancing rules may have hindered the work of internal and external auditors, thus damaging the reliability of financial reporting. Since the standard tests of value relevance do not enable us to analyze relevance and reliability separately, the evidence from the literature does not provide a clear indication of whether financial reporting during the COVID-19 period suffered from lower relevance, lower reliability, or both. To determine this, a more discerning value relevance test is required.
This study introduces a method to disentangle reliability from relevance by extending the standard value relevance tests. For this purpose, we construct a comprehensive measure of reliability based on the findings of previous studies. Specifically, we build on observable incidents formerly associated in the literature with a lack of reliability. We use this reliability measure to extend the standard value relevance tests, to gain insights into reliability and relevance as characteristics of useful accounting information. In particular, we look for cases where increasing reliability or relevance does not improve the usefulness of reported earnings.
We start by disentangling reliability from relevance in estimating the usefulness of losses versus profits, where about 30% of our sample firms report losses. Hayn [5] and Collins, Pincus, and Xie [6] reported that the value relevance of losses is significantly lower than the value relevance of profits. Yet, it remains unclear whether the usefulness of reported losses is weak due to low reliability or low relevance (or both). Utilizing the proposed reliability measure, we find that, on average, losses are significantly less reliable than profits. We then employ the extended tests for further testing of the incremental usefulness of profits over losses, controlled for the reliability level. For high-reliability reports, the estimated ERC for losses is significantly lower than the respective ERC estimated for profits. A significantly greater ERC for profits over losses is also estimated using low-reliability reports. After controlling for reliability levels, we interpret the results as suggesting that the relevance of losses is significantly lower than the relevance of profits. However, we find that the ERC estimated for high-reliability loss reports is insignificantly different from the ERC estimated for low-reliability loss reports. We conclude that high reliability among firms experiencing losses does not enhance the usefulness of reported earnings because it does not compensate for weak relevance. The results are robust when assessed using alternative specifications and robustness checks.
Next, we conduct a similar analysis based on the intensity of reporting of accounting estimates. We use 10 accounting estimates based on the list in the work of Lev, Li, and Sougiannis [7] and test the relevance and reliability of firms with financial statements that include numerous estimates versus firms that sparsely use estimates. When applying the proposed reliability measure, we find that reporting more accounting estimates is negatively and significantly associated with reliability. On extending the value relevance tests for profit reports with a high intensity of accounting estimates, we note that the estimated ERC for high-reliability reports is significantly greater than the estimated ERC for low-reliability reports. As expected for firms with a high intensity of accounting estimates, this result suggests that more reliable reports improve the usefulness of reporting. We also assess the group of profit reports with just a few accounting estimates. In this group, the estimated ERC for highly reliable reports is insignificantly different from the estimated ERC for poorly reliable reports. We interpret the findings as suggesting that reliability does not enhance the usefulness of reported profits for firms with insufficient relevance due to their rare production of accounting estimates. Again, we conclude that the high reliability of such firms does not enhance the usefulness of their reported earnings due to the weak relevance of their reports.
The contribution of this study lies in the new method we propose to disentangle reliability from relevance in value relevance tests. Our extended test can be used in multiple settings to examine the distinct effect of each attribute. In particular, this test enables us to identify changes in the relevance and reliability of financial reporting during unique periods such as the COVID-19 pandemic. Another contribution stems from the findings in both settings we use, suggesting that increasing the reliability does not improve the usefulness of reports when the relevance is insufficient.

Relevance and Reliability in Standard Value Relevance Tests
When setting standards, the goal of the Financial Accounting Standards Board (FASB), the organization responsible for establishing accounting and financial reporting standards in the United States, is to heighten the usefulness to users of financial statements of the information that entities report in their financial statements. In assessing whether the usefulness of information would be enhanced, the FASB considers relevance and reliabil-ity as two qualitative characteristics that make useful accounting information. Relevant financial information is capable of making a difference in decisions made by users. Reliable information is complete, neutral (unbiased), and free from error. To be useful, information must be both relevant and reliable. Hence, neither reliable and irrelevant economic construct nor unreliable information about a relevant economic construct help users make decisions. Accordingly, measuring the effects of relevance and reliability on the usefulness of accounting information is key issue for accounting regulators. Particularly, this study is an attempt to unravel the joint measurement of the relevance and reliability of earnings reported in financial statements.
A vast stream of value relevance studies attempts to operationalize key aspects of the FASB's conceptual framework. Value relevance studies use share prices (and/or stock returns) to infer whether capital market participants consider accounting information to be sufficiently relevant and reliable to be useful in making investment decisions. For example, [8][9][10][11] regress share price or returns on earnings to deduce value relevance in various contexts, such as the application of international accounting standards, accounting rules for joint ventures, and transitions in the economy. This stream of studies implicitly assumes: (1) investors perceive the relevance of a specific accounting information for the future cash flows of the firm, (2) investors perceive the reliability of that specific accounting information, (3) an asset-pricing model that the investors use to control for all the other factors that explain share prices, such as risk, and (4) market efficiency. Because the standard value relevance tests are joint tests, they allow drawing inferences about reliability and relevance together, but not separately. Barth, Beaver, and Landsman [2] make a clear argument: "Value relevance tests generally are joint tests of relevance and reliability. Although finding value relevance indicates the accounting amount is relevant and reliable, at least to some degree, it is difficult to attribute the cause of lack of value relevance to one or the other attribute. Neither relevance nor reliability is a dichotomous attribute, and SFAC No. 5 does not specify "how much" relevance or reliability is sufficient to meet the FASB's criteria. In addition, it is difficult to test separately relevance and reliability of an accounting amount".
In a similar vein, SFAC 8 draws attention to the measurement of reliability and relevance. Criticizing the empirical accounting research, SFAC 8 (2010, BC3.30) emphasizes that: "Empirical accounting researchers have accumulated considerable evidence supporting relevant and faithfully represented financial information through correlation with changes in the market prices of entities' equity or debt instruments. However, such studies have not provided techniques for empirically measuring faithful representation apart from relevance." We address this void by disentangling reliability apart from relevance. To do that, we build on prior studies to propose a measure of reliability and extend the standard value relevance tests.

Measuring Reliability
We built on prior studies for measuring the reliability of reported earnings. To facilitate as much separation between relevance and reliability as possible, we constructed a measure that acknowledges five observable indicators suggesting impaired reliability. Notably, ample literature detects earnings management using discretionary accruals (e.g., [12][13][14][15][16][17][18][19][20]). However, accrual metrics do not allow measurement of reliability apart from relevance because accruals are value-relevant accounting information.
The first indicator is the issuing of restatements, revealing that earnings reported on prior periods did not reliably reflect the firm's underlying economic constructs and needed to be restated. The second indicator is internal controls demonstrating one or more material weaknesses and consequent noncompliance with the Public Company Accounting Oversight Board (PCAOB) Auditing Standard No. 5 (2007) requirement for effective internal control over financial reporting. Moreover, such weaknesses are likely to suggest that a material misstatement of the company's reported earnings may not be prevented or detected on a timely basis. Accordingly, ineffective controls disclosed under the Sarbanes-Oxley Act (SOX) suggest impaired reliability (see also [21,22]). The third indicator is just meeting/beating earnings benchmarks. Prior literature hypothesized and found that firms that slightly beat benchmarks are more likely to have managed earnings (see [23][24][25]). That is, just meeting/beating earnings benchmarks is likely to reflect biased accounting figures. The fourth indicator builds on prior studies suggesting that a change of auditor increases the likelihood of mis-stated earnings. Stice [26] notes that a new auditor is less able to detect material misstatements in his audit process because he lacks familiarity with the client. Hence, the risk of audit failure and subsequent litigation is higher during an initial engagement than in subsequent years. The fifth indicator is a qualified, disclaimed, or adverse audit opinion. The auditor's failure or reluctance to produce an unqualified opinion indicates a reporting problem [27]. Admittedly, qualified, disclaimed, and adverse audit opinions are rare. Nevertheless, we include this indicator in the reliability measure for completeness of reported earnings.
We assume the reliability of reported earnings is adversely affected by the five indicators comprising the metric. Accordingly, we propose a comprehensive reliability measure, RSCORE (Reliability Score), which aims to capture the extent to which financial statements contain accounting information that is complete, neutral, and free from error. RSCORE is based on the five adverse reliability indicators detailed above, and counts for each firm-year the number of indicators known before financial statements are announced and recorded out of the following: Filing of a restatement (RESTATE)-RSCORE builds on information known to investors. Thus, we record whether a restatement of earlier financial statements was filed during the year prior to the announcement of financial statements. II. Material weakness in internal controls over financial reporting (MW-material weakness), disclosed either under Section 302 or under Section 404 of SOX. III. Just meeting/beating earnings benchmarks (MBE-just meeting/beating earnings)-We employ the three earnings benchmarks frequently used in the literature: zero, last year's earnings per share (EPS), and analyst forecast consensus [23][24][25]. We use these three benchmarks alternatively; that is, meeting or slightly beating any of them indicates manipulation, and hence impaired reliability. Following Cohen, Dey, and Lys [24], observations suspected of indicating just beating/meeting the zero benchmark are defined as firm-years with earnings before extraordinary items over lagged assets between 0 and 0.005. Observations suspected of indicating just beating/meeting last year's earnings are firm-years with a change in basic EPS excluding extraordinary items from last year between zero and two cents; and observations suspected of indicating just beating/meeting the analyst forecast consensus are firm-years with an actual EPS less than the analyst forecast consensus outstanding prior to the earnings announcement date between zero and one cent. IV. Change of auditor (CHANGE). V. Auditor adverse, qualified, or no opinion (OPINION).
Specifically, for every firm-year, RSCORE counts the number of adverse indicators recorded out of the five indicators listed above with a negative sign. That is, the highest reliability is indicated by RSCORE = 0 (none of the indicators exist), whereas RSCORE = −5 indicates the lowest reliability (all five indicators exist).
RSCORE is an appealing measure of reliability for a number of reasons: (i) it captures the indicators of reliability of reported earnings, which are complete, neutral, and free from error, as documented in the literature and as stated in SFAC 8; (ii) it is an observable, transparent, and easily reproducible metric; (iii) it does not involve relevance; and (iv) it is an equal weight procedure that follows Lev and Thiagarajan [28], Piotroski [29], and Gompers, Ishii, and Metrick [30]. It is a simple procedure that avoids complex and controversial weighting of the relative impact of different indices. Utilizing RSCORE provides means for compar-ing the differential usefulness of reported earnings in a setting of reliable information (RSCORE = 0) with that in a setting of weak reliability (RSCORE < 0). Thus, using RSCORE enables us to extend the standard value relevance tests.

Sample and Descriptive Statistics
We downloaded data for all nonfinancial firms on Compustat from 2002 to 2012 with available total assets and market value data, a total of 57,169 firm-years. Our sample period begins in 2002 because this year is the earliest for which we were able to obtain data on material weaknesses over internal controls reported under SOX. We deleted observations with share prices below one dollar from the sample to eliminate economically marginal firms. We also required firms to have at least two consecutive years of available data in order to allow for deflation of variables and for sufficient CRSP stock return data. We did not limit the applicability of RSCORE to observations with available I/B/E/S data. Thus, when I/B/E/S data were unavailable, we used earnings per share or change in earnings per share. These requirements reduced the sample size to 40,542 firm-year observations.
To compute RSCORE, we utilized data on its five components. We collected each restatement attributed to the year in which the restatement was announced and material weaknesses over internal controls reported under SOX (Section 302 or Section 404 reports). We considered a firm as having ineffective controls if it disclosed one material weakness or more in internal controls under either of these sections. We obtained data on change of auditor and auditor opinion from Compustat, as well as data necessary to identify firmyears just meeting/beating earnings benchmarks (zero and last year's EPS). We extracted data on the third benchmark, consensus analyst forecasts, from the Institutional Brokers' Estimate System (I/B/E/S). We calculated the consensus earnings forecast as the mean of all forecasts announced in the month preceding that of the earnings announcement. We compared earnings forecast to actual earnings taken from the I/B/E/S, because these data are more likely than the Compustat data to be consistent with the forecast in terms of the treatment of extraordinary items and special items. We obtained financial data from the Compustat industrial annual file and stock return information from the CRSP monthly file.

#EST =
The number of accounting estimates firm i recorded in year t out of the following 10: change in inventories, depreciation and amortization, deferred taxes, pension expense, post-retirement benefits, doubtful receivables, restructuring costs, in-process research and development, stock compensation expenses and asset write-downs. FCF = Firm i's free cash flow, calculated as the difference between operating cash flow (OANCF t ) and average capital expenditure (CAPX t ) over years t and t − 1, deflated by total assets at the beginning of year t (AT t−1 ).
The percentage change in firm i's sales (SALE t ) from year t − 1 to year t.

LEV =
Financial leverage equal to the sum of long-term debt (Compustat DLTT t ) and debt in current liabilities (Compustat DLC t ) divided by the sum of the long-term debt, debt in current liabilities, and market value of equity. LOSS = A dummy variable equal to 1 if firm i in year t recorded negative earnings before extraordinary items; zero otherwise.

MBE =
A dummy variable equal to 1 if firm i just meets/beats at least one of three earnings benchmarks in year t; zero otherwise. Observations suspected of indicating just beating/meeting the zero benchmark are defined as firm-years with earnings before extraordinary items over lagged assets (IB t /AT t−1 ) between 0 and 0.005. Observations suspected of indicating just beating/meeting last year's earnings are firm-years with a change in basic EPS excluding extraordinary items from last year (EPSPX t -EPSPX t−1 ) between 0 and 2 cents.
Observations suspected of indicating just beating/meeting analyst forecast consensus are firm-years with an actual EPS less than the analyst forecast consensus outstanding prior to the earnings announcement date between 0 and 1 cent.   A dummy variable equal to 1 if firm i reports ineffective controls under Section 302 or 404 in year t; zero otherwise. Table 2 reports descriptive statistics of RSCORE. The mean value of RSCORE is −0.325 and, as expected, the distribution of RSCORE is skewed to the right. As for the components of RSCORE, being indicator variables, their means represent frequency in the sample. The most frequent indicators of low reliability are restatements (RESTATE), recording a mean of 0.149, and meeting/beating earnings benchmarks (MBE), with a mean of 0.103. Less frequent are change of auditor (CHANGE) and material weakness in controls over financial reporting (MW), which have means of 0.085 and 0.069, respectively. Lastly, the occurrence of auditor adverse, qualified, or no opinion (OPINION) is rare, reflected by a mean of about 0.001.  Table 3 presents the pattern of RSCORE in our sample observations. We find that 72.7% of the observations have no adverse indicators of unreliability, whereas 22.6% have a single adverse indicator of unreliability. The remaining 4.7% of the sample have two indicators or more, suggesting serious reliability problems. The columns on the right side of the panel indicate a reasonable distribution of the five indicators among firms with one to four adverse incidents (where the values of RSCORE are −1 to −4, respectively).

Extending the Standard Value relevance Tests
The reliability measure allows for extending the standard value relevance tests in two ways. First, we classified our sample observations into two groups. We utilized RSCORE levels to separate between firm-year observations with no signal of weak reliability, RSCORE = 0, from observations with signals of weak reliability, RSCORE < 0. For observations in each of the two groups, we estimate the standard value relevance regression model in its simplest form: We used this simple model because earnings are a summary accounting variable that aggregate revenues and expenses. Earnings surprises are interpreted as value relevant if they influence investors' valuation of securities in decisions to buy or sell stocks. A vast body of empirical accounting research employs regressions of stock returns on earnings surprises (and earnings levels) to infer value relevance. A higher value for ERC suggests more useful reported earnings for security valuation. Verifying that the findings are independent of the measurement of stock returns, we used two alternative variables of stock returns: raw stock returns and market-adjusted stock returns (see, for example, [31][32][33][34][35][36]).
Next, we extended the standard value relevance test to gain further insights on the impact of reliability on the usefulness of reported earnings. This extended model disentangles the effects of reliability and relevance as follows: As before, we used raw returns and market-adjusted returns in estimating the model. To increase the confidence in our findings, two analogous versions of the two regression models were estimated. We estimated a pooled cross-sectional regression model, with firm effects, year effects, and both firm and year clustering. We applied the extended value relevance tests for disentangling relevance apart from reliability in two contexts: losses and accounting estimates.

The Determinants of Poor Reliability
We started by confirming that both contexts, losses and intensive accounting estimates, were associated with low reliability which affects the results of the standard value relevance test. For this purpose, we employed a cross-sectional regression model in which RSCORE was the dependent variable, and the explanatory variables included a dummy variable for loss observations (LOSS, equal to one for firm-years reporting a loss; otherwise zero) and a variable reflecting the number of accounting estimates in the financial statement (#EST). The regression also incorporated other variables associated in the literature with low reliability, such as the size of the firm, book-to-market value of equity, free cash flow, the number of segments, whether the auditor is one of the big four, etc. The regression model is as follows: The estimation includes fixed firm and year effects. Table 4 reports the results. In the specification using both variables of interest, the coefficient estimate on LOSS is −0.023 and highly significant (p-value = 0.001). As expected, this result suggests that losses are associated with low reliability. Similarly, the coefficient estimate of #EST is −0.014 (p-value < 0.001), indicating that accounting estimates are also associated with low reliability. Signs of coefficient estimates on control variables match their expected effect on reliability, as explored by prior studies. The coefficient estimates on size of the firm, free cash flow, age and the presence of a big four auditor are all positive and significant, suggesting they contribute to reliability. Conversely, the coefficient estimates on the number of segments are negative, indicating they hamper reliability.

Relevance and Reliability of Profits versus Losses
Prior studies employed the standard value relevance framework and found that the return-earnings relation for loss firms was much weaker than that for profit firms (e.g., [5,6]). A lower ERC in the standard value relevance framework suggests that losses are less useful for investment decision making than profits. The findings are attributed to the perception of losses as being transitory. The explanation lies in the abandonment option, whereby loss firms are more likely to curtail operations ( [5,37]). Following a similar path, Balakrishnan, Bartov, and Faurel [38] reported that earnings surprises of loss firms are substantially larger than those of profit firms, indicating weaker relevance.
However, other studies documented signals of weak reliability in loss firms (e.g., [21,39,40]). Since value relevance tests are joint tests of relevance and reliability, they do not provide means for inferring which of the two characteristics of useful information is lacking in loss firms (or, perhaps both). To examine this question, we utilized the reliability measure and the extended value relevance tests for gaining insights on each of the two characteristics in profit reports versus loss reports.
In the first test, we classified our sample observations into four portfolios: profit firms with high reliability (RSCORE = 0), profit firms with low reliability (RSCORE < 0), loss firms with high reliability (RSCORE = 0), and loss firms with low reliability (RSCORE < 0). We then estimated Equation (1), the standard value relevance regression model in its simplest form, for each firm category separately. Table 5 reports the estimation results. For the estimation using raw results, the estimated ERC for profit firms with high reliability is 0.810 and the estimated ERC for profit firms with low reliability is 0.691, both highly significant (p-value < 0.001). The difference between the ERCs, 0.119, is significant (p-value = 0.001), indicating that reliability enhances the usefulness of reported earnings for profit firms. By contrast, the estimated ERC for loss firms with high reliability is 0.362 and the estimated ERC for loss firms with low reliability is also 0.366, both highly significant (p-value < 0.001). Importantly, the difference between the ERCs is zero and insignificant (p-value = 0.994), indicating that reliability has no significant impact on the ERC for loss firms. Comparing the estimated ERCs between profit and loss firms, we found significantly lower ERCs for loss firms than for profit firms when reliability is high (0.362 and 0.810, respectively, p-value < 0.001). Similarly, we found significantly lower ERCs for loss firms than for profit firms when reliability is low (0.366 and 0.691, respectively, p-value < 0.001). These findings indicate a higher ERC in profit firms compared with loss firms after controlling for the level of reliability. This result indicates that reported earnings in profit firms are more useful than in loss firms, after controlling for RSCORE. The result confirms the argumentation in Hayn (1995) and subsequent studies. Results from using market-adjusted returns for estimating Equation (1) are essentially the same.
Next, we employed the extended value relevance test, described in Section 3.3, to disentangle the effects of relevance and reliability on the usefulness of financial statements of profit versus loss firms. The estimation results of Equation (2) are reported in Table 6. For raw stock returns, the coefficient estimate on earnings surprise (∆ERN) in profit firms is 0.803, more than twice the corresponding coefficient estimate of loss firms, 0.353. Since this model controls for reliability, the findings suggest that the relevance of profits is higher than relevance of losses, even after considering their dissimilar reliability. Turning to the interaction between the two characteristics, the coefficient estimate for profit firms is 0.080 and significant (p-value = 0.003), indicating that higher values of RSCORE increase the association between ∆ERN and stock returns. However, for loss firms, the interaction coefficient estimate is −0.028 and insignificant (p-value = 0.264), indicating that higher reliability has an insignificant impact on the association between ∆ERN and stock returns. We presented similar findings for market-adjusted returns. Overall, the results indicate that loss reports are significantly less reliable and significantly less relevant than profit reports. Moreover, the results suggest that if the level of relevance is sufficient, as is the case in profit firms, increasing reliability is expected to enhance the usefulness of reported earnings. In contrast, if reported earnings in loss firms have a low, insufficient level of relevance, reliability is unlikely to enhance their usefulness. These results are consistent with the statement made in SFAS 8, that information must be both relevant and reliable if it is to be useful. Neither a reliable reporting of an irrelevant phenomenon, nor an unreliable reporting of a relevant phenomenon, helps financial statement users to make good decisions. The results reported here confirm this premise by demonstrating that reliability matters only when the information is highly relevant. Furthermore, our findings complement prior experiment-based evidence within the fair-value context that uses do not view relevance and reliability as independent constructs [41].

Relevance and Reliability of Accounting Estimates
In this section, we further explore the extent to which accounting estimates influence reliability and relevance. The FASB encourages firms to report accounting estimates because they increase the relevance of accounting information (IFRS Memorandum 2005; Johnson, 2005). However, when considering reliability, prior studies found that accounting latitude leads to biases and manipulations of earnings (see [42] for a review), which are likely to impede reliability. Prior studies also reported that accounting estimates lead to impaired item-specific reliability [43][44][45][46][47]. Specifically, Lev, Li, and Sougiannis ( [7], p. 780) point out that "accounting estimates . . . introduce a considerable and unknown degree of noise, and perhaps bias, to financial information, detracting from their usefulness. . . . Add to the above objective difficulties in generating reliable estimates the expected and frequently documented susceptibility of accounting estimates to managerial manipulation, and the consequent adverse impact of estimates on the usefulness of financial information becomes apparent." Based on the FASB's approach, we assume that reliance on accounting estimates increases relevance of the reports. However, relevance of reported earnings is likely to expand their usefulness only in the presence of sufficient reliability. If relevance is insufficient in the group of weak reliability reports (as in SFAC 8, 2010, QC17), increasing reliability is not expected to enhance the usefulness of reported earnings.
To test this assertion, we followed Lev, Li, and Sougiannis [7] and compiled 10 accounting estimates underlying financial information. These estimates are change in inventories, depreciation and amortization, deferred taxes, pension expense, post-retirement benefits, doubtful receivables, restructuring costs, in-process research and development, stock compensation expense, and asset write-downs. Next, we computed the intensity of accounting estimates (#EST) for each firm-year as the number of estimates recorded in the financial statements out of the 10 on our list. Hence, #EST can take values between zero and 10. In this section, we included only profit firms to prevent the confounding effect of profits versus losses from driving our results.
As we did for profits versus losses, we began with portfolio analysis in which we used four firm categories: we first classified the sample into low-versus high-intensity of accounting estimates, and then divided each group into high reliability firms (RSCORE = 0) and low reliability firms (RSCORE < 0). The low intensity of accounting estimates category includes all profit observations recording up to three estimates, while the low-intensity category includes all profit observations recording at least four estimates. For each of the four firm categories, we estimated Equation (1), the standard value relevance test. As before, we used either raw returns or market-adjusted returns. Table 7 presents the results. The coefficient estimate on earnings surprise (∆ERN) for high-estimate-intensity firms is significantly higher than the corresponding coefficient estimate for low-estimate-intensity firms. For the estimation using raw results, the estimated ERC for high-estimate-intensity firms with high reliability is 0.847 and the estimated ERC for high-estimate-intensity firms with low reliability is 0.703, both highly significant (p-value < 0.001). The difference between the ERCs, 0.144, is significant (p-value = 0.001), indicating that reliability enhances the usefulness of reported earnings for high-estimateintensity firms. By contrast, the estimated ERC for low-estimate-intensity firms with high reliability is 0.631 and the estimated ERC for low-estimate-intensity firms with low reliability is also 0.620, both highly significant (p-value < 0.001). Importantly, the difference between the ERCs is zero and insignificant (p-value = 0.783), indicating that reliability has no significant impact on the ERC for low-estimate-intensity firms. Comparing the estimated ERCs between high-and low-estimate-intensity firms, we found significantly lower ERCs for low-estimate-intensity firms than for high-estimateintensity firms when reliability is high (0.637 and 0.847, respectively, p-value < 0.001). Similarly, we found significantly lower ERCs for low-estimate-intensity firms than high-estimateintensity firms when reliability is low (0.620 and 0.703, respectively, p-value < 0.001). These findings indicate a higher ERC in high-estimate-intensity firms compared with low-estimateintensity firms after controlling for the level of reliability. This result indicates that reported earnings in high-estimate-intensity firms are more useful than in low-estimate-intensity firms, after controlling for RSCORE. The results from using market-adjusted returns for estimating Equation (1) are essentially the same.
Our second analysis employed the extended value relevance test, described in Section 3.3, to disentangle the effects of relevance and reliability on the usefulness of financial statements. The estimation results of Equation (2) are reported in Table 8. For raw stock returns, the coefficient estimate for earnings surprise (∆ERN) of high-estimate-intensity firms is 0.835, significantly higher than the coefficient estimate for low-estimate-intensity firms, which is 0.648. The findings suggest that the relevance of high estimate intensity is higher than the relevance of low estimate intensity, even after considering their dissimilar reliability. As for the interaction between the two characteristics, the coefficient estimate for highestimate-intensity firms is 0.087 and significant (p-value = 0.003), indicating that higher values of RSCORE increase the association between ∆ERN and stock returns. However, for low-estimate-intensity firms, the interaction coefficient estimate is 0.054 and insignificant (p-value = 0.407), indicating that higher reliability has an insignificant impact on the association between ∆ERN and stock returns. We report similar findings for marketadjusted returns.  Table 7 presents portfolios, where the coefficient estimates β from estimating a value relevance regression described in Equation (1). LOW EST includes all profit observations recording up to three estimates, while HIGH EST includes all profit observations recording at least four estimates. Table 8 presents coefficient estimates of crosssectional value relevance regressions with RSCORE and an interaction of ∆ERN with RSCORE, as in Equation (2). p-values are reported in parentheses. Definitions of all variables are in Table 1.
Overall, the results indicate that low-estimate-intensity reports are significantly less relevant than high-estimate-intensity reports. These results are consistent with the ample evidence in the prior literature on the value relevance of accounting estimates (e.g., [7,[43][44][45][46][47]). Unlike the prior literature that focused on the typically high relevance and low reliability of estimates, thus indicating a trade-off between these characteristics, we examined the interaction between them. The results suggest that if the level of relevance is sufficiently high, as is the case in high-estimate-intensity firms, increasing reliability is expected to enhance the usefulness of reported earnings. In contrast, if reported earnings in low-estimate-intensity firms have an insufficient level of relevance, reliability is unlikely to enhance their usefulness. Similar conclusions were obtained in our analysis of profit versus loss firms.

Conclusions
This study extends the standard value relevance test to enable disentanglement between the distinct effects of relevance and reliability on the usefulness of financial reporting. We applied a new measure of reliability in two settings: one compared between profit and loss firms, and the other compared between high vs. low estimate intensity. In both settings, the differential reliability between the opposing firm categories may distort the results of the standard value relevance test, which does not consider reliability.
The findings demonstrate that profits provide more relevant information than losses, above and beyond their higher reliability. Furthermore, high-estimate-intensity firms are more relevant than low-estimate-intensity firms, controlling for their low reliability. Moreover, both settings reveal that reliability matters only if the level of relevance is sufficiently high, where increased reliability enhances the usefulness of reported earnings. However, if reported earnings have a low level of relevance (losses and low estimate intensity), reliability is unlikely to improve their usefulness.
This study contributes to the rich value relevance literature by presenting a test that differentiates between relevance and reliability, as opposed to the standard test used in the literature. Future research may employ the method we offer in various contexts to disentangle between these two attributes and determine the distinct contribution of each of them to the usefulness of accounting information. Moreover, the proposed reliability measure, RSCORE, can be used in future studies to estimate the reliability of various firm categories or over time. The comprehensiveness of this measure and its independency of the reported values make it an appealing measure of reliability. Nonetheless, this measure has some drawbacks since some of its components may depend on enforcement policy (such as restatements), and some are noisy signals of poor reliability (such as auditor changes). An additional limitation is that our measure is designed to assess the reliability of mandatory financial reporting, not voluntary disclosures made by the firm. The prior literature suggests that such disclosures may also suffer from impaired reliability (e.g., [48]). Notwithstanding these limitations, we believe RSCORE can assist regulators and researchers to identify the low reliability of financial reporting.