Sensitivity of Performance Indexes to Disaster Risk

: We examine how sensitive the new performance indexes incorporating high moments and disaster risk are to disaster risk. The new performance indexes incorporating high moments and disaster risk are the Aumann-Serrano performance index and Foster-Hart performance index proposed by Kadan and Liu. These performance indexes provide evaluations sensitive to the underlying risk. We show, by numerical examples and empirical examples, how sensitive these indexes are to disaster risk. Although these indexes are known to be either quite sensitive or excessively sensitive to disaster risk or maximum loss in the literature, we show by the regression analysis of the index and summary statistics these indexes are in fact not excessively sensitive to maximum loss in representative stock data, which contain disastrous observations. The numerical estimate of the Foster-Hart performance index is found to be effective in showing the performance index. Our analysis suggests these indexes can handle various empirical data containing quite disastrous observations.


Introduction
Recently, new performance measures incorporating high moments and disaster risk are proposed by Kadan and Liu (2014). The new performance measures are reciprocals of the risk indexes proposed by Aumann and Serrano (hereafter AS) (2008) and Foster and Hart (hereafter FH) (2009) based on axiomatic approaches. Kadan and Liu (2014) demonstrated the use of the new performance indexes is useful to provide risk-averse assessments of various assets compared to the de facto industry standard performance measure, i.e., the Sharpe ratio. These conservative assessments shed new light to provide very different evaluations compared to the Sharpe ratio, which incorporates only the first two moments. There are many performance measures proposed in the literature. However, most of those performance measures proposed before were put forward in a rather ad hoc way and not based on economic principles. See, e.g., articles such as Hubner (2009a, 2009b), Eling and Schuhmacher (2007), and Farinelli et al. (2008) for various performance measures proposed in the literature. On the other hand, the AS and FH performance measures are based on axiomatic approaches and different from those ad hoc performance measures. However, there are no further studies of the properties of the new performance measures since Kadan and Liu (2014) except Hodoshima (2019), who only studies the property of the AS performance measure 1 . The AS performance measure is known to be quite sensitive to the underlying risk of targets in question (cf., Kadan and Liu 2014;Miyahara 2014;Ban et al. 2016;Hodoshima 2019). The FH performance measure is also known to be extremely sensitive to rare disasters or maximal loss (cf., Foster and Hart 2009;Kadan and Liu 2014;Anand et al. 2016Anand et al. , 2017Riedel and Hellmann 2015). We address the issue of sensitivity of the AS and FH performance measures to disaster risk in this study. It is not clear in the existing literature how sensitive these new performance measures are to disaster risk. The AS and FH performance measures correspond to the AS and FH risk measures, one-to-one. Although we focus on the AS and FH performance measures in this study, therefore, our findings in this study can be equally applied to the AS and FH risk measures in the opposite sense that good (bad) performance is equivalent to being less (more) risky.
We The performance measures we study in the numerical examples are the AS and FH performance measures, Sharpe ratio, Sortino ratio, and Calmar ratio. By comparing these performance measures numerically, we can see more clearly the merits and demerits of the AS and FH performance measures as compared to the more traditional performance measures proposed before. We also study empirical examples to see how the new performance measures are sensitive to disaster risk or maximal loss. As empirical examples we use the DOW 30 stocks, which make up the world-leading Dow Jones Industrial Average Index (DOW). We consider the DOW 30 stocks are representative stocks. When we show daily and monthly data of the DOW 30 stocks as empirical examples in this study, we can see they contain quite disastrous days or months, which looks more disastrous than our numerical examples where the FH performance measure attains a lower limit of zero. However, we show both the AS and FH performance measures can handle these data to produce sensible scores in our empirical examples. We show the new performance measures, particularly the FH performance measure, are, unlike the previous predominant view in the literature, able to produce robust scores to real data containing quite disastrous observations. The numerical estimate of the FH performance measure by grid search is found to be effective in showing a good estimate of the FH performance measure. We also show by the regression analysis of the new performance measure and summary statistics including percentiles how the new performance measures are sensitive to disaster risk or maximal loss in typical stock data.
In the following, we review the existing literature with respect to sensitivity of the AS and FH performance measures to disaster risk. In their seminal paper, Aumann and Serrano (2008) showed the AS risk index satisfies first-and second-order stochastic dominance when the underlying gamble is restricted to take finite values and noted the AS risk index is sensitive to losses. Miyahara (2010Miyahara ( , 2014 showed the IRRA, equivalent to the AS performance index, is sensitive to losses by numerical examples. Kadan and Liu (2014) described the reciprocal of the AS risk index can be used as a performance index and demonstrated through several empirical examples it can be quite useful to provide risksensitive evaluation compared to the Sharpe ratio. Ban et al. (2016) showed the sensitivity of the IRRA to disaster risk by numerical examples. Hodoshima (2019) examined the properties of the IRRA using examples of normal mixture distributions.
On the other hand, Foster and Hart (2009) proved the FH risk index also satisfies firstand second-order stochastic dominance when the underlying gamble is restricted to take finite values and remarked it is quite sensitive to the maximal loss as an implication of their proposition, i.e., Proposition 3 of Foster and Hart (2009). Foster and Hart (2009) showed there exists a unique positive solution R(g) of the implicit equation for a gamble g when a gamble g takes finite values with E[g] > 0 and P(g < 0) > 0 where R(g) denotes the FH risk measure. Kadan and Liu (2014) also remarked the FH performance measure is extremely sensitive to the maximal loss by considering a composite gamble with a small probability of a large loss, i.e., Proposition 8 of Kadan and Liu (2014). Anand et al. (2016) and Anand et al. (2017) use as the FH risk measure the extended FH risk given by Riedel and Hellmann (2015) for gambles with continuous distribution. In other words, Riedel and Hellmann (2015) extended the FH risk measure for a gamble with continuous distribution by the maximum loss L, i.e., Riedel and Hellmann (2015) remarked the FH risk measure does not exist for many common distributions and that the sign of the expectation E ln 1 + g L determines whether the FH risk measure exists or not. In other words, the FH risk measure defined by Foster and Hart (2009) does (not) exist when the sign of E ln 1 + g L is negative (nonnegative), and the maximal loss L is used as the extended FH risk measure when the FH risk measure does not exist. Since Riedel and Hellmann (2015), the FH risk measure have been identified in some cases as the maximum loss. However, to the best of our knowledge no study exists to examine if the maximum loss or the reciprocal of the maximum loss of a gamble is appropriate in real data. In this study, we verify this thesis empirically using the numerical estimate of the FH performance measure proposed by Kadan and Liu (2014) and the reciprocal of the maximum loss with the help of the sign of the expectation E ln 1 + g L . As a result, we show the numerical estimate of the FH performance measure by grid search is effective in finding the FH performance measure or the extended FH performance measure defined by the reciprocal of the maximal loss.
The rest of the article is organized as follows. In Section 2, we describe methods, i.e., the AS and FH performance measures as well as the extended FH performance measure. In Section 3, we provide results, i.e., numerical examples of a set of performance measures, including the AS and FH performance measures, and empirical examples. In Section 4, we provide discussion of results. In Section 5, we present concluding comments.

Methods (the AS and FH Performance MEASURE and FH Extended Performance Measure)
In this section, we describe the AS and FH performance measures and FH extended performance measure. We follow Kadan and Liu (2014) to describe the AS and FH performance measures, i.e., Proposition 1 and 2 of Kadan and Liu (2014).
First we describe the AS performance measure. We begin to give the following two definitions.

Definition 1.
A gamble g is wealth-uniformly rejected by an investor with utility function u, if u rejects g at all initial wealth levels.

Definition 2.
A gamble g wealth-uniformly dominates gamble g if whenever g is wealth-uniformly rejected by a utility function u, g is also wealth-uniformly rejected by u.
Based on the above definitions, the following result can be obtained.
Proposition 1 (Aumann and Serrano 2008;Hart 2011). Wealth-uniform dominance induces a complete order on the set G of gambles that extends second-order stochastic dominance. This order can be represented by a performance index P AS (g) assigned to any gamble g ∈ G, which is given by the unique positive solution to the implicit equation That is, for any two gambles g and g , g wealth-uniformly dominates g if and only if P AS (g) ≥ P AS (g ). The set G denotes the set of gambles.
Next, we describe the FH performance measure. We assume the class of utility functions U * satisfy the three conditions; (i)decreasing absolute risk aversion, (ii)increasing relative risk aversion, (iii) lim w↓0 u(w) = −∞. Definition 3. A gamble g is utility-uniformly rejected at an initial wealth level w 0 if all utility functions u ∈ U * reject g at w 0 . Definition 4. A gamble g utility-uniformly dominates gamble g if whenever g is utility-uniformly rejected at an initial wealth level w 0 , g is also utility-uniformly rejected at w 0 .
Then we have the following result.
Proposition 2 (Foster and Hart 2009;Hart 2011). Utility-uniform dominance induces a complete order on G that extends second-order stochastic dominance. This order can be represented by a performance index P FH (g) assigned to any gamble g ∈ G, which is given by the unique positive solution to the implicit equation That is, for any two gambles g and g , g utility-uniformly dominates g if and only if P FH (g) ≥ P FH (g ).
Therefore, the AS performance index of a gamble g is defined to be a positive solution α of the implicit equation given by On the other hand, the FH performance index of a gamble g is defined to be a positive solution γ of the implicit equation The AS and FH risk indexes are the reciprocals of the AS and FH performance indexes, i.e., the AS risk index proposed by Aumann and Serrano (2008) is given by 1/α and the FH risk index proposed by Foster and Hart (2009) is given by 1/γ. Hodoshima and Miyahara (2020) extended the AS performance index to include the negative solution of the implicit equation of the AS performance index and provided a sufficient condition for the existence of the unique negative solution of the implicit equation. Riedel and Hellmann (2015) gave an interesting theorem regarding a gamble with continuous distribution by introducing finite gambles that approximate the continuous gamble.
Proposition 3 (Riedel and Hellmann 2015). Let g be a gamble with maximal loss L > 0. Let g n be a sequence of finite-valued gambles with g n ↑ g a.s., where each g n has the same maximal loss L. Denote by ρ n ≡ ρ(g n ) > L their FH risk measure. Then the following statements hold true: 1.
The sequence ρ n is decreasing. Put ρ ∞ = lim ρ n ≥ L for its limit.
In other words, Riedel and Hellmann (2015) defined the maximal loss L as the extended FH risk measure when E[ln(1 + g/L)] is nonnegative, i.e., when the FH risk measure does not exist.

Results
In this section, we present numerical examples of a set of performance measures including the AS and FH performance measures and empirical examples of the DOW 30 stocks.

Numerical Examples of the Performance Measures
We first present numerical examples of a set of performance measures using random variables of cash flows. Although Kadan and Liu (2014) presented empirical examples of the AS and FH performance measures in various financial assets, the characteristics of the two performance measures were not fully explained there. We intend to show the characteristics of the two performance measures in comparison to the traditional performance measures of the Sharpe ratio, Sortino ratio, and Calmer ratio. The Sortino ratio and Calmar ratio are derived from the Sharpe ratio by replacing the standard deviation in the Sharpe ratio by other risk measures. The Sortino ratio replaces standard deviation in the Sharpe ratio by the downside deviation in order to take into account only downside risk instead of both downside and upside risk in standard deviation. The Calmar ratio replaces standard deviation in the Sharpe ratio by another risk measure of the maximum drawdown where a drawdown is a peak-to-trough decline. Comparing the two performance measures with these traditional performance measures, we can see how the new performance measures function relative to the old performance measures.
We consider the following two sets of random variables in Example 1 and 2 given in Tables 1 and 2. We consider four cases of random variables in Example 1 where values each random variable takes are given with corresponding probabilities where pr stands for probability. Mean, s.d., third, and fourth, skew, kurt, downrisk, and maxdrawd are respectively mean, standard deviation, the third central moment, the fourth central moment, skewness, kurtosis, downside deviation, and maximum drawdown for each random variable. Similar numerical examples were examined in Hodoshima (2020a) where the traditional performance measures were compared with the AS performance measure.
Sharpe, Sortino, Calmar, AS, and FH denote respectively the Sharpe ratio, Sortino ratio, Calmar ratio, AS performance measure, and FH performance measure.
We estimate the two performance measures by the generalized method of moments (GMM) estimator, as described in Kadan and Liu (2014). In particular, we find the two performance measures via grid search for the solutions of the sample analogs of the implicit equations for the two performance measures. The GMM estimator is consistent and asymptotically normally distributed. The implicit equations of the two performance measures are given as follows. The AS performance measure of a gamble g is given by α, which is a unique solution α of the implicit equation On the other hand, the FH performance index of a gamble g is defined by γ, which is the unique solution of the implicit equation In Example 1, the case with a higher number dominates the case with a lower number since the former is larger than the latter with probability one. Hence, appropriate performance measures ought to take higher values in the case with a higher number than the case with a lower number, which is the property called monotonicity. The de facto industry standard performance measure, the Sharpe ratio, fails to provide a larger value in the case with a higher number than the case with a lower number. Therefore, the Sharpe ratio does not satisfy monotonicity, one of the most fundamental criteria for performance measures. On the other hand, the Sortino ratio and Calmar ratio both satisfy this criterion. However, the Calmar ratio increases very much in the case with a higher number where a large value replaces a small value in the case with a lower number with a small probability 0.009. Therefore, the Calmar ratio is too sensitive to the increase of a value in the random variable with a small probability 0.009 in the case with a higher number. The Sortino ratio increases gradually in the case with a higher number so that it satisfies monotonicity in Example 1. The AS performance measure does satisfy to provide a larger number in the case with a higher number than the case with a lower number 2 . We remark the AS performance measures in case 2-4 are the same up to the third decimal point in Example 1 but that the AS performance measure in the case with a higher number is in fact larger in four decimal places or less than the case with a lower number in case 2-4 in Example 1. The increase of the AS performance measure in the case with a higher number is very small and hence not sensitive to gains of the underlying random variable. The FH performance measure increases more clearly than the AS performance measure in the case with a higher number than the case with a lower number. This implies the FH performance measure is more sensitive to gains than the AS performance measure, which has never been mentioned in the literature. The magnitude of the FH performance measure is similar to that of the AS performance measure and Sortino ratio.
Example 2 provides four cases of random variables of uncertain cash flows where values each random variable takes are given with corresponding probabilities where pr stands for probability. Mean, s.d., third, and fourth, skew, kurt, downrisk, and maxdrawd are respectively mean, standard deviation, the third central moment, the fourth central moment, skewness, kurtosis, downside risk, and maximum drawdown for each random variable.
In the table, a, b, c, and d denote respectively the probability given by 0.4 × 0.999, 0.6 × 0.999, 0.591 × 0.999, and 0.009 × 0.999. Sharpe, Sortino, Calmar, AS, and FH denote respectively the Sharpe ratio, Sortino ratio, Calmar ratio, AS performance measure, and FH performance measure. In the table, 0 + denotes limit on the right, i.e., the limit of sequences of positive numbers converging to zero.
In Example 2, a random variable in each case has a disaster −15 with a small probability 0.001 and other cashflows with remaining probabilities proportional to probabilities in each case in Example 1. The disaster risk −15 with a small probability 0.001 does not affect much the Sharpe ratio and Sortino ratio but does affect significantly the Calmar ratio. However, the Calmar ratio is again seen to be too sensitive to the increase of a value in the random variable with a small probability 0.009 × 0.999 in the case with a higher number. Hence it is not a reliable performance measure. The AS performance measure becomes less than half in Example 2 than in Example 1. We can say the disaster risk has a large effect on the AS performance measure. Thus, the AS performance measure is sensitive to losses or the maximal loss of the underlying random variable. The FH performance measure has an even more significant negative effect by the disaster risk. In Table 2, 0 + denotes limit on the right, i.e., the limit of sequences of positive numbers converging to zero. It becomes virtually zero, which is a lower bound of the FH performance measure, in every case in Example 2. Adding larger positive values in the case with a higher number does not affect the FH performance measure in Example 2. Therefore, the FH performance measure is virtually determined by disaster risk or maximal loss. Hence, it is the most sensitive measure to disaster risk, which conforms to the previous studies of the FH performance measure (cf., Foster and Hart 2009;Kadan and Liu 2014;Anand et al. 2016Anand et al. , 2017Riedel and Hellmann 2015). Hence, the newly introduced performance measures of the AS and FH performance measures are both quite sensitive to losses of the underlying random variable.
We introduce another example, Example 3 given in Table 3, to show the problematic nature of the traditional performance measures of the Sharpe ratio, Sortino ratio, and Calmar ratio. Example 3 has a loss of −5, which is not huge but sizable compared to the disaster risk −15 in Example 2, with a small probability 0.001. It has four cases where a case with a higher number is larger than a case with a smaller number with probability 1. The loss −5 produces larger skewness and kurtosis in absolute value in Example 3 than in Example 2. We can see absurd values of the Sharpe ratio in Example 3, failing to satisfy monotonicity again. In Example 3, the Sortino ratio also fails to satisfy monotonicity. Therefore, we cannot always trust the Sortino ratio because it does give irrational values, depending on the underlying random variable. We can also observe the Calmar ratio is again too sensitive to the increase of a value in the random variable with small probability 0.009. Hence, we cannot trust the Calmar ratio as an appropriate performance measure. In Example 3, the FH performance index is again 0 + , the limit of sequences of positive numbers converging to zero, in every case of Example 3. Adding larger positive values does not change the FH performance measure in Example 3. Hence, the FH performance measure is again virtually determined by the disaster risk of −5 in Example 3. One may say this indicates that the FH performance measure is excessively sensitive to disaster risk. On the other hand, the AS performance measure provides larger scores in Example 3 than in Examples 1 and 2. This also conforms to the performance of mean in the three examples, i.e., mean in Example 3 is larger than in Example 1 and 2. Therefore, the AS performance measure is sensitive to losses of the underlying random variable but not excessively sensitive to disaster risk as in the FH performance measure. We can see the AS performance measure is again insensitive to gains in Example 3. One may consider the AS performance measure, sensitive to losses but not excessively sensitive to disaster risk, is more appropriate than the FH performance measure since it can provide assessments more often.
Example 3 provides four cases of random variables of uncertain cash flows where values each random variable takes are given with corresponding probabilities where pr stands for probability. Mean, s.d., third, and fourth, skew, kurt, downrisk, and maxdrawd are respectively mean, standard deviation, the third central moment, the fourth central moment, skewness, kurtosis, downside risk, and maximum drawdown for each random variable. Sharpe, Sortino, Calmar, AS, and FH denote respectively the Sharpe ratio, Sortino ratio, Calmar ratio, AS performance measure, and FH performance measure. In the table, 0 + denotes limit on the right, i.e., the limit of sequences of positive numbers converging to zero.
Although our numerical examples are limited, we can summarize our numerical comparisons as follows. Overall, the traditional performance measures of the Sharpe ratio, Sortino ratio, and Calmar ratio are not reliable since they do not either satisfy monotonicity or sometimes give irrational evaluations depending on the underlying target in question. On the other hand, the AS and FH performance measures are reliable, i.e., satisfy monotonicity, when they are well defined. However, the FH performance measure is excessively sensitive to the maximal loss, which makes the FH performance measure incapable of providing appropriate assessments. On the other hand, the AS performance measure can provide assessments more often than the FH performance measure. The AS performance measure is quite sensitive to losses but not excessively sensitive to the maximal loss as in the FH performance measure. The AS performance measure is less sensitive to gains than the FH performance measure when the two performance measures both can provide assessments.
In the next subsection, we provide empirical examples to show how the AS and FH performance measures function when we can compute the two performance measures. In particular, we show by empirical examples how the FH performance measure performs when we can obtain its assessment, which we could observe only once in Example 1 in three numerical examples in this section.

Empirical Results
In this subsection, we present empirical results of evaluations by the two new performance measures and the Sharpe ratio for the DOW 30 components 3 as of 2 April 2019. The DOW 30 components are listed in Table 4. Our sample period for daily (monthly) return data is from 4 January 2005 (February 2005) till 30 December 2019 (December 2019) 4 . As stock returns, we use log-returns in this paper. The same data were studied by Hodoshima and Yamawake (2020) where only winners and losers of the DOW 30 components were described without percentiles including the maximal loss. On the other hand, the current study focuses on the issue of sensitivity of the new performance measures to disaster risk. We cursorily describe performance of the DOW 30 stocks in this study. The list of the DOW 30 components is given in Table 4. We present summary statistics of daily return data in Table 5 and percentiles, including the maximum and minimum, in Table 6. In the tables, s.d., max, min, 80%, 60%, 40%, and 20% denote respectively standard deviation, maximum, minimum, 80% percentile, 60% percentile, 40% percentile, and 20% percentile. Mean in the last row in Tables 5 and 6 denotes mean of summary statistics and percentiles over 29 stocks. Mean ranges from 0.017 in Walgreens to 0.114 in Apple. Standard deviation ranges from 1.017 in Johnson & Johnson to 2.350 in JP Morgan. Skewness shows 11 stocks are negatively skewed and that 18 stocks are positively skewed. Kurtosis shows all the data have heavy tails compared to the normal distribution. The maximum ranges from 8.975 in McDonald's to 29.829 in UnitedHealth. The minimum ranges from −8.226 in Procter & Gamble to −23.228 in JP Morgan. These minimum values are for daily returns and hence considered to be quite disastrous losses. Four other percentiles are given in Table 6. They are listed from larger values (80% percentile) to smaller values (20% percentile). Table 7 presents the three performance measures, the AS and FH performance measures and the Sharpe ratio, for daily return data. We do not provide the Sortino ratio and Calmar ratio in this section since our focus is on the two performance measures and the Sharpe ratio is the de facto industry standard performance measure to compare. Mean in the last row in Table 7 denotes mean of performance measures over 29 stocks. We can obtain the AS performance measure in every stock of the DOW 30 stocks. In other words, we can obtain the AS performance measure without much difficulty in representative real stock data. Therefore, obtaining the AS performance measure is not a problem in our daily stock data. The AS performance measure ranges from 0.010 in Goldman Sachs to 0.086 in McDonald's. Therefore, the AS performance measure scores are much smaller than those in the numerical examples in the previous subsection although the maximum loss is in some stocks larger than the numerical examples in the previous section. The difference between the AS performance measure and Sharpe ratio is large in outperforming stocks but small in underperforming stocks.  The FH performance measure is generally similar to the AS performance measure. Example 1 in the previous subsection shows the FH performance measure is similar to the AS performance measure when there is not disaster risk, which can be the reason why the FH performance measure is similar to the AS performance measure, although the minimum values in Table 6 show the existence of severe negative returns in many stocks. The existence of these severe negative returns seems to have downward effects on the FH performance measure as well as the AS performance measure in daily data. The two performance measures are substantially small compared to those in Example 1 and the AS performance measure in Example 2 and 3 at the previous subsection. Table 8 presents summary statistics of monthly return data for the DOW 30 stocks. Mean in the last row in Table 8 denotes mean of summary statistics over 29 stocks. Mean of summary statistics ranges from 0.335 in Walgreens to 2.297 in Apple. Mean ranges from the minimum in Walgreens to the maximum in Apple in monthly data, which is the same as in daily data. In the table, mean* denotes the mean derived from the formula where mean in monthly data should be close to 30 ÷ 7 × 5 times mean in daily data if daily returns follow identical distributions and s.d.* denotes the standard deviation derived from the formula where standard deviation in monthly data should be close to √ 30 ÷ 7 × 5 times standard deviation in daily data if daily returns follow independent and identical distributions. Standard deviation ranges from 3.982 in Johnson & Johnson to 9.314 in Apple. Standard deviation in Johnson & Johnson is also the minimum in daily data . Skewness is all negative in monthly data except American Express. The negative skewness of the distribution shows that we may expect frequent small gains and a few large losses. There are only four companies, 3M, American Express, Cisco Systems, and Walgreens, where skewness is larger in monthly data than daily data. Skewness is more negative in the rest of the four companies in monthly data than daily data. Kurtosis shows DOW components have tails closer to the normal distribution except American Express in monthly data than daily data. Table 9 presents percentiles of monthly return data for DOW 30 stocks. Mean in the last row in Table 9 denotes mean of percentiles over 29 stocks. The maximum ranges from 10.364 in Johnson & Johnson to 62.866 in American Express. The minimum ranges from −43.479 in Caterpillar to −10.881 in McDonald's. The range of returns widens in monthly data than in daily data. Consequently, most of summary statistics become larger in absolute value in monthly data than in daily data except kurtosis and 40% percentile. This applies to the maximum and minimum. Hence, the minimum shows more severe disastrous observations in monthly data than daily data. Since the AS and FH performance measures are sensitive to losses but insensitive to gains as we saw in the numerical examples in the previous section, negative values of the observation have disproportionally larger adverse effects on the two performance measures than positive values when the former and latter are equal in absolute value. In the table, s.d., skew, and kurt denote respectively standard deviation, skewness, and kurtosis. Mean* denotes the mean derived from the formula where mean in monthly data should be close to 30 ÷ 7 × 5 times mean in daily data if daily returns follow identical distributions and s.d.* denotes the standard deviation derived from the formula where standard deviation in monthly data should be close to √ 30 ÷ 7 × 5 times standard deviation in daily data if daily returns follow independent and identical distributions.  Table 10 presents the three performance measures of monthly return data for the DOW 30 stocks. The Sharpe ratio is high in monthly data in some stocks such as Visa, McDonald's, Nike, and Apple. The Sharpe ratio in monthly data is much higher than that in daily data. This is natural since the monthly Sharpe ratio is close to √ 30 ÷ 7 × 5 times as much as the daily Sharpe ratio if daily returns are independently and identically distributed (cf. Lo 2002), where 30 denotes an average number of days in a month, 7 denotes the number of days in a week, and 5 denotes the number of weekdays in a week. On the other hand, the AS and FH performance measures in monthly data are much closer to those in daily data. They are nearly closed under temporal aggregation in some stocks, i.e., they have time-invariant values regardless of data frequency (cf. Hodoshima 2020b). The AS performance measure ranges from 0.012 in Walgreens to 0.132 in McDonald's. McDonald's is the maximum in the AS performance measure in monthly data, which is the same as in daily data. McDonald's is by far the best by the AS performance measure but rated only as the second-best next to Visa by the Sharpe ratio. The difference between the AS performance measure and Sharpe ratio is large in outperforming stocks but small in underperforming stocks.
The FH performance measure is small compared to the AS performance measure. Some companies with high AS performance measure scores in monthly data referred above all have the FH performance measure considerably smaller than the corresponding AS performance measure. This is in contrast to the result in daily data. As we saw in the previous subsection, the FH performance measure is much more sensitive to losses of the underlying stock performance, we consider this indicates the lower FH performance measure score in monthly data is due to larger losses in the stock return in monthly data. Therefore, our result indicates the FH performance measure is more sensitive to losses, i.e., negative skewness or left tail of the underlying distribution, than the AS performance measure.

Discussion
In this section, in order to see how the AS and FH performance measures are related to summary statistics of the underlying return data, we run regressions with each of the two performance measures as the dependent variable and summary statistics as explanatory variables in both daily and monthly data. We add percentiles of the maximum, minimum, 80%, 60%, 40%, and 20% to summary statistics in regressions to see how these percentiles are related to the two performance measures. This is to examine how the two performance measures are sensitive to losses or maximal loss of the underlying return data. In particular, in this section we focus on the FH performance measure more than the AS performance measure since the former is known to be extremely sensitive to disaster risk or maximal loss as compared to the latter.
We begin to compare the estimate of the FH performance measure and the reciprocal of the maximal loss since the maximal loss is identified as the extended FH risk measure when there exists no FH risk measure (cf. Riedel and Hellmann 2015). Table 11 shows, in daily data, the FH performance measure estimate, the reciprocal of the maximal loss L, and the sample estimate of the expectation E ln 1 + g L , which becomes a discriminant if the reciprocal of the maximal loss should be used as the extended FH performance measure or not. Proposition 3 in Section 2 implies the FH risk (performance) exists when E ln 1 + g L becomes negative but does not exist when E ln 1 + g L becomes nonnegative. If the discriminant E ln 1 + g L becomes nonnegative, then (the reciprocal of) the maximal loss should be used as the extended FH risk (performance) measure. The estimate of the discriminant becomes positive only once in Apple and is negative otherwise. Therefore, the GMM estimate of the FH performance measure should be used as the FH performance measure except for Apple. In every stock in Table 11, the GMM estimate of the FH performance measure is uniformly smaller than the estimate of the extended FH performance measure, i.e., the reciprocal of the maximal loss. The difference between the two estimates is in general quite large so that one should use the GMM estimate as the FH performance measure. Even in Apple's case where one should use the reciprocal of the maximal loss, the difference between the two estimates is very small so that the use of the GMM estimate as the FH performance measure estimate in this case is not problematic. Table 12 shows, in monthly data, the FH performance measure estimate, the reciprocal of the maximal loss, and the sample estimate of the expectation E ln 1 + g L . The discriminant estimate takes positive values in 12 stocks where one should use the reciprocal of the maximal loss as the extended FH performance measure. However, the difference between the GMM estimate of the FH performance measure and the reciprocal of the maximal loss is generally small in these cases but large when the discriminant estimate takes negative values. Therefore, the erroneous use of the reciprocal of the maximal loss as the extended FH performance measure results in bad inference of the FH performance measure when the discriminant estimate takes negative values while the incorrect use of the GMM estimate of the FH performance measure does not cause much troubles when the discriminant estimate takes positive values. This view is new in the existing literature. We emphasize the importance of finding the GMM estimate by grid search when we estimate the FH performance (risk) measure in order to obtain a good estimate.
We now turn to regression analysis in order to find how sensitive the new performance measures are to disaster risk or maximal loss. Table 13 shows the regression result of the AS performance measure and summary statistics in daily return data. We test whether each regression coefficient is zero or not by a t-test in regression analysis. We name explanatory variables of percentiles as risks such as risk0, risk1, etc. in the tables of regression results given below. The goodness of fit statistics, shown by R 2 being 0.907 andR 2 being 0.855, indicate this regression fits data fairly well. Mean is one percent significant and has the largest estimate of 0.503. Significance and insignificance in the regression results for the regression of the two performance measures and summary statistics in this section are based on p-values under the classical assumptions of the standard regression model of homoskedastic variance of the error term and independently and identically normally distributed error term so that t-values are assumed to follow t distribution with 18 degrees of freedom. This is so because we are concerned about the finite-sample inference of the regression with only 29 observations and 10 explanatory variables, where there are not many observations given the number of explanatory variables in the regression model. Standard deviation is five percent significant with the second-largest estimate of −0.043 in absolute value. Skewness has a ten percent significant positive estimate of 0.023 but kurtosis is not significant. Other explanatory variables are not significant. Six percentiles have all small t-values in absolute values, and hence they are not significant. Therefore, only mean, standard deviation, and skewness are significant.  Table 11 shows the FH performance measure estimate, the reciprocal of the maximal loss, and the estimate of the discriminant E ln 1 + g L for daily return data of the DOW 30 stocks. The estimate of the discriminant E ln 1 + g L is obtained from the sample data where the observation where the maximal loss occurs is excluded from the sample not to make the estimate take −∞.  Table 12 shows the FH performance measure estimate, the reciprocal of the maximal loss, and the estimate of the discriminant E ln 1 + g L for monthly return data of the DOW 30 stocks. The estimate of the discriminant E ln 1 + g L is obtained from the sample data where the observation where the maximal loss occurs is excluded from the sample not to make the estimate take −∞.  Table 13 shows the regression result of the AS performance measure as the dependent variable and summary statistics as explanatory variables for daily return data of the DOW 30 stocks. In the table, s.d., skew, kurt, risk0, and risk5 denote respectively standard deviation, skewness, kurtosis, maximum, and minimum. Also risk1, risk2, risk3, and risk4 denote respectively 80% percentile, 60% percentile, 40% percentile , and 20% percentile. We name percentiles of the return distribution as classes of risk markers such as risk5 etc. In the table, R 2 andR 2 are goodness of fit statistics, i.e., R-square and adjusted R-square respectively. In the table, *** denotes significant at 1% level, ** denotes significant at 5% level, and * denotes significant at 10% level. Significance and insignificance in the regression results for the regression of the two performance measures and summary statistics are based on p-values under the classical assumptions of the standard regression model of homoskedastic variance of the error term and independently and identically normally distributed error term so that t-values are assumed to follow t distribution with 18 degrees of freedom. Table 14 shows the regression result of the FH performance measure and summary statistics in daily data. The goodness of fit in the regression, given by R 2 andR 2 , is fairly good, similar to the result for the AS performance measure. The estimation result in Table 14 is similar to that in Table 13. Mean is highly significant with the largest estimate of 0.470. Standard deviation is ten percent significant with the second-largest t-value in absolute value. Skewness is also ten percent significant and has a positive estimate, while kurtosis is not significant. None of six percentiles are significant in Table 14, which is the same as in Table 13. Therefore, mean, standard deviation, and skewness are the most important factors influencing the FH performance measure in daily data. However, percentiles, including the maximum and minimum, are not significant in daily data despite the previous view the FH performance measure is extremely sensitive to rare disasters and our numerical investigation in Section 3. Overall, the regression result of the FH performance measure in Table 14 is similar to that in the AS performance measure in Table 13 for daily data.  Table 14 shows the regression result of the FH performance measure as the dependent variable and summary statistics as explanatory variables for daily return data of the DOW 30 stocks. In the table, s.d., skew, kurt, risk0, and risk5 denote respectively standard deviation, skewness, kurtosis, maximum, and minimum. Also risk1, risk2, risk3, and risk4 denote respectively 80% percentile, 60% percentile, 40% percentile , and 20% percentile. We name percentiles of the return distribution as classes of risk markers such as risk5 etc. In the table, R 2 andR 2 are goodness of fit statistics, i.e., R-square and adjusted R-square respectively. In the table, *** denotes significant at 1% level, and * denotes significant at 10% level. Significance and insignificance in the regression results for the regression of the two performance measures and summary statistics are based on p-values under the classical assumptions of the standard regression model of homoskedastic variance of the error term and independently and identically normally distributed error term so that t-values are assumed to follow t distribution with 18 degrees of freedom. Table 15 shows the regression result of the AS performance measure and summary statistics in monthly data. The goodness of fit statistics are a little smaller in monthly data than in daily data. Mean and standard deviation, the most influential factors in daily data, are no longer significant in monthly data. The only significant factor, significant at ten percent, is skewness with p-value 0.055 in monthly data. The positive estimate of skewness implies skewness has a negative effect on the AS performance measure since the majority of the 29 companies have negative skewness in monthly data, as we saw above. The estimate 0.045 of skewness is the largest estimate in Table 15. We consider the negative effect of skewness in Table 15 indicates the AS performance measure is sensitive to losses or the underlying risk of stock returns (cf., e.g., Kadan and Liu 2014;Miyahara 2014;Ban et al. 2016;Hodoshima 2019). As the reason why mean and standard deviation are no longer significant in monthly data, we consider daily data contain daily microstructure noise and can be summarized by traditional summary statistics such as mean and standard deviation but that monthly data, daily microstructure noise being washed out, have distributions where only skewness is significant when explanatory variables of percentiles, including the maximum and minimum, are present. Table 16 shows the regression result of the FH performance measure and summary statistics in monthly data. The goodness of fit statistics are also a little smaller in monthly data than in daily data. Mean and standard deviation, the two most influential factors in daily data, are no longer significant in monthly data as in the regression of the AS performance measure in Table 15. Monthly return data have distributions where traditional summary statistics of mean and standard deviation lose explanatory power in the regression when other percentile summary statistics are present. Instead, skewness and 40% percentile are five percent significant with p-value 0.023 and 0.030 respectively. The maximum is ten percent significant. However, the maximum has a small estimate −0.002 so that it is not an important factor. Skewness has the largest estimate of 0.036, and hence it is the most influential factor. Since the majority of the 29 companies have negative skewness in monthly data, the positive estimate of skewness implies skewness has a negative effect on the FH performance measure. The 40% percentile being five percent significant seems to indicate the FH performance measure is more sensitive to losses of stock returns compared to the AS performance measure in monthly data. However, the minimum has a small estimate and a low t-value. This implies the FH performance measure is not sensitive to the maximal loss or the rare disaster, which has been used as the value of the FH extended riskiness or performance measure in some previous studies (cf., Kadan and Liu 2014;Anand et al. 2016Anand et al. , 2017Riedel and Hellmann 2015). This is evidence that the FH performance is more sensitive to losses of the underlying financial target compared to the AS performance measure but not excessively determined by the maximal loss. Unlike the previous studies in the literature and our own numerical investigation in Section 3, our empirical results show the FH performance measure is not excessively determined by the maximal loss.  Table 15 shows the regression result of the AS performance measure as the dependent variable and summary statistics as explanatory variables for monthly return data of the DOW 30 stocks. In the table, s.d., skew, kurt, risk0, and risk5 denote respectively standard deviation, skewness, kurtosis, maximum, and minimum. Also risk1, risk2, risk3, and risk4 denote respectively 80% percentile, 60% percentile, 40% percentile , and 20% percentile. We name percentiles of the return distribution as classes of risk markers such as risk5 etc. In the table, R 2 andR 2 are goodness of fit statistics, i.e., R-square and adjusted R-square respectively. In the table, *** denotes significant at 1% level, and * denotes significant at 10% level. Significance and insignificance in the regression results for the regression of the two performance measures and summary statistics are based on p-values under the classical assumptions of the standard regression model of homoskedastic variance of the error term and independently and identically normally distributed error term so that t-values are assumed to follow t distribution with 18 degrees of freedom.  Table 16 shows the regression result of the FH performance measure as the dependent variable and summary statistics as explanatory variables for monthly return data of the DOW 30 stocks. In the table, s.d., skew, kurt, risk0, and risk5 denote respectively standard deviation, skewness, kurtosis, maximum, and minimum. Also risk1, risk2, risk3, and risk4 denote respectively 80% percentile, 60% percentile, 40% percentile , and 20% percentile. We name percentiles of the return distribution as classes of risk markers such as risk5 etc. In the table, R 2 andR 2 are goodness of fit statistics, i.e., R-square and adjusted R-square respectively. In the table, *** denotes significant at 1% level, and * denotes significant at 10% level. Significance and insignificance in the regression results for the regression of the two performance measures and summary statistics are based on p-values under the classical assumptions of the standard regression model of homoskedastic variance of the error term and independently and identically normally distributed error term so that t-values are assumed to follow t distribution with 18 degrees of freedom.

Concluding Comments
We examined how sensitive the new performance measures of the AS and FH performance measures are to disaster risk or maximal loss. We presented numerical examples and empirical examples. Although numerical examples are limited, we showed that the AS performance measure is sensitive to disaster risk but insensitive to gains. On the other hand, we showed that the FH performance measure is even more sensitive to disaster risk than the AS performance measure and more sensitive to gains than the AS performance measure. The FH performance measure is sometimes not possible to obtain by the large maximal loss. On the other hand, our empirical examples show we can always obtain the two performance measures despite the large maximal loss. Therefore, the infeasibility of the FH performance measure due to the large maximal loss does not happen in our representative stock return data.
We closely studied how the GMM estimate of the FH performance measure is related to the maximal loss, the reciprocal of which is the extended FH performance measure defined by Riedel and Hellmann (2015) when the FH risk (performance) measure does not exist. We found that the GMM estimate is valid in every DOW 30 stocks except Apple in daily DOW 30 stock data. The difference between the GMM estimate and the reciprocal of the maximal loss (the extended FH performance measure) in daily data is very small in Apple but large in other stocks. Therefore, we should be careful to use the extended FH risk (performance) measure not to make erroneous estimation. On the other hand, the GMM estimate of the FH performance measure is more often invalid in monthly DOW 30 stock data. However, the difference between the GMM estimate and the reciprocal of the maximal loss in monthly data is small when the FH risk (performance) measure does not exist but large when the FH risk (performance) measure exists and hence (the GMM estimate of) the FH performance measure is valid. Therefore, one should check the sign of the discriminant E ln 1 + g L whether one should use the FH performance measure or the extended FH performance measure, i.e., the reciprocal of the maximal loss. However, (the GMM estimate of) the FH performance measure does not function so badly when it is not valid, i.e., when the FH risk (performance) measure does not exist, and does much better than the extended FH performance measure when it is valid. Therefore, we do not make a big mistake by (the GMM estimate of) the FH performance measure.
We also examined whether the two performance measures are related to summary statistics including percentiles by running regressions with the two performance measures as dependent variables and summary statistics as explanatory variables. In daily data, only mean and standard deviation are respectively one and five percent significant for the AS performance measure and only mean is one percent significant for the FH performance measure. In monthly data, only skewness is ten percent significant for the AS performance measure and skewness and 40% percentile are five percent significant for the FH performance measure. Therefore, the maximal loss or disaster risk is not, unlike the previous view in the literature, related to the two performance measures. However, skewness and 40% percentile being five percent significant in the FH performance measure as compared to skewness being only 10 percent significant in the AS performance measure indicates the FH performance measure is more sensitive to losses in return observations compared to the AS performance measure. Therefore, the two performance measures can provide evaluations in representative stock data, even though they contain quite disastrous observations, despite the previous view in the literature and our numerical examples. Hence, the two performance measures could be used in empirical studies to shed new light to show risk-averse assessments as compared to other traditional performance measures.
In this study, we identified risk as losses. However, risk is sometimes associated with regional economic concepts such as regional risks for doing business. For example, the World Economic Forum has been publishing The Global Risks Report (cf., e.g., The Global Risks Report (2020)) since 2006, highlighting each year vulnerability of our world to volatility and disruption. Studies such as ours where risk is associated with losses may be limited when we consider risk more broadly.