Revisiting Structural Modeling of Credit Risk — Evidence from the Credit Default Swap ( CDS ) Market

The ground-breaking Black-Scholes-Merton model has brought about a generation of derivative pricing models that have been successfully applied in the financial industry. It has been a long standing puzzle that the structural models of credit risk, as an application of the same modeling paradigm, do not perform well empirically. We argue that the ability to accurately compute and dynamically update hedge ratios to facilitate a capital structure arbitrage is a distinctive strength of the Black-Scholes-Merton’s modeling paradigm which could be utilized in credit risk models as well. Our evidence is economically significant: We improve the implementation of a simple structural model so that it is more suitable for our application and then devise a simple capital structure arbitrage strategy based on the model. We show that the trading strategy persistently produced substantial risk-adjusted profit.


Introduction
"Seldom has the marriage of theory and practice been so productive."-by Peter Bernstein, financial historian, in Capital Ideas: The Improbable Origins of Modern Wall Street [1] The quote above is a compliment to the ground-breaking Black-Scholes-Merton Model ( [2,3]) and the rapidly developing generation of derivative pricing models.These models, with starry empirical performance that facilitated the mushrooming of the financial derivative markets, have brought the lending capacity of the financial industry to a new height.
However, this modeling paradigm faced major difficulties when it was applied to one of the oldest financial instruments-bonds.Merton [4] started this endeavor by pointing out that the value of a bond is the difference between the value of the underlying asset (value of the business) and a call option of the underlying asset, or equivalently, the difference between a riskless bond and a put option of the underlying asset.This key insight made the application of the Black-Scholes-Merton modeling paradigm to credit risk straightforward.The type of credit risk models that follow the Black-Scholes-Merton modeling paradigm are referred to "structural models of credit risk".Subsequently, the hedge ratio has been discussed by later study as a key to understand the weakness of structural models of credit risk.For example, Bao and Hou [23] use the hedge ratio to look into the heterogeneity in the co-movement of corporate bonds and equities.They found that bonds that are later in their firms' maturity structure have higher hedge ratios, hedge ratios are larger for firms with greater credit risk within ratings.Their findings are consistent with the prediction of the extended Merton model.Huang and Shi [24] reconciled the "equity-credit market integration anomaly" from the perspective of the hedge ratio.
We argued that accurate computation of hedge ratio is only part of a bigger picture.A distinctive feature of the Black-Scholes-Merton modeling paradigm is the construction of a dynamic trading strategy aimed at exploiting pricing discrepancies among different securities of the same underlying asset.Such a trading strategy is called a capital structure arbitrage.The success of a capital structure arbitrage relies critically on the dynamic rebalancing of the portfolio which relies on the accurate computation and dynamic updating of the constantly changing hedge ratio.Namely, the accuracy and dynamic updating of hedge ratio is the key to a capital structure arbitrage.
We argue that the ability to accurately compute and dynamically update hedge ratios to facilitate a successful capital structure arbitrage is a distinctive strength of the Black-Scholes-Merton's modeling paradigm which could be utilized in credit risk models as well.Consequently, a place where the structural model of credit risk performs well is the application related to capital structure arbitrage.
Empirically, we implemented a simple structural model of credit risk, which facilitated a capital structure arbitrage that persistently produced substantial risk adjusted profit.
Our empirical result is an improvement upon those of Schaefer and Strebulaev [22].First, in Schaefer and Strebulaev [22], linear time series regression was used as the benchmark to "test" whether or not the hedge ratio produced by the structural model was accurate.The empirical result of their study was that the hedge ratios produced by the structural model and the linear time series regression models were not far apart (except for the AAA bonds).An immediate question is, since a linear regression model is much easier to implement than a structural model, and it is accurate enough to serve as the benchmark of hedge ratio calculation, what is the point of using a structural model at all?Second, the way the structural model was implemented in Schaefer and Strebulaev [22] actually largely eliminated the advantage of a structural model over a linear regression type of model.Specifically, every grade of bonds has only one hedge ratio, as an average of the hedge ratio of all the bonds of the same grade and cross time.However, one distinctive advantage of structural models is the capacity to dynamically recalculate the constantly changing hedge ratio of an individual firm, and such hedge ratios surely vary cross-sectionally, which is beyond the reach of linear regression types of models.Moreover, the problem may also go deeper than that: Although intuitively such "average" hedge ratios converge to certain "expected hedge ratio", the way the model is implemented may fail.Namely, every time series has only one volatility estimate, which is an average volatility over time.One hedge ratio is calculated according to this average, and the hedge ratio is considered an average hedge ratio.However, the hedge ratio is a nonlinear function of volatility.The following equation was an implicitly assumption for the algorithm in Schaefer and Strebulaev [22]: Equation ( 1) is true if h is a linear function of σ or σ is constant over time.In the case of a structural model, h is not linear, and apparently σ changes over time, therefore, Equation (1) is not true in general.
Our study overcomes all the limitations aforementioned in Schaefer and Strebulaev [22].Instead of using a linear regression as a benchmark, we used the market as the benchmark-we utilized a structural model in a capital structure arbitrage trading, demonstrating substantial persistent risk adjusted profit.It is worth mentioning that our evidence is therefore economically significant, not just statistically significant.The hedge ratio is recalculated at every point in time for every firm, demonstrating the unique strength of the structural model, which would be difficult for a linear regression type of model to accomplish.
Our results do not suggest that all the shortcomings of structural models of credit risk are due to the non-credit risk related factors.Bao [25] found that the part of the cross-sectional variation unexplained by the Black-Cox model is related to credit risk proxies such as recent equity volatility, ratings, and option expensiveness.Further, Bao [25] found that unexplained yield spreads are related to unexplained CDS spreads.
We utilized option implied volatility as opposed to historical volatility to be one of the key inputs of the structural model.We do so because option implied volatility is capable of expressing new information incorporated in the options market, while historical volatility moves slowly.The advantage of option implied volatility has been discussed previously.For example, the information content of option implied volatility has been discussed in [26].They used a linear regression approach to demonstrate that option volatility dominate historical volatility in explaining time series CDS spread.Cao, Yu and Zhong [26] did not use option implied volatility directly as an input to the structural model.Cao, Yu and Zhong [26] went one step further by using implied volatility as an input to the structural model.They demonstrated that the CDS spread provided by such an implementation matched the observed CDS spread better than using historical volatility as the input.Empirically, our study went yet one step further by providing economically significant evidence that a structural model with option implied volatility as an input can facilitate successful capital structure arbitrage.
In comparison with this literature, our data set is relatively old.However, the focus of our study is neither the information content of certain asset (options), nor the advantage of using one parameter over another.The focus of this study is to show whether or not structural model of credit risk can be put to good use despite all of the difficulties in credit risk modeling.Its working in one period in a certain way is good enough to make the point.A trading strategy should not be expected to work the same way all the time-it surely has to be modified according to changing regulatory environments, etc.The regulation around CDS trading has been through substantial changes and we surely would not expect the same trading strategy to work the same way all of the time.
The profitability of capital structure arbitrage has been discussed in previous studies; see, for example, [27][28][29].Profit of their trading strategy was very modest, far less pronounced than that in our study.
This paper is organized as follows: Section 2 introduces the model used; Section 3 presents our improved implementation of the model; Section 4 describes the capital structure arbitrage based on the prediction produced by the model; Section 5 reports empirical test results; and Section 6 concludes.

The CreditGrades™ Model
The structural model we used in this study is the CreditGrades™ model [30].It was developed jointly by the RiskMetrics Group, J.P. Morgan, Goldman Sachs, and Deutsche Bank.The CreditGrades™ model is a descendent of Black and Scholes [2] and Merton [4], wherein the equity is modeled as a call option on a firm's assets, and the value of a firm's debt is therefore the value of its assets in excess of the equity.One problem structural models encounter in pricing corporate bonds is that the predicted short-term credit spread is too low compared to market data.CreditGrades™ addresses this problem by introducing uncertainty in the default barrier, motivated by the fact that the market does not know the exact level of leverage of a firm except at the time when the firm actually defaults.This approach produces higher short-term spreads than those found in a typical structural credit risk model.
We choose the CreditGrades™ model here because it is simple in comparison with to the technically more sophisticated structural models.Good performance of a simple structural model helps to demonstrate the strength of structural models in general.
For detailed information of the model, please refer to the original CreditGrades™ technical document.

Improved Implementation
This section describes what distinguishes our implementation from conventional implementation.The CreditGrades™ model provides a function f of externally observed (calculated) factors Z t and model parameters β, to predict CDS spread Y t at time t.Namely, the model predicts that Y t = f (Z t , β).Z t , the vector of externally observed (calculated) factors, includes variables like stock price, stock return volatility, interest rate, etc.; β, the vector of model parameters, includes three elements: L, λ, and R 1 .For a detailed interpretation of the parameters, please refer to the original CreditGrades™ technical document.
In the original CreditGrades™ model technical documentation, λ and L are estimated to be 0.3 and 0.5, respectively, across all firms and all time, based on a study of historical data; bond specific recovery rate R is taken from a proprietary database from J.P. Morgan.In short, among the three unobservable parameters, λ and L take fixed values across firms and time while R varies across firm but is constant over time.In Yu [27], λ and R are fixed across firms and across time, while L is allowed to vary across firms but is constant over time.More specifically, as in Yu [27], L is estimated as the solution of the following programming: In our implementation, we free up all three parameters-they are allowed to vary across firms and over time.In fact, the estimation is performed at every point of time t, with the most current data from the same firm, instead of estimating once and using the estimates forever, or using cross sectional data to estimate.Specifically, all the three model parameters are estimated as the solution of the following programming: We freed up more model parameters.Instead of taking fixed values across firm and time, we allowed them to take the values that would allow the best match between the model and the data.The advantage of doing so is to allow the model more flexibility in expressing the firm-specific information in the data.

Distinction 2:
We re-estimated the model parameters at every point of time with the most current data of the same firm.The advantage of doing so is to allow the most current update of information be expressed.

Distinction 3:
In addition, one of the observable variables in Z t is stock return volatility.Traditional implementation use historical volatility, for example, Yu [27].We replaced historical volatility with option implied volatility.The advantage of using option-implied volatility is that option implied volatility reacts to new market information more promptly and in a more pronounced way, allowing the model to capture the newest information.The division between economic factors and model parameters is purely pragmatic.Theoretically, they are all variables."Economic factors" here refers to the variables that are directly observable, or straightforward to calibrate, while "model parameters" are the variables that are more difficult to calibrate.

Capital Structure Arbitrage
This section provides a description of the capital structure arbitrage trading strategy.

Trading against the Market
The arbitrageur compares the real (market observed) CDS spread of an obligor and the theoretical CDS spread calculated based on the CreditGrades™ (CG) model and parameters obtained from the recent spread sheet, current stock price, current interest rate, maturity of the CDS spread, and, in some cases, the current price of the stock options of the obligator.
For a pre-specified threshold h, if the modeled spread c' is higher than p1 `hq times of the realized spread c, namely, c1 ą p1 `hq ¨c the arbitrageur longs the CDS contract, which means she buys the insurance of the bonds and pays periodic premiums.
If the market spread c is higher than p1 `hq times of the model spread c', namely, the arbitrageur shorts the CDS contract, which means she sells the insurance of the bond and receives periodic premiums.Why would this work?Suppose Equation ( 2) is true.That would be a signal that the market spread could be lower than its "fair" value, and it would tend to go up.Entering the long position of the CDS today means paying a lower than "fair" spread periodically for the insurance of the bond.Notice that the long position of the CDS contract is worth zero, just as is the short position of the same CDS contract at the time the CDS contract initiates.Sometime later, the market CDS spread might increase to its "fair" value.When this occurs, the market demands a higher periodic premium for the purchase of the same insurance.Now someone who needs to protect his/her bond value by entering the long position of a CDS spread will have to pay a higher periodic premium.She is willing to pay the arbitrageur a premium in exchange for the long position of the CDS contract, so that she can pay a lower periodic premium for the same protection.The situation when Equation (3) comes true is similar and will not be detailed here.
The "fair" value today may no longer be "fair" tomorrow simply because the underlying economic elements, such as stock price and volatility of stock return, etc., change.Therefore, profit is not guaranteed whenever the theoretical value and the real value converge.The theoretical value and the market value can fluctuate up and down together simply due to the change of underlying economic elements and converge at a value that makes the arbitrageur lose money.
Formally, suppose the theoretical CDS spread is given by c t 1 " f pS t , σ t ; ϑ t q, a function of the stock price S t , asset volatility σ t and a vector of other parameters ϑ t .The real CDS spread is given by c t " c t 1 ´mt , where m t measures miss-price.
The arbitrageur assumes that m t cannot stay positive (negative) forever.Formally, With the above assumption assuring the convergence, the arbitrageur still cannot expect c t 1 " f pS t , σ t ; ϑ t q or c t " c t 1 ´mt to stay still while the underlying economic elements pS t , σ t ; ϑ t q are constantly changing.In the previous example, the arbitrageur enters a long position of CDS when c t 1 ą p1 `hq ¨ct .The arbitrageur can expect m t to come back down to zero, but when it does, both c t 1 " f pS t , σ t ; ϑ t q and c t " c t 1 ´mt may have gone up, which leaves the arbitrageur in a money-losing situation.

Delta Hedge
To hedge against the movement of CDS spread, stock is introduces into the portfolio.The hedge ratio here is the ratio between the change of the value of the CDS position and the change of the stock price, in a sufficiently short period of time.Intuitively, when the stock price increases (decreases), the CDS spread tends to decrease (increase), consequently, the value of a long CDS position will decrease (increase), therefore, it makes sense to have a number of short (long) position of the stock in the portfolio, to offset the impact from the stock price change to the value of the long CDS position.
Suppose that the long position of CDS maturing at time T is worth πpS t , σ t ; ϑ t q.The value change of the long position due to the change of stock price S t , with the vector of other parameters σ t and ϑ t holding constant is ∆πpS t , σ t ; ϑ t q " dπpS t , σ t ; ϑ t q dS t ∆S t `op∆S t q which implies ∆πpS t , σ t ; ϑ t q ´dπpS t ,σ t ;ϑ t q dS t ∆S t « 0.
To hedge, the arbitrageur needs to purchase ´dπpS t ,σ t ;ϑ t q dS t units of stocks to implement ´dπpS t ,σ t ;ϑ t q dS t ∆S t .(The arbitrageur needs to short sell dπpS t ,σ t ;ϑ t q dS t units of stock if ´dπpS t ,σ t ;ϑ t q dS t is negative.) The hedge does not provide complete protection of the portfolio because S t is not the only changing economic element that plays a role in the CDS spread.Other parameters that affect CDS spread can fluctuate over time and therefore bring changes to the value of the portfolio.Additionally, the speed at which m t converges to zero is important.If m t converges slowly, extended holding time comes with direct holding cost, and also more chances that the underlying value changes adversely.Additionally, when m t goes further from 0 to a large extent before converging, such a situation might force the arbitrageur to liquidate.

Empirical Tests
This section offers empirical tests of the performance of our improved implementation of the CreditGrades™ model.One distinctive feature of this study is that performance is measured by the profitability of the capital structure arbitrage (CSA) based on the implementation.Consequently, evidence produced would be economically significant.In what follows, Section 5.1 describes the data and explains how profitability is measured; Section 5.2 compares our improved implementation and conventional implementation; Section 5.3 assesses the profitability of CSA based on our improved implementation; and Section 5.4 assesses the persistence of the profitability of CSA based on our improved implementation.

Data and Computation of Profit
The CDS spread this study uses is the mid quote CDS spread of a five-year CDS contract on senior unsecured debt of North American Industrial Obligors from January 2001 to June 2004.The CDS data is merged with equity price and adjusted equity return data from CRSP and accounting information from Compustat quarterly data.Five-year U.S. treasury yields and three-month T-bill rates are from the Center for Research in Security Prices (CRSP).Option data come from OptionMetrics.
The above data merge and yield 178 investment grade firms and 35 speculative grade firms, with daily data in the period from January 2001 to June 2004.The total number of observations is 138,815.Basic statistics of the data are presented in Tables 1 and 2.  The Fama-French Risk Factors and Momentum risk factors used in regressions are downloaded from Kenneth French's data library online.For a detailed description, please refer to French's online library [31].
The other four risk factors used in Yu [27] are from Datastream: RSPIND excess return on the S&P Industry Index, proxying for equity market risk RINV the excess return on the Lehman Brothers Baa intermediate index, proxying for investment grade bond market risk RSPEC Lehman Brothers Baa intermediate index, proxying for speculative grade bond market risk RARB excess return on the CSFB/Tremont Fixed Income Arbitrage Index proxying for the variations in the monthly excess returns not related to the market indices Data of Option Implied Volatility: The structure of the option data is irregular: Some stocks have more related options than others.The largest number of options related to a stock in a single day is thirty-six.Additionally, the number of options related to a stock can be different from day to day.We calculate option-implied volatility by minimizing least square error.Namely, suppose at a certain date t, a stock (with stock price s t ) has n related call options pC 1,t , C 2,t , ...C n,t q with their strike price pK 1,t , K 2,t , ...K n,t q and maturity pτ 1,t , τ 2,t , ...τ n,t q.Suppose also that the risk-free interest rate is r t .The Black-Scholes formula to calculate call option price C from volatility σ is Cpσ, s, τ, K, rq.We implement the following optimization: The interest rate used above is the 3-month treasury rate.The solution of the above optimization σ t is the implied option volatility needed to further calculate the asset volatility.The option implied stock return volatility here will further be used to calculate the asset volatility, using the relationship between the stock volatility, asset volatility, stock value and asset value.For more detailed derivation, please refer to the original CreditGrades technical document.
Transaction Costs: A 5% bid-ask spread is assumed for the trading of CDS.For example, for a CDS spread quoted as 100 bps, the arbitrageur would have to pay 102.5 bps annually if she buys the contract; she receives annually only 97.5% bps if she sells the CDS contract.Stock trading is assumed to have no transaction cost.Initial Capital: Every trade starts with an initial capital of $0.5 and a position in CDS with a $1.00 notional amount.
Index Return: We build our investment/speculative grade CSA index in the following way.We consider every trade to be carried out by a different hedge fund.The fund starts when the trade starts; it liquidates when the trade liquidates.We define investment/speculative grade CSA index as the equally weighted portfolio of all such hedge funds that hold investment/speculative grade CDS positions.The daily return of the index is then the average of the daily return of all hedge funds in the index.It is worth noting that the number of hedge funds in the index can change from time to time due to the creation of new hedge funds and the termination of old hedge funds.We then compound the daily return of the indexes into monthly return of the index.
Individual Trade Return: Suppose a trade starts at t and ends at T, with the values of portfolio being V t and V T , respectively .The excess return (ER) of the trade is then V T ´Vt V t minus the return of the 3-month T-bill over period pt, Tq.
The AER of an individual trade is the annualized excess return of the trade.Namely, AER " ER pT´tq ¨252.3Measures of Profitability: Various statistics are used to measure profitability, namely, Sharpe ratio, proportion of months of positive excess return, proportion of trades of positive excess return, proportion of trades with 20% drawdown, proportions of trades with modeled CDS spread and real CDS spread converged, and mean/skewness of the AER across trades.

Advantage of Our Improved Implementation over Conventional Implementation
In this section we makes two comparisons to show the incremental advantages of the distinctions of our improved implementation.
First, we define the capital structure arbitrage based on conventional implementation of CreditGrades™ model as TS0 4 .Define the implementation with only Distinction 1 and Distinction 2 of our improved implementation as Implementation 1 and the CSA based on Implementation 1 as Trading Strategy 1 (TS1).We study the difference in profitability between TS1 and TS0.
Second, we define the CSA based on our improved implementation with all its three distinctions as Trading Strategy 2 (TS2) and study the difference in profitability between TS2 and TS1.
Third, we compare TS2 and TS0 directly.

TS1 vs. TS0
Before a systematic comparison, let us look into an individual firm to provide an intuitive perspective.The firm we will look at is Wyeth, a major drug manufacturer based in Madison, New Jersey.Wyeth engages in the discovery, development, manufacture, distribution, and sale of pharmaceuticals, consumer healthcare, and animal health products.It operates through three segments: Pharmaceuticals, Consumer Healthcare, and Animal Health.Famous products of Wyeth include Robitussin, Advil (ibuprofen), Premarin, and Effexor.
As Figure 1 demonstrates, TS1 has a much tighter fit than Yu [27].Yu [27] persistently over-estimates market realized CDS spread for an extended period from 30 August 2002 to 3 March 2004.In the middle of the period, the fit is far off.At threshold " 0.5, Yu [27] initiates 395 trades, with an average AER at 4.94%, while TS1 initiates only 34 trades, with an average AER at 20.28%, four times that of Yu [27].All the trades TS1 initiates converge while only 38% of the trades Yu [27] initiates converge.

TS1 vs. TS0
Before a systematic comparison, let us look into an individual firm to provide an intuitive perspective.The firm we will look at is Wyeth, a major drug manufacturer based in Madison, New Jersey.Wyeth engages in the discovery, development, manufacture, distribution, and sale of pharmaceuticals, consumer healthcare, and animal health products.It operates through three segments: Pharmaceuticals, Consumer Healthcare, and Animal Health.Famous products of Wyeth include Robitussin, Advil (ibuprofen), Premarin, and Effexor.
As Figure 1 demonstrates, TS1 has a much tighter fit than Yu [27].Yu [27] persistently over-estimates market realized CDS spread for an extended period from 30 August 2002 to 3 March 2004.In the middle of the period, the fit is far off.At 0.5  threshold , Yu [27] initiates 395 trades, with an average AER at 4.94%, while TS1 initiates only 34 trades, with an average AER at 20.28%, four times that of Yu [27].All the trades TS1 initiates converge while only 38% of the trades Yu [27] initiates converge.Wyeth, CDS spread, TS1 vs. TS0.Figure 1 presents the market CDS spread, modeled CDS spread computed with methods in Yu [27], and CDS spread computed with TS1.It is shown that TS1 has a much tighter fit than Yu [27].
The statistical results of a systematic comparison between TS1 and Yu [27] are presented in Table 3, and the statistical procedure is described in the caption of the table.
The conclusion we draw in this section is, for investment grade firms, in comparison with Yu [27], TS1 has a statistically significantly higher Sharpe ratio, higher proportion of trades and months with positive excess, higher mean average AER across trades, and higher proportion of trades that converge.For speculative grade firms, in comparison with Yu [27], TS1 has statistically significantly higher Sharpe ratio, higher proportion of trades with positive excess, higher mean and skewness of AER across trades, and higher proportion of trades that converge.

CDS Spread
Market Spread TS1 Yu(2006) Figure 1.Wyeth, CDS spread, TS1 vs. TS0.Figure 1 presents the market CDS spread, modeled CDS spread computed with methods in Yu [27], and CDS spread computed with TS1.It is shown that TS1 has a much tighter fit than Yu [27].
The statistical results of a systematic comparison between TS1 and Yu [27] are presented in Table 3, and the statistical procedure is described in the caption of the table.
The conclusion we draw in this section is, for investment grade firms, in comparison with Yu [27], TS1 has a statistically significantly higher Sharpe ratio, higher proportion of trades and months with positive excess, higher mean average AER across trades, and higher proportion of trades that converge.For speculative grade firms, in comparison with Yu [27], TS1 has statistically significantly higher Sharpe ratio, higher proportion of trades with positive excess, higher mean and skewness of AER across trades, and higher proportion of trades that converge.
It is worth noting that for investment grade index, the skewness of the AER across trades of TS1 is actually statistically less than that of Yu [27].The proportion of months with positive excess return in TS1 is statistically significantly larger than that in Yu [27] but only at a significant level of 94%.For both investment grade and speculative grade indexes, the proportion of trades with 20% drawdown in TS1 is not statistically different from that in Yu [27].Step 1: For every threshold h, simulate both TS1 and Yu [27] and compute the difference between their Sharpe ratios D h . 5This step produces the series of D h , with h ranging from 0.5 to 1.5.
In Step 2, we cannot use a straightforward T-test because the series D h is auto correlated. 6Instead, we use an intercept-only regression and adjust for serial correlation in the error term with Newey-West adjustment: Regression : D h " α `εh with Epε h q " 0 for all h The hypothesis is now on the intercept α: H 0 : α " 0 vs. H a : α ą 0.
Step 3: Repeat Step 1 and Step 2 but replace Sharp ratio with other measures of profitability, including the proportion of months with positive excess return, proportion of trades with positive excess, mean, and skewness of the AER across trades, proportion of trades that converges, and proportion of trades with a 20% drawdown 7 .For proportion of trades with a 20% drawdown, the alternative hypothesis needs to be changed to H a : α ă 0.
Table 3 reports the estimate, standard deviation, t-statistics, and the p-value of the intercept in the regression.
For the skewness of the AER across trades in the investment grade CSA index, since the estimate is negative, the alternative hypothesis has to be changed to H a : α ă 0. Rejection of the corresponding null hypothesis is interpreted as TS1 underperforms Yu [27] in the corresponding measure of profitability.
Items with ** have a confidence level of more than 99%.Items with a single * have a confidence level of more than 90%.TS1 outperforms Yu [27] in 11 measures and underperforms it in one measure at confidence of at least 90%.In general, TS1 outperforms Yu [27].

TS2 vs. TS1
This section compares the profitability of TS2 and TS1.Before a systematic comparison, we again use Wyeth to provide an intuitive perspective.Figure 2 demonstrates the fit of both TS1 modeled spread and TS2 modeled spread.The TS1 model spread smoothly follows the market realized spread while TS2 model spread has more sudden large moves.TS2 gives more trading signals than TS1.At threshold " 0.95, TS2 gives 34 trading signals while TS1 gives only 13.Among the 13 trading signals given by TS1, 11 are also given by TS2.(All the trading signals of both trading strategies are profitable.)The AHPERR is 1.9% for TS2 and 1.2% for TS1.
As demonstrated in Figure 3, historical volatility8 was a smooth curve while historical volatility had more sudden jumps, which confirmed the intuition that option-implied volatility incorporated information more promptly than historical volatility.The CDS spread computed with option-implied 5 D h = Shape ratio of TS1-Sharpe ratios of Yu [27].6 Although h is not time, we can still borrow the concept and methodology.7 Twenty percent drawdown means the value of the portfolio drops to lower than 80% of the original capital some time before liquidation.For this measurement, we need to change H a : α ą 0 into H a : α ă 0.
volatility therefore was able to send clearer trading signals than when it is computed with historical volatility.and TS2 modeled spread.The TS1 model spread smoothly follows the market realized spread while TS2 model spread has more sudden large moves.TS2 gives more trading signals than TS1.At 0.95  threshold , TS2 gives 34 trading signals while TS1 gives only 13.Among the 13 trading signals given by TS1, 11 are also given by TS2.(All the trading signals of both trading strategies are profitable.)The AHPERR is 1.9% for TS2 and 1.2% for TS1.
As demonstrated in Figure 3, historical volatility 8 was a smooth curve while historical volatility had more sudden jumps, which confirmed the intuition that option-implied volatility incorporated information more promptly than historical volatility.The CDS spread computed with option-implied volatility therefore was able to send clearer trading signals than when it is computed with historical volatility.

CDS spread
Market Spread TS2 TS1 Figure 2. Wyeth, CDS spread, TS2 vs. TS1.Figure 2 presents the market CDS spread, modeled CDS spread computed with TS2, and CDS spread computed with TS1.It is shown that TS1 has a much tighter fit than TS2.However, it turns out that the trading performance of TS2 is better than that of TS1.TS2 gives more trading signals than TS1.
J. Risk Financial Manag.2016, 9, 3 13 of 22 The better fit of TS1 was also due to the special features of the CDS spread in relationship to volatility and stock price.First, the historical volatility was a very smooth curve-the difference between the historical volatilities between consecutive days was small.Second, the day-to-day changes of the CDS spread were small most of the time.The difference between the model spread of two consecutive days was caused by the different input to the CDS function.The only difference in input in this case was stock price and historical volatility.When historical volatilities were close to each other, the difference in model spread was mostly caused by the difference in stock prices.However, the stock price did not have a strong effect on the CDS spread, as demonstrated by the small magnitude of the partial derivative of the CDS spread over stock price.In a nutshell, the smoothness of both historical volatility and market  The better fit of TS1 was also due to the special features of the CDS spread in relationship to volatility and stock price.First, the historical volatility was a very smooth curve-the difference between the historical volatilities between consecutive days was small.Second, the day-to-day changes of the CDS spread were small most of the time.The difference between the model spread of two consecutive days was caused by the different input to the CDS function.The only difference in input in this case was stock price and historical volatility.When historical volatilities were close to each other, the difference in model spread was mostly caused by the difference in stock prices.However, the stock price did not have a strong effect on the CDS spread, as demonstrated by the small magnitude of the partial derivative of the CDS spread over stock price.In a nutshell, the smoothness of both historical volatility and market realized CDS spread made TS1 fit better than TS2.
To provide an intuitive perspective, let us take a closer look at one of the trading opportunities that TS2 catches but TS1 misses for intuition.The stock price of Wyeth dropped on 9 July 2009 because of the concern that the Senate committee that oversees the U.S. Food and Drug Administration was likely to pass a compromise bill, sponsored by Senators John McCain, an Arizona Republican, and Charles Schumer, a New York Democrat, to tighten two loopholes in a 1984 law that let brand-name drug makers delay generic competition.The loophole allows brand-name drug makers like Wyeth to get a 30-month hold on approvals of generic versions of their drugs if they claim their patents have been infringed, which would stall the competition from generic drug manufacturers.
As demonstrated in Figure 4, on 8 July 2008, the market spread, TS1 modeled spread, and TS2 modeled spreads are all at a low level around 50 basis points.On 9 July, the market CDS spread increased substantially.TS2 soared upward, far above the market CDS spread, giving a strong signal that market CDS spread would increase further.The large difference between the TS2 and market CDS spreads sparked a trading signal.This trading signal turned out to be profitable-on 22 July when the TS2 spread and market CDS spread converged, the market CDS spread had more than tripled and the arbitrageur who used TS2 to trade would have profited.The TS1 spread, on the other hand, did not generate any trading signals and the arbitrageur who carried out TS1 would have missed the opportunity.The TS1 spread rose a bit higher than the market CDS spread on 9 July and then dropped below the market CDS spread until the trade initiated by TS2 liquidated.
J. Risk Financial Manag.2016, 9, 3 14 of 22 opportunity.The TS1 spread rose a bit higher than the market CDS spread on 9 July and then dropped below the market CDS spread until the trade initiated by TS2 liquidated.Wyeth, Bad News, CDS spread, TS2 vs. TS1.Figure 4 shows the dynamics of market CDS spread, TS2 modeled CDS spread, and TS1 modeled spread between 1 July to 23 July 2008.The market spread, TS1 and TS2 model spreads are all at a low level around 50 basis point.On 9 July the market CDS spread increases substantially.TS2 soars upward, far above market CDS spread, giving a strong signal that market CDS spread will increase further.The large difference between TS2 and market CDS spread sparks a trading signal.This trading signal turns out to be profitable-on 22 July when the TS2 spread and market CDS spread converge, the market CDS spread has more than tripled, and the arbitrageur who uses TS2 to trade would have profited.TS1 spread, on the other hand, does not generate any trading signals and the arbitrageur who carries out TS1 would have missed the opportunity.TS1 spread rises a little higher than market CDS spread on 9 July and then drops below the market CDS spread until the trade initiated by TS2 is liquidated.
Why did TS2 catch an opportunity while TS1 missed it?As demonstrated in Figure 5, option-implied volatility gave a much clearer signal than historical volatility on 9 July 2008.Both volatilities stayed flat before 9 July 2008, when option-implied volatility jumped much more than historical volatility.This is why the CDS spread computed with the option-implied volatility could send a clearer signal of trading opportunity.The interpretation of the different behavior between implied and Wyeth, Bad News, CDS spread, TS2 vs. TS1.Figure 4 shows the dynamics of market CDS spread, TS2 modeled CDS spread, and TS1 modeled spread between 1 July to 23 July 2008.The market spread, TS1 and TS2 model spreads are all at a low level around 50 basis point.On 9 July the market CDS spread increases substantially.TS2 soars upward, far above market CDS spread, giving a strong signal that market CDS spread will increase further.The large difference between TS2 and market CDS spread sparks a trading signal.This trading signal turns out to be profitable-on 22 July when the TS2 spread and market CDS spread converge, the market CDS spread has more than tripled, and the arbitrageur who uses TS2 to trade would have profited.TS1 spread, on the other hand, does not generate any trading signals and the arbitrageur who carries out TS1 would have missed the opportunity.TS1 spread rises a little higher than market CDS spread on 9 July and then drops below the market CDS spread until the trade initiated by TS2 is liquidated.
Why did TS2 catch an opportunity while TS1 missed it?As demonstrated in Figure 5, option-implied volatility gave a much clearer signal than historical volatility on 9 July 2008.Both volatilities stayed flat before 9 July 2008, when option-implied volatility jumped much more than historical volatility.This is why the CDS spread computed with the option-implied volatility could send a clearer signal of trading opportunity.The interpretation of the different behavior between implied and historical volatility is, first, implied volatility reacts to information more promptly than historical volatility; second, implied volatility reacts to the information with larger movement, which might incorporate the time varying volatility premium as Cao, Yu, and Zhong [26] show.We now compare the profitability of TS1 and TS2 at an index level.We use the same methodology as in Section 5.2.1 whereupon we compare TS1 and Yu [27].The statistics of the comparison are presented in Table 4. TS2 outperforms TS1 in five measures of profitability while underperforming it in three measures, at a confidence level of at least 95%, implying that TS2 outperforms TS1 in general.Items with ** have a confidence level of more than 99%.Items with a single * have a confidence level of more than 90%.We now compare the profitability of TS1 and TS2 at an index level.We use the same methodology as in Section 5.2.1 whereupon we compare TS1 and Yu [27].The statistics of the comparison are presented in Table 4. TS2 outperforms TS1 in five measures of profitability while underperforming it in three measures, at a confidence level of at least 95%, implying that TS2 outperforms TS1 in general.Items with ** have a confidence level of more than 99%.Items with a single * have a confidence level of more than 90%.[27] To make the analysis complete, we now compare TS2 and Yu [27].Again, to provide an intuitive perspective, we compare TS2 and Yu [27] in the case of an individual firm, Wyeth.As Figure 6 demonstrates, TS2 had a much tighter fit than Yu [27].Yu [27] persistently overestimated the market spread for an extended period from 30 August 2002 to 3 March 2004.In the middle of the period, the fit was far off.At threshold " 0.5, Yu [27] initiated 395 trades, with an average AER at 0.3114%, while TS2 initiated only 74 trades, with an average AER at 1.6486%, 5 times that of Yu [27].All the trades in TS2 converge while only 38% of the trades Yu [27] initiates converge.To make the analysis complete, we now compare TS2 and Yu [27].Again, to provide an intuitive perspective, we compare TS2 and Yu [27] in the case of an individual firm, Wyeth.As Figure 6 demonstrates, TS2 had a much tighter fit than Yu [27].Yu [27] persistently overestimated the market spread for an extended period from 30 August 2002 to 3 March 2004.In the middle of the period, the fit was far off.At 0.5  threshold , Yu [27] initiated 395 trades, with an average AER at 0.3114%, while TS2 initiated only 74 trades, with an average AER at 1.6486%, 5 times that of Yu [27].All the trades in TS2 converge while only 38% of the trades Yu [27] initiates converge.Figure 6.Wyeth CDS spread, TS2 vs. Yu [27].Figure 6 compares the modeled CDS spreads computed by TS2 and those computed with methods of Yu [27].As shown above, TS2 has a much tighter fit than Yu [27].Yu [27] persistently over estimates market spread for an extended period from 30 August 2002 to 3 March 2004.In the middle, the fitting is far off.

TS2 vs. Trading Strategy in Yu
We now compare the profitability of TS2 and Yu [27] at an index level.We use the same methodology as we compare TS1 and Yu [27].The statistics of the comparison are presented in Table 5.

CDS Spread
Market Spread TS2 Yu(2006) Figure 6.Wyeth CDS spread, TS2 vs. Yu [27].Figure 6 compares the modeled CDS spreads computed by TS2 and those computed with methods of Yu [27].As shown above, TS2 has a much tighter fit than Yu [27].Yu [27] persistently over estimates market spread for an extended period from 30 August 2002 to 3 March 2004.In the middle, the fitting is far off.
We now compare the profitability of TS2 and Yu [27] at an index level.We use the same methodology as we compare TS1 and Yu [27].The statistics of the comparison are presented in Table 5.Except for skewness of AER across trades in TS2, all the null hypotheses are rejected at confidence levels of at least 99.9%, implying that TS2 outperforms Yu [27] substantially.Items with ** have a confidence level of more than 99.99%.
TS2 improves upon Yu [27] in almost all measurements of profitability, except for the skewness of AER for investment grade index.

Profitability of TS2
Profitability of TS2 measured in the same statistics as in the previous section is presented in Table 6.It is found that the profitability of the investment grade index is impressive while the performance of the speculative grade index is modest.Step 1: For every threshold h, simulate TS2 and compute Sharpe ratios SR h . 9This step produces the series of SR h , with h ranging from 0.5 to 1.5 in an increment of 0.025.Step 2: Run the following intercept-only regression and adjust for serial correlation in the error term with Newey-West adjustment: Regression : SR h " α `εh , with Epε h q " 0 for all h.
The hypothesis test about intercept α: H 0 : α " 0 vs. H a : α ‰ 0 Step 3: Repeat Step 1 and Step 2 but replace Sharpe ratio with other measures of profitability, including the proportion of months with positive excess return, proportion of trades with positive excess, mean and skewness of the AERacross trades, proportion of trades that converges, and proportion of trades with a 20% drawdown 10 .
Table 6 reports the lower bound of the confidence interval, upper bound of the confidence interval, estimate, t-statistics, and p-value of multiple measures of profitability.

Persistence against Systematic Risk
Is TS2 market neutral?This question is answered by regressing the monthly excess returns of TS2 indexes on eight risk factors that proxy a wide range of systematic risks.The t-value and p-value of the intercept and the factor loadings are listed in Table 7.
As demonstrated in the table, for the investment grade index, none of the coefficients of the risk factors are statistically significant while all the intercepts are statistically significant.
The result for speculative grade is not as clean, which might be because the sample size of speculative grade obligors is small.Therefore, we may have less confidence on the test result for speculative grade obligors.9 D h = Shape ratio of TS3 without transaction cost-Sharpe ratios of TS2. 10 20% drawdown means the value of the portfolio drops to lower than 80% of the original capital sometime before liquidation.
For this measurement, we need to change H a : α ą 0 into H a : α ă 0. As demonstrated in the table, for investment grade, Null Hypothesis 1 is rejected for all measures of profitability, and Null Hypothesis 2 is not rejected for any measures of profitability.Our conclusion is therefore, for investment grade obligors, TS2 does not have any loading of any of the risk factors.For speculative grade, the result is less clean.

Persistence against Time
For a trading strategy, there is always worry that as the market matures with time, the profit of the trading strategy will dwindle.Does that happen to TS2?This question is answered by regression of the monthly excess returns of TS2 indexes on time.
The statistics of the tests are reported in Table 8, and the description of the statistical procedure is described in the caption of the table.
For investment grade obligors, the coefficient of time is not statistically significant while the intercept is statistically significant; the answer is therefore "Yes".For speculative grade obligors, the coefficient of time is actually statistically significantly positive, which means return of CSA grows with time.The intercept of the regression, however, is not statistically significant.The strange result in speculative grade might be due to the fact that the sample size of speculative grade obligors is very small in comparison with investment grade obligors; therefore, we may we may have less confidence in the test result for speculative grade obligors.Step 1: For every threshold h, simulate TS2 and record the monthly return of the CSA index.This step produces 41 time series of CSA index monthly return R h,t , with h ranging from 0.5 to 1.5, t indexes the months.Step 2: For h " 0.5, perform the following time series regression, with auto correlation in the error term adjusted with Newey-West adjustment.
Regression 1 : R h,t " α h `βh ¨t `εh,t , t indexes the month, with Epε h,t ˘" 0, for all t.
Step 3: Repeat Step 2 for all the other values of h.This step results in coefficient estimate pairs pα h , β h q with 41 different values of h, ranging from 0.5 to 1.5.
Step 4: Run the following regression with auto correlation in the error term adjusted with Newey-West adjustment.
If the estimate of β turns out to be positive, we test the following Hypotheses instead: Null Hypothesis 3 : β " 0 vs. Alternative Hypothesis 3 : β ą 0.
5.4.3.Picking up Nickels in front of a Steam Roller?Some trading strategies pick up profits most of the time but suffer huge losses once in a while.Does TS2 suffer from the same symptom?This question is answered by studying the skewness of the AER across all trades of TS2.Table 9 reports the statistics of the skewness.Formally, we test the following hypotheses: H 0 : The skewness of the AER across all trades of TS2 is positive.H a : The skewness of the AER across all trades of TS2 is zero.Step 1: For every threshold h, simulate TS2 and compute the annualized excess return AER h . 11This step produces the series of AER h , with h ranging from 0.5 to 1.5 in an increment of 0.025.Step 2: Run the following intercept-only regression and adjust for serial correlation in the error term with Newey-West adjustment: Regression : AER h " α `εh , with Epε h q " 0 for all h The hypothesis test about intercept α: H 0 : α " 0 vs. H a : α ‰ 0 Rejection of a hypothesis is interpreted as the skewness of AER is positive, and therefore, TS2 is not "picking up nickels in front of a steam roller".Table 9 reports the lower/upper bound of the confidence interval, estimate, t-statistics, and p-value of α.
The null hypothesis is rejected in both grades of CSA, interpreted as TS2 is not "picking up nickels in front of a steam roller." It turns out that the skewness is greater than 0 with a confidence level of more than 99.9%.Therefore, the answer is "No".

Conclusions and Discussion
This study shed new light on the credit spread puzzle by pointing out that the ability to accurately compute and dynamically update hedge ratios to facilitate a successful capital structure arbitrage is a distinctive strength of the Black-Scholes-Merton's modeling paradigm, which could be utilized in credit risk models as well.
We provide economically significant evidence.Using an improved implementation, we showed that the model facilitated a successful capital structure arbitrage, which persistently produced substantial risk adjusted profit.
We do not suggest that our implementation is better than that of Huang and Huang [5].In general, implementation should be tailored to the model and to the specific application.Moreover, even the same model should be implemented differently if different applications demand so.The way we implemented this particular model for this particular application does not serve as a general standard for the implementation of other models in the same paradigm.We do wish, of course, that individual elements of our implementation will be found helpful for fellow researchers.
We need to caution against the notion that the capital structure arbitrage in this study always works.Profitability of certain trading strategy is not the focus of this study.In general, trading strategy needs to be modified constantly in response to the changes in regulatory environment, market situation, etc.The regulatory environment of CDS trading has substantial changes over the years and there is surely no guarantee that a trading strategy will always work in spite of such changes.

p 2 s
f pZ t´s , λ, R, Lq ´yt´s q Namely, we modified the original implementation in the following way: Distinction 1:

Figure 1 .
Figure 1.Wyeth, CDS spread, TS1 vs. TS0.Figure1presents the market CDS spread, modeled CDS spread computed with methods in Yu[27], and CDS spread computed with TS1.It is shown that TS1 has a much tighter fit than Yu[27].

Figure 3 .
Figure 3. Wyeth, Implied Volatility vs. Historical Volatility Figure3Compares Implied volatility and Historical Volatility.It demonstrates that historical volatility is a smooth curve while historical volatility has more sudden jumps.This confirms the intuition that option-implied volatility incorporates information more promptly than historical volatility.

Figure 3 .
Figure 3. Wyeth, Implied Volatility vs. Historical Volatility Figure3Compares Implied volatility and Historical Volatility.It demonstrates that historical volatility is a smooth curve while historical volatility has more sudden jumps.This confirms the intuition that option-implied volatility incorporates information more promptly than historical volatility.

Figure 4 .
Figure 4. Wyeth, Bad News, CDS spread, TS2 vs. TS1.Figure4shows the dynamics of market CDS spread, TS2 modeled CDS spread, and TS1 modeled spread between 1 July to 23 July 2008.The market spread, TS1 and TS2 model spreads are all at a low level around 50 basis point.On 9 July the market CDS spread increases substantially.TS2 soars upward, far above market CDS spread, giving a strong signal that market CDS spread will increase further.The large difference between TS2 and market CDS spread sparks a trading signal.This trading signal turns out to be profitable-on 22 July when the TS2 spread and market CDS spread converge, the market CDS spread has more than tripled, and the arbitrageur who uses TS2 to trade would have profited.TS1 spread, on the other hand, does not generate any trading signals and the arbitrageur who carries out TS1 would have missed the opportunity.TS1 spread rises a little higher than market CDS spread on 9 July and then drops below the market CDS spread until the trade initiated by TS2 is liquidated.

Figure 4 .
Figure 4. Wyeth, Bad News, CDS spread, TS2 vs. TS1.Figure4shows the dynamics of market CDS spread, TS2 modeled CDS spread, and TS1 modeled spread between 1 July to 23 July 2008.The market spread, TS1 and TS2 model spreads are all at a low level around 50 basis point.On 9 July the market CDS spread increases substantially.TS2 soars upward, far above market CDS spread, giving a strong signal that market CDS spread will increase further.The large difference between TS2 and market CDS spread sparks a trading signal.This trading signal turns out to be profitable-on 22 July when the TS2 spread and market CDS spread converge, the market CDS spread has more than tripled, and the arbitrageur who uses TS2 to trade would have profited.TS1 spread, on the other hand, does not generate any trading signals and the arbitrageur who carries out TS1 would have missed the opportunity.TS1 spread rises a little higher than market CDS spread on 9 July and then drops below the market CDS spread until the trade initiated by TS2 is liquidated.

Figure 5 .
Figure 5. Wyeth, Bad News, Implied Vol. vs. Historical Vol. Figure 5 shows the dynamics of option-implied volatility and historical volatility in the period between 1 July and 23 July 2002.Option-implied volatility gives a much clearer signal than historical volatility on 9 July 2008.Both volatilities stay flat before 9 July 2008.On 9 July 2008, option-implied volatility jumps up at a much larger margin than historical volatility.

Figure 5 .
Figure 5. Wyeth, Bad News, Implied Vol. vs. Historical Vol. Figure 5 shows the dynamics of option-implied volatility and historical volatility in the period between 1 July and 23 July 2002.Option-implied volatility gives a much clearer signal than historical volatility on 9 July 2008.Both volatilities stay flat before 9 July 2008.On 9 July 2008, option-implied volatility jumps up at a much larger margin than historical volatility.

Table 1 .
Number of observations and market cap.Nob is the number of observation of investment grade firms.Inv.Mkt cap is the market cap of investment grade firms.Spec.Nob is the number of observations of speculative grade firms.Spec.Mkt cap is the market cap of speculative grade firms.Unit "B" represents one billion dollar.
Note: Table2presents the industrial composition of the obligors.Firms in the financial industry are excluded because their capital structure is hard to interpret.

Table 6 .
Risk and Return Characteristics of TS2.

Table 7 .
Market Neutrality of TS2.For every threshold h, simulate TS2 and record the monthly return of the CSA index.This step produces 41 time series of CSA index monthly return R h,t , with h ranging from 0.5 to 1.5, t indexes the months.Step 2: For h = 0.5, regress the time series of monthly return on the systematic risk factors.Step 3: Repeat Step 2 for all the other values of h.This step results in 41 coefficient estimate pairs pα h,SMB , β h,SMB q with 41 different values of h, ranging from 0.5 to 1.5.Step 4: Run the following regression with auto correlation in the error term adjusted with Newey-West adjustment.If Null Hypothesis 1 is rejected while Null Hypothesis 2 is not rejected, we conclude that CSA index return does not have loading of risk factor SMB.If Null Hypothesis 2 is rejected, then we conclude that CSA index return does have loading of risk factor SMB. Step 6: Repeat Steps 4-5, but with SMB replaced by other risk factors.

Table 9 .
Skewness of AER across all trades in TS2.