Some Notes on the Formation of a Pair in Pairs Trading

: The main goal of the paper is to introduce different models to calculate the amount of money that must be allocated to each stock in a statistical arbitrage technique known as pairs trading. The traditional allocation strategy is based on an equal weight methodology. However, we will show how, with an optimal allocation, the performance of pairs trading increases signiﬁcantly. Four methodologies are proposed to set up the optimal allocation. These methodologies are based on distance, correlation, cointegration and Hurst exponent (mean reversion). It is showed that the new methodologies provide an improvement in the obtained results with respect to an equal weighted strategy.


Introduction
Efficient Market Hypothesis (EMH) is a well-known topic in finance. Implications of the weak form of efficiency is that information about the past is reflected in the market price of a stock and therefore, historical market data is not helpful for predicting the future. An investor in an efficient market will not be able to obtain a significant advantage over a benchmark portfolio or a market index trading based on historical data (for a review see Reference [1,2]).
On the opposite way, some researchers have shown that the use of historical data as well as trading techniques is sometimes possible due to temporal markets anomalies. Despite that most of economists consider that these anomalies are not compatible with an efficient market, recent papers have shown new perspectives called Fractal Market Hypothesis (FMH) and Adaptive Market Hypothesis (AMH), that tries to integrate market anomalies into the efficient market hypothesis.
The EMH was questioned by the mathematician Mandelbrot in 1963 and after the economist Fama showed his doubts about the Normal distribution of stock returns, essential point of the efficient hypothesis. Mandelbrot concluded that stock prices exhibit long-memory, and proposed a Fractional Brownian motion to model the market. Di Matteo [3,4] considered that investors can be distinguished by the investment horizons in which they operate. This consideration allows us to connect the idea of long memory and the efficiency hypothesis. In the context of an efficient market, the information is considered as a generic item. This means that the impact that public information has over each investor is similar. However, the FMH assumes that information and expectations affect in a different way to traders, which are only focused on short terms and long term investors [5,6].
The idea of a AMH has been recently introduced by Lo [7] to reflect an evolutionary perspective of the market. Under this new idea, markets show complex dynamics at different times which make that some arbitrage techniques perform properly in some periods and poorly in others.
In an effort of conciliation, Sanchez et al. [8] remarks that the market dynamic is the results of different investors interactions. In this way, scaling behavior patterns of a specific market can characterize it. Developed market price series usually show only short memory or no memory whereas emerging markets do exhibit long-memory properties. Following this line, in a recent contribution, Sanchez et al. [9] proved that pairs trading strategies are quite profitable in Latin American Stock Markets whereas in Nasdaq 100 stocks, it is only in high volatility periods. These results are in accordance with both markets hyphotesis. A similar result is obtained by Zhang and Urquhart [10] where authors are able to obtain a significant exceed return with a trading strategy across Mainland China and Hong kong but not when the trading is limited to one of the markets. The authors argue that this is because of the increasing in the efficiency of Mainland China stock market and the decreasing of the Hong Kong one because of the integration of Chinese stock markets and permission of short selling.
These new perspectives of market rules explain why statistical arbitrage techniques, such as pairs trading, can outperform market indexes if they are able to take advantage of market anomalies. In a previous paper, Ramos et al. [11] introduced a new pairs trading technique based on Hurst exponent which is the classic and well known indicator of market memory (for more details, References [8,12] contain an interesting review). For our purpose, the selection of the pair policy is to choose those pairs with the lowest Hurst exponent, that is, the more anti-persistent pairs. Then we use a reversion to the mean trading strategy with the more anti-persistent pairs according with the previously mentioned idea that developed market prices show short memory [3,[13][14][15].
Pairs trading literature is extensive and mainly focused on the pair selection during the trading period as well as the developing of a trading strategy. The pioneer paper was Gatev et al. [16] where authors introduced the distance method with an application to the US market. In 2004, Vidyamurthy [17] presented the theoretical framework for pair selection using the cointegration method. Since then, different analysis have been carried out using this methodology in different markets, such us the European market [18,19], the DJIA stocks [20], the Brazilian market [21,22] or the STOXX 50 index [23]. Galenko et al. [24] made an application of the cointegration method to arbitrage in fund traded on different markets. Lin et al. [25] introduced the minimum profit condition into the trading strategy and Nath [26] used the cointegration method in intraday data. Elliott et al. [27] used Markov chains to study a mean reversion strategy based on differential predictions and calibration from market observations. The mean reversion approach has been tested in markets not considered efficient such us Asian markets [28] or Latin American stock markets [9]. A recent contribution of Ramos et al. [29] introduced a new methodology for testing the co-movement between assets and they tested it in statistical arbitrage. However, researchers did not pay attention to the amount of money invested in every asset, considering always a null dollar market exposition. This means that when one stock is sold, the same amount of the other stock is purchased. In this paper we propose a new methodology to improve pairs trading performance by developing new methods to improve the efficiency in calculating the ratio to invest in each stock that makes up the pair.

Pair Selection
One of the topics in pairs trading is how to find a suitable pair for pairs trading. Several methodologies have been proposed in the literature, but the more common ones are co-movement and the distance method.

Co-Movement
Baur [30] defines co-movement as the shared movement of all assets at a given time and it can be measured using correlation or cointegration techniques.
Correlation technique is quite simple, and the higher the correlation coefficient is, the greatest they move in sync. An important issue to be considered is that correlation is intrinsically a short-run measure, which implies that a correlation strategy will work better with a lower frequency trading strategy.
In this work, we will use the Spearman correlation coefficient, which is a nonparametric range statistic which measure the relationship between two variables. This coefficient is particularly useful when the relationship between the two variables is described by a monotonous function, and does not assume any particular distribution of the variables [31].
The Spearman correlation coefficient for a sample A i , B i of size n can be described as follows: first, consider the ranks of the samples rgA i , rgB i , then the Spearman correlation coefficient r s is calculated as: where • ρ denotes the Pearson correlation coefficient, applied to the rank variables • cov(rg A , rg B ), is the covariance of the rank variables. • σ rg A and σ rg B , are the standard deviations of the rank variables.
Cointegration approach was introduced by Engle and Granger [32] and it considers a different type of co-movement. In this case, cointegration refers to movements in prices, not in returns, so cointegration and correlation are related, but different concepts. In fact, cointegrated series can perfectly be low correlated.
Two stocks A and B are said to be cointegrated if there exists γ such that P A t − γP t B is a stationary process, where P A t and P t B are the log-prices A and B, respectively. In this case, the following model is considered: where • µ is the mean of the cointegration model • t is the cointegration residual, which is a stationary, mean-reverting process • γ is the cointegration coefficient.
We will use the ordinary least squares (OLS) method to estimate the regression parameters. Through the Augmented Dickey Fuller test, we will verify if the residual t is stationary or not, and with it we will check if the stocks are co-integrated.

The Distance Method
This methodology was introduced by Gatev et al. [16]. It is based on minimizing the sum of squared differences between somehow normalized price series: where S A (t) is the cumulative return of stock A at time t and S B (t) is the cumulative return of stock B at time t.
The best pair will be the pair whose distance between its stocks is the lowest possible, since this means that the stocks moves in sync and there is a high degree of co-movement between them.
An interesting contribution to this trading system was introduced by Do and Faff [33,34]. The authors replicated this methodology for the U.S. CRSP stock universe and an extended period. The authors confirmed a declining profitability in pairs trading as well as the unprofitability of the trading strategy due to the inclusion of trading costs. Do and Faff then refined the selection method to improve the pair selection. The authors restricted the possible combinations only within the 48 Fama-French industries and they looked for pairs with a high number of zero-crossings to favor the pairs with greatest mean-reversion behavior.

Pairs Trading Strategy Based on Hurst Exponent
Hurst exponent (H from now on) was introduced by Hurst in 1951 [35] to deal with the problem of reservoir control for the Nile River Dam. Until the beginning of the 21st century, the most common methodology to estimate H was the R/S analysis [36] and the DFA [37], but due to accuracy problems remarked by several studies (see for example References [38][39][40][41]), new algorithms were developed for a more efficient estimation of the Hurst exponent, some of them with its focus on financial time series. One of the most important methodologies is the GHE algorithm, introduced in Reference [42], which is a general algorithm with good properties.
The GHE is based on the scaling behavior of the statistic where τ is the scale (usually chosen between 1 and a quarter of the length of the series), H is the Hurst exponent, < · > denotes the sample average on time t and q is the order of the moment considered. In this paper we will always use q = 1. The GHE is calculated by linear regression, taking logarithms in the expression contained in (4) for different values of τ [3,43].
The interpretation of H is as follow: when H is greater than 0.5, the process is persistent, when H is less than 0.5, it is anti persistent, while Brownian motion has a value of H equal to 0.5.
With this technique, pairs with the lowest Hurst exponent has to be chosen in order to apply reversion to the mean strategies which is also the base of correlation and cointegration strategies.

Pairs Trading Strategy
Next, we describe the pairs trading strategy, which is taken from Reference [11]. As usual, we consider two periods. The first one is the formation period (one year), which is used for the pair selection. This is done using the four methods defined in this section (distance, correlation, cointegration and Hurst exponent). The second period is the execution period (six months), in which all selected pairs are traded as follows: • In case s > m + σ the pair will be sold. The position will be closed if s < m or s > m + 2σ.

•
In case s < m − σ the pair will be bought. The position will be closed if s > m or s < m − 2σ.
where m is a moving average of the series of the pair and s is a moving standard deviation of m.

Forming the Pair: Some New Proposals
As we remarked previously, all works assume that the amount purchased in a stock is equal to the amount sold in the other pair component. The main contribution of this paper is to analyse if not assuming an equal weight ratio in the formation of the pair improves the performance of the different pair trading strategies. In this section different methods are proposed.
When a pair is formed, we use two stocks A and B. This two stocks have to be normalized somehow, so we introduce a constant b such that stock A is comparable to stock bB. Then, to buy an amount T of the pair AB means that we buy 1 b+1 T of stock A and sell b b+1 T of stock B, while to sell an amount T of the pair AB means that we sell 1 b+1 T of stock A and buy b b+1 T of stock B. We will denote by p X (t) the logarithm of the price of stock X in time t minus the logarithm of the price of stock X at time t = 0, that is p X (t) = log(price X (t)) − log(price X (0)), and by r X (t) the log-return of stock X between times t − 1 and t, r In this paper we discuss the following ways to calculate the weight factor b: 1.
In this case b = 1. This is the way used in most of the literature. In this case, the position in the pair is dollar neutral. This method was used in Reference [16], and since then, it has become the more popular procedure to fix b.
Volatility of stock A is std(r A ) and volatility of stock B is std(r B ). If we want that A and bB have the same volatility then b = std(r A )/std(r B ). This approach was used in Reference [11] and it is based on the idea that both stocks are normalized if they have the same volatility.

3.
Based on minimal distance of the log-prices.
In this case we minimize the function , so we look for the weight factor b such that p A and bp B has the minimum distance. This approach is based on the same idea that the distance as a selection method. The closer is the evolution of the log-price of stocks A and bB, the more reverting to the mean properties the pair will have. 4.
Based on correlation of returns.
If returns are correlated then r A is approximately equal to br B , where b is obtained by linear regression r A = br B . In this case, if returns of stocks A and B are correlated, then the distribution of r A and br B will be the same, so we can use this b to normalize both stocks.

5.
Based on cointegration of the prices.
If the prices (in fact, the log-prices) of both stocks A and B are cointegrated then p A − bp B is stationary, whence b is obtained by linear regression p A = bp B . In this case, this value of b makes the pair series stationary so we can expect reversion to the mean properties of the pair series. Even if the stocks A and B are not perfectly cointegrated, this method for the calculation of b may be still valid, since, thought p A − bp B may be not stationary, it can be somehow close to it or still have mean-reversion properties. 6.
Based on lowest Hurst exponent of the pair.
The series of the pair is defined as In this case, we look for the weight factor b such that the series of the pair s(b) has the lowest Hurst exponent, what implies that the series is as anti-persistent as possible. So we look for b which minimizes the function is the Hurst exponent of the pair series s(b). The idea here is similar to the cointegration method, but from a theoretical point of view, we do not expect p A − bp B to be stationary (which is quite difficult with real stocks), but to be anti-persistent, which is enough for our trading strategy.

Experimental Results
For testing the results through the different models introduced in this paper, we will use the components of the Nasdaq 100 index technological sector (see Table A1 in Appendix A),for the period between January 1999 and December 2003, coinciding with the "dot.com" bubble crash and the period between January 2007 and December 2012, this period coincides with the financial instability caused by the "subprime" crisis. These periods are choosen based on the results showed by Sánchez et al. [9].
We use Pairs Trading traditional methods (Distance Method, Correlation and Cointegration) in addition to the method developed by Ramos et al. [11] based on the Hurst exponent.
In Appendix B, it is shown the results obtained for different selection methods and different ways to calculate b, for the two selected periods. In addition to the returns obtained for each portfolio of pairs, we include two indicators of portfolio performance and risk, the Sharpe Ratio and the maximum Drawdown.
In the first period analyzed, the EW method to calculate b is never the best one. The best methods to calculate b seems to be the cointegration method and the minimization of the Hurst exponent. Also note that the Spearman correlation, the cointegration and the Hurst exponent selection methods provide strategies with high Sharpe ratios for several methods to calculate b.
In the second period analyzed, the EW method to calculate b works fine with the cointegration selection method, but it is not so good with the other ones, while the correlation method to calculate b is often one of the best ones.
Note that, in both periods, the Sharpe ratio when we use EW to calculate b are usually quite low with respect to the other methods. Figures 1-4 show the cumulative log-return of the strategy for different selection methods and different ways to calculate b. Figure 1 shows the returns obtained for the period 1999-2003 using the co-integration approach as a selection method. We can observe that during the whole period, the best option is to choose to calculate the b factor by means of the lowest value of the Hurst exponent, while the EW method is the worst.  It can be observed that during the period studied, the results obtained using the EW method are also negative, while the Hurst exponent method is again the best option.
For the period 2007-2012, for a portfolio composed of 20 pairs selected using the distance method, Figure 3 shows the cumulative returns for the different methods proposed. In this case we can highlight the methods of correlation, minimizing distance and cointegration, as the methods to calculate b that provide the highest returns. Again, we can observe that the worst options would be the EW method together with the volatility one.   Figure 4 shows the results obtained using the different models to calculate the b factor for a portfolio of 10 pairs by selecting them using the Spearman model. We can observe that all returns are positive throughout the period studied (2007)(2008)(2009)(2010)(2011)(2012). The most outstanding are the methods of correlation, minimum distance and volatility, which move in a very similar way during this period. On the contrary, the method of the lowest value of the Hurst exponent and the EW one are the worst options during the whole period. Finally, we complete our sensitivity analysis by analyzing the influence of the strategy considered in Section 2.4. We consider the Hurst exponent as the selection method, 20 pairs in the portfolio and the period 1999-2003. We change the strategy by using 1 (as before), 1.5 and 2 standard deviations. That is, we modify the strategy as follows: • In case s > m + kσ the pair will be sold. The position will be closed if s < m or s > m + 2kσ.

•
In case s < m − kσ the pair will be bought. The position will be closed if s > m or s < m − 2kσ.
where k = 1, 1.5, 2. Table A2 shows that the EW, correlation and minimal distance obtain the worst results, while cointegration and the Hurst exponent obtain robust and better results for the different values of k.

Discussion of the Results
In Tables A3-A10, the results obtained with a pair trading strategy are shown. In those tables, we have consider four different methods for the pair selection (distance, correlation, cointegration and Hurst exponent), three different number of pairs (10, 20 and 30 pairs) and two periods (1999-2003 and 2007-2012). Overall, if we focus on the Sharpe ratio of the results, in 58% of the cases (14 out of 24) the EW method for calculating b obtains one of the three (out of seven) worst results. If we compare the EW method with the other methods proposed we obtain the following: minimal Hurst exponent is better than EW in 58% of the cases, minimal distance is better than EW in 58% of the cases, correlation is better than EW in 67% of the cases, cointegration is better than EW in 58% of the cases and volatility is better than EW in 50% of the cases. So, in general, the proposed methods (except the volatility one) tend to be better than the EW one.
However, since we are considering stocks in the technology sector, if we focus in the dot.com bubble (that is, the period 1999-2003) which affected more drastically to the stocks in the portfolio, we have, considering the Sharpe ratio of the results, that in 83% of the cases (10 out of 12) the EW method for calculating b obtains one of the three (out of seven) worst results. In this period, if we compare the EW method with the other methods proposed we obtain the following: minimal Hurst exponent is better than EW in 75% of the cases, minimal distance is better than EW in 83% of the cases, correlation is better than EW in 83% of the cases, cointegration is better than EW in 83% of the cases and volatility is better than EW in 58% of the cases. So, in general, the proposed methods (except the volatility one) tend to be much better than the EW one in this period.
On the other hand, in the second period (2007-2012), the EW performs much better than in the first period (1999-2003) and it does similarly or slightly better than the other methods.
Results show that these novel approaches used to calculate the factor b improve the results obtained compared with the classic EW method for the different strategies and mainly in the first period considered (1999)(2000)(2001)(2002)(2003). Therefore, it seems that the performance of pairs trading can be improved not only acting on the strategy, but also on the method for the allocation in each stock.
In this section we have tested different methods for the allocation in each stock of the pair. Though we have used the different allocation methods with all the selection methods analyzed, it is clear that some combinations make more sense than others. For example, if the selection of the pair is done by selecting the pair with a lower Hurst exponent, the allocation method based on the minimization of the Hurst exponent of the pair should work better than other allocation methods.
One of the main goal of this paper is to point out that the allocation in each stock of the pair can be improved in the pairs trading strategy and we have given some ways to make this allocation. However, further research is needed to asses which of the methods is the best for this purpose. Even better, which of the combinations of selection and allocation method is the best. Though this problem depends on many factors, and some of them changes, depending on investor preferences, a multi-criteria decision analysis (see, for example References [44][45][46]) seems to be a good approach to deal with it.
In fact, in future research it can be tested if the selection method can be improved if we take into account the allocation method. For example, for the distance selection method, we can use the allocation method based on the minimization of the distance to normalize the price of the stocks in a different way than in the classical distance selection method, taking into account the allocation in each stock. Not all selection methods can be improved in this way (for example, the correlation selection method will not improve), but some of them, including some methods which we have not analyzed in this paper or future selection methods, could be improved.

Conclusions
In pairs trading literature, researchers have focused their attention in increasing pairs trading performance proposing different methodologies for pair selection. However, in all cases it is assumed that the amount invested in each stock of a pair (b) must be equal. This technique is called Equally Weighted (EW). This paper presents a novel approach to try to improve the performance of this statistical arbitrage technique through novel methodologies in the calculation of b. Any selection method can benefit from these new allocation methods. Depending on the selection method used, we prove that the new methodologies for calculating the factor b obtain a greater return than those used up to the present time.
Results show that the classic EW method does not performance as well as the others. Cointegration, correlation and Hurst exponent give excellent results when are used to calculate factor b.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix B. Empirical Results
For each model (Equal Weight, Volatility, Minimal Distance of the log-prices, Correlation of returns, Cointegration of the prices, lowest Hurst exponent of the pair), we have considered 3 scenarios, depending on the amount of pairs included in the portfolio.

1.
Number of standard deviations.