An Entropic Approach for Pair Trading in PSX

The perception in pair trading is to recognize that when two stocks move together, their prices will converge to a mean value in the future. However, finding the mean-reverted point at which the value of the pair will converge as well as the optimal boundaries of the trade is not easy, as uncertainty and model misspecifications may lead to losses. To cater to these problems, this study employed a novel entropic approach that utilizes entropy as a penalty function for the misspecification of the model. The use of entropy as a measure of risk in pair trading is a nascent idea, and this study utilized daily data for 64 companies listed on the PSX for the years 2017, 2018, and 2019 to compute their returns based on the entropic approach. The returns to these stocks were then evaluated and compared with the buy and hold strategy. The results show positive and significant returns from pair trading using an entropic approach. The entropic approach seems to have an edge to buy and hold, distance-based, and machine learning approaches in the context of the Pakistani market.


Introduction
According to quantitative models, pair trading involves a driving mechanism for mean reversions using a statistical arbitrage strategy. The perception is to recognize that when two stocks move together, their prices will converge to a mean value in the future [1]. Mean reversion allows traders to make a profit by matching a long position in one stock with an offsetting position in another stock [2]. Pair trading is an efficient method for the formation of portfolios or pair trading [3,4]; however, finding the accurate pairs and boundary points is not an easy task.
The profitability of pair trading decreased due to an increasing share of non-converging pairs [5]. To resolve the issue of non-converging pairs, several researchers contributed to the literature [6][7][8][9] and proposed cointegration as the most efficient solution for structuring pair trading [10].
After settling on how to find accurate pairs, the problem arose of how to find the mean-reverted point between them and how to identify the boundaries for when exactly the investors can buy or sell any asset. Yoshikawa [11] derived the entropy-based optimal boundary points for pair trading using Tokyo Stock Exchange 2015 data. The proposed approach for the optimal stopping problem is motivated by the work of Ekström et al. [12] and Suzuki [13]. This method is based on maximizing profit via pair trading and minimizing the relative entropy (risk). This is a robust method, as it directly tackles model misspecification [14] and provides a more persuasive solution. The choice of pairs is made through cointegration, the most effective way to identify stocks that move together [15]. Entropy has a wide application in finance as well [16][17][18].
In the context of Pakistan, there was a handful of studies conducted on pair trading [19,20], and interestingly, no one has yet considered the optimal stopping problem using stocks listed on the Pakistan Stock Exchange (PSX). This study employs the novel entropic approach proposed by Yoshikawa [11] to explore the optimal boundary points that yield the maximum profit for 64 companies listed on PSX for the period 2017-2019.
The concept of maximizing the profit in pair trading based on relative entropy is a nascent idea in the literature, and this study is the first attempt in the context of Pakistan. The performance of this entropic approach is compared with the buy and hold strategy in terms of returns.

Data & Methodology
As mentioned in the last section, this study utilized the daily data for 64 companies listed on PSX for the years 2017, 2018, and 2019. These companies cover the major sectors, including cement, chemical, automobile assembler, food and personal care products, oil and gas marketing companies, oil and gas exploration companies, power generation and distribution, refinery, and pharmaceuticals. The firms' selection criterion was based on year-wise price earnings ratios (PER); a firm with a PER lower than the sample median value was selected in the sample. The underlying idea is that the stock below the median PER is undervalued and signifies potential for higher returns [21,22]. The choice of pairs was made through Johansen cointegration, which is the most effective way to identify stocks that move together [15]. In each year, we formulated all pairs n 2 − n /2 of the selected stocks and assessed each pair for cointegration.
Keeping in view the potential jumps/structural breaks in high-frequency financial data [23], the following breakpoint unit root test proposed by Bai and Perron [24] was employed.
where µ t is white noise.

Ornstein-Uhlenbeck (OU) Process
Pair trading utilizes the mean reversion of the composite process of two stocks. Following Yoshikawa [11], we considered the Ornstein-Uhlenbeck (OU) process X t such that where µ and σ are the positive constants, α is the mean-reversion point, and B t is the p-Brownian motion. Let X t − α =X t . Then, Equation (2) implies: The optimal stopping problem at time t for the processX t is defined as follows: where is the set of all stopping points of B, and ρ is the discount rate. The solution of Equation (4) gives us the trading strategy: we short pair X when it attains the highest value and liquidate it when X attains zero value. These values are specified by the above equation. Alternatively, we take the long position for X for zero value and liquidate it for the highest value. The superscript S in Equation (4) is the solution to the following: where λ is a positive constant and H(.) is a relative entropy defined as follows: Thus, the optimal boundary b(t) for Equation (4) is given as: where Any investor holding pair X should liquidate when X touches b(t) and, if not holding X, should short their position when X touches b(t) and liquidate it when it reverts to mean zero.

Results and Discussion
From the eight selected sectors, we found 64 active firms listed on PSX for the years 2017, 2018, and 2019. After applying the PER benchmark, we got 33, 34, and 40 companies, respectively. Having selected the companies, the unit root test was applied to the time series data of these stocks to find the order of integration. All the time series are integrated of order one. This led us to find the cointegrated pairs using the Johansen cointegration test at 0.05 level of significance. We found 79 = (28 + 29 + 23 = 80 − 1) unique cointegrated pairs (one pair was repeated) out of 1869 = (528 + 561 + 780) pairs of the selected stocks in the 3-year period.
Having found the pairs, we applied the maximum likelihood method to find the parameters of the Ornstein-Uhlenbeck processes, µ, α, and σ, as given in Equation (2). MATLAB R2021b was used for the coding and estimation of these parameters. However, to compute the optimal boundary points, we needed to find the parameters ρ and λ as well. The parameter ρ is the discount rate, and the parameter λ represents the level of confidence. The lower the value of λ, the lower the confidence of the agent on the reference measure as a true probability measure among the class of all probability measures and vice versa. We used ρ = 0.08978, 0.1315, and 0.1440 as per the annual report of State Bank of Pakistan for 2017, 2018, and 2019, respectively, and by following Yoshikawa [11], four cases for the parameter, λ = 0.001, 0.01, 0.1, and ∞ were considered. Tables 1 and 2 present the results for only five pairs of stocks in each year involving the top listed companies (see Appendix A, Tables A1-A7 for the results of other companies). After computing the values of µ, α, and σ as furnished in Table 1, we estimated the rate of returns for different values of λ for the selected companies ( Table 2). On balance, pair trading yielded optimal returns for lower values of λs, which is understandable, as the parameter lambda is linked with the penalty function. All the estimated parameter values are presented in Figures 1-3 and in Table 2 for their respective years. From these figures, it Is evident that the values of the mean reversion parameter differ when the stocks in the pair are selected within the sector in comparison to when the stocks are selected across the sectors.
For the real data sets, the pair trading strategy was to set the position when the pair value touches either the mean reverted point or the boundary. For example, in Figure 1 (pair: PSO and MPLF), the mean reversion point was 60.29 where we set the position, and we liquidated the position when the pair value touched the boundary b(t). If the position was set when the pair value touches the boundary, then it was liquidated when it touched the mean reversion point α. In Figure 2 (pair: PSO and BYCO), if we set our position when the pair value touched the boundary then we would liquidate at the mean reversion point, α = 9.26. The next position was set when the pair value touched either the boundary b(t) or the mean reversion point α and liquidated following the same rule.
Following this trading strategy, we estimated the rate of returns for the 80 unique pairs of the companies for the years 2017, 2018, and 2019. Gatev et al. [6] highlighted the transaction fee as an obstacle in trading. Because the transaction cost in the Pakistan Stock Exchange is 0.15 percent and we are dealing with pair trading, we discounted our return values by 0.3 percent. Table 2 provides these return values for five pairs from each year. The return values ranged from 0.2 to 25.2 percent for the year 2017, 0.4 to 19.5 percent for the year 2018, and 1.5 to 15.7 percent for the year 2019. All positive returns confirm profitable trades, which is line with the findings in the literature [1,11]. For all cointegrated pairs (Appendix A), average return values ranged from 2.9% to 18.5% which are much higher than the return values estimated in [25], which ranged from 0.1% to 1.71% using the distance-based approach for the stocks listed on the PSX during the period 2009-2016. Shaukat et al. (2021) employed the distance-based method to select the pairs and compute returns to pair trading for financial (banks) and non-financial (cement industries) sectors with a formation period of 12 months. Cement industries yielded higher returns, whereas the banks yielded lower returns. Sohail et al. [20] estimated the return on pair trading using 80 stocks from five different sectors: banking, chemicals, cement, textile, and food and care products, all of which were listed on the PSX from 2011 to 2019. Trading periods of two and one year were used for the machine learning algorithm (clustering algorithm) and distance-based methods, respectively. The study found a maximum return of 2.07 percent for the textile sector using the distance-based approach, whereas the clustering algorithm yielded a maximum return of 2.55 percent.
The distance-based approach relies on the average squared differences between the normalized prices of stocks, and principal component analysis (PCA) is used to generate the indices of the stocks that represent the weighted average prices of the stocks to be used in the machine algorithm. By construct, PCA indices resemble those generated with the cointegration technique; we found parameters α and β such that the linear combination of the two stock prices, αp 1 + βp 2 , yielded a stationary process, whereas the weights in PCA may not yield stationary indices. Further, both the studies [20,25] did not allow cross-sector pairing that might have caused their low returns in comparison with our study. The profitability of pair trading decreases due to non-convergence of the pairs [5], and cointegration is the most efficient method to explore converging pairs [10]. Thus, the entropic approach seems to have an edge over the distance-based and machine learning approaches in the context of the Pakistani market. Further, to evaluate our results, we contrasted our results against the buy and hold strategy with trading periods of one quarter, annually, 2 years, and 3 years (Table 3). A trading period of one year is in line with the literature [20,25]. The rate of returns for the alternative strategy is summarized in Table 3. In general, except for 2019-Q4, the top-performing stocks made a loss for this strategy, whereas Table 2 shows pair trading provided stable profits. The buy and hold strategy has a considerable risk of human error considering the pressure of all the wrong choices one can make [26]. The optimization of the boundaries backed by the Ornstein-Uhlenbeck process allowed us to incorporate all risks, improve the profitability of pair trading, and receive maximum positive returns [27]. Therefore, we suggest the pair trading strategy while taking model uncertainty into account.    Table 2).  Table 2).

Conclusions
This study employed a novel entropic approach to explore the optimal boundary points that yield maximum profit for 64 companies listed on the Pakistan Stock Exchange (PSX) for the period 2017-2019. The concept of maximizing the profit in pair trading based on relative entropy is a nascent idea in literature, and this study is the first attempt to implement it in the context of Pakistan. The performance of this entropic approach is contrasted with the buy and hold strategy in terms of returns. The following are the key findings of the study.

1.
The values of the mean reversion parameter differ when the stocks in the pair are selected within the sector in comparison to when the stocks are selected across the sectors.

2.
On balance, optimal returns are associated with lower values of λs; approximately, 84 percent pairs yielded optimal returns for low values of lambda ( λ = 0.001 and 0.01).

3.
The return values based on entropic pair trading approached ranges from 0.2 to 25.2 percent for the year 2017, 0.4 to 19.5 percent for the year 2018, and 1.5 to 15.7 percent for the year 2019. These values are much higher than the returns estimated in [20,25].

4.
Based on the buy and hold strategy, all the top performing stocks make a loss. 5.
The entropic approach seems to have an edge over the buy and hold, distance-based, and machine learning approaches in the context of the Pakistani market.
Pair trading is an efficient method that allows maximization of profitability by eliminating short-term price deviations in favor of long-term historical pricing relationships. The entropy-based pair trading method yielded positive returns for all the cointegrated pairs tested and confirmed their profits, which is line with the findings in literature [1,11]. According to the efficient market hypothesis (EMH), an active investor cannot be more effective than the one who buys and holds. Therefore, the returns estimated from the entropic approach were contrasted against the returns estimated through the buy and hold strategy. The buy and hold strategy yielded negative returns, except for a few cases implying losses. Consequently, we suggest the pair trading strategy while taking model uncertainty into account.