Analyzing, Modeling, and Utilizing Observation Series Correlation in Capital Markets

: In this paper, we consider the task of the analysis, modeling, and application of dependencies between asset quotes at various capital markets. As an example, we study the dependency between financial instrument observation series in the currency and stock markets. Our work in-tends to give a theoretical basis to asset management strategies that estimate an asset ’ s price via regression, taking into account its correlated assets in various markets. Furthermore, we provide a way to increase the estimate quality using an evolutionary algorithm.


Introduction
The majority of prominent news in economics, politics, and other areas influences asset quotations and integral trading indicators in various capital markets. This fact makes it possible to hypothesize a significant correlation between these numerical indicators of market trade. In turn, the presence of correlation creates a background for developing asset management strategies based on it.
An obvious version of such a strategy is the ability to promptly estimate the price of the used instrument used with multivariate statistical analysis. If an asset is either underor over-priced in relation to its market value, a probabilistic conclusion on its quotation dynamics can be drawn. This, in turn, gives rise to constructing various proactive management strategies based on regressor selection, i.e., market parameters used for the multiregression estimation of a financial instrument's value, the width of the sliding window used for constructing statistical estimates, and other factors.
The theoretical basis of this management strategy is the hypothesis that various capital market asset quotations significantly correlate, as well as the assumption that correlation change dynamics are less variable than the changes in asset quotations.
The considered approach to the analysis and application of financial instrument quotation dependencies is focused on the stochastic chaos model, in which the conditions of stationarity and ergodicity of observation series do not hold [1][2][3][4]. In this case, conventional statistical forecast quality estimates are inherently nonoptimal, and the only available research method is a numerical comparative analysis run on a sufficiently large testing polygon. The current article is dedicated to the investigation of this issue.
Further research directions in the area of dependencies between financial instruments include the analysis of inter-market connections, presented in works such as [5][6][7][8]. This approach opens a wide perspective for constructing forecast algorithms that take into account the higher inertia of integral financial dynamics indicators. The subjects of this research are the evaluation of the efficiency of multivariate statistical analysis in conditions of chaotic dynamics, as well as the application of specialized econometric approaches used for analyzing causal relationships (for example, Granger causality [9]).

Methods
Let = ( , = 1, . . . , ) be a quotation state vector formed by m parameters. Its individual components may pertain to different capital markets. In this case, the observation matrix = [ , = 1, … , ; = 1, … ] will define a vector-valued random process with a discrete time that reflects the dynamics of market asset quotations.
The most widely adopted direct vector observation model [10][11][12][13] is an additive scheme of the following form where , = 1, … , is a systemic component (i.e., used for the analysis and development of management strategies), and , = 1, … is noise.
An example of process , = 1, … shown by the USDCHF, EURUSD, EURJPY, and USDJPY quotations on a five-day observation interval with equidistanced counts dt = 1min can be seen in Figure 1. It can be easily seen that the systemic component , = 1, … , is an oscillatory nonperiodic process with a large number of local trends. In the general theory of dynamic processes, such observations are classified as chaotic [1][2][3][4]. The noise , = 1, … is a nonstationary process roughly described by the Gaussian model with changing parameters. A more detailed justification of this model can be found in [1,3,14].
The presence of a chaotic structure in observation series of market assets contradicts the simplest quotation model based on a stationary Gaussian process. Violating the probabilistic analysis' principal statement on the reproducibility of experience significantly obstructs the usage of otherwise well-applied methods. This is due to an inherent violation of conditions in which the existing computational schemes produce optimal results. This makes conventional analytic methods of evaluating algorithm efficiency almost not applicable. In this case, numerical studies of large observation samples become the main tool for analyzing the efficiency of estimation and management.
At the same time, useful patterns can also be found in trading asset quotation dynamics, as they are represented by multivariate observation series. As it will be shown below, increasing the observation interval used for estimating the correlation reveals strong dependencies between financial instruments. This gives rise to an ability to obtain a regression estimate of the managed asset's value using its correlated observations. In this case, it is possible to construct an indicator of the discrepancy between the market value and the estimated value of the asset, which may indicate a potential change in the quotation's direction. In other words, this allows for forecasting the price dynamics using regression.
Let us now move on to correlational analysis of changes in the state of stock and currency markets. We used the most widespread currency pairs, as well as two stock market indicators, as the parameters for the observation vector. Their list can be seen in Table  1. Further on, they will be denoted by their numbers in the table in the text. We represent the stock market asset observation series as a rectangular table X of size <n:m>, in which n is the number of observations (rows) and m is the number of the observed financial instruments (columns). In this case, the covariation matrix is given by a known equation . The estimates of the pairwise correlations between the stock indexes and the selected currency instruments for three 30-day intervals and one common 90-day interval are presented in Table 2.
The correlations between the currency pairs from Table 1 and the FTSE and DJ indexes are presented as tonal matrices in Figure 2a,b. The estimates were performed using the time intervals drawn from Table 2.
In order to select regressors, it is important to know not only the values of the pairwise correlations, but also their dynamic properties. We estimated the pairwise correlations looping over time within a sliding observation window. The respective computation examples are presented in Figure 3. The presented plots show that the estimated correlation values become highly variable as the observation interval changes.
We selected EURUSD, EURJPY, GBPJPY, and NZDJPY as our currency instruments, setting the lengths of the sliding windows to 10, 5, and 1 observation days. Pairwise correlations between the currency instruments and the FTSE index of the London Stock Exchange were considered as an example.  The presented calculations show that decreasing the size of the observation window leads to a significant increase in variability of the pairwise correlations. Consequently, employing multiregressional analysis for "fast" asset management, such as intraday trading, would not be efficient.
Using a sliding window larger than five days reveals intervals with relatively stable correlation estimates. In such cases, these financial instruments can be used as regressors for estimating the managed indicators.
An immutable set of predictors cannot guarantee an efficient regression estimate for the indicators. This makes it necessary to structurally adapt the regression function to the correlational market changes.
Further enlargement of the sliding window increases the stability of pairwise correlations, but disables the ability to react flexibly to possible variations in the correlations. A smaller sliding window increases the forecast quality, but may lead to false alarm errors caused by the fluctuating properties of the initial processes. Using robust processing algorithms is one of the ways of resolving this contradiction [15][16][17].  is an oscillatory indicator of the under-or over-price of the current asset that allows for forecasting the direction of its quotation.
As an illustration, Figure 4 contains the observation plots of USDCHF in a seven-day period with an observation interval of 1 min, as well as its regression forecast constructed with the use of correlated quotations of EURUSD, EURJPY, and USDJPY. Figure 5 presents the difference , = 1, . . . , between the quotations of USDCHF and their regression estimates, as well as the smoothed version, obtained via applying an exponential filter with a transfer coefficient of = 0.05.  Let us now consider the simplest application of this difference to the construction of an asset management strategy.

Multiregressional Data Analysis-Based Management Strategy Constructed upon Financial Instrument Correlation
The simplest asset management strategy based on computational regression estimates can be constructed with the following criterion : | | > * , = 1, . . . , .
If > * , this means >̂, i.e., the financial instrument is overpriced, and its price can be expected to go down. Vice versa, if < − * , i.e., <̂, indicates underpricing and therefore the price should go up. A statistical approach produces solutions that are correct only to a certain confidence level. Critical values of * are determined by its distribution tables (or a statistic constructed upon it with a known distribution law). In the conditions of nonstationary dynamics with a chaotic systemic component, such an approach is unfeasible. The critical value has to be selected based on the preliminary analysis of the retrospective information drawn from a large observation interval.
As an example, let us consider how =̂− changes for USDCHF on a five-day observation interval with 1-min counts. The respective plot, the smoothed version of = − , and the decision levels are presented in Figure 6. For better understandability, the value of the oscillator and its critical value are enlarged by 1.5 times. Figure 7 contains a histogram of , = 1, . . . , , which demonstrates a weak convergence of the given difference's distribution to the Gaussian law.  According to the described management strategy, if the oscillator surpasses the threshold value, i.e., ( ) > * , one can open a long position. Alternatively, a short position can be opened if ( ) < − * . Positions can be closed along with a reverse crossing of the threshold * , or by the conventional method of setting the "take profit" and "stop loss" levels. Figure 6 shows that there is no stable trend for either underprice or overprice. The reason for this is clear: the oscillator's movement into either direction is determined by its estimate of the prices of financial instruments used as regressors. At the same time, there are external factors that lead to the appearance of dynamic trends. The sum of these movements produces the final form of the dynamics, the direction of which is determined by a vector sum of heterogeneous and hardly forecasted influencing factors. Because of this, our asset management strategy algorithm has to be adapted further to consider the additional external impact factors of the financial instrument's price.

Evolutionary Adaptation of Our Multiregressional Asset Management Strategy Algorithm
A natural development of the presented study lies in solving the issue of whether a winning regression estimate strategy exists. This requires adapting the parameters of the management algorithm in order to obtain the most positive result. It is impossible to obtain a strict solution to an optimization problem in the conditions of chaotic dynamics. Because of this, we implemented a numerical iterative optimization based on evolutionary modeling [23][24][25][26][27][28]. In essence, this method was a modification of random search inspired by Darwinian evolution [23]. An example of how evolutionary modeling can be applied to optimize a management strategy can be found in [24].
A base algorithm of sequential adaptation is characterized by an array of modifiable parameters that form a genome vector = [ 1 , 2 , . . . , ]. In structural adaptation, the genome also contains the numbers of the regressors used for the estimation.
The criterion for opening a position during asset management, as was said previously, is : |̃| > , where ̃ is the smoothed value of the difference between the current estimate and value of the observed process.
The starting value of the genome 0 = [ 1 , 2 , . . . , ] 0 is estimated based on the results of the preliminary numerical analysis of the retrospective data. As threshold values, we used standard deviations (SDs) of the modified parameter observations on the specified interval.
The functional block diagram of the evolutionary adaptation of the asset value estimation algorithm is presented in Figure 8. Within the current study, we employed the following three parametric genome modifications: 1. Little single modification, LSM. This term refers to a small change (within a single SD) in a single randomly selected parameter of the parent genome. 2. Little group modification, LGM. 3. Strong single mutation, SSM, also known as parametric mutation. This refers to a significant modification (within three SDs) of a single randomly selected parameter.
The first generation of parent genomes was formed via applying LSM modifications to 0 . Next, using every mentioned type of modification, we created a group of child genomes, which, combined with their parents, constitute the first generation of our estimation algorithm parameters. The next step is, in accordance to the computational scheme shown in Figure 8, testing every modified version of the algorithm on a polygon of retrospective data. The result of the testing is evaluated using the terminal indicator of the quality of management, i.e., the gain. Next, all of the genome versions are ordered by their profit. During the selection process, the genomes with the least efficiency values are pruned, and the best genomes become the parents of the next generation. Next, looping over the generations, the versions are modified and selected, i.e., they are sequentially adapted to the particularities of the currently observed asset quotation fragments. Theoretically rigorous optimality of the found solutions is not guaranteed. However, the gain value cannot decrease as the number of generations increases. Furthermore, the presence of parametric and structural mutations allows for carrying the adaptation process out of local extrema vicinities.
Note that in contrast with random search, every iteration preserves a finite group of genomes instead of a single best solution. This approach allows for obtaining the best terminal solution using sequences of intermediate nonoptimal solutions.

An Evolutionally Adapted Implementation of a Regression Asset Management Strategy
Let us present a numerical example of using evolutionary adaptation for the regression asset management task. Consider USDCHF as the managed asset, and EURUSD, EURJPY, and USDJPY as regressors.
We used the parameter vector = [ , , ] as the genome, where is the size of the sliding window of data used to construct regression estimate ̂, is the transfer coefficient of the smoothing exponential filter, and is the boundary (critical) value used in the solution-making during asset management. The starting values of the modified pa-rameters 0 = [0. 5, 0.03, 4] are estimates based on USDCHF quotation dynamics analysis for 1-year interval and minute counts. The first parameter is evaluated in days, the second one is dimensionless, and the third one is measured in pipses. The intervals of allowed variation in the genome parameters are defined by matrix We used from 7 to 10 generation changes for evolutionary adaptation, preserving the = 4 parent genomes each time. For each parent, we performed three LSMs, a single LGM for all parameters, and one SSM. Thus, every generation included 24 genome modifications.
Next, we tested each genome version on the same data polygon. During selection, we preserved the four most successful genomes, making them parents of the next generation. The procedure was run for 7-8 iterations. The plot of the profit for the best solutions can be seen in Figure 9. The plot of the implemented regression asset management model that corresponds to the best genome is presented in Figure 10. For a better illustration, we used a short trading interval of 24 h. The plot shows the observed process of quotation dynamics ( ), the plot of criterial statistics ̃( ) (enlarged for better understandability), and the decision levels (horizontal lines). Stars on line ( ) and denote the moments of ̃( ) crossing the ± levels, i.e., position opening. Positions were closed at the moments corresponding to the reverse crossing of levels ± by ̃( ). The graph denotes these moments either with diamonds (profit gain) or circles (profit loss). It can be seen that the result is not clear-cut: it is rather probabilistic. The provided example has benefitted from our evolutionally adapted regression algorithm by obtaining 140 pipses of profit in a single day.

Discussion
The obtained results let us conclude that the chaotic multivariate environment of trading assets contains explicit correlations, the presence of which allows for implementing multiregressional computational schemes for asset management.
Using multiregressional estimates allows for constructing oscillating indicators with an asset management strategy for multivariate chaotic environments, which is an inherent possibility of a positive result.
A significant difference in this approach is the lack of requirement for trend identification and usage. This makes it possible to move away from the conventional method of extrapolation forecasting, which is essential when working with chaotic processes. Further improvement in management quality is provided by evolutionary adaptation.
Using regression analysis allows for directly substituting forecasting a chaotic process with an indirect estimate of the forthcoming price dynamics of the financial instrument price. Such an approach makes the forecast more stable in view of nonstationary variations of quotation dynamics.
It is important to note that the proposed approach, just as with any other oscillator, may lead to a negative result. In particular, an oscillator-based judgement of an instrument's overprice during a long-lasting positive trend may lead to a loss. These situations can be overcome with a complex solution based on multi-expert management. These issues will constitute the subject of further research.