A sequential method of detecting abrupt changes in the correlation coefficient and its application to Bering Sea climate

A new method of regime shift detection in the correlation coefficient is proposed. The method is designed to find multiple change-points with unknown locations in time series. It signals a possible regime shift in real time and allows for its monitoring. The method is tested on randomly generated time series with predefined change-points. It is applied to examine structural changes in the Bering Sea climate. A major shift is found in 1967, which coincides with a transition from a zonal type of atmospheric circulation to a meridional one. The roles of the Siberian and Alaskan centers of action on winter temperatures in the eastern Bering Sea have been investigated.


Introduction and Problem Formulation
The correlation coefficient is a principal statistical tool and the most widely used measure of a relationship between two variables [1]. When using this tool, it is assumed that the nature of the relationship is linear and can be modelled by a simple linear regression. Large natural systems, however, such as the climate system, exhibit behaviors that are far more complex and seem to require equally complex, possibly nonlinear, models to simulate them adequately. An alternative view, which is adopted here, is that such systems may be governed by simple rules, but the parameters of those rules are changing, possibly in an abrupt fashion. Indeed, quasi-permanent regimes with abrupt changes between them are characteristic of the climate system and have been observed on different time scales [2][3][4][5][6][7]. In this paradigm, a major task would be to determine the timing of those abrupt changes, also known as breaks, regime shifts, discontinuities, or change-points. In practice, when new data arrive over time, it is also important to allow for monitoring of those shifts and detect them as soon as possible.

OPEN ACCESS
Given this framework, a relationship between two variables can be written in a sequential manner as a dynamic regression model [8]: = β + , = 1, … , + 1, …, (1) where and are observations at time of the independent and dependent variables, respectively, β are the regression coefficients and are the error terms or residuals. The sample {( 1 , 1 ), … , ( , )} is called the historic sample, for which observations are available, and new observations are expected to arrive after time . Let this historic sample be a random sample from a bivariate normal distribution with means μ , μ , variances σ 2 , σ 2 , and population correlation coefficient ρ. The idea is to estimate the regression coefficient just once for the history period 1, … , , assuming it is constant during all that time, i.e., β ≡ β 0 . Alternatively, one can estimate the correlation coefficient instead, because in a simple linear regression, the values of the ordinary least squares (OLS) estimate β and the Pearson's correlation coefficient are the same when the time series and are normalized by their respective standard deviations , . Then a process is devised that captures fluctuations either in estimates or in residuals of a regression model during the monitoring period. Whenever there are excessive fluctuations in these processes, that is, β or for > become significantly different from β 0 or 0 , respectively, or the residuals deviate systematically from their zero mean, the null hypothesis of stability is rejected and a regime shift is declared. This approach works well in statistical quality control applications, when the underlying process is known and the constancy of the correlation coefficient during the history period can be assured [9]. In the case of natural systems, however, there is no guarantee that the data collected came from the same population. On the contrary, it is quite possible that there were already multiple regime shifts during the history period that occurred at unknown times. Therefore, a more appropriate model would be = β + , = , + 1, … , +1 − 1, = 1, … , , + 1, … where each change point is the first point of regime . The initial task here is to find the number of regimes and locations of change-points , = 2, … , before the monitoring phase can begin. To resolve this task, both sequential and non-sequential (also known as retrospective or "ex-post") approaches can be used. The retrospective tests usually fall into one of the two categories: (1) Cumulative sum (CUSUM) and fluctuation tests or (2) -tests, such as the Chow test [8]. The classic CUSUM tests are based on the behavior of recursive residuals [10]. Fluctuation tests are CUSUM-type tests, but deal with changes in estimates of the correlation coefficients β [8]. While the tests in the first category can be used in both retrospective and monitoring settings, the -tests cannot be applied to monitor out-of-sample stability each time new data arrive [11]. Due to the law of the iterated logarithm, repeated application of such tests yields a procedure that rejects a true null hypothesis of no change with probability approaching one as the number of applications grows [12]. In addition, all retrospective tests have a common problem: the drastic deterioration of the test statistics toward the end of time-series. Therefore, the null hypothesis of constancy of the regression coefficient for the last regime in the historic sample, = , … , , would be weak and the transition to the monitoring phase would be problematic.
Staying within the sequential framework, one can use subsamples of the data (e.g., in a form of a sliding window as in Leisch et al. [13]), assuming that the subsample size is equal or smaller than the length of anticipated regimes. This will improve data homogeneity within the subsamples, but increase the standard error of the correlation coefficient. It is known that the Pearson's is a biased estimator of ρ, and the bias increases as the sample size decreases [14]. The bias can be substantially reduced by an estimator suggested by Fisher [15] and almost completely eliminated by a related estimator recommended by Olkin and Pratt [16]. A much bigger issue, however, is that even if the correlation coefficient remains constant within a subsample, there might be changes in the means and variances in one or both variables occurring at unknown times. In these circumstances, an accurate estimation of the correlation coefficient directly from data, without addressing this issue first, is not feasible.
A solution suggested here represents a three-step procedure discussed in detail in Section 2. In Section 3, this procedure is tested using synthetic time series with predefined statistics. It is demonstrated that failure to remove regime shifts in the mean and variance first may result in spurious regime shifts in the correlation, whereas true regime shifts may not be detected at all. Section 4 discusses a real world example of structural changes in the Bering Sea climate. This example was previously used by Wang et al. [17] who found a shift in the Arctic influence on the region in the 1960s. Since they used smoothed (by 5-yr running means) temperature data series and 25-yr running correlation coefficients as a tool, the timing of the shift was approximate. The shift is reexamined in Section 4 using the proposed method. The results are summarized in Section 5.

The Three-Step Procedure
The procedure starts with the detection of regime shifts in the means of both and time series. It then proceeds with the detection of regime shifts in the variances of the residuals after the first step. Finally, it finds the regime shifts in the correlation coefficient after the residuals are normalized by the standard deviations of each regime found in step two.

Regime Shifts in the Mean
Most methods related to abrupt structural changes with unknown timing focus solely on such changes in the mean [18]. Rodionov [19] proposed a new method that can detect multiple shifts in the mean level of fluctuations for a given time scale. The method, known as STARS (Sequential T-test Analysis of Regime Shifts), has been used in numerous research projects, primarily in climatology (e.g., [20]) and marine ecosystem research [21]. Lately, it has found its way into other areas, such as oceanography [22], hydrology [23], biochemistry [24], metrology [25], and economics [26].
STARS falls into a category of CUSUM-type tests. It uses the -statistic to estimate a threshold, or critical level, for the new regime to be detected. In this forward-looking approach, cumulative sums of deviations are calculated not from the mean of the "current" regime, but from that critical level for a new regime. This makes it easier to automate the decision process when testing a potential change-point. The critical level for a new regime ̅ is defined as ̅ ↑ = ̅ + δ (if the shift is up), or ̅ ↓ = ̅ − δ (if the shift is down), where ̅ is the mean of the "current" regime and Here, is the value of Student's -distribution with 2 − 2 degrees of freedom for a given probability level , is the cut-off length of the regimes (somewhat similar to the cut-off point in low-pass filtering), and ̅ 2 is the average variance for running -point intervals in the time series. If the value of a newly arrived observation at time is outside the interval [ ̅ ↓ , ̅ ↑ ], it is marked as a potential change-point , which triggers the calculation of the regime shift index (RSI) for this point Equation (4) shows that the next − 1 points after are used to test the null hypothesis of no regime shift at . If the RSI retains its sign during this test, then the null hypothesis is rejected, and becomes a true change point +1 . If at any time the RSI changes its sign, the test for stops. The value of at = is now considered just a random fluctuation that belongs to the "current" regime . The mean value for the regime, ̅ , is recalculated, and the search for the next change-point continues.
In the case of autocorrelation, or "red noise", in the time series, it is important to filter it out, using a procedure known as "prewhitening". When a time series contains both the autocorrelation and regime shifts, it is impossible to estimate the autocorrelation coefficient using the entire historic sample. It can be done using subsamples of the size smaller than . It is known, however, that the smaller the sample, the larger the bias of the OLS estimator of the autocorrelation coefficient [27]. The bias can be substantially reduced using one of the two techniques: (1) MPK, named after Marriott and Pope [28] and Kendall [29]; or (2) IP4 (Inverse Proportionality with 4 corrections), as described in [30]. The IP4 technique is based on the assumption that the bias is approximately inversely proportional to the subsample size and is always negative [31]. Since the residual bias is also inversely proportional to the subsample size, three additional corrections practically eliminate the bias. Monte Carlo simulations showed that for subsample sizes smaller than 10, IP4 substantially outperforms MPK in terms of both the magnitude of the bias and variability of the estimates [30]. Both these techniques became part of the STARS package.
After all the regime shifts in the mean are detected in both and , the stepwise trends are removed. The residual time series ′ and ′ , which represent deviations from the respective stepwise trends, are then passed to step two for detection of regime shifts in the variance.

Regime Shifts in the Variance
The procedure for detecting regime shifts in the variance is similar to the one for the mean, except that, instead of the -statistic, it uses the -statistic to estimate the critical variance for the new regime. The -test consists of comparing the ratio of sample variances for two adjacent regimes with the critical value : Here is the value of the -distribution with ν 1 and ν 2 degrees of freedom (where ν 1 = ν 2 = − 1) and a significance level (two-tailed test), = ( 2 ⁄ , ν 1 , ν 2 ). For the new regime to be statistically different from the "current" regime, variance +1 2 should be greater than critical variance 2↑ , if the variance increases, or smaller than 2↓ , if the variance decreases, where If at the "current" time , the value of ′2 is greater than 2↑ or smaller than 2↓ , this point is marked as a potential change-point , and the subsequent − 1 data points are used to test the null hypothesis of no regime shift. The decision rule is similar to the one used for shifts in the mean, but based on the residual sum of squares index (RSSI): If at any point during this testing period the index turns negative in the case of increasing variance, or positive in the case of decreasing variance, the test for stops, because the null hypothesis cannot be rejected. The data point ′2 is included in the "current" regime , and its variance 2 is recalculated. Otherwise, the null hypothesis is rejected, and becomes a true change-point +1 .
After finding all the regime shits in both ′ and ′ , the time series are normalized by dividing each value by the standard deviation for the corresponding regime. The normalized time series * and * are then passed to step three for detection of regime shifts in the correlation.

Regime Shifts in the Correlation Coefficient
The detection of regime shifts in the correlation coefficient between * and * is based on the formula for the variance of their sum [1]: * + * 2 = * 2 + * 2 + 2 * * Since in our case the variables are normalized, i.e., ̅ * = ̅ * = 0 and * 2 = * 2 = 1 , the formula becomes * + * 2 = 2(1 + ) Equation (9) shows that, as the correlation coefficient increases, the variance also increases from a minimum of 0 at = −1 to a maximum of 4 at = 1. To detect abrupt shifts in , the technique described in step 2, can be applied to a series of the sums * + * . It can also be applied to a series of the differences * − * since * − * 2 For practical reasons, the detection test is performed for both * + * and * − * series. Our experiments show that when the difference between the correlation coefficients for two adjacent regimes is statistically significant at ≤ 0.05, the detected change-points most often are the same for both series. For > 0.05, the change points may differ by a few time steps or may be detected only in one of the series. In the case of two competing potential change-points, the decision about which one to select is based on their -values computed using the Fisher r-to-z-transformation [32]. The change point with the lower -value is chosen as the final one.

An Example with Synthetic Time Series
The above three-step algorithm was implemented as a software package called the Sequential Regime Shift Detector (SRSD). It is written in VBA for MS Excel and is available at [33].
To illustrate the software capabilities, two correlated random time series of size = 70 were generated from the bivariate normal distribution with μ = μ = 0 and σ 2 = σ 2 = 1. The change-point for the correlation coefficient from regime 1 to regime 2 was set in the middle of the series at 2 = 36, with 1 = −0.6 in the first half and 2 = 0.6 in the second half. Then the shifts in the variance were added to time series from σ 1 2 = 1 to σ 2 2 = 9 at 2 = 51 and to time series from σ 1 2 = 9 to σ 2 2 = 1 at 2 = 21.
Finally, the shifts in the mean were superimposed to from μ 1 = −1 to μ 2 = 1 at 2 = 26 and to from μ 1 = 1 to μ 2 = −1 at 2 = 41. The resultant time series are presented in Figure 1. The addition of regime shifts in the variance and, particularly, in the mean significantly changes the direct OLS estimates of ρ. As shown in Figure 2, in the presence of those shifts, sample correlation coefficient tends to underestimate ρ by as much as 0.6. In some other cases, when the shifts in the mean and variance are placed differently, it may lead to an overestimation of ρ. Therefore, in order to detect shifts in the correlation coefficient accurately, it is necessary to remove shifts in the mean and variance first.
The SRSD starts by detecting shifts in the mean. In this example, the target significance level and the cut-off length are set at = 0.05 and = 20. It is always good to know an approximate length of the regimes in time series to set these two parameters right, but since there is only one strong regime shift in the mean in each of the series, some variations in and do not change the results. Figure 3a,b show that the detected change points ̂2 = 26 and ̂2 = 41 in and , respectively, coincide with the theoretical change points. The calculated -values in both cases are much lower than 0.05. After the stepwise trends are removed, the detection of regime shifts in the variance is performed using the same target significance level = 0.05 and the cut-off length = 20. The increase in the variance in (Figure 3c) and its decrease in (Figure 3d) were both accurately detected at ̂2 = 51 and ̂2 = 21, respectively.

Shifts in the mean and variance
In its final step, the SRSD uses the normalized time series to detect regime shifts in the correlation coefficient. As shown in Figure 4a, the regime shift in our synthetic time series is found at ̂2 = 36, which is precisely where it is supposed to be. Interestingly, when the first two steps are skipped, that is the regime shifts in the mean and variance are not removed, a shift in is incorrectly detected at ̂2 = 21 (Figure 4b). Although the difference in before and after this spurious shift appears to be statistically significant (the exact -value cannot be calculated due to violation of the assumptions for ), its timing is way off.

Change in the Climatological Structure of the Bering Sea
Wang et al. [17] reported a change of unknown origin in the climatological structure of the Bering Sea that occurred sometime in the 1960s. Using the running 25-yr correlation coefficients between winter (DJFM) surface air temperatures (SAT) at two Alaskan stations, Barrow (71.3°N, 156.8°W) and St. Paul (57.1°N, 170.2°W), they found a shift from about 0.2 to 0.7 between the 1940s and 1980s. They concluded that the shift was statistically significant at the 95% confidence level, based on their Monte Carlo simulations.

Shifts in the correlation coefficient between and
With the approach they used, it was difficult to determine the exact timing of the shift, especially when they smoothed the SAT time series with running 5-yr means, which only increased the autocorrelation in the series. In addition, a visual inspection of their Figure 2a reveals a strong shift in SAT at Barrow, from a cold regime before the late 1970s to a warm regime thereafter. As shown in Section 3, it may lead to large errors in estimation of the correlation coefficient.
The station SAT data used in this study was obtained from GISS [34]. To facilitate a comparison between SATs at Barrow and St. Paul, the data for both stations is presented as anomalies from the 1971-2000 base period normalized by the standard deviations for the same period. Figure 5a shows that SAT at Barrow experienced an abrupt, and statistically highly significant, shift upward in 1978, the -value of which was 8 • 10 −7 . It coincided with a major step in the North Pacific climate in the late 1970s [35]. Interestingly, no shift at that time was detected in SAT at St. Paul, despite its proximity to the North Pacific Ocean (Figure 5b). It should be noted that, using the same method, a weak shift in 1977 was detected for that station in [36]. The discrepancy can be explained by the difference in the data sources used, that is, the raw data from NCDC in [36] and adjusted data from GISS here. Apparently, the homogeneity adjustment technique used in GISS made the 1977 shift in SAT at St. Paul even weaker. In any case, the absence of a pronounced shift in SAT at St. Paul is a reflection of the complex climate dynamics in the region and the lack of a clear-cut connection with the major modes of climate variability in the Pacific. As shown in [37], the correlation coefficients between winter SAT at St. Paul and such indices as the Pacific Decadal Oscillation, Aleutian Low Pressure, Pacific-North American, and Southern Oscillation are all statistically insignificant.  [38], which implies the advection of cold continental air over the southeastern Bering Sea [36].

Shifts in the mean in SATs at Barrow and St. Paul
When the regime shifts in the mean were removed, the maximum r between SAT residuals at Barrow and St. Paul for any 25-yr sample decreased from 0.7 to 0.6. Since no regime shifts in the variance were detected, step two in the three-step procedure was skipped, and the test for shifts in the correlation coefficient was performed using the residuals after the first step. The target significance level and cut-off length were the same as for the tests of shifts in the mean (i.e., target = 0.1 and = 15). Here the focus will be on the latter shift due to its higher statistical significance and more readily available meteorological data to explore it further. Baines and Folland [5] reported a number of rapid climate changes in various parts of the globe centered on the late 1960s. In the Bering Sea region, that was the time when atmospheric circulation changed from a generally zonal flow to a meridional pattern [17]. Wang et al. [17] demonstrated that Arctic and Pacific air had fewer meridional excursions before the late 1960s. They alluded, however, that since the increase in the north/south correlation structure did not begin near the well-known shift in the late 1970s, but about a decade earlier, it might be more related to changes in the Siberian High.
The Siberian High is part of the so-called Siberian-Alaskan Index (SAI) designed specifically to characterize the effect of atmospheric circulation on thermal conditions in the Bering Sea. The SAI is defined as a difference between the mean winter (DJFM) normalized 700-hPa anomalies in two regions, Siberia (55°N-70°N, 90°E-150°E) and Alaska/Yukon (60°N-70°N, 130°W-160°W). Positive (negative) values of the index indicate anomalously strong north-westerly (south-easterly) winds and colder (warmer) than normal winters in the Bering Sea. The SAI is available at the Bering Climate web site [39]. It should be emphasized that the SAI is not a dipole, as, for example, the North Atlantic Oscillation. The two parts of the SAI, the Siberian Index (SI) and Alaskan Index (AI), are completely uncorrelated, and therefore, represent two independent sources of influence on Bering Sea climate. The SI reflects the strength of the Siberian High (Figure 7a) and the advection of cold Siberian air into the Bering Sea. Judging from the correlations between the SI and winter temperatures computed for the entire period of observations, 1949-2015 (Figure 7c), its effect on winter conditions in the Bering Sea is rather modest. Note that the correlation maps in Figure 7 are computed for the first differences rather than for absolute values. Taking first differences serves as a high-pass filter that practically eliminates the effect of regime shifts in the mean on the correlation coefficient.  (Figure 7c,d) shows that the linear effect of the Alaskan center of action on SATs in the Bering Sea is much stronger than that of the Siberian High. The AI is strongly correlated with SAT at Barrow as well, which can be seen in Figure 7c. For the entire period of observations, 1949-2015, the correlation coefficient between the AI and residuals of SAT at Barrow, after removing the stepwise trend, is 0.61. The results of regime shift detection in the mean and variance in the SI and AI are presented in Figure 8. Prior to 1966, the SI was mostly positive, indicating a strong Siberian High (Figure 8a). Another period of predominantly positive SI started in 2005. No shifts in the mean were detected for the AI (Figure 8b). As for the variance, a statistically significant decrease in the SI fluctuations since 2001 was detected (Figure 8c). In contrast, the AI index experienced a remarkable increase in its fluctuations in recent years, when it jumped from a record low value in 2012 to a record high value in 2014 (Figure 8d). However, due to a small number of observations in this high variance regime, normalization for the AI was not performed.
The results of regime shift detection in the correlation coefficient between SAT at St. Paul and the two atmospheric circulation indices, SI and AI, are presented in Figure 9. Although the effect of the SI on temperature in the Bering Sea appeared to be rather weak overall (Figure 7c), there was a strong correlation regime from 1966 through 1997, when reached −0.54, a statistically significant value at the 99.8% confidence level (Figure 9a). The linear relationship between the AI and SAT at St. Paul (Figure 9b)  It is interesting to compare atmospheric circulation patterns for the correlation regimes in Figure 9. As shown in Figure 10, the most prominent feature of atmospheric circulation during the strong correlation regime for the SI, 1966-1997, is a strong Siberian-North Pacific (NP) dipole in the 700-hPa field, with the correlation coefficient between the two centers reaching 0.8 (Figure 10c). During a positive phase of the dipole (SI+, NP−), the pressure gradient between the Siberian High and Aleutian low increases that leads to an enhanced advection of Arctic air directed into the Bering Sea. During its negative phase (SI−, NP+), the advection of Arctic air decreases, while the advection of warm Pacific air increases, apparently due to a more active Siberian storm track [36]. The NP center is weaker during the weak correlation regime of 1949-1965 ( Figure 10a) and completely disappears during the recent weak correlation regime of 1998-2015 (Figure 10e). Figure 11 shed some light on why the correlation coefficient between the AI and SAT at St. Paul was low during the period of 1949-1967. A comparison of Figure 11a,c shows that during the earlier period, the Alaskan center of action was expanded westward. In that situation the advection of warm Pacific air (AI+), or cold Arctic air (AI−) was directed into the western part of the Bering Sea, where the correlation coefficients with SATs reached 0.8, whereas they were statistically insignificant in the eastern part (Figure 11b). Note that the correlation between the AI and SAT at Barrow remained strong during this Shifts in the correlation coefficient earlier regime, and that the shift in the correlation for the pairs AI-St. Paul and Barrow-St. Paul occurred almost at the same time. This indicates that the primary reason for the break in the correlation for the latter pair in 1967 is likely to be the westward expansion and subsequent contraction of the Alaskan center of action, and not the change in the Arctic influence as suggested by Wang et al. [17].

Conclusions
A new method of regime shift detection in the correlation coefficient is proposed. It is capable of automatic detection of multiple change-points at unknown time. It also allows early warning and monitoring of regime shifts. The method is built on previous works by the author [19,30,41], which describe algorithms of sequential regime shift detection in the mean and variance, as well as the prewhitening procedure to eliminate the effect of red noise. Using the synthetic time series, this paper demonstrates that in the presence of shifts in the mean and variance, the direct OLS estimation of the correlation coefficient becomes unreliable. Therefore, a three-step procedure is suggested, which detects and removes the regime shifts in the mean and variance first, and then uses the sequential -test applied to the sums * + * and differences * − * of the residuals to detect regime shifts in the correlation coefficient.
The SRSD software based on this procedure has been applied to detect structural changes in the Bering Sea climate. It is shown that a major shift in the correlation between winter SATs at Barrow and St Paul occurred in 1967, which coincides with the change in the atmospheric circulation from a zonal to meridional pattern, as described in [17].
The role of the Siberian and Alaskan centers of action, which represent two independent sources of influence on winter thermal conditions in the Bering Sea, has been investigated. Although the overall effect of the Siberian center is much weaker than the Alaskan one, there was a period  when the correlation coefficient between the SI and SAT at St. Paul reached −0.59, which was statistically significant at the 99.98% confidence level. The principal feature during that strong correlation regime was a well-expressed North Pacific center of action, which acted in a coordinated fashion with the Siberian center. The effect of the Alaskan center of action on the eastern Bering Sea was relatively weak during the earlier period, 1949-1967. This can be explained by a westward expansion of the center during that period. As a result, fluctuations in the strength of the AI had more effect on the western Bering Sea. After the regime shift in 1968, almost 50% of the variance of winter SAT at St. Paul can be explained by the AI alone. This westward expansion of the Alaskan center before 1968 and its subsequent contraction appears to be the primary reason for the 1967 break in the correlation between SATs at Barrow and St. Paul.