An Alternative Estimate of Potential Predictability on the Madden–Julian Oscillation Phase Space Using S2S Models

This study proposes an alternative method to estimate the potential predictability without assuming the perfect model. A theoretical consideration relates a maximum possible value of the initial-value error to the covariance between analysis and bias-corrected ensemble-mean forecast. To test the method, the prediction limit of the Madden–Julian Oscillation (MJO) was evaluated, based on three pairs of reanalysis and forecast datasets provided by the European Centre for Medium-Range Weather Forecasting, the Japan Meteorological Agency and the National Centers for Environmental Prediction, participating in the subseasonal-to-seasonal prediction project. The results showed that the predictability was higher when MJO amplitude exceeded unity, consistent with the conventional method in which the error is evaluated as the ensemble-forecast spread. Moreover, the multimodel analysis was also conducted because the proposed method is readily applicable to the multimodel average of ensemble-mean forecasts. The phase dependency of the MJO’s potential predictability is also discussed.


Introduction
The Madden-Julian Oscillation (MJO) is dominant intraseasonal variability in the tropics [1,2].Typically, the convective center of an MJO event is formed over the Indian Ocean, propagates eastward at an average speed of ~5 m s −1 and dissipates in the region between Maritime Continent and the central Pacific.Because of its regular behavior, the MJO is expected to be potentially predictable for 3 weeks or longer [3].Most operational general circulation models (GCMs) have succeeded to forecast the planetary-scale phenomena related to the MJO for 2-3 weeks [4,5], although they are not capable of an accurate simulation of finer-scale aspects of MJOs such as precipitation distribution [6].
Dynamical MJO forecast has been evaluated as two aspects: prediction skill and potential predictability.The prediction skill on the MJO phase space, meaning model accuracy against the insufficient representation of the MJO, has been gradually increased.Rashid et al. [7] found that the MJO could be predicted for 21 days in an Australian GCM measured with a bivariate correlation coefficient (e.g., [8]).A new-generation cloud-resolving model also successfully predicted several MJO events [9].The monthly forecasts operated by the European Centre for Medium-Range Weather Forecasts (ECMWF) furthermore improved the skill to nearly a month for any initial MJO phases [10].On the other hand, the potential predictability, representing the intrinsic uncertainty due to the chaotic nature of the atmosphere, was usually estimated from a single-model ensemble forecast by calculating spread among members involved with a small initial perturbation.However, because most of the GCMs could not sustain the convective activity of MJO, the ensemble members tended to dump into the origin of MJO phase space.Waliser et al. [3] focused on this dumping problem and estimated the intrinsic limit of predictability as the forecast lead time at which spread multiplied by √ 2 attains the MJO amplitude.Their estimate was about 25-30 days for 200 hPa velocity potential.Neena et al. [5], applying the same method to each of eight GCMs, estimated that the potential predictability is 20-30 days in the MJO phase space.They also found the dependence of predictability on the initial phase of the MJO.In spite of a similar metric in an ensemble forecast analysis, Kim et al. [11] suggested that the potential predictability was insensitive to the initial MJO phase.
We now caution that all the above studies on the potential predictability implicitly assumed that a forecast model perfectly represented the MJO time-series in their estimates.However, the perfect-model assumption is not always reasonable for the MJO prediction, because most GCMs still underestimate the MJO amplitude and then distort the climatological probability distribution in the MJO phase space [6,10,12,13].If we developed an alternative approach to evaluate the potential predictability without the perfect-model assumption, we could extricate ourselves from model dependence in the estimation.The theoretical consideration by Kumar and Hoerling [14] may help us relate the magnitude of the initial-value error to the statistics of analysis and imperfect-model forecast data.If one removed the model bias averaged over the forecasts [15], the potential predictability would be related to the manner in how the forecasts are covariated with the analyses, as will be discussed later [16][17][18].Recently, the sub-seasonal to seasonal (S2S) prediction project has been launched and posed the MJO forecast as one of the most important targets [19][20][21].We can access a set of hindcasts with operational GCMs, each of which has different resolution and physical parameterizations.A model estimate of potential predictability has not been established yet, because the perfect-model assumption obstructs a reasonable compilation of results.Hence a new approach without the perfect-model assumption would be applicable to multimodel ensemble analysis.
The purpose of this study is to propose an alternative method to estimate the potential predictability without assuming the perfect model, which is completely different from a conventional method to evaluate the spread of initial perturbations in an ensemble forecast.The point of this paper is the strategy to cope with an averaged time-series over the ensemble members, posing the analysis time-series as its counterpart.A set of analyses and forecasts projected onto the MJO phase space starting from various dates are then prepared.In order to achieve this purpose, we first formulate the estimation method of potential predictability based on a set of analyses and forecasts.The method newly developed in this paper is capable of being applied to multimodel ensemble analysis.The rest of the paper is organized as follows.Section 2 gives the description on analysis and forecast data that we used here.The MJO phase space is also defined.Section 3 develops an alternative method to estimate potential predictability.Section 4 compares the new method with a conventional method, and then the new method is applied to multimodel ensemble analysis.Concluding remarks are given in Section 5.

Verification and Phase Space
The verification in this paper is an average of ERA-Interim of ECMWF [22], JRA55 of the Japan Meteorological Agency (JMA; [23]), and the Climate Forecast System (CFS) reanalysis version 2 from the National Centers for Environmental Prediction (NCEP; [24]).Presumably because of the sparseness of surface observation and the deficit of model parameterizations, a discrepancy among reanalyses is not negligible in the tropics [25,26].While most previous works evaluated GCMs' performance with their own reanalysis as the verification, the reanalysis average enabled us to fairly compare the prediction among models [27], which was expected to provide a more optimal estimate than a single analysis [28].
The MJO phase space is spanned by two gravest combined empirical orthogonal function (EOF) modes of zonal wind at 200 hPa and 850 hPa, averaged between 15 • S and 15 • N based on NCEP's CFS (Climate Forecast System) reanalysis version 2 from 1979 to 2015 [29].If the data were replaced with the reanalysis average, the MJO phase space changed little.We excluded the outgoing longwave radiation data from the combined EOF analysis for simplicity, as the two gravest EOF modes with outgoing longwave radiation and zonal winds have little difference from those with zonal winds only [4,30].Climatology and the 120-day average were removed from zonal wind prior to the EOF analysis to extract the intraseasonal variations, and the signal of the El Niño/Southern Oscillation was automatically removed in this processing [8].We classified the MJO phase as conventional and the non-MJO phase was defined as the state where the MJO amplitude is less than unity.

Forecast Data
We used reforecast data with three GCMs joining the S2S project (http://s2sprediction.net/):operational models of ECMWF [10], GSM1403C of the JMA [31] and CFS version 2 of NCEP [32].The details are provided in Table 1.Hereafter, each model is called by the name of its organization (e.g., the JMA model for GSM1403).The data period is from January 1999 to December 2009.Model forecast data were also projected on the phase space, after a removal of the seasonality and 120-day average using the reanalysis provided with the corresponding model.In S2S reforecast data, an initial perturbation was created using singular vector and ensemble data assimilation perturbation by ECMWF, using the bred vector and lagged averaging (LAF) method by the JMA, and using the LAF method for NCEP.A stumbling block when using the reforecast data joining the S2S project is a different interval of initial date and time, which makes it difficult to find a set of models initialized on the same date.The ECMWF model has 1397 forecasts that were initialized at 00 UTC of the calendar date, corresponding to Thursdays between January 2015 and May 2016, or Mondays from 14 May 2015 to 30 May 2016.The JMA model has 396 forecasts that ran from 12 UTC of the 10th, 20th and the last day of each month.The NCEP model has 3972 forecasts beginning from 00 UTC of every date during the period.For the multimodel ensemble forecast analysis, ECMWF, JMA and NCAR reforecasts were selected only if the difference of initial time is less than 12 h; for example, the JMA model's reforecast, initialized on the 20th of March at 12 UTC, was grouped in the multimodel analysis with the ECMWF and NCEP models' reforecasts, which were initialized on the 21st of March at 00 UTC.This selection was effective in matching the initial date among the forecasts, and we realized multimodel ensemble analysis with 250 independent forecasts.The list of calendar dates is provided in

Alternative Method to Estimate Potential Predictability
Consider the time-series of the true value, X o , and the perfect model hypothetically (Figure 1).We first clip a part of the time-series X o in the range from t 0 to t 0 + T, and the time segment is renamed as X o 1 (τ) for 0 ≤ τ ≤ T. We numbered the time segment (see below) in the subscript.The superscripts o, f and e mean the truth, the ensemble mean and the initial-value error, respectively, almost following the notation of data assimilation [33].The perfect model, as a thought experiment, would be repeatedly integrated from X o 1 (0) plus an initial error, the magnitude of which is infinitesimally small and randomly given.Taking an average over infinite ensemble members of the above experiment, we can obtain the time segments of the ensemble mean perfect-model forecast written as X f 1 here.We now emphasize that the nonlinearity intrinsic to the system causes the difference between X o 1 and X f 1 for τ > 0. This difference, expanding in time, is regarded as the initial-value error As τ approaches infinity, the ensemble members of the perfect model scatter over the phase space following the climatological probability density function (PDF) of the true value.The ensemble-mean X f 1 (τ) is asymptotic to the origin of the phase space, while the true value X o 1 (τ) is not converged to a certain value as τ → ∞ .Next, we labeled a set of time segments of the true value as and that of corresponding ensemble-mean forecasts as where N is the total number of selected time segments.As ( ) is uncorrelated with the initialvalue error ( ) for all time segments 1 ≤ ≤ , the variance of the true values among a set of time segments Equation ( 1), ( ) , can be represented by the variance of the ensemble-mean forecasts among a set of time segments Equation ( 2), ( ) , added by the initial-value error spread [16,34,35]: At = 0, ( ) is equal to as the initial-value error should be zero.As approaches infinity, ( ) becomes the variance of climatological probability density function (PDF).For this limit, since → for all time segments , the variance is exactly zero.Now we introduce the anomaly correlation coefficient (ACC) between the true value and ensemble-mean forecast as where the angle bracket denotes the average over the time segments.The ACC is high if the true value is well covariated with the forecast (Figure 2a); otherwise, it is low (Figure 2b).By using the relation ( ) = ( ) + ( ) and the independence between ( ) and ( ) , the ACC is reduced to the ratio of the standard deviation of the forecast to that of the true value: Next, we labeled a set of time segments of the true value as and that of corresponding ensemble-mean forecasts as where N is the total number of selected time segments.As X f j (τ) is uncorrelated with the initial-value error X e j (τ) for all time segments 1 ≤ j ≤ N, the variance of the true values among a set of time segments Equation (1), σ 2 [X o (τ)], can be represented by the variance of the ensemble-mean forecasts among a set of time segments Equation (2), σ 2 X f (τ) , added by the initial-value error spread [16,34,35]: At τ = 0, σ 2 (X o ) is equal to σ 2 X f as the initial-value error should be zero.As τ approaches infinity, σ 2 (X o ) becomes the variance of climatological probability density function (PDF).For this limit, since X f j → 0 for all time segments j, the variance is exactly zero.Now we introduce the anomaly correlation coefficient (ACC) between the true value and ensemble-mean forecast as where the angle bracket denotes the average over the time segments.The ACC is high if the true value is well covariated with the forecast (Figure 2a); otherwise, it is low (Figure 2b).By using the relation τ) and the independence between X f j (τ) and X f i (τ), the ACC is reduced to the ratio of the standard deviation of the forecast to that of the true value: Finally, we arrive at the hypothesis that would be created with the perfect model.If is reconsidered to be the ensemble-mean of an imperfect model forecast, we add one extra term of covariance between forecast and error, , , as Because the initial-value error deteriorates the forecast, the extra cross-correlation term is, in general, negative.Therefore, the equality in Equation ( 6) is replaced with the inequality as below [18]: The ACC is unity at the initial time and almost monotonically decreases with the lead time.Within the range of a positive ACC, using Equation (3), we can derive the inequality that determines a possible maximum value of initial-value error based on a set of time segments of the true value and the ensemble-mean forecasts as The prediction limit should be evaluated as the time when ( )√1 − reaches a threshold value.On the MJO phase space, we used the threshold value of unity, that is, about 70 percent of the climatological variance.It is worthwhile remarking that the new method only uses the ensemblemean forecast data and does not use each ensemble member like the conventional method.Expanding Equation ( 8) to multi-model analysis with M models, the inequality should be ( ) ≤ ( )√1 − .Here, is the maximum ACC among the models and their mean, because our evaluation only provides the lower bound of predictability.
For a practical use of the above theoretical consideration in this paper, we replaced the true value with verification defined in Section 2, the reanalysis average.Even so, the discrepancy in the initial state between verification and ensemble-mean forecast was not negligible.We hence constantly offset the time segment of forecast ( ), so as to match its initial state (0) with the verification (0) for each time segment: Moreover, the bias correction is made, the details of which will be explained in the next section.Since we focus on the MJO behavior, and are hereafter two-dimensional vectors on the MJO phase space.
It is cautioned that the ACC is different from the bivariate correlation coefficient (BCC) that was often used for the evaluation of prediction skill on the MJO phase space [15]: Finally, we arrive at the hypothesis that X f j would be created with the perfect model.If X f is reconsidered to be the ensemble-mean of an imperfect model forecast, we add one extra term of covariance between forecast and error, C X f , X e , as Because the initial-value error deteriorates the forecast, the extra cross-correlation term is, in general, negative.Therefore, the equality in Equation ( 6) is replaced with the inequality as below [18]: The ACC is unity at the initial time and almost monotonically decreases with the lead time.Within the range of a positive ACC, using Equation (3), we can derive the inequality that determines a possible maximum value of initial-value error based on a set of time segments of the true value and the ensemble-mean forecasts as The prediction limit should be evaluated as the time when σ(X o ) √ 1 − r 2 reaches a threshold value.On the MJO phase space, we used the threshold value of unity, that is, about 70 percent of the climatological variance.It is worthwhile remarking that the new method only uses the ensemble-mean forecast data and does not use each ensemble member like the conventional method.Expanding Equation ( 8) to multi-model analysis with M models, the inequality should be σ(X e ) ≤ σ(X o ) Here, R is the maximum ACC among the models and their mean, because our evaluation only provides the lower bound of predictability.
For a practical use of the above theoretical consideration in this paper, we replaced the true value with verification defined in Section 2, the reanalysis average.Even so, the discrepancy in the initial state between verification and ensemble-mean forecast was not negligible.We hence constantly offset the time segment of forecast X f j (τ), so as to match its initial state X j f (0) with the verification X o j (0) for each time segment: Moreover, the bias correction is made, the details of which will be explained in the next section.Since we focus on the MJO behavior, X o and X f are hereafter two-dimensional vectors on the MJO phase space.
It is cautioned that the ACC is different from the bivariate correlation coefficient (BCC) that was often used for the evaluation of prediction skill on the MJO phase space [15]: where s(X) is mean amplitude 1/N ∑ N j=1 X j 2 .The BCC is formerly related to the ACC as only when the lead time is sufficiently large so that σ(X) = s(X).Now, as we evaluate the predictability for forecasts initialized on some blocks of the MJO phase space, the mean of state is generally non-zero.Hence, the BCC cannot simply be relevant to the initial-value error in the way described in our formulation.

The Prediction Skill and Bias Correction
This section shows the predictability estimated with the new method developed in the above section, which provides a maximum possible value of initial-value error based on a pair of analysis and ensemble-mean forecast datasets.To effectively test the new method, it is better to remove the model bias from the forecast data in advance.Hence, we first discuss the prediction skill for each model, and then make the bias correction by removing the bias vector averaged over each MJO phase.The prediction skill is now evaluated as the distance between ensemble-mean forecast state and the analysis state on the MJO phase space, defined as the two gravest combined-EOF modes of tropical zonal wind.Each separated block in the figures of this paper corresponds to the MJO phase: the convective center is located over the Indian Ocean in Phases 2-3, the Maritime Continent in Phases 4-5 and the western Pacific in Phases 6-7, while Phases 8-1 correspond to suppressed or formation stages.Figure 3 displays the prediction skill at 10-day lead time calculated for each forecast as the dot plot at the initial-state position on the MJO phase space, with the root-mean-square error (RMSE) as the statistics among forecasts.The best-performing model was ECMWF's, with the RMSE less than unity at this lead time.The RMSE was 1.07 for the NCEP model and 1.15 for JMA's.The two models (Figure 3b,c) especially provided erroneous forecasts with the RMSE exceeding 1.4 when initialized in Phases 4-5, in which convection is active over the Maritime Continent.Contrastingly, the ECMWF model predicted the MJO state more accurately when initialized there, while it provided erroneous forecasts initialized in Phases 8-2, in which the signal resides between Africa and the central Indian Ocean.
unity at this lead time.The RMSE was 1.07 for the NCEP model and 1.15 for JMA's.The two models (Figure 3b,c) especially provided erroneous forecasts with the RMSE exceeding 1.4 when initialized in Phases 4-5, in which convection is active over the Maritime Continent.Contrastingly, the ECMWF model predicted the MJO state more accurately when initialized there, while it provided erroneous forecasts initialized in Phases 8-2, in which the signal resides between Africa and the central Indian Ocean.The model bias as a function of lead time was defined for each initial phase as the bias vector averaged over the forecasts starting from a phase (Figure 4).The bias vector at 10-day lead time in the JMA model showed a clockwise tendency with a leftward translation over the phase space (Figure 4b), moderately consistent with Ichikawa and Inatsu [15].The bias vector in the NCEP model showed a tendency toward the positive direction of the second principal component with a weak clockwise motion (Figure 4c).The bias vector of ECMWF was characterized by clockwise rotation uniformly over the space, corresponding to a bias of slower eastward propagation of the MJO convective center (Figure 4a).The bias correction was made for each model forecast by subtracting the bias vector shown in Figure 4 from the original forecast state vector.Figure 5 shows the error plot after making a bias correction.The bias correction made the error phase-independent in all three models.The RMSE decreased by 0.1 for the ECMWF and NCEP models (Figure 5a,c).In the JMA model, the RMSE dropped from 1.15 to 0.97 as a result of the effective bias correction (Figure 5b).The model bias as a function of lead time was defined for each initial phase as the bias vector averaged over the forecasts starting from a phase (Figure 4).The bias vector at 10-day lead time in the JMA model showed a clockwise tendency with a leftward translation over the phase space (Figure 4b), moderately consistent with Ichikawa and Inatsu [15].The bias vector in the NCEP model showed a tendency toward the positive direction of the second principal component with a weak clockwise motion (Figure 4c).The bias vector of ECMWF was characterized by clockwise rotation uniformly over the space, corresponding to a bias of slower eastward propagation of the MJO convective center (Figure 4a).The bias correction was made for each model forecast by subtracting the bias vector shown in Figure 4 from the original forecast state vector.Figure 5 shows the error plot after making a bias correction.The bias correction made the error phase-independent in all three models.The RMSE decreased by 0.1 for the ECMWF and NCEP models (Figure 5a,c).In the JMA model, the RMSE dropped from 1.15 to 0.97 as a result of the effective bias correction (Figure 5b).

Prediction Limit and Its Comparison with a Conventional Method
Predictability was evaluated with prediction limit in this paper.The proposed method provided a maximum-possible value of initial-value error variance as a function of the lead time.Therefore, it provided a minimum-possible value of the prediction limit defined as the lead day just before the initial-value error variance attains unity, accounting for half the climatological variance of the data on the 2-dimensional phase space.As the time goes to infinity, the variance of analysis time segments is asymptotic to 2. The new method, in essence, evaluates the prediction limit when a possible maximum of initial-value error multiplied by √2 reaches this limitation.This is analogous to the conventional method in which the prediction limit is defined as the time when the ensemble spread multiplied by √2 reaches the MJO amplitude.It is worthwhile noting that ( ) → √2 can be used

Prediction Limit and Its Comparison with a Conventional Method
Predictability was evaluated with prediction limit in this paper.The proposed method provided a maximum-possible value of initial-value error variance as a function of the lead time.Therefore, it provided a minimum-possible value of the prediction limit defined as the lead day just before the initial-value error variance attains unity, accounting for half the climatological variance of the data on the 2-dimensional phase space.As the time goes to infinity, the variance of analysis time segments is asymptotic to 2. The new method, in essence, evaluates the prediction limit when a possible maximum of initial-value error multiplied by √ 2 reaches this limitation.This is analogous to the conventional method in which the prediction limit is defined as the time when the ensemble spread multiplied by √ 2 reaches the MJO amplitude.It is worthwhile noting that σ(X o ) → √ 2 can be used in the new method but the spread goes to the MJO amplitude in the limit due to the model's imperfectness [5].In order to be fairly compared with the method proposed in this paper, the prediction limit was evaluated following Neena et al. [5], except that we did not take a 51-day running average for amplitude due to the limited length of prediction.As the new method gives the minimum possible value of prediction limit, it can be said that the conventional method provides a fake short limit when it is shorter than the new method estimation.
We first tested the new method in light of the classification of forecasts on the initial MJO amplitude.Averaging over forecasts of which the initial MJO amplitude exceeds unity, a minimum possible value of prediction limit was 14 days based on the ECMWF model and 10 days based on the JMA and NCEP models (Figure 6a-c).In contrast, averaging over forecasts starting from a non-MJO state, the minimum-possible prediction limit was 12 days based on the ECMWF model, 9 days based on JMA's and 10 days based on NCEP's (Figure 6d-f), slightly shorter than forecasts from MJO states.This result suggests that the forecasts starting from an MJO state are, on average, more predictable than those starting from a non-MJO state.This is consistent with the consequence from the conventional method.The prediction limit in the ECMWF model was estimated as 23 days for forecasts from an MJO state, while it was 18 days for forecasts from a non-MJO state (Figure 6a,d).Similarly, the prediction limit for the NCEP model was 23 days for an initial MJO state and 21 days for an initial non-MJO state (Figure 6c,e).Although the conventional method did not provide the prediction limit for the JMA model due to no crossing of amplitude and spread lines, the prediction limit seems shorter in forecasts with initial non-MJO states (Figure 6d).This result was not substantially changed even if we classified forecasts by the average MJO amplitude through forecast period, and we found a longer prediction limit for the forecasts for which average amplitude exceeded unity (not shown).
for an initial non-MJO state (Figure 6c,e).Although the conventional method did not provide the prediction limit for the JMA model due to no crossing of amplitude and spread lines, the prediction limit seems shorter in forecasts with initial non-MJO states (Figure 6d).This result was not substantially changed even if we classified forecasts by the average MJO amplitude through forecast period, and we found a longer prediction limit for the forecasts for which average amplitude exceeded unity (not shown).(d-f) are the same as (a-c), but for forecasts from the initial state with the MJO amplitude less than unity.
Our method can also be utilized to estimate the dependence of MJO predictability at the initial phase.Here, the results are shown in Figure 7 for the NCEP model and Table 4 for the ECMWF model.A minimum-possible prediction limit based on the NCEP model was longest for forecasts from Phases 6-7 and shortest for forecasts from Phases 2-3.This phase dependence is coherent with the  (d-f) are the same as (a-c), but for forecasts from the initial state with the MJO amplitude less than unity.
Our method can also be utilized to estimate the dependence of MJO predictability at the initial phase.Here, the results are shown in Figure 7 for the NCEP model and Table 4 for the ECMWF model.A minimum-possible prediction limit based on the NCEP model was longest for forecasts from Phases 6-7 and shortest for forecasts from Phases 2-3.This phase dependence is coherent with the prediction limit estimated with ensemble spread, and the low prediction limit for forecasts initialized in Phases 2-3 may be due to an irregular amplitude change over the Maritime Continent.The ECMWF model provided a minimum-possible prediction limit at 27 days for forecasts from Phases 4-5, much longer than that for forecasts from other phases.This is likely because there is little possibility to initiate another MJO event under the dry environment that occurs after the MJO passage over the Indian Ocean and the Maritime Continent.This is also in line with the prediction limit estimated with the conventional method, although the difference among phases is not so prominent.in Phases 2-3 may be due to an irregular amplitude change over the Maritime Continent.The ECMWF model provided a minimum-possible prediction limit at 27 days for forecasts from Phases 4-5, much longer than that for forecasts from other phases.This is likely because there is little possibility to initiate another MJO event under the dry environment that occurs after the MJO passage over the Indian Ocean and the Maritime Continent.This is also in line with the prediction limit estimated with the conventional method, although the difference among phases is not so prominent.

The Multimodel Ensemble Analysis
The multimodel analysis combined the estimates of prediction limit from each of the single models and the multimodel average.The multimodel average is expected, in general, to provide a better forecast than the single model.Figure 8 shows the prediction skill of forecasts averaged among the ECMWF, JMA and NCEP models after a bias correction.Although the sample number was only 250, as pointed out in section 2, the RMSE was less than any other models' (Figure 5).Similarly, the multimodel analysis gives a longer estimate of potential predictability than the single model analysis.Considering this point, we now regard the multimodel average as one of the model results, and a minimum-possible prediction limit is defined as the lead day just before the smallest estimate of initial-value error among the models reaches unity, in order to get the longest minimum values possible.The new method can extend to multimodel analysis in this manner, while the conventional method cannot.

The Multimodel Ensemble Analysis
The multimodel analysis combined the estimates of prediction limit from each of the single models and the multimodel average.The multimodel average is expected, in general, to provide a better forecast than the single model.Figure 8 shows the prediction skill of forecasts averaged among the ECMWF, JMA and NCEP models after a bias correction.Although the sample number was only 250, as pointed out in section 2, the RMSE was less than any other models' (Figure 5).Similarly, the multimodel analysis gives a longer estimate of potential predictability than the single model analysis.Considering this point, we now regard the multimodel average as one of the model results, and a minimum-possible prediction limit is defined as the lead day just before the smallest estimate of initial-value error among the models reaches unity, in order to get the longest minimum values possible.The new method can extend to multimodel analysis in this manner, while the conventional method cannot.A minimum-possible prediction limit in the multimodel analysis is shown in Figure 9.For forecasts of which the initial amplitude exceeded unity, the smallest estimate of initial-value error reached unity at 18 days (Figure 9a).For forecasts starting from a non-MJO state (Figure 9b), in contrast, the minimum-possible prediction limit was 12 days.The multimodel analysis also supports the notion that the potential predictability of forecasts from an MJO state is longer than that from a A minimum-possible prediction limit in the multimodel analysis is shown in Figure 9.For forecasts of which the initial amplitude exceeded unity, the smallest estimate of initial-value error reached unity at 18 days (Figure 9a).For forecasts starting from a non-MJO state (Figure 9b), in contrast, the minimum-possible prediction limit was 12 days.The multimodel analysis also supports the notion that the potential predictability of forecasts from an MJO state is longer than that from a non-MJO state.Having classified the initial state as in the previous section, we investigated the dependence of the potential predictability on MJO phase in the multimodel analysis.The minimum-possible prediction limit for forecasts initialized in Phases 4-5 was 28 days, while that for forecasts initialized in other phases was 15-18 days.This is because the variance σ 2 (X 0 ) is so small that the initial-value error can keep less than unity even with the relatively low ACC.
It is cautioned that there is an estimation error in the evaluation above caused by the small sample size in the multimodel analysis.For example, there are only 31 and 28 forecasts initialized in Phases 6-7 and Phases 8-1, respectively (Table 3).In these cases, the estimation error of the initial-value error could be as large as ~0.05-0.08 around the lead time when it reaches unity, according to the boot-strap method (not shown).

Concluding Remarks
This study has proposed an alternative method to estimate potential predictability without making the perfect model assumption.The formulation started from the definition of the initial-value error as the difference between the analysis and ensemble-mean forecast.Their covariance among a bundle of time segments is a key point in developing the new method.Applying it to the two-

Concluding Remarks
This study has proposed an alternative method to estimate potential predictability without making the perfect model assumption.The formulation started from the definition of the initial-value error as the difference between the analysis and ensemble-mean forecast.Their covariance among a bundle of time segments is a key point in developing the new method.Applying it to the two-dimensional state on the MJO phase space, the potential predictability for three operational models participating in the S2S project was successfully evaluated.The results suggested that forecasts from an MJO state provided a longer prediction limit, on average, than forecasts from a non-MJO state.Our method also detected a longer prediction limit for forecasts initially from Phases 4-5.We obtained consistent results from a conventional method in which the initial-value error was estimated with the spread of ensemble members normalized by the amplitude of the MJO signal.Though we obtained a similar result to the conventional method, our new method has three merits: it can be used for phenomena that are not well forecasted by models; it can be used for multimodel ensemble analysis; and it can identify a fake short prediction limit obtained using the conventional method.Another different point from the conventional method is the use of the ensemble spread.The ensemble members or their spread might have more information than their mean.Though they might give some statistical stability, however, this point is actually beyond the scope of this paper and will be reported elsewhere.
We have already compared the new method with the conventional method, though the former only gives a minimum-possible value of prediction limit.As reviewed in the introduction, several publications were devoted to exploring potential predictability related to the MJO with the conventional method.Even with different metrics or different models, Waliser et al. [3] and Neena et al. [5] found that the prediction limit for forecasts from an MJO state is a few days longer than that for forecasts from a non-MJO state, quite consistent with our result.Related to the phase-dependency, however, our result-that an initial state in Phases 4-5 is preferable to predictability-is different from previous studies: Neena et al. [5] found slightly higher potential predictability for forecasts initialized in Phases 2-3 and Phases 6-7, while Kim et al. [11] indicated little dependence of potential predictability on the initial phase.We consider that more forecast samples are necessary to get stable statistics on the phase dependency.Moreover, if one used the precipitation pattern as the metric to estimate predictability, the prediction limit would be much shorter [3].Similarly, the prediction limit was only 4 days for unfiltered zonal wind that contains small scale variability [36].It should be kept in mind that our estimate focused only on the planetary-scale aspect of MJO that was well represented by the projection onto the two-dimensional phase space.
Even though the phase dependency on potential predictability is still unsolved, the comparison with prediction skill is interesting.At least in our result based on the JMA and NCEP models, forecasts from Phases 4-5 provide a longer prediction limit.In contrast, they provide a low prediction skill, which is probably related to the so-called "maritime continent barrier", which prevents the model MJO from propagating eastwards beyond the Maritime Continent [4,10,13].This difference of phase dependency between prediction skill and prediction limit from Phases 4-5 suggests that the "maritime continent barrier" is not related to the intrinsic prediction limit, but instead to the insufficient representation of the MJO in numerical models.This is in line with the suggestion by Neena et al. [5], Kim et al. [11] and Seo et al. [37] with different analyses from ours.We finally discuss a possible effect of the initial variance of time segments to prediction limit estimation in this study.Since the standard deviation of the state σ(X o ) is initially 1.6 in the MJO state and 0.7 in the non-MJO state (Figure 10), the latter case allows a smaller ACC if σ(X o ) slowly grows in the lead time.However, σ(X o ) for the forecasts initialized with the MJO and non-MJO states rapidly converge to the same value around 14 days, which is related to the system memory over the phase space.The prediction limit is solely decided by ACC after that.Because the ACC was higher for forecasts from an MJO state (not shown), its prediction limit turns out to be slightly longer than that from a non-MJO state, in spite of a larger σ(X o ) in the initial time.Hence, the difference in initial amplitude has little effect on prediction limit estimation.
the same value around 14 days, which is related to the system memory over the phase space.The prediction limit is solely decided by ACC after that.Because the ACC was higher for forecasts from an MJO state (not shown), its prediction limit turns out to be slightly longer than that from a non-MJO state, in spite of a larger ( ) in the initial time.Hence, the difference in initial amplitude has little effect on prediction limit estimation.Acknowledgments: We are deeply indebted to Masahide Kimoto, Yukari N. Takayabu, Tetsuo Nakazawa, Shuhei Maeda, Seok-Woo Son, and two anonymous reviewers for giving us insightful comments related to this manuscript.Discussion with Hiroaki Miura, Kazuaki Yasunaga, Atsushi Hamada., Satoru Yokoi, Tomoki Miyakawa, and Tamaki Suematsu is helpful to improve our earlier works.Akihiko Shimpo and Yuhei Takaya introduced the S2S project and the reference of the JMA operational model.This work was supported by Grant-

Figure 1 .
Figure 1.A schematic of time evolution of a single time segment of the true value and perfect model's ensemble forecast.True value (solid line and closed circles) and the ensemble-mean perfect-model forecast (open circles) from the initial to the infinite lead time .The forecast expansion of the ensemble members is represented by the thick lines enclosed by their mean.

Figure 1 .
Figure 1.A schematic of time evolution of a single time segment of the true value and perfect model's ensemble forecast.True value (solid line and closed circles) and the ensemble-mean perfect-model forecast (open circles) from the initial to the infinite lead time τ.The forecast expansion of the ensemble members is represented by the thick lines enclosed by their mean.

Figure 2 .
Figure 2. A schematic describing forecast groups with (a) high and (b) low anomaly correlation coefficient skill.The open circles are the forecasts and the filled circles are the verifications.The number on them denotes the time segment label.

Figure 2 .
Figure 2. A schematic describing forecast groups with (a) high and (b) low anomaly correlation coefficient skill.The open circles are the forecasts and the filled circles are the verifications.The number on them denotes the time segment label.

Figure 3 .
Figure 3.The prediction skill measured by a two-dimensional error vector on the Madden-Julian Oscillation (MJO) phase space for each forecast at 10-day lead time plotted at the initial-state position for (a) ECMWF, (b) JMA, and (c) NCEP models.The NCEP model plot is every 10 days for clarity.The magnitude of error is represented with gray scale as per the reference on the right.The MJO phases are labeled and partitioned by dashed lines in the plot.The number of forecasts and the rootmean-square error among all forecasts are provided at the top of the panels.

Figure 3 .
Figure 3.The prediction skill measured by a two-dimensional error vector on the Madden-Julian Oscillation (MJO) phase space for each forecast at 10-day lead time plotted at the initial-state position for (a) ECMWF, (b) JMA, and (c) NCEP models.The NCEP model plot is every 10 days for clarity.The magnitude of error is represented with gray scale as per the reference on the right.The MJO phases are labeled and partitioned by dashed lines in the plot.The number of forecasts and the root-mean-square error among all forecasts are provided at the top of the panels.

Figure 4 .
Figure 4.The error vector at 10-day lead time averaged over forecasts initially located in each MJO phase for (a) ECMWF, (b) JMA and (c) NCEP models.The magnitude of the error vector is as measured by the scale (bottom left).

Figure 4 .
Figure 4.The error vector at 10-day lead time averaged over forecasts initially located in each MJO phase for (a) ECMWF, (b) JMA and (c) NCEP models.The magnitude of the error vector is as measured by the scale (bottom left).

Figure 4 .
Figure 4.The error vector at 10-day lead time averaged over forecasts initially located in each MJO phase for (a) ECMWF, (b) JMA and (c) NCEP models.The magnitude of the error vector is as measured by the scale (bottom left).

Figure 5 .
Figure 5. Same as Figure 3, except that the error is based on the bias-corrected data.

Figure 5 .
Figure 5. Same as Figure 3, except that the error is based on the bias-corrected data.

Figure 6 .
Figure 6.(a-c) MJO amplitude (solid line), ensemble spread multiplied by √2 on the MJO phase space (dashed line), and a maximum possible value of initial-value error evaluated with the method proposed in this paper (dotted line), averaged over forecasts of which the MJO amplitude initially exceeds unity for (a) ECMWF, (b) JMA and (c) NCEP models.The prediction limit (day) is denoted as the vertical line for the new and conventional methods, if any.Figures(d-f) are the same as (a-c), but for forecasts from the initial state with the MJO amplitude less than unity.

Figure 6 .
Figure 6.(a-c) MJO amplitude (solid line), ensemble spread multiplied by √ 2 on the MJO phase space (dashed line), and a maximum possible value of initial-value error evaluated with the method proposed in this paper (dotted line), averaged over forecasts of which the MJO amplitude initially exceeds unity for (a) ECMWF, (b) JMA and (c) NCEP models.The prediction limit (day) is denoted as the vertical line for the new and conventional methods, if any.Figures(d-f) are the same as (a-c), but for forecasts from the initial state with the MJO amplitude less than unity.

Figure 8 .
Figure 8. Scatter plot of the prediction skill at the 10-day lead time for each forecast based on the multimodel average.

Figure 8 .
Figure 8. Scatter plot of the prediction skill at the 10-day lead time for each forecast based on the multimodel average.

Figure 10 .
Figure 10.The standard deviation of analysis time segments as a function of lead time for forecasts initialized at the MJO state (black) and forecasts initialized at the non-MJO state (gray).

Figure 10 .
Figure 10.The standard deviation of analysis time segments as a function of lead time for forecasts initialized at the MJO state (black) and forecasts initialized at the non-MJO state (gray).

Table 2
, and the number of single model and multimodel forecasts initialized in each MJO phase are shown in Table3.It is noted that the forecasts initialized on 1 January 1999, 1 February 2003 and 11 February 2003 were excluded for a technical reason.

Table 2 .
A list of months and days on which the multimodel ensemble analysis was performed between 1999 and 2009.

Table 3 .
The number of individual forecasts, each with its own forecast period, starting from each piecewise area in the Madden-Julian Oscillation (MJO) phase space.

Table 4 .
The list of prediction limits for the ECMWF model (day) for forecasts from each MJO phase with the proposed and conventional methods.

Table 4 .
The list of prediction limits for the ECMWF model (day) for forecasts from each MJO phase with the proposed and conventional methods.