An Empirical Seasonal Rainfall Forecasting Model for the Northeast Region of Brazil

: The Northeast region of Brazil (NEB) is characterized by large climate variability that causes extreme and long unseasonal wet and dry periods. Despite signiﬁcant model developments to improve seasonal forecasting for the NEB, the achievement of a satisfactory accuracy often remains a challenge, and forecasting methods aimed at reducing uncertainties regarding future climate are needed. In this work, we implement and assess the performance of an empirical model (EmpM) based on a decomposition of historical data into dominant modes of precipitation and seasonal forecast applied to the NEB domain. We analyzed the model’s performance for the February-March-April quarter and compared its results with forecasts based on data from the North American Multi-model Ensemble (NMME) project for the same period. We found that the ﬁrst three leading precipitation modes obtained by empirical orthogonal functions (EOF) explained most of the rainfall variability for the season of interest. Thereby, this study focuses on them for the forecast evaluations. A teleconnection analysis shows that most of the variability in precipitation comes from sea surface temperature (SST) anomalies in various areas of the Paciﬁc and the tropical Atlantic. The modes exhibit different spatial patterns across the NEB, with the ﬁrst being concentrated in the northern half of the region and presenting remarkable associations with the El Niño-Southern Oscillation (ENSO) and the Atlantic Meridional Mode (AMM), both linked to the latitudinal migration of the intertropical convergence zone (ITCZ). As for the second mode, the correlations with oceanic regions and its loading pattern point to the inﬂuence of the incursion of frontal systems in the southern NEB. The time series of the third mode implies the inﬂuence of a lower frequency mode of variability, probably related to the Interdecadal Paciﬁc Oscillation (IPO). The teleconnection patterns found in the analysis allowed for a reliable forecast of the time series of each mode, which, combined, result in the ﬁnal rainfall prediction outputted by the model. Overall, the EmpM outperformed the post-processed NMME for most of the NEB, except for some areas along the northern region, where the NMME showed superiority.


Introduction
Climate prediction is an important tool for the preventive management of undesirable consequences of climate variability. Despite the strong chaotic component of the atmosphere, the oceanic phenomena that trigger an ocean-atmosphere energy transfer in present an analysis of the teleconnection patterns associated to the climate variability of the NEB, as well as the validation outcomes. In Section 4, we expose the main conclusions derived from the results.

Data and Region of Study
For this work, we used an interpolated gridded monthly precipitation time series provided by the University of Delaware, with the spatial resolution of 0.5 • × 0.5 • for the NEB (Figure 1). Reference [46] assessed the quality of version 3 of this dataset by comparing it to control stations in Rio Grande do Norte that represent different physiographic properties. This paper uses version 4 (from www.esrl.noaa.gov/psd/data/ gridded/data.UDel_AirT_Precip.html (accessed on 26 February 2019)). SST data were used as the predictor variable of the model. The data originate from the Extended Reconstructed Sea Surface Temperature (ERSST) project, version 5, which is derived from the International Comprehensive Ocean-Atmosphere Dataset (ICOADS) (from www. esrl.noaa.gov/psd/data/gridded/data.noaa.ersst.v5.html (accessed on 3 March 2019)), with a 2 • × 2 • resolution. Layer thickness (1000-500 mb) was also used, calculated from a geopotential height reanalysis provided by the National Center for Environmental Prediction (NCEP) Reanalysis 2, with a 2.5 • × 2.5 • spatial resolution (from http: //www.cpc.ncep.noaa.gov/products/wesley/reanalysis2/kana/reanl2-1.htm (accessed on 3 March 2019)).
validation methods and the GCM forecasts that were used as a reference. In Section 3, w present an analysis of the teleconnection patterns associated to the climate variability o the NEB, as well as the validation outcomes. In Section 4, we expose the main conclusion derived from the results.

Data and Region of Study
For this work, we used an interpolated gridded monthly precipitation time serie provided by the University of Delaware, with the spatial resolution of 0.5° × 0.5° for th NEB (Figure 1). Reference [46] assessed the quality of version 3 of this dataset b comparing it to control stations in Rio Grande do Norte that represent differen physiographic properties. This paper uses version 4 (from www.esrl.noaa.gov/psd/data/gridded/data.UDel_AirT_Precip.html (accessed on 2 February 2019)). SST data were used as the predictor variable of the model. The dat originate from the Extended Reconstructed Sea Surface Temperature (ERSST) projec version 5, which is derived from the International Comprehensive Ocean-Atmospher Dataset (ICOADS) (from www.esrl.noaa.gov/psd/data/gridded/data.noaa.ersst.v5.htm (accessed on 3 March 2019)), with a 2° × 2° resolution. Layer thickness (1000-500 mb) wa also used, calculated from a geopotential height reanalysis provided by the Nationa Center for Environmental Prediction (NCEP) Reanalysis 2, with a 2.5° × 2.5° spatia resolution (from http://www.cpc.ncep.noaa.gov/products/wesley/reanalysis2/kana/reanl2-1.htm (accesse on 3 March 2019)).

Description of the Empirical Model
The statistical model used in this study is based on the method initially proposed by [47] and subsequently implemented in [31,[48][49][50]. Figure 2 illustrates a diagram with details on how the model is built. There are three main steps (phases), from tuning to validation. Phase 1 consists of an empirical process of calculating lagged SST indices that have a good correlation with the principal components (PCs) of precipitation. These indices and the PCs of precipitation are later used as predictor and predictand variables, respectively, in a linear regression model. First, the EOF technique, also known in statistics as principal component analysis (PCA) [51], is applied to the precipitation data. The technique is applied to obtain an n number of time series that concentrate most of the precipitation variability observed in the NEB (denoted as PC n in Figure 2). Each PC is then compared to all lagged SST series in the global domain to obtain the Pearson correlation values. The outcomes of this are the correlation fields (CF) that are used to calculate the indices. This is shown in phase 1 in Figure 2, on the right. The bottom part of the phase 1 box illustrates how each index is calculated. Each grid point in the SST field has a corresponding correlation value from CF that works as a weight. The higher (lower) the correlation at a given point, the more (less) important that point is for the index. Points with statistically significant correlation lower than the 5% confidence level are set to a correlation value equal to 0. Finally, the indices are calculated by adding all the points to the CF-weighted lagged SST field, that is, for each PC in the precipitation field, a highly correlated SST index is generated.
Once the PCs of the precipitation field and the lagged SST indices for each PC are obtained, these series are used in a simple linear regression model in phase 2. The model is used to predict the next PC value using the last value of its corresponding index. In this phase, the model undergoes an adjustment and a cross-validation procedure. Thus, it is possible to obtain a hindcast of forecasts independently of the precipitation PCs.
One of the products derived from the EOF technique after it has been applied to spatial data is the spatial pattern of the modes (spatial pattern of the PCs). These patterns arise from parameters that are derived from the spatially referenced technique. It is from these parameters that the PCs are calculated in the EOF process, allowing us to couple the PCs and, therefore, to reconstruct the precipitation field. These parameters are used by combining the expected PCs to build a predicted precipitation field. This process occurs in phase 3, which is detailed in the last panel of Figure 2. The grids at the left of the phase 3 panel represent the spatial patterns of each precipitation data mode. The equations at the top of that panel show how the forecast is performed for each grid point: the product between each predicted time series (PC) and their corresponding EOF is the spatial pattern of the mode. The sum of all of these predicted spatial patterns is the model's final precipitation forecast for the NEB.

North American Multi-Model Ensemble (NMME)
The NMME is a recent multi-agency effort to provide an operational set of forecasts of global climate models [45]. As described in [52], this project, launched in 2011, provides forecasts at global level from USA and Canada climate centers, with a uniform spatial resolution of 1 • × 1 • . The NMME comprises forecasts from the Climate Forecast System, version 2 (CFSv2; [53]); the Geophysical Fluid Dynamics Laboratory (GFDL [54]) and the GFDL Forecast-Oriented Low Ocean Resolution (FLOR; [55]) models; the National Center for Atmospheric Research (NCAR); the Community Climate System Model, version 4 (CCSM4; [56]); the National Aeronautics and Space Administration (NASA) and Goddard Earth Observing System, version 5 (GEOS-5), ocean data and sea ice models and data assimilation systems [57,58]; as well as two models from the Canadian Meteorological Center: the Third Generation Canadian Coupled Global Climate Model (CanCM3) and the Fourth Generation Canadian Coupled Global Climate Model (CanCM4) [59]. The NMME is the result of the integration of this set of models. The complete list of models is described in detail by [45].

North American Multi-Model Ensemble (NMME)
The NMME is a recent multi-agency effort to provide an operational set of forecasts of global climate models [45]. As described in [52], this project, launched in 2011, provides forecasts at global level from USA and Canada climate centers, with a uniform spatial resolution of 1° × 1°. The NMME comprises forecasts from the Climate Forecast System, version 2 (CFSv2; [53]); the Geophysical Fluid Dynamics Laboratory (GFDL [54]) and the GFDL Forecast-Oriented Low Ocean Resolution (FLOR; [55]) models; the National Center for Atmospheric Research (NCAR); the Community Climate System Model, version 4 (CCSM4; [56]); the National Aeronautics and Space Administration (NASA) and Goddard Earth Observing System, version 5 (GEOS-5), ocean data and sea ice models and data We decided to use NMME outputs because several studies have revealed that the NMME's skill outperforms forecasts of individual models [60][61][62][63]. Other works focused on assessing the NMME's skill for specific regions and situations. For instance, there are efforts to improve seasonal climate forecasts for the African continent [64] and to predict droughts in China [65], as well as an assessment of statistical downscaling of precipitation and 2 m air temperature forecasts from the NMME for Florida, Georgia and Alabama [66].

Model Downscaling Using the Climate Predictability Tool
In this work, we did not use NMME raw data. In order to accomplish a better performance of the ensemble for the NEB, we applied a downscaling technique for postprocessing the NMME's hindcast. The technique is known as canonical correlation analysis (CCA) and is part of the climate predictability tool (CPT), which is widely used for generating operational climate predictions [67][68][69].
The CCA can be used in two ways (see Figure 1 in [15]). The first, as a purely statistical forecast model, by relating predictor variables to the predictand ones (e.g., to obtain forecasts for deviations of accumulated precipitation from SST anomalies), based on a broad hindcast period without the involvement of dynamic models [70][71][72]. The second, which is employed in the present study, relates the raw outputs of dynamic model prediction to their corresponding observations for the targeted forecast time based on a hindcast period [15]. In our case, the CCA was used to relate raw NMME forecasts of rainfall for FMA, outputted in January, to the rainfall observed in that quarter based on the hindcast period of 1982-2010.
In this process, a pre-orthogonalization is achieved by using EOF analysis separately on the model hindcasts (the X variable, or predictor) and on the corresponding observations (the Y variable, or predictand), and the set of the PC time series from these EOFs is used as input to the CCA [15,70]. This method reduces the number of variables used by the CCA, preserving the most consistent patterns of variability. The CPT applies the EOF pre-processing prior to the CCA method and then estimates the model skill using a crossvalidation process [15], that is, the datasets (hindcasts) used to estimate the model skill are split in mutually exclusive subsets, so that some of them are used to make the estimates and the remaining ones are used for validation. The method of cross-validation is characterized by the aforementioned process of subsequent training and validation steps, repeating itself until every dataset has been used at some point to validate the model without having been used to train it.
Analogously to [15], the predictor domain for the NEB (Domain Data X, Figure 3, left panel) is purposely designed to be larger than the predictand region itself (Domain Data Y, Figure 3, right panel). The selected NMME predictor grid spans from 150 • E to 0 • in longitude and from 66 • S to 48 • N in latitude. The choice was motivated by the fact that the grid encompasses the domains of the main modes of climate variability that are known to affect the behavior of weather systems that, in turn, modulate the rainfall regime of the NEB. That same domain was proven to be effective in [15].

Deterministic and Probabilistic Assessment
For newly developed models aimed at climate forecasts, it is expected from the developer to perform and present assessments of the model skill from both the deterministic and probabilistic perspectives. The dexterity of the model in outputting deterministic forecasts was estimated by using the anomaly correlation coefficient (ACC) (Equation (1)), also known as the Pearson correlation coefficient, where F' is the predicted

Deterministic and Probabilistic Assessment
For newly developed models aimed at climate forecasts, it is expected from the developer to perform and present assessments of the model skill from both the deterministic and probabilistic perspectives. The dexterity of the model in outputting deterministic forecasts was estimated by using the anomaly correlation coefficient (ACC) (Equation (1)), also known as the Pearson correlation coefficient, where F' is the predicted rainfall anomaly, O' is the observed anomaly and N is the total number of observed and predicted data that are available in the sample. For the probabilistic assessment of the model, we used the Brier score (BS) (Equation (2)), where f t is the probability of occurrence of any precipitation and O t is the current condition of the event (1 for occurrence and 0 for non-occurrence). In order to generate probabilistic forecasts, it is necessary to convert the deterministic forecast in probabilistic values, which, in turn, are derived from a probability distribution. For three-month data, a probability distribution of the Gaussian (normal) type is usually well suited to model the statistical behavior of the data. We applied the Yule-Kendall index of skewness (YKI) to verify the suitability of the series to a normal distribution. The closer to zero the YKI value is, the greater the suitability of the dataset is to a normal distribution. Figure 4 shows that, for most of the grid points of the NEB, the YKI values are equal to or very close to zero. This confirms the reasonability of applying the Gaussian distribution to these data.

Results and Discussion
For the precipitation forecast of the FMA quarter in the NEB, we focused on the firs three leading modes obtained by EOF. The choice of this number of modes was mad because the general dexterity of the model tends to decay if more than three main mode are adopted, and also because they presented reasonable teleconnection patterns that ar discussed in the next subsection. The first three leading modes account for 44%, 18% an 8% of the variability, respectively. Together, they represent 70% of the total datase variability. The remaining modes were discarded, as they were regarded as noise.

Precipitation and Teleconnection Patterns
The EOF spatial loading patterns for each of the leading modes are shown in Figur 5a,c,e, and their respective lagged SST anomalies correlation fields are shown in Figur As for the probabilistic forecasts, precipitation is considered below (above) the climatological normal when accumulated rainfall is lower (higher) than the 0.33 (0.66) percentile. The deterministic forecasts were converted to probabilistic ones by using logistic regression. As the NMME is the reference model in this study, we used relative metrics to assess the skill gain of one model over the other. The skill correlation score (SCS) (Equation (3)) is the metric that determines, in percentage terms, how much a model was better than the other in outputting deterministic forecasts. In Equation (3), r forecast is the Pearson correlation value of the model that is undergoing evaluation (EmpM), and r reference is the Pearson correlation of the reference model (NMME). For the probabilistic forecasts, we used the Brier skill score (BSS) (Equation (4)) to quantitatively assess the relative gain in skill of one

Results and Discussion
For the precipitation forecast of the FMA quarter in the NEB, we focused on the first three leading modes obtained by EOF. The choice of this number of modes was made because the general dexterity of the model tends to decay if more than three main modes are adopted, and also because they presented reasonable teleconnection patterns that are discussed in the next subsection. The first three leading modes account for 44%, 18% and 8% of the variability, respectively. Together, they represent 70% of the total dataset variability. The remaining modes were discarded, as they were regarded as noise.

Precipitation and Teleconnection Patterns
The EOF spatial loading patterns for each of the leading modes are shown in Figure 5a,c,e, and their respective lagged SST anomalies correlation fields are shown in Figure 5b,d,f. Prior to proceeding with the teleconnection analysis presented hereinafter, it is important to clarify that negative values in the loading patterns imply correlations with SST anomalies of the opposite sign (relative to the ones presented in the scale at the bottom of Figure 5b,d,f).  The EOF loading pattern of the first, second and third leading modes obtained from their FMA precipitation (a,c,e, respectively) and their corresponding December SST correlation fields (b,d,f, respectively). The points on the map represent statistical significance at the 5% level using the student's t-test.
The loading pattern inherent to the first leading mode is shown in Figure 5a, exhibiting a prominent concentration in the northern sector of the NEB (Figure 5a). The pattern showed a gradual decrease towards the southern sector of the NEB, with values nearing the maximum of 0.10 along the northern coast, and neutral to slightly negative in the southern half of the region. Significant negative teleconnection patterns were observed between the mode and the equatorial Pacific, most of the South Pacific and in the northwestern region of the Indian Ocean (Figure 5b). A positive teleconnection pattern was present along the equatorial Atlantic, with higher significance and prevalence of correlation coefficients higher than 0.45 around the South Atlantic tropical (SAT) region, outlined by [7] as one of the two regions chosen by them to propose their SST Atlantic dipole index. This teleconnection is coherent with results found in literature that show the influence of the Atlantic Ocean in the rainfall regime of the NEB [73][74][75]. The huge area of significant and strong negative correlation (ranging from 0.3 to 0.6) along the equatorial Pacific presented a pattern that corresponds to the ENSO domain.
Primarily, the reduction in the NEB precipitation usually observed during El Niño has been attributed to large-scale changes in the Walker circulation, causing an eastward displacement of the affecting cell, positioning its rising branch over the eastern equatorial Pacific, where abnormally warm surface waters prevail, and its sinking branch over the tropical Atlantic [76]. Such anomalies in the Walker circulation can produce subsidence over the region, which can also be affected by a strengthened and shifted sinking branch of the Hadley cell when the ITCZ is displaced northward [77,78]. However, the ENSO SST anomalies may have different effects on the tropical Atlantic depending on their spatial structure and duration [37]. Some authors use the terms "canonical" and "noncanonical" to differentiate the cases according to their observed response in the Atlantic and, consequently, in the NEB precipitation. Among them, are the authors of [37,76,79]. Canonical El Niño events exhibit their maximum anomalous warming in the eastern Pacific, which, in turn, causes a warming of the tropical North Atlantic and a cooling of the equatorial and tropical South Atlantic, leading to deficient rainfall over the NEB [79]. This interbasin connection is mainly through the changes mentioned above in the Walker circulation, which lead to an intensification of the southeast trades over the equatorial Atlantic in DJF, with the stronger southeast winds triggering upwelling equatorial Kelvin waves that cause cooling of the equatorial Atlantic in MAM [37,76]. The opposite tends to happen for canonical La Niña events [79].
Our findings are highly consistent with the teleconnection described above between the equatorial Pacific and the tropical Atlantic during canonical ENSO episodes, as the first leading mode (Figure 5a,b) exhibited, at the same time, strongly significant negative correlations with the ENSO area and positive correlations with the SAT area, as well as a slight negative correlation with the tropical North Atlantic. It is also worth noting that the teleconnection with the equatorial Pacific was stronger in, if not almost restricted to, the domains of the Niño3.4, Niño3 and Niño1+2 indices, which are areas that define the canonical ENSO events. It should be mentioned that the core of the negative correlation found here, with values exceeding 0.45, covered parts of the Niño3.4 and almost the entirety of the Niño3 regions. Furthermore, the lagged nature of the teleconnection was also present in our findings, as the SST anomaly field (Figure 5b) is built with December data.
Additionally, the effect in the Atlantic was mostly seen in the northern NEB (Figure 5a), which is precisely the region regarded as most susceptible to the latitudinal migration of the ITCZ, which, in turn, is governed by the meridional gradient of the SST anomalies in tropical Atlantic, as highlighted by [73,75,[80][81][82].
Indeed, during the positive phase of the AMM, i.e., with a tropical North Atlantic warmer than the tropical South Atlantic, the descending branch of the Hadley circulation is positioned over the NEB and the adjoining tropical South Atlantic. This causes the ITCZ to fail to reach a more austral position (and, therefore, the northern NEB) during the FMA quarter [73,75,80,83,84].
Considering that the time series used in this study encompassed a total period of 29 years, it was not possible to be more assertive regarding teleconnections with oscillations of much lower frequencies. However, it is known that the ENSO and PDO climate patterns are clearly related, both spatially and temporally, to the extent that the PDO may be viewed as ENSO-like interdecadal climate variability [85], with the main difference between PDO and ENSO being their time scales, for, while the PDO phases persist for 20 to 30 years, the typical ENSO events persist for 6 to 18 months [86]. Furthermore, according to [86,87], El Niño (La Niña) events are more (less) intense and occurs in higher (lesser) number during the positive phase of the PDO, with the opposite occurring during its negative phase. Therefore, a teleconnection between the first mode and the PDO may also be hypothesized.
Regarding the second leading mode, its EOF loading pattern exhibited negative values for nearly all the NEB, to the point of exceeding 0.06 for areas in the center-south and in the southwest. The far northern region was an exception presenting low positive numbers, which were rarely above 0.03 (Figure 5c). Therefore, we focus our analysis of this mode on the southern NEB region. The teleconnection patterns with the global SST field (Figure 5f) were rather punctual, suggesting that the mode was not linked to any known major oscillations with frequencies lower than the seasonal scale. There were significant teleconnections with waters off the eastern coast of Madagascar and the southern coast of Australia, where the correlation coefficients reached about −0.5 and 0.5, respectively (Figure 5d). These regions are considered key points in Rossby wave instability, which ripples across the Pacific until it reaches South America. The correlation areas observed off the western coast of South America, with values in excess of 0.3, were also consistent with the path of Rossby wave trains that ripple through the subtropical and polar jets, as it is known that, when they reach approximately 80 • W, the wave trains combine into a single wave [88].
Those waves are associated with the penetration of frontal systems in the South American continent. Eventually, these transient systems reach lower latitudes and, therefore, the southern NEB, being considered one of the most important weather systems to modulate the rainfall regime of that region [78,[89][90][91], especially the state of Bahia, where the maximum EOF loading values of this mode are observed (Figure 5c). However, it should be noted that the wet season varies within the NEB, and, in this region, it is concentrated in the DJF period of the year [78,82,88,91], having only one month in common with the FMA quarter.
Still, on the possible teleconnection with higher latitudes analyzed above, in addition to the mid-latitude areas commented before, it is interesting to notice several pockets off the coast of Antarctica presenting significant correlations ( Figure 5d). As recollected by the authors of [92], 20%-30% of monthly sea level pressure (SLP) variability south of 20 • S is explained by the Southern Annular Mode (SAM), also known as the Antarctic Oscillation (AAO). This oscillation is considered the most significant forcing of intraseasonal to decadal climate variability in the middle-to high-latitude Southern Hemisphere [93]. Therefore, given the relevance of the southern oceans to the behavior of the frontal systems that affect South America, it is possible to conjecture a teleconnection between the southern NEB precipitation and the SAM as well.
There was also a statistically significant correlation area in the North Pacific, around Japan, that may be linked to the Kuroshio-Oyashio Extension. However, given that the inherent correlation was subtle (lower than 0.3 for most of it) and that a possible influence does not have grounds in the literature, we chose to nullify that area for this study, so that the model could technically disregard it.
The third leading mode presented a zonal gradient loading pattern, with positive values in the eastern half of the NEB and negative values in the westernmost sector ( Figure 5e). Almost throughout the eastern area, the EOF values remained above 0.05. From the corresponding global SST field, it was possible to identify a significant negative teleconnection between the precipitation over the eastern NEB and the SST anomalies in the Caribbean Sea (between −0.4 and −0.6), as well as with the eastern tropical North Atlantic (Figure 5f), with the latter corresponding to the NAT index domain, and the other region defined by [7] for their aforementioned Atlantic dipole. It is useful to recall that the North Atlantic may, by itself, determine the latitudinal migration of the ITCZ, by leading the AMM [91,94]. According to these authors, during the positive phase of the NAO, the northeast trade winds, which push the ITCZ to the south during the austral summer, are enhanced. Still conforming to them, these effects, i.e., the greater vigor of the Hadley cell, its closer proximity to the NEB and the increased moisture flux from the Atlantic, eventually result in increased precipitation over the region.
Moreover, in that regard, the evident upward trend in the time series of the third mode (Figure 6c) may be linked to the Atlantic Multidecadal Oscillation (AMO), which transitioned from the negative to positive phase in the 1990s. The positive phase of the AMO naturally implies the predominance of positive phases of the NAO in shorter time scales, which, according to the above, are considered favorable to rainfall over the NEB.  It should be remarked, though, that the eastern NEB is known to have its wet season concentrated in the AMJJ period [82,87,89,91,95], not exactly the FMA quarter.
Significant correlations were also seen with the southwest of the Pacific Ocean, presenting a dipolar pattern of association, with coefficients exceeding 0.4 in the high latitudes and −0.4 in the middle latitudes. Interestingly, this pattern resembled the one identified by [78] as responsible for 39% of the variability of the South Pacific, with anomalies of opposite sign centered around 35 • S, 160 • W and 60 • S, 130 • W. Still, according to those authors, this mode is highly correlated to ENSO and the Interdecadal Pacific Oscillation (IPO), being characterized by a horseshoe structure with positive (negative) anomalies located on subtropical regions of both the North and South Pacific, and negative (positive) anomalies on the eastern tropical Pacific. That is almost exactly the correlation pattern that was observed here in Figure 5f, where a broad region of significant negative correlations formed a horseshoe shape that contoured a tropical area of positive (although not significant) correlations in the eastern tropical Pacific. Therefore, a teleconnection with this mode, called the South Pacific Ocean Dipole (SPOD) [96] was also identified.
Additionally, one of the regions that formed the horseshoe pattern mentioned above, the one presenting significant negative correlations (with values between −0.4 and −0.6) centered in the subtropical to middle latitudes of the central North Pacific, is generally used to define the PDO index [6,85]. Actually, this same pattern of SST anomalies has been described by [97] as a boomerangshaped pattern of stronger anomalies of one sign along the equator over the central and eastern Pacific and weaker anomalies of the opposite sign over mid-latitude regions of both hemispheres; it was also described by [98] as a tripole pattern of SST anomalies in the Pacific, with three large centers of action and variations on decadal timescales, with the sign of the equatorial region being the opposite of the one in the central-west of the North Pacific and in the central-west of the South Pacific. In both cases, the authors referred to the SST anomalies pattern that characterizes the IPO. For both of them, the IPO and the PDO are closely related, highlighting the similarity between their time series and the fact that the two of them exert a low-frequency modulation of ENSO and its teleconnections. It is also suggested that the PDO can be considered as the North Pacific node of the IPO [98], or that the IPO can be a Pacific-wide manifestation of the PDO [97].
Therefore, the SST correlation pattern in Figure 5f made us infer that the upward trend in rainfall observed for our third leading mode in Figure 6c, in addition to the AMO influence discussed before, is also possibly owed to the transition from the positive to the negative phases of the IPO (and the PDO) in the end of the past century.
Similar to the correlation pattern of the second leading mode, the one associated with the third mode also exhibited several areas of correlation around the Antarctic continent, almost forming a belt of correlations of the same sign, which occasionally surpassed 0.6 ( Figure 5f). Knowing that the Easterly Wave Disturbances (EWDs) are the main cause of rainfall over the eastern NEB [73,89,91,95] and that more than 70% of the occurrences of that system in the region have origins found to be associated with cold fronts [95] (which naturally originate in the high latitudes), a teleconnection of the rainfall over eastern NEB with the SAM can be inferred too.

Forecasting Patterns of Precipitation Anomalies
The significant teleconnections found between each mode and the two-month lagged SST anomalies are useful for the adjustment of simple linear regression models that use the time series of each mode as the predictand, and the SST anomalies as the predictor field. Overall, the teleconnections allowed for a reasonable predictability for all the analyzed modes (Figure 6a-c). The first leading mode was linked to a variability at relatively lower frequencies (e.g., ENSO and AMM). For this reason, probably, it proved to be the most predictable mode, with a 0.76 correlation between the observed and the predicted time series. The association of the second mode with the propagation of higher frequency atmospheric waves, as discussed in the previous section, resulted in the attainment of the lowest correlation among the modes, although still satisfactory, at 0.58. The third mode presented a noticeable trend and lower variability, with both factors leading to a stability that, in turn, resulted in a fair predictability of the mode, having achieved a correlation of 0.73.
By having the ability to predict modes individually, it becomes possible to predict the spatial precipitation patterns in the NEB by the method presented in phase 3 of the diagram (Figure 2). The NMME, which represents the state-of-the-art in seasonal forecasting by general circulation models, accomplished significant performance in the northwestern part of the NEB (Figure 7a). However, for other areas within the domain, it obtained statistically non-significant results. This demonstrates a clear need to improve forecasts for the NEB, especially for its southern and eastern sectors. Conversely, the EmpM achieved significant predictability for most of the NEB (Figure 7b). The areas with higher predictability were located at the center-west and southern portions, where the spatial correlations were usually above 0.6. A region of statistically non-significant predictability spread over parts of the states of Bahia, Sergipe, Pernambuco, Paraíba and Ceará (Figure 7b). This region represents the core of the semi-arid zone.
The SCS allowed us to assess the performance of the empirical model against the performance of the NMME. It is possible to observe that the EmpM was able to consistently outperform the NMME for most of the NEB (Figure 7c), to the point of reaching 100% in the some of the southernmost areas. The average EmpM gain against NMME was of 50%. On the other hand, the NMME forecasts for the northernmost part of the NEB were overall superior to those outputted by the EmpM.
Water 2021, 13, x FOR PEER REVIEW 15 of 22 Figure 7. Spatial correlation of the observed time series with the NMME (a) and with the EmpM (b), as well as the skill correlation score (SCS) of the EmpM against the NMME (c). The points on the map represent statistical significance at the 5% level using the student's t-test.
In probabilistic terms, obtaining BS values less than or equal to 0.25 means that the model has an accuracy rate higher than 50%, rendering the model superior to the referential climatology. The NMME was inefficient in a small area of the southwestern NEB in forecasts below the normal (Figure 8a) and in most of the southern area, as well as in some of the eastern coast, in forecasts above the normal (Figure 8b). The EmpM, in turn, obtained a BS value lower than 0.25 for nearly all the NEB territory under circumstances below the normal (Figure 8c). In forecasts above the normal, the model was inefficient over a broad area that encompassed the southern and most of the eastern regions ( Figure   Figure 7. Spatial correlation of the observed time series with the NMME (a) and with the EmpM (b), as well as the skill correlation score (SCS) of the EmpM against the NMME (c). The points on the map represent statistical significance at the 5% level using the student's t-test.
In probabilistic terms, obtaining BS values less than or equal to 0.25 means that the model has an accuracy rate higher than 50%, rendering the model superior to the referential climatology. The NMME was inefficient in a small area of the southwestern NEB in forecasts below the normal (Figure 8a) and in most of the southern area, as well as in some of the eastern coast, in forecasts above the normal (Figure 8b). The EmpM, in turn, obtained a BS value lower than 0.25 for nearly all the NEB territory under circumstances below the normal (Figure 8c). In forecasts above the normal, the model was inefficient over a broad area that encompassed the southern and most of the eastern regions ( Figure 8d). Visually, the NMME was superior to the EmpM with regards to the BS parameter. Finally, we resorted to the BSS to compare once again the EmpM performance to the one of the NMME. In forecasts above the normal (Figure 9b), the pattern bore similarity to the one shown in Figure 7c, with the NMME prevailing in the northwestern region, and the EmpM showing superiority for nearly all the other areas. In forecasts below the normal (Figure 9a), the NMME was superior in a broader strip across the northern region, but the EmpM was still better for most of the NEB.
Water 2021, 13, x FOR PEER REVIEW 16 of 22 (Figure 9a), the NMME was superior in a broader strip across the northern region, but the EmpM was still better for most of the NEB.

Conclusions
In this study, we developed and calibrated an empirical model to create seasonal forecasts for the rainy period of the NEB and compared it to an ensemble of dynamic models that represent the state-of-the-art in seasonal forecasting. The model predicts the rainfall of the FMA quarter from individual forecasts of each dominant mode of the region's precipitation. The teleconnection analysis between the leading precipitation modes and the lagged SST field showed consistent associations that enabled the individual forecasting of each mode and, subsequently, precipitation forecasts for the NEB.
The first leading mode showed remarkable correlations with broad regions of the Pacific, Atlantic and Indian Oceans. It is widely established in the literature that the SSTs of the equatorial Pacific and the tropical Atlantic exert considerable influence on the NEB precipitation. According to the spatial pattern obtained by EOF, such influence is significantly concentrated in the northern part of the NEB, which is known to be the most susceptible to both the AMM and ENSO. Additionally, it is the region that has the FMA period for the wet season, so obtaining that particular spatial pattern for the first leading mode is substantially coherent.
The teleconnection patterns inherent to the second leading mode suggested that it may be linked to the propagation of high-frequency atmospheric waves (e.g., Rossby waves). The statistically significant correlations with the lagged SST anomalies of the South Pacific, in areas that typically nest atmospheric wave excitation, point in the direction of that hypothesis. Moreover, the loading pattern of this mode resembled a meridional dipole, with the maximum EOF loading in the southern NEB, which is precisely the sector that owes its rainfall primarily to the incursion of frontal systems. Teleconnections of this mode with SST regions of higher southern latitudes made it possible to conjecture an influence of the SAM as well.
A zonal gradient pattern was observed for the third EOF leading mode, with maximum loading in the eastern NEB. The positive teleconnections with two broad areas in the tropical North Atlantic may be owed to the influence of the NAO, as previously discussed. As for the Pacific, the horseshoe shape of the negative correlation areas resembled that of the SPOD, with one of those areas, the one centered in the North Pacific, having been identified as the PDO index domain. The remarkable upward trend noticed in the time series of this mode is probably linked to the PDO (and the IPO) or the AMO, or to both of these low-frequency modes of the Pacific and the Atlantic, respectively.

Conclusions
In this study, we developed and calibrated an empirical model to create seasonal forecasts for the rainy period of the NEB and compared it to an ensemble of dynamic models that represent the state-of-the-art in seasonal forecasting. The model predicts the rainfall of the FMA quarter from individual forecasts of each dominant mode of the region's precipitation. The teleconnection analysis between the leading precipitation modes and the lagged SST field showed consistent associations that enabled the individual forecasting of each mode and, subsequently, precipitation forecasts for the NEB.
The first leading mode showed remarkable correlations with broad regions of the Pacific, Atlantic and Indian Oceans. It is widely established in the literature that the SSTs of the equatorial Pacific and the tropical Atlantic exert considerable influence on the NEB precipitation. According to the spatial pattern obtained by EOF, such influence is significantly concentrated in the northern part of the NEB, which is known to be the most susceptible to both the AMM and ENSO. Additionally, it is the region that has the FMA period for the wet season, so obtaining that particular spatial pattern for the first leading mode is substantially coherent.
The teleconnection patterns inherent to the second leading mode suggested that it may be linked to the propagation of high-frequency atmospheric waves (e.g., Rossby waves). The statistically significant correlations with the lagged SST anomalies of the South Pacific, in areas that typically nest atmospheric wave excitation, point in the direction of that hypothesis. Moreover, the loading pattern of this mode resembled a meridional dipole, with the maximum EOF loading in the southern NEB, which is precisely the sector that owes its rainfall primarily to the incursion of frontal systems. Teleconnections of this mode with SST regions of higher southern latitudes made it possible to conjecture an influence of the SAM as well.
A zonal gradient pattern was observed for the third EOF leading mode, with maximum loading in the eastern NEB. The positive teleconnections with two broad areas in the tropical North Atlantic may be owed to the influence of the NAO, as previously discussed. As for the Pacific, the horseshoe shape of the negative correlation areas resembled that of the SPOD, with one of those areas, the one centered in the North Pacific, having been identified as the PDO index domain. The remarkable upward trend noticed in the time series of this mode is probably linked to the PDO (and the IPO) or the AMO, or to both of these low-frequency modes of the Pacific and the Atlantic, respectively.
The identified teleconnection patterns were used to generate simulated time series that awee noticeably correlated with the observed time series of each mode. This allowed for a reasonable adjustment of linear regression models to predict each mode using the SST anomalies field as the predictor variable. Deterministically, the EmpM outperformed the NMME, especially in the center-south portion of the NEB, whereas the northwestern region was better forecasted by the NMME. The results were similar when the models were compared stochastically. The EmpM was better than the NMME for most of the region, especially for the center-south area. The NMME superiority was restricted to some areas spread over the northern region. From the probabilistic perspective, an overall better performance was attained by the NMME in forecasts above the normal, with the EmpM showing prevalence in circumstances below the normal.
As a future activity, we intend to apply the results obtained in this work for the calibration and verification of the Brazilian Global Atmospheric Model (BAM) [99], which is the atmospheric module of the Brazilian Earth System Model (BESM), aiming to achieve a hybrid dynamic-statistic coupling for observed surface data and to perform adjustments in the BESM seasonal forecasting for the NEB.

Conflicts of Interest:
The authors declare no conflict of interest.