An Ensemble Mean and Evaluation of Third Generation Global Climate Reanalysis Models

We have produced a global ensemble mean of the four third-generation climate reanalysis models for the years 1981–2010. The reanalysis system models used in this study are National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR), European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis Interim (ERA-I), Japan Meteorological Agency (JMA) 55-year Reanalysis (JRA-55), and National Aeronautics and Space Administration (NASA) Modern-Era Retrospective Analysis for Research and Applications (MERRA). Two gridded datasets are used as a baseline, for temperature the Global Historical Climatology Network (GHCN), and for precipitation the Global Precipitation Climatology Centre (GPCC). The reanalysis ensemble mean is used here as a comparison tool of the four reanalysis members. Meteorological fields investigated within the reanalysis models include 2-m air temperature, precipitation, and 500-hPa geopotential heights. Comparing the individual reanalysis models to the ensemble mean, we find that each perform similarly over large domains but exhibit significant differences over particular regions.


Introduction
Reanalysis models are numerical frameworks from which gridded solutions for past atmospheric states are obtained from the assimilation of historical meteorological observations.The first such model National Center for Environmental Prediction and National Center for Atmospheric Research (NCEP, NCAR, respectively) Reanalysis (R1) [1] provides meteorological outputs on a global domain, over the period of 1948-present, on a coarse 2.5 • × 2.5 • horizontal grid.A second generation of global climate reanalysis models, including European Centre for Medium-Range Weather Forecasts (ECMWF) 40-year Reanalysis (ERA-40) [2] and Japan Meteorological Agency (JMA) 25-year Reanalysis (JRA-25) [3], made improvements over R1, and subsequent reanalyses, by integrating satellite data, cloud motion winds (ERA-40), and wind profiles around tropical cyclones (JRA-25) on a finer resolution output of 1.125 • × 1.125 • .A third generation of global reanalysis models made significant advances in data assimilation and internal physics, computed at resolutions finer than 1 • × 1 • .These models are NCEP Climate Forecast System Reanalysis (CFSR) [4], ECMWF Reanalysis Interim (ERA-I) [5], JMA 55-year Reanalysis (JRA-55) [6], and NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA) [7].However, since the beginning of this analysis, a new version of National Aeronautics and Space Administration (NASA) MERRA has been released, MERRA-2 [8].The aforementioned reanalysis products span from January 1979 to at least December 2017 (at the time of this study), with the exception of JRA-55 which begins January 1958.Each of the third generation models are considered state-of-the-art, but differences in meteorological outputs arise following different modeling approaches (refer to Table 1 for general model comparisons; refer also to the Overview & Comparison Tables at https://climatedataguide.ucar.edu/climate-data/atmosphericreanalysis-overview-comparison-tables).Chou et al. [10] Reanalysis products are increasingly used for investigations into the changing climate.For example, Santer et al. [11] find a trend of increasing tropopause heights in ERA-40 linked to anthropogenic-forced warming, in agreement with climate model simulations and satellite data.Serreze et al. [12] compare vertical temperature and moisture profiles from CFSR, ERA-I, and MERRA to radiosonde station data within the Arctic, a data-sparse region where gridded output is heavily simulated.They find good agreement between the reanalyses and measured temperature and moisture trends, even though uncertainties are produced in both radiosonde data and reanalyses.Lindsay et al. [13] evaluated seven reanalysis models (at least one model per reanalysis generation) for the Arctic region and find that the third generation reanalyses (CFSR, ERA-I, and MERRA) provide the most significant correlations to observations, but with biases between the reanalyses for 2-m air temperature and other metrics.Chen et al. [14] examine variability between CFSR, ERA-I, JRA-55, and MERRA in the warm-season diurnal cycle over East Asia.They find that although these four reanalysis models reproduce the diurnal precipitation well over large spatial scales, individual models disagree considerably over regional scales.Chen et al. [14] also show that JRA-55 is in good agreement with observational data showing an increase in morning precipitation over their investigated domain.This same study determines that JRA-55 accurately reproduces wind speeds on the leeside of the Tibetan Plateau.
The above evaluations can be referred to as model-observation comparisons, where reanalysis model simulations are assessed by comparison to observations made by either satellite and/or station data.Here, we present an ensemble mean of the four leading reanalysis models-CFSR, ERA-I, JRA-55, and MERRA-and compare the individual reanalysis models against the ensemble mean to examine differences in the 2-m air temperature (T 2m ), precipitation, and 500-hPa geopotential height (Z 500 ) fields.Spatial and time series comparisons are made with respect to the 30-year climate average period 1981-2010.The reanalysis ensemble mean should not be taken as "truth", as our primary intention is to identify similarities and differences between the reanalysis models to provide baseline information for future studies.Although interannual variability correlations for T 2m are strong on the global mean, we find significant differences in T 2m between the reanalyses over many arid regions.There are also notable precipitation departures between the models in association with the Intertropical Convergence Zone (ITCZ).For T 2m and precipitation, gridded observations-Global Historical Climatology Network (GHCN) and Global Precipitation Climatology Centre (GPCC)-are shown for baseline differences for each reanalysis model as well as the ensemble mean.

Experiments
Each reanalysis model can be found in month average file formats, which were used to produce the 1981-2010 averages.For CFSR, monthly files were found at the National Centers for Environmental Information (NCEI; www.ncdc.noaa.gov/data-access/model-data/model-datasets/climate-forecast-system-version2-cfsv2).Output from ERA-I and JRA-55 models were obtained from the NCAR Computational Information Systems Laboratory (CISL) Research Data Archive (RDA; http://rda.ucar.edu)under ds627.1 and ds628.1,respectively.Monthly files for MERRA were obtained from the NASA Modeling and Assimilation Data and Information Services Center (MDISC; http://disc.sci.gsfc.nasa.gov/mdisc).
The unprocessed outputs for each reanalysis model in this study are on different horizontal grids (Table 1).For simplicity, all model output fields were regridded to the same regular 0.5 • × 0.5 • latitude-longitude grid matching that of CFSR.Although it is most common to regrid to the lowest resolution of a model suite, the choice to use a finer common grid does not alter interpretations on the large-scale features reported here.Regridding was done using bilinear interpolation from the Earth System Modeling Framework (ESMF) software embedded within NCAR Command Language (NCL).Once regridded, netCDF files were produced for monthly, seasonal, and annual averages for each year and reanalysis product.The 1981-2010 climate average was created, again, for each month, season, and total.The reanalysis ensemble was generated by averaging monthly, seasonal, and annual means of the regridded CFSR, ERA-I, JRA-55, and MERRA outputs produced as explained above.
Gridded datasets are used in this study as a baseline for the comparisons.For T 2m , GHCN version 3 (https://data.nodc.noaa.gov/cgi-bin/iso?id=gov.noaa.ncdc:C00839)[15,16] gridded dataset is used.The GHCN output is already on a 0.5 • × 0.5 • latitude-longitude grid; however, the grid cells do not match that of the regridded reanalysis output (the upper-left corner of the array is 89.75 • N and 0.25 • E).Thus, it was necessary to regrid GHCN to start the upper-left corner at 90 • N and 0 • E to match the 0.5 • × 0.5 • latitude-longitude grid.As a baseline for precipitation, the Global Precipitation Climatology Centre (GPCC; https://www.esrl.noaa.gov/psd/data/gridded/data.gpcc.html)[17] files are used.Similar to GHCN, GPCC output is on a 0.5 • × 0.5 • latitude-longitude grid with the upper-left corner matching as well.The same regridding scheme was applied to GPCC output for comparisons to reanalyses.As with the reanalyses, 30-year mean files were made for the period of 1981-2010.

Two-Meter Air Temperature
For the T 2m (Figure 1) baseline we show each reanalysis (panels a-d) and the ensemble mean (panel e) subtract GHCN.It is evident that where observations are scarce there are major differences between the reanalyses/ensemble and GHCN output, such as in Greenland, central South America, Sahara Desert, and the Himalayas.Differences reach over 8 • C is some of these regions.Conversely, where observations are available for the duration of the reanalysis models, the outputs are closer to GHCN, as expected.The average differences between the reanalyses and GHCN range from −0.Global T2m differences between the reanalysis models and the ensemble mean (Figure 2) arise primarily near the poles, with slight departures from the mean generally in locations where observations are relatively scarce.For example, CFSR affords a negative 2-3 °C difference over the desert belts in Africa and the Middle East (i.e., Sahara, Sahel, and Saudi Arabia), while ERA-I shows most differences, about negative 1 °C, in Arctic Canada and East Antarctica.From JRA-55, T2m generally compares well with the ensemble mean overall with negative departures from the mean over the Arctic and Southern Oceans.Last, from MERRA, T2m is generally warmer than the mean over most of Africa, Australia, Central Asia, and South America.
Time series of T2m during the 30-year study period over desert regions (black boxes in Figure 2) can be found in Figure 3. Interannual variability between the reanalyses agree well for each region (r > 0.8).The range of time series correlations across the smallest desert analyzed in this study, the Taklimakan Desert, China, is larger (0.41 < r < 0.97) than that of the correlation coefficient ranges for the Sahara Desert, Saudi Arabia, and Central Greenland (0.91 < r < 0.99, 0.94 < r < 0.99, and 0.95 < r < 0.99, respectively).The topography surrounding the Taklimakan Desert differs much more than that of the other deserts referred to here, which could explain the variability in correlations found between the reanalyses.Global T 2m differences between the reanalysis models and the ensemble mean (Figure 2) arise primarily near the poles, with slight departures from the mean generally in locations where observations are relatively scarce.For example, CFSR affords a negative 2-3 • C difference over the desert belts in Africa and the Middle East (i.e., Sahara, Sahel, and Saudi Arabia), while ERA-I shows most differences, about negative 1 • C, in Arctic Canada and East Antarctica.From JRA-55, T 2m generally compares well with the ensemble mean overall with negative departures from the mean over the Arctic and Southern Oceans.Last, from MERRA, T 2m is generally warmer than the mean over most of Africa, Australia, Central Asia, and South America.
Time series of T 2m during the 30-year study period over desert regions (black boxes in Figure 2) can be found in Figure 3. Interannual variability between the reanalyses agree well for each region (r > 0.8).The range of time series correlations across the smallest desert analyzed in this study, the Taklimakan Desert, China, is larger (0.41 < r < 0.97) than that of the correlation coefficient ranges for the Sahara Desert, Saudi Arabia, and Central Greenland (0.91 < r < 0.99, 0.94 < r < 0.99, and 0.95 < r < 0.99, respectively).The topography surrounding the Taklimakan Desert differs much more than that of the other deserts referred to here, which could explain the variability in correlations found between the reanalyses.Figure 4 shows the T 2m time series with accompanied correlation coefficients over the global domain.Note that all reanalysis-ensemble correlations are greater than 0.97.Reanalysis-reanalysis T 2m correlations are also high, with the lowest being between CFSR and JRA-55 at 0.93.ERA-I shows the highest correlation against the ensemble mean at 0.99 with an average T 2m difference of about −0.10 • C compared to that of the ensemble mean.MERRA T 2m has the highest average deviation from the ensemble of about +0.14 • C. The MERRA T 2m time series curve does not intersect the other reanalysis curves making MERRA outputs consistently warmer than all other reanalyses.The JRA-55 T 2m time series is closest to the ensemble mean with a mean departure of only −0.02 • C.
Atmosphere 2018, 9, x FOR PEER REVIEW 6 of 12 Figure 4 shows the T2m time series with accompanied correlation coefficients over the global domain.Note that all reanalysis-ensemble correlations are greater than 0.97.Reanalysis-reanalysis T2m correlations are also high, with the lowest being between CFSR and JRA-55 at 0.93.ERA-I shows the highest correlation against the ensemble mean at 0.99 with an average T2m difference of about −0.10 °C compared to that of the ensemble mean.MERRA T2m has the highest average deviation from the ensemble of about +0.14 °C.The MERRA T2m time series curve does not intersect the other reanalysis curves making MERRA outputs consistently warmer than all other reanalyses.The JRA-55 T2m time series is closest to the ensemble mean with a mean departure of only −0.02 °C.In order to compare T2m trends the derivative of 5-year running means is used for each reanalysis model.ERA-I, JRA-55, and MERRA display similar trends across the 1981-2010 period.However, CFSR yields a steeper increase in T2m during the 1997-2003 period, which can be seen in Figure 4.The CFSR T2m increase over the 1997-2003 period is 0.061 °C/year whereas the other reanalysis trends are less than 0.038°C/year.The last year over the study period, 2010, CFSR deviates from the trend of the other three reanalyses (Figure 4).
For ocean-based 2-m temperature, the reanalyses generally agree well with the ensemble mean, which is to be expected as sea surface temperature (SST) strongly influences T2m.The largest departures from the mean arise over the Arctic and Southern Oceans in JRA-55 (Figure 2).As these regions are data-sparse, this is expected.However, this could also be due to differences in sea ice, as near-surface temperature is highly influenced by sea ice.Each reanalysis, except CFSR, uses a prescribed SST field interpolated from observations.CFSR incorporates a combination of versions 1 and 2 of the optimum interpolation (OI) methods for the period November 1981-present [18].As for the January 1979-October 1981 period, SST fields are used from ERA-40.For further information on reanalyses and SST data sets and methods for CFSR, ERA-I, and MERRA, refer to Kumar et al. [19].JRA-55 uses the Centennial in situ Observation-Based Estimates (COBE) SST data set [20].In order to compare T 2m trends the derivative of 5-year running means is used for each reanalysis model.ERA-I, JRA-55, and MERRA display similar trends across the 1981-2010 period.However, CFSR yields a steeper increase in T 2m during the 1997-2003 period, which can be seen in Figure 4.The CFSR T 2m increase over the 1997-2003 period is 0.061 • C/year whereas the other reanalysis trends are less than 0.038 • C/year.The last year over the study period, 2010, CFSR deviates from the trend of the other three reanalyses (Figure 4).
For ocean-based 2-m temperature, the reanalyses generally agree well with the ensemble mean, which is to be expected as sea surface temperature (SST) strongly influences T 2m .The largest departures from the mean arise over the Arctic and Southern Oceans in JRA-55 (Figure 2).As these regions are data-sparse, this is expected.However, this could also be due to differences in sea ice, as near-surface temperature is highly influenced by sea ice.Each reanalysis, except CFSR, uses a prescribed SST field interpolated from observations.CFSR incorporates a combination of versions 1 and 2 of the optimum interpolation (OI) methods for the period November 1981-present [18].As for the January 1979-October 1981 period, SST fields are used from ERA-40.For further information on reanalyses and SST data sets and methods for CFSR, ERA-I, and MERRA, refer to Kumar et al. [19].JRA-55 uses the Centennial in situ Observation-Based Estimates (COBE) SST data set [20].

Precipitation
Using GPCC as the baseline for precipitation on land, we show differences between each reanalysis model and the ensemble mean, subtracting GPCC (Figure 5).It is evident that CFSR outputs more precipitation in the northern hemisphere with MERRA closer to GPCC.In South America, there are dryer outputs across all four reanalyses, with MERRA showing values over 1 m less than GPCC.Over the Himalayas, each reanalysis displays more precipitation with MERRA showing less on the southern slopes.The average differences across the domain are all positive, ranging from 0.04 m (MERRA) to 0.18 m (CFSR).However, MERRA precipitation being the closest to GPCC does not suggest it is "better" than the other reanalyses in this study, which is likely due to the large negative bias over South America.The ensemble mean average difference is 0.11 m.

Precipitation
Using GPCC as the baseline for precipitation on land, we show differences between each reanalysis model and the ensemble mean, subtracting GPCC (Figure 5).It is evident that CFSR outputs more precipitation in the northern hemisphere with MERRA closer to GPCC.In South America, there are dryer outputs across all four reanalyses, with MERRA showing values over 1 m less than GPCC.Over the Himalayas, each reanalysis displays more precipitation with MERRA showing less on the southern slopes.The average differences across the domain are all positive, ranging from 0.04 m (MERRA) to 0.18 m (CFSR).However, MERRA precipitation being the closest to GPCC does not suggest it is "better" than the other reanalyses in this study, which is likely due to the large negative bias over South America.The ensemble mean average difference is 0.11 m.Difference plots between the reanalyses and ensemble mean for precipitation are shown in Figure 6.The spatial differences are noisy particularly along large mountain ranges (e.g., Himalayas) and near small islands that are resolved by the reanalyses (e.g., Philippines).Major differences are found along the ITCZ.In the eastern hemispheric tropics, CFSR and MERRA show less precipitation than the ensemble mean.JRA-55 shows precipitation values higher than the mean over the oceans and generally less than the mean over Indonesia, the Philippines, and Malaysia, as well as northern South America.The ERA-I precipitation departure is smaller than the other members along the eastern hemispheric tropics showing less precipitation over the islands and less than 0.5 m/year above the ensemble mean over the western tropical Pacific Ocean.
Figure 7 shows precipitation outputs over the western hemispheric tropics.CFSR and ERA-I in this region generally agree with the ensemble mean, whereas JRA-55 and MERRA deviate from the mean on the order of 0.47 and −0.67 m/year, respectively, at the peak of the zonal average.JRA-55 presents a much stronger precipitation belt while the zonal average from MERRA suggests a weaker precipitation belt, with respect to CFSR and ERA-I.Seasonal differences (Supplementary Figure S1) Difference plots between the reanalyses and ensemble mean for precipitation are shown in Figure 6.The spatial differences are noisy particularly along large mountain ranges (e.g., Himalayas) and near small islands that are resolved by the reanalyses (e.g., Philippines).Major differences are found along the ITCZ.In the eastern hemispheric tropics, CFSR and MERRA show less precipitation than the ensemble mean.JRA-55 shows precipitation values higher than the mean over the oceans and generally less than the mean over Indonesia, the Philippines, and Malaysia, as well as northern South America.The ERA-I precipitation departure is smaller than the other members along the eastern hemispheric tropics showing less precipitation over the islands and less than 0.5 m/year above the ensemble mean over the western tropical Pacific Ocean.
Figure 7 shows precipitation outputs over the western hemispheric tropics.CFSR and ERA-I in this region generally agree with the ensemble mean, whereas JRA-55 and MERRA deviate from the mean on the order of 0.47 and −0.67 m/year, respectively, at the peak of the zonal average.JRA-55 presents a much stronger precipitation belt while the zonal average from MERRA suggests a weaker precipitation belt, with respect to CFSR and ERA-I.Seasonal differences (Supplementary Figure S1) relative to the ensemble mean arise during the boreal summer (JJA) and fall (SON) months for JRA-55 and MERRA.The reanalysis models have poor agreement on the precipitation total in the ITCZ, but good agreement overall on timing of the seasonal shift of the ITCZ in the Western Hemisphere.The zonal average peaks for each reanalysis model fall within 1 • latitude of the ensemble zonal average peak during the 1981-2010 climatology.
relative to the ensemble mean arise during the boreal summer (JJA) and fall (SON) months for JRA-55 and MERRA.The reanalysis models have poor agreement on the precipitation total in the ITCZ, but good agreement overall on timing of the seasonal shift of the ITCZ in the Western Hemisphere.The zonal average peaks for each reanalysis model fall within 1° latitude of the ensemble zonal average peak during the 1981-2010 climatology.relative to the ensemble mean arise during the boreal summer (JJA) and fall (SON) months for JRA-55 and MERRA.The reanalysis models have poor agreement on the precipitation total in the ITCZ, but good agreement overall on timing of the seasonal shift of the ITCZ in the Western Hemisphere.The zonal average peaks for each reanalysis model fall within 1° latitude of the ensemble zonal average peak during the 1981-2010 climatology.CFSR exhibits a specific weakness not found in the other models, where spherical harmonic artifacts appear in the precipitation (Figure 8) and T 2m fields.These unrealistic features are not only found in sub-daily and monthly outputs, but also present in the climatological average, 1981-2010.Slight spherical harmonic artifacts in T 2m and precipitation are present within the ensemble mean produced here as a result of the inclusion of CFSR.
Atmosphere 2018, 9, x FOR PEER REVIEW 9 of 12 CFSR exhibits a specific weakness not found in the other models, where spherical harmonic artifacts appear in the precipitation (Figure 8) and T2m fields.These unrealistic features are not only found in sub-daily and monthly outputs, but also present in the climatological average, 1981-2010.Slight spherical harmonic artifacts in T2m and precipitation are present within the ensemble mean produced here as a result of the inclusion of CFSR.

Geopotential Heights at 500 hPa
Geopotential heights at 500 hPa (Z500) generally agree across the four reanalyses with slight departures from the mean (±20 m) (Figure 9).The largest deviations are over Antarctica, where radiosonde observations are sparse, and therefore the models are poorly constrained.Above the friction layer, the free atmosphere becomes geostrophic, enabling model simulations to better approximate real-world observations.

Geopotential Heights at 500 hPa
Geopotential heights at 500 hPa (Z 500 ) generally agree across the four reanalyses with slight departures from the mean (±20 m) (Figure 9).The largest deviations are over Antarctica, where radiosonde observations are sparse, and therefore the models are poorly constrained.Above the friction layer, the free atmosphere becomes geostrophic, enabling model simulations to better approximate real-world observations.

Conclusions
In conclusion, three widely used atmospheric variables-T2m, precipitation, and Z500-are compared between each of the third-generation global climate reanalysis models-CFSR, ERA-I, JRA-55, and MERRA-and the respective ensemble mean of these reanalyses.While the four products generally agree on the large-scale, there are differences when examining regional-scale climatology.For example, we note that the precipitation belt in the ITCZ display differences between model solutions on the order of 13-20% (i.e., JRA-55 and MERRA, respectively) relative to the ensemble mean.
Over regional domains, T2m outputs for each reanalysis are shown to have large differences in extremely dry regions, such as polar ice sheets and low-latitude deserts.As examples, T2m over the Sahara and Saudi Arabia in CFSR (MERRA) is much lower (higher) with respect to the ensemble mean during the 1981-2010 period, whereas ERA-I and JRA-55 are generally in good accord with the mean (Figures 2 and 3).Differences in these areas are most likely linked to scarce observation data.T2m correlations between the reanalysis models reveal strong agreement (Figure 4) for interannual variability, meaning that the models are in agreement on large-scale circulation processes over long temporal scales.Although, differences near the surface arise between the reanalysis model solutions, which could be attributed to any number of internal model differences, such as resolution, assimilation methods, observational input data error, land surface schemes, physics, etc. Future work on meteorological case studies will prove useful for improving seasonal, monthly, daily, and subdaily reanalysis outputs and anomalies within the next generation of reanalysis products.Since most reanalysis differences occur at or near the surface, investigating the internal models and surface model schemes will bring reanalysis solutions closer to observations.

Conclusions
In conclusion, three widely used atmospheric variables-T 2m , precipitation, and Z 500 -are compared between each of the third-generation global climate reanalysis models-CFSR, ERA-I, JRA-55, and MERRA-and the respective ensemble mean of these reanalyses.While the four products generally agree on the large-scale, there are differences when examining regional-scale climatology.For example, we note that the precipitation belt in the ITCZ display differences between model solutions on the order of 13-20% (i.e., JRA-55 and MERRA, respectively) relative to the ensemble mean.
Over regional domains, T 2m outputs for each reanalysis are shown to have large differences in extremely dry regions, such as polar ice sheets and low-latitude deserts.As examples, T 2m over the Sahara and Saudi Arabia in CFSR (MERRA) is much lower (higher) with respect to the ensemble mean during the 1981-2010 period, whereas ERA-I and JRA-55 are generally in good accord with the mean (Figures 2 and 3).Differences in these areas are most likely linked to scarce observation data.T 2m correlations between the reanalysis models reveal strong agreement (Figure 4) for interannual variability, meaning that the models are in agreement on large-scale circulation processes over long temporal scales.Although, differences near the surface arise between the reanalysis model solutions, which could be attributed to any number of internal model differences, such as resolution, assimilation methods, observational input data error, land surface schemes, physics, etc. Future work on meteorological case studies will prove useful for improving seasonal, monthly, daily, and sub-daily reanalysis outputs and anomalies within the next generation of reanalysis products.Since most reanalysis differences occur at or near the surface, investigating the internal models and surface model schemes will bring reanalysis solutions closer to observations.Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/2073-4433/9/6/236/s1.

Figure 1 .
Figure 1.(a-d) Gridded differences in 2-m air temperature for each reanalysis model subtract Global Historical Climatology Network (GHCN) over the 1981-2010 period.Panel (e) shows the same but for the ensemble mean.

Figure 1 .
Figure 1.(a-d) Gridded differences in 2-m air temperature for each reanalysis model subtract Global Historical Climatology Network (GHCN) over the 1981-2010 period.Panel (e) shows the same but for the ensemble mean.

Figure 2 .
Figure 2. (a-d) Gridded differences in 2-m air temperature for each reanalysis model subtract (e) the ensemble mean 2-m air temperature over the 1981-2010 period.Black boxes show locations for annual average time series in Figure 3.

Figure 3 .
Figure 3.Time series of each member and ensemble mean (Ens) of annual 2-m air temperature averaged over the 1981-2010 period.Locations shown as black boxes in Figure 2.

Figure 2 .
Figure 2. (a-d) Gridded differences in 2-m air temperature for each reanalysis model subtract (e) the ensemble mean 2-m air temperature over the 1981-2010 period.Black boxes show locations for annual average time series in Figure 3.

Figure 2 .
Figure 2. (a-d) Gridded differences in 2-m air temperature for each reanalysis model subtract (e) the ensemble mean 2-m air temperature over the 1981-2010 period.Black boxes show locations for annual average time series in Figure 3.

Figure 3 .
Figure 3.Time series of each member and ensemble mean (Ens) of annual 2-m air temperature averaged over the 1981-2010 period.Locations shown as black boxes in Figure 2.

Figure 3 .
Figure 3.Time series of each member and ensemble mean (Ens) of annual 2-m air temperature averaged over the 1981-2010 period.Locations shown as black boxes in Figure 2.

Figure 4 .
Figure 4. Global annual 2-m air temperature for each member and the ensemble mean (Ens) during 1981-2010.Correlation coefficients (r) for each time series shown in upper, left table.

Figure 4 .
Figure 4. Global annual 2-m air temperature for each member and the ensemble mean (Ens) during 1981-2010.Correlation coefficients (r) for each time series shown in upper, left table.

Figure 5 .
Figure 5. (a-d) Gridded differences in precipitation for each reanalysis model subtract Global Precipitation Climatology Centre (GPCC) over the 1981-2010 period.Panel (e) shows the same but for the ensemble mean.

Figure 5 .
Figure 5. (a-d) Gridded differences in precipitation for each reanalysis model subtract Global Precipitation Climatology Centre (GPCC) over the 1981-2010 period.Panel (e) shows the same but for the ensemble mean.

Figure 6 .
Figure 6.(a-d) Gridded differences in precipitation for each reanalysis model subtract (e) the ensemble mean annual precipitation over the 1981-2010 period.

Figure 7 .
Figure 7. (a) The ensemble mean precipitation over the 1981-2010 average for the western hemispheric tropics.The black curve in (b) is the ensemble zonal average of precipitation from (a).The black curves in (c-f) are the respective precipitation zonal averages over the same region as (a) for each reanalysis model.The gray curves in (c-f) show the differences of the ensemble subtracted from the reanalysis model.The thin black line in (c-f) marks zero difference.

Figure 6 .
Figure 6.(a-d) Gridded differences in precipitation for each reanalysis model subtract (e) the ensemble mean annual precipitation over the 1981-2010 period.

Figure 6 .
Figure 6.(a-d) Gridded differences in precipitation for each reanalysis model subtract (e) the ensemble mean annual precipitation over the 1981-2010 period.

Figure 7 .
Figure 7. (a) The ensemble mean precipitation over the 1981-2010 average for the western hemispheric tropics.The black curve in (b) is the ensemble zonal average of precipitation from (a).The black curves in (c-f) are the respective precipitation zonal averages over the same region as (a) for each reanalysis model.The gray curves in (c-f) show the differences of the ensemble subtracted from the reanalysis model.The thin black line in (c-f) marks zero difference.

Figure 7 .
Figure 7. (a) The ensemble mean precipitation over the 1981-2010 average for the western hemispheric tropics.The black curve in (b) is the ensemble zonal average of precipitation from (a).The black curves in (c-f) are the respective precipitation zonal averages over the same region as (a) for each reanalysis model.The gray curves in (c-f) show the differences of the ensemble subtracted from the reanalysis model.The thin black line in (c-f) marks zero difference.

Figure 8 .
Figure 8.A comparison between CFSR (top) and ERA-I (bottom) 1981-2010 average annual precipitation totals.CFSR contains unrealistic, harmonic spherical artifacts on short and long-term averages whereas ERA-I (as well as JRA-55 and Modern-Era Retrospective Analysis for Research and Applications (MERRA)) displays a more spatially consistent output.

Figure 8 .
Figure 8.A comparison between CFSR (top) and ERA-I (bottom) 1981-2010 average annual precipitation totals.CFSR contains unrealistic, harmonic spherical artifacts on short and long-term averages whereas ERA-I (as well as JRA-55 and Modern-Era Retrospective Analysis for Research and Applications (MERRA)) displays a more spatially consistent output.

Atmosphere 2018, 9 , 12 Figure 9 .
Figure 9. (a-d) The difference of 500-hPa geopotential heights between the reanalysis and (e) the ensemble, reanalysis subtract the ensemble over the 1981-2010 period.Horizontal-line artifacts in maps (a-d) are inherited from regridding within MERRA, which was averaged into the ensemble.

Figure 9 .
Figure 9. (a-d) The difference of 500-hPa geopotential heights between the reanalysis and (e) the ensemble, reanalysis subtract the ensemble over the 1981-2010 period.Horizontal-line artifacts in maps (a-d) are inherited from regridding within MERRA, which was averaged into the ensemble.

Table 1 .
Summary of reanalysis models used for ensemble.Major differences between each reanalysis model shown here.