Comparison of GCM Precipitation Predictions with Their RMSEs and Pattern Correlation Coefficients

This study evaluated 20 general circulation models (GCMs) of the Coupled Model Intercomparison Project, Phase 5 (CMIP5), which provide the prediction results for the period of 2006 to 2014, the period from which the observation data (the Global Precipitation Climatology Project (GPCP) data) are available. Both the GCM predictions of precipitation and the GPCP data were compared for three data structures—the global, zonal, and grid mean—with conventional statistics like the root mean square error (RMSE) and the pattern correlation coefficient of the cyclostationary empirical orthogonal functions (CSEOFs). As a result, it was possible to select a GCM which showed the best performance among the 20 GCMs considered in this study. Overall, the NorSM1-M model was found to be the most similar to the GPCP data. Additionally, the IPSL-CM5A-LR, BCC-CSM, and GFDL-CMS models were also found to be quite similar to the GPCP data.


Introduction
Due to global warming, most countries are currently preparing for a future with different climate conditions.The greenhouse effect has been especially considered as a key factor of global warming since the twentieth century [1].Recently, the Intergovernmental Panel on Climate Change (IPCC) stated that the main cause of global warming is the ongoing emission of greenhouse gases from various anthropogenic activities [2].Rapid climate change can have a significantly negative impact on a county's economy and industry, while the health of the ecosystem is also believed to be significantly affected.As climate change has continued globally, both national and international cooperation is now becoming increasingly urgent [3,4].Highly reliable scenarios of climate change are very important for preparing countermeasures to the possible impact of climate change [5].The IPCC presented four different scenarios of representative concentration pathways (RCP) for climate change predictions [2].RCP scenarios were proposed by converting the emitted greenhouse gases into solar radiative forcing, such as in the cases of RCP2.6, RCP4.5, RCP6.0, and RCP8.5.For example, the RCP2.6 scenario represents the case where the total greenhouse gases emitted is 2.6 W/m 2 , in addition to the solar radiative forcing at a preindustrial condition, defined as the year 1750.The RCP2.6 scenario represents the case in which the earth can recover its original pre-anthropogenic state prior to the emission of the greenhouse gases.The RCP4.5 represents the case where the reduction of greenhouse gases is successfully achieved, and the RCP6.0 represents the case where the reduction of the greenhouse gases is moderately achieved.Finally RCP8.5 represents the case where the total greenhouse gases is that predicted if no changes are implemented, and the trend of emission observed in the early 21st century continues [2].
Along with providing the RCP scenarios, the IPCC has also presented the Coupled Model Intercomparison Project, Phase 5 (CMIP5) for evaluating and comparing the general circulation model (GCM) predictions based on RCP scenarios [2].More than 60 GCMs were used in the CMIP5, all of which were developed to make predictions about the future state of meteorology, considering both the non-stationarity of the climate system and the physical process of meteorological phenomena [6].However, as the theories and algorithms applied for each GCM differ, unique predictions cannot be made.The purpose of CMIP5 is simply to evaluate and compare these GCM predictions.Currently, the use of GCM predictions is assumed to be the best way to acquire information about future meteorological global data [7,8]; it is also true that considerable differences exist among the GCM predictions [9][10][11].
Since the GCM predictions have become available through the CMIP5, many studies have reported the use and application of GCM predictions for hydrology as well as for meteorology.For example, Landerer et al. [12] evaluated the spatial and temporal features of sea surface height simulated by 33 GCMs of CMIP5, by using observed data from satellite altimetry.Also, Tabari et al. [13] investigated the potential impact of climate change on water availability in central Belgium, by using 30 GCMs of CMIP5.Ning and Bradley [14] examined the regional winter temperature and precipitation over the eastern United States by using 17 GCMs of CMIP5.Aloysius et al. [9] assessed the performance of 25 GCMs of CMIP5, by analyzing the precipitation and temperature predictions over central Africa.
While many studies have concentrated on the comparison and application of the GCM predictions of the CMIP5, few studies have been presented that validate the potential ability of each GCM to simulate the future climate.Most studies have focused on comparing the basic statistics of the GCM predictions with observed data; however, this is insufficient for application to the global climate system, which shows a strong non-linear behavior.For example, Lovejoy and Varotsos [15] examined the GCM output, and found that the linear assumption can be invalid for GCM output with timescales less than 50 years.Therefore, a more fundamental approach is necessary to evaluate various GCMs and their predictions.Principle component analysis can also be a good option.
In this study, the GCM predictions of precipitation will be compared with conventional statistics like the root mean square error (RMSE) and the pattern correlation coefficient of the CSEOF.The GCMs to be compared are those that provide the GCM predictions over the period of 2006 to 2014, which is the period for which the global precipitation data from GPCP are available.The comparison will also be repeated with several data structures, such as the global, zonal, and grid means.It is especially important to check if the result derived by comparing the conventional statistics is similar to that from the comparison of their major principle components (i.e., CSEOFs), derived by applying the CSEOF analysis.The comparison results will also be merged, to distinguish one GCM to others.

Data
More than 60 GCM predictions of CMIP5 are available from the World Data Center for Climate (WDCC).There are a total of 23 development institutes which contributed to CMIP5 GCM data.First, the authors selected one GCM simulation from each development institute.Also, among 23 development institutes, the National Institute for Space Research and Institute of Atmospheric Physics and the Chinese Academy of Sciences were excluded, as their simulations were not based on the RCP scenarios.In the case of the CMCC-CM model, its grid size was too small (0.750 • × 0.750 • ) to conduct CSEOF analysis.Finally, the authors selected a total of 20 GCMs to be used for the CSEOF analysis in this study.Table 1 summarizes these GCMs with their development institute and grid-resolution.
Among GCM predictions available, for the period of from 2001 to 2300, only the predictions from 2006 to 2014 were used in this study in order to compare those with the observed data.Among four RCP scenarios, the RCP8.5 was selected, as it is based on the assumption of the continuing trend of greenhouse gas emission observed in the early 21st century.The GCM precipitation data collected in this study were monthly ones, which were also accumulated to make the annual data.All the analyses in this study were based on these annual data.However, spatially, three different data structures were considered, which are global, zonal, and grid data.The observed data used for the validation of these 20 GCMs are sourced from the Global Precipitation Climatology Project (GPCP) version 2.3 of the Combined Precipitation Data Set [37].The GPCP-observed precipitation data were constructed by the Mesoscale Atmospheric Processes Laboratory of the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center.The GPCP data are well-organized grid data, developed by merging the rain gauge data and the satellite data.The GPCP data are available from 1979 to 2016 in the form of monthly average data.In this study, the GPCP data were collected for the period of 2006 to 2014, for comparison with the GCM predictions.The GPCP data were also accumulated to make the annual data.

Global Data
In this study, the global data was made simply as the arithmetic mean of all grid values.Figure 1 compares the global mean precipitations of eight GCMs to the GPCP data from 2006 to 2014.Eight GCMs were selected from the 20 GCMs, to clarify the comparison.As can be seen in Figure 1, first, all eight GCMs considered in this study show higher annual precipitation than the observed.Among them, the IPSL-CM5A-LR model was found to have the most similar predictions (about 810 mm) to the GPCP data (about 800 mm), with the NorESM1-M model second (about 820 mm).On the other hand, the MIROC5 model provided the highest value (about 950 mm).The GISS-E2-R model also provided annual precipitation data higher than 900 mm.In addition, the GPCP data showed a weak decreasing trend, and the GCMs showed steady or weak increasing trends.However, the trends during this period of 2006 to 2014 were very weak, so all the GCM and GPCP annual precipitation amounts may be assumed to be stationary.

Global Data
In this study, the global data was made simply as the arithmetic mean of all grid values.Figure 1 compares the global mean precipitations of eight GCMs to the GPCP data from 2006 to 2014.Eight GCMs were selected from the 20 GCMs, to clarify the comparison.As can be seen in Figure 1, first, all eight GCMs considered in this study show higher annual precipitation than the observed.Among them, the IPSL-CM5A-LR model was found to have the most similar predictions (about 810 mm) to the GPCP data (about 800 mm), with the NorESM1-M model second (about 820 mm).On the other hand, the MIROC5 model provided the highest value (about 950 mm).The GISS-E2-R model also provided annual precipitation data higher than 900 mm.In addition, the GPCP data showed a weak decreasing trend, and the GCMs showed steady or weak increasing trends.However, the trends during this period of 2006 to 2014 were very weak, so all the GCM and GPCP annual precipitation amounts may be assumed to be stationary.

Zonal Data
In this study, the zonal data were derived as the arithmetic mean of each 10° latitudinal zone, from −90° to 90° latitude.The comparison of zonal data can also be found in Yoo [38], where the annual mean temperature was of interest for the validation of the GCM predictions.A total of 18 zonal mean data were made each year for each GCM and GPCP data, whose annual mean is compared with other GCMs in Figure 2. Again, eight GCMs from 20 GCMs were selected to compare the zonal average more clearly.In this figure, zone 1 indicates the latitudinal zone from −90° to -80°, zone 2 from −80° to −70°, and finally zone 18 indicates the latitudinal zone with latitude zone from 80° to 90°.
As can be seen in Figure 2, overall pattern of zonal mean data was similar in both the GCM predictions and the GPCP data.Very high precipitation was found around the equator in all GCMs and GPCP data, and medium-high precipitation was also found in the mid-latitude zones of both northern and southern hemispheres.However, in greater detail, it can be found that each GCM prediction is a bit different from others.First of all, in most GCMs, the annual precipitation in zones 9 and 10, around the equator, were predicted very similarly, but the GPCP data does not show this similarity.Specifically, the GPCP precipitation in zone 9 was found to be less than the zone 10.Only

Zonal Data
In this study, the zonal data were derived as the arithmetic mean of each 10 • latitudinal zone, from −90 • to 90 • latitude.The comparison of zonal data can also be found in Yoo [38], where the annual mean temperature was of interest for the validation of the GCM predictions.A total of 18 zonal mean data were made each year for each GCM and GPCP data, whose annual mean is compared with other GCMs in As can be seen in Figure 2, overall pattern of zonal mean data was similar in both the GCM predictions and the GPCP data.Very high precipitation was found around the equator in all GCMs and GPCP data, and medium-high precipitation was also found in the mid-latitude zones of both northern and southern hemispheres.However, in greater detail, it can be found that each GCM prediction is a bit different from others.First of all, in most GCMs, the annual precipitation in zones 9 and 10, around the equator, were predicted very similarly, but the GPCP data does not show this similarity.Specifically, the GPCP precipitation in zone 9 was found to be less than the zone 10.Only the GCMs like NCAR-CAM5 and MIROC5 predicted this non-symmetric pattern of annual precipitation around the equator.
the GCMs like NCAR-CAM5 and MIROC5 predicted this non-symmetric pattern of annual precipitation around the equator.In Figure 2, it can also be found that some of the GCMs produced rather high precipitation predictions around the equator.In particular, the GISS-E2-R, MIROC5, and NCAR-CAM5 models had relatively high precipitation predictions, with a difference of more than 500 mm from other GCMs and the GPCP data.The high equatorial predictions may be the main reason for these GCMs to have the high annual mean precipitation amount (see Figure 1).Most of the GCMs considered in this study seem to follow the pattern of the GPCP data well, especially the IPSL-CM5A-LR and NorESM1-M models, which seem to be very close to the pattern of the GPCP data over all latitudinal zones.

Grid Data
The grid data were derived as the arithmetic mean for the 10° × 10° grid, and as a result, a total of 648 grid data were prepared for each year.In this study, first, basic statistics of global, zonal, and grid mean data of GCM predictions and GPCP data were compared.The mean, standard deviation, coefficient of variance, and normalized root mean square error (NRMSE) of GCM prediction and GPCP data are summarized in Table 2.As can be found in this table, each GCM shows a slightly different mean, standard deviation, and NRMSE.In the case of global mean, the GISS-E2-R model showed the highest value, and the CanESM2 model the lowest.As seen in Figure 1, the annual mean precipitation of all GCMs are found to be higher than that of GPCP data.In Figure 2, it can also be found that some of the GCMs produced rather high precipitation predictions around the equator.In particular, the GISS-E2-R, MIROC5, and NCAR-CAM5 models had relatively high precipitation predictions, with a difference of more than 500 mm from other GCMs and the GPCP data.The high equatorial predictions may be the main reason for these GCMs to have the high annual mean precipitation amount (see Figure 1).Most of the GCMs considered in this study seem to follow the pattern of the GPCP data well, especially the IPSL-CM5A-LR and NorESM1-M models, which seem to be very close to the pattern of the GPCP data over all latitudinal zones.

Grid Data
The grid data were derived as the arithmetic mean for the 10 • × 10 • grid, and as a result, a total of 648 grid data were prepared for each year.In this study, first, basic statistics of global, zonal, and grid mean data of GCM predictions and GPCP data were compared.The mean, standard deviation, coefficient of variance, and normalized root mean square error (NRMSE) of GCM prediction and GPCP data are summarized in Table 2.As can be found in this table, each GCM shows a slightly different mean, standard deviation, and NRMSE.In the case of global mean, the GISS-E2-R model showed the highest value, and the CanESM2 model the lowest.As seen in Figure 1, the annual mean precipitation of all GCMs are found to be higher than that of GPCP data.The standard deviation of the global data was also found to be different among GCMs and the GPCP data.Different from the annual mean, the standard deviation of the GPCP was found to be the highest.Among the GCMs, the MRI-CGCM3 model showed the highest standard deviation of 5.7 mm.The GISS-E2-R model showed the lowest standard deviation, at just 1.9 mm.The standard deviation of the grid data considers both the temporal and spatial variation of the precipitation data.Very different from the global data, the standard deviation of the GPCP data (652.5 mm) was found to be smaller than most of the GCM data.Even though the GISS-E2-R model showed a very high standard deviation (780.0 mm), most other GCMs showed less than 5% difference from the GPCP data.The zonal data are not appropriate to be considered for calculating the standard deviation, which was not derived in this study.
Table 2 also shows the NRMSE estimated for each GCM.The NRMSE is a dimensionless measure, calculated by dividing the error of the GCM predictions by the mean of the observed data [39].The NRMSE counts both the mean difference (or bias) and the variability of the difference between two dataset.The NRMSE is calculated by the following equation: As the NRMSE represents the arithmetic difference between the GCM predictions and the observed data, a smaller NRMSE indicates GCM predictions that are closer to the observed data.
As the GPCP data were considered to be the truth in this study, their NRMSE became zero in all three cases of data structure.
As can be found in Table 2, for the global data, the NRMSE of the IPSL-CM5A-LR model was estimated to be the smallest (0.03), followed by the NorESM1-M model (0.04) and the BCC-CSM model (0.06).For the zonal data, the BCC-CSM model was found to have the smallest NRMSE (0.19), followed by the IPSL-CM5A-LR model (0.20) and the NorESM1-M model (0.21).Finally, for the grid data, the IPSL-CM5A-LR model showed the smallest NRMSE (0.60), followed by the BCC-CSM (0.63), NorESM1-M (0.63), EC-EARTH (0.63), and FGOAL-g2 (0.63) models.Based on these results, it could be concluded that GCMs like IPSL-CM5A-LR, NorESM1-M, and BCC-CSM showed higher similarity to the GPCP data than the other GCMs.

EOF and CSEOF Analysis
Empirical orthogonal function (EOF) analysis is another term used for principle component analysis (PCA).Proposed in 1901 by Pearson [40], and independently developed by Hotelling [41], PCA is known by various terms, depending on the field of application, such as the discrete Kosambi-Karhunen-Loeve Transform in signal processing [42], the proper orthogonal decomposition in mechanical engineering [43], singular value decomposition (SVD) in linear algebra [44], and empirical modal analysis in structural dynamics [45].
EOF analysis is a statistical procedure used to convert a set of observations into a set of values of linearly uncorrelated variables, called EOFs.EOF analysis can be performed easily by the eigenvalue decomposition of a data covariance (or correlation) matrix, or the SVD of a data matrix X.The number of EOFs in a set of data is less than or equal to the number of original variables, and the resulting EOFs are orthogonal to each other.Since efficient algorithms are now available for the SVD of X without needing to form the matrix XTX, the SVD is now the standard way to derive EOFs from a data matrix.The simplicity of EOF analysis is largely the reason for its popularity in study areas such as meteorology, geology, and oceanography [16][17][18][19][20][21][22]24,35,46].
In contrast to EOF analysis, cyclostationary empirical orthogonal function (CSEOF) analysis is used to extract the major principle components, to explain the spatial variability of the raw data while considering the cyclostationarity.Cyclic behavior occurs in various climatological and hydrological data, among which the seasonal data and the global data are typical [47][48][49][50].CSEOF analysis was introduced by Kim and North [20], and many application examples can be found in the field of meteorology [26- 28,30,35,[51][52][53] and oceanography [54][55][56][57].
As the CSEOF analysis considers the cyclic behavior of the data, the variable showing this cyclic behavior should be defined correctly.The global data analysis can be divided into two cases, one to consider the spatial cyclic behavior, and the other to consider the seasonal cyclic behavior.Specifically, the spatial cyclic behavior is used to consider the shape of earth, in which the longitudinal characteristics are repeated at each orbit of the earth.Seasonal cyclic behavior can be easily understood, as it is repeated every year.As this study focuses on the annual precipitation data, the CSEOF analysis was used to consider the spatial cyclic behavior.
The pattern correlation coefficient, as well as the NRMSE given in Equation ( 2), was generally used to compare the major CSEOFs, derived from the GCM predictions, and those from the observed data [11,39,[58][59][60][61].The pattern correlation coefficient indicated the correlation coefficient of the data, given as forms of a matrix or vector.The mean difference was not considered in the pattern correlation coefficient.On the other hand, the NRMSE could be bigger than the pattern correlation coefficient, especially when the mean difference was large, which was the main difference between the pattern correlation coefficient and the NRMSE.The pattern correlation coefficient R pat is defined as the absolute value of the correlation coefficient as follows [20]: where N is the total number of GCM predictions and observed data, X obs represents the observed data, X obs is the mean of the observed data, X simul represents the GCM predictions, X simul is the mean of the GCM predictions, σ obs is the standard deviation of the observed data, and σ simul is the standard deviation of the GCM predictions.The range of the pattern correlation coefficient, calculated using the above equation, is from zero to 1.If the pattern correlation coefficient is 1, both data are assumed to be perfectly linearly correlated.If the pattern correlation coefficient is zero, neither data are assumed to be linearly correlated.

Comparison of Cyclostationary Empirical Orthogonal Functions of General Circulation Model Predictions and Global Precipitation Climatology Project Data
Various spatial patterns were derived by the CSEOF analysis of the GCM predictions and the GPCP data (see Figure 3 as an example).All of these patterns were independent and orthogonal to each other.Among the CSEOFs derived, relatively important patterns could be selected, based on the ratio of variance that each CSEOF explains.Table 3 summarizes the ratio of variance that each CSEOF of the GCM predictions and GPCP data explains.As can be found in this table, the ratio of variance that the first CSEOF explains was higher than 95% in all GCM predictions: the highest was 98.49% in the INMCM4.0model, and the lowest was 95.29% in the NorESM1-M model.The first CSEOF of the GPCP data explained 97.23% of the original variance, which was similar to the NCAR-CAM5, GISS-E2-R, CNRM-CM5, EC-EARTH, and IPSL-CM5A-LR models.
Water 2018, 10, 28 8 of 17 where is the total number of GCM predictions and observed data, represents the observed data, is the mean of the observed data, represents the GCM predictions, is the mean of the GCM predictions, is the standard deviation of the observed data, and is the standard deviation of the GCM predictions.The range of the pattern correlation coefficient, calculated using the above equation, is from zero to 1.If the pattern correlation coefficient is 1, both data are assumed to be perfectly linearly correlated.If the pattern correlation coefficient is zero, neither data are assumed to be linearly correlated.

Comparison of Cyclostationary Empirical Orthogonal Functions of General Circulation Model Predictions and Global Precipitation Climatology Project Data
Various spatial patterns were derived by the CSEOF analysis of the GCM predictions and the GPCP data (see Figure 3 as an example).All of these patterns were independent and orthogonal to each other.Among the CSEOFs derived, relatively important patterns could be selected, based on the ratio of variance that each CSEOF explains.Table 3 summarizes the ratio of variance that each CSEOF of the GCM predictions and GPCP data explains.As can be found in this table, the ratio of variance that the first CSEOF explains was higher than 95% in all GCM predictions: the highest was 98.49% in the INMCM4.0model, and the lowest was 95.29% in the NorESM1-M model.The first CSEOF of the GPCP data explained 97.23% of the original variance, which was similar to the NCAR-CAM5, GISS-E2-R, CNRM-CM5, EC-EARTH, and IPSL-CM5A-LR models.As can be seen in Figure 3, each CSEOF shows a very unique pattern, which may be interpreted as typical behavior of the annual precipitation field recorded in the GPCP data.For example, the first CSEOF in Figure 3 is composed of all positive numbers, which is believed to represent the spatial  As can be seen in Figure 3, each CSEOF shows a very unique pattern, which may be interpreted as typical behavior of the annual precipitation field recorded in the GPCP data.For example, the first CSEOF in Figure 3 is composed of all positive numbers, which is believed to represent the spatial variation of the annual mean precipitation over the globe.Rather high annual precipitation amount was noticed around the equator, as well as in the Pacific and Atlantic Ocean of the northern hemisphere.In fact, the dominance of the first CSEOF was expected, given how the annual precipitation data is handled in this study.Seasonal variation was removed in the process of temporal averaging.Only the spatial variability remained in the resulting annual precipitation data, so the annual climatology seems to be the dominant characteristic in this data.
However, it was not so obvious what other CSEOFs explain exactly.Most of them seemed to explain somewhat different spatial patterns, with some distinctions around the equator.In the second CSEOF, very strong precipitation spots were found around the equator; on the other hand, very weak precipitation spots were found in the same region in the third CSEOF.The fourth CSEOF seems rather opposite to the first CSEOF.The ratio of variance was around 0.38-1.87%for the second CSEOF, around 0.30-1.19%for the third CSEOF, and around 0.26-0.63%for the fourth CSEOF.all models, including the GPCP data, more than 98% of the total variance was explained by the first four CSEOFs.
Before analyzing the characteristics of the CSEOFs derived from 20 GCM predictions and the GPCP data, it should be mentioned here that the difference in the level of darkness between the CSEOFs did not represent the difference of the precipitation amount.As the NCAR-CAM5, MRI-CGCM3, and MIROC5 models have a greater number of cells (due to higher spatial resolution), the CSEOF derived for those GCM predictions looks rather brighter than others.On the other hand, for the BCM-CSM1.1 and IPSL-CM5a-LR models, as the number of cells is rather small, the CSEOF derived can be seen a bit darker than others.This is simply because these CSEOFs were normalized to make the sum of squares of each CSEOF be unified.
Figure 4 compares the first CSEOFs, derived from eight GCM predictions and the GPCP data.Among 20 GCMs, eight GCMs were selected to show the first CSEOF.Similar to the first CSEOF of GPCP data, the first CSEOFs of all eight GCMs in Figure 4 are composed of only positive numbers.All of these first CSEOFs show the obvious and major spatial pattern of annual precipitation around the equator.Different from the first CSEOF of the GPCP data, the precipitation was mostly and clearly concentrated around the ±5 • latitude, not the equator.A strong and clear symmetric pattern in the northern and southern hemisphere around the equator was also common in all eight GCM predictions.High precipitation amount in Southeast Asia was also a common pattern to be found in all eight GCM predictions.Some dominant patterns around the equator could also be found in other CSEOFs.However, as they look similar each other, it was not easy to distinguish one from another.As provided in Table 3, the ratios of variance explained by these second to fourth CSEOFs were all far less than two percent, so it may be absurd to try to link these CSEOFs to some spatial patterns of physical meaning.
Water 2018, 10, 28 10 of 17 variation of the annual mean precipitation over the globe.Rather high annual precipitation amount was noticed around the equator, as well as in the Pacific and Atlantic Ocean of the northern hemisphere.In fact, the dominance of the first CSEOF was expected, given how the annual precipitation data is handled in this study.Seasonal variation was removed in the process of temporal averaging.Only the spatial variability remained in the resulting annual precipitation data, so the annual climatology seems to be the dominant characteristic in this data.However, it was not so obvious what other CSEOFs explain exactly.Most of them seemed to explain somewhat different spatial patterns, with some distinctions around the equator.In the second CSEOF, very strong precipitation spots were found around the equator; on the other hand, very weak precipitation spots were found in the same region in the third CSEOF.The fourth CSEOF seems rather opposite to the first CSEOF.The ratio of variance was around 0.38-1.87%for the second CSEOF, around 0.30-1.19%for the third CSEOF, and around 0.26-0.63%for the fourth CSEOF.In all models, including the GPCP data, more than 98% of the total variance was explained by the first four CSEOFs.
Before analyzing the characteristics of the CSEOFs derived from 20 GCM predictions and the GPCP data, it should be mentioned here that the difference in the level of darkness between the CSEOFs did not represent the difference of the precipitation amount.As the NCAR-CAM5, MRI-CGCM3, and MIROC5 models have a greater number of cells (due to higher spatial resolution), the CSEOF derived for those GCM predictions looks rather brighter than others.On the other hand, for the BCM-CSM1.1 and IPSL-CM5a-LR models, as the number of cells is rather small, the CSEOF derived can be seen a bit darker than others.This is simply because these CSEOFs were normalized to make the sum of squares of each CSEOF be unified.
Figure 4 compares the first CSEOFs, derived from eight GCM predictions and the GPCP data.Among 20 GCMs, eight GCMs were selected to show the first CSEOF.Similar to the first CSEOF of GPCP data, the first CSEOFs of all eight GCMs in Figure 4 are composed of only positive numbers.All of these first CSEOFs show the obvious and major spatial pattern of annual precipitation around the equator.Different from the first CSEOF of the GPCP data, the precipitation was mostly and clearly concentrated around the ±5° latitude, not the equator.A strong and clear symmetric pattern in the northern and southern hemisphere around the equator was also common in all eight GCM predictions.High precipitation amount in Southeast Asia was also a common pattern to be found in all eight GCM predictions.Some dominant patterns around the equator could also be found in other CSEOFs.However, as they look similar each other, it was not easy to distinguish one from another.As provided in Table 3, the ratios of variance explained by these second to fourth CSEOFs were all far less than two percent, so it may be absurd to try to link these CSEOFs to some spatial patterns of physical meaning.The CSEOFs derived from 20 GCM predictions were compared with those from the GPCP data, to evaluate the prediction ability of each GCM.As the CSEOFs were used as building blocks in the prediction of precipitation field, it may be assumed that more accurate prediction results can be expected with more similar CSEOFs to the observed data.As measures for the evaluation of the GCM predictions, the pattern correlation coefficient and NRMSE were derived from the CSEOFs of the GCM predictions and the GPCP data.Table 4 summarizes the resulting pattern correlation coefficients and the NRMSEs between CSEOFs, respectively.Even though this table provides the pattern correlation coefficients and NRMSEs from the first to the fourth CSEOF, those derived for the first CSEOF played a dominant role in evaluating the GCM predictions.The CSEOFs derived from 20 GCM predictions were compared with those from the GPCP data, to evaluate the prediction ability of each GCM.As the CSEOFs were used as building blocks in the prediction of precipitation field, it may be assumed that more accurate prediction results can be expected with more similar CSEOFs to the observed data.As measures for the evaluation of the GCM predictions, the pattern correlation coefficient and NRMSE were derived from the CSEOFs of the GCM predictions and the GPCP data.Table 4 summarizes the resulting pattern correlation coefficients and the NRMSEs between CSEOFs, respectively.Even though this table provides the pattern correlation coefficients and NRMSEs from the first to the fourth CSEOF, those derived for the first CSEOF played a dominant role in evaluating the GCM predictions.the GCMs, but it was excluded, as the resulting resolution was too coarse.Thus, the result in this study can have some uncertainty, due to this inconsistency of the spatial resolution used when comparing the GCM prediction and the GPCP data.The uncertainty, however, was found to be not so serious, based on the evaluation with some other possible spatial resolutions based on common multiples.

Discussions on the Performance of GCM Predictions
In the previous two sections, the GCM predictions of precipitation were compared with the GPCP observed data.The comparison was based on two methods: first, using conventional statistics, like the mean and standard deviation; and second, using the pattern correlation coefficient and NRMSE of those CSEOFs derived from the GCM predictions and the GPCP data.In this part of the study, both the conventional statistics and statistics for the comparison of CSEOFs were merged to evaluate the performance of GCM predictions.
Among many conventional statistics considered in the comparison, the NRMSEs of the grid mean and zonal mean were selected for the evaluation of the GCM predictions.The reason for selecting these two statistics was to make the comparison of GCMs easier, as the NRMSE can consider both the bias (i.e., mean difference) and random error (i.e., standard deviation).The NRMSE for the global data was not considered, as it was too small to be used with other statistics.Both the pattern correlation coefficient and the NRMSE were selected as measures for the comparison of the CSEOFs.Only the first CSEOF was considered in this comparison, as it covers most of the original variability.As a result, a total of four different statistics were selected, which were used to compare the GCM predictions such as in Figure 5.In this figure, as the NRMSE is the measure of badness (i.e., zero represents the best), the pattern correlation coefficient was replaced by the value of the pattern correlation coefficient subtracted from one.
Water 2018, 10, 28 13 of 17 spatial resolution used when comparing the GCM prediction and the GPCP data.The uncertainty, however, was found to be not so serious, based on the evaluation with some other possible spatial resolutions based on common multiples.

Discussions on the Performance of GCM Predictions
In the previous two sections, the GCM predictions of precipitation were compared with the GPCP observed data.The comparison was based on two methods: first, using conventional statistics, like the mean and standard deviation; and second, using the pattern correlation coefficient and NRMSE of those CSEOFs derived from the GCM predictions and the GPCP data.In this part of the study, both the conventional statistics and statistics for the comparison of CSEOFs were merged to evaluate the performance of GCM predictions.
Among many conventional statistics considered in the comparison, the NRMSEs of the grid mean and zonal mean were selected for the evaluation of the GCM predictions.The reason for selecting these two statistics was to make the comparison of GCMs easier, as the NRMSE can consider both the bias (i.e., mean difference) and random error (i.e., standard deviation).The NRMSE for the global data was not considered, as it was too small to be used with other statistics.Both the pattern correlation coefficient and the NRMSE were selected as measures for the comparison of the CSEOFs.Only the first CSEOF was considered in this comparison, as it covers most of the original variability.As a result, a total of four different statistics were selected, which were used to compare the GCM predictions such as in Figure 5.In this figure, as the NRMSE is the measure of badness (i.e., zero represents the best), the pattern correlation coefficient was replaced by the value of the pattern correlation coefficient subtracted from one.Figure 5 can be interpreted as showing that the closer the GCM is to the origin, the more similar it is with the GPCP data.The first quadrant in Figure 5 shows that NorESM1-M is the closest to the origin.That is, when the pattern correlation coefficient and NRMSE of the first CSEOF are considered, the NorESM1-M model becomes the most similar one to the GPCP data.In the second quadrant where the pattern correlation coefficient for the first CSEOF and the NRMSE of the grid mean are concerned, the NorESM1-M model was also found the nearest to the origin.However, the IPSL-CM5A-LR model was found to be closer to the origin in the third quadrant, where the NRMSEs of the grid and zonal means were concerned.Finally, in the fourth quadrant, where the NRMSEs of the zonal mean and the first CSEOF were considered, the NorESM1-M model was also found to be the nearest to the origin.Overall, it is obvious that the NorSM1-M model is the most similar to the GPCP data.However, the IPSL-CM5A-LR model was found to be the best when only the conventional statistics are concerned.Additionally, the BCC-CSM model was found to be similar to the GPCP data in all four quadrants.

Conclusions
In this study, 20 GCM predictions of precipitation were compared with the observed GPCP data.The GCMs considered in this study were all those covering the period from 2006 to 2014, which was the maximum to be considered in this study for the comparison of the GCM predictions and GPCP data.Among 23 development institutes for GCMs, a total of 20 GCMs were selected for the CSEOF analysis in this study.The comparison was done for three data structures, being the global, zonal, and grid means, along with those conventional statistics like the root mean square error (RMSE) and the pattern correlation coefficient of the CSEOF.The results obtained from this study are summarized as follows.
In the comparison of the GCM predictions and GPCP data with the conventional statistics, the IPSL-CM5A-LR and NorESM1-M models were found most similar to the GPCP data.This result was consistent in all the comparisons of annual mean, as well as in the NRMSEs of global, zonal and grid data.
The CSEOF analysis of the GCM predictions and GPCP data showed that the first CSEOF was dominant, in that it covered more than 95% of the total variance of the original data.Thus, the comparison was done only for the first CSEOF, to derive the pattern correlation coefficient and NRMSE.In this comparison, the NorESM1-M model was also found to be the most similar to the GPCP data.The BCC-CSM, NCAR-CAM5, and MRI-CGCM3 models were also found to have quite a high pattern correlation coefficient.
Similar results were also derived from the merged comparison of both the conventional statistics and the statistics for the comparison of CSEOFs.Overall, the NorSM1-M model was found to be the most similar to the GPCP data.Additionally, the IPSL-CM5A-LR, BCC-CSM, and GFDL-CMS models were also found to be quite similar to the GPCP data.
As a result in this study, the precipitation predictions of the NorSM1-M model were found to be the most similar to the GPCP data.However, it should be remembered that this result is based only on several statistics.The fact that the annual data were used in the comparison may be the most serious limitation of this study.It may be possible to derive a different conclusion when considering the monthly or daily data.Additionally, it should also be remembered that the results of this study do not guarantee the quality of the future prediction.It is also true that we do not have many other measures to be used for the selection of a proper GCM.Consideration of multiple GCMs, not just one, may be a better method for the study of global warming and climatic change.

Figure 1 .
Figure 1.Annual variation of global mean precipitations of eight GCMs and Global Precipitation Climatology Project (GPCP) data from 2006 to 2014.

Figure 1 .
Figure 1.Annual variation of global mean precipitations of eight GCMs and Global Precipitation Climatology Project (GPCP) data from 2006 to 2014.

Figure 2 .
Again, eight GCMs from 20 GCMs were selected to compare the zonal average more clearly.In this figure, zone 1 indicates the latitudinal zone from −90 • to -80 • , zone 2 from −80 • to −70 • , and finally zone 18 indicates the latitudinal zone with latitude zone from 80 • to 90 • .

Figure 2 .
Figure 2. Zonal mean precipitation of eight GCM predictions and GPCP data from 2006 to 2014.

Figure 2 .
Figure 2. Zonal mean precipitation of eight GCM predictions and GPCP data from 2006 to 2014.

Figure 5 .
Figure 5. Evaluation of 20 GCM predictions, with their NRMSEs and pattern correlations of CSEOF.

Figure 5 .
Figure 5. Evaluation of 20 GCM predictions, with their NRMSEs and pattern correlations of CSEOF.

Table 1 .
Description of Coupled Model Intercomparison Project, Phase 5 (CMIP5) general circulation models (GCMs) used in this study.

Table 2 .
Some statistics of eight 20 GCM predictions and GPCP data.The normalized root mean squares (NRMSEs) were calculated by assuming the GPCP data as truth.

Table 4 .
Pattern correlation coefficients and NRMSEs estimated for the CSEOFs of GCM Predictions and GPCP data.

Table 4 .
Pattern correlation coefficients and NRMSEs estimated for the CSEOFs of GCM Predictions and GPCP data.