Evaluation of Satellite Precipitation Estimates over the South West Paciﬁc Region

: Rainfall estimation over the Paciﬁc region is difﬁcult due to the large distances between rain gauges and the high convection nature of many rainfall events. This study evaluates space-based rainfall observations over the South West Paciﬁc Region from the Japan Aerospace Exploration Agency’s (JAXA) Global Satellite Mapping of Precipitation (GSMaP), the USA National Oceanographic and Atmospheric Administration’s (NOAA) Climate Prediction Center morphing technique (CMORPH), the Climate Hazards group Infrared Precipitation with Stations (CHIRPS), and the National Aeronautics and Space Administration’s (NASA) Integrated Multi-Satellite Retrievals for GPM (IMERG). The technique of collocation analysis (CA) is used to compare the performance of monthly satellite precipitation estimates (SPEs). Multi-Source Weighted-Ensemble Precipitation (MSWEP) was used as a reference dataset to compare with each SPE. European Centre for Medium-range Weather Forecasts’ (ECMWF) ERA5 reanalysis was also combined with Soil Moisture-2-Rain–ASCAT (SM2RAIN–ASCAT) to perform triple CA for the six sub-regions of Fiji, New Caledonia, Papua New Guinea (PNG), the Solomon Islands, Timor, and Vanuatu. It was found that GSMaP performed best over low rain gauge density areas, including mountainous areas of PNG (the cross-correlation, CC = 0.64), and the Solomon Islands (CC = 0.74). CHIRPS had the most consistent performance (high correlations and low errors) across all six sub-regions in the study area. Based on the results, recommendations are made for the use of SPEs over the South West Paciﬁc Region.


Introduction
Rainfall is fundamental to sustaining communities, economy, and the natural environment. Numerous climate-sensitive sectors rely upon accurate precipitation measurements, and the demand for greater accuracy will increase as rainfall variability increases globally due to anthropogenic climate change [1]. Accurate rainfall estimation crucially impacts a range of sectors, e.g., agriculture, fire and forestry management, landslide prevention [2], and hydrological modelling of reservoirs [3].
Rain gauges and weather radars have been the traditional means of measuring precipitation, yet satellite remote sensing is increasingly considered as complementary to rain gauges due to its improving accuracy and spatial coverage [4,5]. Rain gauges themselves are prone to errors from evaporative losses, maintenance, and distribution issues (especially in remote areas), and the small representative area is particularly problematic in regions with high topography and convection rainfall [6]. Weather radars can be effective for capturing the spatial extent of precipitation, but their use is limited, particularly in developing and least developed countries, due to large upfront expense and maintenance costs [7]. The study area extends from 0° N to 30° S, and from 120° E to the International Date Line (further extension eastward was limited by the availability of the Soil Moisture product. Soil Moisture 2 Rain-ASCAT (SM2RAIN-ASCAT). The study area includes the six sub-regions: Timor, Papua New Guinea (PNG), Bougainville Island and Solomon Islands, Vanuatu, New Caledonia, and the main islands of Fiji. The study period is limited to the temporal range of SM2RAIN-ASCAT, from January 2007 to December 2020.

Datasets
Four satellite precipitation datasets commonly used in evaluation studies were selected for this study: the Japan Aerospace Exploration Agency's (JAXA) Global Satellite Mapping of Precipitation (GSMaP V6), the USA National Oceanographic and Atmospheric Administration's (NOAA) Climate Prediction Center morphing technique (CMORPH V1), Climate Hazards group Infrared Precipitation with Stations (CHIRPS V2), and National Aeronautics and Space Administration's (NASA) Integrated Multi-Satellite Retrievals for GPM (IMERG V6b).
The Global Precipitation Mission (GPM) is a constellation of low earth orbiting satellites which uses PMW imagers and sounders and a dual frequency radar for remotely detecting precipitation; it is maintained by both NASA and JAXA. GSMaP [22] and IMERG [23] provide several SPEs created using different precipitation retrieval algorithms based on data from the GPM constellation.
In addition to utilising MW-based estimates from the GPM constellation, GSMaP also uses infrared (IR) data from geostationary satellites to observe the movement of cloud systems and advection estimates, as well as applying a Kalman filter to form better estimates when data is sparse [24]. A daily global dataset of rain gauges from the CPC is used for calibration via the matching over 24-hour totals, which allows for more accurate calibration of MW readings than the use of monthly datasets [25].
CMORPH uses a similar cloud advection algorithm to GSMaP and the same rain gauge network for daily calibration [26]. CMORPH MW inputs have been calibrated by matching the probability distribution functions of the estimates to those from rain gauge values over land and through the use of corrective ratios formed from matching means to the GPCP dataset over the ocean. In this study, the CMORPH Blended (CMORPH-BLD) product which further incorporates gauge data through optimal interpolation was selected for evaluation. CMORPH-BLD has been shown to be more accurate than the

Datasets
Four satellite precipitation datasets commonly used in evaluation studies were selected for this study: the Japan Aerospace Exploration Agency's (JAXA) Global Satellite Mapping of Precipitation (GSMaP V6), the USA National Oceanographic and Atmospheric Administration's (NOAA) Climate Prediction Center morphing technique (CMORPH V1), Climate Hazards group Infrared Precipitation with Stations (CHIRPS V2), and National Aeronautics and Space Administration's (NASA) Integrated Multi-Satellite Retrievals for GPM (IMERG V6b).
The Global Precipitation Mission (GPM) is a constellation of low earth orbiting satellites which uses PMW imagers and sounders and a dual frequency radar for remotely detecting precipitation; it is maintained by both NASA and JAXA. GSMaP [22] and IMERG [23] provide several SPEs created using different precipitation retrieval algorithms based on data from the GPM constellation.
In addition to utilising MW-based estimates from the GPM constellation, GSMaP also uses infrared (IR) data from geostationary satellites to observe the movement of cloud systems and advection estimates, as well as applying a Kalman filter to form better estimates when data is sparse [24]. A daily global dataset of rain gauges from the CPC is used for calibration via the matching over 24-hour totals, which allows for more accurate calibration of MW readings than the use of monthly datasets [25].
CMORPH uses a similar cloud advection algorithm to GSMaP and the same rain gauge network for daily calibration [26]. CMORPH MW inputs have been calibrated by matching the probability distribution functions of the estimates to those from rain gauge values over land and through the use of corrective ratios formed from matching means to the GPCP dataset over the ocean. In this study, the CMORPH Blended (CMORPH-BLD) product which further incorporates gauge data through optimal interpolation was selected for evaluation. CMORPH-BLD has been shown to be more accurate than the gauge-corrected product CMORPH-CRT, over regions where a dense gauge network exists [27], though performance was similar when gauge density is low such as, for example, over PNG [28].
GSMaP and CMORPH data were made available through the World Meteorological Organization's Space-based Weather and Climate Extremes Monitoring Demonstration Project [29].
IMERG Final Run monthly product blends the Global Precipitation Climatology Centre (GPCC) V5 Monitoring Product and is retroactively used to gauge-correct the Early and Late runs [30]. IMERG also uses a cloud advection algorithm and a Kalman filter. Morphed MW estimates are used where their availability is sufficient, with an IR-based SPE called PERSIANN-CCS being used otherwise. Although GSMaP and IMERG are similar in selecting the satellite sensors that they use, the difference in rain gauge data used for calibration and in their precipitation retrieval algorithms results in different performance.
The CHIRPS dataset was created as part of the USA Agency for International Development Famine Early Warning Systems Network (FEWS NET) [5]. It forms a climatological field using long-term trends from both gauge and satellite products (including CMORPH and Land Surface Temperature estimates) onto which precipitation estimate from near real time IR Cold Cloud Duration (CCD) are incorporated. Blending also takes place with interpolation from up to five nearby gauges. Its higher resolution (0.05 • compared to 0.1 • or 0.25 • of other datasets) and temporal coverage (beginning from 1981 compared to around the 2000s for other SPEs) may make it applicable for Pacific Island Nations that lack consistent rain gauge history.
All SPEs require calibration to surface rainfall [31] and the choice of ground-truth data impacts the accuracy of the satellite products. CMORPH blends in CPC daily data, CHIRPS blends in nearest gauges, IMERG blends in monthly rainfall data, and GSMaP corrects to daily data. It has been shown that using daily gauge data resulted in GSMaP being more accurate over the Tibetan Plateau than IMERG [25].
The Multi-Source Weighted-Ensemble Precipitation dataset (MSWEP V2.8) is a merged product combining numerous gauge, reanalysis, and satellite products [32] and will serve as a gridded reference dataset for this study. However, as ERA-INT, GSMaP, and CMORPH are inputs to MSWEP, it thus will not be used in collocation analysis performed in this study.
SM2RAIN-ASCAT considers an inverse water equation where rainfall is estimated from observed soil moisture and modelled soil temperature and evaporation [33]. It can be more effective than other MW based SPEs when an estimation of accumulated rainfall is desired as opposed to the instantaneous estimates generated by SPEs [33] which are subject to sampling bias. The SM2RAIN-ASCAT product is used in this study, as it has the largest temporal range of all SM2RAIN products covering January 2007 until December 2020. The ASCAT product is based upon three MetOp C-Band MW sounders [21].
The European Centre for Medium-Range Weather Forecasts (ECMWF) produces the ERA5 reanalysis product, which uses a wide range of weather observations from multiple sources including satellites, weather balloons, aircrafts, and weather stations, for humidity, air pressure, wind, and temperature, to provide an accurate history of global meteorological observations from 1950 to present [34]. It does not directly ingest rain gauge data, with a second-order inclusion occurring from the use of a gauge-adjusted radar analysis over the USA, in addition to the use of satellite data. A comparison of ERA5 to global rain gauge datasets reveals that ERA5 overestimates rainfall over the oceans, but is in very close agreement for precipitation over land [34].

Method
CA is a methodology that assumes an unknown uncertainty for a geophysical process (in this case precipitation) and then compares at least three datasets made up of mutually independent measurements to determine their relative accuracy [11,35]. There are two main types of CA which make different assumptions about errors. There is a simple additive error model: and a multiplicative error model, which is considered more accurate for rainfall data due to the sporadic nature of rainfall amounts [11]: where R is the observed rainfall amount, T is the unknown true rainfall amount, is the random error, and a and B denote systematic biases. CA methods overcome the need for an exact source of truth, and are spatially consistent across a domain [36]. A noteworthy feature of CA is that the methodology relies upon the contributing datasets being independent of each other. More specifically, CA's underlying assumptions are of (i) stationarity of the statistics, (ii) linearity between at least three estimates (versus the same target) across all timescales, and (iii) existence of uncorrelated error between at least three estimates [10,11].
Monthly averaged daily precipitation data (January 2007-December 2020) was downloaded as NetCDF4 files for all datasets except IMERG (downloaded as HDF5), converted to units of mm per day, and regridded to a common resolution of 0.25 • , as this was the native resolution of SM2RAIN-ASCAT, ERA5, and CMORPH datasets. The land-based SPEs of CHIRPS, CMORPH, and SM2RAIN-ASCAT were extrapolated to smooth around coastlines, and a common land-sea mask from CMORPH was applied for all the datasets to remove the variation of masks amongst the datasets. CMORPH's mask was used as it presented a compromise of coverage over Pacific Island Countries with the other land-based SPEs (CHIRPS, SM2RAIN-ASCAT) as well as the land-sea mask of ERA5.
Due to the performance of each product varying in accuracy throughout the year [17] it is conventional to use monthly anomalies rather than rainfall totals. Using anomalies restricts our analysis to assume an additive error model (Equation (1)), rather than the multiplicative variant (Equation (2)). Despite the benefits of a multiplicative error model, recent studies, such as Chen et al. [17] and Massari et al. [10], used an additive error model which still provided robust results. Therefore, it is reasonable to remove the climatological component of the data from all the datasets and perform an additive error analysis.
MSWEP was used as a reference dataset to compare each SPE. It is considered to be the most accurate near-real-time satellite product available due to its optimised merging of other rainfall products [20]. The Mean Bias, Mean Average Error, Root Mean Square Error, and Pearson-R correlation are used, as they are conventional comparative statistical measures used to understand dataset accuracy in SPE research (formulas can be found in [28]). Assumptions of linearity and normality also were validated through plotting to ensure that Pearson correlation was the correct statistic for this process. Fisher's ztransformation comparison of correlation coefficients was used to determine significance of the Pearson correlations.
The upper and lower quantiles for each SPE were also compared with MSWEP to record how consistently a value in the lower 20% or upper 80% of the reference dataset was identified by the SPE. The metric of this comparison is percent hit rate, which compares the number of events that lie in the top and bottom quantile of the SPE data to the number expected from MSWEP. While daily data may be more relevant for the accuracy of precipitation estimates for individual tropical cyclone events, the use of monthly data can be relevant to climate extremes such as droughts and accumulated heavy precipitation during the Southern Hemisphere tropical cyclone season, typically from November through to April [37,38].
The time series of one SPE, SM2RAIN-ASCAT and ERA5 were used to create a 3 × 3 covariance matrix (C) for each pixel. This matrix was then used to calculate the crosscorrelation (CC) and RMSE between the SPE and the unknown true rainfall for each SPE. As we are using an additive error model, the equations are: where X represents the SPE being considered, Y is SM2RAIN-ASCAT, and Z is ERA5. Assumption of stationarity were tested by the Augmented Dickey-Fuller unit root test, and linearity by visual inspection. It is worth noting that the assumptions of these Remote Sens. 2021, 13, 3929 6 of 16 datasets having uncorrelated errors are not completely met, as some sensors from the GPM constellation are included in ERA5 [34]. Although SM2R is best not used with ERA5 in triple CA [21], it is still used in cases where a suitable rain gauge product is not available [10,17]. Similarly, Quadruple Collocation using the four SPEs was explored, but was not utilized due to the overlap of sensors violating the assumptions of independence. Figure 2 shows the spatial differences in Mean Bias errors (MBE) between each SPE and the reference MSWEP for six sub-regions. It is evident that topography impacts bias in SPEs, with often opposite direct biases between highlands and lowlands. In general, there is an overestimation of precipitation over mountains. Around coastal areas, the SPEs are more likely to underestimate rainfall, which is consistent with issues of SPEs measuring rainfall over oceans [39].

Performance of Datasets Compared to MSWEP
The comparison of each of the SPEs to MSWEP is summarised in Figure 3; see Appendix A for details and comparison, including SM2RAIN-ASCAT. GSMaP and IMERG have stronger yet insignificant Pearson correlations in Solomon Islands, Timor, and PNG; however, CMORPH and CHIRPS have significantly greater performance when over more densely gauged subregions of Fiji, Vanuatu, and New Caledonia (Figure 3b). CMORPH has the largest RMSE over PNG, GSMaP has the largest over Fiji, Vanuatu, New Caledonia, and Solomon Islands, while IMERG has the smallest or near smallest errors over these sub-regions (Figure 3c). SM2RAIN-ASCAT had significantly lower average correlations than the other SPEs for all sub-regions, with the exception of insignificantly lower R values in Timor (Significance P = 0.86) and PNG (P = 0.19).
The time series of one SPE, SM2RAIN-ASCAT and ERA5 were used to create a 3 × 3 covariance matrix (C) for each pixel. This matrix was then used to calculate the cross-correlation (CC) and RMSE between the SPE and the unknown true rainfall for each SPE. As we are using an additive error model, the equations are: where X represents the SPE being considered, Y is SM2RAIN-ASCAT, and Z is ERA5. Assumption of stationarity were tested by the Augmented Dickey-Fuller unit root test, and linearity by visual inspection. It is worth noting that the assumptions of these datasets having uncorrelated errors are not completely met, as some sensors from the GPM constellation are included in ERA5 [34]. Although SM2R is best not used with ERA5 in triple CA [21], it is still used in cases where a suitable rain gauge product is not available [10,17]. Similarly, Quadruple Collocation using the four SPEs was explored, but was not utilized due to the overlap of sensors violating the assumptions of independence. Figure 2 shows the spatial differences in Mean Bias errors (MBE) between each SPE and the reference MSWEP for six sub-regions. It is evident that topography impacts bias in SPEs, with often opposite direct biases between highlands and lowlands. In general, there is an overestimation of precipitation over mountains. Around coastal areas, the SPEs are more likely to underestimate rainfall, which is consistent with issues of SPEs measuring rainfall over oceans [39]. The comparison of each of the SPEs to MSWEP is summarised in Figure 3; see Appendix for details and comparison, including SM2RAIN-ASCAT. GSMaP and IMERG have stronger yet insignificant Pearson correlations in Solomon Islands, Timor, and PNG; however, CMORPH and CHIRPS have significantly greater performance when over more densely gauged subregions of Fiji, Vanuatu, and New Caledonia (Figure 3b). CMORPH has the largest RMSE over PNG, GSMaP has the largest over Fiji, Vanuatu, New Caledonia, and Solomon Islands, while IMERG has the smallest or near smallest errors over these sub-regions (Figure 3c). SM2RAIN-ASCAT had significantly lower average correlations than the other SPEs for all sub-regions, with the exception of insignificantly lower R values in Timor (Significance P = 0.86) and PNG (P = 0.19). A comparison of the percent hit rate for the upper and lower quantiles with MSWEP indicated there was no consistent difference between the accuracy of lower and upper quantiles for any of the SPEs when compared to MSWEP. The gauge-based products of CMORPH-BLD and IMERG show greater quantile accuracy in Fiji and New Caledonia, both sub-regions with denser gauge networks. CHIRPS had overall the lowest average hit rate (90.6%), while GSMaP performed well for sub-regions with low gauge density.

Triple Collocation Analysis
As in the comparison to MSWEP, we found poor performance of the SPEs over PNG compared to the other sub-regions. While most of the datasets show agreement, there are some areas where performance across the datasets diverges.
Cross correlation between each SPE and the unknown truth from triple CA of ERA5 and SM2RAIN-ASCAT are shown in Figure 4. All SPEs demonstrated best performance over Fiji, New Caledonia, and Vanuatu. GSMaP shows far greater CC over PNG, particularly over highlands, than the other datasets. IMERG appears to perform weaker than other SPEs over PNG, Timor, and the Solomon Islands, but has comparable performance over Fiji, Vanuatu, and New Caledonia. more densely gauged subregions of Fiji, Vanuatu, and New Caledonia (Figure 3b). CMORPH has the largest RMSE over PNG, GSMaP has the largest over Fiji, Vanuatu, New Caledonia, and Solomon Islands, while IMERG has the smallest or near smallest errors over these sub-regions (Figure 3c). SM2RAIN-ASCAT had significantly lower average correlations than the other SPEs for all sub-regions, with the exception of insignificantly lower R values in Timor (Significance P = 0.86) and PNG (P = 0.19). A comparison of the percent hit rate for the upper and lower quantiles with MSWEP indicated there was no consistent difference between the accuracy of lower and upper quantiles for any of the SPEs when compared to MSWEP. The gauge-based products of CMORPH-BLD and IMERG show greater quantile accuracy in Fiji and New Caledonia, both sub-regions with denser gauge networks. CHIRPS had overall the lowest average hit rate (90.6%), while GSMaP performed well for sub-regions with low gauge density.

Triple Collocation Analysis
As in the comparison to MSWEP, we found poor performance of the SPEs over PNG compared to the other sub-regions. While most of the datasets show agreement, there are some areas where performance across the datasets diverges.
Cross correlation between each SPE and the unknown truth from triple CA of ERA5 and SM2RAIN-ASCAT are shown in Figure 4. All SPEs demonstrated best performance over Fiji, New Caledonia, and Vanuatu. GSMaP shows far greater CC over PNG, particularly over highlands, than the other datasets. IMERG appears to perform weaker than other SPEs over PNG, Timor, and the Solomon Islands, but has comparable performance over Fiji, Vanuatu, and New Caledonia. Comparisons of the mean CC and RMSE using triple CA for each SPE are shown in Figure 5a,b, respectively. GSMaP outperformed other SPEs over PNG and the Solomon Islands, while CHIRPS outperformed over Timor. CHIRPS had greater or equal correlations and lower RMSE than IMERG and CMORPH. IMERG did not perform as well as other SPEs over PNG, Timor, the Solomon Islands, and Fiji, whereas its performance was comparable over New Caledonia and Vanuatu.
A comparison of the SPEs with the highest correlation at each pixel for each sub-region is provided in Figure 6; the second highest correlation is also presented.
GSMaP performs best over mountainous areas of PNG, Timor, and the Solomon Islands, while CHIRPS shows consistent performance across the study region, except for poor performance over PNG. Performance of CHIRPS appears best around some coastlines and lower-lying areas of Timor and New Caledonia. CMORPH performs well over Fiji, and IMERG is comparable with CHIRPS over New Caledonia. Comparisons of the mean CC and RMSE using triple CA for each SPE are shown in Figure 5a,b, respectively. GSMaP outperformed other SPEs over PNG and the Solomon Islands, while CHIRPS outperformed over Timor. CHIRPS had greater or equal correla other SPEs over PNG, Timor, the Solomon Islands, and Fiji, whereas its performance was comparable over New Caledonia and Vanuatu.
A comparison of the SPEs with the highest correlation at each pixel for each sub-region is provided in Figure 6; the second highest correlation is also presented. GSMaP performs best over mountainous areas of PNG, Timor, and the Solomon Islands, while CHIRPS shows consistent performance across the study region, except for poor performance over PNG. Performance of CHIRPS appears best around some coastlines and lower-lying areas of Timor and New Caledonia. CMORPH performs well over Fiji, and IMERG is comparable with CHIRPS over New Caledonia.

Discussion
Evidence that SPEs over PNG are the least accurate of the six regions reinforces the known difficulties of estimating rainfall from satellites over areas with complex topography, dense tropical rainforest, and low rain-gauge coverage [28]. Studies on evaluating SPEs in Pacific Island Countries are very limited, with the only previous study over PNG performed by Chua et al. (2020) [28] comparing CMORPH, GSMaP, and ERA5 with rain gauge data. The results obtained in this study showed similar but consistently higher A comparison of the SPEs with the highest correlation at each pixel for each sub-region is provided in Figure 6; the second highest correlation is also presented. GSMaP performs best over mountainous areas of PNG, Timor, and the Solomon Islands, while CHIRPS shows consistent performance across the study region, except for poor performance over PNG. Performance of CHIRPS appears best around some coastlines and lower-lying areas of Timor and New Caledonia. CMORPH performs well over Fiji, and IMERG is comparable with CHIRPS over New Caledonia.

Discussion
Evidence that SPEs over PNG are the least accurate of the six regions reinforces the known difficulties of estimating rainfall from satellites over areas with complex topography, dense tropical rainforest, and low rain-gauge coverage [28]. Studies on evaluating SPEs in Pacific Island Countries are very limited, with the only previous study over PNG performed by Chua et al. (2020) [28] comparing CMORPH, GSMaP, and ERA5 with rain gauge data. The results obtained in this study showed similar but consistently higher

Discussion
Evidence that SPEs over PNG are the least accurate of the six regions reinforces the known difficulties of estimating rainfall from satellites over areas with complex topography, dense tropical rainforest, and low rain-gauge coverage [28]. Studies on evaluating SPEs in Pacific Island Countries are very limited, with the only previous study over PNG performed by Chua et al. (2020) [28] comparing CMORPH, GSMaP, and ERA5 with rain gauge data. The results obtained in this study showed similar but consistently higher correlations and smaller errors than those presented in Chua et al. [28], likely due to the use of anomaly values rather than raw rainfall totals.
As expected, CHIRPS's weaker quantile performance was due to the blending technique from multiple nearby gauges rather than the nearest neighbour [5].
Including rain gauge data within the comparison of MSWEP and ERA5 would be a useful future study to benefit users in the Pacific Islands and for the wider field of collocation analysis.
Using triple CA, stronger CCs in less mountainous areas were found, supporting earlier studies that SPEs are more accurate at low elevations [40,41]. CMORPH performed best over Fiji, likely due to a higher density of contributing rain gauges [16]. This supports previous findings that the CMORPH-BLD is dependent on the availability of CPC rain gauge data [27,28]. It appears that IMERG's strong performance over New Caledonia is also due to the presence of a dense GPCC rain gauge network, the difference from the Fiji result being due to fewer contributing rain gauges in the GPCC dataset than the CPC [42,43]. As this difference in the rain gauge data is most pronounced in the case of Fiji, we can assume that IMERG would have higher correlations than CMORPH over Fiji if the same rain gauge dataset was used for calibration.
Although GSMaP has been shown to have a significant positive bias, particularly over PNG [44], it still demonstrates the strongest CCs. While the SPEs have similar correlations over Vanuatu, Fiji, and New Caledonia, CHIRPS is shown to perform most similarly to ERA and SM2RAIN-ASCAT by having the smallest errors.
The main limitations of using SM2RAIN-ASCAT are that it (i) underestimates peak rainfall events due to soils becoming saturated and incurring surface runoff, (ii) incorrectly records spurious low-intensity rainfall events due to high-frequency soil moisture fluctuations associated with random measurement error, and (iii) is limited to only liquid-phase precipitation over land [21]. The accuracy of SM2RAIN-ASCAT is being improved through the use of constellations of longer-wave MW sensors [45,46] incorporating newer technologies from Sentinel-1 [46] and the Cyclone Global Navigation Satellite System (CYGNSS) constellation [47].
The CC values from triple CA are considered a more important metric for determining accuracy than RMSE [18,36], and the errors are used here to contextualise the results.
It was found that CHIRPS has relatively high CCs compared to the other SPEs for all sub-regions. In sub-regions with low rain gauge density, GSMaP is recommended for use due to its high correlations and low latency. GSMaP performs better over PNG, and CHIRPS over Timor. For Vanuatu, the Solomon Islands, and Bougainville, CHIRPS has slightly better quantile performance and lower errors, while GSMaP has stronger correlations and smaller biases.
As GSMaP uses a similar satellite constellation to IMERG, it is expected that it will be able to provide a more accurate spatiotemporal distribution of tropical cyclone rainfall than CHIRPS [48]. Given the latency for GSMaP is shorter than for CHIRPS (3 days versus 1 month), it is preferred for rapid tropical cyclone impact assessment.
Over New Caledonia, it is recommended to use IMERG due to its dense GPCC gauge network and quantile analysis performance. CHIRPS proves to be a close second, however IMERG is recommended due to its more homogenous distribution of bias and stronger CC from triple CA.
Similarly within Fiji, it can be confidently recommended that CMORPH be used for applications due to its strong CCs. The negative bias around coasts could lead to greater rates of false alarms for drought, so it is advised that if CMORPH historical datasets are used for statistical precipitation forecasting, the inclusion of in situ verification of drought conditions is recommended. In the case of probabilistic drought forecasting, these biases will not impact the seasonal forecasts themselves, but we caution the coupling of CMORPH and forecasts for drought monitoring to similarly include in situ measurements where possible. If other countries had similar amounts of rain gauge data in the CPC dataset as Fiji, then CMORPH could also become the recommended SPE. If similar gauge data was included in the GPCC dataset, then IMERG could become the recommended dataset for use over Fiji due to its strong performance over New Caledonia.
Analysis indicates that SPE accuracy varies over a range of variables and geographic features. Careful consideration of strengths and limitations of SPEs is required before implementing them in operational services. It is recommended that Pacific Island Country National Meteorological Services seeking to operationalise a certain SPE dataset perform correlation analyses on the SPE datasets available to them.

Conclusions
This research found that a key factor for SPE accuracy in the study region was topography, with more mountainous sub-regions (PNG, Timor, the Solomon Islands) consistently having weaker correlations than those characterised by less mountainous terrain (Fiji, Vanuatu, and New Caledonia).
A comparison to MSWEP was performed and results were in line with those obtained in a previous analysis by Chua et al. (2020) [28], which showed GSMaP to be more accurate than CMORPH over PNG. The triple CA found that GSMaP performed best in particularly mountainous areas and CHIRPS had the most consistent performance across all six subregions in the study region.
These different levels of analysis demonstrated that the choice of dataset can reveal quite different accuracy results. The blended datasets of IMERG and CMORPH performed well for the Fiji and New Caledonia sub-regions where high density rain gauge observations were representative of the small domains. GSMaP appears to perform best in sub-regions of low rain gauge density.
Recommendations for SPE for use during tropical cyclone seasons and drought periods were inferred, however, there remains a variety of other avenues of research that could provide more specific insights for applying SPEs to natural hazard risk reduction. CHIRPS was recommended for drought applications due to its consideration of historical climate data, whereas GSMaP was favoured for tropical cyclone seasons with strong performance over regions with low density or no rain gauges, and lower latency.
Case studies using daily raw data in a multiplicative error model could provide more relevant performance of tropical cyclone rainfall accuracy, and new techniques of soil moisture measurement and SPE merging can provide improved ability for SPEs to contribute to disaster risk reduction efforts.
This research reaffirms the importance of rain gauges for increasing SPE accuracy, and strongly recommends maintaining, and even expanding where possible, a network of surface-based meteorological observation stations. The methodology established in this study for the South West Pacific Region can be replicated for other precipitation data sparse regions, and would be a crucial step in the process of operationalising SPE applications.