Satellite Rainfall (TRMM 3B42-V7) Performance Assessment and Adjustment over Pahang River Basin, Malaysia

The Tropical Rainfall Measuring Mission (TRMM) was the first Earth Science mission dedicated to studying tropical and subtropical rainfall. Up until now, there is still limited knowledge on the accuracy of the version 7 research product TRMM 3B42-V7 despite having the advantage of a high temporal resolution and large spatial coverage over oceans and land. This is particularly the case in tropical regions in Asia. The objective of this study is therefore to analyze the performance of rainfall estimation from TRMM 3B42-V7 (henceforth TRMM) using rain gauge data in Malaysia, specifically from the Pahang river basin as a case study, and using a set of performance indicators/scores. The results suggest that the altitude of the region affects the performances of the scores. Root Mean Squared Error (RMSE) is lower mostly at a higher altitude and mid-altitude. The correlation coefficient (CC) generally shows a positive but weak relationship between the rain gauge measurements and TRMM (0 < CC < 0.4), while the Nash-Sutcliffe Efficiency (NSE) scores are low (NSE < 0.1). The Percent Bias (PBIAS) shows that TRMM tends to overestimate the rainfall measurement by 26.95% on average. The Probability of Detection (POD) and Threat Score (TS) demonstrate that more than half of the pixel-point pairs have values smaller than 0.7. However, the Probability of False Detection (POFD) and False Alarm Rate (FAR) show that most of the pixel-point gauges have values lower than 0.55. The seasonal analysis shows that TRMM overestimates during the wet season and underestimates during the dry season. The bias adjustment shows that Mean Bias Correction (MBC) improved the scores better than Double-Kernel Residual Smoothing (DS) and Residual Inverse Distance Weighting (RIDW). The large errors imply that TRMM may not be suitable for applications in environmental, water resources, and ecological studies without prior correction.


Introduction
For many environmental, water resources, and ecological applications, it is important to quantify rainfall accurately. However, quantifying the rainfall can be challenging, especially in inaccessible areas such as Antarctica and Amazonia. The Tropical Rainfall Measuring Mission (TRMM) is a research satellite designed to provide information on rainfall in the tropical and sub-tropical regions of the Earth. The TRMM Multi-Satellite Precipitation Analysis (TMPA), also known as TRMM 3B42, is an algorithm that provides the best estimation of rainfall based on a combination of measurements from multiple sensors onboard multiple satellites [1]. TRMM 3B42 provides rainfall values between 1998 and 2014 and presents one of the most valuable 17 years of spatiotemporal rainfall datasets to date.
The algorithm of TRMM Multisatellite Precipitation Analysis involves data sources from multiple other satellites which are precipitation-related passive microwave and infrared measurements [1]. These are inter-calibrated and merged with the sensor measurements onboard the TRMM satellite, followed by gauge-adjustment at the monthly timescale before back-scaling to a three hourly temporal resolution. The multiple data sources and multiple levels of processing result in many sources of errors in the final product. There have been efforts to evaluate TRMM rainfall data, but until now, there is still limited knowledge of the accuracy of the TRMM 3B42 in its latest version 7 (henceforth TRMM for brevity) over the tropical regions of Asia, which restricts its application in the field of ecology, climate, and hydrology [2]. Topography, seasonality, and climatology have been shown to play a role in the satellite precipitation products performance.
Topography affects the TRMM rainfall performances, especially in terms of detection probability and bias. In Thailand, regions with complex terrain exhibited poor rain detection and magnitude-dependent mean errors [3]. In Morocco, TRMM can estimate precipitation events at elevations below 1000 m accurately, but faces difficulties within the area of high elevations with high snow content and low rainfall [4]. The signals in the microwave region could be scattered due to the cold areas (or snow-covered areas) and complex terrain region [1]. This may be attributed to the TRMM Microwave Imager (TMI) onboard the TRMM satellite. TMI is sensitive to the temperature and as a consequence, it can lead to inaccurate rainfall rates. A study carried in Northeastern Iberia showed TRMM to exhibit a good performance in regions with minimum precipitation events and a low altitude, and shows a poor performance on the coastal areas and in complex terrains [5]. The impacts of altitude on the TRMM accuracy are consistent with a study carried out by Ouatiki et al. (2017) in Morocco that showed that the TRMM product had problems in estimating rainfall rates over mountainous regions [6]. The difficulties are due to the ability of a Precipitation Radar (PR) to obtain the information about the rain event such as the intensity and distribution, the rain type, and other data [7]. The mountainous region may shield the radar beam to obtain such information. It also can be caused by the radar wave that hits the ground, returning false echoes [8].
In addition, the temporal and spatial scale also affects the performance of TRMM estimates. There are some uncertainties which makes the sensitivity of local-scale rainfall in a small region contentious [9]. These include the lack of TRMM precipitation algorithm sensitivity to low and high precipitation clouds [10,11], the effect of upscaling the rainfall rate to an effective temporal scale [12], and the coarse grid size of the TRMM data for solving local rainfall patterns [13].
The TRMM rainfall performances are also found to be affected by the climatology of a region. Areas with high snow content are problematic due to the sensitivity to the surface emissivity in passive microwave sensors, which produces signals similar to those of precipitation [13]. Satellite-based precipitation datasets also show different performances in warm and cold seasons [14]. Complex emissivity from a cold place and area covered in ice, especially in winter, may be recognized as a rain event, which could lead to missed precipitation by TRMM sensors [15].
Apart from topography and climatology, seasonality also plays a big role in the TRMM rainfall performances. TRMM is reported to be affected by monsoon systems. A study carried out in Malaysia shows that TRMM is less sensitive towards low precipitation clouds than heavy precipitation clouds and the correlation obtained was better during the northeast monsoon [9]. The heaviest rainfall events occurred during the northeast monsoon and were mainly caused by the large-scale monsoon flows that contained heavy precipitation cloud [10]. The error can be explained by the presence of low versus heavy precipitation clouds in wet and dry seasons. Based on the study carried out in Bali, most high-rainfall events increase during the wet season, (December-January-February) and the lowest rainfall events occur during the dry season (June-July-August) [16]. Those conditions are associated and generated with the northeast and southwest monsoon patterns. They found that the correlation levels between rain gauges and TRMM during the wet season are lower, whereas in the dry season, they are a bit higher. As was found in a study carried in China, TRMM can be overestimated and underestimated in both the dry season and wet season, depending on the different rain rates [17]. TRMM under-detects and underestimated rainfall rates during high precipitation events, whereas it over detects and overestimated rainfall rates during light precipitation events [17,18]. Meanwhile, a study carried out in Qinghai Lake Basin, China, shows that TRMM exhibits a poor performance between May and September, which is the dry season [19]. This is consistent with a study carried out in Morocco, which showed that July and August exhibit the lowest correlation coefficients [6]. A study in Circum-Bohai-Sea region, China, showed that the rainfall pattern is more effectively captured by TRMM for the wet region or season than for the dry region or season [20]. In short, the performances of TRMM rainfall can be dependent on season. In the case of Malaysia, the performance may differ depending on the monsoons that take place, which are the northeast monsoon (dry season) and southwest monsoon (wet season).
Rainfall information from a large number of rain gauges is already adopted as part of global satellite algorithms, including TRMM at a final stage of monthly bias correction [1]. A practical method of estimating rainfall can be created by merging the satellite data and rain gauge data. Both satellite and rain gauge data can be merged by combining accurate quantitative rainfall from stations with spatially continuous information from remote sensing observations. Even so, most of the developing countries have a restricted accessibility to rain gauge data. Due to this, the global precipitation product may be found to be unsatisfactory and requiring local adjustment. To improve the performance of satellite algorithms, multiple statistical methods have been proposed. Example statistical methods, many with applications originating from ground-based weather radars, are mean bias correction (MBC) [21,22], double-kernel smoothing (DS) [23], and residual inverse error weighting (RIDW) [24].
On February 2014, the Global Precipitation Mission (GPM) was launched as a follow-on to TRMM and the objective was to observe global precipitation more frequently and more accurately than TRMM. The GPM design is based on improvement of the shortcomings of TRMM and hence an in-depth study of the performance of TRMM could provide the basis for a study on GPM improvements. Yet, despite having a significant period of rainfall records, extensive studies of TRMM accuracy in measuring rainfall in South East Asia, specifically in Malaysia, are sparse. Thus, the objectives of this paper are: (1) to analyse the performance of rainfall estimation from TRMM in Malaysia using the Pahang river basin as a case study; and (2) to compare several adjustment methods for correcting TRMM based on rainfall estimates from ground stations. Pahang river is the largest river in Peninsular Malaysia and the river basin is one of the important water catchment areas in Malaysia that also provides water resources to neighbouring urbanized states such as Selangor. The performance will be assessed by computing a set of performance indicators using rain gauge data. Section 2 will describe the study area, methodology, and datasets used, while Section 3 will present the results and discussion. Finally, Section 4 will provide some conclusions and recommendations for future work.

Study Area
The location of study is in Pahang.
Pahang is situated on 2 • 25 55"-4 • 48 4 N and 101 • 19 18 -104 • 14 31 E in the Peninsular Malaysia. As shown in Figure 1, the topography of Peninsular Malaysia is dominated by a mountain range known as Banjaran Titiwangsa (Main Range), which extends from the Thailand border southwards to Negeri Sembilan. The largest basin in peninsular Malaysia is the Sg. Pahang Basin [25]. The Sg. Pahang basin has an annual rainfall of about 2170 mm, a large proportion of which occurs during the North-East Monsoon between mid-October and mid-January [26].
Malaysia's climate is hot and humid throughout the year since it is situated near the equator. The rainfall distribution patterns are dependent on the seasonal wind flow patterns and the local topography features [27]. In particular, Malaysia deals with two distinct monsoons that are the northeast monsoon which blows a wet season from October to March and the Southwest monsoon which blows a dry season from June to September. Throughout the northeast monsoon season,

Rainfall Data
The TRMM 3B42-V7 dataset was obtained from https://pmm.nasa.gov/dataaccess/downloads/trmm. The satellite estimates were obtained with a 3-h temporal resolution and 0.25° × 0.25° (approximately 27.8 km × 27.8 km) spatial resolution. The data is extracted for the region covering Pahang (latitudes and longitude boundaries) and for the years 1998-2014 (17 years). In order to match the temporal resolution of the rain gauge data, which is daily, the values were aggregated to the daily scale. The three-hourly rainfall rate was assumed to be constant over the 3 h, and the all the three-hourly total rainfalls in the defined 24-h period are summed to get the total daily rainfall.
Meanwhile, the ground-based (rain gauge) rainfall data are obtained from Department of Irrigation and Drainage (DID). The data are taken from all 32 stations in Pahang with data on record between 1998 and 2014. Table A1 (Appendix A) shows the rain gauge stations which were collocated with the midpoints of TRMM pixels. As several TRMM pixels contained more than one station within the pixel, this resulted in a total of 27 pixel-point pairs. For these pixels, the rain gauge values from the multiple stations were averaged to obtain a pixel-averaged time series.
The study region is further divided into the low-altitude (h < 100 m), mid-altitude (100 m < h < 500 m), and high-altitude (h > 500 m) areas using the Digital Elevation Model from the Shuttle Radar Topography Mission (SRTM) Version 3.0 Global 1 arc second Data (data available from the U.S. Geological Survey). Table 1 shows the number of pixel-point pairs based on the altitudes. It is important to note that satellite-derived measurements are different from ground station observations for various reasons. The satellite sensors have finite field-of-views, thus when a measurement is taken by a satellite sensor, it represents the average value within a pixel. Besides that, measurements from ground stations are generally point values. Therefore, satellite measurements are intrinsically different from ground station measurements.

Rainfall Data
The TRMM 3B42-V7 dataset was obtained from https://pmm.nasa.gov/data-access/downloads/ trmm. The satellite estimates were obtained with a 3-h temporal resolution and 0.25 • × 0.25 • (approximately 27.8 km × 27.8 km) spatial resolution. The data is extracted for the region covering Pahang (latitudes and longitude boundaries) and for the years 1998-2014 (17 years). In order to match the temporal resolution of the rain gauge data, which is daily, the values were aggregated to the daily scale. The three-hourly rainfall rate was assumed to be constant over the 3 h, and the all the three-hourly total rainfalls in the defined 24-h period are summed to get the total daily rainfall.
Meanwhile, the ground-based (rain gauge) rainfall data are obtained from Department of Irrigation and Drainage (DID). The data are taken from all 32 stations in Pahang with data on record between 1998 and 2014. Table A1 (Appendix A) shows the rain gauge stations which were collocated with the midpoints of TRMM pixels. As several TRMM pixels contained more than one station within the pixel, this resulted in a total of 27 pixel-point pairs. For these pixels, the rain gauge values from the multiple stations were averaged to obtain a pixel-averaged time series.
The study region is further divided into the low-altitude (h < 100 m), mid-altitude (100 m < h < 500 m), and high-altitude (h > 500 m) areas using the Digital Elevation Model from the Shuttle Radar Topography Mission (SRTM) Version 3.0 Global 1 arc second Data (data available from the U.S. Geological Survey). Table 1 shows the number of pixel-point pairs based on the altitudes. It is important to note that satellite-derived measurements are different from ground station observations for various reasons. The satellite sensors have finite field-of-views, thus when a measurement is taken by a satellite sensor, it represents the average value within a pixel. Besides that, measurements from ground stations are generally point values. Therefore, satellite measurements are intrinsically different from ground station measurements.

Detection Metrics
Rainfall detection metrics are the performance criteria that compare the occurrence and non-occurrence of rainfall events between TRMM and rain gauges. Table 2 shows the detection metrics used in this study. Table 2. Detection metrics to be used in the study (Sakolnakhon (2013)).

1
The probability of detection (POD) or the hit rate • A fraction of the observed 'yes' events that were also forecasted 'yes' events.
This score is sensitive to hits, but it ignores the false alarms.
The false alarm ratio (FAR) • A touchstone of the fraction of predicted 'yes' events that actually did not happen.
This score is sensitive to false alarms, but it ignores the missed events.
The probability of false detection (POFD) or the false alarm rate • A fraction of the observed 'no' events that satellite observatory were incorrectly forecast as 'yes' events. • Range: 0 to 1 (perfect forecast = 0) • While it is sensitive to false alarms, it ignores the missed events.
The threat score (TS) or critical success index (CSI) • It tells how well did the forecast 'yes' events correspond to the observed 'yes' events. • Range: 0 to 1 (perfect forecast = 1) • It is the most accurate when correct negatives have been removed from consideration Where a, b, c, and d are the numbers of events observed shown in Table 3.  (2013)).

Observed Total Yes No
Yes

Volumetric Metrics
The satellite product and rain gauge time series were compared to analyse their variation and relationship using various volumetric metrics: Root Mean Squared Error (RMSE), Correlation Coefficient (CC), Nash-Sutcliffe Efficiency (NSE), and Percent Bias (PBIAS). Table 4 shows the standard verification indices for evaluating the TRMM rainfall data.  [28]; Zambrano-Bigiarini (2014) [29]).

1
Root Mean Square Error (RMSE) • The square root of the average of the difference between the observed value and median of the forecast.
Nash Sutcliffe Efficiency (NSE) • It indicates how well the plot of observed vs. forecast values fits the 1:1 line. NSE range from -Inf to 1. The closer to 1, the more accurate the model is Percentage Bias (PBIAS) • A measure of the average tendency of the forecast values to be larger or smaller than their observed ones The correlation coefficient (CC) • A measure of the strength and direction of the linear relationship between two variables. • The correlation coefficient may take any value between −1.0 and +1.0.
Where S i is the estimated values, O i is the observed values and N is the number of samples.

Bias Adjustment Parameters
The evaluation of each point value is carried out using a cross validation against the rain gauge observations for examining the improvement in the accuracy of the satellite rainfall estimates. In this study, leave-one-out cross validation was used to evaluate the three adjustment methods for all events using the performance indicators. Leave-one-out cross validation was used such that one rainfall data is removed at a time from the data set and the value is estimated from the remaining data [30].

Mean Bias Correction (MBC)
The mean bias correction (MBC) provides information on the long-term performance of the correlations by allowing a comparison of the actual deviation between calculated and measured values term by term. MBC is the simplest method in which a bias correction factor is constant over time and space [31][32][33]. The MBC factor is calculated for each time step as an average across all point-pixel pairs and is multiplied with the satellite estimates over the entire study area. For each time step, the correction factor can be calculated as follows: where, N = the number of available gauges inside the satellite domain, Z G x j and Z s x j = the gauge and satellite daily rainfall values corresponding to gauged location j.

Residual Inverse Distance Weighting (RIDW)
The method estimates the unknown value at one point using a linearly weighted combination of its neighbour sample points, with the weights inversely related to the distance between the estimated and sample point [34]. The bias adjustment is based on a previous study carried in Ethiopia [35]. First of all, the satellite rainfall estimates are extracted at rain gauge locations. The differences between satellite and rain gauge estimates are calculated at each station location. These residuals are interpolated using inverse distance weighting (IDW) to each satellite pixel centre and the interpolated differences are added back on to the satellite estimates. The weights for each sample are inversely proportional to the distance from the point being estimated: where, P x = estimate of rainfall for the ungauged station, Z S x j and Z G x j = the gauge and satellite daily rainfall values corresponding to gauged location j, d j = distance from each location the point being estimated, N = No. of surrounding stations.

Double-Kernel Residual Smoothing (DS)
The double-kernel smoothing technique (DS) is used to estimate the residual field by a weighted average of point residuals ε s , using kernel functions, and then adjust the satellite field by the predicted residual field [23]. The point residual at the given gauged location j = 1, . . . , N is defined as: where Z G x j and Z s x j = the gauge and satellite daily rainfall values corresponding to gauged location j.
The idea of a double smoothing estimator is by adding new pseudo observations by a coarse interpolation and leading to the production of the final estimates. There are two steps that need to be taken for the double smoothing estimator to be constructed. Firstly, the original residual ε S j is transformed to a gridded pseudoresidual ε SS j with equal spacing. At the given gridpoint location i = 1, . . . , M, the pseudoresidual is defined to be: where ||.|| is the Euclidean norm and Λ is the Kernel function defined as a Gaussian kernel following Li and Shao [23]: The variable H is the position of the points, and the bandwidth b is determined using Silverman's rule of thumb: b = 4σ 5 3n 1 5 where, n = number of samples and s is the standard deviation of samples. The second step of DS is applied to both the residuals and pseudoresiduals to estimate the final error field ε DS : The merged product Z DS at point k is calculated by subtracting the corresponding error from the satellite estimate Z Sk : The kernel smoothing (interpolation) of the residuals does not rely on the stationary assumption. Thus, the product of the merging will converge toward the rain gauge estimates with decreased distance toward the ground observations. Remote Sens. 2018, 10, 388 8 of 24

Detection Metrics
The probability of detection (POD) values for rainfall estimates in Pahang are shown in Figure 2a. The figure shows that 14 out of 27 pixel-point pairs located in the low and mid-altitude region have POD scores lower than 0.7. The lowest POD value is 0.54, which is located near to the coast. Table 5 shows that the POD values vary from 0.5 to 0.8 (with average POD of 0.68), which can be considered as an acceptable performance. A previous study analysing the performance of TRMM, amongst other satellite products over a smaller time window (2003)(2004)(2005)(2006)(2007), shows that the POD for the rain gauges in Pahang have a range of 0.7-0.8, which is close to the range of scores obtained (0.5-0.8) [36]. The region-specific performance could be due to the fact that the spatial resolution (0.25 • ) causes TRMM estimates to be insensitive to local-scale precipitation events. Figure 2b shows the False Alarm Ratio (FAR) of the rainfall estimates in Pahang Basin. The lowest FAR value is located at the Gunung Brinchang station (4.625 • N, 101.375 • E), which is 0.32. The highest FAR value is 0.62, which is located near to the east coast of Pahang. There are 14 out of 27 pixel-point gauges located in the low and mid-altitude region that have FAR scores of more than 0.45 This result is consistent with the study carried out in Nepal, which shows a higher FAR in a low elevation region [37]. Table 5 shows that FAR values vary from 0.3 to 0.5. Figure 2c shows the Probability of False Detection (POFD) for rainfall estimates in Pahang Basin. The POFD scores across the study area appear to be independent of geographical characteristics. The independency can be explained by the sensitivity of microwave and infrared retrieval algorithms to scattering signals and cold cloud temperatures from the rainfall systems over these terrain features [38]. The majority of the pixel-point gauges located at the coast have a POFD value of more than 0.6, indicating a tendency of TRMM to over detect rainfall. Table 5 shows the POFD values at each pixel-point pair. Figure 2d shows the TS of TRMM in Pahang. The highest TS value is in Gunung Brinchang

Detection Metrics
The probability of detection (POD) values for rainfall estimates in Pahang are shown in Figure  2a. The figure shows that 14 out of 27 pixel-point pairs located in the low and mid-altitude region have POD scores lower than 0.7. The lowest POD value is 0.54, which is located near to the coast. Table 5 shows that the POD values vary from 0.5 to 0.8 (with average POD of 0.68), which can be considered as an acceptable performance. A previous study analysing the performance of TRMM, amongst other satellite products over a smaller time window (2003)(2004)(2005)(2006)(2007), shows that the POD for the rain gauges in Pahang have a range of 0.7-0.8, which is close to the range of scores obtained (0.5-0.8) [36]. The region-specific performance could be due to the fact that the spatial resolution (0.25°) causes TRMM estimates to be insensitive to local-scale precipitation events. Figure 2b shows the False Alarm Ratio (FAR) of the rainfall estimates in Pahang Basin. The lowest FAR value is located at the Gunung Brinchang station (4.625°N, 101.375°E), which is 0.32. The highest FAR value is 0.62, which is located near to the east coast of Pahang. There are 14 out of 27 pixel-point gauges located in the low and mid-altitude region that have FAR scores of more than 0.45 This result is consistent with the study carried out in Nepal, which shows a higher FAR in a low elevation region [37]. Table 5 shows that FAR values vary from 0.3 to 0.5. Figure 2c shows the Probability of False Detection (POFD) for rainfall estimates in Pahang Basin. The POFD scores across the study area appear to be independent of geographical characteristics. The independency can be explained by the sensitivity of microwave and infrared retrieval algorithms to scattering signals and cold cloud temperatures from the rainfall systems over these terrain features [38]. The majority of the pixel-point gauges located at the coast have a POFD value of more than 0.6, indicating a tendency of TRMM to over detect rainfall. Table 5 shows the POFD values at each pixelpoint pair. Figure 2d shows the TS of TRMM in Pahang. The highest TS value is in Gunung Brinchang     Table 6 shows the performance scores calculated using TRMM using the rain gauge measurement. Figure 3a shows the RMSE between the gauge and TRMM daily rainfall over the Pahang Basin. The lowest RMSE is found in one of the high-altitude stations (RMSE = 12.80 mm/day) at Gunung Brinchang Station. However, the value is still considered as a large error, considering the average daily rainfall at the station is 19.14 mm/day. On the other hand, the higher RMSE is located near to the coastal region (RMSE = 29.87 mm/day). This result agrees with a study in Iran, which shows that estimations of satellite rainfall in highland and mountainous areas are more accurate than in lowland areas [39]. Besides that, TRMM performs poorly on coastal and island sites, which is consistent with the study carried out in Tropical Pacific Basin [40]. The convergence zones can be persistent when the breezes and synoptic gradient winds interact with each other and the surface friction increases in transition from sea to land [41]. When the level of convergence of moist air is low, it will assist the dynamical and microphysical processes for the formation of clouds and precipitation [41]. Due to this, the TRMM estimates can be inaccurately measured. Figure 3b shows the percent bias (PBIAS) for Pahang. The positive or negative values indicate overestimation or underestimation bias, respectively. Most of the PBIAS values that are close to zero and negative are in the east coast of Pahang. This is consistent with a previous study which also showed that TRMM underestimates rainfall in the coast [36]. These are also the same pixel-point pairs associated with high RMSE values. During the northeast monsoon season, the exposed areas like the east coast of Peninsular Malaysia experience heavy rain spells. Underestimation in satellite-based precipitation products may be caused by the heavy rainfall which can cause a reduction in the signal of the passive microwave (PMW) sensor [42]. Usually on the coastal area, the frequent occurrence of low stratiform clouds can affect the TRMM estimates if the clouds are under stable conditions, which is detached from precipitation patterns [43]. The average PBIAS value is 26.95%, which indicates an overall overestimation bias. Figure 3c shows the Nash Sutcliffe Efficiency (NSE) of rainfall estimates in Pahang Basin. Efficiency scores below 0 indicates that the TRMM rainfall results are less accurate than the mean of the observed data and therefore very poor estimates [44]. The figure shows that most of the values are under 0, and the NSE values reduced as the area moved further away from the coast. There is only one station with a positive value (NSE = 0.1), which is located at Pulau Tioman (2.875 • N, 104.125 • E). Even though the value is positive, it is still low, indicating the weak performance of the satellite product at the daily scale. Figure 3d shows the correlation coefficient (CC) of rainfall estimates in Pahang basin. The correlation coefficients are shown in Table 6 where the correlations vary from 0 to 0.49. Most of the CC values are positive; however, the average CC value is 0.26, indicating a weak performance. There are 16 out of 27 pixel-point pairs that have CC scores ranging from 0.30 to 0.49. A study in Malaysia also displayed the same range of CC scores in the Pahang area [36]. and negative are in the east coast of Pahang. This is consistent with a previous study which also showed that TRMM underestimates rainfall in the coast [36]. These are also the same pixel-point pairs associated with high RMSE values. During the northeast monsoon season, the exposed areas like the east coast of Peninsular Malaysia experience heavy rain spells. Underestimation in satellite-based precipitation products may be caused by the heavy rainfall which can cause a reduction in the signal of the passive microwave (PMW) sensor [42]. Usually on the coastal area, the frequent occurrence of low stratiform clouds can affect the TRMM estimates if the clouds are under stable conditions, which is detached from precipitation patterns [43]. The average PBIAS value is 26.95%, which indicates an overall overestimation bias. Figure 3c shows the Nash Sutcliffe Efficiency (NSE) of rainfall estimates in Pahang Basin. Efficiency scores below 0 indicates that the TRMM rainfall results are less accurate than the mean of the observed data and therefore very poor estimates [44]. The figure shows that most of the values are under 0, and the NSE values reduced as the area moved further away from the coast. There is only one station with a positive value (NSE = 0.1), which is located at Pulau Tioman (2.875°N, 104.125°E). Even though the value is positive, it is still low, indicating the weak performance of the satellite product at the daily scale. Figure 3d shows the correlation coefficient (CC) of rainfall estimates in Pahang basin. The correlation coefficients are shown in Table 6 where the correlations vary from 0 to 0.49. Most of the CC values are positive; however, the average CC value is 0.26, indicating a weak performance. There are 16 out of 27 pixel-point pairs that have CC scores ranging from 0.30 to 0.49. A study in Malaysia also displayed the same range of CC scores in the Pahang area [36].

Detection Metrics
In Figure 4a, approximately 50% of the points in the high-altitude have POD scores higher than 0.77, but the results constitute a higher frequency of lower valued scores. While at the mid-altitude, about 75% of the pixel-point pairs have POD scores of more than 0.74. However, there are exactly two pixel-point pairs that have POD scores below 0.60. In the low-altitude regions, the distribution is positively skewed and all the station scores are under 0.80. There are about 50% pixel-point pairs with POD scores lower than 0.63.  For the FAR scores, Figure 4b shows that the minimum scores are located in the high-altitude region (FAR = 0.32), while the highest scores are located in the low-altitude region (FAR = 0.62). Unlike in the low-altitude region, the high-altitude and mid-altitude region scores distributions are almost symmetrical. Predominantly, the interquartile range is large, indicating that there is a large variability in the results between the pixel-point pairs. Figure 4c shows the performances of POFD. In the high-altitude region, the highest score in the high-altitude region is 0.54 and the lowest score is 0.44. The lowest score in the high-altitude area also represents the lowest scores in all regions. Almost all pixel-point pairs in the mid-altitude region have scores higher than 0.54. In the low-altitude region, the data distribution is negatively skewed, indicating that the data contains a higher frequency of low valued scores.
The TRMM threat score (TS), on the other hand, shows a large variability for each altitude region. The boxplot for the mid-altitude region in Figure 4d shows that the scores distribution is negatively skewed and 75% of the pixel-point pairs have scores lower than 0.49. As for the high-altitude and low-altitude region, the scores distribution is positively skewed, indicating that the data constitute a higher frequency of high valued scores.

Volumetric Metrics
For the RMSE in Figure 4e, the scores for the high-altitude region and low-altitude region have a large variability since the interquartile range for both altitudes is respectively large. Overall, the highest RMSE score and the lowest RMSE score are located in the high-altitude region. The mid-altitude region has the smallest variability in scores compared to other regions. Station Sg. Anak Endau Kg. Mok in the mid-altitude region have a high score compared to the other station and are numerically far from other scores (RMSE = 28.56 mm/day). The results conclude that pixel-point pairs in the mid-altitude region have overall good RMSE performances compared to other regions.
In Figure 4f, the PBIAS scores distribution in the mid-altitude area are negatively skewed; about half of the pixel-point pairs in the region have scores below 35.6%. Compared to the mid-altitude region, the scores distribution in the low-altitude region is positively skewed, where the scores constitute a higher frequency of high PBIAS. In the high-altitude region, the score distribution is normal.
The highest NSE score is 0.09, which is located in the low-altitude region. In Figure 4g, the TRMM scores distribution in the low-altitude region is positively skewed. This indicates that the scores constitute s higher frequency of higher NSEs. Similarly, the TRMM scores distributions in the mid-altitude and high-altitude region are also positively skewed. The overall NSE performances were poor, as generally, all pixel-point pairs in the high-and mid-altitude areas had NSE values below 0 and almost 75% of the pixel-point pairs in the low-altitude area had NSE values below 0.
The CC scores distributions in the high-altitude and low-altitude regions are negatively skewed in Figure 4h. The scores in both regions constitute a higher frequency of lower CC. The overall highest CC score is located in the mid-altitude region (CC = 0.49). Figure 4h concludes that most of the pixel-point pairs at all regions have CC values below 0.4, which indicates poor performances. Compared to other regions, the mid-altitude region shows a slightly better performance as there are two pixel-point pairs with high CC values and the differences in CC values in the mid-altitude region are not too distinct.

Seasonal Analysis of TRMM
A seasonal analysis is performed by taking all seasons from 1998 to 2014 into consideration. Figure 5 shows the PBIAS of TRMM in each season. During the wetter seasons (DJF and MAM), negative bias occurs in the coastal region and in Cameron Highland (highest altitude pixel-point pair). The rest of the locations show high numbers of overestimated PBIAS readings. This is contrary to other reports of underestimations during the wet season, for example, by Zulkafli in Andean-Amazon River [45]. Inland locations generally show overestimations. DJF and MAM seasons are overestimated on average by 33.05% and 30.33%, respectively. In dry seasons, there are several locations that show an underestimation by TRMM, mostly located in the area close to the coastal region. However, on average, the average PBIAS reading for JJA and SON is 29.81% and 22.92%, respectively, demonstrating overestimation. The overestimation in the wet season and dry season in the inland locations may be attributed to the type of rainfall events that predominantly occur, which are longer term low to moderate intensity rainfall. TRMM can overestimate rain for light and moderate rain rates and underestimate for heavy rain rates, as reported by Guo [17].  Figure 6 shows the performance scores for detection metrics after bias adjustment. In Table 7, the average scores of both pre-corrected and post-corrected TRMM are compared. The average precorrected POD shows a moderate performance, but after bias adjustment, the scores improved. POD scores at a high altitude generally show a smaller increase compared to POD scores at mid-altitude and low-altitude.

Detection Metrics
As for FAR and POFD, the scores reduced towards zero, indicating better detection skills. The MBC method reduces FAR, especially at high altitudes and mid-altitude pixel-point pairs. Compared to the MBC method, the DS method reduces FAR especially for pixel-point pairs in low-altitude and mid-altitude regions. The RIDW method has the highest average FAR and the altitude seems to have no effect on the reduction. For POFD, the MBC method shows that 50% of the pixel-point pairs have scores below 0.2, which indicates a good performance. The majority of the pixel-point pairs located in the low-altitude and mid-altitude region have POFD scores below 0.2 after correction using the DS method. However, the RIDW method shows that only a few of the pixel-point pairs in the coastal and inland regions have POFD scores below 0.3. Half of the pixel-point pairs have scores in the range of 0.3-0.4.
On the other hand, the average pre-corrected TS shows a slightly weak performance, but after being bias adjusted, the scores improved. The MBC method improved the scores especially for pixelpoint pairs at the coastal region and some pixel-point pairs in the inland region. The DS method  Figure 6 shows the performance scores for detection metrics after bias adjustment. In Table 7, the average scores of both pre-corrected and post-corrected TRMM are compared. The average pre-corrected POD shows a moderate performance, but after bias adjustment, the scores improved. POD scores at a high altitude generally show a smaller increase compared to POD scores at mid-altitude and low-altitude.

Detection Metrics
As for FAR and POFD, the scores reduced towards zero, indicating better detection skills. The MBC method reduces FAR, especially at high altitudes and mid-altitude pixel-point pairs. Compared to the MBC method, the DS method reduces FAR especially for pixel-point pairs in low-altitude and mid-altitude regions. The RIDW method has the highest average FAR and the altitude seems to have no effect on the reduction. For POFD, the MBC method shows that 50% of the pixel-point pairs have scores below 0.2, which indicates a good performance. The majority of the pixel-point pairs located in the low-altitude and mid-altitude region have POFD scores below 0.2 after correction using the DS method. However, the RIDW method shows that only a few of the pixel-point pairs in the coastal and inland regions have POFD scores below 0.3. Half of the pixel-point pairs have scores in the range of 0.3-0.4.
On the other hand, the average pre-corrected TS shows a slightly weak performance, but after being bias adjusted, the scores improved. The MBC method improved the scores especially for pixel-point pairs at the coastal region and some pixel-point pairs in the inland region. The DS method particularly improved some pixel-point pairs in the inland region and there is not much improvement in the TS scores for the RIDW method as the average pre-corrected TS score (TSav = 0.423) is close to the average post-corrected TS score (TSav = 0.552).
The      Figure 7 shows the performance scores for detection metrics after bias adjustment. In Table 8, the average scores of both pre-corrected and post-corrected TRMM are compared. The average performances score overall in Pahang shows a poor performance before correction. RMSE scores are significantly reduced after the DS method is applied, with an average of 19.14 mm/day before correction and 15.86 mm/day after bias adjustment. The MBC method also performed well in reducing the RMSE scores. However, the RMSE scores are not consistent with the elevations. This may be caused by the correction factor being applied to the pixel-point pairs. The correction factor applied to the pixel-point pairs was calculated using both ground-based and satellite estimates from different elevation regions. The range of pre-corrected RMSE is 12.80-29.87 mm/day and the range of post-corrected RMSE is 9.95-27.17 mm/day. Most of the PBIAS values corrected by the MBC method that are closer to 0 are on the east coast of Pahang. During the northeast monsoon season, the exposed areas like the east coast of Peninsular Malaysia experience heavy rain spells. The heavy rainfall may cause signal attenuation of the passive microwave (PMW) sensor, which may cause underestimation in satellite-based precipitation products (Qin et al., 2014). The average PBIAS value before correction is +26.95%, which indicates overestimation bias. After the MBC method (range of −34.1%-47.1%) was applied, there was a significant decrease in the average PBIAS value (PBIAS = 3.904%), which indicates a much smaller amount of overestimation. There are 12 out of 27 pixel-point pairs that underestimate the rainfall event. RIDW and the DS method show that the PBIAS scores decrease but remain positive, which indicates overestimated bias. The majority of pixel-point pairs in RIDW (range of −12.9%-98.8%) and the DS method (range of −18.8%-51.2%) are overestimating the rainfall event. However, compared to the MBC method and DS method, the average of PBIAS after being corrected by RIDW increases, from +26.95% (pre-corrected) to +30.32% (post-corrected).
Before correction, half of the pixel-point pairs have NSE values which are under 0. The NSE values reduced as the area moved further away from the coast. The overall NSE values which have been corrected by using the MBC method increase towards 1, which indicates improvement. Some pixel-point pairs increase beyond 0, which indicates a better estimation. Most of the pixel-point pairs after adjustment have NSE values beyond 0, even though the average for the MBC method is still below 0 (NSE = −0.26). Furthermore, the RIDW method shows that there is only one station which shows an NSE value below −1. The range of pre-corrected NSE scores is −0.21 to 0.09 and the range of post-corrected NSE scores is −1.68 to 0.32.
Prior to correction, most of the pixel-point pairs have a positive CC value. However, the average CC value is 0.26 (range between 0-0.5) and this shows a slightly weak overall performance. The higher correlations are located on the east coast of Pahang and the mountainous areas. Most of the post-corrected CC increases towards the perfect score (CC = 1). The average CC value for the MBC method is 0.31 (range of 0.07-0.52), which is the highest average value, followed by the MBC method (CCav = 0.29, range of 0-0.59) and RIDW (CCav = 0.18, range of 0.01-0.36). different elevation regions. The range of pre-corrected RMSE is 12.80-29.87 mm/day and the range of post-corrected RMSE is 9.95-27.17 mm/day.  Most of the PBIAS values corrected by the MBC method that are closer to 0 are on the east coast of Pahang. During the northeast monsoon season, the exposed areas like the east coast of Peninsular Malaysia experience heavy rain spells. The heavy rainfall may cause signal attenuation of the passive microwave (PMW) sensor, which may cause underestimation in satellite-based precipitation

Conclusions
Rain gauge data were used to evaluate the TRMM performance over the Pahang River basin covering approximately 48,380 km 2 . A variety of performance indicators are used to evaluate the rainfall measurements in different ways.
Based on the results, TRMM shows a moderate performance in detection and volumetric correspondence between TRMM and rain gauges. This can be verified by the moderate performance of POD and weak performance of TS at most of the pixel-point pairs. The poor performances of TRMM at the coast reflect the inability of TRMM instruments and algorithms to capture the complex rain formation system on the coast. POFD and FAR demonstrate that the pixel-point pairs generally have high false alarm values (POFD > 0.5 and FAR > 0.5). The errors between the rain gauge and TRMM measurement at most of the pixel-point pairs are high. The high errors were located mostly in the coastal areas and this may due to the coastal convergence zone. At most of the pixel-point locations on the east coast of Pahang, TRMM underestimates the rainfall due to heavy rains which can reduce the signal emission to obtain rain event information. The performance of all pixel-point pairs in Pahang is also characterised by a weak average correlation. The results obtained by this study are also consistent with Tan et al., but with more rain gauges to be considered and a longer temporal scale [36].
Based on the seasonal analysis, TRMM generally shows overestimation in both the wet and dry season. This may be caused by the rainfall rates during each season. Three methods are used in this study to adjust the biases. Overall, it shows that MBC performed better than DS and RIDW. The differences in the post-corrected results are because of the different ways in which each method adjusts the biases. Most of the detection and volumetric scores improved after being adjusted.
Based on the detection scores and volumetric scores, TRMM shows a weak to moderate performance at the daily scale. Therefore, the use of the product for analysis and models in the field of ecology, climate, and hydrology will be limited at this scale. The product may be more suitable at higher temporal aggregation values such as monthly or annual scales. Furthermore, the bias correction results show that the errors between TRMM and rain gauge estimates were reduced by up to 18.38% and the correlation between TRMM and rain gauge estimates was increased by up to 16.28%. This improvement shows that the post-corrected TRMM product may be reliable compared to the pre-corrected TRMM product for its applicability for environmental, water resources, and ecological studies. Correction/merging methods other than MBC, DS, and RIDW can be used to improve the product in its representation of the region's rainfall characteristics. Simple bias correction or more complex data assimilation methods such as the Bayesian Combination [46] may be assessed and implemented to achieve this. On the other hand, this study has focussed on elevation as a controlling factor in the comparison of the performance at various locations in the study area. Generally, TRMM performance at all altitudes shows large errors between satellite and rain gauge estimates. However, compared to high-altitude and low-altitude regions, the range of errors in mid-altitude areas is smaller. TRMM also tends to have a high accuracy of detecting the rainfall event in high-altitude and mid-altitude regions compared to the low-altitude regions. More advanced analysis such as looking at the performance by different storm types (convective, stratiform, et cetera) may provide better insights into TRMM 3B42 capabilities and limitations in measuring tropical rainfall.