2.3.1. Scatter Index
The Scatter Index (SI) is a widely used microwave-based precipitation retrieval algorithm proposed by Grody [
20]. Theoretically, the high-frequency microwave radiation emitted by the land surface could be seriously scattered by the rainfall particles (such as big raindrops and ice crystals) in the clouds, so the TBs at high frequency over rain areas are usually lower than that over no rain areas. In contrast, the low-frequency microwave radiation tends to be less scattered by rainfall particles, leading to similar TBs at low frequency over rain and no rain areas. Based on this theory, Grody [
20] assumed that the TBs at high frequency over the no rain areas could be simulated by the TBs at low frequency, and the difference between the actual and simulated TBs at high frequency could consequently present the scattering intensity of the rainfall particles in the cloud, which is correlated with the spatial distribution and the physical attributes of the rainfall particles in the clouds. Therefore, the SI is a reasonable indicator of the occurrence and intensity of precipitation. According to the research of Grody [
20] and Ferraro [
21], SI could be derived from vertically-polarized brightness temperatures at 21.3, 19.0 and 85.5 GHz (termed TB
21V, TB
19V and TB
85V) using a two-step algorithm. First, the brightness temperature at 85 GHz under the no scattering condition at the location k is defined as
E(
k) and calculated by Equation (1).
The coefficients A, B, C and D in this equation are regressed using the brightness temperature of the pixels under the no rain condition. Second, the SI is calculated using Equation (2).
After the
SI is calculated, the rain areas were delineated with a threshold, and then, the precipitation of the test dataset was calculated using delineated rain areas according to Equation (3),
P(y) is the estimated precipitation at the raining location y; the coefficients m and n are acquired by regressing the SI and the precipitation of samples under raining conditions in the training datasets.
2.3.2. Dynamic Cluster K-Means Method
To delineate rain areas over the Tibetan Plateau with a sparse rain gauge network, Yao, Li, Zhu, Zhao and Chen [
22] attempted to delineate rain/no rain areas with the TBs measured by TMI using an unsupervised classification method, the dynamic cluster K-means (termed as Kmean, hereafter). In this study, the performance of this method was assessed as a comparison of the new rain area delineation method proposed by this study. In the Kmean method, several categories were first defined, and the centers were calculated by averaging the values of similar samples. Then, the distance between a sample and a category center was calculated. For each category, the square sum of the distance between this category and the neighboring samples is called the classification error. Subsequently, the Kmean method attempts to make the classification error as small as possible by assigning each sample to the nearest category. In this step, the distances between a sample
i and all category centers were first calculated, and then, the sample
i was assigned to the category with the smallest distance. At the same time, the category center is re-calculated. This process iterated until the smallest classification error was obtained. The advantage of this method is that ground observations are not necessary. According to Yao, Li, Zhu, Zhao and Chen [
22], the TBs at 19.4, 21.3, 37.0 and 85.5 GHz were used as the inputs of the Kmean method to delineate rain/no rain areas. In this study, the Kmean method was conducted using the “kmeans” toolkit provide by the MATLAB 2009a software.
2.3.4. Rain Area Delineation Method Based on PNN
With the ground observations of the rain gauge network over the Yangtze River Basin, the PNN-based rain area delineation method was established according to the following steps:
(1) The multi-channel TBs stored in TRMM 1B11 datasets were extracted and pre-processed. First, the multi-channel TBs at the nine channels (including the vertical and horizontal polarization at 10.7, 19.4, 37.0 and 85.5 GHz, and the vertical polarization at 21.3 GHz), as well as the geo-locations of each pixel were extracted using the “hdftool” of the MATLAB 2009a software, generating a dataset composed of the latitude, longitude and the TBs at the nine channels of each pixel. Then, this dataset was resampled to 0.1° × 0.1° spatial resolution using the nearest neighboring method provide by ArcGIS software.
Figure 2.
The structure of the probabilistic neural network.
Figure 2.
The structure of the probabilistic neural network.
(2) Three parameters, which have the potential to distinguish the rainfall areas from underlying surface, were derived from the resampled TBs. The first parameter is the polarization-corrected temperature (PCT) at 85 GHz (termed PCT
85). PCT was proposed by Grody [
19] to delineate the rain areas based on the scattering effect of raindrops on the high-frequency microwave radiation, as well as the difference of the microwave polarization between the land surface and rain areas. High values of PCT usually indicate intense raindrops in the clouds and a high probability of raining. Kidd [
33] investigated the relationship between the occurrence of rain and the PCT at different frequencies, and the results indicated that the PCT at 85 GHz was sensitive to light precipitation and more suitable for the delineation of the rain areas than other channels. PCT
85 can be calculated using Equation (4):
In this equation,
TB85H is the horizontally polarized brightness temperature at 85 GHz and µ is a constant value (0.818), according to Spencer
et al. [
34].
The second one is the difference of TBs at high frequency and low frequency (termed TD). Ferraro, Smith, Berg and Huffman [
25] pointed out that the TBs of land surface (except snow, desert) at high frequency is higher or similar in comparison with those at low frequency; therefore, the TD tends to be positive over no rain areas. However, because of the serious scattering effect of the raindrops on high-frequency microwave radiation, the TBs at high frequency showed a substantial decrease, leading to negative TD over rain areas. Therefore, the TD is a promising indicator of rain areas. In this study, the vertically polarized TBs at 37 GHz (high frequency) and 19 GHz (high frequency) were used to calculate TD according to the research of Yao, Li, Zhu, Zhao and Chen [
22].
The third variable is the sum of TBs at 37 and 19 GHz (termed TS). Because the emissivity of water is much lower than land surface, the water body usually shows low TBs, which is similar to rain areas. As addressed by Yao, Li, Zhu, Zhao and Chen [
22], the sum of TBs at 37 and 19 GHz (TS) was an effective indicator to identify the water body, which usually showed a much lower value of TS than other land surface and rain areas.
(3) The three parameters calculated in Step (2) were collocated with the ground observations measured by the rain gauges. For the pixel i (with a spatial resolution of 0.1° × 0.1°), the rain gauges located inside pixel i were first selected, and then, the mean value of the selected rain gauge observations was calculated as the true precipitation for this pixel. Pixels containing at least one rain gauge were selected to establish and validate the PNN-based rain area delineation method.
(4) Based on the ground observation, a rain flag was generated for each pixel to determine whether this pixel was under raining conditions. The pixels with ground observations above 0 mm/h were deemed under raining conditions, and the rain flag was set to 2, while the pixels with ground observations equal to 0 mm/h were deemed as no raining, and the rain flag of these pixel was set to 1. In order to independently validate the new rain area delineation method, 30% of the pixels were randomly selected as the training datasets, while the rest of the pixels were used to assess the performance of the established rain area delineation method.
(5) The PNN was trained with the three parameters (PCT, TD and TS) of the training datasets as inputs and the corresponding rain flag as the output. In this study, the PNN model was designed and trained by the “newpnn” toolkit provided by MATLAB 2009a. This toolkit could establish a PNN classification model by taking a matrix as input vectors and a matrix with classification information as target class vectors. In this toolkit, a parameter was included to control the spread of radial basis functions in PNN. If this parameter is near zero, the network will be treated as a nearest neighbor classifier. As this parameter becomes larger, the designed network will take into account several nearby input vectors and, thus, takes more time to train the PNN. In this study, the default value of this parameter provided by the MATLAB 2009a software (0.1) was used to establish the PNN model. This step will establish a PNN model that could determine whether a sample was under raining conditions.
(6) The PNN-based rain area delineation method was validated using the test datasets generated in Step (4). In this step, the PNN generates a rain flag for each pixel (1 for the no rain pixels and 2 for the rain pixels). Subsequently, the rain flags determined by PNN were compared with the rain flags determined by the ground observations to assess the performance of the PNN method.
2.3.5. Validation
In this study, six categorical statistics, including the probability of detection (POD), false alarm ratio (FAR), critical success index (CSI), equitable threat score (ETS), Hanssen and Kuiper discriminant (HK) and Heidke skill score (HSS), were introduced to evaluate the performance of the rain area delineation methods [
35]. The POD and FAR were used to measure the misdetected and false alarm precipitation of the rain area delineated by satellite observations, respectively. Higher values of POD and lower FAR indicate less misclassified rain/no rain areas. CSI, ETS, HK and HSS were used to evaluate the strength of the correlation between the rain areas measured by rain gauges and those delineated by satellite observations. More accurate rain area delineation will produce higher values of CSI, ETS, HK and HSS. These categorical statistics are calculated by Equations (5)–(10):
In these equations,
h and
z represent the number of the correctly-identified rain pixels and no rain pixels, respectively;
f is the number of false alarm pixels that are under no raining conditions, as detected by rain gauges, but still misclassified as rain by satellite observations;
m is the number of misdetected pixels, which are rain pixels detected by rain gauges, but misclassified as no rain by satellite observations. The meanings of these variables are presented in
Table 1.
Table 1.
The contingency table of rain areas detected by rain gauges and delineated by rain area delineation methods.
Table 1.
The contingency table of rain areas detected by rain gauges and delineated by rain area delineation methods.
| Rain/No Rain Area Delineation by the Models |
---|
Rain | No Rain |
---|
Ground observations | Rain | h | m |
No rain | f | z |