A Machine Learning Approach for Improving Near-Real-Time A Machine Learning Approach for Improving Near-Real-Time Satellite-Based Rainfall Estimates by Integrating Soil Moisture Satellite-Based Rainfall Estimates by Integrating Soil Moisture

: Near-real-time (NRT) satellite-based rainfall estimates (SREs) are a viable option for ﬂood / drought monitoring. However, SREs have often been associated with complex and nonlinear errors. One way to enhance the quality of SREs is to use soil moisture information. Few studies have indicated that soil moisture information can be used to improve the quality of SREs. Nowadays, satellite-based soil moisture products are becoming available at desired spatial and temporal resolutions on an NRT basis. Hence, this study proposes an integrated approach to improve NRT SRE accuracy by combining it with NRT soil moisture through a nonlinear support vector machine-based regression (SVR) model. To test this novel approach, Ashti catchment, a sub-basin of Godavari river basin, India, is chosen. Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA)-based NRT SRE 3B42RT and Advanced Scatterometer-derived NRT soil moisture are considered in the present study. The performance of the 3B42RT and the corrected product are assessed using di ﬀ erent statistical measures such as correlation coe ﬃ cient (CC), bias, and root mean square error (RMSE), for the monsoon seasons of 2012–2015. A detailed spatial analysis of these measures and their variability across di ﬀ erent rainfall intensity classes are also presented. Overall, the results revealed signiﬁcant improvement in the corrected product compared to 3B42RT (except CC) across the catchment. Particularly, for light and moderate rainfall classes, the corrected product showed the highest improvement (except CC). On the other hand, the corrected product showed limited performance for the heavy rainfall class. These results demonstrate that the proposed approach has potential to enhance the quality of NRT SRE through the use of NRT satellite-based soil moisture estimates.


Introduction
Accurate measurement of rainfall in near-real-time (NRT) is a primary requirement for forecasting and monitoring of floods [1,2]. Ground-based rain gauges provide reliable point rainfall values [3]. However, these ground-based rainfall values are often not available in NRT, especially in developing nations of Asia and Africa. Even the available ones are scarcely distributed, which makes the

Study Area
The Ashti catchment is the test site for this study, which is a sub-basin of the Godavari River basin, India. The extent of this catchment lies between 78 • 0 and 81 • 0 East longitudes and 19 • 30 and 22 • 50 North latitudes, covering an area of approximately 50,000 km 2 . The elevation of the catchment varies from 144 to 1036 m above sea level [53]. Agricultural lands and forests are the major land use over the catchment [54]. Figure 1 represents the location of the catchment in India along with the observed monsoonal average rainfall during the study period. There are 86 rainfall grids of 0.25 • × 0.25 • spatial resolution enclosing the catchment. The entire study area is in the rainfed region and falls under the tropical climate zone. Most of the annual rainfall over Ashti catchment occurs during the southwest monsoon period between mid-June and mid-October [55]. Therefore, only the monsoon season is considered in the present study. The observed monsoonal average rainfall during study period varies from 1100 to 2100 mm in the rainfall grids over Ashti catchment (Figure 1b). Significant spatial variability in rainfall, complexity in terrain, and high vulnerability to floods make the Ashti catchment a suitable test site for the present study.
Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 21 based NRT soil moisture. As per the authors' knowledge, this is the first study where NRT SRE is integrated with NRT soil moisture in a machine learning framework to improve NRT SRE. This article is organized into four sections: Following this introduction section, material and methods used are given in Section 2. The results and discussions of various analyses carried out are provided in Section 3. Finally, summary and conclusions of the study are described in Section 4.

Study Area
The Ashti catchment is the test site for this study, which is a sub-basin of the Godavari River basin, India. The extent of this catchment lies between 78°0′ and 81°0′ East longitudes and 19°30′ and 22°50′ North latitudes, covering an area of approximately 50,000 km 2 . The elevation of the catchment varies from 144 to 1036 m above sea level [53]. Agricultural lands and forests are the major land use over the catchment [54]. Figure 1 represents the location of the catchment in India along with the observed monsoonal average rainfall during the study period. There are 86 rainfall grids of 0.25° × 0.25° spatial resolution enclosing the catchment. The entire study area is in the rainfed region and falls under the tropical climate zone. Most of the annual rainfall over Ashti catchment occurs during the southwest monsoon period between mid-June and mid-October [55]. Therefore, only the monsoon season is considered in the present study. The observed monsoonal average rainfall during study period varies from 1100 to 2100 mm in the rainfall grids over Ashti catchment (Figure 1b). Significant spatial variability in rainfall, complexity in terrain, and high vulnerability to floods make the Ashti catchment a suitable test site for the present study.

Datasets
The datasets include rainfall and soil moisture estimates. The monsoon seasons of 2012-2015 are considered as the time span for this study. The time period is constrained by the availability of: (i) ground-based rainfall observations (up to 2015); and (ii) a consistent data record for Advanced

Datasets
The datasets include rainfall and soil moisture estimates. The monsoon seasons of 2012-2015 are considered as the time span for this study. The time period is constrained by the availability of: (i) ground-based rainfall observations (up to 2015); and (ii) a consistent data record for Advanced Remote Sens. 2019, 11, 2221 4 of 20 Scatterometer (ASCAT)-based NRT soil moisture product (starting in August 2011). Description for each dataset is given in subsequent sections.

Observed Rainfall Data
The gridded observed daily rainfall data available at a high spatiotemporal resolution (0.25 • × 0.25 • , daily) have been obtained from the Indian Meteorological Department (IMD). This gridded dataset for India was prepared by Pai et al. [56], considering rainfall measurements from comparatively well spread rain gauge stations over Indian land region after expanded quality controls. This IMD gridded rainfall data is an officially certified commercial product to use in hydrometeorological applications across the Indian region. Many recent studies [52,[57][58][59] have used IMD gridded rainfall as the reference data to evaluate SREs.

Satellite-Based Rainfall Data
The TMPA-based NRT SRE 3B42RT Version 7 (hereafter referred as 3B42RT) at high spatiotemporal resolution (0.25 • × 0.25 • , 3 h) is considered in the present study. 3B42RT relies on microwave observations from the low orbiting satellites. The spatial and temporal gaps in the microwave observations are filled with infrared (IR) data. 3B42RT has a latency period of 6-9 h, making it suitable for NRT applications such as monitoring of floods and droughts. Furthermore, 3B42RT performs relatively better compared with other contemporary NRT SREs [60][61][62][63]. Also, 3B42RT is the benchmark product for the current GPM Mission [27,64]. 3B42RT data can be freely downloaded by a simplified data search tool "Mirador" (NASA Goddard Space Flight Center, Greenbelt, MD, USA), developed at the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC).

Satellite-Based Soil Moisture Data
The ASCAT-derived [65,66] satellite-based soil moisture products, H101 (Metop-A (European Space Agency, Paris, France)) and H16 (Metop-B (European Space Agency, Paris, France)) are considered due to their NRT availability (latency period of 130 min after sensing) along with good spatial and temporal resolutions [67]. Moreover, several studies have used the ASCAT-based soil moisture data for their research [2,68,69] and have obtained good performance in streamflow prediction and rainfall estimation. These satellite-based soil moisture products are distributed by EUMETSAT Satellite Application Facility on Support to Operational Hydrology and Water Management (H SAF). The ASCAT-based soil moisture provides water saturation up to 5 cm of topsoil layer and ranges between 0 and 100%. These estimates are obtained using backscatter coefficients measured by Metop-A (European Space Agency, Paris, France) and Metop-B (European Space Agency, Paris, France) satellites using the change detection method, developed at the Research Group Remote Sensing, Department for Geodesy and Geoinformation, Vienna University of Technology [70]. The native spatial sampling of the soil moisture product is 12.5 km × 12.5 km. The temporal resolution of the product is nearly once per day across India. The ASCAT-based NRT soil moisture data can be accessed freely through EUMETSAT's website [71]. Note that the ASCAT-based soil moisture retrieved product is associated with larger errors/limitations, especially in orographic regions, frozen soils, and dense vegetation [72,73].

SVR Model
In the present study, the SVR model is chosen due to its exceptional capability to handle nonlinearity and complexity [74][75][76][77][78][79][80]. The support vector machine-based algorithms are supervised learning techniques originally developed for classification problems [81]. Further, they are expanded to solve regression problems [82][83][84]. In recent times, SVR models have gained popularity due to their excellent generalization capability as they seek to minimize the upper bound of the generalization error rather than the training error [85]. The SVR models have been extensively used in hydrological problems [86][87][88][89]. The main advantage of the SVR models over the other methods (e.g., artificial neural network, ANN) is that they can overcome major limitations such as trapping in local minimum and network overfitting [90]. Additionally, several studies, which have compared the relative performance of SVR and ANN [91][92][93][94], found SVR to be better suited for hydrological applications. Consequently, SVR is chosen for the proposed rainfall correction method.
The SVR model provides a solution to a regression problem with multiple inputs {xi}, and a target output yi , where, i = 1, 2, 3 ...... n (n represents the number of observations of inputs and output). The SVR equation can be represented as where coefficients w and b are the weight vector and the offset vector, respectively. φ(x) denotes the transformation function that maps the original input vectors into a high-dimensional feature space, and w and b are estimated by solving the following optimization problem: where C is a user-defined penalty constant, which represents the amount of trade-off between dispersion of weights and objective function. ξ i and ξ * i are positive slack variables that quantify the positive difference over an error-tolerance variable ε [95]. The regression problem in Equation (1) is difficult to solve as the dimension of the feature space is high [96]. Hence, this problem can be solved in dual space by using Lagrange multipliers α i and α * i . Finally, the regression model becomes where K (x i , x j ) is a kernel function, which describes the inner product in D-dimensional feature space, x i and x j ε x. A detailed description of SVRs is available in literature [97,98]. The entire analysis and calculation of SVR in the present study are performed using the LIBSVM software, developed by Chang and Lin [99].

Construction of the SVR Model
The construction of the SVR model involves four main steps: 1. Preprocessing of satellite-based rainfall, soil moisture, and observed rainfall; 2. Correlation analysis between satellite-based rainfall and soil moisture with observed rainfall dataset; 3. Selection of kernel function for SVR; 4. Estimation of optimum value of the hyperparameters associated with the SVR model. Description of each of these steps is given in the following subsections:

Preprocessing of Dataset
3B42RT rainfall data are available at 3-h and daily temporal resolution, and the latter is accumulated at 00 UTC. However, IMD only provides daily observed rainfall at 0300 UTC. Hence, daily rainfall for 3B42RT is estimated from its 3-h data accumulated at 0300 UTC, for the sake of homogeneity in the analysis. The ASCAT-based NRT soil moisture estimates with a native resolution of 12.5 km are resampled to 0.25 • to match the spatial resolution of IMD gridded rainfall. As IMD accumulates daily rainfall at 3:00 UTC, the nearby ASCAT morning pass datasets are only considered for this study. However, some temporal discontinuities are observed in morning pass ASCAT dataset, which are filled primarily by using the ASCAT's evening pass dataset. On some days (~15% on average) when both morning and evening pass ASCAT data were not available, the values are filled by the available closest previous day data. Moreover, for both 3B42RT and IMD data, daily rainfall less than 0.5 mm is considered as no rainfall day, which is consistent with the previous studies [58,100].
In the present study, there are two input datasets (3B42RT and ASCAT-based NRT soil moisture) and one target/output dataset (IMD gridded rainfall). All the datasets are scaled between 0 and 1 before setting up the SVR model in order to prevent the model from being dominated by variables with large values. Finally, the model outputs are back-transformed to their original scale and the performance assessment is carried out.

Correlation Analysis of Datasets
Correlation analysis is necessary to check the importance or significance of the input variables to improve the target variable and has been performed in several previous studies [41][42][43][101][102][103]. Correlation accounts for the degree of agreement amongst two variables, which is typically quantified by the correlation coefficient (CC) having a range from −1 to +1. The values of +1, −1, and 0 for CC represent absolute direct, absolute inverse, and no correlation, respectively. In this study, correlation analysis is performed between the inputs (3B42RT and ASCAT-based NRT soil moisture) and the target variable (IMD gridded rainfall). The CC value between 3B42RT and IMD gridded rainfall (soil moisture and IMD gridded rainfall) ranges between 0.51 and 0.82 (0.21 to 0.43) with 95% significance level. As expected, 3B42RT rainfall shows better correlation as compared to ASCAT soil moisture with the observed rainfall. Moreover, to identify the multicollinearity problem between the inputs (3B42RT and ASCAT-based NRT soil moisture), a statistical measure, i.e., variance influence factor (VIF) [104], is obtained for all grids. The value of VIF for every grid is close to one, which indicates no multicollinearity problem between the inputs as the threshold of VIF for multicollinearity problem is for values greater than 5 [105,106].

Selection of Kernel Function for SVR
Selection of appropriate kernel function is essential for reliable performance of the SVR model. Several kernel functions, such as linear, sigmoid, polynomial, and radial functions, are available for SVR. However, various hydrometeorological studies show a favorable performance with radial basis kernel function [107][108][109]. In addition, the radial basis function (RBF) can effectively handle the nonlinear relation between inputs and output effectively. The RBF is also computationally simpler and more efficient than the polynomial kernel function, as the latter requires more parameters [110]. Therefore, RBF is used in the present study. The equation of RBF is given by where x i and x j are the inputs in the ith and jth dimensions, respectively, and γ is a kernel width parameter.

Estimation of the Optimum Value of the Hyperparameters for SVR
The performance of SVR is dependent on the hyperparameters C, ε, and γ [111,112]. Hence, the optimum value of these hyperparameters is essential for efficient SVR model setup. However, there is no predefined value for the hyperparameters associated with SVR [113]. Hence, the optimum value of the parameters is obtained by using grid search optimization technique for their valid range [96,98,[114][115][116]. The five-fold cross validation is used to avoid or minimize the risk of overfitting during the optimization process [91,117,118]. Minimum root mean square error (RMSE) is considered as the selection criterion to optimize C, ε, and γ. Once the optimum value of the parameters is obtained for each grid point (provided in Figure S1), the output is quantified on the basis of the optimal parameters for the training and testing periods.

Performance Metrics
CC, bias, and root mean square error (RMSE) have been selected to assess the performance of 3B42RT and the corrected product (obtained by integrating 3B42RT and ASCAT-based NRT soil moisture in the SVR model). Relevant contemporary studies [15,17,24,33,[119][120][121] have also used these quantitative statistical measures to assess the performance of satellite-based products. Table 1 shows the possible ranges of these performance measures along with their optimal values.

Results and Discussion
In this section, 3B42RT and the corrected product are evaluated and compared. It is noteworthy that the training and testing periods considered for this analysis cover the monsoon seasons of 2012-2014 and 2015, respectively. Section 3.1 presents the results in terms of box plot and spatial distribution of the adopted performance measures across the catchment for both the training and the testing periods. Rainfall intensity-based performance of the corrected product is also investigated and presented in Section 3.2. This is crucial for assessing the performance of rainfall products as the errors may be heterogeneous for different rainfall intensities [62]. In Section 3.3, time-series plots of IMD gridded rainfall, 3B42RT, and the corrected product for testing period are shown to visualize the performance of 3B42RT and the corrected product on a daily scale.

Performance Assessment Across the Ashti catchment
All the adopted statistical measures across the study area are shown in Figures 2 and 3. The box plot in Figure 2 represents the results for the training and testing periods in terms of CC (Figure 2a), bias (Figure 2b), and RMSE (Figure 2c). The spatial distribution of these performance measures is presented in Figure 3. From Figures 2 and 3, it can be clearly observed that there is a substantial improvement (mainly in terms of bias and RMSE) in the corrected product compared with 3B42RT during the training and testing periods. However, the improvement in the median value of CC in the corrected product when compared to 3B42RT is very limited (Figure 2a). The spatial distribution of CC also indicates small improvement in the corrected product over the catchment during the training and testing periods (Figure 3a-d). This limited improvement in CC is consistent with the study carried out by Crow et al. [46], which might be due to no/limited improvement in the residual error/random error in the corrected product compared to 3B42RT. On the other hand, bias and RMSE are improved, possibly due to improvements in the systematic error of the corrected product as compared to 3B42RT. From Figure 2b, it can be noted that the median bias value in 3B42RT is 3.57 mm/day (5.21 mm/day) during training (testing) period. However, the median bias value reduced significantly to −1.21 mm/day (0.17 mm/day) during training (testing) period in the corrected product. Similarly, the spatial plot of bias (Figure 3e-h) also indicates a notable reduction for corrected product compared to 3B42RT. Hence, it can be concluded that the bias is improved significantly all over the catchment. clear evidence for overestimation of 3B42RT as compared to the IMD gridded rainfall over the entire catchment during training and testing periods. From Figure 2c, it can be inferred that the median RMSE value is quite high for 3B42RT, i.e., 16.81 mm/day (17.28 mm/day) during training (testing) period. RMSE decreased greatly by 28% and 33% in training and testing periods, respectively, for the corrected rainfall product. The spatial distribution of RMSE also indicates a considerable improvement in the corrected product over 3B42RT throughout the catchment for both training and testing periods (Figure 3i-l).
error/random error in the corrected product compared to 3B42RT. On the other hand, bias and RMSE are improved, possibly due to improvements in the systematic error of the corrected product as compared to 3B42RT. From Figure 2b, it can be noted that the median bias value in 3B42RT is 3.57 mm/day (5.21 mm/day) during training (testing) period. However, the median bias value reduced significantly to −1.21 mm/day (0.17 mm/day) during training (testing) period in the corrected product. Similarly, the spatial plot of bias (Figure 3e-h) also indicates a notable reduction for corrected product compared to 3B42RT. Hence, it can be concluded that the bias is improved significantly all over the catchment. Figure 3e,g provides clear evidence for overestimation of 3B42RT as compared to the IMD gridded rainfall over the entire catchment during training and testing periods. From Figure 2c, it can be inferred that the median RMSE value is quite high for 3B42RT, i.e., 16.81 mm/day (17.28 mm/day) during training (testing) period. RMSE decreased greatly by 28% and 33% in training and testing periods, respectively, for the corrected rainfall product. The spatial distribution of RMSE also indicates a considerable improvement in the corrected product over 3B42RT throughout the catchment for both training and testing periods (Figure 3i-l).

Performance Assessment Based on Various Rainfall Intensity Classes
The IMD has classified the rainfall amounts into seven different classes based on intensity (mm/day). However, for this study, four classes are defined, i.e., no rainfall (<0.5 mm/day), light rainfall (0.5 to 7.5 mm/day), moderate rainfall (7.5 to 35.5 mm/day), and heavy rainfall (>35.5 mm/day), due to the low number of samples in some of the IMD-defined rainfall classes. Figure 4 presents the box plot of the statistical measures for these four rainfall classes over the training and testing periods. Spatial distribution of the statistical measures for these rainfall intensity classes during training and testing periods are shown in Figures S2 and S3, respectively. CC is only reported for three rainfall classes (light rainfall, moderate rainfall, and heavy rainfall) since no rainfall class contains a nil value of the observed IMD rainfall (Figure 4). For the no-rainfall class, 3B42RT shows an overestimation with median bias of 1.56 mm/day (1.21 mm/day) during the

Performance Assessment Based on Various Rainfall Intensity Classes
The IMD has classified the rainfall amounts into seven different classes based on intensity (mm/day). However, for this study, four classes are defined, i.e., no rainfall (<0.5 mm/day), light rainfall (0.5 to 7.5 mm/day), moderate rainfall (7.5 to 35.5 mm/day), and heavy rainfall (>35.5 mm/day), due to the low number of samples in some of the IMD-defined rainfall classes. Figure 4 presents the box plot of the statistical measures for these four rainfall classes over the training and testing periods. Spatial distribution of the statistical measures for these rainfall intensity classes during training and testing periods are shown in Figures S2 and S3, respectively. CC is only reported for three rainfall classes (light rainfall, moderate rainfall, and heavy rainfall) since no rainfall class contains a nil value of the observed IMD rainfall (Figure 4). For the no-rainfall class, 3B42RT shows an overestimation with median bias of 1.56 mm/day (1.21 mm/day) during the training (testing) period, which increased to 2.65 mm/day (2.23 mm/day) in the corrected product (Figure 4a). On the other hand, the median RMSE value in 3B42RT is 4.44 mm/day (3.38 mm/day) during the training (testing) period, which reduced by 29% (17%) in the corrected product (Figure 4b). Along with the box plot, the spatial distribution of RMSE (Figures S2b and S3b) also shows an improvement in the corrected product over 3B42RT across the catchment during training and testing periods. It indicates the improvement occurred throughout the catchment in the corrected product compared to 3B42RT. Note that the Bias is increased, whereas RMSE is decreased in corrected product, as compared to 3B42RT. This indicates a reduction in the random error for the corrected product as compared to 3B42RT, which is consistent with the study carried out by Bhuiyan et al. [44].
With regard to the light and moderate rainfall classes, a marginal improvement in the median value of CC is obtained in the corrected product compared to 3B42RT (Figure 4c,f). On the other hand, the median Bias in 3B42RT is 5.13 mm/day (7.39 mm/day) during the training (testing) period in the light rainfall class, which is drastically reduced by 50% (55%) for the corrected product (Figure 4d). For the moderate rainfall class, it is reduced from 6.86 mm/day (13.02 mm/day) to −4.07 mm/day (−1.19 mm/day) (Figure 4g). Similarly, the median RMSE value of 13.54 mm/day (15.39 mm/day) associated with 3B42RT during the training (testing) period for light rainfall is reduced by 58% (59%) for the corrected product (Figure 4e). For moderate rainfall, it is reduced from 21.88 mm/day (26.82 mm/day) in 3B42RT to 11.08 mm/day (12.03 mm/day) in the corrected product (Figure 4h). Besides these boxplots, the spatial plots also indicate a significant improvement in the bias and RMSE all over the catchment in the corrected product during light and moderate rainfall classes (Figures S2d,e,g,h and S3d,e,g,h). Therefore, a certain improvement in these rainfall classes is observed all over the catchment for the corrected product. The obtained results in these rainfall classes agree with the study carried out by Bhuiyan et al. [44].
For the heavy rainfall class, the median CC value hardly showed any improvement (Figure 4i) in the corrected product over the catchment, which can also be inferred from the spatial distribution maps (Figures S2i and S3i). Some of the grids show CC value near +1 or −1 in Figure S3i, which is due to the presence of very limited samples of heavy rainfall values during the testing period. From Figure 4j-k, it is clear that there is deterioration in the median value of Bias and RMSE in the corrected product as compared to 3B42RT during both training and testing periods. These results are consistent with the work carried out by Bhuiyan et al. [44], and this relatively poor performance may be attributed to fewer samples of heavy rainfall during the model training stage (Refer Figure S4).
In addition to the box plots (Figure 4), to demonstrate the reliability of the correction method, 2-dimensional histograms ( Figure 5) along with the value of performance measures ( Table 2) are shown for training and testing periods. Data from all the grids in this study (86) are considered in this plot. Overall, a significant scattering in 3B42RT is present along the 1:1 line, which is evidenced by the substantial bias and RMSE (shown in bold values in Figure 5a,c). However, scattering is considerably reduced and samples came near to the 1:1 line in the corrected product, which is reflected by the reduced RMSE and bias in the corrected product (shown in bold values in Figure 5b,d) compared with 3B42RT.
Regarding intensity-based classes, it can be clearly observed that in the no-rainfall class, positive biases/overestimations are present in 3B42RT in training and testing periods (Figure 5a,c), which is obvious as rainfall cannot be negative. However, these positive biases/overestimations are also present in the corrected product, but with a reduced range of scattering (Figure 5b,d). This is why the RMSE is low in the corrected product for the no-rainfall class ( Table 2). During light and moderate rainfall, a notable scattering along the 1:1 line is available in 3B42RT (Figure 5a,c), which introduces considerable biases as well as RMSE during the training and testing periods (Table 2). However, in the corrected product, these are reduced significantly as they approximate to the 1:1 line (Figure 5b,d), thereby reducing the value of bias and RMSE (Table 2). For the heavy rainfall class, scattering along the 1:1 line is not reduced in the corrected product compared to 3B42RT, which is evident from the performance measure (Table 2).   Regarding intensity-based classes, it can be clearly observed that in the no-rainfall class, positive biases/overestimations are present in 3B42RT in training and testing periods (Figure 5a,c), which is obvious as rainfall cannot be negative. However, these positive biases/overestimations are also present in the corrected product, but with a reduced range of scattering (Figure 5b,d). This is why the RMSE is low in the corrected product for the no-rainfall class ( Table 2). During light and moderate rainfall, a notable scattering along the 1:1 line is available in 3B42RT (Figure 5a,c), which introduces considerable biases as well as RMSE during the training and testing periods (Table 2). However, in the corrected product, these are reduced significantly as they approximate to the 1:1 line (Figure 5b,d), thereby reducing the value of bias and RMSE (Table 2). For the heavy rainfall class, scattering along the 1:1 line is not reduced in the corrected product compared to 3B42RT, which is evident from the performance measure (Table 2).

Performance Assessment Based on Time Series
In this section, the time series plots of IMD gridded rainfall, 3B42RT, and corrected product for the testing period are shown ( Figure 6). Out of the 86 grid points enclosing the catchment, three points are selected on the basis of highest, medium, and no improvements of corrected product over 3B42RT (Figure 6a-c). From Figure 6a, it can be observed that 3B42RT shows an overestimation compared to IMD rainfall in most of the testing periods. In contrast, the corrected product is close to the IMD gridded rainfall for most of the testing period. This indicates the corrected product is superior to 3B42RT. However, during heavy rainfall events (more than 35.5 mm/day), the corrected product is not able to reconcile with the IMD gridded rainfall. These results are consistent with the previous results obtained in Section 3.2. Similar findings are also obtained for other grid points considered (Figure 6b,c). It is also evident that the performance of the corrected product (RMSE) deteriorated significantly with the higher frequency and magnitude of heavy rainfall (Figure 6a-c).

Summary and Conclusions
In this study, 3B42RT NRT SRE and ASCAT-based NRT soil moisture data are integrated through a machine learning-based SVR model to improve 3B42RT. The statistical measures, i.e., CC, bias, and RMSE, have been chosen to assess the performance. All these performance measures are

Summary and Conclusions
In this study, 3B42RT NRT SRE and ASCAT-based NRT soil moisture data are integrated through a machine learning-based SVR model to improve 3B42RT. The statistical measures, i.e., CC, bias, and RMSE, have been chosen to assess the performance. All these performance measures are presented with boxplots and spatial plots. In addition, the time-series plots of IMD, 3B42RT, and the corrected product are also shown to assess the temporal performance of this integration approach.
The obtained results reveal that 3B42RT is associated with significant bias and RMSE. However, in the corrected product, bias and RMSE are significantly reduced compared to 3B42RT rainfall. Particularly, RMSE is decreased by 28% and 33% during the training and testing periods, respectively. With regard to the intensity-based performance, both bias and RMSE are reduced significantly in the corrected product during light and moderate rainfalls over the entire catchment. Even the range of the reduction in RMSE compared with 3B42RT in these two classes is about 50 to 60%. A marginal improvement is also observed in CC values for the corrected product. However, for the heavy rainfall class, no clear improvements are observed, indicating the developed algorithm's limitation to capture heavy rainfall events. In the no rainfall class, RMSE (bias) is decreased (increased) in the corrected product as compared to 3B42RT, which is due to the improvement in the random error. The obtained results indicate that the proposed approach can effectively reduce the error associated with 3B42RT over Ashti catchment. However, the robustness of the approach needs to be tested rigorously in catchments located in different climatic conditions and using different rainfall products and soil moisture datasets.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-4292/11/19/2221/s1, Figure S1: Optimum value of the support vector machine-based regression model's hyperparameters for various grid points across the Ashti catchment. Figure S2: Spatial distribution of the performance of 3B42RT and the corrected product for different rainfall intensity classes, i.e., (a, b) no rainfall; (c-e) light rainfall; (f-h) moderate rainfall; (i-k) heavy rainfall; across the Ashti catchment using CC, bias, and RMSE during the training period. Figure S3. Spatial distribution of the performance of 3B42RT and the corrected product for different rainfall intensity classes, i.e., (a, b) no rainfall; (c-e) light rainfall; (f-h) moderate rainfall; (i-k) heavy rainfall; across the Ashti catchment using CC, bias, and RMSE during the testing period. Figure S4. Number of samples corresponds to various classes of rainfall in 3B42RT and the corrected product during the training and testing periods.