Estimating Mean Surface Backscatter from GPM Surface Backscatter Observations

: The radar on the Global Precipitation Measurement (GPM) mission observes precipitation at 13.6 GHz (Ku-band) and 35.6 GHz (Ka-band) and also receives echoes from the earth’s surface. Statistics of surface measurements for non-raining conditions are saved in a database for later use in estimating the precipitation path-integrated attenuation. Previous work by Meneghini and Jones (2011) showed that while averaging over larger latitude/longitude bins increase the number of samples, it can also increase sample variance due to spatial inhomogeneity in the data. As a result, Meneghini and Kim (2017) proposed a new, adaptive method of database construction, in which the number of measurements averaged depends on the spatial homogeneity. The purpose of this work is to re-visit previous, single-frequency results using dual-frequency data and optimal interpolation (kriging). Results include that (1) temporal inhomogeneity can create similar results as spatial, (2) Ka-band behavior is similar to Ku-band, (3) the Ku-/Ka-band difference has less spatial inhomogeneity than either band by itself, and (4) kriging and the adaptive method can reduce the sample variance. The author concludes that ﬁner spatial and temporal resolution is necessary in constructing the database for single frequencies but less so for the Ku-/Ka-band difference. The adaptive approach reduces sample standard deviation with a relatively modest computational increase.


Introduction
The Global Precipitation Measurement (GPM) mission [1,2] was launched to space in 2014 and includes the Dual-frequency Precipitation Radar (DPR), which observes precipitation at 13.6 GHz (Ku-band) and 35.6 GHz (Ka-band). In addition to precipitation echoes, the DPR also receives echoes from the earth's surface, providing measurements of the radar cross-section normalized by surface area, denoted by σ • [3]. These surface σ • measurements are important to GPM for estimating the path-integrated attenuation (PIA) in precipitation. The measured PIA for a given precipitation reflectivity profile provides an important constraint, or boundary condition, used in the operational retrieval of precipitation from that profile [3]. The algorithm for using the GPM surface return for this application is called the surface reference technique (SRT) and has been a key part of precipitation retrieval for both GPM and its predecessor, the Tropical Rainfall Measuring Mission (TRMM) [3][4][5][6].
While there are various subtleties associated with the SRT, the technique can be described simply as estimating total PIA as the difference of the σ • in rain with an equivalent σ • without rain. Methods include comparisons of the surface measured in rain with nearby non-raining measurements, also used for attenuation estimation for airborne radar measurements [7], and comparison with previous measurements of the same location under non-raining conditions (temporal method). These previous measurements are stored in a so-called temporal database as a function of space and time [3][4][5][6]. Figure 1 illustrates DPR measurements of σ • on an irregular grid (blue dots), determined by the spacecraft orbit and cross-track scan pattern. Additionally shown is a latitude/longitude grid on which we want to compute the sum and sum-of-squares of the σ • measurements. These two sums Eng 2021, 2 493 provide the information needed for estimating the mean σ • and sample standard deviation at each grid point. The mean σ • is used for comparison with σ • measured in precipitation; the sum-of-squares is used to find the sample variance and standard deviation for assessing the SRT error. The simplest approach for computing the sums at each point is to bin all the data near each grid point and then perform the sums over the points in each bin. This is the method normally used to create the temporal database for GPM, and was also used for the temporal database for TRMM.
Eng 2021, 2, FOR PEER REVIEW 2 which we want to compute the sum and sum-of-squares of the σ° measurements. These two sums provide the information needed for estimating the mean σ° and sample standard deviation at each grid point. The mean σ° is used for comparison with σ° measured in precipitation; the sum-of-squares is used to find the sample variance and standard deviation for assessing the SRT error. The simplest approach for computing the sums at each point is to bin all the data near each grid point and then perform the sums over the points in each bin. This is the method normally used to create the temporal database for GPM, and was also used for the temporal database for TRMM. Previous work [8] has examined effects of the bin size on the sample standard deviation (SSD, as used in [8]) of the mean σ° estimates when using the binning approach. While the use of relatively large bins provides more samples for estimating the mean σ°, increasing the spatial bin size of the Ku-band TRMM PR data can increase the SSD, as reported in [8]. This is true even when starting with 0.1° spatial bins and monthly time bins, as was available toward the latter part of the TRMM mission, which had provided roughly 17 years of data when it ceased operation in 2015. The work reported here extends the analysis of [8] to GPM dual-frequency data, with geographical coverage to ±65° latitude versus ±35° latitude for TRMM. This work also considers the effect of the size of time bins as well as spatial bins. Because of the especially large SSD of land σ° compared with ocean σ° at smaller incidence angles relative to nadir [5], we focus on land data here.
Besides simply averaging the data in each bin, the situation in Figure 1 can be considered an interpolation of the radar measurements to a desired grid. A statistically optimal method of spatial interpolation is known as kriging in geostatistics or optimal interpolation in atmospheric science [9][10][11]. Kriging was originally developed for spatial data, but extensions to space-time are also possible [12]. Kriging is also related to Wiener filtering in the signal-processing literature [13]. In addition to kriging is the approach of Meneghini and Kim [14], which starts with gridded (i.e., finely binned) data, rather than the original measurements (i.e., the blue dots in Figure 1). It then increases the size of each spatial bin by sequentially including surrounding high-resolution bins and checking the new SSD. If including the new spatial bin decreases the SSD, it is kept. If not, it is discarded. This same logic is applied to all grid locations, so that the resulting σ° statistics for a given grid point are based on a larger number of samples if that grid point is in a relatively homogeneous region. Results in [14], applied to GPM data, have already shown the Previous work [8] has examined effects of the bin size on the sample standard deviation (SSD, as used in [8]) of the mean σ • estimates when using the binning approach. While the use of relatively large bins provides more samples for estimating the mean σ • , increasing the spatial bin size of the Ku-band TRMM PR data can increase the SSD, as reported in [8]. This is true even when starting with 0.1 • spatial bins and monthly time bins, as was available toward the latter part of the TRMM mission, which had provided roughly 17 years of data when it ceased operation in 2015. The work reported here extends the analysis of [8] to GPM dual-frequency data, with geographical coverage to ±65 • latitude versus ±35 • latitude for TRMM. This work also considers the effect of the size of time bins as well as spatial bins. Because of the especially large SSD of land σ • compared with ocean σ • at smaller incidence angles relative to nadir [5], we focus on land data here.
Besides simply averaging the data in each bin, the situation in Figure 1 can be considered an interpolation of the radar measurements to a desired grid. A statistically optimal method of spatial interpolation is known as kriging in geostatistics or optimal interpolation in atmospheric science [9][10][11]. Kriging was originally developed for spatial data, but extensions to space-time are also possible [12]. Kriging is also related to Wiener filtering in the signal-processing literature [13]. In addition to kriging is the approach of Meneghini and Kim [14], which starts with gridded (i.e., finely binned) data, rather than the original measurements (i.e., the blue dots in Figure 1). It then increases the size of each spatial bin by sequentially including surrounding high-resolution bins and checking the new SSD. If including the new spatial bin decreases the SSD, it is kept. If not, it is discarded. This same logic is applied to all grid locations, so that the resulting σ • statistics for a given grid point are based on a larger number of samples if that grid point is in a relatively homogeneous region. Results in [14], applied to GPM data, have already shown the basic soundness of this approach. In the remainder of this paper, this method is referred to as the "adaptive" method, since the amount of averaging depends on the data itself.
The next section describes the GPM DPR data used for this study. The following section describes the methodology used to carry out the study. This is followed by results for both the effects of averaging and the comparison of simple binning with kriging and adaptive database construction. The final section provides a discussion of the results.

Data
The data used for this work are GPM dual-frequency 2A DPR data, Version 6. The version number is provided here since the GPM operational processing algorithms and resulting data change somewhat between versions. Within a version, the reconstructed and calibrated instrument data at full resolution are referred to as Level 1 [15]. Level 2 data consist of derived geophysical data with the same resolution and acquisition geometry as the Level 1 instrument data. The GPM DPR Level 2 data used here contain both Kuband and Ka-band surface-normalized radar cross-section σ • [3]. The DPR Ku-band data are acquired in scans across-track to ±17 • off nadir with a 5 km footprint at the Earth's surface [1]. The Ku-band swath consists of 49 beams. The Ka-band swath (until May 2018) is based on a similar cross-track scan but only for a maximum angle of about 9 • , yielding 25 beams across-track. Beginning in May of 2018, the Ka-band scanning pattern covers the same ± 17 • swath as the Ku-band [6,9]. However, processing of the Ka-band data beyond 9 • requires updated algorithms; hence, products using those data are not available as operational products; products using the full Ka-band scan data are anticipated for Release 7 of the GPM data. The irregular-looking surface locations for the Level 1 and Level 2 surface measurements have already been illustrated in Figure 1. The dates for the data used here are April 2014 through March 2020, 6 years of data. While secular variations extending through the six years might be detectable with sufficient averaging, the focus here is on seasonal and shorter time variation. Hence, the data for all six years are tagged with the week number, 1-52, so that data with the same week number are analyzed together, regardless of the year.

Data Analysis Methodology
The DPR temporal database, used in DPR operational processing, is not generally available. Instead, I created a local version by the same binning approach used by GPM, in which the geolocated Level 2 data falling within a specific space and time bin are summed for the computation of mean and variance [8]. This database-generation algorithm can duplicate the current GPM temporal database by averaging to the same 0.5 • space and quarterly time averaging [5,6] and can create alternative databases with other spatial and temporal bin sizes. This latter feature is needed to determine the effect of various spatial or temporal bin sizes on the sample standard deviation of the estimated mean σ • , as discussed in Section 1.
In kriging, the random process (in this case, σ • ) is modeled as Gaussian: , where x (denoted by boldface) is the spatial (x,y) or spatiotemporal (x,y,t) location. The underlying Gaussian process is ε(x), with known covariance, while β and m(x) are vectors of length p, with the apostrophe on m'(x) denoting transpose. The desired output of the linear estimator is σ • at each specified grid point, given noisy observations at locations that generally do not coincide with grid points (Figure 1). In simple kriging, m(x) is assumed to be 0, so the optimal estimate at x is σ • (x) = KZ, where Z is a vector of σ • measurements at a set of locations, as in Figure 1. The kriging coefficients in K are found by minimizing the mean square error of the kriging predictor versus observations. The coefficients depend on the covariance of ε(x), although kriging is often formulated via the variogram, which is related to the expected difference between points, rather than the covariance [10]. In practice, either the covariance or variogram is estimated from the observations prior to kriging. Besides simple kriging, there is ordinary kriging, which assumes that m(x) is an unknown constant that is estimated as part of kriging. The ordinary spatial kriging routine used here is from the ltsk package in the R language [16].
The last main algorithm needed for this study is the adaptive approach of [14]. I have implemented this approach in R; it captures the overall intent of the method described in [14] but may differ in detail. In summary, it loops over each grid point in latitude and longitude, starting with higher-resolution binned data (not the original Level 2 measurements). For a given point, the code finds all the neighbor bins and then tests the effect of including each along with the original bin. If at least one neighbor decreases the SSD, a new set of neighbors is generated and tested. When no neighbors in a given step decrease SSD, the search process for that latitude and longitude location is terminated. The number of total samples (original high-resolution bin plus added neighbors) is saved, along with the total sum and sum of squares. Then, the code moves to the next latitude/longitude location and begins the search there. This particular version of the adaptive method is called "stepwise" in [14] and is described in detail there. When estimating the mean of a stochastic process, drawing from a larger set of samples typically reduces the sample standard deviation, if the process is stationary. This can be seen by approximating the measurements as independent, identically distributed (IID) random variables. As discussed in [17], the sample mean of n IID random variables has a variance of κ 2 /n, where κ 2 is the variance of each measurement. This reduces as n increases. However, if the process is not stationary, the measurements are no longer identically distributed. Additional samples can then increase the standard deviation of the estimated mean.
Meneghini and Jones [8] provide a good description of the theory behind their findings on averaging. I give an alternative illustration using Figure 2, which shows the results of a simple experiment in estimating the mean of a random process with a varying mean. On the left is a single realization of the process, with the first half (128 samples) having mean 1 and the second half (another 128 points) having mean 6. I then break the data into varying sized sections of 4, 8, 16, 32, 64, 128, 140, and 256. The bar plot on the right shows the SSD of the estimated mean for each segment size. The SSD of the estimate of the mean in Figure 2 decreases approximately as expected (~1/ √ n) for all cases where the segment is 128 samples or less. However, segments larger than 128 cross the boundary between the means of 1 and 6, increasing the SSD within the segment and biasing the estimated mean. 21, 2, FOR PEER REVIEW 4 servations prior to kriging. Besides simple kriging, there is ordinary kriging, which assumes that m(x) is an unknown constant that is estimated as part of kriging. The ordinary spatial kriging routine used here is from the ltsk package in the R language [16]. The last main algorithm needed for this study is the adaptive approach of [14]. I have implemented this approach in R; it captures the overall intent of the method described in [14] but may differ in detail. In summary, it loops over each grid point in latitude and longitude, starting with higher-resolution binned data (not the original Level 2 measurements). For a given point, the code finds all the neighbor bins and then tests the effect of including each along with the original bin. If at least one neighbor decreases the SSD, a new set of neighbors is generated and tested. When no neighbors in a given step decrease SSD, the search process for that latitude and longitude location is terminated. The number of total samples (original high-resolution bin plus added neighbors) is saved, along with the total sum and sum of squares. Then, the code moves to the next latitude/longitude location and begins the search there. This particular version of the adaptive method is called "stepwise" in [14] and is described in detail there. When estimating the mean of a stochastic process, drawing from a larger set of samples typically reduces the sample standard deviation, if the process is stationary. This can be seen by approximating the measurements as independent, identically distributed (IID) random variables. As discussed in [17], the sample mean of n IID random variables has a variance of 2 / , where 2 is the variance of each measurement. This reduces as increases. However, if the process is not stationary, the measurements are no longer identically distributed. Additional samples can then increase the standard deviation of the estimated mean.
Meneghini and Jones [8] provide a good description of the theory behind their findings on averaging. I give an alternative illustration using Figure 2, which shows the results of a simple experiment in estimating the mean of a random process with a varying mean. On the left is a single realization of the process, with the first half (128 samples) having mean 1 and the second half (another 128 points) having mean 6. I then break the data into varying sized sections of 4, 8, 16, 32, 64, 128, 140, and 256. The bar plot on the right shows the SSD of the estimated mean for each segment size. The SSD of the estimate of the mean in Figure 2 decreases approximately as expected (~1/√ ) for all cases where the segment is 128 samples or less. However, segments larger than 128 cross the boundary between the means of 1 and 6, increasing the SSD within the segment and biasing the estimated mean.  The GPM DPR data are averaged onboard, giving a standard deviation for each measured σ • in the Level 2 data (blue dots in Figure 1) of less than 1 dB [6]. Using latitude/longitude bins of 0.25 • and time bins of 1 week, we find that with the 6-year data Eng 2021, 2 496 record we have, on average, 5 samples per bin at nadir and 10 at other angles; many bins have no samples. Hence, averaging the σ • measurements in a bin should then reduce the 1 dB standard deviation to less than about 0.3 dB (based on 1/ √ n and n = 10), if the measurements represent IID random variables. However, significantly larger standard deviations are found from the GPM data, indicating non-uniformity among the samples. Figure 3 shows the cumulative (probability) distribution function (CDF) for the SSD of Ku-band σ • with varying amounts of spatial and temporal averaging for incidence angles of nadir (0 • ) and 5.25 • . This can be compared with the TRMM results found in [8] (their Figure 6), which has similar curves for several values of spatial averaging. For any choice of σ • standard deviation (SSD) on the horizontal axis, the higher the curve, the smaller the probability of the SSD exceeding the chosen value on the horizontal axis. For example, the median SSD for nadir incidence is 4.3 dB for 0.25 • spatial and weekly temporal averaging. However, the green dashed curve, corresponding to 0.5 • quarterly binning, indicates a CDF of less than 0.2 at 4.3 dB. Hence, the space-time averaging increases the probability of the sample standard deviation exceeding 4.3 dB. The CDF curves for the Ka-band look very similar to those for the Ku-band and so are not shown. For both frequencies, the curves at 5.25 • incidence are generally higher than the corresponding curves at nadir, implying smaller standard deviations off-nadir. The median Ku-band standard deviation at 5.25 • is 1.8 dB, much smaller than the nadir value. I also computed curves for both frequencies at an 8.25 • angle (not shown). These look similar to the curves for 5.25 • in Figure 3, but the median standard deviation is reduced to 1.5 dB. This corresponds to the previously observed smaller variability of σ • off-nadir [3,5].
, 2, FOR PEER REVIEW 5 The GPM DPR data are averaged onboard, giving a standard deviation for each measured σ° in the Level 2 data (blue dots in Figure 1) of less than 1 dB [6]. Using latitude/longitude bins of 0.25° and time bins of 1 week, we find that with the 6-year data record we have, on average, 5 samples per bin at nadir and 10 at other angles; many bins have no samples. Hence, averaging the σ° measurements in a bin should then reduce the 1 dB standard deviation to less than about 0.3 dB (based on 1/√ and n = 10), if the measurements represent IID random variables. However, significantly larger standard deviations are found from the GPM data, indicating non-uniformity among the samples. Figure  3 shows the cumulative (probability) distribution function (CDF) for the SSD of Ku-band σ° with varying amounts of spatial and temporal averaging for incidence angles of nadir (0°) and 5.25°. This can be compared with the TRMM results found in [8] (their Figure 6), which has similar curves for several values of spatial averaging. For any choice of σ° standard deviation (SSD) on the horizontal axis, the higher the curve, the smaller the probability of the SSD exceeding the chosen value on the horizontal axis. For example, the median SSD for nadir incidence is 4.3 dB for 0.25° spatial and weekly temporal averaging. However, the green dashed curve, corresponding to 0.5° quarterly binning, indicates a CDF of less than 0.2 at 4.3 dB. Hence, the space-time averaging increases the probability of the sample standard deviation exceeding 4.3 dB. The CDF curves for the Ka-band look very similar to those for the Ku-band and so are not shown. For both frequencies, the curves at 5.25° incidence are generally higher than the corresponding curves at nadir, implying smaller standard deviations off-nadir. The median Ku-band standard deviation at 5.25° is 1.8 dB, much smaller than the nadir value. I also computed curves for both frequencies at an 8.25° angle (not shown). These look similar to the curves for 5.25° in Figure  3, but the median standard deviation is reduced to 1.5 dB. This corresponds to the previously observed smaller variability of σ° off-nadir [3,5]. The GPM DPR retrieval algorithms have separate processing for the Ku-and Kaband frequencies, as well as algorithms for dual-frequency processing [5,6]. In this latter case, the SRT uses the differential σ° (DS0), defined as the Ku-band σ° minus Ka-band σ° (in dB). Figure 4 shows the CDFs for DS0. Previous work has noted the advantage of the DS0 relative to single-frequency σ° [18], in terms of smaller errors in the PIA estimates. The CDF curves for DS0 at both angles are shifted well above the corresponding results for the Ku-band in Figure 3 (and for the Ka-band). This confirms the underlying ideas behind the dual-frequency processing. Specifically, σ° at the Ku-and Ka-bands are wellcorrelated; this is discussed in [18] and also shown using GPM data in [5] (their Figure 3). This correlation implies that the spatial variability of the DS0 should be smaller than the variability of σ° at a single frequency. This reduced variability of DS0 results in smaller The GPM DPR retrieval algorithms have separate processing for the Ku-and Ka-band frequencies, as well as algorithms for dual-frequency processing [5,6]. In this latter case, the SRT uses the differential σ • (DS0), defined as the Ku-band σ • minus Ka-band σ • (in dB). Figure 4 shows the CDFs for DS0. Previous work has noted the advantage of the DS0 relative to single-frequency σ • [18], in terms of smaller errors in the PIA estimates. The CDF curves for DS0 at both angles are shifted well above the corresponding results for the Ku-band in Figure 3 (and for the Ka-band). This confirms the underlying ideas behind the dual-frequency processing. Specifically, σ • at the Ku-and Ka-bands are well-correlated; this is discussed in [18] and also shown using GPM data in [5] (their Figure 3). This correlation implies that the spatial variability of the DS0 should be smaller than the variability of σ • at a single frequency. This reduced variability of DS0 results in smaller SSD and improved PIA estimates using the dual-frequency SRT algorithm, relative to single-frequency algorithms, discussed in [5,6]. 1, 2, FOR PEER REVIEW 6 SSD and improved PIA estimates using the dual-frequency SRT algorithm, relative to single-frequency algorithms, discussed in [5,6]. I have confirmed the reduced variability of DS0 using a principal component analysis (PCA) [19]. The input to PCA is the vector of Ku-and Ka-band σ° measurements over the angle. The results of PCA using a large number of these measurement vectors show that the first principal component corresponds to about 92% of the variability of the DPR measurements. This first principal component turns out to essentially be a weighted average of the Ku-band and Ka-band measurements. However, the third principal component is related to the Ku-Ka-band difference (i.e., DS0); it explains only about 8% of the total variance, confirming the much larger variability of the single-frequency measurements of σ°. The effect of the reduced variability of DS0 is the change in CDFs from Figure 3 to Figure  4. The median SSD for DS0 at nadir is 2 dB, dropping to 1 dB at 5.25°, well below the Kuband values provided above. At nadir, for SSDs on the horizontal axis greater than about 3.5 dB, it seems that averaging in space and/or time can actually reduce the probability of exceeding the chosen SSD. For smaller SSDs, averaging still increases the probability of exceeding the chosen SSD (lowers the curve). At 5.25°, the separation between curves is small, compared with the Ku-band in Figure 3, indicating that averaging has a relatively small effect.
The results in Figures 3 and 4 can inform the choice of space and time averaging for the temporal database. First considering the Ku-band results in Figure 3, we can start with the green dashed lines, which correspond to the current database bin sizes of 0.5° in space and quarterly in time and are the lowest curves in the plots. As noted, the Ka-band results are similar, so the same considerations apply. Moving upward, we next encounter solid black lines, which correspond to the same 0.5° spatial average but monthly averaging in time. Above this is 0.25° spatial averaging and quarterly temporal averaging (purple dotdash line). The position of this line above the 0.5° monthly curve indicates that improving the spatial resolution (by 2 × 2 for latitude and longitude) is slightly more beneficial than improving the temporal resolution from quarterly to monthly. At nadir, the 0.25° monthly curve and the 0.5° weekly curves are very close together, while at the 5.25° incidence, the 0.25° monthly curve is slightly higher. Furthermore, improving the spatial resolution to 0.25° appears preferable to improving time averaging from quarterly to monthly. Once the 0.25° spatial resolution is achieved, additional data can be used to improve the time resolution. The dual-frequency results show a different behavior of the CDFs for DS0, with little or no expected improvement with higher spatial and/or temporal resolution, especially for angles away from nadir. I have confirmed the reduced variability of DS0 using a principal component analysis (PCA) [19]. The input to PCA is the vector of Ku-and Ka-band σ • measurements over the angle. The results of PCA using a large number of these measurement vectors show that the first principal component corresponds to about 92% of the variability of the DPR measurements. This first principal component turns out to essentially be a weighted average of the Ku-band and Ka-band measurements. However, the third principal component is related to the Ku-Ka-band difference (i.e., DS0); it explains only about 8% of the total variance, confirming the much larger variability of the single-frequency measurements of σ • . The effect of the reduced variability of DS0 is the change in CDFs from Figure 3 to Figure 4. The median SSD for DS0 at nadir is 2 dB, dropping to 1 dB at 5.25 • , well below the Ku-band values provided above. At nadir, for SSDs on the horizontal axis greater than about 3.5 dB, it seems that averaging in space and/or time can actually reduce the probability of exceeding the chosen SSD. For smaller SSDs, averaging still increases the probability of exceeding the chosen SSD (lowers the curve). At 5.25 • , the separation between curves is small, compared with the Ku-band in Figure 3, indicating that averaging has a relatively small effect.
The results in Figures 3 and 4 can inform the choice of space and time averaging for the temporal database. First considering the Ku-band results in Figure 3, we can start with the green dashed lines, which correspond to the current database bin sizes of 0.5 • in space and quarterly in time and are the lowest curves in the plots. As noted, the Ka-band results are similar, so the same considerations apply. Moving upward, we next encounter solid black lines, which correspond to the same 0.5 • spatial average but monthly averaging in time. Above this is 0.25 • spatial averaging and quarterly temporal averaging (purple dot-dash line). The position of this line above the 0.5 • monthly curve indicates that improving the spatial resolution (by 2 × 2 for latitude and longitude) is slightly more beneficial than improving the temporal resolution from quarterly to monthly. At nadir, the 0.25 • monthly curve and the 0.5 • weekly curves are very close together, while at the 5.25 • incidence, the 0.25 • monthly curve is slightly higher. Furthermore, improving the spatial resolution to 0.25 • appears preferable to improving time averaging from quarterly to monthly. Once the 0.25 • spatial resolution is achieved, additional data can be used to improve the time resolution. The dual-frequency results show a different behavior of the CDFs for DS0, with little or no expected improvement with higher spatial and/or temporal resolution, especially for angles away from nadir.
Applying binning, kriging, and the adaptive methods yields the result of Figure 5, showing the resulting mean and standard deviation for a test case in SE Asia. The data correspond to the month of July and are the Ku-band with incidence angles from 0.0 • to 0.8 • ; all methods estimate σ • on a grid with spacing of 0.5 • . The Ku-band at small angles generally has a larger variability than either larger angles or the Ka-band and so is used here as a worst-case test of the methods. The means of the binning and kriging approaches are similar. However, the adaptive approach looks smoother, since it involves degrading resolution in areas with smaller variability. Looking at the bottom row in Figure 5, the estimation standard deviation (i.e., the SSD) is reduced by both kriging and the adaptive approach. However, the reduction in the mean SSD is not large. The mean/maximum of the SSDs over all the bins in the Figure 5 area, for binning, kriging, and adaptive, respectively, are 6.8/10.0, 6.2/7.5, and 6.1/8.3 dB. Kriging has the best overall performance since the mean SSD is similar to the adaptive method, while its maximum SSD is lower. Figure 6 shows the same information as in Figure 5 but for a second test case, over North America. The SSDs for Figure 6 are 6.4/10.0, 5.8/7.2, and 5.8/7.9 dB (mean/max), similar to the case in Figure 5. While slower than binning, the adaptive approach appears much faster than kriging (by more than a factor of 10), with better performance than binning. 1, 2, FOR PEER REVIEW 7 Applying binning, kriging, and the adaptive methods yields the result of Figure 5, showing the resulting mean and standard deviation for a test case in SE Asia. The data correspond to the month of July and are the Ku-band with incidence angles from 0.0° to 0.8°; all methods estimate σ° on a grid with spacing of 0.5°. The Ku-band at small angles generally has a larger variability than either larger angles or the Ka-band and so is used here as a worst-case test of the methods. The means of the binning and kriging approaches are similar. However, the adaptive approach looks smoother, since it involves degrading resolution in areas with smaller variability. Looking at the bottom row in Figure 5, the estimation standard deviation (i.e., the SSD) is reduced by both kriging and the adaptive approach. However, the reduction in the mean SSD is not large. The mean/maximum of the SSDs over all the bins in the Figure 5 area, for binning, kriging, and adaptive, respectively, are 6.8/10.0, 6.2/7.5, and 6.1/8.3 dB. Kriging has the best overall performance since the mean SSD is similar to the adaptive method, while its maximum SSD is lower. Figure  6 shows the same information as in Figure 5 but for a second test case, over North America. The SSDs for Figure 6 are 6.4/10.0, 5.8/7.2, and 5.8/7.9 dB (mean/max), similar to the case in Figure 5. While slower than binning, the adaptive approach appears much faster than kriging (by more than a factor of 10), with better performance than binning.

Figure 5.
Top row: Mean nadir σ° of a test area over SE Asia using data from one week in July. Grid resolution is 0.5°; latitude and longitude are relative to the area's center. On the left is the result of binning all measurements within specified space and time ranges. In the center is the standard deviation resulting from kriging, in place of binning. On the right are the results for our implementation of the Meneghini and Kim [14] adaptive approach. Bottom row: Same as top row but standard deviation (SSD) instead of mean.

Discussion
This paper has reported on the use of six years of GPM DPR data to investigate the variability of the estimated mean surface radar backscatter versus time and space averaging, with an application to use the surface backscatter for accurate estimation of the pathintegrated attenuation for correction of GPM DPR rainfall profile correction and retrieval. The accuracy of this method is directly related to the standard deviation of estimates of σ° that are stored in a so-called temporal database. The analysis presented here extends the results of [8] to dual-frequency, higher latitude, and changes in temporal averaging. The last of these can cause the same effect as seen in [8] for spatial averaging; namely, the sample standard deviation tends to increase for increased time averaging. I also find that the individual behavior of the Ka-band σ° is quite similar to that of the Ku-band. However, when taking the difference (in dB) of the Ku-and Ka-band σ° measurements, the behavior changes. The relatively high correlation between the Ku-band and the Ka-band σ° causes the difference to have less spatial and temporal variability than σ° at either frequency. This results in a reduced effect of averaging; the current averaging strategy for the dual-frequency data appears reasonable, at least away from nadir. I also experiment with applying optimal spatial interpolation or averaging methods. This leads to the finding that both kriging and the adaptive averaging strategy in [14] can reduce the sample standard deviation of the estimated σ°, with kriging producing the lowest maximum sample standard deviation. However, the adaptive method of [14] is significantly faster than kriging, for the implementations used here.

Discussion
This paper has reported on the use of six years of GPM DPR data to investigate the variability of the estimated mean surface radar backscatter versus time and space averaging, with an application to use the surface backscatter for accurate estimation of the path-integrated attenuation for correction of GPM DPR rainfall profile correction and retrieval. The accuracy of this method is directly related to the standard deviation of estimates of σ • that are stored in a so-called temporal database. The analysis presented here extends the results of [8] to dual-frequency, higher latitude, and changes in temporal averaging. The last of these can cause the same effect as seen in [8] for spatial averaging; namely, the sample standard deviation tends to increase for increased time averaging. I also find that the individual behavior of the Ka-band σ • is quite similar to that of the Ku-band. However, when taking the difference (in dB) of the Ku-and Ka-band σ • measurements, the behavior changes. The relatively high correlation between the Ku-band and the Ka-band σ • causes the difference to have less spatial and temporal variability than σ • at either frequency. This results in a reduced effect of averaging; the current averaging strategy for the dual-frequency data appears reasonable, at least away from nadir. I also experiment with applying optimal spatial interpolation or averaging methods. This leads to the finding that both kriging and the adaptive averaging strategy in [14] can reduce the sample standard deviation of the estimated σ • , with kriging producing the lowest maximum sample standard deviation. However, the adaptive method of [14] is significantly faster than kriging, for the implementations used here.