The GPM Validation Network and Evaluation of Satellite-Based Retrievals of the Rain Drop Size Distribution

: A unique capability of the Global Precipitation Measurement (GPM) mission is its ability to better estimate the raindrop size distribution (DSD) on a global scale. To validate the GPM DSD retrievals, a network of more than 100 ground-based polarimetric radars from across the globe are utilized within the broader context of the GPM Validation Network (VN) processing architecture. The GPM VN ensures quality controlled dual-polarimetric radar moments for use in providing reference estimates of the DSD. The VN DSD estimates are carefully geometrically matched with the GPM core satellite measurements for evaluation of the GPM algorithms. We use the GPM VN to compare the DSD retrievals from the GPM’s Dual-frequency Precipitation Radar (DPR) and combined DPR–GPM Microwave Imager (GMI) Level-2 algorithms. Results suggested that the Version 06A GPM core satellite algorithms provide estimates of the mass-weighted mean diameter (D m ) that are biased 0.2 mm too large when considered across all precipitation types. In convective precipitation, the algorithms tend to overestimate D m by 0.5–0.6 mm, leading the DPR algorithm to underestimate the normalized DSD intercept parameter (N w ) by a factor of two, and introduce a signiﬁcant bias to the DPR retrievals of rainfall rate for DSDs with large D m . The GPM Combined algorithm performs better than the DPR algorithm in convection but provides a severely limited range of N w estimates, highlighting the need to broaden its a priori database in convective precipitation. variability within any given DPR footprint, the VN computes statistics (e.g., mean, median, standard deviation, maximum) of the GR ﬁelds at each GR elevation angle intersecting each DPR ray within the GR domain. For the HID ﬁeld, the GR bins assigned to each HID class at each GR elevation angle intersecting a DPR ray are summed for each class (i.e., histogram of HID). In addition, for each GR elevation angle, the DPR bins are averaged within each intersecting GR ray. These geometrically matched data are stored in netCDF formatted ﬁles and archived (see supplemental material for data location online). Matchup ﬁles exist for each version of 2ADPR and 2BCMB during GPM overpasses of GR sites when precipitation was detected by the satellite. In this study, we include VN matchups for six years of Version 06A GPM algorithms from 2014 through 2019. files archived supplemental files for


Introduction
Understanding the distribution of precipitation is vital to gaining new insights into the water and energy cycle of the Earth system and how this life-essential resource evolves in a changing climate. Hence, accurate precipitation estimates are required on a global scale-something only possible with Earth observing satellites. The Global Precipitation Measurement (GPM) mission provides these observations using a core satellite, which consists of the Dual-frequency Precipitation Radar (DPR) and the GPM Microwave Imager (GMI), operating within and calibrating a constellation of microwave radiometers [1,2]. Building upon the 17 years of satellite-based observations from the Tropical Rainfall The VN ingests ground-based polarimetric radar data from a variety of sources, including both operational and research weather radars, processes the raw radar moment data to obtain geophysical variables and geometrically matches those with DPR and the GMI retrievals during GPM core satellite overpasses within 100-km of each radar ingested. A schematic of this workflow is given in Figure 2.

Data Sources
Operational radar datasets such as National Oceanic and Atmospheric Administration (NOAA) National Weather Service WSR-88D Level II radar data are acquired in near real time from NOAA's publicly accessible archive hosted on the Amazon Web Services Simple Storage Solution cloud platform [26]. Across the continental United States (CONUS), the VN includes most polarimetric The VN ingests ground-based polarimetric radar data from a variety of sources, including both operational and research weather radars, processes the raw radar moment data to obtain geophysical variables and geometrically matches those with DPR and the GMI retrievals during GPM core satellite overpasses within 100-km of each radar ingested. A schematic of this workflow is given in Figure 2.
Atmosphere 2020, 11, x FOR PEER REVIEW 3 of 14 variables (e.g., DSD, rainfall rate) derived from more than 400,000 min of two-dimensional video disdrometer measurements in the following regions: southern Finland; central Oklahoma; southern Ontario, Canada; eastern Iowa; western North Carolina, Olympic Peninsula; Huntsville, Alabama; Wallops Island, Virginia [17]. Hence the VN database serves the GPM mission as the connection between the disdrometer observations and the satellite-based DSD retrievals. The VN ingests ground-based polarimetric radar data from a variety of sources, including both operational and research weather radars, processes the raw radar moment data to obtain geophysical variables and geometrically matches those with DPR and the GMI retrievals during GPM core satellite overpasses within 100-km of each radar ingested. A schematic of this workflow is given in Figure 2.

Data Sources
Operational radar datasets such as National Oceanic and Atmospheric Administration (NOAA) National Weather Service WSR-88D Level II radar data are acquired in near real time from NOAA's publicly accessible archive hosted on the Amazon Web Services Simple Storage Solution cloud platform [26]. Across the continental United States (CONUS), the VN includes most polarimetric

Data Sources
Operational radar datasets such as National Oceanic and Atmospheric Administration (NOAA) National Weather Service WSR-88D Level II radar data are acquired in near real time from NOAA's publicly accessible archive hosted on the Amazon Web Services Simple Storage Solution cloud platform [26]. Across the continental United States (CONUS), the VN includes most polarimetric Weather Surveillance Radar, 1988, Doppler radars (WSR-88Ds) east of the Rocky Mountains. This region is selected in a somewhat conservative fashion as a zeroth order approach to maximize GV reference data quality by avoiding radar observations affected by terrain blockage and associated clutter, while at the same time optimizing the associated sampling range/area of the given radars underneath periodic overpasses of the GPM core satellite. Beyond this initial selection of radars, WSR-88Ds in Alaska, Hawaii, Guam and Puerto Rico are now also processed. This equates to nearly 200 GB of WSR-88Ds data ingested by the VN each day. In addition to the network of WSR-88D radars, the VN also routinely ingests S-band polarimetric radar data from Brazil's Center for Monitoring and Alerting of Natural Disasters (CEMADEN) radar network, and radar data from the Kwajalein Polarimetric S-band Weather Radar (KPOL) [27], which is located in the Central Pacific. Radars primarily used for research are also ingested by the VN but less frequently pending their operations and timing of GPM overpasses. These include the following polarimetric radars: the NASA Polarimetric Doppler Weather Radar (NPOL) [28]; the Colorado State University-University of Chicago-Illinois State Water Survey (CSU-CHILL) [29]; the University of Alabama in Huntsville/WHNT-TV Advanced Radar for Meteorological and Operational Research (ARMOR) [30]; the C-band polarimetric/Doppler meteorological radar system (C-POL) in Darwin, Australia [31]. The VN also includes data from Météo-France's S-band polarimetric radar at Piton Villers on La Réunion [32] collected between 2014 and 2017, providing a unique site in the southern Indian Ocean to enrich GPM GV.
The VN routinely ingests atmospheric sounding information from the National Oceanic and Atmospheric Administration (NOAA) Rapid Refresh (RAP) hourly model analysis [33], the Global Forecast System [34], the Flow-Following, Finite Volume Icosahedral Model [35] and from radiosondes launched daily at Kwajalein Atoll to extract vertical temperature and moisture profiles used in the VN hydrometeor identification (HID) scheme [36], whose weights have been adjusted for S-band. Additionally, the GPM Precipitation Processing System [37] provides orbital subsets of the GPM products for each ground radar ingested by the VN. When a GPM overpass occurs within 200-km of a given radar, the VN ground radar (GR) processing is activated. Approximately 40 GPM overpasses occur over VN radars daily. The GR processing is largely automated, except for occasional manual checks to ensure only valid precipitation events are included in the database. Since the VN must process radar data from a variety of different radar platforms, the VN software includes data readers to handle each of the various radar data formats (e.g., netCDF, Archive Level-II, SIGMET, Rainbow).

GR Data Processing
Each GR data set is put through automated and manual quality control (QC). First, NASA's Dual Polarimetric Quality Control (DPQC) algorithm is used to identify and remove non-precipitation echoes [38]. If any unwanted echoes remain, the QC algorithm's polarimetric thresholds are manually adjusted and the data is processed through it again [38]. This iterative QC procedure continues until the GR data are deemed acceptable (i.e., a trained operator does not find any remaining clutter or other non-precipitation echoes). Most of the GR datasets ingested by the VN are from operational networks, which typically have routine maintenance schedules, hence, routine calibration is checked only for NPOL and KPOL reflectivity and differential reflectivity data. For these non-operational radars, the VN uses the Relative Calibration Adjustment technique [39,40], self-consistency of polarimetric variables [41], and vertical profile (or birdbath) scans [42]. While occasional manual verification of operational radar datasets is undertaken, for the large operational radar datasets such the WSR-88Ds it is more generally assumed that the large statistical data-base approach relying on tens of thousands or more data points and extended duration will provide an unbiased reference comparison with an varied degree of random error (which will be apparent in following figures). This, as it is highly unlikely that systematic bias will (a) remain at any single radar for multiple years; or, (b) exist uniformly in the entire data sample of 100+ radars at any given time over multiple years. We are able to approximately validate this assumption using time-series comparisons of GPM DPR radar reflectivity aloft in stratiform regions where ice dominates. The time series comparisons show occasional negative and positive departures of GR calibration relative to the GPM, but the departures often oscillate with +/−2 dB of the GPM radar values. We assume this is also the case for biases in polarimetric moments, and with subsequent application in large numbers of samples to ensure that our comparisons to GPM will be dominated primarily by random error and only minimal bias (i.e., much less than requirements dictate [18]).
An additional step related to GR data quality is the computation of specific differential phase (K dp ), which is used for attenuation correction and rainfall estimation. The VN uses an adaptive K dp estimation algorithm developed by [43] to extract the differential propagation phase shift from the total differential phase measured at each range gate along the radar ray and estimate the K dp . Although many of the GR datasets already contain an estimate of K dp when ingested by the VN, this new K dp is used instead.

GR-Based Retrievals of the DSD
The processed (i.e., QC, attenuation corrected) GR data are used to retrieve, amongst other geophysical variables (e.g., rainfall rate), two parameters of the normalized gamma DSD model-D m , which is defined as the ratio of the fourth to the third moment of the DSD, and the normalized intercept parameter, N w [11]. The retrieval is derived from T-matrix scattering simulations performed using a very large (>300,000 min of DSDs) and diverse database of optical disdrometer observations collected for GPM GV efforts [17]. Using a third-order polynomial with a sequential intensity filtering technique [44], the D m is retrieved from the GR observed differential reflectivity, Z dr , (after QC and attenuation correction) and used together with the GR observed radar reflectivity (i.e., Z h after QC and attenuation correction) to retrieve N w . The DSD model-fit of the "ALL campaign" described by [17] produces a relative bias of less than 10% and a relative mean absolute error of less than 15% near the ground. Further details of this retrieval process, including the relevant equations, are found in [17]. It is worth noting here that the level 2 radar data the VN obtains from the WSR-88D network includes an internal Z DR bias correction, which is monitored by the National Weather Service Radar Operations Center and found to be within 0.2 dB [45]. Additionally, we examined the VN data used in this study and found the Z DR bias to be 0.06 dB for drizzle (as identified by the polarimetric hydrometeor classification) and the reflectivity is 15-23 dBz. Hence, a Z DR bias does not significantly impact the DSD retrievals from the ground-based radars presented in this study.

GPM Core Satellite-Based Retrievals of the DSD
The VN includes observations and retrievals from both the DPR and GMI. The GPM/DPR Level 2 algorithm (2ADPR; [46]) uses single and dual frequency measurements from the DPR to retrieve the DSD. Its approach relies on the relationship between rain rate, R, and D m , and performs a path-integrated attenuation-constrained optimization to find a unique R-D m relation for each precipitation type with N w following from the resultant optimal solution of D m and R [47]. The Level 2B GPM Combined Radar-Radiometer Precipitation Algorithm (2BCMB) uses both the Level 2 calibrated reflectivity profiles from the DPR and Level 1C GMI brightness temperatures together with an a priori database of particle size distributions and corresponding environmental conditions to retrieve the DSD and other integral rain parameters [48]. It begins with an initial guess of N w profiles based on the a priori database in relation to the satellite observations and employs an ensemble filter within an optimal estimation framework to find the most likely profile of N w and D m [49].

Matching the Satellite and GR Retrievals
The ground radar (GR) and DPR observe the precipitation from different perspectives and at different resolutions. The DPR has a swath of 245-km with a nadir pixel level footprint of 5-km in diameter and vertical resolution of 250-m. A typical GR has 125-250 m range gate spacing (depending on whether it is a research or operational radar) and 3 dB beam width of 1 • , giving it better horizontal resolution than DPR. Hence, the VN applies a geometrical matching of the GR bins for intersecting DPR rays. A thorough discussion on the specifics of the geometrical matching procedure is provided in [25]. So here we just give an example of this for a single DPR ray ( Figure 3). To capture the precipitation Atmosphere 2020, 11, 1010 6 of 14 variability within any given DPR footprint, the VN computes statistics (e.g., mean, median, standard deviation, maximum) of the GR fields at each GR elevation angle intersecting each DPR ray within the GR domain. For the HID field, the GR bins assigned to each HID class at each GR elevation angle intersecting a DPR ray are summed for each class (i.e., histogram of HID). In addition, for each GR elevation angle, the DPR bins are averaged within each intersecting GR ray. These geometrically matched data are stored in netCDF formatted files and archived (see supplemental material for data location online). Matchup files exist for each version of 2ADPR and 2BCMB during GPM overpasses of GR sites when precipitation was detected by the satellite. In this study, we include VN matchups for six years of Version 06A GPM algorithms from 2014 through 2019.
Atmosphere 2020, 11, x FOR PEER REVIEW 6 of 14 in [25]. So here we just give an example of this for a single DPR ray ( Figure 3). To capture the precipitation variability within any given DPR footprint, the VN computes statistics (e.g., mean, median, standard deviation, maximum) of the GR fields at each GR elevation angle intersecting each DPR ray within the GR domain. For the HID field, the GR bins assigned to each HID class at each GR elevation angle intersecting a DPR ray are summed for each class (i.e., histogram of HID). In addition, for each GR elevation angle, the DPR bins are averaged within each intersecting GR ray. These geometrically matched data are stored in netCDF formatted files and archived (see supplemental material for data location online). Matchup files exist for each version of 2ADPR and 2BCMB during GPM overpasses of GR sites when precipitation was detected by the satellite. In this study, we include VN matchups for six years of Version 06A GPM algorithms from 2014 through 2019.

Results
In this section, we present comparisons of the DSD retrievals (1-km below the melting layer) from 2ADPR and 2BCMB with those from the CONUS ground-based radars (GRs) in the GPM VN matchup archive. The focus is on evaluating the performance of the GPM algorithms and using the VN to help diagnose sources of error. Figure 4 shows the retrievals of Dm from Version 06A of 2ADPR and 2BCMB for stratiform and convective rainfall, which is based on the precipitation type classification of 2ADPR [46]. Note that, in contrast to that discussed in [17], this comparison focuses on a new algorithm version (V06A) and significantly extends the time period of comparisons (and hence sample numbers). It includes 319,063 (116,218) matchups of Dm retrievals from 2ADPR (2BCMB) for the DPR matched scans (i.e., the Ku-band and Ka-band scans) within its inner swath (2BCMB also includes GMI observations). Both GPM algorithms produce a larger Dm than that of the GR as in Petersen et al. (2020), with 2BCMB exhibiting a 0.2 mm high bias and 2ADPR exhibiting a slightly lower mean absolute error (MAE), particularly in stratiform precipitation (Figure 4a). However, in convective precipitation the MAE of the 2ADPR Dm retrieval is 0.5 mm, whereas 2BCMB has the lower error. Furthermore, the GPM retrievals of convective precipitation exhibit peculiar behavior at large Dm (>2.5 mm), especially 2ADPR (Figure 4c). In Figure 4c, the GPM retrievals appear to saturate around 3.0 mm, but this is an artifact of the matched scan (MS) in which Ka-band retrieval of Dm is limited to <3.0 mm. The 2AKu retrieval (i.e., Ku-band only) of Dm does not exhibit such artificial behavior, but like Figure 4c, it does increasingly depart in a marked positive bias fashion from agreement with the ground radars (not shown). Although the GPM retrieval of Dm meets the mission level 1 requirements of being within ±0.5 mm [50], which is largely due to its superb

Results
In this section, we present comparisons of the DSD retrievals (1-km below the melting layer) from 2ADPR and 2BCMB with those from the CONUS ground-based radars (GRs) in the GPM VN matchup archive. The focus is on evaluating the performance of the GPM algorithms and using the VN to help diagnose sources of error. Figure 4 shows the retrievals of D m from Version 06A of 2ADPR and 2BCMB for stratiform and convective rainfall, which is based on the precipitation type classification of 2ADPR [46]. Note that, in contrast to that discussed in [17], this comparison focuses on a new algorithm version (V06A) and significantly extends the time period of comparisons (and hence sample numbers). It includes 319,063 (116,218) matchups of D m retrievals from 2ADPR (2BCMB) for the DPR matched scans (i.e., the Ku-band and Ka-band scans) within its inner swath (2BCMB also includes GMI observations). Both GPM algorithms produce a larger D m than that of the GR as in Petersen et al. (2020), with 2BCMB exhibiting a 0.2 mm high bias and 2ADPR exhibiting a slightly lower mean absolute error (MAE), particularly in stratiform precipitation (Figure 4a). However, in convective precipitation the MAE of the 2ADPR D m retrieval is 0.5 mm, whereas 2BCMB has the lower error. Furthermore, the GPM retrievals of convective precipitation exhibit peculiar behavior at large D m (>2.5 mm), especially 2ADPR (Figure 4c). In Figure 4c, the GPM retrievals appear to saturate around 3.0 mm, but this is an artifact of the matched scan (MS) in which Ka-band retrieval of D m is limited to <3.0 mm. The 2AKu retrieval (i.e., Ku-band only) of D m does not exhibit such artificial behavior, but like Figure 4c, it does increasingly depart in a marked positive bias fashion from agreement with the ground radars (not shown). Although the GPM retrieval of D m meets the mission level 1 requirements of being within ±0.5 mm [50], which is largely due to its superb performance in stratiform precipitation, the V06 DPR algorithms clearly exhibit departures from the GR retrievals of D m within convective precipitation.
Atmosphere 2020, 11, x FOR PEER REVIEW 7 of 14 performance in stratiform precipitation, the V06 DPR algorithms clearly exhibit departures from the GR retrievals of Dm within convective precipitation. As previously discussed, disdrometer measurements across the globe and a variety of precipitation types represent the foundation of the DSD retrievals in the VN. Indeed, Figure 5a demonstrates that DSD retrievals by ground radars (GRs) included in the VN database agree quite well with observations from a global disdrometer-based DSD dataset [51]. The DPR retrievals are offset relative to the GR (Figure 5b,c). In addition to the high bias in Dm, the DPR retrievals tend to have significantly lower Nw-an order of magnitude for 2AKu (Figure 5b). The DSD bias is less severe for the dual-frequency retrieval, but the upper constraint on Dm from the Ka-band information is readily apparent (Figure 5c). However, in precipitation classified as convective by 2ADPR, the GR Dm retrievals are shifted approximately 0.3 mm lower than the global DSD observations classified by [51] as being associated with convective precipitation processes (i.e., their Groups 1,3,5, and 6). In fact, both the GR and DPR median Dm retrievals are within 0.2 mm of the observed DSDs classified as convective. The GR retrievals of Nw for convective precipitation are nearly spot on with that observed (Figure 5a), but the DPR algorithms have a problem with the retrieval of Nw in convective precipitation. In stratiform precipitation, the DPR retrievals of Nw are within ±15% of the GR retrieved Nw (not shown). As previously discussed, disdrometer measurements across the globe and a variety of precipitation types represent the foundation of the DSD retrievals in the VN. Indeed, Figure 5a demonstrates that DSD retrievals by ground radars (GRs) included in the VN database agree quite well with observations from a global disdrometer-based DSD dataset [51]. The DPR retrievals are offset relative to the GR (Figure 5b,c). In addition to the high bias in D m , the DPR retrievals tend to have significantly lower N w -an order of magnitude for 2AKu (Figure 5b). The DSD bias is less severe for the dual-frequency retrieval, but the upper constraint on D m from the Ka-band information is readily apparent (Figure 5c). However, in precipitation classified as convective by 2ADPR, the GR D m retrievals are shifted approximately 0.3 mm lower than the global DSD observations classified by [51] as being associated with convective precipitation processes (i.e., their Groups 1,3,5, and 6). In fact, both the GR and DPR median D m retrievals are within 0.2 mm of the observed DSDs classified as convective. The GR retrievals of N w for convective precipitation are nearly spot on with that observed (Figure 5a), but the DPR algorithms have a problem with the retrieval of N w in convective precipitation. In stratiform precipitation, the DPR retrievals of N w are within ±15% of the GR retrieved N w (not shown). Figure 6 shows the 2BCMB and 2ADPR retrievals of N w relative to the GR within the inner swath of the DPR. The behavior is markedly different between the two algorithms. The correlation between the GR and 2BCMB retirevals of N w is very weak compared to that of the GR and 2ADPR estimates. The DPR retrievals of N w extend across three orders of magnitude, which is similar to the dynamic Atmosphere 2020, 11, 1010 8 of 14 range of the GR retrievals, but the 2BCMB retrievals of N w range less than two orders of magnitude. This is especially the case for convective precipitation. The 2BCMB retrievals of log 10 (N w ) are largely concentrated around 3.5 (Figure 6d), and this represents a strong a priori database constraint on N w retrievals in the 2BCMB algorithm. The low bias of 2ADPR estimates of N w in convection (Figure 6c) are reflective of the its high bias in D m (Figure 4c).  Figure 6 shows the 2BCMB and 2ADPR retrievals of Nw relative to the GR within the inner swath of the DPR. The behavior is markedly different between the two algorithms. The correlation between the GR and 2BCMB retirevals of Nw is very weak compared to that of the GR and 2ADPR estimates. The DPR retrievals of Nw extend across three orders of magnitude, which is similar to the dynamic range of the GR retrievals, but the 2BCMB retrievals of Nw range less than two orders of magnitude. This is especially the case for convective precipitation. The 2BCMB retrievals of log10(Nw) are largely concentrated around 3.5 (Figure 6d), and this represents a strong a priori database constraint on Nw retrievals in the 2BCMB algorithm. The low bias of 2ADPR estimates of Nw in convection (Figure 6c) are reflective of the its high bias in Dm (Figure 4c).  Figure 6 shows the 2BCMB and 2ADPR retrievals of Nw relative to the GR within the inner swath of the DPR. The behavior is markedly different between the two algorithms. The correlation between the GR and 2BCMB retirevals of Nw is very weak compared to that of the GR and 2ADPR estimates. The DPR retrievals of Nw extend across three orders of magnitude, which is similar to the dynamic range of the GR retrievals, but the 2BCMB retrievals of Nw range less than two orders of magnitude. This is especially the case for convective precipitation. The 2BCMB retrievals of log10(Nw) are largely concentrated around 3.5 (Figure 6d), and this represents a strong a priori database constraint on Nw retrievals in the 2BCMB algorithm. The low bias of 2ADPR estimates of Nw in convection (Figure 6c) are reflective of the its high bias in Dm (Figure 4c).

Discussion
We have identified a few issues with the DSD retrievals from the GPM V06A algorithms, predominately in convective rain. In this section, we will further discuss them and utilize the GR data included in the VN to shed some light on potential sources of these errors.
For DSDs comprised of a relatively high number of large raindrops (e.g., D m > 2.5 mm), the 2ADPR retrievals of D m become increasingly positive biased relative to GR retrieved D m , primarily for convective precipitation (Figure 4). Using the polarimetric radar retrievals of hydrometeor type (i.e., HID), we find relatively more rimed ice aloft (e.g., graupel and hail) when 2ADPR retrievals of D m > 2.5 mm occur within convective precipitation (Figure 7). The presence of larger ice particles such as graupel and hail and associated potential mixed phase conditions, is consistent with the presence of larger drops and a modest positive shift in D m [51,52]. However, these conditions can also produce non-uniform beam filling and multiple scattering at the Ku-and, especially, Ka-bands, thereby complicating the attenuation correction of the DPR [53]. The 2BCMB algorithm includes a solution for handling multiple scattering at the Ka-band [48], which may be a reason for it being slightly less biased at large D m than 2ADPR is in convective precipitation. However, the 2BCMB retrievals of D m still exhibit an increasing bias with D m > 2.5 mm, suggesting it may not be fully accounting for the effects of multiple scattering and/or non-uniform beam filling (NUBF), which can also affect the satellite retrievals in deep convection [54].

Discussion
We have identified a few issues with the DSD retrievals from the GPM V06A algorithms, predominately in convective rain. In this section, we will further discuss them and utilize the GR data included in the VN to shed some light on potential sources of these errors.
For DSDs comprised of a relatively high number of large raindrops (e.g., Dm > 2.5 mm), the 2ADPR retrievals of Dm become increasingly positive biased relative to GR retrieved Dm, primarily for convective precipitation (Figure 4). Using the polarimetric radar retrievals of hydrometeor type (i.e., HID), we find relatively more rimed ice aloft (e.g., graupel and hail) when 2ADPR retrievals of Dm > 2.5 mm occur within convective precipitation (Figure 7). The presence of larger ice particles such as graupel and hail and associated potential mixed phase conditions, is consistent with the presence of larger drops and a modest positive shift in Dm [51,52]. However, these conditions can also produce non-uniform beam filling and multiple scattering at the Ku-and, especially, Ka-bands, thereby complicating the attenuation correction of the DPR [53]. The 2BCMB algorithm includes a solution for handling multiple scattering at the Ka-band [48], which may be a reason for it being slightly less biased at large Dm than 2ADPR is in convective precipitation. However, the 2BCMB retrievals of Dm still exhibit an increasing bias with Dm > 2.5 mm, suggesting it may not be fully accounting for the effects of multiple scattering and/or non-uniform beam filling (NUBF), which can also affect the satellite retrievals in deep convection [54]. In GPM DPR algorithm retrievals of rain rate, the R-Dm relationship is a dominant player [55]. Moreover, recall that when the Dm bias gets large in the DPR estimates, that log10 (Nw) is observed to become markedly lower relative to both the VN and ground-based disdrometer measurements ( Figure 5). Collectively, this results in an underestimation of rain rate for similar radar reflectivity. Indeed, the impact of a positive bias in the Dm retrievals for the GPM algorithms is readily observed in comparisons of rainfall rate to VN estimates. For example, if convective rainfall rates associated with large Dm in the outer swath of the DPR (i.e., Ku-only measurements) are extracted from the rain rate sample, and then compared to the VN polarimetric radar-based rainfall rates [10], the effect of large Dm on the rain rate bias is removed (Figure 8). In GPM DPR algorithm retrievals of rain rate, the R-D m relationship is a dominant player [55]. Moreover, recall that when the D m bias gets large in the DPR estimates, that log 10 (N w ) is observed to become markedly lower relative to both the VN and ground-based disdrometer measurements ( Figure 5). Collectively, this results in an underestimation of rain rate for similar radar reflectivity. Indeed, the impact of a positive bias in the D m retrievals for the GPM algorithms is readily observed in comparisons of rainfall rate to VN estimates. For example, if convective rainfall rates associated with large D m in the outer swath of the DPR (i.e., Ku-only measurements) are extracted from the rain rate sample, and then compared to the VN polarimetric radar-based rainfall rates [10], the effect of large D m on the rain rate bias is removed (Figure 8).
Reasons for the bias in D m are not clear. However, in a NUBF situation (more likely to occur in convective precipitation) it is conceivable that measured attenuation at a lower reflectivity value (due to the NUBF) could be over compensated by an increase in the initial D m estimate via the R-D m approach, which would drive down N w and the rainfall rate. Reasons for the bias in Dm are not clear. However, in a NUBF situation (more likely to occur in convective precipitation) it is conceivable that measured attenuation at a lower reflectivity value (due to the NUBF) could be over compensated by an increase in the initial Dm estimate via the R-Dm approach, which would drive down Nw and the rainfall rate.
The GPM retrievals of Nw are significantly biased low. The 2ADPR approach to retrieving Nw is similar to that used with the GR in that it is dependent upon Dm and reflectivity, Z. However, 2ADPR assumes a fixed shape parameter in the gamma DSD (i.e., µ = 3) and includes an adjustment factor, ε, that defines the R-Dm, which is used to estimate Dm, and the Z-Dm relationship, which is then used to estimate Nw. A robust description of 2ADPR's DSD retrieval approach is provided in [56]. The ε is optimally determined from the forward (i.e., downward starting at the storm top) recursively estimated attenuation versus that observed via the surface-reference technique [47,55]. Since an inherent inverse relationship exists between Nw and Dm (e.g., Figure 5), retrieval approaches such as 2ADPR with a high bias in Dm can in turn produce a low bias in Nw. We find severely negatively biased Nw retrievals in convective precipitation, where the Dm estimates are 0.5-0.6 mm too high relative to the GR. The significantly more accurate Nw estimate in stratiform precipitation is in part evidence for the impact of errors in the Dm retrieval on 2ADPR's retrievals of the DSD and ultimately its rainfall estimates. Additional factors may be at play in contributing to 2ADPR's order of magnitude Nw bias in convective precipitation. The optimization approach to find the ε used in the 2ADPR DSD retrievals involves simulating the attenuation at the Ka-band, which in convection can suffer from multiple scattering and non-uniform beam filling [54,57]. The assumption of a fixed µ in the formulation of the gamma DSD is certainly another possible source of error, however, it is likely to be of lower impact since µ exhibits little variability over a vast range of Dm and Nw normalized rainfall rates [58].
The 2BCMB estimates of Nw have greater error than 2ADPR, except in convection, which is a testament to the 2BCMB attempt to account for multiple scattering and non-uniform beam filling. However, in convection the 2BCMB tends to often arrive at Nw estimates between 2500 and 3500 m −3 mm −1 , which is not realistic and highlights a problem with the algorithm. The 2BCMB DSD retrieval approach differs from 2ADPR in that it first retrieves Nw and then Dm. The 2BCMB relies on a database of Nw profiles that serves as a starting point and constraining a priori distribution for this iterative Bayesian-based approach [49]. Hence the retrievals are only as representative as the a priori database. Therefore, this database may not be representative enough in convective precipitation, causing the optimal estimation used in 2BCMB to often converge on the initial Nw profile.

Conclusions
The VN offers detailed insights into the inner workings of the GPM algorithms and can help to pinpoint any deficiencies. In this study, we compared DSD retrievals from 2ADPR and 2BCMB with those from numerous ground-based polarimetric radars. We find that both 2ADPR and 2BCMB estimates of Dm are biased high by only 0.2 mm, thereby meeting the level 1 science requirement. However, the algorithms do not estimate the DSD as well in convective precipitation. They tend to The GPM retrievals of N w are significantly biased low. The 2ADPR approach to retrieving N w is similar to that used with the GR in that it is dependent upon D m and reflectivity, Z. However, 2ADPR assumes a fixed shape parameter in the gamma DSD (i.e., µ = 3) and includes an adjustment factor, ε, that defines the R-D m , which is used to estimate D m , and the Z-D m relationship, which is then used to estimate N w . A robust description of 2ADPR's DSD retrieval approach is provided in [56]. The ε is optimally determined from the forward (i.e., downward starting at the storm top) recursively estimated attenuation versus that observed via the surface-reference technique [47,55]. Since an inherent inverse relationship exists between N w and D m (e.g., Figure 5), retrieval approaches such as 2ADPR with a high bias in D m can in turn produce a low bias in N w . We find severely negatively biased N w retrievals in convective precipitation, where the D m estimates are 0.5-0.6 mm too high relative to the GR. The significantly more accurate N w estimate in stratiform precipitation is in part evidence for the impact of errors in the D m retrieval on 2ADPR's retrievals of the DSD and ultimately its rainfall estimates. Additional factors may be at play in contributing to 2ADPR's order of magnitude N w bias in convective precipitation. The optimization approach to find the ε used in the 2ADPR DSD retrievals involves simulating the attenuation at the Ka-band, which in convection can suffer from multiple scattering and non-uniform beam filling [54,57]. The assumption of a fixed µ in the formulation of the gamma DSD is certainly another possible source of error, however, it is likely to be of lower impact since µ exhibits little variability over a vast range of D m and N w normalized rainfall rates [58].
The 2BCMB estimates of N w have greater error than 2ADPR, except in convection, which is a testament to the 2BCMB attempt to account for multiple scattering and non-uniform beam filling. However, in convection the 2BCMB tends to often arrive at N w estimates between 2500 and 3500 m −3 mm −1 , which is not realistic and highlights a problem with the algorithm. The 2BCMB DSD retrieval approach differs from 2ADPR in that it first retrieves N w and then D m . The 2BCMB relies on a database of N w profiles that serves as a starting point and constraining a priori distribution for this iterative Bayesian-based approach [49]. Hence the retrievals are only as representative as the a priori database. Therefore, this database may not be representative enough in convective precipitation, causing the optimal estimation used in 2BCMB to often converge on the initial N w profile.

Conclusions
The VN offers detailed insights into the inner workings of the GPM algorithms and can help to pinpoint any deficiencies. In this study, we compared DSD retrievals from 2ADPR and 2BCMB with those from numerous ground-based polarimetric radars. We find that both 2ADPR and 2BCMB estimates of D m are biased high by only 0.2 mm, thereby meeting the level 1 science requirement. However, the algorithms do not estimate the DSD as well in convective precipitation. They tend to increasingly overestimate D m when graupel and hail are present in the column of convective precipitation, likely causing multiple scattering at the Ka-band and possibly exacerbating attenuation correction. This is especially true when 2ADPR estimates D m > 2.5 mm, but less so for 2BCMB.
Additionally, the 2AKa upper constraint on D m artificially limits the 2ADPR retrievals of D m < 3.0 mm within the inner (aka MS) swath and subsequently affects the rain rate retrievals for these large D m .
The high bias in D m partially explains the low bias in the N w retrievals with 2ADPR, and the presence of multiple scattering and non-uniform beam filling are also likely to blame in convective precipitation. Although 2BCMB provides a better estimate of N w than 2ADPR, its significantly smaller dynamic range highlights the need to broaden its a priori database of convective precipitation and perhaps revisit its optimal estimation approach.
The VN matchup files are available for download from the NASA Distributed Active Archive Center Global Hydrology Resource Center (see Supplementary Materials). Additionally, a recent project is underway to store the VN dataset into an Athena database hosted on Amazon Web Services to enable GV and other users to readily query and extract subsets of the VN dataset [59]. The VN is continually being enhanced as a tool for precipitation science. It now consists of vertical motion retrieved via a multi-Doppler three-dimensional variational (3DVAR) approach [60,61] implemented in the Python Direct Data Assimilation (PyDDA) software package [62] for more than 20 pairs of WSR-88D radars [63]. Hence the VN now facilitates investigation of microphysical and dynamical processes that occur within the GPM satellite footprint-a vital observation to serve the program of record for future satellite missions focused on convective precipitation processes.