Above-Ground Biomass Retrieval over Tropical Forests: A Novel GNSS-R Approach with CyGNSS

: An assessment of the National Aeronautics and Space Administration NASA’s Cyclone Global Navigation Satellite System (CyGNSS) mission for biomass studies is presented in this work on rain, coniferous, dry, and moist tropical forests. The main objective is to investigate the capability of Global Navigation Satellite Systems Reﬂectometry (GNSS-R) for biomass retrieval over dense forest canopies from a space-borne platform. The potential advantage of CyGNSS, as compared to monostatic Synthetic Aperture Radar (SAR) missions, relies on the increasing signal attenuation by the vegetation cover, which gradually reduces the coherent scattering component σ coh,0 . This term can only be collected in a bistatic radar geometry. This point motivates the study of the relationship between several observables derived from Delay Doppler Maps (DDMs) with Above-Ground Biomass (AGB). This assessment is performed at di ﬀ erent elevation angles θ e as a function of Canopy Height (CH). The selected biomass products are obtained from data collected by the Geoscience Laser Altimeter System (GLAS) instrument on-board the Ice, Cloud, and land Elevation Satellite (ICESat-1). An analysis based on the ﬁrst derivative of the experimentally derived polynomial ﬁtting functions shows that the sensitivity requirements of the Trailing Edge TE and the reﬂectivity Γ reduce with increasing biomass up to ~ 350 and ~ 250 ton / ha over the Congo and Amazon rainforests, respectively. The empirical relationship between TE and Γ with AGB is further evaluated at optimum angular ranges using Soil Moisture Active Passive (SMAP)-derived Vegetation Optical Depth (VOD), and the Polarization Index (PI). Additionally, the potential inﬂuence of Soil Moisture Content (SMC) is investigated over forests with low AGB.


Introduction
Tropical forests are a key terrestrial ecosystem that play a leading role in the global carbon cycle. They are forested landscapes in tropical regions, i.e., land areas approximately bounded by the tropic of Cancer and Capricorn. Biomass is an Essential Climate Variable (ECV) that determines the spatial distribution of carbon in the biosphere [1]. Quantifying the spatio-temporal fluctuations of the biomass could help to understand the dynamics of tropical forests [2]. On the other hand, biomass is also an important parameter of the water cycle. Biomass affects evapotranspiration, which has a strong impact in freshwater supply in river runoff.
The Above-Ground Biomass (AGB) of tropical forests represents an important carbon pool [2]. Improving our capabilities to estimate AGB over this type of forests is essential to further understand both the carbon and the water cycles. At present, AGB estimation mainly relies on the direct estimation 1.
What is the DDMs' response to the interaction of GNSS signals with tropical forests? 2.
What is the GNSS signals saturation level (if any) over tropical forests? 3.
What is the optimum GNSS satellites elevation angle for biomass estimation over tropical forests?
This work can be understood as a follow-on study of the sensitivity analysis performed in [17]. Here, the main interest is on the sensitivity of Γ and TE to AGB over tropical forests as a function of CH, PI, and VOD. The study in [17] was focused on the analysis of the relationship between Γ and the normalized first Stokes parameter over a wide variety of target areas. The "tau-omega" model was found to properly fit this relationship. Several improvements have been introduced: (a) the use of raw DDMs, (b) signal calibration, and (c) the use of several observation geometries. This paper is organized as follows. Section 2 motivates the use of L-band bistatic radar for biomass studies. Section 3 describes the selected observables and the methodology. Section 4 analyses the sensitivity of CyGNSS to AGB, including a comprehensive interpretation of the experimental results. The main conclusions are included in Section 5. Information about test sites and reference data is included in Appendices A and B, respectively. Finally, Appendix C includes supplemental information for the evaluation of results in Section 4.

Background
The CyGNSS mission (down looking antenna gain~14.5 dB, LHCP) was first proposed for ocean wind speed estimation over tropical cyclones using GPS L1 C/A signals of opportunity. The orbital configuration of each CyGNSS satellite is a circular Low Earth Orbit (LEO) with a height~500 km and an inclination angle of~35 • . Each single Delay Doppler Mapping Instrument (DDMI) [12,14] aboard the 8 CyGNSS microsatellites collects forward scattered signals along four specular directions (elevation angle of the incident wave θ e,i equals elevation angle of the reflected wave θ e,s ; θ e,i = θ e,s = θ e ) corresponding to four different transmitting GPS spacecrafts, simultaneously ( Figure 1). As such, CyGNSS allows one to sample the Earth's surface along 32 tracks simultaneously, within a wide range of satellites' elevation angles θ e~ [ 20,90] • over tropical latitudes~ [−40, 40] • . scattering [29]. GNSS signals become mostly RHCP after double scattering, while they become mostly LHCP after direct scattering [29]. The CyGNSS constellation (8 spacecrafts) provides 32 Delay Doppler Maps at a time.
The retrieval of geophysical parameters from CyGNSS requires that the DDMI payload crosscorrelates directly the collected signals against a known replica of the GPS L1 C/A code, after compensating for the Doppler shift [8,12,14]. Consequently, 32 complex reflected DDMs Y ( ,f )  can be modelled using geometrical and scattering related parameters as follows [40]: where T P is the power of the RHCP transmitted signals, ~ 19 cm is the wavelength of the signals at L1, T G and R G are the transmitting and receiving antenna gains, T R , and R R are the ranges from the transmitter and the receiver to the specular point, respectively,  is the Woodward Ambiguity Function (WAF),  is the delay of the signal from the transmitter to the receiver, The bistatic scattering coefficient is defined as follows [41]: Left-HCP (LHCP) because of polarization changes during multiple (scattering processes which include at least double reflections) and direct (scattering processes that only account for single reflections) scattering [29]. GNSS signals become mostly RHCP after double scattering, while they become mostly LHCP after direct scattering [29]. The CyGNSS constellation (8 spacecrafts) provides 32 Delay Doppler Maps at a time.
The retrieval of geophysical parameters from CyGNSS requires that the DDMI payload cross-correlates directly the collected signals against a known replica of the GPS L1 C/A code, after compensating for the Doppler shift [8,12,14]. Consequently, 32 complex reflected DDMs Y r are generated on-board at a time, using a short coherent integration time T c~1 ms. Complex DDMs Y r are incoherently averaged along~1 s to finally generate the so-called power DDMs Y r (τ, f) 2 .
Power DDMs Y r (τ, f) 2 can be modelled using geometrical and scattering related parameters as follows [40]: where P T is the power of the RHCP transmitted signals, λ~19 cm is the wavelength of the signals at L1, G T and G R are the transmitting and receiving antenna gains, R T , and R R are the ranges from the transmitter and the receiver to the specular point, respectively, χ is the Woodward Ambiguity Function (WAF), τ is the delay of the signal from the transmitter to the receiver, f D is the Doppler shift of the reflected signal, f c is aimed to compensate the Doppler shift of the signal, σ 0 is the bistatic scattering coefficient at LHCP, and ρ is the positioning vector of the scattering point. The bistatic scattering coefficient is defined as follows [41]: where σ coh,0 and σ incoh,0 are the coherent and the incoherent scattering terms, respectively. σ incoh,0 describes scattering properties of the surface and the vegetation, i.e., volume scattering. σ coh,0 depends mainly on surface properties and it has a significantly narrower (practically, a delta function) power Remote Sens. 2020, 12, 1368 5 of 29 distribution over scattering angles than σ incoh,0 . Consequently, DDMs consists on a sum of two terms, as follows [28,42,43]: Y r,coh (τ, f) 2 accounts for coherent reflections, while Y r,incoh (τ, f) 2 is responsible for the diffuse scattering. The relative portion of both components mainly depends on the coherent integration time T c , the observation geometry, and both the dielectric and structural properties of the scattering media. Several works found a transition from incoherent to coherent scattering regime over the ocean surface [43,44], where Y r,incoh (τ, f) 2 is dominant for high θ e [45]. Theoretical simulations have been carried out to evaluate the behavior of the incoherent reflected power as a function of θ e [45].
It was stated that Y r,incoh (τ, f) 2 decreases~20% when θ e decreases from θ e~9 0 • to θ e~5 0 • . Over land surfaces, global scale results [46] have shown that Γ decreases with decreasing θ e down to~55 • . This empirical observation was justified because of the effect of the incoherent scattering term σ incoh,0 . On the other hand, Γ increases for lower θ e because of the increasing signal coherence. The reflectivity Γ accounts for contributions from both coherent and incoherent scattering terms along both angular ranges. The most relevant difference between both scattering terms is on the combination of the electromagnetic field vectors. σ coh,0 is the result of the coherent combination of the signals reflected on individual facets within the first Fresnel zone. σ incoh,0 comes from the random combination of electromagnetic signals scattered within the glistening zone that add together at the GNSS-R receiver. When the combination is fully coherent, σ coh,0 accounts for all the reflected power. In the occurrence of diffuse scattering, the random signs of the electric field cross-products cancel out σ coh,0 , and the reflected power is provided by σ incoh,0 . The changes in the angular evolution of Γ depend on the dominant scattering mechanism.

Motivation
It is hypothesized that GNSS-R measurements have a certain sensitivity to biomass over tropical forests ( Figure 2) and this sensitivity can be higher than that of current SAR missions. This hypothesis is based on the following statements: (a) The backscatter intensity measured by monostatic SAR increases with biomass up to a saturation level AGB~100 ton/ha. (b) L-band GNSS signals can partially penetrate the vegetation cover because of the semi transparency of the vegetation at low frequency bands. (c) The bistatic scattering pattern of the coherent surface scattering component follows a delta function along the specular direction, while the incoherent scattered signals spread along all other directions.
(d) The coherent component Y r,coh (τ, f) 2 can be higher than the incoherent one Y r,incoh (τ, f) 2 even from a space-borne platform.  2  2  2  TR  c  2  2  T  r,coh  e  e  2  2  TR ), where R is the LHCP Fresnel reflection coefficient, k is the signal angular wavenumber,  is the surface height standard deviation (related to surface roughness), and  is the transmissivity of the vegetation: where VOD is the Vegetation Optical Depth. VOD is an index of leaf and woody biomass that is used to parameterize absorption effects. VOD data derived at L-band are better correlated to AGB and CH than C-band data because of the higher signal penetration depth at lower frequency bands [48]. C-band is only sensitive to top canopy layers, while L-band is sensitive to the canopy from top to bottom layers [48]. Increasing levels of biomass belong to higher signal attenuation effects. On the other hand, optically derived vegetation parameters, such as, e.g., NDVI are only correlated with green leaf material. Tropical rainforests are characterized by wet biomass and the impact of signal attenuation through the vegetation cover is thus higher than scattering effects [49]. This attenuation becomes higher for decreasing e  because of the larger signal path along the vegetation cover. The A significant difference in the DDMs is found between both study sites. The spreading in the delay domain is much higher over rain than over dry forests. On the other hand, the scattering is quite specular over dry forests.
Y r,coh (τ, f) 2 includes coherent scattering from the surface. It can be expressed analytically as follows [47]: where R is the LHCP Fresnel reflection coefficient, k is the signal angular wavenumber, σ is the surface height standard deviation (related to surface roughness), and γ is the transmissivity of the vegetation: where VOD is the Vegetation Optical Depth. VOD is an index of leaf and woody biomass that is used to parameterize absorption effects. VOD data derived at L-band are better correlated to AGB and CH than C-band data because of the higher signal penetration depth at lower frequency bands [48]. C-band is only sensitive to top canopy layers, while L-band is sensitive to the canopy from top to bottom layers [48]. Increasing levels of biomass belong to higher signal attenuation effects. On the other hand, optically derived vegetation parameters, such as, e.g., NDVI are only correlated with green leaf material. Tropical rainforests are characterized by wet biomass and the impact of signal attenuation through the vegetation cover is thus higher than scattering effects [49]. This attenuation becomes higher for decreasing θ e because of the larger signal path along the vegetation cover. The impact of θ e on Y r,coh (τ, f) 2 is twofold. On one hand it influences the Fresnel reflection coefficient, and on the other hand it influences the exponential decaying factor in Equation (4). For a totally flat surface, the root mean square of the surface height variation is zero, and Y r,coh (τ, f) 2 equals the square of the amplitude of the Fresnel reflection coefficient. R is constant for θ e > 40 • and it decreases significantly for lower angles [27]. However, the exponential factor increases significantly with lower θ e . As a consequence, Y r,coh (τ, f) 2 increases at grazing angles [46].
Finally, it is worth commenting that Y r,coh (τ, f) 2 can be assumed to be roughly independent of the platform's height in a LEO scenario because R T R R . Y r,coh (τ, f) 2 suffers lower free space losses than Y r,incoh (τ, f) 2 , following the classical bistatic radar equation.
Remote Sens. 2020, 12, 1368 7 of 29 Y r,incoh (τ, f) 2 includes two main contributions: (a) surface scattering over areas with moderate to high roughness (small-scale roughness) and topography, and (b) volume scattering [29] from the vegetation cover including leaves, branches, and trunks. The impact of double bouncing scattering can be assumed to be negligible because the LHCP down looking antennas filter out the RHCP component of the total scattered electromagnetic field. The polarization of GNSS signals changes from RHCP to LHCP after direct scattering. However, they become RHCP after double reflections, first from RHCP to LHCP and then from LHCP to RHCP.

Definition of the Selected CyGNSS Observables
CyGNSS Level 2.1 Science Data Record [14,37,38] was selected for this study. The original data were first filtered out using an equivalent "CyGNSS overall quality flag" over land surfaces to improve the quality of the observables. Reflected delay waveforms WF r,raw were obtained from the original DDMs Y r (τ, f) 2 at zero Doppler frequency: The delay bin resolution of the original 17 lag waveforms WF r,raw is~0.2552 GPS C/A chips. A re-sampling and interpolation of the resulting 1700 lag waveforms was performed using a spline method [50] to increase the accuracy of the waveforms, before applying the algorithms to extract the observables. Several observables were considered to evaluate the sensitivity of σ coh,0 and the volume scattering term to forests biomass. The trailing edge width TE was defined as the lag difference between the 70% power threshold of the high-resolution waveforms WF r,threshold and the corresponding maximum power of the waveforms WF r,peak [13]: TE = τ WF r,threshold − τ WF r,peak .
Different power thresholds were tested. The 50% power threshold was found to cut off some lags when this threshold was out of the original 17 lag waveforms. On the other hand, the 90% power threshold provided a lower dynamic range as compare to the 70%. The incoherent scattering term σ incoh,0 is the main contribution to WF r,threshold , while both the coherent σ coh,0 and the incoherent σ incoh,0 scattering terms significantly contribute to the peak power of the waveforms WF r,peak . Y r (τ, f) 2 are composed of both coherent and incoherent scattering terms over tropical forests. The width of these power waveforms depends on the scattering properties, in a differentiated manner depending on the coherent-to-incoherent scattering ratio. The width is expected to be small if the ratio is high because of the strong power contribution of the coherent term Y r,coh (τ, f) 2 to the peak of the power waveforms.
The reflectivity Γ is estimated as the ratio of the reflected WF r,peak and the direct WF d,peak power waveforms peaks, after compensating for the noise power floor and the antennas' gain patterns as a function of the elevation angle θ e as: Γ = WF r,peak /WF d,peak .
The antennas' gain patterns were compensated versus the gain at the corresponding boresight direction (down looking gain ∼14.5 dB, θ e~6 2 • and up looking gain ∼4.7 dB, θ e~9 0 • ), and the difference of both gains at boresight. The compensation of the antennas' gain was implemented as a function of θ e . This is relevant for an accurate estimation of Γ because the transmitted signal power depends on θ e , and because both gain patterns have a different relationship with this variable. The CyGNSS Level 1 Science Data Record variables used in this procedure are highlighted here: raw DDM and Zenith Signal-to-Noise Ratio (SNR) for the estimation of WF r,peak and WF d,peak , while Specular point Rx antenna gain for the information of the down looking antenna gain in the direction of the specular point. The up looking antenna is omnidirectional with a gain ∼4.7 dB at boresight and a half power beam width of ∼57 • . The noise power floor was estimated as the mean value of the DDM subset where the signal is not present [51]. The delay separation between the DDM subset and DDM peak was at least 0.75 chips, and estimation of Γ was discarded if this distance was shorter than 0.75 chips.
The half power beam width (−3 dB) of each down-looking CyGNSS antenna is~30 • , and the −6 dB field-of-view is~45 • . On the other hand, each antenna pointed to the Earth's surface with an elevation angle of θ e~6 2 • (antenna boresight). As such, GNSS signals reflected at low angles θ e down to~20 • can be accurately collected through the main lobe of each antenna.
The nominal mission lifetime is 2 years. This milestone was achieved on March 2019. Experience with CyGNSS wind measurements shows that absolute power measurements is currently one of the challenges of this mission [52,53]. In this work, the direct signal was used to partially improve the estimation of Γ. At present, the mission is operating nominally, and the Automatic Gain Control (AGC) for the Zenith channel was disabled from August 2018. This fact helps to improve the accuracy in the estimation of Γ using Equation (8). On the other hand, Γ could also be inverted from scattering models. However, this approach relies on the assumption that the scattering is totally coherent [51] or totally incoherent [54]. This was the main motivation to use Equation (8) in the estimation of the reflectivity

Auxiliary Forests Vegetation Parameters
The SMAP-Enhanced L3 Radiometer Global Daily 9 km Level L3 SPL3SMP_E Version 1.0 product was used to generate auxiliary vegetation parameters. This product was derived from SMAP radiometer (6 A.M. -6 P.M. data in different arrays) and ancillary data over the global 9-km Equal Area Scalable Earth (EASE 2-0) grid [55][56][57]. The SMAP L-band mission provides a better signal penetration depth over the vegetation as compared to higher frequency bands. As such, it is an adequate candidate to evaluate the relationship between CyGNSS derived TE and Γ with AGB, as a function of VOD, PI, and SMC.
(1) Vegetation Optical Depth (VOD): The retrieval is based on a priori NDVI information obtained from visible-near infrared reflectance data from the NPP/JPSS VIIRS instrument, and land cover type assumptions [55][56][57]. It is used to retrieve the transmissivity γ of the vegetation as an estimation of the attenuation of the electromagnetic signal through the vegetation layer. SMOS-derived VOD has been also used in, e.g., [58].
(2) Polarization Index (PI): It is defined as the difference between the microwave radiometric brightness temperatures at V-pol T BV and H-pol T BH , normalized to their average value as follows [59][60][61][62]: PI normalizes the measurements of the brightness temperatures and it is independent of the physical temperature of the surface. The potential sensitivity of PI to dense forest canopies is based on the assumption that V-polarized emission from soil is attenuated by vegetation, which, on the other hand, emits much less polarized radiation depending on the amount of biomass. This parameter was found to decrease when the biomass increases, independently of the crop type. The performance over tropical forests is evaluated here.
(3) Soil Moisture Content (SMC): The retrieval of SMC is based on the use of the V-pol single channel algorithm [55][56][57] only when favorable surface conditions appear at a given grid cell. Corrections for surface roughness, effective soil temperature, and Vegetation Water Content (VWC) were also applied.

Strategy
TE and Γ were classified into different groups according to different ranges of the satellites' elevation angles θ e in steps of~10 • from θ e~9 0 • to θ e~2 0 • . θ e is an important parameter that remarkably affects the ratio of the coherent Y r,coh (τ, f) 2 to incoherent Y r,incoh (τ, f) 2 scattering components. The contribution of the topography to the incoherent scattering term Y r,incoh (τ, f) 2 was partially filtered out to evaluate only the effect of the volume scattering term. In so doing, only pixels over the surface with a Terrain Ruggedness Index (TRI) lower than~15 were considered [63,64]. This threshold was found, empirically, to limit the effect of topography. Several TRI thresholds were tested. It was found that levels higher than~15 significantly increase the spreading of the TE. On the other hand, there is a remaining incoherent scattering contribution due to the small-scale surface roughness. However, the coherent scattering term Y r,coh (τ, f) 2 largely overpass the incoherent scattering term Y r,incoh (τ, f) 2 associated to the small-scale roughness over land surfaces [28,42,47].
Topography is an important factor that disturbs σ 0 because the local surface slopes modify the size of the scattering area [65]. Quantifying and analyzing the relationship between TE and Γ against AGB and CH is the main focus of this work. In so doing, SMAP-derived VOD, PI, and SMC products were also used to further analyze the results. A common grid was thus required, because of the variety of products with different footprint sizes and spatio-temporal sampling characteristics. A 0.1 • × 0.1 • latitude/longitude grid was selected. All products were averaged using a moving window of 0.2 • in steps of 0.1 • . The spatial resolution is~20 km × 20 km at equatorial latitudes, which is wide enough to account for the across-track spreading~20 km of the DDMs due to Y r,incoh (τ, f) 2 . The ultimate resolution at a cell center is~20 km × 20 km because multiple satellites overpasses with different orientations were considered. This grid also allows us to account for Y r,coh (τ, f) 2 . This term is limited by the along-track resolution~7.6 km, related with the orbital motion and the incoherent integration timẽ 1 s. Finally, it is worth pointing out that this strategy provides a filtering of potential fluctuations of Γ and TE due to remaining (after quality flags removal) noise sources, such as Attitude Determination and Control Subsystem (ADCS), the geolocation of the nominal specular reflection points, and the use of eight different DDMs sources (eight distributed microsatellites) in the moving averaging filter. The selected temporal window was 6 months (01/08/2018 to 31/01/2019). This temporal length is large enough to provide statistically significant results. This specific time period was selected because the AGC was disabled on July 2018, so as to improve the performance in the estimation of Γ using Equation (8). The temporal length was selected as large as possible at the time of starting this work because this study is not focused on the analysis of potential temporal fluctuations of different geophysical parameters. The temporal variations of AGB and CH during this period were assumed to be negligible.

Introduction
CyGNSS-derived TE and Γ were used in this study to improve our understanding in the performance of GNSS-R for biomass studies (Figures 3-8). The analysis was performed over different types of forests, such as Congo rain (Figure 3a Table 1 and Tables A1-A8.  Table 1 and Tables   A1-A8. Speckle is a source of noise that involves diffuse scattering. If the scattering medium is rough with respect to  , the different heights of the scatterers over the glistening zone will randomly shift the phases of the GNSS signals. Some of these paths will destructively interfere and others will constructively interfere. Thus, the signal power level as measured by CyGNSS will change randomly. From this point, it can be stated that the SNR of the reflected signals is affected by tropical forests. The strategy described previously was applied to mitigate this effect. At the same time, the AGB reference data could potentially have some errors because this dataset is derived from CH data and assumed continental allometric relationships. Nonetheless the validation of Avitabile's map [39] Table 1 and Tables A1-A8.  Table 1 and Tables   A1-A8. Speckle is a source of noise that involves diffuse scattering. If the scattering medium is rough with respect to  , the different heights of the scatterers over the glistening zone will randomly shift the phases of the GNSS signals. Some of these paths will destructively interfere and others will constructively interfere. Thus, the signal power level as measured by CyGNSS will change randomly. From this point, it can be stated that the SNR of the reflected signals is affected by tropical forests. The strategy described previously was applied to mitigate this effect. At the same time, the AGB reference data could potentially have some errors because this dataset is derived from CH data and assumed continental allometric relationships. Nonetheless the validation of Avitabile's map [39] Tables 1 and A1, Tables A2-A8.   starts to be dominant for angles lower than e  ~60° [46]. A complementary description of the impact of e  can be found in [46]. Overall, it is also found that the dispersion in the scatter plots is very high over the strongest levels of  , where surface reflectivity starts to play an important role in the overall reflectivity. This significant dispersion is related to the fluctuations of certain surface properties, depending on the surface area under consideration. This dispersion increases for grazing angles, which suggests that surface properties such as rivers start to play an important role. As such,  appears quite high even for high AGB levels.   Table   1.   Table 1.   Table   1.   Table 1. AGB and CH data were obtained from the Avitabile et al. product [39] and the ICESat-1/GLAS space-borne instrument [3,6], respectively. Both biomass references were displayed over the specific target areas, enabling a first qualitative evaluation (Figures 3 and 6). A certain correlation of TE and Γ with AGB and CH was found over the selected study sites (Table 1). This correlation improves over more dense forest canopies. Γ decreases with increasing biomass levels. On the other hand, the spreading of TE increases with increasing levels of biomass. Avitabile's map and CyGNSS sampling properties show a good performance, while GLAS-derived CH data are provided with less spatial density. This observation motivates the selection of AGB from Avitabile et al. as the main biomass reference. CH was used as an auxiliary product to assess the potential impact of this term in the interaction of GNSS signals with the vegetation.  -l and 8i-l) tropical forests. These plots are depicted as a function of CH. They were computed using the mean values of the biomass reference data in steps of TE~1 m and Γ~0.05 dB. The values of these steps were found empirically. This strategy allowed us to filter noise at a pixel level, such as speckle in Y r (τ, f), and potential errors in GLAS-derived AGB. On the other hand, it provided enough sampling density to compute the correlation coefficients, i.e., Spearman for rainforests, and Pearson for other forests (Table 1). Different steps were also tested. Decreasing values of these steps belonged to a noisier behavior in the scatter plots, while increasing values reduced sampling density. The extension of the target areas is large, providing enough data. The errors were assumed to be randomly distributed. Thus, noise could be reduced after averaging, so as to find the underlying relationship between CyGNSS observables and AGB. This work is focused on the analysis of this empirical relationship on a regional scale. The correlation between the mean values of both CyGNSS observables and AGB is clearly visible. Increasing levels of AGB belongs to decreasing Γ and increasing TE.

Sensitivity Analysis as a Function of the Elevation Angle
Speckle is a source of noise that involves diffuse scattering. If the scattering medium is rough with respect to λ, the different heights of the scatterers over the glistening zone will randomly shift the phases of the GNSS signals. Some of these paths will destructively interfere and others will constructively interfere. Thus, the signal power level as measured by CyGNSS will change randomly. From this point, it can be stated that the SNR of the reflected signals is affected by tropical forests. The strategy described previously was applied to mitigate this effect. At the same time, the AGB reference data could potentially have some errors because this dataset is derived from CH data and assumed continental allometric relationships. Nonetheless the validation of Avitabile's map [39] shows a lower root mean square error (RMSE) (reduction down to~74%) and bias (reduction down to~153%) than both original datasets [66,67] over all the continents. An interpretation of the results is provided for each study site: (1) Tropical rainforests (Figures 4 and 5): The empirical relationships of TE and Γ with AGB were fitted by empirically derived polynomial functions (  (Tables A1-A4). The values of the derivative reduce with increasing AGB levels. Uncertainties in AGB estimation were computed as the values of the first derivatives times the uncertainties in the GNSS-R observables. This point is translated into smaller uncertainties in the estimation of AGB with increasing biomass levels. A budget (ignoring noise) is provided assuming uncertainties of, e.g., TE~1 m and Γ~0.1 dB (Tables A5-A8). In this work, the main interest is in the sensitivity, ignoring noise in the measurements when observations deviate from the fits [68]. As such, at a first order, the required GNSS-R sensitivity and the AGB uncertainties are related by the first derivative of the fitting functions [69]. These results suggest that different types of forests have a different response. As such, different models should be applied to retrieve biomass at each type of forest. (Figure 4d) and TE~600 m over Amazon (Figure 4e-h). This experimental evidence is justified because, for increasing vegetation levels ( Figure A1a,c), (a) a higher volume scattering term in σ incoh,0 increases the tail of Y r (τ, f) 2 , and (b) the coherent scattering term σ coh,0 is gradually more attenuated. On the other hand, the measurable AGB dynamic range increases with decreasing θ e , from~140 ton/ha at θ ẽ [80, 90] • to 300 ton/ha at θ e~ [ 20,30] • over the Congo target area, while it increases from~140 ton/ha at θ e~[ 80, 90] • to 170 ton/ha a θ e~ [ 20,30] • over the Amazon site. The measurable AGB range is a function of θ e because of the impact of the different scattering mechanisms at different θ e . The coherent scattering increases at grazing angles, belonging to a higher measurable AGB dynamic range, as compared to a Nadir looking configuration. The improved measurable dynamic range at lower angles is explained because of the higher coherent reflectivity at this geometry Equation (4). This scattering property belongs to an increment of the SNR dynamic range, which in turn improves the sensitivity of TE to the attenuating cover. Coherence effects could also appear after scattering over the vegetation cover if the coherent integration time would set to be long enough, e.g., T c~2 0 ms [27,28,33], so as to filter out the noise and the volume scattering term. However, the volume scattering term is high for lower integration times, e.g., T c~1 ms.

Figures 4 and 5 show that TE increases with increasing AGB up to TE~800 m over Congo
Γ gradually decreases down to~−25 dB with increasing AGB because of the increasing attenuation of the GNSS signals along the vegetation cover ( Figure 5). The measurable AGB dynamic range first increases from θ e~[ 80, 90] • to~ [60,70] • , and later it reduces with decreasing angles from θ e~ [ 60,70] • to~ [20,30] • . On the other hand, Γ is significantly stronger at θ e~ [ 20,30] • than at higher observation angles. This is a symptom that the coherent scattering term σ coh,0 is dominant at grazing geometries, even over tropical rainforests [46]. The coherent scattering term σ coh,0 is mainly associated with surface scattering. As such, the sensitivity of Γ to AGB reduces when σ coh,0 is dominant. Indeed, it is found that the RMSE of the scatter plots increases with increasing Γ levels. Potential fluctuations of surface properties have a stronger impact at this geometry ( Figure 5). With this respect, the performance of TE is better than Γ. The information in WF r,peak is poorly related to biomass when σ coh,0 is dominant. However, the higher SNR dynamic range is linked to a higher TE dynamic range, which in turn increases the sensitivity of TE to AGB, although the sensitivity of Γ is poor. These observations suggest that the sensitivity of Γ to forests canopies improves when the incoherent scattering term is dominant over the coherent one σ coh,0 [46]. Indeed, the highest AGB dynamic range appears at θ e~ [ 60,70] • . The interpretation is two-fold: (a) a larger signal propagation path through the vegetation, as compared to that corresponding to higher θ e , improves the sensitivity to biomass, and (b) the coherent scattering term σ coh,0 starts to be dominant for angles lower than θ e~6 0 • [46]. A complementary description of the impact of θ e can be found in [46].
Overall, it is also found that the dispersion in the scatter plots is very high over the strongest levels of Γ, where surface reflectivity starts to play an important role in the overall reflectivity. This significant dispersion is related to the fluctuations of certain surface properties, depending on the surface area under consideration. This dispersion increases for grazing angles, which suggests that surface properties such as rivers start to play an important role. As such, Γ appears quite high even for high AGB levels.
The sensitivity over Congo rainforests shows a higher angular variability than over the Amazon. This experimental evidence could be linked to different structural properties of the vegetation cover. The AGB distribution over the Amazon is a bimodal one with the two peaks at~200 and~250 ton/ha ( Figure A1c). The re-radiation pattern could have isotropic properties because of the small separation between both modes. This characteristic could explain the small angular variability of the sensitivity. On the other hand, a bimodal distribution is also found over the Congo ( Figure A1a). The two modes are much more differentiated, with peaks at~300 and~500 ton/ha. It is hypothesized that this characteristic could belong to a much more complex structure of the canopy layer, and a higher angular variability.
The assessment using TE and Γ shows that the maximum correlation coefficients and slopes of the linear regressions (Table 1) appear gradually at decreasing θ e with decreasing CH levels: θ e~[ 80, 90] • over coniferous, θ e~ [ 60,70] • over dry, and θ e~ [ 40,50] • over moist forests ( Table 1). The angular differences are small. The RMSE is high even for low AGB levels. A reasonable justification is provided, despite the fact that these aspects complicate the interpretation of the results.
Larger signal propagation paths through the vegetation cover are required to increase the sensitivity with decreasing CH levels. On the other hand, the surface contribution partially masks the vegetation effects at grazing geometries. This latter aspect can be observed more clearly at θ e~ [ 20,30] • , where the correlation coefficients with AGB are quite low (Table 1). This behavior is different to that over rainforests, because low AGB levels belong to a stronger contribution of surface scattering in WF r,peak . Finally, it is worth pointing out that the correlation of AGB with Γ is much higher than with TE. This observation is also justified because of the dominant effect of surface scattering (Figure 2). On the other hand, the influence of volume scattering is low because of the low AGB levels. This point is translated into lower correlation coefficients with TE, because the sensitivity of WF r,threshold to AGB is almost negligible in this scenario. Over forests with low AGB~< 40 ton/ha, the soil surface contribution to the case of GNSS-R is dominant. This explains the lower spreading of the waveforms, as compared to rainforests (Figure 2). In the latter scenario, the volume scattering is very strong.

Evaluation with SMAP-derived VOD, PI, and SMC
At present, most of biomass monitoring studies are based on the use of visible and infrared indices. However, these types of measurements are only sensitive to the upper canopy layers and they additionally suffer from weather conditions and clouds. On the other hand, this study is based on measurable biomass parameters (AGB, CH). These considerations motivate to complement this work with microwave radiometry-derived parameters, such as VOD and PI. There is an increasing interest on the use of both parameters to estimate some vegetation properties, such as VWC and AGB [58][59][60][61][62]. Additionally, SMC is considered over coniferous, dry, and moist forests, because of the potential impact of this parameter in the performance of CyGNSS over forests with low AGB levels ( Figure A1). Over tropical rainforests, the impact of SMC is assumed to be negligible [24]. As a matter of fact, SMC data are not fully reliable over very dense forest canopies.
The functional relationship of AGB with both TE and Γ is further evaluated at the specific angular ranges where the optimum performance is found in terms of correlation coefficients and AGB dynamic ranges (Figures 4, 5, 7 and 8). This is a case by case study that depends on the characteristics of each type of forest: Congo rain (TE at θ e~ [ 20,30] • and Γ at θ e~ [ 60,70] • ), Amazon rain (TE at θ e~ [ 20,30] • and Γ at θ e~ [ 60,70] • ), coniferous (TE and Γ at θ e~[ 80, 90] • ), dry (TE and Γ at θ e~ [ 60,70] • ), and moist (TE and Γ at θ e~ [ 40,50] • ) forests. An interpretation of the results is provided for each target area: (1) Tropical rainforests (Figures 9 and 10): VOD increases with increasing AGB, so as to provide an estimation of the attenuation by the vegetation cover. The sensitivity of VOD to AGB is high up to AGB 350 ton/ha (TE~800 m and Γ~−25 dB) over Congo (Figures 9a and 10a), and up to AGB~250 ton/ha (TE~600 m & Γ~−25 dB) over Amazon (Figures 9b and 10b). This sensitivity is qualitatively similar to that of CH (Figure 4d,h and Figure 5b,f). Recently, a good correlation of Soil Moisture Ocean Salinity (SMOS)-derived VOD with CH has also been reported [48,70]. On the other hand, SMAP-derived VOD is linearly related to VWC trough a parameter that accounts for the structural effects of the vegetation cover [55]. As such, it seems reasonable to find this dependence with CH, although VWC is indirectly inferred from NDVI.   Microwave radiometric brightness temperatures, T BH and T BV , depend on frequency, polarization, and the geometrical and dielectric properties of the scattering media. Different polarization characteristics of bare soil and vegetation suggest the potential use of PI for biomass studies [59][60][61][62]. PI should gradually decrease with increasing AGB levels because (a) V-polarized emissivity from soil is attenuated by the upwelling vegetation cover, and (b) the radiation component due to forest canopies is roughly unpolarized. Figures 9 and 10 confirm these theoretical expectations. The polarization degree decreases with increasing AGB, without an apparent saturation. Additionally, it is found that PI takes different values for the Congo and Amazon for a given AGB level. This empirical observation suggests that PI slightly depends on CH additionally to AGB, at least over tropical rainforests. A similar dependence with CH is also found for VOD.
(2) Coniferous, dry, and moist tropical forests (Figures 11 and 12): A significant sensitivity of VOD is also found for low levels of AGB, over coniferous (Figures 11a and 12a), dry (Figures 11b and 12b), and moist (Figures 11c and 12c) forests. The sensitivity improves with increasing CH levels, from moist to coniferous forests, despite the roughly similar AGB levels ( Figure A1e,g,i). On the other hand, PI also performs well over this type of CH-driven forests (Figures 11d-f and 12d-f). The PI dynamic range over dry forests~[0.05 0.075] is slightly lower than over coniferous forests~[0.05 0.1] because of the lower CH dynamic range ( Figure A). At the same time, it is found that PI is higher over moist forests~[0.06 0.085] than over dry forests, because of the slightly lower CH levels ( Figure A). Remote Sens. 2020, 12, x FOR PEER REVIEW 19 of 30     The results are independent of SMC over coniferous and dry forests. Over coniferous forests, increasing levels of SMC up to~0.2 m 3 /m 3 are found for gradually increasing TE (Figure 11g) and decreasing Γ (Figure 12g). This observation means that CH ( Figure A1f) has a higher impact than SMC in both observables, TE and Γ. An increment in SMC would reduce TE and increase Γ, in the case that SMC would be dominant. Over dry forests, SMC~0.18 m 3 /m 3 is quite homogenous along both TE ( Figure 11h) and Γ (Figure 12h) dynamic ranges. As such, changes on VOD and PI are assumed to be mainly related to the effects of the vegetation cover. Finally, in the case of moist forests, SMC influences TE (Figure 11i) and Γ (Figure 12i). Increasing levels of SMC up to~0.25 m 3 /m 3 are found for decreasing TE (Figure 11i) and increasing Γ (Figure 12i). Even in this scenario, VOD (Figures 11c and  12c) and PI (Figures 11f and 12f) are found to have a certain degree of sensitivity to canopy forests.

First Evidences of Biomass Retrieval Using an Empirical Approach
Finally, the ability to retrieve biomass using an empirical approach is evaluated ( Figure 13). To do so, the empirically derived polynomial fitting functions corresponding to the relationship between TE and AGB at low elevation angles θ e~ [ 20,30] • are used to provide a preliminary AGB estimation at the two selected test sites over Congo and Amazon tropical rainforests. The percentage of data used as training and validation subsets ranged from 30% and 70%, respectively. On the other hand, the AGB map provided by Avitabile et al. is used as the AGB reference. Small absolute errors in AGB estimation are found except in areas with AGB >~350 ton/ha and in areas with little biomass and the presence of inland water bodies (Figure 3). These promising preliminary results should be further investigated in future works also using ground truth data for validation. Remote Sens. 2020, 12 Over rainforests, the optimum geometry for the use of TE appears at grazing angles because: (a) a higher volume scattering term increases the tail of On the other hand, the optimum geometry for  is found at moderate e  , when the incoherent scattering term is still dominant and the signal propagation path along the vegetation cover is higher than at a Nadir-looking configuration. Then, the first derivative of the corresponding polynomial fitting functions of the relationship between AGB with TE and  was computed. It was found that there is a certain sensitivity to AGB up to ~350 and 250 ton/ha, respectively, over Congo and Amazon target areas, without an apparent saturation. The first maps of absolute error in AGB retrieval from space using GNSS-R were included, showing a small error. These findings suggest the possibility of accurate AGB retrieval using GNSS-R on-board small satellites such as, e.g., CyGNSS. Over forests with low AGB ~ < 40 ton/ha, the optimum geometry for the use of both parameters TE and  appears at decreasing e  with gradually decreasing CH levels. This point is explained because an increasing signal propagation path through the vegetation is required to improve the sensitivity with decreasing CH levels. The correlation coefficients of AGB with  are higher than those with TE . The sensitivity of TE to AGB is low because of the negligible volume scattering

Conclusions
Earth-reflected GNSS signals collected in a bistatic radar configuration by CyGNSS, show a significant sensitivity to biomass over rain, coniferous, dry, and moist tropical forests. Future studies with the U.K. TDS-1 mission would help to also cover boreal forests. The sensitivity has been evaluated using several parameters derived from DDMs, as a function of θ e . The optimum results in terms of correlation coefficients and AGB dynamic ranges were found at different angles: Congo rain (TE at θ ẽ [20,30] • and Γ at θ e~ [ 60,70] • ), Amazon rain (TE at θ e~ [ 20,30] • and Γ at θ e~ [ 60,70] • ), coniferous (TE and Γ at θ e~[ 80, 90] • ), dry (TE and Γ at θ e~ [ 60,70] • ), and moist (TE and Γ at θ e~ [ 40,50] • ). The SNR reduces at low elevation angles for both direct and reflected signals. However, we use Γ. This observable is more related with geophysical parameters than the SNR. In particular, Γ increases at lower angles because of the lower effective surface roughness at this geometry.
Over rainforests, the optimum geometry for the use of TE appears at grazing angles because: (a) a higher volume scattering term increases the tail of Y r (τ, f) 2 , and (b) a higher coherent scattering term Y r,coh (τ, f) 2 belongs to an improved sensitivity of WF r,peak to the attenuating cover. On the other hand, the optimum geometry for Γ is found at moderate θ e , when the incoherent scattering term Y r,incoh (τ, f) 2 is still dominant and the signal propagation path along the vegetation cover is higher than at a Nadir-looking configuration. Then, the first derivative of the corresponding polynomial fitting functions of the relationship between AGB with TE and Γ was computed. It was found that there is a certain sensitivity to AGB up to~350 and 250 ton/ha, respectively, over Congo and Amazon target areas, without an apparent saturation. The first maps of absolute error in AGB retrieval from space using GNSS-R were included, showing a small error. These findings suggest the possibility of accurate AGB retrieval using GNSS-R on-board small satellites such as, e.g., CyGNSS. Over forests with low AGB~< 40 ton/ha, the optimum geometry for the use of both parameters TE and Γ appears at decreasing θ e with gradually decreasing CH levels. This point is explained because an increasing signal propagation path through the vegetation is required to improve the sensitivity with decreasing CH levels. The correlation coefficients of AGB with Γ are higher than those with TE. The sensitivity of TE to AGB is low because of the negligible volume scattering term over these types of forests. In this scenario, the spreading of Y r (τ, f) is low, which reduces the sensitivity of WF r,threshold to biomass.
Finally, it is worth pointing out that the application of a moving averaging filter minimizes potential errors in the estimation of Γ due to inaccuracies in the signal calibration. The uncertainty in establishing the relationship between CyGNSS measurables and forest parameters is due to (a) potential errors in CH data derived from GLAS and (b) noise at the pixel level, such as speckle in Y r (τ, f). Overall, RMSE in the scatter plots increases with increasing biomass levels, while the correlation coefficients increase. The mean monthly temperature is over~18 • C along the year and the average annual rainfall is over~1680 mm [71]. Both study sites contain the two major carbon pools on Earth. The dominant International Geosphere Biosphere Program (IGBP) land cover type is evergreen broadleaf forests, which are characterized by wet biomass. This type of biomass is associated with signal attenuation effects [49]. On the other hand, dry biomass is associated with incoherent scattering effects because vegetation albedo is higher over drylands with forests [72]. Albedo is related with land cover type heterogeneity and structural effects of the canopy layer [49]. It is worth noting that both attenuation (linked to vegetation opacity) and scattering (linked to albedo) can be found in a general case. Evergreen broadleaf forests comprise many species with rather unique vegetation structural patterns. They consist of different vertical layers, including overstory, canopy, understory, shrub layer, and ground level. As such, dielectric and structural properties are complex and heterogeneous. The AGB distribution over Congo differs from that over Amazon, although the CH distribution is relatively similar over both sites.
Coniferous tropical forests ( Figure A1e,f and Figure A2c) appear mainly in North and Central America. In this work, the selected target area was over Mexico (Latitude = [19,28] • , Longitude = [−104, −98] • ). This type of forest is characterized by low levels of precipitation and a moderate variability in temperature [73]. The dominant IGBP land cover type is woody savannas, which are characterized by diverse species of conifers (firs, pines, spruces, etc). The CH distribution is spread up to~35 m, although the AGB is low~< 40 ton/ha. Dry tropical forests ( Figure A1g,h and Figure A2d) appear mostly in tropical and subtropical bands. In this work, the selected target area was over Zambia (Latitude = [−14, −9] • , Longitude = [29, 33] • ). These forests are dominated by warm climates. They receive several hundreds of centimeters of rain per year, although they have long dry seasons that can last for several months [74]. Deciduous trees predominate in most of these forests. The CH distribution spreads up to~20 m and AGB is low < 40 ton/ha. Moist tropical forests ( Figure A1i,j and Figure A2e) are discontinuously found over patches on the equatorial belt and between the tropics of Cancer and Capricorn. They are characterized by low variability in annual temperature and high levels of rainfall. Forest composition is dominated by semi-evergreen and evergreen deciduous tree species [71]. In this work, the selected target area was over Brazil ( [75] contain the highest levels of AGB on Earth [39]. On the other hand, Amazon rainforests spread over ~6 million km 2 [76]. The world's tropical forests store ~ 250 billion tons of carbon.
Overall, the selected target areas cover a wide variety of forests at a pantropical scale. Additionally, different surface moisture conditions are expected to be found over different target areas. This aspect was also evaluated in this work.   [75] contain the highest levels of AGB on Earth [39]. On the other hand, Amazon rainforests spread over ~6 million km 2 [76]. The world's tropical forests store ~ 250 billion tons of carbon.
Overall, the selected target areas cover a wide variety of forests at a pantropical scale. Additionally, different surface moisture conditions are expected to be found over different target Figure A2. Photos of the selected study sites: (a) Congo rain, (b) Amazon rain, (c) coniferous, (d) dry, and (e) moist tropical forests. Congo rainforests (~2 million km 2 ) [75] contain the highest levels of AGB on Earth [39]. On the other hand, Amazon rainforests spread over~6 million km 2 [76]. The world's tropical forests store~250 billion tons of carbon.
Overall, the selected target areas cover a wide variety of forests at a pantropical scale. Additionally, different surface moisture conditions are expected to be found over different target areas. This aspect was also evaluated in this work.

Appendix B.1 GLAS ICESat-1 Canopy Height Map
The selected reference CH data is the global map generated by Healey et al. using data collected by the ICESat-1/GLAS payload from 2004 to 2008 [3,6]. GLAS is a waveform sampling lidar that uses three lasers, each one with a nominal lifetime of 18 months, and a spatial resolution of~60 m along-track and~170 m across-track. Only one laser operates at a time to achieve the nominal mission lifetime. GLAS transmits~40 Hz pulses using infrared light λ~1064 nm to detect changes on the elevation of the Earth's surface and visible green light λ~532 nm to measure clouds and aerosol heights. In so doing, ICESat-1 points~0.3 • off-Nadir during nominal operations to mitigate the potential damage of the detector due to specular reflections of laser pulses from mirror-like surfaces (e.g., calm water). ICESat-1 mission includes estimation of vegetation height as a secondary mission objective. The shape of the Earth-reflected waveforms depends on the vertical distribution of vegetation and surface properties. Level L1a waveforms were used to calculate the so called Lorey's height, which is the crown-area-weighted height. The applied algorithm [3,6] uses a height correction factor based on the trailing waveform edge extent to correct for topographic effects because of the non-homogeneous sampling properties of GLAS. The final product is delivered with a spatial resolution of~0.2 km.

Appendix B.2 Pantropical Above-Ground Biomass Map
The selected reference AGB data was the integrated pantropical map developed by Avitabile et al. [39]. This map combines Saatchi's [66] and Baccini's [67] AGB datasets into a~1-km resolution product. In so doing, it uses an independent reference of field observations for validation, and high-resolution biomass maps locally calibrated, harmonized, and upscaled to 14,477 AGB estimates with a resolution of~1 km as inputs for the fusion algorithm. The data fusion approach applies a bias removal and a weighted linear averaging. Baccini's and Saatchi's patterns use ICESat-1/GLAS as a primary data source, a similar strategy to upscale lidar data, and they assume continental allometric relationships. However, Baccini's dataset covers tropical latitudes~ [−23.4, 23.4] • , while Saatchi's dataset has a wider coverage. As such, the fusion model is first applied to the common area and then it is extended to the area where Saatchi's dataset is available. In this region, the fusion algorithm only removes the bias of the Saatchi's dataset using the values estimated over Baccini's coverage. The validation of Avitabile's map shows a lower RMSE (reduction down to~74%) and bias (reduction down to~153%) than both original datasets over all the continents.