Democratic Republic of the Congo Tropical Forest Canopy Height and Aboveground Biomass Estimation with Landsat-8 Operational Land Imager (OLI) and Airborne LiDAR Data: The Effect of Seasonal Landsat Image Selection

Inventories of tropical forest aboveground biomass (AGB) are often imprecise and sparse. Increasingly, airborne Light Detection And Ranging (LiDAR) and satellite optical wavelength sensor data are used to map tree height and to estimate AGB. In the tropics, cloud cover is particularly prevalent and so several years of satellite observations must be considered. This may reduce mapping accuracy because of seasonal and inter-annual changes in the forest reflectance. In this paper, the sensitivity of airborne LiDAR and Landsat-8 Operational Land Imager (OLI) based dominant canopy height and AGB 30 m mapping is assessed with respect to the season of Landsat acquisition for a ~10,000 Km2 tropical forest area in the Democratic Republic of the Congo. A random forest regression estimator is used to predict and assess the 30 m dominant canopy height using LiDAR derived test and training data. The AGB is mapped using an allometric model parameterized with the dominant canopy height and is assessed by comparison with field based 30 m AGB estimates. Experiments are undertaken independently using (i) only a wet season Landsat-8 image, (ii) only a dry season Landsat-8 image, and (iii) both Landsat-8 images. At the study area level there is little reported sensitivity to the season of Landsat image used. The mean dominant canopy height and AGB values are similar between seasons, within 0.19 m and 5 Mg ha−1, respectively. The mapping results are improved when both Landsat-8 images are used with Root Mean Square Error (RMSE) values that correspond to 18.8% of the mean study area mapped tree height (20.4 m) and to 41% of the mean study area mapped AGB (204 Mg ha−1). The mean study area mapped AGB is similar to that reported in other Congo Basin forest studies. The results of this detailed study are illustrated and the implications for tropical forest tree height and AGB mapping are discussed.


Introduction
Tropical forests play a key role in the terrestrial carbon cycle with globally significant amounts of carbon stored as aboveground biomass (AGB) [1][2][3]. National inventories of forest AGB are incomplete and imprecise in many tropical countries for several reasons and primarily because tropical forests have highly variable structure and composition that make them difficult to survey [4][5][6]. Forest AGB can be

Study Area
The study area covers 137 km × 80 km of Mai Ndombe province in the DRC (Figure 1). It was selected because it contains airborne LiDAR data and field plot data that can be used to derive AGB. In addition, it is located within a single Landsat Path (180) and Row (061) which reduces Landsat data processing complexity as overlapping images sensed from adjacent Landsat orbits [27] do not need to be processed. The study area falls in the tropical monsoon climate zone and because it lies close to the Equator has two wet and two dry seasons [28]. The Inongo weather station, operated by the Congolese national society of meteorology, is situated near the center of the study area. The annual mean temperature is 24 °C [29] and the annual total rainfall is 1800 mm that falls typically over about 115 days [30]. The main wet season is from September to December. The principal dry season extends from June to August, with a secondary dry season from January to February.
The majority of the study area is covered by dense tropical evergreen rainforest with low lying parts that can be flooded in the wet seasons and includes the northern end of Lake Mai Ndombe [19,31]. People subsist on the terra firme non-forest rural complex (evident in Figure 1a in pink tones), primarily growing cassava, corn, sorghum, upland rice, and peanuts and practicing slash and burn agriculture [32,33]. The main forest characteristics are high tree crown cover (70-100%) with mature tree heights of 35-45 m and predominantly evergreen heterogeneous shade tolerant species [30,31]. The interior forest is relatively undisturbed but at risk of deforestation and degradation due to unregulated resource exploitation and limited governance on timber harvesting, charcoal production, and mining [34,35]. The majority of the study area is covered by dense tropical evergreen rainforest with low lying parts that can be flooded in the wet seasons and includes the northern end of Lake Mai Ndombe [19,31]. People subsist on the terra firme non-forest rural complex (evident in Figure 1a in pink tones), primarily growing cassava, corn, sorghum, upland rice, and peanuts and practicing slash and burn Remote Sens. 2020, 12, 1360 4 of 21 agriculture [32,33]. The main forest characteristics are high tree crown cover (70-100%) with mature tree heights of 35-45 m and predominantly evergreen heterogeneous shade tolerant species [30,31]. The interior forest is relatively undisturbed but at risk of deforestation and degradation due to unregulated resource exploitation and limited governance on timber harvesting, charcoal production, and mining [34,35].

Airborne LiDAR Transect Data
Airborne discrete-return LiDAR transects were flown across the DRC in support of a World Wildlife Fund (WWF) carbon mapping and modelling project [16]. The transects were selected based on a systematic random sampling design where a 1 • × 1 • grid was overlaid on the forest cover map of the country produced by Observatoire satellital des forest d'Afrique Centrale (OSFAC) [16]. An Orion M300 LiDAR was flown at 700 m aboveground level with a 150 kHZ pulse frequency and a laser beam divergence of 0.25 mRad to provide an average density of 4 returns per square meter with a nominal footprint size of 0.17 m. The airborne LiDAR data were flown to have positional spatial errors no greater than 0.05 m horizontally and 0.10 m vertically. The data are available categorized as ground or non-ground returns [36]. Four airborne transects were flown over the study area, three were flown on 1 July 2014 (the N.W., S.W., and S.E. transects illustrated in Figure 1a) and one on 12 August 2014 (the N.E. transect). Each transect is approximately 2 km wide and 10 km long, i.e., each covers approximately 20 km 2 (2000 hectares).

Landsat-8 Operational Land Imager (OLI) Data
The Landsat-8 Operational Land Imager (OLI) provides 30 m optical wavelength images with improved radiometry and geolocation compared to previous Landsat sensors [37]. The most recent Collection 1 Landsat images that are defined with per-pixel cloud and quality information [38] were used. In order to select contemporaneous imagery, two years of Landsat-8 OLI images acquired from July 2013 to August 2015, i.e., from one year before to one year after the airborne LiDAR data acquisitions, were considered. Only those OLI images that had <20% cloud cover (defined by the Collection 1 "land cloud cover" metadata) and that were cloud-free over the four LiDAR transects were selected. In total, out of 48 OLI images acquired over the two years, only two images, sensed 14 July 2013 (i.e., dry season) and 8 December 2014 (i.e., wet season) met these selection criteria. This is not surprising given the prevalence of cloud at the time of Landsat overpass that is evident in Figure 2.
The Landsat-8 OLI has nine 30 m reflective wavelength (435 nm to 2200 nm) bands. In this study OLI bands 3 (Green,~560 nm), 4 (Red,~655 nm), 5 (NIR,~865 nm), 6 (Shortwave Infrared, 1610 nm), and band 7 (Shortwave Infrared,~2200 nm) were used. The two OLI blue bands were not used because of their sensitivity to atmospheric scattering [39,40]. The surface reflectance rather than top of atmosphere reflectance was used to minimize the effects of atmospheric contamination that can be particularly significant over the tropics due to high water vapor content and biogenic and pyrogenic aerosols. The surface reflectance imagery, were obtained from the United State Geographical Survey (USGS) website (https://earthexplorer.usgs.gov) and are derived using the Land Surface Reflectance Code (LaSRC) [40].
July 2013 to August 2015, i.e., from one year before to one year after the airborne LiDAR data acquisitions, were considered. Only those OLI images that had <20% cloud cover (defined by the Collection 1 "land cloud cover" metadata) and that were cloud-free over the four LiDAR transects were selected. In total, out of 48 OLI images acquired over the two years, only two images, sensed July 14th 2013 (i.e., dry season) and December 8th 2014 (i.e., wet season) met these selection criteria. This is not surprising given the prevalence of cloud at the time of Landsat overpass that is evident in Figure 2.  Figure 1) for two years from July 2013 to August 2015. In total, out of 48 images, 11 had a cloud cover < 20%, 23 had cloud cover ≥50%, and 16 had cloud cover ≥80%.

Aboveground Biomass Field Plot Validation Data
Field plot data used to validate AGB were collected in support of the Mai Ndombe Emission Reductions Program, a World Bank coordinated program (under the Forest Carbon Partnership Facility Carbon Fund) that aims to provide benefits for the local population while reducing greenhouse gas emissions from deforestation and forest degradation [41]. The field plots were in undisturbed primary forest [36]. The field plot data were collected November 2015 where the airborne LiDAR data had been flown the previous year (recall the LiDAR data were flown July and August 2014). The field plot data were collected in two of the study area LiDAR transects (the NE and SW transects illustrated in Figure 1a). In each transect, two field plots of one hectare situated <10 km apart were surveyed on the ground. Thus, there were four field plots in the study area.
Each field plot (1 ha) was divided into sixteen 25 m × 25 m (0.0625 ha) parcels. In each parcel, the diameter at breast height (1.3 m) for all trees with diameter >10 cm was measured. The species of each tree was identified by ecologists following the procedure described in [36] and the tree wood density was assigned using the Global Wood Density Database for tropical trees [7]. In cases where tree identification was not possible, the mean wood density of the plot was assigned. The tree AGB in each parcel was derived using a standard allometry model [10] as was undertaken by [36]: where AGB is the estimated tree aboveground biomass in the parcel (Mg ha −1 ), a is the parcel area (ha), d is the diameter at breast height of each tree in the parcel (cm), ρ is the wood density of each tree in the parcel (g cm -3 ), n is the number of trees in the parcel, and e is an environmental stress parameter that depends on the seasonality of temperature, precipitation, and the climatic water deficit [10]. There were 16 parcels and so 16 AGB estimates were derived as (1) for each one hectare field plot, except for one Remote Sens. 2020, 12, 1360 6 of 21 field plot (located over the study area NE LiDAR transect) where there were 15 AGB estimates as no data was collected in one of its parcels. This provided a total of 63 AGB 25 m × 25 m parcel estimates. No field measurements of the non-forest vegetation, i.e., grasses and shrubs, were made. The AGB of grasses and shrubs in tropical forests is not well documented, and although a minor fraction of tree AGB, post tropical forest disturbance (e.g., due to deforestation, degradation, or fire) grass, shrubs, and tree saplings grow rapidly [42]. However, in this study, only tree AGB was considered.

Forest Mask Classification
A 30 m forest mask was derived so that only the forest parts of the Landsat images would be considered. This is needed as about a quarter of the study area is composed of water and seasonally inundated soil (Figure 1). The Landsat images were acquired in the dry and wet seasons and so the lake level and the soil and cloud conditions were different between the images. In addition, the Landsat cloud and shadow mask was not always reliable, which has been observed by [43,44]. Therefore, in order to provide a reliable forest mask, a supervised random forest classification applied to both images was undertaken, which is a state of the practice land cover classification approach [45], and used to define forest, water, permanent wet soil (wet in both Landsat images), dry soil, cloud, and shadow classes. Training data were derived by visual interpretation of both Landsat images. Care was taken to ensure that the forest training pixels did not include mixed forest and non-forest pixels (e.g., over forest edges and small forest clearings) as they likely contain shrubs and grasses whose AGB is unknown. Care was also taken to select training samples across the study area and to ensure that the proportion selected among the different classes reflected the visually estimated study area class proportions in order to provide approximately similar class training portions as found by random sampling [46].
A total of 7280 training pixels of 30 m were collected composed of forest (67% of the pixels), water (25%), permanent wet soil (3%), dry soil (3%), cloud (1%), and shadow (1%) classes. The classification predictor variables were defined by the Landsat-8 OLI surface reflectance for bands 3, 4, 5, 6, and 7. In addition, normalized difference band ratios, defined like the normalized difference vegetation index (NDVI), for every possible two band combination of these bands were derived. This provided a total of 11 predictor variables. These bands and ratios have been used before for Landsat land cover classification [46][47][48].
The training data were used to develop a random forest classification tree using the default parameter settings, i.e., 500 trees were grown with each tree built using 63.2% of the training data selected randomly with replacement and three predictor variables (the square root of the number of predictor variables) randomly selected [49]. The random forest classification was applied to the 11 predictor variables at every 30 m study area pixel. The land cover classification was checked by visual comparison with the Landsat-8 OLI images and with Google Earth high resolution images. No formal quantitative per-pixel assessment of the classification accuracy was undertaken as the objective here was to develop a conservative forest mask used to discard non-forest pixels from the tree height and AGB mapping.

LiDAR Dominant Canopy Height Quantification
The data were processed using FUSION, a public software designed by the U.S. Forest Service to analyze LiDAR data [50]. The Landsat 30 m pixel grid was used to define a coordinate system.
First, the number of LiDAR ground returns in different sized grid cells (1 m, 2 m, 2.5 m, and 3 m side dimensions) aligned with the Landsat coordinate system were examined to determine an appropriate grid cell dimension for the subsequent processing. In the DRC national carbon mapping study undertaken by Xu et al. [36] a 2 m grid cell dimension was used. However, for the four study area transects, we found that a 2.5 m grid cell dimension was more appropriate as with smaller grid cells there were usually no ground returns in each grid cell. As an example, Figure 3 illustrates in detail Remote Sens. 2020, 12, 1360 7 of 21 the number of ground returns for the four different grid cell dimensions considered. The percentage of illustrated grid cells with no ground return data were 85%, 60.4%, 49.3%, and 39.7% for 1 m, 2 m, 2.5 m, and 3 m, grid cell dimensions, respectively. In Figure 3, the greatest ground return density is in the north east and occurs where there are no trees. The discrete-return airborne LiDAR transect data categorized as ground returns were used to generate a 2.5 m ground height digital terrain model (DTM) by averaging the heights of the ground returns falling in each 2.5 m grid cell. Some DTM grid cells had no data (e.g., white in Figure 3) and the DTM gaps were interpolated from neighboring DTM grid cell values by natural neighbor interpolation that has been recommended for LiDAR processing [51] and has elegant interpolation properties, i.e., no parameters are used, the interpolated values are guaranteed to be within the range of the samples used and to pass through the input samples, and are smooth everywhere except at the locations of the input samples [39].
A canopy height model (CHM) was estimated by extracting the DTM height from the maximum first return height in each 2.5 m grid cell and only considering first returns with heights >1 m. This is a common approach in tropical forests if the LiDAR first returns are not particularly noisy [52][53][54][55].
The dominant canopy heights in 30 m grid cells aligned with the Landsat 30 m pixel grid were derived by taking the mean of the 2.5 m CHM values falling in each 30 m grid cell. The mean rather than another metric, such as the maximum or the median, was used as it provides a reliable representation of forest structure and has been used in other LiDAR based tropical forest studies [55][56][57]. The dominant canopy height was derived only for 30 m grid cells ≥75% covered by LiDAR data (i.e., containing ≥108 2.5 m canopy height values). This resulted in a proportion of the 30 m grid cells along the LiDAR transect edges being discarded from the analysis. The discrete-return airborne LiDAR transect data categorized as ground returns were used to generate a 2.5 m ground height digital terrain model (DTM) by averaging the heights of the ground returns falling in each 2.5 m grid cell. Some DTM grid cells had no data (e.g., white in Figure 3) and the DTM gaps were interpolated from neighboring DTM grid cell values by natural neighbor interpolation that has been recommended for LiDAR processing [51] and has elegant interpolation properties, i.e., no parameters are used, the interpolated values are guaranteed to be within the range of the samples used and to pass through the input samples, and are smooth everywhere except at the locations of the input samples [39].
A canopy height model (CHM) was estimated by extracting the DTM height from the maximum first return height in each 2.5 m grid cell and only considering first returns with heights >1 m. This is a common approach in tropical forests if the LiDAR first returns are not particularly noisy [52][53][54][55].
The dominant canopy heights in 30 m grid cells aligned with the Landsat 30 m pixel grid were derived by taking the mean of the 2.5 m CHM values falling in each 30 m grid cell. The mean rather than another metric, such as the maximum or the median, was used as it provides a reliable representation of Remote Sens. 2020, 12, 1360 8 of 21 forest structure and has been used in other LiDAR based tropical forest studies [55][56][57]. The dominant canopy height was derived only for 30 m grid cells ≥75% covered by LiDAR data (i.e., containing ≥108 2.5 m canopy height values). This resulted in a proportion of the 30 m grid cells along the LiDAR transect edges being discarded from the analysis.
The above processing was also repeated independently for the LiDAR data falling over the 25 m × 25 m AGB field parcels (Section 2.4) and using the field plot corner locations to define a coordinate system. A 2.5 m DTM was generated, then, as above, any DTM gaps were filled by natural neighbor interpolation, canopy heights were estimated for each 2.5 m grid cell, and then the dominant canopy heights in the 25 m × 25 m grid cells falling over each 25 m × 25 m field plot parcel were derived.

Dominant Forest Canopy Height Prediction and Accuracy Assessment
The dominant forest canopy height was predicted at each Landsat 30 m pixel using the established non-parametric supervised random forest regression estimator [49]. Other researchers have also used this approach [16,22,55]. Only the 30 m pixels classified as forest (Section 3.1) were considered.
The response variable was defined by the 30 m dominant canopy height data (Section 3.2) sampled systematically every four pixels (120 m) north and south across each LiDAR transect. A four pixel sampling interval was used to reduce spatial autocorrelation effects that can introduce biases into the forest height prediction [58]. The four pixel sampling interval was selected beacause it is >100 m which is the distance that canopy heights in Mai Ndombe province were found to be significantly different from forest edge canopy heights [35].
A total of 5278 pixels with 30 m dominant canopy height response variables and 11 associated predictor variables were extracted. The predictor variables were defined by the Landsat-8 OLI surface reflectance for bands 3, 4, 5, 6, and 7. In addition, normalized difference band ratios, defined like the NDVI, for every possible two band combination of these bands were derived. The 5278 response and predictor values were divided into two equally sized portions, one portion was used to train the random forest regression and the other to test it. To ensure that a full range of forest canopy heights were used in both the training and testing, the following sampling procedure was used. The 5278 30 m dominant canopy height values were ranked into ascending canopy height order. Every second sample in the ranked list was selected as training data (n = 2639) and the remainder were used to define the test data (n = 2639). The dominant canopy heights for the training data ranged from 2.71 m to 43.99 m and for the test data ranged from 2.65 m to 42.71 m.
The random forest regression estimator was trained using the 2639 30 m dominant canopy height pixel training values and the 11 corresponding predictor values. The default random forest regression parameter settings were used, i.e., 500 trees were grown with each tree built using 63.2% of the training data selected at random with replacement and 3 predictor variables (one third the number of predictor variables) randomly selected [49]. The resulting random forest regression tree was applied to the 11 predictor variables at every forest mask Landsat-8 OLI pixel location to generate a 30 m dominant forest canopy height map.
The random forest regression prediction accuracy was assessed by application of the random forest regression tree to the 2639 test predictor variables. The resulting 2639 random forest regression predicted canopy heights were compared with the test 30 m dominant forest canopy heights and the Root Mean Square Error (RMSE) between them derived. In addition, scatterplots comparing the predicted and test 30 m dominant canopy heights were generated and Ordinary Least Squares (OLS) regressions between the data and the goodness of fit (R 2 ) and regression confidence (p value) statistics were derived.
The above process was undertaken independently three times, using the Landsat-8 OLI

Aboveground Biomass Mapping
The aboveground biomass (AGB) was derived at each 30 m pixel location with a dominant forest canopy height estimate as: AGB = 1.88 h 1.55 (2) where AGB is the predicted aboveground biomass (Mg ha −1 ) and h is the 30 m dominant forest canopy height predicted by the random forest regression tree (Section 3.3). This allometric equation was defined by Xu et al. [16] by statistically fitting 92 pairs of dominant canopy heights (derived using the same airborne LiDAR data as this study but extracted from more (33) LiDAR transects flown across the main forest types of the DRC) with field AGB estimates (derived as described in Section 2.4). Three AGB maps were generated by application of Equation (2) to the 30 m dominant forest canopy height maps generated using predictor variables derived from the Landsat-8 OLI (a) dry season, (b) wet season, and (c) both images.

Aboveground Biomass Map Accuracy Assessment
The study area 30 m AGB maps were validated by comparing them with the field plot AGB data that were defined in 25 × 25 m parcels (Section 2.4). The Landsat pixels and field parcels have different sizes and are not aligned. Consequently, the 25 × 25 m parcel AGB estimates falling under each 30 × 30 m Landsat pixel location were weighted to derive an equivalent 30 m field AGB estimate as: where AGB 30 is the 30 × 30 m AGB field estimate derived from the n (typically 4 but sometimes 2 or 1) parcel AGB estimates (AGB 25  f i ≥ 0.5 were considered. Thus, if a 30 m pixel was less than 50% covered by field plot parcels it was not considered. The RMSE between AGB 30 and the corresponding mapped 30 m AGB values (Section 3.4) were derived. Scatterplots comparing these data were generated and OLS regressions between the data and the goodness of fit (R 2 ) and regression confidence (p value) statistics were derived to quantify the correspondence of the data. Figure 4 shows the predicted 30 m dominant forest canopy heights for the study area derived using the Landsat-8 OLI predictor variables generated using (a) only the dry season, (b) only the wet season, and (c) both images. White shows the pixels that were classified as either water, permanent wet soil, dry soil, cloud, or shadow, and that were masked off from the subsequent AGB analysis. The masked off pixels include Lake Mai Ndombe evident in the wet season Landsat-8 OLI image (Figure 1a) and also capture most of the small rivers including streams with small axis dimensions greater than about half a 30 m pixel. Clouds and shadows located mostly in the North West that occurred in the dry season Landsat-8 OLI image are also apparent. Typically, forest edges and small forest clearings were classified as one of the non-forest classes (usually as wet soil or water) which is not a problem as these masked off pixels likely contain shrubs and grassed whose AGB is unknown.

Dominant Forest Canopy Height Prediction Accuracy Assessment
The dominant forest canopy height prediction accuracy was assessed, as described in Section 3.3, by application of the random forest regression tree to the 2639 test pixels that were not used to train the tree. This was undertaken three times using the trees derived with Landsat-8 OLI predictor variables generated from (a) the dry season, (b) the wet season, and (c) both images. Figure 5 shows scatterplots comparing the test and the predicted 30 m dominant canopy height values. There are two clouds of dots evident in the scatterplots, the larger cloud corresponds to tall trees > 20 m present in the mature tropical evergreen forest parts of the four LiDAR transects, and the other corresponds to shorter forest canopies about 18 m high that occur predominantly around the Lake Mai Ndombe and often in the S.E. and S.W. LiDAR transects.
The OLS regressions of the plotted data are shown in red in Figure 5. In all three cases the regressions are significant (p values < 0.05) with slopes less than unity and intercepts > 12 m. The random forest regression underestimates and overestimates the heights for pixels dominated by tall and short trees, respectively. The These results indicate that using both the dry and wet season Landsat-8 OLI images provides more accurate dominant canopy height prediction. Figure 6 shows the 30 m AGB biomass maps derived from the 30 m dominant forest canopy height maps (Figure 4) using Equation (2). The same broad patterns as the dominant forest canopy height maps are observed, which is expected given that the AGB is proportional to the dominant canopy height.

Aboveground Biomass Maps
The mean study area AGB was 206 Mg ha −1 , 211 Mg ha −1 , and 204 Mg ha −1 for the AGB maps generated using the dry season, wet season and both images dominant forest canopy height maps, respectively. The maximum AGB was found at the 30 m pixels with greatest dominant canopy height and was 493.    Figure 7 shows scatterplots comparing the 30 m AGB derived from the remotely sensed data ( Figure 6) and the 30 m area weighted AGB field estimates (AGB 30 ) over the four one-hectare field plots. The three scatterplots compare the same AGB 30 with AGB predicted using forest canopy heights generated from the dry season (Figure 7a coded to designate which of the four field plots the AGB 30 were derived from. They illustrate that two of the field plots (purple and green) had higher AGB 30 and that there was a wide range of values from about 96 Mg ha −1 to 503 Mg ha −1 . However, this range is smaller than present in the estimated 30 m AGB study area maps shown in Figure 6 that had AGB that varied from approximately 16 to 512 Mg ha −1 .  The OLS regressions of the plotted data are shown in red and were insignificant for the dry (Figure 7a) and wet (Figure 7b) season derived image results (p values > 0.05) with small R 2 values, of 0.05 and 0.07, respectively. Conversely, the OLS regression results for the AGB estimated using the dominant forest canopy heights derived from both Landsat images (Figure 7c) was more significant (p = 0.03) with a 0.11 R 2 value, and a slope closer to unity (0.135) and an intercept closer to zero (193 Mg ha −1 ). The RMSE values for the wet and dry season results were 92.43 Mg ha −1 and 87.76 Mg ha −1 , respectively, and smaller, 83.77 Mg ha −1 , for the combined image results. The 83.77 Mg ha −1 RMSE value corresponds to about 41% of the mean study area mapped AGB (204 Mg ha −1 ) (Figure 6c). However, clearly, the mapped AGB is over-estimated below about 225 Mg ha −1 and under-estimated above this value (Figure 7c).

Discussion
The AGB of the Congo Basin forest has been poorly documented due to a lack of inventory data and research [59]. Recently, airborne LiDAR and Landsat-8 OLI data have been used to map tree height and AGB across all of the Congo Basin [16]. Cloud cover at the time of Landsat overpass can be high (e.g., Figure 2), reducing the ability to obtain cloud-free imagery needed to undertake the mapping. Consequently, Xu et al. [16] defined Landsat predictor variables by the medians (i.e., 50th percentiles) of the red, NIR, and the two shortwave Landsat-8 OLI reflective wavelength bands acquired over three years. Thus, the predictor variables at adjacent pixel locations may have been selected from different years and seasons which is an issue if the forest and the Landsat forest reflectance changed between years and seasons. However, the detailed processing and results of this study show, on average at the study area level, no great difference between the dry and wet season dominant canopy height and AGB results, i.e., little sensitivity to the seasonality of the Landsat imagery used. This is discussed below.
The dominant forest canopy heights and AGB estimated from the dry season Landsat-8 OLI image were on average marginally lower than those estimated from the wet season image. The RMSE between the mapped and 2639 independent test 30 m dominant canopy heights was 4.17 m (dry season) and 4.43 m (wet season) and corresponds to 20.2% and 21.2% of the mean study area mapped dominant canopy heights that were 20.6 m (dry season) and 20.8 m (wet season). The RMSE between the mapped and 43 independent field based 30 m AGB estimates was 87.76 Mg ha −1 (dry season) and 92.43 Mg ha −1 (wet season) and corresponds to 42.6% and 43.8% of the mean study area mapped AGB that was 206 Mg ha −1 (dry season) and 211 Mg ha −1 (wet season).
There were seasonal geographic differences between the mapped dominant canopy height ( Figure 4) and the AGB (Figure 6) results, in particular around Lake Mai Ndombe and in the vicinity of several of the rivers. The reasons for this are complex but may be due to seasonal vegetation condition and surface differences. For example, although the Landsat 30 m forest mask was derived conservatively, sub-pixel disturbed forest patches, and degraded forest areas with reduced live tree cover, may include shrubs and saplings that exhibit greater seasonal reflectance differences than elsewhere in the forest. In addition, in these regions, wet season flood water may have been observable at Landsat resolution through the forest canopy.
The pre-processing applied to the LiDAR and Landsat-8 OLI data were state of the practice. However, the difference between the seasonal results may have been affected by Landsat bi-directional reflectance distribution (BRDF) effects. Landsat BRDF variations are smaller than in wider field of view satellite optical wavelength data and occur due to changes in the view geometry across the image swath and to temporal changes in the solar geometry [60,61]. The two Landsat-8 OLI images were not corrected for BRDF effects because they were both sensed from the same orbit and so have similar view geometry. The solar zenith angle at the center of each image was 35.85 • and 32.30 • for the dry and wet season images, respectively. This 3.55 • solar zenith difference is small compared to the 15 • Landsat field of view although variations of this magnitude can cause small reflectance variations [61]. Use of both the dry and wet season Landsat-8 OLI images provided the lowest RMSE value (3.84 m) between the predicted and test dominant forest canopy heights, corresponding to 18.8% of the mean study area mapped tree height (20.4 m) derived using both images. This 18.8% RMSE is quite small and was not particularly expected as many of the study area forest Landsat pixels have high NDVI (~0.8) which is indicative of "saturated vegetation" reflectance conditions. Typically, the NDVI saturates with increased Leaf Area Index (LAI) above about LAI~3.0 [62] and Congo basin tropical evergreen rainforest has LAI > 5 [63,64]. Presumably, despite this potential saturation issue, other factors related to the dominant forest canopy height could be discriminated by the random forest regression model using the wet and dry season Landsat-8 OLI images. Notably, the R 2 value was not particularly high (0.47) but the regression was significant (p < 0.001), indicating that the regression fit model illustrated in Figure 5c is better than not having a model. Moreover, the R 2 is comparable to other recent study results, for example, Staben et al. [22] reported a 0.49 R 2 between mapped and predicted forest canopy height in the Northern territory, Australia.
Using both Landsat-8 OLI images provided the most accurate AGB prediction and the lowest RMSE value (83.77 Mg ha −1 ) between predicted and field estimated AGB. This was expected because the AGB was derived as an allometric function of the dominant canopy height which was most accurately predicted using both Landsat-8 OLI images. Although the R 2 was low (0.11) the regression was significant (p < 0.03), indicating that the regression fit model illustrated in Figure 7c is better than not having a model. As noted earlier, the range of the field plot derived AGB (~96 Mg ha −1 to 503 Mg ha −1 ) is smaller than the range of the study area estimated 30 m AGB (~16 to 512 Mg ha −1 ), and the field plot data were available at only four 1 ha sites, which may reduce the representativeness of the validation results. However, the 83.77 Mg ha −1 AGB RMSE is comparable to the 89.83 Mg ha −1 RMSE value reported by Xu et al. [16] and corresponds to about~40% of the reported mean Congo Forest AGB. This is quite a high error but is not surprising as we found that the dominant canopy heights were under-estimated for trees >~25 m and over-estimated for trees <~18 m ( Figure 5). Similarly, the AGB was over-estimated below about 225 Mg ha −1 and under-estimated above this value (Figure 7).
The magnitude of the estimated mean study area AGB derived using both Landsat-8 OLI images (204 Mg ha −1 ) is similar to that reported in other Congo Basin forest studies. For example, Baccini et al. [59] reported 216 Mg ha −1 for the evergreen rainforest of Central Africa, Silva et al. [65] reported 223 Mg ha −1 for Lope National Park in Central Gabon, and Xu et al. [16] reported 231 Mg ha −1 for all the DRC. Similarly, the magnitude of the maximum study area AGB derived using both Landsat-8 OLI images (511.8 Mg ha −1 ) was comparable to the maximum Congo Basin forest AGB reported by Silva et al. [65] and Xu et al. [16]. Thus, the maximum AGB that a hectare of Congo basin forest can store is about 500 Mg, which demonstrates the importance of these forests for carbon storage.
The predicted dominant canopy height results were derived using LiDAR transect training data acquired in July and August 2014, i.e., up to 12 and 13 months, respectively, after the dry season Landsat image acquisition, and up to 6 and 5 months, respectively, before the wet season Landsat image acquisition. The reliability of the dominant canopy height prediction will be reduced if the forest within the transects was disturbed in these periods. However, given the paucity of cloud-free satellite data we have no way to check this. The field plot data used to derive the AGB validation data were collected November 2015, i.e., 11 and 28 months after the wet and dry season Landsat images, respectively. The field plot data were collected in undisturbed primary forest and so are unlikely to have been subsequently disturbed but, again, we have not way to check this definitively. These issues underscore the difficulty in mapping and validating DRC dominant forest canopy height and AGB.
The degree to which AGB estimation can be improved using optical wavelength data is unknown. Further research to examine the effects of using additional satellite data, such as the Landsat-like Sentinel-2 data [66], to see if the height estimation can be improved, is warranted. Using improved allometry [8] and recent spaceborne LiDAR data [67] is also warranted.

Conclusions
The sensitivity of airborne LiDAR and Landsat-8 OLI based dominant canopy height and AGB 30 m mapping was assessed with respect to the season of Landsat acquisition for a~10,000 km 2 Congo Basin tropical forest study area. Experiments were undertaken independently three times to map and assess the 30 m dominant canopy height and AGB using (i) only a wet season Landsat-8 image, (ii) only a dry season Landsat-8 image, and (iii) both images. The images were predominantly cloud-free. A random forest regression estimator was used to predict and assess the 30 m dominant canopy height using LiDAR derived test and training data. The AGB was mapped using an allometric model parameterized with the dominant canopy height and was assessed by comparison with 43 field based 30 m AGB estimates.
The most accurate results were obtained using both the dry and wet season Landsat-8 OLI images together. The RMSE between the mapped and test 30 m dominant canopy heights was 3.84 m, and the RMSE between the mapped and field based AGB estimates was 83.77 Mg ha −1 . These RMSE values correspond to 18.8% of the mean study area mapped tree height (20.4 m) and to 41% of the mean study area mapped AGB (204 Mg ha −1 ). The mean study area mapped AGB is similar to that reported in other Congo Basin forest studies [16,59,65].
At the study area level there was little sensitivity to the seasonality of the Landsat imagery used. The study area mean dominant canopy height and AGB values were similar between seasons, within 0.19 m and 5 Mg ha −1 , respectively, and the RMSE between the mapped and test 30 m dominant canopy heights was 4.17 m (dry season) and 4.43 m (wet season), and the RMSE between the mapped and field based AGB estimates was 87.76 Mg ha −1 (dry season) and 92.43 Mg ha −1 (wet season).
The degree to which AGB estimation can be improved using temporally richer optical wavelength data is unknown due to difficulties in obtaining cloud-free imagery. These results suggest that (i) using a single cloud-free Landsat-8 OLI image may be sufficient for airborne LiDAR and Landsat-8 OLI based dominant canopy height and AGB 30 m mapping in the Congo Basin tropical forest, but (ii) using Landsat imagery from different seasons is preferred to improve tropical forest inventories in the Congo Basin forest.