Correction of Interferometric and Vegetation Biases in the SRTMGL1 Spaceborne DEM with Hydrological Conditioning towards Improved Hydrodynamics Modeling in the Amazon Basin

: In the Amazon basin, the recently released SRTM Global 1 arc-second (SRTMGL1) remains the best topographic information for hydrological and hydrodynamic modeling purposes. However, its accuracy is hindered by errors, partly due to vegetation, leading to erroneous simulations. Previous efforts to remove the vegetation signal either did not account for its spatial variability or relied on a single assumed percentage of penetration of the SRTM signal. Here, we propose a systematic approach over an Amazonian ﬂoodplain to remove the vegetation signal, addressing its heterogeneity by combining estimates of vegetation height and a land cover map. We improve this approach by interpolating the ﬁrst results with drainage network, ﬁeld and altimetry data to obtain a hydrological conditioned DEM. The averaged interferometric and vegetation biases over the forest zone were found to be ´ 2.0 m and 7.4 m, respectively. Comparing the original and corrected DEM, vertical validation against Ground Control Points shows a RMSE reduction of 64%. Flood extent accuracy, controlled against Landsat and JERS-1 images, stresses improvements in low and high water periods (+24% and +18%, respectively). This study also highlights that a ground truth drainage network, as a unique input during the interpolation, achieves reasonable results in terms of ﬂood extent and hydrological characteristics.


Introduction
Wetlands are widely recognized as vital components of watersheds that provide various hydrological and ecological functions and valuable services for the society, such as water storage, groundwater recharge, water quality improvement, carbon storage, and biodiversity conservation [1,2]. In the Amazon basin, wetlands cover 800,000 km 2 , or 14% of the lowland basin [3]. The flooded area nearly doubles between the low and high water periods, at Óbidos (1.93˝S, 55.5˝W) the most downstream gauge, in response to the pronounced discharge seasonality of the Amazon River. Water storage clearly contributes to the flood wave attenuation and delay in its propagation [4,5]. In comparing two regional approaches for channel hydrodynamic modeling, including or omitting floodplain storage, these authors found a difference in travel time of approximately 1 month at the scale of the Solimões and Amazon basin. When flood waters travel across the floodplain, a significant part of the main channel sediment load is trapped [6][7][8]. Floodplains also significantly alter the fate of carbon and nutrients in the basin [9,10]. They are considered as hot spots of biodiversity [11], these characteristics being mainly interpreted in the framework of the flood pulse concept [12].
Floodplains mainly receive regional water through channelized and diffusive overbank flow from the main stem. Water in the floodplains also comes from local runoff, groundwater seepage and direct rainfall, whose contributions vary seasonally and inter-annually [13][14][15]. The seasonal contribution and the mixture of these different water sources could be a key driving factor of biochemical and ecological processes in floodplains [16,17]. Because of their hydrological connectivity, these systems are threatened by both local and distant anthropic pressures, particularly through regional and local deforestation and dam construction [18]. Thus far, the possible impacts of climatic trends and anthropic pressure on floodplain hydrology and ecology have not clearly been stated. Gloor et al. [19] recently showed that the Amazon basin has presented wetter climatic conditions since 1990.
Considering the strong interactions between water circulation, biogeochemical and ecological processes, it is important to quantify the river-floodplain fluxes exchanged and the water circulation patterns within the floodplains. Hydrodynamics models, such as LISFLOOD-FP, originally developed by Bates and De Roo [20] and progressively improved [21][22][23], are attractive in this context. This model (or an adapted version) has been applied in the Amazon, from regional (thousands of km) [5,24] to medium (hundreds of km) spatial scales [25][26][27]. However, relatively high quality topographic data are obviously required to produce consistent results. Indeed, hydrodynamics features in floodplains are in part controlled by small topographic features [28,29], such as narrow channels that drive floodplain connectivity at low and medium water levels [27].
To date, the best topographic dataset readily available for the Amazon is from the Shuttle Radar Topographic Mission (SRTM), described in Farr et al. [30]. Nevertheless, the SRTM includes various types of errors, the dominant sources of which are summarized in Rodriguez et al. [31]. These errors, gathered hereafter under the term of interferometric bias, make the data inappropriate for use as they stand in hydrodynamics models. In moderate topography and low vegetation regions, the residual motion error of the interferometric baseline mainly contributes to the interferometric bias in a significant source of absolute height errors. The baseline rolls error results in a residual long-wavelength˘2 m error with peak values of˘10 m [31]. In addition, as in most remote-sensed digital elevation models (DEM), land cover impacts the vertical accuracy of the SRTM: the denser the vegetation, the more important the elevation offset is. This effect is due to the inability of C-band radar (5.6 cm) to fully reach the bare earth in the presence of a forest canopy. The capacity of SRTM to detect bare earth is a function of vegetation characteristics such as tree height, density, branching angle, soil moisture, and wood moisture [32,33]. In the Amazon lowland basin, Carabajal and Hardling [34] estimated the height error to be 22.4 m and the SRTM elevation to be 40% of the distance from the canopy top to the ground. The reported error value in height estimation may prevent accurate estimation of the water exchange between the main stream and its floodplains. It is higher than the flood wave amplitude tide in most of water level gauges along the Solimões/Amazon River (approximately 12 m in Manacapuru (3.317˝S, 60.583˝W), for example).
Interferometric bias might be estimated from ground control points (GCPs), as explained in Rodriguez et al. [31]. However, no systematic method has been developed to remove the positive vegetation-related offset. At the regional scale, the simplest method consists of removing a uniform vegetation bias independently of vegetation type. For example, in the view of hydrological modeling at the Amazonian Basin scale, Coe et al. [35] used a value of 23 m, whereas Paiva et al. [36] assumed 17 m in forested areas. Such a methodology, however, assumes a uniform spatial error, which is unlikely and can lead to inconsistent elevations in floodplain regions where spatial variations in vegetation heights occur naturally [37]. This procedure could result in certain areas of the SRTM being over or under-elevated, as shown in Paiva et al. [36]. An alternative approach consists of using a vegetation height map and subtracting a percentage of the vegetation height from the SRTM. From this perspective, Wilson et al. [27] made vegetation height field measurements in different representative land cover classes and, using the wetland cover class proposed by Hess et al. [38], built a vegetation map of their study area consisting of a 285 km-long floodplain segment. Comparing in situ measurements made at the edge of deforested areas and SRTM elevations along the same ground profile, they found that the percentage of penetration of the radar signal within the canopy was 50%. Combining these two sources of information allowed them to produce a corrected DEM. Based on vegetation height field data, however, this method was difficult to extrapolate to other areas or larger scales. Going further, and taking advantage of a global remote-sensed vegetation height map produced by Simard et al. [39], Baugh et al. [25] proposed a sensitivity analysis to assess the influence of the fixed value to remove the vegetation bias from SRTM data for hydrodynamic modeling. They created a set of distributed vegetation offsets by applying a fixed percentage of radar signal penetration to the Simard et al. [39] vegetation height map. Then, they generated a set of DEMs by subtracting these sets of distributed offset from the SRTM elevation; the best DEM was selected as the one leading to the most consistent results when comparing the LISFLOOD-FP hydrodynamic model results with independent hydrometric data. Their study suggested that subtracting 50%-60% of the Simard et al. [39] vegetation height from the SRTM was most appropriate for the vegetation encountered in the Amazon floodplains. One weak point of this method is the target criterion, which requires basing the DEM assessment upon a hydrodynamic modeling application. As models do not represent reality but only approach it, we should be careful in using models to validate forcing components of the models themselves [40].
In this study, we propose a correction methodology for the latest release of the SRTM, the SRTM Global 1 arc-second (SRTMGL1) product, with the aim of making this product available for the hydrological and hydrodynamic modeling of a floodplain located in the Amazon basin. The correction methodology considers interferometric and vegetation offset corrections and aims to keep the DEM coherent with hydrologic knowledge acquired during field campaigns, such as canals of communication between the Solimões and the lake. The elevation dataset merges the corrected SRTMGL1 elevations and in situ bathymetric data and is interpolated by the ANUDEM v5.3 algorithm [41,42], which is forced by a ground truth drainage network. The methodology remains relatively simple, and independent of the model. We compare the improvement of the DEMs generated by each methodology step in terms of vertical accuracy, drainage network quality and flood extent against GCPs derived from altimetry sensors (e.g., Envisat, ICESat) and inundation maps deduced from free available remote-sensed product imagery. For each DEM, we also assess the consequences for the morphological and hydrological characteristics of the floodplain local catchment. This analysis permits investigation of the usefulness of each data source on the DEM performance, evaluating the possibility of using the methodology on a larger scale.

Study Site
The Janauacá Floodplain is located between 3.20˝S-3.25˝S and between 60.23˝W-60.13˝W, along the right margin of the Solimões River, approximately 40 km upstream from its junction with the Rio Negro (Amazon state, Brazil) ( Figure 1). It consists of one lake connected with the Solimões River. According to the classification proposed by Sioli [43], the Solimões River is a white water river, rich in suspended sediments and nutrients. In contrast, small streams (igarapés) draining the south of the watershed present properties closer to black waters, which are rich in dissolved organic matter. According to the rain gauge data in Manacapuru (3.317˝S, 60.583˝W), at approximately 40 km from the study, the mean annual rainfall is 1976 mm/year for the period lasting from 2006 to 2009 [44]. The river water level mean annual fluctuation reaches 12.2 m at the Manacapuru water level gauge (3.31˝S, 60.26˝W), when considering the 2006-2011 period. The Solimões River presents a mono-modal hydrograph with the water level rising from mid-November until mid-June, when the recession phase begins.

SRTMGL1 and SWBD
The global 1 arc-second SRTM V3.0 dataset (SRTMGL1) is a joint product of the National Geospatial-intelligence Agency (NGA) and the National Aeronautics and Space Administration (NASA). Data were collected during 11 days in February 2000 using dual Spaceborne Imaging Radar (SIR-C) and dual X-band Synthetic Aperture Radar (X-SAR). From these data, a near global DEM was generated. By the beginning of January 2015, the SRTMGL1 had been freely released for South America. Data can be downloaded at NASA's Earth Observing System Data and Information System (EOSDIS) website [45]. They are referenced to the World Geodetic System 84 (WGS84) ellipsoid and Earth Gravitational Model 1996 (EGM96) geoid. The EGM96 geoidal undulations were replaced by the EGM08 ones [46], using programs provided by NGA [47] for both geoid models. SRTM data products have been validated on continental scales: the absolute and relative vertical accuracies over South America are 6.2 m and 5.5 m, respectively [31]. Rudorff et al. [26] found a local negative bias of 4.4 m in an Amazonian floodplain located downstream from the Janauacá floodplain. Satgé et al. [48] found a negative bias of 7.2 m for the Andean Plateau region. In addition to the bias introduced by interferometric errors, because of the limitation of C-band radar in reaching the bare earth, SRTM data present an elevation ranging above the bare earth and below the maximum canopy height [33,34]. In the lowland Amazon basin, the vertical height accuracy was estimated to be 22.4 m [34].
Because water surfaces have low radar backscatter, water bodies and coastlines are generally not well defined. As a by-product of the water body editing, a water mask was generated: the SRTM Water Body Data (SWBD). It presents the same coverage as SRTMGL1 and is available in ESRI Shapefile format [49] from NASA's EOSDIS. According to Lehner et al. [50], this product presents some inconsistencies because in some cases, water body depiction required ancillary data sources, such as Landsat 5 data that were collected much earlier than the shuttle mission.

Wetlands Map
Hess et al. [38] produced a dual-season map of wetland inundation and vegetation for the central Amazon basin, denoted by WM hereafter. It is available at NASA's EOSDIS website [45]. In this product, wetland areas were defined as inland areas that are periodically inundated or permanently waterlogged, including lakes, rivers, estuaries, and freshwater marshes. The product is based on mosaicked L-band Synthetic Aperture Radar (SAR) imagery acquired by the Japanese Earth Resources Satellite-1 (JERS-1) during two periods. In the study area, images were acquired on 10 October 1995 and 27 May 1996 (Hess, personal communication) for the low water (LW) and high water (HW) seasons, respectively. In addition to water and non-wetland classes, the classification provides 4 classes of vegetation (herb, woodland, shrub land and forest). The dual season approach allowed splitting these classes according to their inundated status (permanently inundated, seasonally inundated or never inundated). To date, with a resolution of 3 arc-second, it is the only freely available product for the region of the study. WM was validated using high-resolution, geocoded digital videography collected during aerial surveys. According to Hess et al. [38], the classification accuracy varies from 94% for the flooded class and 76% for the non-flooded class at the HW level and 84% for the flooded and 89% for the non-flooded class at the LW level. Flood maps during the LW and HW seasons (denoted LWFM and HWFM, respectively) deduced from the WM are commonly used to assess the ability of hydrological and hydraulic models to reproduce the inundation in the central Amazon basin [4,26,27,35,40].

Radar Altimetry Data
Satellite altimetry for continental waters really began in the 2000s, following the work involving the Geosat satellite [51] and the Topex/Poseidon (T/P) mission [52]. Many studies have now shown the great ability of radar altimetry missions for retrieving river stages [53][54][55][56][57]. In this study, we used data from three different radar altimeters, namely ENVISAT/RA-2, T/P and ICESat/GLAS, available through the CTOH website (Centre de Topographie des Océans et de l'Hydrosphère) [58]. The ENVISAT/RA-2 altimeter launched by the European Space Agency (ESA) collected data from 2002 to 2012. ENVISAT has a 35-day repeat period; its ground-track presents an inter-track distance of approximately 80 km at the equator. Comparison at crossovers and with in situ gauges showed that the quality of the series can be highly variable, with results ranging from 12 cm to several meters [54,56,59]. The T/P altimeter launched by NASA and the CNES (Centre National d'Etudes Spatiales) collected data from 1992-2004. T/P has a 10-day repeat cycle. Its ground-track presents an inter-track distance of 315 km at the equator. Birkett et al. [52] showed accuracies ranging from 10 cm to several meters (mean RMSE 1.1 m).
NASA acquired elevation data from 2003 to 2009 using the GLAS altimeter on the ICESat satellite. The ground-track repeat cycle is 183 days, yielding 15 km track spacing at the equator. The ICESat/GLAS mission was initially launched to monitor icecaps. However, many studies have addressed its capability of monitoring other land covers. ICESat/GLAS instrument demonstrates height accuracies of 3 cm over water bodies [60][61][62]. Due to its high accuracy, ICESat/GLAS data can be used as GCPs [48,60,61]. In this study, we used the most recent release of the GLA14 data (v34), specific for land-surface elevation. Data are referenced to the TOPEX/Poseidon ellipsoid and can be downloaded from NASA's EOSDIS website [45]. We filtered the data by selecting data acquired at LW, between October and January, outside the SWBD boundaries and inside the herb class of the wetlands map.

Forest Canopy Height Map
Simard et al. [39] released a global map of Forest Canopy Height (FCHM, hereafter) at 1 km spatial resolution. This map is based on data from the Moderate Resolution Imaging Spectroradiometer (MODIS) on NASA's Terra and Aqua satellites and from the ICESat/GLAS. The Simard et al. [39] FCHM was created by regressing ICESat RH100 (Relative Height) canopy height measurements with global grids of annual mean precipitation, precipitation seasonality, annual mean temperature, temperature seasonality, elevation, and percentage tree cover. It is distributed as an 890 MB GeoTIFF file at the website [63]. This product by Simard et al. [39] presents an accuracy of 6.1 m compared with measurements at 66 Fluxnet sites.
Landsat Products USGS and NASA have managed the Landsat missions since 1972. Recently, the Landsat archive has been made freely available [64,65]. Landsat 5 Thematic Mapper (TM) and Landsat 7 Enhanced Thematic Mapper Plus (ETM+) imagery has moderate spatial resolution (30 m) and provides multispectral images (7 or 8 bands) with a short revisit interval (16 days). We chose images with low cloud contamination that matched the dates of the study.

In Situ Data
Water Level Data In addition to the water level gauge at Manacapuru, available at the website of the Brazilian government water agency ANA [66], two gauges were installed in the floodplain at "RL1" at Tilhero place (3.424˝S, 60.264˝W) and "RL2" in the channel (3.368˝S, 60.193˝W) connecting the lake to the Solimões River ( Figure 1). A series of daily values is available beginning 1 September 2006. Satellite radar altimetry provided 5 additional virtual stations from the T/P and ENVISAT missions. The manual method described in [53,59] was used to build the virtual gauges, which can be defined as any intersection between a water body surface and radar altimetric ground tracks. Three virtual stations were gained from the ENVISAT mission ( The Manacapuru water level gauge has been leveled against EGM08 by differential GPS. All the other water level gauges were leveled with the help of satellite radar altimetry level time series data, according to the method described in Santos da Silva et al. [56], and validated by high-precision bi-frequencies Global Positioning System (GPS) stations. Finally, level time series were reported using the EGM08 geoid. According to the available altimetry data, the slope of the water surface along the reach between Manacapuru and the virtual station SVR shows a mean of 2 cm/km over the years 2006-2012.
Gauge element displacements or errors in reading or reporting data are always possible. The comparison between altimetry data and in situ records made it possible to correct or complete the in situ measurements, improving confidence in the in situ water level series [67]. In particular, as shown in Figure 2, the water level time series at the Manacapuru gauge distributed by ANA present noisy variations during the years 1994 and 1995. To estimate the water level in the lake at the date of acquisition of the JERS-1 images (RL1 and RL2 records are available from only 2006), we needed confident water levels from Manacapuru at the same dates. From the T/P series SV_076 and SV_063 T/P, we estimated the water level on October 10, 1995 to be 12.20 m, against the 10.42 m given by the Manacapuru gauge series. The same reasoning for the HW level gave a difference of 0.2 m between the altimetry estimation and the Manacapuru gauge level. As the T/P altimetry data have a vertical accuracy of approximately 0.5 m [59], we did not modify the recorder level provided by the ANA gauge for the HW level. Then, building a linear regression between the Manacapuru gauge and RL1 over the LW and HW periods, we estimated the water level in the floodplain at the RL1 gauge to be 13.5 m on 10 October 1995 and 22.1 m on 27 May 1996, respectively.

Bathymetric in Situ Data
Most of the data were acquired during a field trip organized during June 2012, when the water level in the floodplain rose exceptionally to a high absolute level of 24.3 m recorded at the RL1 gauge. The in situ bathymetric data were acquired using an Acoustic Doppler Profiler Current (Teledyne RD Instruments, ADCP 1200 Hz) linked to a GPS. Everywhere else, we used an echo sounder linked to a GPS station. ADCP and echo sounder data were checked against manual measurements using a ruler and showed overall good agreement. Other in situ bathymetric data were acquired in May 2008 and in August 2006. Errors in the bathymetric grid are expected in both depth and position accuracy. According to the constructor, ADCP has a vertical precision of 1 cm in 99% of the measurements, far below the errors induced by field conditions (instruments onboard). In this study, we estimated that the waves would generate errors on the order of 0.2 m. Position error is reported to be less than 15 m for 95% of the measurements by both instrument constructors, which is less than the grid mesh we aimed to build (30 m). Examining survey data, we excluded all data points that showed no consistency with the rest.

Method
Prior to analysis, all gridded products except Landsat images were resampled to the SRTMGL1 1 arc-second resolution. All data were projected into the WGS84 Universal Transverse Mercator (UTM) zone 20S. We gathered the in situ bathymetric data and ICESat points into a global dataset of 201,935 GCPs. We split the latter into two independent sets, 90 GCP and 10 GCP, whose respective sizes were 90% and 10% of the size of the global GCP dataset.
The methodological steps are presented in Figure 3. First, focusing on a window including a large area around the floodplain and the river, the process begins with the correction of the interferometric bias, followed by correction of the positive offsets related to land cover and vegetation height to produce an adjusted DEM (denoted DEM_ADJ hereafter). Then, DEM_ADJ is merged with the 90 GCP dataset and interpolated using the ANUDEM algorithm constrained by a drainage network to produce the corrected DEM (denoted DEM_COR hereafter). The drainage network used to force the ANUDEM algorithm was automatically extracted from SRTMGL1 and corrected by visual inspection using Landsat images, denoted GTSN hereafter. Several products were generated: DEM_COR, representing the whole of the inputs; DEM_COR_1, generated from the GTSN and DEM_ADJ (without vegetation correction); DEM_COR_2, generated without the river network; and DEM_COR_3, generated without the 90 GCP dataset. To evaluate the weight of each method step and each input during the interpolation process, these generated products, DEM 1 and SRTMGL1, were validated vertically (against the 10 GCPs dataset), horizontally (against maps of flood extent), and hydrologically (against watershed and river network assessments).

Interferometric Bias Correction
We followed a similar method to Rudorff et al. [26], which consisted of creating two elevation subsets: one built with bathymetric-and ICESat-data that overlap bare soil areas, and the other one built using the corresponding SRTM elevations. Bare soil areas were deduced from the WM. The Mann-Whitney U test was used to exclude the hypothesis of equal means between real elevations and SRTMGL1 elevations. The bias was computed as the mean difference between the two elevation subsets mentioned above and uniformly reported over the entire study site domain. The resulting intermediary adjusted DEM is denoted DEM 1 hereafter.

Vegetation Bias Correction
The bias introduced by the vegetation is a function of vegetation characteristics and may vary depending on vegetation height and cover density [32,33]. In the potentially flooded region, vegetation is spread over several types that clearly present differences in height and cover density [27]. In the non-flooded uplands, the forest is partially replaced by pasture and culture areas and secondary forest (capoeira). Thus, applying a uniform vegetation elevation correction to each region can lead to erroneous elevations.
WM classes were used to partition the study area into several regions. In each region, an elevation correction was computed according to: 1. the type of vegetation and 2. the flooded status, from elevation. Using the SWBD, the water class issued from WM was split in two: water in the Solimões (ID61) and water outside the Solimões (ID62). All other classes are strictly derived from WM. As a whole, 16 regions were identified and summarized in Table 1. All regions except the terra firma and one forest class (ID41) may be flooded. Consequently, the pixel elevations of these regions must remain lower than or equal to the HWFM water level (22.1 m as mentioned above). Alternatively, non-flooded/flooded regions must exhibit pixel elevations ranging between the LWWM and HWWM water levels. However, a less restrictive criterion was applied to the elevations of permanently flooded regions: they must be lower than the HWWM water level as the natural criterion for these zones (to be lower than the LWWM water level) was too restraining.  The adjusted DEM, DEM_ADJ, was obtained by subtracting an elevation offset from the DEM 1 pixel within each region and by turning into NODATA all pixels that do not fulfill the elevation constraints after the offset subtraction.
The elevation vegetation offsets adopted for each considered region are presented in Table 1. The elevation offset for open water is set to 0. The water out of the Solimões (ID62 class) was turned into NODATA values, whereas the Solimões (ID61) was kept unchanged. Similarly, the elevation offset for the herb and shrub regions was set to 0 (defined as small trees with heights lower than 5 m, and a relatively small cover density [38]). The woodlands vegetation type (ID31) presents an open tree canopy with approximately 60% coverage [38]. In this condition, the elevation offset due to this type of vegetation is likely to vary from one pixel to another. As this class represents only a small percentage of the study area, we chose to maintain a 0 offset.
The elevation offset to be applied to forested regions was computed from the DEM 1 elevation distribution of the pixels overlapped by the respective classes to reduce the number of pixels that did not fulfill the elevation constraints after correction. The elevation distributions of the DEM 1 pixels overlapped by the never flooded forest (ID41), intermittently flooded forest (ID42) and always flooded forest (ID43) regions are shown in Figure 4a-c, respectively. The pixel elevation in the ID41 region must remain higher than 22.1 m (HWFM water level). The elevation distribution of the DEM 1 overlapped pixels is mono-modal (Figure 4a), with a median value of 26.7 m and with 33% of the pixels presenting an elevation lower than 22.1 m. As it was difficult to propose a vegetation offset that made sense while minimizing the quantity of pixels that would have been turned into NODATA after correction, the vegetation offset was kept undefined, and we turned only pixels less than 22.1 m into NODATA. The elevation distributions of the pixels overlapped by the ID42 and ID43 regions are mono-modal (Figure 4b,c), with median values of 27.7 m and 25.6 m, respectively. These regions represent 7.9% and 3.6% of the study zone window, respectively. For the ID42 class, we imposed a requirement that the median of the corrected elevation distribution be at the middle of the window of acceptable elevations (13.5 m, 22.1 m) for the ID42 region, that is, 17.8 m. Finally, the vegetation offset obtained for this region was 9.9 m. The ID43 area presents the same vegetation and histogram characteristics, and thus the same offset was used for both ID42 and ID43. A value of 9.9 m corresponds to 53% and 52% of the mean vegetation height deduced from the Simard et al. [39] product over the areas ID42 and ID43, respectively. The elevation distribution of the pixels overlapped by the ID51 region is bi-modal (Figure 4d). The region was split into two mono modal sub-zones: pixels whose DEM 1 elevation was less than 30.0 m, denoted ID511, and the rest, denoted ID-512. A comparison of the spatial repartition of these pixels with a Landsat image stressed that ID511 was linked to deforested areas surrounding water bodies or linked to roads and that ID512 corresponded to forested areas. A subtraction of 14.70 m was used in the ID512 region, assuming the SRTM C-band penetration of 60% proposed by Carabajal and Harding [34]. However, the elevation range of the pixels located in ID511 was globally smaller and fit with the imposed elevation range. Consequently, the elevations of these pixels were kept unchanged in this zone.
The correction for the influence of vegetation led to turning to NODATA classifications for 31% of the whole study zone. However, the removal process was slightly counterbalanced by the introduction of 181,741 points from the 90 GCP dataset during the interpolation process that represents 2.2% of the wetland zone.

DEM Elevation Interpolation
The second step in the DEM correction procedure consisted of the interpolation of DEM_ADJ using the ANUDEM v5.3 [40] algorithm implemented through the TOPO TO RASTER command in ARCGIS v10.1. This algorithm generates a hydrologically correct raster surface from points, in contrast to other interpolation technics such as Inverse Distance Weighting or Kriging. The streamlines were generated applying the commonly used D8 algorithm [68], which involved the following steps: filling sinks; generating flow direction; generating flow accumulation; generating stream network. The river network was substantially improved with the help of a water/non-water classification obtained from Landsat images to produce a "Ground Truth" Stream Network (GTSN). The point elevation set was obtained by converting the DEM_ADJ raster to points after merging it with the 90 GCP dataset.
Optional parameters values adopted are gathered in Figure 3.

Quality Assessment of the Generated DEMs
Vertical Accuracy The vertical accuracy assessment was estimated by comparison against the 10 GCPs dataset through 4 criteria. Setting x as the datasets consisting of the elevations of the DEM pixels that contain at least one GCP and y as the dataset consisting of the mean of the 10 GCP, we computed the following statistics: the RMSE, the MEAN, the Standard Deviation (SD) of (x-y), and the ROUGHNESS, computed as the mean of the standard deviation of a 3ˆ3 array of elevation spots.

Stream Network Assessment
For each generated DEM, the stream network was extracted following the D8 algorithm. The networks were compared through three characteristics: CONNECTIVITY (Boolean), which indicates whether the lake and igarapés are continuously connected; OUTLETS (Boolean), which indicates whether the outlet is well positioned; and the MATCHING PERCENTAGE TO THE GTSN. The latter is designed as the percentage of a stream network falling in a buffer of 200 m surrounding the GTSN.

Flood Extent Assessment
In addition to HWFM and LWFM, we selected four Landsat images relatively free of clouds. Two images are representative of the LW period, with water elevations at the RL1 gauge of 11.5 m and 13.3 m on their acquisition dates. The third image is representative of the flushing period, with a corresponding water level of 18.4 m at the RL1 gauge. The fourth image is representative of the HW period, with a corresponding water level of 22.5 m at the RL1 gauge. The classifications issued from the images are denoted LFM11.5, LFM13.3, LFM18.4 and LFM22.5. Water areas were distinguished from non-water areas computationally using the normalized difference ratio between the mid-infrared band TM5 and the visible band TM2, used by Toivonen et al. [69] to map open water areas over a 2.2 million km 2 portion of the western Amazon. It appeared that a constant threshold, as used in Toivonen et al. [69], could not be applied to all Landsat images, so each image was associated with a specific threshold. To reduce the number of misclassified pixels, we stated that a pixel classified as "water" had to remain classified as "water" in the images representing higher water levels. Conversely, pixels classified as "non-water" had to remain classified "non-water" in the images representatives of lower water levels.
The flooded area can be roughly represented by the intersection of the DEM with the interpolated water surface, assuming a hypothesis of horizontality in the floodplain. Hence, it could be possible to compute the extent of inundation from the DEM. Here, differences in level between upstream (SV2) and downstream (SV1) are in the range (´0.4 m, 0.3 m) for 90% of the data. The water level difference recorded at RL1 and RL2 is in the range of (´0.2 m, 0.5 m) for 90% of the data. Finally, we estimate that the maximum slope that can appear in the floodplain is 2.5 cm/km. The water level at the different stations distributed in the floodplain show that the horizontal assumption is reasonable.
The agreement between the flooding extent deduced from the DEM and imagery was calculated using the following classical skills scores [36,70,71]: the threat score (TS) measures the model accuracy with a perfect score of 100, whereas the bias index (BIAS) indicates the type of error (overestimation or underestimation). These scores are determined using the following relations: where a represents the total area that is mapped as flooded both in the image and from the DEM, b is the area flooded in the DEM but not in the image, and c is the area flooded in the image but not in the DEM. Table 2 resumes all the results relative to the biases correction. The replacement of EGM96 by EGM08 as the reference for the SRTMGL1 elevation resulted in lowering the terrain by a mean of 0.3 m (SD = 0.1 m). The Mann-Whitney U test rejected the hypothesis of equal means between overlaps in bare soil of altimetry and in situ elevations and SRTMGL1 at the 1% significance level. The interferometric bias was estimated to be´2.0 m (SD = 4.1 m). A vertical offset of this magnitude was applied to increase the SRTMGL1 elevations. This bias is half the value encountered by Rudorff et al. [26] in the Curuaí floodplain, located approximately 700 km downstream of the Janauacá floodplain. However, as reported by Rodriguez et al. [31], the interferometric bias is expected to vary from place to place.

Biases Correction
The total correction, including the´2.0 m interferometric offset and the vegetation bias, has a mean value of 5.9 m (SD = 6.9 m) and 7.4 m (SD = 7.3 m) over the whole study area and the terra firma zone (ID51), respectively. As the SRTM elevation is located at approximately 40% of the distance from the canopy top to the ground [34], we estimated that the mean canopy height was 12.3 m over the terra firma zone. This value is consistent with the canopy heights found in studies of tree species in central Amazonian floodplain forests [72,73] and is higher than the median vegetation offset of 1.4 m reported by Rudorff et al. [26] for the Curuaí floodplain, but this area includes a larger proportion of savanna and secondary vegetation than the Janauacá floodplain. It is also much smaller than the value of 22.4 m (SD = 11.8 m) estimated by Carabajal and Harding [34] or used in regional models (23 m and 17 m, suggested by Coe et al. [35] and Paiva et al. [36], respectively).
As mentioned earlier, Baugh et al. [25] proposed a procedure to remove vegetation based on the FCHM. This method is attractive because it does not require any in situ data, but in the case of the Janauacá floodplain, applying their methodology would have led to inconsistent results. As reported in Figure 5, where we present the difference between the SRTMGL1 and the DEM_COR in terms of the percentage of the FCHM, negative bias is found principally in floodable areas. In this area, SRTMGL1 elevations are too low, partly because of the interferometric bias compared with GPCs. Second, we detected several incoherencies between the SRTM, WM and FCHM products that hinder obtaining realistic results (Table 1). For example, vegetation height pixels overlapped by the herb WM classes have a mean vegetation height of 6.5 m, whereas the herb class should be considered as bare ground. Similarly, shrub lands present a mean vegetation height of 13.5 m, whereas Hess et al. [38] define them with a height below 5 m. The horizontal and vertical accuracy of each product can explain these incoherencies, but we essentially attribute these errors to the resampling process at the SRTMGL1 resolution. In this sense, the method proposed by Baugh et al. [25] is more likely to be suitable for larger resolution studies. In the study area, the mean percentage of vegetation height subtracted from FCHM was 13%, with a large SD of 29%. The histogram repartition ( Figure 5) of the percentage presents two modes,´8% and 48%. The pixels from the (30%, 60%) class, which was the second most populated, are localized in the upland of the catchment. For these pixels, the mean of the vegetation removal was near the intervals of (50%, 60%) proposed by Baugh et al. [25]. On the banks, the mean percentage of vegetation removal was 29%. However, DEM_COR banks are raised by 0.5 m on average, when all offsets (interferometric and vegetation) and interpolations are considered. The corrections are more significant below SWBD, where SRTMGL1 was raised by a mean value of 3.3 m (SD = 3.3 m).

Vertical Accuracy Assessment
The vertical accuracy assessment against the 10 GCP dataset for the different generated DEMs and the original SRTMGL1, as well as the roughness criteria, are gathered in Table 3. Clearly, DEMs generated with in situ and altimetry data present a better vertical accuracy than the others. The mean of the differences between the 10 GCP and DEM elevations decreased from´1.2 m for SRTMGL1 to 0.1 m for DEM_COR. RMSE decreased from 4.7 m for SRTMGL1 to 1.7 m for DEM_COR. SD similarly decreased from 4.5 m for SRTMGL1 to 1.7 m for DEM_COR. Conversely, the descriptor values for DEM_COR_1 and DEM_COR_3, which does not include the 90 GCP dataset, remained similar to the values obtained for SRTMGL1 as RMSE and SD remain greater than 4 m. As expected, the roughness of the generated DEMs was significantly reduced, by approximately 50%, except for DEM_COR_1. The roughness reduction is partly explained by the interpolation process. DEM_COR_2 presented the lowest value as it was generated without the constraint of a stream network.  Table 3 presents the validation relative to the river network. Regarding horizontal agreement, DEM_COR presents the best matching with the GTSN (83%), against percentages below 60% in the case of the SRTMGL1 and all other DEMs generated without the GTSN. All the extracted networks, except the one derived from DEM_COR_3, capture the connectivity between the different water bodies. However, only the networks extracted from DEM_COR and DEM_COR_1 presented both connectivity and right outlet position. In the case of DEM_COR_3, generated with the GTSN input but without GCPs, two outlets were found. This result was partly due to the hydrological incoherence introduced by the vegetation removal process, which resulted in turning 31% of the pixels into NODATA values 31% of the pixels into NODATA values without any counterbalancing by GCPs. However, DEM_COR_2, which was obtained without any drainage network as input to the ANUDEM algorithm, presented the worst GTSN matching index and failed the connectivity and outlet tests. Thus, the use of accurate streamlines as input data to generate the DEM through the ANUDEM algorithm is important.

Flood Extent Assessment
The values of the different skill scores (TS, BIAS) obtained comparing SRTMGL1 and the generated DEMs with flood maps are reported in Table 4.
At low waters, compared with LWFM, all generated DEMs presented lower accuracy (TS) than SRTMGL1, except for DEM_COR_3. The flooding extent deduced from the generated DEMs was underestimated (positive BIAS), whereas SRTMGL1 presented a nearly 0% BIAS. However, the relatively good agreement of SRTMGL1 with LWFM is very likely due to the interferometric bias, which lowered the pixel elevations by 2 m. Indeed, DEM 1 and DEM_COR_1 corrected only for the interferometric bias presented a lower accuracy than SRTMGL1. Compared with LFM13.3, all generated DEMs presented slightly better accuracy than SRTMGL1, with TS values ranging from 42 to 48 for generated DEMs against 40 for SRTMGL1. The inundated area overestimation in SRTMGL1 (BIAS =´127) was greatly reduced in the other DEMs (BIAS ě´68). This trend was confirmed when comparing the DEMs with the LFM11.5. At this very low water level, DEM_COR presented the best accuracy, with a nearly 0 BIAS. During flushing, compared with LFM18.4, the best accuracies were obtained for SRTMGL1 and for DEMs generated without vegetation correction (DEM 1 and DEM_COR_1). For the remaining DEMs, the TS values were approximately 15% lower. The flood extent was overestimated by all DEMs (negative BIAS).
At high waters, compared with the HWFM, all generated DEMs presented better accuracy than SRTMGL1. Negative BIAS, in all cases, stressed an overestimation of the flood extent. Compared with the LFM22.5, the best accuracies were exhibited by SRTMGL1 and DEMs generated without vegetation correction (DEM 1 and DEM_COR_1). For the remaining DEMs, the TS values remained approximately 15% lower. The negative BIAS in all comparisons except for DEM_COR_1 showed an overestimation of the flood extent.
The spatial distribution of pixels contributing to overestimation or underestimation of the flood extent is shown in Figure 6 (DEM_COR was used for this purpose). At low water levels (Figure 6a,c,d), mis-flooded pixels were mainly located in the northwestern part of the study area, where bathymetric data indicated bottom elevation greater than 13.5 m. Some of these pixels very likely result from the limitation of retrieving the flooding extent from a simple water surface interpolation as these flooded regions remained isolated from the water body where the water level used to interpolate the water surface is recorded (at RL1). This limitation may also explain some of the mis-flooded pixels (Figure 6a,b) encountered far upstream along the igarapés, where the pixel inundated status is more dependent on local runoff and seepage than on the floodplain inundation. Some of the pixels contributing to underestimation of the flood extent when DEMs were compared with flood maps derived from Landsat are likely due to mis-classification because of the difficulty of detecting water under vegetation with the Landsat imager. Indeed, most overestimated pixels lie in the upstream part of the catchment zone and around the Janauacá Lake (Figure 6e), where woody areas are encountered according to WM. This finding is confirmed by the fact that their accuracy significantly increased compared with HWFM instead of LFM22.5.
In addition, as found by Rudorff et al. [26] for the Curuaí floodplain, our results suggested that LWFM very likely overestimates the flood extent in the Janauacá floodplain. The flooded area mapped from Landsat at 13.3 m only represented 45% of the inundation area issued from JERS-1, a difference that is hardly explained by Landsat mis-classification. Moreover, compared with LFM13.3, all generated DEMs presented better accuracy with this product than with LWFM, except for SRTMGL1, which strongly overestimated the extent of inundation likely due to an interferometric bias of´2 m. Flood extent overestimation mapping with LWFM level could also explain the relatively passable results at LW level reported in other studies at larger spatial scales. For example, Wilson et al. [27] concluded that the accuracy (TS index) was less than 23 at the LW level, against 73 at the HW level. The matching result of Yamazaki et al. [40] reached 40 at the LW level against 60 at the HW level. On a global scale, Paiva et al. [36] found a model performance of 34 at the LW level, against 70 at the HW level. As mentioned by Hess et al. [38], the inundation mapping accuracies for SAR images acquired at low stages decrease as pixels are subjected to a signal mixture, especially pixels overlapping flooded vegetation and open-water surfaces.  Figure 7 highlights the tendencies of each flooding extent index along the comparison of each DEM against different flooding extent maps. From a general perspective, the dispersion among products for each was considered with increasing waters. This result suggests that at the HW level, the products are analogous in term of flood extent. More specifically, accuracy increases with increasing water by a factor of 3 on average. The BIAS value tends to 0. Thus, at the HW level, all the DEMs tend to present the same equal proportions of over-and under-estimated pixels during the validation. Finally, at the LW level, disparities between DEMs are important, highlighting that flooding extent is sensitive to each modification of DEM, whereas at the HW level, all DEMs present similar indexes. At LW, the proposed methodology clearly enhanced the DEM product. However, all DEMs present similar skill scores at the HW level. Indeed, in the unforested regions, the methodology reduced the interferometric bias (´2 m), which is negligible compared to the tide (near 12.0 m). Over the forested zones located in the wetlands, interferometric and vegetation corrections compensate each other, as confirmed in Figure 5 (8% of the vegetation height is removed, corresponding to 0.5 m).  Table 5 summarizes the morphological and hydrological characteristics of the watershed delineated from SRTMGL1 and the generated DEMs in terms of total area, low and high water flooded areas, slopes, longest flow paths and concentration times. The catchment areas extracted from the different generated DEMs varied by 14% among the different products. The size of the longest flow path varies by +29%, from 68 km to 96 km, among the different DEMs. As expected, the slope is greatly reduced, ranging from 31 cm/km to 59 cm/km for all generated DEMs against 88 cm/km for SRTMGL1. We estimated the flooded area of each DEM by intersecting the interpolated water surfaces with the DEM floodplain (Table 5). At the 11.5 m water level, the DEM_COR flooding extent was estimated to 23 km 2 , i.e., 3% of the watershed. At the highest water level recorded (24.3 m in 2012), the DEM_COR flooding extent was estimated to be 391 km 2 , i.e., 50% of the associated watershed. DEM 1 and DEM_COR_1 presented the lowest flooding extent as they have been raised by the positive interferometric offset, but vegetation offset was not removed from these DEMs. Apart from DEM_COR_1 and DEM 1 , the dispersion between flood extents derived from the different DEMs was significantly reduced for water levels above 15.5 m. At a water level of 11.5 m, the dispersion around the mean value of the DEMs represents 83% of the mean flood extent. At 15.5 m, this dispersion around the mean value among DEMs is already reduced to 7% and is less than 4% for all water levels above.

Implications for Morphological and Hydrological Characteristics
These variations in morphological characteristics may influence some classical parameters used to interpret the hydrological watershed response. In particular, the Kirpich concentration time, used in the MGB-IPH model [74], which has been intensively applied in the Amazon basin, varies by 95% among all the DEMs. It ranged from 26 h to 50 h for SRTMGL1 and DEM_COR, respectively. In local floodplain studies, where data are sparse, a classical approach is to approximate the local runoff by the linear formula Q = cˆIˆA t f , where Q is the peak discharge (m 3 /yr), I is the rainfall intensity (m 3 /m 2 ), A t f is the emerged area within the catchment (m 2 ), and c is the runoff coefficient. The value of c is fixed at 0.58, regarding a wide basin average and local runoff coefficient from other lakes in the lowland basin [26,75,76]. The difference in terms of watershed and flooded area between SRTMGL1 and DEM_COR, for example, would lead to a reduction of 8% in terms of runoff at low water level and to an increase lower than 1% at high water (Figure 8b) if SRTMGL1 was used instead of DEM_COR. Along the banks, the bias correction and interpolation led to raising the bank pixels by 0.5 m on average when comparing DEM_COR against SRTMGL1. Such a difference delays the diffusive overbank flow by a few days, typically less than 15 days considering an increase of water level of 4 cm/day in the Solimões River. It will also slightly reduce overbank flow.

Usefulness of Bathymetric Data in the Correction Process
Collecting in situ bathymetric data on the whole Amazon basin is not conceivable. The comparison between the different generated DEMs helps better highlight the improvement brought by the bathymetric in situ data. It is interesting to note that apart from LW level, the parameters derived from DEM_COR_3 and DEM_COR are relatively similar. In particular, for these two DEMs, watershed and flooding extent are comparable, as well as skill scores related to the flooding extent obtained at medium and high water levels. DEM_COR_1 also presented better skill scores and a more realistic watershed extent than DEM_COR_2, which included bathymetric data but not GTSN in the ANUDEM interpolation step. Thus, as far as watershed hydrological characteristics and flooding extent are concerned, the use of in situ bathymetric data is not necessarily required as long as a realistic drainage network is furnished. Indeed, altimetry data could be used to compute the interferometric bias with reasonable confidence if the study window is large enough to overlap a reasonable number of ICESat/GLAS points. Improvement of the flooding extent at the LW stage might also be induced, selecting the minimum values of altimetry data for the pixels overlapped by SWBD, following a similar method as proposed by Pfeffer et al. [77].

Conclusions
Hydrodynamic models are attractive tools for quantifying river-floodplain water exchanges and for studying water circulation patterns in the floodplain, but they require relatively high quality topography to produce realistic results. To date, the best free readily available topographic information for the lower Amazon basin is the most recent SRTMGL1 release, at 1 arc-second of resolution. However, this dataset still presents inconsistent elevations.
We proposed a method to remove some of the errors related to interferometric and vegetation bias, mobilizing in situ bathymetric data, altimetry data and flooding extent mapped from remote sensing.
The interferometric bias was estimated to be´2.0 m. The mean total offset correction over the study area was 5.9 m, and 7.4 m for pixels located in the upland region. As the SRTM elevation is located approximately 40% of the distance from the canopy top to the ground, the latter value led to a mean canopy height of 12.3 m, which was reasonable considering the proportion of secondary forest in the region.
In a second step, unbiased elevations were interpolated using the ANUDEM v5.3 algorithm. Several DEMs were generated to control for the respective influence of using or not using a GCP dataset or a ground-truth drainage network in the interpolation process. The vegetation correction and the interpolation process made it possible to reduce the DEM roughness by almost 50%. As expected, using GCPs clearly improved the vertical accuracy: the RMSE value decreased from 4.7 m to less than 2 m for all DEMs that included the GCP dataset. The use of GCPs also improved the flooding extent predicted by the DEM, independently of using or not using a drainage network at low water levels. The correction method improved the agreement between the flooding extent derived from the DEMs and from remote sensed products at low and high water levels (+10% and +27%, respectively), whereas accuracy at a medium water level was difficult to evaluate using the Landsat product. At medium and high water levels, the use of a ground truth drainage network as input to the interpolation algorithm permitted the algorithm to achieve relatively reasonable results regarding flooding extent and watershed hydrological characteristics, even with DEMs that did not include GCPs. This result is promising from the perspective of the application of the method, at least for hydrological studies at larger scale as radar altimetry can likely furnish a sufficient GCP dataset to estimate the interferometric bias.
Finally, we investigated the influence of the different DEMs on the numerical retrieval of morphological watershed characteristics and consequently on some hydrological properties. Using the D8 algorithm, the best generated DEM (identified as DEM_COR in this study) according to our criteria led to a 786 km 2 -wide watershed area, whereas we obtained 795 km 2 with SRTMGL1. The minimum flooding extent was 23 km 2 , representing 3% of the generated DEM watershed, instead of 88 km 2 (i.e., 10% of the watershed) for SRTMGL1. The maximum flooding extent derived from the best generated DEM was 391 km 2 , whereas we obtained 397 km 2 for SRTMGL1, thus neglecting error during high waters. In both cases, the result represented 50% of the Janauacá catchment extent. The longest flow path deduced from the generated DEM was 30% greater than when deduced from SRTMGL1. These differences produced an almost double Kirpich concentration time if computed from SRTMGL1 (26 h) or the generated DEM (50 h). In terms of local runoff and assuming a linear runoff formula, the difference in terms of the watershed and flooded area between SRTMGL1 and the generated DEM would lead to an increase of 10% in terms of runoff at low water level and to a reduction of less than 1% at high water level if using the corrected DEM.