Accuracy Assessment and Correction of SRTM DEM Using ICESat/GLAS Data under Data Coregistration

Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) inherently suffers from various errors. Many previous works employed Geoscience Laser Altimeter System onboard the Ice, Cloud, and land Elevation Satellite (ICESat/GLAS) data to assess and enhance SRTM DEM accuracy. Nevertheless, data coregistration between the two datasets was commonly neglected in their studies. In this paper, an automated and simple three dimensional (3D) coregistration method (3CM) was introduced to align the 3-arc-second SRTM (SRTM3) DEM and ICESat/GLAS data over Jiangxi province, China. Then, accuracy evaluation of the SRTM3 DEM using ICESat/GLAS data with and without data coregistration was performed on different classes of terrain factors and different land uses, with the purpose of evaluating the importance of data coregistration. Results show that after data coregistration, the root mean square error (RMSE) and mean bias of the SRTM3 DEM are reduced by 14.4% and 97.1%, respectively. Without data coregistration, terrain aspects with a sine-like shape are strongly related to SRTM3 DEM errors; nevertheless, this relationship disappears after data coregistration. Among the six land uses, SRTM3 DEM produces the lowest accuracy in forest areas. Finally, by incorporating land uses, terrain factors and ICESat/GLAS data into the correction models, the SRTM3 DEM was enhanced using multiple linear regression (MLR), back propagation neural network (BPNN), generalized regression NN (GRNN), and random forest (RF), respectively. Results exhibit that the four enhancement models with data coregistration obviously outperform themselves without the coregistration. Among the four models, RF produces the best result, and its RMSE is about 3.1%, 2.7% and 11.3% lower than those of MLR, BPNN, and GRNN, respectively. Moreover, 146 Global Navigation Satellite System (GNSS) points over Ganzhou city of Jiangxi province were used to assess the accuracy of the RF-derived SRTM3 DEM. It is found that the DEM quality is improved and has a similar error magnitude to that relative to the ICESat/GLASS data.


Introduction
Digital elevation models (DEMs) are an indispensable input variable for geomorphological, hydrological, and ecological models in the applications, such as landslide detection [1], forest inventory [2] and flood hazard assessment [3]. The data source of DEMs includes field surveys, topographical maps, and remote sensing techniques [4,5]. At present, the remote-sensing-based techniques show significant advantages for producing high-accuracy and global-scale DEMs [6].
To date, DEMs with the most widespread use in geosciences were collected from the Shuttle Radar Topography Mission (SRTM) based on Interferometric Synthetic Aperture Radar (InSAR) technique [7]. Like other spatial datasets, SRTM DEM intrinsically contains various errors, which are associated with data acquisition sensors, measurement conditions on the ground and post-processing methodology [8].
With respect to spatial scale, the random component of SRTM DEM errors can be classified into speckle noise of short-wavelength (a few pixels), stripe noise of medium-wavelength (500 m-50 km) and absolute biases of long-wavelength (>20 km) [9,10]. SRTM DEM also suffers from systematic positive biases caused by vegetation canopy in forested areas [11]. Global assessment of SRTM DEM indicates that the magnitude of each height error component can reach to 10 m for 90% of the data [9]. These errors have a seriously negative impact on the success of SRTM DEM-related applications [12][13][14]. Thus, it is urgent to understand the spatial structure of SRTM DEM errors and then eliminate them to the greatest extent.
The accuracy of SRTM DEM has been fully assessed over different countries of the world. This is generally achieved by comparing SRTM DEM with high-quality DEMs [8,[15][16][17] or high-accuracy ground control points (GCPs) collected by Global Navigation Satellite System (GNSS) [9,[18][19][20][21][22]. However, field GCP collection is highly expensive and time-consuming, thereby being suited for small-scale areas. Some researchers suggested the use of Geoscience Laser Altimeter System (GLAS) onboard the Ice, Cloud, and land Elevation Satellite (ICESat/GLAS) data as GCPs [23][24][25][26][27][28][29][30], because ICESat/GLAS data are freely available and have a near-global coverage. More importantly, these data are highly accurate under good conditions. For example, Huber et al. [31] assessed the height errors of ICESat/GLAS data using a high-quality DEM under different terrain characteristics and land types, and found that the ICESat/GLAS data has the mean error of 0.35 m in bare and flat areas, and 1.34 m in bare areas. Nevertheless, almost all ICESat/GLAS data-based SRTM DEM accuracy assessment omitted the horizontal shift between the two datasets in previous researches, mainly based on the biased assumption that the effect of a subpixel misregistration on DEM accuracy is negligible [23,32]. In fact, this argument is untenable on steep slopes (s), since a small horizontal shift (a) can cause a large vertical error (∆h), where ∆h = a × tan(s). Van Niel et al. [33] also indicated that subpixel horizontal misalignment between two DEMs can result in greater difference than their true elevation difference.
In addition to the accuracy evaluation, plenty of researchers have focused on the correction of SRTM DEM by removing vegetation bias over vegetated areas. For example, Baugh et al. [34] subtracted the vegetation heights of a certain proportion from SRTM DEM to improve the performance of hydrodynamic model. Su and Guo [11] constructed a multivariate linear regression model using airborne Light Detection and Ranging (LiDAR)-derived vegetation information and DEM slope to correct SRTM DEM. Zhao et al. [35] extended this framework to the global-scale SRTM DEM correction using vegetation type, global canopy cover and vegetation height. O'Loughlin et al. [36] constructed a power model to produce bare-earth DEMs using global vegetation height, vegetation continuous field and ICESat/GLAS data. However, the aforementioned methods have two shortcomings: (i) some important terrain factors (e.g., aspect and slope) were not incorporated in the models [35], and (ii) simple linear models cannot fully capture the nonlinear relationship between SRTM DEM bias and its predictors [37].
Recently, Wendleder et al. [38] employed spherical harmonics to remove the long-wave height errors of SRTM DEM, where ICESat/GLAS data located in flat area were taken as GCPs. Similarly, Yue et al. [32] used ICESat/GLAS points to improve the accuracy of Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) GDEM based on an artificial neural network (ANN). Zhao et al. [24] corrected the ASTER data by interpolating the elevation differences between ICESat/GLAS and ASTER data, and then adding the error surface to ASTER GDEM. Yamazaki et al. [10] produced a global DEM using SRTM DEM, ASTER GDEM, Advanced Land Observing Satellite (ALOS) World 3 Dimensions 30 m (AW3D) DEM, global tree height map, global tree density map, and ICESat/GLAS data by removing all component errors of the spaceborne DEMs. Nevertheless, the horizontal offset between ICESat/GLAS data and the global DEMs was not taken into account.
To solve aforementioned problems, the objective of this paper is to evaluate the effect of data coregistration on the accuracy assessment and correction of SRTM DEM using ICESat/GLAS under different classes of terrain factors and different land uses, where an automated, simple and general Remote Sens. 2020, 12, 3435 3 of 22 3D data coregistration method (3CM) is introduced to perform data coregistration, and four models, including multiple linear regression (MLR), random forest (RF), generalized regression neural network (GRNN) and back propagation neural network (BPNN) are employed to correct SRTM DEM. The main contribution of this paper is to demonstrate the importance of data coregistration for SRTM DEM accuracy assessment and enhancement using ICESat/GLAS data, since this process was commonly ignored in almost all previous researches. Moreover, as far as we know, this is the first time that the 3CM is introduced to register SRTM DEM to ICESat/GLAS in the context of accuracy assessment and enhancement.
We organize the remainder of this paper as follows. Materials are introduced in Section 2. Methods for accuracy assessment and correction of SRTM DEM are presented in Section 3. Experimental results and discussion are provided in Section 4. Finally, Section 5 concludes the paper.

Study Area
Jiangxi province between 28 • 22 -29 • 45 N and 115 • 47 -116 • 45 E was taken as the study area (Figure 1a), because of its diverse terrain characteristics, large elevation range, and various land uses. It has an area of over 166,900 km 2 [39]. Most part of this province is characterized by hills and mountains, and its terrain is like a basin opening up in the north ( Figure 1b). Specifically, it is surrounded by mountains on the west, east, and south sides. Totally, mountainous regions cover 36% of the land surface and hilly lands cover 42%.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 23 3D data coregistration method (3CM) is introduced to perform data coregistration, and four models, including multiple linear regression (MLR), random forest (RF), generalized regression neural network (GRNN) and back propagation neural network (BPNN) are employed to correct SRTM DEM. The main contribution of this paper is to demonstrate the importance of data coregistration for SRTM DEM accuracy assessment and enhancement using ICESat/GLAS data, since this process was commonly ignored in almost all previous researches. Moreover, as far as we know, this is the first time that the 3CM is introduced to register SRTM DEM to ICESat/GLAS in the context of accuracy assessment and enhancement. We organize the remainder of this paper as follows. Materials are introduced in Section 2. Methods for accuracy assessment and correction of SRTM DEM are presented in Section 3. Experimental results and discussion are provided in Section 4. Finally, Section 5 concludes the paper.

Study Area
Jiangxi province between 28°22′-29°45′ N and 115°47′-116°45′ E was taken as the study area (Figure 1a), because of its diverse terrain characteristics, large elevation range, and various land uses. It has an area of over 166,900 km 2 [39]. Most part of this province is characterized by hills and mountains, and its terrain is like a basin opening up in the north ( Figure 1b). Specifically, it is surrounded by mountains on the west, east, and south sides. Totally, mountainous regions cover 36% of the land surface and hilly lands cover 42%.

Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM)
SRTM with the goal of DEM production was a cooperative project conducted by the National Aeronautics and Space Administration (NASA) and the National Geospatial-Intelligence Agency (NGA) [7]. The SRTM data was simultaneously measured using InSAR with C-band (5.6 cm) and Xband (3.1 cm) in February 2000. It covers the earth from 60° N to 56° S, approximately 80% of the global land surface [40]. The SRTM C-band data were used to produce 1-and 3-arc-seconds DEMs. The designed vertical and horizontal accuracies of SRTM DEM were specified to be 16 and 20 m, respectively [7]. SRTM DEM of the 3-arc-second was originally released in 2003 by Jet Propulsion Laboratory (JPL) of NASA (NASA-JPL). The latest available version (V4.1) was released in 2008 by Consortium for Spatial Information of the Consultative Group of International Agricultural Research (CGIAR-

Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM)
SRTM with the goal of DEM production was a cooperative project conducted by the National Aeronautics and Space Administration (NASA) and the National Geospatial-Intelligence Agency (NGA) [7]. The SRTM data was simultaneously measured using InSAR with C-band (5.6 cm) and X-band (3.1 cm) in February 2000. It covers the earth from 60 • N to 56 • S, approximately 80% of the global land surface [40]. The SRTM C-band data were used to produce 1-and 3-arc-seconds DEMs. The designed vertical and horizontal accuracies of SRTM DEM were specified to be 16 and 20 m, respectively [7]. The original SRTM DEM of the 1-arc-second was released in 2003 only available for the United States. It has become available in a phased manner for the other parts of the world since September 2014.
In our study, the 3-arc-second SRTM (SRTM3) DEM V4.1 (Figure 1b) was used due to its long-time use and high popularity. It was downloaded from the website (http://earthexplorer.usgs. gov/). The horizontal and vertical references of the SRTM data are WGS84 ellipsoid and EGM96 geoid, respectively.  11 October 2009. They were downloaded from the website (https://nsidc. org/data/icesat/). In this study, we adopted the GLAH14 data, since they are specifically designed for land-surface elevations. GLAH14 contains GLAS land elevation data that includes footprint centroid geolocation and reflectance plus the geodetic, instrument, and atmospheric corrections used. Totally, there were 233,478 ICESat/GLAS points in the study site ( Figure 1b).

Check Points
In order to show the accuracy of the enhanced SRTM3 DEM outside of ICESat/GLASS data, 146 check points collected by GNSS in 2010 were obtained. The GNSS points were randomly distributed in Ganzhou city, which is located in the south of Jiangxi province (Figure 1b). They were referenced to WGS84 ellipsoid with Huang-Hai Elevation System 1985. The reported height errors of these control points were lower than 5 cm.

Terrain Factors and Land Use
Many experiments indicated that terrain factors and land use have a great impact on SRTM DEM errors [8,17,23]. Thus, they should be considered for the enhancement of SRTM DEM. Here, terrain factors including DEM elevation (Figure 1b Here, terrain relief is the maximum elevation minus minimum elevation in the 21 × 21 kernel. The selection of the kernel size is based on the conclusion of Lang et al. [41]. In the paper, they first computed the terrain reliefs under different kernel sizes, and then the terrain reliefs with respect to kernel sizes were fitted with a curve. It is found that the curve has an obvious inflection point between the kernel sizes of 19 × 19 and 21 × 21, and becomes flat when the kernel size exceeds 21 × 21. Thus, it is regarded that the kernel size of 21 × 21 is more suited for the extraction of terrain reliefs.
The 30-m land use map (Figure 1f) in the study site was produced by Chinese Academy of Sciences in 2000, with an overall classification accuracy of 92.7% [42]. It includes six classes: cultivated land (27.1%), forestland (62.2%), grassland (4.4%), water area (4.1%), built-up land (1.7%), and unused land (0.5%). Here, unused land is a type of land use that is covered by barren sand, Gobi desert, saline soil, marsh or rock.
For the following analysis and computation, the three terrain factors, land use and the elevation of SRTM3 DEM were extracted for each ICESat/GLAS point based on the bilinear interpolation in ArcGIS 10.6, since the ICESat/GLAS footprint centroids may not locate on the corresponding pixel centers of the gridded surfaces.

Processing of ICESat/GLAS Data
ICESat/GLAS data and SRTM DEM must be homogenized, since the two datasets have different vertical datum. The vertical datum conversion for ICESat/GLAS data from the ellipsoidal height (H ICESat ) to the orthometric height of SRTM DEM (H SRTM ) was achieved by the equation (Figure 2): (1) where N is the geoid undulation and h offset is the vertical datum difference between the TOPEX and the WGS84 ellipsoids, which is a constant. Here, h offset = 0.707 m [43]. For each ICESat/GLAS elevation, the value of N in Equation (1) was contained in the product referred to as GLAH14 [44].

Processing of ICESat/GLAS Data
ICESat/GLAS data and SRTM DEM must be homogenized, since the two datasets have different vertical datum. The vertical datum conversion for ICESat/GLAS data from the ellipsoidal height (HICESat) to the orthometric height of SRTM DEM (HSRTM) was achieved by the equation (Figure 2): where N is the geoid undulation and hoffset is the vertical datum difference between the TOPEX and the WGS84 ellipsoids, which is a constant. Here, hoffset = 0.707 m [43]. For each ICESat/GLAS elevation, the value of N in Equation (1) was contained in the product referred to as GLAH14 [44]. To assure the accuracy of the selected ICESat/GLAS data, we adopted the following criteria as recommended by some studies [31,32,45]: signal width less than 25 m, peak number less than 6 and received energy lower than 10 FJ. Namely, ICESat/GLAS points which meet the aforementioned conditions were selected for subsequent operations. Outliers commonly exist in ICESat/GLAS data because of the impact of saturated waveforms [46], cloud and valley fog. Thus, elevation differences (EDs) between ICESat/GLAS points and the corresponding SRTM3 DEM values larger than 100 m were removed, as suggested by Carabajal and Harding [47]. Finally, in terms of ED, the ICESat/GLAS points beyond the three standard deviations of the mean were further removed based on the 3-sigma rule [48]. After above steps, 175,684 ICESat/GLAS points were left over the study site.
For the GNSS-based check points, we transformed their elevations (Hcontrol) from Huang-Hai Elevation System 1956 to that of SRTM DEM using the equation: where hoffset is the vertical offset between the two Elevation Systems. Here, Hoffset = 0.357 m [49]. To assure the accuracy of the selected ICESat/GLAS data, we adopted the following criteria as recommended by some studies [31,32,45]: signal width less than 25 m, peak number less than 6 and received energy lower than 10 FJ. Namely, ICESat/GLAS points which meet the aforementioned conditions were selected for subsequent operations. Outliers commonly exist in ICESat/GLAS data because of the impact of saturated waveforms [46], cloud and valley fog. Thus, elevation differences (EDs) between ICESat/GLAS points and the corresponding SRTM3 DEM values larger than 100 m were removed, as suggested by Carabajal and Harding [47]. Finally, in terms of ED, the ICESat/GLAS points beyond the three standard deviations of the mean were further removed based on the 3-sigma rule [48]. After above steps, 175,684 ICESat/GLAS points were left over the study site.
For the GNSS-based check points, we transformed their elevations (H control ) from Huang-Hai Elevation System 1956 to that of SRTM DEM using the equation: where h offset is the vertical offset between the two Elevation Systems. Here, H offset = 0.357 m [49].

Coregistration of 3-Arc-Second Shuttle Radar Topography Mission (SRTM3) DEM and ICESat/GLAS Data
As discussed above, the horizontal shift has an effect on the accuracy assessment of DEMs, especially in steep slope areas. Thus, the 3CM, proposed by Nuth and Kääb [50], was employed to eliminate the offset between ICESat/GLAS data and SRTM3 DEM. Data coregistration is based on the following equation: where ∆h is the elevation difference between the ICESat/GLAS point and the corresponding SRTM3 DEM value; s and ϕ represent terrain slope and aspect, respectively; x i , i = 1, 2, 3 respectively stand for horizontal shift magnitude, shift vector direction and mean bias (∆h) divided by the tangent of mean slope (s), i.e., x 3 = ∆h/tan(s).
Equation (3) shows that the inputs of 3CM include elevation differences between ICESat/GLAS points and the corresponding SRTM3 DEM pixels, terrain slopes and aspects. By solving Equation (3) with a nonlinear least squares fitting method (e.g., Levenberg-Marquardt), the parameters x i , i = 1, 2, 3 can be estimated. Thus, the mean horizontal shifts along the x and y directions (∆x and ∆y) are computed respectively as Moreover, the mean elevation bias is estimated as Based on above results, SRTM3 DEM is co-registered to ICESat/GLAS points by respectively shifting SRTM3 DEM in the x and y directions with the magnitudes of ∆x and ∆y, and then in the vertical direction with the magnitude of ∆h.
It can be concluded that 3CM has the following merits: (i) simplicity since it does not require other specialized software; (ii) robustness since it considers not only elevation differences between the two datasets, but also terrain parameters; (iii) universality since it can handle any two elevation datasets without the restriction of data collection sensors; and (iv) automation since it does not require manual intervention.

SRTM3 DEM Correction
To improve the quality of SRTM3 DEM using ICESat/GLAS data, the relationship between ICESat/GLAS elevations (H ICESat ), terrain factors and land uses was respectively simulated using MLR, RF, BPNN, and GRNN, and their results were compared and analyzed. The general relationship is expressed as where f (•) is a mapping function, quantitatively shows the relationship between dependent variable and its covariates; Lat and Lon represent latitude and longitude, respectively; Sl, Re, and As are slope, relief and aspect, respectively; Lu is the land-use type and H SRTM is the elevation of SRTM3 DEM at the ICESat/GLAS point. Since aspect degree is a direction, we used the sine and cosine values of the aspect in the model. Because land-use type is considered as categorical variable [51], five dummy variables were included in the model to represent the six land-use types. More specifically, taking the cultivated land (CL) as the reference, the other five land-use types including forestland (FL), grassland (GL), built-up land (BL), water area (WA), and unused land (UL) were used as dummy variables in the model. This indicates that the land-use type of a point can be described with five dummy variables. For example, if a point has the land-use type of forestland, it can be described as FL = 1 and other four dummy variables are 0 (i.e., GL = 0, BL = 0, WA = 0, and UL = 0). In comparison, if all five dummy variables are 0, the point has the land-use type of cultivated land.

Multiple Linear Regression (MLR)
The formulation of MLR is expressed as where β i (i = 0, . . . n) are the unknown coefficients to be computed and x = x 1 · · · x n represent the predictors of SRTM DEM accuracy, as shown in Equation (7). Generally, the linear model is effective to explain the linear relationship [51,52], yet has a poor performance for dealing with nonlinear problems. In our study, MLR was conducted in Statistical Package for the Social Sciences (SPSS) 22.0.

Back Propagation Neural Network (BPNN)
BPNN is one of the most popular ANNs for solving nonlinear problems. It generally includes three layers ( Figure 3): one input layer, one hidden layer and one output layer. Each layer has some neurons: the neuron numbers in the input and output layers equal to the input and output parameters, respectively. According to previous studies [53], the neuron number in the hidden layer ranges from 2 √ a + b to 2a + 1, where a and b are the neuron number in the input and output layers, respectively. In our study, based on the 10-fold cross valuation scheme [54], the neuron number in the hidden layer was set to 2a + 1, and the transfer functions for the hidden and output layers was set to 'tansig' and 'purelin', respectively. Moreover, the network consists of connections between the layers and each connection is assigned a weight that represents its relative importance. The BP algorithm was employed to train the multilayer networks for updating weights to minimize the loss function. if a point has the land-use type of forestland, it can be described as FL = 1 and other four dummy variables are 0 (i.e., GL = 0, BL = 0, WA = 0, and UL = 0). In comparison, if all five dummy variables are 0, the point has the land-use type of cultivated land.

Multiple Linear Regression (MLR)
The formulation of MLR is expressed as where βi (i = 0,… n) are the unknown coefficients to be computed and represent the predictors of SRTM DEM accuracy, as shown in Equation (7). Generally, the linear model is effective to explain the linear relationship [51,52], yet has a poor performance for dealing with nonlinear problems. In our study, MLR was conducted in Statistical Package for the Social Sciences (SPSS) 22.0.

Back Propagation Neural Network (BPNN)
BPNN is one of the most popular ANNs for solving nonlinear problems. It generally includes three layers ( Figure 3): one input layer, one hidden layer and one output layer. Each layer has some neurons: the neuron numbers in the input and output layers equal to the input and output parameters, respectively. According to previous studies [53], the neuron number in the hidden layer ranges from 2 a b + to 2a + 1, where a and b are the neuron number in the input and output layers, respectively. In our study, based on the 10-fold cross valuation scheme [54], the neuron number in the hidden layer was set to 2a + 1, and the transfer functions for the hidden and output layers was set to 'tansig' and 'purelin', respectively. Moreover, the network consists of connections between the layers and each connection is assigned a weight that represents its relative importance. The BP algorithm was employed to train the multilayer networks for updating weights to minimize the loss function.  In the study, BPNN was conducted in MATLAB R2019b using the commands of "newff", "train" and "sim" to create a feed-forward backpropagation network, train the network and simulate the network for a new input, respectively.
In the study, BPNN was conducted in MATLAB R2019b using the commands of "newff", "train" and "sim" to create a feed-forward backpropagation network, train the network and simulate the network for a new input, respectively.

Generalized Regression Neural Network (GRNN)
A GRNN includes four layers (Figure 4): input layer, pattern layer, summation layer and output layer. GRNN has the merit of highly computational speed [55], since its weights are pre-determined with a Gaussian function.  (9) where n is the number of training samples, zi is the output of the ith training sample, di is the Euclidean distance between the point x to be estimated and the ith training point x i , i.e., is the smoothing parameter of GRNN. Generally, the larger the value of the smoothing parameter λ, the smoother the function approximation will be. On the contrary, to fit data closely, a smaller value of λ should be used. Thus, the value of the smoothing parameter controls the performance of GRNN. Here, the optimal value of λ was tuned by the 10-fold cross validation.
In our study, GRNN was conducted in MATLAB R2019b using the commands of "newgrnn" and "sim" to design a network and simulate the network for a new input, respectively.

Random Forest (RF)
The RF model developed by Breiman [56] is an ensemble learning algorithm based on decision trees and bagging [57,58]. As shown in Figure 5, RF includes four stages: (i) selecting mtree subsamples from the original training samples using bootstrapping; (ii) building mtree independent decision trees The output of GRNN can be formulated as where n is the number of training samples, z i is the output of the ith training sample, d i is the Euclidean distance between the point x to be estimated and the ith training point x i , i.e., d i = x − x i 2 and λ is the smoothing parameter of GRNN.
Generally, the larger the value of the smoothing parameter λ, the smoother the function approximation will be. On the contrary, to fit data closely, a smaller value of λ should be used. Thus, the value of the smoothing parameter controls the performance of GRNN. Here, the optimal value of λ was tuned by the 10-fold cross validation.
In our study, GRNN was conducted in MATLAB R2019b using the commands of "newgrnn" and "sim" to design a network and simulate the network for a new input, respectively.

Random Forest (RF)
The RF model developed by Breiman [56] is an ensemble learning algorithm based on decision trees and bagging [57,58]. As shown in Figure 5, RF includes four stages: (i) selecting m tree subsamples from the original training samples using bootstrapping; (ii) building m tree independent decision trees with n cov covariates randomly chosen from all independent variables; (iii) performing prediction using each decision tree; and (iv) obtaining the final output by averaging the m tree predictions for regression or taking a simple majority vote for classification [59].
with ncov covariates randomly chosen from all independent variables; (iii) performing prediction using each decision tree; and (iv) obtaining the final output by averaging the mtree predictions for regression or taking a simple majority vote for classification [59]. It can be found that RF includes two parameters, i.e., the number of trees (mtree) in the forest and the number of covariates (ncov) in the random subsample. The bootstrapping method and the random selection of the subset of covariates reduces the redundancy of explanatory variables, thereby avoiding the overfitting problem [60]. Additionally, RF has the merits of simplicity, flexibility, ease of use, efficiency, stable computation and automatic rank of feature importance on the prediction, all of which make the method appealing for handling nonlinear problems.
In our test, the RF regression model was performed with the freely available codes, downloaded from the website (https://code.google.com/archive/p/randomforest-matlab/downloads), and the optimal values of the two parameters were determined by the 10-fold cross validation.

Accuracy Measures
Both root mean square error (RMSE) and mean error (ME) [61] were taken as the accuracy measure to assess the accuracy of SRTM3 DEM and the performance of the DEM correction models. RMSE and ME are formulated respectively as ( ) where n is the number of check points. For accuracy assessment, Oi and Si respectively represent the elevations of the ith ICESat/GLAS point and the corresponding SRTM3 DEM pixel. For evaluating the performance of DEM correction model, ICESat/GLAS points were randomly grouped into training and testing points with the proportions of 90% and 10%, respectively. More specifically, the former was first adopted to train the model and then the SRTM3 DEM was improved with the trained model. Finally, the testing points were employed to assess the accuracy of the enhanced SRTM3 DEM. It can be found that RF includes two parameters, i.e., the number of trees (m tree ) in the forest and the number of covariates (n cov ) in the random subsample. The bootstrapping method and the random selection of the subset of covariates reduces the redundancy of explanatory variables, thereby avoiding the overfitting problem [60]. Additionally, RF has the merits of simplicity, flexibility, ease of use, efficiency, stable computation and automatic rank of feature importance on the prediction, all of which make the method appealing for handling nonlinear problems.
In our test, the RF regression model was performed with the freely available codes, downloaded from the website (https://code.google.com/archive/p/randomforest-matlab/downloads), and the optimal values of the two parameters were determined by the 10-fold cross validation.

Accuracy Measures
Both root mean square error (RMSE) and mean error (ME) [61] were taken as the accuracy measure to assess the accuracy of SRTM3 DEM and the performance of the DEM correction models. RMSE and ME are formulated respectively as where n is the number of check points. For accuracy assessment, O i and S i respectively represent the elevations of the ith ICESat/GLAS point and the corresponding SRTM3 DEM pixel. For evaluating the performance of DEM correction model, ICESat/GLAS points were randomly grouped into training and testing points with the proportions of 90% and 10%, respectively. More specifically, the former was first adopted to train the model and then the SRTM3 DEM was improved with the trained model. Finally, the testing points were employed to assess the accuracy of the enhanced SRTM3 DEM. Thus, O and S respectively represent the sampled elevations and the corresponding values of the enhanced SRTM3 DEM. It is undoubtable that more check points can make the accuracy assessment and enhancement more reliable. Moreover, all ICESat/GLAS points can be easily downloaded from the website. Thus, from the point view of the rationality and practicability, all available ICESat/GLAS data were used for assessing and enhancing SRTM3 DEM in our study.
As described in previous Sections, the parameters of each machine learning method were tuned by the 10-fold cross validation in our study. More specifically, given one set of parameters, the 10-fold cross validation was performed as follows [54,62]: (i) averagely dividing the training points into 10 subsamples, and setting the counter variable i = 1, (ii) using the ith subsample for model validation and the other subsamples for model training, (iii) doing i = i + 1, (iv) repeating (ii)-(iii) until i > 10, (v) computing the accuracy measures like RMSE and ME. After several tests, the best parameter values correspond to the least RMSE or ME.

Overall Accuracy
Results indicate that based on data coregistration, the mean horizontal shifts of the SRTM3 DEM in the x and y directions with respect to ICESat/GLAS data are −17.588 m (move left) and −29.343 m (move down), respectively. In other words, the SRTM3 DEM relative to ICESat/GLAS data has a subpixel misregistration. In addition, the mean vertical shift of the SRTM3 DEM relative to ICESat/GLAS data is −2.107 m, demonstrating that it is averagely above ICESat/GLAS data. In fact, the horizontal shift between SRTM3 DEM and other DEMs or GCPs was also found in previous studies. For example, Rexer and Hirt [63] revealed an average −0.007 arc-seconds offset in North-South direction and −0.100 arc-seconds offset in East-West direction between ASTER GDEM and SRTM3 DEM over the Australian continent. Satgé et al. [23] found that compared to ICESat/GLAS points, SRTM3 DEM V4.1 has 3.7 and −7.3 m horizontal shifts in the NE-SW and NW-SE directions over the Altiplano watershed, respectively. Li et al. [64] indicated that the horizontal shifts between SRTM3 and ASTER DEMs at three sites of China range from 12.2 to 15.5 m and from 1.4 to 14 m in the xand y-directions, respectively. Figure 6 shows that before data coregistration, SRTM3 DEM has a negative bias with the ME of −2.079 m, exhibiting that the SRTM3 DEM is averagely higher than the ICESat/GLAS points. This result is expected as SRTM radar center is located somewhere between vegetation canopy and the bare-earth ground [11,28,47]. The computed RMSE of 11.740 m is within the specified accuracy of the SRTM DEM (i.e., 16 m). The RMSE in this study is approximate to the values in South America and other regions of China. For example, Huang et al. [29] showed that the SRTM3 DEM over the eastern Tibetan Plateau of China has the RMSE of 15.2 m. Du et al. [30] found that the RMSE of SRTM3 DEM over Pearl River Delta area of China is 14.01 m. Satgé et al. [23] reported that the RMSE of SRTM3 DEM over the Altiplano watershed is 11.1 m.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 23 Figure 6. Error histograms of SRTM3 DEM relative to ICESat/GLAS data before and after coregistration.
After the coregistration, the DEM vertical errors are symmetrically distributed around zero with the ME of −0.061 m, while the RMSE is 10.047 m. This indicates that data misregistration significantly affects the vertical accuracy of SRTM3 DEM. Specifically, after the data coregistration, the ME and RMSE of SRTM3 DEM are decreased by 97.1% and 14.4%, respectively. However, this data preprocess was commonly omitted in previous researches [32,43,44,65], which may result in a large RMSE. For After the coregistration, the DEM vertical errors are symmetrically distributed around zero with the ME of −0.061 m, while the RMSE is 10.047 m. This indicates that data misregistration significantly affects the vertical accuracy of SRTM3 DEM. Specifically, after the data coregistration, the ME and RMSE of SRTM3 DEM are decreased by 97.1% and 14.4%, respectively. However, this data preprocess was commonly omitted in previous researches [32,43,44,65], which may result in a large RMSE. For example, without the coregistration, Zhao et al. [44] demonstrated that the SRTM3 DEM relative to ICESat/GLAS data over the Shanbei Plateau, China has the RMSE of 20.72 m and the ME of −1.0 m. Figure 7a shows that without data coregistration, SRTM3 DEM errors with a sine-like shape are strongly associated with terrain aspects. This indicates the misalignment of SRTM3 DEM and ICESat/GLAS data, since a strong relationship between terrain aspects and DEM errors is a useful indicator of data misregistration [33]. Although this relationship was also found by some researchers [8,17,44], there is no doubt about its cause from data misalignment or inaccurate alignment. With the coregistration, this relationship seems to disappear (Figure 7b). Therefore, without the coregistration, terrain aspect might be wrongly taken as a dominant explanatory factor for SRTM3 DEM errors. Figure 7b shows that SRTM3 DEM obtains the lowest accuracy in the north direction (i.e., aspect values are around 0 • and 360 • ), and the highest accuracy in the south direction (i.e., aspect values are around 180 • ). This is attributed to the fact that for collecting elevation data using SAR technique, the data on the foreslopes (facing the sensor) have a high accuracy, while those on the backslopes are much poorly measured, especially in steep slope areas [66].  Table 1 and Figure 8 demonstrate the influence of terrain slope, relief and elevation classes on the MEs of SRTM3 DEM without and with data coregistration. It is obvious that no relationship is found between the ME and the three terrain factors on all classes, mainly because of the different penetration ability of SRTM radar signal under different environments. For all the classes without data coregistration, the MEs are about −2 m. After the coregistration, the systematic biases are significantly reduced. Moreover, the MEs on the first class (I) of the three terrain factors, and on the fourth class (IV) of elevation are positive. This is owed to the fact that these classes are mainly covered by water body and flat terrain, which have a smaller bias than the other land uses. In fact, the complex relationship between ME and the terrain slope was also reported by previous studies [17,44,65].   Table 1 and Figure 8 demonstrate the influence of terrain slope, relief and elevation classes on the MEs of SRTM3 DEM without and with data coregistration. It is obvious that no relationship is found between the ME and the three terrain factors on all classes, mainly because of the different penetration ability of SRTM radar signal under different environments. For all the classes without data coregistration, the MEs are about −2 m. After the coregistration, the systematic biases are significantly reduced. Moreover, the MEs on the first class (I) of the three terrain factors, and on the fourth class (IV) of elevation are positive. This is owed to the fact that these classes are mainly covered by water body and flat terrain, which have a smaller bias than the other land uses. In fact, the complex relationship between ME and the terrain slope was also reported by previous studies [17,44,65].    Figure 9 exhibit that for the classes II-V, there exists a strong positive relationship between RMSEs and terrain factors. However, for class I, the RMSEs are obviously smaller than those of the other classes. This is because class I has a flat terrain and SRTM3 DEM has a much higher accuracy in such terrain than in the other terrain types. The RMSEs with data coregistration are consistently smaller than those without the coregistration on all the classes. In comparison, SRTM3 V: (>20 • ) for slope class, (>400 m) for relief and elevation classes. "S", "R" and "E" stand for slope, relief, and elevation, respectively.). Table 1 and Figure 9 exhibit that for the classes II-V, there exists a strong positive relationship between RMSEs and terrain factors. However, for class I, the RMSEs are obviously smaller than those of the other classes. This is because class I has a flat terrain and SRTM3 DEM has a much higher accuracy in such terrain than in the other terrain types. The RMSEs with data coregistration are consistently smaller than those without the coregistration on all the classes. In comparison, SRTM3 DEM has the largest accuracy improvement on class V, since this class is mainly covered by dense forests, where the SRTM radar signal has the lowest penetration depth. Our results on the relationship between terrain factors and SRTM3 DEM errors are consistent with previously reported findings. For example, Szabó et al. [67] assessed the accuracy of SRTM3 V2 and V3 in two mountainous areas of Hungary and found that their error standard deviations (STDs) increased from 4.28 and 3.16 m on the slope class 0-5° to 9.8 and 8.02 m on the class >20°, respectively. Satgé et al. [23] demonstrated that the RMSE of SRTM3 DEM V4 over the Altiplano watershed in South America increases from 8.1 to 21.1 m when the slope class ranges from 0-2° to >20°. Zhang et al. [17] indicated that the STD of SRTM3 DEM changes from 4.29 m with the slope class 0-2° to 17.51 m of slope greater than 35° over the whole of China. Plenty of studies demonstrated that in addition to SRTM DEM, the vertical accuracy of other public DEMs also shows a strong negative relationship with terrain slope [19,65,[68][69][70]. Table 1 and Figure 10 show that land-use type has a great influence on SRTM3 DEM errors. More specifically, without data coregistration, the ME ranges from −0.70 m in water area to −2.55 m in cultivated land (Figure 10a), while the RMSEs from 1.78 m in unused land to 13.84 m in forested cover (Figure 10b). Namely, among the six land uses, forest area has the largest RMSE. This is an expected result, as tree trunks and limbs can cause irregular backscattering of SRTM radar signal, thereby magnifying the geometric distortions [43]. After data coregistration, the accuracy of SRTM3 DEM is significantly improved on almost all land uses, except for the water area. This is because the raw SRTM3 DEM has voids over water bodies, which were filled by other datasets in the production of SRTM3 DEM V4.1 [71]. Obviously, the data used for void filling has a low systematic bias (i.e., −0.70 Our results on the relationship between terrain factors and SRTM3 DEM errors are consistent with previously reported findings. For example, Szabó et al. [67] assessed the accuracy of SRTM3 V2 and V3 in two mountainous areas of Hungary and found that their error standard deviations (STDs) increased from 4.28 and 3.16 m on the slope class 0-5 • to 9.8 and 8.02 m on the class >20 • , respectively. Satgé et al. [23] demonstrated that the RMSE of SRTM3 DEM V4 over the Altiplano watershed in South America increases from 8.1 to 21.1 m when the slope class ranges from 0-2 • to >20 • . Zhang et al. [17] indicated that the STD of SRTM3 DEM changes from 4.29 m with the slope class 0-2 • to 17.51 m of slope greater than 35 • over the whole of China. Plenty of studies demonstrated that in addition to SRTM DEM, the vertical accuracy of other public DEMs also shows a strong negative relationship with terrain slope [19,65,[68][69][70]. Table 1 and Figure 10 show that land-use type has a great influence on SRTM3 DEM errors. More specifically, without data coregistration, the ME ranges from −0.70 m in water area to −2.55 m in cultivated land (Figure 10a), while the RMSEs from 1.78 m in unused land to 13.84 m in forested cover ( Figure 10b). Namely, among the six land uses, forest area has the largest RMSE. This is an expected result, as tree trunks and limbs can cause irregular backscattering of SRTM radar signal, thereby magnifying the geometric distortions [43]. After data coregistration, the accuracy of SRTM3 DEM is significantly improved on almost all land uses, except for the water area. This is because the raw SRTM3 DEM has voids over water bodies, which were filled by other datasets in the production of SRTM3 DEM V4.1 [71]. Obviously, the data used for void filling has a low systematic bias (i.e., −0.70 m). Thus, the fact that the biases in the slope class of 0-5 • , elevation and terrain relief classes of 0-100 m become positive after the coregistration (Figure 8) is mainly due to the large proportion of water bodies in the three classes. The poor performance of global DEMs in forested areas was also reported in other studies. For example, Hu et al. [72] assessed the accuracy of different public DEMs of Hubei province, China, and indicated that compared to built-up areas, wetland and cultivated land areas, forest area produces the lowest accuracy. Li and Zhao [65] evaluated the errors of AW3D30 in function of land cover, and The poor performance of global DEMs in forested areas was also reported in other studies. For example, Hu et al. [72] assessed the accuracy of different public DEMs of Hubei province, China, and indicated that compared to built-up areas, wetland and cultivated land areas, forest area produces the lowest accuracy. Li and Zhao [65] evaluated the errors of AW3D30 in function of land cover, and obtained that the forest gives the poorest accuracy, which is followed by grassland. Yap et al. [20] evaluated freely available latest 30-m DEMs over Cameroon and pointed out that all DEMs have the worst performance in the areas characterized by forests and shrubs. From our test and the previously reported results, it can be concluded that the forest areas have the poorest accuracy mainly due to the low penetration and irregular backscattering of SRTM radar signals. Table 2 shows the performance of MLR and the three machine learning methods for the correction of SRTM3 DEM with and without data coregistration. Note that the automated 3D data coregistration method (3CM) introduced in this paper can be also considered as a correction method, since after the coregistration, the DEM accuracy is improved. Thus, its performance is compared with those of the other correction models. It can be found that without data coregistration, MLR, BPNN, GRNN and RF have lower MEs and RMSEs than those of the original SRTM3 DEM (ME = −2.079 m and RMSE = 11.740 m). Namely, the four models improve the accuracy of the original SRTM3 DEM. However, in terms of ME, only MLR (ME = −0.045 m) and BPNN (ME = −0.029 m) outperform 3CM (ME = −0.061 m), while in terms of RMSE, only BPNN (RMSE = 9.793 m) yielded a better result than 3CM (RMSE = 10.047 m). With the data coregistration, almost all models perform better than 3CM, except for GRNN in terms of RMSE. Specifically, in terms of RMSE, MLR, BPNN, and RF are 5.4%, 5.8%, and 8.3% more accurate than 3CM, respectively, while GRNN is 3.2% less accurate than 3CM. The poor performance of GRNN might be attributed to the fact that its weighted computation (i.e., Equation (8)) is too simple to capture the nonlinear relationship between the predictors and SRTM3 DEM errors. The four models with data coregistration significantly outperform themselves without the coregistration. For instance, after data coregistration, the ME and RMSE of GRNN are reduced by 80.8% and 9.4%, respectively. Hence, data coregistration is essential for SRTM3 DEM correction.

Model Performance Comparison
In contrast, the four models with the data coregistration produce similar results for the bias elimination of SRTM3 DEM, as their absolute MEs are approximate to 0, especially for BPNN with ME = −0.003 m. With respect to RMSE, RF outperforms the other models. Specifically, RF is about 8.3%, 3.1%, 2.7%, and 11.3% more accurate than 3CM, MLR, BPNN, and GRNN, respectively. The high accuracy of RF is mainly attributed to its ensemble learning technique [56,73]. Moreover, the RF regression employed a bootstrap aggregation scheme to collect some data from all samples to train the model and other to validate the model. This strategy avoids the overfitting problem, which commonly confuses linear regression method and the classical machine learning methods [74].
The computational costs indicate that MLR has the fast speed, which is followed by 3CM. This is expected, as the two methods do not solve large linear systems. Among the three machine learning methods, RF has the largest computing cost, since it includes four stages, each of which is time-consuming.
Additionally, the corrected DEMs based on MLR and RF are compared, since the two models are representative of linear and nonlinear regressions, respectively. Figure 11 shows that the corrected DEMs based on MLR ( Figure 11b) and RF (Figure 11c) have a similar appearance to the original one ( Figure 11a). However, the high elevations of the two corrected DEMs (i.e., points with red color) are obviously less than those of the original DEM. This is expected as the elevations of the points located in forested areas are accurately corrected by the models. It is surprised to see that the maximum value of MLR-based DEM (i.e., 2153.8 m) is slightly larger than that of the original one (i.e., 2147.0 m). This seems unreasonable, as the highest point is more likely located on mountain, which is covered by forests, and these points should be lowered. In comparison, RF clearly lowers the highest elevation, indicating that the negative effect of the forest canopy on SRTM3 DEM was reduced. The difference between the RF-based and MLR-based DEMs (Figure 11d) shows that the elevations of RF in the forest areas are lower than those of MLR. This further indicates that RF is more effective than MLR for enhancing SRTM3 DEM, especially in forest areas. corrected DEMs based on MLR ( Figure 11b) and RF (Figure 11c) have a similar appearance to the original one (Figure 11a). However, the high elevations of the two corrected DEMs (i.e., points with red color) are obviously less than those of the original DEM. This is expected as the elevations of the points located in forested areas are accurately corrected by the models. It is surprised to see that the maximum value of MLR-based DEM (i.e., 2153.8 m) is slightly larger than that of the original one (i.e., 2147.0 m). This seems unreasonable, as the highest point is more likely located on mountain, which is covered by forests, and these points should be lowered. In comparison, RF clearly lowers the highest elevation, indicating that the negative effect of the forest canopy on SRTM3 DEM was reduced. The difference between the RF-based and MLR-based DEMs (Figure 11d) shows that the elevations of RF in the forest areas are lower than those of MLR. This further indicates that RF is more effective than MLR for enhancing SRTM3 DEM, especially in forest areas.  Finally, the GNSS-derived check points were employed to assess the accuracy of the RF-enhanced DEMs with and without data coregistration. Results (Table 3) show that the enhanced SRTM3 DEM with data coregistration is more accurate than that without the coregistration. Specifically, compared to the ME and RMSE without the coregistration, those with the coregistration are reduced by 74.5% and 8.2%, respectively. Moreover, the enhanced DEM accuracy relative to the GNSS points is slightly lower than that relative to the ICESat/GLASS data. This might be owed to the different locations of the two groups of the check points. Specifically, the GNSS points are located in Ganzhou city, while the ICESat/GLAS data are located in Jiangxi province. However, the local accuracy assessment is helpful for our understanding of the quality of the enhanced SRTM3 DEM. Therefore, it can be concluded that the accuracy of the enhanced DEM is improved by the RF method. Table 3. Accuracy assessment of the RF-enhanced DEMs with and without data coregistration relative to GNSS points (m).

Data Coregistration ME RMSE
Without −0.690 11.951 With −0.176 10.977 In previous works, the ICESat/GLAS-based correction of SRTM DEM was conducted under different areas. For example, Satgé et al. [23] improved SRTM3 DEM by removing its bias from ICESat/GLAS points over the Altiplano watershed in South America. Compared to the original SRTM3 DEM, the RMSE of the enhanced one was reduced by 22.5%. Zhao et al. [44] employed MLR to improve the SRTM3 DEM based on ICESAT/GLAS data over the Shanbei Plateau, China, where the RMSE was reduced by 54.4%. Wendleder et al. [38] used spherical harmonics to remove the error of SRTM3 DEM based on ICESat/GLAS data at a global scale, and found that the mean error was reduced by 66.7%. Similarly, Yue et al. [32] employed ANNs to improve the accuracy of ASTER GDEM with ICESat/GLAS data on a global scale. Therefore, the ICESat/GLAS-based correction of SRTM DEM can be achieved at different scales including regional-, watershed-and global-scales. It is obvious that their accuracy improvement levels are inconsistent. This is because the accuracy of the improved SRTM3 DEM seriously depends on terrain characteristics, land uses, forest types, and model performance.

Conclusions
Data coregistration between SRTM DEM and ICESat/GLAS data were commonly omitted in the context of DEM accuracy assessment and correction in previous researches. In this paper, a simple and general 3D data coregistration method (3CM) is introduced and used to co-register the two datasets, and then its influence on the accuracy of SRTM DEM is analyzed. Results show that the 3-arc-second SRTM (SRTM3) DEM has a subpixel misregistration relative to ICESat/GLAS data, with the mean horizontal shifts in the xand y-directions of −17.588 and −29.343 m, respectively. Moreover, the vertical shift is −2.107 m. After data coregistration, the mean bias and RMSE of SRTM3 DEM are −0.061 and 10.047 m, respectively, which are less than those without the coregistration. The effect of terrain factors including aspect, slope, elevation and relief, and land use types on the vertical accuracy of SRTM3 DEM is analyzed. The results demonstrate that there is a strong relationship between aspect and SRTM3 DEM errors before data coregistration; however, this relationship disappears after data coregistration. The vertical bias of SRTM3 DEM has no conclusive relationship with the three terrain factors, while the DEM RMSE linearly increases with the increase of terrain factors. Data coregistration greatly improves the accuracy of SRTM3 DEM on almost all the classes of the terrain factors. Among the six land use types, the forest areas produce the largest vertical errors.
The terrain factors and land uses were incorporated into the models including MLR, BPNN, GRNN, and RF for the correction of SRTM3 DEM. It is found that without data coregistration, the 3CM seems more accurate than the four models. However, after the coregistration, the performances of the four models are greatly improved. Among the four models, the RF method produces the best result, which is about 3.1%, 2.7%, and 11.3% more accurate than MLR, BPNN, and GRNN in terms of RMSE, respectively. However, it has the largest computational cost. Accuracy assessment of the RF-enhanced SRTM3 DEM relative to 146 GNSS points located in Ganzhou city also shows its quality improvement. Overall, our results clearly demonstrate the importance of data coregistration for accuracy assessment and enhancement of SRTM3 DEM using ICESat/GLAS data. Since all of the methods used in this paper are simple, general, and effective, they could be applied to other publicly available DEMs, such as ASTER GDEM, Advanced Land Observing Satellite (ALOS) World 3 Dimensions (3D) 30 m (AW3D) DEM, and TerraSAR-X add-on for Digital Elevation Measurements (TanDEM-X) DEM in the context of accuracy assessment and enhancement.