# Estimation of Forest Height Using Google Earth Engine Machine Learning Combined with Single-Baseline TerraSAR-X/TanDEM-X and LiDAR

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

^{2}= 0.67, RMSE = 2.89 m) were achieved with the combination of coherence and decorrelation of volume scattering using the GBDT and RF algorithms. (2) In terms of machine learning methods, the GBDT algorithm proved suitable for estimating forest height. The most effective approach for forest height mapping involved combining the GBDT algorithm with coherence, decorrelation of volume scattering, and a small amount of LiDAR forest height data, used as training data.

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Study Area

#### 2.2. Data Acquisition and Processing

#### 2.2.1. TerraSAR-X/TanDEM-X Data

#### 2.2.2. Sentinel-2A Data

#### 2.2.3. SRTM Data

#### 2.2.4. ESA WorldCover 10 m 2020 Data

#### 2.2.5. Airborne LiDAR Data

- (1)
- Data Acquisition

- (2)
- Sample Point Collection

#### 2.3. Methods

#### 2.3.1. DSM-DEM Differencing Method

- (1)
- Extract a DSM from InSAR data, which includes vegetation height.
- (2)
- Obtain a high-precision DEM from LiDAR data.
- (3)
- Co-register InSAR DSM and LiDAR DEM to the same coordinate system. Subtract the LiDAR DEM from the InSAR DSM to obtain forest height.

#### 2.3.2. SINC Function Modeling Method

#### 2.3.3. Machine Learning Algorithms

- (1)
- CART: The CART algorithm involves splitting the sample into two smaller samples, where each non-leaf node in the tree has two branches. It is a binary recursive partitioning technique that can be used for both regression and classification tasks. The resulting tree is referred to as a regression tree [42]. The CART algorithm uses binary splitting to handle continuous data, and it selects features and performs splits based on minimizing the squared error criterion. In addition to the general advantages of decision tree models, such as simplicity and high accuracy, CART algorithm does not impose any requirements on the probability distribution of the target and predictor variables. It can also handle missing values, thus reducing bias caused by missing data [43,44]. In this study, the CART algorithm was implemented on the GEE cloud platform. The default values are null for maxNodes and 1 for minLeafPopulation.
- (2)
- GBDT: GBDT is a boosting algorithm for ensemble learning proposed by Friedman [45]. Its training process is conducted in a sequential manner, where the training of weak learners is ordered. Each weak learner learns based on the previous learner’s performance. GBDT typically uses decision trees as the base weak classifiers. The main idea behind GBDT is that each decision tree is constructed along the gradient direction of the previously built residual reduction. In other words, each new tree is built to reduce the residual of all previous trees in the direction of the gradient. This algorithm obtains a decision tree at each training iteration, and the trained decision trees are iteratively combined to form a strong learner [46,47,48,49]. In this study, the GBDT algorithm was implemented on the GEE cloud platform. Through iterative experiments, the following specific parameter settings were found: ntree = 160, shrinkage = 0.07.
- (3)
- RF: RF is a tree-based algorithm composed of many decision trees or regression trees, where each tree relies on the values of randomly sampled vectors and all trees have the same distribution in the data [50,51,52,53]. When using the RF algorithm on the GEE platform, only two parameters need to be set: the number of trees to generate (ntree) and the number of inverse variables used to split each node (Mtry). Through iterative experiments, ntree was set to 220 to avoid overfitting while ensuring accuracy. Mtry was configured with the default setting, which corresponds to the square root of the input feature variables’ number. The default value of null was assigned to maxNodes, while the default value of 1 was assigned to minLeafPopulation.
- (4)
- SVM: SVM is a novel algorithm based on statistical theory proposed by Vapnik. It is commonly used for small-sample nonlinear problems [54]. The principle can be understood as extending linearly inseparable data into a multidimensional space and using hyperplanes for classification. By finding the minimum structured risk, it enhances the generalization ability of feature combinations, thereby achieving the goal of obtaining effective statistical patterns even with limited statistical samples [55,56,57,58]. In this study, the implementation of the SVM algorithm was done on the GEE cloud platform. For the parameter settings of the SVM algorithm, the widely recognized radial basis function is used as the SVM’s kernel function.

#### 2.3.4. Feature Combination and Performance Evaluation

^{2}) and root mean square error (RMSE) were used to evaluate the accuracy of the feature combinations.

## 3. Results

#### 3.1. Validation of Forest Height Estimation Accuracy

#### 3.1.1. DSM-DEM Differencing Method and SINC Function Modeling Method

^{2}values of 0.38 and 0.23, respectively. The corresponding RMSE values are 4.34 m and 11.43 m. Both methods yield unsatisfactory estimation results. The scatter plot of the DSM-DEM differencing method appears more dispersed compared to the SINC function modeling method, and the underestimation becomes more evident as the forest height increases. The scatter plot of the SINC function modeling method appears relatively concentrated, but it exhibits significant overestimation at lower forest heights, resulting in a larger deviation from the LiDAR forest height.

#### 3.1.2. Feature Combination A

^{2}values of 0.51, 0.67, 0.67, and 0.54 for CART, GBDT, RF, and SVM algorithms, respectively. The corresponding RMSE values are 3.74 m, 2.89 m, 2.89 m, and 3.44 m. The scatter plot of the CART algorithm shows more scattering compared to the other three machine learning algorithms. GBDT and RF algorithms perform well in estimating the heights of low trees (0–7.5 m) and medium to high trees (15 m and above), while all machine learning algorithms exhibit some scattering in the estimation results for trees ranging from 7.5 m to 15 m. A certain overestimation is observed when LiDAR-derived forest height is close to 0 m, and as LiDAR-derived forest height increases, the underestimation becomes more pronounced for all machine learning algorithms. In general, GBDT and RF algorithms demonstrate better estimation accuracy, while the CART algorithm shows poorer estimation results.

#### 3.1.3. Feature Combination B

^{2}values for CART, GBDT, RF, and SVM algorithms are 0.40, 0.62, 0.62, and 0.46, respectively. The corresponding RMSE values are 4.22 m, 3.11 m, 3.12 m, and 3.70 m. Compared to feature combination A, the accuracy of forest height estimation decreases when using feature combination B with different machine learning algorithms. Specifically, the CART and SVM algorithms show a significant decrease in accuracy, with R

^{2}decreasing by 0.11 and 0.08, respectively, and RMSE increasing by 0.48 m and 0.26 m, respectively. The scatter plot of the CART algorithm exhibits a greater dispersion compared to feature combination A, while the scatter plot of the SVM algorithm is concentrated in the range of shorter trees with a noticeable overestimation. In the range of medium to tall trees, all machine learning algorithms exhibit underestimation. Overall, the GBDT and RF algorithms demonstrate better accuracy in forest height estimation, while the CART and SVM algorithms perform relatively poorly.

#### 3.1.4. Feature Combination C

^{2}values for CART, GBDT, RF, and SVM algorithms are 0.43, 0.65, 0.63, and 0.49, respectively. The corresponding RMSE values are 4.10 m, 2.91 m, 2.99 m, and 3.55 m.

#### 3.2. Large-Scale Forest Height Mapping

## 4. Discussion

#### 4.1. Deviation in Forest Height Estimation

#### 4.2. Comparison of Forest Height Estimation Methods

^{2}values and lower RMSE values, whereas the latter achieved lower R

^{2}values and higher RMSE values. When estimating forest height using feature combination C, the accuracy improved for feature combination B, while the accuracy of feature combination A did not improve, and even decreased, indicating that interferometric information has better estimation effects compared to backscatter coefficient, local incidence angle information.

#### 4.3. Machine Learning Algorithm Variable Analysis

#### 4.4. Limitations and Prospects

## 5. Conclusions

- (1)
- The estimation accuracy of different feature combinations and machine learning algorithms is superior to DSM-DEM differencing and SINC function modeling methods.
- (2)
- Modeling based on interferometric information demonstrates better estimation accuracy compared to modeling based on backscatter coefficient, local incidence angle information, or a combination of interferometric information and backscatter coefficient and local incidence angle information across all machine learning algorithms.
- (3)
- GBDT and RF algorithms both achieve accurate forest height estimation. GBDT exhibits higher precision, with R
^{2}values of 0.67, 0.62, and 0.65, and corresponding RMSE values of 2.89 m, 3.11 m, and 2.91 m for feature combinations A, B, and C, respectively. - (4)
- Interferometric information combined with machine learning algorithms has great potential in forest height estimation. The method proposed in this study allows the cost-effective estimation of forest height over large areas.

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Vatandaslar, C.; Narin, O.G.; Abdikan, S. Retrieval of forest height information using spaceborne LiDAR data: A comparison of GEDI and ICESat-2 missions for Crimean pine (Pinus nigra) stands. Trees
**2023**, 37, 717–731. [Google Scholar] [CrossRef] - Chen, E.; Li, Z.; Pang, Y.; Tian, X. Average tree height extraction technique based on polarimetric synthetic aperture radar interferometry. For. Sci.
**2007**, 66–70+145. [Google Scholar] - Pan, Y.; Birdsey, R.A.; Fang, J.; Houghton, R.; Kauppi, P.E.; Kurz, W.A.; Phillips, O.L.; Shvidenko, A.; Lewis, S.L.; Canadell, J.G.; et al. A large and persistent carbon sink in the world’s forests. Science
**2011**, 333, 988–993. [Google Scholar] [CrossRef] [PubMed] - Nikhil, S.; Danumah, J.H.; Saha, S.; Prasad, M.K.; Rajaneesh, A.; Mammen, P.C.; Ajin, R.S.; Kuriakose, S.L. Application of GIS and AHP method in forest fire risk zone mapping: A study of the Parambikulam tiger reserve, Kerala, India. J. Geovis. Spat. Anal.
**2021**, 5, 14. [Google Scholar] [CrossRef] - Amrutha, K.; Danumah, J.H.; Nikhil, S.; Saha, S.; Rajaneesh, A.; Mammen, P.C.; Ajin, R.S.; Kuriakose, S.L. Demarcation of forest fire risk zones in Silent Valley National Park and the effectiveness of forest management regime. J. Geovis. Spat. Anal.
**2022**, 6, 8. [Google Scholar] [CrossRef] - Wang, Y.; Lehtomäki, M.; Liang, X.; Pyörälä, J.; Kukko, A.; Jaakkola, A.; Liu, J.; Feng, Z.; Chen, R.; Hyyppä, J. Is field-measured tree height as reliable as believed–A comparison study of tree height estimates from field measurement, airborne laser scanning and terrestrial laser scanning in a boreal forest. ISPRS J. Photogramm. Remote Sens.
**2019**, 147, 132–145. [Google Scholar] [CrossRef] - Persson, H.J.; Ståhl, G. Characterizing uncertainty in forest remote sensing studies. Remote Sens.
**2020**, 12, 505. [Google Scholar] [CrossRef] - Fassnacht, F.E.; White, J.C.; Wulder, M.A.; Næsset, E. Remote sensing in forestry: Current challenges, considerations and directions. For. Int. J. For. Res.
**2023**, cpad024. [Google Scholar] [CrossRef] - Rodriguez, E.; Martin, J.M. Theory and design of interferometric synthetic aperture radars. In IEE Proceedings F (Radar and Signal Processing); IET Digital Library: Londone, UK, 1992; Volume 139, pp. 147–159. [Google Scholar]
- Li, L.; Chen, E.; Li, Z.; Feng, Q.; Zhao, L. A Review on Forest Height and Above-ground Biomass Estimation based on Synthetic Aperture Radar. Remote Sens. Technol. Appl.
**2016**, 31, 625. [Google Scholar] [CrossRef] - Zhang, H.; Wang, C.; Zhu, J.; Fu, H.; Xie, Q.; Shen, P. Forest above-ground biomass estimation using single-baseline polarization coherence tomography with P-band PolInSAR data. Forests
**2018**, 9, 163. [Google Scholar] [CrossRef] - Soja, M.J.; Ulander, L.M. Digital canopy model estimation from TanDEM-X interferometry using high-resolution lidar DEM. In Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium-IGARSS, Melbourne, VIC, Australia, 21–26 July 2013; pp. 165–168. [Google Scholar]
- Sadeghi, Y.; St-Onge, B.; Leblon, B.; Simard, M.; Papathanassiou, K. Mapping forest canopy height using TanDEM-X DSM and airborne LiDAR DTM. In Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 13–18 July 2014; pp. 76–79. [Google Scholar]
- Cloude, S. Polarisation: Applications in Remote Sensing; OUP Oxford: Oxford, UK, 2009. [Google Scholar]
- Cloude, S.R.; Chen, H.; Goodenough, D.G. Forest height estimation and validation using Tandem-X polinsar. In Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium-IGARSS, Melbourne, VIC, Australia, 21–26 July 2013; pp. 1889–1892. [Google Scholar]
- Feng, Q.; Chen, E.; Li, Z.; Li, L.; Zhao, L. Forest Height Estimation from Airborne X-band Single-pass InSAR Data. Remote Sens. Technol. Appl.
**2016**, 31, 551–557. [Google Scholar] - Caicoya, A.T.; Kugler, F.; Hajnsek, I.; Papathanassiou, K.P. Large-scale biomass classification in boreal forests with TanDEM-X data. IEEE Trans. Geosci. Remote Sens.
**2016**, 54, 5935–5951. [Google Scholar] [CrossRef] - Fan, Y.; Chen, E.; Li, Z.; Zhao, L.; Zhang, W.; Jin, Y.; Cai, L. Forest Height Estimation Method Using TanDEM-X Interferometric Coherence Data. For. Sci.
**2020**, 56, 35–46. [Google Scholar] - Zhang, T.; Zhu, J.; Fu, H.; Wang, C. Forest height inversion with single-baseline TanDEM-X InSAR coherence. Acta Geod. Cartogr. Sin.
**2022**, 51, 1931–1941. [Google Scholar] - Wang, H.; Zhao, Y.; Pu, R.; Zhang, Z. Mapping Robinia pseudoacacia forest health conditions by using combined spectral, spatial, and textural information extracted from IKONOS imagery and random forest classifier. Remote Sens.
**2015**, 7, 9020–9044. [Google Scholar] [CrossRef] - Zhao, J.; Zhang, Z.; Han, S.; Qu, C.; Yuan, Z.; Zhang, D. SVM based forest fire detection using static and dynamic features. Comput. Sci. Inf. Syst.
**2011**, 8, 821–841. [Google Scholar] [CrossRef] - Singh, S.K.; Srivastava, P.K.; Gupta, M.; Thakur, J.K. Mukherjee, S. Appraisal of land use/land cover of mangrove forest ecosystem using support vector machine. Environ. Earth Sci.
**2014**, 71, 2245–2255. [Google Scholar] [CrossRef] - Fassnacht, F.E.; Hartig, F.; Latifi, H.; Berger, C.; Hernández, J.; Corvalán, P.; Koch, B. Importance of sample size, data type and prediction method for remote sensing-based estimations of aboveground forest biomass. Remote Sens. Environ.
**2014**, 154, 102–114. [Google Scholar] [CrossRef] - Chen, G.; Hay, G.J.; St-Onge, B. A GEOBIA framework to estimate forest parameters from lidar transects, Quickbird imagery and machine learning: A case study in Quebec, Canada. Int. J. Appl. Earth Obs. Geoinform.
**2012**, 15, 28–37. [Google Scholar] [CrossRef] - Gu, C.; Clevers, J.G.; Liu, X.; Tian, X.; Li, Z.; Li, Z. Predicting forest height using the GOST, Landsat 7 ETM+, and airborne LiDAR for sloping terrains in the Greater Khingan Mountains of China. ISPRS J. Photogramm. Remote Sens.
**2018**, 137, 97–111. [Google Scholar] [CrossRef] - García, M.; Saatchi, S.; Ustin, S.; Balzter, H. Modelling forest canopy height by integrating airborne LiDAR samples with satellite Radar and multispectral imagery. Int. J. Appl. Earth Obs. Geoinf.
**2018**, 66, 159–173. [Google Scholar] [CrossRef] - Li, W.; Niu, Z.; Shang, R.; Qin, Y.; Wang, L.; Chen, H. High-resolution mapping of forest canopy height using machine learning by coupling ICESat-2 LiDAR with Sentinel-1, Sentinel-2 and Landsat-8 data. Int. J. Appl. Earth Obs. Geoinf.
**2020**, 92, 102163. [Google Scholar] [CrossRef] - Brigot, G.; Simard, M.; Colin-Koeniguer, E.; Boulch, A. Retrieval of forest vertical structure from PolInSAR data by machine learning using LIDAR-derived features. Remote Sens.
**2019**, 11, 381. [Google Scholar] [CrossRef] - Xie, Y.; Fu, H.; Zhu, J.; Wang, C.; Xie, Q. A LiDAR-aided multibaseline PolInSAR method for forest height estimation: With emphasis on dual-baseline selection. IEEE Geosci. Remote Sens. Lett.
**2019**, 17, 1807–1811. [Google Scholar] [CrossRef] - Pourshamsi, M.; Garcia, M.; Lavalle, M.; Balzter, H. A machine-learning approach to PolInSAR and LiDAR data fusion for improved tropical forest canopy height estimation using NASA AfriSAR Campaign data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2018**, 11, 3453–3463. [Google Scholar] [CrossRef] - Pourshamsi, M.; Garcia, M.; Lavalle, M.; Pottier, E.; Balzter, H. Machine-learning fusion of PolSAR and LiDAR data for tropical forest canopy height estimation. In Proceedings of the IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 8108–8111. [Google Scholar]
- Pourshamsi, M.; Xia, J.; Yokoya, N.; Garcia, M.; Lavalle, M.; Pottier, E.; Balzter, H. Tropical forest canopy height estimation from combined polarimetric SAR and LiDAR using machine-learning. ISPRS J. Photogramm. Remote Sens.
**2021**, 172, 79–94. [Google Scholar] [CrossRef] - Weber, M. TerraSAR-X and TanDEM-X: Reconnaisance applications. In Proceedings of the 2007 3rd International Conference on Recent Advances in Space Technologies, Istanbul, Turkey, 14–16 June 2007; pp. 299–303. [Google Scholar]
- Krieger, G.; Moreira, A.; Fiedler, H.; Hajnsek, I.; Werner, M.; Younis, M.; Zink, M. TanDEM-X: A satellite formation for high-resolution SAR interferometry. IEEE Trans. Geosci. Remote Sens.
**2007**, 45, 3317–3341. [Google Scholar] [CrossRef] - Papathanassiou, K.P.; Cloude, S.R. Single-baseline polarimetric SAR interferometry. IEEE Trans. Geosci. Remote Sens.
**2001**, 39, 2352–2363. [Google Scholar] [CrossRef] - Cloude, S.R.; Papathanassiou, K.P. Three-stage inversion process for polarimetric SAR interferometry. IEE Proc.-Radar Sonar Navig.
**2003**, 150, 125–134. [Google Scholar] [CrossRef] - Kugler, F.; Lee, S.K.; Hajnsek, I.; Papathanassiou, K.P. Forest height estimation by means of Pol-InSAR data inversion: The role of the vertical wavenumber. IEEE Trans. Geosci. Remote Sens.
**2015**, 53, 5294–5311. [Google Scholar] [CrossRef] - Duque, S.; Balls, U.; Rossi, C.; Fritz, T.; Balzer, W. TanDEM-X. Ground Segment. CoSSC Generation and Interferometric Considerations. Issue: 1.0; Deutsches Zentrum fuer Luft-und Raumfahrt (DLR): Oberpfaffenhofen, Germany, 2012. [Google Scholar]
- Fritz, T. TanDEM-X. Ground Segment. TanDEM-X Experimental Product Description. Issue: 1.2; Deutsches Zentrum fuer Luft-und Raumfahrt (DLR): Oberpfaffenhofen, Germany, 2012. [Google Scholar]
- Martone, M.; Bräutigam, B.; Rizzoli, P.; Gonzalez, C.; Bachmann, M.; Krieger, G. Coherence evaluation of TanDEM-X interferometric data. ISPRS J. Photogramm. Remote Sens.
**2012**, 73, 21–29. [Google Scholar] [CrossRef] - Kugler, F.; Schulze, D.; Hajnsek, I.; Pretzsch, H.; Papathanassiou, K.P. TanDEM-X Pol-InSAR performance for forest height estimation. IEEE Trans. Geosci. Remote Sens.
**2014**, 52, 6404–6422. [Google Scholar] [CrossRef] - Dong, H.; Xu, H.; Lu, B.; Yang, Q. A CART-based approach to predict nitrogen oxide concentration along urban traffic roads. Acta Sci. Circumstantiae
**2019**, 39, 1086–1094. [Google Scholar] [CrossRef] - Li, Z.; Du, J.; Zhou, Y. Rainfall prediction model based on improved CART algorithm. Mod. Electron. Tech.
**2020**, 43, 133–137+141. [Google Scholar] [CrossRef] - Guan, Y.; Wang, W.; Liu, S. Building and application of summer high temperature prediction model based on CART algorithm. J. Meteorol. Sci.
**2018**, 38, 539–544. [Google Scholar] - Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat.
**2001**, 29, 1189–1232. [Google Scholar] [CrossRef] - Sun, J. Surface Water Information Extraction from High Resolution Remotely Sensed Image Based on Integrated Learning; Jilin University: Jilin, China, 2020. [Google Scholar] [CrossRef]
- Zhang, W.; Wei, Q.; Wu, T.; Lin, J.; Shao, G.; Ding, M. Prediction models of reference crop evapotranspiration based on gradient boosting decision tree (GBDT) algorithm in Jiangsu province. Jiangsu J. Agric. Sci.
**2020**, 36, 1169–1180. [Google Scholar] - Wu, W.; Wang, J.; Huang, Y.; Zhao, H.; Wang, X. A novel way to determine transient heat flux based on GBDT machine learning algorithm. Int. J. Heat Mass Transf.
**2021**, 179, 121746. [Google Scholar] [CrossRef] - Paudel, D.; Boogaard, H.; De Wit, A.; Van Der Velde, M.; Claverie, M.; Nisini, L.; Janssen, S.; Osinga, S.; Athanasiadis, I.N. Machine learning for regional crop yield forecasting in Europe. Field Crops Res.
**2022**, 276, 108377. [Google Scholar] [CrossRef] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] - Wang, L.; Zheng, G.; Guo, Y.; He, J.; Cheng, Y. Prediction of Winter Wheat Yield Based on Fusing Multi-source Spatio-temporal Data. Trans. Chin. Soc. Agric. Mach.
**2022**, 53, 198–204+458. [Google Scholar] - Odebiri, O.; Mutanga, O.; Odindi, J.; Peerbhay, K.; Dovey, S. Predicting soil organic carbon stocks under commercial forest plantations in KwaZulu-Natal province, South Africa using remotely sensed data. GIScience Remote Sens.
**2020**, 57, 450–463. [Google Scholar] [CrossRef] - Lin, Z.; Yao, J.; Su, X.; Cai, Z.; Liu, D. Extracting planting information of early rice using MODIS index and random forest in Jiangxi Province, China. Trans. Chin. Soc. Agric. Eng.
**2022**, 38, 197–205. [Google Scholar] - Vapnik, V. Estimation of Dependences Based on Empirical Data; Springer: New York, NY, USA, 1982. (In Russian) [Google Scholar]
- Zhang, R.; Sun, D.; Li, S.; Yu, Y. A stepwise cloud shadow detection approach combining geometry determination and SVM classification for MODIS data. Int. J. Remote Sens.
**2013**, 34, 211–226. [Google Scholar] [CrossRef] - Chu, Y.; Liu, C.; Tai, W.; Yang, H. Prediction model of TOC contents in source rocks with different salinity degrees based on Support Vector Machine (SVM). Pet. Geol. Exp.
**2022**, 44, 739–746. [Google Scholar] - Vn, V. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995. [Google Scholar]
- Guo, J.; Long, H.; He, J.; Mei, X.; Yang, G. Predicting soil organic matter contents in cultivated land using Google Earth Engine and machine learning. Trans. Chin. Soc. Agric. Eng.
**2022**, 38, 130–137. [Google Scholar] - Karila, K.; Vastaranta, M.; Karjalainen, M.; Kaasalainen, S. Tandem-X interferometry in the prediction of forest inventory attributes in managed boreal forests. Remote Sens. Environ.
**2015**, 159, 259–268. [Google Scholar] [CrossRef] - Nandy, S.; Srinet, R.; Padalia, H. Mapping forest height and aboveground biomass by integrating ICESat-2, Sentinel-1 and Sentinel-2 data using Random Forest algorithm in northwest Himalayan foothills of India. Geophys. Res. Lett.
**2021**, 48, e2021GL093799. [Google Scholar] [CrossRef]

**Figure 6.**Forest height predicted by regression using feature combination A and different algorithms: (

**a**) LiDAR, (

**b**) CART, (

**c**) GBDT, (

**d**) RF, (

**e**) SVM.

**Figure 8.**Forest height predicted by regression using feature combination B and different algorithms: (

**a**) LiDAR, (

**b**) CART, (

**c**) GBDT, (

**d**) RF, (

**e**) SVM.

**Figure 10.**Forest height predicted by regression using feature combination C and different algorithms: (

**a**) LiDAR, (

**b**) CART, (

**c**) GBDT, (

**d**) RF, (

**e**) SVM.

**Figure 12.**Forest canopy height distribution maps produced using the feature combination A combined with the GBDT algorithm, along with a comparison to optical remote sensing images from Google Earth: (

**a**) forest height; (

**b**) zoomed-in map of the mountainous area; (

**c**) zoomed-in map of the lake area.

**Figure 13.**Histograms depicting the difference between estimated forest heights and LiDAR-derived forest heights using various methods: (

**a**) based on DSM-DEM, SINC; (

**b**) based on feature combination A; (

**c**) based on feature combination B; (

**d**) based on feature combination C. The x-axis represents the difference in forest heights, where positive values indicate that the regression-predicted forest height is higher than the LiDAR-measured forest height, and negative values indicate that the regression-predicted forest height is lower than the LiDAR-measured forest height. The y-axis represents pixel density.

**Figure 14.**Feature importance ranking: (

**a**) Feature Combination A; (

**b**) Feature Combination B; (

**c**) Feature Combination C.

Acquisition Time | Height of Ambiguity (m) | Effective Baseline (m) | Incidence Angle (°) | Polarization Mode | k_{z} (rad/m) | Resolution (Rg × Az) (m) |
---|---|---|---|---|---|---|

2020-10-07 | 32.7 | 203.4 | 40.6 | HH | 0.19 | 2.71 × 3.30 |

Data | Count | Maximum Value (m) | Minimum Value (m) | Average Value (m) | Standard Deviation (m) | Number of Samples in the Range of 0–10 m | Number of Samples in the Range of 10–20 m | Number of Samples in the Range of 20–30 m |
---|---|---|---|---|---|---|---|---|

LiDAR forest height sample points | 6225 | 24.7 | 0.1 | 8.2 | 5.0 | 4193 | 2007 | 25 |

**Table 3.**Feature combinations of interferometric, backscatter coefficient, and local incidence angle information.

Feature Combination | Features |
---|---|

A | Interferometric features (coherence, decorrelation of volume scattering) |

Spectral features (fraction of vegetation cover) | |

Topographic features (slope, aspect, elevation) | |

B | (backscatter coefficient, local incidence angle) |

Spectral features (fraction of vegetation cover) | |

Topographic features (slope, aspect, elevation) | |

C | Interferometric features (coherence, decorrelation of volume scattering) |

(backscatter coefficient, local incidence angle) | |

Spectral features (fraction of vegetation cover) | |

Topographic features (slope, aspect, elevation) |

DSM-DEM | SINC | |
---|---|---|

R^{2} | 0.38 | 0.23 |

RMSE (m) | 4.34 | 11.43 |

**Table 5.**Comparison of forest height estimation performance using different feature combinations and machine learning algorithms.

Feature Combination | CART | GBDT | RF | SVM | ||||
---|---|---|---|---|---|---|---|---|

R^{2} | RMSE (m) | R^{2} | RMSE (m) | R^{2} | RMSE (m) | R^{2} | RMSE (m) | |

A | 0.51 | 3.74 | 0.67 | 2.89 | 0.67 | 2.89 | 0.54 | 3.44 |

B | 0.40 | 4.22 | 0.62 | 3.11 | 0.62 | 3.12 | 0.46 | 3.70 |

C | 0.43 | 4.10 | 0.65 | 2.91 | 0.63 | 2.99 | 0.49 | 3.55 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bao, J.; Zhu, N.; Chen, R.; Cui, B.; Li, W.; Yang, B.
Estimation of Forest Height Using Google Earth Engine Machine Learning Combined with Single-Baseline TerraSAR-X/TanDEM-X and LiDAR. *Forests* **2023**, *14*, 1953.
https://doi.org/10.3390/f14101953

**AMA Style**

Bao J, Zhu N, Chen R, Cui B, Li W, Yang B.
Estimation of Forest Height Using Google Earth Engine Machine Learning Combined with Single-Baseline TerraSAR-X/TanDEM-X and LiDAR. *Forests*. 2023; 14(10):1953.
https://doi.org/10.3390/f14101953

**Chicago/Turabian Style**

Bao, Junfan, Ningning Zhu, Ruibo Chen, Bin Cui, Wenmei Li, and Bisheng Yang.
2023. "Estimation of Forest Height Using Google Earth Engine Machine Learning Combined with Single-Baseline TerraSAR-X/TanDEM-X and LiDAR" *Forests* 14, no. 10: 1953.
https://doi.org/10.3390/f14101953