Estimating the Biomass of Maize with Hyperspectral and LiDAR Data

The accurate estimation of crop biomass during the growing season is very important for crop growth monitoring and yield estimation. The objective of this paper was to explore the potential of hyperspectral and light detection and ranging (LiDAR) data for better estimation of the biomass of maize. First, we investigated the relationship between field-observed biomass with each metric, including vegetation indices (VIs) derived from hyperspectral data and LiDAR-derived metrics. Second, the partial least squares (PLS) regression was used to estimate the biomass of maize using VIs (only) and LiDAR-derived metrics (only), respectively. Third, the fusion of hyperspectral and LiDAR data was evaluated in estimating the biomass of maize. Finally, the biomass estimates were validated by a leave-one-out cross-validation (LOOCV) method. Results indicated that all VIs showed weak correlation with field-observed biomass and the highest correlation occurred when using the red edge-modified simple ratio index (ReMSR). Among all LiDAR-derived metrics, the strongest relationship was observed between coefficient of variation (HCV) of digital terrain model (DTM) normalized point elevations with field-observed biomass. The combination of VIs through PLS regression could not improve the biomass estimation accuracy of maize due to the high correlation between VIs. In contrast, the HCV combined with Hmean performed better than one LiDAR-derived metric alone in biomass estimation (R2 = 0.835, RMSE = 374.655 g/m2, RMSECV = 393.573 g/m2). Additionally, our findings indicated that the fusion of hyperspectral and LiDAR data can provide better biomass estimates of maize (R2 = 0.883, RMSE = 321.092 g/m2, RMSECV = 337.653 g/m2) compared with LiDAR or hyperspectral data alone.


Introduction
Crop biomass serves as one of the critical inputs in growth monitoring and yield estimation models, and plays an important role in agricultural, ecological, and meteorological applications [1][2][3][4][5].It is, therefore, essential to accurately estimate crop biomass.However, accurate estimation of crop biomass remains a challenging task.Field measurement techniques, such as destructive sampling, can accurately obtain crop biomass.However, they are labor-intensive and time-consuming, and not suitable for large-scale spatial and temporal measurements [6].
Unlike field measurement techniques, remote sensing is an effective technology for estimating crop biomass at the regional or global scale [1,7,8].Many previous studies have successfully estimated crop biomass using traditional optical remote sensing data [2,6,9,10].Both physical and statistical models were developed to estimate crop biomass.The physical models are complex, thus, only a few previous studies conducted crop biomass estimation using the inversion of physical models [11,12].In contrast, spectral vegetation indices (VIs) were commonly used to estimate crop biophysical variables due to the simplicity and practicality [13][14][15][16][17].However, the estimation accuracy of crop biomass is usually affected by background soil reflectance, atmospheric and water absorption using broadband VIs.The emergence of hyperspectral data overcame the limitation by adopting distinctive narrow bands [18].Hyperspectral sensors can provide fine spectral resolution bands and abundant information regarding the biochemical and biophysical composition of vegetation canopy [19][20][21][22].In fact, hyperspectral data has been successfully employed in crop biomass estimation by many previous studies.However, the estimation accuracy of crop biomass is generally limited due to the asymptotic saturation problem, which is common with passive optical sensors [9,18,22,23].Additionally, passive optical sensors record electromagnetic energy which is mostly reflected or absorbed in the uppermost canopy layers [24,25].In this case, optical remote sensing sensors provide limited information of vegetation vertical structure, the estimation accuracy of crop biomass may, therefore, be affected.
Different from optical sensors, Light Detection and Ranging (LiDAR) is an active remote sensing technology capable of providing three-dimensional information on vegetation canopy structure [26,27].Moreover, it is able to overcome the saturation problem of optical remote sensing.Therefore, LiDAR can provide effective data for estimating vegetation structure parameters.LiDAR data has been successfully used to estimate forest biomass in many previous studies.Both height metrics and laser penetration indices, derived from LiDAR returns, are commonly employed in forest biomass estimation [3,[28][29][30].However, the application of airborne discrete-return LiDAR data is generally limited in environments with short, dense vegetation maybe due to the following two reasons [20,31,32].First, dense vegetation limits laser penetration to the ground, which may be lower the estimation accuracy of vegetation biophysical variables.Second, the discrete-return LiDAR data provides limited information about the vertical vegetation structure because it only records one return for each emitted pulse in areas with short vegetation [31][32][33].Finally, airborne LiDAR data cannot provide spectral characteristics of vegetation canopy because they work with a single-wavelength.
In summary, airborne LiDAR data provides detailed vertical structural information of vegetation canopy and hyperspectral data reflects abundant spectral information of the vegetation canopy.In this case, the fusion of LiDAR and hyperspectral data appears to be an interesting option for better estimating vegetation biomass.The previous studies have demonstrated the capability of fused hyperspectral and LiDAR data for providing robust vegetation biomass estimates in forest ecosystems [25,29,[34][35][36], even in herbaceous environments [22,37].However, there is no literature that has focused on the study of crop biomass estimation through the fusion of hyperspectral and LiDAR data.This study can, therefore, serve as a new attempt to evaluate the fusion of LiDAR and hyperspectral data in estimating crop biomass.
The objective of this paper is to explore the potential of hyperspectral and LiDAR data in better estimating biomass of maize.More specifically, our goals were: (1) to examine the relationship between field-observed biomass with VIs derived from hyperspectral data; (2) to evaluate the utility of LiDAR-derived metrics for biomass estimation; and (3) to determine whether the fusion of LiDAR and hyperspectral data will improve the biomass estimation accuracy of maize.

Study Area
The experiment was carried out in Heihe River Basin, Gansu province of China (100 • 20 E and 100 • 30 E, 38 • 50 N, and 39 • 03 N, Figure 1).The average annual precipitation and temperature are 140 mm and 7 • C in the middle reaches of the Heihe River, respectively.The terrain in this experiment site is relatively flat, and the average elevation is about 1403 m above sea level.The dominated vegetation is cropland with maize being the major crop, occupying 95% of whole study area.

Airborne LiDAR Data
Airborne LiDAR data were acquired in July 2012 using a Leica Airborne Laser Scanner (ALS70) system (Leica, Lucerne, Switzerland).The average flying height was 1300 m above the ground level with an average point density of 7.4 points/m 2 .The LiDAR data was post processed to calculate LiDAR-derived metrics.It mainly consists of three separate stages: outlier removal, point cloud classification, and digital terrain model (DTM) generation.Outliers in the original LiDAR datasets were first removed using a statistical method based on the elevation frequency histogram of point clouds [38].Then LiDAR point clouds were classified as canopy and ground returns using the adaptive triangulation network (TIN) filtering algorithm developed by Axelsson embedded in the TerraScan software (TerraSolid, Ltd., Finland) [39].A DTM was finally generated from LiDAR ground returns with a grid cell size of 1 m.We removed the influence of topography through using the generated DTM and obtained the DTM normalized point clouds.The LiDAR-derived metrics can, therefore, be calculated from DTM normalized point clouds.

Airborne Hyperspectral Data
Compact airborne spectrographic imager (CASI) imagery was acquired in July 2012, and provided by "Heihe Watershed Allied Telemetry Experimental Research" (HiWATER) [40].CASI data were collected in "Hyperspectral Mission" mode, with 72 channels of approximately 7.5 nm bandwidth across a wavelength range of 400-940 nm.The final spatial resolution was at 1 m after radiometric correction and ortho-rectification based on a LiDAR-derived DTM.Although the radiation calibration and geometric correction of CASI imagery have been carried out by the data provider, atmospheric correction must be conducted on original hyperspectral bands to accurately calculate VIs.In this study, atmospheric correction of the CASI imagery was performed using the Fast Line-of-Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) algorithm [41], which is based on a MODTRAN4 approach for path scattered radiance, absorption, and adjacency effects [42].In addition, a list of other preprocessing steps, such as removal of aircraft motion effects, and flat field adjustments of surface reflectance spectra, were also conducted using ENVI software in this study.

Field Measurement
The field measurements were carried out from 11 July 2012 to 13 July 2012.There are a total of 38 sample plots with a size of 4.0 m × 4.0 m square across the whole study area.However, only 29 sample plots were analyzed in this study due to the bad quality of hyperspectral data.For each plot,

Airborne LiDAR Data
Airborne LiDAR data were acquired in July 2012 using a Leica Airborne Laser Scanner (ALS70) system (Leica, Lucerne, Switzerland).The average flying height was 1300 m above the ground level with an average point density of 7.4 points/m 2 .The LiDAR data was post processed to calculate LiDAR-derived metrics.It mainly consists of three separate stages: outlier removal, point cloud classification, and digital terrain model (DTM) generation.Outliers in the original LiDAR datasets were first removed using a statistical method based on the elevation frequency histogram of point clouds [38].Then LiDAR point clouds were classified as canopy and ground returns using the adaptive triangulation network (TIN) filtering algorithm developed by Axelsson embedded in the TerraScan software (TerraSolid, Ltd., Helsinki, Finland) [39].A DTM was finally generated from LiDAR ground returns with a grid cell size of 1 m.We removed the influence of topography through using the generated DTM and obtained the DTM normalized point clouds.The LiDAR-derived metrics can, therefore, be calculated from DTM normalized point clouds.

Airborne Hyperspectral Data
Compact airborne spectrographic imager (CASI) imagery was acquired in July 2012, and provided by "Heihe Watershed Allied Telemetry Experimental Research" (HiWATER) [40].CASI data were collected in "Hyperspectral Mission" mode, with 72 channels of approximately 7.5 nm bandwidth across a wavelength range of 400-940 nm.The final spatial resolution was at 1 m after radiometric correction and ortho-rectification based on a LiDAR-derived DTM.Although the radiation calibration and geometric correction of CASI imagery have been carried out by the data provider, atmospheric correction must be conducted on original hyperspectral bands to accurately calculate VIs.In this study, atmospheric correction of the CASI imagery was performed using the Fast Line-of-Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) algorithm [41], which is based on a MODTRAN4 approach for path scattered radiance, absorption, and adjacency effects [42].In addition, a list of other preprocessing steps, such as removal of aircraft motion effects, and flat field adjustments of surface reflectance spectra, were also conducted using ENVI software in this study.

Field Measurement
The field measurements were carried out from 11 July 2012 to 13 July 2012.There are a total of 38 sample plots with a size of 4.0 m × 4.0 m square across the whole study area.However, only 29 sample plots were analyzed in this study due to the bad quality of hyperspectral data.For each plot, the center coordinate was accurately determined using a 1 cm level real-time kinematic Global Positioning System (RTK-GPS) (Trimble Navigation Ltd., Sunnyvale, CA, USA).Additionally, both the height and LAI of maize were also directly measured in this study (see Table 1).Field LAIs were measured using an LAI-2200 (LI-Cor Inc., Lincoln, NE, USA).The heights of all maize plant within the same plot were measured using a measuring tape, and the average height was regarded as reference height of maize.The field-observed biomass of maize can be calculated using Equation (1), which was obtained by long-term observation of the entire growing season of maize according to the study of Gao et al., [10].
where height is the mean height, LAI is the leaf area index, and Biomassis the field-observed biomass of maize.

Deriving Metrics from Hyperspectral Data
Many previous studies have investigated a list of VIs derived from hyperspectral data for crop biomass estimation [1,7,17,43].Based on the findings of these studies, six VIs were selected to estimate biomass of maize in this study (Table 2): Simple Ratio Vegetation Index (SR) [44], Normalized Difference Vegetation Index (NDVI) [45], Modified Simple Ratio Index (MSR) [24], Modified Soil-Adjusted Vegetation Index (MSAVI) [46], Red Edge NDVI (ReNDVI) [47] and Red Edge Modified Simple Ratio Index (ReMSR) [15].For each plot, the VIs of each pixel was calculated from the CASI imager, and all VIs within the same plot were averaged.

Abbr. Formula
Simple Ratio Vegetation Index Note that there is no band centered at 705 nm for the CASI imager; two adjacent bands were chosen to calculate the reflectance R 705 .R is the abbreviation of reflectance.

Deriving Metrics from LiDAR
A list of LiDAR-derived metrics, including height metrics and laser penetration indices, were adopted to estimate the biomass of maize in this study according to the previous studies [3,6,25,32].These metrics consists of: the mean (H mean ) and maximum (H max ) height of LiDAR returns, the standard deviation (H SD ) and coefficient of variation (H CV ) of normalized DTM points' elevations, and canopy cover (CC) derived from airborne LiDAR data.The CC is calculated as the ratio of the number of canopy returns and all returns.The definition and explanation of these LiDAR-derived metrics were listed in Table 3.

Abbr. Formula
Max height H max Note: h i is the height of i-th point, and N is the total number of LiDAR points in the plot.

Regression Analysis
In this study, the relationship between field-observed biomass with each metric was first examined using both the linear and exponential regression analyses according to the previous studies [17,48].These metrics included VIs derived from hyperspectral data and LiDAR-derived metrics.
Additionally, partial least squares (PLS) regression was implemented with VIs (only), LiDAR-derived metrics (only) showed in Table 3, and all metrics, respectively.Different from multiple linear regression, PLS regression is a multivariate statistical method which can effectively deal with the problem of multicollinearity among predictor variables [25,49].It has been successfully employed in vegetation biomass estimation.In the process of PLS regression, variance inflation factors (VIFs) were simultaneously calculated to evaluate importance of individual predictors.Only predictor variables with VIF values greater than 1.0 were considered according to the previous study [25].
For all biomass estimation models, both the coefficient of determination (R 2 ) and root-mean-square error (RMSE) were calculated to assess the established models.In addition, the leave-one-out cross-validation (LOOCV) is used to evaluate the generalization capability of regression models because there is no additional data for model validation [50,51].Specifically, the root mean square error from the cross validation analysis (RMSE CV ), calculated from the predicted residual sum of squares (PRESS statistic) and the number of observations, was used to validate the accuracies of biomass estimation models.

VIs as Predictors of Vegetation Biomass
To examine the relationship of biomass against VIs, both the linear and exponential regression analysis were carried out in this study.The R 2 and RMSE of biomass estimation models using different VIs are showed in Figure 2.
From Figure 2, all VIs showed a weak relationship with field-observed biomass due to high RMSE and low R 2 .Among all VIs, ReMSR had the lowest RMSE, and the highest R 2 for biomass estimation of maize regardless of linear or exponential models.
Since VIs showed a high degree of multi-collinearity (see Table 4), when they were used in PLS regression, they did not improve the predictability considerably.Therefore, the best biomass estimation model using VIs is built based on ReMSR alone.Figure 3 showed the scatterplot of field-observed biomass versus estimated biomass using VIs alone.
From Figure 2, all VIs showed a weak relationship with field-observed biomass due to high RMSE and low R 2 .Among all VIs, ReMSR had the lowest RMSE, and the highest R 2 for biomass estimation of maize regardless of linear or exponential models.
Since VIs showed a high degree of multi-collinearity (see Table 4), when they were used in PLS regression, they did not improve the predictability considerably.Therefore, the best biomass estimation model using VIs is built based on ReMSR alone.Figure 3 showed the scatterplot of field-observed biomass versus estimated biomass using VIs alone.From Figure 2, all VIs showed a weak relationship with field-observed biomass due to high RMSE and low R 2 .Among all VIs, ReMSR had the lowest RMSE, and the highest R 2 for biomass estimation of maize regardless of linear or exponential models.
Since VIs showed a high degree of multi-collinearity (see Table 4), when they were used in PLS regression, they did not improve the predictability considerably.Therefore, the best biomass estimation model using VIs is built based on ReMSR alone.Figure 3 showed the scatterplot of field-observed biomass versus estimated biomass using VIs alone.

LiDAR-Derived Metrics as Predictors of Vegetation Biomass
The vertical accuracies of DTM greatly affected the calculation of LiDAR-derived metrics, and then limited the biomass estimation accuracy of maize.Thus, it is essential to assess the accuracy of LiDAR-derived DTM.The elevation differences were first calculated by subtracting the elevations of GPS points from the corresponding DTM, then the mean error (ME) and RMSE were calculated to assess the vertical accuracy of LiDAR-derived DTM.Results showed that the LiDAR-derived DTM in our experiment site has a high accuracy with the ME and RMSE of 0.07 m and 0.09 m, respectively.
Additionally, we performed linear and exponential regression analyses of field-observed biomass against each LiDAR-derived metric in this study.The variation of R 2 and RMSE using different LiDAR-derived metrics were shown in Figure 4.Among all of the LiDAR-derived metrics, H CV provided the best agreement with field-observed biomass in an exponential model and 80.1% of the variance in vegetation biomass (RMSE = 429.144g/m 2 ).This was followed by H mean (R 2 = 0.729, RMSE = 518.509g/m 2 ) with an exponential model.However, the CC was not sensitive to field-observed biomass regardless of the linear or exponential regression model due to the high RMSE and low R 2 .
Similarly, the PLS regression method was also used to estimate the biomass of maize using LiDAR-derived metrics.The VIP values of all LiDAR-derived metrics were calculated using the PLS regression (as shown in Figure 5).Only the LiDAR-derived metrics with VIP values greater than 1.0 were considered in this study.In this case, the best estimation model is constructed based on H CV and H mean .Figure 6 showed the scatterplot of field-observed biomass versus estimated biomass using LiDAR-derived metrics alone.The combined use of LiDAR-derived metrics performed better than one LiDAR-derived metric alone in estimating biomass of maize.
From Figures 3 and 6, we can concluded that LiDAR data had a stronger capability than hyperspectral data in estimating the biomass of maize due to the higher R 2 and lower RMSE.

LiDAR-Derived Metrics as Predictors of Vegetation Biomass
The vertical accuracies of DTM greatly affected the calculation of LiDAR-derived metrics, and then limited the biomass estimation accuracy of maize.Thus, it is essential to assess the accuracy of LiDAR-derived DTM.The elevation differences were first calculated by subtracting the elevations of GPS points from the corresponding DTM, then the mean error (ME) and RMSE were calculated to assess the vertical accuracy of LiDAR-derived DTM.Results showed that the LiDAR-derived DTM in our experiment site has a high accuracy with the ME and RMSE of 0.07 m and 0.09 m, respectively.
Additionally, we performed linear and exponential regression analyses of field-observed biomass against each LiDAR-derived metric in this study.The variation of R 2 and RMSE using different LiDAR-derived metrics were shown in Figure 4.Among all of the LiDAR-derived metrics, H CV provided the best agreement with field-observed biomass in an exponential model and explained 80.1% of the variance in vegetation biomass (RMSE = 429.144g/m 2 ).This was followed by H mean (R 2 = 0.729, RMSE = 518.509g/m 2 ) with an exponential model.However, the CC was not sensitive to field-observed biomass regardless of the linear or exponential regression model due to the high RMSE and low R 2 .
Similarly, the PLS regression method was also used to estimate the biomass of maize using LiDAR-derived metrics.The VIP values of all LiDAR-derived metrics were calculated using the PLS regression (as shown in Figure 5).Only the LiDAR-derived metrics with VIP values greater than 1.0 were considered in this study.In this case, the best estimation model is constructed based on H CV and H mean .Figure 6 showed the scatterplot of field-observed biomass versus estimated biomass using LiDAR-derived metrics alone.The combined use of LiDAR-derived metrics performed better than one LiDAR-derived metric alone in estimating biomass of maize.
From Figure 3 and Figure 6, we can concluded that LiDAR data had a stronger capability than hyperspectral data in estimating the biomass of maize due to the higher R 2 and lower RMSE.

Fusion of LiDAR and Hyperspectral Data for Vegetation Biophysical Variables Predictions
To evaluate the fusion of LiDAR and hyperspectral data in estimating the biomass of maize, the PLS regression analysis of field-observed biomass against all metrics including VIs and LiDAR-derived metrics was conducted in this study.Figure 7 showed the scatterplot of field-observed biomass versus estimated biomass using all metrics.From Figure 7, the synergistic use of hyperspectral and LiDAR data explained 88.3% of variance (RMSE = 321.092) in biomass measurements, whereas the best models using LiDAR and hyperspectral data alone explained 83.5% (RMSE = 374.655)and 47.8% (RMSE = 658.885) of variance in biomass measurements, respectively.The biomass estimation accuracy of maize could be improved by the fusion of LiDAR and hyperspectral data.

6.
The scatterplot of field-observed biomass versus estimated biomass using LiDAR-derived metrics (only).

Fusion of LiDAR and Hyperspectral Data for Vegetation Biophysical Variables Predictions
To evaluate the fusion of LiDAR and hyperspectral data in estimating the biomass of maize, the PLS regression analysis of field-observed biomass against all metrics including VIs and LiDAR-derived metrics was conducted in this study.Figure 7 showed the scatterplot of field-observed biomass versus estimated biomass using all metrics.From Figure 7, the synergistic use of hyperspectral and LiDAR data explained 88.3% of variance (RMSE = 321.092) in biomass measurements, whereas the best models using LiDAR and hyperspectral data alone explained 83.5% (RMSE = 374.655)and 47.8% (RMSE = 658.885) of variance in biomass measurements, respectively.The biomass estimation accuracy of maize could be improved by the fusion of LiDAR and hyperspectral data.

Discussion
Many previous studies have successfully estimated crop biomass using airborne hyperspectral data [1,17].However, our study found that VIs derived from CASI data had limited capability in estimating the biomass of maize.Although narrowband VIs were more sensitive to vegetation biomass than broadband Vis, according to the previous study [49], they still saturated over medium to high biomass area.Thus, VIs derived from hyperspectral data could not provide accurate biomass estimates because most of the biomass values are high in this experiment site.Additionally, results indicated that the combined use of VIs did not improve the predictability considerably due to the high correlation between VIs themselves.
The accuracy of LiDAR-derived DTM is a key factor for accurately calculating LiDAR-derived metrics and, thereafter, greatly affected the biomass estimation of maize.Findings of previous studies indicated that the accuracy of LiDAR-derived DTM is largely determined by: 1) vegetation canopy structure characteristics [33]; 2) terrain, mainly including slope and terrain irregularities; and 3) sensor characteristics, such as laser point density.Thus, the accuracy of LiDAR-derived DTM generally showed significant difference in different environments [31,37].Our findings revealed that the accuracy of LiDAR-derived DTM is high in our study area.The main reason is that the terrain in this experimental site is relatively flat.Although there may be no ground points over extremely dense areas.The LiDAR-derived DTM can still be accurately obtained by the interpolation method.

Discussion
Many previous studies have successfully estimated crop biomass using airborne hyperspectral data [1,17].However, our study found that VIs derived from CASI data had limited capability in estimating the biomass of maize.Although narrowband VIs were more sensitive to vegetation biomass than broadband Vis, according to the previous study [49], they still saturated over medium to high biomass area.Thus, VIs derived from hyperspectral data could not provide accurate biomass estimates because most of the biomass values are high in this experiment site.Additionally, results indicated that the combined use of VIs did not improve the predictability considerably due to the high correlation between VIs themselves.
The accuracy of LiDAR-derived DTM is a key factor for accurately calculating LiDAR-derived metrics and, thereafter, greatly affected the biomass estimation of maize.Findings of previous studies indicated that the accuracy of LiDAR-derived DTM is largely determined by: (1) vegetation canopy structure characteristics [33]; (2) terrain, mainly including slope and terrain irregularities; and (3) sensor characteristics, such as laser point density.Thus, the accuracy of LiDAR-derived DTM generally showed significant difference in different environments [31,37].Our findings revealed that the accuracy of LiDAR-derived DTM is high in our study area.The main reason is that the terrain in this experimental site is relatively flat.Although there may be no ground points over extremely dense areas.The LiDAR-derived DTM can still be accurately obtained by the interpolation method.
Several LiDAR-derived metrics have been employed in crop biomass estimation in this analysis.Results indicated that H CV showed highest correlation with the biomass of maize among all metrics.H CV represents the relative dispersion of the vertical laser point height distribution, which has been demonstrated to be a good predictor of LAI in the study of Nie et al., (2016).Simultaneously, the biomass of maize is greatly dependent on the LAI according to Equation (1).Thus, the biomass of maize can be accurately estimated by H CV .Although the study of Li et al., found a strong relationship between the canopy cover CC with the biomass of maize [6], low correlation was observed between CC and the biomass of maize in this study.This is maybe due to the two following reasons.On one hand, a thick canopy structure greatly reduces the probability of laser penetration to the ground.On the other hand, LiDAR cannot discriminate between the first and subsequent returns and only a single return is recorded for short vegetation.In this case, there may not be enough ground returns in some plots with extremely high canopy cover, and then the CC cannot be accurately calculated.Thus, the accuracy of biomass estimates is low using CC.Additionally, results also indicated that the combination of H mean and H CV can provide more accurate biomass estimates of maize.This may be owing to the fact that biomass was mainly determined by LAI and the height of maize.The mean height H mean showed a significant relationship with the height of maize, and H CV is closely related to the LAI of maize.Thus, the combined use of H mean and H CV could improve the biomass estimation accuracy.
Many previous studies indicated that the fusion of hyperspectral and LiDAR data can improve the estimation accuracy of forest biomass compared with either data alone [25,29,34,36].However, we did not find literature regarding sensor data fusion for crop biomass estimation.Results in this study indicated that the fusion of LiDAR and hyperspectral data yielded the highest biomass estimation accuracy of maize using PLS regression.This may be due to the fact that LiDAR data can provide abundant information about vegetation canopy structure, while hyperspectral data can provide complementary spectral information regarding the biochemical and biophysical composition of vegetation canopy.However, the data fusion approach only reported a slight improvement for the uncertainly of biomass compared with the LiDAR-derived metrics alone, with RMSE values decreasing from 374.655 g/m 2 to 321.092 g/m 2 .Similar results were also reported in previous studies, which indicated the fusion of hyperspectral data could not significantly improve the estimation accuracy of forest biomass [25,49].The main reason is that hyperspectral data could not provide accurate biomass estimates due to the saturation problem and LiDAR-derived metrics generally showed a strong relationship with field-observed biomass, thus, fused hyperspectral and LiDAR data only made a small improvement in biomass estimation accuracy.
To summarize, the biomass estimation accuracy of maize is low using hyperspectral data because of the problem of saturation over medium to high biomass areas.In contrast, LiDAR data can overcome the saturation problem and obtained more accurate biomass estimates of maize.Thus, LiDAR is considered as a more effective technology for biomass estimation.The fusion of LiDAR and hyperspectral data could obtain a higher biomass estimation accuracy of maize compared with LiDAR data alone.Our study made a new attempt to estimate crop biomass using fused LiDAR and hyperspectral data by PLS regression, which will provide valuable guidance for both data fusion and crop biomass estimation.However, this study also has several limitations.Field-observed biomass was indirectly calculated using the allometric equation.More accurate measurement of field-observed biomass should be carried out in the future.Additionally, future works should also focus on the data fusion method for better combining of hyperspectral data with LiDAR data.

Conclusions
Our study aimed to explore the potential of hyperspectral and LiDAR data in estimating the biomass of maize.Each metric from either VIs or LiDAR-derived metrics was used to estimate the biomass of maize.The estimation models were also developed using PLS regression of biomass against VIs (only), LiDAR-derived metrics (only), and all metrics, respectively.The results of this study revealed the following conclusions.Moderate relationships existed between the biomass of maize and VIs.In contrast, LiDAR-derived metrics showed strong correlation with field-observed biomass of maize.In addition, the combined use of LiDAR-derived metrics resulted in a better biomass estimates (R 2 = 0.835, RMSE = 374.655g/m 2 , RMSE CV = 393.573g/m 2 ), and the fusion of the hyperspectral and LiDAR data also led to a great improvement in biomass estimation accuracy of maize (R 2 = 0.883, RMSE = 321.092g/m 2 , RMSE CV = 337.653g/m 2 ).

Figure 1 .
Figure 1.Location of the study area and filed plots.

Figure 1 .
Figure 1.Location of the study area and filed plots.

Figure 2 .
Figure 2. The determination coefficient R 2 (a) and root-mean-squared error (RMSE) (b) of biomass estimation models using different VIs.

Figure 3 .
Figure 3.The scatterplot of field-observed biomass versus estimated biomass using VIs (only).

Figure 2 .
Figure 2. The determination coefficient R 2 (a) and root-mean-squared error (RMSE) (b) of biomass estimation models using different VIs.

Figure 2 .
Figure 2. The determination coefficient R 2 (a) and root-mean-squared error (RMSE) (b) of biomass estimation models using different VIs.

Figure 3 .
Figure 3.The scatterplot of field-observed biomass versus estimated biomass using VIs (only).

Figure 3 .
Figure 3.The scatterplot of field-observed biomass versus estimated biomass using VIs (only).

Figure 4 .
Figure 4.The determination coefficient R 2 (a) and root mean squared error (RMSE) (b) of biomass estimation using different LiDAR-derived metrics.

Figure 5 .
Figure 5. VIP values of all LiDAR-derived metrics for biomass estimation.

Figure 5 .
Figure 5. VIP values of all LiDAR-derived metrics for biomass estimation.

Figure 6 .
Figure 6.The scatterplot of field-observed biomass versus estimated biomass using LiDAR-derived metrics (only).

Figure 7 .
Figure 7.The scatterplot of field-observed biomass versus estimated biomass using the fusion of hyperspectral and LiDAR data.

Figure 7 .
Figure 7.The scatterplot of field-observed biomass versus estimated biomass using the fusion of hyperspectral and LiDAR data.

Table 1 .
Summary statistics of field-measured parameters in field plots.

Table 2 .
Definitions and formulas of VIs in this study.

Table 3 .
Definitions and formulas of LiDAR-derived metrics in this study.