Crop Monitoring Using Satellite / UAV Data Fusion and Machine Learning

: Non-destructive crop monitoring over large areas with high e ﬃ ciency is of great signiﬁcance in precision agriculture and plant phenotyping, as well as decision making with regards to grain policy and food security. The goal of this research was to assess the potential of combining canopy spectral information with canopy structure features for crop monitoring using satellite / unmanned aerial vehicle (UAV) data fusion and machine learning. Worldview-2 / 3 satellite data were tasked synchronized with high-resolution RGB image collection using an inexpensive unmanned aerial vehicle (UAV) at a heterogeneous soybean (Glycine max (L.) Merr.) ﬁeld. Canopy spectral information (i.e., vegetation indices) was extracted from Worldview-2 / 3 data, and canopy structure information (i.e., canopy height and canopy cover) was derived from UAV RGB imagery. Canopy spectral and structure information and their combination were used to predict soybean leaf area index (LAI), aboveground biomass (AGB), and leaf nitrogen concentration (N) using partial least squares regression (PLSR), random forest regression (RFR), support vector regression (SVR), and extreme learning regression (ELR) with a newly proposed activation function. The results revealed that: (1) UAV imagery-derived high-resolution and detailed canopy structure features, canopy height, and canopy coverage were signiﬁcant indicators for crop growth monitoring, (2) integration of satellite imagery-based rich canopy spectral information with UAV-derived canopy structural features using machine learning improved soybean AGB, LAI, and leaf N estimation on using satellite or UAV data alone, (3) adding canopy structure information to spectral features reduced background soil e ﬀ ect and asymptotic saturation issue to some extent and led to better model performance, (4) the ELR model with the newly proposed activated function slightly outperformed PLSR, RFR, and SVR in the prediction of AGB and LAI, while RFR provided the best result for N estimation. This study introduced opportunities and limitations of satellite / UAV data fusion using machine learning in the context of crop monitoring. estimation, and RFR produced the best results for N estimation.


Introduction
Monitoring crop growth-related biochemical and biophysical traits, such as leaf nitrogen concentration (N), leaf area index (LAI), and aboveground biomass (AGB), at a fine-scale, is crucial for In previous studies, based on remote sensing data, different machine learning (ML) methods, such as partial least squares regression (PLSR) [48], support vector regression (SVR) [49], and random forest regression (RFR) [50], have been utilized for crop monitoring. With efficient and rapid learning, extreme learning machine (ELM) [51] and its variants [52,53] have been employed for a variety of regression and classification analysis [54][55][56] and remote sensing-based plant trait estimations [4].
In this study, we aimed to (i) examine the potential of combing canopy spectral and structure information for crop monitoring of a heterogeneous soybean field using satellite/UAV data fusion and machine learning, (ii) investigate the contributions of UAV-derived structure information in accounting for background soil effects at the early developmental stage, and saturation issues late in crop development, (iii) evaluate the power of extreme learning regression (ELR) method with a newly proposed activation function for the prediction of crop AGB, LAI, and N.

Test Site and Field Layout
The test site is located at the Bradford Research Center of the University of Missouri in Columbia, Missouri, USA ( Figure 1). The average monthly temperature measured by a nearby on-farm weather station was 17.35°C, 22.63°C, 25.04°C, 20.96°C, 20.16°C, and 13.75°C, respectively, from May to October in 2017. The accumulated monthly precipitation was 114 mm, 82 mm, 116 mm, 77 mm, 20 mm, and 98 mm, respectively, from May to October.
As shown in Figure 1, the experiment field was divided into four replications, and each replication was split into rainfed and irrigated main plots, which were planted with three soybean cultivars-'AG3432', 'Pana', and 'Dwight'-and sub-plots on May 19th of 2017. A pre-and postemergence herbicide was applied, along with a manual weed removing to control the weeds. No fertilizers were applied during this experiment. As shown in Figure 1, the experiment field was divided into four replications, and each replication was split into rainfed and irrigated main plots, which were planted with three soybean cultivars-'AG3432', 'Pana', and 'Dwight'-and sub-plots on May 19th of 2017. A pre-and post-emergence herbicide was applied, along with a manual weed removing to control the weeds. No fertilizers were applied during this experiment.

Field Data Collection
Each AGB sample was harvested from an area of 1 m length and 2 rows width (about 1.52 m 2 ) at locations evenly distributed throughout the field (6 samples within each sub-plot) on 8 July, 20 July, and 4 August, respectively (Figure 1), and oven-dried at 60 • C immediately after harvesting. UAV RGB orthomosaics acquired before and after biomass harvesting were co-reregistered and overlapped to identify the precise sampling locations of AGB [42].
Soybean LAI measurements were carried out at 48 sampling spots throughout the field (two sampling spots within each sub-plot, and each sampling spot was about 21.3 m 2 composed of two rows (1.52 m) with about 14 m length) ( Figure 1) on 8 July, 20 July, and 4 August, respectively. LAI value was achieved nondestructively using the LAI-2200C Plant Canopy Analyzer (LI-COR Inc., Lincoln, NE, USA) along a diagonal transect between the two rows at each sampling spot. Leaf nitrogen concentration was estimated based on DUALEX 4 Scientific (Force-A, Orsay, France) hand-held device measurements from three leaves at each sampling spot [4], and a total of 144 samples (6 samples within each sub-plot) were collected on 8 July, 20 July, and 4 August, respectively. Field-based canopy height (CH) measurement was conducted using a measuring stick at the same 144 locations where the leaf nitrogen concentration measurements were taken during each sampling day to validate UAV-derived CH. A summary of the field-measured plant trait variables is shown in Table 1. Table 1. Summary statistics of measured leaf area index (LAI), aboveground biomass (AGB), and nitrogen concentration (N) and canopy height (CH) data.

Remote Sensing Data
UAV-based image acquisition was conducted on the same dates as ground truth plant trait data collection (Table 2). WorldView-2/3 (WV-2/3) (Digital Globe Inc.) imagery was acquired on 10 July 2017, 22 July 2017, and 9 August 2017 as closely as possible to the dates of UAV flights and ground truth data collection. As demonstrated in Table 2, WV-2 data had one panchromatic band and eight visible near-infrared (VNIR) bands. WV-3 data included one panchromatic band, eight VNIR bands, * Pan band: panchromatic band, VNIR: visible near-infrared. The wavelength of Pan and VNIR bands of Worldview-3 was consistent with Worldview-2 data. The wavelength of UAV RGB bands was not available.
UAV RGB imagery was acquired at an altitude of 30 m using a light-weight commercial platform DJI Mavic Pro quadcopter. The flight speed was set to 6 m s −1 , and the side and front overlap was 85%. Equipped with a 12.4-Mpixel RGB camera and built-in GPS device, DJI Mavic Pro provides automatically geotagged and high-resolution RGB images [41].

Worldview Data Preprocessing
Atmospheric correction and radiometric calibration were carried out for WV-2/3 imagery through the Fast Line-of-Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) method using ENVI 5.4.1 software (Harris Visual Information Solutions, Boulder, CO, USA) [57,58]. Pan-sharpened WV-2/3 imagery with 0.3 m spatial resolution was generated using the Gram-Schmidt Pan Sharpening tool of ENVI 5.4.1 software [23].

UAV Imagery Preprocessing
Orthorectification and mosaicking of UAV imagery collections from different sampling days were carried out using Pix4Dmapper software (Pix4D SA, Lausanne, Switzerland). Eighteen highly accurate ground control points (GCPs) (Figure 1) surveyed with a Trimble R8 GNSS Rover (10 mm and 20 mm accuracy at the horizontal and vertical direction, respectively) were leveraged through the processing workflow of Pix4Dmapper, to generate UAV RGB orthomosaics with precise scale and position [42]. Along with the SfM-based photogrammetric workflow embedded in Pix4Dmapper, high density and accurate point clouds were created.
Multitemporal multiscale WV-2/3 imagery and UAV orthomosaics were co-registered and georeferenced to the UAV orthomosaic of July 8, 2017, through a number of GCPs, along with selected ground feature points, such as road intersections, using ArcGIS 10.4 software. The georeferencing errors were less than one pixel (<0.3 m). Finally, a unified and overlapped imagery dataset was generated with WGS 1984 UTM Zone 15N geoinformation.

Satellite Imagery-Based Spectral Feature Extraction
As listed in Table 3, the eight VNIR bands of WV-2/3, as well as a series of VIs employed for crop monitoring in previous research, were selected.

Number of crop pixels in the plot
Total number of the plot pixels * 100 [73] * DSM: digital surface model; DEM: digital elevation model.

UAV Imagery-Based Canopy Height Extraction
Digital elevation model (DEM) was retrieved using the photogrammetric point clouds from UAV imagery acquired before plant development on 25 May 2017. Additionally, the digital surface model (DSM) for the soybean field was generated on three dates (i.e., 8 July, 20 July, and 4 August 2017) that represent different developmental stages using the UAV imagery-derived point clouds. The subtraction of DEM from DSM was conducted at the pixel level, and the canopy height model (CHM) was created [40].

UAV Imagery-Based Canopy Cover Extraction
The weeds and soil pixels were identified and removed from the UAV RGB orthomosaics from 8 July, 20 July and 4 August, respectively, using the Support Vector Machine (SVM) classifier [74]. An overall accuracy higher than 99% and a Kappa coefficient over 0.99 were achieved from the SVM classifier [42].
Additionally, canopy cover (CC), which reflects the crop density and structure (i.e., LAI) information [75], was computed by dividing the area of soybean canopy pixels in each sampling spot by the total number of pixels within that spot [3]. The formula for CC is as follows: The plot-level average of VIs and CH values were computed in order to relate them to the corresponding AGB, LAI, and N data. Binary masks were applied to remove soil/weed pixels from VIs and CH raster layers. The workflow including image preprocessing, feature extraction, model calibration and validation was demonstrated in Figure 2. Remote Sens. 2020, 12, x FOR PEER REVIEW 2 of 25 Figure 2. A workflow diagram of remote sensing imagery preprocessing, canopy spectral and structure feature extraction, and model training and testing.

Modeling Methods
Several commonly used ML methods in remote sensing applications, particularly in crop monitoring and plant traits estimation, such as PLSR, RFR, and SVR [4,40,76], were implemented to estimate AGB, LAI, and N using satellite image-derived VIs and UAV image-based CH and CC (Table 3) as input features. PLSR is a widely-used multivariate statistical technique and is often used to construct predictive models when the number of explanatory variables is large and highly correlated/collinear. It reduces the dimension and noise and transforms the collinear input features to a few non-correlated latent variables by performing component projection [77]. PLSR is similar to principal component regression (PCR), but PLSR uses both the variation of the spectra and the response variable to construct new explanatory components/variables [78]. RFR is a non-parametric ensemble method, which has roots in classification and regression trees (CART); RFR consists of different trees that are trained by applying bagging and random variable selection process [79]. RFR has a better tolerance for outliers and noise [80]. SVR is the implementation of SVM [74] for regression and function approximation. SVR basically converts the nonlinear regression into a linear projection through different kernel functions and transforms the original feature space into a new data space and multidimensional hyperplanes [81]. ELR is the regression version of ELM; the classic ELM is a feedforward neural network, which has one input layer, one hidden layer, and one output layer. Instead of iterative tuning, the hidden node parameters in ELM are randomly generated, which provides high computational efficiency [51].

Modeling Methods
Several commonly used ML methods in remote sensing applications, particularly in crop monitoring and plant traits estimation, such as PLSR, RFR, and SVR [4,40,76], were implemented to estimate AGB, LAI, and N using satellite image-derived VIs and UAV image-based CH and CC (Table 3) as input features. PLSR is a widely-used multivariate statistical technique and is often used to construct predictive models when the number of explanatory variables is large and highly correlated/collinear. It reduces the dimension and noise and transforms the collinear input features to a few non-correlated latent variables by performing component projection [77]. PLSR is similar to principal component regression (PCR), but PLSR uses both the variation of the spectra and the response variable to construct new explanatory components/variables [78]. RFR is a non-parametric ensemble method, which has roots in classification and regression trees (CART); RFR consists of different trees that are trained by applying bagging and random variable selection process [79]. RFR has a better tolerance for outliers and noise [80]. SVR is the implementation of SVM [74] for regression and function approximation. SVR basically converts the nonlinear regression into a linear projection through different kernel functions and transforms the original feature space into a new data space and multidimensional hyperplanes [81]. ELR is the regression version of ELM; the classic ELM is a feedforward neural network, which has one input layer, one hidden layer, and one output layer. Instead of iterative tuning, the hidden node parameters in ELM are randomly generated, which provides high computational efficiency [51].
Remote Sens. 2020, 12, 1357 8 of 23 In this study, a dual activation function-based extreme learning machine (ELM), which has been proved to outperform regular ELM for crop monitoring application in previous research [82], was modified and implemented. In our previous study [82], the activation function, namely, TanhRe, that combines rectified linear unit (ReLU) [83] and hyperbolic tangent (Tanh) functions have shown superior performance for grapevine yield and berry quality prediction, compared to commonly used nonlinear functions, such as sigmoid, Tanh, and ReLU ( Figure 3a). Inspired by leaky ReLU, we herein proposed a modified TanhRe by introducing a small coefficient to Tanh function, instead of using the fixed shape of Tanh, which allowed for a flexible functionality to better fit the input pattern. Accordingly, the proposed activation function was defined as: where x is the input of the nonlinear activation f , and c is a small constant, such as 0.001. In this paper, the constant c was optimized using the value from 0 to 1 (Figure 3b).
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 25 In this study, a dual activation function-based extreme learning machine (ELM), which has been proved to outperform regular ELM for crop monitoring application in previous research [82], was modified and implemented. In our previous study [82], the activation function, namely, TanhRe, that combines rectified linear unit (ReLU) [83] and hyperbolic tangent (Tanh) functions have shown superior performance for grapevine yield and berry quality prediction, compared to commonly used nonlinear functions, such as sigmoid, Tanh, and ReLU ( Figure 3a). Inspired by leaky ReLU, we herein proposed a modified TanhRe by introducing a small coefficient to Tanh function, instead of using the fixed shape of Tanh, which allowed for a flexible functionality to better fit the input pattern. Accordingly, the proposed activation function was defined as: where is the input of the nonlinear activation , and is a small constant, such as 0.001. In this paper, the constant c was optimized using the value from 0 to 1 (Figure 3b). For implementing different ML methods, 70% of input features were randomly selected and used together with corresponding AGB, LAI, and N data for model training, and the remaining 30% unseen data were used to test the model. All variables, including VIs, CH, and CC, were used as input features for the modeling, and no feature selection method was applied. A grid-search and k-fold cross-validation technique were applied to obtain optimal parameters for each method during the model calibration phase [84]. For the PLSR method, the number of principal components (PCs) was determined by the grid search. A radial basis kernel function (RBF) was used for SVR, and the kernel parameter gamma and regularization parameter C were determined by tuning. The number of trees for the RFR method was set at 500, and the max_features parameter, which decides how many features each tree in the RFR considers at each split, was determined through the grid search procedure. For the ELR method, the number of neurons in the hidden layer, the generalization parameter C, as well as the proposed activation function's coefficient c, were optimized through the grid search. The assessment of model performance was conducted by adopting the root mean square error (RMSE), relative RMSE (RMSE%), along with the coefficients of determination (R 2 ). The equations were as follows: For implementing different ML methods, 70% of input features were randomly selected and used together with corresponding AGB, LAI, and N data for model training, and the remaining 30% unseen data were used to test the model. All variables, including VIs, CH, and CC, were used as input features for the modeling, and no feature selection method was applied. A grid-search and k-fold cross-validation technique were applied to obtain optimal parameters for each method during the model calibration phase [84]. For the PLSR method, the number of principal components (PCs) was determined by the grid search. A radial basis kernel function (RBF) was used for SVR, and the kernel parameter gamma and regularization parameter C were determined by tuning. The number of trees for the RFR method was set at 500, and the max_features parameter, which decides how many features each tree in the RFR considers at each split, was determined through the grid search procedure. For the ELR method, the number of neurons in the hidden layer, the generalization parameter C, as well as the proposed activation function's coefficient c, were optimized through the grid search. The assessment of model performance was conducted by adopting the root mean square error (RMSE), relative RMSE (RMSE%), along with the coefficients of determination (R 2 ). The equations were as follows: whereŷ i and y i represent the predicted and the measured AGB, LAI, and N data, respectively. y is the average of each measured variable, and n is the sample size used for model testing.

Distribution of Canopy Height and Canopy Cover
UAV imagery-derived CH was validated by comparing to the manual measurements made in the field (Figure 4a), which presented R 2 of 0.898 and RMSE% of 11.8%. As documented in past studies [42,85], photogrammetric point cloud-based CH of smaller plants (i.e., 0.3 m in this case) were slightly underestimated, which might be due to missing points during the point cloud reconstruction and post-processing [86]. The distribution of UAV imagery-derived CH at different developmental stages and across all stages was displayed with violin plots (Figure 4b). CH ranged considerably within each developmental stage, especially on July 20 and August 4, which indicated a large spatial variability and heterogeneity of canopy structure in the vertical direction of soybean. This was likely due to a number of factors, including variation associated with soybean genotypes, soil conditions, and water availability. UAV imagery-extracted CC also demonstrated relatively wide data ranges (Figure 4c), especially at the early development stage (i.e., 08 July), which also implied spatial variety and heterogeneity of soybean canopy. However, due to the gradual closing of the soybean canopy from July 8 to August 4, the variation in CC became smaller, and the CC of the most area approached 100% (Figure 4c).
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 25 where and represent the predicted and the measured AGB, LAI, and N data, respectively. is the average of each measured variable, and is the sample size used for model testing.

Distribution of Canopy Height and Canopy Cover
UAV imagery-derived CH was validated by comparing to the manual measurements made in the field (Figure 4a), which presented R 2 of 0.898 and RMSE% of 11.8%. As documented in past studies [42,85], photogrammetric point cloud-based CH of smaller plants (i.e., 0.3 m in this case) were slightly underestimated, which might be due to missing points during the point cloud reconstruction and post-processing [86]. The distribution of UAV imagery-derived CH at different developmental stages and across all stages was displayed with violin plots (Figure 4b). CH ranged considerably within each developmental stage, especially on July 20 and August 4, which indicated a large spatial variability and heterogeneity of canopy structure in the vertical direction of soybean. This was likely due to a number of factors, including variation associated with soybean genotypes, soil conditions, and water availability. UAV imagery-extracted CC also demonstrated relatively wide data ranges (Figure 4c), especially at the early development stage (i.e., 08 July), which also implied spatial variety and heterogeneity of soybean canopy. However, due to the gradual closing of the soybean canopy from July 8 to August 4, the variation in CC became smaller, and the CC of the most area approached 100% (Figure 4c). August, and when combing CH data from those three dates (b); canopy cover (CC) distribution on 8 July, 20 July, and 4 August, and when combing CC data from those three dates (c). The violin plot demonstrates the maximum, minimum, median, first, and third quartile values of CH or CC, and the width of each violin element represents the frequency of CH or CC with the specific value. The distribution of CH and CC indicated a large spatial variety and heterogeneity of canopy structure in the vertical direction of soybean with almost closed canopy on August 4.

Estimation Results of AGB, LAI, and N
PLSR, SVR, RFR, and ELR were employed for AGB, LAI, and N prediction using satellite-based canopy spectral information, UAV-based canopy structure information, as well as the combination of spectral and structure information, respectively.
The validation metrics for AGB estimation are demonstrated in Table 4. Satellite-based spectral information yielded superior performance to UAV-based canopy structure information (i.e., CH and CC) regardless of regression models with R 2 ranging from 0.792 to 0.870 and RMSE% from 30.8% to August, and when combing CH data from those three dates (b); canopy cover (CC) distribution on 8 July, 20 July, and 4 August, and when combing CC data from those three dates (c). The violin plot demonstrates the maximum, minimum, median, first, and third quartile values of CH or CC, and the width of each violin element represents the frequency of CH or CC with the specific value. The distribution of CH and CC indicated a large spatial variety and heterogeneity of canopy structure in the vertical direction of soybean with almost closed canopy on August 4.

Estimation Results of AGB, LAI, and N
PLSR, SVR, RFR, and ELR were employed for AGB, LAI, and N prediction using satellite-based canopy spectral information, UAV-based canopy structure information, as well as the combination of spectral and structure information, respectively.
The validation metrics for AGB estimation are demonstrated in Table 4. Satellite-based spectral information yielded superior performance to UAV-based canopy structure information (i.e., CH and CC) regardless of regression models with R 2 ranging from 0.792 to 0.870 and RMSE% from 30.8% to 24.3%. Structure information presented lesser, yet comparable performance with R 2 from 0.763 to 0.805 and RMSE% from 32.9% to 29.8%. A combination of satellite/UAV spectral and structure information produced the best results with R 2 varying from 0.852 to 0.923 and RMSE% from 25.9% to 18.8%. Additionally, the predicted AGB values using satellite, UAV, and their combination, as well as measured AGB values, were compared using ANOVA with HSD Tukey tests (α = 0.05). The results showed that no significant differences were found between predicted AGB when using data from different platforms ( Figure 5), as well as between measured and predicted AGB, demonstrating that data from different platforms was effective in predicting AGB regardless of the regression methods.
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 25 information produced the best results with R 2 varying from 0.852 to 0.923 and RMSE% from 25.9% to 18.8%. Additionally, the predicted AGB values using satellite, UAV, and their combination, as well as measured AGB values, were compared using ANOVA with HSD Tukey tests (α = 0.05). The results showed that no significant differences were found between predicted AGB when using data from different platforms ( Figure 5), as well as between measured and predicted AGB, demonstrating that data from different platforms was effective in predicting AGB regardless of the regression methods. As shown in Table 5, satellite imagery-derived canopy spectral information slightly outperformed UAV-based canopy structure information for LAI estimation with R 2 varying from 0.844 to 0.915 and RMSE% from 22.1% to 16.3%. UAV-derived canopy structure information provided decent and comparable results to satellite spectral information with R 2 ranging from 0.832 to 0.909 and RMSE% from 22.2% to 16.9%. The integration of spectral information with structure features boosted the prediction accuracy with R 2 varying from 0.923 to 0.927 and RMSE% from 15.5% to 15.1%. Table 5. Testing results of PLSR, SVR, RFR, and ELR methods for leaf area index (LAI) estimation. As shown in Table 5, satellite imagery-derived canopy spectral information slightly outperformed UAV-based canopy structure information for LAI estimation with R 2 varying from 0.844 to 0.915 and RMSE% from 22.1% to 16.3%. UAV-derived canopy structure information provided decent and comparable results to satellite spectral information with R 2 ranging from 0.832 to 0.909 and RMSE% from 22.2% to 16.9%. The integration of spectral information with structure features boosted the prediction accuracy with R 2 varying from 0.923 to 0.927 and RMSE% from 15.5% to 15.1%. In addition, the measured LAI, as well as the predicted LAI values, using data from satellite, UAV, and their combination were compared using ANOVA with HSD Tukey tests (α = 0.05). The results showed that no significant differences were found between measured and predicted LAI, as well as among predicted LAI using different platforms' data regardless of the regression methods ( Figure 6). This indicated that data from different platforms was effective in predicting LAI regardless of the regression methods. In addition, the measured LAI, as well as the predicted LAI values, using data from satellite, UAV, and their combination were compared using ANOVA with HSD Tukey tests (α = 0.05). The results showed that no significant differences were found between measured and predicted LAI, as well as among predicted LAI using different platforms' data regardless of the regression methods ( Figure 6). This indicated that data from different platforms was effective in predicting LAI regardless of the regression methods. Validation statistics of leaf N estimation are displayed in Table 6. Overall, whether based on satellite or UAV individually, as well as in combination, the estimation of leaf N was poor. However, satellite-based canopy spectral information yielded superior performance to UAV-based canopy structure information with R 2 ranging from 0.301 to 0.565 and RMSE% from 23.1% to 18.2%. UAVderived canopy structure features presented poorer performance with R 2 varying from 0.125 to 0.328 and RMSE% from 25.8% to 22.6%. Although at a minor scale, the inclusion of canopy structure information with spectral features improved the estimation accuracy of N regardless of the regression models.  Validation statistics of leaf N estimation are displayed in Table 6. Overall, whether based on satellite or UAV individually, as well as in combination, the estimation of leaf N was poor. However, satellite-based canopy spectral information yielded superior performance to UAV-based canopy structure information with R 2 ranging from 0.301 to 0.565 and RMSE% from 23.1% to 18.2%. UAV-derived canopy structure features presented poorer performance with R 2 varying from 0.125 to 0.328 and RMSE% from 25.8% to 22.6%. Although at a minor scale, the inclusion of canopy structure information with spectral features improved the estimation accuracy of N regardless of the regression models. The estimated N values using data from satellite, UAV, and their combination were compared with each other and also against measured N using ANOVA with HSD Tukey tests (α = 0.05). The results showed that no significant differences were found between measured and predicted N, as well as among predicted N using different platforms' data regardless of the regression methods ( Figure 7). This indicated that data from different platforms was effective in predicting N regardless of the regression methods.
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 25 The estimated N values using data from satellite, UAV, and their combination were compared with each other and also against measured N using ANOVA with HSD Tukey tests (α = 0.05). The results showed that no significant differences were found between measured and predicted N, as well as among predicted N using different platforms' data regardless of the regression methods ( Figure 7). This indicated that data from different platforms was effective in predicting N regardless of the regression methods.

Satellite/UAV Data Fusion and Its Impact on Model Performance
Canopy spectral features (i.e., VIs) have been extensively used in the realm of remote sensingbased crop monitoring and agricultural applications. As shown in Tables 4-6, regardless of the regression models, canopy spectral information outperformed structure features for AGB, LAI, and N estimation, which agreed with previous studies [4,87]. In detail, canopy structure features yielded slightly lesser, yet comparable performance to spectral features for AGB and LAI prediction. This indicated, to some extent, that canopy structure features are promising alternatives to spectral VIs for remote sensing-based crop monitoring. Many previous studies have noted the potential of canopy structure features in the estimation of AGB [42,88], LAI [4], and leaf N [89]. Due to the complementary information provided by satellite-based abundant spectral features and UAV-derived structure features, satellite/UAV data fusion boosted the model performance compared with the satellite or UAV data alone (Tables 4-6). Canopy structure features (i.e., CH and CC) contain independent information on canopy growth and architecture [12,90], which is supplementary for spectral information to some extent, and eventually improve model performance [4,[91][92][93].
Compared with the estimation of AGB and LAI, canopy structure features have been rarely used for N estimation in remote sensing applications. In our case, canopy structure features produced a poor performance for N estimation; however, integrating canopy structure features with spectral features was able to improve model performance regardless of the regression methods (Table 6). It is Figure 7. Boxplot for measured (M) and predicted N using satellite (S), UAV (U), as well as combined satellite and UAV (S + U), data from each model. ANOVA was carried out, and the letter 'a' on the bars indicate no significant differences according to the HSD Tukey's test (α = 0.05).

Satellite/UAV Data Fusion and Its Impact on Model Performance
Canopy spectral features (i.e., VIs) have been extensively used in the realm of remote sensing-based crop monitoring and agricultural applications. As shown in Tables 4-6, regardless of the regression models, canopy spectral information outperformed structure features for AGB, LAI, and N estimation, which agreed with previous studies [4,87]. In detail, canopy structure features yielded slightly lesser, yet comparable performance to spectral features for AGB and LAI prediction. This indicated, to some extent, that canopy structure features are promising alternatives to spectral VIs for remote sensing-based crop monitoring. Many previous studies have noted the potential of canopy structure features in the estimation of AGB [42,88], LAI [4], and leaf N [89]. Due to the complementary information provided by satellite-based abundant spectral features and UAV-derived structure features, satellite/UAV data fusion boosted the model performance compared with the satellite or UAV data alone (Tables 4-6). Canopy structure features (i.e., CH and CC) contain independent information on canopy growth and architecture [12,90], which is supplementary for spectral information to some extent, and eventually improve model performance [4,[91][92][93].
Compared with the estimation of AGB and LAI, canopy structure features have been rarely used for N estimation in remote sensing applications. In our case, canopy structure features produced a poor performance for N estimation; however, integrating canopy structure features with spectral features was able to improve model performance regardless of the regression methods (Table 6). It is worth noting that both improved model performance [94][95][96], as well as no improvement [89], have been documented for N estimation when the inclusion of canopy structure features to VIs in previous studies. Additionally, the hand-held sensor used for N estimation measured a small area (19.6 mm 2 ) of a single leaf per measurement, and, even when 6 measurements were taken per plot, some error was expected to be associated with respect to the representativeness of these spot measurements for the canopy. Further research efforts are needed to investigate the potential of canopy structure features for the estimation of leaf biochemical traits, such as leaf N concentration over different environments and crop species. Additionally, the application of alternative ground-truth measurements should be investigated to improve the predictions of biochemical traits.

Canopy Structure Information and Spectral Saturation Issue
The asymptotic saturation phenomenon has been widely recognized as an issue for optical remote sensing-based crop monitoring (i.e., crop AGB, LAI, and N estimation). It characterizes one of the major limitations of optical remote sensing data, especially using such data collected from dense and heterogeneous canopies [11,12,40]. Canopy spectral features represented by commonly used VIs, such as Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge (NDRE), along with canopy structure features CH and CC, were plotted against AGB, LAI, and N ( Figure 8). With the increase of AGB, LAI, and N ( Figure 8) at the late developmental stage (i.e., August 4, represented by red dots in Figure 8), both NDVI and NDRE increased only slightly or did not increase accordingly, presenting gradual saturation patterns. Additionally, canopy structure feature CC also displayed the saturation pattern. Asymptotic saturation of canopy spectral features is a challenge for estimating plant traits of crops planted at high density and leads to lower accuracy [26]. However, CH demonstrated higher responsiveness to the increase of AGB, LAI, and N, which led to the weakening of saturation issue, and eventually improved the model accuracy to some extent [4].
Due to the superior performance of ELR compared to other models, estimates of AGB, LAI, and N achieved from ELR were plotted against their corresponding ground truth values ( Figure 9). AGB, LAI, and N were underestimated for samples with higher values, particularly in the case of using spectral data alone, which is likely attributed to the saturation issue [97,98]. Consistent with the previous study by Moeckel, Safari, Reddersen, Fricke and Wachendorf [8], obvious improvement of fit was observed for all variables when combining canopy spectral and structure features. This was demonstrated by higher R 2 and lower RMSE%, as well as the convergence of the linear fitting line (black solid line in Figure 9) and the bisector (red dash line), implying that the saturation effect was reduced to some extent when employing satellite/UAV data fusion for crop monitoring. August 4, represented by red dots in Figure 8), both NDVI and NDRE increased only slightly or did not increase accordingly, presenting gradual saturation patterns. Additionally, canopy structure feature CC also displayed the saturation pattern. Asymptotic saturation of canopy spectral features is a challenge for estimating plant traits of crops planted at high density and leads to lower accuracy [26]. However, CH demonstrated higher responsiveness to the increase of AGB, LAI, and N, which led to the weakening of saturation issue, and eventually improved the model accuracy to some extent [4].
(a) (b) (c) Figure 8. Scatter plots of aboveground biomass (AGB) (a), LAI (b), and N (c) against crop spectral and structural features; NDVI is normalized difference vegetation index, NDRE is normalized difference red-edge index, CC is canopy coverage in percentage, and CH is canopy height in meter; Green dots represent the data from July 8, blue dots represent the data from July 20, and red dots represent the data from August 4 of 2017. Spectral features NDVI and NDRE presented saturation pattern, while CH demonstrated higher responsiveness with the increase of AGB, LAI, and N. , and N (c) against crop spectral and structural features; NDVI is normalized difference vegetation index, NDRE is normalized difference red-edge index, CC is canopy coverage in percentage, and CH is canopy height in meter; Green dots represent the data from July 8, blue dots represent the data from July 20, and red dots represent the data from August 4 of 2017. Spectral features NDVI and NDRE presented saturation pattern, while CH demonstrated higher responsiveness with the increase of AGB, LAI, and N.

Contribution of Data Fusion at Different Crop Developmental Stages
ELR with the newly proposed activation function stably yielded the best performance for AGB, LAI, and N prediction compared to other ML methods in most of the cases (Tables 4-6). Thus, it was employed for generating prediction results at different development stages of soybean (Table 7).
Overall, the inclusion of UAV-derived canopy structure information to satellite-based spectral features improved the model performance of AGB, LAI, and N estimation at all developmental stages, except for N estimation on July 20 (Table 7 and Figure 10). Canopy structure features (i.e., CH and CC) provide independent information on canopy growth and architecture [12,90] and supplement canopy spectral features in terms of AGB, LAI, and N estimation [4,[91][92][93], which likely contributes to improved model performance. Additionally, as evidenced in Figure 11, background soil reflectance, especially at early stages (i.e., July 8) [99][100][101], as well as saturation issue due to dense/closed canopy (Figure 11c) at a late stage (i.e., 4 August) [35,102], affected the performance of satellite data-based crop monitoring. Thus, the improvement of model performance when adding high-resolution canopy structure features to VIs indicated that canopy structure features likely weakened the effects of soil background and saturation issues. Due to the superior performance of ELR compared to other models, estimates of AGB, LAI, and N achieved from ELR were plotted against their corresponding ground truth values ( Figure 9). AGB, LAI, and N were underestimated for samples with higher values, particularly in the case of using spectral data alone, which is likely attributed to the saturation issue [97,98]. Consistent with the previous study by Moeckel, Safari, Reddersen, Fricke and Wachendorf [8], obvious improvement of fit was observed for all variables when combining canopy spectral and structure features. This was demonstrated by higher R 2 and lower RMSE%, as well as the convergence of the linear fitting line (black solid line in Figure 9) and the bisector (red dash line), implying that the saturation effect was reduced to some extent when employing satellite/UAV data fusion for crop monitoring. The combination of satellite spectral information with UAV-derived structure features yielded higher estimation accuracies, as well as the convergence of the linear fitting line.

Contribution of Data Fusion at Different Crop Developmental Stages
ELR with the newly proposed activation function stably yielded the best performance for AGB, LAI, and N prediction compared to other ML methods in most of the cases (Tables 4-Table 6). Thus, it was employed for generating prediction results at different development stages of soybean (Table  7). Figure 9. The cross-validation measured against ELR-based predicted variables AGB (a), LAI (b), and N (c) when using satellite, UAV, and satellite + UAV data. The red dash line is the bisector, and the black solid line is the linear regression fitting line; S + U represents combing satellite and UAV data. The combination of satellite spectral information with UAV-derived structure features yielded higher estimation accuracies, as well as the convergence of the linear fitting line.

Comparison of Different Modeling Methods
As evidenced by high R 2 and low RMSE% (Figure 12), ELR produced the most accurate model in AGB and LAI estimation, which matched results obtained in previous studies [4,104]. This could be attributed to the nonlinear learning capability of integrated ReLU and tanh activation functions, along with the insensitivity to outliers of the adjusted ELR model [82]. Additionally, the newly introduced coefficient (Figure 3b and Equation (2)) to ELR's dual activation function in this study possessed more robust and flexible functionality to better capture the input feature patterns and eventually led to the improvement of model performance. For N estimation, RFR yielded the best LAI = 1.5 LAI = 2.7 LAI = 4.0 Figure 10. Percentage changes of R2 for AGB, LAI, and N estimation when using satellite and UAV data fusion compared to only using satellite data on different development stages (Note: all estimation results here are based on ELR method); 'All' represents the result when combing all data samples from different dates. The combination of satellite and UAV data improved the model accuracies in most cases, except for N estimation when the data was from 20 July 2017.

Comparison of Different Modeling Methods
As evidenced by high R 2 and low RMSE% (Figure 12), ELR produced the most accurate model in AGB and LAI estimation, which matched results obtained in previous studies [4,104]. This could be attributed to the nonlinear learning capability of integrated ReLU and tanh activation functions, along with the insensitivity to outliers of the adjusted ELR model [82]. Additionally, the newly introduced coefficient (Figure 3b and Equation (2)) to ELR's dual activation function in this study possessed more robust and flexible functionality to better capture the input feature patterns and eventually led to the improvement of model performance. For N estimation, RFR yielded the best It is worth noting that satellite/UAV data fusion was not able to improve the model performance of leaf N estimation on July 20. The unstable performance of CH at different developmental stages, when combined with spectral data in N estimation, has also been reported previously by Näsi, Viljanen, Kaivosoja, Alhonoja, Hakala, Markelin and Honkavaara [89]. This might be due to the inferior correlation between UAV-derived features and leaf N on July 20 compared with the other developmental stages (Table 7), as well as possible information homogeneity and redundancy between canopy spectral and structure features [103]. Further investigation should be conducted to examine the potential/possibility of canopy structure features for leaf N estimation at different development stages over different crop species and environments. In addition, alternative methods of ground-truthing N concentration should be examined to improve the predictions of biochemical traits.

Comparison of Different Modeling Methods
As evidenced by high R 2 and low RMSE% (Figure 12), ELR produced the most accurate model in AGB and LAI estimation, which matched results obtained in previous studies [4,104]. This could be attributed to the nonlinear learning capability of integrated ReLU and tanh activation functions, along with the insensitivity to outliers of the adjusted ELR model [82]. Additionally, the newly introduced coefficient (Figure 3b and Equation (2)) to ELR's dual activation function in this study possessed more robust and flexible functionality to better capture the input feature patterns and eventually led to the improvement of model performance. For N estimation, RFR yielded the best performance with the highest R 2 and lowest RMSE% ( Figure 12). RFR shows higher tolerance for data faults with a large sample decision tree for high dimensional data training [79]. SVR provided decent accuracies, which were comparable to RFR in most cases in this study. SVR also can cope with high data dimensionality but is less sensitive to sample size compared to RFR [105]. It is worth noting that RFR and SVR are the most extended ML algorithms in remote sensing-based crop monitoring applications, and their potential in handling nonlinear and noisy data, as well as dealing with overfitting issues, is well documented [106][107][108]. The PLSR model was outperformed by other models in most of the cases as it resulted in lower R 2 and higher RMSE% (Figure 12). PLSR technique is able to tackle multiple collinearity problems through principal component analysis; however, its limitation, particularly in dealing with non-linear and complex relationships between plant traits and spectral features, has been noted in the literature [107].
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 25 performance with the highest R 2 and lowest RMSE% ( Figure 12). RFR shows higher tolerance for data faults with a large sample decision tree for high dimensional data training [79]. SVR provided decent accuracies, which were comparable to RFR in most cases in this study. SVR also can cope with high data dimensionality but is less sensitive to sample size compared to RFR [105]. It is worth noting that RFR and SVR are the most extended ML algorithms in remote sensing-based crop monitoring applications, and their potential in handling nonlinear and noisy data, as well as dealing with overfitting issues, is well documented [106][107][108]. The PLSR model was outperformed by other models in most of the cases as it resulted in lower R 2 and higher RMSE% ( Figure 12). PLSR technique is able to tackle multiple collinearity problems through principal component analysis; however, its limitation [107]. Figure 12. R 2 and RMSE% of AGB, LAI, and N estimation with PLSR, SVR, RFR, and ELR methods using satellite, UAV, and both satellite and UAV data, respectively. S represents satellite data, U represents UAV data, and S+U stands for the combination of satellite and UAV data. ELR outperformed other methods in AGB and LAI estimation, and RFR produced the best results for N estimation.

Conclusions
The main merit of this research was the demonstration of a novel application for crop monitoring across a heterogeneous soybean field in the framework of satellite/UAV data fusion and machine learning. The main conclusions were: 1. With the capability of providing high-resolution and detailed 3-dimensional canopy structure information (i.e., canopy height and canopy cover), low-cost UAV integrated with RGB camera was a promising tool to supplement satellite-based crop monitoring.
2. UAV-derived high-resolution and detailed canopy structure features (i.e., canopy height and canopy cover) were significant indicators for crop monitoring. Combing satellite imagery-based rich canopy spectral information with UAV-derived canopy structural features using machine learning was able to improve crop AGB, LAI, leaf N concentration estimation. Additionally, the inclusion of canopy structure information to spectral features tended to reduce background soil effect and asymptotic saturation issues.
3. The ELR model with the proposed activation function slightly outperformed PLSR, RFR, and SVR methods in the prediction of AGB and LAI, while RFR yielded the best performance for leaf N concentration estimation.
The results of this study illustrated the tremendous potential of satellite/UAV synergy in the context of crop monitoring using machine learning. Nevertheless, the approach employed in this study should be tested across different field environments, as well as a variety of crop types.  . R 2 and RMSE% of AGB, LAI, and N estimation with PLSR, SVR, RFR, and ELR methods using satellite, UAV, and both satellite and UAV data, respectively. S represents satellite data, U represents UAV data, and S + U stands for the combination of satellite and UAV data. ELR outperformed other methods in AGB and LAI estimation, and RFR produced the best results for N estimation.

Conclusions
The main merit of this research was the demonstration of a novel application for crop monitoring across a heterogeneous soybean field in the framework of satellite/UAV data fusion and machine learning. The main conclusions were: 1. With the capability of providing high-resolution and detailed 3-dimensional canopy structure information (i.e., canopy height and canopy cover), low-cost UAV integrated with RGB camera was a promising tool to supplement satellite-based crop monitoring.
2. UAV-derived high-resolution and detailed canopy structure features (i.e., canopy height and canopy cover) were significant indicators for crop monitoring. Combing satellite imagery-based rich canopy spectral information with UAV-derived canopy structural features using machine learning was able to improve crop AGB, LAI, leaf N concentration estimation. Additionally, the inclusion of canopy structure information to spectral features tended to reduce background soil effect and asymptotic saturation issues.
3. The ELR model with the proposed activation function slightly outperformed PLSR, RFR, and SVR methods in the prediction of AGB and LAI, while RFR yielded the best performance for leaf N concentration estimation.
The results of this study illustrated the tremendous potential of satellite/UAV synergy in the context of crop monitoring using machine learning. Nevertheless, the approach employed in this study should be tested across different field environments, as well as a variety of crop types.