Potato Leaf Area Index Estimation Using Multi-Sensor Unmanned Aerial Vehicle (UAV) Imagery and Machine Learning

: Potato holds signiﬁcant importance as a staple food crop worldwide, particularly in addressing the needs of a growing population. Accurate estimation of the potato Leaf Area Index (LAI) plays a crucial role in predicting crop yield and facilitating precise management practices. Leveraging the capabilities of UAV platforms, we harnessed their efﬁciency in capturing multi-source, high-resolution remote sensing data. Our study focused on estimating potato LAI utilizing UAV-based digital red–green–blue (RGB) images, Light Detection and Ranging (LiDAR) points, and hyperspectral images (HSI). From these data sources, we computed four sets of indices and employed them as inputs for four different machine-learning regression models: Support Vector Regression (SVR), Random Forest Regression (RFR), Histogram-based Gradient Boosting Regression Tree (HGBR), and Partial Least-Squares Regression (PLSR). We assessed the accuracy of individual features as well as various combinations of feature levels. Among the three sensors, HSI exhibited the most promising results due to its rich spectral information, surpassing the performance of LiDAR and RGB. Notably, the fusion of multiple features outperformed any single component, with the combination of all features of all sensors achieving the highest R 2 value of 0.782. HSI, especially when utilized in calculating vegetation indices, emerged as the most critical feature in the combination experiments. LiDAR played a relatively smaller role in potato LAI estimation compared to HSI and RGB. Additionally, we discovered that the RFR excelled at effectively integrating features.


Introduction
Potatoes are one of the most widely consumed staple foods [1]. They provide essential carbohydrates and valuable nutrients, contributing to the diets of millions worldwide [2]. Moreover, being a major agricultural crop, potatoes play a vital role in economies and global food security [3]. Monitoring the growth status of potato crops is of utmost significance for effective agricultural management. However, due to their subterranean nature, direct observation of potato growth presents challenges. In order to overcome this limitation, an efficient approach involves calculating various indices based on above-ground plant parts. One such vital index is the LAI, which serves as a reliable indicator of the crop's health and vigor. LAI measures the total green leaf area (one side) per unit of ground surface area. It is a crucial parameter reflecting the canopy structure and the biochemical status and is commonly used in the estimation of biomass [4], chlorophyll content [5], crop yield [6], and other agroecosystem studies [7]. By leveraging LAI measurements, farmers and researchers can gain valuable insights into the overall performance of potato crops, enabling them to make informed decisions regarding irrigation, fertilization, and crop management practices. coefficients ranging from 0.677 to 0.851 [39]. Ma et al. evaluated the performances of four methods for LAI estimation based on eight spectral indices, and the highest R 2 of the validation set reached 0.74 [40]. These methods could achieve good estimation accuracy (R 2 > 0. 7), but the bands that constitute the VIs were still referenced from other studies, and no specific selection was made for the research target. Ma et al. conducted band selection based on four basic index types for cotton LAI estimation, and the highest R 2 value reached 0.9 [41]. It remains to be discussed whether other vegetation types, such as potatoes, require band selection and whether more complex index types are suitable for band selection methods.
Though with the above promising results, each data source has limitations in estimating LAI on its own. Therefore, many researchers have started exploring the combination of multiple data sources to estimate LAI. Luo et al. (2019) combined LiDAR and hyperspectral data to predict the LAI of maize, wheat, and vegetables, and the R 2 of combination features reached 0.829 [42]. Yue et al. (2018) fused the VIs and height from RGB and HSI to estimate above-ground biomass and LAI with the highest R 2 for LAI exceeding 0.9 [43]. The combination of multiple data sources has shown a significant improvement in accuracy compared to using a single data source. However, further research is needed to determine the enhancement in crop LAI estimation that can be achieved by integrating these three data sources.
The LAI assessment methods for various crops based on image features such as vegetation indices have achieved good estimation accuracy, which is further enhanced by the combination of multiple data sources. However, there are still two issues that need to be further analyzed. The first one is how to select the bands for VI calculation. Most of the studies use fixed bands VIs rather than considering the differences between different plants in different growth states. Taking NDVI as an example, many studies will calculate NDVI as an input feature, and the band selection (λ1 and λ2) shows great differences [17,44]. The second one is that few studies have reported how the features of three data sources, RGB, LiDAR, and HSI, contribute to the performance of LAI estimation.
Thus, the specific objectives of this study are to (1) validate whether the band combinations used in previous studies are suitable for LAI estimation of potatoes, and comparative experiments were conducted between fixed and optimized band combinations. (2) Investigate the performance of single and combined data sources in the LAI estimation of potatoes.
(3) Examining the importance and contribution of features from different data sources in the combination experiments.
The article is organized as follows. After Section 1, field experiments, data acquisition and processing, and models with strategies are discussed in Section 2. Section 3 introduces the ground data and the comparison results of different feature combinations. Differences in ground data from two potato cultivars and the contribution and role of different data sources are discussed in Section 4. Finally, Section 5 concludes the work.

Field Experiments
The field experiment was conducted in 2021 at the University of Wisconsin (UW) Hancock Agricultural Research Station (HARS), which is a vegetable research farm located in the Central Sands area of Wisconsin. The whole field was 41 m wide by 78 m long, comprising a total of 32 individual plots. Each subplot was comprised of 8 rows, with each row measuring 7.6 m in length and 0.9 m in width.
The experiment followed a split-plot design with 4 replications, as shown in Figure 1. The field consisted of one strip without fertigation and one strip with fertigation. Within each strip, two different nitrogen (N) rates were randomly assigned to the entire plot, as specified in Table 1. Other production practices were implemented based on the recommendations provided by UW Extension [45]. Two cultivars, Snowden (chipping potato cultivar) and Colomba (yellow potato cultivar), were planted on 23 April and harvested  The sampling dates were scheduled for 30 June, 20 July, 23 July, 3 August, and 12 August.   C  37  37  ------R1  287  37  85  165  ----R2  287  37  85  30  34  34  34  34  R3  392  37  85  134  34  34  34  34 Unit: kg/ha.

Data Collection
RGB, LiDAR, and Hyperspectral data were synchronously collected by three sensors mounted on a Matrice 600 Pro platform (DJI Technology Co., Shenzhen, China) under clear sky conditions five times in the growing season on 30 June, 20 July, 29 July, 3 August, and 12 August. RGB images were taken by a Cyber-shot DSC-RX1R II camera (Sony Group Corporation, Minato, Tokyo, Japan) and 3D point data were taken by a VLP-16 sensor (Velodyne Lidar Inc., San Jose, CA, USA), which uses an array of 16 infrared (IR) lasers paired with IR detectors and emit each laser at 18.08 kHz. The HSI were taken by a Nano-Hyperspec sensor (Headwall Photonics Inc., Bolton, MA, USA), containing 640 pixels in each scan line with a pixel pitch of 7.4 µm. The details of the sensors were provided in Table 2. A Global Navigation Satellite System-aided Inertial Navigation System (GNSS/INS), VN-300 (VectorNav, Dallas, TX, USA), was integrated with the sensors to provide the longitude, latitude, and attitude indicator (yaw, pitch, and roll).   C  37  37  ------R1  287  37  85  165  ----R2  287  37  85  30  34  34  34  34   R3  392  37  85  134  34  34  34  34 Unit: kg/ha.

Data Collection
RGB, LiDAR, and Hyperspectral data were synchronously collected by three sensors mounted on a Matrice 600 Pro platform (DJI Technology Co., Shenzhen, China) under clear sky conditions five times in the growing season on 30 June, 20 July, 29 July, 3 August, and 12 August. RGB images were taken by a Cyber-shot DSC-RX1R II camera (Sony Group Corporation, Minato, Tokyo, Japan) and 3D point data were taken by a VLP-16 sensor (Velodyne Lidar Inc., San Jose, CA, USA), which uses an array of 16 infrared (IR) lasers paired with IR detectors and emit each laser at 18.08 kHz. The HSI were taken by a Nano-Hyperspec sensor (Headwall Photonics Inc., Bolton, MA, USA), containing 640 pixels in each scan line with a pixel pitch of 7.4 µm. The details of the sensors were provided in Table 2. A Global Navigation Satellite System-aided Inertial Navigation System (GNSS/INS), VN-300 (VectorNav, Dallas, TX, USA), was integrated with the sensors to provide the longitude, latitude, and attitude indicator (yaw, pitch, and roll).
We used LAI 2000 Plant Canopy Analyzer (LI-COR, Inc., Lincoln, NE, USA) to measure the LAI of the potato plants. It utilizes a "fisheye" optical sensor to calculate LAI by capturing light measurements both above and below the canopy. This device measures light interception at five different zenith angles simultaneously. Then, it employs a radiative transfer model to compute the LAI.

Image Process and Features Calculation
In our methodology, we adopted a rigorous approach to ensure the accuracy of image processing. After collecting raw data from three sensors, we used GRYFN Processing Tool V 1.2.6 (West Lafayette, IN, USA) software to preprocess them, such as orthorectification, mosaic, geometric and radiometric correction. For HSI, the Hyperspec III software V 3.1.4 (Bolton, MA, USA) was also used for radiometric correction. Further corrections and processes were achieved by Python. Our workflow follows a standardized process that draws on methods commonly used in similar studies [17]. We conducted a visual interpretation to assess the correctness of the processing results. To analyze the relationships between the images and LAI, various related features were extracted from the processed RGB, LiDAR, and HIS. These features will reduce the redundancy of the raw data and emphasize some specific information about the plant. All the processing and features are shown below, and corresponding formulas are shown in Table 3.  from it by the manually drawn plot boundaries, and we obtained 160 (32 plots by 5 times) individual images. Before extracting the features of plots, we used the balanced histogram thresholding (BHT) method to automatically remove the background (shadow and soil) in Python.
Six features of each plot were calculated based on the processed images: mean pixel value of red bands (R), mean pixel value of green bands (G), mean pixel value of blue bands (B), the mean normalized value of red band (Normalized_R), the mean normalized value of green band (Normalized_G), the mean normalized value of blue band (Normalized_B).

LiDAR-Based Features
All the raw data collected from LiDAR were georeferenced with the recorded position in GNSS/IMU unit and transferred into the point cloud in GRYFN. Most points of LiDAR were distributed within a reasonable range of potato plant heights. After separating the points of different plots by plot boundaries, we used cloth simulation filtering (CSF) [46] to distinguish the points of plant and ground in each plot. The basic idea of CSF is to assume that an inverted point cloud surface is covered with a rigid cloth. By analyzing the interactions between the cloth nodes and the LiDAR points, the cloth nodes can be used to simulate the ground. The extracted ground points were used to generate DTM with 8 cm spatial resolution. The median height of the ground points in each grid is set to the height of that grid, while for grids where no ground points are extracted, the height is decided by the mean value of the surrounding grids. The vertical distances of plant points to DTM are considered the heights of the plant points. Besides the point cloud, GRYFN also outputs a median-filtered digital surface model (DSM). With the categorized points, plant heights, and DSM, we can calculate several features: Height Percentile: The 50th, 75th, 90th, and 95th percentile height of plant points. Canopy Volume: The number of points categorized as plants.
Canopy Cover: The ratio of the number of plant points to all the points. Max Plant Height: The difference between the maximum height and the minimum height of DSM.
Plant Area Index (PAI): The plant area, the sum of plant area density (PAD) values multiplied by voxel volume per unit ground surface area. The index was calculated by the algorithm in [47]. The algorithm scale and average returned lidar intensities for each lidar pulse and used the Beer-Lambert law to estimate the PAD.

HSI-Based Features
The data collected from the hyperspectral scanner were orthorectified and georeferenced based on the position information of the GNSS/IMU unit. Then, raw digital numbers were calibrated to reflectance based on the metadata and calibration panels in Hyperspec III and GRYFN. The HSI exhibited standard green vegetation spectral curves. After separating the reflectance map of each plot, the background was removed based on the threshold in Python. The threshold was set to 0.15 at the 800 nm wavelength, and pixels with lower values were removed. Images of each plot were also extracted by the plot boundaries. We calculated two types of features, VI and statistical features, based on that.
We selected six typical indices as input features. The NDVI [48] is the most widely used VI and can be used to monitor the phenology, quantity, and activity of vegetation. The two-band enhanced vegetation index (EVI2) not only maintains the advantages of EVI, improving linearity with biophysical vegetation properties and reducing saturation effects but also does not need to use a blue band [49]. As red-edge reflectance-based VIs are preferable for crop LAI and other canopy architectures estimation and are more sensitive to leaf chlorophyll content [50]. We selected four VIs related to the red-edge bands and chlorophyll, the red-edge chlorophyll index (CI rededge ) [51], the green chlorophyll index(CI green ) [51], the red-edge modified simple ratio index (MSR rededge ) [52], and the MERIS terrestrial chlorophyll index (MTCI) [53].
Besides, the mean and standard deviation (Std) of each of the HSI bands were calculated and used as input features for estimating LAI.

Grid Searching Bands and Fixed Bands
The HSI imagery with 274 narrow bands supplies more refined spectral information in constructing VIs, which are commonly calculated from fixed and broad spectral bands based on previous studies. For example, the NDVI is the ratio of the difference between the near-infrared (NIR) and red bands to the summation of these two bands. The band ranges are commonly 770-890 and 630-690 nm for NIR and red, respectively. However, with the HSI imagery, there are 55 bands within the NIR range and 27 within the red range. Therefore, to locate the specific HSI bands with the best performance for estimating LAI, we applied a grid-searching method in the feature selection step. First, we divide the bands into five sets, blue, green, red, Red Edge, and NIR set. The spectral range of different sets is slightly larger than the commonly used range to ensure that the best combination can be searched as much as possible. For example, the range of NIR is set to 760-1000 nm, and then, we calculated six groups of HSI-based VI based on the band sets, and the combination with the highest Pearson correlation coefficient (r) of LAI in each group was selected as the final output of this index. Furthermore, we compared the performance in estimating LAI by VIs calculated using the selected HSI bands with the fixed (experienced) bands. When the wavelength of the sensor and the experienced wavelength do not exactly match, we use the band with the smallest difference from its wavelength as a substitute for the following calculation.

Combination of VIs from Different Data Sources
VIs derived from RGB, LiDAR, and HSI data can be regarded as distinct subsets of features, each with a unique focus. By combining these multi-source features, the input can be enriched, leading to improved estimation capabilities of the models. To assess the extent of this improvement, two sets of experiments were conducted. In the first step, the three individual data sources were evaluated independently. Subsequently, in the second step, the four combinations-RGB + LiDAR, RGB + HSI, LiDAR + HSI, and RGB + LiDAR + HSI-were evaluated. All inputs underwent standardization, and the parameters were determined through a combination of grid searching and manual adjustment.

Statistical Features Selection and Combination with VIs
The VIs are considered subsets of the original reflectance data and typically emphasize specific vegetation properties. In addition to VIs, statistical features are also important for evaluating the biological conditions of vegetation. The mean and standard deviation of each band were calculated as the original statistical features. However, these 548 features cannot be directly used for model training due to the presence of invalid and redundant information. To address this, we employed Recursive Feature Elimination with Cross-Validation (RFECV) on the original features. RFECV is a method that depends on the chosen estimator and requires feature importance as input. In this case, we utilized Random Forest (RFR) and Support Vector Regression (SVR) as base models to extract two feature subsets. Subsequently, we evaluated the performance of the four regression models using the two resulting feature sets obtained from RFECV. This approach aimed to reduce redundancy and identify the essential features for model training and estimation. The selected features and the combination of VIs and selected features are evaluated.

Machine Learning Model
Four commonly used machine learning approaches, Support Vector Regression (SVR), Random Forest Regression (RFR), Histogram-based Gradient Boosting Regression Tree (HGBR), and Partial Least-Squares Regression (PLSR), were selected as learners to evaluate the features. SVR is a supervised learning model from SVMs. The algorithm's goal is to put Remote Sens. 2023, 15, 4108 8 of 18 more original points inside the hyperplane with a width ε and reduce the error outside. Three parameters, kernel, gamma, and regularization parameter (C), were adjusted with different input features. RFR is a supervised ensemble learning algorithm that constructs several decision trees with training data. The outputs of these trees were integrated to calculate the final estimation. HGBR tremendously accelerates the gradient-boosting methods by categorizing the continuous features into integer-valued groups instead of sorting continuous values. PLSR projects the input features and LAI to a new lowerdimensional space and applies a linear regressor to fit the data. In this study, all the regression models were performed using Python and the scikit-learn library [54]. As effective parameter tuning plays a pivotal role in optimizing model performance and unleashing its true potential, the optimal values of the model's parameters were obtained by grid search and manual adjustment. The name and explanation of the parameters are shown in Table 4.

Evaluation Metrics
The accuracy of the LAI estimation was evaluated using the coefficient of determination (R 2 ), root-mean-square error (RMSE), and the mean absolute error (MAE). The formulas of these metrics are shown below. R 2 is positively correlated with model accuracy, while RMSE and MAE are just the opposite. In our experiment, we applied a 5-fold cross-validation strategy, and repeat was set to 5. The final evaluation metrics are the mean values of the 25 results.
where y i is the measured value, y i_pre is the estimated value, y is the mean value and n is the sample number.

Ground Data Statistics
The Colomba is an early maturing variety, harvested about 90 days after planting, whereas the Snowden exhibits late maturation, occurring approximately 110-120 days after planting. This discrepancy in maturation time is reflected in the distinct distribution of the LAI illustrated in Figure 2. Snowden exhibited a pattern of initially increasing followed by a subsequent decline, while the Colomba consistently displayed a downward trend. Additionally, Figure 2 provides visual representations in the form of time series images depicting the growth progression of both vegetation types.

Ground Data Statistics
The Colomba is an early maturing variety, harvested about 90 days after planting, whereas the Snowden exhibits late maturation, occurring approximately 110-120 days after planting. This discrepancy in maturation time is reflected in the distinct distribution of the LAI illustrated in Figure 2. Snowden exhibited a pattern of initially increasing followed by a subsequent decline, while the Colomba consistently displayed a downward trend. Additionally, Figure 2 provides visual representations in the form of time series images depicting the growth progression of both vegetation types. In Figure 3, we also present the distribution of LAI of the two varieties under different N rates. For the Colomba, LAI decreased over time under all four fertilization conditions. However, the magnitude of LAI values is influenced by the rate of fertilizer application. Specifically, the LAI of the control group is lower than that of R1, which is lower than the LAI values of R2 and R3. This phenomenon is more prominent in Snowden, where both the control group and R1 exhibited a declining trend in LAI. However, noteworthy is the fact that the LAI values of R2 and R3 demonstrated an increasing trend when higher amounts of fertigation were applied. The effect of fertigation on LAI is significant and leads to a dataset with large variations. The regression experiments based on this dataset can prove the generality of the model to some extent.   In Figure 3, we also present the distribution of LAI of the two varieties under different N rates. For the Colomba, LAI decreased over time under all four fertilization conditions. However, the magnitude of LAI values is influenced by the rate of fertilizer application. Specifically, the LAI of the control group is lower than that of R1, which is lower than the LAI values of R2 and R3. This phenomenon is more prominent in Snowden, where both the control group and R1 exhibited a declining trend in LAI. However, noteworthy is the fact that the LAI values of R2 and R3 demonstrated an increasing trend when higher amounts of fertigation were applied. The effect of fertigation on LAI is significant and leads to a dataset with large variations. The regression experiments based on this dataset can prove the generality of the model to some extent.

Ground Data Statistics
The Colomba is an early maturing variety, harvested about 90 days after planting, whereas the Snowden exhibits late maturation, occurring approximately 110-120 days after planting. This discrepancy in maturation time is reflected in the distinct distribution of the LAI illustrated in Figure 2. Snowden exhibited a pattern of initially increasing followed by a subsequent decline, while the Colomba consistently displayed a downward trend. Additionally, Figure 2 provides visual representations in the form of time series images depicting the growth progression of both vegetation types. In Figure 3, we also present the distribution of LAI of the two varieties under different N rates. For the Colomba, LAI decreased over time under all four fertilization conditions. However, the magnitude of LAI values is influenced by the rate of fertilizer application. Specifically, the LAI of the control group is lower than that of R1, which is lower than the LAI values of R2 and R3. This phenomenon is more prominent in Snowden, where both the control group and R1 exhibited a declining trend in LAI. However, noteworthy is the fact that the LAI values of R2 and R3 demonstrated an increasing trend when higher amounts of fertigation were applied. The effect of fertigation on LAI is significant and leads to a dataset with large variations. The regression experiments based on this dataset can prove the generality of the model to some extent.

Comparison of VIs with Searched and Fixed Bands
There are notable differences between the effectiveness of the searched bands and fixed bands, especially in the Red Edge and NIR range, in Table 5. In general, the correlation coefficients of these searched ones were greatly increased, with the maximum improvement reaching 0.211. The NDVI and EVI, commonly used to characterize "greenness", demonstrated higher correlations compared to others in both searched and fixed bands combination. After searching, the correlations between two VIs composed of the red-edge band, CI rededge and MSR rededge , have significantly increased from 0.565 to 0.736 and from 0.548 to 0.759, respectively. These improvements provide evidence of the effectiveness of the optimal (searched) VI for estimating LAI. The searched bands are different from fixed bands and generally have longer wavelengths. This phenomenon indicates that vegetation with a similar reflectance spectrum can have varying factors such as plant structure, biochemical composition, external environmental pressures, and diseases, all of which can impact spectral characteristics. As a result, relying solely on VI formulations based on fixed bands may not consistently yield optimal results. Grid searching, on the other hand, can aid in identifying suitable combinations of bands to a certain extent.

Combination of VIs from Different Data Sources
The evaluation results of single sources are shown in Table 6. The highest R 2 for each data source is underlined, and the highest R 2 among all data sources is in bold. Despite employing different data acquisition mechanisms, RGB and LiDAR achieved comparable regression accuracy. The RGB image-derived traits have a slight advantage in estimating potato LAI compared to those from LiDAR. With its extensive spectral information, the HSI features exhibited significantly higher R 2 compared to the other two data sources. In general, RFR was the most accurate and stable model. However, as an ensemble of decision trees, it incurs higher time costs compared to other methods. SVR excels in terms of runtime efficiency due to its lower training and prediction complexity. A solid mathematical foundation and theoretical guarantees provide confidence in the model's performance. The HGBR model fell intermediate to RFR and SVR in terms of efficiency and accuracy, while Partial Least Squares Regression (PLSR) represents the fastest model.
Results of the four different combinations of the three data sources as predictors in estimating potato LAI are shown in Figure 4. The best performance (R 2 = 0.775) was given by the RFR model with all features from LiDAR, RGB, and HSI, followed by 0.768 achieved by the RFR model with features from LiDAR and HSI. Notably, the combined data sources outperformed the individual ones. For example, the R 2 of LiDAR + RGB was higher than that of LiDAR or RGB alone, and the R 2 of LiDAR + RGB + HSI was higher than that of either component. The combinations of multiple data sources encompass diverse different feature spaces, providing the regression model with richer input information. This, in turn, improves the performance of estimating LAI.
In general, RFR was the most accurate and stable model. However, as an ensemble of decision trees, it incurs higher time costs compared to other methods. SVR excels in terms of runtime efficiency due to its lower training and prediction complexity. A solid mathematical foundation and theoretical guarantees provide confidence in the model's performance. The HGBR model fell intermediate to RFR and SVR in terms of efficiency and accuracy, while Partial Least Squares Regression (PLSR) represents the fastest model.
Results of the four different combinations of the three data sources as predictors in estimating potato LAI are shown in Figure 4. The best performance (R 2 = 0.775) was given by the RFR model with all features from LiDAR, RGB, and HSI, followed by 0.768 achieved by the RFR model with features from LiDAR and HSI. Notably, the combined data sources outperformed the individual ones. For example, the R 2 of LiDAR + RGB was higher than that of LiDAR or RGB alone, and the R 2 of LiDAR + RGB + HSI was higher than that of either component. The combinations of multiple data sources encompass diverse different feature spaces, providing the regression model with richer input information. This, in turn, improves the performance of estimating LAI.  When examining the combinations with the lowest accuracy values (0.687, 0.734, 0.745, and 0.740), we can notice that the last three combinations, which incorporate HSI features, exhibit similar and notably higher accuracy levels compared to the first combination. This observation highlights the significance of hyperspectral information in estimating LAI with different modeling approaches.
Among the four models, RFR emerged as the most suitable algorithm for estimating LAI with combined features, as evidenced by its superior performance across all combi-nations. Each combination outperformed the single components individually. Notably, RFR achieved the highest R 2 value of 0.775 when utilizing all three feature subsets. On the other hand, SVR, despite showing advantages in Table 6, did not perform well in this set of experiments. Its results exhibited an opposite trend compared to RFR, and in some combinations, the accuracy was even reduced. For instance, combinations such as LiDAR + HSI and LiDAR + RGB + HSI performed worse than using HSI alone. The results of HGBR show similarities to those of SVR, with both strengths and weaknesses observed across different combinations. Finally, PLSR exhibited the lowest accuracy among the models, with a difference of approximately 0.2 to 0.3 compared to other methods. In addition to accuracy, efficiency is indeed an important metric to consider. During the grid searching process for the best parameters, each feature combination requires parameter tuning, leading to variations in evaluation times for each model. RFR had the greatest uncertainty in evaluation time. On the other hand, PLSR and SVR were generally faster and more stable compared to RFR and HGBR.
From Table 6 and Figure 4, it can be observed that the R 2 of the combinations involving all three data sources is slightly higher (by 0.009) compared to using HSI alone. Despite doubling the amount of data and workload, the overall improvement is marginal, indicating that the complementary information provided by RGB and LiDAR to HSI is limited in this context.

Combination of VIs and Selected Statistical Features
The evaluation metrics of original statistical features and selected features are shown in Table 7. The highest R 2 values for the original or selected features are underlined, and the highest R 2 value in the table is in bold. RFR-based and SVR-based RFECV selected 44 and 40 features, respectively. These selected features and original features were fed into the evaluation models to calculate the accuracy, and the selected ones achieved the same or even better performance. The highest R 2 reached 0.766. The significant reduction in the number of features did not result in a significant decrease in accuracy, indicating that the insignificant features had been removed. Moreover, the refinement of features brings a significant reduction in evaluation time, which was reduced by 15.22%, 56.52%, 88.71%, and 30%, respectively. These statistical features and combined VIs achieved reliable performance. When combining them as input of evaluation models, the best R 2 was improved from 0.778 to 0.782, as shown in Figure 5. Given that the RFR model achieved the highest accuracy in the previous experiments, we generate the importance of each feature based on the RFR model, shown in Figure 6. Similarly, HSI is the most important data source, and the six VIs derived from it were in the top six. The feature importance confirmed with the modeling results that the contributions of LiDAR and RGB were minimal. The features extracted by LiDAR varied considerably, including the MaxPlantHeight in the seventh position and the Height Percentile and Canopy Cover inside the top 10 of the countdowns. The importance difference of RGB features is not significant. The green channel is slightly higher than the red channel, and the red channel is slightly higher than the blue channel.
the previous experiments, we generate the importance of each feature based on the RFR model, shown in Figure 6. Similarly, HSI is the most important data source, and the six VIs derived from it were in the top six. The feature importance confirmed with the modeling results that the contributions of LiDAR and RGB were minimal. The features extracted by LiDAR varied considerably, including the MaxPlantHeight in the seventh position and the Height Percentile and Canopy Cover inside the top 10 of the countdowns. The importance difference of RGB features is not significant. The green channel is slightly higher than the red channel, and the red channel is slightly higher than the blue channel.

Discussion
Many widely used vegetation indices were originally defined based on a broader range of wavelengths (such as green, red, and NIR bands) when they were first proposed, such as the VIs in Table 3. As hyperspectral data become more prevalent, selecting the appropriate narrow bands for calculating these indices becomes critical. Researchers have addressed this challenge and conducted studies to find the optimal bands for specific the previous experiments, we generate the importance of each feature based on the RFR model, shown in Figure 6. Similarly, HSI is the most important data source, and the six VIs derived from it were in the top six. The feature importance confirmed with the modeling results that the contributions of LiDAR and RGB were minimal. The features extracted by LiDAR varied considerably, including the MaxPlantHeight in the seventh position and the Height Percentile and Canopy Cover inside the top 10 of the countdowns. The importance difference of RGB features is not significant. The green channel is slightly higher than the red channel, and the red channel is slightly higher than the blue channel.

Discussion
Many widely used vegetation indices were originally defined based on a broader range of wavelengths (such as green, red, and NIR bands) when they were first proposed, such as the VIs in Table 3. As hyperspectral data become more prevalent, selecting the appropriate narrow bands for calculating these indices becomes critical. Researchers have addressed this challenge and conducted studies to find the optimal bands for specific

Discussion
Many widely used vegetation indices were originally defined based on a broader range of wavelengths (such as green, red, and NIR bands) when they were first proposed, such as the VIs in Table 3. As hyperspectral data become more prevalent, selecting the appropriate narrow bands for calculating these indices becomes critical. Researchers have addressed this challenge and conducted studies to find the optimal bands for specific vegetation indices, such as an optimal band combination for simple indices [41,56] and optimal bandwidth [57]. In this study, we performed grid searching for several complex VIs and selected the most relevant one for LAI as the optimal band. The optimal bands we discovered for potatoes through our experiments are distinct from the commonly used fixed bands and even differ from the optimal bands identified in other studies. This highlights the importance of conducting a dedicated band selection process for different research targets when using hyperspectral data.
In the part of feature combination, we conducted ablation experiments, testing single source features, pairwise combination features, and all features. The HSI performed better than RGB, which in turn performed better than LiDAR. From our results, we can find that the richer the spectral information in the data, the higher the prediction accuracy of LAI. Similar findings are present in [58][59][60], showing that the multi-hyperspectral imageryderived model outperforms the LiDAR-derived model. In all four combination experiments, the accuracy of the combination features was higher than the accuracy of their components. This result indicates that multiple data sources can complement each other and improve LAI prediction performance. However, the improvement in accuracy was not substantial and aligned with the findings in [59][60][61].
VI is just a subset of features extracted from HSI, and a large amount of spectral information remains unused. Therefore, we extracted another type of feature, statistical features, to analyze the importance of each band. The results indicate that compared to using all information, the important bands selected by RFECV are more advantageous for estimating the target features. This is consistent with the results of other methods [62,63] that also employ RFECV. It indicates that there is data redundancy when using hyperspectral information for parameter estimation, and removing unnecessary data can lead to better parameter estimation.
In precision agriculture, the accuracy estimation of LAI should be considered along with the cost of UAV sensors, image acquisition efficiency, data processing complexity, and other factors. Currently, RGB cameras have the lowest cost (~USD 3000) and the simplest data processing workflow. The acceptable accuracy in experiments (R 2 = 0.726 in Table 6) indicates that RGB cameras are a good choice for LAI estimation under low cost and low technical requirements. Compared to RGB cameras, LiDAR devices slightly cost more (~USD 4000), and their powerful penetration capability may not be effectively utilized in agricultural fields, resulting in lower accuracy (R 2 = 0.666 in Table 6). Therefore, we do not recommend using LiDAR as the sole sensor for LAI estimation. Hyperspectral sensors perform best in terms of accuracy (R 2 = 0.766 in Table 6). However, their prohibitive cost (~USD 50,000) and the requirement for specialized knowledge are two main disadvantages. In cases where high accuracy is demanded and the participants are knowledgeable in data processing, hyperspectral sensors are a very suitable choice. If sensors that support custom spectral channel settings are widely used, users can acquire only the bands sensitive to the research parameters, such as the optimal bands obtained in Table 5. Consequently, the overall cost of using multispectral or hyperspectral sensors will be further reduced. In addition, our method can also be applied to other green vegetation, but the optimal bands corresponding to them may vary, necessitating a re-evaluation of band selection. However, the applicability of these methods to non-green vegetation requires further investigation.
While the current research emphasis is on deep learning, machine learning still retains its advantages in this application. Deep learning, or a deep neural network, is a data-driven approach that requires a substantial amount of training data to adequately estimate even millions of parameters. It finds extensive use in tasks such as leaf classification [64,65] and crop classification [66,67], but its application is limited in regression tasks such as LAI estimation. This is due to the significantly greater difficulty in acquiring true values for LAI compared to classification tasks and the lack of large, publicly available LAI datasets for thorough parameter training of deep networks. Hence, for small-scale (field) crop parameter estimation, mainstream machine learning methods and shallow neural networks offer performance that meets requirements. Moreover, machine learning methods have lower hardware demands, making them better suited for practical applications.

Conclusions
This study validates the necessity of band selection for potato LAI estimation by comparing experiments involving searched and fixed bands. In comparison to Vegetation Indices (VIs) calculated from fixed bands based on empirical knowledge, the correlation between LAI and optimized narrow-band VIs increased by 0.01 to 0.211. This provides evidence for the essentiality of optimal band selection for potatoes.
Additionally, a series of experiments using single and combined data sources are conducted to comprehensively analyze the accuracy, significance, and cost involved. Three data sources provide different types of features, structural features, and general and detailed spectral features. Among these features, VIs calculated from HSI outperformed others, and its highest R 2 reached 0.766 (by the RFR), and features of RGB achieved better accuracy than that of LiDAR. The content of spectral information in the data correlates with the accuracy of LAI estimation. In the ablation experiments, VIs of HSI dominated the feature space, while adding features from RGB and LiDAR barely improved the model accuracy. The experiments of statistical features have demonstrated that there is a data redundancy in hyperspectral data when used for LAI estimation tasks. Removing this redundant information not only reduces computational complexity but also improves estimation accuracy to some extent.
HSI achieved the highest accuracy, but at the same time, the acquisition and processing cost is the highest. In practical applications, it is essential to consider multiple factors, such as budget constraints and accuracy requirements, to determine the sensor to use. Different situations and projects may call for various data sources, and it is crucial to strike a balance between the available resources and the desired level of information.