Banana Fusarium Wilt Recognition Based on UAV Multi-Spectral Imagery and Automatically Constructed Enhanced Features

Su, Ye; Zhao, Longlong; Ye, Huichun; Huang, Wenjiang; Li, Xiaoli; Li, Hongzhong; Chen, Jinsong; Kong, Weiping; Zhang, Biyao

doi:10.3390/agronomy15081837

Open AccessArticle

Banana Fusarium Wilt Recognition Based on UAV Multi-Spectral Imagery and Automatically Constructed Enhanced Features

by

Ye Su

^1,2

,

Longlong Zhao

^1,*

,

Huichun Ye

^3,4

,

Wenjiang Huang

^3,4

,

Xiaoli Li

¹

,

Hongzhong Li

^1,2

,

Jinsong Chen

^1,2

,

Weiping Kong

^3,4 and

Biyao Zhang

^3,4

¹

Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

²

University of Chinese Academy of Sciences, Beijing 100049, China

³

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

⁴

Key Laboratory of Earth Observation of Hainan Province, Hainan Aerospace Information Research Institute, Sanya 572029, China

^*

Author to whom correspondence should be addressed.

Agronomy 2025, 15(8), 1837; https://doi.org/10.3390/agronomy15081837

Submission received: 18 April 2025 / Revised: 26 June 2025 / Accepted: 25 July 2025 / Published: 29 July 2025

(This article belongs to the Section Pest and Disease Management)

Download

Browse Figures

Versions Notes

Abstract

Banana Fusarium wilt (BFW, also known as Panama disease) is a highly infectious and destructive disease that threatens global banana production, requiring early recognition for timely prevention and control. Current monitoring methods primarily rely on continuous variable features—such as band reflectances (BRs) and vegetation indices (VIs)—collectively referred to as basic features (BFs)—which are prone to noise during the early stages of infection and struggle to capture subtle spectral variations, thus limiting the recognition accuracy. To address this limitation, this study proposes a discretized enhanced feature (EF) construction method, the automated kernel density segmentation-based feature construction algorithm (AutoKDFC). By analyzing the differences in the kernel density distributions between healthy and diseased samples, the AutoKDFC automatically determines the optimal segmentation threshold, converting continuous BFs into binary features with higher discriminative power for early-stage recognition. Using UAV-based multi-spectral imagery, BFW recognition models are developed and tested with the random forest (RF), support vector machine (SVM), and Gaussian naïve Bayes (GNB) algorithms. The results show that EFs exhibit significantly stronger correlations with BFW’s presence than original BFs. Feature importance analysis via RF further confirms that EFs contribute more to the model performance, with VI-derived features outperforming BR-based ones. The integration of EFs results in average performance gains of 0.88%, 2.61%, and 3.07% for RF, SVM, and GNB, respectively, with SVM achieving the best performance, averaging over 90%. Additionally, the generated BFW distribution map closely aligns with ground observations and captures spectral changes linked to disease progression, validating the method’s practical utility. Overall, the proposed AutoKDFC method demonstrates high effectiveness and generalizability for BFW recognition. Its core concept of “automatic feature enhancement” has strong potential for broader applications in crop disease monitoring and supports the development of intelligent early warning systems in plant health management.

Keywords:

banana fusarium wilt; UAV multi-spectral imagery; enhanced features; machine learning; kernel density; threshold segmentation

1. Introduction

Banana (Musa spp.) is a vital economic crop in tropical and subtropical regions. However, its production is severely threatened by Fusarium wilt (also known as Panama disease, BFW), a soilborne fungal disease caused by Fusarium oxysporum f. sp. cubense. The pathogen typically infects the plant from the roots upward, impeding water and nutrients, which results in leaf yellowing, wilting, and potentially plant death [1,2], posing a significant threat to the banana yield and quality [3,4]. BFW spreads in various ways, such as contaminated water, agricultural tools, infected seedlings, and animal activity, exhibiting high transmissibility and destructiveness. It is estimated that BFW has affected hundreds of thousands of hectares of plantations across multiple countries in Asia, Africa, and the Americas [5,6], causing substantial economic losses to the banana industry. Timely and accurate recognition of infected plants is therefore critical for effective disease control, optimized crop management, and protection of local economic benefits.

BFW progresses upward from the roots, and during its latent phase—prior to the appearance of visible symptoms such as leaf yellowing and wilting—external signs on the plant are typically absent [7,8]. At this stage, BFW can be recognized by longitudinally slicing the pseudo stem to observe the vascular discoloration or through molecular testing using portable devices [9,10]. However, these methods are labor-intensive, time-consuming, and unsuitable for large-scale rapid recognition. As the disease progresses into the early symptomatic stage, visible signs begin to appear on the leaves. Yet ground-based manual surveys are limited in terms of the spatial and temporal coverage, leading to monitoring blind spots. In contrast, remote sensing technologies offer large-scale and efficient monitoring capabilities. Unmanned aerial vehicle (UAV) remote sensing, with its high spatial resolution and strong mobility, has emerged as a key tool for pest and disease recognition. Equipped with multi-spectral or hyperspectral sensors, UAVs can rapidly capture spectral information across large scales, making them increasingly important for disease surveillance [11,12,13,14,15]. Given the long latency and rapid spread of BFW, the early symptomatic stage—when spectral changes become detectable but before widespread transmission—presents a critical time window for UAV-based monitoring to enable timely intervention.

Although several studies have demonstrated the feasibility of recognizing BFW using UAV-based multi-spectral imagery, this area remains underexplored. For instance, ref. [16] showed that multi-spectral UAV imagery can effectively identify BFW symptoms, while [17] combined four supervised and two unsupervised algorithms for BFW recognition. Their results suggested that random forest is more effective in early-stage detection, whereas hotspot analysis performs better in later stages. Ref. [18] further proposed a deep learning method for BFW detection based on the YOLOv8n network, aiming to improve both detection accuracy and speed. Despite these efforts, the current body of work is limited in terms of the early-stage spectral feature enhancement and robust feature construction.

Under disease stress, plants undergo physiological and structural changes—such as alterations in pigment composition, nutrient levels, and morphology—which in turn lead to changes in their spectral response [13,19,20]. Band reflectances (BRs) and vegetation indices (VIs) are widely used spectral features in remote sensing-based disease detection. Utilizing these features, previous studies have successfully monitored various crop and forest diseases, including wheat stripe rust, rice blast, pine wilt disease, citrus Huanglongbing, and so on [21,22,23,24]. While these continuous features generally exhibit strong discriminative capacity, they are susceptible to noise and limited in capturing subtle spectral differences, particularly in early-stage or mild infections. In such scenarios, the overlapping feature distributions between healthy and infected plants reduce the classification performance. Thus, enhancing the sensitivity to early-stage spectral variations while ensuring robustness against noise has become a key challenge in improving disease detection accuracy, yet it remains insufficiently addressed.

A promising approach to address this challenge involves constructing discretized enhanced features (EFs) through threshold segmentation. By exploring the relationship between continuous spectral features (e.g., BRs and VIs) and plant health status, these features can be converted into binary representations that are more sensitive to early disease signals. Such binary feature construction strategies have been successfully applied in various domains. For example, ref. [25] improved bike-sharing demand prediction by creating binary cross-features from meteorological and temporal data, while [26] enhanced PM2.5 forecasting using low-peak binary features derived from pollutant correlations and seasonal patterns. Similarly, ref. [27] developed binary vegetation–meteorological indicators to improve fire risk prediction. However, most existing studies rely on empirically set thresholds, which are often subjective and lack generalizability—particularly in high-dimensional feature spaces. Therefore, developing an automated feature construction method based on threshold segmentation holds great promise for improving early-stage disease detection sensitivity and model robustness.

In this study, we propose a novel feature enhancement method based on threshold segmentation using UAV multi-spectral imagery to improve the sensitivity of early-stage BFW recognition. A series of BFW recognition models are built using multiple machine learning (ML) algorithms. The main objectives include the following: (1) construct basic features (BFs)—such as band reflectances and vegetation indices—for BFW recognition, (2) develop an automated enhanced feature construction method based on kernel density segmentation of the BF distributions in healthy and diseased samples and then generate EFs, (3) analyze the correlation and contribution of BFs and EFs to BFW recognition, and establish recognition models using three ML algorithms to validate the effectiveness of EFs in improving the model accuracy, and (4) generate a spatial distribution map of BFW for the study area using the optimal model.

2. Materials and Methods

2.1. Study Area and Data

2.1.1. Study Area

The study area is in Long’an County, Nanning City, Guangxi Zhuang Autonomous Region, China (23°7′53.2″–23°8′4.0″ N, 107°43′44.9″–107°44′7.2″ E) (Figure 1a). Characterized by a subtropical monsoon climate, the region enjoys abundant sunlight and rainfall, with an average annual temperature ranging from 20.8 °C to 22.4 °C and an average annual precipitation of approximately 1200 mm—conditions highly favorable for banana cultivation. Bananas from Long’an County are known for their high quality and have gained considerable recognition in the national market. As of 2022, the total banana cultivation area in the county reached approximately 7840 hectares, with an annual output of 300,000 tons.

The specific study site is a banana plantation located about 5 km southeast of Long’an County’s urban center, covering an area of approximately 21 hectares (Figure 1b). The plantation grows the “Williams B6” variety, with the plants reaching a height of approximately 2.4 to 3 m and bearing 34 to 36 leaves. The planting density is about 1950 plants per hectare, with a spacing of 2.0 m × 2.6 m, and the growth period spans 10 to 12 months. A field survey conducted in August 2018 revealed severe BFW occurrence in this plantation, with over 40% of the plants exhibiting symptoms of varying severity (Figure 1c).

2.1.2. Data and Preprocessing

Field Survey of Banana Fusarium Wilt

On 7 August 2018, the research team conducted a field survey in Long’an County, Guangxi. A total of 120 samples were evenly collected across the study area to assess the occurrence of BFW, with each sample corresponding to a single banana plant (Figure 1b). BFW recognition was based on the proportion of yellowing leaf area relative to the total leaf area of the plant. Plants with less than 1% yellowing were classified as healthy, while those exceeding this threshold were considered diseased. During the field survey, we obtained the latitude and longitude of each sampled plant using a high-precision handheld GPS device and recorded its infection status (healthy or diseased). Ultimately, 57 healthy samples and 63 diseased samples were obtained and used for model training and validation.

UAV Multi-Spectral Data Acquisition and Preprocessing

Multi-spectral data of the banana plantation were collected using a DJI Phantom 4 UAV (Shenzhen Dajiang Innovation Technology Co., Ltd., Shenzhen, China) equipped with a MicaSense RedEdge-M™ multi-spectral imaging system (Figure 2), manufactured by MicaSense, Inc., Seattle, WA, USA. The imaging system, which was mounted on the drone by a professional UAV developer, consists of a MicaSense RedEdge-M™ multi-spectral camera with a downlink light sensor (DLS) module and a GPS module, acquiring data across five spectral bands: blue, green, red, red edge, and near-infrared. The camera uses an external power supply with a voltage range of 4.2 V DC to 15.6 V DC, power consumption of 4 W, and peak power of 8 W. With these loads, the UAV can fly for at least 20 min under ideal conditions. All the spectral data were recorded as geotagged TIFF files and stored onboard the camera’s memory card.

Aerial imaging was conducted between 12:30 and 13:30 on 7 August 2018, under a clear sky and low wind conditions. When collecting data, the UAV flew at an altitude of 120 m above ground level, and the flight plan ensured cross-track and along-track overlap of 80%. The acquired multi-spectral imagery by the UAV covered an area of 21 hectares. The ground sampling resolution was 8 cm/pixel. Prior to and following the flight, images of the MicaSense calibrated reflectance panel were captured using the RedEdge sensor, manufactured by MicaSense, Inc., Seattle, WA, USA. The Pix4D mapper V4.5.6 software was used to process the images. These calibration images were used to perform radiometric correction of the UAV imagery, converting raw digital number values into reflectance data.

To integrate the ground-surveyed samples and UAV imagery, the sample data layer was overlaid with the UAV imagery layer in ArcGIS 10.6. The sample locations were manually adjusted to align with the centers of the corresponding banana plants in the UAV imagery, ensuring a precise match between the ground survey points and the plant centers in the UAV imagery. The processed data were stored in shapefile format. Subsequently, vector polygons were digitized for 120 banana plants, and the mean UAV-derived spectral reflectance and vegetation index values within each polygon were extracted for analysis.

2.2. Methods

Based on the UAV multi-spectral imagery and ground survey samples of BFW, this study first constructed basic features (BFs), including band reflectances (BRs) and vegetation indices (VIs). Subsequently, enhanced features (EFs) were generated using an automated kernel density segmentation method, which leverages differences in the kernel density distributions of healthy and diseased samples. These EFs were designed to amplify subtle spectral variations indicative of early-stage infection. Three ML algorithms were then employed to build recognition models of BFW, and the contribution and effectiveness of the EFs in improving the model performance were evaluated. Finally, the optimal model was used to generate a spatial distribution map of BFW in the study area. The methodological framework is illustrated in Figure 3.

2.2.1. Construction of Basic Features

Considering the remote sensing data source and this study’s objectives, BRs and VIs were selected as BFs. The band reflectance provides information on the canopy’s physiological condition; thus, the reflectance of five spectral bands—blue, green, red, red edge, and near-infrared—was included. Considering the physiological mechanisms of BFW—such as reductions in chlorophyll, carotenoids, and anthocyanins, leading to leaf yellowing, and impaired water transport, causing wilting and a decline in the leaf area index—eight VIs were also derived from these bands. These indices are sensitive to changes in plant pigments and structure caused by disease and include the following: Normalized Difference Vegetation Index (NDVI), Red Edge Normalized Difference Index (NDRE), Chlorophyll Index Green (CI_green), Chlorophyll Index Red Edge (CI_RE), Structure-Insensitive Pigment Index (SIPI), Red Edge Structure-Insensitive Pigment Index (SIPI_RE), Carotenoid Reflectance Index (CARI), and Anthocyanin Reflectance Index (ARI). The formulas and corresponding physiological targets of these indices are presented in Table 1. In total, 13 BFs were constructed, comprising five BR-based BFs and eight VI-based BFs.

The specific feature extraction process was as follows. Based on the coordinates of the ground-surveyed banana Fusarium wilt samples and UAV multi-spectral imagery, the corresponding banana plants were located within the images. Using ArcGIS 10.6, vector outlines were drawn for 120 individual banana plants. Subsequently, eight vegetation indices were calculated from the UAV-acquired multi-spectral images. For each banana plant, the mean values of the reflectance across five spectral bands and the eight vegetation indices within its corresponding contour area were extracted as the sample’s feature values. In total, 13 basic features were obtained for each of the 120 samples, and the data were compiled into a CSV file.

2.2.2. Automated Construction of Enhanced Features

The EFs developed in this study aim to capture subtle spectral variations associated with early disease onset, thereby improving the predictive performance of BFW recognition models. By analyzing the distributional differences of the BR-based BFs and VI-based BFs between healthy and diseased samples, optimal segmentation thresholds were identified, and the BFs were discretized into binary values. This transformation reduces the noise sensitivity and enhances the detectability of weak spectral signals. Kernel density estimation (KDE), a non-parametric method for estimating probability density functions, was employed to reveal the structure of the feature distributions [36]. Accordingly, this study proposed an automated kernel density segmentation-based feature construction algorithm (AutoKDFC), which includes the following steps: (1) define a discriminant metric to assess the separability of each BF; (2) determine the optimal threshold for feature segmentation; and (3) generate binary EFs based on the optimal threshold. The algorithm workflow is shown in Figure 4.

Evaluation of Basic Feature Separability Using Kernel Density

Figure 5 and Figure 6 illustrate the kernel density distributions of the BRs and VIs for healthy and diseased plants. Except for SIPI_RE and SIPI, most features showed notable differences in the kernel density distributions between the two classes, indicating their potential for disease discrimination. However, there were considerable differences in the sensitivity of these features to BFW, as observed in the kernel density magnitudes, distribution shapes, and isodensity contour characteristics. These variations impact the selection of optimal thresholds and necessitate a quantitative evaluation of feature separability. To this end, this study developed a separability metric—the Kernel Density Discriminant Score (KDDS)—to quantify the separability of each basic feature. The features were then categorized into three levels: non-separable, considered separable, and highly separable, forming the basis for the subsequent threshold determination and EF construction.

We propose that the greater the vertical distance between the kernel density distributions of healthy and diseased samples, the lower the overall data dispersion and the lesser the overlap of isodensity contours, thereby indicating stronger feature separability. Accordingly, the Kernel Density Discriminant Score (KDDS) was defined as follows:

K D D S = \frac{d i s t}{s t d_{t o t a l}} + r a t i o = \frac{|\underset{x}{\arg \max f_{h} (x)} - \underset{x}{\arg \max f_{d} (x)}|}{\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - μ_{h})}^{2}} + \sqrt{\frac{1}{m} \sum_{i = 1}^{m} {(x_{i} - μ_{d})}^{2}}} + \frac{n_{\max}}{n_{t o t a l}}

(1)

where

d i s t

represents the distance between the “kernels” (the contour lines corresponding to the kernel density peaks) of the healthy and diseased samples in the vertical direction. The larger the value, the stronger the separability of the feature. The values of

\underset{x}{\arg \max f_{h} (x)}

and

\underset{x}{\arg \max f_{d} (x)}

represent the feature values corresponding to the kernel density peaks of healthy and diseased samples, respectively.

{s t d}_{t o t a l}

is the sum of the standard deviations of the healthy and diseased samples, representing the overall dispersion (i.e., distribution width) of the two classes of sample data. The smaller the value, the stronger the separability of the feature.

n

and

m

represent the number of healthy and diseased samples, respectively, while

μ_{h}

and

μ_{d}

are their means. ratio represents the non-overlap rate of the contour lines, with a higher value indicating the stronger separability of the feature.

n_{m a x}

is the larger number of non-overlapping contour lines between healthy and diseased samples, while

n_{t o t a l}

is the total number of contour lines. In this study, the number of contour lines for healthy and diseased samples is set to be the same.

Once the KDDS values for all BFs are computed, they were categorized into three levels: non-separable, considered separable, and highly separable. Given that the KDDS values range from [0, +∞), a stepwise median method was employed to define the separability thresholds (Figure 7). Specifically, the KDDS values were sorted in ascending order. Features with KDDS values below the second median (Median 2) were labeled as non-separable; those between Median 2 and Median 3 were considered separable; and values above Median 3 were deemed highly separable. The first median (Median 1) was used solely to assist in computing Medians 2 and 3.

Using this approach, the separability levels were determined for all 13 BFs (Figure 8). The features SIPI_RE, SIPI, and CARI were deemed non-separable and therefore unsuitable for EF construction. In contrast, R_NIR, R_blue, R_green, R_red, and R_RE were classified as considered separable, and ARI, NDVI, NDRE, CI_RE, and CI_green as highly separable, indicating that 10 EFs could be constructed.

Automatic Determination of Optimal Segmentation Thresholds

The AutoKDFC employs two distinct strategies to determine the optimal segmentation threshold (OST) for the considered separable and highly separable features.

(1): Highly Separable Features—Stratified K-Fold Cross-Validation

In cases where a feature is highly separable, the distributions of healthy and diseased samples differ significantly, and the model performance is relatively insensitive to the specific threshold. However, relying on a single dataset can introduce randomness and reduce robustness. To address this, we apply stratified K-fold cross-validation (K = 5) to evaluate the F1 score across multiple train–validation splits and select the threshold with the best generalization performance. The procedure is as follows:

Uniformly sample candidate thresholds within the range from the minimum to maximum values of the basic feature (e.g., NDVI) in the training set.
For each threshold, label samples based on whether the NDVI value is below or above the threshold (label 0 or 1, respectively).
Compute the F1 score on the validation set for each threshold.
Identify the threshold with the highest F1 score and record both the threshold and corresponding F1 score.
Compute the mean F1 score and the mean optimal threshold across all folds.

(2): Considered Separable Features—Binary Search

For the considered separable features with overlapping distributions across healthy and diseased samples, precisely locating the threshold is critical. As the optimal thresholds for these features tend to lie within a narrow interval, binary search provides an efficient way to narrow the search space and approach the optimal threshold. This study employs binary search based on the F1 score as the evaluation metric. The procedure is as follows:

Set the minimum and maximum values of the feature (e.g., R_blue) as the initial search interval.
In each iteration, compute the midpoint as a candidate threshold and evaluate its F1 score.
If the current F1 score exceeds the previous best, update the optimal threshold.
Adjust the search interval based on the F1 score: If the left interval yields a better F1 score, narrow the search interval to the left (update R_blue_max); otherwise, narrow the search interval to the right (update R_blue_min).
Terminate the search when the range is less than 1 × 10⁻⁵ or the maximum number of iterations (max_iter) is reached, and return the optimal threshold and its F1 score.

Construction of Enhanced Features Based on OST

After determining the OST for each BF, EFs are constructed by comparing the feature values corresponding to the kernel density peaks of healthy and diseased samples (denoted as H_value and D_value, respectively). The strategy is as follows. If the H_value of a BF > D_value, construct the EF using Equation (2); if the H_value of a BF < D_value, build the EF according to Equation (3).

{EF}_{i} = \{\begin{matrix} 1, i f {B F}_{i} < {O S T}_{i} \\ 0, e l s e \end{matrix}

(2)

{EF}_{i} = \{\begin{matrix} 0, i f {B F}_{i} \leq {O S T}_{i} \\ 1, e l s e \end{matrix}

(3)

where EF_i represents the enhanced feature i, BF_i represents the basic feature i, and OST_i represents the optimal segmentation threshold for BF_i, where, i = 1, 2, …, 10, referring to the ten BFs selected for enhancement.

Ultimately, 10 EFs are constructed from the 13 BFs, increasing the total number of features to 23, as detailed in Table 2.

2.2.3. Machine Learning Modeling

(1) Machine Learning Algorithms

Three ML algorithms were employed in this study for the recognition task: random forest (RF), a decision-tree-based ensemble method; support vector machine (SVM), a kernel-based approach; and Gaussian naive Bayes (GNB), a probabilistic model. The dataset was randomly divided into a training set (70%: 84 samples) and a testing set (30%: 36 banana samples). The characteristics of each algorithm are as follows.

RF constructs an ensemble of decision trees using the bagging strategy, where each tree is built on randomly selected samples and feature subsets [37]. Its robustness against high-dimensional data, nonlinear relationships, and noisy inputs makes it particularly suitable for modeling field-collected banana data that may contain measurement errors. SVM identifies the optimal hyperplane that maximizes the margin between two classes based on statistical learning theory [38]. Since the goal is binary classification—differentiating diseased from healthy banana plants—SVM is a suitable choice. It may even outperform RF in terms of generalization, especially when dealing with small datasets. GNB assumes that features follow a Gaussian distribution and applies Bayes’ theorem under the assumption of conditional independence among features [39]. As the dataset includes multiple continuous BR-based BFs and VI-based BFs, GNB provides an effective way to model such continuous inputs.

(2) Experimental Setup for BFW Recognition Modeling

To evaluate the effectiveness of EFs in improving model performance, two sets of comparative experiments and four model configurations were designed for BFW recognition. These configurations primarily differ in their feature set, as shown in Table 3 (Models I–IV). The comparison between Model I and Model II was intended to test whether the inclusion of EFs improves the recognition performance. The comparison between Model III and Model IV aimed to determine whether EFs outperform BFs in identifying BFW.

In this study, the experiments were conducted in a local Anaconda environment using Jupyter Notebook 6.3.0. The experimental setup included the Windows 10 operating system and the Python 3.8.8 interpreter. For reproducibility, the random_state parameter was set to 42 for both RF and SVM. All the other parameters were kept at their default settings. The implementations used were as follows: RF, sklearn.ensemble.RandomForestClassifier; SVM, sklearn.svm.SVC; GNB, sklearn.naive_bayes.GaussianNB.

2.2.4. Model Performance Evaluation

The performance of the BFW recognition models was evaluated using four metrics: accuracy, precision, recall, and F1 score (Equations (4)–(7)). Accuracy measures the overall correctness of the model’s predictions. Precision assesses the proportion of true positive predictions among all the positive predictions, reflecting the model’s ability to avoid false positives. Recall evaluates the proportion of true positives that were correctly identified, reflecting the model’s ability to detect actual positives. F1 score is the harmonic mean of the precision and recall, providing a balanced evaluation of both aspects. All the metrics range from 0 to 1, with higher values indicating better model performance. To ensure fairness and robustness in the evaluation, the train–test split (70% training, 30% testing) was repeated 50 times with random sampling. The arithmetic mean of the performance metrics across these runs was reported as the final score.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(4)

P r e c i s i o n = \frac{T P}{T P + F P}

(5)

R e c a l l = \frac{T P}{T P + F N}

(6)

F 1 S c o r e = \frac{2 * P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c l a l}

(7)

where TP denotes the number of samples predicted as healthy that are indeed healthy, FP represents the number of samples predicted as healthy but actually diseased, FN refers to the number of samples predicted as diseased but actually healthy, and TN indicates the number of samples predicted as diseased that are indeed diseased.

3. Results

3.1. Optimal Kernel Density Segmentation Thresholds

Figure 9 and Figure 10 present the optimal kernel density segmentation thresholds and corresponding F1 scores obtained using the AutoKDFC for five BR-based BFs and five VI-based BFs. The results show that among the five BRs, R_blue, R_green, and R_NIR exhibited relatively weak performance, with F1 scores ranging from 72% to 78%, while R_red and R_RE achieved better results, with F1 scores exceeding 80%. The VIs outperformed the RF features overall, with all the F1 scores above 90% except for ARI, which scored 84.36%, indicating the high reliability of the automatically derived thresholds. The kernel density distributions for healthy and diseased samples show significant overlaps for R_blue, R_green, R_NIR, and ARI, which likely explains their lower thresholding performance (further discussed in the Section 4). Overall, the optimal segmentation thresholds of the VIs were superior, suggesting that VI-based EFs are more effective than BR-based EFs in distinguishing the health status of banana plants affected by BFW.

3.2. Effectiveness of Enhanced Features

To assess the effectiveness of the constructed EFs in terms of BFW recognition, this study analyzed their correlation with plant health status and their relative importance.

3.2.1. Correlation Analysis

Figure 11 and Figure 12 show the kernel density distributions of the EFs for healthy and diseased plants. Compared to the BFs, the EFs demonstrate more pronounced distributional differences between healthy and diseased plants. Their kernel density curves are more uniform and show greater separability, highlighting their stronger discriminative capacity for BFW recognition. Additionally, a comparison between Figure 10 and Figure 11 reveals that the VI-based EFs exhibit better class separability and more consistent distribution patterns than the BR-based EFs.

Pearson correlation coefficients were used to analyze the correlation between the EFs/BFs and the BFW severity (Figure 13). Figure 13a indicates that, except for R_green, all the BR-based EFs show higher correlations with BFW than the BFs, with R_blue, R_red, R_RE, and R_NIR increasing by 2%, 15%, 3%, and 2%, respectively. Figure 13b shows that although ARI’s correlation dropped by 7%, NDVI, NDRE, CI_green, and CI_RE improved by 17%, 11%, 10%, and 6%, respectively. In general, most EFs constructed by the AutoKDFC improved the correlation with BFW, thereby enhancing the separability of healthy and diseased plants. The reduced correlation of R_green and ARI may be attributed to the poor segmentation thresholds (as discussed in Section 3.1) and low sensitivity of the R_green band to BFW (further explored in the Section 4.1).

3.2.2. Feature Importance Analysis

Since the SVM and GNB algorithms cannot directly quantify feature importance, the built-in feature importance function of the RF algorithm was used for analysis. Figure 14 displays the relative importance of all 23 features. The results indicate that, apart from ARI_EF, the VI-based EFs generally hold higher importance, with some even exceeding that of BR-based BFs. Among the BR-based EFs, only R_RE_EF showed notable importance, while the others were comparatively low. Overall, the EFs constructed using the AutoKDFC were found to be highly relevant in terms of BFW recognition, with the VI-based EFs demonstrating greater significance than the BR-based ones. Possible reasons why certain EFs performed worse than their basic counterparts will be explored further in Section 4.

3.3. Predictive Performance of BFW Recognition Models

Figure 15 and Table 4 provide both visual and quantitative comparisons of the model performance from two sets of experiments: Experiment 1 (Model I vs. Model II, using 13 BFs vs. 23 EFs) and Experiment 2 (Model III vs. Model IV, using 10 BFs vs. 10 EFs). In Experiment 1, both Figure 15a and Table 4 show improvements in all the performance metrics across the algorithms when the EFs were added, with the greatest average improvement observed in GNB (3.07%) and the smallest in RF (0.88%). Before introducing the EFs, RF was the best-performing model, whereas after enhancement, SVM achieved the highest accuracy at 91.39%. In Experiment 2, as shown in Figure 15b and Table 4, models using only the EFs outperformed those using only BFs across all three algorithms, again confirming the superior discriminative ability of the EFs. SVM continued to perform best, with an average accuracy of 91.44%. Overall, the EFs significantly improved the model prediction performance and demonstrated strong adaptability across multiple algorithms, with the most substantial improvement observed in SVM. This may be attributed to the consistency between SVM’s optimal hyperplane classification principle and the design rationale behind the EFs.

Additionally, despite having fewer features, Models III and IV outperformed Models I and II, which had more features. This suggests that combining BFs and EFs might introduce redundancy, potentially reducing the model generalizability. An increased number of features could lead the models to overfit noise in the data rather than learn meaningful distinctions, ultimately affecting the test set performance. Therefore, future studies should consider optimized combinations of BFs and EFs to balance their contributions and improve the model accuracy and generalization under varying conditions.

3.4. Spatial Distribution Mapping of Banana Fusarium Wilt

Given that SVM was identified as the optimal-performing model, it was selected for mapping the spatial distribution of BFW. Although Model IV (EFs only) achieved the highest accuracy, completely replacing the BFs could risk information loss or reduced feature stability, as the EFs are derived from the basic ones. Thus, Model II, which integrates both BFs and EFs and achieved only a marginally lower accuracy (by 0.05%), was ultimately chosen for the mapping task.

In the mapping process, banana planting areas in the study region were first extracted using texture features and the NDVI. Then, the SVM model based on Model II was applied to predict the spatial distribution of BFW (Figure 16). The results show that the red dots (diseased samples) in Figure 15 are primarily located within the yellow regions (predicted diseased plants), while the blue dots (healthy samples) align well with the green regions (predicted healthy plants), indicating strong agreement between the model predictions and the actual conditions. Furthermore, the spatial pattern of the yellow areas in Figure 16b closely matches the distribution of bare soil zones shown in Figure 16a. This is likely because banana plants infected with BFW exhibit drooping and withered leaves, reducing canopy coverage and increasing soil exposure, which in turn alters the spectral characteristics. This observed phenomenon further supports the validity of the model’s predictions.

4. Discussion

4.1. Mechanistic Interpretation of Feature Separability and Threshold Rationality

The separability of the kernel density distributions for a given feature primarily depends on its sensitivity to BFW—specifically, the distributional differences between healthy and diseased samples. Upon infection, the pigment content of banana leaves changes significantly during disease progression, with leaves turning from green to yellow and eventually withering to a grayish–white color. These pigment alterations result in shifts in the spectral reflectance, with varying magnitudes across different bands. As the chlorophyll content decreases, the reflectance in the R_red and R_NIR bands increases, while R_RE decreases. Similarly, R_blue and R_green also increase, but to a lesser extent than R_red and R_NIR [40,41]. Accordingly, all five BR-based BFs examined in this study showed a degree of separability, but R_blue and R_green performed poorly in threshold segmentation (F1 scores of 72.87% and 72.55%, respectively), with R_green being the weakest. Prior studies have shown that the reflectance in the green band is relatively stable under physiological stress, making it less effective for distinguishing between healthy and diseased vegetation [16]. This aligns with the current study, where the kernel density distributions of R_green show considerable overlap between healthy and diseased samples.

For the VI-based BFs, significant differences were observed between healthy and diseased samples in indices that reflect the chlorophyll content and leaf area index (e.g., NDVI, NDRE, CI_green, and CI_RE), resulting in high kernel density separability and rational threshold partitioning. In contrast, the carotenoid content remains relatively stable during early leaf yellowing, degrading only in the late stages of BFW when leaves turn brown or gray–white. Since no severely diseased (gray–white) samples were collected during the field surveys in this study, the kernel density distributions of SIPI, SIPI_RE, and CARI—indices related to the carotenoid content—were similar between healthy and diseased samples, resulting in lower separability. Regarding the ARI index, which reflects the anthocyanin content, there is minimal change during the early infection stages, with anthocyanin gradually accumulating in the mid-to-late stages. As a result, diseased samples exhibit a dispersed kernel density distribution that overlaps with healthy samples, yielding a KDDS score near the threshold between “considered separable” and “highly separable” (median = 3), and leading to weaker threshold performance compared to other VIs.

4.2. Mechanistic Interpretation of the Importance of Enhanced Features

This study employed a nonlinear threshold-based transformation of BFs to construct EFs aimed to improving the recognition of spectral differences between healthy and diseased banana plants—especially for identifying subtle spectral vibrations during early infection stages. The model evaluation indicated that the inclusion or partial substitution with EFs improved the recognition of BFW. However, feature importance analysis using the RF algorithm (Figure 13) showed that EFs generally contributed less than their corresponding BFs. This phenomenon may stem from several factors. (1) Information redundancy: EFs are derived from BFs and, although optimized for disease recognition, share overlapping information sources. Consequently, in RF models, some EFs may be partially replaced by BFs, reducing their relative importance. (2) Feature discriminability: BFs capture global spectral differences between healthy and diseased plants, while EFs are tailored to optimize the discriminability in specific spectral ranges. Given that RF relies on multi-dimensional feature partitioning for classification, it tends to favor globally discriminative features, positioning EFs as supplementary. (3) Feature sensitivity differences: EFs may be more sensitive to subtle early-stage spectral vibrations, while in late stages—when spectral differences become more pronounced—BFs may play a dominant role, thereby diminishing the unique contribution of EFs.

In summary, while the inclusion of EFs improves disease recognition, their independent contribution is influenced by information redundancy, model preference, and disease progression, resulting in varying levels of impact across different models.

4.3. Methodological Benchmarking and Performance Reference to Similar Studies

In the field of plant disease recognition, model performance is highly dependent on the effectiveness of feature engineering. This study addresses the challenge of identifying BFW by innovatively proposing an enhanced feature construction method based on kernel density segmentation. A comparative analysis is presented below from two perspectives—feature engineering and model performance—against existing studies on BFW recognition.

In terms of feature engineering, compared with other BFW identification studies that rely solely on BFs [16,17], this study introduces EFs to improve the model’s ability to detect subtle spectral signals associated with early-stage BFW symptoms. Feature importance analysis shows that some EFs rank among the top five (Figure 14). In terms of model performance, compared with the logistic regression model using BFs in [16], which achieved an accuracy of 90.5%, the SVM model in this study reached an average accuracy of 91.39% through the use of EFs, demonstrating their effectiveness in boosting model performance. Although there are limitations in making direct comparisons across different plant diseases, the proposed enhanced feature construction approach may offer valuable insights for the monitoring of other crop diseases. Future work may apply this strategy to broader scenarios and conduct more extensive comparative studies.

4.4. Future Research Directions and Perspectives

This study validated the effectiveness of the AutoKDFC method for automatically constructing EFs in improving BFW recognition capability, offering a novel approach to plant health monitoring based on spectral features. However, several directions remain worthy of further exploration in future research:

(1): Disease severity monitoring: EFs are constructed to capture subtle spectral vibrations during early-stage disease infection, showing particularly high recognition capability for mildly infected samples. However, due to the limited sample size, this study did not conduct an in-depth analysis of disease severity monitoring. Future research with a sufficient data volume could further validate the role of EFs in identifying mildly infected samples and evaluate their applicability across different disease progression stages.
(2): Feature optimization: This study did not apply independent feature selection and instead combined BFs and EFs. Given their derivation, there may be significant information redundancy. Future work could apply feature selection methods such as the Pearson correlation coefficient [42], mutual information [43], or LASSO [44] to reduce redundancy and improve model generalizability [45,46,47].
(3): Application extension: AutoKDFC demonstrates strong potential for disease recognition and could be extended to other crop disease monitoring tasks. Integrating this approach with multi-source remote sensing data and deep learning models may yield more efficient and accurate disease recognition systems, supporting the advancement of precision agriculture.

5. Conclusions

This study proposed the AutoKDFC to improve the spectral recognition of BFW. By applying threshold segmentation to basic features (BRs and VIs), the resulting EFs effectively captured subtle spectral vibrations in early infection stages. The experimental results showed that—except for a few vegetation indices—EFs based on VIs outperformed those based on BRs in BFW recognition. The inclusion of EFs significantly improved the recognition accuracy across multiple ML algorithms (RF, SVM, GNB), with SVM yielding the best overall performance. Although EFs ranked lower than BFs in importance within the RF model, their inclusion still notably improved the model performance, confirming their effectiveness in detecting early-stage spectral anomalies caused by BFW. Future work could further validate the ability of these features to detect mild disease cases through severity-stage monitoring. Additionally, combining BFs and EFs may introduce redundancy, and future efforts should focus on optimizing feature selection to enhance model generalizability. Overall, this study demonstrated the effectiveness and generalizability of the AutoKDFC method for BFW recognition and introduced a novel approach to spectral feature enhancement that could be applied to broader disease monitoring tasks, supporting precision agriculture and disease management.

Author Contributions

Conceptualization, Y.S., L.Z.; methodology, Y.S., H.Y.; software, Y.S.; validation, Y.S., H.L.; formal analysis, Y.S., L.Z., H.Y.; investigation, H.Y., W.K., B.Z.; resources, W.H., J.C.; writing—original draft preparation, Y.S., L.Z.; writing—review and editing, Y.S., L.Z., H.L., X.L., W.H., W.K., B.Z.; visualization, Y.S., L.Z.; supervision, J.C.; funding acquisition, L.Z., W.K., B.Z., J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 42171323), the Hainan Provincial Natural Science Foundation of China (No. 623QN325, 322QN346), and the Research Foundation of Shenzhen Science and Technology Innovation Bureau (No. KCXFZ20240903093800002).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Shen, Z.; Xue, C.; Penton, C.R.; Thomashow, L.S.; Zhang, N.; Wang, B.; Ruan, Y.; Li, R.; Shen, Q. Suppression of banana Panama disease induced by soil microbiome reconstruction through an integrated agricultural strategy. Soil Biol. Biochem. 2019, 128, 164–174. [Google Scholar] [CrossRef]
Ploetz, R.C. Fusarium Wilt of Banana. Phytopathology 2015, 105, 1512. [Google Scholar] [CrossRef] [PubMed]
Ismaila, A.A.; Ahmad, K.; Siddique, Y.; Wahab, M.A.A.; Kutawa, A.B.; Abdullahi, A.; Zobir, S.A.M.; Abdu, A.; Abdullah, S.N.A. Fusarium wilt of banana: Current update and sustainable disease control using classical and essential oils approaches. Hortic. Plant J. 2023, 9, 1–28. [Google Scholar] [CrossRef]
Pegg, K.G.; Coates, L.M.; O’Neill, W.T.; Turner, D.W. The Epidemiology of Fusarium Wilt of Banana. Front. Plant Sci. 2019, 10, 1395. [Google Scholar] [CrossRef]
Ordonez, N.; Seidl, M.F.; Waalwijk, C.; Drenth, A.; Kilian, A.; Thomma, B.P.H.J.; Ploetz, R.C.; Kema, G.H.J. Worse Comes to Worst: Bananas and Panama Disease—When Plant and Pathogen Clones Meet. PLoS Pathog. 2015, 11, e1005197. [Google Scholar] [CrossRef]
Segura-Mena, R.A.; Stoorvogel, J.J.; García-Bastidas, F.; Salacinas-Niez, M.; Kema, G.H.J.; Sandoval, J.A. Evaluating the potential of soil management to reduce the effect of Fusarium oxysporum f. sp. cubense in banana (Musa AAA). Eur. J. Plant Pathol. 2021, 160, 441–455. [Google Scholar] [CrossRef]
Zhang, M.; Zhou, D.; Qi, D.; Wei, Y.; Chen, Y.; Feng, J.; Wang, W.; Xie, J. Research progress on the integrated control of Fusarium wilt disease in banana. Sci. Sin. Vitae 2024, 54, 1843–1852. [Google Scholar]
Dita, M.; Barquero, M.; Heck, D.; Mizubuti, E.S.G.; Staver, C.P. Fusarium Wilt of Banana: Current Knowledge on Epidemiology and Research Needs Toward Sustainable Disease Management. Front. Plant Sci. 2018, 9, 1468. [Google Scholar] [CrossRef]
Wang, W.; Liu, Z.; Zheng, C.; Zhang, L. Research Progress on Banana Fusarium Wilt. China Port Sci. Technol. 2024, 6, 44–51. [Google Scholar]
Siamak, S.B.; Zheng, S. Banana Fusarium Wilt (Fusarium oxysporum f. sp. cubense) Control and Resistance, in the Context of Developing Wilt-resistant Bananas Within Sustainable Production Systems. Hortic. Plant J. 2018, 4, 208–218. [Google Scholar] [CrossRef]
Lan, Y.; Zhu, Z.; Deng, X.; Lian, B.; Huang, Y.; Huang, Z.; Hu, J. Monitoring and classification of citrus Huanglongbing based on UAV hyperspectral remote sensing. Trans. Chin. Soc. Agric. Eng. 2019, 35, 92–100. [Google Scholar]
Yu, R.; Luo, Y.; Zhou, Q.; Zhang, X.; Wu, D.; Ren, L. Early detection of pine wilt disease using deep learning algorithms and UAV-based multispectral imagery. For. Ecol. Manag. 2021, 497, 119493. [Google Scholar] [CrossRef]
Liao, J.; Tao, W.; Zang, Y.; Wang, P.; Luo, X. Research Progress and Prospect of Key Technologies in Crop Disease and Insect Pest Monitoring. Trans. Chin. Soc. Agric. Mach. 2023, 54, 1–19. [Google Scholar]
Duarte, A.; Borralho, N.; Cabral, P.; Caetano, M. Recent Advances in Forest Insect Pests and Diseases Monitoring Using UAV-Based Data: A Systematic Review. Forests 2022, 13, 911. [Google Scholar] [CrossRef]
Kaivosoja, J.; Hautsalo, J.; Heikkinen, J.; Hiltunen, L.; Ruuttunen, P.; Näsi, R.; Niemeläinen, O.; Lemsalu, M.; Honkavaara, E.; Salonen, J. Reference Measurements in Developing UAV Systems for Detecting Pests, Weeds, and Diseases. Remote Sens. 2021, 13, 1238. [Google Scholar] [CrossRef]
Ye, H.; Huang, W.; Huang, S.; Cui, B.; Dong, Y.; Guo, A.; Ren, Y.; Jin, Y. Recognition of Banana Fusarium Wilt Based on UAV Remote Sensing. Remote Sens. 2020, 12, 938. [Google Scholar] [CrossRef]
Zhang, S.; Li, X.; Ba, Y.; Lyu, X.; Zhang, M.; Li, M. Banana Fusarium Wilt Disease Detection by Supervised and Unsupervised Methods from UAV-Based Multispectral Imagery. Remote Sens. 2022, 14, 1231. [Google Scholar] [CrossRef]
Lin, S.; Ji, T.; Wang, J.; Li, K.; Lu, F.; Ma, C.; Gao, Z. BFWSD: A lightweight algorithm for banana fusarium wilt severity detection via UAV-Based Large-Scale Monitoring. Smart Agric. Technol. 2025, 11, 101047. [Google Scholar] [CrossRef]
Yuan, L.; Pu, R.; Zhang, J.; Wang, J.; Yang, H. Using high spatial resolution satellite imagery for mapping powdery mildew at a regional scale. Precis. Agric. 2016, 17, 332–348. [Google Scholar] [CrossRef]
Yang, G.; He, Y.; Feng, X.; Li, X.; Zhang, J.; Yu, Z. Methods and New Research Progress of Remote Sensing Monitoring of Crop Disease and Pest Stress Using Unmanned Aerial Vehicle. Smart Agric. 2022, 4, 1–16. [Google Scholar]
Wang, G.; Lan, Y.; Qi, H.; Chen, P.; Hewitt, A.; Han, Y. Field evaluation of an unmanned aerial vehicle (UAV) sprayer: Effect of spray volume on deposition and the control of pests and disease in wheat. Pest Manag. Sci. 2019, 75, 1546–1555. [Google Scholar] [CrossRef]
Das, S.; Biswas, A.; VimalKumar, C.; Sinha, P. Deep Learning Analysis of Rice Blast Disease using Remote Sensing Images. IEEE Geosci. Remote Sens. 2023, 20, 1. [Google Scholar] [CrossRef]
Zhang, N.; Chai, X.; Li, N.; Zhang, J.; Sun, T.; Sveriges, L. Applicability of UAV-based optical imagery and classification algorithms for detecting pine wilt disease at different infection stages. GIScience Remote Sens. 2023, 60, 2170479. [Google Scholar] [CrossRef]
Deng, X.; Zhu, Z.; Yang, J.; Zheng, Z.; Huang, Z.; Yin, X.; Wei, S.; Lan, Y. Detection of Citrus Huanglongbing Based on Multi-Input Neural Network Model of UAV Hyperspectral Remote Sensing. Remote Sens. 2020, 12, 2678. [Google Scholar] [CrossRef]
Tu, T.; Su, Y.; Tang, Y.; Guo, G.; Tan, W.; Ren, S. SHFW: Second-order hybrid fusion weight–median algorithm based on machining learning for advanced IoT data analytics. Wirel. Netw. 2023, 30, 6055–6067. [Google Scholar] [CrossRef]
Tu, T.; Su, Y.; Ren, S. FC-MIDTR-WCCA: A Machine Learning Framework for PM2.5 Prediction. IAENG Int. J. Comput. Sci. 2024, 51, 544–552. [Google Scholar]
Su, Y.; Zhao, L.; Li, X.; Li, H.; Ge, Y.; Chen, J. FC-StackGNB: A novel machine learning modeling framework for forest fire risk prediction combining feature crosses and model fusion algorithm. Ecol. Indic. 2024, 166, 112577. [Google Scholar] [CrossRef]
Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring vegetation systems in the great plains with ERTS. In Proceedings of the Third ERTS-1 Symposium NASA SP-351, Greenbelt, MD, USA, 10–14 December 1973. [Google Scholar]
Gitelson, A.; Merzlyak, M.N. Spectral reflectance changes associated with autumn senescence of aesculus–hippocastanum L and acer-platanoides L leaves—Spectral features and relation to chlorophyll estimation. J. Plant Physiol. 1994, 143, 286–292. [Google Scholar] [CrossRef]
Gitelson, A.A.; Gritz, Y.; Merzlyak, M.N. Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. J. Plant Physiol. 2003, 160, 271–282. [Google Scholar] [CrossRef]
Gitelson, A.A.; Vina, A.; Ciganda, V.; Rundquist, D.C.; Arkebauer, T.J. Remote estimation of canopy chlorophyll content in crops. Geophys. Res. Lett. 2005, 32, L08403. [Google Scholar] [CrossRef]
Peñuelas, J.; Inoue, Y. Reflectance indices indicative of changes in water and pigment contents of peanut and wheat leaves. Photosynthetica 1999, 36, 355–360. [Google Scholar] [CrossRef]
Ramoelo, A.; Skidmore, A.K.; Cho, M.A.; Schlerf, M.; Mathieu, R.; Heitkonig, I.M.A. Regional estimation of savanna grass nitrogen using the red-edge band of the spaceborne RapidEye sensor. Int. J. Appl. Earth Obs. Geoinf. 2012, 19, 151–162. [Google Scholar] [CrossRef]
Kim, M.S.; Daughtry, C.S.T.; Chappelle, E.W.; McMurtrey, J.E.; Walthall, C.L. The use of high spectral resolution bands for estimating absorbed photosynthetically active radiation (APAR). In Proceedings of the 6th International Symposium on Physical Measurements and Signatures in Remote Sensing, Val d’Isère, France, 17–21 January 1994; pp. 299–306. [Google Scholar]
Gitelson, A.A.; Merzlyak, M.N.; Chivkunova, O.B. Optical properties and nondestructive estimation of anthocyanin content in plant leaves. Photochem. Photobiol. 2001, 74, 38–45. [Google Scholar] [CrossRef] [PubMed]
Kim, J.S.; Scott, C.D. Robust kernel density estimation. J. Mach. Learn. Res. 2012, 13, 2529–2565. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Xue, J.; Titterington, D.M. Comment on “On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes”. Neural Process. Lett. 2008, 28, 169–187. [Google Scholar] [CrossRef]
Fang, C.; Wang, L.; Xu, H. A comparative study of different red edge indices for remote sensing recognition of urban grassland health status. J. Geo-Inf. Sci. 2017, 19, 1382–1392. [Google Scholar]
Yuan, X.; Zhou, G.; Wang, Q.; He, Q. Hyperspectral characteristics of chlorophyll content in summer maize under different water irrigation conditions and its inversion. Acta Ecol. Sin. 2021, 41, 543–552. [Google Scholar] [CrossRef]
Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson Correlation Coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar] [CrossRef]
Duncan, T.E. On the calculation of mutual information. SIAM J. Appl. Math. 1970, 19, 215–220. [Google Scholar] [CrossRef]
Roth, V. The generalized LASSO. IEEE Trans. Neural Netw. 2004, 15, 16–28. [Google Scholar] [CrossRef]
Kumar, V.; Minz, S. Feature Selection: A literature Review. Smart Comput. Rev. 2014, 4, 211–229. [Google Scholar] [CrossRef]
Hancer, E.; Xue, B.; Zhang, M. A survey on feature selection approaches for clustering. Artif. Intell. Rev. 2020, 53, 4519–4545. [Google Scholar] [CrossRef]
Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature Selection. ACM Comput. Surv. 2018, 50, 1–45. [Google Scholar] [CrossRef]

Figure 1. Study area: (a) Long’an County, Guangxi, China; (b) UAV remote sensing imagery (true-color composite) of the study area with overlaid survey points; and (c) photos of healthy and diseased plants at different infection levels.

Figure 2. (a) The DJI Phantom 4 drone equipped with the MicaSense RedEdge-M™ multi-spectral camera system; and (b) the MicaSense RedEdge-M™ multi-spectral camera with a DLS module and a GPS module.

Figure 3. Technical flowchart.

Figure 4. AutoKDFC flowchart.

Figure 5. Kernel density distribution graphs of BR-based BFs. The color bar on the right side represents the kernel density intensity; the darker the color, the higher the kernel density.

Figure 6. Kernel density distribution graphs of VI-based BFs. The color bar on the right side represents the kernel density intensity; the darker the color, the higher the kernel density.

Figure 7. Diagram of feature separability threshold determination.

Figure 8. KDDS and kernel density separability determination of BR-based BFs and VI-based BFs.

Figure 9. Kernel density segmentation thresholds of BR-based BFs. (a) R_blue; (b) R_green; (c) R_red; (d) R_RE; (e) R_NIR.

Figure 10. Kernel density segmentation thresholds of VI-based BFs. (a) ARI; (b) NDVI; (c) NDRE; (d) CI_RE; (e) CI_green.

Figure 11. Kernel density graphs of BR-based EFs. The color bar on the right side represents the kernel density intensity; the darker the color, the higher the kernel density. (a) Kernel density graphs of R_blue_EF; (b) Kernel density graphs of R_green_EF; (c) Kernel density graphs of R_red_EF; (d) Kernel density graphs of R_RE_EF; (e) Kernel density graphs of R_NIR_EF.

Figure 12. Kernel density graphs of VI-based EFs. The color bar on the right side represents the kernel density intensity; the darker the color, the higher the kernel density. (a) Kernel density graphs of CI_RE_EF; (b) Kernel density graphs of NDVI_EF; (c) Kernel density graphs of NDRE_EF; (d) Kernel density graphs of ARI_EF; (e) Kernel density graphs of CI_green_EF.

Figure 13. Correlation comparison of BFs and EFs with BFW: (a) BR-based BFs and EFs; and (b) VI-based BFs and EFs. The red color of numbers represents that the correlation between the BFs and the target variable decreased after using the AutoKDFC algorithm.

Figure 14. Feature importance based on random forest.

Figure 15. Model performance comparison: (a) comparison of Model I and Model II; and (b) comparison of Model III and Model IV.

Figure 16. Spatial distribution map of BFW: (a) UAV remote sensing image of the study area; and (b) spatial distribution map of BFW in the study area (overlaid with field survey samples).

Table 1. List of eight vegetation indices and their sensitive parameters.

Vegetation Indices	Formulation	Sensitive Parameter	Reference
NDVI	$(R_{N I R} - R_{r e d}) / (R_{N I R} + R_{r e d})$	Leaf area index, green biomass	[28]
NDRE	$(R_{N I R} - R_{R E}) / (R_{N I R} + R_{R E})$	Leaf area index, green biomass	[29]
CI_green	$R_{N I R} / R_{g r e e n} - 1$	Chlorophyll content	[30]
CI_RE	$R_{N I R} / R_{R E} - 1$	Chlorophyll content	[31]
SIPI	$(R_{N I R} - R_{b l u e}) / (R_{N I R} - R_{r e d})$	Pigment content	[32]
SIPI_RE	$(R_{R E} - R_{b l u e}) / (R_{R E} - R_{r e d})$	Pigment content	[33]
CARI	$R_{R E} / R_{g r e e n} - 1$	Carotenoid content	[34]
ARI	$1 / R_{green} - 1 / R_{R E}$	Anthocyanin content	[35]

Note: R_red, red band reflectance; R_green, green band reflectance; R_blue, blue band reflectance; R_RE, red-edge band reflectance; R_NIR, near-infrared band reflectance.

Table 2. Feature set for BFW recognition.

Feature Type	Feature Category	Specific Feature
Basic Features (BFs)	BR_BFs	(1) R_blue; (2) R_green; (3) R_red; (4) R_RE; (5) R_NIR
Basic Features (BFs)	VI_BFs	(6) SIPI_RE; (7) SIPI; (8) CARI; (9) ARI; (10) CI_green; (11) NDVI; (12) NDRE; (13) CIRE
Enhanced Features (EFs)	BR_EFs	(14) R_blue_EF; (15) R_green_EF; (16) R_red_EF; (17) R_RE_EF; (18) R_NIR_EF
Enhanced Features (EFs)	VI_EFs	(19) ARI_EF; (20) CI_RE_EF; (21) NDVI_EF; (22) NDRE_EF; (23) CI_green_EF

Note: BR_BFs, band-reflectance-based basic features; VI_BFs, vegetation index-based basic features; BR_EFs, band-reflectance-based enhanced features; VI_EFs, vegetation index-based enhanced features.

Table 3. Four model configurations for BFW recognition modeling.

Model	Feature Set	Features Used for Modeling
I	BFs (13)	BR_BFs (8) + VI_BFs (5)
II	BFs (13) + EFs (10)	BR_BFs (8) + VI_BFs (5) + BR_EFs (5) + VI_EFs (5)
III	BFs (10)	BR_BFs (5) + VI_BFs (5)
IV	EFs (10)	BR_EFs (5) + VI_EFs (5)

Note: The numbers in parentheses represent the number of features.

Table 4. Model performance statistics and comparison (%).

Comparative Experiment	Algorithm	Model	Evaluation Metric				Average
Comparative Experiment	Algorithm	Model	Accuracy	Precision	Recall	F1 Score	Average
Comparative Experiment 1	RF	Model I	89.78	90.89	89.02	89.76	89.86
		Model II	90.72	92.17	89.44	90.63	90.74
		Improvement	0.94	1.28	0.42	0.87	0.88
	SVM	Model I	88.72	90.32	87.46	88.64	88.79
		Model II	91.39	93.22	89.68	91.27	91.39
		Improvement	2.67	2.90	2.22	2.63	2.61
	GNB	Model I	86.78	92.32	81.30	86.13	86.63
		Model II	89.72	92.86	86.75	89.49	89.71
		Improvement	2.94	0.54	5.45	3.36	3.07
Comparative Experiment 2	RF	Model III	89.39	90.38	88.81	89.34	89.48
		Model IV	90.83	92.31	89.58	90.76	90.82
		Improvement	1.44	1.93	0.77	1.42	1.39
	SVM	Model III	90.56	91.63	89.69	90.50	90.60
		Model IV	91.44	93.22	89.78	91.33	91.44
		Improvement	0.85	1.59	0.09	0.83	0.84
	GNB	Model III	88.61	92.86	84.26	88.15	88.47
		Model IV	90.78	93.08	88.55	90.58	90.62
		Improvement	2.17	0.22	4.29	2.43	2.28

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Su, Y.; Zhao, L.; Ye, H.; Huang, W.; Li, X.; Li, H.; Chen, J.; Kong, W.; Zhang, B. Banana Fusarium Wilt Recognition Based on UAV Multi-Spectral Imagery and Automatically Constructed Enhanced Features. Agronomy 2025, 15, 1837. https://doi.org/10.3390/agronomy15081837

AMA Style

Su Y, Zhao L, Ye H, Huang W, Li X, Li H, Chen J, Kong W, Zhang B. Banana Fusarium Wilt Recognition Based on UAV Multi-Spectral Imagery and Automatically Constructed Enhanced Features. Agronomy. 2025; 15(8):1837. https://doi.org/10.3390/agronomy15081837

Chicago/Turabian Style

Su, Ye, Longlong Zhao, Huichun Ye, Wenjiang Huang, Xiaoli Li, Hongzhong Li, Jinsong Chen, Weiping Kong, and Biyao Zhang. 2025. "Banana Fusarium Wilt Recognition Based on UAV Multi-Spectral Imagery and Automatically Constructed Enhanced Features" Agronomy 15, no. 8: 1837. https://doi.org/10.3390/agronomy15081837

APA Style

Su, Y., Zhao, L., Ye, H., Huang, W., Li, X., Li, H., Chen, J., Kong, W., & Zhang, B. (2025). Banana Fusarium Wilt Recognition Based on UAV Multi-Spectral Imagery and Automatically Constructed Enhanced Features. Agronomy, 15(8), 1837. https://doi.org/10.3390/agronomy15081837

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Banana Fusarium Wilt Recognition Based on UAV Multi-Spectral Imagery and Automatically Constructed Enhanced Features

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Data

2.1.1. Study Area

2.1.2. Data and Preprocessing

Field Survey of Banana Fusarium Wilt

UAV Multi-Spectral Data Acquisition and Preprocessing

2.2. Methods

2.2.1. Construction of Basic Features

2.2.2. Automated Construction of Enhanced Features

Evaluation of Basic Feature Separability Using Kernel Density

Automatic Determination of Optimal Segmentation Thresholds

Construction of Enhanced Features Based on OST

2.2.3. Machine Learning Modeling

2.2.4. Model Performance Evaluation

3. Results

3.1. Optimal Kernel Density Segmentation Thresholds

3.2. Effectiveness of Enhanced Features

3.2.1. Correlation Analysis

3.2.2. Feature Importance Analysis

3.3. Predictive Performance of BFW Recognition Models

3.4. Spatial Distribution Mapping of Banana Fusarium Wilt

4. Discussion

4.1. Mechanistic Interpretation of Feature Separability and Threshold Rationality

4.2. Mechanistic Interpretation of the Importance of Enhanced Features

4.3. Methodological Benchmarking and Performance Reference to Similar Studies

4.4. Future Research Directions and Perspectives

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI