Detection of Rubber Tree Powdery Mildew from Leaf Level Hyperspectral Data Using Continuous Wavelet Transform and Machine Learning

: Powdery mildew is one of the most significant rubber tree diseases, with a substantial impact on the yield of natural rubber. This study aims to establish a detection approach that coupled continuous wavelet transform (CWT) and machine learning for the accurate assessment of powdery mildew severity in rubber trees. In this study, hyperspectral reflectance data (350–2500 nm) of healthy and powdery mildew-infected leaves were measured with a spectroradiometer in a laboratory. Subsequently, three types of wavelet features (WFs) were extracted using CWT. They were as follows: WFs dimensionally reduced by the principal component analysis (PCA) of significant wavelet energy coefficients (PCA-WFs); WFs extracted from the top 1% of the determination coefficient between wavelet energy coefficients and the powdery mildew disease class (1%R 2 -WFs); and all WFs at a single decomposition scale (SS-WFs). To assess the detection capability of the WFs, the three types of WFs were input into the random forest (RF), support vector machine (SVM), and back propagation neural network (BPNN), respectively. As a control, 13 optimal traditional spectral features (SFs) were extracted and combined with the same classification methods. The results revealed that the WF-based models all performed well and outperformed those based on SFs. The models constructed based on PCA-WFs had a higher accuracy and more stable performance than other models. The model combined PCA-WFs with RF exhibited the optimal performance among all models, with an overall accuracy (OA) of 92.0% and a kappa coefficient of 0.90. This study demonstrates the feasibility of combining CWT with machine learning in rubber tree powdery mildew detection.


Introduction
The rubber tree is a cash crop native to Brazil and mainly grows in tropical and subtropical regions.The milky latex extracted from rubber trees serves as the primary source of natural rubber and is regarded as one of the four major industrial raw materials, alongside steel, petroleum, and coal [1].Rubber tree powdery mildew, caused by the infestation of rubber trees by the pathogen Oidium heveae Steinmann, is one of the most significant diseases affecting rubber trees.Its occurrence and prevalence are influenced by a combination of factors, including rubber tree phenology, meteorological conditions, and pathogen population [2].This disease primarily affects the tender leaves, buds, shoots, and inflorescences of rubber trees, substantially impacting the yield of natural rubber [3].During the initial stages of infection, rubber tree leaves exhibit radial silvery mycelium.As the disease progresses, the mycelium develops into radially shaped, spider-like lesions, eventually forming unevenly sized powdery mildew spots.In severe cases, the affected leaves become covered with powdery mildew, leading to wrinkling, deformation, yellowing, and the eventual dropping of the leaves [4].Traditional detection of rubber tree powdery mildew involves periodic leaf surveys during the peak of the disease, with disease management guided by weather forecasts.This method has several drawbacks, including the need for extensive human and material resources, limited representativeness, and an inability to meet the demand for large-scale, rapid detection in production.In addition, some detection sites may provide incomplete or untimely data, weakening their role in providing timely scientific guidance.Therefore, achieving the rapid and accurate detection of rubber tree powdery mildew is of paramount importance.
Hyperspectral remote sensing technology has found widespread application in the identification of crop diseases, owing to its advantages of rapid measurements, large coverage, and non-destructive nature [5,6].Spectroscopic observations of crop disease symptoms using field spectroradiometers are typically conducted at the leaf and canopy scales.The primary focus of research based on such measurements is on characterizing spectral responses in the visible to near-infrared bands for the assessment of crop diseases, including traditional spectral features (SFs) such as vegetation indices and differential spectral features.Zhang et al. extracted 32 spectral features from indoor spectral data collected from winter wheat leaves and established two regression models, namely, multiple linear regression (MLR) and partial least squares regression (PLSR), to evaluate the severity of winter wheat powdery mildew [7].Feng et al. collected hyperspectral data from wheat canopies to identify sensitive bands for wheat powdery mildew (ranging from 570-590 nm and 536-566 nm).Based on this, the authors developed the optimal dual-green vegetation index, which was effectively utilized to detect the disease [8].Huang et al. combined spectral features of the first-order derivatives, spectral absorption features of continuum removal, and vegetation indices with a support vector machine (SVM) to construct a wheat Fusarium head blight identification model at the wheat ear scale [9].
Following on from the application of traditional spectral features, the continuous wavelet transform (CWT) is an emerging spectral analysis method that enables the multiscale decomposition of spectral data, allowing it to capture intricate spectral variations.As a result, it has been applied to detect and identify crop diseases [10,11].Zhang et al. combined CWT with PLSR based on hyperspectral data from diseased leaves, enabling the assessment of winter wheat powdery mildew at the leaf level [12].Luo et al. achieved the estimation of wheat aphid density by performing continuous wavelet analysis on the reflectance spectra of winter wheat leaf samples [13].Shi et al. proposed a method based on wavelet features (WFs) to reveal the processes related to wheat stripe rust [14].Ma et al. established a wheat Fusarium head blight identification model based on WFs [15].The aforementioned studies employed a single thresholding method to determine WFs [16], demonstrating the superiority of CWT in crop disease detection.However, research on detecting rubber tree powdery mildew based on CWT is lacking, particularly using multiple types of WFs for detection.Therefore, the performance of CWT in rubber tree powdery mildew detection needs to be further explored.
Accurate classification algorithms are crucial in the establishment of precise crop disease detection models.In recent years, machine learning models have been extensively applied in the detection and identification of crop diseases due to their excellent classification performance [17][18][19].Su et al. developed a wheat yellow rust detection system using the random forest (RF) method, enabling the identification of yellow rust at the farm scale [20].Wang et al. proposed an SVM-based detection model for wheat disease severity, enabling the identification and severity evaluation of wheat stripe rust and leaf rust at the leaf level [21].Bohnenkamp et al. also employed an SVM model to analyze images captured using a hyperspectral camera mounted on an unmanned aerial vehicle (UAV) operating 20 m above the ground, achieving the regional-level detection of wheat stripe rust [22].Lan et al. employed remote sensing images to extract disease features, including texture, shape, and color, and utilized multiple machine learning algorithms, such as SVM and AdaBoost, allowing for the identification and classification of Citrus Huanglongbing [23].The aforementioned studies have made substantial progress and provide a basis for further research.However, at present, research on the remote sensing detection of rubber tree powdery mildew is limited, highlighting the pressing need to develop new methods aimed at improving the accuracy of detecting rubber tree powdery mildew.
In order to fill the research gap, particularly addressing the limited research on remote sensing detection of rubber tree powdery mildew and the absence of studies utilizing CWT for feature extraction, this paper uses the hyperspectral reflectance data of rubber tree leaves as the data source and proposes a rubber tree powdery mildew detection model that combines CWT and machine learning to determine the features and methods that can effectively detect rubber tree powdery mildew.This study represents the first attempt in the research community to integrate remote sensing technology with CWT for rubber tree powdery mildew detection, and it is also one of the few studies utilizing multiple types of WF features for crop disease detection.The specific objectives of the study were to (1) evaluate the performance of WFs in detecting rubber tree powdery mildew and (2) construct a rubber tree powdery mildew detection model coupled with CWT and machine learning.This study holds significant reference value for the detection and control of rubber tree powdery mildew.

Study Area
The study area is located in the Dai Autonomous Prefecture of Xishuangbanna, Yunnan Province, China (22 • 2 ′ N, 100 • 52 ′ E), at an elevation of 852.2 m.Xishuangbanna features a warm and humid climate throughout the year, with annual average temperatures ranging from 18.9 to 23.5 • C and annual rainfall between 1214.8 and 1615.9 mm [24].Rubber tree powdery mildew is a typical climate-dependent disease, with low temperatures and rainy conditions favoring its occurrence and spread.The experiment was conducted from 28 February to 2 March 2023 in both field and indoor settings.During this period, there was an above-average amount of precipitation, and the average temperatures were relatively low.The rubber tree phenology was predominantly in the bronze to pale-green stage, a critical period for detecting and controlling rubber tree powdery mildew.

Data Acquisition
The FieldSpec Pro FR spectrometer (ASD, Boulder, CO, USA) was employed to collect hyperspectral reflectance data of rubber tree leaves, with a spectral range of 350-2500 nm and spectral resolutions of 3 nm and 10 nm in the regions of 350-1000 nm and 1000-2500 nm, respectively [25,26].During measurements, the leaves were positioned horizontally on the operating table, and each sample was measured five times to derive the average leaf spectral reflectance.Prior to measurements, spectral calibration was carried out using a standard white reference board, with recalibrations conducted at intervals of 10-20 min.A total of 1250 reflectance spectra were obtained, with the spectral curve resampled to 1 nm.
During the collection of the leaf hyperspectral reflectance data, disease severity levels were assessed according to the technical regulations for rubber tree powdery mildew prediction implemented in China in 2015 (NY/T1089-2015).Disease severity was classified into the following five levels based on the visual appearance of the rubber tree leaves and the size of lesions (Figure 1): (1) healthy (H) leaves without disease lesions; (2) disease Severity 1 (S1) describes leaves covered with a layer of a white powdery substance, with disease lesions occupying one-eighth of the total leaf area; (3) under disease Severity 2 (S2), disease lesions occupy one-quarter of the total leaf area, or leaves are mildly wrinkled due to the disease; (4) for disease Severity 3 (S3), disease lesions occupy half of the total leaf area, or leaves are moderately wrinkled due to the disease; and (5) for disease Severity 4 (S4) leaves, disease lesions occupy three-fourths of the total leaf area, or leaves are severely wrinkled due to the disease.

nm.
During the collection of the leaf hyperspectral reflectance data, disease severity levels were assessed according to the technical regulations for rubber tree powdery mildew prediction implemented in China in 2015 (NY/T1089-2015).Disease severity was classified into the following five levels based on the visual appearance of the rubber tree leaves and the size of lesions (Figure 1): ( 1) healthy (H) leaves without disease lesions; (2) disease Severity 1 (S1) describes leaves covered with a layer of a white powdery substance, with disease lesions occupying one-eighth of the total leaf area; (3) under disease Severity 2 (S2), disease lesions occupy one-quarter of the total leaf area, or leaves are mildly wrinkled due to the disease; (4) for disease Severity 3 (S3), disease lesions occupy half of the total leaf area, or leaves are moderately wrinkled due to the disease; and (5) for disease Severity 4 (S4) leaves, disease lesions occupy three-fourths of the total leaf area, or leaves are severely wrinkled due to the disease.Table 1 reports    Leaves are covered with a layer of white powdery substance, with disease lesions occupying oneeighth of the total leaf area.
Disease lesions occupy one-quarter of the total leaf area, or leaves are mildly wrinkled due to the disease.
Disease lesions occupy half of the total leaf area, or leaves are moderately wrinkled due to the disease.
Disease lesions occupy threefourths of the total leaf area, or leaves are severely wrinkled due to the disease.

Traditional Spectral Features
Vegetation indices significantly amplify spectral differences through the combination and transformation of wavebands, yielding favorable outcomes in the remote sensing detection of crop diseases.Hyperspectral reflectance data can eliminate background effects and stray light reflection signals and increase implicit information through spectral differential transform.Thus, it is widely used in crop disease remote sensing detection research.In this study, based on previous research, we selected 23 spectral features from the two mentioned types, comprising 11 vegetation indices and 12 differential spectral features.These features were evaluated for their sensitivity to rubber tree powdery mildew using correlation analysis.Table 2 summarizes the definitions, descriptions, and references of these 23 features.Leaves are covered with a layer of white powdery substance, with disease lesions occupying one-eighth of the total leaf area.
Disease lesions occupy one-quarter of the total leaf area, or leaves are mildly wrinkled due to the disease.
Disease lesions occupy half of the total leaf area, or leaves are moderately wrinkled due to the disease.
Disease lesions occupy three-fourths of the total leaf area, or leaves are severely wrinkled due to the disease.

Traditional Spectral Features
Vegetation indices significantly amplify spectral differences through the combination and transformation of wavebands, yielding favorable outcomes in the remote sensing detection of crop diseases.Hyperspectral reflectance data can eliminate background effects and stray light reflection signals and increase implicit information through spectral differential transform.Thus, it is widely used in crop disease remote sensing detection research.In this study, based on previous research, we selected 23 spectral features from the two mentioned types, comprising 11 vegetation indices and 12 differential spectral features.These features were evaluated for their sensitivity to rubber tree powdery mildew using correlation analysis.Table 2 summarizes the definitions, descriptions, and references of these 23 features.

TCARI
Transformed chlorophyll absorption and reflectance index D y First-order maximal derivative inside yellow edge Yellow edge covers 550-582 nm.D y is the maximum value of the first-order derivatives within the yellow edge of 33 bands [37] SD y Summation of first-order derivatives inside yellow edge Defined by the sum of the first-order derivative values of 33 bands within the yellow edge [37] λ y Wavelength at D y λ y is the wavelength at D y [37] D r First-order maximal derivative inside red edge Red edge covers 670-737 nm.D r is the maximum value of the first-order derivatives within the red edge of 68 bands [37] SD r Summation of first-order derivatives inside red edge Defined by the sum of the first-order derivative values of 68 bands within the red edge [37] λ r Wavelength at D r λ r is the wavelength at D r (SD r − SD y )/(SD r + SD y ) Normalized value of the SD r and SD y (SD r − SD y )/(SD r + SD y ) [37] Note: Rx is the reflectance at wavelength x nm.

Continuous Wavelet Transform and Features Extraction
Continuous wavelet transform [38] is an important signal processing method that can localize both the frequency and time domains and refine functions or signals at different scales and positions.Based on CWT, the correlation analysis of the original spectral curves and Gaussian functions at different positions and scales generates a series of continuous wavelet energy coefficients.These energy coefficients are capable of extracting weak information from the spectra of various diseases.The output of CWT is as follows [39]: where f (λ) is the reflectance spectrum, λ = 1, 2, . . . m, with m denoting the number of bands-here we set m as 2151; W f (a, b) represents the wavelet coefficients that constitute a scalogram; and ψ a,b (λ) represents the mother wavelet basis function, with its generic form as follows: where a is the scaling factor indicating the width of the wavelet, and b is the shifting factor representing the position of the wavelet.In this study, the Mexican hat wavelet (mexh), which exhibits similar vegetation absorption characteristics, was selected as the mother wavelet base function [40].To facilitate computation without compromising the accuracy of CWT, only the wavelet powers at dyadic scales (i.e., 2ˆn, n = 1, 2, . .., 10) [41] were retained and referred to as the 1st scale, 2nd scale, . .., up to the 10th scale.
Three types of WFs were extracted based on the CWT: wavelet features dimensionally reduced by principal component analysis (PCA) of significant wavelet energy coefficients (PCA-WFs); wavelet features extracted using the top 1% of the coefficient of determination values between wavelet energy coefficients and the powdery mildew disease class (1%R 2 -WFs); and all wavelet features at a single decomposition scale (SS-WFs).Figure 2 presents the workflow used in this study.All CWT-related analyses were conducted using MATLAB 2016a (MathWorks).
information from the spectra of various diseases.The output of CWT is as follows [39]: where   is the reflectance spectrum, λ = 1, 2, …, m, with m denoting the number of bands-here we set m as 2151;  ,  represents the wavelet coefficients that constitute a scalogram; and  ,  represents the mother wavelet basis function, with its generic form as follows: where  is the scaling factor indicating the width of the wavelet, and  is the shifting factor representing the position of the wavelet.In this study, the Mexican hat wavelet (mexh), which exhibits similar vegetation absorption characteristics, was selected as the mother wavelet base function [40].To facilitate computation without compromising the accuracy of CWT, only the wavelet powers at dyadic scales (i.e., 2^n, n = 1, 2, …, 10) [41] were retained and referred to as the 1st scale, 2nd scale, …, up to the 10th scale.
Three types of WFs were extracted based on the CWT: wavelet features dimensionally reduced by principal component analysis (PCA) of significant wavelet energy coefficients (PCA-WFs); wavelet features extracted using the top 1% of the coefficient of determination values between wavelet energy coefficients and the powdery mildew disease class (1%R 2 -WFs); and all wavelet features at a single decomposition scale (SS-WFs).Figure 2 presents the workflow used in this study.All CWT-related analyses were conducted using MATLAB 2016a (MathWorks).(1) PCA-WFs The principal component analysis is a data dimensionality reduction algorithm that maps the original set of n-dimensional features onto k-dimensions through a linear transformation.This process generates a new set of k-dimensional orthogonal features, often referred to as principal components, which are subsequently ranked in descending order of variance [42].PCA can be categorized into two main classes: (1) dimensionality reduction by specifying the dimension of the lower-dimensional space and (2) dimensionality reduction by specifying a contribution rate.It is widely applied in the fields of data compression, feature extraction, and pattern recognition.
We determined the correlations between the calculated wavelet energy coefficients and disease severity levels, resulting in multiple significant results at different wavebands and scales.PCA was then employed to reduce the dimensionality of the significant results based on specified contribution rates.The features obtained through the process are referred to as PCA-WFs.
(2) 1%R 2 -WFs The calculated wavelet energy coefficients were subjected to correlation analysis with the severity levels of rubber tree powdery mildew, generating coefficients of determination (R 2 ).The R 2 values of wavelet energy coefficients at different scales and wavebands form a correlation scalogram, characterizing the sensitivity of each wavelet energy coefficient to rubber tree powdery mildew.We sorted the R 2 values in descending order across all scales, and the top 1% of R 2 values were retained as wavelet feature regions.To minimize redundancy, within each wavelet feature region, only the wavelet energy coefficients with the highest R 2 were preserved as extracted 1%R 2 -WFs.
(3) SS-WFs Due to the decomposition of spectral information at various positions and multiple scales by CWT, we selected the wavelet energy coefficients at a single decomposition scale as wavelet features in order to effectively capture the spectral changes caused by rubber tree powdery mildew.

Model Construction
The extracted PCA-WFs, 1%R 2 -WFs, and SS-WFs are employed as feature variables, combined with three classification methods, namely, RF, SVM, and back propagation neural network (BPNN), to establish rubber tree powdery mildew detection models.The models were used to select suitable feature variables and classification methods for detecting rubber tree powdery mildew.To demonstrate the superiority of WFs in rubber tree powdery mildew detection, the study also combined the extracted SFs with the same classification methods to build detection models for comparisons.
RF is a classification model based on ensemble learning [43], composed of a set of independent, unpruned decision trees.Multiple sample sets are generated through random sampling with replacement, and classification trees are constructed using a fully split method.The final classification result is obtained by averaging the classification results from all individual binary decision trees.RF exhibits the advantages of relatively simple parameter tuning compared to other classification models, that is, the ability to avoid overfitting and improved robustness.SVM is a supervised learning algorithm [44] that employs kernel functions to map data into high-dimensional feature spaces and subsequently performs a straightforward linear regression [45].The resulting linear regression function in the high-dimensional feature space corresponds to a non-linear regression in the original input space, enabling the modeling and classification of complex data relationships [46].SVM exhibits excellent performance in high-dimensional spaces and with small-sample data, effectively preventing overfitting, and can produce robust models even when dealing with limited training data.BPNN is a widely used neural network composed of three layers: the input layer, the hidden layer, and the output layer.Information is propagated forward through connections between these layers, and error backpropagation and weight updates are performed using the backpropagation algorithm to minimize the error between predicted and actual outputs [47].BPNN exhibits high self-learning potential and broad applicability, making it suitable for handling tasks involving non-linear and complex data pattern recognition.

Accuracy Assessment
To better assess the accuracy and stability of the proposed detection, the sample data was divided into training and validation sets at a ratio of 7:3.A confusion matrix was constructed using true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).The overall accuracy (OA) and kappa coefficient were computed as evaluation parameters to assess the model performance.While OA serves as a comprehensive measure, evaluating the quality of classification results, the kappa coefficient functions as a metric, gauging the consistency between the model's prediction results and the actual classification outcomes [48,49].The model development and data analysis were conducted using the Python machine learning library scikit-learn version 0.24.2 on a Windows 10 operating system [50].

Spectral Responses of Rubber Tree Powdery Mildew
3.1.1.Spectral Responses in Leaves under Powdery Mildew Infection Figure 3 presents the spectral reflectance curves of rubber tree leaf samples at different disease severity levels.The spectral reflectance curves of healthy and diseased leaves exhibit a similar general trend.However, noticeable differences are observed in the magnitude of spectral reflectance between samples of different disease severity levels.At the early stages of infection, during the transition from H to S1, the leaf spectral reflectance presents a sudden downward shift.As the disease severity level advances, the leaf spectral curves exhibit an overall upward trend.This increase is more pronounced in the visible and near-infrared wavelength ranges and becomes more prominent in the near-infrared range.The progress in disease severity clearly affects the reflectance peak at 550 nm, the absorption valley at 670 nm, and the peaks and valleys of the water absorption bands at 1450 nm and 1950 nm.These results indicate that powdery mildew infestation leads to remarkable changes in leaf color, pigments, and moisture content.Furthermore, there is a relatively small differentiation in the spectral curves between S2 and S3.However, a notable distinction in spectral reflectance is observed between S3 and S4, indicating prominent changes in the leaf characteristics during the late stages of disease infestation.
constructed using true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).The overall accuracy (OA) and kappa coefficient were computed as evaluation parameters to assess the model performance.While OA serves as a comprehensive measure, evaluating the quality of classification results, the kappa coefficient functions as a metric, gauging the consistency between the model's prediction results and the actual classification outcomes [48,49].The model development and data analysis were conducted using the Python machine learning library scikit-learn version 0.24.2 on a Windows 10 operating system [50].

Spectral Responses in Leaves under Powdery Mildew Infection
Figure 3 presents the spectral reflectance curves of rubber tree leaf samples at different disease severity levels.The spectral reflectance curves of healthy and diseased leaves exhibit a similar general trend.However, noticeable differences are observed in the magnitude of spectral reflectance between samples of different disease severity levels.At the early stages of infection, during the transition from H to S1, the leaf spectral reflectance presents a sudden downward shift.As the disease severity level advances, the leaf spectral curves exhibit an overall upward trend.This increase is more pronounced in the visible and near-infrared wavelength ranges and becomes more prominent in the near-infrared range.The progress in disease severity clearly affects the reflectance peak at 550 nm, the absorption valley at 670 nm, and the peaks and valleys of the water absorption bands at 1450 nm and 1950 nm.These results indicate that powdery mildew infestation leads to remarkable changes in leaf color, pigments, and moisture content.Furthermore, there is a relatively small differentiation in the spectral curves between S2 and S3.However, a notable distinction in spectral reflectance is observed between S3 and S4, indicating prominent changes in the leaf characteristics during the late stages of disease infestation.

Wavelet Coefficient Responses under Powdery Mildew Infection
CWT decomposes the spectral information at multiple scales for each position, facilitating the effective detection of spectral changes induced by rubber tree powdery mildew.Figure 4 depicts the wavelet coefficients at different scales.The results reveal remarkable variations in WFs with different disease severity levels across different wavelet scales.As the wavelet scale increases, the spectral variations from wavelet decomposition become coarser.At lower scales, such as the 1st to 4th scales, the spectral changes from wavelet decomposition appear finer across multiple spectral regions.This is particularly true in the visible light range, where pigment-dominated spectral features are prominent, as well as the dominant moisture regions near 1400 nm and 1950 nm in the near-infrared range.At higher scales, such as the 10th scale, although there is a notable difference in wavelet coefficient values among different disease severity levels, the spectral decomposition ability is reduced, resulting in an overall change that is coarser.is particularly true in the visible light range, where pigment-dominated spectral features are prominent, as well as the dominant moisture regions near 1400 nm and 1950 nm in the near-infrared range.At higher scales, such as the 10th scale, although there is a notable difference in wavelet coefficient values among different disease severity levels, the spectral decomposition ability is reduced, resulting in an overall change that is coarser.

Traditional Spectral Features
Table 3 reports the calculated correlation coefficients between the SFs and disease severity levels.With the exception of the maximum first-order differential value Dy within the yellow edge, all SFs exhibited highly significant correlations with the severity of powdery mildew (p-value < 0.001).Among them, the absolute values of the correlation coefficients for MCARI and TCARI exceeded 0.4, indicating strong correlations.Moreover, the majority of SFs exhibited relatively high correlations with powdery mildew, with |R|

Traditional Spectral Features
Table 3 reports the calculated correlation coefficients between the SFs and disease severity levels.With the exception of the maximum first-order differential value Dy within the yellow edge, all SFs exhibited highly significant correlations with the severity of powdery mildew (p-value < 0.001).Among them, the absolute values of the correlation coefficients for MCARI and TCARI exceeded 0.4, indicating strong correlations.Moreover, the majority of SFs exhibited relatively high correlations with powdery mildew, with |R| > 0.3.Based on the correlation results, 13 SFs with absolute correlation coefficients exceeding 0.2 (i.e., demonstrating highly significant correlations) were selected to construct the model.

Wavelet features
(1) PCA-WFs p-values were computed to assess the correlation between the wavelet energy coefficients at different scales and disease severity levels, allowing us to determine the presence of significant relationships between the two variables.Features with p-values < 0.05 (i.e., those with significant correlations) were retained, resulting in a total of 919 features.Dimensionality reduction using PCA was applied to the 919 significant features, with a specified cumulative contribution rate of 95%.This resulted in the selection of 25 features, as illustrated in Figure 5.
> 0.3.Based on the correlation results, 13 SFs with absolute correlation coefficients exceeding 0.2 (i.e., demonstrating highly significant correlations) were selected to construct the model.

Wavelet features
(1) PCA-WFs p-values were computed to assess the correlation between the wavelet energy coefficients at different scales and disease severity levels, allowing us to determine the presence of significant relationships between the two variables.Features with p-values < 0.05 (i.e., those with significant correlations) were retained, resulting in a total of 919 features.Dimensionality reduction using PCA was applied to the 919 significant features, with a specified cumulative contribution rate of 95%.This resulted in the selection of 25 features, as illustrated in Figure 5.  (2) 1%R 2 -WFs Figure 6 presents the correlation scalogram for disease severity levels and rubber tree leaf spectral reflectance generated using CWT.The R 2 values ranged from 0 to 0.425, and the threshold of R 2 values in the top 1%, sorted in descending order, was 0.233.The extracted wavelet feature regions are highlighted in red in Figure 6.The results indicate that the extracted WFs are primarily located at relatively lower scales, with the majority at the 1st scale and the highest at the 4th scale.These features are concentrated in the near-infrared regions of 750-1350 nm, 1550-1800 nm, and 2200-2400 nm, which correspond to the sensitive regions of the original spectrum.To reduce redundancy, within each wavelet feature region, the wavelet energy coefficients with the highest R 2 are retained as the 1%R 2 -WFs, resulting in a total of 59 features.Furthermore, all of the selected 59 features exhibit highly significant correlations with disease severity levels (p-value < 0.001) [51], indicating that the 1%R 2 -WFs are sensitive to rubber tree powdery mildew.
Figure 6 presents the correlation scalogram for disease severity levels and rubber tree leaf spectral reflectance generated using CWT.The R 2 values ranged from 0 to 0.425, and the threshold of R 2 values in the top 1%, sorted in descending order, was 0.233.The extracted wavelet feature regions are highlighted in red in Figure 6.The results indicate that the extracted WFs are primarily located at relatively lower scales, with the majority at the 1st scale and the highest at the 4th scale.These features are concentrated in the nearinfrared regions of 750-1350 nm, 1550-1800 nm, and 2200-2400 nm, which correspond to the sensitive regions of the original spectrum.To reduce redundancy, within each wavelet feature region, the wavelet energy coefficients with the highest R 2 are retained as the 1%R 2 -WFs, resulting in a total of 59 features.Furthermore, all of the selected 59 features exhibit highly significant correlations with disease severity levels (p-value < 0.001) [51], indicating that the 1%R 2 -WFs are sensitive to rubber tree powdery mildew.(3) SS-WFs Using the RF algorithm, detection models for rubber tree powdery mildew were individually constructed at 10 different decomposition scales, each based on all WFs at a single decomposition scale (Figure 7).The results indicate that the 1st scale of WFs exhibited superior accuracy, with an OA of 94.7%.As the wavelet scale increased, spectral variations from wavelet decomposition became coarser, and the ability to detect disease symptoms gradually weakened, resulting in a decline in accuracy.These findings demonstrate the applicability of the 1st scale of the WFs as the SS-WFs for the subsequent model construction.(3) SS-WFs Using the RF algorithm, detection models for rubber tree powdery mildew were individually constructed at 10 different decomposition scales, each based on all WFs at a single decomposition scale (Figure 7).The results indicate that the 1st scale of WFs exhibited superior accuracy, with an OA of 94.7%.As the wavelet scale increased, spectral variations from wavelet decomposition became coarser, and the ability to detect disease symptoms gradually weakened, resulting in a decline in accuracy.These findings demonstrate the applicability of the 1st scale of the WFs as the SS-WFs for the subsequent model construction.
Figure 6 presents the correlation scalogram for disease severity levels and rubber tree leaf spectral reflectance generated using CWT.The R 2 values ranged from 0 to 0.425, and the threshold of R 2 values in the top 1%, sorted in descending order, was 0.233.The extracted wavelet feature regions are highlighted in red in Figure 6.The results indicate that the extracted WFs are primarily located at relatively lower scales, with the majority at the 1st scale and the highest at the 4th scale.These features are concentrated in the nearinfrared regions of 750-1350 nm, 1550-1800 nm, and 2200-2400 nm, which correspond to the sensitive regions of the original spectrum.To reduce redundancy, within each wavelet feature region, the wavelet energy coefficients with the highest R 2 are retained as the 1%R 2 -WFs, resulting in a total of 59 features.Furthermore, all of the selected 59 features exhibit highly significant correlations with disease severity levels (p-value < 0.001) [51], indicating that the 1%R 2 -WFs are sensitive to rubber tree powdery mildew.(3) SS-WFs Using the RF algorithm, detection models for rubber tree powdery mildew were individually constructed at 10 different decomposition scales, each based on all WFs at a single decomposition scale (Figure 7).The results indicate that the 1st scale of WFs exhibited superior accuracy, with an OA of 94.7%.As the wavelet scale increased, spectral variations from wavelet decomposition became coarser, and the ability to detect disease symptoms gradually weakened, resulting in a decline in accuracy.These findings demonstrate the applicability of the 1st scale of the WFs as the SS-WFs for the subsequent model construction.4 summarizes the confusion matrices, OA, and kappa coefficients of the rubber tree powdery mildew detection models constructed by combining the three types of WFs with the three machine learning classification methods.For the same WF type, the models based on RF achieved the highest OA and kappa coefficients for all three WFs (PCA-WFs, 1%R 2 -WFs, and SS-WFs), with overall accuracies of 92.0%, 88.0%, and 94.7%, and kappa coefficients of 0.90, 0.85, and 0.93, respectively.The BPNN models exhibited the second-best performance, while the SVM models achieved the lowest accuracy.Therefore, the choice of an appropriate classification method significantly impacts the accuracy of crop disease detection models.Using SS-WFs as the input achieved the highest accuracy of 94.7% and a kappa coefficient of 0.93 for the RF model.However, its performance in the SVM model was notably lower, with an accuracy of only 53.3%, and in the BPNN model, it reached just 70.7%.Substantial variations were observed among different classification methods.In contrast, PCA-WFs consistently exhibited relatively high accuracy and exceptional stability in all three machine learning models, with overall accuracies of 92.0%, 86.7%, and 84.0%, and kappa coefficients of 0.90, 0.83, and 0.80, respectively.Therefore, this study suggests that PCA-WFs are more suitable as representative wavelet features for the detection of rubber tree powdery mildew.After determining PCA-WFs as the optimal wavelet features, we compared the detection performance of PCA-WFs and SFs for rubber tree powdery mildew (Table 5).When employing the selected 13 SFs (MCARI, TCARI, D b , SD b , (SD r − SD y )/(SD r + SD y ), TVI, SD r , D r , RVSI, SD r /SD b , ARI, (SD r − SD b )/(SD r + SD b ), SD y , (SD r − SD b )/(SD r + SD b ), SD y ) as input, the detection models constructed using the RF, SVM, and BPNN algorithms exhibited overall accuracies of 45.3%, 41.3%, and 44.0%, with kappa coefficients of 0.32, 0.26, and 0.30, showing subpar performance for powdery mildew detection.In contrast, the disease recognition models based on PCA-WFs exhibited a prominently higher accuracy compared to the models based on SFs, with overall accuracies increasing by 46.7%, 45.4%, and 40.0%, and kappa coefficients increasing by 0.58, 0.57, and 0.50, respectively.
These results indicate the ability of WF-based models to outperform those built with SFs.WFs perform well in distinguishing healthy and powdery mildew-infected rubber tree leaves, demonstrating the feasibility of combining CWT with machine learning for rubber tree powdery mildew detection.The model combining PCA-WFs and RF exhibits the best performance among all models, with an OA of 92.0% and a kappa coefficient of 0.90.This may be attributed to the multiscale decomposition of spectral information via CWT at various positions, revealing spectral variations related to pigments, moisture, morphology, and structure induced by rubber tree powdery mildew.In addition, the integration of PCA in dimensionality reduction optimally selects positions and scales.

Discussion
The natural rubber produced from latex extracted via rubber trees is a vital industrial resource.The global annual economic losses due to rubber tree powdery mildew reach billions of dollars [52].The implementation of early warning capabilities prior to the occurrence of powdery mildew epidemics or the precise identification in the early stages of infection could reduce pesticide usage, prevent unnecessary environmental pollution, and minimize economic losses, thus effectively controlling the outbreak of powdery mildew.
This study introduces a rubber tree powdery mildew detection model that combines PCA-WFs with RF, representing the first attempt in the research community to integrate remote sensing technology with CWT for rubber tree powdery mildew detection.It is also one of the few studies utilizing multiple types of WF features for crop disease detection.The optimal rubber tree powdery mildew detection model achieved an OA of 92.0% and a kappa coefficient of 0.90, demonstrating the substantial potential of using CWT technology for detecting rubber tree powdery mildew.
Despite the great progress made by this study, it has several limitations.We employed three different machine learning classifiers to categorize rubber tree powdery mildew severity levels.Notably, these classifiers exhibited substantial discrepancies.For each type of feature, models based on RF consistently achieved the highest OA and kappa coefficients compared to BPNN and SVM.This may be attributed to the overall stability of the RF model.As an ensemble method based on multiple decision trees, the risk of overfitting in individual trees is minimized.This results in a more robust overall model that consistently performs well across different datasets.In addition, despite performing less effectively on BPNN and SVM, the study found that using SS-WFs as the input achieved the highest accuracy of 94.7% and a kappa coefficient of 0.93 for the RF model.The possible reason for SS-WFs outperforming PCA-WFs for the RF model might be that SS-WFs retain all wavelet features at the first scale, encompassing data from various spectral positions, thus providing a greater amount of information.Additionally, compared to BPNN and SVM, the RF model excels in handling high-dimensional data [53].The abundance of features in SS-WFs facilitates more effective feature selection by the RF model, enabling it to make more accurate and robust decisions.In particular, machine learning classifiers are associated with a limited generalization capability.As a result, the performance of disease identification on the same leaf may exhibit variability in different environmental settings, such as varying lighting conditions [54].Consequently, selecting appropriate classification models is crucial when dealing with distinct diseases.
Furthermore, the study area encompasses various rubber tree varieties.Leaves from different varieties generally exhibit variations in color and shape.However, this study did not consider the potential impact of different varieties on the identification of rubber tree powdery mildew.Thus, the influence of distinct rubber tree varieties on the performance of disease detection models is worthy of further research.The lack of practical application experiments also constitutes a current limitation, necessitating additional research and experimentation to bridge the gap between theoretical advancements and real-world application.Moving forward, research endeavors aim to collaborate with agricultural practitioners to implement this detection method in actual real-world conditions.The integration of feedback from practical applications will contribute to refining and optimizing the research.
While our study results can be applied to indicate foliar lesions related to powdery mildew, the current research did not explore the canopy structure characteristics of rubber trees.In recent years, the rapid advancement of UAV technology has achieved certain successes in pest and disease detection applications [55][56][57].Compared to proximal spectral measurement devices, instruments onboard UAVs can rapidly and non-invasively collect spectral information from entire crop fields, enabling the more in-depth exploration of the canopy structure, texture, and other information related to rubber trees.Therefore, integrating UAV technology with CWT for future identification of rubber tree powdery mildew is expected to greatly enhance the detection efficiency and accuracy.

Conclusions
In this study, based on hyperspectral data of rubber tree powdery mildew, three types of WFs were extracted using CWT for the detection of rubber tree powdery mildew, namely, PCA-WFs, 1%R 2 -WFs, and SS-WFs.These features were combined with three classification methods, RF, SVM, and BPNN, to establish rubber tree powdery mildew detection models.Based on the results, the following conclusions were drawn: (1) The overall magnitude of the spectral curves of rubber tree leaves increases in the visible and near-infrared wavelength ranges as the disease severity progresses.(2) All three selected WFs can effectively detect the severity of rubber tree powdery mildew.The models constructed based on PCA-WFs exhibited relatively high accuracy and exceptional stability.The models based on WFs all outperform those based on SFs.For example, the OA based on the RF classification method exhibits prominent improvements of 46.7%, 49.4%, and 42.7%, respectively.For models based on the same WF type, those constructed with the RF classification method achieve the highest OA and kappa coefficient.In particular, compared with the BPNN and SVM methods, the model's OA improves by more than 5% and 8%, respectively.(3) The model combining PCA-WFs with RF demonstrates the best performance among all models, achieving an OA of 92.0% and a kappa coefficient of 0.90.This demonstrates the feasibility of CWT in the detection of rubber tree powdery mildew.
However, it is important to note that our study lacks additional independent datasets for validation.Future research will acquire more diverse leaf samples representing different rubber tree varieties to serve as independent datasets.This endeavor aligns with the need to enhance the robustness and generalizability of our findings.Moreover, future research will focus on employing UAV remote sensing imagery and performing comprehensive analyses of rubber tree canopy structure, texture, and related information for the application of rubber tree powdery mildew identification, aiming to establish a more complete and holistic detection model.
maximal derivative inside blue edge Blue edge covers 490-530 nm.D b is the maximum value of the first-order derivatives within the blue edge of 41 bands[37] SD b Summation of first-order derivatives inside blue edge Defined by the sum of the first-order derivative values of 41 bands within the blue edge[37] λ b Wavelength at D b λ b is the wavelength at D b[37]

[ 37 ]
SD r /SD b Ratio of SD r and SD b SD r /SD b [37](SD r − SD b )/(SD r + SD b ) Normalized value of SD r and SD b (SD r − SD b )/(SD r + SD b )[37]

Figure 3 .
Figure 3. Spectral reflectance curves of healthy and powdery mildew-infected rubber tree leaves.Figure 3. Spectral reflectance curves of healthy and powdery mildew-infected rubber tree leaves.

Figure 3 .
Figure 3. Spectral reflectance curves of healthy and powdery mildew-infected rubber tree leaves.Figure 3. Spectral reflectance curves of healthy and powdery mildew-infected rubber tree leaves.

Figure 7 .
Figure 7. Detecting the accuracy of the RF model based on all wavelet features at a single decomposition scale.

Figure 7 .
Figure 7. Detecting the accuracy of the RF model based on all wavelet features at a single decomposition scale.

Figure 7 .
Figure 7. Detecting the accuracy of the RF model based on all wavelet features at a single decomposition scale.

3. 3 .
Evaluation of Rubber Tree Powdery Mildew Detection Models 3.3.1.Comparison of Disease Detection Models Based on Different Wavelet Features Table

Table 2 .
Traditional spectral features selected in the study.

Table 3 .
Correlation between traditional spectral features and disease severity.
Note: ** indicates a highly significant correlation with a p-value < 0.001.

Table 3 .
Correlation between traditional spectral features and disease severity.
Note: ** indicates a highly significant correlation with a p-value < 0.001.

Table 4 .
Comparison of detection model results based on different wavelet features and algorithms.

Table 5 .
Comparison of the detection model accuracy between optimal wavelet features and traditional spectral features.