The avocado crop in Florida is considered the second most economically important crop after citrus. Avocados account for approximately 7500 acres in Miami-Dade County, with an economic impact of more than $
54 million [1
]. Since 2011, the Florida avocado industry has lost thousands of trees due to a deadly disease named Laurel wilt (Lw). Laurel wilt was identified for the first time in Savannah Georgia in 2002 [2
]. In 2011, the ambrosia beetle attacked the commercial production of citrus in Florida [3
]. The redbay ambrosia beetle, Xyleborus glabratus
Eichhoff (Coleoptera, Curculionidae, Scolytinae), is related with fungal symbionts such as Raffaelea lauricola
]; a fungus that blocks the flow of water to other branches, causing wilting. In a matter of a few weeks, a tree can die [5
It is not easy to recognize the infected tree because Lw disease has similar symptoms of other disorders that appear in early and late stages [6
]. Phytophthora root rot, salt damage, freezing, nutrient deficiency has the same symptoms in the early stages. Therefore, it is necessary to identify the disease by a qualified disease diagnostician.
At an early stage, leaves turn to a yellowish color, and at late stages, they turn reddish to purplish. The crown of the tree sometimes partially wilts, requiring an expert to distinguish which disease might be affecting the tree. The expert inspects the fungus after removing the bark to confirm the deep fungal infection in the sapwood. If infected by Lw, the grower should remove the infected tree to disrupt the life cycle of the ambrosia beetle [7
]. Currently, there is no known chemical treatment for Lw. The best method for sterilization is to remove the tree, including the root, and burn them in the same grove [8
]. Time is of utmost concern in disrupting the disease and improving the odds of survival because Lw kills the tree in a few weeks.
Timely and accurate detection methods are essential to inhibit and prevent the disease from spreading to another area or state [9
]. Several methods have been utilized to detect diseases in the field [10
]. Precision and sustainable applications in agriculture require the use of modern and rapid techniques and technologies for assessing crop health and stress status [13
], detecting and precisely treating pests [15
] and diseases, developing site-specific traceability systems [17
], etc. Nondestructive and accurate methods are necessary to improve early disease detection. Abdulridha et al. and Sankaran et al. [6
] developed rapid techniques to detect Lw in avocado and distinguish it from other diseases and deficiencies, which produce similar symptoms, utilizing hyperspectral and multispectral data and several classification algorithms (neural networks).
Hyperspectral data has been used in recent years to analyze, detect and classify plant disease and environmental factors related to plant stresses. Hyperspectral data analysis was used by Moshou et al. [20
] to detect yellow rust (Puccinia striiformis
) disease of winter wheat crops in the asymptomatic stage by utilizing hyperspectral and fluorescence imaging (450–900 nm). The spectral data were compared with multi-spectral fluorescence images. After comparing the 550 and 690 nm, they discovered that it was possible to detect the disease in its early stages. Varpe et al. [21
] monitored the variance in chlorophyll content in Syzygium cumini
plant species using spectral indices derived from hyperspectral data. The result of regression models demonstrates a good correlation between spectral indices (e.g., mSR, R² = 0.157; BIG2, R² = 0.069; ARI, R² = 0.454) and photosynthetic color contents in all species. Ahmadi et al. [22
] utilized spectroradiometer techniques to detect early-stage Ganoderma basal stem rot on Malaysian oil palm. Neural networks were utilized to distinguish infected plants from healthy ones. Satisfactory results were obtained (83%–100%) by using artificial neural networks and first derivative spectral data in ranges of 540 nm to 550 nm. Corti et al. [23
] developed a technique to monitor the nitrogen level and water stress in various canopy geometry (rice and spinach), using three detection methods, multispectral, hyperspectral and thermal imaging, in greenhouses. Multivariate regression analysis was applied successfully between spectral wavelengths. Specific wavelengths were proposed and selected to be utilized in field detection of the same crops. Bravo et al. [24
] mounted a spectrograph on spray boom height to successfully detect yellow rust in wheat crops. A classification model based on quadratic discrimination was built on a selected group of wavebands obtained by stepwise variable selection. The classification error dropped from 12% to 4%. Franke and Menz [25
] tracked the growth of two diseases, powdery mildew (Blumeria graminis
) and leaf rust (Puccinia recondita
), in wheat crops for three different periods. Two classification methods were applied; the mixture-tuned matched filtering and decision trees, using the Normalized Difference Vegetation Index (NDVI). The results varied based on the developmental stages of the disease (from 56.8% to 88.6%). Pérez-Bueno et al. [26
] detected white root rot in avocado trees utilizing remote sensing. Calderon et al. [27
] utilized airborne hyperspectral and thermal camera to detect Verticillium wilt in olive trees. Their contribution distinguished physiological parameters in the hyperspectral data that indicate disease.
Our approach has been to use the hyperspectral data along with enhanced spectral methods and statistical correlation to detect and classify healthy, Lw-diseased, Fe-Deficient and N-Deficient avocados. Improvement of the gradient spectra of various plant leaves was accomplished by using spatial derivatives and curve fitting as well as statistical correlation of the data after the enhancement process, to congregate various specimens (diseased stages, Fe,N-deficiencies, and healthy plants) of the avocado plant into their corresponding group. The spectral radiance data can be transformed into percent reflectance and resampled in correspondence to the spectral band configuration used by the sensors. Using finite differences, clarification to higher order spectrograms were achieved. The characteristic signatures were then used in the statistical correlation process. After obtaining the divided difference spectra for second and fourth order spatial differences, and deriving a correlation process of the resultant spectra, the algorithm presented is able to clearly distinguish healthy and disease categories of the plant samples.
A set of reflectance data collected by the sensor were obtained for healthy and diseased avocado leaves from the TREC and CREC. After obtaining the data, a standard normalized transformation was applied in order to reduce error of the approximation due to scattering and particle size differences, embedded in the reflectance data. Finite difference approximation (FDA) and bivariate correlation were used to distinguish diseased or nutrient-deficient plants from healthy specimens. Normalized finite difference approximations of each leaf’s reflectance values with respect to the wavelength were calculated.
3.1. Data Analysis
After careful examination of the data and noting specifically the variance in the data between 750 and 950 nm, it was determined that a higher order FDA could discern inflection point differences in this region. Figure 2
displays a graph of the averaged reflectance data (plant reflectance signatures) in the complete spectrum and in the 750 to 950 nm region for each leaf category. Taking a fourth order FDA, the zero crossing regions in the 750–950 nm region show the inflection point variance in Figure 3
. Figure 4
displays the second order FDA for each group between 750 and 950 nm, after normalizing and transforming the data Each group’s FDA reveals categorical spectrums indicative of inflection point variations in this region. The variations of phase, amplitude and peak data are significant. These variants together with the correlation method defined in Section 2.3.2
were used to classify leaves belonging to the categories of Fe-Deficient, N-Deficient, Laurel wilt diseased, or healthy plants.
The interval [750, 950] nm, was selected for the particular ROI, for approximating the bivariate correlation coefficient because of the inflection point variance displayed in the fourth order zero crossing data and spectral variance defined in the second order FDA.
3.2. Applied FDA and Bivariate Correlation Process
After normalization and smoothing of the data, a second order FDA was performed on each leaf’s data set. The bivariate correlation coefficients were then calculated between each leaf’s second order FDA and the second order FDA of leaves in the other categories. For the calculation of the bivariate correlation coefficient between leaves of the same group, randomly sampled leaves from each group were used to create the average spectra and these were correlated against only the remaining leaves. Since over 300 data sets for numerous avocado leaves were available for each respective leaf group, thirty randomly sampled data sets were used to create the averages. Results follow below for each group of avocados.
3.2.1. Fe-Deficient Classification Results
As shown in Figure 5
, all Fe-Deficient specimens produced a correlation greater than 0.5 only with the average of the 30 randomly selected Fe-deficient specimens. Correlations with other groups center around zero. The Fe-Deficient correlation with the Lw-diseased FDA clarifies the most variance but were still below the 0.4–0.5 correlation coefficient point which is being established as the line of distinction categorically.
3.2.2. N-Deficient Classification Results
All N-Deficient specimens produced a correlation greater than 0.5 only with the average of the 30 randomly selected N-Deficient specimens (Figure 6
). Correlations with other groups centered around zero. Again, The N-Deficient correlation with the Lw-diseased FDA show the most variance. In both cases, for the N-Deficient and the Fe-Deficient correlations, it can be noted that the variance is the greatest with Lw correlation. The Lw-N-Deficient to Healthy-N-Deficient correlation spectrum differs in an almost 180° out of phase relationship.
3.2.3. Lw-Disease Classification Results
As seen in Figure 7
, all 170 Lw specimens exceed 0.48 correlations with other groups centered near or below zero. This correlation set shows the highest degree of collinearity. This is particularly why it is difficult to distinguish Lw from other abiotic stressors or healthy specimens. Establishing a line of distinction categorically at the 0.4 correlation coefficient mark, to distinguish Lw from the other deficiencies and healthy plants, is considered to be the most applicable from these results.
3.2.4. Healthy Avocado Classification Results
All healthy specimens produced a correlation greater than 0.5 only with the average of the 30 randomly selected healthy specimens. Correlations with other groups centered around or below zero, as seen in Figure 8
. Again, the pattern of inverse phase relationship occurs when correlated with Lw in this statistical process. The most variance is seen in the healthy to N-Deficient correlation.
3.3. Analysis of Classification Results
The results show that the FDA-BC algorithm is extremely efficient at distinguishing Lw disease and N, Fe deficiencies from healthy avocado plants. Each enhanced spectra group of avocado leaves correlated above 0.4 on a −1 < bivariate range, with its own condition, and correlated near or below zero with any other group. Because of these results, an algorithm for distinguishing between the four groups of avocados was realized. The algorithm for each leaf consisted of:
Normalization of the hyperspectral data.
Polynomial fitting of data.
Smoothing the data by moving median with absolute deviation.
Obtaining the second and fourth order finite difference approximation (FDA).
Establishing regions of interest (ROI) through FDA inflection point analysis of the spectra.
Detecting and correctly categorizing the leaf sample into one of the healthy or diseased/deficient categories based on the correlation coefficient result.
For correlations greater than 0.4, the leaf specimen was classified with the specimen group correlated against.
The bivariate correlation of the enhanced spectra for the avocado data distinguishes each leaf into the correct disease or deficiency group of avocados, with significant accuracy. The accuracy of this FDA-BC algorithm is summarized for the categorization of various avocados and their deficiencies in Table 1
. As can be noted, all deficiency classification using this method is highly accurate with the population data that was used for our data set. Healthy and disease classifications were also able to be detected with greater than 99% accuracy in most cases. The FDA-BC method is consolidated in Figure 9
to more concisely show the results of the detection and classification method. The results vary only slightly in repeated testing (within ~0.41%), due to the random sampling and averaging of the pattern signature for the autocorrelation process.
Confusion Matrix as Figure of Merit
The Confusion Matrix is used as a measure of accuracy in predictive analysis. For the purpose of verifying the FDA-BC results, four rows of correlated data results represent true positives, true negatives, false positives and false negatives for each data set. The first row contains correct correlation results (true positives) for “leaf being tested” that fall above the bivariate division line of 0.4 for leaves in the same category. The second row contains correlated leaf results of the same category that fall below 0.4; this signifies that the categorization was incorrect or a false negative (e.g., An Lw leaf should have correlated above 0.4 for its category of Lw). The graphs in Figure 10
show the first two rows of the correlated data of the Confusion Matrix. Figure 10
gives the Confusion Matrix graphs of the data for the [True Positives]/[False Negatives] prediction outcomes. Figure 10
a shows the classification accuracy for Fe-Deficient plants; Figure 10
b gives the results for the N-Deficient classification; Figure 10
c shows the Lw classification results, and Figure 10
d verifies the healthy avocado leaf identification. There are no false negatives in the Confusion Matrix (data row 2), thus verifying the 100% accuracy of the FDA-BC method in the ROI [750–950 nm].
The Accuracy (ACC) figure of merit represents the actual accuracy of the prediction, using the Confusion Matrix. Table 2
presents the Confusion Matrix for Healthy avocado predictions using the FDA-BC algorithm. All other ACC data is given in Table 2
. These ACC results verify that the FDA-BC method used for classification of healthy, diseased and deficient avocado plants is extremely accurate. For the set of randomly sampled and averaged FDA correlations in this analysis, 100% accuracy was achieved, verifying the robustness of the FDA-BC algorithm.
Techniques in hyperspectral data analysis have grown rapidly in recent years. Similar work is being achieved in the field of hyperspectral data analysis for plant disorders’ detection and classification [6
]. Perez-Bueno et al. [25
] have used normalized difference vegetative index (NVDI) and canopy temperature to form a binary regression model for classification of hyperspectral data in determining white root rot in avocados. Ye et al. [29
] used diverse signal processing methods of hyperspectral imaging to distinguish bruised potatoes (at three various levels) from unbruised. Preprocessing methods included Savitzky-Golay smoothing of the data, averaging and transforming data using normal variate transformation to reduce scattering effects (clutter), first and second order derivitave analysis, and multiplicative scatter correction. By using dimensional reduction and a three dimensional state vector machine for decision analysis, a classification decision algorithm, resulted in 100% accuracy. While this algorithm is very accurate, its complexity and strategies used to implement the objective function are difficult to apply for a variety of features in other similar type hyperspectral data.
Other recent methods for detection and classification of diseased species using hyperspectral imagery analysis has been used to identify nutrient-deficient abiotic and biotic stressors in tomato plants [30
]. Hyperspectral imaging analysis was used for early detection of root knot nematodes caused by biotic stress to distinguish from abiotic stresses caused by drought in tomato plants. This study chose to analyze data using partial least squares with support vector machine (PLS-SVM) algorithms and matched filter analysis to detect and classify biotic vs. abiotic stressors in the plants. Again, similar preprocessing methods of the data were used by Ye at al. [29
]. This method then applies supervised classification using spectral divergence and discriminaton analysis to classify the data. Because of the variability in the data sets, reliable models were only able to be built for specific test sets.
Hyperspectral imaging techniques have been utilized recently to distinguish diseases in the asymptomatic and early stages of Lw disease in avocado. Abdulridha et al. [6
] and Sankaran et al. [6
] developed rapid techniques to detect Lw in avocado and distinguish it from other diseases and disorders, which produce similar symptoms, utilizing hyperspectral and multispectral data and several classification algorithms (neural networks). Two data sets were collected at 10 nm and 40 nm spectral resolution, and 23 vegetation indices (VIs) were calculated to detect Lw-affected trees by using two classification methods: decision tree (DT) and multilayer perceptron (MLP) neural networks. Additionally, the optimal wavelengths and VIs to discriminate healthy, Lw-infected and avocado trees with iron and nitrogen deficiencies were identified. The work focused on the comparative analysis of using several types of neural networks for classifying diseased, deficient, and healthy avocados. The research proved that the Multi-layer perceptron neural network was superior to the Decision Tree type network in classifying the diseased vs. deficient and healthy.
The FDA-BC method described in this article is more general than these recent methods. It is found to be successful for over 1,300 leaf samples tested. It uses a few well known preprocessing steps (polynomial fit, finite difference approximation with zero crossing identification, normalization, smoothing), followed by statistical correlation. The FDA-BC provides a robust algorithm for detection and classification of diseased plants using hyperspectral data analytics. This technique can be a reliable, verifiable algorithm to be implemented for use in other discriminatory data from hyperspectral data.
We have presented a numerical and statistical method to detect Lw-infected avocado trees (in early and late stage) and discriminate them from N, Fe-Deficient and healthy avocado plants. The method has proved to be highly accurate (ACC = 100%) in the ROI, for the data sets tested. It is a vigorous and versatile method that can be used to detect and classify other hyperspectral data of plant diseases or related stress factors.
Future work to improve this algorithm would be to use more advanced smoothing methods, define spectral pattern signatures for each diseased or deficient category of leaf as well as for the healthy plant to consolidate the correlation processing method, and to design an automatic zero crossing detection scheme. The smoothing would increase the accuracy when afforded with noisy, in-field data. The signature pattern consolidation would deliver a more rapid processing of the data. The zero-crossing detection would provide immediate and automatic interpretation of the pertinent ROIs to be tested. This being stated, the accuracy of the result would not necessarily improve, but the implementation would be of more value in adding this algorithm as part of a sequence to the remote sensing, hyperspectral data processing system.
A method for distinguishing disease factors in plants has been introduced and designed. By developing higher order spectra with Finite Difference Approximation, and statistically correlating the results of these enhanced spectra, a method (FDA-BC) for detecting variations in spectra between healthy and diseased avocado plants has been achieved. The formulation of the problem, the analyses, algorithm development, implementation, and results have been shown in this paper. The FDA-BC method is a more generalized algorithm than some of the more recent methods that have been undertaken to provide solutions for the identification and classification of diseased plant data through hyperspectral data processing. FDA-BC involves fitting the data to a fourth order Taylor Polynomial with specific fit properties using the finite difference approximation to determine regions of interest in the data, smoothing of the FDA by moving average estimate, and normalization of the data. A bivariate correlation coefficient algorithm was then applied to classify the preprocessed hyperspectral data. This novel method has proven to be over 99% accurate on the ROIs tested for avocado Lw, Fe/N deficiencies and healthy plant species. The results are evidence that by means of finite difference approximation to enhance hyperspectral data, and analyzing these enhanced spectra through statistical correlation, the FDA-BC algorithm has proven to be highly accurate in the categorization of diseased vs healthy samples. For the sets of data considered, avocado diseased/deficient/healthy, 100% accuracy was achieved for all leaf samples tested. Other plant species’ data are being considered for further verification of this numerical and statistical approach. By further investigation and refining of this algorithm, higher precision can be achieved for disease detection and classification in plant species using this methodology. It is, therefore, recommended that variations of this fundamental algorithm be applied to other hyperspectral data analyses as a general solution that is highly accurate and can be further designed to optimize or verify a neural network or support vector machine approach to detection and classification.