1. Introduction
Apple Valsa canker (AVC) is among the most destructive diseases of apple trees. AVC is caused by a filamentous fungus called
Cytospora ceratosperma, with synonyms
Valsa ceratosperma,
V. mali,
C. capitata, and
C. sacculus [
1,
2,
3], which grows in the trunks and branches of apple trees. The spores formed on the diseased bark are dispersed by wind and rain and transmit the fungus to crevices developed in the bark and wounds caused by the pruning of branches and fruits. The infected bark rots and, if left untreated, the infection gradually spreads and ultimately leads to the death of the tree, resulting in great economic loss. Once a tree develops AVC, the only method of control is excising the diseased area. Therefore, early and accurate detection is crucial to minimize the tree damage caused by excision. However, it is difficult to visually identify diseased bark, particularly in the early stages. In addition, the excised area must be promptly removed from the orchard and properly disposed of to prevent further transmission. These inspection and treatment processes impose a labor burden on apple farmers.
AVC was discovered in Japan in the early 20th century [
4]. Since its discovery, it has been reported in other countries and regions of East Asia, such as China, South Korea, and eastern Russia [
5], which account for approximately half of the world’s apple production [
6]. China’s apple production is the world’s largest, with 42.42 million tons produced annually as of 2019, eight times more than that of the second largest producer, the United States [
6]. It is estimated that between 10 and 55% of the major apple-producing areas in China suffer from AVC [
7], and a recent study predicted that future climate change will increase the area of suitable habitat for the pathogen in China [
8]. Thus, AVC control is becoming increasingly important for sustainable apple production.
As with most tree species, apple trees have a layer of chlorenchyma, which contains photosynthetic chloroplasts, beneath the periderm of the bark; hence, photosynthesis occurs in the bark of the trunk and branches as well as in the leaves. Chlorophyll, a pigment present in chloroplasts, strongly absorbs blue and red light and weakly absorbs green light as the reflectance of the chlorenchyma is relatively high in the green region of light, thereby giving leaves their color. The reflectance is much higher in the near-infrared region of light, which is hardly absorbed. Unlike the leaves, the trunk and branches are covered with a periderm, which strongly absorbs light with shorter wavelengths than the red color region [
9], causing the bark of the trunk and branches to have a brown color. As the near-infrared light is hardly absorbed by the periderm and penetrates the chlorenchyma, the near-infrared reflectance of healthy bark and leaves is high.
Remote sensing is an effective tool for the non-destructive and non-contact diagnosis of crop growth, disease, and pest damage. It is widely used in precision agriculture, which applies cutting-edge technology to improve crop productivity [
10]. In precision agriculture, research on the application of hyperspectral imaging using a high spectral resolution sensor is progressing rapidly [
11]. Hyperspectral imaging can obtain more detailed spectral information than that acquired using other optical remote sensing methods, such as true color and multispectral imaging [
12]. Furthermore, it has been used to retrieve leaf traits from canopy reflectance for vegetation monitoring [
13,
14] and is expected to be highly effective in the diagnosis and detection of diseases in agricultural crops [
15,
16]. Regarding apple trees, hyperspectral imaging has been applied in the detection of other diseases, such as apple scab [
17] and fire blight [
18] on leaves. Recently, Zhao et al. [
19] applied hyperspectral measurement to the detection of AVC on the bark. They combined near-infrared spectroscopy (900–1700 nm) and Raman spectroscopy (0–2000 cm
−1) techniques with machine learning algorithms and generated a diagnostic model based on chemometric methods. Their hyperspectral measurement using portable spectrometers demonstrated that it is feasible to detect AVC on bark samples under laboratory conditions. However, the point-based spectroscopic measurement required to identify the diseased areas in outdoor orchard environments is time-consuming.
In this study, detection of AVC using hyperspectral imaging of apple tree bark in visible and near-infrared regions (460–780 nm) was investigated for the first time. If the diagnosis can be efficiently automated, it will lead to early detection of the disease and a reduction of the burden on apple farmers. The objectives of this study are (1) to determine the spectral characteristics of the AVC-diseased apple tree bark by hyperspectral imaging and (2) to develop a simple diagnostic model to discriminate between diseased and healthy apple tree bark.
2. Materials and Methods
The apple trees investigated in this study were cultivated in an orchard in the Central Agricultural Experiment Station, Agricultural Research Department, Hokkaido Research Organization in Naganuma, Hokkaido, Japan. The AVC infection was visually checked by experts, and the infected branches and trunks were cut down for hyperspectral imaging. All the hyperspectral measurements were conducted outdoors at approximately 11 a.m. within a few hours after cutting. The measurement angles were 20–40 degree off-nadir, and sky conditions were mostly cloudy. After the measurements were completed, the bark was peeled off with a knife to confirm the extent of the disease (
Figure 1). The diseased areas were easily identified by removing the bark, as the chlorenchyma beneath the periderm had rotted and lost its green color, and the lesion often reached the xylem.
A sequential two-dimensional imager (Genesia Corp., Tokyo, Japan) using liquid crystal tunable filter technology was used to carry out the hyperspectral imaging. The central wavelength of its spectral bands can be changed sequentially from 460 to 780 nm at a minimum interval of 1 nm, and the bandwidth increases linearly with the wavelength from 6 nm at 460 nm to 23 nm at 780 nm. In this study, a wavelength scanning of the imager was operated at 10 nm intervals, and a total of 33 spectral bands were acquired for each hyperspectral dataset. A 127 mm square standard diffuse reflectance target SRT-50-050 (Labsphere, Inc., North Sutton, NH, USA), with a spectral reflectance of approximately 45–50% in the range of the hyperspectral measurement was placed in the field of view of the imager. The target was used to convert the measured spectral radiance to spectral reflectance. Detailed specifications of the imager and methods of image preprocessing have been described in previous studies [
20].
A diagnostic model of AVC was generated based on the hyperspectral measurements of the trunks from seven apple trees sampled on 19 May 2021. The accuracy of the diagnostic model was evaluated using a hyperspectral measurement of a branch from an apple tree sampled on 24 April 2020 (
Table 1). In addition, the model was assessed by using k-fold cross-validation (CV) with five cultivars (k = 5); the seven trunks were grouped into the five cultivars because one of the trunks had only a very small amount of healthy area. In the 5-fold CV, the accuracy was evaluated for each cultivar using the model generated from the other four cultivars.
When a pixel-based analysis is used for the hyperspectral images with narrow spectral bands, noisy pixels can have a significant effect on the generation of a diagnostic model. To reduce the amount of noise, an object-based analysis using a superpixel segmentation algorithm, which groups pixels into small areas (superpixels) with similar brightness, was applied to the input image. Pixels of a composite color image created from the hyperspectral imagery were automatically segmented into superpixels using the simple linear iterative clustering (SLIC) algorithm [
21] implemented in the OpenCV open-source library [
22]. Among the algorithm variants of the SLIC, the SLICO (‘O’ stands for optimization) algorithm was applied. The SLICO algorithm optimizes superpixels using the adaptive compactness factor, and the average superpixel size was 15 pixels. Each superpixel was manually labeled as healthy or diseased based on the identification measurement after the bark was removed (
Figure 2). To avoid misclassification, superpixels that contained both healthy and diseased areas were not used for the analysis. The total number of healthy and diseased superpixels obtained from the seven trunks were 330 and 303, respectively.
After calculating the average spectral reflectance for each superpixel, the normalized difference spectral index (NDSI) was derived from a combination of two spectral bands using the following equation:
where
R(
λi) and
R(
λj) are the spectral reflectance values at the central wavelength
λi and
λj of ith and jth spectral bands, respectively. NDSI was calculated for all band pair combinations from the 33 spectral bands (
33C2 = 528). Normalized difference vegetation index (NDVI), which is widely used in optical remote sensing for monitoring the quantity or status of vegetation, is a form of NDSI that combines spectral bands in the red and near-infrared regions as
λi and
λj, respectively. The advantages of NDSI are its relative insensitivity to changes in irradiance and its convenience of normalizing the values between −1 and 1.
It was assumed that the NDSI values of the superpixels had a normal distribution, and discriminant analysis was performed to classify the healthy and diseased areas using the Mahalanobis distance. The Mahalanobis distance is the distance from the center of a normal distribution, and the Mahalanobis distance for a certain point is expressed by the following equation using the standard deviation
σ and the mean
μ:
where
DM is the Mahalanobis distance and
x is the position of the point. In discriminant analysis, the point where the Mahalanobis distances from two normal distributions are equal is the boundary to determine which distribution a certain point belongs to.
To validate the efficacy of this method using the NDSI and the Mahalanobis distance, a linear discriminant analysis (LDA) and a quadratic discriminant analysis (QDA) were performed using spectral reflectance values of all band pairs (instead of the NDSI values). Although both LDA and QDA consider a normal distribution for each class of data, QDA does not assume a fixed variance for both healthy and diseased conditions. Models for LDA and QDA were also generated from the dataset of the seven trunks, and their accuracies were evaluated using the branch data. Algorithms for LDA and QDA were implemented using the scikit-learn library [
23] in Python 3.7 [
24].
3. Results
Figure 3 shows the spectral reflectance averaged for each superpixel, divided into the healthy and diseased areas. In the healthy area, the spectral reflectance increases sharply in the red edge spectral region (680 nm to 740 nm). In contrast, the spectral reflectance in the diseased area increases gradually in the red edge region. Depending on the superpixels, the values of the spectral reflectance in both the healthy and diseased areas vary by more than a factor of three.
Figure 4 shows the probability density distributions of the healthy and diseased areas for
NDSI (680 nm, 740 nm), which are calculated using Equation (1), where
λi = 680 nm and
λj = 740 nm. These two spectral bands are located at either end of the red edge region where the healthy and diseased areas are significantly different. As expected from the characteristics of the spectral reflectance in
Figure 3, the probability density distributions of the healthy and diseased areas are separated, and they can be reasonably assumed to be normally distributed.
Assuming a normal distribution for both distributions, the threshold value of NDSI at which the Mahalanobis distances are equal was determined.
Figure 5 shows the threshold values and their Mahalanobis distances, calculated for all combinations of the spectral bands. As can be seen in
Figure 5a, the threshold values of NDSI are large in the combinations of spectral bands where
λi is in the visible region and
λj is in the near-infrared region. In contrast,
Figure 5b shows that the Mahalanobis distances are large in the combination of spectral bands where
λi is on the short wavelength side and
λj is on the long wavelength side of the red edge region where the spectral reflectance increases sharply. The maximum value of the Mahalanobis distance is located at the combination of
λi = 690 nm and
λj = 710 nm.
Table 2 shows the results of the 5-fold CV, which is performed using five cultivars of the seven trunks to generate the diagnostic model. Although each cultivar has a different number of superpixels and a different proportion of healthy and diseased areas, the accuracy obtained for each cultivar was extremely high (>0.99). Spectral bands with the high accuracies were similar for all cultivars and located in the region where the Mahalanobis distance was significant in
Figure 5b. These results support the validity of creating a diagnostic model from data including various cultivars with different sample sizes and proportions of healthy and diseased areas.
Figure 6 shows the results of the diagnostic model created from the NDSI threshold values using the hyperspectral image data of an apple tree branch obtained on 24 April 2020. Based on the identification measurement of the peeled-off bark, the ground-truth labels were prepared and compared with the discrimination result. The discrimination accuracy is defined as the ratio of the discriminated pixels having the same label as the ground-truth pixels.
Figure 6c shows the discrimination result for
NDSI (660 nm, 730 nm), which achieved the highest accuracy (0.961). Although the diagnostic model was created from the hyperspectral datasets of apple tree trunks, the discrimination result shows high diagnostic accuracy for the branch, including its twigs.
Figure 7 shows the accuracy assessment results for all the spectral band combinations. For NDSIs with
λi = 660–690 nm and
λj = 720–760 nm, the accuracy was ≥0.94.
Table 3 shows a comparison of the accuracy assessment results obtained using the proposed method, LDA, and QDA. The proposed method achieved the same accuracy as that of QDA, whereas LDA was slightly less accurate. Although the spectral bands that provide the highest accuracy vary depending on the analysis method, they are all located in the similar spectral regions.
4. Discussion
As shown in
Figure 3, when the reflectance of the diseased area is compared with that of the healthy area, there is no significant difference in the absorption of light in the shorter wavelength region, but the diseased area has low reflectance in the near-infrared region. This is likely because the near-infrared light is absorbed, as it can penetrate the damaged chlorenchyma layer and the xylem. This result suggests that the diseased area can be easily identified by the near-infrared reflectance, which is invisible to the human eye, even when it is difficult to distinguish the diseased area visually. Furthermore, the examination of near-infrared reflectance is potentially suitable for diagnosing other tree bark canker diseases that damage the chlorenchyma layer.
As is shown in
Figure 7, the diagnostic model using the NDSI threshold achieved high discrimination accuracy (≥0.94) for the combination of spectral bands at 660–690 nm and at 720–760 nm, across the red edge region. From the discriminant analysis, it is expected that a larger Mahalanobis distance yields higher discrimination accuracy. However, at the spectral bands with the maximum Mahalanobis distance (
λi = 690 nm,
λj = 710 nm) in
Figure 5b, the accuracy was low (0.886). This can be attributed to the influence of significant noise of the small threshold value of NDSI near the maximum value of the Mahalanobis distance. As shown in
Figure 4, the discriminant analysis assumes a normal distribution, but in the actual measurement data, the values varied due to noise. If two spectral bands are close and the NDSI threshold becomes small, its noise can lead to misclassification. Additionally in
Figure 4, a small amount of data belonging to the diseased area is included on the healthy side with respect to the threshold value. This is an example of the outlier possibly affected by noise. In conclusion, a diagnostic model based on the results of accuracy assessment should be adopted rather than the theoretical estimation.
The combination of spectral bands exhibiting a high discrimination accuracy in this study is relatively close to that of NDVI, a traditional vegetation index for multispectral remote sensing. Therefore, it is expected that NDVI can also discriminate against AVC. In multispectral imaging, however, spectral regions of the red and near-infrared bands used in the NDVI calculation vary widely depending on the sensor specifications, and they may have extremely wide bandwidths and thus overlap with each other. The Mahalanobis distance will become small in such cases, and the discrimination accuracy will decrease. Additionally, as the wavelength range of hyperspectral imaging in this study was up to 780 nm, the efficacy cannot be evaluated from the results of this study if the near-infrared band of NDVI is set at a wavelength longer than 780 nm.
Until recently, there have been few studies on AVC using optical remote sensing. Thus, the study published by Zhao et al. [
19] last year is epoch-making as it implements AVC detection via hyperspectral measurements using Raman spectroscopy. Their work can be used to detect asymptomatic early stages of AVC infection, as with many studies aimed at diagnosing other plant diseases using Raman spectroscopy, which has made remarkable progress in recent years [
25]. However, laser Raman spectroscopy is a point-based active sensing method, which requires repeated measurements at many points to identify the extent of the diseased area. In addition, because Raman scattering is generally very weak, significant improvements in the measurement system, including measures against background light, will be required for outdoor applications. In contrast, the AVC disease visualization technology developed in this study is a passive imaging technique that utilizes sunlight as a light source. Therefore, practical application of this technology is highly promising as it only encounters a few technical issues during the in-situ measurements in orchards.
In this study, the healthy and diseased areas of apple trees were identified using the threshold value of NDSI. This method has a higher accuracy than that of LDA and the same accuracy as that of QDA. However, as slight misclassifications are evident from
Figure 6c even in these ideal and controlled conditions, the accuracy may decrease further in different illumination and measurement conditions. In such cases, more complicated techniques, such as machine learning, can be used. However, when considering the practical application of diagnostic technology, it is preferable that a diagnosis can be performed using a small number of spectral bands and a simple algorithm. If the number of spectral bands required is small, measurement can be performed using a multispectral sensor, which is significantly less expensive than a hyperspectral sensor. If the algorithm is simple, a low-cost and light-weight mobile device with low computing power can be used. The diagnostic model proposed in this study, which can identify the diseased area with only two specific spectral bands and a single threshold value, is suitable for practical use.