Detection of Two Different Grapevine Yellows in Vitis vinifera Using Hyperspectral Imaging

: Grapevine yellows (GY) are serious phytoplasma-caused diseases affecting viticultural areas worldwide. At present, two principal agents of GY are known to infest grapevines in Germany: Bois noir (BN) and Palatinate grapevine yellows (PGY). Disease management is mostly based on prophylactic measures as there are no curative in-field treatments available. In this context, sensor-based disease detection could be a useful tool for winegrowers. Therefore, hyperspectral imaging (400–2500 nm) was applied to identify phytoplasma-infected greenhouse plants and shoots collected in the field. Disease detection models (Radial-Basis Function Network) have successfully been developed for greenhouse plants of two white grapevine varieties infected with BN and PGY. Differentiation of symptomatic and healthy plants was possible reaching satisfying classification accuracies of up to 96%. However, identification of BN-infected but symptomless vines was difficult and needs further investigation. Regarding shoots collected in the field from different red and white varieties, correct classifications of up to 100% could be reached using a Multi-Layer Perceptron Network for analysis. Thus, hyperspectral imaging seems to be a promising approach for the detection of different GY. Moreover, the 10 most important wavelengths were identified for each disease detection approach, many of which could be found between 400 and 700 nm and in the short-wave infrared region (1585, 2135, and 2300 nm). These wavelengths could be used further to develop multispectral systems.


Introduction
Grapevine yellows (GY) are diseases caused by different phytoplasma groups that are distributed in viticultural areas worldwide [1]. In Germany, phytoplasmas associated to Bois noir (BN) and Palatinate grapevine yellows (PGY) are the main agents of GY [2]. While Candidatus Phytoplasma solani, the causal agent of BN, is classified to the Stolbur taxonomic group (16SrXII-A) and transmitted to grapevine mainly by the planthopper Hyalesthes obsoletus [3,4], PGY is a member of the Elm yellows group (16SrV) with the leafhopper Oncopsis alni being its vector [5,6]. Both vectors feed only erratically on grapevine, as bindweed and nettle are the main hosts for H. obsoletus [7] and alder for O. alni [8]. Besides transmission via insects, phytoplasmas can also be distributed during vegetative propagation [9].
Although, BN and PGY can easily be discriminated based on their etiology and epidemiology, the symptoms they induce on grapevine are indistinguishable [10]. Typical symptoms include downward rolling of leaves and discolorations that may be limited to the main veins but usually extend to the whole leaf blade. Depending on the variety, leaves turn chlorotic to yellow or develop a red to purple-reddish color. As the season progresses, affected leaves may become crispy and brittle [11]. Rows of black pustules can be observed along the internodes of shoots and, at the end of summer, lignification may be incomplete or lacking. In addition, phytoplasma infection may cause flower abortion as well as shriveling and early drying of grapes, which has a major economic impact on viticulture [1]. So far, no cultivated or wild grapevine species is known to be resistant, although grapevine varieties differ in their susceptibility towards the pathogens. Hence, some Vitis vinifera cultivars, most rootstocks, and wild Vitis species show milder symptoms than usual or may be completely symptomless carriers of GY [12].
Due to the complex epidemiology of BN and PGY-with grapevine being only an occasional host-and the lack of curative in-field treatments, disease management is almost exclusively based on prophylactic measures [13]. Besides planting phytoplasma-free propagation material, identification of infected vines and their subsequent uprooting are among the most common approaches [14]. A disease detection system could, therefore, be a useful tool in GY management.
In recent years, sensor-based methods have widely been applied for the noninvasive and objective analysis of different plant diseases. Since most plant pathogens interact with their hosts in a way that leads to biochemical and biophysical modifications, leaf spectral patterns change upon an infection and during symptom development [15]. These alterations in leaf optical properties can be detected by spectral sensors that capture the plants' reflectance not only in the visible range of light (VIS; 400-700 nm) but also in the near infrared (NIR; 700-1000 nm) and short-wave infrared (SWIR; 1000-2500 nm). Thereby, either the entire spectral region (hyperspectral) or selected spectral bands (multispectral) can be used [16].
In the work of Arens et al. [17], hyperspectral imaging was used for the detection of Cercospora beticola infection in sugar beet reaching classification accuracies of up to 99.9%. Polder et al. [18] performed a multispectral analysis directly in the field to identify tulips infected with tulip breaking virus, and Behmann et al. [19] showed the possibility to gain pre-symptomatic information of Puccinia triticina and Zymoseptoria tritici pathogenesis using spatial reference points on the leaves of wheat. Further studies were conducted regarding, e.g., the detection of Venturia inaequalis on apple [20] or powdery mildew on barley [21].
A first approach in phytoplasma detection was performed by Barthel et al. [22] who investigated the potential of SWIR spectroscopy for the detection of apple proliferation. Regarding GY, most studies focused on Flavescence dorée (FD), one of the most severe phytoplasma-caused diseases and therefore subject to quarantine restrictions in Europe [1]. Albetis et al. [23,24] tested the suitability of multispectral imaging in combination with an unmanned aerial vehicle (UAV) for the airborne detection of FD symptoms under field conditions. For this purpose, they successfully analyzed several spectral bands, vegetation indices, and biophysical parameters. In different field studies, Al-Saddik et al. [25][26][27] used a portable spectroradiometer (350-2500 nm) to collect hyperspectral reflectance data of healthy and symptomatic leaves, thereby, reaching classification accuracies of more than 90%.
So far, only limited attention has been paid to the detection of GY other than FD. Therefore, this study focuses on the detection of BN and PGY using hyperspectral imaging in the range of 400-2500 nm. For this purpose, greenhouse plants and shoots collected in the field were recorded under laboratory conditions in order to (i) discriminate healthy and symptomatic plants, (ii) test the detection of infected but symptomless vines, and (iii) identify the most relevant wavelengths for the differentiation tasks.

Plant Material
Plant material includes greenhouse plants derived from wood cuttings and plant samples collected in the field. Hyperspectral data acquisition of this plant material was conducted when disease symptoms were fully expressed, i.e., in May 2017 and 2018 for greenhouse plants, and September 2018 for field samples. All plants obtained from the cuttings were grown from February until August in a greenhouse adjusted to 26/22 °C (day/night) and a photoperiod of 16 h per day. They were kept in plastic pots (1 L volume) filled with 80% substrate (Fruhstorfer Erde Typ Tray Substrat + Perlite, Hawita Gruppe GmbH, Vechta, Germany) and 20% sand. Plants were watered twice a week and fertilized once a week (Hakaphos ® soft, Compo Expert GmbH, Münster, Germany).
Grapevines were visually inspected on a regular basis and hyperspectral images of all plants were recorded once in May after symptoms had developed and did not expand further.

Field Samples
In September 2018, 151 symptomatic and nonsymptomatic shoots of red-and white-berried grapevines were collected at different locations in the Palatinate and Middle Rhine region, Germany. Hyperspectral data were recorded on the same day the samples were taken. All shoots were visually inspected and tested by PCR (see Section 2.2.) for phytoplasma infection (Table 2). Since only three shoots were infected by both BN and PGY, hyperspectral data were not analyzed.

Molecular Analysis
For extraction of total nucleic acids, a modified protocol of the CTAB method described by Maixner et al. [28] was used. Leaf midribs (120 mg) were ground in 2 mL microcentrifuge tubes with 3-4 mL CTAB buffer (2% cetyltrimethylammonium bromide, 1.4 M NaCl, 100 mM Tris-HCl, pH 8,0, 20 µM EDTA, 2% PVP-40, 0.2% mercaptoethanole). Further, 1.5 mL of the homogenate was incubated for 30 min at 65 °C and 1300 min −1 . The supernatant was transferred after centrifugation for 10 min at 1000 g to a fresh microcentrifuge tube and an equal volume of chloroform/isoamyl alcohol (24:1, v/v) was added. The mixture was centrifuged for 10 min at 10,000 g, the supernatant was transferred to a sterile 1.5 mL microcentrifuge tube, and 500 µL of ice-cold (−20 °C) isopropanol was added. The preparation was stored for 30 min at −20 °C and centrifuged at 15,000 g for 15 min at 4 °C. The pellet was washed with 70% ethanol, dried in a vacuum concentrator and resuspended in 150 µL TE buffer.
Phytoplasmas of the 16SrV taxonomic group (elm yellows group)-that includes FD phytoplasma-were detected on total DNA extracts by amplification of parts of the 16S rRNA gene. A first PCR reaction was run with universal phytoplasma primers U5/P7 [29,30] followed by a nested-PCR using 16SrV group-specific primers fAY/rEY [31]. For detection of phytoplasmas of the Stolbur group (16SrXII) including the agent of Bois noir disease, a first amplification with the universal primers U5/P7 was followed by a nested-PCR using the primers fStol/rStol [28]. The PCR products obtained were analyzed by electrophoresis in a 1.5% horizontal agarose gel in TAE buffer (40 mM Tris-acetate, 1 mM EDTA, pH 8.0). DNA was stained with ethidium bromide and visualized by UVlight.

Hyperspectral Sensors and Data Acquisition
In this study, hyperspectral imaging was performed covering the spectral range from 400 to 2500 nm. For this purpose, two hyperspectral line scanning sensors (Norsk Elektro Optikk A/S, Skedsmokorset, Norway) were implemented: (i) HySpex VNIR 1800 to record spectra in the visible and near-infrared range (VNIR; 400-1000 nm) and (ii) HySpex SWIR 384 that captures the short-wave infrared range (SWIR; 1000-2500 nm). Further sensor details can be found in Table 3.  Figure 1, the experimental setup for hyperspectral data acquisition is depicted. In order to achieve reproducible measuring conditions, hyperspectral data were captured in an imaging Blackbox with which disturbing environmental factors can be avoided. Plant samples were placed as flat as possible in the Blackbox and the two sensors were moved in 1 m distance above the samples by a horizontal translation stage to obtain spatial images yielding the spatial resolution given in Table  3. For illumination, a 1000 W short-wave halogen spotlight (Hedler C12, Hedler Systemlicht, Runkel/Lahn, Germany) was installed between the cameras. A low reflective support surface assured minimal reflectance in the image background. In every image, a PTFE (polytetrafluoroethylene) spectralon (Sphere Optics GmbH, Herrsching, Germany) was included for calibration. The cameras' integration time was set to yield ~90% signal within the cameras' dynamic range. Camera frame period was set by the image acquisition software to match the driving speed of the linear stage in order to get a geometrically correct image. Image acquisition and radiometric calibration was performed using the camera vendor's acquisition software HySpex Ground.

Data Calibration and Labeling
Data calibration and labeling was performed similar to previous studies [32,33]. Calibration measurements were performed before and after every plant sample. For this purpose, reflectance per pixel was calculated as where is the image pixel intensity at wavelength , the intensity while recording the spectralon device (white reference), and the intensity when measured with closed camera shutter (dark current). Values for white and dark values were obtained individually per pixel on the scan line in order to compensate for illumination gradients generated by the halogen light source.
In order to remove background and nonrelevant sample parts like stems and pots, a segmentation for the leaf material was performed. For this purpose, a number of images was labeled by hand and a model based on the reflectance spectrum was trained to classify each spectral pixel into vegetation, background, and nonrelevant sample parts. Multi-Layer Perceptron (MLP) [34] with SNV normalization [35] performed the best and was, therefore, used for all segmentation purposes. The model training was performed using the AutoML platform HawkSpex ® Flow developed by the Fraunhofer IFF. Data from both cameras were treated separately throughout the study resulting in different models for VNIR and SWIR images. An image registration was attempted but did not yield satisfactory results. For this matter, we used MATLAB's (MathWorks Inc., Natick, MA, USA) methods for phase correlation and nonrigid image registration. Registration was performed using two grey value channel images from both cameras (VNIR at 785 nm and SWIR at 1056 nm). The VNIR camera was mapped to the SWIR camera that has a lower resolution. The same image registration mapping was then applied to all SWIR channel images. In order to generate a labeled dataset for the subsequent modeling, data were labeled using the provided laboratory results (Table 1 and Table 2). Based on the visual assessment and the molecular analysis, all pixels per plant were labeled as either healthy or infected. For each label class, 10,000 pixel-spectra were randomly sampled from all available imaging data. This was seen as good compromise between data representation and computational demand for generating the subsequent models. Figure 2 summarizes the steps from image acquisition over pre-processing to data modeling. Overview of the principal workflow from image acquisition to data modeling. Online steps were processed directly during data acquisition and offline steps were processed in the computer infrastructure at Fraunhofer Institute for Factory Operation and Automation (IFF).

Model Development and Application
Each detection problem in this paper can be described as a binary classification problem with two classes: infected vs. healthy. In order to map the spectral reflectance data to a detection decision, a machine learning approach was followed yielding a Soft-Sensor detection system [36]. In this study, a number of spectral pre-processing methods in combination with machine learning models were tested for their detection performance (Table 4 and Table 5). Pre-processing is typically performed on reflectance data to minimize the effect of geometry on the measured reflectance, which leads to offset and gain effects [34]. Output of the pre-processing is then used as input to the machine learning model. In Table 4, the pre-processing methods used in this study are listed. Table 4. Pre-processing methods used in this study. Calculation is performed on the dataset to generate the input to the machine learning process.  [40][41][42] Hyperspectral data of this study were analyzed using four different machine learning algorithms. These include: (i) Linear Discriminance Model (LDA), (ii) Partially Least Square (PLS), (iii) Multi-Layer Perceptron (MLP), and (iv) Radial-Basis Function Network with Relevance (rRBF) ( Table 5). The chosen methods differ in how data classes are separated. The LDA and PLS model use a single linear hyperplane as decision boundary, but acquire their parameters through different optimization methods. While LDA optimizes for class discrimination, PLS optimizes for input to output correlation. The MLP and rRBF model use non-linear models for a more complex decision boundary. The MLP generally works best for datasets, which can be separated using a small number of hyperplanes, whereas an RBF network is able to separate more complex shaped data clusters, since it uses receptive fields in combination with hyperplanes. For the output of these models, a coding of −1 for control/healthy and +1 for pathogen infection was used.

Method Formula
In order to test the model on unseen data, an n-fold cross validation with n = 10 was performed with the dataset being divided into n parts. The model was then optimized on n-1 folds while being tested on the n th -fold. The modeling process was performed with all possible combinations without repetition of folds. As a model performance indicator, the average and standard deviations of the performance value was calculated across modeling runs and reported in the result tables.
Performance of all models was assessed using the following performance criteria (with sample being defined as one spectrum labeled with its respective class): After model training, the best performing model in terms of classification accuracy was selected and applied to the hyperspectral images resulting in a label for each vegetation pixel. In order to evaluate the detection performance of the selected model, percentage of all considered vegetation pixels classified as either healthy or infected was calculated and the label with the highest occurrence was regarded as the representing label for the plant sample (majority vote) ( Figure 3). Depicted are calibrated reflectance images at 800 nm coded as grey scale (a,b) and RGB (red-green-blue color space) images for visible and near-infrared range (VNIR) (c,d) and short-wave infrared (SWIR) (e,f), which are reconstructed for visualization only. The binary classifier for PGY detection was applied to all leaf pixels labeling them as either healthy or infected (g-j). Green pixels were classified as healthy by the machine learning algorithm and red pixels were classified as infected. The percentage of symptomatic pixels was calculated for VNIR (g,h) and SWIR (i,j) images.

Spectral Relevance and Important Wavelengths
While optimizing the Radial Basis Function Network (RBF), a weighting per wavelength is optimized as well and indicates the importance or relevance of a wavelength's contribution to the detection task [41]. Due to the high correlation of wavebands, such a weighting profile is rarely only activated at a single wavelength. In addition, a multispectral camera system measures reflectance with a resolution that is approximately an order of magnitude lower than that of a hyperspectral camera. Therefore, we developed an algorithm with which multispectral channels can be placed at optimal positions in the spectral range and the relevant waveband utilization can be maximized [43]. For this purpose, the relevance profile was used as probability density function (pdf), and an automatic algorithm was applied that generated 100,000 random wavelength values based on this pdf. Consequently, the generated values are denser in areas of high relevance than in areas of low relevance. This data set was then used to train a Neural Gas vector quantization algorithm [44], which placed a set number of wavelength values in a way to minimize the quantization error measured by the mean squared error between placed wavelengths and best machine generated wavelengths.
Naturally, the Neural Gas algorithm covered denser areas with more wavelength candidates than less dense areas. Wavelength candidates were then ordered by their interpolated value from the relevance curve. Consequently, the first wavelength candidate is of highest importance to the detection task and relevance decreases from wavelength candidate to next wavelength candidate. Therefore, a maximum of 10 wavelength candidates for VNIR and SWIR were set in this study, as more were not considered useful. An example of the wavelength selection process is given in Figure  4.

Greenhouse Plants
In this study, four different machine learning models were applied to the hyperspectral data recorded. Table 6 shows CAs, TPRs, and FPRs of all models for the detection of phytoplasma-infected greenhouse plants. Clear differences could be seen between the performances of the four models within each detection task. While LDA and PLS with comparable results performed the worst, MLP achieved significantly higher CAs and TPRs. However, rRBF performed best, thus, was further used to analyze greenhouse plants.

Field Samples
The same four models were also used to analyze symptomatic and nonsymptomatic shoots collected in the field. Results of the different machine learning approaches are given in Table 7. In contrast to greenhouse plants, no significant differences could be observed between the models as their performances were almost similar within each detection task. However, MLP performed slightly better and was, therefore, further used to analyze samples collected in the field. Table 7. Results of the different machine learning approaches for disease detection of symptomatic field material derived from red-and white-berried cultivars. Best machine learning approach according to its classification accuracy is highlighted in bold.

Symptomatic Greenhouse Plants
During model development, all pixels were evaluated without considering spatial scales. In order to achieve a decision per vine, these models were then applied on plant scale. A majority voting of all pixel results was performed, whereby the whole plant was classified as either healthy or infected. Table 8 shows the results for the detection of phytoplasma-induced leaf symptoms of greenhouse plants. Identification of PGY-induced symptoms appeared to be easier than the detection of BN-induced symptoms, which is indicated by higher TPRs and corresponding lower FPRs. Here, TPRs of 81% and 100% could be achieved for BN and PGY, respectively, in both wavelength ranges. However, FPRs were significantly higher in the VNIR range, making the SWIR range the better predictor of plants' disease status.

Nonsymptomatic Greenhouse Plants
The detection of infected but symptomless plants was only possible for BN. Table 9 shows CAs, TPRs, and FPRs for the model application on plant level. Identification of symptomless greenhouse plants seemed to be more challenging than the detection of symptomatic plants as is indicated by lower model performance. TPRs of 68% and 79% were achieved for VNIR and SWIR, respectively. However, 29% (VNIR) and 41% (SWIR) of all pixels were falsely classified as symptomatic leading to rather low CAs of 68% and 64% for VNIR and SWIR, respectively.

Symptomatic Field Material
Symptom detection seemed to be easier for shoots collected in the field than for greenhouse plants (Table 10). Satisfying results could be accomplished for both diseases with TPRs of 95-100% and FPRs of 0-7%. Although symptom detection was successful for both diseases, PGY performed slightly better reaching detection rates of 100% without misclassifications. In general, no differences could be seen between VNIR and SWIR, so, both wavelength ranges seem to be suitable for the differentiation task.

Greenhouse Plants
The machine learning approach allows the calculation of relevance profiles that provide information about the most important wavelengths for the detection tasks. Relevance profiles for the identification of phytoplasma-infected greenhouse plants are depicted in Figure 5. Based on these relevance profiles up to 10 local maxima were selected. They are listed according to their importance in Table 11. Clear differences can be seen in the three differentiation tasks in both VNIR and SWIR.
Regarding symptomatic plants infected with PGY, most important wavelengths in VNIR are around 459-492 and 679 nm with some minor peaks at 748 and 905 nm. The peak around 679 nm overlaps with that of BN-infected and symptomatic plants at 689 nm showing the importance of this spectral region for the discrimination of symptomatic and control plants. However, no further concordance could be shown between the two diseases. Regarding BN-infected but symptomless plants, wavelengths at 503, 616, and 734 nm as well as in the range of 932-972 nm seem to be of highest relevance.
In the SWIR range, wavelengths around 1400 and 1865 nm are of importance for all three detection tasks, although peaks seem to be slightly shifted.

Field Material
Relevance profiles for the detection of symptomatic field material are depicted in Figure 6 and exact wavelengths are given in Table 12. Regarding PGY, most important wavelengths are around 557, 639, 672, and in the range of 801-940 nm. Some of these are also relevant for the identification of BN-infected shoots of white varieties, e.g., 637, 553, and 812-966 nm with an additional important wavelength at 741 nm. Obvious differences become visible between symptomatic shoots of white and red varieties with 528, 586-626, and 673 nm being the key wavelength ranges for the detection of BNsymptomatic shoots in red varieties. In contrast, a clear pattern can be seen in SWIR with wavelengths around 1585, 2135, and 2300 nm being of relevance in all three differentiation tasks. Small differences between PGY and BNinfected shoots can be found between 1880 and 2000 nm. Here, important wavelengths are slightly shifted. Further differences become apparent in the range of 1050-1490 nm, with 1072 nm being important for the identification of PGY and 1350 nm for BN.
However, when comparing relevance spectra of greenhouse plants and field material only a few concordances can be found, for PGY, at 675 and 2130 nm and for BN, only at around 540 nm. Table 11. The 10 most informative spectral bands for the detection of phytoplasma-infected greenhouse plants.

Discussion
For high dimensional data, it is difficult to determine the best machine learning model in advance as every model underlies different mathematical conditions, which may lead to different performances when applied on the same dataset. Therefore, four different machine learning models were tested in this study. While for greenhouse plants rRBF showed highest CAs and TPRs, MLP performed best for shoots collected in the field. High-dimensional data are not easy to visualize in order to check if the datasets meet one of the mentioned conditions. The training was done using careful validation and performed on identical dataset splits in order to generate comparable results. Furthermore, we analyzed the relevance of wavelengths ranges for more interpretability of the modeling outcome. Comparing different classification algorithms is a common approach in hyperspectral data analysis as has been shown in various studies. For the detection of Botrytis cinerea and Colletotrichum acutatum infections on strawberry fruits, Siedliska et al. [45] analyzed four different classification methods of which a Backpropagation Neural Network (BNN) showed highest accuracy. Wiegmann et al. [46] used PLS, MLP, and an RBF network with Transfer Learning for the prediction of nutrient content in barley grain. Here, PLS was identified as the best compromise between good prediction performance and lowest computing demand. For the detection of laurel wilt disease on avocado plants, Abdulridha et al. [47] also applied MLP as well as RBF neural networks and additionally performed a stepwise discriminant (STEPDISC) analysis on hyperspectral data in the VNIR region. In both early and late infection stages, MLP performed significantly better than STEPDISC and RBF.
Symptom development of BN and PGY is known to be influenced by environmental factors, scion-rootstock combination, and the grapevine cultivar [1]. Therefore, ungrafted vines grown under controlled greenhouse conditions were used as a first approach to assess the potential of hyperspectral imaging for GY disease detection. While high classification rates could be achieved for PGY-infected plants, BN symptom detection performed significantly worse with CAs of 68% and 79% for VNIR and SWIR, respectively. Although symptoms of GY are indistinguishable, their appearance on a vine may differ. Phytoplasmas associated to the 16SrV group like PGY or FD usually induce systemic symptoms, thus, affecting the entire plant. BN-infected vines, however, express symptoms only partially on some shoots, while others seem to remain healthy [2]. This could also be observed for greenhouse plants in the present study. PGY-infected plants showed symptoms along the whole shoot; in contrast, vines infected by BN developed only some symptomatic leaves and symptoms did not further expand as the season progressed. This mixture of symptomatic and nonsymptomatic leaves might be the reason for lower CAs in BN-infected greenhouse plants, especially since disease detection was performed successfully using plant material collected in the field.
In general, phytoplasmas are erratically distributed in their host plants and their location as well as titer are assumed to play a significant role in symptom development [48]. In combination with the fact that some cultivars, most rootstocks, and wild Vitis species may be completely symptomless [12], identifying disease carriers would be a key element in reducing pathogen reservoirs in vineyards and nurseries and especially in rootstock motherblocks. The feasibility of such an approach has already been demonstrated for other phloem-limited pathogens such as citrus tristeza virus or grapevine leafroll-associated viruses [32,49]. Unfortunately, the detection of BN-infected but symptomless plants was not successful under greenhouse conditions, since VNIR and SWIR showed poor classification performances of 68% and 64%, respectively, which is close to a random classifier. Since results for symptom detection were significantly higher for PGY-than BN-infected vines, it would be interesting to evaluate in further studies whether this effect could also be observed in symptomless PGY-infected plants.
Even though analyzing greenhouse-grown vines allows environmental factors to be precisely controlled, phenotypes strongly differ from those grown in the field since grapevines are naturally large perennial plants. Therefore, as a next step, shoots collected in the field were recorded under laboratory conditions. In general, better results could be obtained for plant material from the field than for greenhouse plants. This might be due to a higher amount of symptomatic leaves per sample as shoots from field-grown grapevines were completely symptomatic and considerably larger than greenhouse vines. Moreover, symptoms slightly varied between greenhouse and field. As the season progresses, symptomatic leaves may turn crispy and brittle [11], thereby, influencing especially the NIR spectral range that is strongly affected by cell tissue structures [15]. However, this could only be observed for field material and not for greenhouse plants. Furthermore, Mannini et al. [50] found that vegetative propagation of infected vines might lead to reduced infection intensity in the progeny, which could also be one reason for lower CAs in greenhouse plants.
In summary, disease detection could be performed successfully for both BN and PGY using field-grown plant material leading to the assumption that these two GY may also be detectable directly in the field, as has been demonstrated for FD [23][24][25][26][27]. Identifying the most relevant wavelengths for a multispectral disease detection system and, as a result thereof, reduced data dimensionality might be a promising concept for transferring BN and PGY detection into the field. Assessment of optimal spectral bands is a common approach that has widely been used for the detection of tomato spotted wilt virus [51], anthracnose on strawberries [52], three sugar beet diseases [53], or powdery mildew and Esca on grapevines [33,54]. In this study, a maximum of 10 relevant wavelength bands were selected separately for VNIR and SWIR, as more were not considered realistic for multispectral systems. Many of these wavelengths could be found in the visible range of the electromagnetic spectrum for both greenhouse plants and field material. Phytoplasmas are known to inhabit the phloem of their host plants, thus, callose is deposited near sieve plates and plasmodesmata to hinder pathogen spread [55]. As a consequence thereof, phloem transport is inhibited leading to an impairment of photosynthetic activity [56]. Moreover, infection typically causes a decrease in chlorophylls and carotenoids [57]. These changes in pigment content are predominantly expressed in the range of 400-700 nm [58]. Chlorophyll a and b strongly absorb incoming light in the blue and red region of the spectrum, thereby providing energy for photosynthesis [59]. Besides chlorophylls, carotenoids are the main factors influencing reflectance characteristic in the visible range of light, especially in the blue region [60]. The differences observed in VNIR reflectance spectra of BN-infected field material collected from red-and white-berried grapevines may be explained by an increase in anthocyanin content during symptom development [61] that is not observable in white cultivars since they lack several genes of the flavonoid biosynthetic pathway [62]. Anthocyanins affect leaves' reflectance mainly around 550 nm. Based on this finding, Gitelson et al. [63] introduced the anthocyanin reflectance index (ARI). In general, several vegetation indices (VIs) have been described to estimate leaves' pigment contents [16,64] and some of them might be applicable to the spectral data of this study. However, common VIs typically lack disease specificity, therefore, attempts have been made to develop individual spectral disease indices (SDIs) [53]. The generation of optimal wavelength pairs for suitable BN and PGY indices might be a subject for further studies.
Regarding relevance profiles in the SWIR range, a consistent pattern could be seen for field material across the three detection tasks with 1585, 2135, and 2300 nm being of high importance. As described by Curran et al. [59], wavelengths around 1580 nm are strongly associated to the absorption of starch and sugar. Due to the disturbed photosynthesis and phloem blockage upon phytoplasma infection, synthesis and transport of carbohydrates and starch are modified leading to their accumulation in mature leaves [65]. Furthermore, phytoplasma infection may cause a significant decrease in lignin content [66]. According to Nagler et al. [67], cellulose and lignin have a relatively broad absorption feature around 2100 nm. In general, the range of 2100-2300 nm is not only heavily affected by leaf cellulose and lignin but also by protein content [68] that is known to be strongly reduced in many phytoplasma-infected plants [69]. However, these changes typically occur in heavily affected leaves only, which could explain the differences between relevance profiles of greenhouse and field material as field material was more affected by phytoplasma infection.
The results of this study are based on 1-year data; therefore, selected spectral bands need to be verified in further experimental years. Sinha et al. [70] identified relevant wavelengths for the identification of grapevine leafroll-associated virus 3 in Cabernet Sauvignon vines over two consecutive years, but transfer of these wavelengths from one year to the other was only partially satisfying. Moreover, Al-Saddik et al. [27,71] tried to assess optimal spectral bands for FD detection in different white and red grapevine cultivars, but the best bands selected were different from one case to another. Nevertheless, in further studies, it will be of interest to transfer BN and PGY detection to the field.

Conclusions and Perspectives
In this study, greenhouse plants and shoots collected in the field were analyzed under controlled laboratory conditions to evaluate the potential of hyperspectral imaging for disease detection. So far, no similar work on BN and PGY has been conducted as previous studies focused mainly on FD. While identification of PGY-infected greenhouse plants was successful reaching CAs of up to 96%, symptom detection of BN needs to be improved. Further investigations will also be necessary for infected but symptomless plants. Identification of these plants could help to improve nurseries' ability to provide phytoplasma-free propagation material. However, symptomatic field material could be easily classified with CAs of 96-100%. Further studies could expand upon our work by transferring the developed disease detection models into the field. For viticulture, a tractor-mounted system might be a suitable method since data could be collected in parallel to fieldwork. An airborne approach using UAVs could be a fast and flexible alternative. For this purpose, it would be useful to develop a multispectral sensor using phytoplasma-specific wavelength bands, thereby, reducing data dimensionality and computational time. As a first step, the most relevant wavelengths for each differentiation tasks could be identified in this study. Selected bands differed from one case to the other, except for field material in the SWIR spectral range where consistent wavelengths could be observed. So far, Germany is considered FD-free but it cannot be excluded that the disease will be present in future. Due to the relatedness of FD and PGY, results of this study could be used as a basis to develop a sensor-based monitoring system that could then help to fulfill the required quarantine restrictions. In general, many future applications are possible; therefore, complementary studies with increased number of samples should be conducted in order to validate the results presented in this work.