Non-Destructive Early Detection and Quantitative Severity Stage Classiﬁcation of Tomato Chlorosis Virus (ToCV) Infection in Young Tomato Plants Using Vis–NIR Spectroscopy

: Tomato chlorosis virus (ToCV) is a serious, emerging tomato pathogen that has a signiﬁcant impact on the quality and quantity of tomato production worldwide. Detecting ToCV via means of spectral measurements in an early pre-symptomatic stage o ﬀ ers an alternative to the existing laboratory methods, leading to better disease management in the ﬁeld. In this study, leaf spectra from healthy and diseased leaves were measured with a spectrometer. The diseased leaves were subjected to RT-qPCR for the detection and quantiﬁcation of the titer of ToCV. Neighborhood component analysis (NCA) algorithm was employed for the feature selection of the e ﬀ ective wavelengths and the most important vegetation indices out of the 24 that were tested. Two machine learning methods, namely XY-fusion network (XY-F) and multilayer perceptron with automated relevance determination (MLP–ARD), were employed for the estimation of the disease existence and viral load in the tomato leaves. The results showed that before outlier elimination, the MLP–ARD classiﬁer generally outperformed the XY-F network with an overall accuracy of 92.1% against 88.3% for the XY-F. Outlier elimination contributed to the performance of the classiﬁers as the overall accuracy for both XY-F and MLP–ARD reached 100%.


Introduction
Agriculture plays a significant role in the economic domain worldwide. The continuously increasing demand for food due to the increase of the global population has triggered the evolution of novel agro-technology methods in the sector of precision agriculture (PA) in order to optimize the productivity and reduce the agricultural waste and the production loss that is caused by biotic and abiotic factors [1,2].
Plant diseases constitute a great threat for the agricultural sector in a worldwide scale, causing significant loss of global production [3]. Diseases in the field and greenhouse conditions are mostly addressed with the use of chemical compounds if the disease is curable. Although this method could be proven efficient, the cost of pesticides is large and the results are questionable. Apart from the economic impact, such methods also have a negative environmental impact [1]. Additionally, in the case of viral diseases, their control is mainly based on preventive measures since it is not possible to cure the infected plants. Thus, it is necessary that there be targeted early detection for the efficient management of such diseases. For these purpose, preventive means and post infection methods are employed, aiming to minimize the extent of the impact of disease damages. Compared to destructive methods, these approaches are noninvasive, since they can be applied on the same plants over time [4]. These approaches can yield useful data from the spectral bands outside the visual spectrum, enhancing crop monitoring potential [5,6].
The tomato yellows disease (TYD) is a serious tomato disease that causes leaf deterioration through yellowing and interveinal chlorosis and leads to important yield losses of the affected crops worldwide [7,8]. Two pathogens of the genus Crinivirus are implicated in this disease: tomato chlorosis virus (ToCV) and tomato infectious chlorosis virus (TICV) [8]. At the worldwide level, ToCV prevails over TICV [9]. Tomato yield is negatively affected by ToCV infection, although tomato fruits are not directly affected, due to loss of photosynthetic area, which causes smaller and reduced number of fruits [10]. The virus is semipersistently transmitted by the whitefly species Trialeurodes varopariorum, Bemisia tabaci (MED and MEAM1) and Trialeurodes abutilonea [10]. Methods to reduce the spread of ToCV in the field are limited to controlling the whitefly population, as there are no commercially available resistant or tolerant tomato varieties to date, and management of the virus sources [9].
Visible symptoms typically appear at least 2-3 weeks after virus inoculation in healthy tomato plants. Spectroscopy methods are employed for preventive screenings, which are non-destructive and could potentially detect the pathogen long before the appearance of symptoms. Examples of spectroscopy methods employed for prevented screening have been presented by Gold et al. [11], Fernández et al. [12] and Herrmann et al. [13]. Thus, removing infected plants before the virus transmission in neighboring plants or crops susceptible to the virus could be of vital importance for the growers in retaining the production.
The conventional laboratory methods and techniques that are used for the detection of ToCV infection include the serological technique of ELISA, which is generally considered an efficient method for criniviruses such as ToCV [14][15][16], and real-time RT-PCR assay [14,17].
Spectroscopy techniques, such as fluorescence, Vis-NIR spectroscopy [11] and hyperspectral imaging [18], among others, are non-destructive methods for the study of the leaves' spectral responses under any biotic and abiotic stresses that they are subjected to [19,20]. Such methods have been successfully used for the detection and the diagnosis of plant diseases and weeds both on laboratory and field levels and in an early stage of the epidemic [13,[21][22][23][24][25][26] or in a later stage for precise targeting of herbicides and pesticides [27][28][29][30]. When spectroscopy methods are applied, differences between the healthy and the infected areas of the leaves can be assessed by the study of the shifting of visible and NIR curves. The visible spectrum area has typically lower reflectance due to absorption by leaf pigments, mostly chlorophyll, caretonoids and anthocyanins [31]. Reflectance in the NIR region depends on the leaf cellular structure [32]. Such differences in the way leaf pigments and structure interact with electromagnetic radiation can be detected by specialized spectrometry equipment in order to discriminate between healthy and infected plants [28], as plant diseases affect the leaf pigments, cellular structure and metabolic activity and even the leaf texture characteristics. In this paper, leaf reflectance in the Vis-NIR spectrum is employed in order to identify and quantify ToCV in young tomato plants.
The aim of this paper is the development of an algorithm for nondestructive detection of a possible ToCV infection in young tomato plants, taking also into account the intensity level of the infection using the spectral signatures that are collected by a portable Vis-NIR spectrometer. The intensity levels of the virus infection that was examined are based on the quantitative virus concentration of the plants and are subject to the detection of ToCV before there is any visible symptom on the plants' Remote Sens. 2020, 12, 1920 3 of 22 leaves. For the detection of the pathogen, the machine learning classification methods of XY-fusion networks (XY-F) and multilayer perceptron with automatic relevance determination (MLP-ARD) artificial neural networks (ANN) were employed and compared to each other for their accuracy, using effective wavelengths and vegetation indices that were selected by neighborhood components analysis (NCA) algorithm.

Plant Material and Growth Conditions
Biologically untreated tomato plants (Solanum lycopersicum L. hybrid Belladonna) were placed in a growth chamber with controlled environment. The temperature was set in a 23/25 • C day/night cycle, and the mean relative humidity was 70 ± 10%. Illumination was provided by artificial fluorescent lighting (Philips 32-Watt T8 4 ft Plant and Aquarium) under a 16 h photoperiod regime per day. The lamps were placed 65 cm above the plants with a mean photosynthetic photon flux density (PPFD) of 410 µmol·m −2 ·s −1 . Commercial enriched brown potting peat soil (Agricult ® ) mixed with vermiculite in a 3:1 ratio was used with a composition of 70% blond peat, 30% black peat and 1.5 kg of PG-Mix (12-14-24), with a pH of 5.6-6.4.

ToCV Infection and Quantitative Analysis
A total of 156 tomato plants were used as the experimental material, of which 132 were used for virus infection, and the remaining 24 served as negative controls. The isolate ToCV/Rh1835 (GenBank accession number HG380092) originated from a greenhouse in Rhodes Island (Greece), and it was maintained in an insect-proof cage by serial passages onto tomato plants (Solanum lycopersicum hybrid Belladonna) in the Laboratory of Plant Pathology (Aristotle University of Thessaloniki, Greece) using the whitefly vector Bemisia tabaci MED.
ToCV-infected tomato plants (hybrid Belladonna) were used as viral sources 4 weeks post inoculation (wpi). Twelve groups of 40 adult whiteflies were given a 48-h acquisition access period (AAP) on source plants, and a 72-h inoculation access period (IAP) was followed in order to achieve maximum transmission efficiency. Then, clip cages were removed and all plants were sprayed with the insecticide imidacloprid. Clip-cages are useful experimental tools for gathering and transferring small insects, such as whiteflies, to leaves when aiming to study various biological parameters [33]. Plants were transferred to the chamber where they were grown for approximately two weeks (until the appearance of interveinal yellowing symptoms induced by ToCV). During this period no visible signs of leaf senescence was observed in the plant material that was used.
Finally, the tomato plants were taken to the laboratory, and the first true leaf in which the whiteflies were fed and transmitted the virus was treated by removing lingering dust and dirt for spectral signature acquisition. Eleven spectral measurements were taken during this two week period. The first three measurements occurred within a two-day time interval, and after that there was a measurement every day until the end of the experiment. After the optical measurements, approximately 0.2 g of the leaf was cut off the plant and stored at −80 • C until it was processed. Total RNA was extracted from all leaves from both negative control treatments and infected plants and subjected to RT-qPCR for detection and quantitation of ToCV in infected tissues according to the protocols described by Orfanidou et al. [34].

Optical Measurements
For each optical measurement that was performed during the experiment, 12 infected tomato plants were randomly picked from the growth chamber and carried to the laboratory. The measurements were carried out using a portable Unispec-SC spectrometer (PP Systems, Inc.). This specific spectral instrument provides robust spectral measurements (reflectance and absorbance) in the visible and near infrared range between 310 nm and 1100 nm with a sampling interval of 3.3 nm and a spectral resolution (FWHM) of less than 10 nm. The leaf spectral signature acquisition was performed through a halogen light source mounted on a leaf clip. A 100% white reference scan was performed before the initial scanning, and it was repeated before each individual measurement. The spectra in the range 310 nm to 399 nm and in the range 1001 nm to 1100 nm at the fringe of the active range of the spectrophotometer showed an excessive noisy pattern and were removed from further analysis.
At least 15 spectral scans were collected from each tomato plant sample, depending on the plant leaf area. The greater the leaf area, the more spectral samples were collected so that there would be a representative sample collection. The negative control plants were scanned three times during the experimental procedure. The first occurred on the first day of the start of the spectral measurements, the second occurred on the sixth day and the last one on the twelfth day.

Spectral Data Pre-Processing and Feature Selection
It is crucial for the success of the classifiers' training phase to exclude any excessive non-linearity that is related with noise or any other external effect that is caused either by the spectral measurement instrument sensitivity or the measurement conditions [35]. For this reason, apart from just removing the noisy edges of the spectrometer, as mentioned above, there was further smoothing of the spectral data. Transforming the raw Vis-NIR reflectance data (R) using their transpose logarithm, log(1/R), was used as it was found to be very efficient in the quantification of plant pigments [36,37]. Additionally, the data were mean centered, and finally the Savitzky-Golay smoothing filter [38] was applied, using five supporting points on each edge of the smoothing point and by applying a third-degree polynomial.
Apart from data pre-processing, there was outlier detection in the spectral data, which was decided after performing high-dimensional robust principal component analysis (HR-PCA) as described by Xu et al. [39]. The outlier elimination was performed by iteratively omitting the largest data points from the PCA projection of the principle components that covered at least 80% of the total variance. Both the results of the predictive accuracy before and after the outlier elimination are presented in this paper.
Due to the high dimensionality that the spectral data show, they are not suggested to be used as a model input in statistical and machine learning methods. The problem is mostly related to the limited useful information that these data carry, which in turn lead to lower model performance due to lower variance, as well as the increase of the computational time. Instead, in most cases, dimensionality reduction techniques are used, such as principal component analysis (PCA) or techniques for feature selection. In this study, it was decided to examine the effect of selected vegetation indices (VIs) from the literature (Table 1) on the presymptomatic detection of ToCV infection on young tomato plants [40]. Apart from the VIs, the most effective raw wavelengths were also evaluated for their effect on the disease detection. The effective wavelengths were decided through neighborhood component analysis.

Feature Selection
Adding many predictors in a model increases its complexity, which may improve the quality of the training process but can have strongly negative impacts on the predictive accuracy. For this reason, the abovementioned VIs, as well as the EWs that were used as features in the classification models were decided by using the neighborhood component analysis (NCA) method, which was based on Yang et al.'s [57] implementation for feature selection of the original Goldberger et al. [58] algorithm.
The NCA method is a feature selection algorithm that learns a low dimensional embedding of the data for kNN classification using a direct gradient-based approach [59]. The quadratic (Mahalanobis) distance metrics are used because they can be represented by the symmetric positive semi-definite matrices. The aim of this algorithm is to find a system for determining the ideal feature distance through a linear transformation of input data to optimize classification in the transformed space. The irrelevant feature weights get reduced to zero, under the leave one out classification scheme. In this paper the feature weights were estimated using the stochastic gradient descent (SGD) solver.
The initial learning rate value was set to 0.1, and the learning rate tuning iteration was set to 20. After 16 iterations, the convergence was accomplished to a learning rate value of 6.4. Optimum lambda regularization hyperparameter (which minimizes the generalization error) was determined after its tuning using five-fold cross-validation and was found to be 0.004. Total subset size that the threshold (θ) above which the most significant features were selected was decided according to Equation (1) [60].
Remote Sens. 2020, 12, 1920 6 of 22 In this equation, τ denotes the tolerance set at 0.2 [60], and max(w) denotes the maximum value of the updated features weight vector.

Class Division
The initial division of the tomato samples into the control (healthy) and infected (virus positive) classes was done according to Orfanidou et al. [34], after the quantitative measurement of ToCV on tomato leaves. In their research, Orfanidou et al. [34] define the cycle threshold (Ct) value of 43.3 as a limit for positive ToCV detection. Any value in the Ct range of 30 to 43.5 is considered not detectable, and any higher value belonged to the negative control plants. The class division that reflected the different severity stages of the virus infection and that was used for the classifier training is shown in Table 2. A total of 2984 spectral signatures were collected from the plants. Out of those spectral signatures, 680 belonged to the control treatment, 749 belonged to class 2, 761 to class 3, and the remaining 794 belonged to mid to highly infected leaves (class 4).

Machine Learning Techniques
The machine learning techniques used in this paper were XY-fused networks and multilayer perceptron with automated relevance determination (MLP-ARD). XY-fused networks [61] are supervised artificial neural networks that are used for classification modelling in a similar way as supervised Kohonen maps. The winning neuron is defined from the fused similarity of the Euclidean distances calculation between the n input feature (x n ) and their respective weights and the target class vector [62]. A graphical representation of the architecture of such a network is shown in Figure 1.
In this equation, τ denotes the tolerance set at 0.2 [60], and max(w) denotes the maximum value of the updated features weight vector.

Class Division
The initial division of the tomato samples into the control (healthy) and infected (virus positive) classes was done according to Orfanidou et al. [34], after the quantitative measurement of ToCV on tomato leaves. In their research, Orfanidou et al. [34] define the cycle threshold (Ct) value of 43.3 as a limit for positive ToCV detection. Any value in the Ct range of 30 to 43.5 is considered not detectable, and any higher value belonged to the negative control plants. The class division that reflected the different severity stages of the virus infection and that was used for the classifier training is shown in Table 2. A total of 2984 spectral signatures were collected from the plants. Out of those spectral signatures, 680 belonged to the control treatment, 749 belonged to class 2, 761 to class 3, and the remaining 794 belonged to mid to highly infected leaves (class 4).

Machine Learning Techniques
The machine learning techniques used in this paper were XY-fused networks and multilayer perceptron with automated relevance determination (MLP-ARD). XY-fused networks [61] are supervised artificial neural networks that are used for classification modelling in a similar way as supervised Kohonen maps. The winning neuron is defined from the fused similarity of the Euclidean distances calculation between the n input feature (xn) and their respective weights and the target class vector [62]. A graphical representation of the architecture of such a network is shown in Figure 1.  classes. For this study, a fully connected MLP with a three-layer architecture (input layer, hidden layer and output layer) was assigned for the classification of the spectral signatures into the healthy or the intensity of the disease level conditions. The weight correction was performed by the scaled conjugate gradient back propagation algorithm, and the transfer function that were selected were the hyperbolic tangent (tanh) for the interconnections between the input and the hidden layer and the logistic function for the respective interconnections between the hidden and the output layer. The values of the weights were chosen by minimizing the value of the cost function G (Equation (2)): where t i is the target class, y i is the classifier output, m i ∈ Z is the number of the training samples and i ∈ Z ∈ Z is the index of a specific sample. Apart from the first level hyperparameters, the values of which were randomly chosen as priors for the initialization of the MLP classifiers, automatic relevance determination (ARD) was used in this study. In the application of the ARD technique, a new regularization hyperparameter, alpha (α), was introduced for every weight associated with the i input variables in order to determine the relevance of the input data into the model. Evidence maximization was used to infer the regularization parameters, and the inputs with the largest α i values were not used for further analysis. In the specific algorithm, three additional alpha hyperparameters, including weight classes, were demonstrated; the first was related to the synaptic connection bias, the second to the interconnections between the hidden and the output neurons, and the last one was associated with the connection between the hidden layer bias neuron and the output neurons [63].
A quantification of the influence of the individual features in the model can be derived by the magnitude of L2 regularization norms of the weights and the relevant values of the alpha hyperparameter (α k ) that can be calculated using Equation (3).
where k ∈ Z, Ew k = 1 2 w 2 k j , w j ∈ R, represents the weights and j weight indicator of the class W (k) . Both of the classification models that were used for this study had their optimal network architecture hyperparameters, such as the training epoch number and the hidden layer size of the network defined by the means of genetic algorithm (GA) implementation, as described by Ballabio et al. [64], for five different learning rate values. The optimization parameters and their respective values that were set to be optimized are shown in Table 3. Table 3. Neural network hyperparameter values that were tested by a genetic algorithm (GA) for the architecture optimization of the classifiers that were used. For the XY-F network, the optimal architecture was for 200 epochs, a 10 × 10 self organizing map (SOM) layer and a learning rate of 0.005, while for the MLP-ARD network, the respective hyperparameters that were chosen were 500 training epochs, 10 hidden units and a learning rate of 0.005.

Network Hyperparameters Values
All the data analyses and the implementation of the classification methods were carried out using MATLAB software, version 9.5 (R2018b) by Mathworks ® Inc. (Natick, MA, USA).

Model Evaluation Metrics
Before data analysis, the whole dataset of the spectral signatures was divided into training and testing sets by randomly picking samples from the spectral dataset and by using the ordinary 70-30% scheme for the respective training and testing set divisions. The predictive ability of the trained models were addressed using the confusion matrix from the classifier by which the F1 score and the accuracy were computed, as shown in Equations (6) and (7). In these equations, the notation TP corresponds to the true positive predictions, TN to the true negative, FP to the false positive and FN to the false negative predictions. where Although Accuracy is a widely used evaluation metric in the literature, it also implies balanced class distribution in order to give a trustworthy and robust result. In practice though, this is not always the case, as it is many times more difficult to have balanced class distribution. F1-score can tackle this problem in a better way than accuracy, as it is the weighted average of precision and recall and penalizes extreme values in any class.

Spectral Data Overview
Both before and after the outlier elimination it could be concluded that the pattern of the spectral signature was dependent on the disease severity ( Figure 2). In both cases it could be seen that there was an inversely proportional attitude for the spectral peaks in the visible and in the near infrared region. An optical comparison between the graphs in Figure 3 can show that there was almost no influence of the proposed outlier elimination in the visible part of the spectral signature, but in the NIR part, especially after the red edge peak at 750 nm, there was a distinct difference between the different classes. While the data before the outlier elimination showed that in this part of the spectrum it was difficult to differentiate mean spectral response between the class couple 1 and 2 and the class couple 3 and 4, nevertheless, it seemed like after the outlier elimination, the classes became more distinct to each other, with class 1 being on the top and class 2, 3 and 4 following in an order of increasing severity.

Feature Selection
As was mentioned, NCA was used for the feature selection of both VIs and EWs, and any spectral index or spectral value that had a feature weight higher than 1.18, as was decided by NCA algorithm, was included as an input feature in the classification process. The selected features for both the VIs and the effective wavelengths are shown in Figure 3. Before the outlier elimination the mean spectral signatures seemed to be closer to each other, in comparison with the data after the outlier elimination, and this could make the classification process more difficult, as it was possible that there would be some overlap between the spectral signatures. A more distinct class division can ensure most of the time a more successful classification [65].
An optical comparison between the graphs in Figure 3 can show that there was almost no influence of the proposed outlier elimination in the visible part of the spectral signature, but in the NIR part, especially after the red edge peak at 750 nm, there was a distinct difference between the different Remote Sens. 2020, 12, 1920 9 of 22 classes. While the data before the outlier elimination showed that in this part of the spectrum it was difficult to differentiate mean spectral response between the class couple 1 and 2 and the class couple 3 and 4, nevertheless, it seemed like after the outlier elimination, the classes became more distinct to each other, with class 1 being on the top and class 2, 3 and 4 following in an order of increasing severity. different classes. While the data before the outlier elimination showed that in this part of the spectrum it was difficult to differentiate mean spectral response between the class couple 1 and 2 and the class couple 3 and 4, nevertheless, it seemed like after the outlier elimination, the classes became more distinct to each other, with class 1 being on the top and class 2, 3 and 4 following in an order of increasing severity.

Feature Selection
As was mentioned, NCA was used for the feature selection of both VIs and EWs, and any spectral index or spectral value that had a feature weight higher than 1.18, as was decided by NCA algorithm, was included as an input feature in the classification process. The selected features for both the VIs and the effective wavelengths are shown in Figure 3.

Α Β
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 4 The features that were selected by the NCA algorithm and that were used as input for the classification algorithms are also shown in Table 4.

Feature Selection
As was mentioned, NCA was used for the feature selection of both VIs and EWs, and any spectral index or spectral value that had a feature weight higher than 1.18, as was decided by NCA algorithm, was included as an input feature in the classification process. The selected features for both the VIs and the effective wavelengths are shown in Figure 3.
The features that were selected by the NCA algorithm and that were used as input for the classification algorithms are also shown in Table 4. Table 4. Vegetation indices and effective wavelengths that were selected by the NCA algorithm as the input features for the selected classifiers before and after the outlier elimination.

Before Outlier Elimination
After Outlier Elimination

XY-F classifier
The performance of the XY-F classifier for the ToCV disease detection and its severity evaluation on the young tomato plants for the two different features set selected, before and after the outlier elimination, using the HR-PCA algorithm, are presented in Table 5. Out of the 2984 total spectral signatures that were collected during the experimental process, from the 156 plants, 2089 were used for the training phase of the classifiers and 895 for their validation. Table 5. XY -Fusion Network (XY-F) classifier prediction performance on the detection of Tomato Chlorosis Virus (ToCV) on the tomato plants and its severity before outlier elimination. Class 1 represents the healthy state and classes 2-4 the severity of the disease in an increasing order, denoted as ToCV1-ToCV3. A closer look to Table 5 shows that EWs showed a slightly lower prediction accuracy in comparison with the VIs' prediction accuracy for both outlier regimes (before and after outlier elimination); when the effective wavelengths (EW) were used as features, the prediction accuracy was always slighter lower in comparison with the respective outcome that the vegetation indices (VI) showed when they were selected as features This difference varied from 0%, in the case of class 3 and 4 (ToCV2 and ToCV3) prediction for the model that used the data after the outlier elimination, to 5.7% in the case of class 1 prediction for the model that used the data before the outlier elimination. It seems that the overall accuracy and F1 scores do not have any significant differences to each other. The best performing combination of classifier and feature selection was found to be the MLP-ARD classifier with vegetation indices as features.

Outliers
The elimination of the outliers seemed to have a very significant influence in the models' predictive accuracy, as the results were very close or even equal to the perfect 100% score for classes 3 and 4. The variance in the accuracy results before the outlier elimination was 4.53 for the EW features and 3.68 for the VI features, while after the outliers' elimination, the variance was less than 0.05 for both feature sets.
The XY-F component maps for both of the feature sets that were selected for the models before and after the outlier elimination are presented in Figure 4. These component maps show the spatial distribution of the classes in the SOM plane, and they can better demonstrate that clusters from various pixels of the imagery can have similar characteristics while being spatially dispersed in the image [29]. The component maps seem to have a uniformity (measured to be more than 80% for each component map) in the class assignation for both the models before and after the outlier elimination. This means that the correlation coefficient of the input SOM layers and the class layers should show the influence of each feature in the model structure. The correlation between the topological structure of the components map and the respective structure that was generated by the training vectors through their training phase is explained in Figure 5. It is apparent that for the model before the outlier elimination, 408.9, 425.7, 429.0 and 449.2 nm overruled the other EWs in terms of topological correlation to the component map when the EWs were used as features. Nevertheless, the rest of the EWs also played a significant role in the classifier as almost all of them had a correlation coefficient higher than 0.5. A similar pattern was followed for the model that was created after the outlier elimination. The difference is that although the maximum peaks were in the same exact places as in the respective model before the outlier elimination, after this point, the correlation fell to levels below 0.4.
On the other hand, when VIs were used as features, before the outlier elimination it was found that PSSRa, REIP and VOG1 showed a great correlation (r = 0.7) with PSSRb following and ARI playing a minor role contributing only a little to the model creation, having an r value close to 0.1 ( Figure 5). Similarly, for the model that was created after the outlier elimination, the SRCHLTOT feature seemed to also have a large correlation alongside PSSRa, PSSRb and VOG2, while REIP and TVI had a minor correlation, and ARI once again had almost no correlation. . Neuron class assignments in the XY-F network's class layer as it was trained before the outlier elimination for the EW features (A1,A2) and for the VI features (B1,B2). The axes represent the SOM grid (10 × 10 neurons). The index 1 denotes the results before the outlier elimination, while the index 2 denotes the respective results after the outlier elimination. Each pixel color represents one of the target classes (dark blue for class 1, light blue for class 2, green for class 3 and yellow for class 4).
The correlation between the topological structure of the components map and the respective structure that was generated by the training vectors through their training phase is explained in Figure 5. It is apparent that for the model before the outlier elimination, 408.9, 425.7, 429.0 and 449.2 nm overruled the other EWs in terms of topological correlation to the component map when the EWs were used as features. Nevertheless, the rest of the EWs also played a significant role in the classifier as almost all of them had a correlation coefficient higher than 0.5. A similar pattern was followed for the model that was created after the outlier elimination. The difference is that although the maximum peaks were in the same exact places as in the respective model before the outlier elimination, after this point, the correlation fell to levels below 0.4.

MLP-ARD classifier
As far as the MLP-ARD classifier is concerned, its performance in detecting the ToCV disease and its severity level on the young tomato plants for the two different features set selected, before and after the outlier elimination, are presented in Table 6 Once again, out of the 2984 total spectral signatures that were collected, 2089 were used for the training phase of the classifiers and 8895 for their validation. Table 6. Multilayer Perceptron with Automated Relevance Detection (MLP-ARD) classifier prediction performance of the detection of ToCV on the tomato plants and its severity before outlier elimination. Class 1 represents the healthy state and classes 2-4 the severity of the disease in an increasing order, denoted as ToCV1-ToCV3.  On the other hand, when VIs were used as features, before the outlier elimination it was found that PSSRa, REIP and VOG1 showed a great correlation (r = 0.7) with PSSRb following and ARI playing a minor role contributing only a little to the model creation, having an r value close to 0.1 ( Figure 5). Similarly, for the model that was created after the outlier elimination, the SRCHLTOT feature seemed to also have a large correlation alongside PSSRa, PSSRb and VOG2, while REIP and TVI had a minor correlation, and ARI once again had almost no correlation.

MLP-ARD Classifier
As far as the MLP-ARD classifier is concerned, its performance in detecting the ToCV disease and its severity level on the young tomato plants for the two different features set selected, before and after the outlier elimination, are presented in Table 6  Table 6. Multilayer Perceptron with Automated Relevance Detection (MLP-ARD) classifier prediction performance of the detection of ToCV on the tomato plants and its severity before outlier elimination. Class 1 represents the healthy state and classes 2-4 the severity of the disease in an increasing order, denoted as ToCV1-ToCV3. Once again, out of the 2984 total spectral signatures that were collected, 2089 were used for the training phase of the classifiers and 8895 for their validation. Table 6 shows that the MLP-ARD algorithm seemed to have great classification performance, as all of the predicted accuracy values exceeded 90%. The model that used the VIs as features showed a slightly better performance, in comparison with the respective model that used the EWs as features, of all cases of the models that were created before the outlier elimination, except the case of class 4 (ToCV3), where the accuracy was the same in both cases. The maximum difference was 2.7% in the case of the prediction of class 1 (healthy leaves).

Outliers
As shown in the same table, after the outlier elimination, the model showed a perfect 100% accuracy in classifying the leaves' spectral signatures into the healthy or one of the ToCV severity level classes. The variance in the accuracy results before the outlier elimination was 0.61 for the EW features and 2.03 for the VI features, while after the outlier elimination, there was no variance for both feature sets.
The Hinton diagrams in Figure 6 depict the weight values connecting the input features in any of the cases that were tested in this study (before and after outlier elimination and with EWs and VIs as input features) and is a better way of visualizing the function of the MLP-ARD algorithm [30]. The ARD algorithm suppresses the weights of the least important features and enforces the weights of the most active ones.
As seen in Figure 6, it is apparent that the biggest absolute weight values in case of the VIs used as input features were for ARI and VOG1 for the model before the outlier elimination. Additionally, it appears that in this very model the hidden neurons 1, 2, 3, 4, 7, 8 and 9 had a very small influence on the model's performance. REIP, SR CHLTOT and VOG2 were the respective indices that had the biggest absolute values, while ARI seemed to have almost no contribution to the model after outlier elimination. As far as the EWs input features are concerned, the range 402 to 412.2 nm seemed to have the highest impact in the model created before the outlier elimination, while the 415.6, 425.7, 736.2 and 865.4 nm seemed to have the lowest impact. On the other hand, in the Hinton diagram for the model that used the EWs input features after the outlier elimination, there was a uniformity in the impact that each EW had to the model. Only 862.1 nm seemed to have slightly smaller weight values.
As the training procedure went towards the last iterations, the highest L2 norms in the network were connected to the lowest alpha values and thus the highest weigh variances. Figure 7 shows the alpha hyperparameter values for the interconnection weights between the input and the hidden layer of the MLP. As seen in Figure 7, it was apparent that there was an inversely direct connection between the magnitude of the alpha hyperparameter and the weight values that were presented in the Hinton diagrams of Figure 6. Indeed, it appeared that ARI and VOG1 and TVI, REIP, SRCHLTOT and VOG2 had the lowest alpha values for the models before and after the outlier elimination, respectively. Although ARI had the highest alpha value in comparison with the rest of the features in the model that were created after the outlier elimination, this value was not as high when it was compared with the respective highest values of the model before the outlier elimination. Similarly, the EWs that appeared to have the highest alpha values were 415.6, 425.7, 736.2 and 865.4 nm for the model before the outlier elimination and 862.1 nm for the model after the outlier elimination. In the latter case it could be also observed that the maximum alpha value was close to 0.05, which was a relatively small value when compared with the respective data from Figure 7(B1).
predicted accuracy values exceeded 90%. The model that used the VIs as features showed a slightly better performance, in comparison with the respective model that used the EWs as features, of all cases of the models that were created before the outlier elimination, except the case of class 4 (ToCV3), where the accuracy was the same in both cases. The maximum difference was 2.7% in the case of the prediction of class 1 (healthy leaves).
As shown in the same table, after the outlier elimination, the model showed a perfect 100% accuracy in classifying the leaves' spectral signatures into the healthy or one of the ToCV severity level classes. The variance in the accuracy results before the outlier elimination was 0.61 for the EW features and 2.03 for the VI features, while after the outlier elimination, there was no variance for both feature sets.
The Hinton diagrams in Figure 6 depict the weight values connecting the input features in any of the cases that were tested in this study (before and after outlier elimination and with EWs and VIs as input features) and is a better way of visualizing the function of the MLP-ARD algorithm [30]. The ARD algorithm suppresses the weights of the least important features and enforces the weights of the most active ones. that were created after the outlier elimination, this value was not as high when it was compared with the respective highest values of the model before the outlier elimination. Similarly, the EWs that appeared to have the highest alpha values were 415.6, 425.7, 736.2 and 865.4 nm for the model before the outlier elimination and 862.1 nm for the model after the outlier elimination. In the latter case it could be also observed that the maximum alpha value was close to 0.05, which was a relatively small value when compared with the respective data from Figure 7B1.

General and Spectral Data Overview
Spectral reflectance signature samples collected by the young tomato leaf surfaces were used in this study in order to detect a possible viral infection in an early stage, before the symptoms become visible on the leaf surface, and the infection severity. The selected virus was ToCV, because it is an emergent plant pathogen in tomato fields in Greece and worldwide [8,66]. The classifiers that were used were the XY-F network and the MLP-ARD.
From the comparison of the mean spectral responses of the healthy vs. infected plants in Figure  3, it was found that an increase of severity has as a consequence a decrease of the spectral response in the NIR region and a slight increase of the spectral response in the visible region (400-700 nm). This very slight increase in the spectral peak in the visible region (with the local maximum close to 550 nm) is clearly not detectable by human eyes as a visible symptom and is probably happening because as the ToCV infection becomes more serious, it affects the leaf pigments, causing interveinal chlorosis. Yellowing has appeared in a similar situation of sugarcane viral infection [57] to have a slightly increased reflectance percentage in the visible region of the spectrum. There was no shifting of the red edge region though to lower wavelengths, as happens when yellowing occurs to a leaf surface, according to Sims and Gamon [67].
A similar decrease in the red edge and NIR regions was shown by Grisham et al. [23] and Gazala et al. [68] that used hyperspectral imagery and spectral signatures by spectroradiometer to identify sugarcane viruses causing leaf yellowing. Additionally, healthy leaves have been found to show higher reflectance in the NIR region compared to those infected by tobacco mosaic virus (TMV), but

General and Spectral Data Overview
Spectral reflectance signature samples collected by the young tomato leaf surfaces were used in this study in order to detect a possible viral infection in an early stage, before the symptoms become visible on the leaf surface, and the infection severity. The selected virus was ToCV, because it is an emergent plant pathogen in tomato fields in Greece and worldwide [8,66]. The classifiers that were used were the XY-F network and the MLP-ARD.
From the comparison of the mean spectral responses of the healthy vs. infected plants in Figure 3, it was found that an increase of severity has as a consequence a decrease of the spectral response in the NIR region and a slight increase of the spectral response in the visible region (400-700 nm). This very slight increase in the spectral peak in the visible region (with the local maximum close to 550 nm) is clearly not detectable by human eyes as a visible symptom and is probably happening because as the ToCV infection becomes more serious, it affects the leaf pigments, causing interveinal chlorosis. Yellowing has appeared in a similar situation of sugarcane viral infection [57] to have a slightly increased reflectance percentage in the visible region of the spectrum. There was no shifting of the red edge region though to lower wavelengths, as happens when yellowing occurs to a leaf surface, according to Sims and Gamon [67].
A similar decrease in the red edge and NIR regions was shown by Grisham et al. [23] and Gazala et al. [68] that used hyperspectral imagery and spectral signatures by spectroradiometer to identify sugarcane viruses causing leaf yellowing. Additionally, healthy leaves have been found to show higher reflectance in the NIR region compared to those infected by tobacco mosaic virus (TMV), but the visible part seems to have the highest reflectance in the case of severe infection [26]. The NIR region reflectance pattern is caused by the internal light scattering by the leaf cells [46,68,69]. The change in the NIR reflectance in such a way could be explained by the effects of the virus infection [70] that induce the destruction of the cellular structures, which in turn cause the collapse of cell compactness and loss of air spaces.

The Effect of Outliers in the Models
It is apparent that outlier elimination gave a more discernible profile in the average spectral signatures in the NIR region by segregating the four different classes in terms of infection severity in descending reflectance order, despite the loss of visible symptoms (Figure 3). This process showed a significant improvement of the performance of the classifiers used in this study, in most of the cases, to score a perfect accuracy and F1 score. This was also confirmed by the literature on the effect of outlier elimination in the classification process [70]. The reason for this remarkable improvement in these models is probably due to the spectral instrument (spectrometer) used having a clip that totally isolates the leaf from the external disturbances. Additionally, the halogen light source of the instrument is fixed on the clip and almost in direct contact with the leaf surface, and thus it does not allow heat fluctuations.
Nevertheless, it should be noted that most of the error occurring during classification to the models before outlier elimination was probably due to spectral signature overlap of different classes. Indeed, unpublished data from the confusion matrices that were created showed that most of the misclassification errors in the models were between consecutive classes, i.e., class 1 and 2 and class 3 and 4, and there were hardly any errors occurring between classes 1 and 3, 2 and 4, and 1 and 4 and very few between classes 2 and 3, meaning that some signatures from healthy leaves were classified as slightly infected and vice versa and some signatures from mid infected leaves were misclassified as severely infected.
Spectral signatures are very close to each other before outlier elimination ( Figure 3) making it probable that signatures belonging to the limits of neighboring classes (Table 3) were classified wrongly due to overlap of signatures. Absence of outliers in this case could raise some robustness issues in case of a similar experiment. Thus, it is possible that the collected data containing the outliers may perform a better generalization of the proposed models for either online application or virus detection, and its severity in similar cases, in comparison to the models after the outlier elimination. Figure 8 shows the spectral bands that were selected by the NCA algorithm and that were used for the classification either as EWs or as a structural part of the VI formula. This figure shows that the selected bands cover parts from most of the important regions (peak in the visible part, red edge and the peak and parts from the NIR plateau region) of the spectrum for both models before and after the outlier elimination, for the detection of the vegetation health, as was described by Kalacska and Sanchez-Azofeifa [71]. The difference between VIs and EWs is that there are also bands in the EWs that cover the range from 400 to 450 nm, which does not happen in the case of the VIs. In this paper it was found that the bands in that region can have a significant role in the model predictive accuracy, having in most cases high correlation coefficient in the case of the XY-F network ( Figure 5) and low alpha values in the case of MLP-ARD classifier (Figure 7), which is also in accordance with the findings of Zhu et al. [26], which had 459.58 nm selected as the most influential spectral band feature, using the successive projections algorithm (SPA). findings of Zhu et al. [26], which had 459.58 nm selected as the most influential spectral band feature, using the successive projections algorithm (SPA).

Spectral Bands and Vegetation Indices Selected by NCA
It is known that the reflectance peak in the visible part of the spectrum is related to and dependent on the chlorophyll a and b, carotenoid and anthocyanin concentrations. The selection of bands in the region of 400 to 450 nm could be probably related to the chlorophyll concentration, while the peak around 550 nm could be related to anthocyanin absorption Figure 8. Spectral bands covered in the classification algorithms, as selected by the NCA algorithm for the models before (left graph) and after the outlier elimination (right graph). White dots depict the bands that came of the VI formulas and black dots the bands that were selected as EWs.
It is known that the reflectance peak in the visible part of the spectrum is related to and dependent on the chlorophyll a and b, carotenoid and anthocyanin concentrations. The selection of bands in the region of 400 to 450 nm could be probably related to the chlorophyll concentration, while the peak around 550 nm could be related to anthocyanin absorption The VIs that were chosen in this paper by the NCA algorithm combine spectral bands from both the visible part of the spectrum and the part close to or on the red edge, taking into account that these two regions makes it easier for the algorithm to differentiate any changes that happen either in the shifting of the local minima to higher or lower values or to the shifting of the red edge inflection point to the left or the right. This is probably why the REIP index was chosen for both outlier existence scenarios. REIP was also found to be useful for the detection of vegetation health by Hoque and Hutzler [72] and Vogelmann et al. [56]. Nevertheless, owing to the fact that there was almost no change of the position of the inflection point to higher or lower wavelengths, this index was found to have in most of the cases an average contribution to the model's performance ( Figure 5(B2) and Figure 7(A1)).
Simple ratio indices, like PSSRa and PSSRb, for the direct chlorophyll content estimation were also selected for both model scenarios. For the models created after the outlier elimination there was also the SR CHLTOT index selected (Table 2). Indeed, chlorophyll is the most important pigment for the photosynthesis and is thus one of the most important indicators for the general health condition of the plant [73]; this is why these indices had such an important contribution to the models, as can be seen in Figures 5 and 7. These findings are in accordance with the respective findings of Lu et al. [74], who investigated the contribution of different VIs to the detection of tomato leaf health condition and found a high contribution of the PSSR index. VOG1 and VOG2 are both spectral indices that deal with the changes that occur in the red edge zone, and according to Figures 5 and 7, they are proven to have a major effect on the models' efficiency. This is comparable with the related works of Lu et al. [74] and Lopez et al. [75], that have found a dominant effect of VOG index for disease detection (including viral infection) in an early stage in almond trees and tomato plants, respectively.
Finally, ARI is a VI that is generally used for the estimation of the anthocyanins and is also a VI that was selected for both outlier scenarios in this study. Despite this fact though and the fact that ToCV infection induces increased anthocyanin accumulation in some tomato cultivars [76], the results from the classifiers showed a very low importance of this index for both XY-F network and one case of the MLP-ARD models (Figures 5 and 7). There is a conflict in these findings in comparison with those of Devadas et al. [77], who found that ARI is a very efficient indicator for the differentiation of three different rust types. It is probable that either the contribution of ARI in the present paper's models' efficiency was overruled by the contribution of the rest of the VIs that were selected by NCA algorithm, or there was no effect on the anthocyanin content caused by ToCV in this tomato cultivar.

Classifier Results
An overview of the results for both models created before and after the outlier elimination reveals the great performance that was achieved in all the cases that were studied, scoring individual class accuracies higher than 80% in the case of the XY-F network (Table 5) and higher than 90% in the case of the MLP-ARD ANN ( Table 6). The performance of the models, before the outlier elimination, described in this paper are comparable with the findings of Schor et al. [78] for the detection of tomato spotted wilt virus, using PCA, with an overall accuracy of 90%, and Xu et al. [39] for the detection of tobacco mosaic virus using a Mahalanobis distance based model.
Despite the fact that in this paper MLP-ARD generally showed better performance than the XY-F network, previous studies [29,30] that have worked on a dataset for the detection of a fungal infection on S. Marianum weed plants have shown that hierarchical self-organizing models like the XY-F network showed a better overall performance in comparison to the MLP-ARD algorithm. Using the EWs as input features, a slightly lower overall performance was found than with VIs and a higher variance between the models before and after the outlier elimination.
A comparison between the two different aspects of the classifiers that were employed in this paper showed that if we excluded the models that were created after the outlier elimination and had a perfect performance, the best performing combination of classifier and feature selection was found to be the MLP-ARD classifier with VIs as features. This is probably due to the fact that the VIs were developed in such a way as to reveal the structural and metabolic alterations that happen in plant leaves when they are subjected to a stress regime, by fusing the combined effect of two or more spectral bands from the most important regions of the spectrum in one formula.
A close observation of the results of the Hinton diagram in Figure 6 shows that most of the hidden neurons in the best performing VIs (REIP, SR CHLTOT and VOG2) in the case of the model after the outlier elimination have opposite signs (positive and negative weight values). This could be an indicator that there is very low overlap between the features, and at the same time a synergistic activity, which in turn is an indicator of feature fusion by the classifier performed by the ARD algorithm.
Finally, there was a balanced distribution of the selected signatures for each class from the comparison of the very low absolute difference between the F1 score and accuracy values, for every model. This means that accuracy, an evaluation metric prone to imbalanced data, can be used in this study to give a satisfactory estimation of the detector's performance.

Conclusions
In the present study, the spectral reflectance signatures of ToCV-infected tomato plants were studied for the detection of a possible virus infection and a quantitative severity level estimation by applying machine learning techniques on selected spectral features of these plants. A non-destructive disease detection approach is of value for the early prevention of the disease spread in a nursery or farm level and the subsequent loss of production.
Both XY-F network and MLP-ARD ANN classifiers were demonstrated to be greatly efficient in detecting the ToCV infection and its severity level, scoring an overall accuracy of over 85%. MLP-ARD seems to perform generally better than the XY-F classifier and also shows more robust results in terms of variance.
Outlier elimination plays a major role in the overall performance of the classifiers, showing a perfect accuracy when they are eliminated for both classifiers. Outlier existence though, could be a valuable tool for the generalization of the model for reasons of repetition of the experiment in similar situations, as they take into account the possible class overlap that happens between the spectral signatures of neighbor classes, given that there is no other possible reason for outlier existence, due to the high resolution of the spectrometer measurements.
VIs were shown to have a slightly better overall performance in comparison to the effective wavelengths that were chosen by the NCA algorithm. A combination of pigment specific indices like