Early Detection of Ganoderma boninense in Oil Palm Seedlings Using Support Vector Machines

: Ganoderma boninense ( G. boninense ) is a fungus that causes one of the most destructive diseases in oil palm plantations in Southeast Asia called basal stem rot (BSR), resulting in annual losses of up to USD 500 million. The G. boninense infects both mature trees and seedlings. The current practice of detection still depends on manual inspection by a human expert every two weeks. This study aimed to detect early G. boninense infections using visible-near infrared (VIS-NIR) hyperspectral images where there are no BSR symptoms present. Twenty-eight samples of oil palm seedlings at ﬁve months old were used whereby 15 of them were inoculated with the G. boninense pathogen. Five months later, spectral reﬂectance oil palm leaﬂets taken from fronds 1 (F1) and 2 (F2) were obtained from the VIS-NIR hyperspectral images. The signiﬁcant bands were identiﬁed based on the high separation between uninoculated (U) and inoculated (I) seedlings. The results indicate that the di ﬀ erences were evidently seen in the NIR spectrum. The bands were later used as input parameters for the development of Support Vector Machine (SVM) classiﬁcation models, and these bands were optimized according to the classiﬁcation accuracy achieved by the classiﬁers. It was observed that the U and I seedlings were excellently classiﬁed with 100% accuracy using 35 bands and 18 bands of F1. However, the combination of F1 and F2 (F12) gave better accuracy than F2 and almost similar to F1 for speciﬁc classiﬁers. This ﬁnding will provide an advantage when using aerial images where there is no need to separate F1 and F2 during the data pre-processing stage. boninense infection. The process of image acquisition was conducted ﬁve months after transplanting. The camera used in this study was a FireﬂEYE S185 (Cubert GmbH, Ulm, Germany) snapshot camera with wavelengths ranging from 450 to 950 nm (125 bands) that covers the visible (blue, green, and red) and NIR regions with a sampling interval of 4 nm. The camera was mounted horizontally on a custom tripod and was positioned 2.6 m from the ground level (Figure 1).


Introduction
The production of oil palm in Southeast Asia has been affected by a never-ending case of basal stem rot (BSR) disease. Malaysia has reported annual losses of up to RM 1.5 billion due to this disease, which has made it the most economically devastating disease in the agricultural field. Not only the mature trees, but the oil palm seedlings are also susceptible to the infection whereby the symptoms appear earlier and more severe [1]. BSR affects a plantation by reducing the number of standing trees as well as the weight of fresh fruit bunches (FFB) [2]. According to Subagio and Foster [3], FFB yield decreases by an average of 0.16 tons per hectare for every dead palm or equivalent to 35% when half infected seedlings (inoculated at four months old) after six months of inoculation. Their study used significant bands of first derivative spectra to develop a Maximum Likelihood classification model. The accuracy produced by the model was 82% with a kappa coefficient of 0.73. Table 1 gives a summary of methods and specific bands used to detect BSR disease in oil palm seedlings and mature trees. As observed, previous studies have been able to present good discrimination accuracy between healthy and infected trees/seedlings with the highest accuracy of 82% and 97% at nursery and plantation level, respectively. Only six out of 20 research studies focused on BSR detection at nursery level using spectroscopy devices with the best method found to be conventional classification, i.e., Maximum Likelihood, and none of the methods used hyperspectral imaging and machine learning techniques. Although the spectroscopy device is capable of detecting early infection of BSR in oil palm seedlings, it has a limitation where the device can only take one reading per time for a small sample point, thus requiring a longer duration of data collection. In contrast, a hyperspectral camera can reduce the time taken for data collection due to its ability to cover large areas in a single imaging session.
The use of machine learning techniques, especially Support Vector Machines (SVMs), has been shown to be beneficial in agriculture applications [30,[49][50][51][52][53] and is worth being explored, especially for early detection of BSR such as in studies conducted by Santoso et al. [54], Santoso et al. [55] and Khaled et al. [56]. SVM offers robust analysis and better prediction due to the optimal hyperplane marginal gap between classes. It has been used for classification, regression, and clustering in agricultural studies. The main advantage of SVM is the implementation of the kernel method, which enables higher dimensional separation of non-linear data and improves the computational power of linear learning machines. Kernel functions that are widely used are linear, Gaussian radial basis function (RBF), and polynomial. In addition, this study was carried out to determine the capability of various SVM classifiers to achieve a high degree of accuracy in classifying the different number of bands, features, and frond number for early detection of G. boninense. n.a. [47] n.a.
Mann-Whitney U test and Band ratio and Optimum index factor (OIF) and K-means clustering and Average silhouette width (ASW) plot 610.5, 738 n.a. [57] 10 months old

Study Area
The study was conducted at the Transgenic Greenhouse, Universiti Putra Malaysia (UPM), Serdang, Malaysia (2 • 59 33.10" N, 101 • 43 19.16" E), from 24 January 2019 to 24 June 2019. The greenhouse has dimensions of 12 m length x 6 m width x 4 m height and is made up of polycarbonate panels to protect against harmful UV light and is equipped with air conditioners, humidifiers, thermal screens, and a humidity/temperature sensor. The temperature inside the greenhouse was set at 27 • C following the work of Kamil and Omar [69].

Artificial Inoculation of Samples
A total of 28 oil palm seedlings (commercial standard crosses of Dura × Pisifera, D × P) were obtained from Sime Darby Plantation, Banting, Malaysia, at the age of four months old. The seedlings were permitted to acclimatize under greenhouse conditions for one month before transplanting.
At five months old, 15 of the seedlings were transplanted into 24 cm × 21 cm × 33 cm polybags comprising 30% of a mixture of 90% topsoil and 10% sand. A rubberwood block (RWB, 6 cm × 6 cm × 6 cm) colonized with G. boninense pathogen was placed at the center of a polybag. The roots of the seedlings were positioned on top to have direct contact with the RWB and covered with soil until the bole level. This method is called a sitting technique as described in Naidu et al. [70]. Thirteen seedlings were planted with uninoculated RWB acting as the control treatment (U). All the seedlings were arranged according to the standard triangular arrangement of an oil palm nursery with equal and sufficient water and fertilizer application. Two months after artificial inoculation, two of the inoculated (I) seedlings were sent to the Bacteriology Laboratory, Faculty of Agriculture, UPM, Serdang, Malaysia, for a polymerase chain reaction (PCR) test to confirm the G. boninense infection.

Data Collection
The process of image acquisition was conducted five months after transplanting. The camera used in this study was a FireflEYE S185 (Cubert GmbH, Ulm, Germany) snapshot camera with wavelengths ranging from 450 to 950 nm (125 bands) that covers the visible (blue, green, and red) and NIR regions with a sampling interval of 4 nm. The camera was mounted horizontally on a custom tripod and was positioned 2.6 m from the ground level ( Figure 1). Remote Sens. 2020, 12, x 6 of 21

Study Area
The study was conducted at the Transgenic Greenhouse, Universiti Putra Malaysia (UPM), Serdang, Malaysia (2°59′33.10″ N, 101°43′19.16″ E), from 24 January 2019 to 24 June 2019. The greenhouse has dimensions of 12 m length x 6 m width x 4 m height and is made up of polycarbonate panels to protect against harmful UV light and is equipped with air conditioners, humidifiers, thermal screens, and a humidity/temperature sensor. The temperature inside the greenhouse was set at 27 °C following the work of Kamil and Omar [69].

Artificial Inoculation of Samples
A total of 28 oil palm seedlings (commercial standard crosses of Dura × Pisifera, D × P) were obtained from Sime Darby Plantation, Banting, Malaysia, at the age of four months old. The seedlings were permitted to acclimatize under greenhouse conditions for one month before transplanting.
At five months old, 15 of the seedlings were transplanted into 24 cm × 21 cm × 33 cm polybags comprising 30% of a mixture of 90% topsoil and 10% sand. A rubberwood block (RWB, 6 cm × 6 cm × 6 cm) colonized with G. boninense pathogen was placed at the center of a polybag. The roots of the seedlings were positioned on top to have direct contact with the RWB and covered with soil until the bole level. This method is called a sitting technique as described in Naidu et al. [70]. Thirteen seedlings were planted with uninoculated RWB acting as the control treatment (U). All the seedlings were arranged according to the standard triangular arrangement of an oil palm nursery with equal and sufficient water and fertilizer application. Two months after artificial inoculation, two of the inoculated (I) seedlings were sent to the Bacteriology Laboratory, Faculty of Agriculture, UPM, Serdang, Malaysia, for a polymerase chain reaction (PCR) test to confirm the G. boninense infection.

Data Collection
The process of image acquisition was conducted five months after transplanting. The camera used in this study was a FireflEYE S185 (Cubert GmbH, Ulm, Germany) snapshot camera with wavelengths ranging from 450 to 950 nm (125 bands) that covers the visible (blue, green, and red) and NIR regions with a sampling interval of 4 nm. The camera was mounted horizontally on a custom tripod and was positioned 2.6 m from the ground level ( Figure 1).  The camera was calibrated with white and dark references before each image acquisition to reduce the effect of illumination and detector sensitivity. Therefore, the time for integration was approximately the same. The dark calibration was performed by closing the lens of the camera, while the white calibration was performed by placing the provided white rectangular board (99% light reflection) flat and close to the lens. Each collected spectrum was calibrated as: The white and dark calibrations were checked before the actual image acquisition to ensure good quality of the output images. One seedling was imaged at one time against a black background board. The images were taken on a sunny, clear day from 11:00 a.m. to 2:00 p.m. local time to obtain a natural illumination. The system was controlled by Cube-Pilot software provided by the manufacturer. Figure 2 shows the top view image of frond 1 (F1) and frond 2 (F2). In order to minimize variations in spectral reflectance due to the effects of frond inclination [71], only the spectral reflectance of the first four leaflets of F1 and F2 were extracted manually and randomly. The usage of F1 and F2 followed the work of Shafri et al. [48] and Izzuddin et al. [59] due to the morphological arrangement of the fronds. Therefore, an average of 20 sample points were obtained from each frond, resulting in a total number of 558 and 564 sample points for F1 and F2, respectively. The outliers of the data were identified using the box plot method. Box plot analyzed the data statistically in a graphical manner with five measured parameters representing the distribution, i.e., lower quartile, upper quartile, lower fence, upper fence, and interquartile range. These quartile ranges were advantageous due to their reduced sensitivity towards outliers [72,73].

Data Pre-Processing
Remote Sens. 2020, 12, x 7 of 21 The camera was calibrated with white and dark references before each image acquisition to reduce the effect of illumination and detector sensitivity. Therefore, the time for integration was approximately the same. The dark calibration was performed by closing the lens of the camera, while the white calibration was performed by placing the provided white rectangular board (99% light reflection) flat and close to the lens. Each collected spectrum was calibrated as: The white and dark calibrations were checked before the actual image acquisition to ensure good quality of the output images. One seedling was imaged at one time against a black background board. The images were taken on a sunny, clear day from 11:00 a.m. to 2:00 p.m. local time to obtain a natural illumination. The system was controlled by Cube-Pilot software provided by the manufacturer. Figure 2 shows the top view image of frond 1 (F1) and frond 2 (F2). In order to minimize variations in spectral reflectance due to the effects of frond inclination [71], only the spectral reflectance of the first four leaflets of F1 and F2 were extracted manually and randomly. The usage of F1 and F2 followed the work of Shafri et al. [48] and Izzuddin et al. [59] due to the morphological arrangement of the fronds. Therefore, an average of 20 sample points were obtained from each frond, resulting in a total number of 558 and 564 sample points for F1 and F2, respectively. The outliers of the data were identified using the box plot method. Box plot analyzed the data statistically in a graphical manner with five measured parameters representing the distribution, i.e., lower quartile, upper quartile, lower fence, upper fence, and interquartile range. These quartile ranges were advantageous due to their reduced sensitivity towards outliers [72,73].

Data Analysis
In this study, the detection of G. boninense infection was determined by analyzing specific spectral signatures at different treatments (U and I) as well as at different frond numbers. Bands were selected based on the first 35 bands (30% of the total bands) that gave high separation values between U and I. These bands were also subject to a t-test statistical analysis using SPSS statistical software (IBM SPSS Statistics 25, IBM, New York, NY, USA) with a value of p ≤ 0.05. Then, the coefficient of variation (CV) of the sample points for all significant bands was calculated to identify the dispersion of the data. The CV was calculated as:

Data Analysis
In this study, the detection of G. boninense infection was determined by analyzing specific spectral signatures at different treatments (U and I) as well as at different frond numbers. Bands were selected based on the first 35 bands (30% of the total bands) that gave high separation values between U and I. These bands were also subject to a t-test statistical analysis using SPSS statistical software (IBM SPSS Statistics 25, IBM, New York, NY, USA) with a value of p ≤ 0.05. Then, the coefficient of variation (CV) of the sample points for all significant bands was calculated to identify the dispersion of the data. The CV was calculated as: where s is the standard deviation of the samples, and x is the mean of the samples. Later, the identified significant bands were used as input parameters to develop SVM classification models using the machine learning toolbox of MATLAB (2019b, The MathWorks Inc., Natick, MA, USA). In order to evaluate the performance of the developed model, a five-fold cross-validation technique was applied where the data was randomly divided into five equal-sized subsamples. Each subsample was used to test the constructed model using the remaining four trained subsamples. This process was repeated five times, with each subsample becomes a testing set once to improve the effectiveness of the model. The completed models were subsequently exported and assessed using the prediction dataset.
In this study SVM classification models with different kernel functions, i.e., linear, Gaussian RBF, and polynomial, were trained. The linear kernel was fit for linearly separable data, which may be expressed as: where k is the kernel function, x i , x j are a-dimensional input and x T i x j is a map from the a-dimension to the b-dimension.
Where the data are not linearly separable, an appropriate kernel function may be used to enhance SVM classification. The kernel method allows SVM to identify a hyperplane in the kernel space, hence making non-linear separation feasible within the feature space. An example of non-linear kernels is the Gaussian Radial Basis Function (RBF), which can be represented as: where x i − x j 2 is known as the squared Euclidean distance between two feature vectors, and σ is defined as the kernel width. A small kernel width tends to reflect dissimilar patterns and causes overfitting, whereas large kernel width results in very similar patterns and causes underfitting. The optimal kernel width is chosen based on a tradeoff between underfitting and overfitting loss. Furthermore, the σ also has a similar definition as the with kernel scale (γ) where = 1 2σ 2 . In this study, the value of γ was adjusted to different values according to the following assumptions: where n is the number of features. Another kernel function that was used was the polynomial that can be expressed as: where p is the order of the polynomial kernel. The degree of the polynomial kernel is able to influence the tolerance of the classifier resulting in a flexible decision boundary of a higher degree polynomial than the lower value.
After the classification models were constructed, an optimization process was carried out to find the optimal number of bands that could give suitable classification accuracy. Exploration runs were applied where the initial number of significant bands was optimized, as shown in Figure 3. Thirty-five significant bands that were confirmed to be statistically significant were used as inputs of the SVM classifiers. If the classification accuracy obtained by all SVM classifiers was greater than 85%, the current number of significant bands was reduced by 50%. Otherwise, if the condition was not satisfied, the current number of significant bands was increased by 50%. The classification models of the reflectance spectra were developed separately for F1, F2, and the combination of F1 and F2 (F12) using the respective significant bands.
Remote Sens. 2020, 12, x 9 of 21 five significant bands that were confirmed to be statistically significant were used as inputs of the SVM classifiers. If the classification accuracy obtained by all SVM classifiers was greater than 85%, the current number of significant bands was reduced by 50%. Otherwise, if the condition was not satisfied, the current number of significant bands was increased by 50%. The classification models of the reflectance spectra were developed separately for F1, F2, and the combination of F1 and F2 (F12) using the respective significant bands.  Figure 4 shows the condition of the oil palm seedlings that were sent to the Bacteriology Laboratory, Faculty of Agriculture, UPM for PCR test to confirm G. boninense infection. Both samples tested positive for G. boninense infection, although the seedlings displayed no apparent symptoms associated with BSR disease such as fungal mass and yellowing of older leaves. However, longitudinal sectioning of the bole showed brown discoloration indicating the presence of the G. boninense infection. Furthermore, Figure 5 shows the condition of I seedling after 20 weeks of artificial inoculation. The seedling also did not show any visible symptoms related to G. boninense infection, as stated in Izzati et al. [74]; Kok et al. [75]; Naidu et al. [70] despite being inoculated with the G. boninense pathogen.  Figure 4 shows the condition of the oil palm seedlings that were sent to the Bacteriology Laboratory, Faculty of Agriculture, UPM for PCR test to confirm G. boninense infection. Both samples tested positive for G. boninense infection, although the seedlings displayed no apparent symptoms associated with BSR disease such as fungal mass and yellowing of older leaves. However, longitudinal sectioning of the bole showed brown discoloration indicating the presence of the G. boninense infection. Furthermore, Figure 5 shows the condition of I seedling after 20 weeks of artificial inoculation. The seedling also did not show any visible symptoms related to G. boninense infection, as stated in Izzati et al. [74]; Kok et al. [75]; Naidu et al. [70] despite being inoculated with the G. boninense pathogen.    Figures 6 and 7 show the average spectral reflectance of F1 and F2 for the U and I seedlings taken at five months after inoculation with its standard deviation to the mean. As shown in these figures, F1 and F2 yielded almost similar reflectance patterns for both the U and I seedlings. However, the U seedling demonstrated higher reflectance in the green (520 to 570 nm) and NIR (750 to 950 nm) ranges, with maximum differences around 1.4% and 15.4% for F1, and 1.6% and 17.3% for F2, respectively. Although NIR shows higher standard deviation compared to the green, it can totally separate the U and I seedlings without any overlapping wavelength.    Figures 6 and 7 show the average spectral reflectance of F1 and F2 for the U and I seedlings taken at five months after inoculation with its standard deviation to the mean. As shown in these figures, F1 and F2 yielded almost similar reflectance patterns for both the U and I seedlings. However, the U seedling demonstrated higher reflectance in the green (520 to 570 nm) and NIR (750 to 950 nm) ranges, with maximum differences around 1.4% and 15.4% for F1, and 1.6% and 17.3% for F2, respectively. Although NIR shows higher standard deviation compared to the green, it can totally separate the U and I seedlings without any overlapping wavelength.  Figures 6 and 7 show the average spectral reflectance of F1 and F2 for the U and I seedlings taken at five months after inoculation with its standard deviation to the mean. As shown in these figures, F1 and F2 yielded almost similar reflectance patterns for both the U and I seedlings. However, the U seedling demonstrated higher reflectance in the green (520 to 570 nm) and NIR (750 to 950 nm) ranges, with maximum differences around 1.4% and 15.4% for F1, and 1.6% and 17.3% for F2, respectively. Although NIR shows higher standard deviation compared to the green, it can totally separate the U and I seedlings without any overlapping wavelength.   Table 2 tabulates 35 significant bands for F1 and F2. Although the significant bands from both fronds were located in the NIR region, two of the specifically selected bands were different. For example, 862 nm was not significant for F1 but was significant for F2, and 810 nm was not significant for F2 but was significant for F1. These significant bands comprised 30% of the total 125 bands and were verified as statistically significant. Considering only 35 significant bands instead of 125 bands could avoid analytical issues due to unnecessary bands, which thus would make it less complex and more economical to design future hardware.    Table 2 tabulates 35 significant bands for F1 and F2. Although the significant bands from both fronds were located in the NIR region, two of the specifically selected bands were different. For example, 862 nm was not significant for F1 but was significant for F2, and 810 nm was not significant for F2 but was significant for F1. These significant bands comprised 30% of the total 125 bands and were verified as statistically significant. Considering only 35 significant bands instead of 125 bands could avoid analytical issues due to unnecessary bands, which thus would make it less complex and more economical to design future hardware.   Table 2 tabulates 35 significant bands for F1 and F2. Although the significant bands from both fronds were located in the NIR region, two of the specifically selected bands were different. For example, 862 nm was not significant for F1 but was significant for F2, and 810 nm was not significant for F2 but was significant for F1. These significant bands comprised 30% of the total 125 bands and were verified as statistically significant. Considering only 35 significant bands instead of 125 bands could avoid analytical issues due to unnecessary bands, which thus would make it less complex and more economical to design future hardware. The CV of all sample points of the significant bands for F1 and F2 are shown in Figure 8. The CV values were in the range of 5 to 14% and were considered to be good and reliable [76]. Based on the figure, the sample points of F2 demonstrated more variation than F1 at the significant bands (800 to 950 nm). The CV of all sample points of the significant bands for F1 and F2 are shown in Figure 8. The CV values were in the range of 5 to 14% and were considered to be good and reliable [76]. Based on the figure, the sample points of F2 demonstrated more variation than F1 at the significant bands (800 to 950 nm).

SVM Classification
All 35 significant bands tabulated in Table 2 were utilized as input parameters for the development of SVM classification models. Apart from F1 and F2, the combination of F1 and F2 (F12) was also included in the classification for comparative purposes. The number of significant bands was reduced based on the classification accuracy achieved by the SVM classifiers so that the ability of small data to achieve high classification accuracy could be determined.

Frond 1 (F1)
Firstly, the 35 most significant bands tabulated in Table 2 were used as inputs for the SVM classifiers. The output of classification in Table 3 shows that all classifiers successfully classified the U and I seedlings with an accuracy of above 85%. Therefore, this set of bands were then reduced by 50% to 18 and used as new inputs for the classification. Table 4 shows lists of significant bands after band optimization. These 18 significant bands demonstrated a classification accuracy of over 85% for all classifiers. Therefore, the 18 bands were then reduced by 50%. The result shows that these nine

SVM Classification
All 35 significant bands tabulated in Table 2 were utilized as input parameters for the development of SVM classification models. Apart from F1 and F2, the combination of F1 and F2 (F12) was also included in the classification for comparative purposes. The number of significant bands was reduced based on the classification accuracy achieved by the SVM classifiers so that the ability of small data to achieve high classification accuracy could be determined.

Frond 1 (F1)
Firstly, the 35 most significant bands tabulated in Table 2 were used as inputs for the SVM classifiers. The output of classification in Table 3 shows that all classifiers successfully classified the U and I seedlings with an accuracy of above 85%. Therefore, this set of bands were then reduced by 50% to 18 and used as new inputs for the classification. Table 4 shows lists of significant bands after band optimization. These 18 significant bands demonstrated a classification accuracy of over 85% for all classifiers. Therefore, the 18 bands were then reduced by 50%. The result shows that these nine significant bands achieved classification accuracies of more than 85% for all classifiers. Thus, the number of bands was reduced by 50%. For five bands, the results showed that all classifiers gave a classification accuracy greater than 85%. The optimization was terminated at five bands due to the decreasing trend of classification accuracy from 35 to 5 bands. Thus, it was predicted that any further reduction of bands would only decrease the accuracy.  The results also indicated that 100% accuracy could be achieved using 35 bands and 18 bands for all classifiers. Overall, Fine Gaussian SVM was the best classifier for F1 with 100% accuracy for 35 and 18, while achieving 99% for nine bands and five bands with a kappa coefficient of 0.97. This indicated that the Fine Gaussian SVM was not sensitive to band reduction since it still gave 99% accuracy when using nine and five bands as a finely detailed distinction between classes compared to other types of classifiers. In contrast, the Quadratic SVM and Coarse Gaussian SVM were very sensitive to the band reduction, whereby the use of nine bands reduced the accuracy by 3%.

Frond 2 (F2)
SVM classification models were first developed using 35 significant bands as tabulated in Table 2. The results in Table 5 show that the 35 bands produced more than 85% accuracy for all classifiers. Therefore, the 35 bands were reduced by 50%. These 18 bands were then used as new inputs to the SVM classifiers and yielded over 85% accuracy for all classifiers. Therefore, the 18 bands were reduced by 50%. The result indicates that nine significant bands could not provide above 85% accuracy for all classifiers. Consequently, the number of bands was increased by 50%, i.e., from nine bands to 14 bands. For 14 bands, the results have shown that all classifiers were able to exceed 85% accuracy. However, the optimization was ended at 14 bands because further reduction would cause a substantial reduction of classification accuracy as occurred in Cubic SVM with nine bands. A list of all significant bands in the optimization process is tabulated in Table 6. In general, the number of bands affects the classification accuracy of the SVM classifiers as a high number of bands tend to establish classification models with a high accuracy, which ensures better prediction. Fine Gaussian SVM obtained the highest accuracy of all bands with 93% accuracy when using 35 bands. By using the same 35 bands, all SVM classifiers secured classification accuracies above 90%, except for the Cubic SVM, which gained a slightly lower accuracy of 89%. For 18 bands, the classification accuracy attained was slightly lower than for 35 bands. Although Fine Gaussian SVM was among the classifiers with the highest accuracy for 18 bands, the accuracy was reduced by 2% compared to 35 bands, making it the classifier with the highest percentage of loss. Further, the only classifier that was able to maintain the same accuracy as 35 bands was Cubic SVM with 89% accuracy. However, this kernel provided worse accuracy compared to others.
Interestingly, the reduction of bands from 18 to 9 reduced the accuracy of Cubic SVM sharply from 89% to 47%. Meanwhile, Quadratic SVM, Medium Gaussian SVM and Coarse Gaussian SVM gave a decrease of 1%. Almost all of the classifiers at nine bands scored lower classification accuracies than for 18 and 14 bands except for the Linear SVM and Coarse Gaussian SVM that maintained the same accuracy. In conclusion, the best classifier for this analysis was Fine Gaussian SVM that consistently scored the highest accuracy in all bands with kappa coefficients of 0.89, 0.84, 0.85, and 0.77 for 35, 18, 14, and 9 bands, respectively.

Combination of Frond 1 and Frond 2 (F12)
Thereafter, the reflectance dataset of F1 and F2 were combined to assess the performance of SVM classifiers on the combination dataset. Firstly, the 35 statistically significant bands tabulated in Table 7 were used as inputs to the SVM classifiers. The output of classification in Table 8 shows that all the classifiers gained over 85% accuracy when using 35 bands. Therefore, the 35 bands were reduced by 50%. The 18 bands were used as a new input for the classifiers. The 18 bands also obtained classification accuracy above 85% for all the classifiers, which resulted in a reduction of bands by 50%. For nine bands, one classifier obtained an accuracy lower than 85% by which the bands were then subsequently increased by 50% to 14 bands. The 14 bands successfully recovered the accuracy achieved by nine bands with all classifiers earning classification accuracies greater than 85%. However, there was no further optimization of 14 bands since nine bands already experienced an evident decrease in the classification accuracy of the Cubic SVM. Based on Table 8, the highest classification accuracy acquired was 95%, and at least three classifiers earned that accuracy for 35, 18, and 14 bands each. Linear SVM, Fine Gaussian SVM, and Medium Gaussian SVM scored the same accuracy of 95% with a kappa coefficient of 0.90 across the number of bands, except for nine bands where the accuracies were slightly lower. Furthermore, Quadratic SVM and Coarse Gaussian SVM experienced a one-time reduction in classification accuracy either at 18 or nine bands. After that, the accuracy became fixed until 14 bands. As for the Cubic SVM, the classification accuracy achieved gradually decreased as the number of bands decreased, initially 94% for 35 bands, then reduced to 78% for 9 bands and increased to 89% for 14 bands.
In general, the accuracy of the models of F12 was not affected by the band optimization except for the Cubic SVM. The 35 and 18 bands were capable of establishing more SVM models with the highest classification accuracy compared to nine and 14 bands. The 14 bands only earned the highest classification accuracy when using Linear SVM (kappa coefficient 0.89), Fine Gaussian SVM (kappa coefficient 0.90), and Medium Gaussian SVM (kappa coefficient 0.90). In this case, apart from the different number of bands, the type of classifiers also played a major role. For instance, the Quadratic SVM scored the same accuracy of 93% despite the decreasing number of bands from 18, 14 to 9 bands. It was also shown that the localized and finite response type of kernel function RBF is needed to classify the data as the accuracy produced was almost consistent despite the band reduction. For example, Fine Gaussian SVM and Medium Gaussian SVM gained the same kappa coefficient of 0.9 for 35, 18, and 14 bands, while, at nine bands, the kappa coefficients obtained were 0.88 and 0.87, respectively.

Discussion
The reflectance pattern generated by I seedlings was typical for diseased plants, with lower reflectance in the NIR spectrum due to the destruction of xylem, which thus reduced the chlorophyll pigments and also caused water deficiency. According to Liaghat et al. [67], changes of reflectance in the NIR spectrum during a stress period were more evident than changes in the visible spectrum, since NIR could penetrate deeper through the leaf pigments compared to the visible wavelengths. The changes in NIR were due to the rupture of the mesophyll cell wall [62,[77][78][79] which caused higher absorbance and lower reflectance of NIR. This result agreed with the findings by Ausmus and Hilty [80], where healthy maize dwarf mosaic virus-infected leaves showed significant differences in the NIR range even before the development of the physical symptoms, where the NIR reflectance of healthy leaves was higher than infected leaves.
In this study, the U seedlings reflected a slightly higher light level compared to the I seedlings in the visible range. This pattern was contrary to the spectral signature of healthy plants studied by other researchers that agreed healthy plants normally have lower reflectance than a diseased plant in the visible range, especially for green (520 to 560 nm) due to the higher chlorophyll content of the leaves. Nevertheless, the pattern shown in this study was similar to the study conducted by Shafri et al. [48] where the healthy seedlings yielded higher reflectance than G. boninense-infected seedlings in the green wavelengths. Therefore, this reflectance pattern might be a unique spectral signature for oil palm seedlings, since each plant has a specific spectral signature. Furthermore, according to Schmidt and Skidmore [81], different types of vegetation have spectral reflectance that is statistically significant in various spectral regions.
Although the reflectance patterns of the U and I seedlings of F1 and F2 were almost similar, the U of F2 produced a slightly higher reflectance than the U of F1 throughout the spectrum. This may indicate that the older leaves generated higher reflectance than the younger leaves. This idea was supported by Rapaport et al. [82] who found increases of NIR reflectance of Cabernet Sauvignon in the second week when the fourth leaf (young) of the control treatment moved to the eighth nodal position (old leaf position) and concluded that age variability mainly influenced the differences in reflectance spectra. Contrarily, Ahmadi et al. [62] presented average reflectance curves of healthy oil palms where frond 17 (old) produced lower NIR reflectance than frond 9 (young) during the first data collection. However, for the second data collection (8 months later), the NIR reflectance of frond 17 and 9 were almost similar. These reflectance spectra were not consistent with the first data collection.
By using the selected wavelengths of 800 to 950 nm, it was possible to obtain 100% classification accuracy in discriminating healthy and asymptomatic G. boninense-infected seedlings. The results also confirm that the prediction models developed using F1 generally had the most excellent accuracy of 100% when using 35 and 18 bands as input parameters. In addition, the accuracies attained by the SVM models showed that F12 was able to improve the accuracies of F2, which verified that both fronds could be used to detect the G. boninense infection in oil palm. This outcome agreed with Shafri et al. [48] who conducted a Maximum Likelihood classification using a combination of F1 and F2 to determine the health status of oil palm seedlings and achieved a net accuracy of 82% with a kappa coefficient of 0.73. Focusing on the similar type of input data used by Shafri et al. [48], i.e., F1 and F2, our method gave better accuracy with more than 90% at a different number of bands. Shafri et al. [48] used 24 significant bands (three green bands, 20 red bands and one NIR band) of first derivative spectra as input parameters. Our method that used the NIR spectrum performed well even at the small number of bands, i.e., nine bands with 94% accuracy. Reducing the number of bands has the advantages of being less complex and more economical. This promising result gave useful information in aerial-view applications such as when applying an unmanned aerial vehicle (UAV) for image acquisition since both fronds can be clearly seen from the top-view image and hence could expedite the detection of the G. boninense disease.
The F1 and F12 classification models produced robust results in contrast to the F2 classification models. For example, several SVM classifiers scored high classification accuracy when using the 35 and 18 (for F1 and F12), 9 and 5 (for F1), and 14 (for F12) bands which suggested that even a small number of input parameters could attain classification accuracies similar to a large number of input parameters. However, it depended on the type of classifiers used. Unlike the F2, the highest classification accuracy was only accomplished by Fine Gaussian SVM using 35 bands. The differences in the number of bands have no significant impact on the accuracy generated by the SVM classifiers. For example, nine bands were able to gain classification accuracies above 90% similar to 35, 18, and 14 bands when processed with F2. The distinct differences between the accuracy of F1 and F2 occurred due to the higher CV of F2, which indicate a higher dispersion of data in F2. In addition, Ahmadi et al. [62] claimed that younger fronds were more suitable to be used for the early detection of G. boninense infection since it has a better effect on classification accuracy than the older fronds due to its location on the top part of the crown.
Moreover, the results also present the importance of the kernel method, where Fine Gaussian SVM outperformed Linear SVM with higher classification accuracy in 14 bands of F2, and nine bands of F12, while Medium Gaussian SVM could attain similar accuracies as Linear SVM except when using nine bands of F2. In contrast, Coarse Gaussian SVM obtained less accuracy than the Linear SVM in all bands of F2 but higher in nine bands of F12. Therefore, it showed that the Gaussian RBF classifiers could provide a much better classification than the linear kernel SVM by using the appropriate kernel scale. However, for the polynomial kernel, Quadratic SVM yielded higher classification accuracy than Cubic SVM in F2 and F12, whereas Cubic SVM scored higher classification accuracy than Quadratic SVM in F1. Therefore, this indicated that the performance of SVM-based classification models was highly affected by the different types of data and kernel function, as the kernel effect depended on the data used. For example, F1 data could be optimized to five bands with classification accuracies scoring still above 85% for all SVM classifiers. In comparison, F2 and F12 could only be optimized to 14 bands since the Cubic SVM gave lower accuracy than 85% for both data sets.

Conclusions
NIR reflectance showed significant differences between the U and I seedlings. G. boninense infection can be detected at an early stage even though there are no physical symptoms of the disease by using SVM classifiers with a varying number of NIR bands. The reflectance spectra of the F1 frond yielded 100% classification accuracy for all SVM kernel functions using 35 and 18 NIR bands. In contrast, the F2 and F12 fronds achieved the highest classification accuracy of 93% when using Fine Gaussian SVM at 35 NIR bands and 95% when using Gaussian RBF and linear kernel at 35, 18, and 14 NIR bands, respectively. Since F12 produced higher classification accuracy than F2, it could be concluded that F12 would be better used for early detection of G. boninense infection in oil palm when using aerial images because there is no need to separate between F1 and F2 during the pre-processing data stage. Next, it was observed that a high number of bands achieved high classification accuracy while a small number of bands obtained slightly less accuracy. In addition, the optimized Gaussian RBF kernel, i.e., Fine Gaussian SVM could perform excellently compared to the Linear SVM and other classifiers in terms of the classification accuracy produced. However, it depended on the number of bands and fronds used.
For future work, the developed method in this research could be tested in an open environment to confirm its reliability for field application. Even the camera could be calibrated before every image acquisition to avoid error due to illumination. However, careful consideration needs to be taken when dealing with sun angle, shadow and weather conditions. In addition, the inoculation period of oil palm seedlings could be shortened to less 10 months to check the ability to detect the earliest infection of G. boninense. Next, research could also be implemented for different types of oil palm varieties to test their tolerance towards G. boninense infection and its effects on spectral reflectance.