Variety Identification of Orchids Using Fourier Transform Infrared Spectroscopy Combined with Stacked Sparse Auto-Encoder

The feasibility of using the fourier transform infrared (FTIR) spectroscopic technique with a stacked sparse auto-encoder (SSAE) to identify orchid varieties was studied. Spectral data of 13 orchids varieties covering the spectral range of 4000–550 cm−1 were acquired to establish discriminant models and to select optimal spectral variables. K nearest neighbors (KNN), support vector machine (SVM), and SSAE models were built using full spectra. The SSAE model performed better than the KNN and SVM models and obtained a classification accuracy 99.4% in the calibration set and 97.9% in the prediction set. Then, three algorithms, principal component analysis loading (PCA-loading), competitive adaptive reweighted sampling (CARS), and stacked sparse auto-encoder guided backward (SSAE-GB), were used to select 39, 300, and 38 optimal wavenumbers, respectively. The KNN and SVM models were built based on optimal wavenumbers. Most of the optimal wavenumbers-based models performed slightly better than the all wavenumbers-based models. The performance of the SSAE-GB was better than the other two from the perspective of the accuracy of the discriminant models and the number of optimal wavenumbers. The results of this study showed that the FTIR spectroscopic technique combined with the SSAE algorithm could be adopted in the identification of the orchid varieties.


Introduction
Orchids, one of the two major families of flowering plants, have fascinated botanists and plant enthusiasts over centuries [1]. Since the introduction of tropical species into cultivation in the 19th century, a large number of orchid hybrids and cultivars have been produced by horticulturists. Many orchids are mistaken for other varieties because of the similar appearance of orchids. Thus, it is important to clarify the identity and type of an orchid and distinguish it from other similar orchid families. Normally, orchid taxa are determined by their main morphology, ecology, and rarity [2,3]. In addition, many experts determine the type of orchid by identifying the differences in genes [4,5], an expensive or time consuming process which is often applied to only a few sample seeds. Thus, it is of significant interest to develop a rapid method for identifying orchid varieties. In recent years, the spectroscopic technique has proven to be a powerful analytical tool that can provide detailed structural information on sample properties and composition at the molecular level [6]. Fourier transform infrared (FTIR) is considered a simple (requiring minimum sample preparation), rapid, low-cost, and high sensitivity applied spectroscopic method [7]. Like the fingerprints of each person, the infrared

Spectral Profiles
The spectra of all original data are shown in Figure 1A, and the average spectra of each species are shown in Figure 1B. Since each spectrum was obtained based on 32 scans, the spectral profiles are very smooth. For a more detailed analysis of the spectrum, a broad-band O-H stretch was observed near 3322 cm −1 (between 3200 and 3650 cm −1 ) [21]. The absorption peaks at 2849 and 2924 cm −1 are mainly attributed to the stretching vibration of C-H [22]. The peak at 1620 cm −1 is caused by the O-H bending of absorbed water [23]. The absorption peak at 1417 cm −1 is mainly caused by the combination of the C-H bending of the alkenes and the O-H bending of the C-OH group [24]. The peak at 1369 cm −1 mainly offers information about the C-H bending vibrations [25]. The absorption at 1237 cm −1 was the bending of O-H [26]. The largest absorption occurs near 1023 cm −1 , mainly due to the C-O stretching in cellulose [27]. Obviously, there were differences in the spectral absorption rates of different orchid varieties, but we could not distinguish different types by spectral difference alone. Molecules 2019, 24, x FOR PEER REVIEW 3 of 12 Figure 1. The profiles of (A) the raw spectra and (B) the average spectra of each orchid.

Discriminant Models based on Full Spectra
Therefore, some chemometric methods were used to analyze the spectra and establish recognition models for different varieties. Firstly, the discriminant models were established by full spectra data. The KNN, SVM, and SSAE were used to establish the discriminant models, and the accuracy of the classification recognition was used as an evaluation metric of the model performance. As shown in Table 1, the classification accuracy of the SSAE model was 99.4% for the calibration set and 97.9% for the prediction set, both of which were significantly higher than those of the SVM and KNN models. It was clear that the performance of the KNN model was worse than the other two models-the accuracy for the prediction set was lower than 60%, and there was serious over-fitting. Though the calibration set accuracy rate of the SVM model reached 100%, the prediction set accuracy rate was 92.6%. There was a slight over-fitting for the SVM model. Therefore, the classification effect of the SSAE model was significantly better than the other two models. In the previous study, Nie et al. applied near-infrared hyperspectral imaging technology combined with a deep convolutional neural network to identify hybrid seeds and reached an accuracy of 93.87% and 96.12% for hybrid loofah and okra, respectively [28]. Wu et al. classified three varieties of tea samples using FTIR and the allied Gustafson-Kessel algorithm, a process which reached the accuracy of 93.9% at the wavenumber range of 4001.569-401.1211 cm −1 [9]. Michele et al. reported LDA (Linear Discriminant Analysis) models with discriminant accuracy of 92.0% at the wavenumber ranges of 2300-600 and 3000-2400 cm -1 for five different types of Moroccan olive cultivars [10]. Therefore, FTIR spectroscopy combined with the SSAE model for the identification thirteen different types of orchids performed better than previous works, both in terms of the number of experimental object varieties and the accuracy of recognition.

Feature Visualization with t-SNE
To visually demonstrate the validity of the dispersion of the SSAE network, the distribution of feature variables was visualized by projecting features from the high-dimensional space into the twodimensional space based on t-distribution stochastic neighbor embedding (t-SNE). Then, the twodimensional data were used to draw a scatter plot to observe the distribution of the raw data. Different from the traditional dimensionality reduction techniques that perform a linear mapping of

Discriminant Models Based on Full Spectra
Therefore, some chemometric methods were used to analyze the spectra and establish recognition models for different varieties. Firstly, the discriminant models were established by full spectra data. The KNN, SVM, and SSAE were used to establish the discriminant models, and the accuracy of the classification recognition was used as an evaluation metric of the model performance. As shown in Table 1, the classification accuracy of the SSAE model was 99.4% for the calibration set and 97.9% for the prediction set, both of which were significantly higher than those of the SVM and KNN models. It was clear that the performance of the KNN model was worse than the other two models-the accuracy for the prediction set was lower than 60%, and there was serious over-fitting. Though the calibration set accuracy rate of the SVM model reached 100%, the prediction set accuracy rate was 92.6%. There was a slight over-fitting for the SVM model. Therefore, the classification effect of the SSAE model was significantly better than the other two models. In the previous study, Nie et al. applied near-infrared hyperspectral imaging technology combined with a deep convolutional neural network to identify hybrid seeds and reached an accuracy of 93.87% and 96.12% for hybrid loofah and okra, respectively [28]. Wu et al. classified three varieties of tea samples using FTIR and the allied Gustafson-Kessel algorithm, a process which reached the accuracy of 93.9% at the wavenumber range of 4001.569-401.1211 cm −1 [9]. Michele et al. reported LDA (Linear Discriminant Analysis) models with discriminant accuracy of 92.0% at the wavenumber ranges of 2300-600 and 3000-2400 cm −1 for five different types of Moroccan olive cultivars [10]. Therefore, FTIR spectroscopy combined with the SSAE model for the identification thirteen different types of orchids performed better than previous works, both in terms of the number of experimental object varieties and the accuracy of recognition.

Feature Visualization with t-SNE
To visually demonstrate the validity of the dispersion of the SSAE network, the distribution of feature variables was visualized by projecting features from the high-dimensional space into the two-dimensional space based on t-distribution stochastic neighbor embedding (t-SNE). Then, the two-dimensional data were used to draw a scatter plot to observe the distribution of the raw data. Different from the traditional dimensionality reduction techniques that perform a linear mapping of data from a high to low dimensional space, t-SNE has been widely used to visualize high-dimensional data with a non-linear dimensionality reduction method.
As shown in Figure 2, t-SNE was adopted to visualize the original data and two-layer output features of the SSAE. The feature representation of hidden layer1 is more dispersed than the original spectral data. From the visualization of the original data, it can be seen that the distribution of different kinds of data is disorderly, and the feature representation of hidden layer1 had a certain degree of difference with respect to the raw data while being highly compact. In contrast, the feature representation of hidden layer2 was clearly separated, while individual samples were misclassified. This visualization technology had the potential to explain the effectiveness of the SSAE in processing the spectral data and to establish an intuitive visual differentiation method.
Molecules 2019, 24, x FOR PEER REVIEW 4 of 12 data from a high to low dimensional space, t-SNE has been widely used to visualize high-dimensional data with a non-linear dimensionality reduction method. As shown in Figure 2, t-SNE was adopted to visualize the original data and two-layer output features of the SSAE. The feature representation of hidden layer1 is more dispersed than the original spectral data. From the visualization of the original data, it can be seen that the distribution of different kinds of data is disorderly, and the feature representation of hidden layer1 had a certain degree of difference with respect to the raw data while being highly compact. In contrast, the feature representation of hidden layer2 was clearly separated, while individual samples were misclassified. This visualization technology had the potential to explain the effectiveness of the SSAE in processing the spectral data and to establish an intuitive visual differentiation method.

Optimal Wavenumber Selection
For the development of instruments that can be adopted for portable detection, the use of characteristic spectroscopy can effectively reduce the development cost of the instruments, as processing the whole band of spectral data would increase the computational complexity and require more computing power of the hardware devices. Thus, the selection of sensitive wavenumbers in multivariate spectra analysis is necessary to determine principal spectral information and simplify the modeling process. PCA-loading, CARS and stacked sparse auto-encoder guided backward (SSAE-GB) were applied to select optimal wavenumbers in this study. As shown in Figure 3, the strong peaks and valleys of SSAE-GB's partial guided result which had an absolute value over 0.5

Optimal Wavenumber Selection
For the development of instruments that can be adopted for portable detection, the use of characteristic spectroscopy can effectively reduce the development cost of the instruments, as processing the whole band of spectral data would increase the computational complexity and require more computing power of the hardware devices. Thus, the selection of sensitive wavenumbers in multivariate spectra analysis is necessary to determine principal spectral information and simplify the modeling process. PCA-loading, CARS and stacked sparse auto-encoder guided backward (SSAE-GB) were applied to select optimal wavenumbers in this study. As shown in Figure 3, the strong peaks and valleys of SSAE-GB's partial guided result which had an absolute value over 0.5 were chosen as optimal wavenumbers, and 38 optimal wavenumbers were selected. The wavenumbers distribution of each selection method is displayed in Figure 4. PCA-loading selected 39 optimal wavenumbers, and CARS selected the most optimal wavenumbers to be 300, after which the number of wavenumber variables was significantly reduced by 95.81% and 99.47%, respectively. There were many overlapping ranges of the three methods. The maximum overlapping range was between 3000 cm −1 and 3750 cm −1 , which can be assigned as O-H stretching [29]. The other common overlapping ranges included 1712 cm −1 (for C=O stretching [30]), 1642 cm −1 (for H-O-H bending of water [31]), 1530 cm −1 -1560 cm −1 (for coupled NH deformation and C-N stretching [32]), and near 861 cm −1 (for CH2 rocking vibrations [33]). were chosen as optimal wavenumbers, and 38 optimal wavenumbers were selected. The wavenumbers distribution of each selection method is displayed in Figure 4. PCA-loading selected 39 optimal wavenumbers, and CARS selected the most optimal wavenumbers to be 300, after which the number of wavenumber variables was significantly reduced by 95.81% and 99.47%, respectively. There were many overlapping ranges of the three methods. The maximum overlapping range was between 3000 cm -1 and 3750 cm -1 , which can be assigned as O-H stretching [29]. The other common overlapping ranges included 1712 cm -1 (for C=O stretching [30]), 1642 cm -1 (for H-O-H bending of water [31]), 1530 cm -1 -1560 cm -1 (for coupled NH deformation and C-N stretching [32]), and near 861 cm -1 (for CH2 rocking vibrations [33]).  To verify the validity of the feature wavenumbers selection, models of the SVM and KNN were established based on the optimal wavenumbers extracted by PCA-loading, CARS, and SSAE-GB. The were chosen as optimal wavenumbers, and 38 optimal wavenumbers were selected. The wavenumbers distribution of each selection method is displayed in Figure 4. PCA-loading selected 39 optimal wavenumbers, and CARS selected the most optimal wavenumbers to be 300, after which the number of wavenumber variables was significantly reduced by 95.81% and 99.47%, respectively. There were many overlapping ranges of the three methods. The maximum overlapping range was between 3000 cm -1 and 3750 cm -1 , which can be assigned as O-H stretching [29]. The other common overlapping ranges included 1712 cm -1 (for C=O stretching [30]), 1642 cm -1 (for H-O-H bending of water [31]), 1530 cm -1 -1560 cm -1 (for coupled NH deformation and C-N stretching [32]), and near 861 cm -1 (for CH2 rocking vibrations [33]).  To verify the validity of the feature wavenumbers selection, models of the SVM and KNN were established based on the optimal wavenumbers extracted by PCA-loading, CARS, and SSAE-GB. The To verify the validity of the feature wavenumbers selection, models of the SVM and KNN were established based on the optimal wavenumbers extracted by PCA-loading, CARS, and SSAE-GB. The recognition effect of each model for the calibration set and the prediction set is shown in Table 2. The SVM model performed much better than the KNN model, like the full spectrum modeling. In addition, most of the optimal wavenumbers-based models were slightly better than the full spectra models, which can be reflected in both the SVM model and the KNN model. This could indicate that the full spectrum with 7157 variables contained redundant information, which affected the accuracy of the model. Though the SVM model on optimal spectrum selected by CARS was slightly better than the SVM model based on the SSAE-GB method, the number of characteristic wavenumbers selected by SSAE-GB was 0.531% of the full spectrum, and the CARS method selected up to 4.192%. In contrast, SSAR-GB had satisfactory performance and was suitable for spectral feature selection.

Samples Preparation and FTIR Spectra Acquisition
Thirteen varieties of orchid leaves (including cl25, cl3215, cl3sheng, cl43, cl49, cl52, cl5839, cl_mei, cljin, cls39, hongfenjiaren, hongmeiren, and jiutoulan) were collected from the orchid nursery in the Institute of Horticulture, Zhejiang Academy of Agricultural Science, Hangzhou, China. The plants were collected from the same greenhouse to reduce environmental effects. The orchids whose names begin with 'cl' are hybrid combinations of Chunlan (Cymbidium goeringii) and other orchid species, and the orchids whose names begin 'h' or 'j' are of a different four-season orchid (Cymbidium ensifolium) variety. The number of experimental samples for each orchid is shown in Figure 5. recognition effect of each model for the calibration set and the prediction set is shown in Table 2. The SVM model performed much better than the KNN model, like the full spectrum modeling. In addition, most of the optimal wavenumbers-based models were slightly better than the full spectra models, which can be reflected in both the SVM model and the KNN model. This could indicate that the full spectrum with 7157 variables contained redundant information, which affected the accuracy of the model. Though the SVM model on optimal spectrum selected by CARS was slightly better than the SVM model based on the SSAE-GB method, the number of characteristic wavenumbers selected by SSAE-GB was 0.531% of the full spectrum, and the CARS method selected up to 4.192%. In contrast, SSAR-GB had satisfactory performance and was suitable for spectral feature selection.

Samples Preparation and FTIR Spectra Acquisition
Thirteen varieties of orchid leaves (including cl25, cl3215, cl3sheng, cl43, cl49, cl52, cl5839, cl_mei, cljin, cls39, hongfenjiaren, hongmeiren, and jiutoulan) were collected from the orchid nursery in the Institute of Horticulture, Zhejiang Academy of Agricultural Science, Hangzhou, China. The plants were collected from the same greenhouse to reduce environmental effects. The orchids whose names begin with 'cl' are hybrid combinations of Chunlan (Cymbidium goeringii) and other orchid species, and the orchids whose names begin 'h' or 'j' are of a different four-season orchid (Cymbidium ensifolium) variety. The number of experimental samples for each orchid is shown in Figure 5. All samples were freeze-dried and grounded to a powder using a grinder before being subjected to FTIR spectroscopy over a wavenumber range of 4000-550 cm −1 . Before each sample scan, 0.02 g of the sample was evenly mixed with 0.98 g of dried KBr powder, and the entire process was operated All samples were freeze-dried and grounded to a powder using a grinder before being subjected to FTIR spectroscopy over a wavenumber range of 4000-550 cm −1 . Before each sample scan, 0.02 g of the sample was evenly mixed with 0.98 g of dried KBr powder, and the entire process was operated under an infrared heat lamp to minimize the variations in the moisture content of the sample powder. The uniformly mixed powder was pressed as 15 MPa for 30 s using a tableting machine. The FTIR spectra of these samples were obtained by using a FTIR spectrometer (FTIR 4100, JASCO Corporation, Tokyo, Japan) with a spectral resolution of 4 cm −1 . During the spectral scanning, each sample was scanned 32 times, and the average spectra was taken as the sample spectra. The sampling interval of background signal was set to 45 min, and the entire experimental temperature was maintained at approximately 25 • C.

Multivariate Data Analysis
Three discriminant models, the KNN, SVM, and SSAE, were applied to identify the type of orchids based on the full band spectra. The KNN and SVM models were used to compare the representative ability of optimal wavenumbers selected by PCA-loading, CARS, and SSAE-GB. A high-dimensional dataset visualization method, t-SNE, was applied to explore the feature extraction ability of the SSAE. The multivariate data analysis was implemented in Python 3.6.8 and run by Jupyter 4.4.0. In addition, the flow chart of this research is shown in Figure 6. under an infrared heat lamp to minimize the variations in the moisture content of the sample powder. The uniformly mixed powder was pressed as 15 MPa for 30 s using a tableting machine. The FTIR spectra of these samples were obtained by using a FTIR spectrometer (FTIR 4100, JASCO Corporation, Tokyo, Japan) with a spectral resolution of 4 cm −1 . During the spectral scanning, each sample was scanned 32 times, and the average spectra was taken as the sample spectra. The sampling interval of background signal was set to 45 min, and the entire experimental temperature was maintained at approximately 25 °C.

Multivariate Data Analysis
Three discriminant models, the KNN, SVM, and SSAE, were applied to identify the type of orchids based on the full band spectra. The KNN and SVM models were used to compare the representative ability of optimal wavenumbers selected by PCA-loading, CARS, and SSAE-GB. A high-dimensional dataset visualization method, t-SNE, was applied to explore the feature extraction ability of the SSAE. The multivariate data analysis was implemented in Python 3.6.8 and run by Jupyter 4.4.0. In addition, the flow chart of this research is shown in Figure 6.

K-Nearest Neighbor
K-nearest neighbor (KNN) is a non-parametric method wildly used for pattern recognition, which is a non-parameters method [34]. It finds the k instances nearest to the unknown instance in the training dataset where k is a manually set parameter. The unknown instance is identified as the category to which majority of these k nearest instances belong. Determining the parameter k is critical for KNN. For this study, the optimal k was selected from 3 to 20 with a step of 1 by using three-fold cross validation.

Support Vector Machine
The support vector machine (SVM) is a supervised recognition algorithm whose basic model is the linear classifier with the largest interval in the feature space [35]. Owing to the high efficiency of processing linear and non-linear data, the SVM has been widespread used in spectral data analysis. The SVM maps the data from the original space to the higher-dimensional feature space by the kernel function, such that instances of individual categories are separated as clearly as possible. For this study, the SVM with the radial basis function was applied. In order to make the SVM model achieve

K-Nearest Neighbor
K-nearest neighbor (KNN) is a non-parametric method wildly used for pattern recognition, which is a non-parameters method [34]. It finds the k instances nearest to the unknown instance in the training dataset where k is a manually set parameter. The unknown instance is identified as the category to which majority of these k nearest instances belong. Determining the parameter k is critical for KNN. For this study, the optimal k was selected from 3 to 20 with a step of 1 by using three-fold cross validation.

Support Vector Machine
The support vector machine (SVM) is a supervised recognition algorithm whose basic model is the linear classifier with the largest interval in the feature space [35]. Owing to the high efficiency of processing linear and non-linear data, the SVM has been widespread used in spectral data analysis. The SVM maps the data from the original space to the higher-dimensional feature space by the kernel function, such that instances of individual categories are separated as clearly as possible. For this study, the SVM with the radial basis function was applied. In order to make the SVM model achieve optimal performance, a grid-search procedure was used to obtain the optimal penalty parameters (c) and kernel function parameters (g) with the searching range of 2 −8 -2 8 .

Principal Component Analysis loading
Principal component analysis (PCA) is an effective data dimension reduction method which has been widely applied in processing large spectral data [36]. It converts linearly correlated high-dimensional variables into linearly independent low-dimensional variables (called principal components) by orthogonal variation. The first principal component has the largest variance, and each succeeding component in turn has highest variance possible under the constraint that it is orthogonal to the preceding components. The main information of the raw variables is included in the first few principal components. PCA-loading is obtained based on the PCA algorithm, which can extract the characteristic variables by analyzing the loading. PCA-loading reflects the degree of correlation between the original spectral variables and the principal component. The larger the PCA-loading, the more important the band corresponding to the spectral variable is. In this study, the optimal spectra were found in the loading of the first 6 principal components which explained 98.65% of the total variance.

Competitive Adaptive Reweighted Sampling
Competitive adaptive reweighted sampling (CARS) is a method for selecting feature bands by Monte Carlo sampling combined with the partial least squares regression model [37]. CARS combines an exponential decay function and the adaptive weighted sampling algorithm, selects variables by comparing regression coefficient weights of variables in the PLS model, and removes the variables with small weight. Besides, it selects the lowest RMSECV (Root Mean Squared Error Cross Validation) by interactive verification, which is an effective method to find the optimal combination of variables.

Stacked Sparse Auto-Encoder
The stacked sparse auto-encoder (SSAE) is a multi-layered feature expression framework consisting of a layer-by-layer stack of sparse auto-encoder (SAE) structures which can be used to extract the key information from original data [38].
A sparse auto-encoder (SAE) network shown is in Figure 7A. It is composed of one input layer, one hidden layer, and one output layer, which includes the encoder and decoder. The encoder and decoder are performed as follows: where h is the feature representation; W 1 and b 1 are encoding weights and biases, respectively; relative W 2 and b 2 are decoding weights and biases; and f ( ) is a sigmoid function (1 + exp(−x)) −1 , which is a nonlinear activation function introducing nonlinear characteristics for the feature representation process. The main purpose of the SAE is to reduce the dimensions of the input data by minimizing the error between input variables and output variables. Mathematically, it can be formulated as follows: where KL divergence is adopted to penalize the difference between ρ andρ j so that only a few hidden units respond to specific categories and most are suppressed, denoted as follows: where ρ is a sparse parameter that is usually set to a small value close to 0 andρ j is the average activation value of the hidden unit j on all data in the training set.
Monte Carlo sampling combined with the partial least squares regression model [37]. CARS combines an exponential decay function and the adaptive weighted sampling algorithm, selects variables by comparing regression coefficient weights of variables in the PLS model, and removes the variables with small weight. Besides, it selects the lowest RMSECV (Root Mean Squared Error Cross Validation) by interactive verification, which is an effective method to find the optimal combination of variables.

Stacked Sparse Auto-encoder
The stacked sparse auto-encoder (SSAE) is a multi-layered feature expression framework consisting of a layer-by-layer stack of sparse auto-encoder (SAE) structures which can be used to extract the key information from original data [38]. As shown in Figure 7B, multiple SAEs build a SSAE network in which the characteristic representation h of each SAE is the input of the continuous SAE.
In this study, the encoder of the SSAE was pre-trained, and the decoder was removed, with the last coding unit followed by a classifier ( Figure 7C). The softmax loss function was used as a classifier to fine-tune the entire SSAE network in a supervised manner, which can indicate the ability of the model to fit the tagged data. The activation function of the softmax classifier, h θ (x), determines the probability P(y = j |x; θ) that x belongs to the category j. h θ (x) was defined as follows: . . .
where k is the number of SSAE's last coding layer output and θ j is the parameter that maps x to the jth category, which was specified by the classifier. In addition, the loss function of the entire supervised model including the coding layers of the SSAE and the classifier is defined as follows: where λ is weight decay coefficient which can prevent overfitting and ||θ|| 2 is L2 norm of the parameters in h θ (x). 1 y i = j is the indicator function, defined as follows: It can be seen from 1 y i = j that if sample x i of class j is misclassified, the P(y i = j |x i ; θ) is very small, and the corresponding J(θ) would be large because of the logarithmic mapping. Thus, with backward learning, the learning trend of θ is to make P(y i = j |x i ; θ) bigger in order to implement the process of optimizing parameters.
It is well known that the SSAE has some hidden layers, and it is hard to choose characteristic variables only based on the weights of the first network layer. In order to consider the features learned by each layer, this study we used the method which takes a reverse derivative of J(θ) called the stacked sparse auto-encoder guided backward (SSAE-GB). As described in the following equation, we found the partial guide of x i for J(θ) which could be used to select the optimal variables. As shown by Equations (8) and (9), the larger absolute value of SSAE-GB's partial guided result, the more important the absorbance at the wavenumbers corresponding to these values.

Conclusions
The results of this study indicated that 13 orchids genotypes could be identified by FTIR spectroscopy combined with appropriate multivariate analysis techniques. Three discriminant models, the SSAE, SVM, and KNN, were built on the full spectra of 4000-550 cm −1 obtained by the FTIR spectroscopy technique. Each model was operated in three-fold cross validations; the SSAE model performed more effectively than the SVM and KNN based on results of the calibration set and prediction set. The SSAE model achieved the classification accuracy rates of 99.4% and 97.9% for the calibration set and prediction set, respectively. Though the SVM model's accuracy of the calibration set reached an amazing 100%, the accuracy of the prediction set only reached 92.6%, and there was a certain over-fitting phenomenon. In order to further observe the SSAE model from another aspect, the feature distribution of each layer was visualized by t-SNE, and the visualized results demonstrated that it was feasible to identify the type of orchids using the SSAE model. At the same time, three feature selection methods, PCA-loading, CARS, and SSAE-loading, were used to reduce the dimensions of the data variable, which selected 39, 300, and 38 characteristic wavenumbers, respectively. The SVM and KNN models built on the optimal wavenumbers were used to compare the effects of the three feature selection methods. The accuracy of the SVM was much better than the KNN model, having the same phenomenon as these two models built on the full band. The prediction set's accuracy of the SVM model based on the characteristic wavenumbers selected by SSAE-GB was 94.5%, which was slightly lower than the accuracy of the SVM based on the CARS method. However, the number of characteristic features selected by SSAE-GB was 38-extremely less than the 300 features of the CARS method. Comprehensively, SSAE-GB performed more effectively than the other two. This analytical protocol undoubtedly has great prospects in orchid identification, as it provides a quick and effective detection method. Future experiments will enrich the spectral database of orchids to improve the robustness and generalization of the recognition model so as to identify orchids more quickly and efficiently.

Conflicts of Interest:
The authors declare no conflict of interest.