Storage Time Detection of Torreya grandis Kernels Using Near Infrared Spectroscopy

: To achieve the rapid identiﬁcation of Torreya grandis kernels ( T. grandis kernels) with different storage times, the near infrared spectra of 300 T. grandis kernels with storage times of 4~9 months were collected. The collected spectral data were modeled, analyzed, and compared using unsupervised and supervised classiﬁcation methods to determine the optimal rapid identiﬁcation model for T. grandis kernels with different storage times. The results indicated that principal component analysis (PCA) after derivative processing enabled the visualization of spectral differences and achieved basic detection of samples with different storage times under unsupervised classiﬁcation. However, it was unable to differentiate samples with storage times of 4~5 and 8~9 months. For supervised classiﬁcation, the classiﬁcation accuracy of support vector machine (SVM) modeling was found to be 97.33%. However, it still could not detect the samples with a storage time of 8~9 months. The classiﬁcation accuracy of linear discriminant analysis after principal component analysis (PCA-DA) was found to be 99.33%, which enabled the detection of T. grandis kernels with different storage times. This research showed that near-infrared spectroscopy technology could be used to achieve the rapid detection of T. grandis kernels with different storage


Introduction
Torreya grandis are mainly distributed in the Kuaiji's mountain and Tianmu mountain regions of Zhejiang Province, the Huangshan region of Anhui Province, the mountainous region of southern Fujian Province, the northeastern region of Jiangxi Province, and the western region of Hunan Province, China [1]. T. grandis kernels contain a range of nutrientrich and medicinally valuable components, including protein, vitamins, flavonoids, and oleic acid [2]. T. grandis kernels are highly sought after for their rich nutrients and unique taste [3]. During the storage of T. grandis kernels, nutrients are lost as time goes by, and prolonged storage could accelerate the loss of moisture and even lead to quality problems such as oxidation and decay, which could affect the internal quality and taste of T. grandis kernels [4]. Chemical indicators of T. grandis kernels such as moisture, protein, and fat show significant differences with different storage times [5]. T. grandis kernels have a high value as both a food and medicinal herb [6]. T. grandis kernels contain relatively rich fat and protein, and are extremely prone to dampness, breeding bacteria, and other harmful microorganisms, resulting in Torreya mildew and long insects. In addition, T. grandis kernels in the shell are much shorter than those that are unshelled. Because T. grandis kernels have no outer shell to protect them, and their flesh is in direct contact with the air, they are more prone to mildew and can only be kept for about 3~9 months. The detection of storage time for T. grandis kernels was deemed necessary in order to ensure their quality and safety, and to prevent resource wastage resulting from expired T. grandis kernels [7]. This could help producers better control the storage time of the chestnuts [8]. Therefore, the detection of the storage time of T. grandis kernels is of great significance.
The traditional methods of testing the storage time of nuts include observation of appearance of the appearance of nuts, if there is deterioration, discoloration, or they are moth-eaten and other problems, it indicates that they have expired or the quality has declined and should not be eaten. Listening to their sound can also be used: shaking the nuts and listening to the sound they make [9]. If the sound is crisp and resounding, it indicates good quality; otherwise, it indicates low quality. The smell of the nuts can also be an indicator. If there is a musty or spoiled odor, etc., it indicates that the nut has expired and should not be eaten. the taste the taste of nuts is also an indicator: if they feel too hard, soft, greasy taste, or other problems, such as spoilage or being moth-eaten. These methods, while simple, are not accurate enough to ensure accurate nut storage times.
The near-infrared spectroscopy technology, as a fast and non-destructive detection technique, could analyze samples without damaging them. Therefore, there is no need to use harmful chemicals or treat samples, effectively reducing environmental pollution. This technology could perform analyses in a short time that does not consume a large amount of energy, thus reducing the environmental burden. Currently, near-infrared spectroscopy detection technology has been widely applied in research on storage time and quality detection of various types of foods [10]. Vis/NIR spectroscopy was employed to evaluate the quality changes in walnut kernels with different storage times. In the study. the spectral information was preprocessed using second-order derivatives and it was demonstrated that the PLSR processing had promising potential for distinguishing between spoiled and fresh walnut kernels stored for different periods, achieving an accuracy of 87%. This finding suggests that Vis/NIR spectroscopy could be an effective method for detecting quality changes in walnut kernels over time [11]. In a previous study, NIR hyperspectral imaging was employed to distinguish between Spanish and Chinese pine nut varieties. After applying SIMCA modeling, a model identification accuracy ranging from 84% to 100% was achieved [12]. NIR spectroscopy was used to collect wheat samples from 2007-2012 for the determination of their storage time. The results of the study showed that the SIMCA classification method could accurately classify adjacent years up to 97.05% using a dichotomous classification approach [13]. The above results demonstrate the feasibility of using NIR spectroscopy for the rapid detection of T. grandis kernels with different storage times.
The present study was conducted on T. grandis kernels in Zhejiang Province. NIR spectroscopy was utilized to develop models for the discriminative analysis of T. grandis kernels with different storage times. This study is of great importance to improvement in the quality of T. grandis kernels and standardizing the development of the relevant industries. This experiment aimed to determine the storage time of T. grandis kernels in a sustainable manner, reducing resource consumption and environmental damage, ensuring the sustainability of resources and the environment for future generations, protecting and restoring natural ecosystems, and maintaining ecological balance.

Sample Preparation
T. grandis kernels used in the experiment were purchased from local farmers in Zhandai Village, Zhuji City, Zhejiang Province, China, and they were harvested at the end of September 2021. Fifty T. grandis kernel samples were randomly selected every month from February to June 2022, and a total of 300 samples were collected. The shell was manually removed and the T. grandis kernels were ground into powder with a mortar. Prior to storage, the T. grandis kernels needed to be stored in a damp-resistant, ventilated, and sealed package to prevent moisture and odor intrusion. The temperature was kept between 8 • C and 15 • C. High temperature could cause the T. grandis kernels to degrade, while low temperature could cause the T. grandis kernels to become moist, which would affect the quality of the T. grandis kernels. In order to prevent T. grandis kernels from losing water, Sustainability 2023, 15, 7757 3 of 12 some desiccant, such as silica gel, was put in a sealed bag to absorb moisture. These samples were stored for 4~9 months, after which they were taken out for spectral data collection.

Acquisition of Spectral Data
The test instrument used in this study was a SmartEye 1700 portable NIR spectrometer (FireEye Golden Eye Co., Hangzhou, China). The instrument had a wavelength range from 1000 to 1650 nm, with a sampling interval of 1 nm. The light source was a double integrated vacuum tungsten lamp (Rays Lighting Co., Huizhou, China), and the detector was a 128-line source uncooled In Ga As diode array. During the acquisition of the spectra, the laboratory temperature was maintained at 23 • C, and the humidity was maintained at 55%. The portable NIR spectrometer was preheated for 30 min before spectral acquisition to ensure the accuracy of the NIR spectrum. Diffuse reflectance was used in this experiment, and a 100% Spectralon TM standard whiteboard was used as a background. The average number of scans was set to 50, and integration time was set to 12.7 ms with a resolution of 8 cm −1 . The ground T. grandis kernel samples were placed in the sampling window area, and were taken while ensuring that the light source illuminated the samples vertically. The spectra were collected three times for each sample at different locations and averaged to minimize experimental errors that may be caused by improper handling.

Principal Component Analysis (PCA)
Principal component analysis (PCA) is a common method used for dimensionality reduction of high-dimensional data [9]. The spectral data are normalized and feature vectors are extracted through PCA [14]. The most significant principal components are selected to remove redundancies and noise from the spectral data [15]. The PCA model is an unsupervised classification method and does not require sample delineation [16]. The spectral data of T. grandis kernels with different storage time were pre-processed using the Normalization, 1-Der, 2-Der, Baseline, and standard normal variate (SNV) methods. Then, the principal PCA modeling based on pre-processed data was established to differentiate between the different storage times of T. grandis kernels.

Support Vector Machine (SVM)
The support vector machine (SVM) is capable of effectively classifying data in highdimensional spaces, capturing key samples and weeding out a large number of redundant samples [17]. It is widely used in statistical classification and regression analysis due to its simple algorithms and good performance [18]. As a type of supervised classification, SVM requires selecting training samples [19]. Kennard-Stone (K-S) and SPXY sampling are common methods used for sample classification [20]. Since only spectral data were used to analyze the storage times of T. grandis kernels, the K-S classification method was chosen in this study, and was based on the difference in euclidean distance between spectra, and the sample with the largest euclidean distance was divided into the calibration set. Compared with the random classification method, it effectively improves the homogeneity of the sample distribution [21]. The collected spectral data of T. grandis kernels were divided into calibration and validation sets at a ratio of 3:1 using the K-S classification method. Four kernel tricks including linear, polynomial, RBF, and sigmoid kernels were used for constructing SVM models. After selecting the kernel trick, various preprocessing methods were applied to improve the accuracy of the classification model.

Linear Discriminant Analysis after Principal Component Analysis (PCA-DA)
Linear discriminant analysis (LDA) is used to project high-dimensional spectral data into a low-dimensional space, with similar data clustered together and dissimilar data separated [22]. LDA is a supervised classification method that aims to achieve a classification effect by projecting data points of the same category closer together in the projection space [23]. However, LDA has limitations in terms of the number of variables, which Sustainability 2023, 15, 7757 4 of 12 cannot be higher than the sample size. Therefore, to overcome this limitation, dimension reduction through PCA was used. LDA has three modeling methods: linear, quadratic, and Mahalanobis. To identify the optimal model for the detection of different storage times of T. grandis kernels, the spectral data were first subjected to dimension reduction using PCA to select the main components, then the LDA was applied to the spectral data.

Near-Infrared Spectral Analysis
The average spectra of T. grandis kernels samples stored for 4~9 months are presented in Figure 1. With a spectral interval ranging from around 1000 to 1600 nm, a peak at 1200 nm represented the fat in T. grandis kernels [24]. The absorbance of the NIR spectra decreased rapidly when the storage time increased from around 7 to 9 months, which was possibly due to the decrease in unsaturated fatty acid caused by long storage time. The peak at 1450 nm represented the protein in T. grandis kernels [25]. Its absorbance decreased uniformly with increasing storage time, indicating a relatively uniform reduction in protein.
In general, the absorbance of NIR spectra of T. grandis kernels decreased gradually with increasing storage time. However, the differences between NIR spectral curves of T. grandis kernels with different storage times were not significant. Thus, distinguishing T. grandis kernels at different storage times solely from the spectra was difficult, and it was necessary to use chemometric software to analyze and process the spectral data to establish a detection model of T. grandis kernels with different storage times. cation effect by projecting data points of the same category closer together in the tion space [23]. However, LDA has limitations in terms of the number of variables cannot be higher than the sample size. Therefore, to overcome this limitation, dim reduction through PCA was used. LDA has three modeling methods: linear, qu and Mahalanobis. To identify the optimal model for the detection of different times of T. grandis kernels, the spectral data were first subjected to dimension red using PCA to select the main components, then the LDA was applied to the spectr

Near-Infrared Spectral Analysis
The average spectra of T. grandis kernels samples stored for 4~9 months are pr in Figure 1. With a spectral interval ranging from around 1000 to 1600 nm, a peak nm represented the fat in T. grandis kernels [24]. The absorbance of the NIR spec creased rapidly when the storage time increased from around 7 to 9 months, wh possibly due to the decrease in unsaturated fatty acid caused by long storage tim peak at 1450 nm represented the protein in T. grandis kernels [25]. Its absorba creased uniformly with increasing storage time, indicating a relatively uniform red in protein. In general, the absorbance of NIR spectra of T. grandis kernels decrease ually with increasing storage time. However, the differences between NIR spectral of T. grandis kernels with different storage times were not significant. Thus, disting T. grandis kernels at different storage times solely from the spectra was difficult was necessary to use chemometric software to analyze and process the spectral establish a detection model of T. grandis kernels with different storage times.

Principal Component Analysis (PCA) Model
Dimension reduction was performed on the near-infrared spectral data of T. kernels, and the cumulative contribution rate of the principal components was d in Figure 2. It was found that the first seven principal components contained almo the information of the samples. In addition, the contribution rates of the original sp and pre-processed spectra for the first two principal components exceeded 80%. Al PC3 and PC4 could have been used for visualization purposes, in our general data sis, the majority of the data variance was already explained by PC1 and PC2.

Principal Component Analysis (PCA) Model
Dimension reduction was performed on the near-infrared spectral data of T. grandis kernels, and the cumulative contribution rate of the principal components was depicted in Figure 2. It was found that the first seven principal components contained almost all of the information of the samples. In addition, the contribution rates of the original spectrum and pre-processed spectra for the first two principal components exceeded 80%. Although PC3 and PC4 could have been used for visualization purposes, in our general data analysis, the majority of the data variance was already explained by PC1 and PC2. The first and second principal components were used for plotting the results PCA, and the classification results are presented in Figure 3. Almost all of the T. kernel samples were mixed together, making it difficult to discriminate according original spectrum (Figure 3a). However, the samples with storage times of 8~9 m were more cohesive, indicating that spectral differences were more evident when T. kernels were stored for 8~9 months. After Baseline processing (Figure 3b), the classif results were not significantly improved, and only two storage periods (4~7 mont 8~9 months) could be distinguished. After standard normal variate (SNV) (Figure 3 Normalize (Figure 3d) preprocessing, although some samples with a storage time months were slightly adulterated with a storage time of 8~9 months, the storage ti samples were divided into three different sets of 4~5 months, 6~7 months, and 8~9 m The best results were obtained after 1-Der ( Figure 3e) and 2-Der (Figure 3f) processin the storage times of samples were divided into four different sets of 4~5 months, 6 m 7 months, and 8~9 months.
The results showed that the PCA model with spectral preprocessing was only divide T. grandis kernel samples into four different sets of 4~5 months, 6 months, 7 m and 8~9 months. It was impossible to distinguish samples with storage times of fo five months, and eight and nine months. In general, the differentiation results ob by PCA alone could only achieve partial storage time distinction. To improve the c cation accuracy of the samples with storage times of 4~5 months and 8~9 month and LDA models were established to obtain better results. Visible/near-infrared (V spectroscopy was used to rapidly determine the storage time of Fuji apples. The raw tral data and SNR feature values were analyzed using PCA, and the apple sample successfully differentiated using PCA and SNR spectra. Therefore, Vis/NIR spectr is effective for the rapid discrimination of the storage time of Fuji apples [26]. NIR troscopy was also used to discriminate the authenticity of extra virgin olive oil (E PCA and redundancy analysis (RDA) techniques were used to qualitatively or qu tively verify the authenticity of olive oil, predicting the percentage of EVOO in mi or pure EVOO. The results showed the potential of RDA factors for predicting and fying, significantly improving the calibration and validation results obtained fr PCA factors [27]. The first and second principal components were used for plotting the results of the PCA, and the classification results are presented in Figure 3. Almost all of the T. grandis kernel samples were mixed together, making it difficult to discriminate according to the original spectrum (Figure 3a). However, the samples with storage times of 8~9 months were more cohesive, indicating that spectral differences were more evident when T. grandis kernels were stored for 8~9 months. After Baseline processing (Figure 3b), the classification results were not significantly improved, and only two storage periods (4~7 months and 8~9 months) could be distinguished. After standard normal variate (SNV) (Figure 3c) and Normalize (Figure 3d) preprocessing, although some samples with a storage time of five months were slightly adulterated with a storage time of 8~9 months, the storage times of samples were divided into three different sets of 4~5 months, 6~7 months, and 8~9 months. The best results were obtained after 1-Der ( Figure 3e) and 2-Der (Figure 3f) processing, and the storage times of samples were divided into four different sets of 4~5 months, 6 months, 7 months, and 8~9 months.
The results showed that the PCA model with spectral preprocessing was only able to divide T. grandis kernel samples into four different sets of 4~5 months, 6 months, 7 months, and 8~9 months. It was impossible to distinguish samples with storage times of four and five months, and eight and nine months. In general, the differentiation results obtained by PCA alone could only achieve partial storage time distinction. To improve the classification accuracy of the samples with storage times of 4~5 months and 8~9 months, SVM and LDA models were established to obtain better results. Visible/near-infrared (Vis/NIR) spectroscopy was used to rapidly determine the storage time of Fuji apples. The raw spectral data and SNR feature values were analyzed using PCA, and the apple samples were successfully differentiated using PCA and SNR spectra. Therefore, Vis/NIR spectroscopy is effective for the rapid discrimination of the storage time of Fuji apples [26]. NIR spectroscopy was also used to discriminate the authenticity of extra virgin olive oil (EVOO). PCA and redundancy analysis (RDA) techniques were used to qualitatively or quantitatively verify the authenticity of olive oil, predicting the percentage of EVOO in mixed oil or pure EVOO. The results showed the potential of RDA factors for predicting and classifying, significantly improving the calibration and validation results obtained from the PCA factors [27]. Sustainability 2023, 15, x FOR PEER REVIEW 6 of 13

Support Vector Machine (SVM) Model
The support vector machine (SVM) models for T. grandis kernels were developed using four kernel tricks: linear, polynomial, RBF, and sigmoid kernels. The SVM models were established by the 180 original spectra samples which were obtained from the calibration set, and the models were validated using 120 samples which were obtained from the validation set. The results of these models are shown in Table 1. It can be seen that the SVM models with polynomial, RBF, and Sigmoid kernels had classification accuracies of less than 50%, while that of the linear kernel had a classification accuracy of more than 80%, which indicates a better classification accuracy for T. grandis kernel samples. Therefore, the linear kernel was chosen as the modeling approach for support SVM.

Support Vector Machine (SVM) Model
The support vector machine (SVM) models for T. grandis kernels were developed using four kernel tricks: linear, polynomial, RBF, and sigmoid kernels. The SVM models were established by the 180 original spectra samples which were obtained from the calibration set, and the models were validated using 120 samples which were obtained from the validation set. The results of these models are shown in Table 1. It can be seen that the SVM models with polynomial, RBF, and Sigmoid kernels had classification accuracies of less than 50%, while that of the linear kernel had a classification accuracy of more than 80%, which indicates a better classification accuracy for T. grandis kernel samples. Therefore, the linear kernel was chosen as the modeling approach for support SVM. The classification results of the SVM model under different preprocessing methods are presented in Table 2. After undergoing MSC processing, little change was shown in the model's classification accuracy compared to the original spectrum, indicating that the SVM classification was not affected by this preprocessing method. However, after Normalize and Baseline preprocessing, the model's classification accuracy improved by more than 90%. The best classification results were obtained after 1-Der preprocessing, with the model's training accuracy, validation accuracy, and prediction accuracies being 98.89%, 97.50%, and 97.33%, respectively. The spectra after 1-Der preprocessing eliminated irrelevant variables' interference and retained the feature variables, leading to a significant improvement in the classification accuracy. The confusion matrix depicting the SVM model classification results was presented in Figure 4. Compared with that of PCA model, the prediction result of the SVM model was significant improved. For the samples with a storage time of 4~5 months, only one sample with a storage time of 5 months was misclassified as having a storage time of 4 months. For the samples with a storage time of 8~9 months, only three samples with a storage time of 8 months were misclassified as a storage time of 9 months, and four samples with a storage time of 9 months were misclassified as having a storage time of 8 months. All of the other samples with different storage times were correctly classified. In general, the classification accuracy of the SVM model after 1-Der preprocessing was 97.33%. This model exhibited good classification of T. grandis kernels for all storage times for 4~7 months except for 8 and 9 months. The NIR spectra of four types of bamboo shoots were compared using NIR reflectance technology. It was found that the SVM model with second derivative treatment performed the best. The combination of NIR spectra and the SVM method provides a fast and non-destructive method for the classification of bamboo shoot species [28]. A portable NIR sensor was used to predict the freshness of eggs. The spectral data obtained were processed using different preprocessing methods, and the SVM prediction model was utilized to classify fresh and stale eggs with an accuracy of 87.0% [29].

Linear Discriminant Analysis after PCA (PCA-DA) Model
In the PCA combined with linear discriminant analysis (PCA-DA) method, the spec tral data must undergo PCA due to the limitation of the algorithm of LDA. As shown i Figure 2, the first seven principal components of the original spectrum contained almos all of the effective information of the spectrum. Thus, the default principal componen number was set to 7 in this study. Setting the maximum principal component number to large could cause overfitting, while setting it too small could cause the loss of spectra information. The linear, quadratic, and Mahalanobis methods were employed to mode the principal component score data of the original spectrum, and the classification result are presented in Table 3. The classification accuracy of the calibration set by the quadrati method was the highest, reaching 95%, and the classification accuracy of the predictio set was 90.83%. The classification accuracy of the prediction set of the other two modelin methods was less than 90%. Therefore, comprehensively taking into account the classifi cation effect of the calibration set and the prediction set, the quadratic method was used to establish the discriminant model in the subsequent LDA. The classification accuracy of LDA with different storage times of T. grandis kernel did not reach 100%. Therefore, the PCA-DA models were established after spectrum pre processing, and the classification accuracy is shown in Table 4. It can be seen that the ac curacy of the PCA-DA model decreased after Baseline and Normalize preprocessing. I particular, the classification accuracy of the validation set decreased from 90.83% t

Linear Discriminant Analysis after PCA (PCA-DA) Model
In the PCA combined with linear discriminant analysis (PCA-DA) method, the spectral data must undergo PCA due to the limitation of the algorithm of LDA. As shown in Figure 2, the first seven principal components of the original spectrum contained almost all of the effective information of the spectrum. Thus, the default principal component number was set to 7 in this study. Setting the maximum principal component number too large could cause overfitting, while setting it too small could cause the loss of spectral information. The linear, quadratic, and Mahalanobis methods were employed to model the principal component score data of the original spectrum, and the classification results are presented in Table 3. The classification accuracy of the calibration set by the quadratic method was the highest, reaching 95%, and the classification accuracy of the prediction set was 90.83%. The classification accuracy of the prediction set of the other two modeling methods was less than 90%. Therefore, comprehensively taking into account the classification effect of the calibration set and the prediction set, the quadratic method was used to establish the discriminant model in the subsequent LDA. The classification accuracy of LDA with different storage times of T. grandis kernels did not reach 100%. Therefore, the PCA-DA models were established after spectrum preprocessing, and the classification accuracy is shown in Table 4. It can be seen that the accuracy of the PCA-DA model decreased after Baseline and Normalize preprocessing.
In particular, the classification accuracy of the validation set decreased from 90.83% to 81.67~82.50%. This indicates that the preprocessing method, which has an optimizing and screening effect on NIR spectral data, was not suitable for the principal component classification data obtained after PCA. Although the classification accuracy of the calibration set increased by 2.22% after SNV preprocessing, the classification accuracy of the validation set decreased by 8.33%. The classification accuracy of PCA-DA model after derivative preprocessing was the best, with the accuracy of the calibration set and validation set both being higher than 96%. The confusion matrix depicting the optimal PCA-DA model classification results is presented in Figure 5. The accuracy rates of the calibration and validation sets were 100% and 98.33%, respectively. Only two samples in the validation set were misclassified. One sample with a storage time of 8 months was misclassified as 9 months, and another sample with a storage time of 9 months was misclassified as 8 months. All other samples with different storage times were correctly classified. Among the three classification approaches used, the PCA-DA model was proven to be the most effective. screening effect on NIR spectral data, was not suitable for the principal component classification data obtained after PCA. Although the classification accuracy of the calibration set increased by 2.22% after SNV preprocessing, the classification accuracy of the validation set decreased by 8.33%. The classification accuracy of PCA-DA model after derivative preprocessing was the best, with the accuracy of the calibration set and validation set both being higher than 96%. The confusion matrix depicting the optimal PCA-DA model classification results is presented in Figure 5. The accuracy rates of the calibration and validation sets were 100% and 98.33%, respectively. Only two samples in the validation set were misclassified. One sample with a storage time of 8 months was misclassified as 9 months, and another sample with a storage time of 9 months was misclassified as 8 months. All other samples with different storage times were correctly classified. Among the three classification approaches used, the PCA-DA model was proven to be the most effective.  In summary, the detection of T. grandis kernel samples with different storage times showed that PCA could only achieve clustering of samples with storage times of 4~5 months, 6 months, 7 months, and 8~9 months. The SVM model had good classification results except for the poor classification of storage times of 8 and 9 months, while the PCA-DA model had the best classification results and could differentiate every month of storage time from around 4 to 9 months. A study was conducted on the analysis of sweet corn with and without husk using a PLS-DA model in reflection and interaction modes. It was found that the best prediction results were obtained after 2-Der preprocessing, with accuracies of 90% and 97.5% [30]. NIR spectroscopy and NMR were used to assess macadamia nut extranuclear defects, and the GA-LDA model achieved classification accuracies and specificities of 97.8% and 100%, respectively [31]. The quality of fresh chestnuts was evaluated using NIR diffuse reflectance technology, and the NIR spectra were modeled. LDA was used to discriminate between normal and moldy chestnuts, and calibration accuracies and validation accuracies of 100% and 96.37%, respectively, were achieved, indicating that the established model was successful [32]. In this study, the PCA-DA model after 1-Der preprocessing could effectively differentiate the samples of T. grandis kernels from the storage times of 4~9 months, making it applicable to practical production.
The analysis of storage times of some foods using near-infrared detection technology is shown in Table 5. Navel oranges with a storage time of 1~6 months were detected, and a BP artificial neural network model was established with five indicators including total soluble sugar, total acid, vitamin C, soluble solids, and sugar acid ratio, which changed with storage time. The calibration set and prediction set of the model were 0.9487 and 0.8770, respectively. It was concluded that the multi-factor model was more accurate than the single-factor model in predicting storage time and storage life [33]. The storage time of Carya cathayensis sarg. with a storage time of 3~7 months was detected using a combination of raw data and LDA (linear discriminant analysis) to achieve a classification accuracy of over 95%, enabling rapid detection of the storage time of walnuts [34]. Cakes with a storage time of 1~8 days were detected and the storage time was accurately predicted using a combination of raw data and PLS-DA (partial least squares discriminant analysis) with an error control of about 1 day [35]. Strawberries with a storage time of 0~60 h were detected using visible/near-infrared (Vis/NIR) hyperspectral imaging technology. A combination of CARS-SPA (spectral selection and spectral intrinsic analysis) and PLSR (partial least squares regression) was used to obtain correlation coefficients of 0.9989 and 0.9974, respectively. The storage time distribution map generated based on pixel-level spectra and the established model clearly showed the quality changes in strawberries [36]. With the development of near-infrared spectroscopy technology, it could be applied in the field of food safety. By improving algorithms and data processing methods, the prediction accuracy could be further improved, thus more accurately assessing and monitoring the storage time and quality of different foods.

Conclusions
In this paper, the storage time of T. grandis kernels was, firstly, detected by near infrared spectroscopy. The results showed that clustering could be achieved between T. grandis kernel samples with storage times of 4~5 months, 6 months, 7 months, and 8~9 months using PCA model after derivative preprocessing. However, the method could not distinguish T. grandis kernels with storage times of 4~5 months and 8~9 months. The Linear kernel in SVM model using the 1-Der preprocessing exhibited better accuracy. The training accuracy, validation accuracy, and prediction accuracy of this model were significantly improved compared with PCA model. However, the SVM model still could not distinguish T. grandis kernels with storage times of 8~9 months. The principal component analysis-linear dis-criminant analysis (PCA-DA) model had the highest classification accuracy among all of the classification methods. The optimal PCA-DA model was obtained with the quadratic method technique after 1-Der preprocessing, and the classification accuracy of the calibration and validation sets were 100% and 97.33%, respectively. Finally, the effects of different storage times on the near infrared model were analyzed and compared. The near infrared detection technology has been used to realize the classification of Torreya in different storage times, which can be used in practical production.
Near infrared detection technology has the advantages of being nondestructive, convenient, and fast, which conforms to the principle of sustainable development, and is of great significance to the sustainable development of the nut industry. The application of this technology would bring more precise control of nuts storage time, contributing to improving economic benefits and protecting the environment. In the future, more samples with different storage times could be added into the model to develop a more comprehensive classification model and improve the scope of this model.