A Rapid and Nondestructive Approach for the Classification of Different-Age Citri Reticulatae Pericarpium Using Portable Near Infrared Spectroscopy

Citri Reticulatae Pericarpium (CRP), has been used in China for hundreds of years as a functional food and medicine. However, some short-age CRPs are disguised as long-age CRPs by unscrupulous businessmen in order to obtain higher profits. In this paper, a rapid and nondestructive method for the classification of different-age CRPs was established using portable near infrared spectroscopy (NIRS) in diffuse reflectance mode combination with appropriate chemometric methods. The spectra of outer skin and inner capsule of CRPs at different storage ages were obtained directly without destroying the samples. Principal component analysis (PCA) with single and combined spectral pretreatment methods was used for the classification of different-age CRPs. Furthermore, the data were pretreated with the PCA method, and Fisher linear discriminant analysis (FLD) with optimized pretreatment methods was discussed for improving the accuracy of classification. Data pretreatment methods can be used to eliminate the noise and background interference. The classification accuracy of inner capsule is better than that of outer skin data. Furthermore, the best results with 100% prediction accuracy can be obtained with FLD method, even without pretreatment.


Introduction
Citri Reticulatae Pericarpium (CRP) has been used in China for hundreds of years as a functional food and medicine. CRP is rich in volatile oil, flavonoids, polysaccharides and alkaloids, which can be used to treat digestive problems and respiratory complaints [1]. The research shows that the longer the storage time is, the higher the medicinal value of CRP is [2], however, the differences between CRPs of different age are not significant. In recent years, unscrupulous businessmen have marketed young-age CRP as old-age CRP, to obtain illegal profits. It is thus urgent to develop a rapid and simple identification technology for different-age CRP samples.

CRP Sample
Different-age CRPs (5, 10, 15, 20 and 25 years) were obtained from Guangdong Fu Dong Hai Co., Ltd. (Zhanjiang, China). The color of outer skin is brown and the color of the inner capsule is light brown. Each CRP is composed of three petals of pericarp (~50 mm diameter) and a petal for each CRP was used directly as the test sample without destroying it. Forty samples were taken from each age group and a total of 200 samples were collected. The samples were individually packed in sealed polyethylene bags and stored under dry conditions. To reduce the effect of sample temperature on the prediction accuracy, the samples were placed at room temperature for 24 h for equilibration.

Instrumentation and Measurements
The spectra of outer skin and inner capsule were obtained directly using a QuasIR 4000 portable Fourier transform NIRS instrument (Galaxy Scientific, Nashua, NH, USA) in diffuse reflectance mode without destroying the samples. The system consists of a light source, interferometer, fiber optical sensor, InGaAs detector (Galaxy Scientific, Nashua, NH, USA) and data collection card, as shown in Figure 1. The CPR petal was placed directly in the middle of the spot without the container. The selected 200 CPR samples were measured. The measurements were repeated three times and averaged. Each spectrum is composed of 2098 data points recorded from 12,000 to 4000 cm −1 .

CRP Sample
Different-age CRPs (5, 10, 15, 20 and 25 years) were obtained from Guangdong Fu Dong Hai Co., Ltd. (Zhanjiang, China). The color of outer skin is brown and the color of the inner capsule is light brown. Each CRP is composed of three petals of pericarp (~50 mm diameter) and a petal for each CRP was used directly as the test sample without destroying it. Forty samples were taken from each age group and a total of 200 samples were collected. The samples were individually packed in sealed polyethylene bags and stored under dry conditions. To reduce the effect of sample temperature on the prediction accuracy, the samples were placed at room temperature for 24 h for equilibration.

Instrumentation and Measurements
The spectra of outer skin and inner capsule were obtained directly using a QuasIR 4000 portable Fourier transform NIRS instrument (Galaxy Scientific, Nashua, NH, USA) in diffuse reflectance mode without destroying the samples. The system consists of a light source, interferometer, fiber optical sensor, InGaAs detector (Galaxy Scientific, Nashua, NH, USA) and data collection card, as shown in Figure 1. The CPR petal was placed directly in the middle of the spot without the container. The selected 200 CPR samples were measured. The measurements were repeated three times and averaged. Each spectrum is composed of 2098 data points recorded from 12,000 to 4000 cm −1 .

Data Analysis
The 200 different-age CPR samples were divided into a calibration dataset with 150 samples and a validation dataset with 50 samples by the Kennard-Stone (KS) method. De-bias and DT were used to eliminate the baseline drift in the spectra, while the MSC and SNV methods were used to eliminate the scattering effects. MinMax and Mean-Center methods were applied to normalize all variables into a certain range. CWT, 1st and 2nd methods were used to subtract the influence of background and baseline drift. Combined pretreatment methods, first-order derivative-detrend (1st-DT), first-order derivative-standard normal variate (1st-SNV), first-order derivative-multiplicative scatter correction (1st-MSC), and continuous wavelet transform-standard normal variate (CWT-SNV), continuous wavelet transform-multiplicative scatter correction (CWT-MSC) and standard normal variate-firstorder derivative (SNV-1st) were applied in order to further improve the classification accuracy. PCA with single and combined spectral pretreatment methods was used for the classification of differentage CRPs. The spectra were Mean-Centered prior to the creation of the models. To obtain satisfied classification results, FLD method with single and combined pretreatment methods was used. LDA method has the disadvantage that the number of calibration samples must be larger than the number of variables included in the LDA model [34]. Generally, the suggested total number of objects should

Data Analysis
The 200 different-age CPR samples were divided into a calibration dataset with 150 samples and a validation dataset with 50 samples by the Kennard-Stone (KS) method. De-bias and DT were used to eliminate the baseline drift in the spectra, while the MSC and SNV methods were used to eliminate the scattering effects. MinMax and Mean-Center methods were applied to normalize all variables into a certain range. CWT, 1st and 2nd methods were used to subtract the influence of background and baseline drift. Combined pretreatment methods, first-order derivative-detrend (1st-DT), first-order derivative-standard normal variate (1st-SNV), first-order derivative-multiplicative scatter correction (1st-MSC), and continuous wavelet transform-standard normal variate (CWT-SNV), continuous wavelet transform-multiplicative scatter correction (CWT-MSC) and standard normal variate-first-order derivative (SNV-1st) were applied in order to further improve the classification accuracy. PCA with single and combined spectral pretreatment methods was used for the classification of different-age CRPs. The spectra were Mean-Centered prior to the creation of the models. To obtain satisfied classification results, FLD method with single and combined pretreatment methods was used. LDA method has the disadvantage that the number of calibration samples must be larger than the Sensors 2020, 20, 1586 4 of 16 number of variables included in the LDA model [34]. Generally, the suggested total number of objects should be equal to at least three to five times the number of variables [35]. In this paper, the PCA method was applied to reduce the multidimensionality and the dataset was transformed into fewer principal components (PCs) before FLD calculation.
The programs were performed using Matlab 2010a (The Mathworks, Natick, MA, USA) and run on a personal computer. The spectral data and results were visualized in Origin 9.0 Software (The OriginLab, Northampton, MA, USA). Techniques   Figures 2a and 3a show the average spectra of each group for the analysis of the outer skin and inner capsule, respectively. It can be seen that there is a very obvious interference of baseline drift in the spectra, due to the rough surface of the CPR samples. There is a slight difference between the spectra trend of outer skin and inner capsule. However, it is difficult to find the difference of different-age CRPs due to the serious interference of overlapping and background.

Spectra of Different-age CRPs with Single Pretreatment
Sensors 2020, 20, x FOR PEER REVIEW 4 of 16 be equal to at least three to five times the number of variables [35]. In this paper, the PCA method was applied to reduce the multidimensionality and the dataset was transformed into fewer principal components (PCs) before FLD calculation. The programs were performed using Matlab 2010a (The Mathworks, Natick, MA, USA) and run on a personal computer. The spectral data and results were visualized in Origin 9.0 Software (The OriginLab, Northampton, MA, USA).

Spectra of Different-age CRPs with Single Pretreatment Techniques
Figures 2a and 3a show the average spectra of each group for the analysis of the outer skin and inner capsule, respectively. It can be seen that there is a very obvious interference of baseline drift in the spectra, due to the rough surface of the CPR samples. There is a slight difference between the spectra trend of outer skin and inner capsule. However, it is difficult to find the difference of different-age CRPs due to the serious interference of overlapping and background.     Figures 2d-f and 3d-f. The variant background in the spectra can be removed with 1st, 2nd and CWT methods. Besides, there is very serious noise interference in the wavenumber range of 12,000-10,000 cm −1 , especially Figure 2i by the 2nd method. This is due to the obvious increase of noise level in higher order derivative calculation. Each spectrum has seven groups of peaks in the wavenumber range of 11,700-10,500, 9000-7600, 7200-6000, 6000-5400, 5300-5000, 5000-4500, and 4500-4150 cm −1 , which belong to OH second overtone bands, CH second overtone bands, OH first overtone bands, CH first overtone bands, OH combination bands, NH and OH combination bands, and CH combination bands, respectively. In addition, it can be clearly seen that there are differences between the inner and outer spectra in the wavenumber ranges of 9000-7000 and 6400-5600 cm −1 . However, there is almost no difference among the spectra of different-age CRPs, and the classification of different-age CRPs cannot be achieved with single pretreatment methods.

PCA of Different-Age CRPs with Single Pretreatment Techniques
In order to discriminate the different-age CRP samples, PCA method was performed. The calibration dataset with 150 samples and the validation dataset with 50 samples were obtained by KS method. Figures 4 and 5 show the classification effect based on the raw spectra and those with single pretreatment methods for the analysis of outer skin and inner capsule data. In the figures, the validation samples are labeled with hollow icons. The first two scores (PC1 and PC2) were used for the classification analysis based on the explanted variances noted in the axis. As shown in Figures 4a and 5a, the five groups are merged together and the classification effect is worse with the raw spectra. The classification accuracies are 2.00% and 6.00%, for the analysis of outer skin and inner capsule data, respectively. Figures 4b-i and 5b-i show the PCA results with DT, de-bias, SNV transformation, MinMax, MSC, 1st, 2nd and CWT methods, for the analysis of outer skin and inner capsule, respectively. The classification results of inner capsule are better than those of outer skin. The best classification accuracy is 10.00% with the SNV transformation and MSC methods for the analysis of outer skin data, while the best classification accuracy is 22.00% with the 2nd method for the analysis of inner capsule data. Therefore, the classification using the spectra with single pretreatment methods may not be feasible.

Spectra of Different-Age CRPs with Combined Pretreatment Techniques
In order to improve the accuracy of classification, combined pretreatment techniques were applied. Figures 6 and 7 show the spectra with 1st-DT, 1st-SNV, 1st-MSC, CWT-SNV, CWT-MSC, and SNV-1st methods, for the analysis of outer skin and inner capsule, respectively.

Spectra of Different-Age CRPs with Combined Pretreatment Techniques
In order to improve the accuracy of classification, combined pretreatment techniques were applied. Figures 6 and 7 show the spectra with 1st-DT, 1st-SNV, 1st-MSC, CWT-SNV, CWT-MSC, and SNV-1st methods, for the analysis of outer skin and inner capsule, respectively. CWT and 1st methods can significantly eliminate the background and baseline drift interference in the signal. The changes to signal by CWT and 1st methods are greater than other methods. Therefore, Figure 6a-c,f are similar, while Figure 6d is similar to Figure 6e. Similar results can be obtained for the inner capsule data, shown in Figure 7d-f. In addition, it can be clearly seen that there are differences between the outer skin and inner capsule spectra in the wavenumber ranges of 9000-7000 and 6400-5600 cm −1 . The noise interference in the outer skin spectra is less than that in the inner capsule spectra.  CWT and 1st methods can significantly eliminate the background and baseline drift interference in the signal. The changes to signal by CWT and 1st methods are greater than other methods. Therefore, Figure 6a-c,f are similar, while Figure 6d is similar to Figure 6e. Similar results can be obtained for the inner capsule data, shown in Figure 7d-f. In addition, it can be clearly seen that there are differences between the outer skin and inner capsule spectra in the wavenumber ranges of 9000-7000 and 6400-5600 cm −1 . The noise interference in the outer skin spectra is less than that in the inner capsule spectra. CWT and 1st methods can significantly eliminate the background and baseline drift interference in the signal. The changes to signal by CWT and 1st methods are greater than other methods. Therefore, Figure 6a-c,f are similar, while Figure 6d is similar to Figure 6e. Similar results can be obtained for the inner capsule data, shown in Figure 7d-f. In addition, it can be clearly seen that there are differences between the outer skin and inner capsule spectra in the wavenumber ranges of 9000-7000 and 6400-5600 cm −1 . The noise interference in the outer skin spectra is less than that in the inner capsule spectra. Techniques   Figures 8 and 9 show the classification effect with combined pretreatment methods for the analysis of outer skin and inner capsule data, respectively.  Figures 8 and 9 show the classification effect with combined pretreatment methods for the analysis of outer skin and inner capsule data, respectively.  The validation samples are labeled with hollow icons, and the first two scores were used for the classification analysis. Figure 8a-c,f are similar, while Figure 8d is similar to Figure 8e. Similar results can be obtained for the inner capsule data, shown in Figure 9a-f. The classification results with combined pretreatment techniques are better than those with single pretreatment techniques, while the classification results of inner capsule are better than those of outer skin. The best classification accuracy is 6.00% with the SNV-1st method for the analysis of outer skin data, while the best classification accuracy is 30.00% with the SNV-1st method for the analysis of inner capsule data. However, the results are still unsatisfactory, even with the combined pretreatment techniques.

FLD of Different-Age CRPs with Pretreatment Techniques
As a powerful supervised classification method, the FLD method has been developed to find the optimal boundary between object classes. To make the total number of objects equal to three to five times the number of variables, PCA method was applied to reduce the multidimensionality into fewer PCs before FLD calculation. 200 different-age CPR samples were divided into a calibration dataset with 150 samples and a validation dataset with 50 samples by KS method. Figure 10 is the cumulative variance contribution rates with the increase of PCs number. The value of cumulative variance contribution rate increased rapidly with the increase of PC number and reached a stable high level. For the analysis of spectra with DT, de-bias, SNV transformation, MinMax and MSC methods, most variations (~99%) can be explained when PC number is 5 and variations (~99.99%) can be explained with the PC number 30. For the analysis of spectra with CWT and derivatives methods, variations (~92%) are explained when PC number is 5 and most variations (~99%) can be explained with the PC number 30, except the data with 2nd method (~92%). It is because that the variant The validation samples are labeled with hollow icons, and the first two scores were used for the classification analysis. Figure 8a-c,f are similar, while Figure 8d is similar to Figure 8e. Similar results can be obtained for the inner capsule data, shown in Figure 9a-f. The classification results with combined pretreatment techniques are better than those with single pretreatment techniques, while the classification results of inner capsule are better than those of outer skin. The best classification accuracy is 6.00% with the SNV-1st method for the analysis of outer skin data, while the best classification accuracy is 30.00% with the SNV-1st method for the analysis of inner capsule data. However, the results are still unsatisfactory, even with the combined pretreatment techniques.

FLD of Different-Age CRPs with Pretreatment Techniques
As a powerful supervised classification method, the FLD method has been developed to find the optimal boundary between object classes. To make the total number of objects equal to three to five times the number of variables, PCA method was applied to reduce the multidimensionality into fewer PCs before FLD calculation. 200 different-age CPR samples were divided into a calibration dataset with 150 samples and a validation dataset with 50 samples by KS method. Figure 10 is the cumulative variance contribution rates with the increase of PCs number. The value of cumulative variance contribution rate increased rapidly with the increase of PC number and reached a stable high level. For the analysis of spectra with DT, de-bias, SNV transformation, MinMax and MSC methods, most variations (~99%) can be explained when PC number is 5 and variations (~99.99%) can be explained with the PC number 30. For the analysis of spectra with CWT and derivatives methods, variations (~92%) are explained when PC number is 5 and most variations (~99%) can be explained with the PC number 30, except the data with 2nd method (~92%). It is because that the variant background in the spectra is removed with 1st, 2nd and CWT methods. Besides, the noise level increases apparently in higher order derivative calculation. Therefore, 30 PCs were selected for the FLD calculation of both outer skin and inner capsule data.
Sensors 2020, 20, x FOR PEER REVIEW 12 of 16 background in the spectra is removed with 1st, 2nd and CWT methods. Besides, the noise level increases apparently in higher order derivative calculation. Therefore, 30 PCs were selected for the FLD calculation of both outer skin and inner capsule data. With the selected PCs, the FLD method was used for the classification analysis of different-age CRP samples, and different pretreatment techniques were applied to optimize the classification model. Table 1 shows the classification accuracies obtained by FLD and different pretreatment methods for the analysis of outer skin and inner capsule spectra. It is clear that the classification accuracies with FLD method are significantly higher than that with PCA method. The identification accuracies of the raw data are more than 96% for the analysis of the outer skin and inner capsule spectra. The result of 2nd method is not satisfactory due to the obvious increase of noise level in higher order derivative calculation. Furthermore, the 100% identification accuracies for the outer skin spectra can be obtained with the raw data or DT, SNV transformation, MinMax and MSC methods, while the 100% identification accuracies for the inner capsule spectra DT, de-bias, SNV transformation, MinMax and MSC methods. Furthermore, Figure 11 is the FLD score plots of outer skin spectra with FLD method and inner capsule spectra with SNV-FLD method, and all the five groups were visually separated. Satisfactory results can be obtained with both outer skin and inner capsule spectra, even without any spectral pretreatment. The results demonstrate that, the classification of different-age CRPs can be achieved by the method. With the selected PCs, the FLD method was used for the classification analysis of different-age CRP samples, and different pretreatment techniques were applied to optimize the classification model. Table 1 shows the classification accuracies obtained by FLD and different pretreatment methods for the analysis of outer skin and inner capsule spectra. It is clear that the classification accuracies with FLD method are significantly higher than that with PCA method. The identification accuracies of the raw data are more than 96% for the analysis of the outer skin and inner capsule spectra. The result of 2nd method is not satisfactory due to the obvious increase of noise level in higher order derivative calculation. Furthermore, the 100% identification accuracies for the outer skin spectra can be obtained with the raw data or DT, SNV transformation, MinMax and MSC methods, while the 100% identification accuracies for the inner capsule spectra DT, de-bias, SNV transformation, MinMax and MSC methods.

Conclusions
A rapid and nondestructive method for the classification of different-age CRPs was established using portable NIRS in reflectance mode in combination with appropriate chemometric methods. The spectra of outer skin and inner capsule can be obtained directly without destroying the samples.