Rapid Prediction of Adulteration Content in Atractylodis rhizoma Based on Data and Image Features Fusions from Near-Infrared Spectroscopy and Hyperspectral Imaging Techniques

Atractylodis rhizoma (AR) is an herb and food source with great economic, medicinal, and ecological value. Atractylodes chinensis (DC.) Koidz. (AC) and Atractylodes lancea (Thunb.) DC. (AL) are its two botanical sources. The commercial fraud of AR adulterated with Atractylodes japonica Koidz. ex Kitam (AJ) frequently occurs in pursuit of higher profit. To quickly determine the content of adulteration in AC and AL powder, two spectroscopic techniques, near-infrared spectroscopy (NIRS) and hyperspectral imaging (HSI), were introduced. The partial least squares regression (PLSR) algorithm was selected for predictive modeling of AR adulteration levels. Preprocessing and feature variable extraction were used to optimize the prediction model. Then data and image feature fusions were developed to obtain the best predictive model. The results showed that if only single-spectral techniques were considered, NIRS was more suitable for both tasks than HSI techniques. In addition, by comparing the models built after the data fusion of NIRS and HSI with those built by the single spectrum, we found that the mid-level fusion strategy obtained the best models in both tasks. On this basis, combined with the color-texture features, the prediction ability of the model was further optimized. Among them, for the adulteration level prediction task of AC, the best strategy was combining MLF data (at CARS level) and color-texture features (C-TF), at which time the R2T, RMSET, R2P, and RMSEP were 99.85%, 1.25%, 98.61%, and 5.06%, respectively. For AL, the best approach was combining MLF data (at SPA level) and C-TF, with the highest R2T (99.92%) and R2P (99.00%), as well as the lowest RMSET (1.16%) and RMSEP (2.16%). Therefore, combining data and image features from NIRS and HSI is a potential strategy to predict the adulteration content quickly, non-destructively, and accurately.


Spectral Acquisition and Image Feature Extraction
All samples were scanned by the NIR spectrometer (Antaris™ II, Thermo Fisher Scientific Co., Ltd., Waltham, USA) and hyperspectral imager (GaiaField-N17E, Sichuan Shuangli Hepu Technology Co., Ltd., Chengdu, China). The parameters of the NIR spectrometer were set as follows: the number of spectral scans was 32, and the resolution was 8 cm −1 . The parameters of the hyperspectral imager were set as follows: The moving speed and distance of the carrier platform were 1.4 cm s -1 and 30 cm, respectively; the vertical distance between the sample and the lens was 42 cm; and the exposure time was 7 ms. For NIRS, the scan was repeated three times for each sample and averaged for subsequent data analysis. As for HSI, it was scanned once, as in previous studies [21]. Finally, 1557 NIRS data from 4000-10,000 cm −1 and 512 HSI data from 900-1700 nm were acquired.
The raw NIRS data was entered into the Unscrambler X 10.4 software to extract the corresponding spectral values. As shown in Figure 1, before HSI data could be used for further analysis, the following four processes had to be completed: (i) The original hyperspectral image was calibrated to create a calibrated hyperspectral image using Formula (1); (ii) The calibrated hyperspectral image was input and applied to the mask image to identify the region of interest (ROI) and remove outlier pixels; (iii) Two expansion and one erosion operations were performed; (iv) Each image was cropped into sub-images and averaged the reflectance values of all pixels in the sub-images to obtain the average spectrum for each sample. In addition, the study also considered the combination of color and texture features to improve the prediction ability. Color features (CF) were selected from first-order moments, second-order moments, and third-order moments of the three channels of RGB, while texture features (TF) included energy, homogeneity, contrast, and the correlation of each pixel in the Gray-Level Co-occurrence Matrix (GLCM) at three bands (47,197,378) and four directions (0 • , 45 • , 90 • , 135 • ). Masked and filtered binary images were used to calculate the morphological features. A total of 48 texture features were extracted from the new hypercube by the GLCM algorithm, and 9 color features were calculated from the three RGB channels (Table S1) [23,24].
Note: RB = the all-black background image; RC = the corrected image; RO = the original image; RW = the all-white calibration plate image.

Preprocessing and Feature Variable Extraction
To remove systematic noise and outside environmental influences during sample collection, as well as to reduce the substantial amount of redundant and useless spectral data information in the full wavelength, simplify the model, and enhance its predictive power, pretreatment and feature variable extraction were required [25]. As previous studies, five pretreatment methods such as Savitzky-Golay smoothing (SGS), standard normalized variate (SNV), multiplicative scatter correction (MSC), the first derivative (1 Der), and the second derivative (2 Der) were chosen. Among them, SGS is used for smoothing filtering, which reduces the interference of noise on the sampled signal. The inhomogeneity of the sample causes light scattering, which leads to errors in the sample spectrum, and SNV can remove additive and multiplicative effects from the spectrum. MSC is mainly applied to eliminate scattering effects caused by inhomogeneity in particle distribution and particle size. Interference caused by baseline drift or soft background can be eliminated by 1 Der and 2 Der algorithms, as can overlapping peaks, to improve resolution and sensitivity [26].

Preprocessing and Feature Variable Extraction
To remove systematic noise and outside environmental influences during sample collection, as well as to reduce the substantial amount of redundant and useless spectral data information in the full wavelength, simplify the model, and enhance its predictive power, pretreatment and feature variable extraction were required [25]. As previous studies, five pretreatment methods such as Savitzky-Golay smoothing (SGS), standard normalized variate (SNV), multiplicative scatter correction (MSC), the first derivative (1 Der), and the second derivative (2 Der) were chosen. Among them, SGS is used for smoothing filtering, which reduces the interference of noise on the sampled signal. The inhomogeneity of the sample causes light scattering, which leads to errors in the sample spectrum, and SNV can remove additive and multiplicative effects from the spectrum. MSC is mainly applied to eliminate scattering effects caused by inhomogeneity in particle distribution and particle size. Interference caused by baseline drift or soft background can be eliminated by 1 Der and 2 Der algorithms, as can overlapping peaks, to improve resolution and sensitivity [26].
In addition, three feature variable extraction methods, such as competitive adaptive reweighted sampling (CARS), successive projection algorithms (SPA), and genetic algorithms (GA), were selected [15]. CARS is an efficient wavelength selection algorithm based on the principle of "survival of the fittest." CARS is designed to select critical wavelengths through a rigorous and computationally efficient procedure. Two consecutive wavelength selection steps are performed: in the first step, wavelengths with relatively small PLSR coefficients are forcibly removed using an exponential decay function. Next, variable sampling with adaptive reweighting is used to further eliminate wavelengths in a competitive manner. SPA is a forward variable selection algorithm used for multivariate calibration to select wavelengths with minimal redundancy. It performs simple projection operations in vector space to obtain a subset of useful variables with minimal covariance. The principle of variable selection via SPA is that the newly selected variable is the one with the largest projection on the orthogonal subspace of the previously selected variable. The optimal initial variables and the number of variables can be determined based on the minimum root-mean-square error of cross-validation. GA is a global optimization search method inspired by Darwin's theory of natural selection. It selects variables that are better suited to the fitness function through the manipulation of genetic processes, such as reproduction, mutation, and selection, and successive genetic iterations. In this study, the parameter values of the GA were set based on preliminary tests: population size (30), window width (3), penalty slope (0), maximum generations (100), mutation rate (0.01), crossover probability (0.5), and replicate runs (100) [27].

Data and Image Feature Fusions
The NIRS and HSI data are directly stitched together to obtain the LLF dataset, and the LLF is usually used in conjunction with the feature variable extraction method because In addition, three feature variable extraction methods, such as competitive adaptive reweighted sampling (CARS), successive projection algorithms (SPA), and genetic algorithms (GA), were selected [15]. CARS is an efficient wavelength selection algorithm based on the principle of "survival of the fittest." CARS is designed to select critical wavelengths through a rigorous and computationally efficient procedure. Two consecutive wavelength selection steps are performed: in the first step, wavelengths with relatively small PLSR coefficients are forcibly removed using an exponential decay function. Next, variable sampling with adaptive reweighting is used to further eliminate wavelengths in a competitive manner. SPA is a forward variable selection algorithm used for multivariate calibration to select wavelengths with minimal redundancy. It performs simple projection operations in vector space to obtain a subset of useful variables with minimal covariance. The principle of variable selection via SPA is that the newly selected variable is the one with the largest projection on the orthogonal subspace of the previously selected variable. The optimal initial variables and the number of variables can be determined based on the minimum root-mean-square error of cross-validation. GA is a global optimization search method inspired by Darwin's theory of natural selection. It selects variables that are better suited to the fitness function through the manipulation of genetic processes, such as reproduction, mutation, and selection, and successive genetic iterations. In this study, the parameter values of the GA were set based on preliminary tests: population size (30), window width (3), penalty slope (0), maximum generations (100), mutation rate (0.01), crossover probability (0.5), and replicate runs (100) [27].

Data and Image Feature Fusions
The NIRS and HSI data are directly stitched together to obtain the LLF dataset, and the LLF is usually used in conjunction with the feature variable extraction method because more information is introduced to interpret the sample. The features extracted by the same feature filtering method are integrated into a new MLF dataset, which effectively avoids the drawbacks of LLF by integrating data from two sources without causing a large increase in data [28]. NIRS, HSI, and LLF would go through five pretreatments and three feature variable selections to filter out the best combination of their respective methods, whereas MLF built an optimal PLSR model by fusing the NIRS and HSI feature variables extracted by the same feature variable extraction methods based on the best pretreatment ( Figure 2). CF, TF, or color-texture features (C-TF) would be added to the best PLSR models based on NIRS, HSI, LLF, or MLF to obtain the optimal model. feature variable selections to filter out the best combination of their respective methods, whereas MLF built an optimal PLSR model by fusing the NIRS and HSI feature variables extracted by the same feature variable extraction methods based on the best pretreatment ( Figure 2). CF, TF, or color-texture features (C-TF) would be added to the best PLSR models based on NIRS, HSI, LLF, or MLF to obtain the optimal model.

Data Set Partitioning and Quantitative Analysis Methods
The joint x-y distance (SPXY) algorithm uses a partitioning algorithm that accepts the variability of x-space and y-space. In contrast to partitioning systems based just on x information or random sampling, the multidimensional space can be covered more effectively with this approach [29]. Therefore, SPXY was proposed to divide the dataset. The dataset was divided into training and prediction sets in the ratio of 4:1. For AC doped concentration prediction, the numbers of samples in the training and prediction sets were 672 and 168, respectively, while for AL, the numbers were 461 and 115, respectively. PLSR is a common modeling method for quantitative spectral analysis that effectively solves the issue of multiplicity correlation between variables and incorporates the principles and characteristics of principal component analysis, multiple linear regression, and conventional correlation analysis. Because of its forceful predictive power, the PLSR algorithm was chosen as the quantitative prediction model [30,31]. The choice of principal component number affects the modeling effect of PLSR; therefore, 10-fold cross-validation was proposed to select the optimal principal component number to minimize the root mean square error (RMSE). In this study, the performance of the model was assessed mainly by the correlation coefficient of training sets (R 2 T), the correlation coefficient of prediction sets (R 2 P), the root mean square error of training sets (RMSET), and the root mean square error of prediction sets (RMSEP). A good model ought to have a low RMSE and a high R 2 . In addition, the principal component number should be as small as possible, because a large principal component number may introduce some irrelevant information and cause overfitting of the model [32].

Software
The Unscrambler X 10.4 was used for NIRS data format conversion, and SpecView was used for hyperspectral image capture and calibration. The rest of the data processing was conducted on MATLAB R2022a.

Data Set Partitioning and Quantitative Analysis Methods
The joint x-y distance (SPXY) algorithm uses a partitioning algorithm that accepts the variability of x-space and y-space. In contrast to partitioning systems based just on x information or random sampling, the multidimensional space can be covered more effectively with this approach [29]. Therefore, SPXY was proposed to divide the dataset. The dataset was divided into training and prediction sets in the ratio of 4:1. For AC doped concentration prediction, the numbers of samples in the training and prediction sets were 672 and 168, respectively, while for AL, the numbers were 461 and 115, respectively. PLSR is a common modeling method for quantitative spectral analysis that effectively solves the issue of multiplicity correlation between variables and incorporates the principles and characteristics of principal component analysis, multiple linear regression, and conventional correlation analysis. Because of its forceful predictive power, the PLSR algorithm was chosen as the quantitative prediction model [30,31]. The choice of principal component number affects the modeling effect of PLSR; therefore, 10-fold cross-validation was proposed to select the optimal principal component number to minimize the root mean square error (RMSE). In this study, the performance of the model was assessed mainly by the correlation coefficient of training sets (R 2 T ), the correlation coefficient of prediction sets (R 2 P ), the root mean square error of training sets (RMSET), and the root mean square error of prediction sets (RMSEP). A good model ought to have a low RMSE and a high R 2 . In addition, the principal component number should be as small as possible, because a large principal component number may introduce some irrelevant information and cause overfitting of the model [32].

Software
The Unscrambler X 10.4 was used for NIRS data format conversion, and SpecView was used for hyperspectral image capture and calibration. The rest of the data processing was conducted on MATLAB R2022a.

Sample and Spectral Analysis
The color, texture, and size of the powder particles were relatively similar between pure and adulterated AR powder, so it was challenging to quickly identify them from their appearance ( Figure 3). Unsurprisingly, the raw spectrograms did not seem to work well either. The curve trends of pure and adulterated samples were similar, and the peaks appeared at the same position and height, whether in the HSI or NIRS spectrogram (Figure 4), which was consistent with previous studies [33,34]. Therefore, it was necessary to perform specific processing on the raw spectra to achieve a fast and accurate prediction of the adulteration level. their appearance (Figure 3). Unsurprisingly, the raw spectrograms did not seem to work well either. The curve trends of pure and adulterated samples were similar, and the peaks appeared at the same position and height, whether in the HSI or NIRS spectrogram (Figure 4), which was consistent with previous studies [33,34]. Therefore, it was necessary to perform specific processing on the raw spectra to achieve a fast and accurate prediction of the adulteration level.   well either. The curve trends of pure and adulterated samples were similar, and the peaks appeared at the same position and height, whether in the HSI or NIRS spectrogram (Figure 4), which was consistent with previous studies [33,34]. Therefore, it was necessary to perform specific processing on the raw spectra to achieve a fast and accurate prediction of the adulteration level.

Quantitative Analysis Based on NIRS Data and Image Features
For predicting the adulteration concentration of AC samples, the PLSR model based on the original data obtained high R 2 T (95.32%) and R 2 P (95.00%) and acceptable RSMET (8.63%) and RMSEP (8.89%) ( Table 1). All five preprocessing methods improved the model's performance to different degrees. Among them, 1 Der obtained the best result since the model had higher R 2 T (99.77%) and R 2 P (97.61%) as well as lower RMSET (1.94%) and RMSEP (6.60%). 2 Der was not considered because its R 2 P (87.45%) and RMSEP (15.53%) were suboptimal, even though its R 2 T (99.98%) and RMSET (0.60%) were excellent, while the feature variable extraction degraded the model's performance to different degrees. Although CARS only degraded the model slightly, the modeling efficiency increased substantially; hence, CARS was chosen as the best feature variable extraction method, at which time the model's R 2 T , R 2 P , RMSET, and RMSEP were 99.06%, 3.87%, 98.50%, and 5.14%, respectively. The principal component score selection was obtained by plotting the RMSE of Y with the principal components. In Figure 5, the smallest RMSE was obtained by cross-validation when the number of principal components was 20 (when the number of principal components was between 10 and 20, the RMSE value showed a downward trend and was basically stable after 20, so 20 was selected). Since previous studies have proved that HSI and color feature fusion had good modeling efficacy for whole wheat flour samples with different DON levels and that introducing a large number of features without reducing the data dimension may complicate the model, CF, TF, and C-TF were added after the feature variables were extracted [34,35]. It was obvious that the best performance of the model was obtained by combining CT. At this time, R 2 T , RMSET, R 2 P , and RMSEP were 99.06%, 3.87%, 98.50%, and 5.14%, respectively. Only 85 features were combined; compared with the original data, the features were reduced by 94.54%, while R 2 T and R 2 P were improved by 3.74% and 3.50%, respectively, and RMSET and RMSEP were reduced by 4.76% and 3.75%, respectively.  For predicting the adulteration level of AL, the PLSR model built from the raw NIRS data had R 2 T , RMSET, R 2 P , and RMSEP of 97.81%, 5.93%, 92.45%, and 11.73%, respectively ( Table 1). The preprocessing did not optimize the model in all cases, especially with 2 Der, where the R 2 P was reduced by 24.29% and the RMSEP increased by 9.36%. While previous studies found significant performance gains in models built by choosing 2 Der as preprocessing, our study reached the opposite results [9]. A reasonable explanation was that the signal-to-noise ratio decreases as the derivative increases, and the spectral information may be lost [36]. The best pretreatment choice was SGS, where the R 2 T , RMSET, R 2 P , and RMSEP were 99.66%, 2.24%, 97.60%, and 7.08%, respectively. The best feature selection method was SPA, where 95 feature variables were filtered, a reduction of 1462 compared to the full wavelength. However, neither feature variable selection nor combining image features led to improved model performance, presumably because some of the useful information was eliminated and the algorithm utilized different information with different efficiency [37]. Note: AC = Atractylodes chinensis (DC.) Koidz.; AL = Atractylodes lancea (Thunb.) DC.; NIRS = nearinfrared spectroscopy; HIS = hyperspectral imaging; SGS = Savitzky-Golay smoothing; SNV = standard normalized variate; MSC = multiplicative scatter correction; 1 Der = first derivative; 2 Der = second derivative; CARS = competitive adaptive reweighted sampling; SPA = successive projection algorithm; GA = genetic algorithm; CF = color features; TF = texture features; C-TF = color-texture features; R 2 T = correlation coefficient of training sets; RMSET = root mean square error of training sets; R 2 P = correlation coefficient of prediction sets; RMSEP = root mean square error of prediction sets.

Quantitative Analysis Based on HSI Data and Image Features
The models based on HSI data performed worse than those based on NIRS. For predicting the adulteration content of the adulterated AC, the R 2 T , RMSET, R 2 P , and RMSEP of the model using the original data were only 81.92%, 16.39%, 79.99%, and 17.24%, and after preprocessing with 2 Der, the R 2 could exceed 88% ( Table 1). The best feature variable extraction method was SPA, at which time the R 2 T could reach 90.06%. Although R 2 P was only 85.59%, it was still the highest among the three methods. We tried to combine image features on the basis of 2 Der + SPA and found that the model results combining three image features were in the order of good to bad: 2 Der + SPA + CF > 2 Der + SPA + T-CF > 2 Der + SPA + TF. Similar results were found in the classification task [38], and it might be that the TF carries some useful information and the others may not contribute much to the modeling. Although the R 2 of the model constructed by 2 Der + SPA + CF was higher than 86%, it was still far from the best PLSR model based on NIRS data.
Interestingly, in predicting the adulteration level in AL, the model built with original data gave better results than that built with pretreatments, except for 2 Der. After 2 Der processing, the R 2 T , RMSET, R 2 P , and RMSEP of the model were 99.12%, 3.69%, 91.66%, and 11.58%, respectively, which changed by 9.46%, 8.56%, 2.89%, and 4.03% compared to the original data (Table 1). In addition, we found that 2 Der was the best choice for predicting the adulteration level of either AC or AL based on HSI data. In the detection of metanil yellow adulteration in chickpea flour based on the HSI full spectrum, the best pretreatment was also 2 Der, and our findings were consistent with this [33]. This might be related to the ability of 2 Der to separate overlapping peaks, which may lead to the success prediction [39]. The best feature variable selection method was then SPA, at which point it filtered to 94 feature wavelengths. Combining TF on the basis of this seemed to be the best option. However, in general, feature variable extraction and combining TF did not lead to a significant improvement in model performance. Furthermore, the model built from HSI data seemed to be worse than the one built from NIRS, which was explainable because HSI shows its unique spatial resolution at the expense of spectral resolution compared to conventional NIRS spectroscopy [40]. However, when NIRS is combined with the C-TF method extracted from HSI, the complementary advantages of the two enable the model performance to be further optimized, which gave us the possibility to think about whether a better model can be obtained by fusing the two data and then combining the image features.

Quantitative Analysis Based on LLF Data and Image Features
The LLF dataset of 2069 features was obtained by integrating 1557 variables in the NIRS and 512 variables in the HSI. For the adulteration content prediction of adulterated AC, 1 Der was the best pretreatment choice, at which time R 2 T , RMSET, R 2 P , and RMSEP were 99.92%, 1.15%, 97.71%, and 6.73%, respectively (Table 1). 117 feature variables were obtained after SPA extraction of feature variables, which was 94.35% less compared to total variables, but there was a slight decrease in the performance of the model. There was further degradation in the performance of the model if image features were introduced.
For predicting the adulteration level of AL, its combination with the original data could obtain 98.16%, 5.47%, 91.95%, and 12.29% for R 2 T , RMSET, R 2 P , and RMSEP (Table 1). After pretreatment, SGS, SNV, MSC, and 1 Der could optimize the model, while 2 Der did the opposite. Among them, the model performed best after SGS treatment compared with the original data, with R 2 T and R 2 P increasing by 1.65% and 5.24%, respectively, while RMSET and RMSEP decreasing by 3.72% and 5.92%, respectively. However, similar to the AC prediction task, both feature variable selection and combining image features made the modeling worse. LLF did not seem to be suitable for combining image features to accomplish the above two prediction tasks. In addition, the models using LLF exhibited similar or poorer performance compared to models built with a single spectrum. However, in the previous study, the LLF strategy had better predictive ability in predicting TVB-N content in chicken than the optimal model for single-spectral data. Our results yielded different conclusions from this, which could be attributed to the large amount of data introduced by LLF, leading to the computational complexity and uncertainty of the PLSR method [21,41].

Quantitative Analysis Based on MLF Data and Image Features
Based on the best pretreatment, the MLF dataset was obtained by integrating the feature variables obtained by the same feature variable extraction method. For predicting the adulteration content of AC, the model built at the CARS level was the best, with R 2 T , RMSET, R 2 P , and RMSEP of 99.15%, 3.61%, 98.17%, and 6.55%, respectively (Table 1). If image features were considered, C-TF was the best choice because the R 2 T , RMSET, R 2 P , and RMSEP were 99.85%, 1.25%, 98.61%, and 5.06%, respectively, which was superior to the best model built based on NIRS data.
For predicting the adulteration content in AL, the model built at the SPA level was the best, with R 2 T , RMSET, R 2 P , and RMSEP of 99.62%, 2.37%, 96.22%, and 11.00%, respectively (Table 1). Similarly, the combination of C-TF on this basis also achieved the best results, with R 2 T , RMSET, R 2 P , and RMSEP reaching 99.92%, 1.16%, 99.00%, and 2.16%, better than the best models built with single spectral or LLF data. In general, the MLF strategy achieved very good results, which was consistent with the findings of previous studies [15].

Analysis of Feature Variables
The characteristic variables selected for the MLF strategy were analyzed to find the reasons for the success of the adulterated content prediction. In Figure 6A, the feature variables were mainly distributed in the ranges of 4200-5300 cm −1 and 8200-9800 cm −1 . Unfortunately, there is little information in the scientific literature about the NIRS of AR. However, an attempt would be made to identify the compounds that may be present in the NIRS of AR. We speculated that the feature variables selected between 4200 and 5300 cm −1 may be related to the absorption valleys near 4200, 4600, and 5000 cm −1 . The variables near 4600 and 5000 cm −1 may be connected to the combined C-H and N-H bonding vibrations found in proteins. Those near 4200 cm −1 could be attributed to the C-H and C-C bonding vibrations [42]. The region between 8200 and 9800 cm −1 was associated with the second overtone vibrations of C-H and N-H [43]. In Figure 6B, the characteristic variables were mainly distributed between 1140 and 1500 nm, which may be related to the first-order octave vibrations of C-H and N-H.
Intriguingly, the feature variables in Figure 6C were primarily dispersed between the ranges of 5200-7300 and 8200-9800 cm −1 . The absorption band of 5200-7300 cm −1 might be due to the amide combination band vibration of CONH 2 and the N-H stretching of the protein [44]. When it came to feature variables of HSI data processed by 2 Der and SPA ( Figure 6D), selected feature variables concentrated in 1000-1180, 1220-1340, and 1450-1600 nm could be observed. Feature variables in the 1000-1180 nm range may be influenced by the second overtone of N-H or O-H [36]. In summary, proteins may affect the effectiveness of model modeling, and the specific substances still need further indepth study. protein [44]. When it came to feature variables of HSI data processed by 2 Der and SPA ( Figure 6D), selected feature variables concentrated in 1000-1180, 1220-1340, and 1450-1600 nm could be observed. Feature variables in the 1000-1180 nm range may be influenced by the second overtone of N-H or O-H [36]. In summary, proteins may affect the effectiveness of model modeling, and the specific substances still need further in-depth study.

Conclusions
To predict the adulteration content of AC and AL powder quickly and nondestructively, two spectroscopic techniques, NIRS and HSI, were developed. Preprocessing, feature variable selection, data fusion, and image feature fusion strategies were introduced to obtain the optimal prediction model. The results suggested that a single NIRS might be more appropriate than a single HSI, but if higher prediction results were pursued, adding C-TF data to the MLF would be a better choice. To be specific, for the task of predicting the adulteration level of AC, the best strategy was combining MLF data (at CARS level) and C-TF, at which time the R 2 T, RMSET, R 2 P, and RMSEP of the model were 99.85%, 1.25%, 98.61%, and 5.06%, respectively. For another task, the best approach was adding C-TF to the basic MLF data (at SPA level), where the R 2 T, RMSET, R 2 P, and RMSEP were 99.92%, 1.16%, 99.00%, and 2.16%, respectively. In addition, we found that proteins may be one of the factors affecting successful modeling, but the exact substance still needs further study. The instruments used in this study are suitable for analytical testing in the laboratory; therefore, the use of low-cost and convenient spectroscopic equipment for quality assessment in the field or considering applying a demixing algorithm to obtain a better model may be the focus of future work. In conclusion, the data and image feature fusions based on NIRS and HSI can predict the level of adulteration of AR powder quickly, nondestructively, and accurately, which is beneficial to safeguarding the quality and efficacy of herbs

Conclusions
To predict the adulteration content of AC and AL powder quickly and nondestructively, two spectroscopic techniques, NIRS and HSI, were developed. Preprocessing, feature variable selection, data fusion, and image feature fusion strategies were introduced to obtain the optimal prediction model. The results suggested that a single NIRS might be more appropriate than a single HSI, but if higher prediction results were pursued, adding C-TF data to the MLF would be a better choice. To be specific, for the task of predicting the adulteration level of AC, the best strategy was combining MLF data (at CARS level) and C-TF, at which time the R 2 T , RMSET, R 2 P , and RMSEP of the model were 99.85%, 1.25%, 98.61%, and 5.06%, respectively. For another task, the best approach was adding C-TF to the basic MLF data (at SPA level), where the R 2 T , RMSET, R 2 P , and RMSEP were 99.92%, 1.16%, 99.00%, and 2.16%, respectively. In addition, we found that proteins may be one of the factors affecting successful modeling, but the exact substance still needs further study. The instruments used in this study are suitable for analytical testing in the laboratory; therefore, the use of low-cost and convenient spectroscopic equipment for quality assessment in the field or considering applying a demixing algorithm to obtain a better model may be the focus of future work. In conclusion, the data and image feature fusions based on NIRS and HSI can predict the level of adulteration of AR powder quickly, nondestructively, and accurately, which is beneficial to safeguarding the quality and efficacy of herbs as well as providing a theoretical basis and new ideas for quality evaluation of Chinese herbs.

Data Availability Statement:
The data in this study were available from the following sources: the corresponding authors. These data are not publicly available due to the requirement to fund research projects.

Conflicts of Interest:
The authors declare no conflict of interest.