Rapid Determination of Soil Class Based on Visible-Near Infrared, Mid-Infrared Spectroscopy and Data Fusion

: Wise soil management requires detailed soil information, but conventional soil class mapping in a rather coarse spatial resolution cannot meet the demand for precision agriculture. With the advantages of non-destructiveness, rapid cost-e ﬃ ciency, and labor savings, the spectroscopic technique has proved its high potential for success in soil classiﬁcation. Previous studies mainly focused on predicting soil classes using a single sensor. In this study, we attempted to compare the predictive ability of visible near infrared (vis-NIR) spectra, mid-infrared (MIR) spectra, and their fused spectra for soil classiﬁcation. A total of 146 soil proﬁles were collected from Zhejiang, China, and the soil properties and spectra were measured by their genetic horizons. Along with easy-to-measure auxiliary soil information (soil organic matter, soil texture, color and pH), four spectral data, including vis-NIR, MIR, their simple combination (vis-NIR-MIR), and outer product analysis (OPA) fused spectra, were used for soil classiﬁcation using a multiple objectives mixed support vector machine model. The independent validation results showed that the classiﬁcation model using MIR (accuracy of 64.5%) was slightly better than that using vis-NIR (accuracy of 64.2%). The predictive model built on vis-NIR-MIR did not improve the classiﬁcation accuracy, having the lowest accuracy of 61.1%, which likely resulted from an over-ﬁtting problem. The model based on OPA fused spectra performed best with an accuracy of 68.4%. Our results prove the potential of fusing vis-NIR and MIR using OPA for improving prediction ability for soil classiﬁcation. D.X. administration, Z.S.; acquisition, Z.S.


Introduction
It is well known that soils have high spatial heterogeneity with different intrinsic and morphological characteristics. Therefore, it is necessary to have a good understanding of soil on a local scale for better soil management. In practice, soil class is a good indicator for characterizing soil information and provides a basis for subsequent land management, land resource evaluation, crop planting, and fertilization [1].
In addition, the sum of Halosols, Gleyosols and Isohumosols only represents less than 4% of the study area.
A total of 146 soil profiles were collected from Zhejiang Province between 2009 and 2012, resulting in 571 soil samples from each genetic horizon of these soil profiles. All soil samples were air-dried, ground and sieved to less than 2 mm. The samples were then sent to the laboratory for measurement of the physical and chemical properties as well as the spectral data (vis-NIR and MIR). The sample positions are shown in Figure 1. The soil classification for each soil profile was determined by an experienced soil judge based on the Chinese Soil Taxonomy (CST) [47].
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 15 mainly occur in the highlands. In addition, the sum of Halosols, Gleyosols and Isohumosols only represents less than 4% of the study area. A total of 146 soil profiles were collected from Zhejiang Province between 2009 and 2012, resulting in 571 soil samples from each genetic horizon of these soil profiles. All soil samples were air-dried, ground and sieved to less than 2 mm. The samples were then sent to the laboratory for measurement of the physical and chemical properties as well as the spectral data (vis-NIR and MIR). The sample positions are shown in Figure 1. The soil classification for each soil profile was determined by an experienced soil judge based on the Chinese Soil Taxonomy (CST) [47].  Table 1. shows the number of samples of each soil class of each soil profile at the soil order and suborder levels. Table 2 shows soil orders in accordance with the CST and their relationships with the World Reference Base [48].   Table 1. shows the number of samples of each soil class of each soil profile at the soil order and suborder levels. Table 2 shows soil orders in accordance with the CST and their relationships with the World Reference Base [48].

Chemical Analysis
In this study, we selected soil organic matter, texture, moist color and pH as auxiliary soil information to build the classification model. Organic matter was measured through the H 2 SO 4 -K 2 Cr 2 O 7 oxidation method. Texture was measured through a pipette method. Soil pH was measured in 1:1 soil/water suspension, with the potentiometric method [49]. Soil moist color was recorded by a Munsell soil system [50]. In addition, the colors recorded by the Munsell color system were hard to apply to the classification algorithm. Thus, we transformed soil color data to RGB using the aqp package [51] in R 3.5.3. [52].

Spectroscopic Measurement and Pre-Processing
The vis-NIR spectral data was obtained with a FieldSpec®ProFR vis-NIR spectrometer (Analytical Spectral Devices, Boulder, CO, USA). The spectral range of the instrument is from 350 to 2500 nm. The instrument has a spectral resolution of 3 nm between 350 and 1000 nm and a spectral resolution of 10 nm between 1000 and 2500 nm. The resampling resolution of the spectra is 1 nm. For each measurement, a Spectralon®panel with 99% reflectance was used as the standard to calibrate the spectrometer. For every soil sample, we measured the vis-NIR spectra at three random positions with 10 internal replicates. Afterwards, we averaged the total 30 spectral data as the final spectra for each soil sample.
The MIR spectral data was obtained using the Agilent 4300 Handheld FTIR (Fourier transform infra-red) (Agilent Technologies, Santa Clara, CA, USA). The spectral range of this instrument is from 4000 to 650 cm −1 . We used a DTGS (deuterated triglycine sulphate) detector with a spectral resolution of 4 cm −1 to measure the soil MIR spectra. The measurement and preparation methods of soil samples were the same as those used for vis-NIR.
For the vis-NIR spectra, due to the high signal-to-noise ratio, the spectral regions of 350 to 399 nm and 2451 to 2500 nm were removed. Then, the vis-NIR and MIR spectra were resampled to 10 nm and 6 cm −1 , respectively. After several comparisons of the spectral pre-processing methods, we adopted the Savitzky-Golay algorithm to smooth the spectra. A polynomial of order 2 and window size 11 were used, and continuum removal (CR) followed.

Outer Product Analysis (OPA)
Outer product analysis is a data fusion algorithm. By analyzing the variance of the intervariable data matrix, the OPA process can calculate the frequencies between these two spectral domains [53]. Therefore, OPA has the potential to emphasize the co-evolution of different spectral domains and to determine whether there is any hidden information between the vis-NIR spectra and the MIR spectra [54]. For instance, for n soil samples having vis-NIR and MIR spectra with x and y wavelengths, we could acquire the intensities of each domain to construct two data vectors. We can naturally infer that the data vectors from the vis-NIR domain are the x dimension, the data vectors from the MIR domain are the y dimension and the number of both kinds of data vectors is n. Subsequently, Remote Sens. 2020, 12, 1512 5 of 15 we multiply both intensities to construct n new data matrixes (x rows by y columns), which contain all possible products of the intensities from the two vectors. Then, this matrix will be unfolded to n (x × y) vectors, which can be applied to further statistical analysis. Moreover, these vectors can be concatenated to a product matrix, called the outer product matrix, with n rows and x × y columns. This method was implemented in R 3.5.3. Figure 2 shows a workflow of data fusion by OPA.
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 15 which contain all possible products of the intensities from the two vectors. Then, this matrix will be unfolded to n (x × y) vectors, which can be applied to further statistical analysis. Moreover, these vectors can be concatenated to a product matrix, called the outer product matrix, with n rows and x × y columns. This method was implemented in R 3.5.3. Figure 2 shows a workflow of data fusion by OPA. Figure 2. Workflow of data fusion by outer product analysis (OPA) (revised after [27]).
For OPA analysis, the spectra of MIR (549 variables) were multiplied by the spectra of vis-NIR (216 variables), which resulted in a matrix with 113094 variables (549 × 216) for each soil sample. To overcome data redundancy and improve the computing efficiency, we used principal component analysis (PCA) to reduce the dimension of the fused data.

Model Construction
To ensure that all representative soil profiles can be completely covered in the calibration set, data partitioning was made at the soil suborder level. At the soil suborder level, a stratified random sampling method was used with a 2:1 ratio of calibration and validation samples. Nevertheless, few soil samples belonged to Orthic Anthrosol or Anthric Primosol. These soil profiles were allocated to the calibration dataset and none were allocated to the validation dataset because Gleyosol, Isohumosol and Halosol have an extremely small sample size. It was found in further study that if these profiles were modeled, poor results were obtained. After excluding these three soil orders, 95 calibration profiles and 45 validation profiles remained.
We reduced data dimensionality by PCA before modeling, as spectral data are highly collinear. To capture spectral information as much as possible, the principal components (PCs), which varied from 20 to 50, were considered as input representing the spectral information for model calibration. Table 3 shows the selection of PCs, corresponding contribution rate and model results. All the contribution rates are approximately 99%.  For OPA analysis, the spectra of MIR (549 variables) were multiplied by the spectra of vis-NIR (216 variables), which resulted in a matrix with 113094 variables (549 × 216) for each soil sample. To overcome data redundancy and improve the computing efficiency, we used principal component analysis (PCA) to reduce the dimension of the fused data.

Model Construction
To ensure that all representative soil profiles can be completely covered in the calibration set, data partitioning was made at the soil suborder level. At the soil suborder level, a stratified random sampling method was used with a 2:1 ratio of calibration and validation samples. Nevertheless, few soil samples belonged to Orthic Anthrosol or Anthric Primosol. These soil profiles were allocated to the calibration dataset and none were allocated to the validation dataset because Gleyosol, Isohumosol and Halosol have an extremely small sample size. It was found in further study that if these profiles were modeled, poor results were obtained. After excluding these three soil orders, 95 calibration profiles and 45 validation profiles remained.
We reduced data dimensionality by PCA before modeling, as spectral data are highly collinear. To capture spectral information as much as possible, the principal components (PCs), which varied from 20 to 50, were considered as input representing the spectral information for model calibration. Table 3 shows the selection of PCs, corresponding contribution rate and model results. All the contribution rates are approximately 99%.
SVM is a common machine learning algorithm for classification and regression. This algorithm shows excellent performance on text classification [55] and is mainly used as a binary classifier. Multi-classification tasks still need further improvement when using SVM [56]. In our study, 5 soil order levels required classification; therefore, we adopted multiple objectives mixed support vector classification (MOM-SVC) for model construction. MOM-SVC is a combination of SVM and the majority voting method. MOM-SVC was performed as follows: first, an SVM was created based on spectral data of each soil horizon; second, each profile horizon was predicted; third, each vote was extracted from each profile and added together; finally, each soil profile was classified according to the most votes [45]. Figure 3 illustrates the workflow of this study. SVM is a common machine learning algorithm for classification and regression. This algorithm shows excellent performance on text classification [55] and is mainly used as a binary classifier. Multi-classification tasks still need further improvement when using SVM [56]. In our study, 5 soil order levels required classification; therefore, we adopted multiple objectives mixed support vector classification (MOM-SVC) for model construction. MOM-SVC is a combination of SVM and the majority voting method. MOM-SVC was performed as follows: first, an SVM was created based on spectral data of each soil horizon; second, each profile horizon was predicted; third, each vote was extracted from each profile and added together; finally, each soil profile was classified according to the most votes [45]. Figure 3 illustrates the workflow of this study. A confusion matrix was used to determine classification accuracy based on the validation set. To assess the uncertainty associated with random sampling and model structure, we established 100 models based on the stratified random sampling mentioned before. Finally, the prediction results depicted by the confusion matrix were calculated by the average of the prediction result of 100 models. In addition, 90% accuracy confidence intervals (CIs90%) were calculated for the calibration and validation sets.
This model was implemented with the package e1071 in R 3.5.3. We used C-classification with a radial kernel function in the SVM model. Two important parameters in the SVM model, "gamma" A confusion matrix was used to determine classification accuracy based on the validation set. To assess the uncertainty associated with random sampling and model structure, we established 100 models based on the stratified random sampling mentioned before. Finally, the prediction results depicted by the confusion matrix were calculated by the average of the prediction result of 100 models. In addition, 90% accuracy confidence intervals (CIs90%) were calculated for the calibration and validation sets.
This model was implemented with the package e1071 in R 3.5.3. We used C-classification with a radial kernel function in the SVM model. Two important parameters in the SVM model, "gamma" and "cost," were optimized by 10-fold cross-validation where gamma was tuned by 0.0125, 0.025, 0.625, 0.125, 0.25 and 0.5 and cost was tuned by 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5 and 5.

Extracting Auxiliary Soil Information
Violin plots are used to depict the distribution status and probability density of datasets. Figure 4 presents the data distribution of mainly auxiliary soil information. Figure 4a shows that the distribution shape of Gleyosol is quite different from others because there was only one soil profile belonging to Gleyosol. To reduce the effect from soil orders with only a few profiles, we excluded the soil profiles belonging to Gleyosol, Isohumosol and Halosol in further modeling.

Extracting Auxiliary Soil Information
Violin plots are used to depict the distribution status and probability density of datasets. Figure  4 presents the data distribution of mainly auxiliary soil information. Figure 4a shows that the distribution shape of Gleyosol is quite different from others because there was only one soil profile belonging to Gleyosol. To reduce the effect from soil orders with only a few profiles, we excluded the soil profiles belonging to Gleyosol, Isohumosol and Halosol in further modeling.
Anthrosols had the highest median SOM contents of 13 g kg -1 due to their long cultivation history. In land management, farmers always plow, fertilize, and irrigate, which causes SOM accumulation. This land management also causes the highest median silt contents (509 g kg -1 ). Argosols had moderate median values with clay contents of 220 g kg -1 . It is obvious that Argosols had higher median clay contents than Anthrosols, second to Ferrosols (308 g kg -1 ). The main pedogenesis process of Argosols is clayization, which makes up the argillic horizon and clay pan. Thus, many clay minerals are accumulated in the soil profile. Ferrosols are formed by desilicification and fersiallitisation. In tropical and subtropical zones, the minerals deeply suffer from weathering, so clay particles with low activity will start to aggregate. The diagnosis horizon of Cambosols was the cambic horizon, which is less developed and demonstrates poor illuviation and poor clayization. Therefore, Cambosols showed the second highest median sand contents (407 g kg -1 ). Primosols had the highest sand content (691 g kg -1 ) because Primosols have no significant development, and diagnostic horizons cannot be observed. This profile had the lowest SOM content (6 g kg -1 ), clay content (70 g kg -1 ) and silt content (237 g kg -1 ).  Anthrosols had the highest median SOM contents of 13 g kg −1 due to their long cultivation history. In land management, farmers always plow, fertilize, and irrigate, which causes SOM accumulation. This land management also causes the highest median silt contents (509 g kg −1 ). Argosols had moderate median values with clay contents of 220 g kg −1 . It is obvious that Argosols had higher median clay contents than Anthrosols, second to Ferrosols (308 g kg −1 ). The main pedogenesis process of Argosols is clayization, which makes up the argillic horizon and clay pan. Thus, many clay minerals are accumulated in the soil profile. Ferrosols are formed by desilicification and fersiallitisation. In tropical and subtropical zones, the minerals deeply suffer from weathering, so clay particles with low activity will start to aggregate. The diagnosis horizon of Cambosols was the cambic horizon, which is less developed and demonstrates poor illuviation and poor clayization. Therefore, Cambosols showed the second highest median sand contents (407 g kg −1 ). Primosols had the highest sand content (691 g kg −1 ) Remote Sens. 2020, 12, 1512 8 of 15 because Primosols have no significant development, and diagnostic horizons cannot be observed. This profile had the lowest SOM content (6 g kg −1 ), clay content (70 g kg −1 ) and silt content (237 g kg −1 ).
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 15 After data fusion, in calibration, the model based on simple spectral combination was correctly classified 100% of the time. In validation, an overall accuracy of 61.1% with CIs90% between 57.8% and 62.31% was obtained (Figure 7). Ferrosols were misclassified completely. Argosols had the second lowest accuracy of 11% (0%, 14%) and was most often misclassified as Cambosols, and the remaining samples were allocated to Anthrosols. Anthrosols showed the highest accuracy of 94% (94%, 94%). Cambosols presented relatively moderate accuracy of 60% (60%, 60%).  Figure 8 illustrates that the model based on outer product analysis achieved a total accuracy of 88.9% with CIs90% between 88.4% and 89.5% in calibration and 68.4% with CIs90% between 66.7 and 71.1% in validation. The accuracy of validation data using OPA performed the best among the four models. In calibration, all the Anthrosols were correctly classified. Both the Argosols and Cambosols had relatively high accuracies of 86% (86%, 86%) and 97% (97%, 97%), respectively. In validation, Ferrosols failed to be classified once again. In all the models, Ferrosols always achieved 0% and were always misclassified to Argosols. Except for Ferrosols, Argosols were the second misclassified class, of which the accuracy was only 14% (14%, 14%). Cambosols showed relatively high accuracy of 80% (80%, 80%) with a few samples occurring with Argosols and Anthrosols. Similar to the previous models, Anthrosols achieved the highest accuracy of 94% (94%, 94%).   Figure 8 illustrates that the model based on outer product analysis achieved a total accuracy of 88.9% with CIs90% between 88.4% and 89.5% in calibration and 68.4% with CIs90% between 66.7 and 71.1% in validation. The accuracy of validation data using OPA performed the best among the four models. In calibration, all the Anthrosols were correctly classified. Both the Argosols and Cambosols had relatively high accuracies of 86% (86%, 86%) and 97% (97%, 97%), respectively. In validation, Ferrosols failed to be classified once again. In all the models, Ferrosols always achieved 0% and were always misclassified to Argosols. Except for Ferrosols, Argosols were the second misclassified class, of which the accuracy was only 14% (14%, 14%). Cambosols showed relatively high accuracy of 80% (80%, 80%) with a few samples occurring with Argosols and Anthrosols. Similar to the previous models, Anthrosols achieved the highest accuracy of 94% (94%, 94%).
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 15 After data fusion, in calibration, the model based on simple spectral combination was correctly classified 100% of the time. In validation, an overall accuracy of 61.1% with CIs90% between 57.8% and 62.31% was obtained (Figure 7). Ferrosols were misclassified completely. Argosols had the second lowest accuracy of 11% (0%, 14%) and was most often misclassified as Cambosols, and the remaining samples were allocated to Anthrosols. Anthrosols showed the highest accuracy of 94% (94%, 94%). Cambosols presented relatively moderate accuracy of 60% (60%, 60%).  Figure 8 illustrates that the model based on outer product analysis achieved a total accuracy of 88.9% with CIs90% between 88.4% and 89.5% in calibration and 68.4% with CIs90% between 66.7 and 71.1% in validation. The accuracy of validation data using OPA performed the best among the four models. In calibration, all the Anthrosols were correctly classified. Both the Argosols and Cambosols had relatively high accuracies of 86% (86%, 86%) and 97% (97%, 97%), respectively. In validation, Ferrosols failed to be classified once again. In all the models, Ferrosols always achieved 0% and were always misclassified to Argosols. Except for Ferrosols, Argosols were the second misclassified class, of which the accuracy was only 14% (14%, 14%). Cambosols showed relatively high accuracy of 80% (80%, 80%) with a few samples occurring with Argosols and Anthrosols. Similar to the previous models, Anthrosols achieved the highest accuracy of 94% (94%, 94%).

Discussion
Soil spectra are able to extract extensive information based on electrical and electromagnetic, optical and radiometric characteristics [33]. This spectra shows the absorption rate of radiation at molecular vibration frequencies. The vibration of these atom groups, such as C-H, N-H, C=O and so on, change the frequency [17]. Previous studies showed the sensitive wave bands of spectral absorption in vis-NIR or MIR for prediction [8]. For vis-NIR spectra, it can contain the information on water (1400, 1900 nm), kaolinite (1400, 2200 nm), illite (2200, 2340, 2445 nm), smectite (2200 nm), carbonate (2335 nm), iron oxides (400, 450, 500, 650, 900 nm) and organic matter (1100, 1600, 1700, 1800, 2000, 2200-2400 nm) [8,57]. For MIR, there are more corresponding sensitive bands than in the vis-NIR regions. For example, under the influence of strong basic group vibration, the sensitive wave bands of SOM in the MIR region (1600-1400, 1630, 1720, 1670, 1530, 2930-2850 cm −1 ) are more than those in the vis-NIR region [8,58]. On the other hand, overtones and combination bands in vis-NIR region will overlap, and in MIR regions, the peaks would be resolved [21]. Hence, vis-NIR spectra is more difficult to characterize than the MIR spectra [6]. Previous studies showed that the MIR spectra performs better than vis-NIR spectra for predicting some soil properties such as pH, SOM, sand content and so on [6,17,32,57,58]. In contrast, for other properties, such as soil organic carbon, pH, CEC, and total nitrogen, MIR did not perform better than vis-NIR because some useful information might be masked by the strong absorptance of soil minerals [8]. Consequently, it is necessary to combine vis-NIR spectra with MIR spectra for covering as much information as possible. Table 3 shows the selection of parameters and corresponding prediction accuracy. The results based on MIR spectra (64.5%) are slightly higher than those based on vis-NIR spectra (64.2%). This may occur because spectral modeling based on a single domain would lose soil information due to the heterogeneity of the soil profiles. Thus, we proposed data fusion for soil classification in this study. However, Figure 6 shows that the result based on simple combination was the worst (61.1%), which may be caused by over-fitting in the calibration step. Data fusion via OPA obtained the best classification accuracy (68.4%), though the contribution rate of PCs is lower than that of other models. The simple combination and OPA presents two different levels for data fusion [59]. The simple combination belongs to the low level, which just concatenates spectral data from both sensors. OPA belongs to the high level, which can fuse the features extracted from vis-NIR and MIR [12]. Here, the simple combination achieves worse accuracy than that based on single sensors. The other reason is that spectral data are highly multicollinear. When using both sensors, the spectral data become more redundant and present more noise, which results in decreasing the model stability [59]. Therefore, spectral pre-processing methods play an important role in model prediction when using the simple combination [60]. Although the improvement of classification accuracy by OPA is not that significant (3%), we prove its potential for soil classification with proper pre-processing and a calibration model.
The sample size may be one of the factors affecting classification accuracy. In previous studies, some scholars have further advanced the research that focuses on soil suborder. It is evident that at the soil suborder level, the classification accuracy is significantly decreased [2,45]. When a more detailed soil type is taken into consideration, the sample size is relatively smaller than before. In our study, the accuracy of the classification results is positively correlated with sample size. It can be seen intuitively that Anthrosols with a maximum sample size achieve the highest classification accuracy. Conversely, Ferrosols only contain a few samples. The prediction accuracy of Ferrosols is 0 in each process, obviously. Here, all Ferrosols are misclassified as Argosols, which probably results from data imbalance in the calibration set. That is, the number of Ferrosols is low in the calibration data and thus leads to a small contribution in the calibration model. As for the problem of sample imbalance, it may be solved by a minority oversampling algorithm such as the synthetic minority oversampling technique [61]. Above all, we suggest that sample size is one of the factors affecting the classification accuracy, and the sample size differences of each class lead to sample imbalance while modeling. The establishment of soil spectral libraries across scales may be a good solution to further improve accuracy of soil classification [11,62,63].
For classification of each soil order, it is necessary to find a general pattern in the misclassification. In calibration, the misclassification of Ferrosols is predominantly observed with Argosols. We have mentioned before that the main diagnostic characteristics of Argosols are argic horizon, kandic horizon, natric horizon, and clay pan. The main diagnostic characteristics of Ferrosols are allitization and ferritization. Both of these genesis processes lead to soil clay accumulation and strong base eluviation, which makes the soil texture similar. For Argosols, most of errors are re-allocated to Cambosols. The diagnostic characteristics of Cambosols are a cambic horizon, whose soil structure is preliminarily formed without significant clayification and illuviation. Comparing Cambosols with Argosols, there is almost no correlation between their diagnostic characteristics. We found that soil properties of different classes or horizons do not show significant differences, and the spectra curves of some soil classes also overlap [11,43], either of which might cause the misclassification. For Primosols, most of the errors occur with Cambosols. The genesis process may explain the error. Primosols are less profile developed and have no diagnostic horizon and characteristic. With a short history of development, the weathering and pedogenesis of both of profiles are weak. Moreover, comparing Anthrosols with Cambosols, both of them achieve high accuracy. The accuracy of Anthrosols reaches nearly 100%. Only a few of the samples are misclassified as Cambosols in the validation process. However, the diagnostic horizon and characteristics between them are totally different. As mentioned above, both have relatively large proportions in calibration data. The calibration model will learn more and cover more features. These results might also explain the reason why most misclassifications were classified as Anthrosols and Cambosols.
In this study, we tried to optimize the gamma and cost parameters to improve the model. We determined the ranges of gamma and cost by expert knowledge. Actually, the parameter selection ranges are infinite. Future studies could focus on finding a better solution for parameter selection (or grid search) out of the range determined in this study.
The soil profiles used in this study only cover 5 soil orders, while other soil orders are still missing. This is because soil order is the highest classification level, which ensures enough data for classification. With the limited sample size, there are no studies focusing on classifying soil group, soil subgroup and soil series. To build a robust classification model at these levels, more soil profile data are needed. Therefore, if the number of samples is large enough and covers as widely as possible, models will perform better, and the classification accuracy will be significantly improved. Although it is time and cost consuming to collect more soil profile data, we will continue to expand our soil database and to improve the model. With continuous data collection, the database and model can be updated constantly. It is worth noting that in the update process, the sample treatment process and analysis should be as consistent as possible [43].
Apart from sample size, model parameter and characteristics of soil profile itself, the limitations of the spectra itself is another factor influencing the accuracy of soil classification. As we know, the processes of pedogenesis and soil development are extremely complex. The processes are affected by climate, biology, parent material, topography, time, and man-made factors. In the process of soil formation, interaction with the surrounding environment will be involved. The interaction of the geological cycle and the biological cycle is the basis of soil genesis. Under the action of soil forming factors, a series of soil physical, chemical, and biological characteristics are developed, thus leading to different soil types. Therefore, most current adopted soil classification is based on the theory of soil genesis and soil characteristics. [64]. It is the same case in China that CST uses diagnostic horizons and diagnostic characteristics as the core of soil classification. In this study, we obtained the soil property data and spectral data quantitatively at each horizon. Although quantitative data can guarantee the objectivity of classification [50], these quantitative data cannot well reflect all the diagnostic horizons and diagnostic characteristics. For instance, the main diagnostic characteristics of Ferrosols and Argosols are ferritization and clay pans, respectively. Though previous studies indicated a high prediction accuracy for clay contents and iron oxide contents by vis-NIR and MIR [6,20,49], we cannot infer that vis-NIR and MIR are able to predict clay pans and ferritization. Moreover, even for these well predicted soil properties (e.g., SOC, clay), there is still a large prediction uncertainty. Considering the comprehensive framework involved in soil classification, classification error occurs if we only simply use these soil properties.
Meanwhile, soil spectral data may ignore the information relevant to soil genesis. In the study of soil genesis, whether the homology or heterogeneity of parent material is clear can help us better understand and distinguish the real causes of the change trend of soil properties. For the soil genesis process, discontinuity of parent material is universal. However, it is not easy to determine the discontinuity of parent material. When there are obvious differences in soil texture and structure in profile morphology, the discrimination of discontinuity of parent material is easier. However it is difficult to identify it if the profile features are uniform. Previous studies discussed some solutions, such as deep functions, uniformity value, and discriminant analysis [65]; while all of them have their limitations and need to be improved. In our study, although soil spectra is able to provide information related to soil properties, it cannot well capture all the relevant diagnostic characteristics. Overall, due to its limitations, soil spectroscopy cannot fully replace conventional soil classification, but it has a high potential to aid soil classification once the soil spectral libraries are well established. It also should be noted that the prediction of soil classification based on soil spectra still remains an academic research topic and more efforts should be pursued for its practical use in decision making for farmers and stakeholders.

Conclusions
We used data fusion for soil classification. There were four models constructed in our study as follows: (1)