Prediction of the Carbon Content of Six Tree Species from Visible-Near-Infrared Spectroscopy

: This study aimed to measure the carbon content of tree species rapidly and accurately using visible and near-infrared (Vis-NIR) spectroscopy coupled with chemometric methods. Currently, the carbon content of trees used for calculating the carbon storage of forest trees in the study of carbon sequestration is obtained by two methods. One involves measuring carbon content in the laboratory (K 2 CrO 7 -H 2 SO 4 oxidation method or elemental analyzer), and another involves directly using the IPCC (Intergovernmental Panel on Climate Change) default carbon content of 0.45 or 0.5. The former method is destructive, time-consuming, and expensive, while the latter is subjective. However, Vis-NIR detection technology can avoid these shortcomings and rapidly determine carbon content. In this study, 96 increment core samples were collected from six tree species in the Heilongjiang province of China for analysis. The spectral data were preprocessed using seven methods, including extended multiplicative scatter correction (EMSC), ﬁrst derivative (1D), second derivative (2D), baseline correction, de-trend, orthogonal signal correction (OSC), and normalization to eliminate baseline drifting and noise, as well as to enhance the model quality. Linear models were established from the spectra using partial least squares regression (PLS). At the same time, we also compared the effects of full-spectrum and reduced spectrum on the model’s performance. The results showed that the spectral data processed by 1D with the full spectrum could obtain a better prediction model. The 1D method yielded the highest R 2 c of 0.92, an RMSEC (root-mean-square error of calibration) of 0.0056, an R 2 p of 0.99, an RMSEP (root-mean-square error of prediction) of 0.0020, and the highest RPD (residual prediction deviation) value of 8.9. The results demonstrate the feasibility of Vis-NIR spectroscopy coupled with chemometric methods in determining the carbon content of tree species as a simple, rapid, and non-destructive method.


Introduction
Increased CO 2 levels result in global warming and frequent natural disasters, which represent a series of positive feedback effects [1][2][3]. As the largest ecosystem on land, forests play a vital role in the global carbon cycle and represent a vast carbon reservoir [4]. Forest carbon stock accounts for about 45-54% of the total C in terrestrial ecosystems [5]. At present, there are many methods for estimating carbon reserves in forest trees, but biomass remains the most widely used, direct, and accurate method [6]. Forest-tree carbon reserves are usually calculated by multiplying biomass with the carbon content of vegetation biomass. Therefore, the carbon content of tree species and biomass are two critical factors used to quantify carbon storage in forest ecosystems. In past studies, more attention was paid to biomass estimation while ignoring the carbon content. In this method, However, near-infrared spectroscopy cannot be directly used for quantitative detection, so it needs to be combined with chemometrics to establish a prediction model. Partial least squares (PLS) is the most common method used for quantitative analysis in near-infrared spectroscopy. Hou R. et al. [37] established a prediction model of protein content using PLS to determine the near-infrared spectra of 58 barley samples. The prediction correlation coefficient was 0.901. Marcelo A. studied using a combination of NIR PLS to determine the total protein content in raw coffee samples [38]. Using the PLS method, Kensuke K. et al. established the near-infrared prediction model of soil carbon content [39]. Many other studies establish a near-infrared quantitative prediction model via PLS regression [40][41][42][43].
Moreover, the Vis-NIR spectra at 350-2500 nm are broad, with overlapping bands, and the relationship between spectra and wood properties is generally complex and nonspecific. Meanwhile, the instrument response, stray light, light scattering, sample state, and other factors also commonly affect the determination of the original image data from nearinfrared spectroscopy. Near-infrared spectroscopy data inevitably contain noise, such as random, baseline drift, and instrument background noise, which will affect the analysis of near-infrared spectroscopy data. Therefore, the pretreatment of spectral data is very important. In a study by James et al., the OSC method improved prediction accuracy (R 2 CV = 0.79; without OSC, the best result was R 2 CV = 0.62) [44]. Yin et al. [45] determined the basic density of Tilia tuan based on different pretreatments the results showed that the first derivative was optimal in the range of 350-2500 nm. The correlation coefficient of the calibration set was 0.9648, the corrected root mean square error was 0.0027, and the correlation coefficient of the verification set was 0.9432. Other models with good predictive performance use different spectral pretreatment methods, such as multiplicative scatter correction, Standard Normal Variate, and second derivative [21,[46][47][48][49].
Our study collected the spectral information and carbon content of 96 trunk increment core samples from six tree species in two different ecological regions, using the seven spectral pretreatment methods, and established near-infrared quantitative models through partial least squares regression. We also examined the effects of full-spectrum and reduced spectrum on the performance of the model. This study is intended to establish a better performing quantitative model for predicting the carbon content of tree species. For this study, we sought to answer two main questions: (1) What spectral pretreatment method combined with PLS can best predict the carbon content of tree species? (2) Between the full spectrum and reduced spectrum, which spectrum is most suitable for modeling and predicting the carbon content of tree species?

Sample Preparation
The samples were collected from 96 trunk increment core samples from 6  Active clay was used to fill the holes left after sampling the growing cone to help the trees heal themselves and prevent infection from insects and microorganisms. First, the increment core samples were naturally dried in the shade, and the spectra were collected. The samples were then dried at a constant temperature in a drying oven (85 • C) to a constant weight, crushed, and sieved to analyze their carbon content chemically.

Carbon Content Based on Chemical Analysis
The increment core samples were dried to a constant weight in a drying oven with a constant temperature of 85 • C; the difference between the two weights was less than 0.2 mg. The samples were then ground and sieved (0.25 mm mesh screen). The carbon content was determined and calculated via the potassium dichromate wet burning sulfuric acid oxidation method (LY/T1237-1999) [50]. During the experiment, the oil bath temperature was controlled at 170-180 • C, and the accuracy time was 5 minutes. Three replicates were determined for each sample, and the relative deviation of the three replicates was controlled within 2%. If the average relative error exceeded ±2%, we repeated the process once more and retained the average of three determination results with the smallest difference as the carbon content of the sample. The determination results of the carbon content are shown in Table 1.

Spectra Collection
The Vis-NIR spectra were collected using a LabSpec Pro FR/A114260 (Analytical Spectral Devices, Inc., Boulder, CO, USA) from 350 to 2500 nm with a spectral resolution of 3 nm @700 nm and 10 nm @1400/2100 nm. To ensure the accuracy of the data, the experiment was carried out under drying conditions at room temperature. Before spectral collection, the spectrometer was preheated for 30 min and calibrated with a commercial white plate made from polytetrafluoroethylene (PTFE). This plate was nearly 100% reflective within the whole wavelength range (350-2500 nm). White references were collected every 15 min from the surface of the white plate. In total, 30 scans were acquired and automatically averaged into one spectrum. Each sample was then scanned three times with a glare probe (unit 6523 h.i. for the contact probe), and the average spectrum was considered the raw spectrum [31]. The signals were generated in the reflectance mode and transformed into absorbance using log1/R.
The sample was truncated from the middle position of the wood to collect the spectrum. To obtain a more stable model, each wood sample was polished with 80 mesh sandpaper five times such that the surface roughness parameter, Ra, was close to 12.5 µm. According to the research of Jiang Z. H. et al. [51] and Evelize A. [29], the information contained in the near-infrared spectra of the three sections of wood samples can be used to characterize wood samples, and the information content is rich. According to the characteristics of the experimental materials, the cross-section of the wood sample was selected to collect the near-infrared spectrum (Figure 1a). The spectrum collected by this method was the spectrum of the tangential section of the wood at the DBH position of the tree (Figure 1b). The near-infrared spectra of each sample are shown in Figure 2.

Pre-Processing of Spectroscopic Data
Vis-NIR spectroscopy provides a great deal of information about tree species along wavelengths of 350 and 2500 nm. However, instrument responses, stray light, light scattering, the sample state, and other factors usually affect the determination of the original image data of near-infrared spectroscopy. Therefore, it is essential to preprocess the spectral data before modeling. The primary function of all preprocessing methods is to reduce unmodeled variability in the data and enhance the features sought in the spectra, which are often linear (simple) relations. However, choosing the most robust preprocessing technique can be challenging because applying a wrong type or applying a preprocessing method that is too severe can result in the removal of valuable information or even the introduction of unwanted variation. Before NIR modeling, the spectra were subjected to pretreatments, including EMSC, 1D, 2D, Baseline correction, de-trend, OSC, and normalization to eliminate baseline drifting and noise and enhance the model quality. EMSC eliminates the multiplicative and additive effects of spectra and allows a better separation of physical light scattering effects from chemical light absorbance effects. Derivative preprocessing can effectively remove baseline and other background interference, separate overlapping peaks, and improve resolution and sensitivity. Baseline correction was used to adjust the spectral offset by adjusting the data to the minimum point in the data. De-trend is a transformation that seeks to remove nonlinear trends in spectroscopic data. OSC can be used as a transformation method for building PLS regression models from spectral data. It removes the extraneous variance from the x data, sometimes making the PLS model more accurate. Normalization is a family of transformations that are computed sample-wise. Its purpose is to "scale" samples to get all data on approximately the same scale. Through the establishment of the PLS model, the optimal spectral pretreatment method was obtained.

Pre-Processing of Spectroscopic Data
Vis-NIR spectroscopy provides a great deal of information about tree species along wavelengths of 350 and 2500 nm. However, instrument responses, stray light, light scattering, the sample state, and other factors usually affect the determination of the original image data of near-infrared spectroscopy. Therefore, it is essential to preprocess the spectral data before modeling. The primary function of all preprocessing methods is to reduce unmodeled variability in the data and enhance the features sought in the spectra, which are often linear (simple) relations. However, choosing the most robust preprocessing technique can be challenging because applying a wrong type or applying a preprocessing

Pre-Processing of Spectroscopic Data
Vis-NIR spectroscopy provides a great deal of information about tree species along wavelengths of 350 and 2500 nm. However, instrument responses, stray light, light scattering, the sample state, and other factors usually affect the determination of the original image data of near-infrared spectroscopy. Therefore, it is essential to preprocess the spectral data before modeling. The primary function of all preprocessing methods is to reduce unmodeled variability in the data and enhance the features sought in the spectra, which are often linear (simple) relations. However, choosing the most robust preprocessing technique can be challenging because applying a wrong type or applying a preprocessing

Model Development
Chemometrics is needed in qualitative and quantitative analyses. There are three commonly used modeling methods: partial least squares (PLS), principal component analysis (PCA), and artificial neural network (ANN). In the modeling method for nearinfrared spectral analysis, the PLS method can effectively solve a large amount of near- infrared spectral information. The method of gradually adding new information can eliminate the influence caused by external noise to a certain extent, improve the data accuracy, and associate the independent variable with the dependent variable matrix to obtain the best model. The PLS regression method offers an effective combination of multiple linear regression and principal component regression. PLS analysis was developed to be a standard tool in chemometrics and is used widely in Vis/NIR spectral analysis. Based on the above characteristics, the present study uses the partial least squares method to model and analyze the carbon content of the sample.
The dataset was split into a calibration set and a validation set before model development using an SPXY algorithm (2:1). First, the data were modeled via PLS regression analysis, where the optimal latent variables were optimized with a five-fold cross-validation procedure. The PLSR model was tuned such that the maximum number of model components was set to 10. The model was then run and tested for each number of components from 1 to 10. The optimal number of components was chosen based on the lowest RMSECV via using cross-validation. Finally, the model was re-calibrated with the optimal number of components and validated, and the R 2 and RMSE were calculated.

Model Evaluation
Evaluation of the model quality included the R square of calibration (R 2 c), the R square of prediction (R 2 p), the root means square error of calibration (RMSEC), the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP), and residual prediction deviation (RPD). The R square is used to describe the linear correlation between the predicted values and the measured values. The higher the R 2 p and the closer the R 2 c is to 1, the greater the correlation between the predicted value and actual value, and the stronger the robustness of the model. The RMSEC, RMSECV, and RMSEP were usf the calibration model. The loweed to evaluate the feasibility or the RMSEP is and the closer it is to the RMSEC, the stronger the predictive ability and the robustness of the calibration model are. RPD is a measure of a model's ability to predict a constituent. Values between 2.0 and 2.5 indicate approximate quantitative predictions, while values between 2.5 and 3.0 and above 3.0 indicate predictions that can be considered good and excellent, respectively [35,52]. The computation equations for these criteria are as follows: where y i represents the tracheid length value;ŷ i and y are the predicted value and the mean of y i , respectively; and n is the number of samples. When n is the number of samples in the calibration set, the coefficient of determination and the root mean square error are referred to as R 2 c and RMSEC, respectively. When n is the number of samples of the validation set, the coefficient of determination and the root mean square error are referred to as R 2 p and RMSEP, respectively. SD is the standard deviation.

Software
Statistical data analysis was completed in IBM SPSS Statistics 26.0 (IBM, Armonk, NY, USA). Transformation of the collected reflectance spectra into absorbance was performed in ViewSpecPro (ASD Inc. Boulder, CO, USA). All spectral preprocessing was performed iusingthe Unscrambler ® X v10.4 software (CAMO Software Inc., Woodbridge, NJ, USA). The PLS modeling, SPXY algorithm, and figure were implemented in MATLAB R2016a (MathWorks, Natick, MA, USA).  Figure 2 shows that the six tree species had similar absorbance patterns, so a spectral was randomly selected for spectral analysis. The Vis-NIR spectra (350-2500 nm) of an increment core sample are shown in Figure 3. Prominent absorption peaks can be seen at 1171, 1418, 1752, 1882, 2050, and 2225 nm.

Near-Infrared Spectral Features
NY, USA). Transformation of the collected reflectance spectra into absorbance was performed in ViewSpecPro (ASD Inc. Boulder, CO, USA). All spectral preprocessing was performed iusingthe Unscrambler ®X v10.4 software (CAMO Software Inc, Woodbridge, NJ, USA). The PLS modeling, SPXY algorithm, and figure were implemented in MATLAB R2016a (MathWorks, Natick, MA, USA). Figure 2 shows that the six tree species had similar absorbance patterns, so a spectral was randomly selected for spectral analysis. The Vis-NIR spectra (350-2500 nm) of an increment core sample are shown in Figure 3. Prominent absorption peaks can be seen at 1171, 1418, 1752, 1882, 2050, and 2225 nm. The peak at approximately 1171 nm was attributed to the second over-tone of symmetric and anti-symmetric C-H (-CH, -CH2, -CH3) stretch vibration. The peak at approximately 1418 nm indicates a combination of C-H stretching, vibrations, and bending vibrations. The above two peaks were also subject to the same analysis used in the experiment of Julia K. [40]. Here, peaks at 1752 nm correspond to the stretching vibrations of C-H in the first overtone. This peak is also in the 1726-1761 nm wavelength, as mentioned in [40]. The sharp peak at around 1882 nm was associated with the first overtone of O-H and C-H stretch vibrations. The peak at 2050 nm indicates a combination of N-H in-plane bending and C-H stretching from the protein compounds. The peak at around 2225 nm might result from a combination of C-H stretch vibrations and C=O stretch vibrations [42]. These peaks are mainly related to carbohydrates, lipids, and protein macromolecular organic matter [53,54]. The above characteristic peaks of element C suggest the potential use of NIR spectroscopy to predict the carbon content of tree species.

The Selection of Sample Sets
In the process of near-infrared analysis, it is generally considered that more than 50 samples can be used for modeling and analysis. A. Vergnoux et al. used only 55 soil samples to establish a near-infrared model for predicting the properties of soil organic carbon [55]. Hou R. et al. used 58 samples to establish a near-infrared model of barley protein, and the prediction results were good [37]. Marcelo A. and Helena P. established near- The peak at approximately 1171 nm was attributed to the second over-tone of symmetric and anti-symmetric C-H (-CH, -CH 2 , -CH3) stretch vibration. The peak at approximately 1418 nm indicates a combination of C-H stretching, vibrations, and bending vibrations. The above two peaks were also subject to the same analysis used in the experiment of Julia K. [40]. Here, peaks at 1752 nm correspond to the stretching vibrations of C-H in the first overtone. This peak is also in the 1726-1761 nm wavelength, as mentioned in [40]. The sharp peak at around 1882 nm was associated with the first overtone of O-H and C-H stretch vibrations. The peak at 2050 nm indicates a combination of N-H in-plane bending and C-H stretching from the protein compounds. The peak at around 2225 nm might result from a combination of C-H stretch vibrations and C=O stretch vibrations [42]. These peaks are mainly related to carbohydrates, lipids, and protein macromolecular organic matter [53,54]. The above characteristic peaks of element C suggest the potential use of NIR spectroscopy to predict the carbon content of tree species.

The Selection of Sample Sets
In the process of near-infrared analysis, it is generally considered that more than 50 samples can be used for modeling and analysis. A. Vergnoux et al. used only 55 soil samples to establish a near-infrared model for predicting the properties of soil organic carbon [55]. Hou R. et al. used 58 samples to establish a near-infrared model of barley protein, and the prediction results were good [37]. Marcelo A. and Helena P. established near-infrared models that used sample sizes of 53 and 60, respectively [38,41]. Thus, 96 increment core samples were used in this study to establish a near-infrared prediction model, and the sample size was reasonable.
Random sampling, Kennard stone, and the SPXY algorithm are the most common methods used to divide sample sets. However, the SPXY algorithm is more common than the first and second algorithms, which are based on the KS algorithm [56,57]. In this paper, the SPXY algorithm is used to divide the sample set by taking the measured carbon content as the Y variable and the spectral data as the X variable. Here, the distance between samples is calculated by using two variables simultaneously to ensure maximum representation of the sample distribution, effectively covering the multi-dimensional vector space, increasing the difference and representativeness between samples, and improving the stability of the model. Ninety-six samples were divided into 64 calibration sets and 32 validation sets using the SPXY algorithm according to a ratio of 2:1. The statistical results are shown in Table 2. In Table 2, the carbon content of the calibration set samples covers the carbon content range of the validation set samples, and the coefficient of variation is less than 5%. Moreover, the standard deviation is very low, indicating that the data are relatively stable. The reasonable division of the sample set helped to establish a robust prediction model.

PLS Model Development
Due to the light scattering and different effective path lengths of solid samples, the NIR spectral data inevitably contained some unwanted variations or noise. In order to improve the model's development and accuracy, it was necessary to perform data preprocessing correctly to reduce such unwanted variations [21,58]. In this study, we applied two important spectral pretreatment methods that are not commonly used. First, we chose the EMSC method rather than MSC. EMSC is an extension of conventional MSC, which is not limited to only removing multiplicative and additive effects from spectra. This extended method allows for better separation of physical light scattering effects from chemical light absorbance effects by including wavelength-dependent effects or a priori information in the modeling [59]. Second, we applied the OSC method, which can be used as a transformation method for building PLS regression models from spectral data. This method removes extraneous variance from the x data, sometimes making the PLS model more accurate. Because OSC depends upon Y-values, it requires a matrix with Y values, which must be accurate [44]. At the same time, we also compared several common spectral pretreatment methods. Spectra derivation was performed via Gap derivatives in Gap sizes of 27 points. Normalization was performed using range normalization. For this process, each row was divided by its range (max value-in value). The baseline was completed by a combination of linear baseline corrections (run first), followed by the baseline offset. Detrending is a type of transformation that seeks to remove nonlinear trends in spectroscopic data. For this method, we applied parameters in the polynomial order of 2 to the data. Table 3 summarizes the model developments for PLS regression models under all the above spectral pretreatment methods for carbon content. Parameters including R 2 , RMSE, and RPD were used to evaluate model robustness. The results for the preprocessed spectra were improved compared to the results for the raw spectra. This result is the same as that of previous studies, in which the derivation [45,60,61], normalization, EMSC [59], and OSC [37] of the spectral data achieved better model performance than using raw spectra. Additionally, 1D offered better model performance than 2D and was much better than other methods, yielding an R 2 c of 0.92, R 2 p of 0.99, and RPD of 8.9. This result is superior to the results of the previous study by Steffen H. (R 2 c = 0.89, R 2 p = 0.79, and RPD = 2.17) [33]. It is generally agreed that a NIR model with an RPD value above 3.0 indicates an excellent prediction [28,53]. As a rule, after using different pretreatment spectra, the performance of the PLS model increased with an increase in the optimal principal-component numbers (as shown in the column of the optimal principal-component number in Table 3), except under baseline correction and 1D correction. For PLS regression models, it is vital to determine the number of the optimal principal component. Though an increase in the number of factors often leads to higher R 2 values, the model is likely to have over-fitting issues and be unsuitable for predicting future unknown samples [62,63]. It can be seen from Table 3 that the optimal principal-component number of each pretreatment method is somewhat different. In this study, we modeled PLS regression analysis in which the optimal latent variables were optimized with a five-fold cross-validation procedure. The optimal number of components was chosen based on the lowest RMSECV via using cross-validation. The results are shown in Figure 4.
Additionally, 1D offered better model performance than 2D and was much better than other methods, yielding an R 2 c of 0.92, R 2 p of 0.99, and RPD of 8.9. This result is superior to the results of the previous study by Steffen H. (R 2 c = 0.89, R 2 p = 0.79, and RPD = 2.17) [33]. It is generally agreed that a NIR model with an RPD value above 3.0 indicates an excellent prediction [28,53]. As a rule, after using different pretreatment spectra, the performance of the PLS model increased with an increase in the optimal principal-component numbers (as shown in the column of the optimal principal-component number in Table 3), except under baseline correction and 1D correction. For PLS regression models, it is vital to determine the number of the optimal principal component. Though an increase in the number of factors often leads to higher R 2 values, the model is likely to have over-fitting issues and be unsuitable for predicting future unknown samples [62,63]. It can be seen from Table 3 that the optimal principal-component number of each pretreatment method is somewhat different. In this study, we modeled PLS regression analysis in which the optimal latent variables were optimized with a five-fold cross-validation procedure. The optimal number of components was chosen based on the lowest RMSECV via using cross-validation. The results are shown in Figure 4. It can be seen from Figure 4 that almost all the curves have a trend of first decreasing and then increasing. Therefore, for each preprocessing method, we must find the optimal principal-component number with the lowest RMSECV value. The optimal principalcomponent number of the raw spectrum, OSC, normalization, EMSC, de-trend, 2D, 1D, and baseline correction is 2, 3, 5, 4, 3, 10, 8, and 4, respectively. We built a PLS regression model by preprocessing the data and the corresponding optimal principal-component It can be seen from Figure 4 that almost all the curves have a trend of first decreasing and then increasing. Therefore, for each preprocessing method, we must find the optimal principal-component number with the lowest RMSECV value. The optimal principalcomponent number of the raw spectrum, OSC, normalization, EMSC, de-trend, 2D, 1D, and baseline correction is 2, 3, 5, 4, 3, 10, 8, and 4, respectively. We built a PLS regression model by preprocessing the data and the corresponding optimal principal-component number. The results showed that 1D and 2D offer lower RMSEC and RMSEP and higher R 2 c, R 2 p, and RPD values than other pretreatment methods. The raw and processed spectra were plotted in Figure 5. number. The results showed that 1D and 2D offer lower RMSEC and RMSEP and higher R 2 c, R 2 p, and RPD values than other pretreatment methods. The raw and processed spectra were plotted in Figure 5. The raw spectra and spectra processed by the seven methods are plotted in Figure 5. The 1D and 2D pretreatments spectra led to more evident and sharper peaks than the raw spectra and other methods at approximately 1400, 1800, and 2200 nm. The baseline drift can be eliminated, and the influence of background interference with NIR data can be reduced by using the derivation of preprocessed NIR data [61]. Xavier H. argued that first derivatives are used to remove baselines and second derivatives to removes slopes [64]. Tian W.F. also found that the characteristic spectral peak after 1D pretreatment was sharper than in the raw spectrum [62]. This result could explain why the derivation pretreatments of the spectra improved the model performance. Although the information on the near-infrared spectrum after 2D pretreatment was basically the same as after 1D pretreatment, the relative intensity of the near-infrared spectrum peak of 1D was higher than that of 2D, especially the peak intensity at 1882 nm. This result may reduce white noise, such as background noise, when constructing a near-infrared prediction model, thereby improving the model's overall accuracy, which also explains why the modeling performance was higher when the 1D optimal principal-component number was 8 than when the 2D optimal principal-component number was 10. Figure 6a,b shows that, in the prediction of the raw spectral model without pretreatment, in both the calibration and validation sets, the sample points are scattered and far away from the 1:1 line, and the fitted curve deviates from the 1:1 line. However, after 1D preprocessing, the prediction results clearly show that the data points-except for a point of obvious deviation-are all closer to the 1:1 line. Moreover, the fitting curve of the 1D pretreatment data is close to the 1:1 line, and the fitting curve of the validation set almost coincides with the 1:1 line. Comparing these two graphs, we can intuitively see that the The raw spectra and spectra processed by the seven methods are plotted in Figure 5. The 1D and 2D pretreatments spectra led to more evident and sharper peaks than the raw spectra and other methods at approximately 1400, 1800, and 2200 nm. The baseline drift can be eliminated, and the influence of background interference with NIR data can be reduced by using the derivation of preprocessed NIR data [61]. Xavier H. argued that first derivatives are used to remove baselines and second derivatives to removes slopes [64]. Tian W.F. also found that the characteristic spectral peak after 1D pretreatment was sharper than in the raw spectrum [62]. This result could explain why the derivation pretreatments of the spectra improved the model performance. Although the information on the nearinfrared spectrum after 2D pretreatment was basically the same as after 1D pretreatment, the relative intensity of the near-infrared spectrum peak of 1D was higher than that of 2D, especially the peak intensity at 1882 nm. This result may reduce white noise, such as background noise, when constructing a near-infrared prediction model, thereby improving the model's overall accuracy, which also explains why the modeling performance was higher when the 1D optimal principal-component number was 8 than when the 2D optimal principal-component number was 10. Figure 6a,b shows that, in the prediction of the raw spectral model without pretreatment, in both the calibration and validation sets, the sample points are scattered and far away from the 1:1 line, and the fitted curve deviates from the 1:1 line. However, after 1D preprocessing, the prediction results clearly show that the data points-except for a point of obvious deviation-are all closer to the 1:1 line. Moreover, the fitting curve of the 1D pretreatment data is close to the 1:1 line, and the fitting curve of the validation set almost coincides with the 1:1 line. Comparing these two graphs, we can intuitively see that the spectral model performance after 1D pretreatment is much better than that of the raw spectrum. spectral model performance after 1D pretreatment is much better than that of the raw spectrum.

Reduced Spectra Model
When the wavelength of the near-infrared spectrum is near 350 and 2500 nm, the near-infrared spectrum detector reaches the edge state, resulting in large noise, and the information intensity of the near-infrared spectrum becomes low throughout the whole spectrum region [62,65]. The high noise bands are mainly concentrated in the 350-400 and 2350-2500nm bands [45], and obvious peaks of raw and processed spectra appear before 2350nm. This section studies a near-infrared prediction model of carbon content in the 400-2350 nm band. As in the full-spectrum modeling process, seven different spectral preprocessing methods were performed on the reduced spectrum. The PLS modeling results of the reduced spectrum are shown in Table 4. Table 4. PLS regression models for the carbon content of tree species based on reduced spectra. As shown in Table 4, the model with reduced spectra demonstrated better performance than the full-spectrum model for the raw spectrum. However, the R 2 c only improved by 7%. For the reduced spectra, except for OSC, which achieved better prediction

Reduced Spectra Model
When the wavelength of the near-infrared spectrum is near 350 and 2500 nm, the near-infrared spectrum detector reaches the edge state, resulting in large noise, and the information intensity of the near-infrared spectrum becomes low throughout the whole spectrum region [62,65]. The high noise bands are mainly concentrated in the 350-400 and 2350-2500 nm bands [45], and obvious peaks of raw and processed spectra appear before 2350 nm. This section studies a near-infrared prediction model of carbon content in the 400-2350 nm band. As in the full-spectrum modeling process, seven different spectral preprocessing methods were performed on the reduced spectrum. The PLS modeling results of the reduced spectrum are shown in Table 4. As shown in Table 4, the model with reduced spectra demonstrated better performance than the full-spectrum model for the raw spectrum. However, the R 2 c only improved by 7%. For the reduced spectra, except for OSC, which achieved better prediction performance than the raw data, the preprocessed spectral models were worse than the original spectral models. The R 2 c values of all models were lower than 0.7; only the R 2 p of OSC was higher when the raw spectrum model was used. Comparing Table 3 with Table 4 shows that the performance of the model after full-spectrum pretreatment was superior to that of all pretreated and non-pretreated reduced-spectrum models. These results illustrate that reduced spectra may not improve the prediction model's performance in determining the carbon content of tree species. The reason for this result is that, although the noise is larger after 2350 nm, there are still C-related peaks from the C-H combination bands for lipid and protein absorption at 2350-2500 nm [53,66,67]. These C-H structures will affect the accurate determination of carbon content when using the NIR model. Some researchers noted that 2290-2400 nm wavelength regions are associated with organic matter [17,39]. Tian W.F.'s study highlighted that, for raw and 1D data, the reduced-spectrum (1333-2222 nm) model achieved better overall performance than the full-range (1000-2500 nm) model. However, for other types of preprocessing, the full-spectrum model was better than the corresponding reduced-spectrum model [62]. This result means that narrowing the spectrum will not necessarily improve regression performance.
In summary, taking six tree species as the research object, 96 sample sets were divided into 64 calibration sets and 32 validation sets using an SPXY algorithm with a ratio of 2:1. The NIR predicted carbon content model was established via PLS regression, and seven different spectral pretreatment methods were used to optimize the model. The performance of the full-spectrum and reduced-spectrum modeling was also compared. The results show that, for the full spectrum, the pretreatment methods can improve the accuracy of the model to different degrees. However, the reduced spectra could not improve the performance of the model in predicting the carbon content of tree species. The best performance ability of the model was 1D pretreatment of the full spectrum. After 1D pretreatment, the spectral model performance of each index was better than that of other methods: the R 2 c was 0.92, the R 2 p was 0.99, and the RMSE was 0.0056 and 0.0030. At the same time, RPD reached 8.9, which shows that the prediction performance of the model is excellent. Despite the six tree species and the limited samples available in our research, we achieved higher prediction accuracy than other studies. Moreover, this method can overcome the limitation of using only one tree species to build a model to predict unknown samples. The best prediction model and corresponding parameters are shown separately in Figure 7.
performance than the raw data, the preprocessed spectral models were worse than the original spectral models. The R 2 c values of all models were lower than 0.7; only the R 2 p of OSC was higher when the raw spectrum model was used. Comparing Table 3 with Table  4 shows that the performance of the model after full-spectrum pretreatment was superior to that of all pretreated and non-pretreated reduced-spectrum models. These results illustrate that reduced spectra may not improve the prediction model's performance in determining the carbon content of tree species. The reason for this result is that, although the noise is larger after 2350 nm, there are still C-related peaks from the C-H combination bands for lipid and protein absorption at 2350-2500 nm [53,66,67]. These C-H structures will affect the accurate determination of carbon content when using the NIR model. Some researchers noted that 2290-2400 nm wavelength regions are associated with organic matter [17,39]. Tian W.F.'s study highlighted that, for raw and 1D data, the reduced-spectrum (1333-2222nm) model achieved better overall performance than the full-range (1000-2500 nm) model. However, for other types of preprocessing, the full-spectrum model was better than the corresponding reduced-spectrum model [62]. This result means that narrowing the spectrum will not necessarily improve regression performance.
In summary, taking six tree species as the research object, 96 sample sets were divided into 64 calibration sets and 32 validation sets using an SPXY algorithm with a ratio of 2:1. The NIR predicted carbon content model was established via PLS regression, and seven different spectral pretreatment methods were used to optimize the model. The performance of the full-spectrum and reduced-spectrum modeling was also compared. The results show that, for the full spectrum, the pretreatment methods can improve the accuracy of the model to different degrees. However, the reduced spectra could not improve the performance of the model in predicting the carbon content of tree species. The best performance ability of the model was 1D pretreatment of the full spectrum. After 1D pretreatment, the spectral model performance of each index was better than that of other methods: the R 2 c was 0.92, the R 2 p was 0.99, and the RMSE was 0.0056 and 0.0030. At the same time, RPD reached 8.9, which shows that the prediction performance of the model is excellent. Despite the six tree species and the limited samples available in our research, we achieved higher prediction accuracy than other studies. Moreover, this method can overcome the limitation of using only one tree species to build a model to predict unknown samples. The best prediction model and corresponding parameters are shown separately in Figure 7.

Comparison of Carbon Content of Tree Species
Here, we compared the carbon content of tree species predicted by the near-infrared best model with the global default value. In the calculation of forest carbon storage, the global default value of the carbon content of tree species is 0.5. The carbon content values of various tree species determined by the best spectral model are shown in Table 5. Our results showed that the average carbon content ranges from 0.4352 to 0.4745, which was lower than the global default value (0.5). The mean carbon content across species was 0.4517, in agreement with other multi-species studies [10][11][12]14]. The results of one-way ANOVA showed that the average carbon content among species was significantly different (F = 24.07, p = 0.000). Compared with the global default value, the deviation of the carbon content of six tree species ranges from 5.4-14.9%.
While sampling, we also investigated the biomass of larch and birch natural mixed secondary forest in Daxing'anling. The results showed that the above-ground biomass of birch was 8.7 t ha −1 , and larch was 105.7 t ha −1 . The total above-ground biomass was 114.4 t ha −1 . Using 0.5 as the carbon content, we overestimated the carbon density of the above-ground tree layer of the stand by 5 t ha −1 compared with the actual measurement. This result produced a 10% deviation in the estimation of the forest-tree carbon pool. If 0.45 is used as the carbon conversion coefficient, a deviation of 1% will occur, and the carbon density value is underestimated by 0.5 t ha −1 . Although the difference of this result is small, it may have a great impact on other stands (such as a pure forest). Therefore, we believe that the measured carbon content of tree species is vital for the accurate estimate of forest carbon sink.

Conclusions
Overall, our study showed that NIR diffuse reflectance spectroscopy, combined with PLS regression, and selecting an appropriate spectral pretreatment method could be employed to detect and rapidly analyze carbon content in tree species. Compared to traditional carbon content measurement methods, this method can save time, labor, and money and realize non-destructive and in situ sensing, thereby providing a novel method for determining the carbon content in forest carbon sequestration measurements. In addition, this study confirmed that coupling Vis-NIR to a machine learning model offers a promising method for accelerating site investigations.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to it is being used to apply for project.