Next Article in Journal
Estimating Effects of Natural and Anthropogenic Activities on Trophic Level of Inland Water: Analysis of Poyang Lake Basin, China, with Landsat-8 Observations
Next Article in Special Issue
On-Site Soil Monitoring Using Photonics-Based Sensors and Historical Soil Spectral Libraries
Previous Article in Journal
A Semantic View on Planetary Mapping—Investigating Limitations and Knowledge Modeling through Contextualization and Composition
Previous Article in Special Issue
Increasing Accuracy of the Soil-Agricultural Map by Sentinel-2 Images Analysis—Case Study of Maize Cultivation under Drought Conditions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exploring the Potential of vis-NIR Spectroscopy as a Covariate in Soil Organic Matter Mapping

1
Department of Environmental Management, Yuzhang Normal University, Nanchang 330103, China
2
ZJU-Hangzhou Global Scientific and Technological Innovation Center, Hangzhou 311200, China
3
Institute of Agricultural Remote Sensing and Information Technology Application, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou 310058, China
4
College of Land Resources and Environment, Jiangxi Agricultural University, Nanchang 330045, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(6), 1617; https://doi.org/10.3390/rs15061617
Submission received: 26 February 2023 / Revised: 14 March 2023 / Accepted: 15 March 2023 / Published: 16 March 2023
(This article belongs to the Special Issue Remote Sensing for Soil Mapping and Monitoring)

Abstract

:
Robust soil organic matter (SOM) mapping is required by farms, but their generation requires a large number of samples to be chemically analyzed, which is cost prohibitive. Recently, research has shown that visible and near-infrared (vis-NIR) reflectance spectroscopy is a fast and accurate technique for estimating SOM in a cost-effective manner. However, few studies have focused on using vis-NIR spectroscopy as a covariate to improve the accuracy of spatial modeling. In this study, our objective was to compare the mapping accuracy from a spatial model using kriging methods with and without the covariate of vis-NIR spectroscopy. We split the 261 samples into a calibration set (104) for building the spectral predictive model, a test set for generating the vis-NIR augmented set from the prediction of the fitted spectral predictive model (131), and a validation set (26) for evaluating map accuracy. We used two datasets (235 samples) for Kriging: a laboratory-based dataset (Ld, observations from calibration and test datasets) and a laboratory-based dataset with vis-NIR augmented predictions (Au.p, observations from calibration and predictions from test dataset), a laboratory-based dataset with vis-NIR spectra as the covariance (Ld.co) and augmented dataset with predictions using vis-NIR with vis-NIR spectra for the covariance (Au.p.co). The first one to seven accumulated principal components of vis-NIR spectra were used as the covariates when we used the measurement of Ld.co and Au.p.co. The map accuracy was evaluated by the validation set for the four datasets using Kriging. The results indicated that adding vis-NIR spectra as covariates had great potential in improving the map accuracy using kriging, and much higher accuracies were observed for Ld.p.co (RMSE of 5.51 g kg−1) and Au.p.co (RMSE of 5.66 g kg−1) than without using vis-NIR spectra as covariates for Ld (RMSE of 7.12 g kg−1) and Au.p (RMSE of 7.69 g kg−1). With a similar model performance to Ld.p.co, Au.p.co can reduce the cost of laboratory analysis for 60% of soil samples, demonstrating its advantage in cost-efficiency for spatial modeling of soil information. Therefore, we conclude that vis-NIR spectra can be used as a cost-effective technique to obtain augmented data to improve fine-resolution spatial mapping of soil information.

1. Introduction

Farmers, public policy managers, and environmental and agricultural scientists all need digital soil maps to inform appropriate decision-making for land management. Digital soil mapping (DSM) requires laboratory soil property measurements, which are costly and time-consuming. Furthermore, DSM with a high resolution requires many representative soil samples, which is costly [1,2]. Practitioners are intending to reduce the number of samples, but this may degrade the accuracy of soil maps. Balancing the budget and accuracy is a key issue for precision agriculture [3].
Visible and near-infrared (vis-NIR) spectroscopy was used as an alternative technique for laboratory chemical analysis and, therefore, may solve this problem [4]. The vis-NIR technique has received much attention during the past three decades in soil surveys and assessment studies [5], and its benefits have been documented extensively [6,7,8]. Several important properties of soil samples can be estimated from their scanning of samples, which is cheaper and faster than conventional laboratory methods. From this point, vis-NIR can be used as the method in DSM to solve the question of budget and sampling density [9,10,11].
A general method in DSM with vis-NIR spectra was to add spectral predicted soil properties to the laboratory soil measurements, called the augmented data, to increase sampling density [1]. For example, Viscarra Rossel et al. [12] showed that augmented data could improve the accuracy of soil maps. DSM with augmented data was carried out by spatial interpolation, predominantly by kriging. Kriging requires the underlying random variable to be approximately normally distributed, but the estimated properties from vis-NIR spectral predictive models can be inaccurate and skewed [13]. Furthermore, spectral predicted soil properties, compared with laboratory soil measurements, tend to easily smooth the variation [14]. Therefore, the accuracy is frequently unsatisfactory in soil mapping with vis-NIR augmented data using kriging methods [15].
Another general method in DSM with vis-NIR spectra is to use the predicted value from the nonlinear model (such as Cubist, Random Forest) using spectra and the environmental covariates [16,17]. These studies can tell us the usefulness of vis-NIR for mapping but do not show how much the extent of vis-NIR can improve.
How to improve accuracy and reduce cost when using vis-NIR spectroscopy requires studies focused on methods to reduce the negative impacts of modeling error [2,5]. Obtaining better covariates that can capture the soil formation factors could be useful in improving the accuracy [2]. Given that some of the key aspects listed above previously existed in mapping using vis-NIR spectroscopy, our study aimed to (i) demonstrate the potential for using vis-NIR data for DSM at a field scale and (ii) compare the accuracy of maps based on laboratory measurement and the accuracy with those using a vis-NIR augmented dataset with and without using vis-NIR spectra as the covariate.
In this study, we used four kinds of data sources for mapping to explore the potential of using vis-NIR spectra for mapping: (1) laboratory-based dataset (Ld, observations from calibration and test datasets); (2) the sum of data from a laboratory-based dataset from calibration and predicted SOM data using vis-NIR calibration model and vis-NIR spectra from the test dataset, which called vis-NIR augmented data (Au.p); (3) laboratory-based dataset with vis-NIR spectra from the total dataset as the covariance (Ld.co); (4) augmented dataset with vis-NIR spectra from the total dataset as the covariance (Au.p.co). We also analyzed the first one to seven accumulated principal components of vis-NIR spectra, which were used as the covariates when we used the measurement of Ld.co and Au.p.co. We also discussed four preprocessing for vis-NIR spectra when Au.p, Ld.co, Au.p.co.

2. Materials and Methods

2.1. Study Area and Soil Sampling

The study area is located in Ji’an County, eastern Jiangxi Province (Figure 1), and covers 2470 ha at altitudes ranging from 50 to 60 m above sea level. The location is distributed on both sides of the tributaries of the Ganjiang River with flat terrain. The predominant soil types came from river alluvials. This area is used for high-quality prime farmland [18]. Soil texture varies from loamy sand to clay. We collected 261 samples on a regular grid of 300 × 300 m from the arable layer (0–20 cm) (Figure 1). In this study, we used soil organic matter (SOM), which can be documented to be accurately estimated by vis-NIR.

2.2. Measurement and Processing of vis-NIR Spectra

After soil samples were collected at the locations from the surface soil layer, they were mixed into a composite sample before being stored in a labeled plastic bag. All composite samples were transported to the laboratory, air-dried, ground, and sieved to pass through a 2 mm sieve. Each sample was divided into two portions using the quartering method, one for laboratory chemical analysis and the other for spectral measurements. For the dry spectral measurements, each soil sample was placed in a Petri dish with a diameter of 10 cm and a depth of 1.5 cm. Soil spectra were measured using the same ASD FieldSpec4 spectrometer (350–2500 nm) equipped with a contact probe (Malvern Panalytical Ltd., Malvern, UK). The spectra were taken at three arbitrarily selected locations from the soil surface. Ten spectra were recorded in each of the three sensing locations. The thirty total spectra were averaged to one to represent the spectra of the soil sample. The spectra in the range of 400 to 2400 nm were used as the final spectra for the next use.
The vis-NIR reflectance spectra were transformed to apparent absorbance (log10 1/reflectance). Then, we applied the Savitzky–Golay (window size = 11, polynomial order = 3, differentiation order = 1) filter (lg_sg), [19] standard normal variate (lg_snv) [20], multiplicative scatter correction + standard normal variate (msc_snv) and detrend normalization (lg_dt) to compare which processing was optimal. We used the 400–2450 nm spectral range to eliminate the high signal noise at the two ends of the spectrometer. The total spectral bands of 2051 were used.
Samples were randomly split into calibration (104, 40%), validation (26, 10%), and test (31, 50%) sets.
The samples were packed into plastic bags, labeled, and transported to the laboratory. The soil samples were air-dried, ground, and sieved to less than 2 mm. Stones and plant residues were removed.

2.3. Spectroscopic Model and Augmented Data

We used PLSR [21] as the spectroscopic model. PLSR is a linear regression model that is widely used for spectroscopic soil modeling. The optimum number of latent variables for PLSR was determined by the lowest root mean square error (RMSE) using leave-one-out cross-validation.
The predicted SOM from the PLSR model using the calibration datasets whose spectra with four processing steps (see above) were pooled to the SOM data from the calibration, which is called the augmented data, resulting in four augmented datasets.

2.4. Laboratory Analyses of Soil Organic Matter (SOM)

SOC content was measured using the H2SO4-K2Cr2O7 oxidation method at 180 °C for 5 min method according to the methods of the Institute of Soil Science of the Chinese Academy of Sciences [22]. The SOM content was obtained using SOC multiplied by the coefficient of 1.72, which was suggested by Reference [23].

2.5. Selection of Covariates

We analyzed the correlation of SOM with the terrain attributes (terrain roughness, vector terrain roughness, slope of aspect, slope of slope), but the correlations were quite low, so we did not consider them. We only used the spectra as the covariate. When using vis-NIR spectroscopy as the covariate, those PCs on the scaled spectra with significant p values (p < 0.01) with SOM were selected. PCs with p values that were negative were transformed into negative PCs, which ensured that the correlation between the PCs and SOM was positive. The spectra used as the covariate were the spectra from the total samples under four spectral preprocessing steps (Figure 2).

2.6. Spatial Modeling, Performance Estimation, and SOM Mapping

In this study, area and soil properties (i.e., SOM) at location s were defined as:
Y i = u i + Z i s + ε ( s )
where ui was a mean, Zi(s) was normally distributed, and ε(s) was an uncorrelated random error.
Autocorrelated spatially random component with zero mean, unit variance, and variogram:
γ h = 1 2 v a r [ ε u ε u + h ] ] = 1 2 E [ ] { ε u ε u + h } 2
where ε(u) and ε(u + h) are random variables at places u and u + h separated by the vector h, and E denotes the expectation.
The semivariance models were spherical functions, which can be defined below:
γ h = τ + σ 3 h 2 α + 1 2 h α 3        h α τ + σ                                    h > α 0                                            h = 0
where τ is the nugget variance, which can be attributed to measurement errors or spatial sources of variation within the range of the sampling interval, and α is the range of spatial dependence or spatial autocorrelation; γ(h) is the semivariance at lag h and σ is the a priori variance of the autocorrelated process.
The parameters of the autocorrelation model of σ, τ, α, the mean u, and the realizations of Z(s) are unknown and must be estimated from the data. The values of σ, τ, and α were determined by fitting a model to the points forming from the empirical semivariogram. Predictions at unvisited locations were made by global block kriging on a 10 m × 10 m grid.
To show how to improve the prediction accuracy when using the spectra as the covariate, we first calculated the results induced from the augmented data that used augmented SOM predicted from the four spectral processing methods. Then, we calculated the results from the combination of the four kinds of augmented data and four preprocessed spectra as covariates. When considering the covariate, we calculated 1 to 7 principal components from each preprocessing technique. Therefore, the total predicted results of 116 (112 predicted SOM using covariance when using the augmented data and 4 augmented data with no covariate, see Figure 2B, Section A).
To further compare which measure made the main effect in predicting and mapping, we also mapped the SOM results using geographic models from the library-based data with the PCs as the covariate using lg_dt preprocessing, which was called Lb.co. The mapping that was finally shown was from the laboratory-based (Ld), laboratory-based with PCs as the covariate using lg_dt pre-proposing (Lb.co), vis-NIR augmented data using whose spectra using lg_dt pre-proposing (Au.p) and vis-NIR augmented data without PCs as the covariate whose all spectra used lg_dt pre-possessing (Au.p.co); this part can be seen in Section B in Figure 2.
Root mean squared error (RMSE), R2, bias and the ratio of performance to interquartile range (RPIQ) were used to evaluate the accuracy and bias of prediction at the validation set for SOM.

3. Results

3.1. SOM and vis-NIR Spectra

Table 1 presents a summary of the calibration, test, and validation datasets. The mean SOM from the three datasets was similar. The skew of the calibration was greater than that of the test and validation datasets due to the largest SOM in calibration.
Apparent water absorption peaks were observed in the vis-NIR spectra bands near 1400 nm and 1900 nm, as shown in Figure 3. The absorption peaks near 2200 nm indicated the existence of kaolinite in the soil samples [19]. In addition, the absorption spectra at approximately 450 and 850 nm showed goethite and hematite, which are characteristic of iron-bearing minerals [20]. Three samples had a higher reflectance at 600–1400 nm than the rest of the soil samples because of their relatively late development [20].

3.2. Prediction of PLSR

The results showed that the vis-NIR spectroscopy calibration model using the 101 calibration samples had a better-predicted performance for the test dataset (RPIQ value was from 2.03 to 2.30) than for the validation dataset (RPIQ values was from 1.51 to 1.71) (Table 2). The model using the spectra with preprocessing of msc.snv and lg_dt gave the best performance with RPIQ values of 1.71 and 2.30 for the validation dataset and the test dataset, respectively. The preprocessing of msc.snv was more effective for the validation than for the test validation. The bias values showed that predictions by the vis-NIR models for validation were positively biased.

3.3. Correlation Analysis

Figure 4A shows that the correlation of SOM with PCs calculated from the different prepossessed spectra was different. The values of correlation calculated from the first four PCs and spectra under msc_snv and lg_dt were significant. However, after msc_snv and lg_sg were proposed, the significant correlation value showed that the PCs were seven (the minimum significant correlation value was 0.18). Among the four preprocessing methods, lg_sg had a more negative correlation than the other three preprocessed methods. The following calculation involved the PCs that used the maximum of seven.
Figure 4B shows that the correlations of SOM and spectra with proposals from lg_dt and lg_snv were more significant than with proposals from lg_sg and msc_snv. The number of significant bands from four preprocessing steps was as follows: lg_snv (1248) > lg_dt (1204) > lg_sg (761) > msc_snv (584).

3.4. The Prediction from the Spatial Model with Spectra as the Covariate

Figure 5 shows that along each row, the change in RMSE was similar, but along each column had a difference, which showed that the effect from the spectra as the covariate was larger than the vis-NIR augmented data induced from the different preprocessing. When considering the whole from each line, the difference from lg_snv_co with different PCs was the smallest among the four lines. The results from lg_dt_co and msc_snv_co had the same trend, but the number of PCs with the lowest RMSE value was different, with the former being four and the latter being five. The RMSE from lg_sg_co showed that the lowest value was PC two.
When comparing the results of RMSE from the spatial model with and without spectra as the covariate, we found that regardless of the kind of pretreatment with PCs as the covariate, the RMSE value was reduced to different degrees except for the first PC from lg_dt_co and msc_snv_co. When not using the spectra as the covariate, the effect of preprocessing on the prediction result was not obvious, and the best result was obtained from lg_dt (Figure 5, red line in each column).
The bias in Figure 6 shows that the different PCs from lg_dt_co and lg_snv_co had little effect. lg_snv_co had a bias that was lower than that without the covariate (lg_snv, Figure 6 red lines). The changes in bias from lg_dt_co were all around the bias value of lg_dt with a short distance except for the seventh PC. When the PC was 3–5, the biases from lg_sg_co and msc_snv_co were higher than those from lg_sg and msv_snv, which showed that when the spectra were used as the covariate, the reduced RMSE was not due to the reduced bias.

3.5. Spatial Analysis and SOM Mapping

The spatial predictions using the SOM value from the vis-NIR augmented dataset (Au.p) were slightly better than those using the data from the laboratory analysis (Lb) (Table 3 and Table 4) in both crossing validation and validation. When comparing the prediction results with and without covariates of vis-NIR spectra, the former obtained higher prediction accuracy (Table 3 and Table 4). The model from the laboratory with vis-NIR spectra as the covariate (Lb.co) had a similar result to the model from the vis-NIR augmented dataset with vis-NIR spectra as the covariates (Au.p.co). The ME values in Table 3 show that the spatial predictions from the cross-validation were always slightly negatively biased, but the validation was always positively biased.
Figure 7 shows the spatial predictions from the maps produced by four kinds of data. When the data were laboratory-based and vis-NIR augmented, the spatial prediction was similar, and the differences between them were relatively small. The largest difference between maps is located in the northwest part of the study area, where the altitudes are highest (Figure 1B). The main difference existed with a lower SOM content in the laboratory-based model, but a higher SOM content in the vis-NIR augmented models. Another noticeable difference was in the middle part of the study area, where the maps predicted with the vis-NIR augmented dataset were smoother than those predicted with the laboratory-based dataset. When comparing the maps from the laboratory data with and without the PC covariate (Lb.co vs. Lb), the main difference was located in the northwest, where the SOM value from Lb.co was higher than that from Lb. There were some differences between maps from Lb and Au.p.co in the middle of sampling and northwest, where the SOM value from Au.p.co was lower than that from Lb.
The histogram showed that the percent of predicted SOM in all ranges between the maps from the data of Lb and Au.p had no difference (Figure 8). However, when considering the vis-NIR spectra as covariates, the histograms from Ld.co and Au.p.co in the SOM content range of 12–21 g kg−1 had a larger percentage (58%) than Lb and Au.p (64%), which can also be seen in Figure 8.
Figure 9A–D shows the location of the difference value of predicted and measured SOM from Lb (A), Au.p (B), Ld.co (C), and Au.p.co (D) from the validation dataset. The range of difference in SOM from Figure 9A, B was more considerable than that in Figure 9C,D. The greatest difference from Figure 9A, B was located in the northwest, where the estimated SOM values were higher than the measured SOM value, but in the north, where estimated SOM values were lower than the measured SOM value. As shown in Figure 9C,D, the largest difference was located in the northwest and southeast, where the estimated SOM value was also lower than the measured SOM value. The difference between measured and estimated SOM values in the middle of the sampling location from Figure 9C,D were lower than that from Figure 9A,B. The difference in the predicted and measured SOM from the vis-NIR model (Figure 9E) was much smaller than that of the predicted and measured SOM from the spatial models (Figure 9A,D). The larger error in vis-NIR spectra in location was not consistent with the spatial models.

4. Discussion

The difference in SOM with their locations from the validation dataset with the spatial model and from vis-NIR augmented models (Figure 7 and Table 3) showed that the use of visible and near-infrared (vis-NIR) spectroscopy for SOM mapping is an alternative to overcome the time and budget constraints, which can be confirmed by several studies [10,24,25,26]. However, these researchers only used vis-NIR spectra to predict the properties and did not use vis-NIR spectra as covariates. In this study, we used both the spectra from the test dataset to predict SOM and PCs of spectra from all samples as covariates. We used PC1 to PC7 as the covariates, which were strongly correlated with SOM. When comparing the prediction from the dataset of laboratory-based (Lb) and vis-NIR augmented with PCs as covariates (Au.p.co), the number of samples used for calibration for spatial data was reduced to nearly 60%, but the accuracy improved by 22%. The main difference between the two maps was that the maps predicted with the vis-NIR augmented model were smoother than those predicted with the laboratory-based model. These can be explained by the fact that the prediction from the vis-NIR model tends to smooth the variation.
Our study showed that the addition of spectroscopy as the covariate was useful in improving the prediction accuracy using kriging. The predicted results using spectroscopy after different preprocessing methods were obviously different (Figure 5 and Figure 6, Table 3). The effect of the covariate was far greater than the effect from the augmented data (Figure 5 and Table 3). Selecting the optimal preprocessing method can be observed from the correlation of spectra with SOM from the calibration dataset (Figure 2B Section B). In this study, lg_dt and la-snv had 1204 and 1248 significant bands, respectively, which were more than the last two. Although msc_snv had the lowest RMSE in the validation when the PC was 5, the prediction in the test dataset was the lowest, which showed that this could not be optimal. The other key in prediction is to select the number of principal components. The correlation of each PC with SOM can be useful (Figure 3A).
Although improved accuracy of maps was produced with the spatial models using the PCs as the covariate, there was still a gap for high-resolution mapping. The difference value from the spatial model from the sampling location showed that the error might be related to the topography; the larger the error is, the larger the value of the DEM (Figure 1), which can be confirmed by Viscarra Rossel et al. [27], who used the DEM as the variable to predict the soil carbon in cool temperate areas. In our study area, the higher the DEM, the lower the temperature, the greater the soil organic matter content, and the larger the difference between the predicted value and measured SOM (Figure 9A–D). Reference [28] also showed using a multi-depth vis-NIR spectral library and terrain attributes in digital soil mapping at the local scale. In this study, we did not consider the DEM due to the small area and the lack of a significant correlation between the DEM and SOM.
The spatial model did not produce a more accurate result than the research of Reference [29], who used representative samples for calibration. In this study, we used random selection methods. The relatively larger RMSE and lower bias from the spatial model showed that imprecision was the main reason [28,30]. The imprecision was due to the large sill value and the larger range of scale from the fitted variogram figure, which showed strong spatial dependence and considerable variation among the SOM [31,32]. Adding sampling points is considered a possible solution because the sampling density from our study was greater (300 × 300 m), which would decrease the precision of the spatial relationship between spectra and soil properties [33] than the need from precision agriculture for soil mapping, which called for one to five samples per hectare [34,35] Therefore, collecting a larger number of soil samples that represent soil spatial variation within the area and measuring their spectra and using them as the covariate in spatial mapping can be a feasible measure for a future study.
We did not investigate the uncertainty of the variogram parameters due to a regular grid sampling design. The estimated values from the nugget variance could not be accurate [29]. A main solution would be to include a second-phase survey which requires many samples. The vis-NIR spectral libraries can be considerable [28,30].

5. Conclusions

This study verified that vis-NIR spectroscopy is an alternative to overcome the time and budget constraints of traditional chemical analysis methods in mapping SOM. In this study, we compared the 116 predicted results from the spatial models with the data from the laboratory-based dataset (Ld), augmented dataset predicted using vis-NIR (Au.p), a laboratory-based dataset with vis-NIR spectra as the covariance (Ld.co) and augmented dataset predicted using vis-NIR with vis-NIR spectra as the covariance (Au.p.co). The conclusions were drawn as follows:
The effect of spectra used as the covariance plays a crucial part in the predicted accuracy. Using different preprocessing methods on spectra had different influences, and the most effective method can be decided by the correlation value from the spectra with SOM from the calibration dataset.
When the vis-NIR model and vis-NIR spectroscopy were used as covariates for SOM mapping, the number used for spatial calibration was reduced to nearly 60%, but the accuracy improved by 23%. The prediction error may be mainly due to the imprecision, not the bias; collecting a larger number of spectra of soil samples that represent soil spatial variation within the area and using them as the covariate in spatial mapping can improve the mapping accuracy. Our conclusion made in this study need to be verified by applying this strategy to spectra measured in a large area.

Author Contributions

Conceptualization, M.Y. and X.Z.; methodology, M.Y.; investigation, X.G.; writing—original draft preparation, M.Y.; writing—review and editing, S.C.; supervision, Z.S. and X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by grants from the National Natural Science Foundation of China (No. 41061031), the Natural Science Foundation of Jiangxi Province (20212BAB205022), and the Science and Technology Research Project of Jiangxi Provincial Department of Education (No. GJJ181150).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Camargo, L.A.; Júnior, J.M.; Barrón, V.; Alleoni LR, F.; Barbosa, R.S.; Pereira, G.T. Mapping of clay, iron oxide and adsorbed phosphate in Oxisols using diffuse reflectance spectroscopy. Geoderma 2015, 251, 124–132. [Google Scholar] [CrossRef]
  2. Ben-Dor, E.; Heller, D.; Chudnovsky, A. A Novel Method of Classifying Soil Profiles in the Field using Optical Means. Soil Sci. Soc. Am. J. 2008, 72, 1113–1123. [Google Scholar] [CrossRef]
  3. Van Zijl, G. Digital soil mapping approaches to address real world problems in southern Africa. Geoderma 2019, 337, 1301–1308. [Google Scholar] [CrossRef]
  4. Leenen, M.; Pätzold, S.; Tóth, G.; Welp, G. A LUCAS-based mid-infrared soil spectral library: Its usefulness for soil survey and precision agriculture. J. Plant Nutr. Soil Sci. 2022, 185, 370–383. [Google Scholar] [CrossRef]
  5. Brevik, E.C.; Calzolari, C.; Miller, B.A.; Pereira, P.; Kabala, C.; Baumgarten, A.; Jordán, A. Soil mapping, classification, and pedologic modeling: History and future directions. Geoderma 2016, 264, 256–274. [Google Scholar] [CrossRef]
  6. Brus, D. Sampling for digital soil mapping: A tutorial supported by R scripts. Geoderma 2019, 338, 464–480. [Google Scholar] [CrossRef]
  7. Li, N.; Arshad, M.; Zhao, D.; Sefton, M.; Triantafilis, J. Determining optimal digital soil mapping components for exchangeable calcium and magnesium across a sugarcane field. Catena 2019, 181, 104054. [Google Scholar] [CrossRef]
  8. Mirzaeitalarposhti, R.; Demyan, M.S.; Rasche, F.; Cadisch, G.; Müller, T. Mid-infrared spectroscopy to support regional-scale digital soil mapping on selected croplands of South-West Germany. Catena 2017, 149, 283–293. [Google Scholar] [CrossRef]
  9. Lopo, M.; dos Santos, C.A.T.; Páscoa, R.N.M.J.; Graça, A.R.; Lopes, J.A. Near infrared spectroscopy as a tool for intensive mapping of vineyards soil. Precis. Agric. 2017, 19, 445–462. [Google Scholar] [CrossRef]
  10. Páscoa, R.; Lopo, M.; dos Santos, C.T.; Graça, A.; Lopes, J. Exploratory study on vineyards soil mapping by visible/near-infrared spectroscopy of grapevine leaves. Comput. Electron. Agric. 2016, 127, 15–25. [Google Scholar] [CrossRef]
  11. Paz-Kagan, T.; Zaady, E.; Salbach, C.; Schmidt, A.; Lausch, A.; Zacharias, S.; Notesco, G.; Ben-Dor, E.; Karnieli, A. Mapping the Spectral Soil Quality Index (SSQI) Using Airborne Imaging Spectroscopy. Remote Sens. 2015, 7, 15748–15781. [Google Scholar] [CrossRef] [Green Version]
  12. Viscarra Rossel, R.; Walvoort, D.; Mcbratney, A.; Janik, L.J.; Skjemstad, J. Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties. Geoderma 2006, 131, 59–75. [Google Scholar] [CrossRef]
  13. Yang, M.; Chen, S.; Li, H.; Zhao, X.; Shi, Z. Effectiveness of different approaches for in situ measurements of organic carbon using visible and near infrared spectrometry in the Poyang Lake basin area. Land Degrad. Dev. 2021, 32, 1301–1311. [Google Scholar] [CrossRef]
  14. Vasques, G.; Grunwald, S.; Sickman, J. Comparison of multivariate methods for inferential modeling of soil carbon using visible/near-infrared spectra. Geoderma 2008, 146, 14–25. [Google Scholar] [CrossRef]
  15. Samson, M.; Deutsch, C. The Sill of the Variogram. Geostatistics Lessons. Available online: https://geostatisticslessons.com/lessons/sillofvariogram (accessed on 25 February 2021).
  16. Rosin, N.A.; Demattê, J.A.M.; Poppiel, R.R.; Silvero, N.E.Q.; Rodriguez-Albarracin, H.S.; Rosas, J.T.F.; Greschuk, L.T.; Bellinaso, H.; Minasny, B.; Gomez, C.; et al. Mapping Brazilian soil mineralogy using proximal and remote sensing data. Geoderma 2023, 432. [Google Scholar] [CrossRef]
  17. Zhang, Y.; Saurette, D.D.; Easher, T.H.; Ji, W.; Adamchuk, V.I.; Biswas, A. Comparison of sampling designs for calibrating digital soil maps at multiple depths. Pedosphere 2022, 32, 588–601. [Google Scholar] [CrossRef]
  18. Yu, D.; Ramsey, R.D.; Zhao, X.-M.; Fu, Y.-Q.; Sun, C.-K. Feasible conversion degree of dryland to paddy field in Jinxian County, Jiangxi province, China. Geocarto Int. 2019, 34, 1042–1053. [Google Scholar] [CrossRef]
  19. Savitzky, A.; Golay, M.J. Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 1964, 36, 1627–1639. [Google Scholar] [CrossRef]
  20. Barnes, R.; Dhanoa, M.S.; Lister, S.J. Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. Appl. Spectrosc. 1989, 43, 772–777. [Google Scholar] [CrossRef]
  21. Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
  22. Ru, R.S. Soil Physical and Chemical Analysis; Shanghai Science and Technology Publishing House: Beijing, China, 2000. (In Chinese) [Google Scholar]
  23. Klingenfuß, C.; Roßkopf, N.; Walter, J.; Heller, C.; Zeitz, J. Soil organic matter to soil organic carbon ratios of peatland soil substrates. Geoderma 2014, 235, 410–417. [Google Scholar] [CrossRef]
  24. Yang, M.; Mouazen, A.; Zhao, X.; Guo, X. Assessment of a soil fertility index using visible and near-infrared spectroscopy in the rice paddy region of southern China. Eur. J. Soil Sci. 2020, 71, 615–626. [Google Scholar] [CrossRef]
  25. Steinberg, A.; Chabrillat, S.; Stevens, A.; Segl, K.; Foerster, S. Prediction of Common Surface Soil Properties Based on Vis-NIR Airborne and Simulated EnMAP Imaging Spectroscopy Data: Prediction Accuracy and Influence of Spatial Resolution. Remote Sens. 2016, 8, 613. [Google Scholar] [CrossRef] [Green Version]
  26. Yang, M.; Xu, D.; Chen, S.; Li, H.; Shi, Z. Evaluation of Machine Learning Approaches to Predict Soil Organic Matter and pH Using vis-NIR Spectra. Sensors 2019, 19, 263. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Viscarra Rossel, R.A.; Behrens, T.; Ben-Dor, E.; Brown, D.J.; Demattê, J.A.M.; Shepherd, K.D.; Shi, Z.; Stenberg, B.; Stevens, A.; Adamchuk, V.; et al. A global spectral library to characterize the world’s soil. Earth-Sci. Rev. 2016, 155, 198–230. [Google Scholar] [CrossRef] [Green Version]
  28. Rizzo, R.; Demattê, J.A.M.; Lepsch, I.F.; Gallo, B.C.; Fongaro, C.T. Digital soil mapping at local scale using a multi-depth Vis–NIR spectral library and terrain attributes. Geoderma 2016, 274, 18–27. [Google Scholar] [CrossRef]
  29. Ramirez-Lopez, L.; Wadoux, A.C.; Franceschini, M.H.; Terra, F.; Marques, K.P.P.; Sayão, V.M.; Demattê, J.A.M. Robust soil mapping at the farm scale with vis–NIR spectroscopy. Eur. J. Soil Sci. 2019, 70, 378–393. [Google Scholar] [CrossRef] [Green Version]
  30. Viscarra Rossel, R.A.; Webster, R.; Bui, E.N.; Baldock, J.A. Baseline map of organic carbon in Australian soil to support national carbon accounting and monitoring under climate change. Glob. Chang. Biol. 2014, 20, 2953–2970. [Google Scholar] [CrossRef] [Green Version]
  31. Viscarra Rossel, R.; Chen, C.; Grundy, M.; Searle, R.; Clifford, D.; Campbell, P. The Australian three-dimensional soil grid: Australia’s contribution to the GlobalSoilMap project. Soil Res. 2015, 53, 845–864. [Google Scholar] [CrossRef] [Green Version]
  32. Somarathna, P.; Minasny, B.; Malone, B.P.; Stockmann, U.; McBratney, A. Accounting for the measurement error of spectroscopically inferred soil carbon data for improved precision of spatial predictions. Sci. Total. Environ. 2018, 631-632, 377–389. [Google Scholar] [CrossRef]
  33. Wetterlind, J.; Stenberg, B.; Söderström, M. The use of near infrared (NIR) spectroscopy to improve soil mapping at the farm scale. Precis. Agric. 2008, 9, 57–69. [Google Scholar] [CrossRef] [Green Version]
  34. De Oliveira, J.F.; Brossard, M.; Corazza, E.J.; De Fátima Guimarães, M.; Marchão, R.L. Field-scale spatial correlation between soil and Vis-NIR spectra in the Cerrado biome of Central Brazil. Geoderma Reg. 2022, 30, e00532. [Google Scholar] [CrossRef]
  35. Chen, S.; Arrouays, D.; Mulder, V.L.; Poggio, L.; Minasny, B.; Roudier, P.; Libohova, Z.; Lagacherie, P.; Shi, Z.; Hannam, J.; et al. Digital mapping of GlobalSoilMap soil properties at a broad scale: A review. Geoderma 2022, 409, 115567. [Google Scholar] [CrossRef]
Figure 1. The location of sampling (A) and the DEM of the sampling location (B).
Figure 1. The location of sampling (A) and the DEM of the sampling location (B).
Remotesensing 15 01617 g001
Figure 2. Dataset from the combination and the mapping flowchart. The letters (A)–(D) correspond with data of the laboratory-based (Lb), vis-NIR augmented datasets (Au.p), the laboratory-based with pcs covariate (Ld.co), vis-NIR augmented datasets with pcs covariate (Au.p.co).
Figure 2. Dataset from the combination and the mapping flowchart. The letters (A)–(D) correspond with data of the laboratory-based (Lb), vis-NIR augmented datasets (Au.p), the laboratory-based with pcs covariate (Ld.co), vis-NIR augmented datasets with pcs covariate (Au.p.co).
Remotesensing 15 01617 g002
Figure 3. Reflectance of the samples.
Figure 3. Reflectance of the samples.
Remotesensing 15 01617 g003
Figure 4. The correlation of SOM with principal components (PC1:PC9) calculated from the spectroscopy reflectance R with four preprocessing methods (A) and of SOM with spectra with four preprocessing methods (B). Note: lg_sg: Savitzky-Golay of log(1/(R)), lg_snv: standardNormalVariate of log(1/(R)), msc_snv: multiplicative scatter correction + standardNormalVariate of log(1/(R)), lg_dt: detrended normalization of log(1/(R)). Note: lg_sg: Savitzky-Golay of log(1/(R)), lg_snv: standardNormalVariate of log(1/(R)), msc_snv: multiplicative scatter correction + standardNormalVariate of log(1/(R)), lg_dt: detrended normalization of log(1/(R)). The two values from the black dotted lines are correlation values whose p value was significant (p < 0.01).
Figure 4. The correlation of SOM with principal components (PC1:PC9) calculated from the spectroscopy reflectance R with four preprocessing methods (A) and of SOM with spectra with four preprocessing methods (B). Note: lg_sg: Savitzky-Golay of log(1/(R)), lg_snv: standardNormalVariate of log(1/(R)), msc_snv: multiplicative scatter correction + standardNormalVariate of log(1/(R)), lg_dt: detrended normalization of log(1/(R)). Note: lg_sg: Savitzky-Golay of log(1/(R)), lg_snv: standardNormalVariate of log(1/(R)), msc_snv: multiplicative scatter correction + standardNormalVariate of log(1/(R)), lg_dt: detrended normalization of log(1/(R)). The two values from the black dotted lines are correlation values whose p value was significant (p < 0.01).
Remotesensing 15 01617 g004
Figure 5. The RMSE calculated by the spatial model with four augmented data and 7 PCs as covariates induced from the spectra under four preprocessing methods. Each column represents that the augmented data were the same, but PCs (as the covariate) were induced from the different preprocessing methods. Each row representing PCs (as the covariate) was the same, but the augmented data were induced from the different preprocessing methods. The red lines show the RMSE value without the spectra as the covariate in the corresponding column, and each column had the same RMSE.
Figure 5. The RMSE calculated by the spatial model with four augmented data and 7 PCs as covariates induced from the spectra under four preprocessing methods. Each column represents that the augmented data were the same, but PCs (as the covariate) were induced from the different preprocessing methods. Each row representing PCs (as the covariate) was the same, but the augmented data were induced from the different preprocessing methods. The red lines show the RMSE value without the spectra as the covariate in the corresponding column, and each column had the same RMSE.
Remotesensing 15 01617 g005
Figure 6. The bias calculated by the spatial model with four augmented data points and 7 PCs as covariates induced from the spectra under four preprocessing methods. Each column represents that the augmented data were the same, but PCs (as the covariate) were induced from the different preprocessing methods. Each row representing PCs (as the covariate) was the same, but the augmented data were induced from the different preprocessing methods. The red lines show the RMSE value without the spectra as the covariate in the corresponding column, and each column had the same bias.
Figure 6. The bias calculated by the spatial model with four augmented data points and 7 PCs as covariates induced from the spectra under four preprocessing methods. Each column represents that the augmented data were the same, but PCs (as the covariate) were induced from the different preprocessing methods. Each row representing PCs (as the covariate) was the same, but the augmented data were induced from the different preprocessing methods. The red lines show the RMSE value without the spectra as the covariate in the corresponding column, and each column had the same bias.
Remotesensing 15 01617 g006
Figure 7. Maps of SOM generated from the data of the laboratory-based (Lb) (A), vis-NIR augmented datasets (Au.p) (B), the laboratory-based with pcs covariate (Ld.co) (C), vis-NIR augmented datasets with pcs covariate (Au.p.co) (D).
Figure 7. Maps of SOM generated from the data of the laboratory-based (Lb) (A), vis-NIR augmented datasets (Au.p) (B), the laboratory-based with pcs covariate (Ld.co) (C), vis-NIR augmented datasets with pcs covariate (Au.p.co) (D).
Remotesensing 15 01617 g007
Figure 8. Histogram of predicted SOM using the data of the laboratory-based (Lb), vis-NIR augmented datasets (Au.p), the laboratory-based with pcs covariate (Ld.co), vis-NIR augmented datasets with pcs covariate (Au.p.co).
Figure 8. Histogram of predicted SOM using the data of the laboratory-based (Lb), vis-NIR augmented datasets (Au.p), the laboratory-based with pcs covariate (Ld.co), vis-NIR augmented datasets with pcs covariate (Au.p.co).
Remotesensing 15 01617 g008
Figure 9. The location of the difference value of predicted and measured SOM from Lb (A), Au.p (B), Ld.co (C), and Au.p.co (D), and from the vis-NIR Model (E) of the validation dataset, the SOM content from the predicted and measured dataset from the test and validation dataset, and the plot of measured SOM and predicted SOM (F). The data showed in (A)–(E) was the difference of the measured SOM minus the predicted SOM.
Figure 9. The location of the difference value of predicted and measured SOM from Lb (A), Au.p (B), Ld.co (C), and Au.p.co (D), and from the vis-NIR Model (E) of the validation dataset, the SOM content from the predicted and measured dataset from the test and validation dataset, and the plot of measured SOM and predicted SOM (F). The data showed in (A)–(E) was the difference of the measured SOM minus the predicted SOM.
Remotesensing 15 01617 g009
Table 1. Summary statistics of SOM of calibration, test, and validation datasets.
Table 1. Summary statistics of SOM of calibration, test, and validation datasets.
NMeanSDSkewMin.1st Qu.Median3rd Qu.Max.
Calibration
/g kg−1
10418.928.870.612.3112.0818.7223.5549.68
Test/g kg−113118.326.570.312.0614.3716.9923.3135.58
Validation
/g kg−1
2617.637.360.314.9612.2217.5923.2637.09
n is the number of samples, SD is the standard deviation, Min. is minimum, 1st Qu. Is the first quantile, 3rd Qu. Is the third quantile, and Max. is the maximum.
Table 2. Prediction results of the validation and test dataset from vis-NIR calibration models under four spectral preprocessing methods.
Table 2. Prediction results of the validation and test dataset from vis-NIR calibration models under four spectral preprocessing methods.
TransformationValidation DatasetTest Dataset
RMSERPIQBiasR2RMSERPIQBiasR2
lg.sg6.111.510.110.564.322.170.390.69
msc.snv5.401.710.090.664.612.030.200.64
lg.snv5.741.610.710.634.192.230.310.71
lg.dt5.481.680.630.654.072.30−0.040.72
Note: lg_sg: Savitzky-Golay of log(1/(R)), lg_snv: standardNormalVariate of log(1/(R)), msc_snv: multiplicative scatter correction + standardNormalVariate of log(1/(R)), lg_dt: detrended normalization of log(1/(R)).
Table 3. The statistics of the cross-validation and prediction results of kriging for SOM.
Table 3. The statistics of the cross-validation and prediction results of kriging for SOM.
LVsCross-ValidationValidation
R2RMSE
/g kg−1
RPIQME
/g kg−1
R2RMSE/g kg−1RPIQME
/g kg−1
Lb 40.186.831.541.010.277.121.271.04
Au.p 60.186.741.560.680.357.691.200.97
Lb.co 70.505.362.080.170.585.511.640.74
Au.p.co 80.475.461.860.340.665.661.651.37
Ld: laboratory-based dataset, Au.p: augmented dataset, Ld.co: laboratory-based dataset with vis-NIR spectra as the covariate, Au.p.co: augmented dataset with vis-NIR spectra as the covariate. LVs: latent variable.
Table 4. The statistics of prediction results of kriging for SOM using different principal numbers under different preprocessing methods.
Table 4. The statistics of prediction results of kriging for SOM using different principal numbers under different preprocessing methods.
PCsPre-ProcessingRMSERPIQBiasR2RMSERPIQBiasR2RMSERPIQBiasR2RMSERPIQBiasR2
lg.sgmsc.snvlg.snvlg.dt
1lg.sg6.931.330.70.457.031.310.50.426.821.350.850.476.721.370.580.48
msc.snv8.031.150.360.248.071.140.20.237.991.150.490.247.931.160.220.25
lg.snv6.421.440.040.526.51.42−0.130.516.391.44−0.150.526.451.43−0.460.51
lg.dt8.011.150.610.258.051.150.440.248.041.150.750.247.961.160.480.25
2lg.sg6.451.430.850.546.521.420.660.526.491.420.980.536.351.450.690.54
msc.snv6.971.321.770.466.871.341.640.476.931.332.010.486.821.351.760.49
lg.snv6.441.43−0.040.566.491.42−0.190.556.221.48−0.190.556.211.49−0.460.55
lg.dt6.871.341.010.476.851.350.840.476.781.361.210.496.731.370.960.49
3lg.sg7.661.21.990.367.761.191.780.347.771.192.180.367.71.21.90.36
msc.snv6.921.331.680.466.841.351.520.476.851.351.830.486.751.371.570.49
lg.snv5.81.590.110.645.881.57−0.080.635.851.58−0.20.595.841.58−0.440.6
lg.dt5.991.541.220.595.981.541.050.595.91.561.390.615.771.61.150.62
4lg.sg7.131.291.830.457.261.271.610.437.151.291.960.477.091.31.670.47
msc.snv6.471.421.740.546.411.441.580.546.491.421.880.556.361.451.610.55
lg.snv5.91.560.10.66.171.5−0.130.556.071.52−0.190.566.11.51−0.420.56
lg.dt5.751.61.320.635.671.631.180.635.741.611.510.645.61.651.240.65
5lg.sg7.031.312.050.477.021.311.740.467.11.32.280.477.041.312.010.47
msc.snv5.61.651.60.665.491.681.430.675.751.61.790.655.661.631.520.65
lg.snv6.061.52−0.050.586.311.46−0.260.5461.54−0.280.576.021.53−0.490.57
lg.dt5.711.621.350.645.621.641.230.645.81.591.520.645.661.631.250.65
6lg.sg6.611.41.350.526.631.391.040.516.571.41.60.546.531.411.330.55
msc.snv5.581.650.440.645.541.660.240.645.821.590.610.625.811.590.360.62
lg.snv6.631.39−0.090.56.721.37−0.240.486.281.47−0.260.546.241.48−0.490.55
lg.dt6.081.521.380.595.931.561.250.66.281.471.510.576.091.511.280.59
7lg.sg7.241.270.910.467.271.270.630.467.371.251.150.467.351.260.870.46
msc.snv5.941.550.680.595.951.550.490.596.11.510.870.596.111.510.610.58
lg.snv6.681.38−0.480.486.711.38−0.640.486.321.46−0.690.546.341.45−0.930.54
lg.dt6.631.391.960.546.51.421.870.546.611.42.030.566.391.441.780.57
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, M.; Chen, S.; Guo, X.; Shi, Z.; Zhao, X. Exploring the Potential of vis-NIR Spectroscopy as a Covariate in Soil Organic Matter Mapping. Remote Sens. 2023, 15, 1617. https://doi.org/10.3390/rs15061617

AMA Style

Yang M, Chen S, Guo X, Shi Z, Zhao X. Exploring the Potential of vis-NIR Spectroscopy as a Covariate in Soil Organic Matter Mapping. Remote Sensing. 2023; 15(6):1617. https://doi.org/10.3390/rs15061617

Chicago/Turabian Style

Yang, Meihua, Songchao Chen, Xi Guo, Zhou Shi, and Xiaomin Zhao. 2023. "Exploring the Potential of vis-NIR Spectroscopy as a Covariate in Soil Organic Matter Mapping" Remote Sensing 15, no. 6: 1617. https://doi.org/10.3390/rs15061617

APA Style

Yang, M., Chen, S., Guo, X., Shi, Z., & Zhao, X. (2023). Exploring the Potential of vis-NIR Spectroscopy as a Covariate in Soil Organic Matter Mapping. Remote Sensing, 15(6), 1617. https://doi.org/10.3390/rs15061617

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop