Next Article in Journal
Multi-Cooperative Agricultural Machinery Scheduling with Continuous Workload Allocation: A Hybrid PSO Approach with Sparsity Repair
Previous Article in Journal
REGENA: Growth Function for Regenerative Farming
Previous Article in Special Issue
A Crawling Review of Fruit Tree Image Segmentation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Applications and Challenges of Visible-Near-Infrared and Mid-Infrared Spectroscopy in Soil Analysis: Chemometric Approaches and Data Fusion

1
National Institute of Agricultural Sciences, Rural Development Administration, Wanju-gun 55365, Republic of Korea
2
School of Life and Environmental Sciences, The University of Sydney, Sydney 2006, Australia
*
Author to whom correspondence should be addressed.
Agriculture 2026, 16(1), 135; https://doi.org/10.3390/agriculture16010135
Submission received: 6 November 2025 / Revised: 2 January 2026 / Accepted: 3 January 2026 / Published: 5 January 2026
(This article belongs to the Special Issue Application of Smart Technologies in Orchard Management)

Abstract

Infrared (IR) spectroscopy has emerged as a rapid, cost-effective, and reliable alternative to traditional methods, enabling real-time, indirect monitoring of nutrients. Most reviews have discussed visible-near-infrared (Vis-NIR) and mid-infrared (MIR) spectroscopy individually for soil analysis. This review highlights the application of IR spectroscopy, particularly Vis-NIR, MIR spectroscopy, and their data fusion, coupled with chemometrics and spectral preprocessing for estimating soil attributes. Additionally, the crucial functions of assessing model accuracy and validating model estimates of soil properties are discussed. Partial least squares regression (PLSR) was used in more than 100 studies in 2022. Based on the literature published from 2020 to 2025, the data fusion method predicts soil properties more accurately. This review also sheds light on recent advances in spectroscopic methods, including improvements in speed (e.g., MIR spectroscopy is up to 12 times faster than traditional methods), instrument miniaturization, and integration with portable devices, which can make field analysis more affordable. However, the sensitivity of IR spectroscopy to soil moisture, sample heterogeneity, vegetation cover, and calibration transfer issues remains a significant challenge in certain studies. Therefore, a discussion on the challenges in implementing this technique is included in this review, and future perspectives, such as integration of various sensors and portable devices for real-time soil assessment, are successively discussed.

1. Introduction

Soil health and crop productivity are essential to sustainable agriculture, as nearly 90% of food production relies on the soil resources [1,2]. To meet future food demand, the world’s soils are anticipated to support a population of nine billion by 2050 by increasing global agricultural production by 60% and nearly 100% in developing countries [3]. Accomplishing this goal requires proper soil management, as crop growth and productivity are influenced by soil properties and nutrient availability. For example, soil pH influences nutrient availability, with micronutrients being more readily available at acidic pH, thereby increasing plant growth [4].
Despite the importance of soils, soil degradation has become a major global concern, with one-third of soils degraded due to erosion, salinization, acidification, nutrient depletion, compaction, and pollution [5]. This widespread deterioration threatens food security and impedes progress toward the Sustainable Development Goals (SDGs), including SDG 1 (End poverty), SDG 2 (Zero hunger), SDG 13 (Climate action), and SDG 15 (Life on land) [6]. Consequently, accurate, efficient, and sustainable approaches to soil assessment are essential for maintaining ecosystem health and long-term agricultural productivity.
Soil assessment has traditionally relied on wet-chemistry methods to identify nutrient deficiencies and excesses. Although these techniques provide accurate measurements, they are labor-intensive, time-consuming, costly, and dependent on chemical reagents that may pose environmental risks [7]. In contrast, infrared (IR) spectroscopy has emerged as an innovative and eco-friendly alternative. As a rapid, non-destructive, and cost-effective technique, IR spectroscopy enables the simultaneous estimation of multiple soil properties [8]. In particular, mid-infrared (MIR) spectroscopy has been reported to be 12 times faster than conventional laboratory methods for analyzing a range of soil attributes [7], thereby enhancing its appeal for modern agricultural applications.
Numerous studies have highlighted the key role of near-infrared (NIR) (750 nm to 2500 nm) and MIR (2500 nm to 25,000 nm) spectroscopy in characterizing complex soil matrices [9,10,11,12]. These methods enable efficient estimation of several physicochemical soil properties, such as texture, moisture, soil organic carbon (SOC), pH, nutrients, and bulk density. For instance, Jeon et al. [7] applied MIR spectroscopy to recommend fertilizers for seven crops based on predicted soil properties, while Engelmann et al. [13] demonstrated its effectiveness in determining nitrogen (N), total organic carbon (TOC), inorganic carbon (IC), and pH. Similarly, NIR spectroscopy has been used to assess soil inorganic carbon (SIC), organic carbon (OC), N, clay, organic matter (OM), and pH [14,15,16,17]. Lelago and Bibiso [18] demonstrated the potential of MIR spectroscopy for evaluating soil nutrients. Although MIR provides superior predictive performance due to stronger fundamental molecular absorptions, NIR spectroscopy offers practical advantages, such as the availability of miniature spectrometers, affordability, and portability for in-field use [19], which has resulted in its widespread use in soil analysis.
Several reviews have explored the use of spectroscopy in soil characterization; most have focused on individual Vis-NIR and MIR analysis. Nath et al. [20] reviewed the potential of MIR spectroscopy coupled with chemometric approaches to rapidly and cost-effectively estimate soil physical, chemical, and biological properties as alternatives to conventional laboratory methods for soil quality assessment. Piccini et al. [21] provided a comprehensive overview of in-field Vis-NIR spectroscopy for commonly measured soil properties across various sensor ranges and types but did not address data fusion methods. Additionally, integrating these techniques into real-time practice is, in its early stages, necessitating further technological advancements. However, several challenges, such as soil heterogeneity, moisture content, and sample preparation, can introduce errors in spectral response, thereby influencing the prediction accuracy [16]. This review provides a comprehensive overview of NIR and MIR spectroscopy and its data fusion for soil analysis, covering progress and challenges encountered in its implementation. Future possibilities for achieving high accuracy in predicting soil attributes—such as nutrients, moisture, and pH—are further explored through the integration of advanced techniques.

2. Spectroscopy

Spectroscopy involves the interaction of light with matter, providing information on molecular and atomic properties [22]. Owing to its versatility, it is widely applied across disciplines including biology, chemistry, physics, and materials science. Each spectroscopic technique relies on different physical phenomena that determine its spectral features [23,24,25,26]. The ultraviolet–visible (UV-Vis), IR, nuclear magnetic resonance (NMR), and Raman spectroscopies are examples of available spectroscopy techniques, with the choice of method depending on the specific analytical objective. In this study, we focus on IR spectroscopy, particularly Vis-NIR, NIR, and MIR, to explore their applicability for agricultural soil analysis.

2.1. Principles of NIR and MIR Methods

This section briefly introduces the fundamental principle of spectroscopy, with emphasis on NIR and MIR techniques. Infrared (IR) spectroscopy involves the generation, measurement, and analysis of spectra produced by the interaction of electromagnetic radiation with matter (Figure 1). The technique relies on the absorbance of radiation at molecular vibrational frequencies associated with functional groups, such as C–H, O–H, C–N, C–C, Al–O, and Si–O [27,28,29,30]. The NIR spectral region covers 750 nm to 2500 nm wavelengths. In this region, absorption features arise mainly from overtone and combination vibrations, particularly those involving hydrogen-containing bonds [31].
Mid-infrared (MIR) spectroscopy operates over a wavelength range of 2500–25,000 nm [32]. At ambient temperature, most molecules occupy the ground vibrational state (ν″ = 0); absorption of MIR radiation promotes transitions to the first excited state (ν″ = 1). This process, known as fundamental vibrations, facilitates detailed information about functional groups and their structures [24]. This fundamental vibration often exhibits multiple absorption peaks and is categorized into four distinct regions: fingerprint, double-bond, triple-bond, and stretching [33].
The spectral behavior of soil is primarily controlled by its mineralogical and chemical composition. Each soil mineral has a distinct structural arrangement, composition, and set of physicochemical properties, and exhibits a distinct spectral response at a specific wavelength [34]. The minerals kaolinite and smectite are common soil-forming clay minerals [35], which exhibit diagnostic O–H and Si–O vibrational features and are captured in the MIR spectral region, whereas CO32− and Si–O absorption bands are associated with carbonates and quartz, respectively [35,36]. The intensity and position of these spectral bands are further modulated by soil chemistry, including organic carbon, metal oxides, and cation exchange sites, which collectively affect both MIR and NIR signatures [35,37]—for example, kaolinite diagnostic wavelengths in Vis-NIR, ranging from 4812–4411 cm−1, and 3636, 2710, 1923–1818, 1587, 1064, and 855 cm−1 for MIR [35]. Smectite shows diagnostic wavelengths in the Vis-NIR range between 4721–4372 cm−1, along with MIR spectral features at 3636, 1961, 1852, 1639, 1111, 1064 cm−1 [35]. Organic carbon can be determined in the MIR region from diagnostic absorption peaks at 2920 and 1230 cm−1 [35]. The diagnostic spectral features observed in both the NIR and MIR spectral regions reflect the integrated effects of minerals and soil chemistry on spectral signatures, underscoring their crucial role in robust spectral interpretation. Variability in soil composition across landscapes, however, limits the transferability of spectral models. This challenge can be addressed through large spectral libraries, model transfer techniques, and data fusion approaches that improve robustness across diverse soil types.

2.2. Spectral Range and Instrument Variability

Each spectroscopy operates over a specific wavelength range (e.g., UV: 250–400 nm, Vis-NIR: 700–2500 nm, MIR: 2500–25,000 nm), resulting in different prediction accuracies for soil properties. These differences could be attributed to variations in the spectroscopy system’s spectral range, resolution, and optical configuration. In general, the MIR region often provides higher prediction accuracy than NIR [32], which may be due to the strong fundamental vibrations of functional groups present in the soil sample. For instance, the prediction accuracies for sand, clay, total nitrogen (TN), total carbon (TC), SOC, and SIC (mean R2 > 0.8) were higher with MIR compared to Vis-NIR [38]. However, this advantage is not universal. Some studies report better performance using NIR, depending on the soil attribute and modeling approach. For instance, van Groenigen et al. [39] observed improved prediction of soil exchangeable cations using NIR coupled PLSR compared to DRIFT-MIR. This discrepancy may be attributed to limitations of linear models such as PLSR in capturing the complex, nonlinear information often present in MIR spectra, as well as to the sensitivity of MIR measurements to particle size and sample preparation. These results indicate that the ideal spectral region for accurately predicting soil properties varies with the soil attributes under study. This is due to the distinct soil attributes assimilated from their chemical and structural compositions, which interact with light at different wavelengths.
Prediction accuracy is also influenced by instrumental variability, as different laboratories use spectrometers from different brands, with varying spectral ranges, sensors, and resolutions. Knadel et al. [40] demonstrated this effect by using different spectrometers with five distinct spectral regions to predict soil clay and SOC content, reporting a significant difference in prediction accuracy of clay (R2 = 0.68–0.77) and SOC (R2 = 0.35–0.59), highlighting the crucial role of spectral range selection on prediction accuracy.
Spectral resolution further contributes to this variability. Gomez et al. [41] reported improved clay prediction accuracy at resolutions between 5 and 100 nm, with reduced performance at 200 nm, suggesting that higher resolution enhances the capture of diagnostic absorption features. Conversely, other studies found that resolution had little impact on prediction accuracy [40,42], likely because advanced preprocessing techniques and robust modeling algorithms can extract predictive information from lower-resolution spectra. In addition to spectral range and resolution, factors such as calibration datasets, instrument type, sample texture, preparation, and thickness can influence spectral responses.
This demonstrates that a single spectrum is not universally optimal, as soil attributes interact with radiation differently across wavelengths. Variations in instrumental configuration, spectral range, and resolution substantially affect the accuracy of soil predictions. Thus, robust calibration strategies and instrument-specific optimization are essential for reliable soil spectroscopy.

2.3. Factors Affecting the Spectrum

Infrared spectroscopy has contributed significantly to sustainable agriculture by facilitating the prediction of soil properties; however, its predictive accuracy can be influenced by physical, chemical, and environmental factors [43,44]. Soil particle size and OM strongly influence soil minerals by reducing reflectance intensities or masking absorption features [45,46,47], thereby decreasing spectral sensitivity and reducing the accuracy of their assessment. Fine particles generally enhance spectral sensitivity and prediction accuracy for cation exchange capacity (CEC), TC, TN, OC, Olsen P, sand, silt, and clay due to improved soil–light contact, whereas coarse particles reduce reflectance quality [45]. This challenge can be mitigated through standardized sample preparation, including drying, grinding, and sieving, as well as by applying scatter-correction and derivative preprocessing techniques to minimize particle-size effects. Similarly, OM can mask mineral absorption features, particularly in clay and sandy soils, thereby reducing their intensities [48] and lowering prediction accuracy. This limitation can be mitigated by employing MIR spectroscopy, which yields stronger fundamental vibrational signals, and by employing multivariate calibration models that can separate overlapping organic and mineral spectral contributions. Although IR spectroscopy has been used for on-site measurements of various soil properties, the feasibility and efficiency of predicting soil properties could be improved if calibration robustness were to be enhanced. Several factors, including topography, parent materials, vegetation cover, and atmospheric conditions, significantly affect soil properties and eventually alter spectral responses.
Topography affects soil physicochemical properties by regulating water drainage and land exposure to sunlight. For example, variations in slope and aspect influence soil development and moisture regimes, which in turn affect soil properties [49] and may modify spectral responses. Similarly, parent materials play a key role in shaping the soil’s chemical composition and structure [50], which ultimately affects its spectral characteristics. They play a crucial role in soil carbon (C) and the mineralization of soil nitrogen (N) [51,52]. Several soil attributes, including soil organic matter (SOM), calcium carbonate (CaCO3), soil texture, pH, and bulk density, vary across parent materials [53,54] and may exhibit distinct spectroscopic responses. Xu et al. [54] evaluated the effects of four parent materials on predicting SOM using Vis-NIR spectroscopy. They showed that soil derived from shale exhibited higher prediction accuracy than those formed from red sandstone, river alluvium, and quaternary red clay, owing to differences in mineralogical and SOM among parent materials. The impact of topographic variability can be reduced by incorporating topographic covariates or terrain-based stratification into calibration models.
Vegetation cover indirectly affects spectral predictions by modifying soil properties through nutrient cycling and microbial activity [55,56,57,58]. Wang et al. [59] found that different types of vegetation affect soil carbon and nitrogen levels. Despite these advantages, vegetation cover predominantly affects spectral response by masking the soil surface, yielding mixed spectral signals from both plant residues and soil rather than from soil itself, leading to interpretation errors.
Temperature is another factor affecting instrument stability and molecular vibrations, particularly the O–H bonds in water, leading to changes in spectral absorbance and prediction accuracy [60,61]. Consequently, soil with higher moisture content is sensitive to temperature fluctuations, reducing model robustness and accuracy, which can be addressed by controlling measurement conditions, applying temperature-correction algorithms, or incorporating temperature variability into calibration datasets. All these factors must be considered when building an effective model for soil characterization.
Soil moisture strongly influences spectral reflectance, with reflectance decreasing as moisture content increases [62]. Soil moisture also masks soil carbon features because of its similar reflectance pattern to soil carbon [63,64]. It broadens the O–H stretching absorption band, which interferes with the prediction of carbon and other soil minerals. Consequently, correcting spectra for soil moisture is crucial prior to building a model for soil estimation, which can be addressed through moisture-standardized measurements, spectral normalization, and external parameter orthogonalization (EPO).

2.4. Spectral Preprocessing

Instrument setup, sample handling, and environmental conditions can introduce errors in spectral measurements [65]; therefore, spectral data must be preprocessed before chemometric modeling to ensure accurate prediction of soil properties. Various preprocessing techniques are used in NIR and MIR spectroscopy, and the common steps are illustrated in Figure 2. Raw spectra obtained from the spectrometer are first trimmed to remove noisy or non-informative regions and then smoothed to reduce random noise while preserving spectral features, thereby improving data quality for model development [66]. Baseline correction using standard normal variates (SNV) is applied to correct baseline shifts and scatter caused by physical differences among samples [66]. The preprocessed spectra are used to develop an effective model, with the dataset split into two parts: (1) 70% for calibration, and (2) 30% for validation. All analyses are performed using R version 4.1.1.
External factors reduce prediction accuracy by increasing light absorption, which can be mitigated via preprocessing [67]. Nocita et al. [68] reported that increased moisture levels adversely affected SOC prediction using UV-Vis-NIR spectroscopy because moisture absorbs more light, thereby interfering with spectral data. Advanced preprocessing, such as EPO, has been shown to reduce the effects of moisture, eventually improving SOC prediction accuracy [69]. This finding highlights the key role of the preprocessing step for exploring hidden information in spectral data by minimizing errors. Various pre-treatment methods have been applied to identify the most effective treatment for improving the predictive performance of soil properties, although no single method is universally optimal. Researchers need to test different preprocessing treatments, each of which has a unique effect, such as Savitzky–Golay (SG) smoothing for reducing noise [70], derivatives to correct the baseline [71,72,73], SNV and multiplicative scatter correction (MSC) to eliminate scatter effect [74,75], orthogonal signal correction (OSC) for removing uninformative variation [76], normalization to reduce sample preparation error [77], and mean centering to adjust intercept [78] as the main preprocessing treatments used in spectral data.
Although selecting the optimal preprocessing treatment is difficult prior to model validation, it is advisable to minimize preprocessing to reduce model complexity. This is because applying incorrect or excessive preprocessing steps can eliminate valuable information and increase the risk of overfitting [75].

2.5. Chemometric Models

A chemometric model is a mathematical and statistical framework for solving specific problems by processing large input datasets and generating meaningful outputs. Researchers have been applying a range of algorithm techniques in soil science globally, with the predominant use in developed countries [79]. Frequently used algorithms for multivariate calibration purposes include PLSR, principal component regression (PCR), multiple linear regression (MLR), SVM, K-nearest neighbors (KNN), random forest (RF), Cubist, and artificial neural networks (ANNs) models. These algorithms identify spectral patterns and perform multivariate calibration based on characteristics of the dataset.
Model performance depends strongly on data structure and complexity. For instance, PLSR is effective with linear and collinear variables. On the other hand, RF and ANN can handle complex datasets. Similarly, SVM can handle high-dimensional data, whereas Cubist is useful for analyzing complex non-linear relationships. Advanced deep learning methods, including ANNs, convolutional neural networks (CNNs), long short-term memory (LSTM), and autoencoders, have proven highly effective for extracting valuable information from large, high-dimensional datasets because they can learn complex, non-linear relationships. For example, Vis-NIR combined Multi-CNN with one and two-dimensional convolutions achieved higher predicted accuracy for OC, N, and clay (R2 = 0.83–0.95) compared to traditional PLSR (R2 = 0.50–0.54) [80], while LSTM-CNN-attention outperformed PLSR, SVR, and RF for prediction of OC, N, CaCO3, and pH (R2 = 0.91–0.95) when applied to LUCAS soil datasets [81]. Despite their advantages, deep learning models require more processing time and a large image dataset [82,83,84], which restricts their widespread use.
Dataset imbalance remains a critical but often overlooked issue in machine-learning-based soil spectroscopy. Data imbalance not only affects the training of machine learning models but also influences their predictive performance, as predictive accuracy is often biased toward the dominant class and poor for the minority class [85]. For example, a soil classification study by Aydın et al. [86] found that high-plasticity clay (71%) was dominant in the samples when using the RF model, while high-plasticity silt and clayey sand were lower, and the performance further improved when the dataset was balanced using the synthetic minority oversampling technique (SMOTE). Another study employed an RF model to predict soil texture, improving accuracy from 44% to 59% by balancing the dataset with SMOTE [87]. Neyestani et al. [88] also reported that imbalanced data adversely affected model performance for digital mapping of soil classes, and that the performance improved when an oversampling method was applied to the lower soil class. These findings indicate that addressing the dataset imbalance requires rigorous sampling strategies (e.g., SMOTE) that cover a broad range of soil classes, parent materials, and textures relevant to the study area. Furthermore, stratified cross-validation can evaluate model performance across underrepresented soil groups, and domain-specific feature selection (e.g., selecting wavelengths associated with chemical constituents rather than overall soil type) can mitigate the models learning of misleading patterns and reduce the effect of dataset imbalance.
To identify the use of chemometric models in the past ten years, a search on the Scopus database (https://www.scopus.com/search/form.uri?display=basic#basic (accessed on 17 January 2025)) was conducted for a specific chemometric model used for soil analysis (e.g., partial least squares regression for soil analysis using spectroscopy, etc., entered in ‘search documents’ window). The search within the window was restricted to the article title, abstract, keywords, whereas ‘year’ was limited to 2014 to 2024. Figure 3 demonstrates the use of algorithms in soil studies over the past ten years, based on data retrieved from the Scopus database, highlighting that PLSR was a frequently used algorithm between 2014 and 2024 to predict soil properties, with the highest use observed in 2022 (n = 125). The use of the SVM algorithm increased from 2017 to 2024. Random forest (RF) is the second most popular algorithm, and its use increased from 2018 to 2024, possibly due to its prediction accuracy, followed by MLR, ANN, CNN, PCR, Cubist, KNN, LSTM, and autoencoders. Several articles may currently be in the communication pipeline, indicating that researchers will continue to show sustained interest and growing application of PLSR and RF in this field. The longer processing time and need of a larger dataset could be possible reasons for the lower use of deep learning methods.

2.6. Model Validation

Model validation is crucial in machine learning, as it prevents both underfitting and overfitting during model development and enhances reliability in real-world applications. Among validation techniques, cross-validation is widely used in machine learning and statistical modeling, primarily to evaluate model performance and generalizability. In this method, data are split into training and testing sets to obtain a robust model by balancing variance and bias [89]. Different cross-validation techniques are available, including leave-one-out-cross-validation (LOOCV), k-fold cross-validation, repeated k-fold cross-validation, holdout validation, and stratified cross-validation, which can be used to validate a model. Among these, repeated cross-validation, LOOCV, and k-fold-cross-validation are routinely used in machine learning. Leave-one-out-cross-validation (LOOCV) is a form of k-fold cross-validation, where k denotes the total number of observations in the dataset [90]. In this validation, each observation is used once for the test set, while the remaining observations serve as the training set. This process continues until each observation in the dataset is used once for validation. Although computationally time-consuming for large datasets, LOOCV provides an unbiased error for small datasets [91].
On the other hand, k-fold cross-validation splits the dataset into k equally sized folds: k − 1 folds are used for validation, and the remaining folds for calibration. This process is repeated k times, ensuring that each fold serves as a validation set exactly once. This method is computationally less expensive and more effective, is suitable for large datasets, and yields lower variance than the LOOCV [92]. Repeated k-fold cross-validation involves performing k-fold cross-validation multiple times using distinct data splits, thereby building on the k-fold approach to yield a more stable model. Compared with a single k-fold cross-validation, this method reduces variance by averaging the results across all runs, yielding a more accurate model [93]. In addition, random sampling with data splitting (calibration: 70%, validation: 30%) is often used, particularly for small datasets, as it can reduce variance and improve model performance. The choice of the validation method relies on the modeling potential, required time, cost, and dataset size.

2.7. Evaluating Model Accuracy

Chemometrics models are used for both regression (estimation of soil properties) and classification (categorization of soil samples); therefore, performance assessment is an imperative step to ensure their reliability in practical applications. Different metrics are now available, including precision, F1 score, accuracy, sensitivity, and specificity, to evaluate model performance [74]. Specificity is the ratio of true negatives, assessing the proportion of actual negatives that are accurately recognized [94]. It reflects the number of samples that do not belong to a specific class and are correctly classified as such by the model. Specificity model accuracy is predicted by both high and low specificity, with high specificity indicating high accuracy.
Sensitivity quantifies the proportion of actual positives that the model accurately detected [74]. Accuracy measures how often model predictions are correct out of all the samples. Precision quantifies the proportion of predicted positive samples that are genuinely correct [94]. The F1 score harmonizes the mean of precision and recall [95]. For regression models, performance is assessed using statistical indicators that describe explanatory power and prediction error. Commonly used metrics include the coefficient of determination (R2), coefficient of variation (CV), standard error of prediction (SEP), ratio to performance deviation (RPD), bias, mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE) [74,96,97,98]. Each metric plays a different role in a model’s predictive performance: R2 reflects the proportion of variance explained by the model, whereas RMSE measures the average difference between predicted and actual values. In spectroscopy studies, the R2 value remains the most widely used metric for evaluating model performance across various soil properties [96].

3. Trends in the Use of NIR and MIR Spectroscopy in Soil Science

Figure 4 illustrates the application trends of NIR and MIR spectroscopy in soil science over the past decade, based on data retrieved from the Scopus database, and indicating their growing popularity. To identify these trends, a systematic search was conducted on the Scopus database (https://www.scopus.com/search/form.uri?display=basic#basic (accessed on 27 January 2025)) with keywords ‘NIR spectroscopy for soil analysis’ and ‘MIR spectroscopy for soil analysis’. The search was restricted to article title, abstract, and keywords and limited to articles published from 2014 to 2024. Data showed that the use of NIR fluctuated, with a trend that appears to be both upward and downward from 2014 to 2024. The NIR trend for soil analysis greatly increased in 2016 (n = 66). The usage decreased in 2017 (n = 53) but steadily increased from 2018, achieving the highest peak in 2024 (n = 114), representing the growing adaptation in soil fields.
Mid-infrared (MIR) exhibits greater fluctuation and a relatively stable behavior compared with NIR. The MIR technique was applied in 5 to 29 research articles published from 2014 to 2024, with the highest number of publications (n = 29) in 2022. The comparatively limited adoption of MIR spectroscopy may be due to its high sensitivity to water content and the need for specific sample preparation [99]. Table 1 presents spectroscopy techniques (NIR, Vis-NIR, and MIR) and data fusion, along with commonly used multivariate calibration and preprocessing methods for predicting a wide range of soil properties.

3.1. Application of NIR and Vis-NIR Spectroscopy

Table 1 summarizes representative studies that employed NIR/Vis-NIR spectroscopy and data fusion methods, coupled with pre-treatment and chemometric models, to predict soil properties. As shown in Table 1, among these applications, SOC has been the most frequently predicted property using Vis–NIR spectroscopy, with PLSR remaining the most widely used algorithm and generally providing good predictive performance across diverse datasets [100,101,102]. Performance varies considerably with the type of preprocessing, sample size, sample preparation, soil type, and validation strategy. For instance, Seema et al. [100] analyzed 280 soil samples from three soil types: Entisols, Inceptisols, and Alfisols to predict soil SOC using Vis-NIR over the 350–2500 nm spectral range.
The author applied six preprocessing treatments and four machine learning algorithms and found that PLSR achieved the highest SOC prediction accuracy, followed by RF, support vector regression (SVR), and multivariate adaptive regression splines (MARS), when reflectance was used rather than the preprocessing treatment. This suggests that preprocessing is not universally beneficial, and its effectiveness depends on soil-specific spectral characteristics and the modeling approach.
Similarly, Vis-NIR spectroscopy combined with PLSR and six preprocessing steps, was applied to 132 soil samples to predict soil SOC, CEC, and clay content [102]. The study compared different preprocessing methods applied to raw spectra and showed that PLSR calibrated with SG first-derivative achieved best prediction performance for SOC and clay, while a moderate performance was obtained for CEC, highlighting soil-property-specific sensitivity to spectral transformations. Another study compared Vis–NIR and MIR spectroscopy for SOC prediction using 53 soil samples, demonstrating that the choice of preprocessing and spectral range strongly influence model performance [101]. Among the five preprocessing treatments, PLSR calibrated with log10 (1/R) transformation yielded superior results for Vis-NIR (R2 = 0.90) compared with MIR (R2 = 0.85), emphasizing the greater sensitivity of the Vis-NIR region to SOC-related spectral features. In forest soil, samples from 2–10 cm depth horizons yielded the most accurate SOC predictions across five distinct soil horizons when Vis-NIR spectroscopy was combined with an SVM model employing a radial basis kernel, calibrated using log10 (1/R) transformation and SG preprocessing [103]. These results demonstrated that relying on a single algorithm and preprocessing approach is inadequate for accurately predicting the same soil properties across soils, given their heterogeneity and complexity.
Although the dry-combustion laboratory method is more accurate for SOC measurement, several studies have shown that spectroscopy offers a viable alternative. Ng et al. [127] reported comparable correlations among loss-on-ignition, dry combustion, and real-time NIR methods for SOC estimation. They observed similar positive correlations for determining SOC (R = 0.549–0.579) across methods, indicating that spectroscopy results were comparable to those obtained with the standard dry-combustion method. Semella et al. [128] found that Vis-NIR and MIR spectroscopy exhibited SOC reproducibility comparable to dry combustion, with MIR achieving accuracy (R2 = 0.98) similar to that of the reference method (R2 = 0.997–0.999), suggesting that spectroscopy offers a rapid and cost-effective alternative to the standard dry-combustion method.
Furthermore, studies have compared chemometrics and/or machine-learning techniques in Vis-NIR spectroscopy for estimating soil properties. Several researchers effectively predicted soil clay and CEC using Vis-NIR combined with PLSR [102,104,105]. In contrast, advanced machine learning models, including Cubist, RF, SVM, and MARS, outperformed PLSR in predicting complex soil attributes and in multi-depth calibration. Zhao et al. [105] compared Cubist and PLSR models for predicting sand, silt, clay, pH, and CEC using Vis-NIR and found that, under multi-depth calibration, Cubist achieved the highest prediction accuracy when combined with SG and SNV preprocessing. Consistent with these findings, another study evaluated four modeling approaches (RF, MARS, PLSR, and Cubist) with two preprocessing treatments to estimate eight soil properties (SOC, TN, total sulfur (TS), CEC, clay, sand, pH, and exchangeable calcium (Ex. Ca)) using Vis-NIR spectroscopy. They observed strong predictive accuracy with a Cubist model across all properties [106]. Beyond tree-based models, SVM approaches have also demonstrated strong predictive capability. A study integrating several multivariate methods with multiple preprocessing treatments over the 1200–2400 nm NIR range identified SVM combined with SNV as the most effective approach for SOM prediction [107].
Similarly, Vis-NIR spectroscopy coupled with SVMR and PLSR, using SG first-derivative and SNV preprocessing, showed strong predictive performance for SOC, whereas moderate to low accuracies were observed for avl. P, K, N, pH, and soil texture [108]. The strong prediction for SOC and lower forecasts for other attributes may be due to SOC exhibiting distinct patterns or relationships in the dataset, which SVMR can effectively capture. However, moderate and low accuracies may be attributable to less distinct patterns and greater noise in the dataset that SVMR cannot capture, resulting in lower accuracies for other soil attributes [129].
Furthermore, Ramírez et al. [104] performed a comparative study of NIR and MIR spectroscopy for predicting soil properties in northern cold-region ecosystems. They reported that MIR spectroscopy combined with PLSR and the SG first-derivative achieved superior performance in determining TOC and TN. Similarly, applying RF with first-derivative of SG to MIR data improved the predictions of clay content. Conversely, NIR spectroscopy demonstrated limited predictive capability, achieving only moderate accuracy in estimating CEC via PLSR on reflectance data. The authors speculated that this result may be due to soils heterogeneous parent materials, which make it difficult to accurately predict soil properties [104]. Researchers also evaluated the single-model approach with different preprocessing treatments for the accuracy of soil characteristics. For example, Miloš et al. [102] evaluated multiple preprocessing methods with a PLSR and reported strong predictions for SOC and clay, with moderate performance for CEC when SG first-derivative preprocessing was applied. In contrast, El-Sayed et al. [109] used a single preprocessing method (MSC) with multiple machine-learning models on Vis-NIR spectroscopy data. They demonstrated that RF performed best for pH and CaCO3, whereas ANN yielded superior predictions for EC. These findings highlight that prediction performance depends not only on the spectral range but also on the interaction between preprocessing techniques, modeling algorithms, and target soil properties.
Accurate characterization of soil variations is essential for agriculture but is challenged by the wide diversity of mineral compositions and particle-size distributions. Lucena et al. [110] demonstrated that chemometric-assisted portable NIR spectroscopy, combined with SVM, could classify soils based on mineralogy and particle size with up to 90% accuracy, indicating its potential for rapid field-based assessments.
Despite variability in performance across soil properties, Vis-NIR spectroscopy has proven robust for SOC estimation and soil classification when supported by optimized chemometrics. These findings underscore the need for property-specific modeling frameworks rather than universal spectral solutions. Nevertheless, Vis-NIR spectroscopy often provides lower predictive accuracy for quantitative soil properties because its absorption features mainly arise from overtones and combination bands, which are weaker and more overlapping than fundamental vibrations [130,131]. In contrast, MIR spectroscopy captures fundamental molecular vibrations, yields more distinct absorption features, and achieves higher prediction accuracy [132]. Consistent with this, comparative studies have shown that MIR outperforms Vis-NIR spectroscopy in predicting OC, N, and pH [133]. Given its superior accuracy and sensitivity for soil estimation, MIR spectroscopy has emerged as a promising method for soil assessment. The following section, therefore, focuses on the application of MIR spectroscopy for predicting soil properties.

3.2. Application of MIR Spectroscopy in Soil

Rapid and accurate soil diagnosis can be facilitated by developing effective models that combine MIR spectroscopy with machine learning algorithms, as MIR is highly sensitive to the molecular vibrations of soil minerals and OM. Several studies demonstrated the superior performance of MIR spectroscopy for predicting soil properties across the soil types, with key findings presented in Table 1. For example, Sabetizade et al. [111] reported high SOC prediction accuracy (R2 = 0.96) using an MIR-coupled Cubist model with latent variables in a semi-arid region, highlighting the potential of MIR spectroscopy in semi-arid regions. Similarly, the effect of anthropogenic activities on SOC in soil from Namoi Valley, Australia, was tellingly demonstrated using MIR spectroscopy [112]. They used multiple preprocessing treatments, including trimming, noise-to-signal ratio reduction, SG filtering, and SNV, to minimize signal errors. For model calibration, 151 soil samples were used to train the Cubist model, and 10-fold cross-validation yielded the best SOC prediction accuracy (R2 = 0.82); this might be due to MIR’s ability to capture fundamental vibrations of C–H, N–H, and O–H groups in OM.
Researchers have highlighted the flexibility of MIR spectroscopy in the agricultural field, demonstrating its ability to predict a wide range of soil properties. Several studies identified PLSR as a benchmark model, achieving reliable predictions for soil pH, SOC, Ca, iron (Fe), manganese (Mn), Mg, CaCO3, lead (Pb), texture, and lime requirement, particularly when combined with preprocessing techniques such as trimming, smoothing, SNV, and first-derivative transformations [113,114,115]. Nevertheless, prediction accuracy for certain nutrients, notably avl. P and K remained limited across multiple models: PLSR (avl. P-0.38, K-0.22) > SVM (avl. P-0.37, K-0.15) > RF (avl. P-0.28, K-0.05), respectively [114]. Soil bulk density is a soil property affecting soil porosity, root penetration, and water-holding capacity. If soil bulk density is high, it can impede plant root growth, air exchange, and water movement, underscoring its importance in agriculture. Sanderman et al. [116] used MIR and compared PLSR and memory-based learning (MBL) models for predicting soil health indicators, including CEC, bulk density, SOC, and texture, and they found that MBL outperformed PLSR. Consistent results were also reported by Ng et al. [19], who successfully predicted 50 soil properties with high accuracy (accuracy fell within a to b categories) using a single MIR coupled MBL. Although prediction accuracy for an additional 40 soil properties was lower, this study indicates that MIR spectroscopy is suitable for large-scale soil assessment compared with traditional methods.
Shi et al. [117] compared the traditional pedo-transfer function with MIR spectra (600–4000 cm−1) coupled with machine-learning algorithms to estimate soil bulk density across different depth layers in Irish soils. The MIR-SVM model achieved a higher R2 (0.81) than the traditional pedo-transfer function and performed consistently across depth up to 50 cm, signifying the robustness of MIR-based approaches for vertical soil profiling. Additionally, MIR spectroscopy has contributed to the development of a spectral library for soil classification, as it is essential for sustainable agriculture. For example, Sherif [118] developed a spectral library based on MIR, comprising over 80,000 soil samples collected across the United States. Partial least squares regression (PLSR) and random forest (RF) models, combined with first-derivative preprocessing, were applied to predict soil attributes such as OM, pH, TN, TC, base cations (P, K, and Ca), and CEC. Most properties exhibited strong prediction performance, particularly with RF, although OM prediction remained comparatively weaker. Effective prediction of TN and SOM contents through MIR spectroscopy relies on the choice of variable selection techniques and multivariate regression methods. The importance of variable selection and modeling strategy was further emphasized by Li et al. [119], who compared two multivariate methods (PLSR and MLR) and two spectral variable selection methods (stability competitive adaptive re-weighted sampling (sCARS) and the bootstrapping soft shrinkage approach (BOSS)) to predict TN and SOM using DRIFT-MIR. Their results showed that MLR combined with sCARS achieved greater prediction accuracy, with R2 values of 0.84 for TN and 0.72 for SOM.
Researchers prefer deep learning methods to traditional algorithms for handling large, complex datasets and achieving better performance. Adopting deep learning is particularly beneficial when the dataset is large, as deep learning models can automatically learn intricate patterns and interactions that are difficult for traditional methods. For instance, Jakkan et al. [120] evaluated multiple models, including MBL, dropout sequential artificial neural networks (DrSeq-ANN), and RF, and found that logarithmic derivative preprocessing combined with DrSeq-ANN provided the most accurate predictions for TN, OC, K, and P. Another study by Nyawasha et al. [121] demonstrated that artificial neural network models applied to MIR data outperformed Vis-NIR-based prediction for various soil properties.
The study highlights the inherent advantages of MIR spectroscopy for accurately predicting soil properties due to its sensitivity to molecular vibrations and interactions with soil constituents. It requires appropriate pretreatment methods and a modern algorithm to offer comprehensive solutions for large-scale soil assessment. Although MIR spectroscopy performs well across various physicochemical soil properties, it remains limited for certain soil nutrients and requires advanced computational modeling to fully exploit underlying molecular information. These challenges can be addressed through standardized sample preparation, advanced spectral preprocessing, variable selection, and the adoption of robust machine-learning and deep-learning algorithms supported by large spectral libraries.
Despite being less precise than dry combustion or an elemental analyzer, spectroscopy is gaining popularity due to its rapidity and nondestructive nature. Many studies have reported that spectroscopy is faster and more time-efficient than traditional methods, but few have compared the actual time required for analysis compared to laboratory methods. Jeon et al. [7] compared MIR spectroscopy with a conventional laboratory method for fertilizer recommendations for various crops in Korea, and reported comparable results, with MIR analysis being approximately 12 times faster for evaluating 11 soil properties. Similarly, Janik et al. [134] exhibited that filter-press attenuated total reflectance (ATR)-based MIR spectroscopy reduced sample-preparation time and achieved good prediction accuracy for TOC (R2 = 0.89). This study underscores the importance of spectroscopy, which narrows the gap with traditional methods for carbon measurements.

4. Data Fusion Methods in Soil Analysis

Single-sensor methods, such as MIR and NIR, have been successfully used to determine soil physicochemical properties, as depicted in Table 1. Nevertheless, no single sensor can accurately characterize all soil properties, highlighting the limitations of single-sensor approaches [135]. To address this challenge, researchers have increasingly adopted spectral data fusion, which integrates information from multiple sensors to provide a more comprehensive representation of soil characteristics and improve prediction accuracy. For instance, Ludwig et al. [136] achieved high prediction accuracies for SOC, TC, TN, and pH, using a data fusion method based on outer product analysis (OPA) compared with individual Vis-NIR and MIR methods. This unique ability led to its widespread adoption for several soil characteristics (Table 1).
Data fusion methods are distributed into three main categories: low-level, mid-level, and high-level fusion [137]. Low-level fusion directly combines entire datasets from multiple sensors without variable reduction. Using this approach, Li et al. [122] demonstrated improved predictive performance for TN, TK, and alkali-hydrolysable nitrogen (AN) using Vis-NIR and MIR relative to single-sensor approaches. A mid-level fusion strategy reduces the dimensionality of each dataset prior to data integration, thereby eliminating redundant data. This strategy often yields better predictive performance than low-fusion; for example, mid-level fusion using spectral features selected by the absolute shrinkage and selection operator (LASSO) increased prediction of TN, TP, AN, AP, and AK [122]. High-level data fusion, also referred to as decision fusion, combines predictions generated independently from each dataset. Li et al. [122] showed that high-level data fusion using Granger Ramanathan averaging (GRA) further increased prediction accuracy for six soil nutrients (TN, TP, TK, avl. P, AN, and avl. K), outperforming low and mid-level data fusion.
Despite these advantages, some studies have reported mixed performance results. Nyawasha et al. [121] found that MIR spectroscopy outperformed MIR-Vis-NIR spectral fusion, followed by Vis-NIR, for predicting soil properties. Conversely, Johnson et al. [138] observed improved prediction accuracy for soil properties using data fusion compared to the single method. On the other hand, a study reported by Ng et al. [139] observed superior performance of MIR compared to the fusion-based approach. These contrasting findings indicate that data fusion does not universally improve prediction accuracy, but its effectiveness may depend on factors such as soil type, pretreatment methods, and sample size.
Figure 5 illustrates trends in the application of data fusion methods for soil assessment from 2014 to 2024, based on data retrieved from the Scopus database—the trend in soil analysis changed significantly from 2014 to 2021. The search was conducted on the Scopus database (https://www.scopus.com/search/form.uri?display=basic#basic (accessed on 30 January 2025)) with the keyword “spectroscopy fusion method for soil analysis”, restricted to the article title, abstract, and keywords, while ‘year’ was limited from 2014 to 2024. Its use has steadily increased from 2019 to 2024, which may be due to increased accuracy and efficiency in forecasting soil characteristics, eventually supporting farming practices.

Prediction of Soil Properties by Data Fusion

Spectroscopic techniques, such as Vis-NIR and MIR, have significant value in agriculture for predicting soil properties, either individually or through fusion approaches. Comparative studies consistently indicate that MIR spectroscopy achieves higher predictive accuracy than Vis-NIR for many soil attributes, and integrating these two techniques through data fusion may further increment their predictive accuracy. For instance, Hong et al. [123] compared individual Vis-NIR and MIR spectra with a Vis-NIR-MIR data fusion approach for predictive accuracy of SOC content. Direct spectral concatenation and optimal band combination (OBC) were employed as fusion strategies, and the data were optimized using the continuous wavelet transform (CWT) before and after fusion (Vis-NIR-MIR). The MIR (R2 = 0.45–0.64) outperformed NIR (R2 = 0.20–0.44) for predicting SOC. The highest SOC prediction (R2 = 0.66) was obtained using Vis-NIR-MIR fusion with OBC at CWT scale 1. Similarly, Karray et al. [124] evaluated soil properties, including OM, sand, clay, silt, CaCO3, and nitrate (NO3) using Vis-NIR field imaging spectra (IS), laboratory Vis-NIR spectra (LS), and their five combinations (C1-LS 100%, IS 0%; C2-LS 80%, IS 20%; C3-LS 50%, IS 50%; C4-LS 20%, IS 80%, and C5-LS 0%, IS 100%). Among the tested combinations, C1 and C5 achieved higher prediction accuracy when the SVR model was applied, followed by PLSR. Their findings indicated that the spectrum fusion technique (C2) augmented SOM prediction accuracy when the SVR model was used.
Xu et al. [125] predicted soil class using Vis-NIR, MIR, simple Vis-NIR-MIR fusion, and OPA Vis-NIR-MIR fusion. The MIR-based independent model exhibited better performance for soil classification (64.5%) than the NIR model (64.2%). The best soil class prediction was obtained by a model based on OPA data fusion (68.4%). The model constructed using simple Vis-NIR-MIR data fusion did not improve the prediction accuracy of soil class (61.1%) relative to individual models (64.5%), which could be attributed to the small sample size and the resulting overfitting.
Soil structure, stability, and fertility are influenced by the main components of soil aggregate stability (AS) properties, including sand, silt, and clay, which can be accurately estimated using a data fusion method. The Vis-NIR-MIR high-level data fusion method was used successfully to determine soil AS properties [126]. The authors compared the performance of four methods, including Vis-NIR, MIR, Vis-NIR-MIR, and data fusion by model output averaging (MOA) with a PLSR model for predicting three aggregate stability indices (fast wetting (FW), slow wetting (SW), and mechanical breakdown (MB)). In independent validation, the MIR model showed greater predictive accuracy across three AS indices compared to the NIR model. The Vis-NIR-MIR data fusion outperformed independent sensors in predicting AS in the FW index. The MOA-based fusion model surpassed Vis-NIR-MIR and independent sensors in predicting AS for SW and MB indices. In addition to physical properties, soil nutrients such as TN, TK, and avl. N were more accurately predicted by Vis-NIR-MIR fusion approach compared to the single sensor with LOOCV [122]. The authors used PLSR and SVM models and three-level fusion strategies: (a) low-level fusion, (b) middle-level fusion, and (c) high-level fusion, with high-level fusion based on Granger–Ramanathan averaging (GRA) augmented predictive performance across soil nutrients. These results suggest that the nature of the dataset, soil heterogeneity, fusion selection method, and fusion strategy all influence the prediction accuracy of soil properties. The above results indicate that high-level data fusion is the most accurate method for predicting soil attributes. This approach, which does not combine raw spectra or use selected features, effectively prevents the noise introduced by overfitting in low-level fusion and ensures that no key information is lost during the feature extraction process in mid-level data fusion.
As summarized in Table 1, prediction accuracy varies widely and depends strongly on the methods and spectral ranges used, the chemometric models employed, the datasets, and the sampling depth. Larger datasets, the MIR method, and the data fusion approach exhibit greater predictive accuracy. Vis-NIR spectroscopy has demonstrated good prediction accuracy for SOC, particularly when linked with models such as PLSR, SVM, and RF. It is frequently used in combination with preprocessing methods such as SG first-derivative, SNV, or log10 (1/R).
Significant challenges in soil characterization include spectral redundancy, noise, soil heterogeneity, and limited model transferability. These can be addressed when data are paired with an advanced algorithm and high-level data fusion techniques such as Granger–Ramanathan averaging or model output averaging.

5. Laboratory vs. Field Scenarios

In addition to laboratory applications, spectroscopy has been widely applied in field studies, enabling rapid on-site soil characterization [140]. Vis-NIR spectroscopy is predominantly used for in situ soil analysis, although its application is constrained by its reliance on overtone and combination bands. In contrast, recent advances in portable MIR spectroscopy have gained significant attention, as the MIR region captures fundamental vibrations of major soil constituents [141]. Laboratory-based and field-based spectroscopy offer distinct advantages. Laboratory-based spectroscopy is conducted under controlled environmental conditions, with additional sample preparation such as air-drying, fine grinding, and sieving, which helps mitigate environmental and physical interferences, thereby improving prediction accuracy [131,142,143]. For instance, Vis-NIR studies have reported higher SOC prediction accuracy in dry soil than in wet soil [144]. Wijewardane et al. [145] further demonstrated that fine-grinding enhanced MIR spectroscopy-based prediction accuracy for TC, OC, TN, CEC, pH, clay, sand, and silts compared with non-grinding samples. Despite these advantages, laboratory spectroscopy requires sample collection, transportation, and extensive preparation [141,146], which increases labor demands and costs.
In contrast, portable spectroscopy overcomes these limitations by enabling rapid field measurements with minimal sample handling [147,148], facilitating the acquisition of dense spatial data for soil variability mapping. Numerous studies have demonstrated the effectiveness of in situ spectroscopy; MIR-based field measurement achieved good predictive accuracy (R2 > 0.50) for SOM, water content, bulk density, Ca, Mg, and CEC [148], while in situ Vis-NIR spectroscopy successfully predicted soil SOM and clay with reasonable accuracy [149]. Silva et al. [150] further demonstrated that portable MIR spectroscopy can compete with laboratory spectroscopy accuracy when supported by calibration transfer techniques, with Spiking identified as effective.
Nevertheless, in situ soil assessments remain significant challenges, including moisture content, soil roughness, and sample heterogeneity, which affect spectroscopic performance [151,152,153]. Greenberg et al. [154] reported that MIR outperforms Vis-NIR for predicting OC and TN under dry conditions, although both techniques perform worse in wet soils. Implementing advanced correction and calibration-transfer techniques, such as piecewise direct standardization, external parameter orthogonalization, generalized least squares weighting, orthogonal signal correction, and Spiking, can help mitigate these environmental effects and improve field-based predictions [69,155,156,157,158]. Although portable spectrometers have made significant advances, they still underperform relative to laboratory benchtop spectrometers. Hutengs et al. [159] reported lower SOC predictive accuracy using MIR and NIR spectroscopy under field conditions relative to laboratory-dried samples, likely due to soil heterogeneity and soil moisture content. The narrow spectral range of portable spectrometers further limits their predictive accuracy [160]. Compared with instrumental costs, laboratory spectrometers are generally more expensive [160].
There is no single spectroscopic method that consistently achieves higher prediction accuracy across soil attributes; rather, performance depends on the soil property of interest, the soil conditions, and must be evaluated across different algorithms and spectral preprocessing strategies. Some studies reported higher predictive performance for NIR than MIR, whereas others found that MIR outperformed NIR. Similarly, some studies reported that data fusion is more accurate for predicting soil properties, whereas others reported that a single method outperforms data fusion. Overall, several challenges remain in spectroscopic soil characterization, including the effects of moisture, surface roughness, sample heterogeneity, and the limited spectral range of portable spectrometers, which reduces predictive accuracy. These challenges can be addressed through laboratory sample preparation, optimized preprocessing, advanced correction and calibration-transfer methods for portable spectroscopy, and the development of soil-specific calibration models. Furthermore, integrating laboratory and field spectra via domain adaptation may further enhance prediction accuracy and improve the reliability of soil spectroscopy across various measurement conditions.

6. Advances in Spectroscopy

Detailed maps of soil spatial variability are crucial for precision agriculture [161]. Over the last few decades, soil spectroscopy, particularly Vis-NIR and MIR, has gained popularity in digital soil mapping and soil analysis due to its rapidity and improved accuracy; the high initial cost of spectroscopy equipment nevertheless restricts its exploration in the agricultural field [9,43,162].
To address this limitation, innovative developments, such as micro-electromechanical systems (MEMS) based, small, portable, and affordable spectrometers, have been introduced for both laboratory and field settings. The evolution of micro, nano, and chip-scale spectrometers, along with their growing compatibility with portable platforms such as smartphones, vehicles, and drones, has further accelerated this trend. As a result, miniature and low-cost spectrometers are expected to occupy a rapidly expanding market in future years. For example, Hamamatsu launched a miniature ultra-lightweight spectrometer (0.3 g), operating in the 650 to 1050 nm range, suitable for analyzing small components. Similarly, NeoSpectra-Micro launched an integrated NIR spectral sensor with a wavelength range of 1250 to 2500 nm and a resolution of 16 nm, enabling both qualitative and quantitative materials analysis [163]. Samsung patented an embedded IR spectrometry system in a Galaxy S11 smartphone for applications such as CO2 monitoring and skin moisture assessment [164]. These examples illustrate the rapid innovative development of spectrometers in the future.
In the agricultural field, several researchers have implemented these innovative systems in their research owing to their compact size and affordability. Priori et al. [160] evaluated the performance of a MEMS-based Neospectra Scanner (1350 to 2500 nm) for predicting soil properties, including SOC, sand, silt, clay, and CaCO3, with accuracy comparable to a benchtop Vis-NIR spectrometer (350 to 2500 nm), despite slightly reduced performance attributed to the narrow spectral range. Similarly, Salazar et al. [165] demonstrated reliable prediction of SOC and exchangeable cations using a miniaturized portable NIR spectrometer (NeoSpectra-Module 2.5) in Mediterranean soils of central Chile. Another study used two miniaturized Fourier Transform Near-Infrared (FT-NIR) spectrometers—laboratory-based analyzer and field-portable analyzer—to predict soil attributes. The miniaturized, field-portable FT-NIR spectrometer effectively predicted total nitrogen [166]. These portable, miniature spectrometers could be a promising solution for cost-effective and field-based soil analysis, with minimal compromise in predictive performance in the near future.

7. Challenges in Implementation

As covered in the above sections, extensive research has demonstrated spectroscopy’s promising role in determining soil physicochemical properties with reasonable accuracy. Despite their several advantages, numerous challenges hinder their implementation, particularly the effects of water on their prediction accuracy. Soil moisture interferes with NIR and MIR results because it occupies the air volume within the soil and reflects the bands due to overtones of OH, CO3, SO4, H2O, CH4, and CO2, as well as fundamental vibrations of OH stretching and OH bending. In soil, water exists in three forms—water bound to the mineral lattice, water on the surface of the clay, and water in soil pores—which can complicate soil analysis using spectroscopic methods. Previous studies showed that variations in moisture content substantially influence NIR spectra, often reducing prediction accuracy for SOC [68,156]. For example, poor SOC prediction performance was reported when calibration was developed using moist samples rather than dry soil, highlighting the importance of moisture-aware calibration strategies [64,68]. Furthermore, fluctuating moisture levels in soil samples complicate calibration of NIR and MIR instruments, suggesting that calibration models for moisture content should be reassessed frequently, and proper calibration is required to forecast soil parameters effectively.
To mitigate the effects of moisture, correction approaches such as EPO have been proposed and shown to improve prediction accuracy by minimizing moisture-related spectral variation [68,69]. It requires a larger dataset to achieve stable results, as a small dataset increases the risk of overfitting, which can lead to low accuracy.
Recent advances, including Bayesian-optimized dynamic moisture correction methods, have further improved the prediction of soil nitrogen and organic matter across various models [167]. In addition to various multivariate calibration methods, NIR and MIR spectral preprocessing data should be combined with additional sensors, such as dielectric moisture sensors, to address this issue and improve the precision of soil-prediction characteristics.
Another challenge is soil heterogeneity, as soil properties vary significantly across landscapes, depths, and management practices. Such variability complicates the development of a reliable universal calibration model for spectroscopy. Even within the same sampling site, differences in texture and mineralogy can alter spectral responses and reduce prediction accuracy if these variations are not adequately represented in calibration datasets. Xu et al. [54] reported that Vis-NIR prediction accuracy for SOM varied significantly depending on parent material, underscoring the need for representative and diverse calibration samples.
Sample preparation and instrumental variability further affect the reproducibility of spectroscopic analysis. Although spectroscopy generally requires less sample preparation than conventional laboratory methods, inconsistent particle size, drying methods (air-dried vs. oven-dried), and sieving protocols can introduce variability in spectral measurements. Finer particles have been shown to yield more accurate predictions due to improved spectral homogeneity [45]. Furthermore, differences in instrument configuration (spectral resolution, wavelength range, and detector type) and measurement protocols may vary across laboratories, resulting in poor reproducibility of results [16]. This signifies that results from one laboratory may not be comparable with those from other laboratories.
The transferability of calibration models also remains a significant limitation in spectroscopy. Calibration models are typically valid only for the datasets on which they are trained and transferring them across regions or instruments is challenging due to variations in soil types, soil conditions, and instrument types. Although a soil spectral library is a soil fingerprint that links measured spectral data to reference soil properties, it is often region-specific due to the regional variations in soil properties. As a result, establishing a national library for each country can be costly and time-consuming. Consequently, the development of advanced machine learning methods, a standardized protocol, and a collaborative soil library initiative is required for the successful implementation of spectroscopic soil assessment.
From an economic perspective, the high initial investment cost of spectroscopic instruments ranges from approximately USD 150,000 to 500,000 depending on the instrument type, software, optical configuration, and features [168]. For example, NIR spectroscopy is available as a portable model for field studies and a benchtop model for laboratory use, with costs ranging from USD 5000 to 25,000 and USD 60,000 to 100,000, respectively [169]. Nevertheless, their cost tends to be modest compared to wet-chemistry methods because they are non-destructive, minimizing consumables and waste-disposal costs [170]. For instance, studies showed that switching the primary laboratory method to NIR in the quality control process can recoup the NIR investment within 11 months by minimizing labor, chemicals, and analysis costs [171]. This suggests that despite the high upfront investment, spectroscopy provides long-term economic benefits and is a reliable alternative to wet-chemistry methods.

8. Conclusions and Perspective

Spectroscopy coupled with machine learning, chemometrics, and pre-treatment processes, offers a powerful approach for effectively estimating soil attributes, which is crucial for maintaining soil fertility and health. In contrast, traditional methods, including wet extraction and elemental analyzers, exhibit higher accuracy in measuring soil properties; however, they require chemical reagents and necessitate longer analysis times. Spectroscopy is an alternative to traditional methods, saving time (12-fold) and costs by estimating multiple soil properties with a single scan. These advantages make it more suitable for large-scale soil assessment and precision agriculture. However, the successful application of spectroscopic methods in on-site soil analysis presents notable challenges. The spectroscopic prediction accuracy varies significantly with factors such as vegetation cover, soil moisture, particle size, soil heterogeneity, soil dataset, calibration models, preprocessing methods, and instrument types. Some studies showed that data fusion approaches achieve higher prediction accuracy for soil properties, whereas others report lower accuracy. Some studies showed that prediction accuracy for soil properties can be improved by using a larger dataset along with spectroscopy. Furthermore, standardizing the dataset for each spectroscopy method is necessary to accurately predict soil properties and ensure data reproducibility.
Portable NIR and emerging portable MIR spectroscopy are available for on-site field soil analysis, indicating that on-site soil assessment can be achieved with reasonable accuracy. However, high initial investment costs continue to hinder its large-scale implementation in farming practices. Although miniaturized spectrometers can overcome this limitation by providing an affordable alternative, their limited spectral range and resolution often compromise prediction accuracy, which need to be addressed. Future research should focus on advanced sensor technologies and their integration with data-driven technologies. Integration of portable spectroscopy with remote sensing, Internet of Things (IoT) based cloud sensors, blockchain, digital twin, and proximal sensing technologies such as Veris-like technology (MSP3-soil scanner-USA) would enhance data integrity, transparency, and real-time field applicability, which can support instant decision-making tools for farmers and support sustainable agriculture.

Author Contributions

Conceptualization, writing—original draft, writing—review and editing, G.D.V., S.J., H.J.J. and W.N.; writing—original draft, writing—review and editing, J.-J.Y., J.-H.P., J.-H.S., S.H.K. (Seong Heon Kim), K.K., A.R. and S.H.K. (So Hui Kim); validation, supervision, funding acquisition, S.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Rural Development Administration (RDA), South Korea, grant numbers PJ015558 and PJ017283.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study, as it is a literature review. Data sharing is not applicable to this article.

Acknowledgments

The authors gratefully acknowledge the Rural Development Administration (RDA) for providing facilities for this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
IRInfrared
UV-VisUltraviolet-Visible
NIRNear-infrared
Vis-NIRVisible-Near-infrared
MIRMid-infrared
NMRNuclear magnetic resonance
ISImaging spectra
LSLaboratory Vis-NIR spectra
FT-NIR Fourier transform near-infrared
FT-IRFourier transform infrared
ATRAttenuated total reflectance
SDGsSustainable development goals
nmNanometer
cmCentimeter
NNitrogen
PPhosphorus
KPotassium
TKTotal potassium
avl. NAvailable nitrogen
avl. PAvailable phosphorus
avl. KAvailable potassium
CaCalcium
MgMagnesium
FeIron
MnManganese
PbLead
OMOrganic matter
OCOrganic carbon
SOCSoil organic carbon
SOMSoil organic matter
TOCTotal organic carbon
SICSoil inorganic carbon
ICInorganic carbon
CECCation exchange capacity
ECElectrical conductivity
TNTotal nitrogen
TCTotal carbon
TSTotal sulfur
CaCO3Calcium carbonate
NO3Nitrate
SNVStandard normal variate
SGSavitzky–Golay
MSCMultiplicative scatter correction
EPOExternal parameter orthogonalization
PLSRPartial least squares regression
MBLMemory-based learning
PCRPrincipal component regression
MLRMultiple linear regression
SVMSupport vector machine
SVMRSupport vector machine regression
SVRSupport vector regression
SMOTESynthetic minority oversampling technique
KNNK-nearest neighbors
RFRandom forest
MARSMultivariate adaptive regression splines
DrSeq-ANNDropout sequential artificial neural network
ANNsArtificial neural networks
CNNsConvolutional neural networks
LSTMLong short-term memory
LOOCVLeave-one-out-cross-validation
OPAOuter product analysis
MOAModel output averaging
OBCOptimal band combination
CWTContinuous wavelet transform
LASSOLeast absolute shrinkage and selection operator
GRAGranger–Ramanathan averaging
FWFast wetting
SWSlow wetting
MBMechanical breakdown
R2Coefficient of determination
CVCoefficient of variation
SEPStandard error of prediction
RPDRatio to performance deviation
MSEMean squared error
MAEMean absolute error
RMSERoot mean squared error
MEMSMicro-electromechanical systems
IoTInternet of Things (IoT)

References

  1. Fang, K.; Kou, D.; Wang, G.; Chen, L.; Ding, J.; Li, F.; Yang, G.; Qin, S.; Liu, L.; Zhang, Q.; et al. Decreased Soil Cation Exchange Capacity Across Northern China’s Grasslands Over the Last Three Decades. J. Geophys. Res. Biogeosci. 2017, 122, 3088–3097. [Google Scholar] [CrossRef]
  2. Freidberg, S. Assembled but Unrehearsed: Corporate Food Power and the ‘Dance’ of Supply Chain Sustainability. J. Peasant Stud. 2020, 47, 383–400. [Google Scholar] [CrossRef]
  3. Handayani, I.P.; Hale, C. Healthy Soils for Productivity and Sustainable Development in Agriculture. IOP Conf. Ser. Earth Environ. Sci. 2022, 1018, 012038. [Google Scholar] [CrossRef]
  4. Lončarić, Z.; Karalić, K.; Popović, B.; Rastija, D.; Vukobratović, M. Total and Plant Available Micronutrients in Acidic and Calcareous Soils in Croatia. Cereal Res. Commun. 2008, 36, 331–334. Available online: https://www.croris.hr/crosbi/publikacija/prilog-casopis/150100 (accessed on 5 January 2025).
  5. Soil Degradation: The Silent Global Crisis|Heinrich Böll Stiftung|Brussels Office—European Union. Available online: https://eu.boell.org/en/SoilAtlas-soil-degradation (accessed on 18 November 2025).
  6. Soil Health: Key to Achieving the Sustainable Development Goals. Available online: https://jeas.agropublishers.com/2023/08/soil-health-key-to-achieving-sustainable-development-goals/ (accessed on 18 November 2025).
  7. Jeon, S.H.; Jang, H.J.; Ng, W.; Minasny, B.; Kim, S.H.; Shim, J.H.; Roh, A.; Kwon, S.i.; Yun, J.J. Predicting Soil Properties for Fertiliser Recommendation in South Korea Using MIR Spectroscopy. Geoderma Reg. 2024, 39, e00901. [Google Scholar] [CrossRef]
  8. Evangelista, S.J.; Francos, N.; Sharififar, A.; Ng, W.; Minasny, B.; McBratney, A.B. Advancing Soil Security with Soil Spectroscopy: The Efficient Estimation of Indicators. Soil Secur. 2025, 21, 100211. [Google Scholar] [CrossRef]
  9. Li, S.; Shen, X.; Shen, X.; Cheng, J.; Xu, D.; Makar, R.S.; Guo, Y.; Hu, B.; Chen, S.; Hong, Y.; et al. Improving the Accuracy of Soil Classification by Using Vis–NIR, MIR, and Their Spectra Fusion. Remote Sens. 2025, 17, 1524. [Google Scholar] [CrossRef]
  10. Shin, S.K.; Lee, S.J.; Park, J.H. Prediction of Soil Properties Using Vis-NIR Spectroscopy Combined with Machine Learning: A Review. Sensors 2025, 25, 5045. [Google Scholar] [CrossRef]
  11. Vyavahare, G.D.; Yun, J.-J.; Park, J.-H.; Shim, J.-H.; Kim, S.H.; Kim, K.; Roh, A.; Jang, H.J.; Jeon, S. Evaluating the Performance of NIR Spectroscopy in Predicting Soil Properties: A Comparative Study. Appl. Sci. 2025, 15, 13240. [Google Scholar] [CrossRef]
  12. Swan, T.; Jang, H.J.; Huang, Y.-C.; Fidelis, C.; Yinil, D.; Bala, B.; Das, B.S.; Field, D. Comparative Analysis of Vis-NIR and MIR Spectroscopy for Predicting Soil Properties and Identifying Minerals at Smallholder Cocoa Farms across Papua New Guinea. Soil Adv. 2026, 5, 100094. [Google Scholar] [CrossRef]
  13. Engelmann, L.; Bierl, R.; Kirchhoff, M.; Ries, J.B. Application of Mid-Infrared Spectroscopy for Soil Analysis in Calcareous Argania Spinosa Forests in Morocco. Geoderma Reg. 2025, 42, e00964. [Google Scholar] [CrossRef]
  14. Viscarra Rossel, R.A.; McGlynn, R.N.; McBratney, A.B. Determining the Composition of Mineral-Organic Mixes Using UV–Vis–NIR Diffuse Reflectance Spectroscopy. Geoderma 2006, 137, 70–82. [Google Scholar] [CrossRef]
  15. Genot, V.; Bock, L.; Dardenne, P.; Colinet, G. L’intérêt de La Spectroscopie Proche Infrarouge En Analyse de Terre (Synthèse Bibliographique). Biotechnol. Agron. Soc. Environ. 2014, 18, 247–261. [Google Scholar]
  16. Stenberg, B.; Viscarra Rossel, R.A.; Mouazen, A.M.; Wetterlind, J. Visible and Near Infrared Spectroscopy in Soil Science. Adv. Agron. 2010, 107, 163–215. [Google Scholar] [CrossRef]
  17. Linsler, D.; Sawallisch, A.; Höper, H.; Schmidt, H.; Vohland, M.; Ludwig, B. Near-Infrared Spectroscopy for Determination of Soil Organic C, Microbial Biomass C and C and N Fractions in a Heterogeneous Sample of German Arable Surface Soils. Arch. Agron. Soil Sci. 2017, 63, 1499–1509. [Google Scholar] [CrossRef]
  18. Lelago, A.; Bibiso, M. Performance of Mid Infrared Spectroscopy to Predict Nutrients for Agricultural Soils in Selected Areas of Ethiopia. Heliyon 2022, 8, e09050. [Google Scholar] [CrossRef]
  19. Ng, W.; Minasny, B.; Jeon, S.H.; McBratney, A. Mid-Infrared Spectroscopy for Accurate Measurement of an Extensive Set of Soil Properties for Assessing Soil Functions. Soil Secur. 2022, 6, 100043. [Google Scholar] [CrossRef]
  20. Nath, D.; Laik, R.; Meena, V.S.; Kumari, V.; Singh, S.K.; Pramanick, B.; Sattar, A. Strategies to Admittance Soil Quality Using Mid-Infrared (Mid-IR) Spectroscopy an Alternate Tool for Conventional Lab Analysis: A Global Perspective. Environ. Chall. 2022, 7, 100469. [Google Scholar] [CrossRef]
  21. Piccini, C.; Metzger, K.; Debaene, G.; Stenberg, B.; Götzinger, S.; Borůvka, L.; Sandén, T.; Bragazza, L.; Liebisch, F. In-Field Soil Spectroscopy in Vis–NIR Range for Fast and Reliable Soil Analysis: A Review. Eur. J. Soil Sci. 2024, 75, e13481. [Google Scholar] [CrossRef]
  22. Penner, M.H. Basic Principles of Spectroscopy. In Food Analysis; Springer: Berlin/Heidelberg, Germany, 2017; pp. 79–88. [Google Scholar] [CrossRef]
  23. Agnello, S. Spectroscopy for Materials Characterization; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2021; pp. 1–467. [Google Scholar] [CrossRef]
  24. Ozaki, Y. Infrared Spectroscopy—Mid-Infrared, Near-Infrared, and Far-Infrared/Terahertz Spectroscopy. Anal. Sci. 2021, 37, 1193–1212. [Google Scholar] [CrossRef]
  25. An, D.; Zhang, L.; Liu, Z.; Liu, J.; Wei, Y. Advances in Infrared Spectroscopy and Hyperspectral Imaging Combined with Artificial Intelligence for the Detection of Cereals Quality. Crit. Rev. Food Sci. Nutr. 2023, 63, 9766–9796. [Google Scholar] [CrossRef]
  26. Hammes, G.G. Spectroscopy for the Biological Sciences. Available online: https://books.google.co.kr/books?hl=ko&lr=&id=glECXyfF4dcC&oi=fnd&pg=PR5&dq=Applications+of+Spectroscopy+in+Science.&ots=ucjhwgfL6a&sig=mvyQhREYNIVvZ4HA5CvKiZEwNbY#v=onepage&q=ApplicationsofSpectroscopyinScience.&f=false (accessed on 14 January 2025).
  27. Janik, L.J.; Merry, R.H.; Skjemstad, J.O. Can Mid Infrared Diffuse Reflectance Analysis Replace Soil Extractions? Aust. J. Exp. Agric. 1998, 38, 681–696. [Google Scholar] [CrossRef]
  28. Du, C.; Zhou, J. Evaluation of Soil Fertility Using Infrared Spectroscopy: A Review. Environ. Chem. Lett. 2009, 7, 97–113. [Google Scholar] [CrossRef]
  29. van der Marel, H.W.; Beutelspacher, H. Atlas of Infrared Spectroscopy of Clay Minerals and Their Admixtures; Elsevier Scientific Publishing Company: Amsterdam, The Netherlands, 1976. [Google Scholar]
  30. Coates, J. Interpretation of Infrared Spectra, A Practical Approach. In Encyclopedia of Analytical Chemistry; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2000. [Google Scholar] [CrossRef]
  31. Ozaki, Y.; Huck, C.W.; Beć, K.B. Near-IR Spectroscopy and Its Applications. In Molecular and Laser Spectroscopy. Advances and Applications; Gupta, V.P., Ed.; Elsevier: San Diego, CA, USA, 2018; pp. 11–38. [Google Scholar] [CrossRef]
  32. Madari, B.E.; Reeves, J.B.; Machado, P.L.O.A.; Guimarães, C.M.; Torres, E.; McCarty, G.W. Mid- and near-Infrared Spec-troscopic Assessment of Soil Compositional Parameters and Structural Indices in Two Ferralsols. Geoderma 2006, 136, 245–259. [Google Scholar] [CrossRef]
  33. Türker-Kaya, S.; Huck, C.W. A Review of Mid-Infrared and Near-Infrared Imaging: Principles, Concepts and Applications in Plant Tissue Analysis. Molecules 2017, 22, 168. [Google Scholar] [CrossRef] [PubMed]
  34. Clark, R.N.; King, T.V.V.; Klejwa, M.; Swayze, G.A.; Vergo, N. High Spectral Resolution Reflectance Spectroscopy of Minerals. J. Geophys. Res. Solid Earth 1990, 95, 12653–12680. [Google Scholar] [CrossRef]
  35. Ng, W.; Malone, B.; Minasny, B.; Jeon, S. Near and Mid Infrared Soil Spectroscopy. In Reference Module in Earth Systems and Environmental Sciences; Elsevier: Amsterdam, The Netherlands, 2023. [Google Scholar] [CrossRef]
  36. Madejová, J.; Pálková, H. Review of the Application of Infrared Spectroscopy in Studies of Acid-Treated Clay Minerals. Clays Clay Miner. 2024, 72, e30. [Google Scholar] [CrossRef]
  37. Shi, Z.; Yin, J.; Li, B.; Sun, F.; Miao, T.; Cao, Y.; Shi, Z.; Chen, S.; Hu, B.; Ji, W. Comparison of Depth-Specific Prediction of Soil Properties: MIR vs. Vis-NIR Spectroscopy. Sensors 2023, 23, 5967. [Google Scholar] [CrossRef]
  38. Gozukara, G.; Hartemink, A.E.; Huang, J.; Demattê, J.A.M. Prediction Accuracy of PXRF, MIR, and Vis-NIR Spectra for Soil Properties—A Review. Soil Sci. Soc. Am. J. 2025, 89, e70028. [Google Scholar] [CrossRef]
  39. Van Groenigen, J.W.; Mutters, C.S.; Horwath, W.R.; Van Kessel, C. NIR and DRIFT-MIR Spectrometry of Soils for Predicting Soil and Crop Parameters in a Flooded Field. Plant Soil 2003, 250, 155–165. [Google Scholar] [CrossRef]
  40. Knadel, M.; Stenberg, B.; Deng, F.; Thomsen, A.; Greve, M.H. Comparing Predictive Abilities of Three Visible-Near Infrared Spectrophotometers for Soil Organic Carbon and Clay Determination. J. Near Infrared Spectrosc. 2013, 21, 67–80. [Google Scholar] [CrossRef]
  41. Gomez, C.; Adeline, K.; Bacha, S.; Driessen, B.; Gorretta, N.; Lagacherie, P.; Roger, J.M.; Briottet, X. Sensitivity of Clay Content Prediction to Spectral Configuration of VNIR/SWIR Imaging Data, from Multispectral to Hyperspectral Scenarios. Available online: http://www.enmap.org/ (accessed on 21 August 2025).
  42. Viscarra Rossel, R.A.; Behrens, T.; Ben-Dor, E.; Brown, D.J.; Demattê, J.A.M.; Shepherd, K.D.; Shi, Z.; Stenberg, B.; Stevens, A.; Adamchuk, V.; et al. A Global Spectral Library to Characterize the World’s Soil. Earth-Sci. Rev. 2016, 155, 198–230. [Google Scholar] [CrossRef]
  43. Ahmadi, A.; Emami, M.; Daccache, A.; He, L. Soil Properties Prediction for Precision Agriculture Using Visible and Near-Infrared Spectroscopy: A Systematic Review and Meta-Analysis. Agronomy 2021, 11, 433. [Google Scholar] [CrossRef]
  44. Reeves, J.; McCarty, G.; Mimmo, T. The Potential of Diffuse Reflectance Spectroscopy for the Determination of Carbon Inventories in Soils. Environ. Pollut. 2002, 116, S277–S284. [Google Scholar] [CrossRef]
  45. Barra, I.; El Moatassem, T.; Kebede, F. Soil Particle Size Thresholds in Soil Spectroscopy and Its Effect on the Multivariate Models for the Analysis of Soil Properties. Sensors 2023, 23, 9171. [Google Scholar] [CrossRef]
  46. Stumpe, B.; Weihermüller, L.; Marschner, B. Sample Preparation and Selection for Qualitative and Quantitative Analyses of Soil Organic Carbon with Mid-Infrared Reflectance Spectroscopy. Eur. J. Soil Sci. 2011, 62, 849–862. [Google Scholar] [CrossRef]
  47. Janik, L.J.; Soriano-Disla, J.M.; Forrester, S.T.; McLaughlin, M.J. Moisture Effects on Diffuse Reflection Infrared Spectra of Contrasting Minerals and Soils: A Mechanistic Interpretation. Vib. Spectrosc. 2016, 86, 244–252. [Google Scholar] [CrossRef]
  48. Silvero, N.E.Q.; Di Raimo, L.A.D.L.; Pereira, G.S.; de Magalhães, L.P.; da Silva Terra, F.; Dassan, M.A.A.; Salazar, D.F.U.; Demattê, J.A.M. Effects of Water, Organic Matter, and Iron Forms in Mid-IR Spectra of Soils: Assessments from Laboratory to Satellite-Simulated Data. Geoderma 2020, 375, 114480. [Google Scholar] [CrossRef]
  49. Måren, I.E.; Karki, S.; Prajapati, C.; Yadav, R.K.; Shrestha, B.B. Facing North or South: Does Slope Aspect Impact Forest Stand Characteristics and Soil Properties in a Semiarid Trans-Himalayan Valley? J. Arid Environ. 2015, 121, 112–123. [Google Scholar] [CrossRef]
  50. Sztabkowski, K.; Jonczak, J. Parent Material Origin as a Factor Influencing the Development and Properties of Brunic Arenosols in a Young Glacial Landscape. CATENA 2025, 258, 109320. [Google Scholar] [CrossRef]
  51. Barré, P.; Durand, H.; Chenu, C.; Meunier, P.; Montagne, D.; Castel, G.; Billiou, D.; Soucémarianadin, L.; Cécillon, L. Geological Control of Soil Organic Carbon and Nitrogen Stocks at the Landscape Scale. Geoderma 2017, 285, 50–56. [Google Scholar] [CrossRef]
  52. Mao, X.; Van Zwieten, L.; Zhang, M.; Qiu, Z.; Yao, Y.; Wang, H. Soil Parent Material Controls Organic Matter Stocks and Retention Patterns in Subtropical China. J. Soils Sediments 2020, 20, 2426–2438. [Google Scholar] [CrossRef]
  53. Angst, G.; Messinger, J.; Greiner, M.; Häusler, W.; Hertel, D.; Kirfel, K.; Kögel-Knabner, I.; Leuschner, C.; Rethemeyer, J.; Mueller, C.W. Soil Organic Carbon Stocks in Topsoil and Subsoil Controlled by Parent Material, Carbon Input in the Rhizosphere, and Microbial-Derived Compounds. Soil Biol. Biochem. 2018, 122, 19–30. [Google Scholar] [CrossRef]
  54. Xu, S.; Shi, X.; Wang, M.; Zhao, Y. Effects of Subsetting by Parent Materials on Prediction of Soil Organic Matter Content in a Hilly Area Using Vis–NIR Spectroscopy. PLoS ONE 2016, 11, e0151536. [Google Scholar] [CrossRef]
  55. Richter, A.; Schöning, I.; Kahl, T.; Bauhus, J.; Ruess, L. Regional Environmental Conditions Shape Microbial Community Structure Stronger than Local Forest Management Intensity. For. Ecol. Manag. 2018, 409, 250–259. [Google Scholar] [CrossRef]
  56. Xiao, R.; Man, X.; Duan, B. Carbon and Nitrogen Stocks in Three Types of Larix Gmelinii Forests in Daxing’an Mountains, Northeast China. Forests 2020, 11, 305. [Google Scholar] [CrossRef]
  57. Spohn, M.; Klaus, K.; Wanek, W.; Richter, A. Microbial Carbon Use Efficiency and Biomass Turnover Times Depending on Soil Depth—Implications for Carbon Cycling. Soil Biol. Biochem. 2016, 96, 74–81. [Google Scholar] [CrossRef]
  58. Bargali, K.; Manral, V.; Padalia, K.; Bargali, S.S.; Upadhyay, V.P. Effect of Vegetation Type and Season on Microbial Biomass Carbon in Central Himalayan Forest Soils, India. Catena 2018, 171, 125–135. [Google Scholar] [CrossRef]
  59. Wang, Y.; Bao, H.; Kavana, D.J.; Li, Y.; Li, X.; Yan, L.; Xu, W.; Yu, B. Effects of Vegetation Types and Soil Properties on Regional Soil Carbon and Nitrogen in Salinized Reservoir Wetland, Northeast China. Plants 2023, 12, 3767. [Google Scholar] [CrossRef]
  60. Sun, Y.; Cai, W.; Shao, X. Chemometrics: An Excavator in Temperature-Dependent Near-Infrared Spectroscopy. Molecules 2022, 27, 452. [Google Scholar] [CrossRef] [PubMed]
  61. Ramadan, A.; Abatzoglou, N.; Gosselin, R. Addressing Temperature Variations of Miniaturized NIR Spectrometers: Advancing Quantitative Models for Pharmaceutical Analysis. J. Pharm. Biomed. Anal. 2025, 264, 116959. [Google Scholar] [CrossRef] [PubMed]
  62. Mcguirk, S.L.; Cairns, I.H. Relationships between Soil Moisture and Visible–NIR Soil Reflectance: A Review Presenting New Analyses and Data to Fill the Gaps. Geotechnics 2024, 4, 78–108. [Google Scholar] [CrossRef]
  63. Zhang, Z.; Ding, J.; Wang, J.; Ge, X. Prediction of Soil Organic Matter in Northwestern China Using Fractional-Order Derivative Spectroscopy and Modified Normalized Difference Indices. Catena 2020, 185, 104257. [Google Scholar] [CrossRef]
  64. Jiang, Q.; Chen, Y.; Guo, L.; Fei, T.; Qi, K. Estimating Soil Organic Carbon of Cropland Soil at Different Levels of Soil Moisture Using VIS-NIR Spectroscopy. Remote Sens. 2016, 8, 755. [Google Scholar] [CrossRef]
  65. Xu, X.; Xie, L.; Ying, Y. Factors Influencing near Infrared Spectroscopy Analysis of Agro-Products: A Review. Front. Agric. Sci. Eng. 2019, 6, 105–115. [Google Scholar] [CrossRef]
  66. Eslamifar, M.; Tavakoli, H.; Thiessen, E.; Kock, R.; Correa, J.; Hartung, E. Effective Spectral Pre-Processing Methods Enhance Accuracy of Soil Property Prediction by NIR Spectroscopy. Discov. Appl. Sci. 2025, 7, 896. [Google Scholar] [CrossRef]
  67. Pal, A.; Dubey, S.K.; Goel, S.; Kalita, P.K. Portable Sensors in Precision Agriculture: Assessing Advances and Challenges in Soil Nutrient Determination. TrAC Trends Anal. Chem. 2024, 180, 117981. [Google Scholar] [CrossRef]
  68. Nocita, M.; Stevens, A.; Noon, C.; Van Wesemael, B. Prediction of Soil Organic Carbon for Different Levels of Soil Moisture Using Vis-NIR Spectroscopy. Geoderma 2013, 199, 37–42. [Google Scholar] [CrossRef]
  69. Minasny, B.; McBratney, A.B.; Bellon-Maurel, V.; Roger, J.M.; Gobrecht, A.; Ferrand, L.; Joalland, S. Removing the Effect of Soil Moisture from NIR Diffuse Reflectance Spectra for the Prediction of Soil Organic Carbon. Geoderma 2011, 167–168, 118–124. [Google Scholar] [CrossRef]
  70. Advanced Preprocessing: Noise, Offset, and Baseline Filtering—Eigenvector Research Documentation Wiki. Available online: https://wiki.eigenvector.com/index.php?title=Advanced_Preprocessing:_Noise,_Offset,_and_Baseline_Filtering (accessed on 18 October 2024).
  71. Levillain, P.; Fompeydie, D. Derivative spectrophotometry principles, advantages and limitations, applications. Analysis 1986, 14, 1–20. [Google Scholar]
  72. Arakaki, L.S.L.; Burns, D.H. Multispectral Analysis for Quantitative Measurements of Myoglobin Oxygen Fractional Saturation in the Presence of Hemoglobin Interference. Appl. Spectrosc. 1992, 46, 1919–1928. [Google Scholar] [CrossRef]
  73. Ozaki, Y.; Mizuno, A.; Sato, H.; Kawauchi, K.; Muraishi, S. Biomedical Application of Near-Infrared Fourier Transform Raman Spectroscopy. Part I: The 1064-Nm Excited Raman Spectra of Blood and Met Hemoglobin. Appl. Spectrosc. 1992, 46, 533–536. [Google Scholar] [CrossRef]
  74. Mokari, A.; Guo, S.; Bocklitz, T. Exploring the Steps of Infrared (IR) Spectral Analysis: Pre-Processing, (Classical) Data Modelling, and Deep Learning. Molecules 2023, 28, 6886. [Google Scholar] [CrossRef]
  75. Rinnan, Å.; van den Berg, F.; Engelsen, S.B. Review of the Most Common Pre-Processing Techniques for near-Infrared Spectra. TrAC Trends Anal. Chem. 2009, 28, 1201–1222. [Google Scholar] [CrossRef]
  76. Kumar, M.; Suman, S.; Pugazhendi, S.; Dhamodharan, K.; Venkatesan, K.A. Orthogonal Signal Correction Assisted Multivariate Regression Approach for the Estimation of Uranium and Acidity in PUREX Process Streams. Talanta 2024, 280, 126673. [Google Scholar] [CrossRef]
  77. Mazdeyasna, S.; Arefin, M.S.; Fales, A.; Leavesley, S.J.; Pfefer, T.J.; Wang, Q. Evaluating Normalization Methods for Robust Spectral Performance Assessments of Hyperspectral Imaging Cameras. Biosensors 2025, 15, 20. [Google Scholar] [CrossRef] [PubMed]
  78. Advanced Preprocessing: Variable Centering—Eigenvector Research Documentation Wiki. Available online: https://wiki.eigenvector.com/index.php?title=Advanced_Preprocessing:_Variable_Centering (accessed on 18 October 2024).
  79. Padarian, J.; Minasny, B.; McBratney, A.B. Machine Learning and Soil Sciences: A Review Aided by Machine Learning Tools. SOIL 2020, 6, 35–52. [Google Scholar] [CrossRef]
  80. Li, R.; Yin, B.; Cong, Y.; Du, Z. Simultaneous Prediction of Soil Properties Using Multi_CNN Model. Sensors 2020, 20, 6271. [Google Scholar] [CrossRef] [PubMed]
  81. Liu, Y.; Shen, L.; Zhu, X.; Xie, Y.; He, S. Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model. Appl. Sci. 2024, 14, 11687. [Google Scholar] [CrossRef]
  82. Masri, D.; Woon, W.L.; Aung, Z. Soil Property Prediction: An Extreme Learning Machine Approach. Neural Inf. Process. 2015, 9490, 18–27. [Google Scholar] [CrossRef]
  83. Bonaccorso, G. Machine Learning Algorithms Reference Guide for Popular Algorithms for Data Science and Machine Learning; Packt Publishing: Birmingham, UK, 2017. [Google Scholar]
  84. Kamilaris, A.; Prenafeta-Boldú, F.X. Deep Learning in Agriculture: A Survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
  85. Johnson, J.M.; Khoshgoftaar, T.M. Survey on Deep Learning with Class Imbalance. J. Big Data 2019, 6, 27. [Google Scholar] [CrossRef]
  86. Aydın, Y.; Işıkdağ, Ü.; Bekdaş, G.; Nigdeli, S.M.; Geem, Z.W. Use of Machine Learning Techniques in Soil Classification. Sustainability 2023, 15, 2374. [Google Scholar] [CrossRef]
  87. Mallah, S.; Khaki, B.D.; Davatgar, N.; Scholten, T.; Amirian-Chakan, A.; Emadi, M.; Kerry, R.; Mosavi, A.H.; Taghizadeh-Mehrjardi, R. Predicting Soil Textural Classes Using Random Forest Models: Learning from Imbalanced Dataset. Agronom 2022, 12, 2613. [Google Scholar] [CrossRef]
  88. Neyestani, M.; Sarmadian, F.; Jafari, A.; Keshavarzi, A.; Sharififar, A. Digital Mapping of Soil Classes Using Spatial Extrapolation with Imbalanced Data. Geoderma Reg. 2021, 26, e00422. [Google Scholar] [CrossRef]
  89. Bhagat, M.; Bakariya, B. A Comprehensive Review of Cross-Validation Techniques in Machine Learning. Int. J. Sci. Technol. 2025, 16, 1–4. [Google Scholar] [CrossRef]
  90. Berrar, D. Cross-Validation. Encycl. Bioinform. Comput. Biol. 2019, 1, 542–545. [Google Scholar] [CrossRef]
  91. Wani, F.; Rizvi, S.; Sharma, M.; Bhat, M. A Study on Cross Validation for Model Selection and Estimation. Int. J. Agric. Sci. 2018, 14, 165–172. [Google Scholar] [CrossRef]
  92. James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer Texts in Statistics; Springer: New York, NY, USA, 2021; ISBN 978-1-0716-1417-4. [Google Scholar]
  93. Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013; pp. 1–600. [Google Scholar] [CrossRef]
  94. Powers, D.M.W. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation. Int. J. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar]
  95. Different Metrices in Machine Learning for Measuring Performance of Classification Algorithms|by Sachinsoni|Medium. Available online: https://medium.com/%40sachinsoni600517/different-metrices-in-machine-learning-for-measuring-performance-of-classification-algorithms-509e55c0a451 (accessed on 28 April 2025).
  96. Soriano-Disla, J.M.; Janik, L.J.; Viscarra Rossel, R.A.; MacDonald, L.M.; McLaughlin, M.J. The Performance of Visible, near-, and Mid-Infrared Reflectance Spectroscopy for Prediction of Soil Physical, Chemical, and Biological Properties. Appl. Spectrosc. Rev. 2014, 49, 139–186. [Google Scholar] [CrossRef]
  97. Davies, A.M.C.; Fearn, T. Back to Basics: Calibration Statistics. Available online: https://www.spectroscopyeurope.com/td-column/back-basics-calibration-statistics (accessed on 28 April 2025).
  98. Miloš, B.; Bensa, A. Prediction of Soil Organic Carbon Using VIS-NIR Spectroscopy: Application to Red Mediterranean Soils from Croatia. Eurasian J. Soil Sci. 2017, 6, 365–373. [Google Scholar] [CrossRef]
  99. Bellon-Maurel, V.; McBratney, A. Near-Infrared (NIR) and Mid-Infrared (MIR) Spectroscopic Techniques for Assessing the Amount of Carbon Stock in Soils—Critical Review and Research Perspectives. Soil Biol. Biochem. 2011, 43, 1398–1410. [Google Scholar] [CrossRef]
  100. Seema; Ghosh, A.K.; Das, B.S.; Reddy, N. Application of VIS-NIR Spectroscopy for Estimation of Soil Organic Carbon Using Different Spectral Preprocessing Techniques and Multivariate Methods in the Middle Indo-Gangetic Plains of India. Geoderma Reg. 2020, 23, e00349. [Google Scholar] [CrossRef]
  101. Olatunde, K.A. Estimation of Soil Organic Carbon Using Chemometrics: A Comparison between Mid-Infrared and Visible near Infrared Diffuse Reflectance Spectroscopy. West Afr. J. Appl. Ecol. 2021, 29, 1–11. [Google Scholar]
  102. Miloš, B.; Bensa, A.; Japundžić-Palenkić, B. Evaluation of Vis-NIR Preprocessing Combined with PLS Regression for Estimation Soil Organic Carbon, Cation Exchange Capacity and Clay from Eastern Croatia. Geoderma Reg. 2022, 30, e00558. [Google Scholar] [CrossRef]
  103. Gholizadeh, A.; Viscarra Rossel, R.A.; Saberioon, M.; Borůvka, L.; Kratina, J.; Pavlů, L. National-Scale Spectroscopic Assessment of Soil Organic Carbon in Forests of the Czech Republic. Geoderma 2021, 385, 114832. [Google Scholar] [CrossRef]
  104. Ramírez, P.B.; Calderón, F.J.; Jastrow, J.D.; Ping, C.L.; Matamala, R. Applying NIR and MIR Spectroscopy for C and Soil Property Prediction in Northern Cold-Region Ecosystems. Which Approach Works Better? Geoderma Reg. 2023, 32, e00617. [Google Scholar] [CrossRef]
  105. Zhao, D.; Arshad, M.; Li, N.; Triantafilis, J. Predicting Soil Physical and Chemical Properties Using Vis-NIR in Australian Cotton Areas. Catena 2021, 196, 104938. [Google Scholar] [CrossRef]
  106. Clingensmith, C.M.; Grunwald, S. Predicting Soil Properties and Interpreting Vis-NIR Models from across Continental United States. Sensors 2022, 22, 3187. [Google Scholar] [CrossRef]
  107. Carvalho, J.K.; Moura-Bueno, J.M.; Ramon, R.; Almeida, T.F.; Naibo, G.; Martins, A.P.; Santos, L.S.; Gianello, C.; Tiecher, T. Combining Different Pre-Processing and Multivariate Methods for Prediction of Soil Organic Matter by near Infrared Spectroscopy (NIRS) in Southern Brazil. Geoderma Reg. 2022, 29, e00530. [Google Scholar] [CrossRef]
  108. Singha, C.; Swain, K.C.; Sahoo, S.; Govind, A. Prediction of Soil Nutrients through PLSR and SVMR Models by VIs-NIR Reflectance Spectroscopy. Egypt. J. Remote Sens. Space Sci. 2023, 26, 901–918. [Google Scholar] [CrossRef]
  109. El-Sayed, M.A.; Abd-Elazem, A.H.; Moursy, A.R.A.; Mohamed, E.S.; Kucher, D.E.; Fadl, M.E. Integration Vis-NIR Spectroscopy and Artificial Intelligence to Predict Some Soil Parameters in Arid Region: A Case Study of Wadi Elkobaneyya, South Egypt. Agronomy 2023, 13, 935. [Google Scholar] [CrossRef]
  110. Lucena, P.G.C.; Aquino, R.V.S.; Sousa, J.E.S.; Souza Júnior, V.S.; Pacheco Filho, J.G.A.; Pereira, C.F. Mineral and Particle-Size Chemometric Classification Using Handheld near-Infrared Instruments for Soil in Northeast Brazil. Geoderma Reg. 2024, 38, e00819. [Google Scholar] [CrossRef]
  111. Sabetizade, M.; Gorji, M.; Roudier, P.; Zolfaghari, A.A.; Keshavarzi, A. Combination of MIR Spectroscopy and Environmental Covariates to Predict Soil Organic Carbon in a Semi-Arid Region. Catena 2021, 196, 104844. [Google Scholar] [CrossRef]
  112. Jang, H.J.; Dobarco, M.R.; Minasny, B.; Campusano, J.P.; McBratney, A. Assessing Human Impacts on Soil Organic Carbon Change in the Lower Namoi Valley, Australia. Anthropocene 2023, 43, 100393. [Google Scholar] [CrossRef]
  113. Metzger, K.; Zhang, C.; Ward, M.; Daly, K. Mid-Infrared Spectroscopy as an Alternative to Laboratory Extraction for the Determination of Lime Requirement in Tillage Soils. Geoderma 2020, 364, 114171. [Google Scholar] [CrossRef]
  114. Hati, K.M.; Sinha, N.K.; Mohanty, M.; Jha, P.; Londhe, S.; Sila, A.; Towett, E.; Chaudhary, R.S.; Jayaraman, S.; Coumar, M.V.; et al. Mid-Infrared Reflectance Spectroscopy for Estimation of Soil Properties of Alfisols from Eastern India. Sustainability 2022, 14, 4883. [Google Scholar] [CrossRef]
  115. Mammadov, E.; Denk, M.; Mamedov, A.I.; Glaesser, C. Predicting Soil Properties for Agricultural Land in the Caucasus Mountains Using Mid-Infrared Spectroscopy. Land 2024, 13, 154. [Google Scholar] [CrossRef]
  116. Sanderman, J.; Savage, K.; Dangal, S.R.S. Mid-Infrared Spectroscopy for Prediction of Soil Health Indicators in the United States. Soil Sci. Soc. Am. J. 2020, 84, 251–261. [Google Scholar] [CrossRef]
  117. Shi, L.; O’Rourke, S.; de Santana, F.B.; Daly, K. Prediction of Soil Bulk Density in Agricultural Soils Using Mid-Infrared Spectroscopy. Geoderma 2023, 434, 116487. [Google Scholar] [CrossRef]
  118. Sherif, F. Developing Spectral Libraries Using Mid Infrared Spectroscopy to Determine Key Soil Properties and Soil Health. Master Thesis, Michigan State University, East Lansing, MI, USA, 2023; pp. 1–24. [Google Scholar]
  119. Li, H.; Wang, J.; Zhang, J.; Liu, T.; Acquah, G.E.; Yuan, H. Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR Spectroscopy. Agronomy 2022, 12, 638. [Google Scholar] [CrossRef]
  120. Jakkan, D.A.; Ghare, P.; Sakode, C. Multi-Parameter Soil Property Prediction Incorporating Mid-Infrared Spectroscopy and Dropout Sequential Artificial Neural Network. Water Air Soil Pollut. 2023, 234, 694. [Google Scholar] [CrossRef]
  121. Nyawasha, R.W.; Wadoux, A.M.J.C.; Todoroff, P.; Chikowo, R.; Falconnier, G.N.; Lagorsse, M.; Corbeels, M.; Cardinael, R. Multivariate Regional Deep Learning Prediction of Soil Properties from Near-Infrared, Mid-Infrared and Their Combined Spectra. Geoderma Reg. 2024, 37, e00805. [Google Scholar] [CrossRef]
  122. Li, X.; Pan, W.; Li, D.; Gao, W.; Zeng, R.; Zheng, G.; Cai, K.; Zeng, Y.; Jiang, C. Can Fusion of Vis-NIR and MIR Spectra at Three Levels Improve the Prediction Accuracy of Soil Nutrients? Geoderma 2024, 441, 116754. [Google Scholar] [CrossRef]
  123. Hong, Y.; Munnaf, M.A.; Guerrero, A.; Chen, S.; Liu, Y.; Shi, Z.; Mouazen, A.M. Fusion of Visible-to-near-Infrared and Mid-Infrared Spectroscopy to Estimate Soil Organic Carbon. Soil Tillage Res. 2022, 217, 105284. [Google Scholar] [CrossRef]
  124. Karray, E.; Elmannai, H.; Toumi, E.; Gharbia, M.H.; Meshoul, S.; Aichi, H.; Rabah, Z. Ben Evaluating the Potentials of PLSR and SVR Models for Soil Properties Prediction Using Field Imaging, Laboratory VNIR Spectroscopy and Their Combination. Comput. Model. Eng. Sci. 2023, 136, 1399–1425. [Google Scholar] [CrossRef]
  125. Xu, H.; Xu, D.; Chen, S.; Ma, W.; Shi, Z. Rapid Determination of Soil Class Based on Visible-Near Infrared, Mid-Infrared Spectroscopy and Data Fusion. Remote Sens. 2020, 12, 1512. [Google Scholar] [CrossRef]
  126. Afriyie, E.; Verdoodt, A.; Mouazen, A.M. Data Fusion of Visible Near-Infrared and Mid-Infrared Spectroscopy for Rapid Estimation of Soil Aggregate Stability Indices. Comput. Electron. Agric. 2021, 187, 106229. [Google Scholar] [CrossRef]
  127. Ng, W.K.P.; Maxfield, P.J.; Crew, A.P.; Teixeira, D.L.; Bevan, T.; Bell, M.J. Comparison of Soil Organic Carbon Measurement Methods. Agronomy 2025, 15, 1826. [Google Scholar] [CrossRef]
  128. Semella, S.; Hutengs, C.; Seidel, M.; Ulrich, M.; Schneider, B.; Ortner, M.; Thiele-Bruhn, S.; Ludwig, B.; Vohland, M. Accuracy and Reproducibility of Laboratory Diffuse Reflectance Measurements with Portable VNIR and MIR Spectrometers for Predictive Soil Organic Carbon Modeling. Sensors 2022, 22, 2749. [Google Scholar] [CrossRef]
  129. Smola, A.J.; Schölkopf, B. A Tutorial on Support Vector Regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef]
  130. Chinilin, A.V.; Vindeker, G.V.; Savin, I.Y. Vis-NIR Spectroscopy for Soil Organic Carbon Assessment: A Meta-Analysis. Eurasian Soil Sci. 2023, 56, 1605–1617. [Google Scholar] [CrossRef]
  131. Soil Spectroscopy TRAINING MATERIAL A Primer on Soil Analysis Using Visible and near-Infrared (Vis-NIR) and Mid-Infrared (MIR) Spectroscopy|Semantic Scholar. Available online: https://www.semanticscholar.org/paper/Soil-spectroscopy-TRAINING-MATERIAL-A-primer-on-and/ab023fb3360551fdff0433f193253f3d5ef06b14 (accessed on 18 November 2025).
  132. Viscarra Rossel, R.A.; Walvoort, D.J.J.; McBratney, A.B.; Janik, L.J.; Skjemstad, J.O. Visible, near Infrared, Mid Infrared or Combined Diffuse Reflectance Spectroscopy for Simultaneous Assessment of Various Soil Properties. Geoderma 2006, 131, 59–75. [Google Scholar] [CrossRef]
  133. Vohland, M.; Ludwig, M.; Thiele-Bruhn, S.; Ludwig, B. Determination of Soil Properties with Visible to Near- and Mid-Infrared Spectroscopy: Effects of Spectral Variable Selection. Geoderma 2014, 223–225, 88–96. [Google Scholar] [CrossRef]
  134. Janik, L.J.; Simpson, S.L.; Farrell, M.; Mosley, L.M. Rapid and Portable Mid-Infrared Analysis of Wet Sediment Samples by a Novel “Filter-Press” Attenuated Total Reflectance Method. Environ. Earth Sci. 2024, 83, 55. [Google Scholar] [CrossRef]
  135. Munnaf, M.A.; Haesaert, G.; Van Meirvenne, M.; Mouazen, A.M. Site-Specific Seeding Using Multi-Sensor and Data Fusion Techniques: A Review. Adv. Agron. 2020, 161, 241–323. [Google Scholar] [CrossRef]
  136. Ludwig, B.; Greenberg, I.; Vohland, M.; Michel, K. Optimised Use of Data Fusion and Memory-Based Learning with an Austrian Soil Library for Predictions with Infrared Data. Eur. J. Soil Sci. 2023, 74, e13394. [Google Scholar] [CrossRef]
  137. Castanedo, F. A Review of Data Fusion Techniques. Sci. World J. 2013, 2013, 1–19. [Google Scholar] [CrossRef]
  138. Johnson, J.M.; Vandamme, E.; Senthilkumar, K.; Sila, A.; Shepherd, K.D.; Saito, K. Near-Infrared, Mid-Infrared or Combined Diffuse Reflectance Spectroscopy for Assessing Soil Fertility in Rice Fields in Sub-Saharan Africa. Geoderma 2019, 354, 113840. [Google Scholar] [CrossRef]
  139. Ng, W.; Minasny, B.; Montazerolghaem, M.; Padarian, J.; Ferguson, R.; Bailey, S.; McBratney, A.B. Convolutional Neural Network for Simultaneous Prediction of Several Soil Properties Using Visible/near-Infrared, Mid-Infrared, and Their Combined Spectra. Geoderma 2019, 352, 251–267. [Google Scholar] [CrossRef]
  140. Izaurralde, R.C.; Rice, C.W.; Wielopolski, L.; Ebinger, M.H.; Reeves, J.B.; Thomson, A.M.; Harris, R.; Francis, B.; Mitra, S.; Rappaport, A.G.; et al. Evaluation of Three Field-Based Methods for Quantifying Soil Carbon. PLoS ONE 2013, 8, e55560. [Google Scholar] [CrossRef] [PubMed]
  141. Hutengs, C.; Ludwig, B.; Jung, A.; Eisele, A.; Vohland, M. Comparison of Portable and Bench-Top Spectrometers for Mid-Infrared Diffuse Reflectance Measurements of Soils. Sensors 2018, 18, 993. [Google Scholar] [CrossRef] [PubMed]
  142. Kuang, B.; Mahmood, H.S.; Quraishi, M.Z.; Hoogmoed, W.B.; Mouazen, A.M.; van Henten, E.J. Sensing Soil Properties in the Laboratory, In Situ, and On-Line: A Review. Adv. Agron. 2012, 114, 155–223. [Google Scholar] [CrossRef]
  143. Breure, T.S.; Prout, J.M.; Haefele, S.M.; Milne, A.E.; Hannam, J.A.; Moreno-Rojas, S.; Corstanje, R. Comparing the Effect of Different Sample Conditions and Spectral Libraries on the Prediction Accuracy of Soil Properties from Near- and Mid-Infrared Spectra at the Field-Scale. Soil Tillage Res. 2022, 215, 105196. [Google Scholar] [CrossRef]
  144. Tekin, Y.; Tumsavas, Z.; Mouazen, A.M. Effect of Moisture Content on Prediction of Organic Carbon and PH Using Visible and Near-Infrared Spectroscopy. Soil Sci. Soc. Am. J. 2012, 76, 188–198. [Google Scholar] [CrossRef]
  145. Wijewardane, N.K.; Ge, Y.; Sanderman, J.; Ferguson, R. Fine Grinding Is Needed to Maintain the High Accuracy of Mid-Infrared Diffuse Reflectance Spectroscopy for Soil Property Estimation. Soil Sci. Soc. Am. J. 2021, 85, 263–272. [Google Scholar] [CrossRef]
  146. Wetterlind, J.; Stenberg, B.; Rossel, R.A.V. Soil Analysis Using Visible and Near Infrared Spectroscopy. Methods Mol. Biol. 2013, 953, 95–107. [Google Scholar] [CrossRef]
  147. Forrester, S.T.; Janik, L.J.; Soriano-Disla, J.M.; Mason, S.; Burkitt, L.; Moody, P.; Gourley, C.J.P.; Mclaughlin, M.J. Use of Handheld Mid-Infrared Spectroscopy and Partial Least-Squares Regression for the Prediction of the Phosphorus Buffering Index in Australian Soils. Soil Res. 2015, 53, 67–80. [Google Scholar] [CrossRef]
  148. Ji, W.; Adamchuk, V.I.; Biswas, A.; Dhawale, N.M.; Sudarsan, B.; Zhang, Y.; Viscarra Rossel, R.A.; Shi, Z. Assessment of Soil Properties in Situ Using a Prototype Portable MIR Spectrometer in Two Agricultural Fields. Biosyst. Eng. 2016, 152, 14–27. [Google Scholar] [CrossRef]
  149. Zhang, Y.; Biswas, A.; Ji, W.; Adamchuk, V.I. Depth-Specific Prediction of Soil Properties In Situ Using Vis-NIR Spectroscopy. Soil Sci. Soc. Am. J. 2017, 81, 993–1004. [Google Scholar] [CrossRef]
  150. Silva, F.H.C.A.; Wijewardane, N.K.; Cox, M.S.; Zhang, X. Assessment of Different VisNIR and MIR Spectroscopic Techniques and the Potential of Calibration Transfer between MIR Laboratory and Portable Instruments to Estimate Soil Properties. Soil Tillage Res. 2025, 251, 106555. [Google Scholar] [CrossRef]
  151. Piekarczyk, J.; Kazmierowski, C.; Krolewicz, S.; Cierniewski, J. Effects of Soil Surface Roughness on Soil Reflectance Measured in Laboratory and Outdoor Conditions. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 827–834. [Google Scholar] [CrossRef]
  152. Mouazen, A.M.; Al-Asadi, R.A. Influence of Soil Moisture Content on Assessment of Bulk Density with Combined Frequency Domain Reflectometry and Visible and near Infrared Spectroscopy under Semi Field Conditions. Soil Tillage Res. 2018, 176, 95–103. [Google Scholar] [CrossRef]
  153. Soriano-Disla, J.M.; Janik, L.J.; McLaughlin, M.J. Assessment of Cyanide Contamination in Soils with a Handheld Mid-Infrared Spectrometer. Talanta 2018, 178, 400–409. [Google Scholar] [CrossRef] [PubMed]
  154. Greenberg, I.; Seidel, M.; Vohland, M.; Ludwig, B. Performance of Field-Scale Lab vs in Situ Visible/near- and Mid-Infrared Spectroscopy for Estimation of Soil Properties. Eur. J. Soil Sci. 2022, 73, e13180. [Google Scholar] [CrossRef]
  155. Ji, W.; Viscarra Rossel, R.A.; Shi, Z. Improved Estimates of Organic Carbon Using Proximally Sensed Vis-NIR Spectra Corrected by Piecewise Direct Standardization. Eur. J. Soil Sci. 2015, 66, 670–678. [Google Scholar] [CrossRef]
  156. Roudier, P.; Hedley, C.B.; Lobsey, C.R.; Viscarra Rossel, R.A.; Leroux, C. Evaluation of Two Methods to Eliminate the Effect of Water from Soil Vis–NIR Spectra for Predictions of Organic Carbon. Geoderma 2017, 296, 98–107. [Google Scholar] [CrossRef]
  157. Zhou, P.; Yang, W.; Li, M.; Wang, W. A New Coupled Elimination Method of Soil Moisture and Particle Size Interferences on Predicting Soil Total Nitrogen Concentration through Discrete NIR Spectral Band Data. Remote Sens. 2021, 13, 762. [Google Scholar] [CrossRef]
  158. Debaene, G.; Bartmiński, P.; Siłuch, M. In Situ VIS-NIR Spectroscopy for a Basic and Rapid Soil Investigation. Sensors 2023, 23, 5495. [Google Scholar] [CrossRef]
  159. Hutengs, C.; Seidel, M.; Oertel, F.; Ludwig, B.; Vohland, M. In Situ and Laboratory Soil Spectroscopy with Portable Visible-to-near-Infrared and Mid-Infrared Instruments for the Assessment of Organic Carbon in Soils. Geoderma 2019, 355, 113900. [Google Scholar] [CrossRef]
  160. Priori, S.; Mzid, N.; Pascucci, S.; Pignatti, S.; Casa, R. Performance of a Portable FT-NIR MEMS Spectrometer to Predict Soil Features. Soil Syst. 2022, 6, 66. [Google Scholar] [CrossRef]
  161. Gebbers, R.; Adamchuk, V.I. Precision Agriculture and Food Security. Science 2010, 327, 828–831. [Google Scholar] [CrossRef] [PubMed]
  162. Shining the Light on Spectroscopy for Sustainable Soil Management|Global Soil Partnership|Food and Agriculture Organization of the United Nations. Available online: https://www.fao.org/global-soil-partnership/resources/highlights/detail/en/c/1505320/?utm_source=chatgpt.com (accessed on 18 December 2025).
  163. NeoSpectra Micro Spectral FT-IR Sensor—Photonic Solutions. Available online: https://photonicsolutions.co.uk/products/neospectra-micro-spectral-ft-ir-sensor/ (accessed on 8 July 2025).
  164. Samsung Patents a Smartphone-Embedded IR Spectrometer. Could Be on the Galaxy S11—Gizmochina. Available online: https://www.gizmochina.com/2019/10/01/samsung-patents-a-smartphone-embedded-ir-spectrometer-could-be-on-the-galaxy-s11/ (accessed on 8 July 2025).
  165. Salazar, O.; Benvenuto, A.; Fajardo, M.; Fuentes, J.P.; Nájera, F.; Celedón, A.; Pfeiffer, M.; Renwick, L.L.R.; Seguel, O.; Tapia, Y.; et al. Evaluation of a Miniaturized Portable NIR Spectrometer for the Prediction of Soil Properties in Mediterranean Central Chile. Geoderma Reg. 2023, 34, e00675. [Google Scholar] [CrossRef]
  166. Sorenson, P.T.; Bulmer, D.; Peak, D. Evaluation of Two Miniaturized FT-NIR Spectrometers for Rapid Soil Property Analysis. Soil Sci. Soc. Am. J. 2024, 88, 126–135. [Google Scholar] [CrossRef]
  167. Tang, J.; Wang, Q.; Liu, D.; Li, J.; Zhang, R.; Zhang, M.; Sun, J. A Novel Approach to Spectral Moisture Interference Correction for Nitrogen and Soil Organic Matter Inversion in Native Black Soils: Bayesian-Optimized Dynamic Moisture Mitigation. Ecol. Inform. 2025, 90, 103240. [Google Scholar] [CrossRef]
  168. Process Spectroscopy Market Growth Outlook & Segment Analysis 2024–2034. Available online: https://www.emergenresearch.com/industry-report/process-spectroscopy-market?utm_source=chatgpt.com (accessed on 19 November 2025).
  169. Process Spectroscopy Market Share & Opportunities 2025–2032. Available online: https://www.coherentmarketinsights.com/market-insight/process-spectroscopy-market-4358?utm_source=chatgpt.com (accessed on 19 November 2025).
  170. Choosing the Right Analytical Technology: NIR vs. Wet Chemistry. Available online: https://www.bluesunscientific.com/post/choosing-between-nir-and-wet-chemistry-a-lab-manager-s-guide?utm_source=chatgpt.com (accessed on 19 November 2025).
  171. Boost Efficiency in the QC Laboratory: How NIRS Helps Reduce Costs up to 90%|Metrohm. Available online: https://www.metrohm.com/en/applications/whitepaper/wp-054.html (accessed on 19 November 2025).
Figure 1. Principal components of infrared spectroscopy. Original figure prepared by Govind Dnyandev Vyavahare using icons taken from Flaticon (https://www.flaticon.com/free-icons/biology, accessed on 5 January 2025).
Figure 1. Principal components of infrared spectroscopy. Original figure prepared by Govind Dnyandev Vyavahare using icons taken from Flaticon (https://www.flaticon.com/free-icons/biology, accessed on 5 January 2025).
Agriculture 16 00135 g001
Figure 2. Schematic illustration of preprocessing steps for IR spectrum analysis: (1) Raw spectra, (2) Spectra trimming (<1000 cm−1 and >3995 cm−1), (3) Spectra smoothing by the Savitzky–Golay filter, (4) Baseline correction by standard normal variate (SNV). Original figure prepared by Sangho Jeon.
Figure 2. Schematic illustration of preprocessing steps for IR spectrum analysis: (1) Raw spectra, (2) Spectra trimming (<1000 cm−1 and >3995 cm−1), (3) Spectra smoothing by the Savitzky–Golay filter, (4) Baseline correction by standard normal variate (SNV). Original figure prepared by Sangho Jeon.
Agriculture 16 00135 g002
Figure 3. The trend of algorithms used in spectroscopy-assisted soil analysis over the past decade. Data were retrieved from the Scopus database via relevant keywords, and the figure was prepared by Govind Dnyandev Vyavahare.
Figure 3. The trend of algorithms used in spectroscopy-assisted soil analysis over the past decade. Data were retrieved from the Scopus database via relevant keywords, and the figure was prepared by Govind Dnyandev Vyavahare.
Agriculture 16 00135 g003
Figure 4. Application of NIR and MIR spectroscopy in soil field. Data were retrieved from the Scopus database via relevant keywords, and the figure was prepared by Govind Dnyandev Vyavahare.
Figure 4. Application of NIR and MIR spectroscopy in soil field. Data were retrieved from the Scopus database via relevant keywords, and the figure was prepared by Govind Dnyandev Vyavahare.
Agriculture 16 00135 g004
Figure 5. Data fusion methods used in various articles for soil analysis over the last decade. Data were retrieved from the Scopus database via relevant keywords, and the figure was prepared by Govind Dnyandev Vyavahare.
Figure 5. Data fusion methods used in various articles for soil analysis over the last decade. Data were retrieved from the Scopus database via relevant keywords, and the figure was prepared by Govind Dnyandev Vyavahare.
Agriculture 16 00135 g005
Table 1. Different soil properties predicted by Vis-NIR, NIR, and MIR spectroscopy.
Table 1. Different soil properties predicted by Vis-NIR, NIR, and MIR spectroscopy.
TechniqueWavelength RangeMultivariate CalibrationPreprocessingSample SizeSample Depth (cm)Predicted PropertiesReferences
Vis-NIR350–2500 nmPLSRUN-P2800–15SOC b[100]
Vis-NIR400–2500 nmPLSRLog10(1/R)5310SOC a[101]
MIR2500–16,660 nmPLSRLog10(1/R)5310SOC a
Vis-NIR350–2500 nmPLSRSavitzky–Golay first derivative1320–25SOC a, CEC b, clay a[102]
Vis-NIR350–2500 nmSVM-radial basis kernelLog10(1/R), Savitzky–Golay 54002–10 and 10–40SOC b[103]
NIR4000–10,000 cm−1PLSRUN-P119NRTN b, TOC a, CEC a, clay a, pH c, bulk density c[104]
MIR400–4000 cm−1PLSR, RFSavitzky–Golay first derivative and 5 points smoothing119NRTN a, TOC a, CEC b, clay a (RF), pH b, bulk density c
MIR400–4000 cm−1PLSRUN-P119NRTN a, TOC a, CEC b (109), clay a, pH c, bulk density c (74)
Vis-NIR350–2500 nmCubistSavitzky–Golay, SNV30–120Sand c, silt b, clay b, pH c, CEC c [105]
Vis-NIR350–2500 nmRF, CubistSavitzky–Golay, Continuum removal by subtraction and division (CR-S and CR-D)14,0000–30SOC a (RF, Cubist), TN a (Cubist), TS a (Cubist), CEC a (Cubist), clay a (Cubist), sand a (Cubist), pH a (Cubist), exchangeable Ca a (Cubist)[106]
NIR1200–2400 nmSVMSNV23880–20SOM b[107]
Vis-NIR350–2500 nmPLSR, Support vector machine regression model (SVMR)Savitzky–Golay smoothing with the first derivative, SNV2000–30OC a (SVMR), available N c (PLSR), P b (PLSR), K b (PLSR), EC c (SVMR), texture c (PLSR)[108]
Vis-NIR350–2500 nmRF, ANNMSC960–5pH a (RF), CaCO3 a (RF), EC a (ANN)[109]
NIR900–1700 nmSVMSavitzky–Golay first derivative, MSC, SNV176NRMineral NR, particle size NR[110]
MIR600–7500 cm−1CubistSavitzky–Golay first derivative1510–5 and 5–15SOC a[111]
MIR400–4000 cm−1CubistTrimming, noise-to-signal ratio, Savitzky–Golay filtering, SNV15180SOC a[112]
MIR 450–4000 cm−1PLSRTrim, smoothing, SNV6555–10Lime b[113]
MIR500–4000 cm−1PLSRSavitzky–Golay first derivative, MSC, SNV, detrending3360–15SOC b, pH b, sand b, silt b, clay b[114]
MIR650–5000 cm−1PLSRFirst derivatives, MSC1140–15pH a, SOC a, Ca a, Mg a (MSC), CaCO3 a [115]
MIRNRMemory-based learningNR1000NRBulk density a, texture a, pH a, CEC a, SOC a, TN a, EC a[116]
MIR600–7498 cm−1Memory-based learningSavitzky–Golay, SNV500NRElements a,b,c, TC a, TN b, pH a, CEC a, clay a, sand a[19]
MIR600–4000 cm−1SVMSavitzky–Golay first derivative67150Bulk density a[117]
MIR400–6000 cm−1RFFirst derivativeNRNRpH a, TN a, TC a, base cations a, CEC a[118]
DRIFT-MIR400–4000 cm−1MLR-sCARSMSV, SNV5100–20SOM b, TN a[119]
MIR500–4100 nmDropout sequential artificial neural networkLogarithmic derivativeNRNRTN a, OC b, K a, P a[120]
MIR650–4000 cm−1PLSR-univariateSavitzky–Golay, SNV2280–20 and 20–40TC a, TN a, sand b, clay a[121]
vis-NIR-MIR350–2500 nm and 650–4000 cm−1PLSR, SVMUN-P5010–20TN a, TK a, avl. N c (SVM)[122]
Vis-NIR-MIR400–2600 nm and 650–4000 cm−1PLSRSavitzky–Golay smoothing11115–25SOC c[123]
Vis-NIR-lab. spectra350–2500 nmSVRLog10(1/R), Savitzky–Golay (SG) smoothing filter (SF), Gap-Segment-Derivative (GSD), Detrend Normalization (DT)3090–30SOM b[124]
Vis-NIR-MIR350–2500 nm and 650–4000 cm−1SVMSavitzky–Golay571NRSoil class (61.1%)[125]
Vis-NIR-MIR350–1700 nm and 650–4000 cm−1PLSRStandardization, moving average, first derivative with Savitzky–Golay smoothingNR0–5soil aggregate stability a[126]
Note: a: R2 > 0.8 (well prediction), b: 0.6 < R2 < 0.8 (acceptable prediction), c: R2 < 0.6 (poor prediction), UN-P: Unprocessed, Log10 (1/R): Transforming reflectance to absorbance (R: reflectance), NR: Not reported.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vyavahare, G.D.; Yun, J.-J.; Park, J.-H.; Shim, J.-H.; Kim, S.H.; Kim, K.; Roh, A.; Kim, S.H.; Jang, H.J.; Ng, W.; et al. Applications and Challenges of Visible-Near-Infrared and Mid-Infrared Spectroscopy in Soil Analysis: Chemometric Approaches and Data Fusion. Agriculture 2026, 16, 135. https://doi.org/10.3390/agriculture16010135

AMA Style

Vyavahare GD, Yun J-J, Park J-H, Shim J-H, Kim SH, Kim K, Roh A, Kim SH, Jang HJ, Ng W, et al. Applications and Challenges of Visible-Near-Infrared and Mid-Infrared Spectroscopy in Soil Analysis: Chemometric Approaches and Data Fusion. Agriculture. 2026; 16(1):135. https://doi.org/10.3390/agriculture16010135

Chicago/Turabian Style

Vyavahare, Govind Dnyandev, Jin-Ju Yun, Jae-Hyuk Park, Jae-Hong Shim, Seong Heon Kim, Kyeongyeong Kim, Ahnsung Roh, So Hui Kim, Ho Jun Jang, Wartini Ng, and et al. 2026. "Applications and Challenges of Visible-Near-Infrared and Mid-Infrared Spectroscopy in Soil Analysis: Chemometric Approaches and Data Fusion" Agriculture 16, no. 1: 135. https://doi.org/10.3390/agriculture16010135

APA Style

Vyavahare, G. D., Yun, J.-J., Park, J.-H., Shim, J.-H., Kim, S. H., Kim, K., Roh, A., Kim, S. H., Jang, H. J., Ng, W., & Jeon, S. (2026). Applications and Challenges of Visible-Near-Infrared and Mid-Infrared Spectroscopy in Soil Analysis: Chemometric Approaches and Data Fusion. Agriculture, 16(1), 135. https://doi.org/10.3390/agriculture16010135

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop