1. Introduction
The increasing use of analytical information in agriculture, whether for assessing plant nutritional status, soil fertility, or the quality of fertilizers and soil amendments, has led to a growing demand for agronomic analysis services. However, the methods currently used in agronomic laboratories, particularly for determining soil texture and organic matter, involve time-consuming and expensive procedures that generate significant amounts of chemical waste, which harms human health and the environment [
1].
Texture is one of soil’s most important physical properties, defined as the proportional distribution of clay, sand, and silt particles within a soil mass. The specific characteristics of clay soils are primarily due to the clay material. Clay particles can either cluster together (flocculated clay) into small groups or become dispersed (deflocculated clay). These particles typically have a high cation exchange capacity, allowing them to bond with various chemical elements like calcium, sodium, potassium, and magnesium (base exchange). This bonding helps retain plant nutrients, making clay soils richer in nutrient content indices [
2]. Soil organic matter is vital for productive soils due to its impact on physical, chemical, and biological processes, such as soil aggregation, nutrient cycling, water retention, disease suppression, pH buffering, and cation exchange capacity [
3]. These attributes are crucial for agronomy as they can significantly influence crop yields [
3].
Therefore, the development of analytical methods that are sustainable and safe, in line with the principles of Green Analytical Chemistry [
4], is strategic for meeting the qualitative and quantitative expectations of the production sector in the determination of texture and organic matter.
In this context, molecular spectroscopy, particularly near-infrared spectroscopy (NIRS) combined with chemometric methods, has emerged as a powerful tool for developing innovative methodologies for soil analysis [
5]. Whether using benchtop spectrophotometers, portable devices or remote sensing [
6,
7,
8], NIRS has demonstrated promising results in estimating soil attributes [
5,
9].
A systematic review and meta-analysis of 115 studies from 30 countries evaluated the accuracy of vis–NIR spectroscopy for predicting soil attributes in the context of precision agriculture [
5]. The findings demonstrated that this technique is a feasible, rapid, and cost-effective alternative to conventional methods, yielding acceptable accuracies for variables such as organic carbon, nitrogen, moisture, salinity, and texture. However, model performance was shown to depend on various factors, including soil type, the type of instrumentation used (laboratory vs. field), spectral range, calibration method, spectral preprocessing, and the division of datasets into calibration and validation sets. This wide range of influencing conditions underscores the importance of tailoring vis–NIR models to local conditions and sampling-specific characteristics.
The prediction of soil texture classes at different depths using reflectance data spanning from the visible (vis) to the mid-infrared (MIR) spectrum has shown that spectral bands associated with iron oxides (~535, 679, 750, and 880 nm) and kaolinite (~2200 nm) are particularly important for class discrimination [
8]. Spectral differentiation across soil depths suggests that both mineral composition and physical structure play a significant role in the predictive capability of the models, supporting the use of spectral stratification as an effective approach in regional-scale applications. Other studies [
7] have also highlighted that variables such as depth, soil type, and drainage conditions directly affect model robustness.
The comparative evaluation of benchtop and portable instruments is another relevant aspect in the development of analytical methodologies for agronomic applications. Studies have shown that portable MIR spectrometers, despite having a narrower spectral range, often outperform miniaturized vis–NIR instruments in terms of prediction accuracy, particularly for organic carbon and texture [
6]. These findings emphasize the need to select the appropriate spectrometer type according to the target attribute and operational context, given that spectral variability can significantly influence predictive outcomes. Nevertheless, it is important to consider that the acquisition and maintenance of MIR instruments are considerably more expensive than those of NIR devices.
In addition, the implementation of a more rigorous metrological framework—accounting for predictive uncertainty and adopting appropriate statistical tools to avoid systematic errors—has been recommended as essential for evaluating and reporting the quality of vis–NIR calibration models [
9].
Thus, this study aims to contribute to the development of a more sustainable analytical process for soil characterization by proposing a regionalized approach based on near-infrared spectroscopy. The methodology employs benchtop and portable spectrophotometers, combined with multivariate calibration, to directly assess soil texture and organic matter without the need for chemical reagents or destructive sample preparation. Partial least squares (PLS) regression models were constructed using calibration samples and applied to independent samples collected from a neighboring field. The models were evaluated for both individual (PLS1) and simultaneous (PLS2) prediction of clay, silt, sand, and organic matter. Additionally, synergy interval PLS (siPLS) algorithms were tested in the PLS1 models to enhance performance by selecting the most informative spectral variables. This approach supports the principles of white analytical chemistry, offering an efficient, clean, and regionally adaptable tool for environmental and agricultural applications.
2. Materials and Methods
2.1. Location of the Study and Sampling Area
This study was conducted in two sampling stages in General Câmara, Rio Grande do Sul (RS), Brazil, where soybeans and corn are grown. The first sampling was in November 2022: 21 soil samples were collected, georeferenced, and distributed over 372 ha. These samples were used to develop calibration models for determining texture and organic matter. The developed calibration models were applied to a second set of soil samples (test samples) collected in December 2023 in the same geographic region (
Figure 1). This temporal separation aimed to evaluate the robustness and practical applicability of the models over time, simulating real-world use in local agricultural contexts.
According to the geomorphological, climate, and specificities of the state of RS, this agricultural area is in a physiographic region called the “crystalline shield.” The soil is classified as a typical dystrophic red argisol, according to the Brazilian Soil Classification System, characterized by deep, well-drained profiles with a reddish color, a loam–clay-to-clay texture with gravel, porous structure, and developed from granite in gently undulating to undulating relief. These soils are strongly acidic, with low base saturation, low sum of bases, and low organic matter content. They typically present a sequence of A, B, and C horizons, the latter containing partially decomposed, yellowish parent material [
10].
All samples were collected by simple sampling at a depth of 0–20 cm, using an auger, according to the methodology described by the Brazilian Agricultural Research Corporation [
1].
2.2. Sample Preparation
The samples were placed in 300 cm3 cardboard boxes and dried in an oven (model MA 037, Marconi) with air circulation for at least 24 h at 45 to 60 °C. After drying, the samples were ground in a hammer mill (model M1040, Marconi Equip. Ltda, Piracicaba, SP, Brzail) with a 1 mm sieve, and a fraction of approximately 100 g, with moisture content below 1.5%, was stored in 150 mL polypropylene jars.
2.3. Determination of Texture and Organic Matter Using the Reference Method
In the analysis of soil samples, the Bouyoucos method was used to determine texture, and an adapted version of the Walkley–Black method was applied for the determination of organic matter [
1]. This methodology was carried out at the Analytical Center of the University of Santa Cruz do Sul, affiliated with the Soil Analysis Quality Control Program coordinated by the Brazilian Soil Science Society (
https://rolas.cnpt.embrapa.br/publico/pNumAmostrasAnalisadas (accessed on 10 June 2025)). The Bouyoucos method, or the densimeter method, determines soil texture by measuring the concentrations of sand, silt, and clay particles. The basic principle of this method involves the dispersion of soil particles in an aqueous suspension using sodium hydroxide (1.0 mol L
−1, Labsynth
®, Diadema, SP, Brazil) as the dispersant. This method is based on Stoke’s law, which establishes a relationship between particle size and rate of sedimentation. In the procedure used in this study, the clay fraction is determined directly by the densimeter, while the silt and sand fractions are determined by granulometric analysis [
1].
The determination of organic matter concentration was based on the Walkley–Black method, which involves the oxidation of organic matter by a sulfochromic solution (containing 2.25 g of potassium dichromate (Labsynth
®, Diadema, SP, Brazil) and 4.17 mL of concentrated sulfuric acid Labsynth
®, Diadema, SP, Brazil)). In this process, potassium dichromate is reduced by organic matter compounds. The amount of unreduced dichromate is then determined by spectrophotometry at 645 nm, and the analytical curve was constructed using reference samples with known organic matter contents, ensuring traceability and reliability of the results [
1].
2.4. Near-Infrared Spectral Acquisition Using Benchtop and Portable Devices
Two NIRS approaches were applied to analyze dried and ground soil samples: one using a benchtop spectrophotometer and another using a custom-built portable device. For both approaches, the samples were placed in customized holders produced via 3D printing (3D Pro, GTMAX3D Equip. Ltda., Americana, SP, Brazil), and the infrared spectra were acquired in absorbance mode, in triplicate, under standardized conditions. The spectral data obtained were later used for developing calibration and prediction models for soil texture and organic matter content. Below, the specific setup and operational parameters for each device are described.
In the benchtop configuration, the infrared spectra were acquired using a Spectrum 400 spectrophotometer (Perkin Elmer, Shelton, CT, USA) equipped with a Near-Infrared Reflectance Accessory (NIRA). Spectra were recorded in the 1250 to 2500 nm range, with 2 nm resolution, using 16 scans per sample. A 3D-printed sample holder (3.4 mL volume) made of PLA was used to position the soil directly onto the sapphire window of the NIRA, improving sample handling and analytical throughput (
Figure 2). Each sample was scanned in triplicate, resulting in a total of 63 spectra for calibration and 30 for prediction.
The second method involved a portable NIR spectrophotometer built with a DLP NIRscan Nano EVM module (Texas Instruments
®, Dallas, TX, USA). This device acquired spectra in the 900 to 1700 nm range, with a resolution of 4 nm and 8 scans per sample. Protective casing and interchangeable sampling drawers (2 and 5 mL) were also 3D-printed in PLA to support field use. In this study, only the 2 mL drawer was used. The spectrophotometer was powered and controlled via a Bluetooth
® connection to a smartphone using the Nanometrix app v.1.0.4 [
11], which handled both data acquisition and storage.
The second method involved a portable NIR spectrophotometer built with a DLP NIRscan Nano EVM spectrophotometer (Texas Instruments
®, Dallas, TX, USA). This device acquired spectra in the 900 and 1700 nm range, with a resolution of 4 nm, 8 scans and in absorbance mode. A protective casing and interchangeable sampling drawers (2 and 5 mL) were also 3D-printed in PLA to support field use (
Figure 3), according to da Silva et al. [
11] In this study, only the 2 mL drawer was used. The spectrophotometer was powered and controlled via a Bluetooth
® connection to a smartphone using the Nanometrix app [
12], which handled both data acquisition and storage.
2.5. Development of Multivariate Calibration Models
PLS calibration models were developed by SOLO + MIA software version 8.7.1 (Eigenvector Research, Inc., Manson, WA, USA) using two versions of the SIMPLS algorithm: PLS1 and PLS2. PLS1 was applied for the individual determination of clay, sand, silt, and organic matter, corresponding to the case where there is only one dependent variable at a time. PLS2 allows for the simultaneous quantification of the dependent variables.
To develop the calibration models for both methodologies, different tools of preprocessing and their combinations were used, such as standard normal variate (SNV), multiplicative scatter correction (MSC), and first and second derivatives (width = 15, polynomial order = 2) with Savitzky–Golay smoothing filter.
MSC and SNV were employed as scatter correction strategies to mitigate the physical variability caused by dispersion while compensating for baseline shifts between samples. MSC adjusts for multiplicative and additive effects, enhancing the spectral data by aligning it with a reference spectrum. On the other hand, SNV transformation standardizes the spectra to zero mean and unit variance, effectively correcting for scatter effects and improving comparability between samples. The first derivative method primarily removes the baseline, whereas the second derivative corrects both the baseline and any linear trends in the data. A Savitzky–Golay (SG) smoothing filter was applied before the derivative calculation to improve the quality of the derivative spectra. This smoothing technique reduces the adverse impact on the signal-to-noise ratio that conventional finite-difference derivatives would cause by fitting successive polynomial regressions to the spectral data, thus preserving important spectral features while minimizing noise [
13,
14].
The number of latent variables (LVs) for the regression models and best processing strategy was selected based on the lowest cross-validation errors (root mean square error of cross-validation, RMSECV) and prediction error (root mean square error of prediction, RMSEP), respectively, obtained from the simultaneous analysis of the analytes (PLS2). In this study, the Venetian blinds (10 splits and blind thickness = 1) cross-validation method was used.
With the preprocessing strategy defined, the PLS1 and PLS2 regression models were developed. PLS1 was performed using the entire infrared spectrum and synergy interval PLS (siPLS1) with a variable selection strategy. For the development of regression models using siPLS1, the spectra acquired by both instruments were divided into 32 intervals using the automatic interval combination mode [
15].
The coefficient of determination for calibration (R
2CAL), cross-validation (R
2CV), and prediction (R
2PRED), as well as the root mean square error of calibration (RMSEC), RMSECV, and prediction RMSEP, along with the relative predictive determinant (RPD), were measured in the evaluation of the models [
16].
The performance of the estimation models for each soil property was evaluated based on RPD values. Models with an RPD below 1.0 are considered very poor, those between 1.0 and 1.4 as poor, 1.4 to 1.8 as regular, 1.8 to 2.0 as good, 2.0 to 2.5 as very good, and above 2.5 as excellent [
17,
18].
2.6. Evaluation of Analytical Performance and Sustainability
The one-way ANOVA test was applied with a 95% confidence level using Past ® software version 2.17c to compare the results obtained between the proposed methodologies and the reference.
Additionally, the developed methods were compared against reference methods considering the 12 principles of white analytical chemistry. This comparison focused on the sustainability of the analytical methods, considering the coherence and synergy of analytical, ecological, and practical attributes [
19,
20].
The RGBfast model [
20] was employed as a user-friendly adaptation of the Red–Green–Blue framework to evaluate the sustainability and efficiency of analytical methods. In this model, the color red represents analytical attributes, including (1) trueness (R1), defined as the mean absolute difference between recovery values (%) and 100%, and (2) precision (R
2), calculated as the mean relative standard deviation across a set of predictive sample results. The R3 value, representing the limit of detection (LOD), was estimated by multiplying the RMSEC value by 3. In the case of the greenness assessment, we decided to use ChlorTox [
20]. The G2 and B2 criteria address the green (linked to the estimated electricity consumption due to the carbon footprint from electricity production and supply) and the blue (representing the cost intensity) attributes. The sixth criterion, sample throughput (B1), is a blue parameter that measures the maximum number of analyses or measurements a method can perform during 24 h of continuous operation.
4. Discussion
Among the four analytes investigated in this study, clay quantification performed best in the proposed methodologies. Using the benchtop spectrophotometer, the PLS1 model achieved an RMSEP of 2.1%, while the portable spectrophotometer achieved an RMSEP of 2.0%. These values resulted in RPDs of 2.0 and 2.1, respectively, indicating very good predictive performance (RPD ≥ 2) [
16].
These results were slightly better than those obtained when clay was quantified simultaneously with the other analytes (PLS2 model), which had RMSEP and RPD values of 2.3% and 1.8 for the benchtop method and 2.4% and 1.8 for the portable method. The siPLS models improved the calibration performance but did not enhance the predictive performance.
These results are better than those reported in other studies on Brazilian soil samples, which found an error of 5.1% (RMSEP) using the cubist regression model combined with visible, near-infrared, and shortwave infrared spectral regions (Vis-NIR-SWIR) [
8]. Similarly, they are superior to those presented by Fernández-Martínez and colleagues, who obtained R
2CAL, RMSEP, and RPD values of 0.62, 3.1%, and 1.8, respectively, for clay using a benchtop NIR spectrophotometer [
18].
Furthermore, when comparing the results of the reference methodology with those obtained from the benchtop and portable methods proposed for analyzing the test sample (
Figure 8), no significant differences were observed between the results (
p > 0.05). This performance was consistent across all developed models except for the siPLS model applied to the portable spectrophotometer.
For sand quantification, the performance of the PLS1 and siPLS1 models was lower than when the analytes were quantified simultaneously (PLS2), with RPD values of 1.3 and 0.8 for the benchtop and portable methods, respectively. As shown in
Figure 8, for the analysis of samples using the portable method, only the PLS2 model produced results for sand that were consistent with the reference methodology. Both the PLS1 and PLS2 models yielded results compatible with the reference methodology (
p > 0.05) for the benchtop method.
For silt quantification, the narrow silt content range of 4% to 8% in the test set contributed to the low predictive performance of the models (RPD < 1.0). Only when using the benchtop method with the PLS1 and PLS2 models were results obtained that were equivalent to those of the reference method (
p < 0.05). It is important to highlight that the reference method also has errors, as the silt content is determined indirectly based on the concentrations of clay and sand [
1]. This results in a higher error fraction in silt quantification than other soil texture components [
8].
In the determination of organic matter, the best results were obtained using the benchtop regression model (siPLS), with RMSEP = 0.2% and RPD = 1.5, utilizing the four selected infrared spectral regions: 1697.2–1771.2 nm; 1896.8–1989.6 nm; 2149.6–2269.6 nm; and 2336.4–2405 nm.
The selected spectral regions comprise areas between 1910 and 1930 nm, indicating the presence of water, polysaccharides, and carboxylic acids, and the region from 2225 to 2400 nm attributed to different organic compounds such as polysaccharides and aliphatic organic compounds [
24]. The region’s contribution between 2160 and 2180 nm was attributed to nitrogen-containing organic compounds, such as amines and amides [
25].
The portable methodology showed better predictive performance with the PLS1 model, achieving RMSEP = 0.3% and RPD = 1.2 for organic matter determination. However, the simultaneous determination (PLS2) yielded similar results (RMSEP = 0.3% and RPD = 1.1). As with clay, no significant differences were observed between the reference method and the proposed benchtop and portable methods (p > 0.05), with consistent performance across all models except for the siPLS model applied to the portable spectrophotometer.
In PLS1, a separate set of scores and loading vectors is calculated for each constituent of interest. In this case, the separate set of scores and loading vectors are specifically tuned for each constituent and, therefore, should give more accurate predictions than PLS2 [
26]. However, applying PLS2 algorithms requires only one calibration model to quantify all analytes.
Nevertheless, when examining the performance of the PLS2 models developed for the four analytes investigated, no significant differences (p > 0.05) were observed in the clay, sand, silt, and organic matter results. The PLS2 models showed higher RPD values for sand quantification than the PLS1 and siPLS1 models.
This performance of the simultaneous determination models can be attributed to the strong correlation between analyte concentrations in the sample. PLS2 models can capture this interdependence by extracting latent components that better explain the variations across all response variables simultaneously. This can lead to more accurate predictions than multiple independent PLS1 models for each variable [
27]). Indeed, the PLS2 models demonstrated better performance in quantifying sand, which is strongly correlated with clay concentration (Pearson’s correlation coefficient = −0.9622), as shown in
Table 1.
To provide a more comprehensive assessment of the methods, white analytical chemistry metrics were employed to compare analytical performance, environmental impact, and practicality. A detailed analysis of these results is presented in the
Supplementary Materials (Table S1).
The benchtop and portable infrared methods exhibited a remarkable similarity in analytical performance compared to reference methods based on red criteria [
20]. Both sets of methods yielded comparable percentage values, ranging from 46% to 62% for the benchtop and portable methods and 48% to 59% for the reference methods.
However, when considering green criteria, which evaluate the environmental sustainability of analytical procedures based on aspects such as the elimination of or reduction in hazardous reagents, minimization of waste, lower energy consumption, and reduced sample preparation steps [
19,
20], near-infrared methods performed significantly better than the reference method, achieving results of 78% to 82%, compared to 30% to 40% for the reference methods. This improvement is mainly due to the complete elimination of chemical reagents and the reduced generation of residues, aligning with the principles of Green Analytical Chemistry [
27].
Considering the blue criterion, which focuses on practical aspects related to analytical productivity, such as speed of analysis, cost efficiency, portability, and operational simplicity [
19,
20], the results obtained in this study show that near-infrared methods and the reference method performed differently when evaluating physical and chemical parameters. Specifically for texture, the benchtop and portable near-infrared methods demonstrated higher efficiency, with 54% and 56%. In contrast, the reference method obtained lower results, 30% to 33%. This difference indicates the advantages of near-infrared methods regarding practicality and analytical frequency for texture determination.
Nevertheless, when assessing organic matter, the reference method presented a value of 57% for the blue criterion, outperforming near-infrared methods, which obtained a value of 41%. This is due to the analytical frequency of each methodology since an automated sample introduction system is used for the reference method, while manual processes were considered for the near-infrared methods. Thus, if the near-infrared methods are automated, they can present an analytical frequency up to 10 times higher than that of manual processes.
The analysis of the combination of red, green, and blue colors, resulting in white, enabled the evaluation of the adherence of analytical methods to the principles of green chemistry. The findings of this study demonstrate that near-infrared methods exhibited greater concordance with these principles, with values for the white color falling within the range of 57% to 64%. In contrast, reference methods yielded values between 37% and 43%. This discrepancy indicates that near-infrared methods associated with chemometric methods are more promising for determining soil texture and organic matter, as they present advantages in terms of sustainability, practicality, and reduced environmental impact when compared to reference methods.
5. Conclusions
This study enabled the development of a sustainable and clean analytical process using NIRS combined with chemometrics, allowing for the creation of individual (PLS1 and siPLS1) and simultaneous (PLS2) regression models to maximize predictive performance for texture and organic matter in soil samples.
Benchtop and portable NIR spectrophotometer methods have proven satisfactory for directly determining soil texture and organic matter, except for silt quantification using the portable method, which yielded less satisfactory results. The use of portable equipment, in particular, reinforces the field-deployable and decentralized potential of the method, contributing to greener and more accessible soil analysis practices. The portable equipment used in this study stands out due to its significant accessibility compared to the benchtop equipment, which may facilitate the adoption of this methodology in agronomic laboratories.
Furthermore, both methodologies are classified as white analytical chemistry methods, characterized by their analytical performance, functionality, and sustainability. Notably, they allow for the quantification of these analytes without the use of chemical reagents and the generation of chemical waste, supporting their classification as environmentally friendly tools aligned with the principles of green and white analytical chemistry.
In comparing the calibration models, it was observed that while PLS1 models allow for the development of more specific models and siPLS models offer additional refinement through variable selection, PLS2 models for simultaneous determination showed an equivalent performance.