1. Introduction
Soil is a critical component of the global carbon cycle, functioning as both a reservoir and regulator of carbon storage and release, thereby playing a fundamental role in maintaining ecological balance [
1]. Furthermore, accurate soil data are essential for improving agricultural productivity and ensuring long-term ecological sustainability [
2]. Therefore, analyzing the content and spatial distribution of soil properties is crucial for understanding the carbon cycle and optimizing agricultural production. Soil properties exhibit significant spatial heterogeneity. Although traditional laboratory measurement techniques can provide highly accurate estimates of soil properties, they are limited in their capacity to comprehensively capture spatial distribution patterns and dynamic changes due to high costs and time-consuming procedures [
3]. In recent years, Visible-NIR spectroscopy has gained widespread use in soil analysis and digital soil mapping owing to its rapid, cost-effective nature and the absence of hazardous chemicals in the process [
4,
5,
6,
7]. However, despite the notable advantages of Visible-NIR spectroscopy in predicting soil properties, its accuracy can be affected by external factors such as soil moisture, spatial heterogeneity, and land-use variations [
8,
9]. Research suggests that the primary source of error in predicting soil properties from Visible-NIR spectra lies in the modeling process that links spectral data to target soil parameters [
5]. Therefore, developing an accurate and reliable prediction model based on Visible-NIR spectroscopy is essential for improving soil property estimations.
In recent decades, mathematical modeling techniques have been widely applied to predict soil properties using spectroscopy, yielding consistent and efficient results [
10]. Traditionally, most studies have focused on linear models such as principal component regression (PCR), multiple linear regression (MLR), and least squares regression (PLSR) [
11,
12,
13]. Xie et al. [
14] employed stepwise multiple linear regression, PCR, and PLSR to develop and evaluate the optimal prediction model for salinized soils in northern Shandong Province, China. However, the relationship between Visible-NIR spectroscopy and soil properties is often complex and nonlinear [
15,
16], primarily due to significant heterogeneity in soil composition and the overlapping spectral reflectance of individual soil components [
17,
18]. Machine learning techniques excel at modeling nonlinear relationships and managing large numbers of features and complex data structures effectively [
19]. Consequently, machine learning models such as support vector machines (SVMs) and random forest (RF) have been increasingly adopted to address the nonlinear relationships between soil properties and Visible-NIR spectra [
20,
21]. De Santana et al. [
20] compared the predictive performance of partial least-squares regression with that of support vector machine regression for estimating soil organic matter. The results indicated that the support vector machine method demonstrated greater generalization capability and higher predictive accuracy.
In recent years, the expansion of a data scale and the ongoing optimization of intelligent algorithms have driven the transition from machine learning to deep learning. Deep learning not only excels at uncovering complex nonlinear relationships between spectral data and soil properties but also has been shown to outperform traditional machine learning methods in predicting and mapping soil properties [
22]. Veres et al. [
23] were the first to apply deep learning techniques to the spectral estimation of soil properties in 2015. Kawamura et al. [
24] compared a Convolutional Neural Network (CNN) model with PLSR and RF methods to evaluate their predictive ability in estimating soil phosphorus content. The results indicated that the deep learning model outperformed traditional machine learning methods in both accuracy and robustness. Hosseinpour-Zarnaq et al. [
25] developed a CNN model using Vis-NIR spectral data from the LUCAS topsoil dataset, which significantly outperformed the PLSR model, particularly in predicting key soil properties such as organic carbon and calcium carbonate, achieving a higher ratio of percent deviation (RPD) values of 4.02 and 3.89, respectively. Although CNN models effectively learn local and abstract features from raw spectral data, they have limitations in capturing the inherent sequential dependencies within spectral data due to their sequential nature [
26]. Recurrent Neural Networks (RNNs) are specifically designed for time series data by feeding the output back into the input, while Long Short-Term Memory (LSTM) networks excel at capturing long-term dependencies and leveraging correlations within time series spectral data [
27,
28]. Singh and Kasana [
29] developed a hybrid framework that employed Principal Component Analysis (PCA) and Locality Preserving Projections (LPPs) for dimensionality reduction, combined with RNN variants such as LSTM and Gated Recurrent Unit (GRU). This approach outperformed CNN models in capturing both short-term and long-term dependencies within the LUCAS hyperspectral dataset. Miao et al. [
30] applied an LSTM-CNN model to predict soil organic matter using the Hebei Soil Spectral Library (HSSL), achieving high accuracy (R
2 = 0.96, RMSE = 1.66 g·kg
−1) by effectively extracting both spatial and temporal features from the spectral data.
However, the aforementioned methods treat all spectral information equally, which can adversely affect the model’s predictive accuracy when redundant or irrelevant information is incorporated [
31]. Therefore, it is crucial to prioritize meaningful features while suppressing irrelevant ones to improve the model’s predictive accuracy. The attention mechanism serves as a resource allocation strategy that selects the most relevant information for the current task from a large pool of data, thereby enhancing the model’s ability to learn and represent critical features [
32]. Zhao et al. [
31] proposed the SECNN-E attention network for estimating soil organic carbon content, which effectively manages complex soil spectral data and mitigates the impact of redundant features on predictive accuracy. This approach facilitates the selection and learning of more meaningful features.
Although some studies have proposed hybrid deep learning methods that capture various aspects of soil spectral data, such as local features and temporal dependencies, these elements are often processed independently, constraining the models’ ability to holistically integrate spatial and temporal information. To address this limitation, we propose an LSTM-CNN model enhanced with an attention mechanism for soil property prediction. The model prioritizes sensitive spectral bands—specific regions of the spectrum that exhibit strong and consistent correlations with soil properties—by assigning weights. This approach effectively captures the nonlinear spatial and temporal relationships between spectral data and soil properties. In this study, a comprehensive topsoil dataset was employed, and deep learning models were leveraged as robust and precise tools for data mining. Furthermore, we will objectively evaluate this approach by benchmarking its performance against traditional machine learning models and established methods reported in the literature. Finally, we will explore the model’s applicability and advantages in predicting various soil properties.
The key contributions of this study are the following:
A novel LSTM-CNN-Attention model is developed for predicting soil properties from hyperspectral data;
The model integrates temporal and spatial feature extraction with attention mechanisms to improve predictive accuracy;
The proposed model outperforms not only traditional machine learning models but also previous deep learning approaches.
4. Discussion
A broader comparison with other models explored in recent studies on soil property prediction highlights further advantages of the proposed LSTM-CNN-Attention model. While traditional machine learning methods such as PLSR, SVR, and RF are effective in certain contexts, they struggle to capture the nonlinear and complex relationships present in soil spectral data. Recent deep learning models, including CNN-GRU and CNN-LSTM, have improved predictive performance by leveraging temporal dependencies and spatial features; however, the absence of attention mechanisms limits their ability to emphasize critical features. Zhao et al. [
31] demonstrated that integrating attention mechanisms enhances feature discrimination by focusing on relevant bands in hyperspectral data. Similarly, Feng et al. [
48] employed a spatial attention mechanism to extract contextual information from multi-channel data for soil property prediction. In line with these findings, the attention mechanism in the proposed model optimizes feature extraction while minimizing the impact of redundant data. Moreover, by integrating LSTM, CNN, and attention modules, the model capitalizes on the unique strengths of each component. In comparison with PCA-LSTM, the proposed model avoids the information loss associated with dimensionality reduction and fully utilizes CNN to extract meaningful spatial patterns. The high R
2 and RPD values observed in the ablation studies further demonstrate that this integrated architecture significantly enhances prediction accuracy across multiple soil properties.
To further assess the predictive performance of the proposed LSTM-CNN-Attention model for soil property prediction, we compared it with the S-AlexNet model from Hosseinpour-Zarnaq et al. [
25]. As presented in
Table 7, the proposed model outperforms S-AlexNet across all soil properties, achieving an R
2 of 0.949 for OC, surpassing the 0.94 reported by S-AlexNet. Similarly, the RPD for CaCO
3 reaches 5.377, compared with 3.89, demonstrating enhanced reliability. The superior performance of the proposed model arises from the integration of the attention mechanism, which highlights key features, and the LSTM component, which captures temporal dependencies in soil spectral data. This combined framework enables the model to effectively learn evolving patterns, delivering more precise and consistent predictions than S-AlexNet, which lacks these components.
The LSTM-CNN-Attention model has demonstrated high efficacy in predicting soil properties; however, its intricate architecture significantly prolongs training times due to the intensive computational requirements of its LSTM, CNN, and Attention components. This trade-off between model performance and efficiency is a well-documented challenge when working with large datasets. To mitigate this issue, future research will explore more efficient alternatives, such as substituting LSTM with GRU, streamlining the Attention mechanism, or implementing parallel processing techniques to enhance training speed.
5. Conclusions
In this paper, we proposed an LSTM-CNN-Attention model for predicting soil properties from hyperspectral data, comparing its performance with those of both traditional machine learning models and advanced deep learning frameworks. This model integrates the temporal learning capability of LSTM, the feature extraction power of CNN, and the feature enhancement ability of attention mechanisms, achieving superior accuracy and robustness in predicting key soil properties such as OC, N, CaCO
3, and pH(H
2O). The proposed framework outperforms existing methods, including S-AlexNet by Hosseinpour-Zarnaq et al. [
25], with consistently higher R
2 and RPD metrics, demonstrating its effectiveness in modeling the complex nonlinear relationships within spectral data.
A key contribution of this study lies in the integration of the attention mechanism, which enables the model to selectively emphasize relevant spectral features while mitigating the impact of noise and redundant data. This design enhances predictive performance across multiple soil properties, with R2 values consistently exceeding 0.9 and RPD values surpassing 3. Comparisons with previous studies further underscore the limitations of models that rely on dimensionality reduction techniques, such as PCA-LSTM, or that lack attention mechanisms, highlighting the superiority of the proposed approach.
While the results validate the potential of the LSTM-CNN-Attention model, challenges remain in enhancing computational efficiency and scalability. Future work will focus on optimizing the model’s architecture and exploring methods to accelerate training. Additionally, the application of this model will be extended to field-collected soil data, accounting for environmental factors such as weather, light intensity, and moisture, which can introduce variability in soil properties. Future efforts will focus on refining the model, optimizing its structure, and broadening its applicability to sustainable agricultural practices, with the ultimate goal of facilitating soil health monitoring and efficient resource management. To enhance the model’s adaptability for practical field applications, particular attention will be given to addressing the noise and variability commonly encountered in field-collected data. Advanced preprocessing techniques, domain adaptation strategies, and robust feature selection methods will be investigated to ensure that the model retains its predictive accuracy in real-world scenarios. These enhancements will further strengthen the model’s utility for soil health assessment and support informed decision making across diverse agricultural environments.