Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model

Liu, Yiqiang; Shen, Luming; Zhu, Xinghui; Xie, Yangfan; He, Shaofang

doi:10.3390/app142411687

Open AccessArticle

Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model

by

Yiqiang Liu

¹

,

Luming Shen

^1,*,

Xinghui Zhu

²,

Yangfan Xie

¹

and

Shaofang He

^1,*

¹

College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China

²

Continuous Education College, Hunan Agricultural University, Changsha 410128, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(24), 11687; https://doi.org/10.3390/app142411687

Submission received: 10 November 2024 / Revised: 10 December 2024 / Accepted: 11 December 2024 / Published: 14 December 2024

(This article belongs to the Topic Recent Progress and Applications in Quantitative Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

Accurate prediction of soil properties is essential for sustainable land management and precision agriculture. This study presents an LSTM-CNN-Attention model that integrates temporal and spatial feature extraction with attention mechanisms to improve predictive accuracy. Utilizing the LUCAS soil dataset, the model analyzes spectral data to estimate key soil properties, including organic carbon (OC), nitrogen (N), calcium carbonate (CaCO₃), and pH (in H₂O). The Long Short-Term Memory (LSTM) component captures temporal dependencies, the Convolutional Neural Network (CNN) extracts spatial features, and the attention mechanism highlights critical information within the data. Experimental results show that the proposed model achieves excellent prediction performance, with coefficient of determination (R²) values of 0.949 (OC), 0.916 (N), 0.943 (CaCO₃), and 0.926 (pH), along with corresponding ratio of percent deviation (RPD) values of 3.940, 3.737, 5.377, and 3.352. Both R² and RPD values exceed those of traditional machine learning models, such as partial least squares regression (PLSR), support vector machine regression (SVR), and random forest (RF), as well as deep learning models like CNN-LSTM and Gated Recurrent Unit (GRU). Additionally, the proposed model outperforms S-AlexNet in effectively capturing temporal and spatial patterns. These findings emphasize the potential of the proposed model to significantly enhance the accuracy and reliability of soil property predictions by capturing both temporal and spatial patterns effectively.

Keywords:

LUCAS; hyperspectral data; Vis-NIR spectroscopy; soil properties prediction; deep learning; attention mechanism

1. Introduction

Soil is a critical component of the global carbon cycle, functioning as both a reservoir and regulator of carbon storage and release, thereby playing a fundamental role in maintaining ecological balance [1]. Furthermore, accurate soil data are essential for improving agricultural productivity and ensuring long-term ecological sustainability [2]. Therefore, analyzing the content and spatial distribution of soil properties is crucial for understanding the carbon cycle and optimizing agricultural production. Soil properties exhibit significant spatial heterogeneity. Although traditional laboratory measurement techniques can provide highly accurate estimates of soil properties, they are limited in their capacity to comprehensively capture spatial distribution patterns and dynamic changes due to high costs and time-consuming procedures [3]. In recent years, Visible-NIR spectroscopy has gained widespread use in soil analysis and digital soil mapping owing to its rapid, cost-effective nature and the absence of hazardous chemicals in the process [4,5,6,7]. However, despite the notable advantages of Visible-NIR spectroscopy in predicting soil properties, its accuracy can be affected by external factors such as soil moisture, spatial heterogeneity, and land-use variations [8,9]. Research suggests that the primary source of error in predicting soil properties from Visible-NIR spectra lies in the modeling process that links spectral data to target soil parameters [5]. Therefore, developing an accurate and reliable prediction model based on Visible-NIR spectroscopy is essential for improving soil property estimations.

In recent decades, mathematical modeling techniques have been widely applied to predict soil properties using spectroscopy, yielding consistent and efficient results [10]. Traditionally, most studies have focused on linear models such as principal component regression (PCR), multiple linear regression (MLR), and least squares regression (PLSR) [11,12,13]. Xie et al. [14] employed stepwise multiple linear regression, PCR, and PLSR to develop and evaluate the optimal prediction model for salinized soils in northern Shandong Province, China. However, the relationship between Visible-NIR spectroscopy and soil properties is often complex and nonlinear [15,16], primarily due to significant heterogeneity in soil composition and the overlapping spectral reflectance of individual soil components [17,18]. Machine learning techniques excel at modeling nonlinear relationships and managing large numbers of features and complex data structures effectively [19]. Consequently, machine learning models such as support vector machines (SVMs) and random forest (RF) have been increasingly adopted to address the nonlinear relationships between soil properties and Visible-NIR spectra [20,21]. De Santana et al. [20] compared the predictive performance of partial least-squares regression with that of support vector machine regression for estimating soil organic matter. The results indicated that the support vector machine method demonstrated greater generalization capability and higher predictive accuracy.

In recent years, the expansion of a data scale and the ongoing optimization of intelligent algorithms have driven the transition from machine learning to deep learning. Deep learning not only excels at uncovering complex nonlinear relationships between spectral data and soil properties but also has been shown to outperform traditional machine learning methods in predicting and mapping soil properties [22]. Veres et al. [23] were the first to apply deep learning techniques to the spectral estimation of soil properties in 2015. Kawamura et al. [24] compared a Convolutional Neural Network (CNN) model with PLSR and RF methods to evaluate their predictive ability in estimating soil phosphorus content. The results indicated that the deep learning model outperformed traditional machine learning methods in both accuracy and robustness. Hosseinpour-Zarnaq et al. [25] developed a CNN model using Vis-NIR spectral data from the LUCAS topsoil dataset, which significantly outperformed the PLSR model, particularly in predicting key soil properties such as organic carbon and calcium carbonate, achieving a higher ratio of percent deviation (RPD) values of 4.02 and 3.89, respectively. Although CNN models effectively learn local and abstract features from raw spectral data, they have limitations in capturing the inherent sequential dependencies within spectral data due to their sequential nature [26]. Recurrent Neural Networks (RNNs) are specifically designed for time series data by feeding the output back into the input, while Long Short-Term Memory (LSTM) networks excel at capturing long-term dependencies and leveraging correlations within time series spectral data [27,28]. Singh and Kasana [29] developed a hybrid framework that employed Principal Component Analysis (PCA) and Locality Preserving Projections (LPPs) for dimensionality reduction, combined with RNN variants such as LSTM and Gated Recurrent Unit (GRU). This approach outperformed CNN models in capturing both short-term and long-term dependencies within the LUCAS hyperspectral dataset. Miao et al. [30] applied an LSTM-CNN model to predict soil organic matter using the Hebei Soil Spectral Library (HSSL), achieving high accuracy (R² = 0.96, RMSE = 1.66 g·kg⁻¹) by effectively extracting both spatial and temporal features from the spectral data.

However, the aforementioned methods treat all spectral information equally, which can adversely affect the model’s predictive accuracy when redundant or irrelevant information is incorporated [31]. Therefore, it is crucial to prioritize meaningful features while suppressing irrelevant ones to improve the model’s predictive accuracy. The attention mechanism serves as a resource allocation strategy that selects the most relevant information for the current task from a large pool of data, thereby enhancing the model’s ability to learn and represent critical features [32]. Zhao et al. [31] proposed the SECNN-E attention network for estimating soil organic carbon content, which effectively manages complex soil spectral data and mitigates the impact of redundant features on predictive accuracy. This approach facilitates the selection and learning of more meaningful features.

Although some studies have proposed hybrid deep learning methods that capture various aspects of soil spectral data, such as local features and temporal dependencies, these elements are often processed independently, constraining the models’ ability to holistically integrate spatial and temporal information. To address this limitation, we propose an LSTM-CNN model enhanced with an attention mechanism for soil property prediction. The model prioritizes sensitive spectral bands—specific regions of the spectrum that exhibit strong and consistent correlations with soil properties—by assigning weights. This approach effectively captures the nonlinear spatial and temporal relationships between spectral data and soil properties. In this study, a comprehensive topsoil dataset was employed, and deep learning models were leveraged as robust and precise tools for data mining. Furthermore, we will objectively evaluate this approach by benchmarking its performance against traditional machine learning models and established methods reported in the literature. Finally, we will explore the model’s applicability and advantages in predicting various soil properties.

The key contributions of this study are the following:

A novel LSTM-CNN-Attention model is developed for predicting soil properties from hyperspectral data;
The model integrates temporal and spatial feature extraction with attention mechanisms to improve predictive accuracy;
The proposed model outperforms not only traditional machine learning models but also previous deep learning approaches.

2. Materials and Methods

2.1. LUCAS Soil Database

The Land Use and Coverage Area frame Survey (LUCAS), a project conducted by Eurostat, periodically surveys land use, land cover, and their changes over time across the European Union (EU) [33]. LUCAS encompasses 27 EU member states, along with the UK, and is carried out every 3 years. This project provides real-time data on soil physical and chemical properties, along with Global Positioning System (GPS) coordinates, across the EU region. The observation points are mapped, as illustrated in Figure 1. Soil samples were collected using a composite sampling process, with five subsamples taken at each site. The central subsample was obtained from the LUCAS point, while the other four were collected 2 m away in a cross pattern [34]. At each site, approximately 0.5 kg of topsoil samples (0–20 cm) were collected following standard protocols. The samples were subsequently sent to the laboratory for analysis, which included measurements of organic carbon (g·kg⁻¹), pH (measured in H₂O), calcium carbonate (g·kg⁻¹), total nitrogen (g·kg⁻¹), and spectral data.

2.2. Splitting of Soil Samples

In this study, the LUCAS 2015 topsoil dataset (0–20 cm), comprising 21,859 samples, was used as the reference data. Since spectral data were available for only 21,782 samples, all soil samples with spectral data, including both organic and mineral soils, were utilized without accounting for additional factors such as land use classification or soil type. First, the Latin Hypercube Sampling (LHS) Minasny and McBratney [35] method was employed to divide the dataset into a training set (85%, 18,516 samples) and a test set (15%, 3266 samples). A stratified sampling strategy was applied by dividing the variable range into equal intervals and randomly selecting a value from each interval. This approach minimizes sampling bias and ensures representative coverage of the dataset’s variability. The training set was then divided equally into six parts for six-fold cross-validation. Each fold was alternately used as the validation set, while the remaining five folds were merged to form the training set, with this process repeated six times. The training set was used to learn model parameters, while the validation set helped to prevent overfitting and optimize hyperparameter combinations. Due to the computational cost and the primary focus of this study on evaluating the model’s effectiveness, hyperparameter optimization was performed using manual fine-tuning on the validation set. In the final testing phase, the test set obtained through data partitioning was used exclusively for model evaluation. This test set remained independent of the cross-validation process, ensuring that it was not involved in model training or validation, thereby providing an unbiased assessment of model performance.

Figure 1. Sampling points of European Union.

2.3. Spectra Measurement

Visible-near infrared (Vis-NIR) absorption spectra of the samples were measured using a FOSS XDS Rapid Content Analyzer (FOSS NIRSystems, Hillerød, Denmark) across a wavelength range of 400–2499.5 nm, with a spectral resolution of 0.5 nm, covering a total of 4200 wavelengths. Each sample was scanned twice, with the sample container rotated between scans to ensure that each measurement captured a different area of the soil sample. This rotation minimized the effects of sample heterogeneity, ensuring more representative spectral measurements. In this study, the average of the two spectra was used for each sample to further reduce noise and improve measurement reliability.

The spectral data in this study are presented in wavelengths rather than wavenumbers, a convention commonly used in Fourier transform infrared (FTIR) spectroscopy. Representing data in wavelengths offers a practical advantage in Vis-NIR spectroscopy, aligning with standard practices and ensuring consistency with the instrumentation and methodologies applied in this study. However, this approach may differ for FTIR users, as spectral resolution in terms of energy or wavenumber is not uniform when using wavelengths. Despite this, wavelength-based representation provides straightforward integration with Vis-NIR models and maintains compatibility with the analytical framework of this research.

2.4. Data Preprocessing

Stevens et al. [36] identified artifacts in the 400–500 nm spectral range of the instrument. Therefore, spectral data with bands between 400 and 499.5 nm were removed. Numerous studies have demonstrated that smoothing and first-order derivative transformations of spectra can effectively reduce noise, enhance the correlation between spectral data and soil properties, and improve model predictive accuracy [37,38]. First, we applied a Savitzky–Golay (SG) smoothing filter with a window size of 101 points and third-order polynomials, followed by a first-order derivative (D1) transformation of the spectral data. Subsequently, the data obtained from the D1 transformation underwent Standard Normal Variate (SNV) transformation [39].

To further optimize computational efficiency and reduce redundant information, the spectral data were down-sampled to 1 nm intervals, reducing the total number of spectral bands to 2000 (Figure 2). This approach balances the need to simplify the dataset while preserving essential spectral features critical for accurate soil property predictions. Although the resampling is not linear in terms of energy, the choice of 1 nm intervals ensures that key information is retained for a reliable model performance. This strategy reflects a trade-off between reducing data complexity and maintaining the integrity of predictive features necessary for robust model outputs.

To improve the model’s training convergence rate and generalization, soil property values were normalized using MinMax scaling, adjusting the target values to a range of 0 to 1. This label normalization ensures consistency in target values, enhancing model performance. During performance evaluation, the estimated soil property values were inverse-transformed to their original scale for accurate comparison.

2.5. Model and Methodology

2.5.1. Long Short-Term Memory Neural Network (LSTM)

The Long Short-Term Memory (LSTM) model is an enhanced version of Recurrent Neural Networks (RNNs), specifically designed to address the vanishing and exploding gradient problems that occur when training on long sequences [40]. These issues hinder the learning of long-term dependencies in traditional RNNs. LSTM introduces memory cells with gating mechanisms that selectively retain or discard information across time steps [41].

The LSTM model’s key components are three gates that regulate the flow of information as follows:

The forget gate determines which parts of the previous memory cell state should be discarded.
The input gate controls the incorporation of new information into the cell state.
The output gate determines which portion of the cell state is passed as the hidden state.

These gates enable the model to maintain and update long-term memory, effectively addressing the limitations of standard RNNs.

The operations of the LSTM model are governed by the following set of equations:

\begin{matrix} f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f}) \end{matrix}

(1)

\begin{matrix} i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i}) \end{matrix}

(2)

\begin{matrix} {\tilde{C}}_{t} = tanh (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C}) \end{matrix}

(3)

\begin{matrix} C_{t} = f_{t} \times C_{t - 1} + i_{t} \times {\tilde{C}}_{t} \end{matrix}

(4)

\begin{matrix} o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o}) \end{matrix}

(5)

\begin{matrix} h_{t} = o_{t} \times tanh (C_{t}) \end{matrix}

(6)

In these equations,

f_{t}

,

i_{t}

, and

o_{t}

correspond to the forget gate, input gate, and output gate, respectively.

σ

and tanh denote the sigmoid and hyperbolic tangent activation functions, respectively, while W and b represent the weight matrix and bias term, respectively. The intermediate cell state and the long-term cell state are denoted by

{\tilde{C}}_{t}

and

C_{t}

, respectively. Finally,

t - 1

and t refer to the previous and current time steps, respectively, and

x_{t}

and

h_{t}

denote the input and output at the current time step.

As illustrated in Figure 3, the LSTM model consists of three gates, each regulating different aspects of information flow within the memory cell. The forget gate

f_{t}

and input gate

i_{t}

update the cell state

C_{t}

, while the output gate

o_{t}

governs the generation of the hidden state

h_{t}

. This gated mechanism enables the LSTM to effectively capture long-term dependencies by selectively filtering information at each time step.

By employing this architecture, the LSTM model is highly effective for tasks involving sequential data modeling, as it retains important patterns across extended sequences. In this study, the time step refers to the spectral band interval, rather than the time interval used in traditional time series analysis. Each spectral band is treated as part of a sequence, enabling the model to capture global dependencies between the bands.

2.5.2. One-Dimensional Convolutional Neural Network (1D-CNN)

CNNs are widely used in deep learning due to their ability to capture spatial patterns in data, particularly in high-dimensional datasets such as hyperspectral imagery. CNN architectures typically consist of an input layer and multiple convolutional and pooling layers, followed by fully connected layers, and concluding with an output layer [42]. In this study, we utilized a one-dimensional convolutional neural network (1D-CNN) specifically designed to process one-dimensional spectral data related to soil properties. The 1D-CNN architecture begins with an input layer that receives the spectral data, followed by multiple hidden layers comprising convolutional and pooling operations. Convolutional layers apply filters to extract relevant features from the input data, with each filter scanning across the spectral data to detect local patterns. Pooling layers subsequently reduce dimensionality, condensing the extracted features to maintain computational efficiency while preserving essential information.

In this study, the architecture consists of five convolutional layers, with the first two layers followed by an average pooling layer and the final layer followed by an adaptive average pooling layer. The convolutional layers employ 1D filters to detect features within the spectral data, processing information across multiple wavelengths. To introduce nonlinearity and enhance the model’s ability to capture complex relationships, a Rectified Linear Unit (ReLU) activation function is applied after each layer [43]. Additionally, batch normalization is applied after each convolutional layer to stabilize the training process and accelerate convergence by normalizing activations within each mini-batch [44]. The final output is generated by a fully connected (FC) layer that integrates the features extracted by the convolutional layers. To prevent overfitting during training, a dropout layer with a rate of 0.02 was added to the FC layer. Dropout randomly disables certain neurons during training, encouraging the model to learn more robust features [45]. This architecture allows the model to efficiently capture local characteristics of the spectral data and gradually aggregate them into global information, leading to accurate predictions of soil properties. The ability of CNNs to learn hierarchical representations makes them particularly well suited for the complex, high-dimensional nature of soil spectral data, thereby enhancing predictive performance.

2.5.3. Self-Attention Mechanism

When processing large datasets, the attention mechanism plays a pivotal role in selectively emphasizing the most relevant information. It functions by selecting the most suitable input from multiple alternatives based on observed environmental data [46]. The self-attention mechanism, as illustrated in Figure 4, is a computational process that assigns adaptive weights to elements of the input data based on their relevance to the task. By dynamically identifying and prioritizing important features, this mechanism enables the model to focus on the most critical spectral bands, thereby enhancing the accuracy of soil property prediction.

The self-attention mechanism operates through the following steps:

Projection of Inputs: The input data matrix ( $X \in R^{N \times D_{x}}$ ) is first transformed into the following three separate representations:

$\begin{matrix} Q = X W_{q} \end{matrix}$

(7)

$\begin{matrix} K = X W_{k} \end{matrix}$

(8)

$\begin{matrix} V = X W_{v} \end{matrix}$

(9)

where Q, K, and V represent the Query, Key, and Value matrices, respectively. $W_{q} \in R^{D_{x} \times D_{k}}$ , $W_{k} \in R^{D_{x} \times D_{k}}$ , and $W_{v} \in R^{D_{x} \times D_{v}}$ are learnable weight matrices. N denotes the sequence length, and $D_{x}$ , $D_{k}$ , and $D_{v}$ represent the dimensions of the input, key, and value spaces.
Similarity Calculation: The similarity between Query and Key is measured using the following scaled dot product:

$\begin{matrix} Score (Q, K) = \frac{Q K^{T}}{\sqrt{D_{k}}} \end{matrix}$

(10)

This calculation measures the compatibility between each Query and Key element. The scaling factor $\sqrt{D_{k}}$ normalizes the dot product to stabilize gradient updates during training.
Attention Weights: The computed similarity scores are passed through a softmax function to transform them into the following probability distribution:

$\begin{matrix} A = Softmax (\frac{Q K^{T}}{\sqrt{D_{k}}}) \end{matrix}$

(11)

The resulting weights determine the importance of each Key in relation to a given Query. These weights are subsequently applied to the Value matrix.
Weighted Aggregation: The attention weights are applied to the Value matrix to compute the following final attention-enhanced output:

$\begin{matrix} Attention (Q, K, V) = Softmax (\frac{Q K^{T}}{\sqrt{D_{k}}}) V \end{matrix}$

(12)

This process ensures that features most relevant to the predictive task receive higher weights, effectively highlighting key spectral bands while suppressing less informative or noisy data. By incorporating the self-attention mechanism, the model dynamically adapts to the unique characteristics of the input data, enabling more accurate and interpretable predictions.

2.5.4. Proposed Model and Prediction Workflow

The LSTM-CNN-Attention model integrates temporal, spatial, and feature-specific information from spectral data to predict soil properties with high accuracy. Figure 5 illustrates the model architecture, while Figure 6 provides a detailed flowchart of the prediction workflow.

The model workflow begins with the input of spectral data. Sequential dependencies inherent in soil spectral features are captured by the LSTM layer, which is adept at extracting long-term dependencies and temporal patterns. This layer generates context-aware feature representations, effectively summarizing sequential information across wavelengths. These sequential representations are refined by the attention mechanism, which dynamically assigns weights to different features. By leveraging the self-attention mechanism, the model derives Query (Q), Key (K), and Value (V) matrices and calculates a scaled dot product to measure feature importance. This operation is followed by softmax normalization, which transforms the computed scores into a probability distribution. The attention mechanism enables the model to prioritize the most informative spectral bands, focusing on regions critical for predicting specific soil properties. The weighted feature representations are then processed by the convolutional layer, which extracts spatial patterns and localized correlations through multiple convolutional filters. These filters operate at varying scales to recognize complex spatial relationships in the spectral data. Subsequently, a fully connected linear layer aggregates the refined features and generates predictions for soil properties. During training, the model’s predictions are compared against ground truth values, and its parameters are optimized using gradient descent to minimize error.

The prediction workflow, as illustrated in Figure 6, systematically outlines the steps involved. Spectral data undergo preprocessing to remove noise and artifacts, followed by partitioning into training, validation, and test sets. During the training phase, the model is initialized and iteratively optimized using mini-batch gradient descent. Hyperparameter tuning is conducted based on validation set performance, and training continues until convergence criteria are satisfied or a maximum number of epochs is reached. The trained model is then evaluated on the test set using metrics such as R² and root mean square error (RMSE) to ensure robust and reliable predictions.

The integration of LSTM, attention, and CNN components offers a comprehensive approach to feature extraction and soil property prediction. The LSTM layer effectively captures sequential dependencies, the attention mechanism highlights key spectral features, and the CNN layer enhances spatial feature extraction. This synergistic design ensures superior performance compared with traditional methods, making it particularly well suited for soil property prediction tasks.

The entire framework is trained in an end-to-end manner, ensuring that the parameters of the LSTM, Attention, and CNN are jointly optimized, thereby enhancing both predictive performance and generalization. The parameters for each model component are presented in Table 1.

2.6. Model Evaluation

The performance of the models was evaluated using the coefficient of determination (R²), root mean square error (RMSE), mean absolute error (MAE), and ratio of percent deviation (RPD). These metrics were calculated using the following equations:

\begin{matrix} R^{2} = 1 - \frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}} \end{matrix}

(13)

\begin{matrix} RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}{n}} \end{matrix}

(14)

\begin{matrix} MAE = \frac{\sum_{i = 1}^{n} | {\hat{y}}_{i} - y_{i} |}{n} \end{matrix}

(15)

\begin{matrix} SD = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}} \end{matrix}

(16)

\begin{matrix} RPD = \frac{SD}{RMSE} \end{matrix}

(17)

where

y_{i}

,

{\hat{y}}_{i}

,

\bar{y}

, and n represent the measured value, predicted value, mean of the measured values, and number of samples, respectively. The R² metric measures the goodness of fit between predicted and actual values, with values closer to 1 indicating higher accuracy. RMSE quantifies the average deviation between predicted and actual values, while MAE captures the average magnitude of errors. Lower values for both RMSE and MAE reflect better model performance.

RPD is the ratio of the standard deviation (SD) of the measured data to the standard error of prediction, with larger RPD values indicating better predictive performance of the model. According to Cao et al. [47], models can be categorized based on their RPD values, as shown in Table 2.

2.7. Experimental Setup

In this experiment, the computer was equipped with an Intel^® Xeon^® Gold 6330 CPU (Intel, Santa Clara, CA, USA) and an NVIDIA^® Tesla V100 GPU with 32 GB of graphics memory (NVIDIA, Santa Clara, CA, USA). The operating system used was Ubuntu 22.04, the programming language was Python 3.10.12, and the deep learning framework employed was PyTorch-gpu 2.01. All the code in this paper is publicly available on GitHub at https://github.com/liuyiqiang123/LSTM-CNN-Attention-Model_Python, accessed on 10 December 2024.

3. Results

3.1. Descriptive Statistical Analysis

To ensure that dataset partitioning did not introduce bias and that the test set accurately represented the diversity of the entire dataset, a detailed statistical and graphical analysis was performed. Table 3 summarizes the descriptive statistics of soil properties for the total dataset, training set, and test set. Metrics such as mean, standard deviation, skewness, and coefficient of variation (CV) reveal minimal differences between subsets, confirming that the LHS method effectively preserved the dataset’s statistical characteristics.

Figure 7 displays Kernel Density Estimation (KDE) plots for organic carbon (OC), nitrogen (N), calcium carbonate (CaCO₃), and pH. The overlapping density curves for the total dataset, training set, and test set indicate consistent distributions, demonstrating that the test set adequately represents the variability of soil properties without introducing sampling bias. Figure 8 illustrates the results of Principal Component Analysis (PCA) on the spectral data, showing density plots for PC1 and PC2 across all subsets. The alignment of density curves for the total dataset, training set, and test set confirms that the partitioning method preserved the diversity of spectral features, ensuring representativeness in soil property prediction.

The combined results of descriptive statistics and graphical analyses validate the use of LHS for dataset partitioning. The training and test sets are consistent with the total dataset in terms of both soil properties and spectral features, ensuring unbiased and reliable model evaluations.

3.2. Impact of Spectrum Preprocessing on Model Performance

To assess the effect of removing the 400–499.5 nm spectral band during preprocessing, a comparative experiment was conducted using four models: LSTM-CNN-Attention, CNN, PLSR, and RF. The results, presented in Table 4, indicate that removing this spectral range consistently improved prediction performance across all evaluated soil properties. For instance, the R² of the LSTM-CNN-Attention model for OC increased from 0.930 to 0.949, accompanied by a reduction in RMSE from 19.592 to 16.687. Similar performance enhancements were observed for N, CaCO₃, and pH predictions. These findings suggest that excluding the 400–499.5 nm band reduces noise and artifacts in the data, thereby improving the models’ generalization capabilities and predictive accuracy.

3.3. Evaluation Results of LSTM-CNN-Attention

This section presents the predictive performance of key soil properties, including OC, N, CaCO₃, and pH in H₂O. To assess the efficacy of the LSTM-CNN-Attention model, ablation experiments were conducted by systematically removing individual components (LSTM, CNN, and Attention) to evaluate their contributions to the model’s overall performance. The results are detailed in Table 5, with performance metrics comprising R², RMSE, MAE, and RPD. The findings indicate that the proposed LSTM-CNN-Attention model outperforms all other variations across all evaluated metrics and soil properties. In particular, the model demonstrates excellent predictive accuracy, achieving R² values above 0.9 for all soil properties, including 0.949 for OC, 0.916 for N, 0.943 for CaCO₃, and 0.926 for pH. Moreover, the model exhibits strong reliability, with RPD values exceeding 3.0, including the highest RPD of 5.377 for CaCO₃. These results highlight the model’s ability to deliver both accurate and reliable predictions of soil properties.

The ablation experiments underscore the necessity of each component within the proposed model. As presented in Table 5, the removal of any module resulted in a decline in predictive performance, affirming the importance of integrating LSTM, CNN, and Attention mechanisms. The results demonstrated that the CNN model achieved relatively high computational efficiency and prediction accuracy compared with the full LSTM-CNN-Attention model. This finding indicates that, in scenarios with limited computational resources, the CNN model can serve as an effective benchmark. Its capacity to efficiently extract spatial features makes it a practical choice for spectral data prediction tasks, offering a balance between predictive accuracy and resource constraints. Additionally, the scatter plots in Figure 9 further validate the model’s predictive capability, with predicted values closely aligning with the ground truth for OC, N, CaCO₃, and pH(H₂O). These findings highlight the robustness of the proposed framework in capturing both temporal and spatial patterns, reinforcing its effectiveness as a reliable tool for soil property prediction.

3.4. Comparison of Performance with Other Models

To further validate the advantages of the LSTM-CNN-Attention model for soil property forecasting, we selected traditional machine learning models (PLSR [13], SVR [20], RF [21]) and advanced deep learning models (PCA-LSTM [26], CNN-LSTM [30], CNN-GRU [18]) for comparative analysis. To ensure fairness, all models were trained using the same input data and preprocessing pipeline. The performance metrics for all prediction models on the testing dataset are presented in Table 6. It is evident from the table that the proposed LSTM-CNN-Attention model surpasses both traditional machine learning and deep learning models across all key soil properties. Among the machine learning models, RF outperforms PLSR and SVR, with average R² values exceeding those of PLSR and SVR by 4.69 and 2.35 percentage points, respectively. However, these traditional models struggle to capture the complex nonlinear relationships inherent in soil data. In contrast, deep learning models such as CNN-LSTM and CNN-GRU demonstrate superior performance by effectively learning temporal dependencies and capturing spatial patterns. The average R² values of CNN-LSTM and CNN-GRU exceed those of the best-performing RF model by 1.84 and 4.11 percentage points, respectively. Nevertheless, both still underperform relative to the proposed model. Notably, the PCA-LSTM model exhibits suboptimal performance, likely due to information loss introduced by PCA, which limits its ability to capture essential features. Additionally, the absence of CNN further restricts the model’s ability to extract local patterns, resulting in reduced prediction accuracy.

The residual plots in Figure 10 further demonstrate the effectiveness of the proposed model, showing smaller and more concentrated residuals across all soil properties compared with the other models. The residuals of the proposed model are tightly distributed around 0, emphasizing its robustness and reliability in forecasting soil properties. Additionally, Figure 11 presents the line plots for R² and RPD values across all models. In panel (a), the proposed model consistently achieves the highest R² scores, underscoring its ability to capture complex relationships within soil data. In panel (b), the model maintains superior RPD values across all properties, with a peak value for CaCO₃, demonstrating its effectiveness in managing variability and delivering accurate predictions.

These results confirm the significant advantages of the LSTM-CNN-Attention model, which integrates temporal, spatial, and attention mechanisms to outperform both traditional machine learning models and other deep learning architectures. This model provides robust and accurate predictions for soil properties, establishing a new benchmark in soil property forecasting.

4. Discussion

A broader comparison with other models explored in recent studies on soil property prediction highlights further advantages of the proposed LSTM-CNN-Attention model. While traditional machine learning methods such as PLSR, SVR, and RF are effective in certain contexts, they struggle to capture the nonlinear and complex relationships present in soil spectral data. Recent deep learning models, including CNN-GRU and CNN-LSTM, have improved predictive performance by leveraging temporal dependencies and spatial features; however, the absence of attention mechanisms limits their ability to emphasize critical features. Zhao et al. [31] demonstrated that integrating attention mechanisms enhances feature discrimination by focusing on relevant bands in hyperspectral data. Similarly, Feng et al. [48] employed a spatial attention mechanism to extract contextual information from multi-channel data for soil property prediction. In line with these findings, the attention mechanism in the proposed model optimizes feature extraction while minimizing the impact of redundant data. Moreover, by integrating LSTM, CNN, and attention modules, the model capitalizes on the unique strengths of each component. In comparison with PCA-LSTM, the proposed model avoids the information loss associated with dimensionality reduction and fully utilizes CNN to extract meaningful spatial patterns. The high R² and RPD values observed in the ablation studies further demonstrate that this integrated architecture significantly enhances prediction accuracy across multiple soil properties.

To further assess the predictive performance of the proposed LSTM-CNN-Attention model for soil property prediction, we compared it with the S-AlexNet model from Hosseinpour-Zarnaq et al. [25]. As presented in Table 7, the proposed model outperforms S-AlexNet across all soil properties, achieving an R² of 0.949 for OC, surpassing the 0.94 reported by S-AlexNet. Similarly, the RPD for CaCO₃ reaches 5.377, compared with 3.89, demonstrating enhanced reliability. The superior performance of the proposed model arises from the integration of the attention mechanism, which highlights key features, and the LSTM component, which captures temporal dependencies in soil spectral data. This combined framework enables the model to effectively learn evolving patterns, delivering more precise and consistent predictions than S-AlexNet, which lacks these components.

The LSTM-CNN-Attention model has demonstrated high efficacy in predicting soil properties; however, its intricate architecture significantly prolongs training times due to the intensive computational requirements of its LSTM, CNN, and Attention components. This trade-off between model performance and efficiency is a well-documented challenge when working with large datasets. To mitigate this issue, future research will explore more efficient alternatives, such as substituting LSTM with GRU, streamlining the Attention mechanism, or implementing parallel processing techniques to enhance training speed.

5. Conclusions

In this paper, we proposed an LSTM-CNN-Attention model for predicting soil properties from hyperspectral data, comparing its performance with those of both traditional machine learning models and advanced deep learning frameworks. This model integrates the temporal learning capability of LSTM, the feature extraction power of CNN, and the feature enhancement ability of attention mechanisms, achieving superior accuracy and robustness in predicting key soil properties such as OC, N, CaCO₃, and pH(H₂O). The proposed framework outperforms existing methods, including S-AlexNet by Hosseinpour-Zarnaq et al. [25], with consistently higher R² and RPD metrics, demonstrating its effectiveness in modeling the complex nonlinear relationships within spectral data.

A key contribution of this study lies in the integration of the attention mechanism, which enables the model to selectively emphasize relevant spectral features while mitigating the impact of noise and redundant data. This design enhances predictive performance across multiple soil properties, with R² values consistently exceeding 0.9 and RPD values surpassing 3. Comparisons with previous studies further underscore the limitations of models that rely on dimensionality reduction techniques, such as PCA-LSTM, or that lack attention mechanisms, highlighting the superiority of the proposed approach.

While the results validate the potential of the LSTM-CNN-Attention model, challenges remain in enhancing computational efficiency and scalability. Future work will focus on optimizing the model’s architecture and exploring methods to accelerate training. Additionally, the application of this model will be extended to field-collected soil data, accounting for environmental factors such as weather, light intensity, and moisture, which can introduce variability in soil properties. Future efforts will focus on refining the model, optimizing its structure, and broadening its applicability to sustainable agricultural practices, with the ultimate goal of facilitating soil health monitoring and efficient resource management. To enhance the model’s adaptability for practical field applications, particular attention will be given to addressing the noise and variability commonly encountered in field-collected data. Advanced preprocessing techniques, domain adaptation strategies, and robust feature selection methods will be investigated to ensure that the model retains its predictive accuracy in real-world scenarios. These enhancements will further strengthen the model’s utility for soil health assessment and support informed decision making across diverse agricultural environments.

Author Contributions

Conceptualization, Y.L.; methodology, Y.L.; software, Y.L.; validation, S.H.; formal analysis, X.Z.; investigation, L.S.; resources, L.S.; data curation, Y.X.; writing—original draft preparation, Y.L.; writing—review and editing, S.H.; visualization, Y.X.; supervision, L.S.; project administration, X.Z.; funding acquisition, S.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Hunan Province (grant number 2023JJ30304) and the Scientific Research Program of the Hunan Province Department of Education (grant number 23A0197).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available at the European Soil Data Centre (ESDAC) at https://esdac.jrc.ec.europa.eu/resource-type/datasets, accessed on 10 December 2024.

Acknowledgments

The LUCAS 2015 topsoil dataset used in this work was made available by the European Commission through the European Soil Data Centre managed by the Joint Research Centre (JRC), http://esdac.jrc.ec.europa.eu/, accessed on 10 December 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, S.; Guan, K.; Zhang, C.; Lee, D.; Margenot, A.J.; Ge, Y.; Peng, J.; Zhou, W.; Zhou, Q.; Huang, Y. Using Soil Library Hyperspectral Reflectance and Machine Learning to Predict Soil Organic Carbon: Assessing Potential of Airborne and Spaceborne Optical Soil Sensing. Remote Sens. Environ. 2022, 271, 112914. [Google Scholar] [CrossRef]
Denton, O.; Aduramigba-Modupe, V.; Ojo, A.; Adeoyolanu, O.; Are, K.; Adelana, A.; Oyedele, A.; Adetayo, A.; Oke, A. Assessment of Spatial Variability and Mapping of Soil Properties for Sustainable Agricultural Production Using Geographic Information System Techniques (GIS). Cogent Food Agric. 2017, 3, 1279366. [Google Scholar] [CrossRef]
Zhang, X.; Huang, B. Prediction of Soil Salinity with Soil-Reflected Spectra: A Comparison of Two Regression Methods. Sci. Rep. 2019, 9, 5067. [Google Scholar] [CrossRef]
Wang, Y.; Huang, T.; Liu, J.; Lin, Z.; Li, S.; Wang, R.; Ge, Y. Soil pH Value, Organic Matter and Macronutrients Contents Prediction Using Optical Diffuse Reflectance Spectroscopy. Comput. Electron. Agric. 2015, 111, 69–77. [Google Scholar] [CrossRef]
Viscarra Rossel, R.A.; Behrens, T.; Ben-Dor, E.; Chabrillat, S.; Demattê, J.A.M.; Ge, Y.; Gomez, C.; Guerrero, C.; Peng, Y.; Ramirez-Lopez, L.; et al. Diffuse Reflectance Spectroscopy for Estimating Soil Properties: A Technology for the 21st Century. Eur. J. Soil Sci. 2022, 73, e13271. [Google Scholar] [CrossRef]
Yang, M.; Xu, D.; Chen, S.; Li, H.; Shi, Z. Evaluation of Machine Learning Approaches to Predict Soil Organic Matter and pH Using Vis-NIR Spectra. Sensors 2019, 19, 263. [Google Scholar] [CrossRef]
Chen, S.; Arrouays, D.; Leatitia Mulder, V.; Poggio, L.; Minasny, B.; Roudier, P.; Libohova, Z.; Lagacherie, P.; Shi, Z.; Hannam, J.; et al. Digital Mapping of GlobalSoilMap Soil Properties at a Broad Scale: A Review. Geoderma 2022, 409, 115567. [Google Scholar] [CrossRef]
Seidel, M.; Vohland, M.; Greenberg, I.; Ludwig, B.; Ortner, M.; Thiele-Bruhn, S.; Hutengs, C. Soil Moisture Effects on Predictive VNIR and MIR Modeling of Soil Organic Carbon and Clay Content. Geoderma 2022, 427, 116103. [Google Scholar] [CrossRef]
Goydaragh, M.G.; Taghizadeh-Mehrjardi, R.; Jafarzadeh, A.A.; Triantafilis, J.; Lado, M. Using Environmental Variables and Fourier Transform Infrared Spectroscopy to Predict Soil Organic Carbon. CATENA 2021, 202, 105280. [Google Scholar] [CrossRef]
Zhao, X.; Zhao, D.; Wang, J.; Triantafilis, J. Soil Organic Carbon (SOC) Prediction in Australian Sugarcane Fields Using Vis–NIR Spectroscopy with Different Model Setting Approaches. Geoderma Reg. 2022, 30, e00566. [Google Scholar] [CrossRef]
Ribeiro, S.G.; Teixeira, A.D.S.; De Oliveira, M.R.R.; Costa, M.C.G.; Araújo, I.C.D.S.; Moreira, L.C.J.; Lopes, F.B. Soil Organic Carbon Content Prediction Using Soil-Reflected Spectra: A Comparison of Two Regression Methods. Remote Sens. 2021, 13, 4752. [Google Scholar] [CrossRef]
Dotto, A.C.; Dalmolin, R.S.D.; Ten Caten, A.; Grunwald, S. A Systematic Study on the Application of Scatter-Corrective and Spectral-Derivative Preprocessing for Multivariate Prediction of Soil Organic Carbon by Vis-NIR Spectra. Geoderma 2018, 314, 262–274. [Google Scholar] [CrossRef]
Tavakoli, H.; Correa, J.; Sabetizade, M.; Vogel, S. Predicting Key Soil Properties from Vis-NIR Spectra by Applying Dual-Wavelength Indices Transformations and Stacking Machine Learning Approaches. Soil Tillage Res. 2023, 229, 105684. [Google Scholar] [CrossRef]
Xie, S.; Li, Y.; Wang, X.; Liu, Z.; Ma, K.; Ding, L. Research on Estimation Models of the Spectral Characteristics of Soil Organic Matter Based on the Soil Particle Size. Spectrochim. Acta Part Mol. Biomol. Spectrosc. 2021, 260, 119963. [Google Scholar] [CrossRef] [PubMed]
Vohland, M.; Besold, J.; Hill, J.; Fründ, H.C. Comparing Different Multivariate Calibration Methods for the Determination of Soil Organic Carbon Pools with Visible to near Infrared Spectroscopy. Geoderma 2011, 166, 198–205. [Google Scholar] [CrossRef]
Knox, N.; Grunwald, S.; McDowell, M.; Bruland, G.; Myers, D.; Harris, W. Modelling Soil Carbon Fractions with Visible Near-Infrared (VNIR) and Mid-Infrared (MIR) Spectroscopy. Geoderma 2015, 239–240, 229–239. [Google Scholar] [CrossRef]
Chen, Y.; Wang, Z. Quantitative Analysis Modeling of Infrared Spectroscopy Based on Ensemble Convolutional Neural Networks. Chemom. Intell. Lab. Syst. 2018, 181, 1–10. [Google Scholar] [CrossRef]
Yang, J.; Wang, X.; Wang, R.; Wang, H. Combination of Convolutional Neural Networks and Recurrent Neural Networks for Predicting Soil Properties Using Vis–NIR Spectroscopy. Geoderma 2020, 380, 114616. [Google Scholar] [CrossRef]
Padarian, J.; Minasny, B.; McBratney, A.B. Machine Learning and Soil Sciences: A Review Aided by Machine Learning Tools. Soil 2020, 6, 35–52. [Google Scholar] [CrossRef]
De Santana, F.B.; Otani, S.K.; De Souza, A.M.; Poppi, R.J. Comparison of PLS and SVM Models for Soil Organic Matter and Particle Size Using Vis-NIR Spectral Libraries. Geoderma Reg. 2021, 27, e00436. [Google Scholar] [CrossRef]
Munnaf, M.A.; Mouazen, A.M. Removal of External Influences from On-Line Vis-NIR Spectra for Predicting Soil Organic Carbon Using Machine Learning. CATENA 2022, 211, 106015. [Google Scholar] [CrossRef]
Taghizadeh-Mehrjardi, R.; Mahdianpari, M.; Mohammadimanesh, F.; Behrens, T.; Toomanian, N.; Scholten, T.; Schmidt, K. Multi-Task Convolutional Neural Networks Outperformed Random Forest for Mapping Soil Particle Size Fractions in Central Iran. Geoderma 2020, 376, 114552. [Google Scholar] [CrossRef]
Veres, M.; Lacey, G.; Taylor, G.W. Deep Learning Architectures for Soil Property Prediction. In Proceedings of the 2015 12th Conference on Computer and Robot Vision, Halifax, NS, Canada, 3–5 June 2015; pp. 8–15. [Google Scholar] [CrossRef]
Kawamura, K.; Nishigaki, T.; Andriamananjara, A.; Rakotonindrina, H.; Tsujimoto, Y.; Moritsuka, N.; Rabenarivo, M.; Razafimbelo, T. Using a One-Dimensional Convolutional Neural Network on Visible and Near-Infrared Spectroscopy to Improve Soil Phosphorus Prediction in Madagascar. Remote Sens. 2021, 13, 1519. [Google Scholar] [CrossRef]
Hosseinpour-Zarnaq, M.; Omid, M.; Sarmadian, F.; Ghasemi-Mobtaker, H. A CNN Model for Predicting Soil Properties Using VIS–NIR Spectral Data. Environ. Earth Sci. 2023, 82, 382. [Google Scholar] [CrossRef]
Singh, S.; Kasana, S.S. Estimation of Soil Properties from the EU Spectral Library Using Long Short-Term Memory Networks. Geoderma Reg. 2019, 18, e00233. [Google Scholar] [CrossRef]
Syed, S.N.; Lazaridis, P.I.; Khan, F.A.; Ahmed, Q.Z.; Hafeez, M.; Ivanov, A.; Poulkov, V.; Zaharis, Z.D. Deep Neural Networks for Spectrum Sensing: A Review. IEEE Access 2023, 11, 89591–89615. [Google Scholar] [CrossRef]
Kumar, A.; Gaur, N.; Chakravarty, S.; Alsharif, M.H.; Uthansakul, P.; Uthansakul, M. Analysis of Spectrum Sensing Using Deep Learning Algorithms: CNNs and RNNs. Ain Shams Eng. J. 2024, 15, 102505. [Google Scholar] [CrossRef]
Singh, S.; Kasana, S.S. Quantitative Estimation of Soil Properties Using Hybrid Features and RNN Variants. Chemosphere 2022, 287, 131889. [Google Scholar] [CrossRef] [PubMed]
Miao, T.; Ji, W.; Li, B.; Zhu, X.; Yin, J.; Yang, J.; Huang, Y.; Cao, Y.; Yao, D.; Kong, X. Advanced Soil Organic Matter Prediction with a Regional Soil NIR Spectral Library Using Long Short-Term Memory–Convolutional Neural Networks: A Case Study. Remote Sens. 2024, 16, 1256. [Google Scholar] [CrossRef]
Zhao, W.; Wu, Z.; Yin, Z.; Li, D. Attention-Based CNN Ensemble for Soil Organic Carbon Content Estimation with Spectral Data. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Zhang, J.; Wei, F.; Feng, F.; Wang, C. Spatial–Spectral Feature Refinement for Hyperspectral Image Classification Based on Attention-Dense 3D-2D-CNN. Sensors 2020, 20, 5191. [Google Scholar] [CrossRef]
Commission, E.; Centre, J.R.; Jones, A.; Fernández-Ugalde, O.; Scarpa, S. LUCAS 2015 Topsoil Survey—Presentation of Dataset and Results; Publications Office of the European Union: Luxembourg, 2020. [Google Scholar] [CrossRef]
Institute for Environment and Sustainability (Joint Research Centre); Jones, A.; Montanarella, L.; Tóth, G. LUCAS Topsoil Survey—Methodology, Data and Results; Publications Office of the European Union: Luxembourg, 2013. [Google Scholar] [CrossRef]
Minasny, B.; McBratney, A.B. A Conditioned Latin Hypercube Method for Sampling in the Presence of Ancillary Information. Comput. Geosci. 2006, 32, 1378–1388. [Google Scholar] [CrossRef]
Stevens, A.; Nocita, M.; Tóth, G.; Montanarella, L.; Van Wesemael, B. Prediction of Soil Organic Carbon at the European Scale by Visible and Near InfraRed Reflectance Spectroscopy. PLoS ONE 2013, 8, e66409. [Google Scholar] [CrossRef] [PubMed]
Vašát, R.; Kodešová, R.; Klement, A.; Borůvka, L. Simple but Efficient Signal Pre-Processing in Soil Organic Carbon Spectroscopic Estimation. Geoderma 2017, 298, 46–53. [Google Scholar] [CrossRef]
Wang, Y.; Yang, S.; Yan, X.; Yang, C.; Feng, M.; Xiao, L.; Song, X.; Zhang, M.; Shafiq, F.; Sun, H.; et al. Evaluation of Data Pre-Processing and Regression Models for Precise Estimation of Soil Organic Carbon Using Vis–NIR Spectroscopy. J. Soils Sediments 2023, 23, 634–645. [Google Scholar] [CrossRef]
Barnes, R.J.; Dhanoa, M.S.; Lister, S.J. Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra. Appl. Spectrosc. 1989, 43, 772–777. [Google Scholar] [CrossRef]
Van Houdt, G.; Mosquera, C.; Nápoles, G. A Review on the Long Short-Term Memory Model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to Forget: Continual Prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar]
Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional Neural Networks: An Overview and Application in Radiology. Insights Imaging 2018, 9, 611–629. [Google Scholar] [CrossRef] [PubMed]
Abdar, M.; Pourpanah, F.; Hussain, S.; Rezazadegan, D.; Liu, L.; Ghavamzadeh, M.; Fieguth, P.; Cao, X.; Khosravi, A.; Acharya, U.R.; et al. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges. Inf. Fusion 2021, 76, 243–297. [Google Scholar] [CrossRef]
Tang, J.; Li, Y.; Ding, M.; Liu, H.; Yang, D.; Wu, X. An Ionospheric TEC Forecasting Model Based on a CNN-LSTM-Attention Mechanism Neural Network. Remote Sens. 2022, 14, 2433. [Google Scholar] [CrossRef]
Cao, L.; Sun, M.; Yang, Z.; Jiang, D.; Yin, D.; Duan, Y. A Novel Transformer-CNN Approach for Predicting Soil Properties from LUCAS Vis-NIR Spectral Data. Agronomy 2024, 14, 1998. [Google Scholar] [CrossRef]
Feng, G.; Li, Z.; Zhang, J.; Wang, M. Multi-Scale Spatial Attention-Based Multi-Channel 2D Convolutional Network for Soil Property Prediction. Sensors 2024, 24, 4728. [Google Scholar] [CrossRef] [PubMed]

Figure 2. Initial absorbance spectra and preprocessed spectral curves for mineral soil samples from the LUCAS 2015 topsoil database: (a) shows the original spectra, and (b) displays the preprocessed spectra. Both figures present the 5th, 16th, 50th, 84th, and 95th percentiles to illustrate the variability within the dataset.

Figure 3. Diagram of the LSTM model structure featuring the forget gate, input gate, and output gate.

Figure 4. Self-attention mechanism.

Figure 5. The framework of the proposed LSTM-CNN-Attention model.

Figure 6. The flowchart of soil property prediction with the LSTM-CNN-Attention method.

Figure 7. KDE plots of soil properties for the total dataset, training set, and test set: (a) OC, (b) N, (c) CaCO₃, and (d) pH(H₂O).

Figure 8. KDE plots of PCA-transformed spectral data for the total dataset, training set, and test set: (a) PC1 and (b) PC2.

Figure 9. Actual vs. predicted values of the proposed framework: (a) OC, (b) N, (c) CaCO₃, and (d) pH(H₂O).

Figure 10. Residual comparison: (a) OC, (b) N, (c) CaCO₃, and (d) pH(H₂O).

Figure 11. Line charts of (a) R² and (b) RPD for the proposed and other models.

Table 1. Model framework parameters.

Parameters	Setting
Number of LSTM layers	2
LSTM units/hidden neurons	16
Cov1D * filters	64/128/256/384/512
Cov1D kernels	3
Cov1D strides	2
Dropout	0.02
Activation	ReLU
Dense layer parameters	128/1
Epoch	100
Learning rate	0.0005
Optimizer	Adam
Batch size	128
Loss function	Mean absolute error

* Cov1D: 1D convolution.

Table 2. Classification tiers of the performance-to-deviation ratio (RPD).

RPD	Meaning	Level
RPD > 3	Excellent Model	A
2.5 ≤ RPD < 3.0	Good Model	B
2.0 ≤ RPD ≤ 2.5	Approximate Model	C
RPD < 2	Unsatisfactory Model	D

Table 3. Summary statistics of soil properties for LUCAS.

Properties	Set	N	Min	Max	Q25	Median	Q75	Mean	Std	Skewness	CV * (%)
OC (g·kg⁻¹)	Total	21,782	0.10	560.20	12.50	20.40	38.60	43.24	76.62	4.36	177.18
	Training	18,516	0.10	560.20	12.50	20.30	38.30	42.81	75.99	4.41	177.51
	Test	3266	0.20	555.50	12.60	21.20	39.80	45.71	80.08	4.11	175.19
N (g·kg⁻¹)	Total	21,782	0.00	38.50	1.30	2.00	3.30	3.10	3.67	3.91	118.32
	Training	18,516	0.00	37.60	1.30	2.00	3.30	3.08	3.64	3.91	118.09
	Test	3266	0.00	38.50	1.30	2.00	3.40	3.23	3.85	3.87	119.32
CaCO₃ (g·kg⁻¹)	Total	21,782	0.00	976.00	0.00	1.00	19.00	57.39	135.46	2.91	236.05
	Training	18,516	0.00	976.00	0.00	1.00	20.00	57.83	135.76	3.87	234.73
	Test	3266	0.00	962.00	0.00	0.00	16.00	54.85	133.79	3.11	243.90
pH(H₂O)	Total	21,782	3.17	10.37	4.92	6.07	7.45	6.13	1.35	0.01	21.95
	Training	18,516	3.17	10.37	4.93	6.08	7.45	6.14	1.35	0.00	21.92
	Test	3266	3.51	9.07	4.87	6.03	7.46	6.10	1.35	0.03	22.14

* CV: coefficient of variation.

Table 4. Performance metrics of different models for soil property prediction under different conditions.

Properties	Model	Without Removing Spectral Data				Removing Spectral Data
Properties	Model	R²	RMSE	MAE	RPD	R²	RMSE	MAE	RPD
OC	LSTM-CNN-Attention	0.930	19.592	9.34	3.748	0.949	16.687	8.599	3.940
	CNN	0.897	22.758	12.517	3.262	0.913	21.970	11.020	3.379
	PLSR	0.868	26.798	14.843	2.771	0.879	25.816	13.941	2.876
	RF	0.904	22.481	11.059	3.303	0.918	21.281	10.752	3.489
N	LSTM-CNN-Attention	0.905	1.005	0.624	3.692	0.916	0.993	0.580	3.737
	CNN	0.825	1.437	0.928	2.385	0.829	1.419	0.777	2.415
	PLSR	0.812	1.705	1.027	2.010	0.828	1.502	0.946	2.282
	RF	0.848	1.331	0.779	2.574	0.849	1.331	0.777	2.574
CaCO₃	LSTM-CNN-Attention	0.934	34.641	13.123	5.180	0.943	33.370	12.990	5.377
	CNN	0.886	46.981	18.842	2.984	0.889	46.770	18.420	2.997
	PLSR	0.832	48.820	24.953	2.871	0.841	48.603	24.409	2.884
	RF	0.910	40.123	18.164	3.494	0.919	39.850	18.092	3.518
pH(H₂O)	LSTM-CNN-Attention	0.923	0.370	0.266	2.702	0.926	0.364	0.265	3.352
	CNN	0.876	0.495	0.369	2.385	0.888	0.448	0.343	2.985
	PLSR	0.806	0.628	0.419	2.129	0.818	0.571	0.384	2.342
	RF	0.836	0.560	0.381	2.388	0.838	0.558	0.379	2.397

Table 5. Ablation experiment results for soil property forecasting.

Properties	Metrics	LSTM	LSTM-Attention	CNN	LSTM-CNN	LSTM-CNN-Attention
OC	R²	0.769	0.863	0.913	0.926	0.949
	RMSE	35.670	27.490	21.970	20.160	16.687
	MAE	17.950	14.850	11.020	10.820	8.599
	RPD	2.081	2.701	3.379	3.683	3.940
N	R²	0.723	0.769	0.829	0.883	0.916
	RMSE	1.805	1.648	1.419	1.174	0.993
	MAE	1.087	1.014	0.777	0.704	0.580
	RPD	1.899	2.079	2.415	2.919	3.737
CaCO₃	R²	0.611	0.811	0.889	0.915	0.943
	RMSE	87.400	60.910	46.770	40.780	33.370
	MAE	35.830	24.610	18.420	15.920	12.909
	RPD	1.604	2.301	2.997	3.437	5.377
pH(H₂O)	R²	0.284	0.409	0.888	0.906	0.926
	RMSE	1.132	1.028	0.448	0.410	0.364
	MAE	0.922	0.786	0.343	0.308	0.265
	RPD	1.181	1.301	2.985	3.262	3.352

Table 6. Performance comparison of the proposed model with other state-of-the-art models.

Properties	Metrics	Proposed	PCA-LSTM	CNN-LSTM	CNN-GRU	PLSR	SVR	RF
OC	R²	0.949	0.890	0.923	0.936	0.879	0.897	0.918
	RMSE	16.687	24.619	20.535	18.779	25.816	23.795	21.281
	MAE	8.599	13.605	10.820	9.597	13.941	14.209	10.752
	RPD	3.940	2.670	3.202	3.501	2.876	3.120	3.489
N	R²	0.916	0.813	0.878	0.904	0.828	0.849	0.849
	RMSE	0.993	1.483	1.199	1.062	1.502	1.332	1.331
	MAE	0.580	0.915	0.716	0.642	0.946	0.880	0.777
	RPD	3.737	2.503	3.095	3.496	2.282	2.572	2.574
CaCO₃	R²	0.943	0.895	0.919	0.935	0.841	0.876	0.919
	RMSE	33.370	45.146	39.973	35.845	48.603	46.497	39.850
	MAE	12.909	20.076	16.627	14.468	24.409	21.430	18.092
	RPD	5.377	3.105	4.489	5.006	2.884	3.015	3.518
pH(H₂O)	R²	0.926	0.728	0.869	0.894	0.818	0.821	0.838
	RMSE	0.364	0.698	0.484	0.436	0.571	0.564	0.558
	MAE	0.265	0.510	0.371	0.315	0.384	0.382	0.379
	RPD	3.352	1.747	2.519	2.798	2.342	2.371	2.397

Table 7. Comparison of the predictive performance between the proposed LSTM-CNN-Attention model and the S-AlexNet model from Hosseinpour-Zarnaq et al. [25].

Properties	LSTM-CNN-Attention				Hosseinpour-Zarnaq et al. (2023) [25]
Properties	R²	RMSE	MAE	RPD	R²	RMSE	MAE	RPD
OC	0.949	16.687	8.599	3.940	0.94	17.04	9.02	4.02
N	0.916	0.993	0.580	3.737	0.89	1.21	0.70	3.02
CaCO₃	0.943	33.370	12.909	5.377	0.93	34.19	13.52	3.89
pH (H₂O)	0.926	0.364	0.265	3.352	0.87	0.48	0.37	2.16

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Y.; Shen, L.; Zhu, X.; Xie, Y.; He, S. Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model. Appl. Sci. 2024, 14, 11687. https://doi.org/10.3390/app142411687

AMA Style

Liu Y, Shen L, Zhu X, Xie Y, He S. Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model. Applied Sciences. 2024; 14(24):11687. https://doi.org/10.3390/app142411687

Chicago/Turabian Style

Liu, Yiqiang, Luming Shen, Xinghui Zhu, Yangfan Xie, and Shaofang He. 2024. "Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model" Applied Sciences 14, no. 24: 11687. https://doi.org/10.3390/app142411687

APA Style

Liu, Y., Shen, L., Zhu, X., Xie, Y., & He, S. (2024). Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model. Applied Sciences, 14(24), 11687. https://doi.org/10.3390/app142411687

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Spectral Data-Driven Prediction of Soil Properties Using LSTM-CNN-Attention Model

Abstract

1. Introduction

2. Materials and Methods

2.1. LUCAS Soil Database

2.2. Splitting of Soil Samples

2.3. Spectra Measurement

2.4. Data Preprocessing

2.5. Model and Methodology

2.5.1. Long Short-Term Memory Neural Network (LSTM)

2.5.2. One-Dimensional Convolutional Neural Network (1D-CNN)

2.5.3. Self-Attention Mechanism

2.5.4. Proposed Model and Prediction Workflow

2.6. Model Evaluation

2.7. Experimental Setup

3. Results

3.1. Descriptive Statistical Analysis

3.2. Impact of Spectrum Preprocessing on Model Performance

3.3. Evaluation Results of LSTM-CNN-Attention

3.4. Comparison of Performance with Other Models

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI