A Methodological Study on Improving the Accuracy of Soil Organic Matter Mapping in Mountainous Areas Based on Geo-Positional Transformer-CNN: A Case Study of Longshan County, Hunan Province, China

Shen, Luming; Xie, Yangfan; Deng, Yangjun; Feng, Yujie; Zhou, Qing; Xie, Hongxia

doi:10.3390/app15148060

Open AccessArticle

A Methodological Study on Improving the Accuracy of Soil Organic Matter Mapping in Mountainous Areas Based on Geo-Positional Transformer-CNN: A Case Study of Longshan County, Hunan Province, China

by

Luming Shen

¹,

Yangfan Xie

¹

,

Yangjun Deng

¹

,

Yujie Feng

¹,

Qing Zhou

² and

Hongxia Xie

^2,*

¹

College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China

²

College of Resources, Hunan Agricultural University, Changsha 410128, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(14), 8060; https://doi.org/10.3390/app15148060

Submission received: 11 June 2025 / Revised: 13 July 2025 / Accepted: 16 July 2025 / Published: 20 July 2025

(This article belongs to the Section Agricultural Science and Technology)

Download

Browse Figures

Versions Notes

Abstract

The accurate prediction of soil organic matter (SOM) content is essential for promoting sustainable soil management and addressing global climate change. Due to multiple factors such as topography and climate, especially in mountainous areas, SOM spatial prediction faces significant challenges. The main novelty of this study lies in proposing a geographic positional encoding mechanism that embeds geographic location information into the feature representation of a Transformer model. The encoder structure is further modified to enhance spatial awareness, resulting in the development of the Geo-Positional Transformer (GPTransformer). Furthermore, this model is integrated with a 1D-CNN to form a dual-branch neural network called the Geo-Positional Transformer-CNN (GPTransCNN). This study collected 1490 topsoil samples (0–20 cm) from cultivated land in Longshan County to develop a predictive model for mapping the spatial distribution of SOM across the entire cultivated area. Different models were comprehensively evaluated through ten-fold cross-validation, ablation experiments, and uncertainty analysis. The results show that GPTransCNN has the best performance, with an R² improvement of approximately 43% over the Transformer, 19% over the GPTransformer, and 15% over the 1D-CNN. This study demonstrates that by incorporating geographic positional information, GPTransCNN effectively combines the global modeling capabilities of the GPTransformer with the local feature extraction strengths of the 1D-CNN, which can improve the accuracy of SOM mapping in mountainous areas. This approach provides data support for sustainable soil management and decision-making in response to global climate change.

Keywords:

digital soil mapping; soil organic matter; cultivated land; deep learning; environmental modeling; uncertainty

1. Introduction

Soil organic matter (SOM) is a key indicator for evaluating soil quality and fertility levels [1]. Soil quality is not only a primary determinant of crop productivity but also a crucial factor influencing farm resilience and the environmental quality of agricultural ecosystems [2,3]. At the global scale, SOM plays a central role in the carbon cycle, storing approximately three times as much carbon as the atmosphere or terrestrial vegetation. It is critical in influencing soil fertility, water quality, erosion resistance, and climate change feedback mechanisms [4]. By enhancing soil carbon sequestration, increasing SOM content can improve the soil carbon sink capacity, contributing to the mitigation of greenhouse gas emissions and alleviating the adverse impacts of climate change [5,6]. Exploring the spatial distribution of SOM and achieving accurate SOM prediction are crucial for promoting sustainable land management, tackling climate change, and improving agricultural productivity [7].

Traditional soil mapping relies on soil experts conducting field surveys to mentally construct a soil-landscape model, followed by manual delineation based on topographic maps, aerial photographs, or satellite imagery [8]. This approach is not only time-consuming and labor-intensive but also limited in accuracy. With the advancement of geographic information technologies and statistical modeling, digital soil mapping (DSM) has emerged as a more efficient, data-driven mapping method [9,10]. The introduction of the scorpan-SSPFe framework (soil spatial prediction function with spatially autocorrelated errors) has provided a theoretical foundation for the development of DSM and laid the groundwork for the spatial modeling of soil properties [11]. In recent years, the rapid development of remote sensing technologies has enabled DSM to integrate multi-source environmental information (e.g., topography, vegetation, hydrology), thereby improving the spatial accuracy of soil maps [12,13,14,15]. Particularly for SOM prediction, recent studies have employed machine learning methods such as Random Forest [16] and XGBoost [17] in combination with multi-source remote sensing data to enhance prediction accuracy.

As a prominent branch of machine learning, deep learning offers superior capabilities in nonlinear modeling and spatial feature extraction. It has demonstrated improved predictive performance over traditional approaches in the estimation of SOM [18,19,20]. The spatial variation in soil properties exhibits scale-dependent patterns [21], with SOM distribution being driven by multi-scale spatial factors [22]. These include large-scale geological and climatic conditions as well as small-scale topographic features and biological processes [23]. Such a multi-scale spatial structure necessitates models that can effectively capture both global and local features. The Transformer model is effective in capturing long range dependencies and is suitable for extracting global spatial features, which helps reveal the relationship between large-scale environmental factors and the distribution of SOM [24,25,26]. Convolutional neural networks (CNNs) are more capable of capturing local spatial features and can effectively detect small-scale variations at the micro-scale [27,28]. According to Tobler’s first law of geography, everything in space is related, and things that are closer in space are more strongly related. The positional encoding used in conventional Transformer models does not fully incorporate actual geographic location information [29], which limits their capacity and generalization in geospatial prediction tasks. Incorporating geographic location information into the model can make better use of spatial context and improve prediction accuracy [30].

To address the above limitations, this study proposes a geographic positional encoding mechanism that embeds geographic location information into the feature representation of the Transformer model and modifies its encoder structure to build an enhanced model with spatial awareness named Geo-Positional Transformer (GPTransformer). Based on this, a dual-branch neural network model called Geo-Positional Transformer-CNN (GPTransCNN) is further developed by integrating the improved geographic positional encoding with a multi-scale feature extraction mechanism. This model combines the strength of the GPTransformer branch in capturing large-scale global features with the advantage of the one-dimensional convolutional neural network (1D-CNN) branch in extracting fine-grained local features. It is designed to improve the prediction accuracy of SOM spatial distribution and provide stronger technical support for sustainable land management and responses to global climate change.

2. Materials and Methods

2.1. Study Area

The study area was the cultivated land in Longshan County. Longshan County (longitude 109°13′ to 109°48′ and latitude 28°46′ to 29°38′) is located in the northwest of Hunan Province, China. It borders Sangzhi County and Yongshun County to the east, Laifeng County of Hubei Province and Youyang County of Chongqing Municipality to the west, Baojing County to the south across the Youshui River, and Xuanen County of Hubei Province to the north. Its total area is 3131 square kilometers (Figure 1). The terrain is high in the north and low in the south, steep in the east and gentle in the west, with an elevation ranging from 218 m to a maximum of 1725 m, mainly mountainous. The region has a subtropical continental humid monsoon climate, with an average annual temperature of approximately 15.8 °C, a maximum of 39.5 °C, and a minimum of −6 °C. The annual precipitation is about 1400 mm, the frost-free period lasts 270 to 280 days, and the annual sunshine duration ranges from 924 to 1246 h. Influenced by the topography, the region exhibits distinct climatic gradients and pronounced microclimatic features. (This information was obtained from the official website of the local government: https://www.xxls.gov.cn, accessed on 30 October 2022.)

2.2. Soil Sampling

The cultivated soil samples used in this study were obtained from the cultivated land quality assessment project in Longshan County. Outliers were identified and removed using the three-standard-deviation method, which excludes data points whose values fall outside the range of the mean plus or minus three times the standard deviation. This process resulted in a final dataset of 1490 surface soil samples (0–20 cm) containing geographic location information and SOC content. The measurement of SOC was conducted using the potassium dichromate oxidation method (

K_{2} {Cr}_{2} O_{7}

) [31], a classical chemical oxidation technique widely applied in SOC analysis. Based on the commonly used conversion factor (

SOM \approx SOC \times 1.724

), the measured SOC values were converted to SOM content, which served as the response variable for model training and validation.

2.3. Environmental Covariates

The spatial distribution of SOC is influenced by multiple environmental factors, among which topography and climate are the key drivers shaping its spatial patterns [32]. Topographic heterogeneity leads to the spatial redistribution of water and heat, thereby altering local conditions such as moisture, temperature, and solar radiation. These changes contribute to the formation of microclimatic systems with spatial variation and affect the type and distribution of vegetation [33]. As a major source of SOC accumulation, vegetation plays a critical role in shaping the spatial pattern of SOC. In turn, the accumulation of SOC further influences the content of SOM [34]. Climate change also plays a critical role in regulating the distribution of SOM. Increased precipitation raises soil moisture content, alleviates drought stress, and enhances plant root uptake of water and nutrients, thereby promoting vegetation growth [35]. As plant biomass increases, the input of root exudates and aboveground litter also rises, which in turn enhances the soil’s capacity to accumulate SOC [36]. Meanwhile, rising temperatures stimulate microbial metabolic activity and enzymatic expression, accelerating the decomposition rate of SOM [37]. When warming and increased precipitation occur simultaneously, higher soil moisture may partially mitigate drought stress induced by elevated temperatures and support continued plant growth. However, temperature-induced microbial decomposition processes may dominate SOM turnover, intensifying SOM loss and weakening the soil carbon sink function [38].

Topography and climate jointly regulate the dynamic balance among water and heat conditions, organic matter input from vegetation, and microbial decomposition, playing a key role in shaping the spatial heterogeneity of SOM [39]. Based on this mechanistic understanding, this study incorporates topographic and bioclimatic factors as environmental covariates to comprehensively characterize the influence of terrain and climate on SOM distribution and to enhance the model’s predictive performance and spatial generalization in complex landscapes.

Topographic factors were derived from the Shuttle Radar Topography Mission 1 (SRTM1) digital elevation model (DEM) provided by the United States Geological Survey, with a spatial resolution of 30 m (data source: https://earthexplorer.usgs.gov). Using the Terrain Analysis module in SAGA GIS with default parameter settings [40], 16 terrain-derived variables were calculated using the DEM. Together with the original elevation, a total of 17 topographic covariates were constructed. Detailed information is provided in Table 1.

Bioclimatic factors were selected from the WorldClim global climate database [41], from which 19 variables, representing long-term average climate conditions for the period 1970 to 2000, were extracted. The original spatial resolution was 1000 m. To ensure spatial consistency with the topographic factors, all climate variables were resampled to a resolution of 30 m using bilinear interpolation. Detailed information is provided in Table 2.

2.4. Dataset Construction

In this study, the 1490 soil samples were divided into a training set and a test set at a ratio of 8:2, resulting in 1192 samples for training and 298 samples for testing. To ensure the fairness of the experiments, samples in both the training and test sets were evenly distributed across the study area, thereby avoiding training bias caused by spatial clustering. The spatial distributions of the training and test samples are shown in Figure 2.

Based on the geographic coordinates of each soil sample, corresponding values were extracted from topographic and bioclimatic environmental covariate data stored in raster format. These values were combined with the measured SOM values to construct the dataset used in this study. After model training, the spatial prediction of SOM was performed across the entire cultivated land area based on the raster data of environmental covariates within the study region.

2.5. Geo-Positional Transformer-CNN

To address the limitations of traditional models in capturing the spatial heterogeneity of SOM and modeling geographic positional information, this study proposes a dual-branch SOM prediction model named Geo-Positional Transformer-CNN (GPTransCNN). The model consists of two branches: the Geo-Positional Transformer (GPTransformer) and a one-dimensional convolutional neural network (1D-CNN). GPTransCNN is a supervised regression model designed to predict continuous SOM values based on geographic positional information and environmental covariates.

2.5.1. GPTransformer Branch

As shown in Figure 3, the GPTransformer branch is built upon a modified Transformer encoder architecture with the integration of a geographic positional encoding mechanism. At large spatial scales, SOM is typically influenced by a combination of macro-environmental factors such as climate and topography, resulting in pronounced spatial heterogeneity [42]. By embedding geographic positional information into the input features, this branch enhances the model’s ability to perceive global spatial heterogeneity, thereby enabling more effective capture of large-scale SOM distribution patterns and the spatial correlations among environmental covariates.

The GPTransformer branch consists of one Input Embedding module, one GeoPositional Encoding module, five Transformer encoder modules, one LayerNorm layer, and one fully connected layer.

The Input Embedding module is designed to project the original input environmental covariates, including topographic and bioclimatic factors, into a unified high-dimensional feature space for ensuring consistent representation across features of different sources and scales. This enhances the model’s ability to capture complex nonlinear relationships among the input variables.

The GeoPositional Encoding module replaces the traditional positional encoding mechanism by incorporating geographic location information into the feature variables. Specifically, the latitude values

θ \in R^{B \times L}

(where B denotes the batch size and L is the sequence length) are first expanded into a three-dimensional tensor

θ^{'} \in R^{B \times L \times 1}

. This tensor is combined with a fixed frequency term

P \in R^{M / 2}

(where M is the feature mapping dimension of the input X) via the Kronecker product, resulting in

\tilde{θ} \in R^{B \times L \times M / 2}

. The latitude feature

\tilde{θ}

is then transformed using sine and cosine functions to generate two sets of encoding components. These components are concatenated to produce the final latitude encoding

E_{l a t} \in R^{B \times L \times M}

, as defined by the following equations:

\tilde{θ} = θ^{'} \otimes P

(1)

{\tilde{θ}}_{\sin} = \sin (\tilde{θ})

(2)

{\tilde{θ}}_{\cos} = \cos (\tilde{θ})

(3)

E_{l a t} = C o n c a t ({\tilde{θ}}_{\sin}, {\tilde{θ}}_{\cos})

(4)

E = X \oplus E_{l a t} \oplus E_{l o n}

(5)

where ⊗ denotes the Kronecker product, ⊕ denotes element-wise addition, and P is a learnable frequency modulation matrix (initialized as an all-ones vector in this study to reduce model complexity). Longitude encoding is performed in the same manner, yielding the longitude encoding

E_{l o n}

. Subsequently,

E_{l a t}

and

E_{l o n}

are added to the feature vector X to obtain the final feature representation E embedded with geographic positional information, which is then used as the input for the subsequent model.

The Transformer encoder module adopts the encoder architecture of the Transformer to extract global features from the input data [43]. Based on the standard architecture, appropriate modifications were made: the LayerNorm layers were moved before the Multi-Head Attention and Feed Forward layers, and the LayerNorm layer before the Multi-Head Attention in the first encoder block was removed. Previous studies have shown that removing the LayerNorm layer from the first encoder block can improve the convergence speed and representation capability of the model when the input consists of non-sequential features [44].

The Multi-Head Attention layer is the most essential component of the Transformer encoder module, and its structure is illustrated in Figure 4. First, the queries (Q), keys (K), and values (V) are linearly projected into spaces of dimensions

d_{k}

,

d_{k}

, and

d_{v}

, respectively, where

d_{k}

denotes the dimensionality of the query and key vectors, and

d_{v}

denotes the dimensionality of the value vectors. These projections are then repeated h times to generate h distinct projected versions (h represents the number of heads in the Multi-Head Attention mechanism). Each projection performs a Scaled Dot-Product Attention operation in parallel, yielding an output of dimension

d_{v}

. The outputs from all attention heads are then concatenated and passed through a linear transformation matrix

W^{O}

to obtain the final output representation. The computation is defined as follows:

{head}_{i} = Attention (Q W_{i}^{Q}, K W_{i}^{K}, V W_{i}^{V}) (i = 1, 2, \dots, h)

(6)

MultiHead (Q, K, V) = Concat ({head}_{1}, \dots, {head}_{h}) W^{O}

(7)

The dimensions of the projection matrices are as follows:

W_{i}^{Q} \in R^{d_{model} \times d_{k}}, W_{i}^{K} \in R^{d_{model} \times d_{k}}, W_{i}^{V} \in R^{d_{model} \times d_{v}}, W^{O} \in R^{h d_{v} \times d_{model}}

.

The input of Scaled Dot-Product Attention consists of queries and keys of dimension

d_{k}

, and values of dimension

d_{v}

(

d_{k} = d_{v} = d_{m o d e l} / h = 32 / 16 = 2

). First, the dot product between the query vector and all key vectors is computed and then scaled by dividing by

\sqrt{d_{k}}

. A softmax function is subsequently applied to obtain attention weights over the value vectors. Finally, the output is obtained by performing a weighted sum of the value vectors based on these attention weights. The computation is formulated as follows:

Attention (Q, K, V) = softmax (\frac{Q K^{⊤}}{\sqrt{d_{k}}}) V

(8)

The LayerNorm layer is a commonly used normalization technique in deep neural networks, which normalizes across the feature dimensions of each individual sample. This normalization approach effectively mitigates the problem of internal covariate shift, thereby improving the stability of model training, accelerating convergence, and enhancing overall performance.

The fully connected layer performs the weighted integration of multidimensional features, enabling the exploration of interactions among different features and generating output with the same dimensionality as the target variable. This layer not only accomplishes the final mapping from features to output but also plays a vital role in information compression, feature fusion, and nonlinear representation. It serves as one of the core structures in the output stage of the model.

2.5.2. The 1D-CNN Branch

The structure of the 1D-CNN branch, as illustrated in Figure 5, is designed to capture variations in environmental covariates at a small scale, thereby extracting local features closely related to SOM. In soil systems, factors such as terrain undulation and the topographic wetness index often exhibit spatial heterogeneity at the micro-scale [45], which influences the distribution of SOM. Through one-dimensional convolutional operations, this branch is capable of capturing high-frequency, localized variation patterns, thereby enhancing the model’s ability to identify small-scale spatial distribution features of SOM and its spatial correlations with environmental covariates [46].

The 1D-CNN branch consists of four convolutional blocks, one flatten layer and two fully connected layers. Each convolutional block includes a one-dimensional convolution module, a BatchNorm layer, and a ReLU activation function. The model parameters are detailed in Table 3. The 1D convolution modules are used to extract local features. The BatchNorm layers standardize the features to accelerate model convergence and reduce the risk of overfitting. The ReLU activation functions introduce nonlinearity, enhancing the model’s representational capacity. The flatten layer is used to convert the multi-channel feature tensor output through the convolution layer into a one-dimensional vector, so that it can be input to the full connection layer for feature fusion and regression prediction. The fully connected layers are employed to integrate features and complete the mapping from the feature space to the output space.

2.5.3. GPTransCNN

As illustrated in Figure 6, GPTransCNN is designed to incorporate geographic positional encoding while effectively integrating both global and local features of environmental covariates to capture their relationship with SOM and improve prediction accuracy. Hyperparameter optimization was conducted using the Optuna framework [47], and the optimal hyperparameter configuration was as follows: the input feature dimension of the GPTransformer branch was set to

d_{m o d e l} = 32

, the number of attention heads to

h = 16

, and the number of encoder module to 5. The batch size was set to

b a t c h = 32

, and the model was trained using the Adam optimizer [48] with a learning rate of

l r = 0.001

. The mean squared error (MSE) loss was used as the loss function. The training process was run for a maximum of 500 epochs, with early stopping applied to prevent overfitting [49].

First, the environmental covariates are simultaneously fed into both the GPTransformer branch and the 1D-CNN branch, where each branch extracts a feature vector of shape (batch, 1). These feature vectors are then concatenated along the feature dimension via a Concat Layer, resulting in a fused feature vector of shape (batch, 2). This fused vector is then passed through a fully connected layer and mapped to a feature vector of shape (batch, 64). A ReLU activation function is subsequently applied to introduce nonlinearity. To alleviate overfitting, a Dropout Layer is employed after the activation function, randomly suppressing 20% of neuron outputs. Finally, the model outputs the predicted SOM content through another fully connected layer.

2.6. Model Evaluation

To evaluate the performance differences among the GPTransformer, 1D-CNN, and GPTransCNN, comparative experiments were conducted using the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE) [50,51] as evaluation metrics. The formulas are as follows:

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(9)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(10)

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

(11)

R² quantifies the proportion of variance in the data explained by the model, with values closer to 1 indicating a superior fit. RMSE and MAE assess the prediction errors, where lower values correspond to higher predictive accuracy. In formulas 9 to 11,

y_{i}

is the actual SOM content,

{\hat{y}}_{i}

is the predicted SOM content,

\bar{y}

is the mean SOM content, and n is the total number of samples.

3. Results

3.1. Descriptive Statistics of Soil Organic Matter

Summary statistics of SOM content in this study are presented in Table 4. SOM values range from 4.9 to 63.7 g/kg, with a kurtosis of 4.6163, indicating a relatively concentrated distribution. The skewness is 0.8877, indicating that the data is generally biased towards higher values and there are a small number of samples with high SOM values. Considering the coefficient of variation of 31.73%, there exists a degree of variability and spatial heterogeneity in SOM content among the sampling sites.

3.2. Prediction Results of Soil Organic Matter Content

The experimental results are summarized in Table 5, where the GPTransCNN demonstrates the best performance in predicting SOM. To further explore the differences among the predictions of these models, scatter plots and violin plots were generated, as shown in Figure 7.

The predictions from the GPTransformer are relatively concentrated, with an R² of 0.3643, representing the weakest performance among the three models. Examination of the scatter plot reveals that its predicted values primarily cluster within the range of 15–30 g/kg, a pattern consistent with the distribution depicted in the violin plot, indicating a lack of sensitivity to extreme values. The spatial distribution of SOM is not only governed by regional-scale climate and topographic patterns but also influenced by local variations in microtopography, moisture redistribution, and vegetation inputs [52]. The GPTransformer effectively captures global spatial heterogeneity and the overall relationships among covariates, facilitating the understanding of large-scale SOM distribution trends, but its ability to extract local details is limited, which limits the prediction accuracy of the model in areas with strong spatial heterogeneity.

The predictive performance of the 1D-CNN surpasses that of the GPTransformer, achieving an R² of 0.3765, which ranks it in the middle among the three models. Although the scatter plot reveals deviations between predicted and observed SOM values in the 0–25 g/kg and high-value ranges, the violin plot further indicates that the overall distribution of the predicted values more closely aligns with the measured data. This improvement is largely attributed to the use of small convolutional kernels, which enable the 1D-CNN to more effectively capture the influence of local environmental factors (such as terrain undulation, precipitation, temperature, etc.) on SOM [53]. As a result, the model is suited for handling SOM data with high spatial heterogeneity and small-scale variability, making it more effective in capturing the micro-scale patterns of the SOM distribution.

GPTransCNN achieved the best predictive performance among the three models, with an R² of 0.4329, indicating the strongest fitting capability. While some discrepancies between predicted and observed values remain, the scatter plot demonstrates an improved overall fit. Moreover, the distribution of predicted SOM values in the violin plot most closely matches the observed distribution, further confirming the model’s accuracy. These results suggest that GPTransCNN effectively integrates the large-scale spatial awareness of the GPTransformer with the local feature extraction capabilities of the 1D-CNN, thereby enhancing the model’s ability to capture the spatial heterogeneity of SOM and improving predictive accuracy.

As shown in Figure 7, all three models exhibit larger prediction errors in regions with higher SOM values. The main reason is the limited number of high-value samples in this range, which leads to insufficient training in these areas and subsequently affects the prediction accuracy. To further ensure the reliability and rigor of the experimental results, this study employed ten-fold cross-validation, a method that divides the dataset into ten equal parts. In each iteration, nine parts are used for training and the remaining one is used for testing. This process is repeated ten times so that each part is used once as the test set. This approach helps mitigate the randomness introduced by data partitioning and enhance model generalizability, the results are shown in the Table 6. The validation results confirmed that GPTransCNN consistently outperformed the other models, with an R² of 0.4260, which is 32.01% higher than GPTransformer and 6.55% higher than 1D-CNN. These findings further demonstrate that the dual-branch architecture of GPTransCNN, which integrates the global and local representations of environmental covariates, more effectively captures the nonlinear relationships between these covariates and SOM, thereby improving predictive accuracy.

3.3. Spatial Distribution of Soil Organic Matter Content

The study area is primarily mountainous, with large undulations and complex terrain, resulting in differences in soil water and heat conditions and vegetation types between different slope positions and orientations, thereby exacerbating the spatial heterogeneity of SOM within cultivated land. To predict the spatial distribution of SOM within the cultivated land of the study area, environmental features across the entire region were used as model inputs. The trained GPTransformer, 1D-CNN, and GPTransCNN models were employed to generate corresponding SOM spatial distribution maps. As shown in Figure 8, SOM concentrations are predominantly in the range of 20–30 g/kg, with higher values mainly distributed in the northeastern and southwestern parts. This spatial differentiation pattern is likely driven by the combined influence of topography, land use types, and human activities.

The SOM spatial distribution maps predicted by different models exhibit certain differences in spatial patterns. From the perspective of the overall distribution pattern, GPTransformer focuses on extracting global features, effectively capturing the overall spatial trends and macro-level distribution patterns of SOM across the study area. This model demonstrates particular strengths in responding to large-scale topographic variation and the overarching influence of climatic factors. However, its limited capacity to capture local details during the modeling process results in relatively smooth spatial predictions, lacking the expression of fine-scale spatial variability. In contrast, 1D-CNN focuses on extracting local features and is particularly sensitive to the spatial variability of environmental covariates at a small scale, demonstrating stronger capability in detail recognition. The predicted spatial distribution maps of SOM generated by 1D-CNN clearly reveal more refined spatial variations, effectively reflecting the spatial heterogeneity of SOM under varying microtopographic conditions. GPTransCNN not only retains the global trend modeling capability of GPTransformer but also enhances the local feature extraction capability through 1D-CNN. Its prediction results are able to delineate the spatial boundaries between high- and low-SOM regions, while simultaneously capturing the effects of topographic undulations and climatic variations in SOM distribution, demonstrating the optimal spatial prediction ability. After further analysis of the enlarged view of the local area in the upper right corner of Figure 8, the same conclusion can be reached.

4. Discussion

4.1. Ablation Study

To evaluate the contribution of each structural component of GPTransCNN to overall performance, this study conducted ablation experiments. By systematically removing or modifying individual model components, the impact of each part on the model’s predictive performance was quantitatively assessed, thereby providing a scientific basis for model optimization and mechanistic interpretation [54]. Given that GPTransCNN adopts a dual-branch neural network architecture composed of a GPTransformer branch and a 1D-CNN branch, structural modifications and combinations were designed separately for each branch.

The GPTransformer branch consists of three structural variants: Transformer (with original positional encoding), NPTransformer (without positional encoding), and GPTransformer (with geographic positional encoding). Transformer serves as the baseline model, while NPTransformer is designed to assess the impact of removing positional encoding on model performance. GPTransformer modifies the positional encoding mechanism of Transformer by incorporating geographic location information to enhance the model’s sensitivity to spatial features. The 1D-CNN branch adopts a fixed architecture composed of a one-dimensional convolutional neural network with four consecutive convolutional blocks.

Based on the combinations of the aforementioned GPTransformer and 1D-CNN branches, three dual-branch comparative models were constructed: TransCNN (comprising Transformer and 1D-CNN), NPTransCNN (comprising NPTransformer and 1D-CNN), and GPTransCNN (comprising GPTransformer and 1D-CNN), with the latter being the model proposed in this study. In total, seven models were evaluated for comparative validation. As shown in Table 7, GPTransCNN demonstrated the best performance, with an R² improvement of 43.39% over Transformer, 44.49% over NPTransformer, 18.83% over GPTransformer, 14.98% over 1D-CNN, 7.26% over TransCNN, and 7.29% over NPTransCNN. To ensure the robustness and reliability of the experimental results, this study conducted ten-fold cross-validation. AS presented in Table 8, GPTransCNN consistently achieved the highest predictive performance.

The results of the comparison reveal that Transformer and NPTransformer exhibit similar performance, indicating that the original positional encoding offers limited effectiveness in the geospatial context of this study. This finding is further corroborated by the comparable outcomes observed between TransCNN and NPTransCNN. In contrast, GPTransformer, which incorporates geographic positional encoding, achieves higher model fitting accuracy, suggesting that embedding geographic information enhances the model’s capacity to interpret the spatial heterogeneity of environmental covariates. Furthermore, models with dual-branch architectures outperform their single-branch counterparts. Among them, GPTransCNN delivers the best performance, further validating that the integration of the GPTransformer’s global modeling capability with the 1D-CNN’s fine-grained local feature extraction substantially improves the model’s ability to capture complex spatial heterogeneity and enhances prediction stability.

Through systematic ablation experiments, this study validated the effectiveness of two core architectural designs. First, the introduction of geographic positional encoding not only enhanced the model’s spatial awareness and its ability to capture spatial distribution patterns but also improved its generalization performance under complex terrain conditions. Second, the dual-branch architecture, by integrating spatial features across multiple scales, provides a more robust and accurate approach for predicting SOM in highly heterogeneous environments, offering important practical insights for digital soil mapping in mountainous terrains.

4.2. Uncertainty Quantification

Given that deep learning models are inherently black-box in nature and subject to uncertainty, their outputs may be influenced by factors such as model parameters, training procedures, and inputs. Predictive uncertainty refers to the spatial variability in the difference between predicted and true SOM values across the study area, and its distribution is used to assess the model’s generalization stability. To quantify the predictive uncertainty of the model, this study adopted the Monte Carlo Dropout method [55]. During the testing phase, the Dropout mechanism is kept active, and forward propagation is performed 100 times, each time generating a set of prediction results. The prediction error is calculated by subtracting the actual values from the predicted values across all 100 iterations. The standard deviation of these errors for each sample is then computed and used as a quantitative indicator of uncertainty. This uncertainty metric is subsequently applied to the entire cultivated area to produce a spatial uncertainty map, as shown in Figure 9. In the map, color intensity represents the magnitude of predictive uncertainty at different spatial locations, with darker colors indicating higher uncertainty. This metric effectively captures the spatial variation in predictive uncertainty and serves as a basis for evaluating the robustness and reliability of the model across different regions.

As shown in Figure 9, the predictive uncertainty of GPTransformer exhibits relatively large fluctuations, indicating a certain degree of instability. The 1D-CNN model shows a more stable spatial distribution of predictive uncertainty, with smaller variation. Among the three models, GPTransCNN demonstrates the least spatial fluctuation in uncertainty, reflecting the most stable overall performance. To further investigate the spatial distribution of predictive uncertainty, this study introduced the Terrain Ruggedness Index (TRI) for auxiliary analysis. The TRI is an important indicator used to measure terrain heterogeneity and complexity. It quantifies the elevation differences between a grid cell and its neighboring cells, thereby reflecting the structural complexity and heterogeneity of the terrain. Higher TRI values indicate rougher surfaces and more complex terrain structures [56].

As shown in Figure 10, the spatial distribution of predictive uncertainty for the three models, along with the TRI distribution, is presented. For ease of comparison, a representative spatial region was selected and uniformly applied across all maps. This region was enlarged and displayed in the upper-right corner of each subfigure to provide a clearer visual comparison of both the differences in predictive uncertainty among the models and the correspondence between model uncertainty and TRI distribution. In areas with high TRI values, the predictive uncertainty of GPTransformer increases, indicating its limited capacity to model local features under complex terrain conditions. The 1D-CNN model shows relatively low uncertainty in these areas, indicating its ability to extract local features and adapt to terrain variation. GPTransCNN shows the lowest predictive uncertainty in high-TRI regions, further confirming that its hybrid structure effectively balances global trend modeling and local detail extraction. This enables GPTransCNN to achieve greater stability and robustness under complex terrain conditions.

4.3. Comparison with Similar Studies

To better evaluate the predictive performance of the proposed model, this study conducted a comparative analysis with existing research. For example, one study used up to 74 covariates to model and predict SOC. These covariates included bioclimatic data such as temperature and precipitation, soil and biome distribution maps, vegetation indices, and various topographic factors derived from a DEM. Four machine learning methods were applied, namely Random Forest (RF), Cubist, Generalized Linear Model Boosting (GLMBoost), and Support Vector Machine (SVM). The results showed that the RF model achieved the best performance, with an R² of 0.32 [57].

Another study constructed a high-dimensional feature set consisting of 130 environmental covariates, including 2 climatic variables, 8 topographic variables, 110 phenological variables (11 × 10 years), and 10 EVI time-series variables (1 × 10 years). The CNN, Long Short-Term Memory (LSTM), and hybrid CNN-LSTM models were used for SOC prediction. The results indicated that the CNN-LSTM model achieved the highest performance, with an R² of approximately 0.35 [58].

It is important to note that none of the above studies explicitly incorporated geographic location information to account for its influence on the distribution of covariates and the spatial variability of SOC. In contrast, the GPTransCNN model proposed in this study introduces a geo-positional encoding mechanism. This allows the model to perceive both spatial structural patterns and environmental variable information, thereby improving the prediction accuracy of SOM, with an R² of 0.4329. Considering the approximate relationship between SOM and SOC (

SOM \approx SOC \times 1.724

), the superior performance of GPTransCNN in SOM modeling indirectly reflects its strong generalization capability and practical potential in SOC prediction tasks.

4.4. Limitations and Outlook

Longshan County features a topography that is high in the north and low in the south, steep in the east and gentle in the west. Influenced by intense erosion and karst processes, the region exhibits a highly complex geomorphological structure. Such terrain complexity contributes to the spatial heterogeneity of SOC [59], which has been confirmed through uncertainty analysis as a significant factor affecting the uncertainty in SOM prediction. The Wuling Mountains traverse the study area from the northeast to the southwest, resulting in a dominant slope aspect pattern oriented from the northwest to the southeast. Variations in slope aspect alter the amount of incoming solar radiation, which in turn affects the soil’s thermal and moisture regimes as well as the microclimatic conditions. These differences influence plant growth and lead to spatial variation in vegetation types [60]. Since vegetation type is a critical factor in SOM accumulation and decomposition, its spatial variability further intensifies the heterogeneity of SOM [61]. The combined effects of complex terrain and microclimatic variations significantly increase the spatial variability of SOM across the study area, thereby posing greater challenges for accurate prediction.

Accurately predicting SOM is crucial for promoting sustainable land management and addressing global climate change. Due to the complex terrain and microclimate changes that result in complex soil landscape patterns at the regional scale, this high spatial heterogeneity increases the difficulty of predicting SOM. Future research could incorporate higher-resolution spatial sampling data and additional environmental covariates such as vegetation, cultivated plants, human activities, erosion, land use types, soil and culture, and soil parent material [62], in order to obtain more comprehensive information. This would enhance the model’s ability to understand and predict the spatial distribution patterns of SOM, thereby providing stronger technical support for sustainable land management and climate change mitigation.

5. Conclusions

In this study, the main novelty is to integrate GPTransformer based on improved geographic location encoding with 1D-CNN to construct a dual-branch neural network, GPTransCNN. The experimental results demonstrate that GPTransCNN achieves the best prediction performance and maintains strong generalization ability under ten-fold cross-validation. Ablation experiments were conducted to dissect the model architecture, revealing that GPTransCNN outperformed all other variants across multiple metrics. It provided the most refined predictions of SOM spatial distribution, achieving an R² improvement of 7.26% to 44.49% compared to other component models. Uncertainty analysis further demonstrated that GPTransCNN exhibited the lowest predictive uncertainty, indicating the highest stability and robustness. These experimental results further validate that the proposed GPTransformer incorporating geographic location information, combined with the dual-branch design integrating the local feature extraction capability of 1D-CNN, can more accurately characterize the nonlinear relationships and spatial heterogeneity between environmental features and SOM.

The SOM spatial distribution map, predicted by GPTransCNN, provides essential support for analyzing SOM variability and identifying spatial patterns within the study area. It provides data support for optimizing agricultural production, improving cultivated land quality, and facilitating refined land management. This model has good accuracy and stability, and has the potential for promotion in practical applications. In future work, this model can be extended to more complex terrain regions to enable precise SOM spatial pattern prediction, thereby better supporting sustainable land management and responses to global climate change.

Author Contributions

L.S.: Conceptualization, validation, writing—review and editing, and funding acquisition. Y.X.: project administration, writing—original draft, and methodology. Y.D.: visualization, data curation, and conceptualization. Y.F.: formal analysis, investigation, and visualization. Q.Z.: software, data curation, investigation, and resources. H.X.: writing—review and editing, validation, supervision, resources, and funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Hunan Province (No.2025JJ80038), the National Natural Science Foundation of China (No.62401203), the Key Research and Development Program of Hunan Province (No.2023NK2026), and the Postgraduate Scientific Research Innovation Project of Hunan Agriculture University (No.2024XKC058).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

We are grateful to the many students and staff who participated in land cultivation projects and soil sample collected but are not listed as co-authors. All individuals included in this section have consented to the acknowledgment.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wood, S.A.; Tirfessa, D.; Baudron, F. Soil organic matter underlies crop nutritional quality and productivity in smallholder agriculture. Agric. Ecosyst. Environ. 2018, 266, 100–108. [Google Scholar] [CrossRef]
Kik, M.; Claassen, G.; Meuwissen, M.; Ros, G.; Smit, A.; Saatkamp, H. Economic optimization of sustainable soil management: A Dutch case study. Agron. Sustain. Dev. 2024, 44, 48. [Google Scholar] [CrossRef]
Löbmann, M.T.; Maring, L.; Prokop, G.; Brils, J.; Bender, J.; Bispo, A.; Helming, K. Systems knowledge for sustainable soil and land management. Sci. Total. Environ. 2022, 822, 153389. [Google Scholar] [CrossRef] [PubMed]
Schmidt, M.W.; Torn, M.S.; Abiven, S.; Dittmar, T.; Guggenberger, G.; Janssens, I.A.; Kleber, M.; Kögel-Knabner, I.; Lehmann, J.; Manning, D.A.; et al. Persistence of soil organic matter as an ecosystem property. Nature 2011, 478, 49–56. [Google Scholar] [CrossRef] [PubMed]
Chalchissa, F.B.; Kuris, B.K. Modelling soil organic carbon dynamics under extreme climate and land use and land cover changes in Western Oromia Regional state, Ethiopia. J. Environ. Manag. 2024, 350, 119598. [Google Scholar] [CrossRef] [PubMed]
Jerray, A.; Rumpel, C.; Le Roux, X.; Massad, R.S.; Chabbi, A. N2O emissions from cropland and grassland management systems are determined by soil organic matter quality and soil physical parameters rather than carbon stock and denitrifier abundances. Soil Biol. Biochem. 2024, 190, 109274. [Google Scholar] [CrossRef]
Triantakonstantis, D.; Karakostas, A. Soil Organic Carbon Monitoring and Modelling via Machine Learning Methods Using Soil and Remote Sensing Data. Agriculture 2025, 15, 910. [Google Scholar] [CrossRef]
Bui, E.N. Soil survey as a knowledge system. Geoderma 2004, 120, 17–26. [Google Scholar] [CrossRef]
Hu, B.; Geng, Y.; Shi, K.; Xie, M.; Ni, H.; Zhu, Q.; Qiu, Y.; Zhang, Y.; Bourennane, H. Fine-resolution baseline maps of soil nutrients in farmland of Jiangxi Province using digital soil mapping and interpretable machine learning. Catena 2025, 249, 108635. [Google Scholar] [CrossRef]
Robb, C.; Aitkenhead, M.; Coull, M.; MacFarlane, F.; Matthews, K. Soil Property, Carbon Stock and Peat Extent Mapping at 10 m Resolution in Scotland Using Digital Soil Mapping Techniques. Eur. J. Soil Sci. 2025, 76, e70123. [Google Scholar] [CrossRef]
McBratney, A.B.; Santos, M.M.; Minasny, B. On digital soil mapping. Geoderma 2003, 117, 3–52. [Google Scholar] [CrossRef]
Stumpf, F.; Behrens, T.; Schmidt, K.; Keller, A. Exploiting Soil and Remote Sensing Data Archives for 3D Mapping of Multiple Soil Properties at the Swiss National Scale. Remote. Sens. 2024, 16, 2712. [Google Scholar] [CrossRef]
Taghizadeh-Mehrjardi, R.; Schmidt, K.; Amirian-Chakan, A.; Rentschler, T.; Zeraatpisheh, M.; Sarmadian, F.; Valavi, R.; Davatgar, N.; Behrens, T.; Scholten, T. Improving the spatial prediction of soil organic carbon content in two contrasting climatic regions by stacking machine learning models and rescanning covariate space. Remote. Sens. 2020, 12, 1095. [Google Scholar] [CrossRef]
Guo, L.; Fu, P.; Shi, T.; Chen, Y.; Zeng, C.; Zhang, H.; Wang, S. Exploring influence factors in mapping soil organic carbon on low-relief agricultural lands using time series of remote sensing data. Soil Tillage Res. 2021, 210, 104982. [Google Scholar] [CrossRef]
Guo, L.; Fu, P.; Shi, T.; Chen, Y.; Zhang, H.; Meng, R.; Wang, S. Mapping field-scale soil organic carbon with unmanned aircraft system-acquired time series multispectral images. Soil Tillage Res. 2020, 196, 104477. [Google Scholar] [CrossRef]
Luo, C.; Wang, Y.; Zhang, X.; Zhang, W.; Liu, H. Spatial prediction of soil organic matter content using multiyear synthetic images and partitioning algorithms. Catena 2022, 211, 106023. [Google Scholar] [CrossRef]
Andrade, R.; Silva, S.H.G.; Weindorf, D.C.; Chakraborty, S.; Faria, W.M.; Mesquita, L.F.; Guilherme, L.R.G.; Curi, N. Assessing models for prediction of some soil chemical properties from portable X-ray fluorescence (pXRF) spectrometry data in Brazilian Coastal Plains. Geoderma 2020, 357, 113957. [Google Scholar] [CrossRef]
Hong, Y.; Chen, S.; Hu, B.; Wang, N.; Xue, J.; Zhuo, Z.; Yang, Y.; Chen, Y.; Peng, J.; Liu, Y.; et al. Spectral fusion modeling for soil organic carbon by a parallel input-convolutional neural network. Geoderma 2023, 437, 116584. [Google Scholar] [CrossRef]
Hong, Y.; Chen, Y.; Chen, S.; Shen, R.; Hu, B.; Peng, J.; Wang, N.; Guo, L.; Zhuo, Z.; Yang, Y.; et al. Data mining of urban soil spectral library for estimating organic carbon. Geoderma 2022, 426, 116102. [Google Scholar] [CrossRef]
Deng, Y.; Xiao, L.; Shi, Y. Enhanced Hyperspectral Forest Soil Organic Matter Prediction Using a Black-Winged Kite Algorithm-Optimized Convolutional Neural Network and Support Vector Machine. Appl. Sci. 2025, 15, 503. [Google Scholar] [CrossRef]
Feng, B.; Yang, H.; Ren, Y.; Zheng, S.; Feng, G.; Huang, Y. Study on Change of Landscape Pattern Characteristics of Comprehensive Land Improvement Based on Optimal Spatial Scale. Land 2025, 14, 135. [Google Scholar] [CrossRef]
Liu, X.; Wang, M.; Liu, Z.; Li, X.; Ji, X.; Wang, F. Spatial and temporal evolution of soil organic matter and its response to dynamic factors in the Southern part of Black Soil Region of Northeast China. Soil Tillage Res. 2025, 248, 106475. [Google Scholar] [CrossRef]
Doetterl, S.; Berhe, A.A.; Heckman, K.; Lawrence, C.; Schnecker, J.; Vargas, R.; Vogel, C.; Wagai, R. A landscape-scale view of soil organic matter dynamics. Nat. Rev. Earth Environ. 2025, 6, 1–15. [Google Scholar] [CrossRef]
Kakhani, N.; Rangzan, M.; Jamali, A.; Attarchi, S.; Alavipanah, S.K.; Mommert, M.; Tziolas, N.; Scholten, T. SSL-SoilNet: A Hybrid Transformer-Based Framework with Self-Supervised Learning for Large-Scale Soil Organic Carbon Prediction. IEEE Trans. Geosci. Remote. Sens. 2024, 62, 4509915. [Google Scholar] [CrossRef]
Tresson, P.; Dumont, M.; Jaeger, M.; Borne, F.; Boivin, S.; Marie-Louise, L.; François, J.; Boukcim, H.; Goëau, H. Self-supervised learning of Vision Transformers for digital soil mapping using visual data. Geoderma 2024, 450, 117056. [Google Scholar] [CrossRef]
Wang, Y.; Zha, Y. Comparison of transformer, LSTM and coupled algorithms for soil moisture prediction in shallow-groundwater-level areas with interpretability analysis. Agric. Water Manag. 2024, 305, 109120. [Google Scholar] [CrossRef]
Tziolas, N.; Tsakiridis, N.; Heiden, U.; van Wesemael, B. Soil organic carbon mapping utilizing convolutional neural networks and Earth observation data, a case study in Bavaria state Germany. Geoderma 2024, 444, 116867. [Google Scholar] [CrossRef]
Dong, Z.; Yao, L.; Bao, Y.; Zhang, J.; Yao, F.; Bai, L.; Zheng, P. Prediction of soil organic carbon content in complex vegetation areas based on CNN-LSTM model. Land 2024, 13, 915. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
Sekulić, A.; Kilibarda, M.; Heuvelink, G.; Nikolić, M.; Bajat, B. Random forest spatial interpolation. Remote. Sens. 2020, 12, 1687. [Google Scholar] [CrossRef]
Hernández, T.D.B.; Slater, B.K.; Shaffer, J.M.; Basta, N. Comparison of methods for determining organic carbon content of urban soils in Central Ohio. Geoderma Reg. 2023, 34, e00680. [Google Scholar] [CrossRef]
Chen, L.; He, Z.; Du, J.; Yang, J.; Zhu, X. Patterns and environmental controls of soil organic carbon and total nitrogen in alpine ecosystems of northwestern China. Catena 2016, 137, 37–43. [Google Scholar] [CrossRef]
Ma, S.; Qiao, Y.; Wang, L.; Zhang, J. Terrain gradient variations in ecosystem services of different vegetation types in mountainous regions: Vegetation resource conservation and sustainable development. For. Ecol. Manag. 2021, 482, 118856. [Google Scholar] [CrossRef]
Zhu, M.; Feng, Q.; Qin, Y.; Cao, J.; Zhang, M.; Liu, W.; Deo, R.C.; Zhang, C.; Li, R.; Li, B. The role of topography in shaping the spatial patterns of soil organic carbon. Catena 2019, 176, 296–305. [Google Scholar] [CrossRef]
Seleiman, M.F.; Al-Suhaibani, N.; Ali, N.; Akmal, M.; Alotaibi, M.; Refay, Y.; Dindaroglu, T.; Abdul-Wajid, H.H.; Battaglia, M.L. Drought stress impacts on plants and different approaches to alleviate its adverse effects. Plants 2021, 10, 259. [Google Scholar] [CrossRef] [PubMed]
Villarino, S.H.; Pinto, P.; Jackson, R.B.; Piñeiro, G. Plant rhizodeposition: A key factor for soil organic matter formation in stable fractions. Sci. Adv. 2021, 7, eabd3176. [Google Scholar] [CrossRef] [PubMed]
Beattie, G.A.; Edlund, A.; Esiobu, N.; Gilbert, J.; Nicolaisen, M.H.; Jansson, J.K.; Jensen, P.; Keiluweit, M.; Lennon, J.T.; Martiny, J.; et al. Soil microbiome interventions for carbon sequestration and climate mitigation. mSystems 2025, 10, e01129–24. [Google Scholar] [CrossRef] [PubMed]
Chen, Q.; Niu, B.; Hu, Y.; Luo, T.; Zhang, G. Warming and increased precipitation indirectly affect the composition and turnover of labile-fraction soil organic matter by directly affecting vegetation and microorganisms. Sci. Total. Environ. 2020, 714, 136787. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; An, C.; Zhang, W.; Zheng, L.; Zhang, Y.; Lu, C.; Liu, L. Drivers of mountain soil organic carbon stock dynamics: A review. J. Soils Sediments 2023, 23, 64–76. [Google Scholar] [CrossRef]
Conrad, O.; Bechtel, B.; Bock, M.; Dietrich, H.; Fischer, E.; Gerlitz, L.; Wehberg, J.; Wichmann, V.; Böhner, J. System for automated geoscientific analyses (SAGA) v. 2.1. 4. Geosci. Model Dev. 2015, 8, 1991–2007. [Google Scholar] [CrossRef]
Fick, S.E.; Hijmans, R.J. WorldClim 2: New 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 2017, 37, 4302–4315. [Google Scholar] [CrossRef]
Zhang, Z.; Ding, J.; Zhu, C.; Chen, X.; Wang, J.; Han, L.; Ma, X.; Xu, D. Bivariate empirical mode decomposition of the spatial variation in the soil organic matter content: A case study from NW China. Catena 2021, 206, 105572. [Google Scholar] [CrossRef]
Du, J.; Zhang, Y.; Wang, P.; Tansey, K.; Liu, J.; Zhang, S. Enhancing Winter Wheat Yield Estimation With a CNN-Transformer Hybrid Framework Utilizing Multiple Remotely Sensed Parameters. IEEE Trans. Geosci. Remote. Sens. 2025, 63, 4405213. [Google Scholar] [CrossRef]
Gorishniy, Y.; Rubachev, I.; Khrulkov, V.; Babenko, A. Revisiting deep learning models for tabular data. Adv. Neural Inf. Process. Syst. 2021, 34, 18932–18943. [Google Scholar]
Yang, L.; Cai, Y.; Zhang, L.; Guo, M.; Li, A.; Zhou, C. A deep learning method to predict soil organic carbon content at a regional scale using satellite-based phenology variables. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102428. [Google Scholar] [CrossRef]
Tsakiridis, N.L.; Keramaris, K.D.; Theocharis, J.B.; Zalidis, G.C. Simultaneous prediction of soil properties from VNIR-SWIR spectra using a localized multi-channel 1-D convolutional neural network. Geoderma 2020, 367, 114208. [Google Scholar] [CrossRef]
Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; pp. 104–118. [Google Scholar]
Prechelt, L. Early stopping-but when? In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2002; pp. 55–69. [Google Scholar]
Tziachris, P.; Aschonitis, V.; Chatzistathis, T.; Papadopoulou, M. Assessment of spatial hybrid methods for predicting soil organic matter using DEM derivatives and soil parameters. Catena 2019, 174, 206–216. [Google Scholar] [CrossRef]
Zhang, M.; Liu, H.; Zhang, M.; Yang, H.; Jin, Y.; Han, Y.; Tang, H.; Zhang, X.; Zhang, X. Mapping soil organic matter and analyzing the prediction accuracy of typical cropland soil types on the Northern Songnen Plain. Remote. Sens. 2021, 13, 5162. [Google Scholar] [CrossRef]
Li, Y.; Henrion, M.; Moore, A.; Lambot, S.; Opfergelt, S.; Vanacker, V.; Jonard, F.; Van Oost, K. Factors controlling peat soil thickness and carbon storage in temperate peatlands based on UAV high-resolution remote sensing. Geoderma 2024, 449, 117009. [Google Scholar] [CrossRef]
Zhang, F.; Liu, Y.; Wu, S.; Liu, J.; Luo, Y.; Ma, Y.; Pan, X. Prediction and spatial–temporal changes of soil organic matter in the Huanghuaihai Plain by combining legacy and recent data. Geoderma 2024, 450, 117031. [Google Scholar] [CrossRef]
Biedenkapp, A.; Lindauer, M.; Eggensperger, K.; Hutter, F.; Fawcett, C.; Hoos, H. Efficient parameter importance analysis via ablation with surrogates. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
Gal, Y.; Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the International Conference on Machine Learning, PMLR, New York City, NY, USA, 19–24 June 2016; pp. 1050–1059. [Google Scholar]
Maxwell, A.E.; Shobe, C.M. Land-surface parameters for spatial predictive mapping and modeling. Earth-Sci. Rev. 2022, 226, 103944. [Google Scholar] [CrossRef]
Gomes, L.C.; Faria, R.M.; de Souza, E.; Veloso, G.V.; Schaefer, C.E.G.; Fernandes Filho, E.I. Modelling and mapping soil organic carbon stocks in Brazil. Geoderma 2019, 340, 337–350. [Google Scholar] [CrossRef]
Zhang, L.; Cai, Y.; Huang, H.; Li, A.; Yang, L.; Zhou, C. A CNN-LSTM model for soil organic carbon content prediction with long time series of MODIS-based phenological variables. Remote. Sens. 2022, 14, 4441. [Google Scholar] [CrossRef]
Liu, W.; Jiang, Y.; Yang, Q.; Yang, H.; Li, Y.; Li, Z.; Mao, W.; Luo, Y.; Wang, X.; Tan, Z. Spatial distribution and stability mechanisms of soil organic carbon in a tropical montane rainforest. Ecol. Indic. 2021, 129, 107965. [Google Scholar] [CrossRef]
Jiquan, C.; Kyaw, T.P.U.; Malcolm, N.; Jerry, F.F. The contributions of microclimatic information in advancing ecosystem science. Agric. For. Meteorol. 2024, 355, 110105. [Google Scholar] [CrossRef]
Lv, X.; Jia, G.; Yu, X.; Niu, L. Vegetation and topographic factors affecting SOM, SOC, and N contents in a mountainous watershed in north China. Forests 2022, 13, 742. [Google Scholar] [CrossRef]
Fan, M.; Lal, R.; Zhang, H.; Margenot, A.J.; Wu, J.; Wu, P.; Zhang, L.; Yao, J.; Chen, F.; Gao, C. Variability and determinants of soil organic matter under different land uses and soil types in eastern China. Soil Tillage Res. 2020, 198, 104544. [Google Scholar] [CrossRef]

Figure 1. The distribution of soil samples and cultivated land.

Figure 2. The distribution of train and test samples.

Figure 3. The architecture of the GPTransformer. Path a denotes the route through the first Transformer Encoding Block where the initial LayerNorm layer is bypassed, after which the processing continues along path b for the remaining layers; ⊕ denotes element-wise addition.

Figure 4. The architecture of Multi-Head Attention consists of 16 Scaled Dot-Product Attention layers.

Figure 5. The architecture of the 1D-CNN.

Figure 6. The architecture of the GPTransCNN. Path a denotes the route through the first Transformer Encoding Block where the initial LayerNorm layer is bypassed, after which the processing continues along path b for the remaining layers; ⊕ denotes element-wise addition,

C_{i}

in the input represents the i-th environmental covariate.

Figure 6. The architecture of the GPTransCNN. Path a denotes the route through the first Transformer Encoding Block where the initial LayerNorm layer is bypassed, after which the processing continues along path b for the remaining layers; ⊕ denotes element-wise addition,

C_{i}

in the input represents the i-th environmental covariate.

Figure 7. Scatter plots and violin plots of soil organic matter in the test set.

Figure 8. Spatial distribution of soil organic matter.

Figure 9. Spatial distribution of model predictive uncertainty.

Figure 10. Spatial distributions of model predictive uncertainty and Terrain Ruggedness Index, with selected area highlighted and enlarged for detailed comparison.

Table 1. Description of topographic covariates.

Indices	Description	Spatial Resolution
Aspect (Asp)	The angle between the projection of the downslope direction (i.e., surface normal) on the horizontal plane and true north.	30 m
Analytical Hillshading Index (AHI)	Terrain shadow distribution calculated based on solar azimuth and elevation at a specific time.	30 m
Channel Network Distance (CND)	The distance from a specific point on the terrain to the nearest stream or river in the channel network.	30 m
Channel Network Base Level (CNBL)	A hierarchical classification of the channel network, typically based on flow volume, catchment area, and channel width.	30 m
Convergence Index (CI)	Indicates the tendency of water flow to converge on the terrain surface.	30 m
Profile Curvature (PrCu)	Measures the convexity or concavity of the terrain surface along the direction of the steepest slope.	30 m
Plan Curvature (PlCu)	Measures the curvature of the terrain surface in the direction perpendicular to the steepest slope.	30 m
Slope (Slo)	The overall steepness of the terrain at a given point.	30 m
Total Catchment Area (TCaAr)	The area from which all surface water flows to a common outlet point (e.g., a river mouth).	30 m
Topographic Wetness Index (TWI)	An index used to assess the influence of topography on soil moisture distribution.	30 m
Relative Slope Position (RSP)	Describes the relative position of a point within specific terrain features (e.g., valley or ridge).	30 m
LS-Factor (LSF)	Evaluates the potential impact of topography on soil erosion.	30 m
Valley Depth (VD)	The vertical distance from the valley bottom to the highest points on either side.	30 m
Terrain Undulation (TU)	The elevation difference between the highest and lowest points within a defined area (e.g., a 11 × 11 window).	30 m
Topographic Position Index (TPI)	Assesses the relative position of a point compared to the average elevation of its surroundings.	30 m
Terrain Ruggedness Index (TRI)	Quantifies the ruggedness or roughness of the terrain surface.	30 m
Elevation (Ele)	The vertical distance of a point on the terrain surface relative to a reference level.	30 m

Table 2. Description of bioclimatic covariates.

Indices	Description	Spatial Resolution
BIO1	Annual Mean Temperature	30 m
BIO2	Mean Diurnal Range (Mean of monthly (max temp - min temp))	30 m
BIO3	Isothermality (BIO2/BIO7) (×100)	30 m
BIO4	Temperature Seasonality (standard deviation ×100)	30 m
BIO5	Max Temperature of Warmest Month	30 m
BIO6	Min Temperature of Coldest Month	30 m
BIO7	Temperature Annual Range (BIO5-BIO6)	30 m
BIO8	Mean Temperature of Wettest Quarter	30 m
BIO9	Mean Temperature of Driest Quarter	30 m
BIO10	Mean Temperature of Warmest Quarter	30 m
BIO11	Mean Temperature of Coldest Quarter	30 m
BIO12	Annual Precipitation	30 m
BIO13	Precipitation of Wettest Month	30 m
BIO14	Precipitation of Driest Month	30 m
BIO15	Precipitation Seasonality (Coefficient of Variation)	30 m
BIO16	Precipitation of Wettest Quarter	30 m
BIO17	Precipitation of Driest Quarter	30 m
BIO18	Precipitation of Warmest Quarter	30 m
BIO19	Precipitation of Coldest Quarter	30 m

Table 3. Parameters of each layer in the 1D-CNN model. Note: Shapes are denoted as (batch size, channels, length) for Conv1d and BatchNorm, and (batch size, feature dimension) for Flatten and FC.

Layer	Filter Size	Stride	Activation	Shape
Conv1d_1	3	1	-	(32, 16, 34)
BatchNorm_1	-	-	ReLU	(32, 16, 34)
Conv1d_2	5	1	-	(32, 32, 30)
BatchNorm_2	-	-	ReLU	(32, 32, 30)
Conv1d_3	7	1	-	(32, 64, 24)
BatchNorm_3	-	-	ReLU	(32, 64, 24)
Conv1d_4	11	1	-	(32, 128, 14)
BatchNorm_4	-	-	ReLU	(32, 128, 14)
Flatten	-	-	-	(32, 1792)
FC_1	-	-	ReLU	(32, 32)
FC_2	-	-	-	(32, 1)

Table 4. Statistical analysis of soil organic matter.

	Min (g/kg)	Max (g/kg)	Mean (g/kg)	Standard Deviation (g/kg)	Kurtosis	Skewness	CV (%)
SOM	4.9	63.7	25.4820	8.0814	4.6163	0.8877	31.71

Table 5. Model performance.

Model	R²	RMSE	MAE
GPTransformer	0.3643	6.5685	5.0842
1D-CNN	0.3765	6.5050	4.7157
GPTransCNN	0.4329	6.2037	4.4975

Table 6. Results of ten-fold cross-validation.

Model	R²	RMSE	MAE
GPTransformer	0.3227	6.5835	5.0062
1D-CNN	0.3998	6.2091	4.4844
GPTransCNN	0.4260	6.0631	4.4334

Table 7. Results of ablation study.

Model	R²	RMSE	MAE
Transformer	0.3019	6.8830	5.0743
NPTransformer	0.2996	6.8946	5.1991
GPTransformer	0.3643	6.5685	5.0842
1D-CNN	0.3765	6.5050	4.7157
TransCNN	0.4036	6.3620	4.7035
NPTransCNN	0.4035	6.3625	4.6679
GPTransCNN	0.4329	6.2037	4.4975

Table 8. Results of ten-fold cross-validated ablation analysis.

Model	R²	RMSE	MAE
Transformer	0.2963	6.7240	5.0769
NPTransformer	0.3096	6.6560	4.9973
GPTransformer	0.3227	6.5835	5.0062
1D-CNN	0.3998	6.2091	4.4844
TransCNN	0.4151	6.1289	4.4499
NPTransCNN	0.4135	6.1249	4.4925
GPTransCNN	0.4260	6.0631	4.4334

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shen, L.; Xie, Y.; Deng, Y.; Feng, Y.; Zhou, Q.; Xie, H. A Methodological Study on Improving the Accuracy of Soil Organic Matter Mapping in Mountainous Areas Based on Geo-Positional Transformer-CNN: A Case Study of Longshan County, Hunan Province, China. Appl. Sci. 2025, 15, 8060. https://doi.org/10.3390/app15148060

AMA Style

Shen L, Xie Y, Deng Y, Feng Y, Zhou Q, Xie H. A Methodological Study on Improving the Accuracy of Soil Organic Matter Mapping in Mountainous Areas Based on Geo-Positional Transformer-CNN: A Case Study of Longshan County, Hunan Province, China. Applied Sciences. 2025; 15(14):8060. https://doi.org/10.3390/app15148060

Chicago/Turabian Style

Shen, Luming, Yangfan Xie, Yangjun Deng, Yujie Feng, Qing Zhou, and Hongxia Xie. 2025. "A Methodological Study on Improving the Accuracy of Soil Organic Matter Mapping in Mountainous Areas Based on Geo-Positional Transformer-CNN: A Case Study of Longshan County, Hunan Province, China" Applied Sciences 15, no. 14: 8060. https://doi.org/10.3390/app15148060

APA Style

Shen, L., Xie, Y., Deng, Y., Feng, Y., Zhou, Q., & Xie, H. (2025). A Methodological Study on Improving the Accuracy of Soil Organic Matter Mapping in Mountainous Areas Based on Geo-Positional Transformer-CNN: A Case Study of Longshan County, Hunan Province, China. Applied Sciences, 15(14), 8060. https://doi.org/10.3390/app15148060

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Methodological Study on Improving the Accuracy of Soil Organic Matter Mapping in Mountainous Areas Based on Geo-Positional Transformer-CNN: A Case Study of Longshan County, Hunan Province, China

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Soil Sampling

2.3. Environmental Covariates

2.4. Dataset Construction

2.5. Geo-Positional Transformer-CNN

2.5.1. GPTransformer Branch

2.5.2. The 1D-CNN Branch

2.5.3. GPTransCNN

2.6. Model Evaluation

3. Results

3.1. Descriptive Statistics of Soil Organic Matter

3.2. Prediction Results of Soil Organic Matter Content

3.3. Spatial Distribution of Soil Organic Matter Content

4. Discussion

4.1. Ablation Study

4.2. Uncertainty Quantification

4.3. Comparison with Similar Studies

4.4. Limitations and Outlook

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI