Research on Soil Pore Segmentation of CT Images Based on MMLFR-UNet Hybrid Network

Qin, Changfeng; Zhang, Jie; Duan, Yu; Li, Chenyang; Dong, Shanzhi; Mu, Feng; Chi, Chengquan; Han, Ying

doi:10.3390/agronomy15051170

Open AccessArticle

Research on Soil Pore Segmentation of CT Images Based on MMLFR-UNet Hybrid Network

by

Changfeng Qin

¹

,

Jie Zhang

¹,

Yu Duan

¹,

Chenyang Li

¹,

Shanzhi Dong

¹,

Feng Mu

¹,

Chengquan Chi

^1,* and

Ying Han

^2,3,4,*

¹

School of Information Science and Technology, Hainan Normal University, Haikou 571158, China

²

College of Geography and Environmental Science, Hainan Normal University, Haikou 571158, China

³

Key Laboratory of Earth Surface Processes and Environmental Change of Tropical Islands, Haikou 571158, China

⁴

Chengmai Meiting Agroforestry Complex Ecosystem Hainan Observation and Research Station, Chengmai 571900, China

^*

Authors to whom correspondence should be addressed.

Agronomy 2025, 15(5), 1170; https://doi.org/10.3390/agronomy15051170

Submission received: 10 April 2025 / Revised: 8 May 2025 / Accepted: 9 May 2025 / Published: 11 May 2025

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Accurate segmentation of soil pore structure is crucial for studying soil water migration, nutrient cycling, and gas exchange. However, the low-contrast and high-noise CT images in complex soil environments cause the traditional segmentation methods to have obvious deficiencies in accuracy and robustness. This paper proposes a hybrid model combining a Multi-Modal Low-Frequency Reconstruction algorithm (MMLFR) and UNet (MMLFR-UNet). MMLFR enhances the key feature expression by extracting the image low-frequency signals and suppressing the noise interference through the multi-scale spectral decomposition, whereas UNet excels in the segmentation detail restoration and complexity boundary processing by virtue of its coding-decoding structure and the hopping connection mechanism. In this paper, an undisturbed soil column was collected in Hainan Province, China, which was classified as Ferralsols (FAO/UNESCO), and CT scans were utilized to acquire high-resolution images and generate high-quality datasets suitable for deep learning through preprocessing operations such as fixed-layer sampling, cropping, and enhancement. The results show that MMLFR-UNet outperforms UNet and traditional methods (e.g., Otsu and Fuzzy C-Means (FCM)) in terms of Intersection over Union (IoU), Dice Similarity Coefficients (DSC), Pixel Accuracy (PA), and boundary similarity. Notably, this model exhibits exceptional robustness and precision in segmentation tasks involving complex pore structures and low-contrast images.

Keywords:

2Dvmd; UNet; image low-frequency signal; soil pore segmentation; CT

1. Introduction

Soil is defined as a porous medium formed by the geometric arrangement of solid particles (mineral and organic matter), water, air, and microorganisms. The soil pore structure is defined by the interconnected void spaces between these solid phases. The intricate characteristics of soil pores, encompassing their geometry, spatial configuration, and connectivity, are pivotal in determining soil gas permeability and water-retention properties [1,2,3,4]. Furthermore, it has been established that the structure of soil pores is a pivotal factor in the maintenance of essential soil biogeochemical and biophysical processes [5,6]. The quantitative characterization of soil pore structure is imperative for comprehending the mechanisms of water and gas transport. This, in turn, is essential for accurate modeling and prediction of physical and chemical processes in soils [7].

Conventional soil physics has historically relied on empirical or physically-based models based on Representative Elementary Volume (REV) assumptions to achieve such quantitative characterizations [8,9]. These models are frequently employed to estimate parameters such as porosity, permeability, and specific surface area [10], often utilizing classical flow equations like Darcy’s law and presuming a homogeneous medium at a defined spatial scale. Nevertheless, REV-based approaches are constrained in their capacity to capture the microstructural characteristics, connectivity, and heterogeneity of complex soil pore networks. With the advent of computed tomography (CT) imaging, it is now possible to visualize the internal three-dimensional pore architecture of soil samples at micrometer-scale resolution, allowing direct analysis of spatial variability and structural complexity. This capability enhances our understanding of fluid dynamics in soil systems and provides essential data for investigating microbial community distributions and plant–root interactions [11,12,13]. As a non-destructive and high-resolution technique, X-ray CT has become a powerful tool in the field of soil science, enabling the visualization of pore structures in both two-dimensional (2D) slices and three-dimensional (3D) reconstructions, and supporting accurate quantification of their morphological characteristics.

Despite the advancement of CT imaging, accurately segmenting complex soil pore structures from CT images remains challenging [14]. Traditional image segmentation methods—such as thresholding, clustering, and morphology-based techniques—have been widely applied, yet each faces specific limitations when dealing with low contrast, noise, and overlapping textures [15,16,17,18,19]. Thresholding methods such as Otsu have been observed to be sensitive to grayscale variations and noise, resulting in inaccurate delineation in complex scenarios [20,21]. Clustering methods such as Fuzzy C-Means have been shown to be more robust to fuzzy edges [22]. However, they have been observed to struggle with grayscale overlaps in multi-scale structures [23,24]. Morphological methods have been demonstrated to enhance geometrical features; however, they are prone to the loss of detail in pores that exhibit irregular shapes [25]. Overall, these traditional methods lack robustness in the face of soil heterogeneity and imaging artifacts, resulting in segmentation outputs that fall short of practical application demands [26,27].

To address these challenges, deep learning-based models have gained traction in recent years. U-net, a widely used encoder-decoder architecture with skip connections, has demonstrated excellent performance in medical and soil image segmentation by capturing multi-scale spatial features and preserving fine boundary information [28,29,30,31,32,33,34]. Despite the extraordinary performance of UNet in the field of soil pore segmentation, it still inevitably exposes some shortcomings when dealing with complex soil structures [35]. Specifically, UNet’s ability is slightly insufficient for segmenting small target pores and edge regions. When the pore boundary is blurred and difficult to recognize, or the internal pore presents complex morphological features, UNet often fails to output accurate segmentation results. In addition, in the feature extraction process, UNet has the problem of repeatedly providing low-resolution information, which may reduce the segmentation accuracy when dealing with high-noise and low-contrast images. Although UNet is capable of segmenting more complex soil images, it has its own common problem with neural networks, i.e., it requires a large number of high-quality image labeled Mask files and usually needs to be normalized in order to achieve good generalization [36,37]. The traditional manual annotation remains labor-intensive and time-consuming, further constraining practical deployment [38,39].

In order to effectively compensate for the shortcomings of UNet, we propose an advanced method, the Multi-Modal Low-Frequency Reconstruction algorithm (MMLFR). The MMLFR, which we improved based on 2Dvmd, possesses a unique frequency-band decomposition capability, which enables it to perform excellently in dealing with noise and non-homogeneity in images, and thus greatly improves the detailed resolution of images. For example, a defogging technique based on 2Dvmd and dark-channel prior has been proposed to successfully solve the low-contrast problem in CT images [40]. In preprocessing, MMLFR precisely decomposes images into multiple frequency bands for targeted enhancement, thus improving overall segmentation. The reconstructed image is input to the UNet network for downsampling, and the features are extracted and passed to the decoder; the decoder part is upsampled layer by layer by using the inverse convolution, which not only improves the prediction efficiency but also better preserves the important edge features in the soil image.

In this study, we propose a hybrid segmentation model, MMLFR-UNet, which integrates Multi-Modal Low-Frequency Reconstruction (MMLFR) based on two-dimensional variational mode decomposition (2Dvmd) with the UNet architecture to enable precise segmentation of heterogeneous soil CT images. The objectives of this study are as follows: (1) to compare the performance of MMLFR-UNet with traditional pore identification methods (e.g., Otsu, Fuzzy C-Means, and UNet) through both quantitative and qualitative experiments; and (2) to provide technical support for the subsequent quantitative characterization of soil pore structures.

2. Materials and Methods

2.1. Soil Sampling

Soil sampling [5,41,42] was conducted in Chengmai Meiting Agroforestry Complex Ecosystem Hainan Observation and Research Station, Hainan Province, China (19°49′ N, 110°5′ E). Two undisturbed cylindrical soil columns (10 cm in diameter and 10 cm in height) were randomly collected from a well-structured A horizon of Ferralsols (FAO/UNESCO), which corresponded to Oxisols (USDA). Hainan Island, situated in the tropics, features a warm, humid climate with abundant rainfall and deep, fertile, red-weathered soils ideal for tropical agriculture. The bulk density at a depth of 0 to 20 cm is 1.37 g cm⁻³. The soil texture is classified as silt loam, composed of 29.33% sand, 44.17% silt, and 26.50% clay. The chemical properties of soil samples were as follows: pH (5.67), soil organic matter (11.13 g·kg⁻¹), total phosphorus (0.58 g·kg⁻¹), and total nitrogen (0.89 g·kg⁻¹). Hainan Island, situated in the tropics, features a warm, humid climate with abundant rainfall and deep, fertile, red-weathered soils ideal for tropical agriculture. To analyze these soil properties in depth, we performed CT scans on collected soil cores, obtaining single-channel grayscale images across three planes (xy, yz, and zx). The selected dataset includes 6450 slices along the Z-axis, each with dimensions of 3000 × 3000 pixels, forming a three-dimensional volume of [3000, 3000, 6450]. These high-resolution images enable detailed observation of soil structures, providing robust foundational data for subsequent research.

2.2. Multi-Modal Low-Frequency Reconstruction (MMLFR)

The Multi-Modal Low-Frequency Reconstruction algorithm (Multi-Modal Low-Frequency Reconstruction) is based on two-dimensional variational mode decomposition (2Dvmd), which can be traced back to the variational mode decomposition (vmd) proposed by Dragomiretskiy in 2014 [43]. Vmd can decompose one-dimensional signals into multiple groups of intrinsic mode functions (IMFs), and each mode has a high concentration within a specific frequency range. 2Dvmd extends this idea to two-dimensional images, decomposing the input image into multiple two-dimensional IMFs, with each mode representing different frequency components or features [44]. Thus, the original image can be expressed by multiple modal sub-images with local features, providing multi-scale features for subsequent analysis or model training, thereby effectively improving the efficiency of image processing and understanding.

The Multi-Modal Low-Frequency Reconstruction algorithm proposed in this paper decomposes the input image into several modes and measures their energy proportion to recombine the low-frequency modes to obtain a low-noise image. Firstly, the frequency domain representation of the image is obtained by two-dimensional Fourier transform, and then the low-frequency energy is calculated in the selected central region, and the ratio of the low-frequency energy to the total energy is used as a measure of the proportion of low-frequency components.

The overall workflow of the proposed MMLFR-UNet is depicted in Figure 1, outlining the methodology used in this study. Initially, raw CT soil images undergo conventional preprocessing techniques to normalize the data and reduce basic noise. Subsequent to this, the Multi-Modal Low-Frequency Reconstruction (MMLFR) algorithm is implemented in order to decompose each image into multiple sub-modalities, with the purpose of capturing distinct frequency components and enhancing the representation of structural details. Thereafter, an FFT-based frequency domain metric is utilized to identify and suppress noise components automatically. Subsequently, a selection of sub-modes is recombined in order to reconstruct a denoised version of the input image, thereby enhancing critical features while minimizing interference. The reconstructed image is then fed into a UNet-based encoder-decoder architecture. During the testing phase, the trained MMLFR-UNet model generates segmentation predictions, which are then compared to ground truth labels and benchmarked against traditional methods. In order to perform a quantitative evaluation of performance, the standard metrics are adopted, including Intersection over Union (IoU), Pixel Accuracy (PA), Dice Similarity Coefficient (DSC), and Boundary Similarity (Boundary F1-score). The effectiveness and robustness of the proposed approach are collectively validated by these metrics [45].

During the iterative process, the center frequency of each mode and its frequency domain distribution are updated by the bandwidth penalty term and the Lagrange multiplier until the convergence condition is satisfied. Finally, each low-frequency modality obtained from the inverse transformation is reorganized to synthesize a new low-noise image. The algorithm-specific process pseudo-code is as follows (Algorithm 1):

Algorithm 1: Multi-Modal Low-Frequency Reconstruction Algorithm (MMLFR)

Input: Original image

f (x, y)

, Number of modes

\{K\}

, Bandwidth parameters

\{α_{k}\}

,
Low-frequency energy threshold

\{γ\}

Output: Reconstructed low-frequency image

\{f_{l o w} (x, y)\}

Initialize modes $\{u_{k}^{0} (x, y)\}$ , center frequencies $\{ω_{k}^{0}\}$ , Lagrange multipliers $\{λ_{k}^{0}\}$ , iteration counter n ← 0
Repeat until convergence:
for each mode k = 1 → K do
Update frequency-domain mode $U_{k}$ :
Solve $a r g \min_{U_{k}} (α_{k} {‖▽ U_{k}‖}^{2} + {‖F (f) - \sum_{i = 1}^{K} U_{k}^{n - 1}‖}^{2} + λ_{k}^{n - 1} {‖U_{k}‖}^{2})$
Update center frequency $ω_{k}$ :
$ω_{k}$ ← $\sum_{ξ, η} (ξ^{2} + η^{2}) {|U_{k}^{n} (ξ, η)|}^{2}$ / $\sum_{ξ, η} {|U_{k}^{n} (ξ, η)|}^{2}$
Convert to spatial domain:
$u_{k} (x, y)$ ← inverse_Fourier( $U_{k} (ξ, η)$ )
Update Lagrange multipliers $λ_{k}$ :
$λ_{k}$ ← $λ_{k} + τ (f - \sum u_{i} (x, y))$
end for
Check convergence:
if $\sum {‖u_{k}^{n} - u_{k}^{n - 1}‖}^{2} / {‖u_{k}^{n - 1}‖}^{2} < ε$ then
break
end if
end repeat
for each mode k = 1 → K do
Compute frequency-domain $U_{k} (ξ, η)$ ← $Fourier (u_{k} (x, y)$ )
Apply FFT shift: $U_{k, s h i f t}$ ← $f f t s h i f t (U_{k})$
Compute low-frequency energy $E_{l o w, k}$ ← $\sum {|U_{k, s h i f t (u, v)}|}^{2}$ within low-frequency window
Compute total energy $E_{t o t a l, k}$ ← $\sum {|U_{k, s h i f t (u, v)}|}^{2}$ over entire frequency domain
Compute low-frequency energy ratio $R_{l o w, k}$ ← $E_{l o w, k}$ / $E_{t o t a l, k}$
end for
Combine selected low-frequency modes:
$f_{l o w} (x, y)$ ← $\sum (u_{k} (x, y))$ , for all $u_{k}$ with $R_{l o w, k}$ > γ
return $f_{l o w} (x, y)$

2.3. UNet Network Fundamentals

UNet is a classical Convolutional Neural network (CNN) architecture originally designed for biomedical image segmentation [46,47]. Its symmetric encoder-decoder U-shaped structure enables the network to efficiently integrate local spatial information and global semantic features, thus improving segmentation accuracy and effectiveness.

In the encoder, multiple 3 × 3 convolutional layers extract local features, followed by ReLU activation to introduce non-linearity. A series of 2 × 2 max-pooling operations progressively reduces spatial resolution while retaining essential semantic information, improving global context understanding and reducing computational cost.

The decoder restores resolution through 2 × 2 transposed convolutions. Skip connections link high-resolution features from the encoder to the corresponding decoder layers, allowing better detail preservation and more accurate segmentation. A final convolutional layer reduces the feature channels to match the number of target classes, producing the final segmentation output. This symmetric architecture, combined with skip connections, enables UNet to deliver high-performance segmentation across domains such as medical imaging and remote sensing.

2.4. MMLFR-UNet Hybrid Model

In this paper, a hybrid segmentation model of Multi-Modal Low-Frequency Reconstruction algorithm (MMLFR) and UNet, MMLFR-UNet, is proposed, aiming at realizing a more efficient and accurate image segmentation task. By combining the multi-scale feature extraction capability of 2Dvmd with the efficient segmentation function of the UNet framework, the model gives full play to the advantages of both, providing an innovative solution for image segmentation in complex scenes, and its overall structure is shown in Figure 2, which completely demonstrates the process from feature extraction to image segmentation and reconstruction.

In this model, the MMLFR first decomposes each input image into five intrinsic mode functions (IMFs) using 2Dvmd. These IMFs capture different frequency components, and a low-frequency energy threshold (set as 0.3) is applied to preserve meaningful information while suppressing noise. The selected IMFs are Fourier-transformed and merged to form a denoised and enhanced image. This processed image is then fed into a UNet network, which consists of four encoding and four decoding layers, each with 3 × 3 convolution, ReLU activation, and bilinear upsampling. Skip connections are used to preserve spatial detail. The training process is optimized using RMSprop, BCEWithLogitsLoss, and a ReduceLROnPlateau scheduler. The complete training-related parameter configuration is summarized in Table 1.

2.5. Experimental Environment and Computational Efficiency

To evaluate the computational efficiency and practicality of the proposed MMLFR-UNet model, all experiments were conducted on a workstation running Windows 10, configured with Python 3.10.13, PyTorch 2.1.1+cu118, and an NVIDIA RTX 4090 GPU with 24 GB of VRAM. When applied to high-resolution grayscale images of 1700 × 1700, the MMLFR preprocessing step took approximately 260.2 s and consumed 926.1 MB of memory, reflecting considerable computational demand. In contrast, processing 283 × 283 images required only 6.84 s with a peak memory usage of 25.71 MB, demonstrating a substantial improvement in efficiency. Therefore, to balance computational cost and segmentation performance, we selected 283 × 283 as the standard image size for subsequent experiments. For the UNet training stage, using this image size and training for 50 epochs, the total training time was 365.7 s, with a peak GPU memory usage of 2019.34 MB. These results indicate that while the proposed model introduces moderate computational overhead, it remains practical for large-scale soil image segmentation tasks.

2.6. Evaluation Indicators

To comprehensively assess the segmentation performance of different methods on soil CT images, this study adopts four commonly used metrics: Intersection over Union (IoU), Pixel Accuracy (PA), Dice Similarity Coefficient (DSC), and Boundary F1-score (BF1). These indicators evaluate segmentation quality from multiple dimensions, including region overlap, pixel-wise classification, structural similarity, and boundary delineation accuracy. The formulas for these metrics are as follows:

I o U = \frac{T P}{T P + F P + F N}

(1)

P A = \frac{T P + T N}{T P + T N + F P + F N}

(2)

D i c e = \frac{2 \times T P}{2 \times T P + F P + F N}

(3)

B F 1 = \frac{2 \times (B o u n d a r y P r e c i s i o n \times B o u n d a r y R e c a l l)}{(B o u n d a r y P r e c i s i o n + B o u n d a r y R e c a l l)}

(4)

In the aforementioned equation, TP denotes the number of correctly predicted pore pixels, FP and FN represent the number of solid or pore pixels misclassified as the opposite class, respectively, and TN is the number of correctly identified solid pixels. Within the BF1 metric, Boundary Precision is defined as the proportion of correctly predicted boundary pixels among all predicted boundaries, whereas Boundary Recall is the proportion of true boundary pixels that were correctly detected.

3. Results

3.1. Data Preprocessing and Annotation Strategy

Since the object of this study is soil image segmentation, the image in the Z-axis direction is selected for analysis. Figure 3a shows the soil slices in the Z-axis direction. Due to minor variations between adjacent slices in the Z-axis (6450 slices in total), fixed-interval sampling was employed to reduce data redundancy and prevent model overfitting. This selective sampling minimizes repetitive features, as illustrated in Figure 3b.

The original images underwent a series of preprocessing steps, including cropping, selection, segmentation, and contrast enhancement, to generate suitable input data for the hybrid model. The preprocessing workflow is shown in Figure 4.

Figure 4a shows the original soil image (resolution: 3000 × 3000) acquired by CT scanning. Since the image contains irrelevant information, such as the container wall, the image was first cropped by a cropping operation, and the inner tangent square region of the container was selected to eliminate the external interference to obtain subfigure b (resolution: 2450 × 2450). On this basis, in order to further improve the data processing efficiency and accelerate the network training, subgraph b was further segmented into multiple subregions, which were processed in a 1 in 16 manner, and the target region containing pore structure was selected as subgraph c (resolution: 283 × 283). Subsequently, in order to ensure the clarity of the pore information in the subsequent analysis, traditional image processing techniques were applied to enhance subgraph c, including contrast optimization and other operations, and ultimately, a clearer processing result (subgraph d) was obtained. This process provides high-quality input data for the subsequent deep learning segmentation model, which not only improves the efficiency of network training but also reduces redundant information while maximizing the retention of the key features of the soil pore space, ensuring the validity and reliability of the data.

To produce high-quality datasets for model training and testing, a hybrid annotation workflow combining automatic processing with manual labeling was developed, significantly enhancing annotation efficiency and accuracy. Figure 5 illustrates the overall annotation procedure.

Starting with preprocessed images in Figure 5a, a step-by-step approach was adopted to handle complex structures and blurred boundaries within soil images:

Binarization: The image underwent binarization, extracting key structural features to simplify subsequent segmentation tasks and produce a clear binary image (Figure 5b).
Fuzzy C-Means (FCM): The binary image was further refined using the FCM algorithm, generating preliminary clustering results (Figure 5c). This method effectively addresses fuzzy boundaries and intricate backgrounds, preserving critical features.
Morphological operations: Morphological techniques, including dilation, erosion, opening, and closing operations, were applied to the clustered image, reducing noise and optimizing boundary details to highlight pore structures (Figure 5d).
Manual secondary calibration and inverse color processing: on the basis of the images generated by the above automated processing, the annotators carry out secondary calibration and refinement of the key regions to ensure the accuracy and completeness of the segmentation results. Finally, after inverse color processing, the final dataset Mask label is generated (Figure 5e).

This semi-automated annotation process effectively balances efficiency and quality. On one hand, automated tools handle repetitive tasks across large-scale datasets (16 × 6450 images), significantly reducing manual workload. On the other hand, manual review and precise refinement of critical regions mitigate potential issues such as missed or incorrect annotations common in fully automated methods.

3.2. Segmentation Framework: MMLFR-UNet

3.2.1. Frequency-Based Feature Enhancement (MMLFR)

In the MMLFR-UNet hybrid model, input soil images are initially decomposed using the Multi-Modal Low-Frequency Reconstruction (MMLFR) algorithm. This step effectively isolates crucial image features via frequency separation and reduces noise interference, laying a robust foundation for subsequent feature extraction and segmentation.

As shown in Figure 6, MMLFR decomposes preprocessed images via 2Dvmd into five intrinsic mode functions (IMF1~IMF5), each representing distinct frequency components. As the case in Fig: IMF1, IMF2, and IMF5 mainly capture low-frequency structures, while IMF3 and IMF4 contain high-frequency details and boundaries but also introduce noise, potentially affecting segmentation accuracy.

In the subsequent feature reconstruction step, the model selectively integrates modes that preserve essential information while minimizing noise. Specifically, it computes the low-frequency energy ratio to automatically identify IMF1, IMF2, and IMF5, which are then recombined into a high-quality sub-image. This operation significantly reduces noise and emphasizes critical soil features such as pore boundaries and internal structures.

This reassembly process is vital to the entire workflow, elevating feature representation quality and ensuring greater reliability and precision in UNet’s input. This adaptive approach is particularly effective in complex soil imaging scenarios, enhancing model robustness under low contrast or high noise conditions, and laying a solid foundation for accurate segmentation results.

3.2.2. MMLFR-UNet Image Segmentation

MMLFR-UNet hybrid model demonstrates notable advantages in soil image segmentation tasks. By integrating the Multi-Modal Low-Frequency Reconstruction (MMLFR) method with the UNet architecture, the model effectively extracts critical features and accurately segments complex soil structures. Figure 7 visually illustrates the segmentation performance using five test images, clearly highlighting the model’s capability and effectiveness.

From the experimental results in Figure 7, it can be seen that the first row is the original soil image, which contains rich pore structures and solid regions, but is difficult to segment directly due to noise interference and low contrast. After MMLFR processing (second row), the model decomposes and reorganizes the multi-frequency components in the image, successfully extracting multi-scale soil features and enhancing the recognizability of pore structures, while significantly reducing the influence of noise. The resulting feature set lays a solid foundation for the subsequent training and segmentation of UNet.

The third row is the real label Mask, which serves as a benchmark for the segmentation task, clearly labeling the pore area and background part of the soil. By comparing the predicted results of the MMLFR-UNet model in the fourth row, it can be found that the segmentation results generated by the model are highly consistent with the real labels. The model is able to accurately segment the pore regions and retain detailed features such as the continuity of the boundary and the morphological information of the small pores in the complex background, which indicates that the model performs well in capturing high-resolution features.

From the segmentation results of the five sets of test samples, it can be observed that the model performs with high robustness on different soil images. The MMLFR-UNet model can stably output high-quality segmentation results for both samples with simple structures (e.g., z_3010_5_6) and samples with more textures and complex boundaries (e.g., z_3500_1_2 and z_3510_6_2), which indicates its good adaptability to diverse soil structures.

3.2.3. Visual Comparison of Segmentation Results Across Models

In order to demonstrate more intuitively the performance difference between various types of segmentation models in soil pore structure identification, we list the representative five sets of sample segmentation results in Figure 8, covering the performance of current mainstream algorithms and the method proposed in this paper.

Figure 8 presents the segmentation results on five test samples, revealing distinct performance differences:

Soil Masks (first column):

The Mask provides clearly labeled pore areas with well-defined boundaries, serving as a primary reference for evaluating segmentation results.

2.: MMLFR-UNet (second column):

The proposed MMLFR-UNet segmentation results match well with the Mask and can effectively capture complex pore shapes and boundaries. For example, in the z_3510_6_2 sample, the model accurately segments the pore space while suppressing the background noise, showing strong robustness and accuracy.

3.: UNet (third column):

Although performance is generally stable, UNet struggles with intricate pore structures. For sample z_3500_5_6, certain edges appear blurred, and smaller pores are missed. It also proves less effective at controlling background noise compared to MMLFR-UNet.

4.: Otsu (fourth column):

As a classic threshold-based method, Otsu works reasonably well on simpler images (e.g., z_3010_1_2), but in complex samples (z_3510_3_4 and z_3500_5_6), it cannot adapt to varying grayscale levels, often causing incomplete segmentation or blurred boundaries.

5.: Fuzzy C-Means (fifth column):

FCM has some advantages in dealing with fuzzy boundaries, but its effect is unstable for pore structures with rich details in soil images. For example, in samples z_3500_4_5, the pore region is obviously missegmented, and part of the background noise is not effectively suppressed.

Figure 9 presents detailed segmentation results for sample z_3510_6_2 using different methods, with comparisons to the real label (Mask). The whole figure contains three parts: (1) original and MMLFR enhanced images (a, b); (2) Mask image (c); and (3) the comparison results of the four segmentation methods (d–g).

Top left of Figure 9: original image (a) and MMLFR-processed result (b).

The original image (a) exhibits significant noise and blurred pore boundaries, posing a challenge for direct segmentation. After applying MMLFR (b), multi-scale features become clearer, and background noise is effectively suppressed, providing high-quality input for subsequent segmentation tasks.

Top right of Figure 9: Soil Mask (c).

The Mask accurately labels pore areas and clearly defines boundary details, serving as the evaluation standard. The MMLFR-UNet results closely match this reference in terms of pore shape and boundaries.

Lower part of Figure 9: Comparative segmentation analysis (d–g).

The penultimate rows show the local segmentation details for z_3510_6_2 are shown under four methods—MMLFR-UNet, UNet, Otsu, and FCM—using both overall (d1,e1,f1,g1) and zoomed-in (d2,e2,f2,g2) views, focusing on pore boundaries, morphology, and small pore segmentation. MMLFR-UNet (d1,d2) achieves the best overall and detailed performance, closely matching true pore shapes. The zoomed view (d2) displays precise small pore boundaries with clear details and intact morphology, free from merging errors. In contrast, UNet (e1,e2) exhibits suboptimal performance on small pores, resulting in blurred or incomplete edges. Otsu (g1,g2), relying on a global threshold, struggles with complex boundaries and fine structures, causing incomplete small pore delineation and broken edges. FCM (f1,f2) tends to over-segment and distort boundaries, mistakenly classifying parts of the background as pores.

3.2.4. Quantitative Evaluation of Segmentation Performance

To provide a comprehensive evaluation of the segmentation performance of the proposed MMLFR-UNet, four widely recognized metrics were adopted: Intersection over Union (IoU), Pixel Accuracy (PA), Dice Similarity Coefficient (DSC), and Boundary F1-score (BF1). These metrics facilitate a multidimensional evaluation of segmentation outcomes, encompassing aspects such as spatial overlap, pixel-wise classification, structural similarity, and boundary precision.

As shown in Figure 10, we compared the segmentation performance of four different methods based on the above metrics. Intersection over Union (IoU, Figure 10a) evaluates how well the predicted segmentation overlaps with the ground truth, Pixel Accuracy (PA, Figure 10b) reflects the overall correctness of pixel classification, Dice Similarity Coefficient (DSC, Figure 10c) measures the similarity between predicted and actual regions, and Boundary F1-score (Boundary F1, Figure 10d) focuses specifically on the precision and recall of boundary detection. These metrics comprehensively assess segmentation accuracy from multiple perspectives, including area overlap, pixel-level classification, region similarity, and boundary precision.

The results indicate that MMLFR-UNet consistently outperforms the other methods across all evaluation metrics, particularly excelling in challenging scenarios involving complex pore boundaries and high noise levels. It maintains superior scores in both IoU and Dice, capturing fine structural details while preserving morphological consistency. In contrast, UNet, though relatively effective, exhibits limitations in resolving smaller pore regions. Traditional methods, such as Otsu and FCM, are more susceptible to noise and often produce incomplete or fragmented segmentations, leading to substantially lower scores. Furthermore, MMLFR-UNet achieves high PA values even in low-contrast, high-noise images, underscoring its robustness. Traditional methods, on the other hand, show noticeable performance degradation under similar conditions due to their sensitivity to grayscale variability and noise artifacts. The Boundary F1-score further validates MMLFR-UNet’s effectiveness in capturing fine boundary contours. Unlike Otsu and FCM, which often blur or over-segment boundaries, MMLFR-UNet maintains stable and precise edge delineation, making it particularly suitable for segmenting intricate pore structures.

Table 2 compares the average performance of five segmentation methods across 20 test images, comprehensively evaluated through Boundary F1-score, Dice Similarity Coefficient, IoU, and Pixel Accuracy. Results indicate that MMLFR-UNet consistently achieves superior scores across all metrics, demonstrating exceptional segmentation performance and robustness. Regarding Boundary F1-score, MMLFR-UNet achieves 0.5236, surpassing UNet (0.4929) and substantially outperforming traditional methods like Otsu (0.1351) and FCM (0.2956), highlighting its strong capability in boundary handling. In terms of Dice and F1-score, MMLFR-UNet attains values of 0.8714 and 0.8751, respectively, higher than UNet (0.8692 and 0.8707), confirming its superior region consistency and accuracy. For IoU and Pixel Accuracy, MMLFR-UNet also excels, scoring 0.7790 and 0.9883, respectively, underscoring its precise identification of pore regions and overall segmentation accuracy. In contrast, traditional methods such as Otsu, FCM, and the watershed algorithm exhibit significantly lower scores, struggling with complex boundaries and multi-scale structures.

4. Discussion

4.1. Comparative Performance with Recent Segmentation Models

To evaluate the performance and applicability of the proposed MMLFR-UNet model, we conducted a comparative analysis with several recent advanced segmentation frameworks. These include the UNet-VAE model by Han et al. (2024) [48], the MFHSformer model by Bai et al. (2025) [49], and the ACFTransUNet by Song et al. (2024) [50]. The following discourse aims to elucidate the merits and deficiencies of our methodology in relation to contemporary state-of-the-art techniques.

On 2D soil CT datasets, MMLFR-UNet achieved a Pixel Accuracy of 98.83%, a Dice coefficient of 0.8714, an IoU of 0.7790, and a Boundary F1-score of 0.5236. These results reflect the model’s superior ability to identify small pores, preserve boundary integrity, and resist noise interference—key challenges in soil image segmentation.

In comparison with the work of Han et al. (2024), who developed a UNet-VAE model achieving an average accuracy of 93.83% across four pore types, MMLFR-UNet offers higher overall accuracy and better adaptability to small, irregular pores. However, the UNet-VAE may offer enhanced performance in the context of multi-class segmentation, a domain not directly addressed by our current binary segmentation framework. In 2025, Bai et al. presented MFHSformer, a transformer-based architecture that achieved a reported F1-score of 84.51% and an accuracy of 99.40%. While the model demonstrates excellent precision, it is computationally intensive and may require larger datasets for optimal performance. Conversely, MMLFR-UNet demonstrates superior boundary preservation in 2D CT images and requires a reduced amount of training data. Song et al. (2024) proposed ACFTransUNet for multi-category 3D pore segmentation and reported an average accuracy of 94.12%. While the model demonstrates excellent capabilities for volumetric analysis, MMLFR-UNet is specifically tailored for 2D binary segmentation and achieves higher pixel-level accuracy with fewer parameters and lower computational cost.

4.2. Applicability and Limitations of MMLFR-UNet

The MMLFR-UNet model demonstrates strong adaptability in small-sample learning scenarios, making it particularly effective in applications where annotated datasets are limited. Its integration of 2Dvmd-based low-frequency reconstruction significantly enhances structural feature preservation, allowing accurate segmentation of small pores and complex boundaries even under conditions of low contrast and high noise. These characteristics make the model highly suitable for soil CT image analysis, where such challenges are prevalent. However, the MMLFR preprocessing—especially the Fourier decomposition and recombination stages—introduces additional computational overhead compared to traditional workflows. Furthermore, while the current work focuses on 2D binary segmentation, the generalizability of the model to 3D volumetric data and multi-class segmentation tasks remains to be further validated through expanded experimentation.

5. Conclusions

This study proposes MMLFR-UNet, a hybrid segmentation model tailored for the challenging task of soil pore segmentation in CT images. By integrating Multi-Modal Low-Frequency Reconstruction (MMLFR) based on two-dimensional variational mode decomposition (2Dvmd) with the classic UNet framework, the model effectively captures subtle pore structures and complex boundaries in heterogeneous soil samples. Soil CT images are often characterized by low contrast, high noise, and multi-scale features—conditions that pose significant challenges for traditional segmentation algorithms and frequently lead to suboptimal performance. MMLFR-UNet addresses these challenges by extracting noise-robust, low-frequency features prior to deep learning-based segmentation, significantly enhancing accuracy and structural fidelity.

The model is particularly suited for small-sample machine learning scenarios where labeled CT images are limited. On a testing set comprising 20 soil CT images, MMLFR-UNet consistently outperformed classical methods such as UNet, Otsu, and Fuzzy C-Means in terms of IoU, Dice, Pixel Accuracy, and Boundary F1-score. It proved especially effective in delineating small pores and preserving edge morphology, which are crucial for downstream applications such as modeling soil water flow, gas diffusion, and biological habitat analysis.

While the incorporation of Fourier-based preprocessing increases computational overhead, the resulting improvement in segmentation accuracy justifies the trade-off. In future work, we plan to extend this approach to 3D pore segmentation and multi-class tasks, thereby enhancing its generalizability and making it a valuable tool for broader soil structure studies.

Author Contributions

Conceptualization, C.Q. and C.C.; methodology, C.Q. and J.Z.; software, C.Q., J.Z. and Y.D.; validation, C.Q., Y.D., C.L., S.D., F.M., C.C. and Y.H.; formal analysis, C.Q.; investigation, C.L., S.D., F.M. and Y.H.; resources, Y.H.; data curation, C.Q., J.Z., Y.D., C.L., S.D. and F.M.; writing—original draft preparation, C.Q., C.C. and Y.H.; writing—review and editing, C.C. and Y.H.; visualization, C.Q.; supervision, C.C. and Y.H.; project administration, C.C. and Y.H.; funding acquisition, C.C. and Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Hainan Provincial Natural Science Foundation of China under Grants 622RC669, 322RC659, and 422RC662.

Data Availability Statement

All the data that support the findings of this study are available in the paper. Data will be made available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

MMLFR	Multi-Modal Low-Frequency Reconstruction
2Dvmd	Two-dimensional Variational Mode Decomposition
IoU	Intersection over Union
DSC	Dice Similarity Coefficient
FCM	Fuzzy C-Means
IMF	Intrinsic Mode Function

References

Falconer, R.E.; Houston, A.N.; Otten, W.; Baveye, P.C. Emergent Behavior of Soil Fungal Dynamics: Influence of Soil Architecture and Water Distribution. Soil Sci. 2012, 177, 111–119. [Google Scholar] [CrossRef]
Martínez, F.S.J.; Martín, M.A.; Caniego, F.J.; Tuller, M.; Guber, A.; Pachepsky, Y.; García-Gutiérrez, C. Multifractal Analysis of Discretized X-Ray Ct Images for the Characterization of Soil Macropore Structures. Geoderma 2010, 156, 32–42. [Google Scholar] [CrossRef]
Mzezewa, J. Impacts of Converting Native Grassland into Arable Land and an Avocado Orchard on Soil Hydraulic Properties at an Experimental Farm in South Africa. Agronomy 2025, 15, 1039. [Google Scholar] [CrossRef]
Li, Q.; Qian, Y.; Wang, Y.; Peng, X. The Relation between Soil Moisture Phase Transitions and Soil Pore Structure under Freeze–Thaw Cycling. Agronomy 2024, 14, 1608. [Google Scholar] [CrossRef]
Han, Q.; Zhao, Y.; Liu, L.; Chen, Y.; Zhao, Y. A Simplified Convolutional Network for Soil Pore Identification Based on Computed Tomography Imagery. Soil Sci. Soc. Am. J. 2019, 83, 1309–1318. [Google Scholar] [CrossRef]
da Silva, I.F.S.; de Carvalho Araújo, A.; de Almeida, J.D.S.; de Paiva, A.C.; Silva, A.C.; Roehl, D. Soil Structure Analysis with Attention: A Deep-Learning-Based Method for 3D Pore Segmentation and Characterization. AgriEngineering 2025, 7, 27. [Google Scholar] [CrossRef]
Jia, Y.; Feng, Y.; Zhang, X.; Sun, X. Multifractal Analysis of Temporal Variation in Soil Pore Distribution. Agronomy 2024, 15, 37. [Google Scholar] [CrossRef]
Gerke, K.M.; Karsanina, M.V. How Pore Structure Non-Stationarity Compromises Flow Properties Representativity (Rev) for Soil Samples: Pore-Scale Modelling and Stationarity Analysis. Eur. J. Soil Sci. 2021, 72, 527–545. [Google Scholar] [CrossRef]
Borges, J.A.R.; Pires, L.F.; Cássaro, F.A.M.; Roque, W.L.; Heck, R.J.; Rosa, J.A.; Wolf, F.G. X-Ray Microtomography Analysis of Representative Elementary Volume (Rev) of Soil Morphological and Geometrical Properties. Soil Tillage Res. 2018, 182, 112–122. [Google Scholar] [CrossRef]
Liu, Y.; Jeng, D.-S. Pore Structure of Grain-Size Fractal Granular Material. Materials 2019, 12, 2053. [Google Scholar] [CrossRef]
Yu, Q.; Wang, J.; Tang, H.; Zhang, J.; Zhang, W.; Liu, L.; Wang, N. Application of Improved Unet and Englightengan for Segmentation and Reconstruction of in Situ Roots. Plant Phenomics 2023, 5, 0066. [Google Scholar] [CrossRef] [PubMed]
Mairhofer, S.; Zappala, S.; Tracy, S.R.; Sturrock, C.; Bennett, M.; Mooney, S.J.; Pridmore, T. Rootrak: Automated Recovery of Three-Dimensional Plant Root Architecture in Soil from X-Ray Microcomputed Tomography Images Using Visual Tracking. Plant Physiol. 2012, 158, 561–569. [Google Scholar] [CrossRef] [PubMed]
Smith, A.G.; Petersen, J.; Selvan, R.; Rasmussen, C.R. Segmentation of Roots in Soil with U-Net. Plant Methods 2020, 16, 13. [Google Scholar] [CrossRef]
Fang, H.; Zhang, N.; Yu, Z.; Li, D.; Peng, X.; Zhou, H. Micro-Ct Analysis of Pore Structure in Upland Red Soil under Different Long-Term Fertilization Regimes. Agronomy 2024, 14, 2668. [Google Scholar] [CrossRef]
Schlüter, S.; Weller, U.; Vogel, H.-J. Segmentation of X-Ray Microtomography Images of Soil Using Gradient Masks. Comput. Geosci. 2010, 36, 1246–1251. [Google Scholar] [CrossRef]
Abrosimov, K.N.; Gerke, K.M.; Semenkov, I.N.; Korost, D.V. Otsu’s Algorithm in the Segmentation of Pore Space in Soils Based on Tomographic Data. Eurasian Soil Sci. 2021, 54, 560–571. [Google Scholar] [CrossRef]
Ojeda-Magaña, B.; Quintanilla-Domínguez, J.; Ruelas, R.; Tarquis, A.M.; Gómez-Barba, L.; Andina, D. Identification of Pore Spaces in 3D Ct Soil Images Using Pfcm Partitional Clustering. Geoderma 2014, 217, 90–101. [Google Scholar] [CrossRef]
Sofou, A.; Evangelopoulos, G.; Maragos, P. Soil Image Segmentation and Texture Analysis: A Computer Vision Approach. IEEE Geosci. Remote Sens. Lett. 2005, 2, 394–398. [Google Scholar] [CrossRef]
Fu, Y.; Zhao, Y.; Zhao, Y.; Han, Q. Semi-Supervised Segmentation of Multi-Scale Soil Pores Based on a Novel Receptive Field Structure. Comput. Electron. Agric. 2023, 212, 108071. [Google Scholar] [CrossRef]
Thompson, M.L.; Singh, P.; Corak, S.; Straszheim, W.E. Cautionary Notes for the Automated Analysis of Soil Pore-Space Images. Geoderma 1992, 53, 399–415. [Google Scholar] [CrossRef]
Bauer, B.; Cai, X.; Peth, S.; Schladitz, K.; Steidl, G. Variational-Based Segmentation of Bio-Pores in Tomographic Images. Comput. Geosci. 2017, 98, 1–8. [Google Scholar] [CrossRef]
Han, Q.; Zhao, Y.; Zhao, Y.; Liu, K.; Pang, M. Soil Pore Segmentation of Computed Tomography Images Based on Fully Convolutional Network. J. Agric. Eng. 2019, 35, 128–133. [Google Scholar]
Cortina-Januchs, M.G.; Quintanilla-Dominguez, J.; Vega-Corona, A.; Tarquis, A.M.; Andina, D. Detection of Pore Space in Ct Soil Images Using Artificial Neural Networks. Biogeosciences 2011, 8, 279–288. [Google Scholar] [CrossRef]
Wang, W.; Kravchenko, A.N.; Smucker, A.J.; Rivers, M.L. Comparison of Image Segmentation Methods in Simulated 2D and 3D Microtomographic Images of Soil Aggregates. Geoderma 2011, 162, 231–241. [Google Scholar] [CrossRef]
Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. Image Segmentation Using Deep Learning: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3523–3542. [Google Scholar] [CrossRef]
Fu, Y.; Huang, Z.; Zhao, Y.; Xi, B.; Zhao, Y.; Han, Q. A Weakly Supervised Soil Pore Segmentation Method Based on Traditional Segmentation Algorithm. Catena 2025, 249, 108660. [Google Scholar] [CrossRef]
Schlüter, S.; Sheppard, A.; Brown, K.; Wildenschild, D. Image Processing of Multiphase Images Obtained Via X-Ray Microtomography: A Review. Water Resour. Res. 2014, 50, 3615–3639. [Google Scholar] [CrossRef]
Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation. In Proceedings of the European Conference on Computer Vision 2022, Tel Aviv, Israel, 23–27 October 2022. [Google Scholar]
Krithika Alias AnbuDevi, M.; Suganthi, K. Review of Semantic Segmentation of Medical Images Using Modified Architectures of Unet. Diagnostics 2022, 12, 3064. [Google Scholar] [CrossRef]
Dai, Q.; Lee, Y.H.; Sun, H.-H.; Ow, G.; Yusof, M.L.M.; Yucel, A.C. Dmrf-Unet: A Two-Stage Deep Learning Scheme for Gpr Data Inversion under Heterogeneous Soil Conditions. IEEE Trans. Antennas Propag. 2022, 70, 6313–6328. [Google Scholar] [CrossRef]
Xu, J.-J.; Zhang, H.; Tang, C.-S.; Yang, Y.; Li, L.; Wang, D.-L.; Liu, B.; Shi, B. Soil Desiccation Crack Recognition: New Paradigm and Field Application. J. Geophys. Res. Mach. Learn. Comput. 2024, 1, e2024JH000347. [Google Scholar] [CrossRef]
Bai, H.; Liu, L.; Han, Q.; Zhao, Y.; Zhao, Y. A Novel Unet Segmentation Method Based on Deep Learning for Preferential Flow in Soil. Soil Tillage Res. 2023, 233, 105792. [Google Scholar] [CrossRef]
Liu, L.; Han, Q.; Zhao, Y.; Zhao, Y. A Novel Method Combining U-Net with Lstm for Three-Dimensional Soil Pore Segmentation Based on Computed Tomography Images. Appl. Sci. 2024, 14, 3352. [Google Scholar] [CrossRef]
Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.-W.; Wu, J. Unet 3+: A Full-Scale Connected Unet for Medical Image Segmentation. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020, Barcelona, Spain, 4–8 May 2020. [Google Scholar]
Liu, F.; Wang, L. Unet-Based Model for Crack Detection Integrating Visual Explanations. Constr. Build. Mater. 2022, 322, 126265. [Google Scholar] [CrossRef]
Seo, H.; Huang, C.; Bassenne, M.; Xiao, R.; Xing, L. Modified U-Net (Mu-Net) with Incorporation of Object-Dependent High Level Features for Improved Liver and Liver-Tumor Segmentation in Ct Images. IEEE Trans. Med. Imaging 2019, 39, 1316–1325. [Google Scholar] [CrossRef]
Dyson, J.; Mancini, A.; Frontoni, E.; Zingaretti, P. Deep Learning for Soil and Crop Segmentation from Remotely Sensed Data. Remote Sens. 2019, 11, 1859. [Google Scholar] [CrossRef]
Zhou, X.; Shi, P. Unet-Like Transformer for 1d Soil Stratification Using Cone Penetration Test and Borehole Data. Eng. Geol. 2024, 343, 107795. [Google Scholar] [CrossRef]
Yang, Y.; Feng, C.; Wang, R. Automatic Segmentation Model Combining U-Net and Level Set Method for Medical Images. Expert Syst. Appl. 2020, 153, 113419. [Google Scholar] [CrossRef]
Qin, C.; Gu, X. A Single Image Dehazing Method Based on Decomposition Strategy. J. Syst. Eng. Electron. 2022, 33, 279–293. [Google Scholar] [CrossRef]
Lu, R.K. Analytical Methods of Soil and Agricultural Chemistry; Chinese Agriculture Science and Technology: Beijing, China, 1999. [Google Scholar]
Han, Q.; Lei, L.; Zhao, Y.; Zhao, Y. A Neighborhood Median Weighted Fuzzy C-Means Method for Soil Pore Identification. Pedosphere 2021, 31, 746–760. [Google Scholar] [CrossRef]
Konstantin, D.; Zosso, D. Two-Dimensional Variational Mode Decomposition. In Proceedings of the Energy Minimization Methods in Computer Vision and Pattern Recognition 2015, Hong Kong, China, 13–16 January 2015. [Google Scholar]
Qin, K.; Zhang, Y.; Li, J. Application of Variational Mode Decomposition Method in Fourier Transform Profilometry. In Proceedings of the 9th International Symposium on Advanced Optical Manufacturing and Testing Technologies: Optical Test, Measurement Technology, and Equipment, Chengdu, China, 26–29 June 2018. [Google Scholar]
Lavrukhin, E.V.; Gerke, K.M.; Romanenko, K.A.; Abrosimov, K.N.; Karsanina, M.V. Assessing the Fidelity of Neural Network-Based Segmentation of Soil Xct Images Based on Pore-Scale Modelling of Saturated Flow Properties. Soil Tillage Res. 2021, 209, 104942. [Google Scholar] [CrossRef]
Rashmi, R.; Girisha, S. A Modified U-Net for Semantic Segmentation of Liver and Liver Tumors from Ct Scans. In Proceedings of the International Conference on Computation of Artificial Intelligence & Machine Learning 2024, Jaipur, India, 18–19 January 2024. [Google Scholar]
Li, X.; Chen, H.; Qi, X.; Dou, Q.; Fu, C.-W.; Heng, P.-A. H-Denseunet: Hybrid Densely Connected Unet for Liver and Tumor Segmentation from Ct Volumes. IEEE Trans. Med. Imaging 2018, 37, 2663–2674. [Google Scholar] [CrossRef] [PubMed]
Han, Q.; Song, M.; Xi, B.; Zhao, Y.; Zhao, Y. Three-Dimensional Segmentation Method of Soil Multi-Category Pores Based on Improved Unet-Vae Network. Trans. Chin. Soc. Agric. Eng. 2024, 40, 81–89. [Google Scholar]
Bai, H.; Han, Q.; Zhao, Y.; Zhao, Y. Mfhsformer: Hierarchical Sparse Transformer Based on Multi-Feature Fusion for Soil Pore Segmentation. Expert Syst. Appl. 2025, 272, 126789. [Google Scholar] [CrossRef]
Song, M.; Zhao, Y.; Zhao, Y.; Han, Q. Acftransunet: A New Multi-Category Soil Pores 3d Segmentation Model Combining Transformer and Cnn with Concentrated-Fusion Attention. Comput. Electron. Agric. 2024, 225, 109312. [Google Scholar] [CrossRef]

Figure 1. Framework for MMLFR-UNet.

Figure 2. Structure of MMLFR-UNet.

Figure 3. Data preview plots: (a) soil slices in the Z-axis direction; and (b) data sampling process.

Figure 4. Preprocessing workflow: (a) original soil slice; (b) inscribed image after removing container walls; (c) high-resolution subregion image following subdivision; and (d) preliminary enhancement after standardizing the image size.

Figure 5. Workflow of Mask calibrated: (a) traditional image enhancement results; (b) binarizing the subgraph a to generate a clear binarized image; (c) the sub-image b is processed by FCM to obtain an image that further refines the region division; (d) morphological transformation is performed on subgraph (c) to further denoise and optimize boundary features; and (e) on the basis of the subgraph d, the Mask label is obtained by manual secondary calibration and anti-color processing.

Figure 6. MMLFR schematic diagram.

Figure 7. Test data effect display diagram.

Figure 8. Comparison of the results of the contrasting methods.

Figure 9. Individual examples (z_3510_6_2). (a) Original image; (b) MMLFR image; (c) Mask image (red boxes are detail comparison areas); (d) MMLFR-UNet; (e) UNet; (f) Otsu; and (g) FCM.

Figure 10. Indicator display chart. (a) IoU; (b) Pixel Accuracy; (c) Dice Similarity Coefficient; and (d) Boundary_F1.

Table 1. Training parameters configuration.

Parameter Name	Value/Setting	Parameter Name	Value/Setting
alpha	5000	epochs	200
tau	0.25	batch_size	64
K	5	Learning Rate	0.0001
DC	1	scale	0.5
init	1	Initialization	Kaiming Normal
tol	K × 10⁻⁶	optimizer	RMSprop
eps	2.2204 × 10⁻¹⁶	loss	BCEWithLogitsLoss
low_freq_threshold	0.3	LR scheduler	ReduceLROnPlateau

Table 2. Comparison of metrics for different methods.

Methods	Boundary_F1	DICE	IOU	Pixel_Accuracy
MMLFR-UNet	0.5236	0.8714	0.7790	0.9883
UNet	0.4929	0.8692	0.7718	0.9872
Otsu	0.1351	0.6569	0.5069	0.9597
FCM	0.2956	0.3966	0.3420	0.5617

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qin, C.; Zhang, J.; Duan, Y.; Li, C.; Dong, S.; Mu, F.; Chi, C.; Han, Y. Research on Soil Pore Segmentation of CT Images Based on MMLFR-UNet Hybrid Network. Agronomy 2025, 15, 1170. https://doi.org/10.3390/agronomy15051170

AMA Style

Qin C, Zhang J, Duan Y, Li C, Dong S, Mu F, Chi C, Han Y. Research on Soil Pore Segmentation of CT Images Based on MMLFR-UNet Hybrid Network. Agronomy. 2025; 15(5):1170. https://doi.org/10.3390/agronomy15051170

Chicago/Turabian Style

Qin, Changfeng, Jie Zhang, Yu Duan, Chenyang Li, Shanzhi Dong, Feng Mu, Chengquan Chi, and Ying Han. 2025. "Research on Soil Pore Segmentation of CT Images Based on MMLFR-UNet Hybrid Network" Agronomy 15, no. 5: 1170. https://doi.org/10.3390/agronomy15051170

APA Style

Qin, C., Zhang, J., Duan, Y., Li, C., Dong, S., Mu, F., Chi, C., & Han, Y. (2025). Research on Soil Pore Segmentation of CT Images Based on MMLFR-UNet Hybrid Network. Agronomy, 15(5), 1170. https://doi.org/10.3390/agronomy15051170

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Soil Pore Segmentation of CT Images Based on MMLFR-UNet Hybrid Network

Abstract

1. Introduction

2. Materials and Methods

2.1. Soil Sampling

2.2. Multi-Modal Low-Frequency Reconstruction (MMLFR)

2.3. UNet Network Fundamentals

2.4. MMLFR-UNet Hybrid Model

2.5. Experimental Environment and Computational Efficiency

2.6. Evaluation Indicators

3. Results

3.1. Data Preprocessing and Annotation Strategy

3.2. Segmentation Framework: MMLFR-UNet

3.2.1. Frequency-Based Feature Enhancement (MMLFR)

3.2.2. MMLFR-UNet Image Segmentation

3.2.3. Visual Comparison of Segmentation Results Across Models

3.2.4. Quantitative Evaluation of Segmentation Performance

4. Discussion

4.1. Comparative Performance with Recent Segmentation Models

4.2. Applicability and Limitations of MMLFR-UNet

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI