Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images

Zhang, Shengnan; Cui, Xinming; Ma, Guangkun; Tian, Ronghui

doi:10.3390/sym17091506

Open AccessArticle

Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images

by

Shengnan Zhang

^1,2,

Xinming Cui

^1,2,*

,

Guangkun Ma

^1,2 and

Ronghui Tian

^1,2

¹

School of Software, Shenyang University of Technology, Shenyang 110870, China

²

Shenyang Key Laboratory of Intelligent Technology of Advanced Industrial Equipment Manufacturing, Shenyang 110870, China

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(9), 1506; https://doi.org/10.3390/sym17091506

Submission received: 1 August 2025 / Revised: 24 August 2025 / Accepted: 27 August 2025 / Published: 10 September 2025

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

Accurate glomerular segmentation in renal pathological images is a key challenge for chronic kidney disease diagnosis and assessment. Due to the high visual similarity between pathological glomeruli and surrounding tissues in color, texture, and morphology, significant “camouflage phenomena” exist, leading to boundary identification difficulties. To address this problem, we propose BM-UNet, a novel segmentation framework that embeds boundary guidance mechanisms into a Mamba architecture with a symmetric encoder–decoder design. The framework enhances feature transmission through explicit boundary detection, incorporating four core modules designed for key challenges in pathological image segmentation. The Multi-scale Adaptive Fusion (MAF) module processes irregular tissue morphology, the Hybrid Boundary Detection (HBD) module handles boundary feature extraction, the Boundary-guided Attention (BGA) module achieves boundary-aware feature refinement, and the Mamba-based Fused Decoder Block (MFDB) completes boundary-preserving reconstruction. By introducing explicit boundary supervision mechanisms, the framework achieves significant segmentation accuracy improvements while maintaining linear computational complexity. Validation on the KPIs2024 glomerular dataset and HuBMAP renal tissue samples demonstrates that BM-UNet achieves a 92.4–95.3% mean Intersection over Union across different CKD pathological conditions, with a 4.57% improvement over the Mamba baseline and a processing speed of 113.7 FPS.

Keywords:

U-Net; boundary enhancement; symmetric encoder–decoder; mamba architecture; glomerular segmentation; renal pathological images; linear computational complexity; medical image segmentation

1. Introduction

Chronic kidney disease (CKD) has emerged as a major global health challenge affecting over 800 million people worldwide, with annual mortality rates exceeding the combined levels of breast and prostate cancer [1]. Glomeruli serve as the kidney’s fundamental filtration units, and their structural alterations represent key pathological indicators for CKD progression and severity assessment. Accurate glomerular segmentation in renal pathology images is essential for quantitative pathological analysis, disease staging, and treatment planning in CKD management [2].

However, glomerular segmentation faces unique technical challenges primarily stemming from the “camouflage phenomenon” in renal pathology. This phenomenon is referred to as “medical camouflage” in the field of medical image analysis [3]. It manifests at the following three specific levels: (1) Color camouflage: Under PAS staining, sclerotic glomeruli exhibit a highly similar hue and saturation to surrounding fibrotic interstitial tissues; (2) Texture camouflage: Inflammatory infiltration and fibrosis lead to blurred internal glomerular structures, where originally clear capillary walls disappear and texture features become consistent with background tissues; and (3) Boundary camouflage: With disease progression, the thickening of Bowman’s capsule walls, basement membrane lesions, and surrounding inflammatory cell infiltration cause originally distinct anatomical boundaries to become obscured. These camouflage characteristics are particularly pronounced in pathological sections from patients with advanced chronic kidney disease, making accurate identification challenging even for experienced pathologists who require careful observation, thereby severely affecting the accuracy of automatic segmentation algorithms.

Despite significant advances in medical image segmentation, existing methods show clear limitations when confronting these camouflage challenges. Current deep learning approaches, including nnU-Net [4], DeepLabV3+ [5], and transformer-based models [6], have achieved considerable progress in general medical segmentation [7]. However, these traditional approaches focus primarily on global feature extraction and often fail to capture the subtle differences between target glomerular regions and complex renal backgrounds [8]. The methods essentially treat boundary information as implicit features learned through standard convolutions, struggling with the fuzzy boundaries and low-contrast regions that characterize the ‘camouflage’ phenomenon in glomerular pathology [9].

Recent developments in camouflaged object detection (COD) offer promising solutions for these challenges. COD methods have achieved breakthrough results in natural scenes by distinguishing closely resembling targets from complex backgrounds through explicit edge semantics enhancement and contextual information fusion [10]. This boundary-guided detection strategy directly addresses the boundary ambiguity challenges in glomerular segmentation [11]. Several medical studies have successfully adapted COD principles, particularly for polyp and glomerular segmentation, with growing scholarly interest in medical applications [12]. However, existing COD methods face computational inefficiency due to quadratic attention mechanisms [13], constraining their practical deployment in clinical environments.

To balance accuracy with computational efficiency, the recent Mamba architecture—a state-space model with linear computational complexity proposed by Ma et al. [14]—provides an elegant solution. Mamba processes long sequences while maintaining global receptive fields through selective scanning mechanisms, making it particularly suitable for high-resolution medical image segmentation, which requires both comprehensive contextual understanding and real-time processing capability.

This study proposes BM-UNet, which uniquely combines COD’s explicit boundary guidance with Mamba’s efficient sequence modeling for camouflaged glomerular segmentation. Our framework integrates specialized edge-guided modules within a UNet architecture augmented with Mamba blocks, enabling accurate segmentation of ambiguous glomerular targets through both fine-grained edge semantics and long-range spatial dependencies while maintaining linear computational complexity.

The main contributions of this work are as follows:

(1) Architectural Innovation: We propose BM-UNet, a symmetric boundary-enhanced framework that successfully integrates camouflaged object detection principles with efficient Mamba state-space modeling, specifically addressing the medical camouflage phenomenon in glomerular segmentation.

(2) Technical Components: The following four synergistic modules are designed to tackle specific camouflage challenges: Multi-scale Adaptive Fusion (MAF) for morphological variations, Hybrid Boundary Detection (HBD) combining structured and learnable edge processing, Boundary-guided Attention (BGA) for feature enhancement, and Mamba-based Fused Decoder Block (MFDB) for reconstruction, achieving linear computational complexity while maintaining a superior accuracy.

(3) Methodological Advancement: This is the first successful combination of explicit boundary guidance with efficient state-space models for medical imaging, introducing a dual-supervision mechanism that preserves both segmentation accuracy and computational efficiency.

(4) Clinical Validation: Comprehensive experimental validation demonstrates a 92.4–95.3% mIoU across different CKD pathological conditions, with a 4.57% improvement over baseline methods at 113.7 FPS, proving strong potential for real-time clinical deployment in challenging camouflage scenarios.

2. Related Work

2.1. Medical Image Segmentation

CNN-based Methods: Various improved architectures based on convolutional neural networks have been developed for medical image segmentation. Ronneberger et al.’s U-Net [15] established the encoder–decoder framework, which became the foundation for medical image analysis. Huang et al. proposed UNet3+ [16] with dense skip connections to enhance feature propagation, but it has significantly increased computational demands. Zhao et al.’s PSPNet [17] introduced pyramid pooling for multi-scale context aggregation, but it faces limitations when processing irregular pathological structures. DeepLabV3+ employs atrous convolutions to improve boundary detection; however, fixed dilation rates struggle to adapt to variable boundary patterns in diseased tissues. Liu et al.’s ConvNeXt [18] modernized traditional CNN design through large kernel convolutions and inverted bottlenecks, but it still lacks specialized mechanisms for ambiguous boundaries.

Transformer-based Methods: Attention mechanisms have introduced new processing capabilities for medical segmentation. Cao et al.’s Swin-UNet [19] uses hierarchical shifted windows to reduce computational complexity while maintaining global context understanding. Despite these improvements, it still requires substantial computational resources for standard medical images. Xie et al.’s SegFormer [20] achieves more efficient processing by eliminating positional encoding. Its global attention mechanism processes all regions equally, which may dilute focus on critical boundary areas where precise segmentation matters most.

These methods share a common characteristic: they process image features uniformly without explicitly distinguishing boundary regions from other areas. This becomes problematic when dealing with pathological images where diseased and healthy tissues have similar visual appearances.

2.2. Glomerular Segmentation Applications

Glomerular segmentation presents unique challenges that require specialized approaches. Kannan et al. [21] developed one of the first deep learning methods for extracting glomeruli from trichrome-stained images. Their method shows promise but requires careful preprocessing and struggles with severely sclerotic glomeruli where boundaries become indistinct. Liu et al. [22] created an AI framework for assessing nodular sclerosis severity. The system improves quantitative analysis but relies on predefined features that may not generalize well when tissue boundaries become ambiguous due to pathological changes. Bueno et al. [23] used SegNet and adapted AlexNet for glomerular classification. Their binary classification (normal vs. sclerotic) provides useful diagnostic information but does not capture the continuous spectrum of pathological changes important for disease staging.

Recent work has incorporated more advanced architectures. Gallego et al. [24] applied U-Net to different staining protocols, achieving good results on PAS-stained samples. However, the model struggles to distinguish between staining variations and actual tissue boundaries. Feng et al.’s ConvWin-UNet [25] combines convolutions with transformers to improve local feature processing. This hybrid approach enhances accuracy but doubles memory requirements, limiting practical batch sizes. Yan et al. [26] specifically addressed boundary ambiguity using multi-class transformers with attention mechanisms. Their approach aligns well with boundary challenges, but quadratic attention complexity limits practical deployment in clinical environments, failing to meet the efficiency requirements of real-time diagnostic applications.

Current glomerular segmentation methods demonstrate a trade-off between accuracy and efficiency. Methods achieving a high accuracy often require extensive computational resources, while efficient approaches may sacrifice the boundary precision needed for reliable diagnosis.

2.3. Camouflaged Object Detection Applications

Camouflaged object detection (COD) techniques from computer vision provide insights for medical imaging, as both fields deal with targets that blend in with their backgrounds. These methods have developed sophisticated strategies for identifying subtle boundaries. Fan et al.’s SINetV2 [27] uses search and identification mechanisms to detect concealed objects. The approach works well for natural images where texture differences exist, but medical tissues often share similar textures due to their common biological origin. Mei et al.’s PFNet [28] employs polarization-aware features that are not available in standard medical imaging. Pang et al.’s ZoomNet [29] implements multi-scale zoom mechanisms. While effective for natural scenes, maintaining multiple resolution streams requires significant memory resources. Sun et al.’s BGNet [30] explicitly generates boundary maps to guide segmentation. This approach aligns well with medical needs, though boundary detection relies primarily on learned features without incorporating anatomical constraints. Zheng et al.’s BiRefNet [31] improves boundary precision through bidirectional refinement. The iterative processing provides excellent results, but increases inference time considerably.

COD techniques have demonstrated significant advantages in handling visual camouflage phenomena, providing technical references for similar challenges in medical imaging. The medical field has begun exploring the application potential of COD principles in addressing tissue camouflage problems. Ji et al. applied ERRNet [32] to polyp segmentation, achieving a high accuracy through edge–context fusion, though complex scenarios required substantial computational resources. He et al.’s method [33] performed well in PET/CT tumor segmentation, validating the practical value of COD technology in complex medical scenarios.

These COD-inspired medical methods demonstrate the value of explicit boundary detection for challenging segmentation tasks. However, their computational requirements often limit practical deployment in clinical settings where processing speed matters.

2.4. State-Space Models in Medical Imaging

State-space models, particularly Mamba, offer promising solutions for efficient sequence processing. These models achieve linear complexity O(n) while maintaining global receptive fields, addressing the computational limitations of transformer architectures.

Ma et al.’s U-Mamba [14] integrates Mamba blocks within the U-Net framework. This combination maintains segmentation accuracy while improving processing speed by approximately three times. The architecture processes features efficiently but does not include specific mechanisms for boundary enhancement. Ruan et al.’s VM-UNet [34] and Zheng et al.’s Mamba-UNet [35] further demonstrate the effectiveness of state-space models in medical imaging. Both achieve competitive results on standard benchmarks with improved efficiency. These Mamba-based architectures primarily focus on computational efficiency. They successfully reduce processing time but do not specifically address the boundary detection challenges present in pathological image segmentation. Their linear complexity makes them attractive for real-time applications, yet they miss opportunities to leverage sequential processing for progressive boundary refinement. Current medical applications treat Mamba as an efficient alternative to attention mechanisms. While this improves speed, it does not fully exploit Mamba’s capabilities in handling challenging segmentation scenarios where boundary precision is critical.

The existing literature reveals a gap: efficient architectures across different paradigms—whether CNN-based (ConvNeXt), Transformer-based (SegFormer), or Mamba-based models—achieve computational efficiency but lack explicit boundary guidance. Meanwhile, COD methods like BGNet demonstrate boundary detection’s value but require excessive computational resources. This motivates our development of BM-UNet, which uniquely combines Mamba’s linear complexity with COD-inspired boundary mechanisms.

3. Methodologies

We propose BM-UNet, an enhanced UNet architecture that achieves effective medical image segmentation by integrating explicit boundary detection with efficient state-space modeling. The framework adopts the classical encoder–decoder paradigm while introducing specialized components for edge-aware processing that address the unique challenges of camouflaged medical structures. Our motivation stems from an important observation in medical image segmentation: traditional encoder–decoder architectures transmit feature information through skip connections. However, this implicit transmission approach struggles to ensure effective preservation of boundary-critical information throughout the processing pipeline. This is particularly problematic for camouflaged medical structures where subtle boundary differences determine diagnostic accuracy. Our approach addresses this limitation by establishing explicit edge guidance mechanisms throughout the UNet architecture, ensuring that clinicians can visualize and verify how detected edges influence the final segmentation.

The framework is built on a Mamba backbone network and integrates the following four core components: (1) Multi-scale Adaptive Fusion (MAF) for handling irregular tissue morphology, (2) Hybrid Boundary Detection (HBD) combining Sobel operators with adaptive Mamba enhancement, (3) Boundary-guided Attention (BGA) for boundary-aware refinement, and (4) a Mamba Fused Decoder Block (MFDB) for reconstruction.

As illustrated in Figure 1, our BM-UNet framework processes input images through a systematic pipeline. The Mamba encoder first extracts hierarchical features at multiple scales, which are then adaptively fused by MAF modules to handle irregular pathological morphologies. Concurrently, the HBD module generates explicit boundary maps by combining structured edge operators with learnable components. These boundary maps guide the BGA modules to refine feature representations at each scale, emphasizing boundary-relevant regions while suppressing background noise. Finally, MFDBs in the decoder pathway integrate the boundary-enhanced features with global semantic information through progressive upsampling, producing precise segmentation masks that preserve fine structural details.

3.1. Mamba Backbone

Our backbone selection prioritizes the global context modeling capability essential for detecting subtle boundary transitions in camouflaged tissues, where local processing may miss critical long-range dependencies. Traditional CNN backbones exhibit limited receptive fields, while transformer-based approaches incur prohibitive computational costs for clinical deployment due to quadratic attention mechanisms. We employ a Mamba backbone that utilizes selective state-space modeling to achieve global receptive fields with linear complexity, making it particularly suitable for the encoder pathway of our UNet architecture.

The framework begins with standardized input preprocessing to ensure consistent feature representation across diverse medical imaging modalities, as follows:

{\tilde{X}}^{(0)} = \frac{I - μ}{σ} \in R^{B \times C \times H \times W}

(1)

where

I

represents the input medical image and

μ

and

σ

denote ImageNet normalization parameters ensuring stable gradient flow during training. The hierarchical Mamba backbone network consists of four stages with embedding dimensions of

64, 128, 320, 448

and depths of

3, 4, 6, 3

. Multi-scale feature decomposition extracts hierarchical representations through the stem module, as follows:

{\hat{F}}^{(l)} \forall l = 1, \dots, L = Φ_{stem} ({\tilde{X}}^{(0)}), {\hat{F}}^{(l)} \in R^{B \times d_{l} \times H_{l} \times W_{l}}

(2)

The selective state-space mechanism forms the core of our linear-complexity processing. Each stage processes input sequences through selective parameterization that preserves critical boundary information, as follows:

{\tilde{Δ}}^{(l)}, {\tilde{B}}^{(l)}, {\tilde{C}}^{(l)} = Ψ_{s s m} ({\hat{F}}^{(l)}), {\tilde{Δ}}^{(l)} \in R^{B \times L \times D}

(3)

where

{\tilde{Δ}}^{(l)}

represents selective time-step parameters enabling the adaptive temporal modeling crucial for capturing tissue variations. The discretization process transforms continuous parameters into computationally efficient discrete forms while maintaining numerical stability, as follows:

{\bar{A}}^{(l)} = e x p ({\tilde{Δ}}^{(l)} ⊙ A), {\bar{B}}^{(l)} = {\tilde{Δ}}^{(l)} ⊙ {\tilde{B}}^{(l)}

(4)

The selective scanning mechanism updates hidden states through consistent processing that maintains boundary-critical information across sequences, as follows:

{\hat{h}}_{t}^{(l)} = {\bar{A}}_{t}^{(l)} ⊙ {\hat{h}}_{t - 1}^{(l)} + {\bar{B}}_{t}^{(l)} ⊙ {\tilde{x}}_{t}^{(l)}

(5)

Output projection integrates global context with local residual connections, ensuring the stable gradient propagation essential for medical segmentation tasks where feature ambiguity challenges conventional approaches, as follows:

{\overset{˘}{Y}}^{(l)} = W_{out} ({\tilde{C}}^{(l)} {\hat{h}}^{(l)} + D {\tilde{x}}^{(l)}) + {\hat{F}}^{(l)}

(6)

The selective mechanism preserves boundary-critical information when conventional CNN features become unreliable due to tissue camouflage. The hierarchical feature extraction maintains balanced computational distribution across processing stages, with each level contributing to boundary-aware representation learning, as follows:

{\hat{H}}_{i} = {M a m b a}^{(i)} ({DownSample}_{i - 1} ({\hat{H}}_{i - 1})), i \in 1,2, 3,4

(7)

To ensure balanced processing throughout the backbone, we introduce computational load normalization, which maintains balanced resource allocation, as follows:

{\tilde{H}}_{i}^{norm} = L a y e r N o r m ({\hat{H}}_{i}) \cdot \sqrt{\frac{d_{model}}{d_{i}}}

(8)

This formulation allows for adaptive receptive field adjustment based on tissue characteristics while preserving information balance, essential for efficiently processing whole-slide imaging data with maintained boundary-critical information throughout hierarchical feature extraction.

3.2. MAF

Following hierarchical feature extraction from the Mamba backbone, each encoder stage output requires adaptive enhancement to handle irregular pathological morphologies. Each MAF module internally handles variable inputs based on hierarchical position: MAF₁ concatenates two inputs (Samba₁ + Samba₂), MAF_2–3 concatenate three inputs each (current Samba + next Samba + previous MAF), and MAF₄ concatenates two inputs (Samba₄ + MAF₃). This adaptive input configuration enables progressive multi-scale feature enhancement while maintaining architectural consistency.

After internal concatenation, all MAF modules process the unified features through three coordinated pathways, as illustrated in Figure 2. The deformable convolution pathway adapts to irregular anatomical shapes through learned spatial transformations, as follows:

{\tilde{δ}}_{offset}, {\tilde{α}}_{\mod} = Θ_{dcn} ({\tilde{H}}_{i}^{norm}), \tilde{δ} \in R^{2 K N}, \tilde{α} \in R^{K N}

(9)

where

{\tilde{δ}}_{offset}

denotes spatial offsets for dynamic kernel positioning and

{\tilde{α}}_{\mod}

controls feature importance through adaptive modulation. The adaptive kernel sampling mechanism enables precise adaptation to pathological variations, as follows:

{\hat{F}}_{d c n} = \sum_{k = 1}^{K} w_{k} \cdot {\tilde{H}}_{i}^{n o r m} (p_{k} + {\tilde{δ}}_{k}) \cdot {\tilde{α}}_{k}

(10)

Parallel directional processing captures elongated structures through separate horizontal and vertical convolutions, as follows:

{\hat{F}}_{d i r}^{h} = {C o n v}_{1 \times 5} ({\tilde{H}}_{i}^{n o r m}), {\hat{F}}_{d i r}^{v} = {C o n v}_{5 \times 1} ({\tilde{H}}_{i}^{n o r m})

(11)

The directional features are then integrated while preserving spatial consistency, as follows:

{\hat{F}}_{d i r} = {\hat{F}}_{d i r}^{h} \oplus {\hat{F}}_{d i r}^{v} + {C o n v}_{3 \times 3} ({\hat{F}}_{d i r}^{h} ⊙ {\hat{F}}_{d i r}^{v})

(12)

Multi-scale atrous convolutions capture hierarchical contexts with preserved geometric relationships, as follows:

{\hat{F}}_{m s} = ⨁_{r \in R} AtrousConv ({\tilde{H}}_{i}^{norm}; rate = r), R = 2,3, 4

(13)

The context gating mechanism provides adaptive feature selection through learnable weighting, as follows:

{\tilde{γ}}_{c t x} = σ_{h a r d} (W_{g} \cdot [{\hat{F}}_{d c n} |{\hat{F}}_{d i r}| {\hat{F}}_{m s}])

(14)

where

σ_{hard}

denotes Hard-sigmoid activation ensuring stable gradient flow. The final adaptive feature aggregation preserves spatial detail integrity, as follows:

{\overset{ˇ}{H}}_{i} = {\tilde{γ}}_{c t x} ⊙ C o n c a t ([{\hat{F}}_{d c n}, {\hat{F}}_{d i r}, {\hat{F}}_{m s}]) + {\tilde{H}}_{i}^{n o r m}

(15)

3.3. HBD

The enhanced features

{\overset{ˇ}{H}}_{i}

from MAF serve as inputs to the boundary detection mechanism. Current COD methods rely solely on learned edge representations that lack the geometric constraints essential for medical boundary detection. While deep networks learn complex patterns from training data, they often fail with pathological variations unseen during training, making structured operators necessary for robust boundary detection. Our HBD module coordinates structured gradient computation with Mamba sequence modeling, establishing explicit boundary guidance that maintains interpretability while capturing global context.

For fine-scale features

{\overset{ˇ}{H}}_{1}

containing rich spatial details, structured gradient computation employs Sobel operators that preserve directional consistency, as follows:

{\tilde{G}}_{x} = S_{x} * {\overset{ˇ}{H}}_{1}, {\tilde{G}}_{y} = S_{y} * {\overset{ˇ}{H}}_{1}

(16)

where

S_{x}

and

S_{y}

represent structured Sobel kernels. Gradient magnitude and orientation analysis provides comprehensive edge characterization, as follows:

|\tilde{G}| = \sqrt{{\tilde{G}}_{x}^{2} + {\tilde{G}}_{y}^{2} + ϵ}, {\tilde{θ}}_{G} = a r c t a n ({\tilde{G}}_{y} / {\tilde{G}}_{x})

(17)

Edge enhancement integrates structured gradient information through learnable weighting, as follows:

{\hat{E}}_{low} = {\overset{ˇ}{H}}_{1} + \tilde{β} ⊙ Φ_{1 \times 1} (|\tilde{G}|)

(18)

where

\tilde{β}

represents learnable gradient enhancement coefficients. Non-maximum suppression refines edge responses, ensuring a geometric consistency that remains reliable across different pathological conditions, as follows:

{\overset{˘}{E}}_{low} = N M S ({\hat{E}}_{low}, {\tilde{θ}}_{G}, τ_{threshold})

(19)

For high-level features

H_{4}

rich in semantic information, spatial-to-sequence transformation prepares features for Mamba processing, as follows:

{\tilde{S}}_{seq} = Λ ({\overset{ˇ}{H}}_{4}) = R e s h a p e ({\overset{ˇ}{H}}_{4}, (B, H W, C))

(20)

Mamba processing captures global contextual relationships, as follows:

{\hat{E}}_{high}^{seq} = M a m b a L a y e r (LayerNorm ({\tilde{S}}_{seq}))

(21)

Sequence-to-spatial transformation reconstructs spatial arrangements, as follows:

{\overset{˘}{E}}_{high} = Γ ({\hat{E}}_{high}^{seq}) = R e s h a p e ({\hat{E}}_{high}^{seq}, (B, C, H, W))

(22)

Cross-scale fusion integrates enhanced edge features with context-aware representations, as follows:

{\tilde{P}}_{edge} = Φ_{edge} (w_{up} \cdot U p s a m p l e ({\overset{˘}{E}}_{high}) + w_{low} \cdot {\overset{˘}{E}}_{low})

(23)

Figure 3 illustrates how the HBD module’s dual-pathway design combines structured gradient computation with adaptive sequence modeling for robust boundary detection.

3.4. BGA

The boundary predictions

{\tilde{P}}_{edge}

from HBD guide feature enhancement across multiple scales. Standard feature processing lacks explicit boundary guidance, limiting effectiveness for precise edge delineation. BGA leverages boundary predictions to perform attention-guided feature enhancement, creating boundary-aware representations specifically adapted for camouflaged structure segmentation.

Boundary attention weights are computed through spatial probability distribution, as follows:

{\tilde{A}}_{edge} = S o f t m a x ({\tilde{P}}_{edge})

(24)

Attention refinement through morphological operations reduces noise and enhances boundary coherence by filling gaps in fragmented edge responses common in camouflaged structures, as follows:

{\hat{A}}_{refined} = M o r p h C l o s e (M o r p h O p e n ({\tilde{A}}_{edge}, K_{small}), K_{large})

(25)

Multi-scale boundary guidance applies edge-aware attention through coordinated enhancement, as follows:

{\overset{˘}{H}}_{i} = {\overset{ˇ}{H}}_{i} ⊙ R e s i z e ({\hat{A}}_{refined}, (H_{i}, W_{i})) + {\overset{ˇ}{H}}_{i}

(26)

where bilinear interpolation adjusts attention maps to corresponding spatial dimensions. Feature refinement through specialized convolution layers ensures stable gradient flow, as follows:

{\bar{H}}_{i} = {C o n v}_{3 \times 3} (B N (R e L U ({\overset{˘}{H}}_{i})))

(27)

The BGA mechanism (Figure 4) creates a feedback loop between boundary detection and feature enhancement, ensuring boundary-aware representations throughout the processing pipeline.

3.5. MFDB

The boundary-aware encoder features

{\bar{H}}_{i}

serve as skip connections to the decoder pathway. Standard decoders process features uniformly without considering the boundary-specific requirements essential for medical diagnosis. MFDB employs SambaMLPBlock processing that combines Mamba’s global modeling capabilities with multi-scale MLP local processing through balanced pathway design.

The SambaMLPBlock processes input features

F_{in}

through dual balanced pathways. The Mamba pathway enables global sequence modeling, as follows:

{\tilde{F}}_{seq} = M a m b a L a y e r (L N (F l a t t e n (F_{in})))

(28)

The multi-scale MLP pathway applies parallel convolutions with different kernel sizes, as follows:

{\hat{F}}_{par}^{(k)} = {DWConv}_{k \times k} (F_{in}), k \in 1,3, 5,7

(29)

Parallel feature integration combines multi-scale local representations, as follows:

{\hat{F}}_{p a r} = C o n c a t ([{\hat{F}}_{p a r}^{(1)}, {\hat{F}}_{p a r}^{(3)}, {\hat{F}}_{p a r}^{(5)}, {\hat{F}}_{p a r}^{(7)}])

(30)

SambaMLPBlock fusion integrates global and local representations, as follows:

{\overset{˘}{F}}_{s m b} = P r o j ({\tilde{F}}_{s e q} \oplus {\hat{F}}_{p a r}) + F_{i n}

(31)

Progressive upsampling combines enhanced features with boundary-aware skip connections, as follows:

{\tilde{D}}_{i} = SambaMLPBlock ([U p ({\tilde{D}}_{i + 1}) ∣ {\bar{H}}_{i}])

(32)

where skip connections from BGA-enhanced encoder features ensure boundary-aware information propagation. The framework generates dual predictions for comprehensive boundary-aware segmentation, as follows:

{\hat{M}}_{seg} = S i g m o i d (Φ_{seg} ({\tilde{D}}_{1}))

(33)

{\hat{M}}_{edge} = S i g m o i d (Φ_{edge} ({\tilde{P}}_{edge}))

(34)

The MFDB architecture (Figure 5) integrates global sequence modeling with local multi-scale processing, enabling efficient boundary-preserving reconstruction.

3.6. Loss Function

The dual-task nature of BM-UNet requires a carefully designed loss function that simultaneously optimizes segmentation accuracy and boundary detection precision. Our loss formulation addresses the class imbalance inherent in medical image segmentation while ensuring that boundary-critical information is preserved throughout training.

The total loss function balances segmentation and boundary objectives through joint supervision, as follows:

L_{total} = L_{seg} + λ_{e} L_{edge}

(35)

where

λ_{e}

controls the relative importance of boundary supervision. Through systematic validation on the development set, we find that

λ_{e}

= 0.5 provides an optimal balance between segmentation accuracy and boundary precision (Table 1). Values below 0.3 lead to under-optimized boundary detection, while values above 0.7 cause segmentation performance degradation.

Each loss component combines Dice loss and focal loss to address both region overlap optimization and class imbalance challenges. The segmentation loss

L_{seg}

is formulated as follows:

L_{seg} = L_{dice}^{seg} + α L_{focal}^{seg}

(36)

The Dice loss optimizes region overlap by measuring intersection over union, as follows:

L_{dice}^{seg} = 1 - \frac{2 \sum_{i} (p_{i} \cdot g_{i}) + ϵ}{\sum_{i} p_{i} + \sum_{i} g_{i} + ϵ}

(37)

where

p_{i}

represents predicted probabilities,

g_{i}

denotes ground truth labels, and

ϵ = 1 \times 10^{- 7}

ensures numerical stability. The focal loss addresses class imbalance by down-weighting easy examples, as follows:

L_{focal}^{s e g} = - α_{t} {(1 - p_{t})}^{γ} \log (p_{t})

(38)

where

α_{t} = 0.25

balances positive and negative samples and

γ = 2

focuses learning on hard examples by reducing the contributions of well-classified instances.

Similarly, the boundary loss

L_{edge}

ensures explicit edge supervision using the same formulation applied to boundary predictions from the HBD module, as follows:

L_{edge} = L_{dice}^{edge} + α L_{focal}^{edge}

(39)

4. Experiments Results

4.1. Implement Details

Experiments were implemented using the using the PyTorch 1.8.0 framework with Python 3.8.10 with the proposed Mamba backbone architecture. Input images were uniformly resized to a 512 × 512 resolution and underwent standard data augmentation strategies, including random horizontal/vertical flipping (probability p = 0.5), dynamic resizing, random cropping, and ImageNet normalization (mean μ = [0.485, 0.456, 0.406], standard deviation σ = [0.229, 0.224, 0.225]). Training used the Adam optimizer with a batch size of 16 and initial learning rate of 1 × 10⁻⁴, following a polynomial decay strategy with a power of 0.9. All experiments were conducted on an NVIDIA GeForce RTX 4090 GPU, with training completed within 1–2 h over 200 epochs.

The complete preprocessing and augmentation pipeline is detailed in Table 2, with particular attention to maintaining a consistent image quality while preventing overfitting through diverse augmentation strategies.

4.2. Datasets

We evaluated our method on two glomerular datasets covering distinct imaging challenges and pathological conditions.

KPIs2024 Dataset: The first dataset, KPIs2024 [36], was specifically designed for glomerular segmentation challenges and provides comprehensive coverage of different CKD pathological stages. The KPIs2024 challenge dataset consists of 60 high-resolution whole-kidney section images from one normal kidney model and three CKD models, as follows: eight-week-old healthy controls, mice 12 weeks post-5/6 nephrectomy (simulating chronic renal failure model), eNOS⁻/⁻/lepr (db/db) double-knockout mice (simulating diabetic nephropathy model, DN group), and NEP25 transgenic mice (three weeks after immunotoxin-induced glomerular injury). All samples are stained with PAS and photographed using a Leica SCN400 scanner (Leica Microsystems, Wetzlar, Germany) at 40× magnification. Table 3 summarizes the composition and characteristics across these four distinct pathological conditions.

HuBMAP Dataset: To assess its generalization capability across different tissue types and imaging protocols, we also evaluated our method on the HuBMAP dataset [37]. This dataset provides human tissue samples that differ significantly from the mouse models in KPIs2024, offering an important validation of cross-species robustness. The publicly available HuBMAP dataset contains comprehensive biomedical resources designed to capture the cellular-level molecular structure of human organs, including high-resolution kidney tissue images with complex renal tissue structures and the subtle morphology of glomeruli. Table 4 presents the specifications and characteristics of the HuBMAP dataset.

Figure 6 illustrates the camouflage challenges inherent in both datasets. In KPIs2024 histopathological images, pathological glomeruli exhibit visual similarity to surrounding tissues, particularly in DN and NEP25 conditions where structural boundaries become increasingly ambiguous. HuBMAP samples demonstrate similar challenges in human tissue, where glomerular morphological variations across different pathologies complicate automated detection. These visualizations highlight the fundamental segmentation difficulty that BM-UNet addresses through edge-guided boundary detection.

4.3. Evaluation Metrics

We assess performance using four metrics targeting clinical deployment requirements. Mean intersection over union (mIoU) evaluates segmentation accuracy through mask overlap calculation. Computational efficiency is measured via FLOPs (G) for operation complexity, Params (M) for model size, and FPS for processing speed. These metrics address the fundamental trade-off between segmentation precision and computational feasibility that determines clinical viability in resource-constrained medical environments.

4.4. Comparative Experiments

4.4.1. Quantitative Performance Analysis

To comprehensively evaluate BM-UNet’s effectiveness in medical image segmentation, we compared it against multiple representative architectures covering mainstream technical directions.

CNN-based Methods: We selected U-Net3+ (encoder–decoder baseline), PSPNet (introducing pyramid pooling for multi-scale context), DeepLabV3+ (utilizing dilated convolutions for boundary refinement), UNet3+ (enhancing feature propagation through dense skip connections), and ConvNeXt (designed with larger kernels and inverted bottlenecks). These models represent both foundational and state-of-the-art CNN-based approaches.

Transformer-based Methods: We compared against advanced attention-based models, including Swin-UNet (employing hierarchical vision Transformer with shifted window attention) and SegFormer (providing efficient processing without positional encoding requirements). These two methods provide important performance references for our state-space model.

COD-inspired Methods: Given our focus on difficult-to-segment medical structures (similar to “camouflaged targets”), we included the following models designed for COD and adapted for medical images: SINetV2 with search identification mechanisms, PFNet employing polarization-aware feature fusion, ZoomNet implementing multi-scale zoom mechanisms, BiRefNet utilizing bidirectional refinement for boundary precision, and BGNet proposing dedicated boundary guidance networks.

Mamba-based Methods: We compared with CM-UNet, which shares the same core computational paradigm of state-space models but lacks the explicit edge guidance mechanisms present in our framework. The experimental results are shown in Table 5.

Performance Analysis Across Architectural Paradigms: The results reveal distinct performance patterns that validate our architectural design choices. Efficient architectures within each category—CNN (ConvNeXt: 89.69%), Transformer (SegFormer: 90.93%), and baseline Mamba (CM-UNet: 92.33%)—achieve progressively better performances, yet all fall short of COD-specialized methods with explicit boundary guidance (BGNet: 91.63%). BM-UNet’s superiority (93.51% mIoU) demonstrates that combining boundary guidance with efficient sequence modeling creates the necessary synergy for accurate camouflaged structure segmentation.

Statistical Validation: Paired t-test analysis confirms that all improvements are statistically significant. Comparison with the strongest baseline CM-UNet shows a 1.18% improvement (p = 0.049, 95% CI: [0.01, 2.35]), confirming that the advancement is statistically reliable. The progressive reduction in standard deviation from weaker to stronger methods (U-Net: 0.48% to BM-UNet: 0.29% for 56Nx condition) demonstrates an enhanced consistency alongside improved accuracy.

Challenging Condition Performance: The framework’s effectiveness is most evident in NEP25 conditions where boundary degradation is severe. BM-UNet achieves 92.28% versus CM-UNet’s 90.87%, representing a clinically meaningful improvement of approximately 20–30 fewer boundary misclassifications per whole-slide image. This directly impacts glomerular sclerosis quantification, a key parameter in CKD staging decisions.

Figure 7 presents the performance matrix visualization through a heatmap showing the mIoU (%) across all methods and pathological conditions. BM-UNet achieves the highest values (represented by the darkest red regions) under all conditions, confirming its consistent superiority.

To further validate the discriminative capability of our model, we conduct an ROC analysis for pixel-wise classification across all test samples. Figure 8 shows the ROC curves comparing BM-UNet with the top-performing baseline methods. Our approach achieves the highest AUC (0.948), demonstrating a superior discriminative capability across all operating points compared to competitive methods.

Our method consistently outperforms the baseline approaches across different CKD stages, with particularly notable improvements in challenging scenarios where tissue boundaries are severely compromised.

Cross-dataset Generalization: As shown in Table 6, HuBMAP validation demonstrates a robust generalization capability with a 94.32% mIoU. The 0.43% improvement over Segformer (93.89%) maintains statistical significance (p = 0.046), confirming a consistent performance across different imaging protocols and tissue sources. The reduced variance in human tissue samples (standard deviations 0.08–0.21%) reflects the higher image quality typical of standardized acquisition protocols.

4.4.2. Computational Efficiency Analysis

Table 7 reveals BM-UNet’s favorable efficiency–accuracy trade-off while achieving the highest segmentation accuracy.

BM-UNet demonstrates favorable efficiency–accuracy trade-offs while maintaining the highest segmentation accuracy. The linear-complexity Mamba backbone and efficient boundary guidance enable a 113.7 FPS processing speed suitable for real-time clinical deployment. Its computational cost (FLOPs) is 43.31 G, parameter count (Params) is 32.42 M, and processing speed reaches 113.7 FPS (approximately 8.8 ms latency per image). Figure 9’s multi-dimensional evaluation further supports this, as follows: (a) BM-UNet occupies the optimal position in speed–accuracy trade-off; (b) has a relatively low computational cost (FLOPs) with an excellent performance; (c) provides the highest accuracy with a moderate model size; and (d) ranks first in comprehensive efficiency.

Clinical Deployment Validation: To assess practical deployment feasibility, we evaluated BM-UNet on 200 test images across clinical-grade GPUs (Table 8). The framework maintained 113.6 FPS on RTX 4090, consistent with our benchmark results (113.7 FPS). Peak memory stabilized at 2.3 GB across all platforms, confirming its deployment viability on 6 GB entry-level GPUs. Mid-range hardware (RTX 3060/2070) achieved 48.5–64.9 images/s, sufficient for interactive pathological analysis. Even the GTX 1660 Ti processed 200 images in 8.62 s (23.2 images/s), enabling practical batch processing. This 5× performance range with consistent memory usage demonstrates BM-UNet’s robustness to hardware variations common in clinical settings.

Table 8 reveals BM-UNet’s favorable efficiency–accuracy trade-off while achieving the highest segmentation accuracy.

4.4.3. Qualitative Analysis

Figure 10 demonstrates BM-UNet’s superior boundary precision across diverse histopathological scenarios. Traditional CNN methods exhibit incomplete boundary capture and fragmentation, while COD-inspired approaches show improvements but struggle with completeness. BM-UNet consistently produces masks that closely adhere to actual boundary contours.

Cross-dataset Validation on HuBMAP Samples: Figure 11 shows BM-UNet’s segmentation results on human kidney tissue samples (HuBMAP). Despite the higher cellular heterogeneity and different staining protocols in human samples compared to mouse models, BM-UNet maintains a robust performance with a 94.32% mIoU, validating its clinical applicability potential.

Error Analysis: Figure 12 provides comprehensive error visualization through color coding (false positives: red; false negatives: blue). BM-UNet effectively suppresses both false positive and false negative errors across all pathological conditions, demonstrating the most balanced error distribution. Figure 13 further validates these findings on HuBMAP human tissue samples.

Boundary Precision: Figure 14 illustrates enhanced boundary delineation through direct overlay visualization. Through the direct overlay of segmentation boundaries (green contours) on original histopathological images, the advantages of BM-UNet in boundary delineation precision are intuitively demonstrated. Compared to traditional methods, BM-UNet generates more continuous and accurate boundary contours that closely match actual glomerular structures.

4.5. Ablation Experiment

To systematically evaluate each module’s contribution, we used the pure Mamba backbone network as a baseline and progressively integrated the proposed core modules (MAF, HBD, BGA, and MFDB), with the results shown in Table 9. This design aims to isolate the effects of boundary guidance mechanisms while avoiding interference from architectural differences.

Figure 15 provides visual evidence of progressive segmentation improvement through systematic module integration. The Mamba-only baseline model exhibits incomplete boundary capture, while the progressive integration of the MAF, HBD, BGA, and MFDB modules demonstrates clear visual improvements in boundary completeness and accuracy.

Baseline Performance: The Mamba-only baseline achieves an 88.93% mIoU, 91.25% mDice, and 93.15% mPA on average, confirming state-space modeling’s effectiveness. However, it struggles most with NEP25 cases (87.47% mIoU) where tissue boundaries are severely degraded.

MAF Module Contribution: Adding MAF improves performance by +1.07% mIoU, +1.13% mDice, and +0.85% mPA on average. The module shows particular effectiveness in handling irregular structures, with NEP25 gaining +1.49% mIoU (from 87.47% to 88.96%). This improvement stems from MAF’s deformable convolutions adapting to pathological shape variations.

HBD Module Impact: The HBD module provides the largest single contribution, adding +1.42% mIoU, +1.33% mDice, and +1.15% mPA beyond MAF. The NEP25 condition benefits most, with +1.29% mIoU improvement (88.96% to 90.25%). This validates our hypothesis that explicit boundary detection is essential when tissue camouflage obscures conventional features. The combination of Sobel operators with Mamba enhancement captures both local gradients and global context.

BGA Module Enhancement: BGA adds +1.10% mIoU, +0.98% mDice, and +0.68% mPA to the MAF + HBD configuration. While the mPA improvement is modest, the mIoU and mDice gains indicate better boundary localization. The module creates effective feedback between detected boundaries and feature refinement, particularly benefiting DN cases (+1.35% mIoU).

MFDB and Full Framework: The complete BM-UNet with MFDB achieves a final performance of 93.51% mIoU, 95.6% mDice, and 96.53% mPA on average. Compared to the baseline, this represents total improvements of +4.58% mIoU, +4.35% mDice, and +3.38% mPA. The NEP25 condition shows the most dramatic improvement (+4.81% mIoU), while even the easier Normal condition gains +4.38% mIoU.

Component Synergy Analysis: Table 9 reveals important patterns in module interactions. The mDice improvements (+4.35% total) closely track mIoU gains (+4.58%), indicating consistent region overlap improvement. However, mPA improvements (+3.38%) are more modest, suggesting that the modules primarily enhance boundary precision rather than overall pixel classification. The standard deviation of improvements across conditions decreases from 1.47% (baseline) to 1.34% (full model), demonstrating a more consistent performance.

Figure 16 illustrates the progressive module integration analysis, demonstrating how each component systematically contributes to the overall performance enhancement. The cumulative performance curve shows steady improvement from the baseline, with each module addition providing measurable gains. The individual contribution bars at the bottom highlight that HBD delivers the most significant improvement (+1.42%), followed by BGA (+1.10%) and MAF (+1.07%), while MFDB provides the final optimization (+0.98%). This visualization confirms that boundary-related modules (HBD and BGA) account for over half of the total performance gain (+2.52% out of +4.57%), validating the importance of explicit boundary guidance in our framework design. The consistent upward trajectory demonstrates effective synergy between components, with no performance degradation observed during progressive integration.

5. Discussion

The experimental results demonstrate that BM-UNet effectively addresses the camouflaged pathological structure segmentation challenge by integrating explicit boundary guidance with efficient state-space modeling. Comprehensive evaluation across multiple datasets confirms the practical viability of this approach for clinical deployment.

Architectural Innovation and Clinical Significance: This study combines edge guidance principles from COD with Mamba’s efficient sequence modeling capabilities. The experimental comparison across CNN, Transformer, COD, and Mamba architectures reveals an important finding: neither computational efficiency nor boundary detection alone is sufficient. CNN architectures like ConvNeXt achieve an 89.69% mIoU with good efficiency but lack boundary specialization. COD methods such as BGNet reach 91.63% through explicit boundary guidance but require excessive computation. Pure Mamba models (CM-UNet: 92.33%) provide efficiency but miss critical boundary details. BM-UNet’s integration achieves a 93.51% mIoU, confirming that combining boundary guidance with efficient sequence modeling creates the necessary synergy for accurate camouflaged structure segmentation.

Performance Superiority in Camouflaged Scenarios: BM-UNet demonstrates stable performance improvements across different pathological conditions (56Nx, DN, NEP25, and Normal), with an average 4.57% mIoU improvement over competitive baselines. The framework shows a particularly strong performance under the most challenging NEP25 conditions, achieving a 92.28% mIoU compared to 87.47% for the Mamba baseline. This 4.81% improvement demonstrates the framework’s capability to handle severely compromised tissue boundaries where conventional approaches struggle.

Clinical Impact of Performance Improvements: The performance gains translate to meaningful clinical benefits. In NEP25 conditions, BM-UNet’s 92.28% mIoU compared to BGNet’s 89.49% mIoU represents approximately 20–30 fewer boundary misclassifications per whole-slide image. This improvement directly affects glomerular sclerosis quantification, a key parameter in CKD staging. Additionally, BM-UNet achieves the lowest standard deviation (1.34%) across all test conditions. CNN methods show higher variability (U-Net: 4.82%, ConvNeXt: 3.12%), as do Transformer (SegFormer: 3.40%, Swin-UNet: 4.56%) and COD approaches (BiRefNet: 2.24%, BGNet: 2.47%). This consistency reduces the need for case-specific adjustments, particularly valuable for borderline cases where accurate quantification influences staging decisions between CKD stage 2 and stage 3.

Computational Efficiency for Clinical Deployment: BM-UNet achieves a 113.7 FPS processing speed (approximately 8.8 ms per image), making it suitable for real-time clinical analysis. Its moderate memory footprint of approximately 2.1 GB ensures compatibility with standard clinical workstations. This efficiency–accuracy balance addresses the practical deployment barriers that limit many academically successful methods. The framework’s dual output design provides both segmentation masks and explicit boundary predictions. While formal user studies have not been conducted, this additional boundary information may help pathologists verify results in ambiguous regions, though further clinical validation would be needed to confirm this benefit.

Limitations and Future Directions: Current evaluation focuses on 2D medical images, primarily PAS-stained samples. While cross-dataset validation on HuBMAP (94.32% mIoU) demonstrates the method’s generalization capability, the predominance of PAS staining in our training data raises concerns about potential overfitting. Clinical practice employs diverse staining methods (H&E, Masson’s trichrome, and silver stains), each with distinct visual characteristics that require systematic validation. The structured Sobel operators in HBD, though effective for PAS samples, may need adjustment for different staining protocols and imaging modalities, despite the learnable components providing some adaptability.

Future work will explore 2.5D processing as a practical step toward volumetric analysis, where Mamba’s linear complexity provides computational advantages over 3D attention mechanisms. True 3D validation remains essential for accurate disease burden quantification, particularly for determining the percentage of affected glomeruli, which is critical for CKD staging. Additionally, the optimal balance between boundary guidance and semantic features may vary across pathological conditions and disease stages. Advanced cases with severe architectural distortion might benefit from different boundary–semantic weighting compared to early-stage changes, suggesting potential for adaptive weighting mechanisms in future iterations that could adjust based on tissue characteristics or institutional imaging protocols.

6. Conclusions

This paper presents BM-UNet, a novel framework addressing the critical challenge of glomerular segmentation in renal pathological images where tissue “camouflage” obscures conventional detection methods. By uniquely integrating explicit boundary guidance with efficient Mamba state-space modeling, our approach establishes a new paradigm for medical image segmentation in challenging scenarios.

The framework incorporates four synergistic components—MAF, HBD, BGA, and MFDB—that systematically address irregular morphologies, boundary detection, feature enhancement, and reconstruction while maintaining linear computational complexity. Experimental validation demonstrates significant improvements: a 93.51% average mIoU representing a 4.57% enhancement over baselines, with a processing speed of 113.7 FPS suitable for clinical deployment.

Most importantly, BM-UNet shows a robust performance across diverse pathological conditions, particularly excelling in challenging NEP25 scenarios (92.28% mIoU vs. 87.47% baseline). Cross-dataset validation on human tissue samples (94.32% mIoU) confirms generalization capability, while moderate computational requirements (2.3 GB memory) ensure practical deployment feasibility.

This work successfully bridges sophisticated boundary detection with computational efficiency, providing a robust foundation for accurate diagnostic tools in camouflaged medical imaging environments. The dual-output design offers both segmentation masks and boundary predictions, potentially enhancing pathologist confidence in challenging cases.

Author Contributions

Conceptualization, S.Z., X.C., G.M. and R.T.; methodology, S.Z., X.C. and R.T.; software, S.Z.; validation, S.Z., X.C., G.M. and R.T.; formal analysis, G.M.; investigation, G.M.; resources, X.C. and R.T.; data curation, G.M.; writing—original draft preparation, S.Z. and X.C.; writing—review and editing, R.T.; visualization, S.Z., X.C., G.M. and R.T.; supervision, G.M. and R.T.; project administration, X.C.; funding acquisition, S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work is partly supported by the Basic Scientific Research Project of Liaoning Provincial Department of Education (No. LJKMZ20220497). The APC was funded by Shenyang University of Technology.

Data Availability Statement

The KPIs2024 dataset [36], specifically developed for glomerular segmentation studies, is publicly accessible at https://sites.google.com/view/kpis2024 (accessed on 4 June 2024). The HuBMAP data [37] is available through the HuBMAP Portal at https://portal.hubmapconsortium.org. Both datasets were used in accordance with their respective licenses and terms of use.

Acknowledgments

The authors acknowledge the contributors of the KPIs2024 challenge dataset and the HuBMAP Consortium for providing high-quality datasets for renal histopathology research.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BM-UNet	Boundary-enhanced U-Net with Mamba Architecture
CKD	Chronic Kidney Disease
COD	Camouflaged Object Detection
MAF	Multi-scale Adaptive Fusion
HBD	Hybrid Boundary Detection
BGA	Boundary-guided Attention
MFD	Mamba Fused Decoder Block
mIoU	mean Intersection over Union
FPS	Frames Per Second
FLOPs	Floating Point Operations
PAS	Periodic Acid–Schiff
DN	Diabetic Nephropathy
NEP25	Nephrotoxic Serum Model
Nx5/6	Nx5/6 Nephrectomy Model
HuBMAP	Human BioMolecular Atlas Program
CNN	Convolutional Neural Network
SSM	State-Space Model

References

Kovesdy, C.P. Epidemiology of chronic kidney disease: An update 2022. Kidney Int. Suppl. 2022, 12, 7–11. [Google Scholar] [CrossRef]
Barisoni, L.; Lafata, K.J.; Hewitt, S.M.; Madabhushi, A.; Balis, U.G.J. Digital pathology and computational image analysis in nephropathology. Nat. Rev. Nephrol. 2020, 16, 669–685. [Google Scholar] [CrossRef] [PubMed]
Fan, D.P.; Ji, G.P.; Xu, P.; Cheng, M.M.; Sakaridis, C.; Van Gool, L. Advances in deep concealed scene understanding. Vis. Intell. 2023, 1, 16. [Google Scholar] [CrossRef]
Isensee, F.; Jaeger, P.F.; Kohl, S.A.; Petersen, J.; Maier-Hein, K.H. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 2021, 18, 203–211. [Google Scholar] [CrossRef] [PubMed]
Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar] [CrossRef]
Shamshad, F.; Khan, S.; Zamir, S.W.; Khan, M.H.; Hayat, M.; Khan, F.S.; Fu, H. Transformers in medical imaging: A survey. Med. Image Anal. 2023, 88, 102802. [Google Scholar] [CrossRef]
Yang, G.; Wang, J.; Chen, Z.; Zhou, S.; Li, H. A Symmetric Fusion Architecture for Multimodal Medical Image Segmentation. Symmetry 2024, 16, 189. [Google Scholar] [CrossRef]
Wang, R.; Chen, S.; Ji, C.; Fan, J.; Li, Y. Boundary-aware context neural network for medical image segmentation. Med. Image Anal. 2022, 78, 102395. [Google Scholar] [CrossRef]
Liu, Y.; Wang, L.; Cheng, J.; Li, C.; Chen, X. Multi-scale symmetric network for semantic segmentation in medical imaging. Symmetry 2020, 12, 1125. [Google Scholar] [CrossRef]
Fan, D.P.; Ji, G.P.; Sun, G.; Cheng, M.M.; Shen, J.; Shao, L. Camouflaged object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2774–2784. [Google Scholar] [CrossRef]
Lv, Y.; Zhang, J.; Dai, Y.; Li, A.; Liu, B.; Barnes, N.; Fan, D.P. Simultaneously localize, segment and rank the camouflaged objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 11591–11601. [Google Scholar] [CrossRef]
Ji, G.P.; Xiao, G.; Chou, Y.C.; Fan, D.P.; Zhao, K.; Chen, G.; Van Gool, L. Video polyp segmentation: A deep learning perspective. Mach. Intell. Res. 2022, 19, 531–549. [Google Scholar] [CrossRef]
Dao, T.; Fu, D.Y.; Ermon, S.; Rudra, A.; Ré, C. FlashAttention: Fast and memory-efficient exact attention with IO-awareness. Adv. Neural Inf. Process. Syst. 2022, 35, 16344–16359. [Google Scholar]
Ma, J.; Li, F.; Wang, B. U-Mamba: Enhancing long-range dependency for biomedical image segmentation. arXiv 2024, arXiv:2401.04722. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Navab, N., Hornegger, J., Wells, W., Frangi, A., Eds.; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar] [CrossRef]
Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.W.; Wu, J. UNet 3+: A full-scale connected UNet for medical image segmentation. In Proceedings of the ICASSP 2020, Barcelona, Spain, 4–8 May 2020; pp. 1055–1059. [Google Scholar] [CrossRef]
Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar] [CrossRef]
Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar] [CrossRef]
Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: Unet-like pure transformer for medical image segmentation. In Computer Vision—ECCV 2022 Workshops; Karlinsky, L., Michaeli, T., Nishino, K., Eds.; Springer: Cham, Switzerland, 2023; pp. 205–218. [Google Scholar] [CrossRef]
Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]
Kannan, S.; Morgan, L.A.; Liang, B.; Cheong, M.G.; Lin, C.Q.; Mun, D.; Nader, R.G.; Baghdadi, M.E.; Henderson, J.L.; Francis, J.M.; et al. Segmentation of glomeruli within trichrome images using deep learning. Kidney Int. Rep. 2019, 4, 955–962. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Wu, Y.; Chen, Y.; Xu, D.; Ge, C.; Chen, Y.; Mao, N.; Liu, L.; Cao, L.; Hu, M.; et al. Diagnosis of diabetic kidney disease in whole slide images via AI-driven quantification of pathological indicators. Comput. Biol. Med. 2023, 166, 107470. [Google Scholar] [CrossRef] [PubMed]
Bueno, G.; Fernandez-Carrobles, M.M.; Gonzalez-Lopez, L.; Deniz, O. Glomerulosclerosis identification in whole slide images using semantic segmentation. Comput. Methods Programs Biomed. 2020, 184, 105273. [Google Scholar] [CrossRef]
Gallego, J.; Swiderska-Chadaj, Z.; Markiewicz, T.; Yamashita, M.; Gabaldon, M.A.; Gertych, A. A U-Net based framework to quantify glomerulosclerosis in digitized PAS and H&E stained human tissues. Comput. Med. Imaging Graph. 2021, 89, 101865. [Google Scholar] [CrossRef]
Feng, X.; Wang, Y.; Yang, X.; Liu, H.; Li, D.; Zhao, Z.; Gao, Y.; Fang, W. ConvWin-UNet: UNet-like hierarchical vision transformer combined with convolution for medical image segmentation. Math. Biosci. Eng. 2023, 20, 128–144. [Google Scholar] [CrossRef]
Yan, S.; Huang, X.; Lian, W.; Song, C.; An, Q.; Yang, X.; Wang, W. Self reinforcing multi-class transformer for kidney glomerular basement membrane segmentation. IEEE Access 2023, 11, 105892–105901. [Google Scholar] [CrossRef]
Fan, D.P.; Ji, G.P.; Cheng, M.M.; Shao, L. Concealed object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 6024–6042. [Google Scholar] [CrossRef]
Mei, H.; Ji, G.P.; Wei, Z.; Yang, X.; Wei, X.; Fan, D.P. Camouflaged object segmentation with distraction mining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8772–8781. [Google Scholar] [CrossRef]
Pang, Y.; Zhao, X.; Xue, T.Z.; Zhang, L.; Lu, H. Zoom in and out: A mixed-scale triplet network for camouflaged object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 2160–2170. [Google Scholar] [CrossRef]
Sun, Y.; Wang, S.; Chen, C.; Xiang, T.Z. Boundary-guided network for camouflaged object detection. Pattern Recognit. 2022, 124, 108498. [Google Scholar] [CrossRef]
Zheng, P.; Gao, D.; Fan, D.P.; Liu, L.; Laaksonen, J.; Ouyang, W.; Sebe, N. Bilateral reference for high-resolution dichotomous image segmentation. Int. J. Comput. Vis. 2024, 132, 2920–2942. [Google Scholar] [CrossRef]
Ji, G.P.; Zhu, L.; Zhuge, M.; Fu, K. Fast camouflaged object detection via edge-based reversible re-calibration network. Pattern Recognit. 2022, 123, 108414. [Google Scholar] [CrossRef]
He, C.; Li, K.; Zhang, Y.; Tang, L.; Zhang, Y.; Guo, Z.; Li, X. Camouflaged object detection with feature decomposition and edge reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 22046–22055. [Google Scholar] [CrossRef]
Ruan, J.; Xiang, S. VM-UNet: Vision mamba UNet for medical image segmentation. arXiv 2024, arXiv:2402.02491. [Google Scholar] [CrossRef]
Zheng, X.; Chen, W.; Li, X. Mamba-UNet: Efficient medical image segmentation with state space models. arXiv 2024, arXiv:2402.05079. [Google Scholar] [CrossRef]
Tang, Y.; He, Y.; Nath, V.; Guo, P.; Deng, R.; Yao, T.; Liu, Q.; Cui, C.; Yin, M.; Xu, Z.; et al. HoloHisto: End-to-end Gigapixel WSI Segmentation with 4K Resolution Sequential Tokenization. arXiv 2024, arXiv:2407.03307. [Google Scholar] [CrossRef]
HuBMAP Consortium. The Human BioMolecular Atlas Program. Nature 2019, 574, 187–192. [Google Scholar] [CrossRef] [PubMed]

Figure 1. BM-UNet framework architecture. Mamba encoder extracts hierarchical features, MAF handles irregular morphologies, HBD generates boundary maps, BGA provides boundary-aware refinement, and MFDB reconstructs segmentation masks. Junction points (●) indicate feature distribution nodes where identical feature tensors are routed to multiple processing branches without additional computational overhead.

Figure 2. MAF architecture. The module processes input features through three parallel pathways: deformable convolutions for irregular shape adaptation, horizontal–vertical convolutions (1 × 5, 5 × 1) for elongated structure capture, and atrous dilated convolutions (d = 2) for multi-scale context. Context gating with Hard-sigmoid activation provides adaptive feature selection, with residual connections preserving spatial details.

Figure 3. HBD architecture. The module combines fine-scale edge features enhanced with fixed Sobel gradient operators and learnable enhancement coefficients, and high-level semantic features processed through Mamba sequence modeling. Cross-scale fusion via upsampling integrates enhanced edge features with context-aware representations to generate explicit boundary predictions for camouflaged structure detection.

Figure 4. BGA architecture. The module leverages boundary predictions from HBD to generate attention weights for multi-scale feature enhancement. Boundary attention weights guide feature refinement through interpolation and convolution operations, creating boundary-aware representations for precise edge delineation.

Figure 5. MFDB architecture. The block combines Mamba’s global sequence modeling pathway (BM-UNet-reshape) with multi-scale MLP local processing using parallel depthwise convolutions (1 × 1, 3 × 3, 5 × 5, 7 × 7). Features are concatenated and processed through ReLU6, dropout, and 1 × 1 convolution, with residual connections ensuring stable gradient flow for boundary-aware reconstruction.

Figure 6. Dataset visualization demonstrating camouflage challenges. Top row: original images across pathological conditions (Normal, 56Nx, DN, NEP25, HuBMAP-1, and HuBMAP-2). Bottom row: colored overlays highlighting boundary degradation progression.

Figure 7. KPIs2024 performance matrix visualization. Heatmap showing mIoU (%) across all methods and pathological conditions, with color intensity indicating segmentation accuracy.

Figure 8. Pixel-wise classification ROC curves for different methods on the KPIs2024 test set. BM-UNet achieves the highest AUC (0.948), demonstrating superior discriminative capability across all operating points. The curves show True Positive Rate vs. False Positive Rate for pixel-wise glomerular classification.

Figure 9. Multi-dimensional performance assessment. (a) Speed–accuracy trade-off; (b) computational cost vs. performance; (c) model size vs. performance; and (d) overall efficiency ranking.

Figure 10. Qualitative segmentation comparison on KPIs2024 dataset. Columns show the original image, ground truth, and results from DeepLabV3+, Swin-UNet, SegFormer, SINetV2, BiRefNet, and BM-UNet. In the ground truth column, the glomerular regions are highlighted in green, while in all model prediction columns, the segmented glomeruli are shown in white against a black background.

Figure 11. Cross-dataset validation on HuBMAP samples. Segmentation results comparing different methods on human kidney tissue images. In the ground truth column, the glomerular regions are highlighted in green, while in all model prediction columns, the segmented glomeruli are shown in white against a black background.

Figure 12. Comprehensive error analysis visualization. Color-coded error maps: green (TP), red (FP), blue (FN), and gray (TN), with quantitative metrics for each method.

Figure 13. HuBMAP error analysis. Cross-dataset error validation showing BM-UNet’s superior performance on human tissue samples.

Figure 14. Boundary overlay analysis. Original histopathological images with segmentation boundary overlays (green contours) across different methods.

Figure 15. Progressive ablation study visualization. Top row: segmentation mask evolution; Bottom row: corresponding error analysis through module integration.

Figure 16. Progressive module integration analysis demonstrating systematic performance improvements through component addition, with boundary-related modules providing the largest contributions for camouflaged structure segmentation.

Table 1. Loss function weight optimization analysis.

λ_e	Seg mIoU	Boundary mIoU	Training Behavior
0.1	93.72	89.84	Boundary under-optimized
0.3	93.64	90.67	Slight improvement
0.5	93.51	91.34	Optimal balance
0.7	93.28	91.52	Segmentation degradation
0.9	92.94	91.43	Over-emphasis on boundaries

Table 2. Dataset preprocessing and augmentation pipeline.

Processing Step	Parameters	Application	Dataset Size Impact
Image Resizing	512 × 512 pixels	All images	No change
Random Horizontal Flip	Probability p = 0.5	Training only	Online augmentation
Random Vertical Flip	Probability p = 0.5	Training only	Online augmentation
Random Cropping	Dynamic crop after resizing	Training only	Online augmentation
ImageNet Normalization	μ = [0.485, 0.456, 0.406]; σ = [0.229, 0.224, 0.225]	All images	No change

Table 3. KPIs2024 dataset composition and characteristics.

Condition	Resolution	Training	Test	Total	Disease Model	Staining
Normal	512 × 512	980	327	1307	Healthy control	PAS
56Nx	512 × 512	979	327	1306	5/6 nephrectomy	PAS
DN	512 × 512	975	325	1300	Diabetic model	PAS
NEP25	512 × 512	976	325	1301	Podocyte injury	PAS
Total	512 × 512	3910	1304	5214	4 models	PAS

Table 4. HuBMAP dataset specifications.

Split	Resolution	Count	Content	Annotation
Training	512 × 512	2576	Human kidney	Expert
Test	512 × 512	1104	Human kidney	Expert
Total	512 × 512	3680	PAS-stained	Manual

Table 5. Quantitative performance comparison on KPIs2024 dataset with statistical significance analysis.

Model Variant	Architecture	56Nx	DN	NEP25	Normal	Avg mIoU	Std	p-Value †
U-Net	CNN	85.12 ± 0.48	88.34 ± 0.42	80.56 ± 0.54	91.78 ± 0.35	86.45	4.82	<0.001 ***
PSPNet	CNN	85.89 ± 0.46	89.72 ± 0.41	81.93 ± 0.52	92.35 ± 0.33	87.47	4.51	<0.001 ***
DeepLabV3+	CNN	86.19 ± 0.47	91.70 ± 0.39	81.45 ± 0.51	93.42 ± 0.31	88.19	5.45	<0.001 ***
UNet3+	CNN	88.32 ± 0.42	90.71 ± 0.40	83.73 ± 0.49	93.81 ± 0.30	89.14	4.25	<0.001 ***
ConvNeXt	CNN	88.45 ± 0.41	91.23 ± 0.38	85.79 ± 0.47	93.28 ± 0.29	89.69	3.12	<0.001 ***
Segformer	Transformer	88.77 ± 0.39	92.77 ± 0.37	87.45 ± 0.44	94.73 ± 0.27	90.93	3.40	0.008 **
SwinUNet	Transformer	87.34 ± 0.45	88.89 ± 0.43	81.72 ± 0.53	92.72 ± 0.32	87.67	4.56	<0.001 ***
SINetV2	COD	89.13 ± 0.38	93.53 ± 0.35	86.57 ± 0.43	94.04 ± 0.26	90.82	3.59	0.011 *
PFNet	COD	87.27 ± 0.43	91.85 ± 0.39	84.84 ± 0.48	93.17 ± 0.30	89.78	3.97	<0.001 ***
ZoomNet	COD	86.34 ± 0.47	86.14 ± 0.50	82.81 ± 0.52	91.35 ± 0.34	86.66	3.70	<0.001 ***
BiRefNet	COD	88.81 ± 0.37	92.82 ± 0.34	88.89 ± 0.42	93.87 ± 0.25	91.10	2.24	0.026 *
BGNet	COD	89.23 ± 0.36	93.67 ± 0.33	89.49 ± 0.40	94.11 ± 0.24	91.63	2.47	0.042 *
CM-UNet	Mamba	90.29 ± 0.33	93.81 ± 0.31	90.87 ± 0.38	94.34 ± 0.23	92.33	1.72	0.049 *
BM-UNet (ours)	COD-Mamba	92.43 ± 0.29	94.02 ± 0.28	92.28 ± 0.35	95.32 ± 0.21	93.51	1.34	-

† p-values from paired t-test comparing against BM-UNet; *** p < 0.001, ** p < 0.01, * p < 0.05.

Table 6. Cross-dataset validation on HuBMAP dataset with statistical significance.

Model Variant	Architecture	mPA (%)	mIoU (%)	mDice (%)	p-Value †
U-Net	CNN	96.12 ± 0.13	93.57 ± 0.18	96.71 ± 0.11	<0.001 ***
UNet3+	CNN	96.48 ± 0.12	93.68 ± 0.17	96.77 ± 0.10	<0.001 ***
PSPNet	CNN	95.52 ± 0.15	93.12 ± 0.20	96.49 ± 0.12	<0.001 ***
DeepLabV3plus	CNN	96.25 ± 0.13	93.45 ± 0.18	96.72 ± 0.11	<0.001 ***
ConvNeXt	CNN	96.78 ± 0.11	93.85 ± 0.16	96.81 ± 0.09	0.034 *
Segformer	Transformer	96.55 ± 0.12	93.89 ± 0.16	96.89 ± 0.09	0.046 *
SwinUNet	Transformer	95.23 ± 0.16	92.43 ± 0.21	96.15 ± 0.13	<0.001 ***
BM-UNet (ours)	COD-Mamba	97.12 ± 0.10	94.32 ± 0.15	97.19 ± 0.08	-

† p-values from paired t-test comparing against BM-UNet; *** p < 0.001, * p < 0.05.

Table 7. Computational efficiency analysis comparing processing speed and resource requirements.

Method	FLOPs (G)	Params (M)	FPS
DeepLabV3plus	46.94	27.23	237.4
Swin-UNet	30.92	27.15	121.1
U-Net3+	179.08	31.03	82.8
Segformer	56.71	27.35	94.3
SINetV2	26.11	24.96	96.2
BiRefNet	37.4	39.29	38.9
CM-UNet	33.05	30.07	119.7
BM-UNet (ours)	43.31	32.42	113.7

Table 8. Clinical deployment validation across different hardware platforms.

Hardware Platform	Peak Memory Usage (GB)	Total Processing Time (s)	Avg. Time per Image (ms)	Throughput (Images/s)
NVIDIA RTX 4090 (24 GB)	2.3	1.76	8.8	113.6
NVIDIA RTX 3060 (12 GB)	2.3	3.08	15.4	64.9
NVIDIA RTX 2070 (8 GB)	2.3	4.12	20.6	48.5
NVIDIA GTX 1660 Ti (6 GB)	2.3	8.62	43.1	23.2

Table 9. Comprehensive ablation study demonstrating the contribution of each BM-UNet component across multiple evaluation metrics.

Method	56Nx			DN			NEP25			Normal
Method	mIoU	mDice	mPA	mIoU	mDice	mPA	mIoU	mDice	mPA	mIoU	mDice	mPA
Baseline (Mamba Only)	87.8	90.2	92.5	89.51	91.8	93.8	87.47	89.9	92.1	90.94	93.1	94.2
+MAF	88.9	91.3	93.4	90.54	92.9	94.6	88.96	91.4	93.2	91.59	93.9	94.8
+MAF + HBD	90.3	92.8	94.7	91.64	94.1	95.3	90.25	92.7	94.5	93.49	95.2	96.1
+MAF + HBD + BGA	91.5	93.9	95.4	92.99	95.3	96.1	91.50	93.8	95.3	94.09	95.7	96.5
BM-UNet (Full + MFDB)	92.43	94.7	96.1	94.02	96.2	96.8	92.28	94.6	96.0	95.32	96.9	97.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, S.; Cui, X.; Ma, G.; Tian, R. Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images. Symmetry 2025, 17, 1506. https://doi.org/10.3390/sym17091506

AMA Style

Zhang S, Cui X, Ma G, Tian R. Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images. Symmetry. 2025; 17(9):1506. https://doi.org/10.3390/sym17091506

Chicago/Turabian Style

Zhang, Shengnan, Xinming Cui, Guangkun Ma, and Ronghui Tian. 2025. "Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images" Symmetry 17, no. 9: 1506. https://doi.org/10.3390/sym17091506

APA Style

Zhang, S., Cui, X., Ma, G., & Tian, R. (2025). Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images. Symmetry, 17(9), 1506. https://doi.org/10.3390/sym17091506

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images

Abstract

1. Introduction

2. Related Work

2.1. Medical Image Segmentation

2.2. Glomerular Segmentation Applications

2.3. Camouflaged Object Detection Applications

2.4. State-Space Models in Medical Imaging

3. Methodologies

3.1. Mamba Backbone

3.2. MAF

3.3. HBD

3.4. BGA

3.5. MFDB

3.6. Loss Function

4. Experiments Results

4.1. Implement Details

4.2. Datasets

4.3. Evaluation Metrics

4.4. Comparative Experiments

4.4.1. Quantitative Performance Analysis

4.4.2. Computational Efficiency Analysis

4.4.3. Qualitative Analysis

4.5. Ablation Experiment

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI