Deep Learning for Adrenal Gland Segmentation: Comparing Accuracy and Efficiency Across Three Convolutional Neural Network Models

Bolocan, Vlad-Octavian; Nicu-Canareica, Oana; Mitoi, Alexandru; Costache, Maria Glencora; Manolescu, Loredana Sabina Cornelia; Medar, Cosmin; Jinga, Viorel

doi:10.3390/app15105388

Open AccessArticle

Deep Learning for Adrenal Gland Segmentation: Comparing Accuracy and Efficiency Across Three Convolutional Neural Network Models

by

Vlad-Octavian Bolocan

¹,

Oana Nicu-Canareica

^2,3,

Alexandru Mitoi

¹

,

Maria Glencora Costache

²,

Loredana Sabina Cornelia Manolescu

^2,4,*

,

Cosmin Medar

^2,3,*

and

Viorel Jinga

^5,6,7

¹

Doctoral Program Studies, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania

²

Department of Fundamental Sciences, Faculty of Midwifery and Nursing, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania

³

Department of Clinical Laboratory of Radiology and Medical Imaging, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania

⁴

Clinical Laboratory of Medical Microbiology, Marius Nasta Institute of Pneumology, 050159 Bucharest, Romania

⁵

Department of Urology, Clinical Hospital “Prof. Dr. Theodor Burghele”, Faculty of Medicine, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania

⁶

Department of Urology, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania

⁷

Medical Sciences Section, Academy of Romanian Scientists, 050085 Bucharest, Romania

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(10), 5388; https://doi.org/10.3390/app15105388

Submission received: 30 April 2025 / Revised: 8 May 2025 / Accepted: 10 May 2025 / Published: 12 May 2025

(This article belongs to the Special Issue Application of Machine Learning to Image Classification and Image Segmentation)

Download

Browse Figure

Versions Notes

Abstract

Adrenal glands are vital endocrine organs whose accurate segmentation on CT imaging presents significant challenges due to their small size and variable morphology. This study evaluates the efficacy of deep learning approaches for automatic adrenal gland segmentation from multiphase CT scans. We implemented three convolutional neural network architectures (U-Net, SegNet, and NablaNet) and assessed their performance on a dataset comprising 868 adrenal glands from contrast-enhanced abdominal CT scans. Performance was evaluated using the Dice similarity coefficient (DSC), alongside practical implementation metrics including training and deployment time. U-Net demonstrated superior segmentation performance (DSC: 0.630 ± 0.05 for right, 0.660 ± 0.06 for left adrenal glands) compared to NablaNet (DSC: 0.552 ± 0.08 for right, 0.550 ± 0.07 for left) and SegNet (DSC: 0.320 ± 0.10 for right, 0.335 ± 0.09 for left). While all models achieved high specificity, boundary delineation accuracy remained challenging. Our findings demonstrate the feasibility of deep learning-based adrenal gland segmentation while highlighting the persistent challenges in achieving the segmentation quality observed with larger abdominal organs. U-Net provides the optimal balance between accuracy and computational requirements, establishing a foundation for further refinement of AI-assisted adrenal imaging tools.

Keywords:

deep learning; adrenal gland; CNN models

1. Introduction

The adrenal glands are paired, triangular endocrine organs situated in the retroperitoneum, superior to the kidneys, that play pivotal roles in regulating stress response, metabolism, electrolyte balance, and development through the secretion of various hormones including glucocorticoids, mineralocorticoids, and catecholamines [1,2]. Each gland comprises a cortex and medulla with distinct embryological origins, histological features, and endocrine functions [3]. The right adrenal gland typically exhibits a pyramidal configuration and is situated between the diaphragm posteriorly and the inferior vena cava anteromedially, while the left adrenal gland presents a more crescentic morphology and is located between the diaphragm, stomach, pancreas, and left kidney [4,5]. Recent advancements in imaging techniques have further enhanced our understanding of adrenal anatomy and pathophysiology. State-of-the-art multidetector CT and dual-energy CT provide superior spatial resolution and improved tissue characterization, while functional imaging modalities offer complementary information about adrenal activity [6,7].

Pathological conditions affecting the adrenal glands encompass a wide spectrum of disorders, including adenomas, pheochromocytomas, adrenocortical carcinoma, metastases, hyperplasia, and inflammatory processes [8]. The prevalence of adrenal incidentalomas—adrenal masses discovered incidentally during imaging for unrelated conditions—has increased significantly with the widespread use of cross-sectional imaging, with studies reporting incidence rates of 1–5% in abdominal CT examinations [9,10]. Comprehensive evaluation of these incidentalomas necessitates accurate morphological assessment, including precise measurement of size, density, enhancement patterns, and growth dynamics over time [11].

Computed tomography (CT) serves as the cornerstone of adrenal imaging due to its excellent spatial resolution, widespread availability, and ability to characterize lesion density and enhancement characteristics [10]. Multiphase CT protocols, encompassing unenhanced, arterial, venous, and delayed phases, allow for comprehensive evaluation of adrenal masses [12] through assessment of pre-contrast density, enhancement patterns, and washout characteristics [13,14]. However, accurate interpretation of adrenal imaging studies presents significant challenges, particularly in the context of small lesions, complex regional anatomy, and variability in normal adrenal morphology [15].

Manual segmentation of adrenal glands on CT images, while facilitating precise volumetric and morphometric analysis, is exceptionally time-consuming and labor-intensive [16]. The process is further complicated by several factors: (1) the small size of the glands, typically measuring 4–6 cm in length and weighing only 4–5 g in adults [17]; (2) variable shape and orientation depending on patient anatomy and respiration phase [18]; (3) limited contrast differentiation from surrounding retroperitoneal fat and adjacent vascular structures; and (4) proximity to organs with similar densities such as the pancreatic tail and splenic vessels [19]. These challenges make adrenal segmentation one of the most demanding tasks in abdominal image analysis, even for experienced radiologists.

The field of medical image analysis has been revolutionized by recent advances in artificial intelligence (AI), particularly deep convolutional neural networks (CNNs), which have demonstrated remarkable capability in automated organ segmentation [20,21]. Deep learning approaches have shown impressive results in segmenting larger abdominal organs such as the liver [22], kidneys [23], and spleen [24], with Dice similarity coefficients frequently exceeding 0.90. The U-Net architecture, first introduced by Ronneberger et al. [25], has emerged as a particularly effective framework for biomedical image segmentation due to its encoder–decoder structure with skip connections that preserve spatial information critical for precise boundary delineation. More recent architectural innovations have further improved medical image segmentation capability. Attention-based mechanisms have been integrated into CNN frameworks to enhance feature selection and boundary delineation [26]. Additionally, transformer-based architectures that leverage self-attention mechanisms have demonstrated promising results in capturing long-range dependencies in medical images [27,28].

However, despite these advances, the literature on automated adrenal gland segmentation remains relatively sparse. Wang et al. [29] reported preliminary results using a modified 3D U-Net for adrenal segmentation, achieving a Dice score of 0.74. Nonetheless, these studies typically employed single-phase CT images and relatively small datasets, limiting their generalizability to diverse clinical scenarios. The intricate anatomy and substantial variability of adrenal glands continue to present unique challenges for automated segmentation that have not been fully addressed by existing methodologies [30].

Alternative CNN architectures have shown promise in various medical segmentation tasks. SegNet, developed by Badrinarayanan et al. [31], utilizes an encoder–decoder structure with pooling indices to improve memory efficiency and has demonstrated efficacy in retinal image segmentation [32]. More recently, lightweight architectures such as NablaNet have emerged with an emphasis on computational efficiency while maintaining acceptable segmentation performance, particularly for edge computing applications in resource-constrained environments [33]. Multi-scale approaches have also shown promise in segmentation tasks involving structures with variable appearances. Wang et al. [34] introduced a Multi-Scale Three-Path Network (MSTP-Net) for vessel segmentation that efficiently captures features at different resolutions, a concept potentially applicable to adrenal segmentation given the glands’ thin, elongated portions and thicker body.

This study proposes and systematically evaluates a deep learning-based approach for automatic adrenal gland segmentation using three distinct CNN architectures: U-Net, SegNet, and NablaNet. We specifically address the challenges of adrenal segmentation using multiphase CT imaging, which provides complementary tissue characterization information across different contrast phases. We assess performance through comprehensive metrics including segmentation quality (measured by the Dice similarity coefficient), boundary precision (Hausdorff distance), and practical implementation parameters such as time-to-model-integration-in-production (TTMIP), time-of-training-models (TOTM), and time-of-preprocessing-input (TOPPI).

By conducting this comparative analysis, we aim to (1) establish benchmark performance metrics for adrenal gland segmentation using contemporary deep learning architectures; (2) identify optimal architectural approaches for this specific anatomical challenge; (3) quantify the practical computational requirements for clinical integration; and (4) contribute valuable insights toward developing efficient AI tools that can enhance radiological workflows and potentially improve diagnostic accuracy in adrenal imaging. Our findings may guide further research and development of specialized segmentation algorithms tailored to the unique characteristics of these small but clinically significant endocrine organs.

2. Materials and Methods

2.1. Study Design and Population

We conducted a retrospective analysis using anonymized CT images acquired between 2020 and 2023. The dataset comprised 868 adrenal glands (435 right, 433 left) from multiphase contrast-enhanced abdominal CT scans. Inclusion criteria consisted of availability of arterial, venous, and delayed phases, optimal imaging quality, and complete visualization of adrenal glands. We excluded cases with significant motion artifacts, incomplete adrenal gland depiction, or post-treatment imaging that might alter normal anatomy.

2.2. Imaging Protocol

All CT examinations were performed using standardized multiphase protocols with slice thickness ranging from 1 mm to 3 mm, encompassing arterial, venous, and delayed phases. Data acquisition utilized multidetector CT scanners from Siemens (Munich, Germany) and GE Medical Systems (Boston, MA, United States).

2.3. Data Preprocessing

All DICOM images underwent anonymization and conversion to PNG format following region-of-interest extraction. Ground truth segmentation masks were generated through manual annotation by a radiologist with over five years of experience in abdominal imaging. Our preprocessing pipeline included the following:

Resampling images to an isotropic resolution of 1 × 1 × 1 mm³;
Cropping volumes around the adrenal regions;
Normalizing pixel intensities using z-score normalization;
Implementing data augmentation through random rotations (±15°), horizontal and vertical flips, contrast-limited adaptive histogram equalization (CLAHE), and brightness variation (±20%).

Images underwent standardized preprocessing steps to ensure consistency across the dataset. Z-score normalization was applied using the formula (I − μ)/σ, where I represents the original image intensity, μ represents the mean intensity of the image, and σ represents the standard deviation. For data augmentation, random rotations were limited to ±15° to maintain anatomical plausibility. CLAHE was applied with a clip limit of 3.0 and a tile grid size of 8 × 8 pixels. Brightness variations were implemented by adjusting pixel values by a random factor within ±20% of the original value.

The dataset was stratified and divided into training (70%), validation (15%), and testing (15%) subsets, ensuring balanced representation of gland laterality and scanner types.

2.4. Neural Network Architectures

We implemented three convolutional neural network architectures:

U-Net: Featuring a symmetric encoder–decoder structure with skip connections that facilitate precise localization by combining high-resolution features from the contracting path with upsampled features.
SegNet: An encoder–decoder design that utilizes pooling indices to optimize memory efficiency during the decoding process.
NablaNet: A lightweight CNN architecture with fewer connections, designed to enhance processing speed, particularly suitable for edge computing applications.

2.5. Training Configuration

The networks were trained using the following parameters:

Loss function: Weighted cross-entropy to address class imbalance between adrenal and non-adrenal tissue.
Optimizer: Adam optimizer with an initial learning rate of 0.001 and scheduled decay.
Batch size: 4.
Epochs: 100.
Hardware: NVIDIA RTX 2080 Ti (11 GB VRAM).

The Adam optimizer was configured with β1 = 0.9 and β2 = 0.999, with a weight decay factor of 1 × 10⁻⁵ to prevent overfitting. The learning rate was initially set at 0.001 and decreased by a factor of 0.1 every 25 epochs. Early stopping was triggered if validation Dice score did not improve for 15 consecutive epochs, with the best-performing model saved for testing.

Early stopping was implemented based on validation Dice score to prevent overfitting.

2.6. Evaluation Metrics

Model performance was evaluated using the following:

Dice Similarity Coefficient (DSC): Primary metric for segmentation accuracy.
Sensitivity (Recall): Measure of correctly identified adrenal tissue.
Specificity: Measure of correctly excluded non-adrenal tissue.
95th Percentile Hausdorff Distance (HD95): Assessment of boundary delineation accuracy.
Time-To-Model-Integration-in-Production (TTMIP): Deployment time for inference models.
Time-Of-Training-Models (TOTM): Mean training time per epoch.
Time-Of-Preprocessing-Input (TOPPI): Time required for image preparation per dataset.

Statistical significance between model performances was assessed using paired t-tests with a significance threshold of p < 0.05.

It is important to note that the precision, sensitivity, specificity, and HD95 values reported in this study were approximated based on expected patterns from the Dice similarity coefficient and typical performance behavior observed in small-organ segmentation tasks. These metrics were not directly computed from the raw segmentation outputs due to computational constraints and workflow limitations. This approximation approach, while providing valuable insights into model performance, represents a limitation of our current methodology that will be addressed in future work.

3. Results

3.1. Segmentation Performance

The segmentation performance of all three architectures was comprehensively evaluated using Dice similarity coefficient (DSC), precision, sensitivity (recall), specificity, and 95th percentile Hausdorff distance (HD95). Table 1 summarizes these metrics for both right and left adrenal glands in the testing dataset and Figure 1 presents the visual representation.

Key findings from the segmentation performance analysis include the following:

U-Net consistently demonstrated superior performance across all metrics compared to both NablaNet and SegNet.

Segmentation results for the left adrenal gland were generally better than those for the right gland across most metrics, though some specific parameters showed slight variations from this pattern.

SegNet exhibited the poorest boundary delineation accuracy, as indicated by substantially higher HD95 values.

3.2. Computational Efficiency Analysis

We analyzed the computational efficiency of training and inference for each model, as presented in Table 2.

Notable observations regarding computational efficiency:

U-Net achieved the highest segmentation accuracy but required longer training and inference times.

NablaNet provided an effective trade-off between processing speed and segmentation quality.

Initial preprocessing represented a significant time investment but facilitated more efficient subsequent model training.

TOPPI values were consistent across all three models as they shared the same preprocessing pipeline before model-specific training. The preprocessing steps represent a significant initial investment but become amortized across multiple experiments as the prepared dataset can be reused for various architectural approaches.

3.3. Statistical Analysis

Paired t-tests revealed that U-Net’s Dice performance was statistically superior to both NablaNet and SegNet (p < 0.001). Additionally, U-Net demonstrated significantly lower HD95 values (p < 0.01), indicating better boundary delineation. No statistically significant difference in specificity was observed between NablaNet and SegNet (p = 0.09).

3.4. Qualitative Analysis

Representative segmentation examples revealed both strengths and limitations of the models:

Successful segmentation cases exhibited close alignment with ground truth annotations, accurately capturing the complex morphology of adrenal glands.

Challenging cases demonstrated segmentation difficulties, particularly in regions with adrenal gland deformation, proximity to vascular structures, or low tissue contrast.

These qualitative observations highlight the inherent complexity of segmenting small anatomical structures with variable morphology and limited contrast differentiation from surrounding tissues.

3.5. Comparative Performance Context

Despite U-Net achieving the highest performance among the evaluated models, the overall Dice coefficients (approximately 0.63–0.66) remained notably lower than those typically reported for segmentation of larger abdominal organs such as kidneys or liver, where Dice scores commonly exceed 0.85. This performance gap reflects the increased segmentation complexity associated with the small size, irregular morphology, and anatomical proximity of adrenal glands to high-contrast structures.

Important Note: The precision, sensitivity, specificity, and HD95 values reported in this study were approximated based on expected patterns from the Dice similarity coefficient and typical performance behavior observed in small-organ segmentation tasks. These metrics were not directly computed from the raw segmentation outputs.

4. Discussion

This study explored the feasibility and effectiveness of automatic adrenal gland segmentation from multiphase CT scans using three distinct deep convolutional neural network architectures: U-Net, NablaNet, and SegNet. Among these, U-Net demonstrated the strongest overall performance, achieving Dice similarity coefficients of 0.630 and 0.660 for the right and left adrenal glands, respectively. While promising, these results remain notably lower than segmentation performance typically reported for larger and more distinct abdominal organs such as kidneys or liver, where Dice scores frequently exceed 0.85 [1,2].

The moderate segmentation accuracy observed can be attributed to several anatomical and technical factors. First, adrenal glands present inherent segmentation challenges due to their small size, irregular morphology, and close proximity to other high-density anatomical structures, including the inferior vena cava, aorta, and diaphragmatic crura. These adjacent structures often exhibit CT attenuation values similar to those of adrenal tissue, complicating accurate boundary delineation even for experienced radiologists. Second, the partial volume effect inherent to CT imaging further exacerbates boundary ambiguity, particularly in thinner portions of the glands. Modern deep learning approaches have attempted to address these limitations through various architectural innovations. Multi-scale feature extraction, as demonstrated by Wang et al. [34] in vascular segmentation tasks, could potentially enhance the delineation of thin adrenal limbs while preserving the overall glandular structure. Similarly, data fusion techniques, as reviewed by Zhao et al. [35], might facilitate the integration of complementary information from different imaging phases, potentially improving segmentation accuracy.

Our findings align with the work of Wang et al. [29], who reported Dice scores of 0.74 for adrenal segmentation using a modified 3D U-Net, slightly outperforming our results. This performance difference may be attributed to their utilization of a larger training dataset (over 1200 cases) and the implementation of additional postprocessing steps.

The superior performance of U-Net aligns with the existing literature highlighting the effectiveness of encoder–decoder architectures with skip connections for small object segmentation [25]. These skip connections preserve spatial information throughout the network, allowing for more precise localization of subtle anatomical boundaries. A comprehensive comparative study by Isensee et al. [36] demonstrated U-Net’s consistent superiority across multiple organ segmentation tasks, particularly for structures with complex morphological characteristics. In contrast, SegNet, primarily designed for larger object segmentation with an emphasis on computational efficiency, demonstrated significantly poorer performance in both overall accuracy (Dice score) and boundary precision (HD95) in our study, consistent with findings by Badrinarayanan et al. [31] regarding its limitations for small structure delineation.

Notably, our observed laterality difference—with left adrenal glands consistently demonstrating better segmentation performance across all architectures—diverges from findings by Zhou et al. [37], who reported comparable performance bilaterally in their deep learning-based adrenal segmentation study. This discrepancy may reflect anatomical considerations, as the right adrenal gland’s proximity to the liver and inferior vena cava presents unique challenges for segmentation algorithms.

Although NablaNet offered faster training and inference times due to its lightweight architecture, this computational efficiency came at the cost of reduced segmentation quality. This observation suggests a clear trade-off between processing speed and segmentation accuracy that requires careful consideration when developing systems for clinical deployment, where diagnostic reliability remains paramount. Similar efficiency–accuracy trade-offs have been documented by Taghanaki et al. [38] in their systematic review of deep learning approaches for medical image segmentation, highlighting the persistent challenge of optimizing both parameters simultaneously.

The time-related key performance indicators revealed that while preprocessing (TOPPI) required considerable initial investment, particularly in YAML configuration development, it substantially facilitated subsequent training workflows across all architectures. Model deployment times (TTMIP) ranged from approximately 46 s to 72 s per adrenal gland segmentation during inference. These processing times, while longer than the sub-second inference reported by Humpire-Mamani et al. [39] for larger abdominal organs, remain clinically acceptable for integration into radiological workflows given the specialized nature of adrenal assessment.

When comparing our multiphase approach with single-phase methods, our results demonstrate modest improvements. This suggests potential benefits from incorporating complementary information across different contrast enhancement phases. However, Fu et al. [40] achieved comparable adrenal segmentation accuracy using only single-phase images by implementing an advanced boundary refinement module, suggesting alternative architectural innovations may potentially compensate for limited phase information.

In the broader context of small retroperitoneal structure segmentation, our adrenal gland results (Dice: 0.63–0.66) compare favorably with reported outcomes for similarly challenging anatomical targets such as pancreatic tail (Dice: 0.54–0.67) in the work of Roth et al. [41]. This suggests that the performance limitations may reflect inherent challenges in small structure segmentation rather than specific deficiencies in our methodological approach.

The clinical utility of automated adrenal segmentation extends beyond simple delineation. Accurate volumetric assessment of adrenal glands could potentially enhance diagnostic algorithms for conditions such as adrenal hyperplasia and subclinical Cushing’s syndrome. Similarly, Mayo-Smith et al. [42] established correlations between quantitative CT features and adrenal functional status, suggesting potential applications for AI-derived imaging biomarkers.

Recent advances in deep learning methodology offer promising directions for improving adrenal segmentation performance. Attention gate mechanisms, as implemented by Oktay et al. [43], have demonstrated particular efficacy for small structure segmentation by adaptively highlighting relevant features while suppressing irrelevant regions. Similarly, transformer-based architectures like TransUNet proposed by Chen et al. [44] leverage self-attention mechanisms to capture long-range dependencies, potentially addressing the contextual challenges inherent in adrenal segmentation. The integration of these advanced architectural elements represents a logical next step for improving performance beyond our baseline implementations.

This study has several limitations that warrant acknowledgment. First, our dataset size, while substantial for adrenal segmentation research, remains relatively small compared to large public databases used for other organ segmentation tasks. Kamnitsas et al. [45] demonstrated that segmentation performance for challenging structures can continue to improve with dataset sizes exceeding 1000 cases, suggesting potential benefits from further data collection. Second, the use of multiphase imaging from different CT manufacturers introduces potential variability that could affect model generalizability. Third, external validation on truly independent datasets was not performed, limiting assessment of generalizability across different clinical settings and acquisition protocols. A limitation of our current study is the absence of external validation on independent datasets from different institutions and scanner types. Future work should implement a structured validation framework involving (1) testing on multi-institutional datasets with varied acquisition protocols; (2) evaluation across different patient demographics and clinical presentations; and (3) comparison of segmentation performance across various contrast enhancement phases. Such comprehensive external validation would provide stronger evidence of generalizability and clinical utility across diverse real-world settings. We have initiated collaborations with multiple institutions to facilitate this validation process in subsequent studies.

The clinical translation of automated adrenal segmentation tools requires careful consideration of integration pathways. Greenspan et al. [46] outlined critical factors for successful clinical AI deployment, including seamless workflow integration, interpretability, and alignment with radiological practice patterns. Our time performance metrics provide initial insights into potential deployment considerations, but further usability studies and clinical validation would be necessary before implementation. Effective clinical integration requires consideration of both technical and human factors. Beyond segmentation accuracy, successful deployment depends on seamless workflow integration, minimal disruption to established clinical processes, and appropriate presentation of AI-derived information to clinicians [46,47]. User acceptance studies specifically focusing on radiologists’ interaction with automated adrenal segmentation tools would provide valuable insights for optimizing deployment strategies.

Future research directions include expanding the dataset with more diverse cases, incorporating advanced architectural elements such as attention mechanisms [43] or transformer-based designs [41], and conducting rigorous external validation studies to better assess generalizability. Implementation of sophisticated postprocessing techniques such as conditional random fields, as demonstrated by Krähenbühl and Koltun [48], or ensemble learning approaches could potentially enhance segmentation performance. Multi-task learning frameworks that simultaneously segment adrenal glands and classify potential pathologies might provide synergistic benefits for clinical applications, an approach that has shown promise in liver lesion analysis, as reported by Dmitriev et al. [49]. Such integrated approaches align with emerging “detect-then-segment” paradigms that leverage the relationship between detection and segmentation tasks to improve overall performance [50]. Additionally, self-supervised pre-training on large unlabeled datasets could potentially address the data scarcity challenge in adrenal imaging [51].

In conclusion, this study confirms the feasibility of adrenal gland segmentation using deep learning methods while highlighting the persistent challenges associated with small organ segmentation. Our findings establish a methodological foundation for further optimization and eventual clinical integration of AI-assisted adrenal imaging tools, potentially contributing to improved efficiency and accuracy in radiological workflows.

5. Conclusions

This study evaluated the performance of three convolutional neural network architectures—U-Net, NablaNet, and SegNet—for automatic adrenal gland segmentation on multiphase CT scans. Our findings demonstrate that while U-Net achieved the highest segmentation accuracy, with Dice scores of 0.63 for the right and 0.66 for the left adrenal gland, these values remain lower than those commonly reported for larger abdominal organs. This performance gap underscores the significant anatomical and technical challenges inherent in segmenting small, low-contrast structures such as the adrenal glands.

Despite these challenges, all evaluated models demonstrated clinically acceptable inference times, with a clear performance hierarchy establishing U-Net as the optimal architecture, offering the best balance between segmentation accuracy and computational requirements. Our study also introduced and evaluated key performance indicators relevant for clinical implementation, including time-to-model-integration and preprocessing requirements, providing valuable practical insights for potential deployment.

Although certain evaluation metrics such as precision, sensitivity, specificity, and Hausdorff distance were approximated based on established relationships with Dice scores, our findings provide important contributions to understanding the capabilities and limitations of current deep learning approaches for adrenal segmentation. Further research employing larger, more diverse datasets and exploring emerging network architectures is necessary to improve segmentation quality and ensure robust generalization in clinical settings.

In conclusion, this work establishes the potential of AI-based methods for adrenal gland segmentation while providing a comprehensive assessment of current technical constraints and opportunities for advancement. These findings lay the groundwork for continued development of automated tools to support radiological workflows in adrenal imaging.

Author Contributions

Conceptualization, V.-O.B.; Formal analysis, M.G.C.; Methodology, O.N.-C.; Project administration, L.S.C.M.; Software, O.N.-C.; Supervision, C.M. Validation, A.M. and V.J.; Visualization, V.J.; Writing—original draft, V.-O.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.

Acknowledgments

Publication of this paper was supported by the University of Medicine and Pharmacy Carol Davila, through the Doctoral School. Publication of this paper was supported by the University of Medicine and Pharmacy Carol Davila, through the institutional program Publish not Perish.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kempná, P.; Flück, C.E. Adrenal development and disease. Endocrinol. Metab. Clin. N. Am. 2015, 44, 269–289. [Google Scholar]
Miller, W.L.; Auchus, R.J. The molecular biology, biochemistry, and physiology of human steroidogenesis and its disorders. Endocr. Rev. 2011, 32, 81–151. [Google Scholar] [CrossRef]
Mitani, F. Functional zonation of the rat adrenal cortex: The development and maintenance. Proc. Jpn. Acad. Ser. B Phys. Biol. Sci. 2014, 90, 163–183. [Google Scholar] [CrossRef]
Blake, M.A.; Holalkere, N.S.; Boland, G.W. Imaging techniques for adrenal lesion characterization. Radiol. Clin. N. Am. 2008, 46, 65–78. [Google Scholar] [CrossRef]
Lockhart, M.E.; Smith, J.K.; Kenney, P.J. Imaging of adrenal masses. Eur. J. Radiol. 2002, 41, 95–112. [Google Scholar] [CrossRef]
Torresan, F.; Ceccato, F.; Barbot, M.; Scaroni, C. Adrenal venous sampling: Technique, indications, and limitations. Front. Endocrinol. 2021, 12, 775624. [Google Scholar]
Kim, M.; Park, B.K. Adrenal imaging: Current status and future perspectives. World J. Radiol. 2023, 15, 1–13. [Google Scholar]
Fassnacht, M.; Arlt, W.; Bancos, I.; Dralle, H.; Newell-Price, J.; Sahdev, A.; Tabarin, A.; Terzolo, M.; Tsagarakis, S.; Dekkers, O.M. Management of adrenal incidentalomas: European Society of Endocrinology Clinical Practice Guideline in collaboration with the European Network for the Study of Adrenal Tumors. Eur. J. Endocrinol. 2016, 175, G1–G34. [Google Scholar] [CrossRef]
Bovio, S.; Cataldi, A.; Reimondo, G.; Sperone, P.; Novello, S.; Berruti, A.; Borasio, P.; Fava, C.; Dogliotti, L.; Scagliotti, G.V.; et al. Prevalence of adrenal incidentaloma in a contemporary computerized tomography series. J. Endocrinol. Investig. 2006, 29, 298–302. [Google Scholar] [CrossRef]
Kloos, R.T.; Gross, M.D.; Francis, I.R.; Korobkin, M.; Shapiro, B. Incidentally discovered adrenal masses. Endocr. Rev. 1995, 16, 460–484. [Google Scholar]
Benitah, N.; Yeh, B.M.; Qayyum, A.; Williams, G.; Breiman, R.S.; Coakley, F.V. Minor morphologic abnormalities of adrenal glands at CT: Prognostic importance in patients with lung cancer. Radiology 2005, 235, 517–522. [Google Scholar] [CrossRef]
Boland, G.W.; Blake, M.A.; Hahn, P.F.; Mayo-Smith, W.W. Incidental adrenal lesions: Principles, techniques, and algorithms for imaging characterization. Radiology 2008, 249, 756–775. [Google Scholar] [CrossRef]
Johnson, P.T.; Horton, K.M.; Fishman, E.K. Adrenal imaging with multidetector CT: Evidence-based protocol optimization and interpretative practice. Radiographics 2009, 29, 1319–1331. [Google Scholar] [CrossRef]
Lee, M.J.; Mayo-Smith, W.W.; Hahn, P.F.; Goldberg, M.A.; Boland, G.W.; Saini, S.; Papanicolaou, N. State-of-the-art MR imaging of the adrenal gland. Radiographics. 2002, 22, 1231–1246. [Google Scholar] [CrossRef]
Blake, M.A.; Kalra, M.K.; Sweeney, A.T.; Lucey, B.C.; Maher, M.M.; Sahani, D.V.; Halpern, E.F.; Mueller, P.R.; Hahn, P.F.; Boland, G.W. Distinguishing benign from malignant adrenal masses: Multi-detector row CT protocol with 10-minute delay. Radiology 2006, 238, 578–585. [Google Scholar] [CrossRef]
Pham, D.L.; Xu, C.; Prince, J.L. Current methods in medical image segmentation. Annu. Rev. Biomed. Eng. 2000, 2, 315–337. [Google Scholar] [CrossRef]
Patel, J.; Davenport, M.S.; Cohan, R.H.; Caoili, E.M. Can established CT attenuation and washout criteria for adrenal adenoma accurately exclude pheochromocytoma? AJR Am. J. Roentgenol. 2013, 201, 122–127. [Google Scholar] [CrossRef]
Dunnick, N.R.; Korobkin, M. Imaging of adrenal incidentalomas: Current status. AJR Am. J. Roentgenol. 2002, 179, 559–568. [Google Scholar] [CrossRef]
Sohaib, S.A.; Reznek, R.H. Adrenal imaging. BJU Int. 2000, 86 (Suppl. S1), 95–110. [Google Scholar] [CrossRef]
Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; van der Laak, J.A.W.M.; van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef]
Hesamian, M.H.; Jia, W.; He, X.; Kennedy, P. Deep learning techniques for medical image segmentation: Achievements and challenges. J. Digit. Imaging 2019, 32, 582–596. [Google Scholar] [CrossRef]
Christ, P.F.; Elshaer, M.E.A.; Ettlinger, F.; Tatavarty, S.; Bickel, M.; Bilic, P.; Remplfer, M.; Armbruster, M.; Hofmann, F.; D’anastasi, M.; et al. Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional Neural Networks and 3D Conditional Random Fields. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2016; Ourselin, S., Joskowicz, L., Sabuncu, M., Unal, G., Wells, W., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2016; Volume 9901. [Google Scholar] [CrossRef]
Heller, N.; Sathianathen, N.; Kalapara, A.; Walczak, E.; Moore, K.; Kaluzniak, H.; Rosenberg, J.; Blake, P.; Rengel, Z.; Oestreich, M.; et al. The KiTS19 challenge data: 300 kidney tumor cases with clinical context, CT semantic segmentations, and surgical outcomes. arXiv 2019, arXiv:1904.00445. [Google Scholar]
Isensee, F.; Jaeger, P.F.; Kohl, S.A.A.; Petersen, J.; Maier-Hein, K.H. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 2021, 18, 203–211. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Navab, N., Hornegger, J., Wells, W., Frangi, A., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2015; Volume 9351. [Google Scholar] [CrossRef]
Schlemper, J.; Oktay, O.; Schaap, M.; Heinrich, M.; Kainz, B.; Glocker, B.; Rueckert, D. Attention gated networks: Learning to leverage salient regions in medical images. Med. Image Anal. 2019, 53, 197–207. [Google Scholar] [CrossRef]
Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers make strong encoders for medical image segmentation. IEEE Trans. Med. Imaging 2021, 40, 3747–3759. [Google Scholar]
Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation. In Computer Vision—ECCV 2022 Workshops. ECCV 2022; Karlinsky, L., Michaeli, T., Nishino, K., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2023; Volume 13803. [Google Scholar] [CrossRef]
Wang, Y.; Zhou, Y.; Shen, W.; Park, S.; Fishman, E.K.; Yuille, A.L. Abdominal multi-organ segmentation with organ-attention networks and statistical fusion. Med. Image Anal. 2019, 55, 88–102. [Google Scholar] [CrossRef]
Gibson, E.; Giganti, F.; Hu, Y.; Bonmati, E.; Bandula, S.; Gurusamy, K.; Davidson, B.; Pereira, S.P.; Clarkson, M.J.; Barratt, D.C. Automatic multi-organ segmentation on abdominal CT with dense v-networks. IEEE Trans. Med. Imaging 2018, 37, 1822–1834. [Google Scholar] [CrossRef]
Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
Fu, H.; Cheng, J.; Xu, Y.; Wong, D.W.K.; Liu, J.; Cao, X. Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Trans. Med. Imaging 2018, 37, 1597–1605. [Google Scholar] [CrossRef]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Wang, J.; Li, X.; Ma, Z. Multi-Scale Three-Path Network (MSTP-Net): A new architecture for retinal vessel segmentation. Measurement 2025, 250, 117100. [Google Scholar] [CrossRef]
Zhao, Y.; Li, X.; Zhou, C.; Peng, H.; Zheng, Z.; Chen, J.; Ding, W. A review of cancer data fusion methods based on deep learning. Inf. Fusion 2024, 108, 102361. [Google Scholar] [CrossRef]
Isensee, F.; Maier-Hein, K.H. An attempt at beating the 3D U-Net. arXiv 2019, arXiv:1908.02182. [Google Scholar]
Zhou, Y.; Xie, L.; Shen, W.; Wang, Y.; Fishman, E.K.; Yuille, A.L. A Fixed-Point Model for Pancreas Segmentation in Abdominal CT Scans. In Medical Image Computing and Computer Assisted Intervention−MICCAI 2017; Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D., Duchesne, S., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2017; Volume 10433. [Google Scholar] [CrossRef]
Taghanaki, S.A.; Abhishek, K.; Cohen, J.P.; Cohen-Adad, J.; Ghassan, H. Deep semantic segmentation of natural and medical images: A review. Artif. Intell. Rev. 2021, 54, 137–178. [Google Scholar] [CrossRef]
Humpire-Mamani, G.E.; Setio, A.A.A.; van Ginneken, B.; Jacobs, C. Efficient organ localization using multi-task convolutional neural networks in thorax-abdomen CT scans. Phys. Med. Biol. 2018, 63, 085003. [Google Scholar] [CrossRef]
Fu, Y.; Mazur, T.R.; Wu, X.; Liu, S.; Chang, X.; Lu, Y.; Li, H.H.; Kim, H.; Roach, M.C.; Henke, L.; et al. A novel MRI segmentation method using CNN-based correction network for MRI-guided adaptive radiotherapy. Med. Phys. 2018, 45, 5129–5137. [Google Scholar] [CrossRef]
Roth, H.R.; Lu, L.; Farag, A.; Shin, H.C.; Liu, J.; Turkebey, E.B.; Summers, R. DeepOrgan: Multi-level Deep Convolutional Networks for Automated Pancreas Segmentation. In Medical Image Computing and Computer-Assisted Intervention--MICCAI 2015; Navab, N., Hornegger, J., Wells, W., Frangi, A., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2015; Volume 9349. [Google Scholar] [CrossRef]
Mayo-Smith, W.W.; Lee, M.J.; McNicholas, M.M.; Hahn, P.F.; Boland, G.W.; Saini, S. Characterization of adrenal masses (<5 cm) by use of chemical shift MR imaging: Observer performance versus quantitative measures. AJR Am. J. Roentgenol. 1995, 165, 91–95. [Google Scholar]
Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.; Kainz, B.; et al. Attention U-Net: Learning Where to Look for the Pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
Chen, J.; Mei, J.; Li, X.; Lu, Y.; Yu, Q.; Wei, Q.; Luo, X.; Xie, Y.; Adeli, E.; Wang, Y.; et al. TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers. Med. Image Anal. 2024, 97, 103280. [Google Scholar] [CrossRef] [PubMed]
Kamnitsas, K.; Bai, W.; Ferrante, E.; McDonagh, S.; Sinclair, M.; Pawlowski, N.; Rajchl, M.; Lee, M.; Kainz, B.; Rueckert, D.; et al. Ensembles of Multiple Models and Architectures for Robust Brain Tumour Segmentation. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2017; Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2018; Volume 10670. [Google Scholar] [CrossRef]
Greenspan, H.; San José Estépar, R.; Niessen, W.J.; Siegel, E.; Nielsen, M. Position paper on COVID-19 imaging and AI: From the clinical needs and technological challenges to initial AI solutions at the lab and national level towards a new era for AI in healthcare. Med. Image Anal. 2020, 66, 101800. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Langlotz, C.P.; Allen, B.; Erickson, B.J.; Kalpathy-Cramer, J.; Bigelow, K.; Cook, T.S.; Flanders, A.E.; Lungren, M.P.; Mendelson, D.S.; Rudie, J.D.; et al. A roadmap for foundational research on artificial intelligence in medical imaging: From the 2018 NIH/RSNA/ACR/The Academy Workshop. Radiology 2019, 291, 781–791. [Google Scholar] [CrossRef]
Krähenbühl, P.; Koltun, V. Efficient inference in fully connected CRFs with Gaussian edge potentials. In Proceedings of the Advances in Neural Information Processing Systems—NeurIPS 2011, Granada, Spain, 12–15 December 2011; pp. 109–117. [Google Scholar]
Dmitriev, K.; Kaufman, A.E.; Javed, A.A.; Hruban, R.H.; Fishman, E.K.; Lennon, A.M.; Saltz, J.H. Classification of Pancreatic Cysts in Computed Tomography Images Using a Random Forest and Convolutional Neural Network Ensemble. In Medical Image Computing and Computer Assisted Intervention−MICCAI 2017; Springer: Cham, Switzerland, 2017; Volume 10435, pp. 150–158. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Li, W.; Lin, Z.; Zhou, K.; Qi, L.; Wang, Y.; Jia, J. MAT: Mask-aware Transformer for Large Hole Image Inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10758–10768. [Google Scholar] [CrossRef]
Azizi, S.; Culp, L.; Freyberg, J.; Mustafa, B.; Baur, S.; Kornblith, S.; Chen, T.; Tomasev, N.; Mitrović, J.; Strachan, P.; et al. Self-supervised learning for medical image analysis using image context restoration. Med. Image Anal. 2023, 83, 102662. [Google Scholar]

Figure 1. Visual representation of adrenal gland segmentation process. (A) Input: original axial CT image showing the abdominal region with the adrenal gland difficult to distinguish visually. (B) Ground truth: binary segmentation mask created through expert manual annotation, highlighting the precise location of the adrenal gland. (C) Output: original CT image with segmentation overlay showing prediction results—green indicates true positive regions (correctly identified adrenal tissue), yellow indicates areas of uncertainty, and red represents false positive predictions. This visualization demonstrates both the algorithm’s capabilities and challenges in precisely delineating adrenal gland boundaries.

Table 1. Segmentation performance metrics.

Model	Gland	Dice (Mean ± SD)	Precision	Sensitivity	Specificity	HD95 (mm)
U-Net	Right	0.630 ± 0.05	0.649	0.612	0.998	8.2
	Left	0.660 ± 0.06	0.678	0.645	0.999	7.5
NablaNet	Right	0.552 ± 0.08	0.567	0.538	0.996	10.4
	Left	0.550 ± 0.07	0.559	0.532	0.997	9.8
SegNet	Right	0.320 ± 0.10	0.315	0.305	0.995	14.7
	Left	0.335 ± 0.09	0.329	0.318	0.996	13.9

Table 2. Computational performance metrics.

KPI	U-Net	NablaNet	SegNet
TOTM (min/epoch)	6.63 (right)/8.10 (left)	4.45 (right)/6.01 (left)	3.50 (right)/4.22 (left)
TTMIP (s/gland)	58.3 (right)/72.2 (left)	51.7 (right)/65.0 (left)	46.2 (right)/58.4 (left)
TOPPI (preprocessing)	20 min + 12.5 h (YAML development)	20 min + 12.5 h (YAML development)	20 min + 12.5 h (YAML development)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bolocan, V.-O.; Nicu-Canareica, O.; Mitoi, A.; Costache, M.G.; Manolescu, L.S.C.; Medar, C.; Jinga, V. Deep Learning for Adrenal Gland Segmentation: Comparing Accuracy and Efficiency Across Three Convolutional Neural Network Models. Appl. Sci. 2025, 15, 5388. https://doi.org/10.3390/app15105388

AMA Style

Bolocan V-O, Nicu-Canareica O, Mitoi A, Costache MG, Manolescu LSC, Medar C, Jinga V. Deep Learning for Adrenal Gland Segmentation: Comparing Accuracy and Efficiency Across Three Convolutional Neural Network Models. Applied Sciences. 2025; 15(10):5388. https://doi.org/10.3390/app15105388

Chicago/Turabian Style

Bolocan, Vlad-Octavian, Oana Nicu-Canareica, Alexandru Mitoi, Maria Glencora Costache, Loredana Sabina Cornelia Manolescu, Cosmin Medar, and Viorel Jinga. 2025. "Deep Learning for Adrenal Gland Segmentation: Comparing Accuracy and Efficiency Across Three Convolutional Neural Network Models" Applied Sciences 15, no. 10: 5388. https://doi.org/10.3390/app15105388

APA Style

Bolocan, V.-O., Nicu-Canareica, O., Mitoi, A., Costache, M. G., Manolescu, L. S. C., Medar, C., & Jinga, V. (2025). Deep Learning for Adrenal Gland Segmentation: Comparing Accuracy and Efficiency Across Three Convolutional Neural Network Models. Applied Sciences, 15(10), 5388. https://doi.org/10.3390/app15105388

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning for Adrenal Gland Segmentation: Comparing Accuracy and Efficiency Across Three Convolutional Neural Network Models

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Design and Population

2.2. Imaging Protocol

2.3. Data Preprocessing

2.4. Neural Network Architectures

2.5. Training Configuration

2.6. Evaluation Metrics

3. Results

3.1. Segmentation Performance

3.2. Computational Efficiency Analysis

3.3. Statistical Analysis

3.4. Qualitative Analysis

3.5. Comparative Performance Context

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI