Next Article in Journal
Bridging the Implementation Gap in AI-Powered Personalized Education: A Systematic Review of Learning Style Prediction and Recommendation Systems
Previous Article in Journal
A Hybrid Intrusion Detection Framework Using Deep Autoencoder and Machine Learning Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

AirwaySeekNet: Fine-Grained Segmentation and Completion of Peripheral Pulmonary Airways with Dynamic Reliability-Aware Supervision

1
Innovation Research Center for Medical Robotics, School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Macau, China
2
Hanglok-Tech Research Institution, Hengqin, Zhuhai 519000, China
3
Center of Interventional Radiology & Vascular Surgery, Department of Radiology, Zhongda Hospital, Medical School, Southeast University, Nanjing 210009, China
*
Author to whom correspondence should be addressed.
Submission received: 14 December 2025 / Revised: 16 January 2026 / Accepted: 23 January 2026 / Published: 26 January 2026

Abstract

Accurate segmentation of the airway tree is crucial for the diagnosis and intervention of pulmonary disease; however, delineating small peripheral airways remains challenging. The small size and complex branching of distal airways, combined with the limitations of CT imaging (partial volume effects, noise), often lead to missed bronchial segments. To address these challenges, we propose AirwaySeekNet, a dual-decoder neural network. The model introduces a Voxel-Selective Supervision (VSS) mechanism, a dynamic reliability-aware strategy that focuses training on uncertain voxels, mitigating annotation bias, and enhancing fine-branch detection. We further incorporate a Signed Distance Field (SDF) loss to enforce tubular shape constraints, improving the boundary delineation and connectivity of the airway tree. In experiments on a pig CT dataset, AirwaySeekNet outperformed state-of-the-art models, achieving higher topological completeness and finer branch detection, and the TD metric increased by 5.55% and the BD metric increased by 8.14%. It maintained high overall segmentation accuracy (Dice), with only a minor increase in false positives from the exploration of the smallest bronchi. Overall, AirwaySeekNet markedly improves airway segmentation accuracy and topology preservation, providing a more complete and reliable mapping of the bronchial tree for clinical applications.

1. Introduction

Airway segmentation plays a pivotal role in pulmonary interventional procedures by enabling the precise extraction of morphological metrics—such as the airway wall thickness and luminal diameter variations—which serve as critical biomarkers for the diagnosis and assessment of disease severity in conditions including chronic obstructive pulmonary disease (COPD), asthma, and bronchiectasis [1,2,3]. In the context of surgical planning and real-time navigation, accurate delineation of the bronchial tree allows clinicians to localize anatomical landmarks with confidence, thereby markedly improving the procedural success rates and patient safety [4,5]. Furthermore, the quantitative analysis of airway parameters facilitates the longitudinal monitoring of disease progression and the design of personalized therapeutic regimens, rendering airway segmentation an indispensable tool in modern interventional pulmonology.
ATM’22 (Airway Tree Modeling 2022) was an open challenge held in conjunction with MICCAI 2022 that aimed to provide a large-scale, multi-site, and multi-domain benchmark and a unified evaluation platform for pulmonary airway (airway tree) segmentation and modeling [6]. The challenge released 500 expert-annotated chest CT scans, sourced from multiple institutions, including cases with COVID-19-related lesions. The results from the challenge indicate that deep-learning methods that explicitly preserve topology and/or specifically handle small airway branches achieve better performance in detecting and maintaining airway tree continuity. However, the majority of airway segmentation algorithms presented in ATM’22 are limited to the delineation of airways no smaller than the third generation (as defined by the standard tracheobronchial branching scheme), leaving finer and more peripheral bronchi beyond their current reach.
To date, no segmentation methods have been specifically developed for the small airways. Small airways are defined as those with an diameter of <2 mm, typically beginning at approximately the eighth generation of branching and comprising portions of both the conducting airways (responsible for air transport) and the respiratory bronchioles (involved in gas exchange) [7]. Often referred to as the “quiet zone” of the lungs, these distal airways may exhibit functional impairment in the early stages of disease that remains clinically silent, undetectable in terms of symptoms and conventional pulmonary function tests.
The small size and intricate branching of the small airways render their segmentation on CT images particularly challenging; however, their accurate delineation holds clinical importance in the diagnosis and management of pulmonary disease. In chronic obstructive pulmonary disease (COPD), small-airway involvement—quantified via impulse oscillometry (IOS)—has been reported in 60–74% of patients [8]. Likewise, in asthma, small-airway pathology affects approximately 50–60% of individuals with the disease [9]. Beyond COPD and asthma, lesions of the small airways also contribute to the pathogenesis and progression of other respiratory disorders—such as pulmonary fibrosis, bronchiectasis, and idiopathic interstitial pneumonias—and are associated with sustained declines in lung function and poorer clinical outcomes.
Current mainstream algorithms remain incapable of reliably segmenting the fine distal airways. The luminal diameters of these bronchi are comparable to the spatial resolution (slice thickness) of CT imaging, producing pronounced partial volume effects [10], and are further obscured by the elevated noise levels inherent in low-dose CT scans [11]. Additionally, the irregular variability in airway wall thickness [10] and the intricate highly branched topology of the bronchial tree [11] exacerbate segmentation uncertainty. Challenges such as extreme class imbalance [12], vanishing gradients and feature attenuation, and the failure to preserve airway connectivity (leading to discontinuities or omissions) further hinder fine airway delineation. This inability to accurately segment the small airways can directly result in missed lesions, thereby undermining the accuracy of pulmonary pathology assessment and the precision of subsequent interventional planning.
To address these prevailing challenges in airway segmentation, we propose a synergistically framework for airway delineation, designed to enhance the detection of delicate distal bronchial structures.
The framework focuses on explicit segmentation: the explicit branch leverages a multi-decoder architecture to enrich feature representation. Collectively, this approach aspires to yield a comprehensive airway model that achieves high segmentation sensitivity.
Building upon this framework, we introduce a Voxel-Selective Supervision (VSS) mechanism to mitigate supervisory uncertainty arising from terminal airway omissions and class imbalance. VSS dynamically quantifies voxel-wise uncertainty via cross-entropy to establish a hierarchy of easy and difficult samples and employs a spatial-distance-based weighting strategy, thereby directing the model to prioritize discriminative features along the delicate bronchial margins. During early training, the network focuses on stable low-uncertainty voxels and progressively shifts attention toward more uncertain regions, effectively guiding the model to “actively explore” potential airway branches—even when such predictions manifest as apparent false positives under incomplete annotations.
We conducted a systematic evaluation of our proposed method on publicly available pulmonary airway segmentation datasets, using current state-of-the-art approaches as baseline comparators. The experiments not only confirm the superiority of our synergistic optimization framework in preserving semantic fidelity and structural continuity, but also underscore the pivotal role of the VSS mechanism in enhancing the recognition of fine bronchial branches.
In summary, the proposed approach delivers a harmonized advancement across architectural design, training strategy, and empirical validation, offering an effective and broadly applicable solution for high-precision airway modeling.

2. Related Works

Accurate CT-based airway segmentation is critically important for the early detection and quantification of airway-centric pathologies—such as COPD, asthma, and bronchiectasis—where involvement of the small airways often precedes overt clinical symptoms and conventional pulmonary function deficits. Consequently, current research efforts have converged on three principal challenges: preserving the topology preservation, reliably detecting fine peripheral bronchi, and ensuring robust performance across diverse pathological presentations. The 2009 EXACT’09 Airway Segmentation Challenge first illuminated these difficulties [13]: among fifteen automated algorithms, the best achieved only 74% completeness, while most averaged around 62%, with distal bronchi particularly prone to omission and fragmentation. This seminal benchmark galvanized the community to pursue methods capable of faithfully reconstructing both major and minor airway branches without sacrificing connectivity.
The ATM’22 Airway Tree Modeling Challenge further propelled advances in the field by benchmarking automated methods on a diverse set of pathological CT scans, thereby driving improvements in both small-airway detection and topological fidelity [6]. The ATM’22 dataset comprised 500 CT volumes encompassing complex pathologies—such as COPD, pulmonary nodules, and fibrosis—and was evaluated using metrics including the total airway length coverage (tree-length detected rate; TD), branch detection rate (branch detected rate; BD), volumetric Dice coefficient, false-positive rate (FPR), and the topology error rate. The top-performing DTPDT (Differentiable Topology–Preserved Distance Transform) algorithm achieved approximately 91% TD and 90% BD [14], whereas the average results for the remaining participants were 78% TD, 75% BD, and an FPR of around 10%.

2.1. CNN-Based Airway Segmentation

The advent of deep convolutional neural networks (CNNs) revolutionized medical image segmentation. Ronneberger et al. (2015) introduced the U-Net architecture [15]—a symmetric encoder–decoder network with skip connections—that demonstrated precise segmentation even when training data are scarce. Its three-dimensional extensions, including 3D U-Net [16] and V-Net [17], further extended this paradigm to volumetric data, enabling end-to-end learning of complex anatomical structures with high fidelity.
Specifically for airway segmentation, Charbonnier et al. (2017) trained a 3D CNN to detect and correct “leakage” [18] artifacts in baseline airway masks. By teaching the network to recognize and excise false positives that penetrated lung parenchyma, they markedly increased the specificity while preserving the overall branch detection sensitivity.
To enhance the detection of fine bronchial branches, multi-scale and cascade designs have been proposed. In 2019, Qin and colleagues developed AirwayNet, which incorporates voxel-connectivity awareness into a 3D CNN to explicitly encourage the prediction of a continuous airway tree [19]; by penalizing disconnected predictions, AirwayNet achieved improved topology preservation and rescued otherwise overlooked distal branches.
In parallel, Zhao et al. employed a two-stage strategy: a 2D CNN first generated slice-wise airway probability maps, which were then integrated by a 3D CNN and refined through a linear-programming-based path-tracing algorithm to enforce connectivity [20]. This hybrid 2D–3D cascade not only yielded more coherent tree structures but also enabled generation-level classification of bronchial branches, highlighting the power of staged architectures combined with global optimization.
Meanwhile, Wang et al. (2019) introduced a spatial fully connected network that leverages a radial-distance loss to directly learn the tubular geometry of airways. By predicting a distance transform rather than a binary mask, their approach inherently preserves the cylindrical morphology and continuity of airway segments, offering another effective avenue for topology-aware segmentation [21].
In addition, recent studies have begun to focus on AI-generated annotations and pseudo-label learning strategies to alleviate the high cost and inconsistency of pixel-level manual annotation. Previous work has shown that model-generated pseudo labels can significantly reduce reliance on manual labeling while maintaining or even improving segmentation performance, thereby enhancing annotation efficiency and scalability. Such advances are expected to substantially improve the efficiency of image segmentation workflows in future research [22,23,24].

2.2. Attention Mechanisms and Feature Recalibration

The integration of attention mechanisms and feature recalibration techniques into convolutional neural networks marked another significant breakthrough in airway segmentation. Qin et al. (2020) proposed AirwayNet-SE, a model that fuses multi-scale contextual information with a squeeze-and-excitation (SE) attention module to enhance the saliency of subtle airway structures during segmentation [25]. In a related work, the same group introduced an attention distillation strategy to impart “fine-bronchus sensitivity” to the network, explicitly guiding it to amplify responses to small-caliber airway pathways [26]. These attention-based approaches recalibrate feature maps and steer the network’s focus toward faint peripheral branches, significantly improving the detection sensitivity under resolution-limited conditions.
Similarly, Zhou et al. (2021) developed a multi-scale context-enhanced U-Net that aggregates coarse-to-fine semantic features, enabling more complete reconstruction of airway trees even in cases with complex pathological alterations [27]. In a complementary direction, Nadeem et al. (2021) proposed a novel “freeze-and-grow” framework that combines deep learning with classical region-growing algorithms [28]. Their method first uses a CNN to “freeze” correctly segmented regions—preventing leakage—and then iteratively grows additional branches in an alternating CNN–propagation manner, striking a compelling balance between false positives and false negatives.
These innovations—spanning 3D network architectures, cascade designs, and attention-enhanced modules—have collectively introduced a qualitative leap in airway segmentation. Modern approaches can now delineate dozens of bronchial generations from standard chest CT scans, far surpassing early algorithms that reliably captured only the third or fourth generations.

2.3. Strategies for Addressing Class Imbalance

With the advancement of deep learning, there has been a growing demand for loss functions tailored to the extreme class imbalance inherent in airway segmentation, wherein small distal airways are vastly outnumbered by the surrounding background. Originally proposed for dense object detection, the focal loss introduced in Lin et al. (2017) [29] was later adapted for segmentation tasks to address this issue. By down-weighting the contribution of well-classified voxels and emphasizing the misclassified ones, focal loss effectively guides the model to focus on under-segmented fine-scale airway structures.
Building on this principle, Zheng et al. (2020) introduced the Generalized Unified Loss (GUL) [12], which dynamically balances gradient contributions from both large and small airways through distance-based weighting. This strategy mitigates the optimizer’s tendency to favor easily segmentable large bronchi and instead promotes learning from smaller more challenging branches. By eliminating scale-dependent gradient disparities, the GUL improves peripheral airway detection without compromising the accuracy of central airway segmentation.
Overall, loss function design remains an active and critical area of research, as achieving branch-complete segmentation requires the network to learn from extremely sparse positive signals associated with the smallest airways. This challenge necessitates carefully crafted objective functions that can amplify weak supervision signals and ensure that even the most distal branches are faithfully captured during training.

2.4. Topology Preservation of the Bronchial Tree

A long-standing challenge in airway segmentation is maintaining the topology preservation of the bronchial tree, ensuring that all predicted branches remain connected to the trunk and to one another in a coherent structure. One prominent approach to address this issue involves the use of centerline prediction or multi-task learning frameworks that jointly infer airway masks and skeletons. By simultaneously predicting the airway centerline and segmentation mask, neural networks are encouraged to produce continuous tubular structures. For instance, Selvan et al. (2020) integrated a mean-field conditional random field with a graph neural network in the post-processing stage to “snap” disconnected components into a connected graph, leveraging learned graph optimization techniques to enhance connectivity [30].
Another effective strategy is the incorporation of topology-preserving loss functions. Shit et al. (2021) proposed the centerline Dice (clDice) loss, which directly rewards topologically correct predictions by measuring the overlap between predicted and ground-truth skeletons [31]. This differentiable skeletal objective ensures that omissions of fine branches contribute to the loss, thereby encouraging the network to pursue even the most distal airways. Similarly, the radial distance loss introduced by Wang et al. (2019) provides a topology-aware supervision signal by regressing the distance from airway boundaries to their centerlines, enabling the model to learn and preserve tubular continuity implicitly [21].
Zheng et al. [32] further introduced a Local Imbalance-based Weighting scheme and a Backpropagation-based Weight Enhancement strategy to reinforce topological completeness during training. To explicitly suppress disconnection errors, Nan et al. [33] and Yu et al. [34] incorporated discontinuity-sensitive regularization terms that penalize fragmented predictions. Given the global influence of structural diversity on distance maps [35,36], Zhang et al. [14] and Yu et al. [34] proposed convolutional distance transforms (CDT) and geodesic distance transforms, respectively, to avoid topological fragmentation.
Collectively, state-of-the-art airway segmentation methods now move beyond voxel-wise accuracy to explicitly optimize for anatomical plausibility and structural connectivity through skeleton-guided learning, graph-based optimization, and topology-aware loss formulations, minimizing branch discontinuities and yielding highly coherent airway tree reconstructions.

3. Methods and Materials

In summary, two key challenges persist in pulmonary airway segmentation. First, mainstream medical-imaging datasets typically omit annotations for the finest airways (branching depth beyond the eighth generation, diameter <2 mm) [7], thereby impeding models from learning accurate representations of these delicate anatomical branches. Second, conventional encoder–decoder architectures tend to fracture or distort the topology of minute tubular structures during their downsampling–upsampling operations.
This study proposes a collaboratively optimized framework for pulmonary airway segmentation, designed to address the principal challenges in delineating fine bronchial structures. The segmentation module employs a dual-decoder architecture and introduces a voxel-selection supervision strategy to intensify the model’s sensitivity to minute airway branches. This encourages the exploration of unannotated fine bronchi, which may manifest in the evaluation metrics as controlled false positives. Such false positives do not necessarily reflect erroneous model predictions. Previous studies have reported that airway segmentation algorithms often detect anatomically plausible peripheral branches that are missing in the reference annotations, and thus are penalized as false positives when evaluated against an incomplete ground truth. These observations suggest that such detected but unannotated structures are more likely due to the incompleteness of the annotation set rather than true algorithmic errors [5,37].

3.1. Segmentation Module

This section presents a detailed account of the segmentation model’s network architecture, the Voxel-Selective Supervision strategy, and the combined loss function. The overall framework is illustrated in Figure 1.
In our segmentation network, the original CT volume X R H × W × D is first cropped into a sub-volume of size h × w × d , which is then encoded into multi-scale features by a grouped-convolutional encoder and subsequently decoded by two parallel decoders: one with conventional upsampling convolutions and the other with dynamic serpentine convolutions, producing segmentation probability maps and distance maps, respectively:
P s e g 1 , P s d f 1 = D e c 1 E n c C r o p ( i ) , P s e g 2 , P s d f 2 = D e c 2 E n c C r o p ( i ) ,
where P s e g 1 , P s d f 1 R h × w × c represents the outputs of the upsampling decoder D e c 1 ( · ) , and P s e g 2 , P s d f 2 R h × w × c represents the outputs of the serpentine-convolution decoder D e c 2 ( · ) .
The two decoders share the encoder’s feature representations but follow independent upsampling paths to produce distinct segmentation probability maps and distance-transform maps. A consistency loss L c o n enforces alignment between their feature spaces. The annotated airway mask is skeletonized to derive the ground-truth distance-transform map (DTM), which is then compared against the decoder’s predicted DTM via the regression loss L d t m . Finally, the segmentation probability maps are supervised against the manual labels using the voxel-selective supervision strategy, yielding the loss L d e c s u p .
S = S k e l e t o n i z e Y { 0 , 1 } h × w × d
D = D T S R h × w × d
Y { 0 , 1 } h × w × d denotes the ground-truth annotation, S k e l e t o n i z e ( · ) the skeletonization operator, and D T ( · ) the distance-transform operation.
The supervision loss L d e c s u p , and the regression loss L d t m are defined as follows:
L d e c s u p = L d e c 1 s u p P 1 , Y + L d e c 2 s u p P 2 , Y ,
L d t m = L d t m 1 P 1 , D + L d t m 2 P 2 , D .
The challenge of accurately segmenting airways arises in part from gradient erosion and voxel neighborhood dilation during training [12] and is exacerbated by missing annotations in these fine branches. Training with imperfect labels introduces erroneous supervision, leading to suboptimal solutions and limiting the effectiveness of topology-enhancement strategies.
To mitigate the gradient erosion in shallow layers, we supply complementary gradient flows. To address the model’s insensitivity caused by unannotated terminal bronchi, we propose a voxel-selective supervision strategy. By quantifying voxel uncertainty via cross-entropy, we dynamically select the subset of voxels to supervise and adaptively balance easy and difficult samples during training, initially focusing on confidently labeled voxels and then gradually incorporating high-uncertainty (i.e., under-labeled) voxels to correct omitted regions. Concurrently, we apply spatial distance-based weighting to assign higher importance to fine airways adjacent to the background, thereby enhancing the boundary sensitivity. Cross-entropy-based selection of high-uncertainty voxels forces the model to explore potential airway structures—even if these predictions would be considered false positives under existing annotations—while the distance weights emphasize boundary voxels in the loss, encouraging airway extensions. To avoid early-stage noise, difficult samples are introduced later in training, enabling the model to complete missing branches in under-annotated regions. Consequently, the resulting false positives correspond to plausible extensions of the airway tree rather than erroneous predictions, improving the fine-branch segmentation accuracy and preserving topological coherence.
To prevent erroneous predictions on unselected voxels, we employ a dual-decoder architecture to enforce consistency between outputs. Inspired by WingsNet [12], we partition the convolutional blocks of Decoder 1 into distinct groups and apply auxiliary supervision at the group level rather than on individual blocks. Within each group, multi-scale features are aggregated via grouped convolutions and a feature pyramid: the output of each convolutional block is first passed through a lightweight 1 × 1 × 1 convolution, then upsampled, and finally concatenated to form the group’s combined representation.
In the segmentation of tubular structures—such as blood vessels or road networks—standard deformable convolutions can suffer from excessive offset magnitudes that displace the kernel away from the target. To address this, a mechanism is required to constrain offset learning so that adjustments follow the trajectory of the tubular structure. Accordingly, we integrate Dynamic Snake Convolution (DSConv) [38] into D e c 2 . DSConv is a specialized convolutional operation for tubular segmentation whose core idea is to iteratively accumulate and refine offsets, deforming the kernel to conform precisely to the target’s geometric contours.
In the dual-decoder architecture, D e c 1 employs a group-supervision mechanism: the network is partitioned into multiple groups, and a group-wise feature pyramid is constructed, wherein lightweight 1 × 1 × 1 convolutions fuse multi-scale features. This design mitigates gradient erosion and explosion in shallow layers while preserving the deep network’s representational power, thereby effectively addressing the class imbalance. D e c 2 integrates DSConv, which iteratively constrains the kernel offsets so that the receptive field extends continuously along the tubular target’s geometric trajectory, enhancing the local adaptability to curved slender structures.
The proposed encoder–dual-decoder architecture not only guarantees gradient stability and the semantic coherence of multi-scale features at the global level but also precisely captures low-contrast high-curvature details of airway branches at the local level. Built upon an efficient inference paradigm, this framework markedly enhances both the accuracy and topological continuity of fine airway segmentation, providing an effective solution for pulmonary airway delineation.

3.2. Voxel-Selective Supervision

Fully supervised training paradigms can yield strong performance in airway-segmentation tasks; however, missing annotations of peripheral bronchi introduce bias into the supervision signal, undermining the model’s capacity to discern small airway branches and to preserve the topology preservation of the airway tree. To overcome this, we propose a dynamic voxel-selection strategy based on the cross-entropy between prediction probabilities and ground-truth labels, which enables progressive optimization by concentrating on difficult regions. As illustrated in Figure 2, this strategy employs a threefold weighting mechanism to bolster the segmentation of fine branches.
First, the shortest Euclidean distance from each voxel to the background is computed to obtain 1 (Weight 1), which enhances the model’s sensitivity to fine branches and improves its ability to discriminate airway boundaries.
W ( i , j , k ) = 1 m d ( i , j , k ) d max , if Y ( i , j , k ) = 1 1 , if Y ( i , j , k ) = 0 ,
where W ( i , j , k ) denotes the weight assigned to voxel ( i , j , k ) , d ( i , j , k ) represents its shortest Euclidean distance to the background, and d m a x is the maximum distance within the sample. To prevent W ( i , j , k ) from becoming zero, the scaling factor is set to m = 0.99 .
Subsequently, the voxels to be supervised are determined based on the cross-entropy values computed between the predicted probability map from the decoder and the ground-truth labels:
V ( i , j , k ) = c = 1 C Y i , j , k , c log P i , j , k , c ,
where V ( i , j , k ) denotes the cross-entropy value at voxel location ( i , j , k ) , and C represents the total number of classes. Y ( i , j , k , c ) { 0 , 1 } and P ( i , j , k , c ) denote the ground-truth label and predicted probability for class C at voxel ( i , j , k ) , respectively. The value V ( i , j , k ) captures the uncertainty between the predicted probabilities and the ground truth. Higher values of V ( i , j , k ) indicate a larger discrepancy between prediction and annotation, typically corresponding to missing labels in fine branches or ambiguous boundary regions.
Based on this, we introduce a voxel-level selection strategy to obtain D r e l , which determines the subset of voxels in the spatial domain D to be supervised by the loss function. During training, a dynamic threshold ρ ( t ) is used to control the proportion of hard samples selected in each epoch, allowing the model to focus on confidently labeled voxels in the early stages and progressively incorporate more challenging samples as training advances:
D r e l = arg min | D | ρ ( t ) | D | ( i , j , k ) D V ( i , j , k ) ,
ρ ( t ) = 1 m i n ( t 80 τ , τ ) ,
where t denotes the current training epoch, and τ represents the proportion of selected voxel samples. Convolutional neural networks tend to prioritize learning from easily identifiable voxel samples in the early stages of training, while later they may overfit to mislabeled or missing annotations caused by low contrast [39,40]. To mitigate this, the threshold ρ ( t ) is initially set close to 1 and gradually decreases with the training progression. In addition, τ is chosen as an empirical value between 0 and 1 to control the introduction of harder samples. Specifically, as training proceeds, more difficult samples (e.g., those with missing annotations or low contrast) are progressively introduced to ensure that the model can handle these challenging parts. Early in training, the model focuses on easily recognizable voxels, while with further training it begins to process more difficult examples, thereby improving robustness to fine details and complex structures. Through validation experiments, we adjust τ to ensure a balanced learning of easy and hard samples during training.
Finally, for the selected voxel set D r e l , supervision is applied using a weighted cross-entropy loss:
L s u p = 1 | D r e l | ( i , j , k ) D r e l ( W c = 1 C Y ( i , j , k , c ) log ( P ( i , j , k , c ) ) ) ,
where W denotes the spatially varying weights assigned to each voxel. The cross-entropy-based voxel-selective supervision strategy enables the model to dynamically adjust its focus throughout training. This not only alleviates the overfitting caused by incomplete annotations but also enhances the model’s sensitivity to fine airway structures through the incorporation of geometric priors. However, because cross-entropy as an uncertainty metric cannot rigorously distinguish between truly missing airways and regions of noise or artifacts, the exploration mechanism of the VSS module relies on empirical assumptions and may result in incorrect selections. In future work, we plan to incorporate more sophisticated uncertainty modeling techniques to enhance this discriminative capability.

3.3. Combined Loss

Although the cross-entropy-based voxel-selective supervision enhances the prediction accuracy in challenging regions, it supervises only a subset of selected voxels, leaving unsupervised areas potentially susceptible to prediction bias. To address this limitation, we introduce a dual-decoder architecture and impose consistency constraints on the dual-decoder through feature alignment using the Kullback–Leibler (KL) divergence, thereby improving the overall robustness of the predictions.
L con = 1 D i , j , k D c = 1 C p 1 i , j , k , c log p 1 i , j , k , c p 2 i , j , k , c ,
where p 1 ( i , j , k , c ) and p 2 ( i , j , k , c ) denote the class-c probability values at the voxel location ( i , j , k ) predicted by the dual-decoder, respectively.
Distance Transformation (DTM) quantifies the geometric relationship between each voxel and the airway surface, significantly enhancing the model’s ability to recognize complex airway structures. The Signed Distance Field (SDF) assigns a specific distance value to each voxel x within the pulmonary airway:
D ( x ) = + inf z S x z 2 , if x O 0 , if x S inf z S x z 2 , if x I ,
where O denotes the airway exterior, S the airway surface, and I the airway interior. To enhance the model’s sensitivity to airway boundaries, we adopt an exponentially weighted distance regression loss inspired by [41]:
ϕ D x = exp δ · D x ,
where δ 1 is a scaling factor that causes the weighting function to exponentially concentrate the model’s attention on the airway boundary regions, i.e., where the assigned weight for D ( x ) is maximal. The value of the parameter δ was determined through empirical tuning on the validation set, with the aim of increasing the model’s focus on the boundary regions of tubular structures by assigning greater weight to errors at these boundaries in the loss function. The distance loss function is defined as follows:
L D = 1 N x Ω ϕ D G T x · D p r e d x D G T x 2 ,
where N denotes the total number of voxels within the domain Ω , D G T represents the ground-truth signed distance values, and D p r e d denotes the predicted signed distance values from the model. By modulating the weighting function ϕ , boundary voxels receive higher gradient weights compared to interior regions, facilitating the model’s ability to address challenges such as blurred boundaries in low-contrast areas and discontinuities in fine branches. This promotes segmentation results that more accurately preserve the tubular anatomical characteristics of the airway.
Ultimately, the composite loss function L t o t a l is formulated as a weighted sum of the segmentation loss, the DTM regression loss, and the KL consistency loss, mathematically expressed as
L t o t a l = λ 1 L c o n + λ 2 L d t m + λ 3 L d e c s u p ,
where λ 1 , λ 2 , and λ 3 are balancing hyperparameters. The parameters λ (including λ 1 , λ 2 , and λ 3 ) were used to balance the contributions of the consistency loss, SDF regression loss, and segmentation loss in the final loss function. Their specific values were likewise determined through empirical tuning on the validation set to achieve a reasonable trade-off among the individual loss components. Through this design, the model is enabled to actively explore sparsely annotated regions while maintaining the geometric continuity of complex branching structures.

3.4. Implementation Details

The joint operational mechanism and synergy between VSS, the dual decoders, and the SDF constraint during training are presented in Algorithm 1. VSS dynamically selects and weights hard voxel samples to ensure that the model progressively corrects overlooked fine branches throughout the training process. The dual-decoder architecture employs a consistency loss to reinforce agreement between the outputs of the two decoders, thereby enhancing the robustness of the model. Meanwhile, the SDF constraint introduces a weighted distance regression loss that increases the model’s sensitivity to airway boundaries, leading to improved segmentation accuracy for fine airway structures.
Algorithm 1 Joint Training: VSS, Dual-Decoder, and SDF Constraints
  1:
Input:
  2:
   X: CT volume image
  3:
   Y: ground truth label
  4:
   D: distance transform
  5:
Output:
  6:
    θ : Optimized model parameters
  7:
Step 1: Input Processing and Encoding
  8:
    sub _ vol Crop ( X )
  9:
    features Encoder ( sub _ vol )
10:
Step 2: Dual-Decoder Decoding
11:
    P 1 , S D F 1 Decoder 1 ( features )
12:
    P 2 , S D F 2 Decoder 2 ( features )
13:
Step 3: Voxel Selection via VSS
14:
   Compute voxel uncertainty based on cross-entropy V ( i , j , k )
15:
   Compute dynamic threshold ρ ( t )
16:
   Select voxels D r e l based on uncertainty
17:
   Compute spatial weights W (using distance transform)
18:
Step 4: Loss Calculation
19:
   Segmentation loss with VSS-weighting: L seg
20:
   SDF regression loss: L sdf
21:
   Consistency loss: L con
22:
Step 5: Total Loss and Optimization
23:
   Combine all losses: L total = λ 1 L con + λ 2 L sdf + λ 3 L seg
24:
   Backpropagate and update parameters: θ θ η · θ
25:
return θ : Updated model parameters

4. Experiments and Results

4.1. Dataset Description

In this study, we utilized an internally collected dataset comprising 53 pig CT scans along with corresponding manual annotations. The dataset was divided into training, validation, and test sets in a ratio of 41:6:6, ensuring balanced distributions of anatomical structures and image quality across subsets.
Scanning equipment and parameters: Imaging was performed using an X-ray computed tomography (CT) scanner (SOMATOM go.Fit; Siemens Healthineers, Shanghai, China).
Imaging parameters: Tube voltage (kVp): 110; Slice thickness: 1 mm; Pixel spacing: [0.78471875, 0.78471875]; Image dimensions: 512 × 512 .
Sedation and anesthesia protocol: Initial sedation was achieved by intramuscular injection of Zoletil (3 mg/kg) and Atropine Sulfate (0.08 mg/kg). Once the animal was calm, the surgical area was prepared and an intravenous catheter was placed in the marginal ear vein for slow infusion of physiological saline and propofol until the animal reached a state suitable for endotracheal intubation. Depending on the animal’s physiological condition, vasoactive drugs such as epinephrine, norepinephrine, or dopamine were administered as needed to maintain stable physiological parameters. After successful induction and intubation, mechanical ventilation was maintained with set respiratory parameters under general anesthesia. Intraoperative electrocardiographic monitoring was conducted via limb leads, invasive arterial blood pressure was monitored through an arterial line placed via central venous catheterization, and oxygen saturation was continuously measured using a tongue pulse oximeter.
Ventilator: WATO EX-20VET (Shenzhen Mindray Bio-medical Electronics Co., Ltd., Shenzhen, China).
Monitoring equipment: Veterinary monitor model im8vet (EDAN Co., Ltd., Shenzhen, China), used for real-time monitoring of vital signs, oxygen saturation, electrocardiography, and other key physiological metrics.

4.2. Data Preprocessing

During data preprocessing, three cropping strategies were adopted to enhance the diversity of the training samples. First, sliding-window cropping was applied around the trachea region to ensure adequate coverage of the target structure. Second, based on the annotated airway diameters, small airway regions were identified and subjected to additional cropping, thereby increasing the proportion of small-airway samples and improving the model’s sensitivity to such structures. Finally, random cropping was performed over the entire volume to introduce further variability. These three cropping strategies were integrated in the training set with a ratio of 7:2:1.
To comprehensively evaluate the segmentation performance, the following metrics were adopted:
Dice = 2 · | P G | | P | + | G | ,
where P denotes the predicted segmentation, and G denotes the ground truth. A higher Dice score indicates a larger overlap between the prediction and the ground truth.
Sensitivity = T P T P + F N × 100 % ,
where T P and F N represent the number of true positives and false negatives, respectively. Sensitivity measures the proportion of correctly detected trachea voxels relative to all ground-truth trachea voxels.
TD = 1 N i = 1 N t i ( p ) t i ( g ) 2 ,
where t i ( p ) and t i ( g ) denote the coordinates of the i-th predicted and ground-truth terminal points, respectively, and N is the total number of terminal points.
BD = 1 M j = 1 M min k b j ( p ) b k ( g ) 2 ,
where b j ( p ) denotes the j-th sampled point on the predicted branches, b k ( g ) denotes sampled points on the ground-truth branches, and M is the total number of sampled points.In addition, TD measures the proportion of the predicted airway centerline length that is correctly detected relative to the total centerline length of the ground truth airway tree, and BD measures the proportion of correctly detected airway branches relative to the total number of branches in the ground truth. A predicted branch is considered correctly detected only if at least 80% of its centerline voxels lie within the corresponding ground truth branch [6].
Leakages = Length false _ branches Length all _ branches × 100 % ,
where Length false _ branches denotes the total length of non-anatomical false branches predicted by the model. This metric reflects the proportion of spurious airway connections in the segmentation results. Leakages refers to the erroneous extension of the segmentation into adjacent non-airway regions in areas where airway boundaries are blurred, leading to predicted airways that do not reflect true anatomical structures and reducing overall segmentation accuracy. This phenomenon is particularly common in low-contrast peripheral bronchi. Clinically, such leakage can negatively impact airway structure visualization, navigation, and quantitative analysis, thereby reducing the reliability of automated airway models in diagnostic and intraoperative planning applications [6].

4.3. Comparative Evaluation with State-of-the-Art Models

To validate the effectiveness of the proposed AirwaySeekNet model for lung airway segmentation, we compared it with several state-of-the-art 3D medical image segmentation models on the same test set, including WingsNet [12], UNet3D [16], VNet3D [17], VoxResNet [42], AttentionUNet [43], CoTAttentionUNet3D [44], FuzzyAttentionUNet3D [33], and DSCNet [38]. The evaluation metrics covered Dice, sensitivity, TD, BD, and leakages. The experimental results are presented in Table 1. And, compared with the DSCNet, the TD metric increased by 5.55% and the BD metric increased by 8.14%. The qualitative segmentation results for each model are illustrated in Figure 3.
To further assess the generalization ability and effectiveness of the proposed method, we conducted comparative experiments on the publicly available Binary Airway Segmentation (BAS) dataset [25]. Our method was quantitatively compared with several representative approaches, including those of Juarez et al. [45], AirwayNet [19], FRNet [26], and WingsNet [12]. The BAS dataset comprises 90 CT cases with pixel spacing and slice thickness both below 1 mm, of which 70 cases are from the LIDC [46] dataset and 20 cases are from the training set of the EXACT’09 Challenge [13]. The data were randomly split into training, validation, and test sets at a ratio of 5:2:2.
The results in Table 2 indicate that the proposed method achieves a sensitivity of 93.79%, demonstrating its improved ability to detect airway structures, especially small and peripheral branches. For the topology-oriented metrics reflecting airway tree completeness and connectivity, the Tree length detected rate (TD) and Branch detected rate (BD) of the proposed method reach 91.97% and 89.88%, respectively, outperforming the other four methods. Although the Dice score is comparable with other approaches, the consistent improvements observed across multiple metrics indicate that the proposed method provides stable and reliable segmentation performance, thereby validating its effectiveness and robustness across different data distributions. In addition, experimental evaluation shows that the segmentation module achieved an average inference time of approximately 4.2842 s per single CT volume input.
As shown in the results, most methods perform similarly in terms of Dice and sensitivity. However, AirwaySeekNet significantly outperforms all comparative models in TD and BD, indicating its superiority in capturing complex bronchial bifurcations and maintaining topological connectivity. At the same time, AirwaySeekNet exhibits relatively higher leakages, reflecting the model’s tendency to explore smaller airways, which may lead to some false positives. This aligns with the presence of mislabeling in the small branch regions of the dataset. Overall, AirwaySeekNet ensures that the Dice remains unaffected while substantially improving the topological and branching preservation, making it more suitable for clinical applications.

4.4. Hardware and Software Configuration

A consistent hardware and software environment was maintained throughout all experiments, including the same type of GPU (NVIDIA A100 GPU with 80 GB memory), the same version of the deep learning framework (PyTorch v2.4.0), and the same operating system version (Ubuntu 24.04). The visualization and software tools used in this study included Windows 11, Intel Core Ultra 7 155H processor, and 32 GB of RAM. The Visualization Toolkit (VTK) was employed for three-dimensional rendering and visualization, and the Medical Imaging Interaction Toolkit (MITK) was used for loading, viewing, and annotating medical images.

4.5. Tubular Constraints Based on SDF Loss

To further enhance the structural consistency of small airway regions, we incorporated SDF loss into the Weighted Entropy Select (WES) loss framework. The SDF loss applies topological constraints to the predicted results through the distance field, better preserving the tubular structure. The loss function includes two hyperparameters to balance the weight of different constraint terms.
The experimental results are presented in Table 3. After introducing the SDF loss (where the SDF loss is defined as λ ϕ D x + W E S ; δ is a parameter within ϕ D x , and λ denotes the weight of the SDF loss), the model achieved a significant improvement of 3.64% in TD and 6.21% in BD, while the Dice score remained essentially stable. The segmentation results became more consistent with the true tubular morphology of the airways. When the weight λ of the SDF loss was further increased, TD and BD reached their optimal values, with additional improvements of 4.20% and 5.36%, respectively. However, the leakages also rose sharply by 6.06% compared to the results before introducing the SDF loss. These findings indicate that the SDF loss effectively enhances the structural preservation of fine bronchial branches, although an excessively strong constraint may introduce more false positives. Therefore, the appropriate tuning of hyperparameters is essential to achieve an optimal balance between topology preservation and leakages.

4.6. Dual Decoder Architecture and WES Loss Introduction Strategies

AirwaySeekNet adopts a dual-decoder architecture. Initially, we followed the WingsNet design, but its use of a conventional convolutional decoder proved inadequate for the complex highly variable tubular geometry encountered in lung airway segmentation, thereby limiting the exploration and recognition of small airways. To remedy this, we replaced one of the decoders with a Snake Convolution decoder to strengthen the network’s capacity for modeling tubular features. During training, we investigated two annealing strategies: inter-epoch annealing, which operates across epochs, and inner-epoch annealing, which operates within the iterations of an epoch. Both strategies use a dynamic threshold ρ ( t ) as the proportional parameter for annealing. ρ ( t ) dynamically modulates the learning trajectory, helping to avoid entrapment in local optima and thereby facilitating more effective model convergence. The experimental results are shown in Table 4.
As shown in the table, under the WingsNet dual-decoder structure, the model trained with the epoch strategy achieved Dice, TD, BD, and leakages of 95.90%, 85.32%, 80.48%, and 3.61%, respectively. In contrast, the inter strategy achieved 95.91%, 86.95%, 82.11%, and 3.45%, respectively. These results indicate that introducing WES loss using the inter strategy leads to higher TD and BD values and lower leakages compared with the epoch strategy. After incorporating the Snake Convolution, the model using the epoch strategy achieved 94.87%, 90.44%, 88.14%, and 7.91% for the Dice, TD, BD, and leakages, respectively, while the inter strategy yielded 95.04%, 91.37%, 88.69%, and 7.89%. The TD and BD further improved, whereas the Dice slightly decreased, and the leakages increased noticeably compared with the model without Snake Convolution. This suggests that the Snake Convolution enhances the model’s ability to capture fine bronchial branches but also introduces more false positives. Moreover, we observed that the introduction of WES loss not only achieves superior performance but also leads to faster convergence, demonstrating that this strategy provides a more advantageous training approach in practice.

4.7. Summary

Based on the above experimental results, the following conclusions can be drawn. Overall, AirwaySeekNet significantly outperforms existing methods in terms of topological and branching preservation, enabling a more complete recovery of lung airway structures. The introduction of SDF loss further constrains the model’s structural consistency, aligning the results more closely with tubular features, although a reasonable trade-off between topology preservation and false positives must be made when tuning hyperparameters. The dual-decoder architecture combined with Snake Convolution strengthens the model’s ability to capture tubular features, while the inter introduction of WES loss offers superior performance and faster convergence compared to the epoch introduction. In summary, AirwaySeekNet significantly enhances the topology preservation while maintaining stable Dice performance, demonstrating its unique advantages and potential for lung airway segmentation tasks.

5. Discussion and Conclusions

In conclusion, this work presents AirwaySeekNet, a comprehensive solution for fine grained airway segmentation and completion that specifically targets the distal bronchi beyond the eighth generation. The proposed framework is built on a dual decoder architecture with a dedicated explicit segmentation branch, yielding an airway model with high segmentation fidelity and full anatomical continuity. The key to this design is the integration of Voxel Selective Supervision, a dynamic reliability aware training strategy that addresses class imbalance and incomplete annotations by gradually focusing the learning process on uncertain hard to segment voxels. This mechanism effectively guides the network to “actively explore” potential airway branches that might be missing in the ground truth, without being misled by early false positives. Together, the dual decoder architecture and VSS strategy constitute the core technical contributions of AirwaySeekNet, collectively geared toward capturing the smallest airway structures with high confidence.
Advancing Topological Continuity in Airway Segmentation: Topology preservation is critically important in airway segmentation due to the complex tree structure of the bronchial network and its relevance in clinical applications. Unlike voxel level overlap metrics, topological connectivity directly reflects whether the segmented airway tree is continuously reconstructed, which is essential for clinical tasks such as bronchoscopy navigation and quantitative airway assessment. Public benchmarks, such as the Multi site, Multi domain Airway Tree Modeling (ATM’22) challenge, establish tree length detected rate (TD) and branch detected rate (BD) as key metrics for evaluating the completeness and connectivity of airway segmentation, emphasizing the need to maintain structural continuity across different data sources and scanning protocols [6].
The experimental results show that AirwaySeekNet substantially improves both tree-length detected (TD) and branch detected (BD) compared to existing approaches, indicating its enhanced ability to reconstruct a more complete airway network. Importantly, these topological gains do not come at the expense of voxel overlap accuracy. The Dice similarity coefficients remain comparable to those of other methods, as shown in Table 1. This balance is significant in practical terms, demonstrating that the model can extend segmentation into peripheral branches while preserving the structural preservation of central airway segments.
While a modest increase in leakage is observed during topology enhancement, this behavior primarily reflects the network’s aggressive exploration of fine or previously unannotated airway segments. By employing dynamic supervision and carefully tuned loss weighting, AirwaySeekNet achieves a favorable trade off between retrieving true airway structures and limiting excessive over segmentation.
Future work: In future work, we will focus on integrating segmentation and shape based optimization within a unified end to end training and inference framework to further enhance topological continuity and overall robustness. Additionally, although the current method has been validated on porcine airway data as a preparatory step for surgical robot integration, the next stage involves full deployment on the robotic platform, integration with surgical control software, and conducting in vivo animal experiments to further assess performance in real interventional contexts.
Under the current hardware configuration, the model’s inference time is satisfactory and suitable for preoperative segmentation, as there is sufficient time to complete segmentation once CT images are obtained. An AI workstation equipped with an NVIDIA GeForce RTX 4090 has been deployed alongside our interventional surgical robot system, demonstrating technical feasibility. However, given the high cost and limited suitability of the RTX 4090 for embedded deployment, we will explore more optimized hardware integration solutions. To improve applicability in low-resource environments, we have already carried out research on model inference with lower-performance GPUs and CPUs toward more lightweight and practical implementations [47,48].

Author Contributions

P.C. led the conceptualization, methodology, resources, software development, validation, visualization, and the preparation of the original draft and shared equally in project administration and writing—review and editing. D.Z. contributed equally to conceptualization and formal analysis, led project administration, supervision, and writing—review and editing. J.Z. contributed equally to formal analysis and methodology, provided resources, and participated in writing—review and editing. X.W. contributed equally to methodology, software development, and validation. J.X. provided supporting formal analysis and contributed equally to methodology and software. C.L. provided support for formal analysis. T.H. contributed to investigation in a supporting role. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Fundo para o Desenvolvimento das Ciências e da Tecnologia Macao grant number (0159/2024/AMJ), National Natural Science Foundation of China grant number (No. 82402410), and National Key Research and Development Program of China grant number (2024YFE0201700).

Institutional Review Board Statement

The use of animal data in this study was approved by the Institutional Animal Care and Use Committee (IACUC). IACUC Statement 1: Committee Name: Silver Snake (GUANG ZHOU) Medical Science & Technology Co., Ltd. and IACUC; Institution: Zhongda Hospital Southeast University; Approval ID/Number: SS-2024-HLXG 2; Approval Date: 10 May 2024. IACUC Statement 2: Committee Name: Silver Snake (GUANG ZHOU) Medical Science & Technology Co., Ltd. and IACUC; Institution: Hanglok-Tech Co., Ltd.; Approval ID/Number: SS-2024-HLXG 1; Approval Date: 10 May 2024. IACUC Statement 3: Committee Name: Silver Snake (GUANG ZHOU) Medical Science & Technology Co., Ltd. and IACUC; Institution: Hanglok-Tech Co., Ltd.; Approval ID/Number: SS-2024-HL CT 1; Approval Date: 20 June 2024.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study contain commercially sensitive information and cannot be made publicly available upon publication. However, the data that support the findings of this study are available upon reasonable request from the authors.

Acknowledgments

The authors acknowledge support from Fundo para o Desenvolvimento das Ciências e da Tecnologia Macao under Grant 0159/2024/AMJ, the National Natural Science Foundation of China (No. 82402410), and the National Key Research and Development Program (2024YFE0201700) and gratefully acknowledge the GPU computing resources ( 1 × NVIDIA A100 GPU/80G, 1 × NVIDIA GeForce RTX 4090 24G) provided by the School of Computer Science and Engineering, the Faculty of Innovation Engineering, Macau University of Science and Technology.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Diaz, A.A.; Estépar, R.S.J.; Washko, G.R. Computed tomographic airway morphology in chronic obstructive pulmonary disease. Remodeling or innate anatomy? Ann. Am. Thorac. Soc. 2016, 13, 4–9. [Google Scholar] [CrossRef] [PubMed]
  2. Eddy, R.L.; Svenningsen, S.; Kirby, M.; Knipping, D.; McCormack, D.G.; Licskai, C.; Nair, P.; Parraga, G. Is computed tomography airway count related to asthma severity and airway structure and function? Am. J. Respir. Crit. Care Med. 2020, 201, 923–933. [Google Scholar] [CrossRef]
  3. Díaz, A.A.; Nardelli, P.; Wang, W.; San José Estépar, R.; Yen, A.; Kligerman, S.; Maselli, D.J.; Dolliver, W.R.; Tsao, A.; Orejas, J.L.; et al. Artificial intelligence–based CT assessment of bronchiectasis: The COPDGene Study. Radiology 2022, 307, e221109. [Google Scholar]
  4. Ravikumar, N.; Ho, E.; Wagh, A.; Murgu, S. Advanced imaging for robotic bronchoscopy: A review. Diagnostics 2023, 13, 990. [Google Scholar] [CrossRef]
  5. Garcia-Uceda, A.; Selvan, R.; Saghir, Z.; Tiddens, H.A.; de Bruijne, M. Automatic airway segmentation from computed tomography using robust and efficient 3-D convolutional neural networks. Sci. Rep. 2021, 11, 16001. [Google Scholar] [CrossRef] [PubMed]
  6. Zhang, M.; Wu, Y.; Zhang, H.; Qin, Y.; Zheng, H.; Tang, W.; Arnold, C.; Pei, C.; Yu, P.; Nan, Y.; et al. Multi-site, multi-domain airway tree modeling. Med. Image Anal. 2023, 90, 102957. [Google Scholar]
  7. McNulty, W.; Usmani, O.S. Techniques of assessing small airways dysfunction. Eur. Clin. Respir. J. 2014, 1, 25898. [Google Scholar] [CrossRef]
  8. Liwsrisakun, C.; Chaiwong, W.; Pothirat, C. Comparative assessment of small airway dysfunction by impulse oscillometry and spirometry in chronic obstructive pulmonary disease and asthma with and without fixed airflow obstruction. Front. Med. 2023, 10, 1181188. [Google Scholar] [CrossRef] [PubMed]
  9. Usmani, O.S.; Singh, D.; Spinola, M.; Bizzi, A.; Barnes, P.J. The prevalence of small airways disease in adult asthma: A systematic literature review. Respir. Med. 2016, 116, 19–27. [Google Scholar] [CrossRef]
  10. San José Estépar, R.; Reilly, J.J.; Silverman, E.K.; Washko, G.R. Three-dimensional airway measurements and algorithms. Proc. Am. Thorac. Soc. 2008, 5, 905–909. [Google Scholar] [CrossRef]
  11. Tschirren, J.; Hoffman, E.A.; McLennan, G.; Sonka, M. Intrathoracic airway trees: Segmentation and airway morphology analysis from low-dose CT scans. IEEE Trans. Med. Imaging 2005, 24, 1529–1539. [Google Scholar] [CrossRef]
  12. Zheng, H.; Qin, Y.; Gu, Y.; Xie, F.; Yang, J.; Sun, J.; Yang, G.Z. Alleviating class-wise gradient imbalance for pulmonary airway segmentation. IEEE Trans. Med. Imaging 2021, 40, 2452–2462. [Google Scholar] [CrossRef]
  13. Lo, P.; Van Ginneken, B.; Reinhardt, J.M.; Yavarna, T.; De Jong, P.A.; Irving, B.; Fetita, C.; Ortner, M.; Pinho, R.; Sijbers, J.; et al. Extraction of airways from CT (EXACT’09). IEEE Trans. Med. Imaging 2012, 31, 2093–2107. [Google Scholar] [PubMed]
  14. Zhang, M.; Yang, G.Z.; Gu, Y. Differentiable topology-preserved distance transform for pulmonary airway segmentation. arXiv 2022, arXiv:2209.08355. [Google Scholar]
  15. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
  16. Çiçek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Athens, Greece, 17–21 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 424–432. [Google Scholar]
  17. Milletari, F.; Navab, N.; Ahmadi, S.A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; IEEE: New York, NY, USA, 2016; pp. 565–571. [Google Scholar]
  18. Charbonnier, J.P.; Van Rikxoort, E.M.; Setio, A.A.; Schaefer-Prokop, C.M.; van Ginneken, B.; Ciompi, F. Improving airway segmentation in computed tomography using leak detection with convolutional networks. Med. Image Anal. 2017, 36, 52–60. [Google Scholar] [CrossRef] [PubMed]
  19. Qin, Y.; Chen, M.; Zheng, H.; Gu, Y.; Shen, M.; Yang, J.; Huang, X.; Zhu, Y.M.; Yang, G.Z. Airwaynet: A voxel-connectivity aware approach for accurate airway segmentation using convolutional neural networks. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 212–220. [Google Scholar]
  20. Zhao, T.; Yin, Z.; Wang, J.; Gao, D.; Chen, Y.; Mao, Y. Bronchus segmentation and classification by neural networks and linear programming. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 230–239. [Google Scholar]
  21. Wang, C.; Hayashi, Y.; Oda, M.; Itoh, H.; Kitasaka, T.; Frangi, A.F.; Mori, K. Tubular structure segmentation using spatial fully connected network with radial distance loss for 3D medical images. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 348–356. [Google Scholar]
  22. Song, Y.; Liu, Y.; Lin, Z.; Zhou, J.; Li, D.; Zhou, T.; Leung, M.F. Learning from AI-generated annotations for medical image segmentation. IEEE Trans. Consum. Electron. 2024, 71, 1473–1481. [Google Scholar] [CrossRef]
  23. Häkkinen, I.; Melekhov, I.; Englesson, E.; Azizpour, H.; Kannala, J. Medical Image Segmentation with SAM-Generated Annotations. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 51–62. [Google Scholar]
  24. Wang, L.; Guo, D.; Wang, G.; Zhang, S. Annotation-efficient learning for medical image segmentation based on noisy pseudo labels and adversarial learning. IEEE Trans. Med. Imaging 2020, 40, 2795–2807. [Google Scholar] [CrossRef]
  25. Qin, Y.; Gu, Y.; Zheng, H.; Chen, M.; Yang, J.; Zhu, Y.M. AirwayNet-SE: A simple-yet-effective approach to improve airway segmentation using context scale fusion. In Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), Iowa City, IA, USA, 3–7 April 2020; IEEE: New York, NY, USA, 2020; pp. 809–813. [Google Scholar]
  26. Qin, Y.; Zheng, H.; Gu, Y.; Huang, X.; Yang, J.; Wang, L.; Zhu, Y.M. Learning bronchiole-sensitive airway segmentation CNNs by feature recalibration and attention distillation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Lima, Peru, 4–8 October 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 221–231. [Google Scholar]
  27. Zhou, K.; Chen, N.; Xu, X.; Wang, Z.; Guo, J.; Liu, L.; Yi, Z. Automatic airway tree segmentation based on multi-scale context information. Int. J. Comput. Assist. Radiol. Surg. 2021, 16, 219–230. [Google Scholar] [CrossRef]
  28. Nadeem, S.A.; Hoffman, E.A.; Sieren, J.C.; Comellas, A.P.; Bhatt, S.P.; Barjaktarevic, I.Z.; Abtin, F.; Saha, P.K. A CT-based automated algorithm for airway segmentation using freeze-and-grow propagation and deep learning. IEEE Trans. Med. Imaging 2020, 40, 405–418. [Google Scholar] [CrossRef]
  29. Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
  30. Selvan, R.; Kipf, T.; Welling, M.; Juarez, A.G.U.; Pedersen, J.H.; Petersen, J.; de Bruijne, M. Graph refinement based airway extraction using mean-field networks and graph neural networks. Med. Image Anal. 2020, 64, 101751. [Google Scholar] [CrossRef]
  31. Shit, S.; Paetzold, J.C.; Sekuboyina, A.; Ezhov, I.; Unger, A.; Zhylka, A.; Pluim, J.P.; Bauer, U.; Menze, B.H. clDice-a novel topology-preserving loss function for tubular structure segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 16560–16569. [Google Scholar]
  32. Zheng, H.; Qin, Y.; Gu, Y.; Xie, F.; Sun, J.; Yang, J.; Yang, G.Z. Refined local-imbalance-based weight for airway segmentation in CT. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France, 27 September–1 October 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 410–419. [Google Scholar]
  33. Nan, Y.; Del Ser, J.; Tang, Z.; Tang, P.; Xing, X.; Fang, Y.; Herrera, F.; Pedrycz, W.; Walsh, S.; Yang, G. Fuzzy attention neural network to tackle discontinuity in airway segmentation. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 7391–7404. [Google Scholar] [CrossRef] [PubMed]
  34. Yu, W.; Zheng, H.; Zhang, M.; Zhang, H.; Sun, J.; Yang, J. Break: Bronchi reconstruction by geodesic transformation and skeleton embedding. In Proceedings of the 2022 IEEE 19th international symposium on biomedical imaging (ISBI), Kolkata, India, 28–31 March 2022; IEEE: New York, NY, USA, 2022; pp. 1–5. [Google Scholar]
  35. Wang, Y.; Wei, X.; Liu, F.; Chen, J.; Zhou, Y.; Shen, W.; Fishman, E.K.; Yuille, A.L. Deep distance transform for tubular structure segmentation in ct scans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3833–3842. [Google Scholar]
  36. Ma, J.; Wei, Z.; Zhang, Y.; Wang, Y.; Lv, R.; Zhu, C.; Gaoxiang, C.; Liu, J.; Peng, C.; Wang, L.; et al. How distance transform maps boost segmentation CNNs: An empirical study. In Proceedings of the Medical Imaging with Deep Learning, Montreal, QC, Canada, 6–8 July 2020; PMLR: Cambridge, MA, USA, 2020; pp. 479–492. [Google Scholar]
  37. Bian, Z.; Charbonnier, J.P.; Liu, J.; Zhao, D.; Lynch, D.A.; van Ginneken, B. Small airway segmentation in thoracic computed tomography scans: A machine learning approach. Phys. Med. Biol. 2018, 63, 155024. [Google Scholar] [CrossRef]
  38. Qi, Y.; He, Y.; Qi, X.; Zhang, Y.; Yang, G. Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 6070–6079. [Google Scholar]
  39. Arpit, D.; Jastrzębski, S.; Ballas, N.; Krueger, D.; Bengio, E.; Kanwal, M.S.; Maharaj, T.; Fischer, A.; Courville, A.; Bengio, Y.; et al. A closer look at memorization in deep networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; PMLR: Cambridge, MA, USA, 2017; pp. 233–242. [Google Scholar]
  40. Fang, C.; Wang, Q.; Cheng, L.; Gao, Z.; Pan, C.; Cao, Z.; Zheng, Z.; Zhang, D. Reliable mutual distillation for medical image segmentation under imperfect annotations. IEEE Trans. Med. Imaging 2023, 42, 1720–1734. [Google Scholar] [CrossRef] [PubMed]
  41. Zhang, M.; Zhang, H.; You, X.; Yang, G.Z.; Gu, Y. Implicit Representation Embraces Challenging Attributes of Pulmonary Airway Tree Structures. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Marrakesh, Morocco, 6–10 October 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 546–556. [Google Scholar]
  42. Chen, H.; Dou, Q.; Yu, L.; Qin, J.; Heng, P.A. VoxResNet: Deep voxelwise residual networks for brain segmentation from 3D MR images. NeuroImage 2018, 170, 446–455. [Google Scholar] [CrossRef]
  43. Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention u-net: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar] [CrossRef]
  44. Islam, M.; Vibashan, V.; Jose, V.J.M.; Wijethilake, N.; Utkarsh, U.; Ren, H. Brain tumor segmentation and survival prediction using 3D attention UNet. In Proceedings of the International MICCAI Brainlesion Workshop, Shenzhen, China, 17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 262–272. [Google Scholar]
  45. Garcia-Uceda Juarez, A.; Tiddens, H.A.; de Bruijne, M. Automatic airway segmentation in chest CT using convolutional neural networks. In Proceedings of the International Workshop on Reconstruction and Analysis of Moving Body Organs, Granada, Spain, 16–20 September 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 238–250. [Google Scholar]
  46. Armato, S.G., III; McLennan, G.; Bidaut, L.; McNitt-Gray, M.F.; Meyer, C.R.; Reeves, A.P.; Zhao, B.; Aberle, D.R.; Henschke, C.I.; Hoffman, E.A.; et al. The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans. Med. Phys. 2011, 38, 915–931. [Google Scholar] [CrossRef] [PubMed]
  47. Lyu, P.; Xiong, J.; Fang, W.; Zhang, W.; Wang, C.; Zhu, J. Advancing multi-organ and pan-cancer segmentation in abdominal CT scans through scale-aware and self-attentive modulation. In MICCAI Challenge on Fast and Low-Resource Semi-Supervised Abdominal Organ Segmentation; Springer: Berlin/Heidelberg, Germany, 2023; pp. 84–101. [Google Scholar]
  48. Xiong, J.; Lyu, P.; Lin, T.; Song, K.; Wang, C.; Zhu, J. A highly efficient segmentation method for abdominal multi-organs on laptop. In MICCAI Challenge on Fast and Low-Resource Semi-supervised Abdominal Organ Segmentation; Springer: Berlin/Heidelberg, Germany, 2024; pp. 116–131. [Google Scholar]
Figure 1. Overview of the proposed dual-decoder segmentation framework. The input CT volume X is cropped into sub-volumes and encoded by the grouped-convolutional encoder E n c ( · ) . The encoded features are decoded along two parallel branches: the upsampling decoder D e c 1 ( · ) produces probability maps P 1 and distance maps D 1 , while the serpentine convolution decoder D e c 2 ( · ) outputs P 2 and D 2 . The segmentation supervision loss L d e c s u p supervises P 1 , P 2 against the manual annotation Y, while the regression loss L d t m aligns D 1 , D 2 with the ground-truth distance map D. A consistency loss L c o n regularizes the feature spaces of the two decoders.
Figure 1. Overview of the proposed dual-decoder segmentation framework. The input CT volume X is cropped into sub-volumes and encoded by the grouped-convolutional encoder E n c ( · ) . The encoded features are decoded along two parallel branches: the upsampling decoder D e c 1 ( · ) produces probability maps P 1 and distance maps D 1 , while the serpentine convolution decoder D e c 2 ( · ) outputs P 2 and D 2 . The segmentation supervision loss L d e c s u p supervises P 1 , P 2 against the manual annotation Y, while the regression loss L d t m aligns D 1 , D 2 with the ground-truth distance map D. A consistency loss L c o n regularizes the feature spaces of the two decoders.
Ai 07 00040 g001
Figure 2. The label is used to initialize two weights: w 1 derived from neighborhood density and w 2 from distance transform. Their summation forms the final weight w = w 1 + w 2 . The label Y and the predicted probability map P are supervised by cross-entropy, followed by sorting and selection. The selected mask is then multiplied with w to obtain the weighted cross-entropy loss. Voxel-selective supervision based on cross-entropy is employed to exclude erroneously supervised voxel positions.
Figure 2. The label is used to initialize two weights: w 1 derived from neighborhood density and w 2 from distance transform. Their summation forms the final weight w = w 1 + w 2 . The label Y and the predicted probability map P are supervised by cross-entropy, followed by sorting and selection. The selected mask is then multiplied with w to obtain the weighted cross-entropy loss. Voxel-selective supervision based on cross-entropy is employed to exclude erroneously supervised voxel positions.
Ai 07 00040 g002
Figure 3. Comparison of airway segmentation results between the ground truth, AirwaySeekNet (Ours), and other methods on our internal dataset. Red regions indicate true positives (TP), green regions indicate false positives (FP), and blue regions indicate false negatives (FN).
Figure 3. Comparison of airway segmentation results between the ground truth, AirwaySeekNet (Ours), and other methods on our internal dataset. Red regions indicate true positives (TP), green regions indicate false positives (FP), and blue regions indicate false negatives (FN).
Ai 07 00040 g003
Table 1. Comparison of different models for lung airway segmentation.
Table 1. Comparison of different models for lung airway segmentation.
MethodDice (%)Sensitivity (%)TD (%)BD (%)Leakages (%)
WingsNet [12]95.8197.1885.8080.384.99
UNet3D [16]96.2895.2983.2275.782.65
VNet [17]96.1495.8284.5877.153.51
VoxResNet [42]96.0995.5283.9876.313.28
AttentionUNet [43]94.5092.2176.5167.552.78
CoTAttention_UNet3D [44]95.5195.4982.3376.574.48
FuzzyAttention_3DUNet [33]95.6595.8884.4678.754.63
DSCNet [38]95.9796.5085.8280.554.63
AirwaySeekNet (Ours)95.0497.6691.3788.697.90
Table 2. Performance comparison of airway segmentation methods on the BAS dataset.
Table 2. Performance comparison of airway segmentation methods on the BAS dataset.
MethodDice (%)Sensitivity (%)TD (%)BD (%)
Juarez et al. [45]92.2592.6883.9381.69
AirwayNet [19]93.6792.5483.6281.25
FRNet [26]92.4492.9790.7886.86
WingsNet [12]92.2693.2191.8489.72
AirwaySeekNet (Ours)92.3393.7991.9789.88
Table 3. Experimental results with SDF loss introduced on the basis of weighted entropy select loss.
Table 3. Experimental results with SDF loss introduced on the basis of weighted entropy select loss.
LossLoss ParametersDice (%)Sensitivity (%)TD (%)BD (%)Leakages (%)
WES Loss/95.9695.2583.4476.933.26
λ = 1 95.9395.9787.0883.144.10
WES Loss δ = 1
+ SDF Loss λ = 100 94.4497.7491.2888.509.32
δ = 10
Table 4. Experimental results with dual decoder architecture and WES loss introduction strategies.
Table 4. Experimental results with dual decoder architecture and WES loss introduction strategies.
Dual Decoder
Structure
Annealing
Strategies
Dice (%)Sensitivity (%)TD (%)BD (%)Leakages (%)
Conv and ConvInter-epoch95.9095.4685.3280.483.61
Inner-epoch95.9195.3486.9582.113.45
Conv and SE-ConvInter-epoch94.8797.3490.4488.147.91
Inner-epoch95.0497.6691.3788.697.89
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, P.; Zhu, J.; Wang, X.; Xiong, J.; Li, C.; Han, T.; Zhang, D. AirwaySeekNet: Fine-Grained Segmentation and Completion of Peripheral Pulmonary Airways with Dynamic Reliability-Aware Supervision. AI 2026, 7, 40. https://doi.org/10.3390/ai7020040

AMA Style

Chen P, Zhu J, Wang X, Xiong J, Li C, Han T, Zhang D. AirwaySeekNet: Fine-Grained Segmentation and Completion of Peripheral Pulmonary Airways with Dynamic Reliability-Aware Supervision. AI. 2026; 7(2):40. https://doi.org/10.3390/ai7020040

Chicago/Turabian Style

Chen, Peng, Jianjun Zhu, Xiaodong Wang, Junchen Xiong, Chichi Li, Tao Han, and Du Zhang. 2026. "AirwaySeekNet: Fine-Grained Segmentation and Completion of Peripheral Pulmonary Airways with Dynamic Reliability-Aware Supervision" AI 7, no. 2: 40. https://doi.org/10.3390/ai7020040

APA Style

Chen, P., Zhu, J., Wang, X., Xiong, J., Li, C., Han, T., & Zhang, D. (2026). AirwaySeekNet: Fine-Grained Segmentation and Completion of Peripheral Pulmonary Airways with Dynamic Reliability-Aware Supervision. AI, 7(2), 40. https://doi.org/10.3390/ai7020040

Article Metrics

Back to TopTop