1. Introduction
A difficult airway denotes the clinical challenges faced by anesthesiologists during mask ventilation or endotracheal intubation [
1]. The reported incidence of difficult mask ventilation ranges from 1.4% to 5.0%, whereas difficult endotracheal intubation varies between 1.9% and 10% [
2,
3,
4]. Such unpredictable events increase the risks of brain injury and death [
5,
6] and require specialized skills and complex procedures. Consequently, preoperative identification of patients at risk of a difficult airway is essential to reduce complications and anesthesia-related mortality [
7]. Despite decades of study, predicting a difficult airway remains challenging in routine practice. Bedside screening tools such as the Mallampati score, thyromental distance and related tests are constrained by subjectivity and substantial interobserver variability, which can lead to misclassification. Moreover, these tools capture only coarse surrogates of airway anatomy and often fail to model the complex contextual and sequential dependencies that determine intubation difficulty. AI systems based on facial photographs or other external markers also face practical barriers: they are susceptible to imaging artifacts, generalize poorly across acquisition protocols, and therefore struggle to translate across clinical settings. These limitations motivate more objective imaging based assessment. Ultrasound offers a noninvasive modality available at the bedside that directly visualizes internal airway structures, for example the trachea, the epiglottis and the tongue base, and it provides more reproducible anatomic information than external inspection. Laryngeal ultrasound enables dynamic evaluation during quiet breathing or phonation and offers repeatability without ionizing radiation. However, it also presents distinct challenges, including acoustic shadowing caused by air and cartilage, dependence on the operator and on probe orientation, and heterogeneity across scanners and protocols, compounded by typically small or weakly labeled datasets. Accordingly, there is a strong need for ultrasound based deep learning frameworks that extract discriminative features directly from internal airway images and that are designed for robustness and clinical utility in difficult airway assessment. Moreover, due to subjectivity and contextual variability, the predictive performance of the modified LEMON criteria and the Simplified Airway Risk Index SARI remains unstable, with limited consistency across protocols and populations [
8,
9].
With the global rapid expansion of Artificial Intelligence (AI) applications, its auxiliary role in medical practice has increasingly become a focus of academic and clinical attention. Deep learning, a core technology in AI [
10], offers substantial advantages by effortlessly identifying trends in data that are challenging for experts to detect [
11]. Significant progress has been made by deep learning in recent years, especially in the detection and diagnosis of clinical conditions including lung cancer [
12], breast cancer [
13], diabetic retinopathy [
14], stroke sequelae [
15], and early Alzheimer’s disease. For patients with high risk of difficult intubation, anatomical airway abnormalities are usually present, and anesthesiologists rely on visual observation to identify these abnormalities. However, AI-based technologies can objectively, accurately, and reliably identify visual clues and handle subtle differences. To date, several studies have attempted to develop AI-based image systems for the identification and management of difficult airways, covering methods such as attention mechanisms combined with manual Mallampati scoring [
16], deep learning models based on facial images [
17], and fully automated semi-supervised deep learning methods [
18]. Difficult airway management continues to encounter multiple obstacles in current AI methodologies, including algorithmic obsolescence, variability in imaging protocol adherence, and suboptimal performance in clinical prediction tasks.
This study sought to apply advanced deep learning technology through the development of AdaDenseNet-LUC, a specialized AI model designed to correlate ultrasound images of surgical patients with the real-world challenges of intubation. The proposed AdaDenseNet-LUC framework aims to provide doctors with a more precise and reliable preoperative assessment of intubation difficulty by leveraging its adaptive attention mechanism and deep feature extraction capabilities. This innovative approach is expected to optimize the surgical process, reduce intubation-related risks, and ultimately improve patient outcomes through enhanced prediction accuracy. By implementing AdaDenseNet-LUC in clinical practice, we anticipate significant improvements in patient rescue success rates and overall surgical safety.
6. Experimental Results
Hyperparameters such as learning rate, weight decay, and batch size were determined based on prior studies and empirical exploration. To ensure robustness and reduce variance caused by random data splitting, all results were reported as the mean performance over 5-fold cross-validation. Visualization outputs, including ROC curves and accuracy progression plots for both training and testing sets, were generated as illustrated in
Figure 8 and
Figure 9.
In the transfer learning stage, the convolutional layers of the pretrained DenseNet backbone up to the third dense block were frozen to retain generic feature representations, while the subsequent dense block and the LSTM-based attention module were fine-tuned on the target dataset. A lower learning rate of 0.0001 was applied to the unfrozen layers to ensure stable adaptation, whereas the newly added fully connected classification layer was trained with a higher learning rate of 0.001 to accelerate convergence. The LSTM hidden layer was set to a size of 256, allowing the model to capture rich temporal dependencies in the sequential features. The final fully connected classification layer (MLP) had an input size of 256 (matching the LSTM hidden size) and an output size of 2, representing the two classification categories. The activation function used in the fully connected layer was softmax, ensuring probabilistic classification across the two classes. This fine-tuning scheme allowed us to effectively balance the preservation of pretrained knowledge with the adaptation to task-specific features.
The ablation study in
Table 2 evaluates the impact of different model components, with all numbers representing the average results from 5-fold cross-validation. The DenseNet-only model achieved an AUC of 0.866 with modest performance. Adding the SE Block improved the AUC to 0.886, highlighting its contribution to feature refinement. The LSTM attention-only model performed poorly with an AUC of 0.686, showing the importance of combining components. The full AdaDenseNet-LUC model, incorporating DenseNet, SE Block, and LSTM attention, achieved the best performance with an AUC of 0.914, demonstrating the effectiveness of combining all components.
The ablation study results presented in
Table 2 provide valuable insights into the model’s performance. The “Attention Only LSTM” model performs significantly worse (ASC 0.686). This performance degradation can be attributed to the fact that the LSTM module, when used in isolation, may not be able to fully capture the spatial dependencies or contextual information necessary for effective classification. In contrast, when integrated with the attention mechanism, the LSTM benefits from dynamic weight allocation, allowing it to focus on the most diagnostically relevant features across different spatial regions. This fusion enhances the model’s ability to capture long-range dependencies and emphasize key areas such as the trachea and epiglottis, leading to improved performance.
In our experiments, we evaluated the model at the patient level, extracting one ultrasound frame per specific angle for each patient, ensuring that the data used for predictions represents the most relevant anatomical information. Therefore, the results are reported at the patient level, with each patient having a single prediction. In this study, DenseNet is used as the core architecture for deep feature extraction. To highlight the advantages of using the improved DenseNet for deep feature extraction in this task, we compare it with Vgg16 [
27], ResNet [
28], AlexNet [
29], MobileNet [
30], EfficientNet [
31], EfficientNetV2 [
32], DenseNet [
21], Transformer [
33] and Mamba [
34]. Specifically, transfer learning methods are applied to transfer each network pre-trained on ImageNet to our dataset for retraining, with all networks trained using the same parameter configuration. The results from the 5-fold cross-validation experiment are presented in
Table 3 and
Table 4. From
Table 4, it is evident that the AUC of our model surpasses that of other models, reaching 0.888.
Therefore, the experimental results show that the improved DenseNet exhibits better deep feature extraction ability compared to other common network architectures in this task [
35,
36,
37], especially in terms of AUC value. This demonstrates the effectiveness of DenseNet’s efficient feature transfer and deeper feature learning in solving this problem. In addition, the improved model in this article not only has high accuracy, but also demonstrates better stability and convergence during the training process, further verifying its potential application in the field of medical imaging. Therefore, we believe that the model based on improved DenseNet has strong competitiveness and practicality in tracheal Ultrasound image classification tasks. Future work will aim to further optimize the network architecture and expand the datasets to enhance the performance and generalization capability of models in real-world applications.
Compare the accuracy between existing prediction indicators and the best artificial intelligence model in the field of difficult airway prediction. The data in
Table 5 and
Table 6 clearly shows that the method proposed in this study outperforms in multiple indicators. In terms of sensitivity, the approach introduced in this article successfully attained 86.0%, which has significant advantages compared to other indicators such as incisor spacing (44.4%) and protruding teeth (24.1%). This suggests that the approach introduced in this article demonstrates greater sensitivity in detecting potential difficult intubation scenarios. In terms of specificity, this article method performed exceptionally well, reaching 88.6%, far exceeding other indicators such as Mallampati grading (52.7%), though slightly lower than the Teeth Condition method (89.9%), indicating that the laryngeal ultrasound image classification can more accurately exclude non difficult intubation situations and reduce misjudgments. In terms of AUC, a comprehensive measure of accuracy and discrimination, the method proposed in this paper achieved a high value of 0.940, slightly better than facial image classification (0.864), and significantly higher than indicators such as sternotomy distance (0.587). This clearly illustrates that the method presented in this paper offers higher accuracy and reliability in predicting difficult intubation, thereby providing a more valuable predictive tool for clinical practice. We assess statistical significance using paired two-sided
t-tests on fold-wise AUROCs computed on identical splits. For each model, we report the mean AUROC with a 95%
t-interval across 5 folds, and we test the per-fold AUC difference (Model − AdaDenseNet-LUC) to obtain the
t statistic and
p-value. As summarized in
Table 5, AdaDenseNet-LUC shows significant improvements over DenseNet (
,
), EfficientNetV2 (
,
), and Transformer (
,
), while differences against other baselines are not statistically significant (
). Given
n = 5, we also report Cohen’s
d and 95% CIs to quantify effect sizes and uncertainty.
Figure 10 presents the classification outcomes for two representative examples from the test set, with the proposed model predicting the class for each instance. Grad-CAM visualization techniques indicate that the model’s attention is mainly focused on the tracheal region. However, we acknowledge that structures such as the epiglottis, tongue, and hyoid bone play significant roles in airway assessment, particularly when evaluating the potential for obstruction or difficulty in intubation. While the trachea’s visibility is essential for identifying difficult airways, these additional anatomical features must also be considered in clinical evaluations. Grad-CAM works by visualizing which regions of the image contribute most to the model’s predictions. In the case of simple airways, Grad-CAM typically highlights clearer, more defined anatomical structures, such as the central airway passages, which are easier for the model to recognize. In contrast, for difficult airways, Grad-CAM focuses more on complex features, such as irregularities, narrow passages, or areas with occlusions, which are more challenging for the model to differentiate. These differences in the Grad-CAM visualizations reflect the model’s sensitivity to variations in airway structures, helping us understand how the model distinguishes between simple and difficult airways based on anatomical features. Future work could expand the model to incorporate these additional structures more effectively, thus improving the accuracy and clinical utility of the approach in predicting difficult intubation scenarios.
Author Contributions
Conceptualization, C.L. and H.L.; methodology, C.L.; software, C.L.; validation, C.L. and H.L.; formal analysis, C.L.; investigation, C.L.; resources, H.L.; data curation, C.L.; writing—original draft preparation, C.L.; writing—review and editing, C.L. and H.L.; visualization, C.L.; supervision, H.L.; project administration, H.L.; funding acquisition, H.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation of China, Grant Number 62273189.
Institutional Review Board Statement
This study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of The Affiliated Hospital of Qingdao University.
Informed Consent Statement
Patient consent was waived by the Institutional Review Board due to the retrospective nature of this study and anonymization of the data.
Data Availability Statement
The clinical data used in this study are not publicly available due to patient privacy and institutional ethics regulations. De-identified data may be available from the corresponding author on reasonable request and with approval from the institutional review board (IRB).
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Apfelbaum, J.L.; Hagberg, C.A.; Caplan, R.A.; Blitt, C.D.; Connis, R.T.; Nickinovich, D.G.; Benumof, J.L.; Berry, F.A.; Bode, R.H.; Cheney, F.W.; et al. Practice guidelines for management of the difficult airway: An updated report by the American Society of Anesthesiologists Task Force on Management of the Difficult Airway. Anesthesiology 2013, 118, 251–270. [Google Scholar] [PubMed]
- Nørskov, A.K.; Rosenstock, C.V.; Wetterslev, J.; Astrup, G.; Afshari, A.; Lundstrøm, L.H. Diagnostic accuracy of anaesthesiologists’ prediction of difficult airway management in daily clinical practice: A cohort study of 188 064 patients registered in the Danish Anaesthesia Database. Anaesthesia 2015, 70, 272–281. [Google Scholar] [CrossRef]
- Langeron, O.; Masso, E.; Huraux, C.; Guggiari, M.; Bianchi, A.; Coriat, P.; Riou, B. Prediction of difficult mask ventilation. Anesthesiology 2000, 92, 1229–1236. [Google Scholar] [CrossRef]
- Levitan, R.M.; Heitz, J.W.; Sweeney, M.; Cooper, R.M. The complexities of tracheal intubation with direct laryngoscopy and alternative intubation devices. Ann. Emerg. Med. 2011, 57, 240–247. [Google Scholar] [CrossRef] [PubMed]
- Cook, T.M. Major complications of airway management in the UK: Results of the Fourth National Audit Project of the Royal College of Anaesthetists and the Difficult Airway Society. Part 1: Anaesthesia. Br. J. Anaesth. 2011, 106, 617–631. [Google Scholar] [CrossRef]
- Cook, T.M.; MacDougall-Davis, S.R. Complications and failure of airway management. Br. J. Anaesth. 2012, 109, i68–i85. [Google Scholar] [CrossRef] [PubMed]
- Heidegger, T. Management of the difficult airway. N. Engl. J. Med. 2021, 384, 1836–1847. [Google Scholar] [CrossRef]
- Hagiwara, Y.; Watase, H.; Okamoto, H.; Goto, T.; Hasegawa, K. Japanese Emergency Medicine Network Investigators. Prospective validation of the modified LEMON criteria to predict difficult intubation in the ED. Am. J. Emerg. Med. 2015, 33, 1492–1496. [Google Scholar] [CrossRef]
- Nørskov, A.K.; Wetterslev, J.; Rosenstock, C.V.; Afshari, A.; Astrup, G.; Jakobsen, J.C.; Thomsen, J.L.; Bøttger, M.; Ellekvist, M.; Schousboe, B.M.B.; et al. Effects of using the simplified airway risk index vs usual airway assessment on unanticipated difficult tracheal intubation: A cluster randomized trial with 64,273 participants. BJA Br. J. Anaesth. 2016, 116, 680–689. [Google Scholar] [CrossRef]
- Vinisha, A.; Boda, R. DeepBrainTumorNet: An effective framework of heuristic-aided brain tumour detection and classification system using residual Attention-Multiscale Dilated inception network. Biomed. Signal Process. Control 2025, 100, 107180. [Google Scholar] [CrossRef]
- Lu, M.Y.; Chen, T.Y.; Williamson, D.F.K.; Zhao, M.; Shady, M.; Lipkova, J.; Mahmood, F. AI-based pathology predicts origins for cancers of unknown primary. Nature 2021, 594, 106–110. [Google Scholar] [CrossRef] [PubMed]
- Murugesan, M.; Kaliannan, K.; Balraj, S.; Singaram, K.; Kaliannan, T.; Albert, J.R. A hybrid deep learning model for effective segmentation and classification of lung nodules from CT images. J. Intell. Fuzzy Syst. 2022, 42, 2667–2679. [Google Scholar] [CrossRef]
- Han, Y.; Chen, W.; Heidari, A.A.; Chen, H.; Zhang, X. A solution to the stagnation of multi-verse optimization: An efficient method for breast cancer pathologic images segmentation. Biomed. Signal Process. Control 2023, 86, 105208. [Google Scholar] [CrossRef]
- Huang, S.; Li, J.; Xiao, Y.; Shen, N.; Xu, T. RTNet: Relation transformer network for diabetic retinopathy multi-lesion segmentation. IEEE Trans. Med. Imaging 2022, 41, 1596–1607. [Google Scholar] [CrossRef] [PubMed]
- Murray, N.M.; Unberath, M.; Hager, G.D.; Hui, F.K. Artificial intelligence to diagnose ischemic stroke and identify large vessel occlusions: A systematic review. J. Neurointerv. Surg. 2020, 12, 156–164. [Google Scholar] [CrossRef]
- Zhang, F.; Xu, Y.; Zhou, Z.; Zhang, H.; Yang, K. Critical element prediction of tracheal intubation difficulty: Automatic Mallampati classification by jointly using handcrafted and attention-based deep features. Comput. Biol. Med. 2022, 150, 106182. [Google Scholar] [CrossRef]
- Hayasaka, T.; Kawano, K.; Kurihara, K.; Suzuki, H.; Nakane, M.; Kawamae, K. Creation of an artificial intelligence model for intubation difficulty classification by deep learning (convolutional neural network) using face images: An observational study. J. Intensive Care 2021, 9, 38. [Google Scholar] [CrossRef]
- Wang, G.; Li, C.; Tang, F.; Wang, Y.; Wu, S.; Zhi, H.; Zhang, F.; Wang, M.; Zhang, J. A fully-automatic semi-supervised deep learning model for difficult airway assessment. Heliyon 2023, 9, e15629. [Google Scholar]
- Wong, T.-T.; Yeh, P.-Y. Reliable accuracy estimates from k-fold cross validation. IEEE Trans. Knowl. Data Eng. 2019, 32, 1586–1594. [Google Scholar] [CrossRef]
- Lu, J.; Behbood, V.; Hao, P.; Zuo, H.; Xue, S.; Zhang, G. Transfer learning using computational intelligence: A survey. Knowl.-Based Syst. 2015, 80, 14–23. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Zhou, T.; Ye, X.; Lu, H.; Zheng, X.; Qiu, S.; Liu, Y. Dense convolutional network and its application in medical image analysis. BioMed Res. Int. 2022, 2022, 2384830. [Google Scholar] [CrossRef]
- Li, Z.; Sun, N.; Gao, H.; Qin, N.; Li, Z. Adaptive subtraction based on U-Net for removing seismic multiples. IEEE Trans. Geosci. Remote Sens. 2021, 59, 9796–9812. [Google Scholar] [CrossRef]
- Li, X.; Wu, J.; Lin, Z.; Liu, H.; Zha, H. Recurrent squeeze-and-excitation context aggregation net for single image deraining. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 254–269. [Google Scholar]
- Zou, L.; Xia, L.; Ding, Z.; Song, J.; Liu, W.; Yin, D. Reinforcement learning to optimize long-term user engagement in recommender systems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2810–2818. [Google Scholar]
- Deng, Z.; Jiang, Z.; Lan, R.; Huang, W.; Luo, X. Image captioning using DenseNet network and adaptive attention. Signal Process. Image Commun. 2020, 85, 115836. [Google Scholar] [CrossRef]
- Simonyan, K. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- Howard, A.G. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
- Tan, M.; Le, Q. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
- Tan, M.; Le, Q. EfficientNetV2: Smaller models and faster training. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 10096–10106. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Gu, A.; Dao, T.; Ermon, S.; Rudra, A.; Ré, C. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar] [CrossRef]
- Seo, S.-H.; Lee, J.-G.; Yu, S.-B.; Kim, D.-S.; Ryu, S.-J.; Kim, K.-H. Predictors of difficult intubation defined by the intubation difficulty scale (IDS): Predictive value of 7 airway assessment factors. Korean J. Anesthesiol. 2012, 63, 491. [Google Scholar] [CrossRef]
- Eberhart, L.H.J.; Arndt, C.; Cierpka, T.; Schwanekamp, J.; Wulf, H.; Putzke, C. The reliability and validity of the upper lip bite test compared with the Mallampati classification to predict difficult laryngoscopy: An external prospective evaluation. Anesth. Analg. 2005, 101, 284–289. [Google Scholar] [CrossRef]
- Safavi, M.; Honarmand, A.; Zare, N. A comparison of the ratio of patient’s height to thyromental distance with the modified Mallampati and the upper lip bite test in predicting difficult laryngoscopy. Saudi J. Anaesth. 2011, 5, 258–263. [Google Scholar] [CrossRef] [PubMed]
- Warnakulasuriya, S.; Chen, T.H.H. Areca nut and oral cancer: Evidence from studies conducted in humans. J. Dent. Res. 2022, 101, 1139–1146. [Google Scholar] [CrossRef] [PubMed]
Figure 1.
Laryngeal ultrasound image of surgical patients.
Figure 2.
5-fold cross validation.
Figure 3.
AdaDenseNet-LUC network architecture diagram.
Figure 4.
The structure of DenseNet.
Figure 5.
Combine squeeze and excitation steps together.
Figure 6.
In an LSTM unit, there are three gates: input gate, forget gate, and output gate.
Figure 7.
Demonstration of LSTM adaptive attention mechanism.
Figure 8.
Receiver Operating Characteristic (ROC) Curve of the proposed model at 70 epochs.
Figure 9.
Training and testing accuracy progression across 70 epochs.
Figure 10.
Grad CAM visualization of two types of samples.
Table 1.
Training data augmentation and five-fold cross-validation for both test data and training data.
| | Test Data | Training Data |
|---|
| | Simple | Difficult | Simple | Difficult |
|---|
| Dataset1 | 150 | 128 | 599 | 514 |
| Dataset2 | 150 | 128 | 599 | 514 |
| Dataset3 | 149 | 129 | 600 | 513 |
| Dataset4 | 150 | 128 | 599 | 514 |
| Dataset5 | 149 | 129 | 600 | 513 |
Table 2.
Ablation study results.
| Model | AUC | Accuracy | Sensitivity | Specificity | F1 Score |
|---|
| DenseNet Only | 0.866 | 0.792 | 0.782 | 0.796 | 0.764 |
| SE Block Only | 0.886 | 0.800 | 0.788 | 0.808 | 0.776 |
| LSTM Attention Only | 0.686 | 0.658 | 0.638 | 0.676 | 0.618 |
| AdaDenseNet-LUC | 0.914 | 0.822 | 0.812 | 0.830 | 0.800 |
Table 3.
The model prediction performance of five-fold cross-validation.
| Dataset | AUC | Accuracy | Sensitivity | Specificity | Score | MCC |
|---|
| Dataset1 | 0.91 | 0.80 | 0.80 | 0.81 | 0.78 | 0.60 |
| Dataset2 | 0.90 | 0.82 | 0.83 | 0.82 | 0.80 | 0.65 |
| Dataset3 | 0.94 | 0.87 | 0.86 | 0.89 | 0.86 | 0.75 |
| Dataset4 | 0.93 | 0.83 | 0.80 | 0.86 | 0.82 | 0.66 |
| Dataset5 | 0.89 | 0.79 | 0.77 | 0.77 | 0.74 | 0.58 |
Table 4.
Performance comparison of different models on the dataset, reporting the average results from five-fold cross-validation.
| Model | Dataset1 | Dataset2 | Dataset3 | Dataset4 | Dataset5 | Average |
|---|
| Vgg16 | 0.75 | 0.90 | 0.57 | 0.94 | 0.94 | 0.825 |
| ResNet | 0.95 | 0.90 | 0.78 | 0.95 | 0.92 | 0.880 |
| AlexNet | 0.70 | 0.85 | 0.68 | 0.78 | 0.92 | 0.793 |
| MobileNet | 0.90 | 0.87 | 0.84 | 0.85 | 0.93 | 0.885 |
| EfficientNet | 0.92 | 0.69 | 0.57 | 0.39 | 0.82 | 0.678 |
| EfficientNetV2 | 0.89 | 0.50 | 0.46 | 0.47 | 0.68 | 0.604 |
| DenseNet | 0.88 | 0.85 | 0.76 | 0.86 | 0.76 | 0.822 |
| Transformer | 0.88 | 0.85 | 0.93 | 0.86 | 0.88 | 0.886 |
| Mamba | 0.90 | 0.89 | 0.78 | 0.92 | 0.77 | 0.857 |
| AdaDenseNet-LUC | 0.91 | 0.90 | 0.94 | 0.93 | 0.89 | 0.914 |
Table 5.
Model performance comparison with AUROC, sensitivity, and specificity across different architectures.
| Model | Avg AUC | AUC CI (95%) | Avg Sen | Sen CI (95%) | Avg Spe | Spe CI (95%) |
|---|
| DenseNet | 0.822 | (0.751, 0.893) | 0.782 | (0.752, 0.811) | 0.788 | (0.749, 0.826) |
| EfficientNet | 0.678 | (0.419, 0.936) | 0.636 | (0.448, 0.823) | 0.664 | (0.410, 0.917) |
| EfficientNetV2 | 0.600 | (0.370, 0.829) | 0.544 | (0.332, 0.755) | 0.624 | (0.439, 0.808) |
| MobileNet | 0.878 | (0.832, 0.920) | 0.792 | (0.752, 0.831) | 0.804 | (0.770, 0.837) |
| AlexNet | 0.786 | (0.661, 0.913) | 0.716 | (0.556, 0.875) | 0.748 | (0.603, 0.892) |
| ResNet | 0.900 | (0.812, 0.987) | 0.800 | (0.753, 0.847) | 0.814 | (0.763, 0.864) |
| Vgg16 | 0.820 | (0.621, 1.019) | 0.778 | (0.623, 0.933) | 0.816 | (0.673, 0.958) |
| Transformer | 0.886 | (0.841, 0.918) | 0.782 | (0.746, 0.817) | 0.805 | (0.767, 0.840) |
| Mamba | 0.857 | (0.763, 0.940) | 0.786 | (0.757, 0.816) | 0.801 | (0.751, 0.848) |
| AdaDenseNet-LUC | 0.914 | (0.888, 0.939) | 0.812 | (0.769, 0.854) | 0.830 | (0.772, 0.887) |
Table 6.
Statistical comparison of AUC values across models using five-fold cross-validation.
| Model | T | P |
|---|
| DenseNet | 3.328 | 0.029 |
| EfficientNet | 2.360 | 0.077 |
| EfficientNetV2 | 3.580 | 0.023 |
| MobileNet | 1.438 | 0.223 |
| AlexNet | 2.425 | 0.072 |
| ResNet | 0.377 | 0.725 |
| Vgg16 | 1.208 | 0.293 |
| Transformer | 2.915 | 0.043 |
| Mamba | 1.909 | 0.128 |
| AdaDenseNet-LUC | – | – |
Table 7.
Performance of the proposed method compared with baseline approaches, reporting the best-performing results for consistency with prior studies.
| Indicator | Sensitivity (%) | Specificity (%) | AUC |
|---|
| Mallampati Classification (MPC) (1/2/3/4) | 79.6 | 52.7 | 0.673 |
| Inter-Incisor Gap (IIG) (cm) | 44.4 | 75.0 | 0.633 |
| Head and Neck Mobility (HNM) | 53.7 | 77.7 | 0.670 |
| Thyromental Distance (TMD) (cm) | 53.7 | 58.1 | 0.587 |
| Horizontal Length of Mandible (HLM) (cm) | 48.1 | 64.9 | 0.558 |
| Teeth Condition (BT) (Normal/Mild/Severe) | 24.1 | 89.9 | 0.572 |
| Upper Lip Bite Test (ULBT) (1/2/3) | 48.1 | 70.3 | 0.607 |
| Facial Image Classification | 81.8 | 83.3 | 0.864 |
| Laryngeal Ultrasound Image Classification | 86.0 | 88.6 | 0.940 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |