Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (283)

Search Parameters:
Keywords = face diagnosis images

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 6490 KiB  
Article
LISA-YOLO: A Symmetry-Guided Lightweight Small Object Detection Framework for Thyroid Ultrasound Images
by Guoqing Fu, Guanghua Gu, Wen Liu and Hao Fu
Symmetry 2025, 17(8), 1249; https://doi.org/10.3390/sym17081249 - 6 Aug 2025
Abstract
Non-invasive ultrasound diagnosis, combined with deep learning, is frequently used for detecting thyroid diseases. However, real-time detection on portable devices faces limitations due to constrained computational resources, and existing models often lack sufficient capability for small object detection of thyroid nodules. To address [...] Read more.
Non-invasive ultrasound diagnosis, combined with deep learning, is frequently used for detecting thyroid diseases. However, real-time detection on portable devices faces limitations due to constrained computational resources, and existing models often lack sufficient capability for small object detection of thyroid nodules. To address this, this paper proposes an improved lightweight small object detection network framework called LISA-YOLO, which enhances the lightweight multi-scale collaborative fusion algorithm. The proposed framework exploits the inherent symmetrical characteristics of ultrasound images and the symmetrical architecture of the detection network to better capture and represent features of thyroid nodules. Specifically, an improved depthwise separable convolution algorithm replaces traditional convolution to construct a lightweight network (DG-FNet). Through symmetrical cross-scale fusion operations via FPN, detection accuracy is maintained while reducing computational overhead. Additionally, an improved bidirectional feature network (IMS F-NET) fully integrates the semantic and detailed information of high- and low-level features symmetrically, enhancing the representation capability for multi-scale features and improving the accuracy of small object detection. Finally, a collaborative attention mechanism (SAF-NET) uses a dual-channel and spatial attention mechanism to adaptively calibrate channel and spatial weights in a symmetric manner, effectively suppressing background noise and enabling the model to focus on small target areas in thyroid ultrasound images. Extensive experiments on two image datasets demonstrate that the proposed method achieves improvements of 2.3% in F1 score, 4.5% in mAP, and 9.0% in FPS, while maintaining only 2.6 M parameters and reducing GFLOPs from 6.1 to 5.8. The proposed framework provides significant advancements in lightweight real-time detection and demonstrates the important role of symmetry in enhancing the performance of ultrasound-based thyroid diagnosis. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

24 pages, 3788 KiB  
Review
Advances in Photoacoustic Imaging of Breast Cancer
by Yang Wu, Keer Huang, Guoxiong Chen and Li Lin
Sensors 2025, 25(15), 4812; https://doi.org/10.3390/s25154812 - 5 Aug 2025
Abstract
Breast cancer is the leading cause of cancer-related mortality among women world-wide, and early screening is critical for improving patient survival. Medical imaging plays a central role in breast cancer screening, diagnosis, and treatment monitoring. However, conventional imaging modalities—including mammography, ultrasound, and magnetic [...] Read more.
Breast cancer is the leading cause of cancer-related mortality among women world-wide, and early screening is critical for improving patient survival. Medical imaging plays a central role in breast cancer screening, diagnosis, and treatment monitoring. However, conventional imaging modalities—including mammography, ultrasound, and magnetic resonance imaging—face limitations such as low diagnostic specificity, relatively slow imaging speed, ionizing radiation exposure, and dependence on exogenous contrast agents. Photoacoustic imaging (PAI), a novel hybrid imaging technique that combines optical contrast with ultrasonic spatial resolution, has shown great promise in addressing these challenges. By revealing anatomical, functional, and molecular features of the breast tumor microenvironment, PAI offers high spatial resolution, rapid imaging, and minimal operator dependence. This review outlines the fundamental principles of PAI and systematically examines recent advances in its application to breast cancer screening, diagnosis, and therapeutic evaluation. Furthermore, we discuss the translational potential of PAI as an emerging breast imaging modality, complementing existing clinical techniques. Full article
(This article belongs to the Special Issue Optical Imaging for Medical Applications)
Show Figures

Figure 1

13 pages, 3685 KiB  
Article
A Controlled Variation Approach for Example-Based Explainable AI in Colorectal Polyp Classification
by Miguel Filipe Fontes, Alexandre Henrique Neto, João Dallyson Almeida and António Trigueiros Cunha
Appl. Sci. 2025, 15(15), 8467; https://doi.org/10.3390/app15158467 - 30 Jul 2025
Viewed by 213
Abstract
Medical imaging is vital for diagnosing and treating colorectal cancer (CRC), a leading cause of mortality. Classifying colorectal polyps and CRC precursors remains challenging due to operator variability and expertise dependence. Deep learning (DL) models show promise in polyp classification but face adoption [...] Read more.
Medical imaging is vital for diagnosing and treating colorectal cancer (CRC), a leading cause of mortality. Classifying colorectal polyps and CRC precursors remains challenging due to operator variability and expertise dependence. Deep learning (DL) models show promise in polyp classification but face adoption barriers due to their ‘black box’ nature, limiting interpretability. This study presents an example-based explainable artificial intehlligence (XAI) approach using Pix2Pix to generate synthetic polyp images with controlled size variations and LIME to explain classifier predictions visually. EfficientNet and Vision Transformer (ViT) were trained on datasets of real and synthetic images, achieving strong baseline accuracies of 94% and 96%, respectively. Image quality was assessed using PSNR (18.04), SSIM (0.64), and FID (123.32), while classifier robustness was evaluated across polyp sizes. Results show that Pix2Pix effectively controls image attributes like polyp size despite limitations in visual fidelity. LIME integration revealed classifier vulnerabilities, underscoring the value of complementary XAI techniques. This enhances DL model interpretability and deepens understanding of their behaviour. The findings contribute to developing explainable AI tools for polyp classification and CRC diagnosis. Future work will improve synthetic image quality and refine XAI methodologies for broader clinical use. Full article
Show Figures

Figure 1

18 pages, 9470 KiB  
Article
DCS-ST for Classification of Breast Cancer Histopathology Images with Limited Annotations
by Suxing Liu and Byungwon Min
Appl. Sci. 2025, 15(15), 8457; https://doi.org/10.3390/app15158457 - 30 Jul 2025
Viewed by 274
Abstract
Accurate classification of breast cancer histopathology images is critical for early diagnosis and treatment planning. Yet, conventional deep learning models face significant challenges under limited annotation scenarios due to their reliance on large-scale labeled datasets. To address this, we propose Dynamic Cross-Scale Swin [...] Read more.
Accurate classification of breast cancer histopathology images is critical for early diagnosis and treatment planning. Yet, conventional deep learning models face significant challenges under limited annotation scenarios due to their reliance on large-scale labeled datasets. To address this, we propose Dynamic Cross-Scale Swin Transformer (DCS-ST), a robust and efficient framework tailored for histopathology image classification with scarce annotations. Specifically, DCS-ST integrates a dynamic window predictor and a cross-scale attention module to enhance multi-scale feature representation and interaction while employing a semi-supervised learning strategy based on pseudo-labeling and denoising to exploit unlabeled data effectively. This design enables the model to adaptively attend to diverse tissue structures and pathological patterns while maintaining classification stability. Extensive experiments on three public datasets—BreakHis, Mini-DDSM, and ICIAR2018—demonstrate that DCS-ST consistently outperforms existing state-of-the-art methods across various magnifications and classification tasks, achieving superior quantitative results and reliable visual classification. Furthermore, empirical evaluations validate its strong generalization capability and practical potential for real-world weakly-supervised medical image analysis. Full article
Show Figures

Figure 1

35 pages, 4940 KiB  
Article
A Novel Lightweight Facial Expression Recognition Network Based on Deep Shallow Network Fusion and Attention Mechanism
by Qiaohe Yang, Yueshun He, Hongmao Chen, Youyong Wu and Zhihua Rao
Algorithms 2025, 18(8), 473; https://doi.org/10.3390/a18080473 - 30 Jul 2025
Viewed by 334
Abstract
Facial expression recognition (FER) is a critical research direction in artificial intelligence, which is widely used in intelligent interaction, medical diagnosis, security monitoring, and other domains. These applications highlight its considerable practical value and social significance. Face expression recognition models often need to [...] Read more.
Facial expression recognition (FER) is a critical research direction in artificial intelligence, which is widely used in intelligent interaction, medical diagnosis, security monitoring, and other domains. These applications highlight its considerable practical value and social significance. Face expression recognition models often need to run efficiently on mobile devices or edge devices, so the research on lightweight face expression recognition is particularly important. However, feature extraction and classification methods of lightweight convolutional neural network expression recognition algorithms mostly used at present are not specifically and fully optimized for the characteristics of facial expression images, yet fail to make full use of the feature information in face expression images. To address the lack of facial expression recognition models that are both lightweight and effectively optimized for expression-specific feature extraction, this study proposes a novel network design tailored to the characteristics of facial expressions. In this paper, we refer to the backbone architecture of MobileNet V2 network, and redesign LightExNet, a lightweight convolutional neural network based on the fusion of deep and shallow layers, attention mechanism, and joint loss function, according to the characteristics of the facial expression features. In the network architecture of LightExNet, firstly, deep and shallow features are fused in order to fully extract the shallow features in the original image, reduce the loss of information, alleviate the problem of gradient disappearance when the number of convolutional layers increases, and achieve the effect of multi-scale feature fusion. The MobileNet V2 architecture has also been streamlined to seamlessly integrate deep and shallow networks. Secondly, by combining the own characteristics of face expression features, a new channel and spatial attention mechanism is proposed to obtain the feature information of different expression regions as much as possible for encoding. Thus improve the accuracy of expression recognition effectively. Finally, the improved center loss function is superimposed to further improve the accuracy of face expression classification results, and corresponding measures are taken to significantly reduce the computational volume of the joint loss function. In this paper, LightExNet is tested on the three mainstream face expression datasets: Fer2013, CK+ and RAF-DB, respectively, and the experimental results show that LightExNet has 3.27 M Parameters and 298.27 M Flops, and the accuracy on the three datasets is 69.17%, 97.37%, and 85.97%, respectively. The comprehensive performance of LightExNet is better than the current mainstream lightweight expression recognition algorithms such as MobileNet V2, IE-DBN, Self-Cure Net, Improved MobileViT, MFN, Ada-CM, Parallel CNN(Convolutional Neural Network), etc. Experimental results confirm that LightExNet effectively improves recognition accuracy and computational efficiency while reducing energy consumption and enhancing deployment flexibility. These advantages underscore its strong potential for real-world applications in lightweight facial expression recognition. Full article
Show Figures

Figure 1

21 pages, 5527 KiB  
Article
SGNet: A Structure-Guided Network with Dual-Domain Boundary Enhancement and Semantic Fusion for Skin Lesion Segmentation
by Haijiao Yun, Qingyu Du, Ziqing Han, Mingjing Li, Le Yang, Xinyang Liu, Chao Wang and Weitian Ma
Sensors 2025, 25(15), 4652; https://doi.org/10.3390/s25154652 - 27 Jul 2025
Viewed by 327
Abstract
Segmentation of skin lesions in dermoscopic images is critical for the accurate diagnosis of skin cancers, particularly malignant melanoma, yet it is hindered by irregular lesion shapes, blurred boundaries, low contrast, and artifacts, such as hair interference. Conventional deep learning methods, typically based [...] Read more.
Segmentation of skin lesions in dermoscopic images is critical for the accurate diagnosis of skin cancers, particularly malignant melanoma, yet it is hindered by irregular lesion shapes, blurred boundaries, low contrast, and artifacts, such as hair interference. Conventional deep learning methods, typically based on UNet or Transformer architectures, often face limitations in regard to fully exploiting lesion features and incur high computational costs, compromising precise lesion delineation. To overcome these challenges, we propose SGNet, a structure-guided network, integrating a hybrid CNN–Mamba framework for robust skin lesion segmentation. The SGNet employs the Visual Mamba (VMamba) encoder to efficiently extract multi-scale features, followed by the Dual-Domain Boundary Enhancer (DDBE), which refines boundary representations and suppresses noise through spatial and frequency-domain processing. The Semantic-Texture Fusion Unit (STFU) adaptively integrates low-level texture with high-level semantic features, while the Structure-Aware Guidance Module (SAGM) generates coarse segmentation maps to provide global structural guidance. The Guided Multi-Scale Refiner (GMSR) further optimizes boundary details through a multi-scale semantic attention mechanism. Comprehensive experiments based on the ISIC2017, ISIC2018, and PH2 datasets demonstrate SGNet’s superior performance, with average improvements of 3.30% in terms of the mean Intersection over Union (mIoU) value and 1.77% in regard to the Dice Similarity Coefficient (DSC) compared to state-of-the-art methods. Ablation studies confirm the effectiveness of each component, highlighting SGNet’s exceptional accuracy and robust generalization for computer-aided dermatological diagnosis. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

17 pages, 6870 KiB  
Article
Edge- and Color–Texture-Aware Bag-of-Local-Features Model for Accurate and Interpretable Skin Lesion Diagnosis
by Dichao Liu and Kenji Suzuki
Diagnostics 2025, 15(15), 1883; https://doi.org/10.3390/diagnostics15151883 - 27 Jul 2025
Viewed by 387
Abstract
Background/Objectives: Deep models have achieved remarkable progress in the diagnosis of skin lesions but face two significant drawbacks. First, they cannot effectively explain the basis of their predictions. Although attention visualization tools like Grad-CAM can create heatmaps using deep features, these features [...] Read more.
Background/Objectives: Deep models have achieved remarkable progress in the diagnosis of skin lesions but face two significant drawbacks. First, they cannot effectively explain the basis of their predictions. Although attention visualization tools like Grad-CAM can create heatmaps using deep features, these features often have large receptive fields, resulting in poor spatial alignment with the input image. Second, the design of most deep models neglects interpretable traditional visual features inspired by clinical experience, such as color–texture and edge features. This study aims to propose a novel approach integrating deep learning with traditional visual features to handle these limitations. Methods: We introduce the edge- and color–texture-aware bag-of-local-features model (ECT-BoFM), which limits the receptive field of deep features to a small size and incorporates edge and color–texture information from traditional features. A non-rigid reconstruction strategy ensures that traditional features enhance rather than constrain the model’s performance. Results: Experiments on the ISIC 2018 and 2019 datasets demonstrated that ECT-BoFM yields precise heatmaps and achieves high diagnostic performance, outperforming state-of-the-art methods. Furthermore, training models using only a small number of the most predictive patches identified by ECT-BoFM achieved diagnostic performance comparable to that obtained using full images, demonstrating its efficiency in exploring key clues. Conclusions: ECT-BoFM successfully combines deep learning and traditional visual features, addressing the interpretability and diagnostic accuracy challenges of existing methods. ECT-BoFM provides an interpretable and accurate framework for skin lesion diagnosis, advancing the integration of AI in dermatological research and clinical applications. Full article
Show Figures

Figure 1

17 pages, 840 KiB  
Article
Developing a Consensus-Based POCUS Protocol for Critically Ill Patients During Pandemics: A Modified Delphi Study
by Hyuksool Kwon, Jin Hee Lee, Dongbum Suh, Kyoung Min You and PULSE Group
Medicina 2025, 61(8), 1319; https://doi.org/10.3390/medicina61081319 - 22 Jul 2025
Viewed by 177
Abstract
Background and Objectives: During pandemics, emergency departments face the challenge of managing critically ill patients with limited resources. Point-of-Care Ultrasound (POCUS) has emerged as a crucial diagnostic tool in such scenarios. This study aimed to develop a standardized POCUS protocol using expert [...] Read more.
Background and Objectives: During pandemics, emergency departments face the challenge of managing critically ill patients with limited resources. Point-of-Care Ultrasound (POCUS) has emerged as a crucial diagnostic tool in such scenarios. This study aimed to develop a standardized POCUS protocol using expert consensus via a modified Delphi survey to guide physicians in managing these patients more effectively. Materials and Methods: A committee of emergency imaging experts and board-certified emergency physicians identified essential elements of POCUS in the treatment of patients under investigation (PUI) with shock, sepsis, or other life-threatening diseases. A modified Delphi survey was conducted among 39 emergency imaging experts who were members of the Korean Society of Emergency Medicine. The survey included three rounds of expert feedback and revisions, leading to the development of a POCUS protocol for critically ill patients during a pandemic. Results: The developed POCUS protocol emphasizes the use of POCUS-echocardiography and POCUS-lung ultrasound for the evaluation of cardiac and respiratory function, respectively. The protocol also provides guidance on when to consider additional tests or imaging based on POCUS findings. The Delphi survey results indicated general consensus on the inclusion of POCUS-echocardiography and POCUS-lung ultrasound within the protocol, although there were some disagreements regarding specific elements. Conclusions: Effective clinical practice aids emergency physicians in determining appropriate POCUS strategies for differential diagnosis between life-threatening diseases. Future studies should investigate the effectiveness and feasibility of the protocol in actual clinical scenarios, including its impact on patient outcomes, resource utilization, and workflow efficiency in emergency departments. Full article
(This article belongs to the Section Intensive Care/ Anesthesiology)
Show Figures

Figure 1

22 pages, 2514 KiB  
Article
High-Accuracy Recognition Method for Diseased Chicken Feces Based on Image and Text Information Fusion
by Duanli Yang, Zishang Tian, Jianzhong Xi, Hui Chen, Erdong Sun and Lianzeng Wang
Animals 2025, 15(15), 2158; https://doi.org/10.3390/ani15152158 - 22 Jul 2025
Viewed by 322
Abstract
Poultry feces, a critical biomarker for health assessment, requires timely and accurate pathological identification for food safety. Conventional visual-only methods face limitations due to environmental sensitivity and high visual similarity among feces from different diseases. To address this, we propose MMCD (Multimodal Chicken-feces [...] Read more.
Poultry feces, a critical biomarker for health assessment, requires timely and accurate pathological identification for food safety. Conventional visual-only methods face limitations due to environmental sensitivity and high visual similarity among feces from different diseases. To address this, we propose MMCD (Multimodal Chicken-feces Diagnosis), a ResNet50-based multimodal fusion model leveraging semantic complementarity between images and descriptive text to enhance diagnostic precision. Key innovations include the following: (1) Integrating MASA(Manhattan self-attention)and DSconv (Depthwise Separable convolution) into the backbone network to mitigate feature confusion. (2) Utilizing a pre-trained BERT to extract textual semantic features, reducing annotation dependency and cost. (3) Designing a lightweight Gated Cross-Attention (GCA) module for dynamic multimodal fusion, achieving a 41% parameter reduction versus cross-modal transformers. Experiments demonstrate that MMCD significantly outperforms single-modal baselines in Accuracy (+8.69%), Recall (+8.72%), Precision (+8.67%), and F1 score (+8.72%). It surpasses simple feature concatenation by 2.51–2.82% and reduces parameters by 7.5M and computations by 1.62 GFLOPs versus the base ResNet50. This work validates multimodal fusion’s efficacy in pathological fecal detection, providing a theoretical and technical foundation for agricultural health monitoring systems. Full article
(This article belongs to the Section Animal Welfare)
Show Figures

Figure 1

9 pages, 401 KiB  
Proceeding Paper
Integrating Machine Learning with Medical Imaging for Human Disease Diagnosis: A Survey
by Anass Roman, Chaymae Taib, Ilham Dhaiouir and Haimoudi El Khatir
Comput. Sci. Math. Forum 2025, 10(1), 12; https://doi.org/10.3390/cmsf2025010012 - 7 Jul 2025
Viewed by 298
Abstract
Machine learning is revolutionizing healthcare by enhancing diagnosis and treatment personalization. This study explores ML applications in medical imaging, analyzing data from X-rays, CT, MRI, and ultrasound for early disease detection. It reviews key ML models, including SVM, ANN, RF, CNN, and other [...] Read more.
Machine learning is revolutionizing healthcare by enhancing diagnosis and treatment personalization. This study explores ML applications in medical imaging, analyzing data from X-rays, CT, MRI, and ultrasound for early disease detection. It reviews key ML models, including SVM, ANN, RF, CNN, and other methods, demonstrating their effectiveness in detecting cancers such as lung and prostate cancer and other diseases. Despite their accuracy, these methods face challenges such as a reliance on large datasets and significant computational requirements. This study highlights the need for further research to integrate ML into clinical practice, addressing its limitations and unlocking new opportunities for improved patient care. Full article
Show Figures

Figure 1

10 pages, 1531 KiB  
Case Report
A Rare Case of Cerebral Amyloidoma Mimicking Thalamic Glioma in a Rheumatoid Arthritis Patient
by Elyaa Saleh, Nour Abdelaziz, Malaak Ramahi, Antonia Loukousia, Theodossios Birbilis and Dimitrios Kanakis
Pathophysiology 2025, 32(3), 31; https://doi.org/10.3390/pathophysiology32030031 - 1 Jul 2025
Viewed by 360
Abstract
Amyloidosis, often referred to as “the great imitator”, is a condition characterized by the abnormal deposition of amyloid proteins in various tissues, potentially leading to organ dysfunction. When these deposits localize in the brain, they can disrupt neurological function and present with diverse [...] Read more.
Amyloidosis, often referred to as “the great imitator”, is a condition characterized by the abnormal deposition of amyloid proteins in various tissues, potentially leading to organ dysfunction. When these deposits localize in the brain, they can disrupt neurological function and present with diverse clinical manifestations, making diagnosis particularly challenging. Cerebral amyloidosis is a rare entity that frequently mimics other neurological disorders, often resulting in significant delays in recognition and management. This case highlights the diagnostic challenge posed by cerebral amyloidosis and underscores its unique presentation. We present the case of a 76-year-old male with a history of rheumatoid arthritis (RA) who developed progressive right-sided weakness over several months. Three years prior, he experienced numbness on the right side of his face and upper limb. Initial imaging identified a small lesion in the left thalamic region, which was originally diagnosed as a glioma. However, due to the worsening of his clinical symptoms, further evaluation was warranted. Subsequent imaging revealed lesion growth, prompting a biopsy that ultimately confirmed the diagnosis of intracerebral amyloidoma. This case underscores the necessity of considering amyloidosis in the differential diagnosis of atypical neurological deficits, particularly in patients with systemic inflammatory conditions such as RA. The initial presentation of hemiparesis resembling a stroke, coupled with non-specific imaging findings and a prior misdiagnosis of glioma, highlights the complexity of cerebral amyloidosis. Only through brain biopsy was the definitive diagnosis established, emphasizing the need for improved diagnostic modalities to facilitate early detection. Further subtyping of amyloidosis, however, requires mass spectrometry-based proteomics or immunohistochemistry to accurately identify the specific amyloid protein involved. Clinicians should maintain a high index of suspicion for cerebral amyloidosis in patients with RA who present with progressive neurological deficits and atypical brain lesions. Early recognition and accurate diagnosis are essential to guiding appropriate management and improving patient outcomes. Full article
(This article belongs to the Section Systemic Pathophysiology)
Show Figures

Figure 1

18 pages, 2503 KiB  
Article
Defect Identification and Diagnosis for Distribution Network Electrical Equipment Based on Fused Image and Voiceprint Joint Perception
by An Chen, Junle Liu, Silin Liu, Jinchao Fan and Bin Liao
Energies 2025, 18(13), 3451; https://doi.org/10.3390/en18133451 - 30 Jun 2025
Viewed by 234
Abstract
As the scale of distribution networks expand, existing defect identification methods face numerous challenges, including limitations in single-modal feature identification, insufficient cross-modal information fusion, and the lack of a multi-stage feedback mechanism. To address these issues, we first propose a joint perception of [...] Read more.
As the scale of distribution networks expand, existing defect identification methods face numerous challenges, including limitations in single-modal feature identification, insufficient cross-modal information fusion, and the lack of a multi-stage feedback mechanism. To address these issues, we first propose a joint perception of image and voiceprint features based on bidirectional coupled attention, which enhances deep interaction across modalities and overcomes the shortcomings of traditional methods in cross-modal fusion. Secondly, a defect identification and diagnosis method of distribution network electrical equipment based on two-stage convolutional neural networks (CNN) is introduced, which makes the network pay more attention to typical and frequent defects, and enhances defect diagnosis accuracy and robustness. The proposed algorithm is compared with two baseline algorithms. Baseline 1 is a long short term memory (LSTM)-based algorithm that performs separate feature extraction and processing for image and voiceprint signals without coupling the features of the two modalities, and Baseline 2 is a traditional CNN algorithm that uses classical convolutional layers for feature learning and classification through pooling and fully connected layers. Compared with two baselines, simulation results demonstrate that the proposed method improves accuracy by 12.1% and 33.7%, recall by 12.5% and 33.1%, and diagnosis efficiency by 22.92% and 60.42%. Full article
Show Figures

Figure 1

16 pages, 2054 KiB  
Article
Transformer-Based Detection and Clinical Evaluation System for Torsional Nystagmus
by Ju-Hyuck Han, Yong-Suk Kim, Jong Bin Lee, Hantai Kim, Jong-Yeup Kim and Yongseok Cho
Sensors 2025, 25(13), 4039; https://doi.org/10.3390/s25134039 - 28 Jun 2025
Viewed by 344
Abstract
Motivation: Benign paroxysmal positional vertigo (BPPV) is characterized by torsional nystagmus induced by changes in head position, where accurate quantitative assessment of subtle torsional eye movements is essential for precise diagnosis. Conventional videonystagmography (VNG) techniques face challenges in accurately capturing the rotational components [...] Read more.
Motivation: Benign paroxysmal positional vertigo (BPPV) is characterized by torsional nystagmus induced by changes in head position, where accurate quantitative assessment of subtle torsional eye movements is essential for precise diagnosis. Conventional videonystagmography (VNG) techniques face challenges in accurately capturing the rotational components of pupil movements, and existing automated methods typically exhibit limited performance in identifying torsional nystagmus. Methodology: The objective of this study was to develop an automated system capable of accurately and quantitatively detecting torsional nystagmus. We introduce the Torsion Transformer model, designed to directly estimate torsion angles from iris images. This model employs a self-supervised learning framework comprising two main components: a Decoder module, which learns rotational transformations from image data, and a Finder module, which subsequently estimates the torsion angle. The resulting torsion angle data, represented as time-series, are then analyzed using a 1-dimensional convolutional neural network (1D-CNN) classifier to detect the presence of nystagmus. The performance of the proposed method was evaluated using video recordings from 127 patients diagnosed with BPPV. Findings: Our Torsion Transformer model demonstrated robust performance, achieving a sensitivity of 89.99%, specificity of 86.36%, an F1-score of 88.82%, and an area under the receiver operating characteristic curve (AUROC) of 87.93%. These results indicate that the proposed model effectively quantifies torsional nystagmus, with performance levels comparable to established methods for detecting horizontal and vertical nystagmus. Thus, the Torsion Transformer shows considerable promise as a clinical decision support tool in the diagnosis of BPPV. Key Findings: Technical performance improvement in torsional nystagmus detection; System to support clinical decision-making for healthcare professionals. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

27 pages, 92544 KiB  
Article
Analysis of Gearbox Bearing Fault Diagnosis Method Based on 2D Image Transformation and 2D-RoPE Encoding
by Xudong Luo, Minghui Wang and Zhijie Zhang
Appl. Sci. 2025, 15(13), 7260; https://doi.org/10.3390/app15137260 - 27 Jun 2025
Viewed by 316
Abstract
The stability of gearbox bearings is crucial to the operational efficiency and safety of industrial equipment, as their faults can lead to downtime, economic losses, and safety risks. Traditional models face difficulties in handling complex industrial time-series data due to insufficient feature extraction [...] Read more.
The stability of gearbox bearings is crucial to the operational efficiency and safety of industrial equipment, as their faults can lead to downtime, economic losses, and safety risks. Traditional models face difficulties in handling complex industrial time-series data due to insufficient feature extraction capabilities and poor training stability. Although transformers show advantages in fault diagnosis, their ability to model local dependencies is limited. To improve feature extraction from time-series data and enhance model robustness, this paper proposes an innovative method based on the ViT. Time-series data were converted into two-dimensional images using polar coordinate transformation and Gramian matrices to enhance classification stability. A lightweight front-end encoder and depthwise feature extractor, combined with multi-scale depthwise separable convolution modules, were designed to enhance fine-grained features, while two-dimensional rotary position encoding preserved temporal information and captured temporal dependencies. The constructed RoPE-DWTrans model implemented a unified feature extraction process, significantly improving cross-dataset adaptability and model performance. Experimental results demonstrated that the RoPE-DWTrans model achieved excellent classification performance on the combined MCC5 and HUST gearbox datasets. In the fault category diagnosis task, classification accuracy reached 0.953, with precision at 0.959, recall at 0.973, and an F1 score of 0.961; in the fault category and severity diagnosis task, classification accuracy reached 0.923, with precision at 0.932, recall at 0.928, and an F1 score of 0.928. Compared with existing methods, the proposed model showed significant advantages in robustness and generalization ability, validating its effectiveness and application potential in industrial fault diagnosis. Full article
Show Figures

Figure 1

22 pages, 1359 KiB  
Article
A Meta-Learning-Based Ensemble Model for Explainable Alzheimer’s Disease Diagnosis
by Fatima Hasan Al-bakri, Wan Mohd Yaakob Wan Bejuri, Mohamed Nasser Al-Andoli, Raja Rina Raja Ikram, Hui Min Khor, Zulkifli Tahir and The Alzheimer’s Disease Neuroimaging Initiative
Diagnostics 2025, 15(13), 1642; https://doi.org/10.3390/diagnostics15131642 - 27 Jun 2025
Viewed by 592
Abstract
Background/Objectives: Artificial intelligence (AI) models for Alzheimer’s disease (AD) diagnosis often face the challenge of limited explainability, hindering their clinical adoption. Previous studies have relied on full-scale MRI, which increases unnecessary features, creating a “black-box” problem in current XAI models. Methods: This study [...] Read more.
Background/Objectives: Artificial intelligence (AI) models for Alzheimer’s disease (AD) diagnosis often face the challenge of limited explainability, hindering their clinical adoption. Previous studies have relied on full-scale MRI, which increases unnecessary features, creating a “black-box” problem in current XAI models. Methods: This study proposes an explainable ensemble-based diagnostic framework trained on both clinical data and mid-slice axial MRI from the ADNI and OASIS datasets. The methodology involves training an ensemble model that integrates Random Forest, Support Vector Machine, XGBoost, and Gradient Boosting classifiers, with meta-logistic regression used for the final decision. The core contribution lies in the exclusive use of mid-slice MRI images, which highlight the lateral ventricles, thus improving the transparency and clinical relevance of the decision-making process. Our mid-slice approach minimizes unnecessary features and enhances model explainability by design. Results: We achieved state-of-the-art diagnostic accuracy: 99% on OASIS and 97.61% on ADNI using clinical data alone; 99.38% on OASIS and 98.62% on ADNI using only mid-slice MRI; and 99% accuracy when combining both modalities. The findings demonstrated significant progress in diagnostic transparency, as the algorithm consistently linked predictions to observed structural changes in the dilated lateral ventricles of the brain, which serve as a clinically reliable biomarker for AD and can be easily verified by medical professionals. Conclusions: This research presents a step toward more transparent AI-driven diagnostics, bridging the gap between accuracy and explainability in XAI. Full article
(This article belongs to the Special Issue Explainable Machine Learning in Clinical Diagnostics)
Show Figures

Figure 1

Back to TopTop