MDPI - Publisher of Open Access Journals

37 pages, 9067 KB

Open AccessReview

Hybrid Quantum–Classical Architectures in Medical Imaging: A Taxonomy-Based Survey of COVID-19 Models

by Seyedeh Aram Salehi, Hanieh Naderi, Seyyed Amir Asghari, Javad Chaharlang and Yvon Savaria

Quantum Rep. 2026, 8(2), 54; https://doi.org/10.3390/quantum8020054 - 12 Jun 2026

Viewed by 269

This paper reviews hybrid quantum–classical (HQC) architectures for COVID-19-related respiratory medical-image analysis. To address the heterogeneity of existing studies, we propose an architecture-centric taxonomy based on the functional role and placement of the quantum module. Reviewed models are grouped into three archetypes: Archetype [...] Read more.

This paper reviews hybrid quantum–classical (HQC) architectures for COVID-19-related respiratory medical-image analysis. To address the heterogeneity of existing studies, we propose an architecture-centric taxonomy based on the functional role and placement of the quantum module. Reviewed models are grouped into three archetypes: Archetype A, where quantum circuits act as patch-level quanvolutional preprocessors; Archetype B, where classical feature extractors are coupled with quantum classifier heads; and Archetype C, where quantum circuits generate intermediate features for downstream classical classifiers. Ten peer-reviewed journal studies were selected through a PRISMA-inspired search and analyzed across architecture, diagnostic performance, quantum resource reporting, validation rigor, computational scalability, and deployment feasibility. The review shows that HQC models often report promising binary COVID-19 screening results on CT or chest X-ray images, but multiclass respiratory classification remains less stable. Key limitations include simulator-dominated evaluation, limited external validation, unclear patient-wise splitting, incomplete reporting of qubit counts, circuit depth, and shots, and insufficient comparison with strong classical baselines. Overall, current HQC models should be viewed as exploratory quantum-augmented classical pipelines rather than clinically validated diagnostic systems. No conclusive task-level quantum advantage has yet been demonstrated for COVID-19 medical imaging. Future progress requires standardized benchmarking, transparent quantum-resource reporting, patient-wise and multi-center validation, hardware-aware evaluation, and interpretable hybrid designs compatible with NISQ-era constraints. Full article

(This article belongs to the Section Quantum Computing and Information Processing)

► Show Figures

Figure 1

23 pages, 4574 KB

Open AccessArticle

LLaMA-XR: A Novel Framework for Radiology Report Generation Using LLaMA and QLoRA Fine Tuning

by Md. Zihad Bin Jahangir, Muhammad Ashad Kabir, Sumaiya Akter, Israt Jahan and Minh Chau

Bioengineering 2026, 13(5), 493; https://doi.org/10.3390/bioengineering13050493 - 23 Apr 2026

Viewed by 1063

Abstract

Background: The goal of automated radiology report generation is to help radiologists in their task of creating descriptive reports from chest radiographs. However, the process of creating coherent and contextually accurate reports has been challenging, mainly due to the intricacies of medical language [...] Read more.

Background: The goal of automated radiology report generation is to help radiologists in their task of creating descriptive reports from chest radiographs. However, the process of creating coherent and contextually accurate reports has been challenging, mainly due to the intricacies of medical language and the need to correlate visual data with textual descriptions. Methods: This study presents LLaMA-XR, a novel framework that integrates Meta LLaMA 3.1 Large Language Model with DenseNet-121-based image embeddings and Quantized Low-Rank Adaptation (QLoRA) fine-tuning. Results: The experiment conducted on the IU X-ray dataset demonstrates that LLaMA-XR outperforms a range of state-of-the-art methods. It achieves an ROUGE-L score of 0.433 and a METEOR score of 0.336, establishing new performance benchmarks in the domain. Conclusions: These results underscore LLaMA-XR’s potential as an effective artificial intelligence system for automated radiology reporting, offering enhanced performance. Full article

(This article belongs to the Special Issue AI-Driven Imaging and Analysis for Biomedical Applications)

► Show Figures

Figure 1

25 pages, 2531 KB

Open AccessArticle

FedIHRAS: A Privacy-Preserving Federated Learning Framework for Multi-Institutional Collaborative Radiological Analysis with Integrated Explainability and Automated Clinical Reporting

by André Luiz Marques Serrano, Gabriel Arquelau Pimenta Rodrigues, Guilherme Dantas Bispo, Vinícius Pereira Gonçalves, Geraldo Pereira Rocha Filho, Maria Gabriela Mendonça Peixoto, Rodrigo Bonacin and Rodolfo Ipolito Meneguette

Biomedicines 2026, 14(3), 713; https://doi.org/10.3390/biomedicines14030713 - 19 Mar 2026

Viewed by 675

Abstract

Background/Objectives: Federated learning has emerged as a promising paradigm for enabling collaborative artificial intelligence in healthcare while preserving data privacy. However, most existing frameworks focus on isolated tasks and lack integrated pipelines that combine classification, segmentation, explainability, and automated clinical reporting. Methods: This [...] Read more.

Background/Objectives: Federated learning has emerged as a promising paradigm for enabling collaborative artificial intelligence in healthcare while preserving data privacy. However, most existing frameworks focus on isolated tasks and lack integrated pipelines that combine classification, segmentation, explainability, and automated clinical reporting. Methods: This study proposes FedIHRAS, a privacy-preserving federated learning framework designed for multi-institutional radiological analysis. The system integrates multi-task deep learning modules, including pathology classification using a modified ResNet-50 backbone, anatomical segmentation, explainability through Grad-CAM, and automated report generation supported by semantic aggregation using SNOMED CT. The framework employs confidence-weighted aggregation, differential privacy mechanisms, and secure aggregation protocols to ensure privacy and robustness across heterogeneous institutional datasets. Results: Experimental evaluation was conducted across four large-scale chest X-ray datasets representing simulated institutional nodes, totaling approximately 874,000 images. FedIHRAS achieved high diagnostic performance with strong cross-institutional generalization and demonstrated improved robustness under non-IID data distributions. Additional experiments showed favorable communication efficiency, effective privacy–utility trade-offs, and strong agreement with expert radiologist assessments. Conclusion: The proposed FedIHRAS framework demonstrates that federated learning can support scalable, privacy-preserving, and clinically meaningful radiological AI systems. By integrating multi-task learning, explainability, and automated reporting within a unified federated architecture, the framework addresses key limitations of existing approaches and contributes to the development of collaborative AI in healthcare. Full article

(This article belongs to the Special Issue Imaging Technology for Human Diseases)

► Show Figures

Figure 1

29 pages, 5858 KB

Open AccessArticle

MRID: Modeling Radiological Image Differences for Disease Progression Reasoning via Multi-Task Self-Supervision

by Yongtao Hao, Pandong Wang, Yanming Chen and Haifeng Zhao

Electronics 2026, 15(5), 997; https://doi.org/10.3390/electronics15050997 - 27 Feb 2026

Viewed by 487

Abstract

Automated radiology report generation has become a prominent research topic in medical multimodal learning. However, most existing approaches primarily focus on single-image interpretation and rarely address the task of tracking disease progression across longitudinal chest X-rays. This task presents two major challenges: accurately [...] Read more.

Automated radiology report generation has become a prominent research topic in medical multimodal learning. However, most existing approaches primarily focus on single-image interpretation and rarely address the task of tracking disease progression across longitudinal chest X-rays. This task presents two major challenges: accurately localizing pathological changes between temporally paired images, and effectively translating visual difference representations into clinically meaningful textual descriptions. To address these challenges, we propose MRID (Modeling Radiological Image Differences for Disease Progression Reasoning), a multi-task self-supervised framework that follows a pretraining–finetuning paradigm. MRID leverages multiple complementary self-supervised objectives to jointly achieve (1) intra-modal spatial alignment of organs and pathological regions across image pairs, and (2) cross-modal semantic alignment between visual difference representations and radiology report embeddings. Furthermore, we introduce a simple yet effective data augmentation strategy to alleviate the imbalance of disease progression categories. Extensive experiments conducted on the Longitudinal-MIMIC and MS-CXR-T datasets demonstrate that MRID effectively captures fine-grained disease progression patterns. In addition, the proposed framework achieves competitive performance on single-image radiology report generation, further highlighting its strong capability in modeling chest X-ray semantics. Full article

(This article belongs to the Special Issue AI-Driven Medical Image/Video Processing)

► Show Figures

Figure 1

22 pages, 1944 KB

Open AccessArticle

Automated Radiological Report Generation from Breast Ultrasound Images Using Vision and Language Transformers

by Shaheen Khatoon and Azhar Mahmood

J. Imaging 2026, 12(2), 68; https://doi.org/10.3390/jimaging12020068 - 6 Feb 2026

Viewed by 1451

Abstract

Breast ultrasound imaging is widely used for the detection and characterization of breast abnormalities; however, generating detailed and consistent radiological reports remains a labor-intensive and subjective process. Recent advances in deep learning have demonstrated the potential of automated report generation systems to support [...] Read more.

Breast ultrasound imaging is widely used for the detection and characterization of breast abnormalities; however, generating detailed and consistent radiological reports remains a labor-intensive and subjective process. Recent advances in deep learning have demonstrated the potential of automated report generation systems to support clinical workflows, yet most existing approaches focus on chest X-ray imaging and rely on convolutional–recurrent architectures with limited capacity to model long-range dependencies and complex clinical semantics. In this work, we propose a multimodal Transformer-based framework for automatic breast ultrasound report generation that integrates visual and textual information through cross-attention mechanisms. The proposed architecture employs a Vision Transformer (ViT) to extract rich spatial and morphological features from ultrasound images. For textual embedding, pretrained language models (BERT, BioBERT, and GPT-2) are implemented in various encoder–decoder configurations to leverage both general linguistic knowledge and domain-specific biomedical semantics. A multimodal Transformer decoder is implemented to autoregressively generate diagnostic reports by jointly attending to visual features and contextualized textual embeddings. We conducted an extensive quantitative evaluation using standard report generation metrics, including BLEU, ROUGE-L, METEOR, and CIDEr, to assess lexical accuracy, semantic alignment, and clinical relevance. Experimental results demonstrate that BioBERT-based models consistently outperform general domain counterparts in clinical specificity, while GPT-2-based decoders improve linguistic fluency. Full article

(This article belongs to the Section AI in Imaging)

► Show Figures

Figure 1

33 pages, 4219 KB

Open AccessReview

Recent Progress in Deep Learning for Chest X-Ray Report Generation

by Mounir Salhi and Moulay A. Akhloufi

BioMedInformatics 2026, 6(1), 3; https://doi.org/10.3390/biomedinformatics6010003 - 9 Jan 2026

Viewed by 4493

Abstract

Chest X-ray radiology report generation is a challenging task that involves techniques from medical natural language processing and computer vision. This paper provides a comprehensive overview of recent progress. The annotation protocols, structure, linguistic characteristics, and size of the main public datasets are [...] Read more.

Chest X-ray radiology report generation is a challenging task that involves techniques from medical natural language processing and computer vision. This paper provides a comprehensive overview of recent progress. The annotation protocols, structure, linguistic characteristics, and size of the main public datasets are presented and compared. Understanding their properties is necessary for benchmarking and generalization. Both clinically oriented and natural language generation metrics are included in the model evaluation strategies to assess their performance. Their respective strengths and limitations are discussed in the context of radiology applications. Recent deep learning approaches for report generation and their different architectures are also reviewed. Common trends such as instruction tuning and the integration of clinical knowledge are also considered. Recent works show that current models still have limited factual accuracy, with a score of 72% reported with expert evaluations, and poor performance on rare pathologies and lateral views. The most important challenges are the limited dataset diversity, weak cross-institution generalization, and the lack of clinically validated benchmarks for evaluating factual reliability. Finally, we discuss open challenges related to data quality, clinical factuality, and interpretability. This review aims to support researchers by synthesizing the current literature and identifying key directions for developing more clinically reliable report generation systems. Full article

► Show Figures

Graphical abstract

20 pages, 7543 KB

Open AccessArticle

Contrastive Learning with Feature Space Interpolation for Retrieval-Based Chest X-Ray Report Generation

by Zahid Ur Rahman, Gwanghyun Yu, Lee Jin and Jin Young Kim

Appl. Sci. 2026, 16(1), 470; https://doi.org/10.3390/app16010470 - 1 Jan 2026

Viewed by 1175

Abstract

Automated radiology report generation from chest X-rays presents a critical challenge in medical imaging. Traditional image-captioning models struggle with clinical specificity and rare pathologies. Recently, contrastive vision language learning has emerged as a robust alternative that learns joint visual–textual representations. However, applying contrastive [...] Read more.

Automated radiology report generation from chest X-rays presents a critical challenge in medical imaging. Traditional image-captioning models struggle with clinical specificity and rare pathologies. Recently, contrastive vision language learning has emerged as a robust alternative that learns joint visual–textual representations. However, applying contrastive learning (CL) to radiology remains challenging due to severe data scarcity. Prior work has employed input space augmentation, but these approaches incur computational overhead and risk distorting diagnostic features. This work presents CL with feature space interpolation for retrieval (CLFIR), a novel CL framework operating on learned embeddings. The method generates interpolated pairs in the feature embedding space by mixing original and shuffled embeddings in batches using a mixing coefficient

λ \sim U (0.85, 0.99)

. This approach increases batch diversity via synthetic samples, addressing the limitations of CL on medical data while preserving diagnostic integrity. Extensive experiments demonstrate state-of-the-art performance across critical clinical validation tasks. For report generation, CLFIR achieves BLEU-1/ROUGE/METEOR scores of 0.51/0.40/0.26 (Indiana university [IU] X-ray) and 0.45/0.34/0.22 (MIMIC-CXR). Moreover, CLFIR excels at image-to-text retrieval with R@1 scores of 4.14% (IU X-ray) and 24.3% (MIMIC-CXR) and achieves 0.65 accuracy in zero-shot classification on the CheXpert5×200 dataset, surpassing the established vision-language models. Full article

(This article belongs to the Topic Artificial Intelligence and Big Data in Biomedical Engineering)

► Show Figures

Figure 1

21 pages, 4464 KB

Open AccessArticle

Chest X-Ray Medical Report Generation Using a CNN—Transformer Model with Maximum Attention

by Mei-Hua Hsih, Shih-Po Lin and Chen-Chiung Hsieh

Electronics 2025, 14(20), 4123; https://doi.org/10.3390/electronics14204123 - 21 Oct 2025

Viewed by 2511

Abstract

Medical imaging, particularly chest X-rays, plays a vital role in radiological diagnosis. However, interpreting these images and generating detailed diagnostic reports is a time-consuming task for clinicians. To address this challenge, this study proposes an automated image captioning framework for chest X-ray images, [...] Read more.

Medical imaging, particularly chest X-rays, plays a vital role in radiological diagnosis. However, interpreting these images and generating detailed diagnostic reports is a time-consuming task for clinicians. To address this challenge, this study proposes an automated image captioning framework for chest X-ray images, aiming to reduce clinical workload and enhance diagnostic efficiency. The proposed approach employs convolutional neural networks (CNNs) for visual feature extraction and a modified Transformer architecture—referred to as the Medical Transformer—for structured report generation. Three CNN models, namely InceptionV3, ResNet152V2, and Inception–ResNetV2, were evaluated as feature extractors. The attention mechanisms, Bahdanau, Luong, and scaled dot product, were activated by ReLU or Tanh functions to identify the optimal configuration, i.e., the maximum attention is used. Experiments were conducted using the Indiana University Chest X-ray dataset, which contains 7466 images paired with corresponding diagnostic reports. The proposed approach employs image augmentation to accommodate input variability, utilizes Inception–ResNetV2 for feature extraction, and integrates the Medical Transformer with maximum attention mechanisms to achieve optimal performance in medical report generation. Evaluation metrics include BLEU (BLEU-1 to BLEU-4 scores of 0.720, 0.669, 0.648, and 0.600, respectively), METEOR (0.741), and BERTScore (F_BERT = 0.787), demonstrating superior performance compared to baseline models and the state of the art. These results validate the effectiveness of the proposed Medical Transformer framework in generating accurate and clinically relevant medical image captions. Full article

(This article belongs to the Special Issue Digital Signal and Image Processing for Multimedia Technology)

► Show Figures

Figure 1

17 pages, 1310 KB

Open AccessArticle

IHRAS: Automated Medical Report Generation from Chest X-Rays via Classification, Segmentation, and LLMs

by Gabriel Arquelau Pimenta Rodrigues, André Luiz Marques Serrano, Guilherme Dantas Bispo, Geraldo Pereira Rocha Filho, Vinícius Pereira Gonçalves and Rodolfo Ipolito Meneguette

Bioengineering 2025, 12(8), 795; https://doi.org/10.3390/bioengineering12080795 - 24 Jul 2025

Cited by 3 | Viewed by 4136

Abstract

The growing demand for accurate and efficient Chest X-Ray (CXR) interpretation has prompted the development of AI-driven systems to alleviate radiologist workload and reduce diagnostic variability. This paper introduces the Intelligent Humanized Radiology Analysis System (IHRAS), a modular framework that automates the end-to-end [...] Read more.

The growing demand for accurate and efficient Chest X-Ray (CXR) interpretation has prompted the development of AI-driven systems to alleviate radiologist workload and reduce diagnostic variability. This paper introduces the Intelligent Humanized Radiology Analysis System (IHRAS), a modular framework that automates the end-to-end process of CXR analysis and report generation. IHRAS integrates four core components: (i) deep convolutional neural networks for multi-label classification of 14 thoracic conditions; (ii) Grad-CAM for spatial visualization of pathologies; (iii) SAR-Net for anatomical segmentation; and (iv) a large language model (DeepSeek-R1) guided by the CRISPE prompt engineering framework to generate structured diagnostic reports using SNOMED CT terminology. Evaluated on the NIH ChestX-ray dataset, IHRAS demonstrates consistent diagnostic performance across diverse demographic and clinical subgroups, and produces high-fidelity, clinically relevant radiological reports with strong faithfulness, relevancy, and alignment scores. The system offers a transparent and scalable solution to support radiological workflows while highlighting the importance of interpretability and standardization in clinical Artificial Intelligence applications. Full article

(This article belongs to the Special Issue AI Advancements in Healthcare: Medical Imaging and Sensing Technologies)

► Show Figures

Figure 1

13 pages, 1566 KB

Open AccessArticle

Turkish Chest X-Ray Report Generation Model Using the Swin Enhanced Yield Transformer (Model-SEY) Framework

by Murat Ucan, Buket Kaya and Mehmet Kaya

Diagnostics 2025, 15(14), 1805; https://doi.org/10.3390/diagnostics15141805 - 17 Jul 2025

Cited by 1 | Viewed by 1785

Abstract

Background/Objectives: Extracting meaningful medical information from chest X-ray images and transcribing it into text is a complex task that requires a high level of expertise and directly affects clinical decision-making processes. Automatic reporting systems for this field in Turkish represent an important [...] Read more.

Background/Objectives: Extracting meaningful medical information from chest X-ray images and transcribing it into text is a complex task that requires a high level of expertise and directly affects clinical decision-making processes. Automatic reporting systems for this field in Turkish represent an important gap in scientific research, as they have not been sufficiently addressed in the existing literature. Methods: A deep learning-based approach called Model-SEY was developed with the aim of automatically generating Turkish medical reports from chest X-ray images. The Swin Transformer structure was used in the encoder part of the model to extract image features, while the text generation process was carried out using the cosmosGPT architecture, which was adapted specifically for the Turkish language. Results: With the permission of the ethics committee, a new dataset was created using image–report pairs obtained from Elazıg Fethi Sekin City Hospital and Indiana University Chest X-Ray dataset and experiments were conducted on this new dataset. In the tests conducted within the scope of the study, scores of 0.6412, 0.5335, 0.4395, 0.4395, 0.3716, and 0.2240 were obtained in BLEU-1, BLEU-2, BLEU-3, BLEU-4, and ROUGE word overlap evaluation metrics, respectively. Conclusions: Quantitative and qualitative analyses of medical reports autonomously generated by the proposed model have shown that they are meaningful and consistent. The proposed model is one of the first studies in the field of autonomous reporting using deep learning architectures specific to the Turkish language, representing an important step forward in this field. It will also reduce potential human errors during diagnosis by supporting doctors in their decision-making. Full article

(This article belongs to the Special Issue Artificial Intelligence for Health and Medicine)

► Show Figures

Figure 1

17 pages, 1532 KB

Open AccessArticle

RADAI: A Deep Learning-Based Classification of Lung Abnormalities in Chest X-Rays

by Hanan Aljuaid, Hessa Albalahad, Walaa Alshuaibi, Shahad Almutairi, Tahani Hamad Aljohani, Nazar Hussain and Farah Mohammad

Diagnostics 2025, 15(13), 1728; https://doi.org/10.3390/diagnostics15131728 - 7 Jul 2025

Cited by 4 | Viewed by 2902

Abstract

Background: Chest X-rays are rapidly gaining prominence as a prevalent diagnostic tool, as recognized by the World Health Organization (WHO). However, interpreting chest X-rays can be demanding and time-consuming, even for experienced radiologists, leading to potential misinterpretations and delays in treatment. Method: The [...] Read more.

Background: Chest X-rays are rapidly gaining prominence as a prevalent diagnostic tool, as recognized by the World Health Organization (WHO). However, interpreting chest X-rays can be demanding and time-consuming, even for experienced radiologists, leading to potential misinterpretations and delays in treatment. Method: The purpose of this research is the development of a RadAI model. The RadAI model can accurately detect four types of lung abnormalities in chest X-rays and generate a report on each identified abnormality. Moreover, deep learning algorithms, particularly convolutional neural networks (CNNs), have demonstrated remarkable potential in automating medical image analysis, including chest X-rays. This work addresses the challenge of chest X-ray interpretation by fine tuning the following three advanced deep learning models: Feature-selective and Spatial Receptive Fields Network (FSRFNet50), ResNext50, and ResNet50. These models are compared based on accuracy, precision, recall, and F1-score. Results: The outstanding performance of RadAI shows its potential to assist radiologists to interpret the detected chest abnormalities accurately. Conclusions: RadAI is beneficial in enhancing the accuracy and efficiency of chest X-ray interpretation, ultimately supporting the timely and reliable diagnosis of lung abnormalities. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

28 pages, 4804 KB

Open AccessArticle

Towards Automatic Detection of Pneumothorax in Emergency Care with Deep Learning Using Multi-Source Chest X-ray Data

by Santiago Ibañez Caturla, Juan de Dios Berná Mestre and Oscar Martinez Mozos

Future Internet 2025, 17(7), 292; https://doi.org/10.3390/fi17070292 - 29 Jun 2025

Viewed by 3246

Abstract

Pneumothorax is a potentially life-threatening condition defined as the collapse of the lung due to air leakage into the chest cavity. Delays in the diagnosis of pneumothorax can lead to severe complications and even mortality. A significant challenge in pneumothorax diagnosis is the [...] Read more.

Pneumothorax is a potentially life-threatening condition defined as the collapse of the lung due to air leakage into the chest cavity. Delays in the diagnosis of pneumothorax can lead to severe complications and even mortality. A significant challenge in pneumothorax diagnosis is the shortage of radiologists, resulting in the absence of written reports in plain X-rays and, consequently, impacting patient care. In this paper, we propose an automatic triage system for pneumothorax detection in X-ray images based on deep learning. We address this problem from the perspective of multi-source domain adaptation where different datasets available on the Internet are used for training and testing. In particular, we use datasets which contain chest X-ray images corresponding to different conditions (including pneumothorax). A convolutional neural network (CNN) with an EfficientNet architecture is trained and optimized to identify radiographic signs of pneumothorax using those public datasets. We present the results using cross-dataset validation, demonstrating the robustness and generalization capabilities of our multi-source solution across different datasets. The experimental results demonstrate the model’s potential to assist clinicians in prioritizing and correctly detecting urgent cases of pneumothorax using different integrated deployment strategies. Full article

(This article belongs to the Special Issue Artificial Intelligence-Enabled Smart Healthcare)

► Show Figures

Figure 1

22 pages, 1899 KB

Open AccessArticle

GIT-CXR: End-to-End Transformer for Chest X-Ray Report Generation

by Iustin Sîrbu, Iulia-Renata Sîrbu, Jasmina Bogojeska and Traian Rebedea

Information 2025, 16(7), 524; https://doi.org/10.3390/info16070524 - 23 Jun 2025

Cited by 6 | Viewed by 2862

Abstract

Medical imaging is crucial for diagnosing, monitoring, and treating medical conditions. The medical reports of radiology images are the primary medium through which medical professionals can attest to their findings, but their writing is time-consuming and requires specialized clinical expertise. Therefore, the automated [...] Read more.

Medical imaging is crucial for diagnosing, monitoring, and treating medical conditions. The medical reports of radiology images are the primary medium through which medical professionals can attest to their findings, but their writing is time-consuming and requires specialized clinical expertise. Therefore, the automated generation of radiography reports has the potential to improve and standardize patient care and significantly reduce the workload of clinicians. Through our work, we have designed and evaluated an end-to-end transformer-based method to generate accurate and factually complete radiology reports for X-ray images. Additionally, we are the first to introduce curriculum learning for end-to-end transformers in medical imaging and demonstrate its impact in obtaining improved performance. The experiments were conducted using the MIMIC-CXR-JPG database, the largest available chest X-ray dataset. The results obtained are comparable with the current state of the art on the natural language generation (NLG) metrics BLEU and ROUGE-L, while setting new state-of-the-art results on F1 examples-averaged F1-macro and F1-micro metrics for clinical accuracy and on the METEOR metric widely used for NLG. Full article

(This article belongs to the Section Information Applications)

► Show Figures

Figure 1

14 pages, 723 KB

Open AccessArticle

RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation

by Caijie Qin, Yize Xiong, Weibin Chen and Yong Li

Mathematics 2025, 13(9), 1492; https://doi.org/10.3390/math13091492 - 30 Apr 2025

Cited by 2 | Viewed by 1271

Abstract

Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still [...] Read more.

Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still fails to enhance the perceptual field during image feature extraction. To solve this problem, we propose the Reinforced Memory-driven Pure Transformer (RMPT), which is a novel Transformer–Transformer-based model. In implementation, our RMPT employs the Swin Transformer to extract visual features from given X-ray images, which has a larger perceptual field to better model the relationships between different regions. Furthermore, we adopt a memory-driven Transformer (MemTrans) to effectively model similar patterns in different reports, which is able to facilitate the model to generate long reports. Finally, we present an innovative training approach leveraging Reinforcement Learning (RL) that efficiently steers the model to focus on challenging samples, consequently improving its comprehensive performance across both straightforward and complex situations. Experimental results on the IU X-ray dataset show that our proposed RMPT achieves superior performance on various Natural Language Generation (NLG) evaluation metrics. Further ablation study results demonstrate that our RMPT model achieves 10.5% overall performance compared to the base mode. Full article

► Show Figures

Figure 1

17 pages, 1944 KB

Open AccessArticle

Pediatric Pneumonia Recognition Using an Improved DenseNet201 Model with Multi-Scale Convolutions and Mish Activation Function

by Petra Radočaj, Dorijan Radočaj and Goran Martinović

Algorithms 2025, 18(2), 98; https://doi.org/10.3390/a18020098 - 10 Feb 2025

Cited by 7 | Viewed by 2466

Abstract

Pediatric pneumonia remains a significant global health issue, particularly in low- and middle-income countries, where it contributes substantially to mortality in children under five. This study introduces a deep learning model for pediatric pneumonia diagnosis from chest X-rays that surpasses the performance of [...] Read more.

Pediatric pneumonia remains a significant global health issue, particularly in low- and middle-income countries, where it contributes substantially to mortality in children under five. This study introduces a deep learning model for pediatric pneumonia diagnosis from chest X-rays that surpasses the performance of state-of-the-art methods reported in the recent literature. Using a DenseNet201 architecture with a Mish activation function and multi-scale convolutions, the model was trained on a dataset of 5856 chest X-ray images, achieving high performance: 0.9642 accuracy, 0.9580 precision, 0.9506 sensitivity, 0.9542 F1 score, and 0.9507 specificity. These results demonstrate a significant advancement in diagnostic precision and efficiency within this domain. By achieving the highest accuracy and F1 score compared to other recent work using the same dataset, our approach offers a tangible improvement for resource-constrained environments where access to specialists and sophisticated equipment is limited. While the need for high-quality datasets and adequate computational resources remains a general consideration for deep learning applications, our model’s demonstrably superior performance establishes a new benchmark and offers the delivery of more timely and precise diagnoses, with the potential to significantly enhance patient outcomes. Full article

(This article belongs to the Special Issue Machine Learning in Medical Signal and Image Processing (3rd Edition))

► Show Figures

Figure 1

Search Results (42)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (42)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI