Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (42)

Search Parameters:
Keywords = chest X-ray report generation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
37 pages, 9067 KB  
Review
Hybrid Quantum–Classical Architectures in Medical Imaging: A Taxonomy-Based Survey of COVID-19 Models
by Seyedeh Aram Salehi, Hanieh Naderi, Seyyed Amir Asghari, Javad Chaharlang and Yvon Savaria
Quantum Rep. 2026, 8(2), 54; https://doi.org/10.3390/quantum8020054 - 12 Jun 2026
Viewed by 269
Abstract
This paper reviews hybrid quantum–classical (HQC) architectures for COVID-19-related respiratory medical-image analysis. To address the heterogeneity of existing studies, we propose an architecture-centric taxonomy based on the functional role and placement of the quantum module. Reviewed models are grouped into three archetypes: Archetype [...] Read more.
This paper reviews hybrid quantum–classical (HQC) architectures for COVID-19-related respiratory medical-image analysis. To address the heterogeneity of existing studies, we propose an architecture-centric taxonomy based on the functional role and placement of the quantum module. Reviewed models are grouped into three archetypes: Archetype A, where quantum circuits act as patch-level quanvolutional preprocessors; Archetype B, where classical feature extractors are coupled with quantum classifier heads; and Archetype C, where quantum circuits generate intermediate features for downstream classical classifiers. Ten peer-reviewed journal studies were selected through a PRISMA-inspired search and analyzed across architecture, diagnostic performance, quantum resource reporting, validation rigor, computational scalability, and deployment feasibility. The review shows that HQC models often report promising binary COVID-19 screening results on CT or chest X-ray images, but multiclass respiratory classification remains less stable. Key limitations include simulator-dominated evaluation, limited external validation, unclear patient-wise splitting, incomplete reporting of qubit counts, circuit depth, and shots, and insufficient comparison with strong classical baselines. Overall, current HQC models should be viewed as exploratory quantum-augmented classical pipelines rather than clinically validated diagnostic systems. No conclusive task-level quantum advantage has yet been demonstrated for COVID-19 medical imaging. Future progress requires standardized benchmarking, transparent quantum-resource reporting, patient-wise and multi-center validation, hardware-aware evaluation, and interpretable hybrid designs compatible with NISQ-era constraints. Full article
(This article belongs to the Section Quantum Computing and Information Processing)
Show Figures

Figure 1

23 pages, 4574 KB  
Article
LLaMA-XR: A Novel Framework for Radiology Report Generation Using LLaMA and QLoRA Fine Tuning
by Md. Zihad Bin Jahangir, Muhammad Ashad Kabir, Sumaiya Akter, Israt Jahan and Minh Chau
Bioengineering 2026, 13(5), 493; https://doi.org/10.3390/bioengineering13050493 - 23 Apr 2026
Viewed by 1063
Abstract
Background: The goal of automated radiology report generation is to help radiologists in their task of creating descriptive reports from chest radiographs. However, the process of creating coherent and contextually accurate reports has been challenging, mainly due to the intricacies of medical language [...] Read more.
Background: The goal of automated radiology report generation is to help radiologists in their task of creating descriptive reports from chest radiographs. However, the process of creating coherent and contextually accurate reports has been challenging, mainly due to the intricacies of medical language and the need to correlate visual data with textual descriptions. Methods: This study presents LLaMA-XR, a novel framework that integrates Meta LLaMA 3.1 Large Language Model with DenseNet-121-based image embeddings and Quantized Low-Rank Adaptation (QLoRA) fine-tuning. Results: The experiment conducted on the IU X-ray dataset demonstrates that LLaMA-XR outperforms a range of state-of-the-art methods. It achieves an ROUGE-L score of 0.433 and a METEOR score of 0.336, establishing new performance benchmarks in the domain. Conclusions: These results underscore LLaMA-XR’s potential as an effective artificial intelligence system for automated radiology reporting, offering enhanced performance. Full article
(This article belongs to the Special Issue AI-Driven Imaging and Analysis for Biomedical Applications)
Show Figures

Figure 1

25 pages, 2531 KB  
Article
FedIHRAS: A Privacy-Preserving Federated Learning Framework for Multi-Institutional Collaborative Radiological Analysis with Integrated Explainability and Automated Clinical Reporting
by André Luiz Marques Serrano, Gabriel Arquelau Pimenta Rodrigues, Guilherme Dantas Bispo, Vinícius Pereira Gonçalves, Geraldo Pereira Rocha Filho, Maria Gabriela Mendonça Peixoto, Rodrigo Bonacin and Rodolfo Ipolito Meneguette
Biomedicines 2026, 14(3), 713; https://doi.org/10.3390/biomedicines14030713 - 19 Mar 2026
Viewed by 675
Abstract
Background/Objectives: Federated learning has emerged as a promising paradigm for enabling collaborative artificial intelligence in healthcare while preserving data privacy. However, most existing frameworks focus on isolated tasks and lack integrated pipelines that combine classification, segmentation, explainability, and automated clinical reporting. Methods: This [...] Read more.
Background/Objectives: Federated learning has emerged as a promising paradigm for enabling collaborative artificial intelligence in healthcare while preserving data privacy. However, most existing frameworks focus on isolated tasks and lack integrated pipelines that combine classification, segmentation, explainability, and automated clinical reporting. Methods: This study proposes FedIHRAS, a privacy-preserving federated learning framework designed for multi-institutional radiological analysis. The system integrates multi-task deep learning modules, including pathology classification using a modified ResNet-50 backbone, anatomical segmentation, explainability through Grad-CAM, and automated report generation supported by semantic aggregation using SNOMED CT. The framework employs confidence-weighted aggregation, differential privacy mechanisms, and secure aggregation protocols to ensure privacy and robustness across heterogeneous institutional datasets. Results: Experimental evaluation was conducted across four large-scale chest X-ray datasets representing simulated institutional nodes, totaling approximately 874,000 images. FedIHRAS achieved high diagnostic performance with strong cross-institutional generalization and demonstrated improved robustness under non-IID data distributions. Additional experiments showed favorable communication efficiency, effective privacy–utility trade-offs, and strong agreement with expert radiologist assessments. Conclusion: The proposed FedIHRAS framework demonstrates that federated learning can support scalable, privacy-preserving, and clinically meaningful radiological AI systems. By integrating multi-task learning, explainability, and automated reporting within a unified federated architecture, the framework addresses key limitations of existing approaches and contributes to the development of collaborative AI in healthcare. Full article
(This article belongs to the Special Issue Imaging Technology for Human Diseases)
Show Figures

Figure 1

29 pages, 5858 KB  
Article
MRID: Modeling Radiological Image Differences for Disease Progression Reasoning via Multi-Task Self-Supervision
by Yongtao Hao, Pandong Wang, Yanming Chen and Haifeng Zhao
Electronics 2026, 15(5), 997; https://doi.org/10.3390/electronics15050997 - 27 Feb 2026
Viewed by 487
Abstract
Automated radiology report generation has become a prominent research topic in medical multimodal learning. However, most existing approaches primarily focus on single-image interpretation and rarely address the task of tracking disease progression across longitudinal chest X-rays. This task presents two major challenges: accurately [...] Read more.
Automated radiology report generation has become a prominent research topic in medical multimodal learning. However, most existing approaches primarily focus on single-image interpretation and rarely address the task of tracking disease progression across longitudinal chest X-rays. This task presents two major challenges: accurately localizing pathological changes between temporally paired images, and effectively translating visual difference representations into clinically meaningful textual descriptions. To address these challenges, we propose MRID (Modeling Radiological Image Differences for Disease Progression Reasoning), a multi-task self-supervised framework that follows a pretraining–finetuning paradigm. MRID leverages multiple complementary self-supervised objectives to jointly achieve (1) intra-modal spatial alignment of organs and pathological regions across image pairs, and (2) cross-modal semantic alignment between visual difference representations and radiology report embeddings. Furthermore, we introduce a simple yet effective data augmentation strategy to alleviate the imbalance of disease progression categories. Extensive experiments conducted on the Longitudinal-MIMIC and MS-CXR-T datasets demonstrate that MRID effectively captures fine-grained disease progression patterns. In addition, the proposed framework achieves competitive performance on single-image radiology report generation, further highlighting its strong capability in modeling chest X-ray semantics. Full article
(This article belongs to the Special Issue AI-Driven Medical Image/Video Processing)
Show Figures

Figure 1

22 pages, 1944 KB  
Article
Automated Radiological Report Generation from Breast Ultrasound Images Using Vision and Language Transformers
by Shaheen Khatoon and Azhar Mahmood
J. Imaging 2026, 12(2), 68; https://doi.org/10.3390/jimaging12020068 - 6 Feb 2026
Viewed by 1451
Abstract
Breast ultrasound imaging is widely used for the detection and characterization of breast abnormalities; however, generating detailed and consistent radiological reports remains a labor-intensive and subjective process. Recent advances in deep learning have demonstrated the potential of automated report generation systems to support [...] Read more.
Breast ultrasound imaging is widely used for the detection and characterization of breast abnormalities; however, generating detailed and consistent radiological reports remains a labor-intensive and subjective process. Recent advances in deep learning have demonstrated the potential of automated report generation systems to support clinical workflows, yet most existing approaches focus on chest X-ray imaging and rely on convolutional–recurrent architectures with limited capacity to model long-range dependencies and complex clinical semantics. In this work, we propose a multimodal Transformer-based framework for automatic breast ultrasound report generation that integrates visual and textual information through cross-attention mechanisms. The proposed architecture employs a Vision Transformer (ViT) to extract rich spatial and morphological features from ultrasound images. For textual embedding, pretrained language models (BERT, BioBERT, and GPT-2) are implemented in various encoder–decoder configurations to leverage both general linguistic knowledge and domain-specific biomedical semantics. A multimodal Transformer decoder is implemented to autoregressively generate diagnostic reports by jointly attending to visual features and contextualized textual embeddings. We conducted an extensive quantitative evaluation using standard report generation metrics, including BLEU, ROUGE-L, METEOR, and CIDEr, to assess lexical accuracy, semantic alignment, and clinical relevance. Experimental results demonstrate that BioBERT-based models consistently outperform general domain counterparts in clinical specificity, while GPT-2-based decoders improve linguistic fluency. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

33 pages, 4219 KB  
Review
Recent Progress in Deep Learning for Chest X-Ray Report Generation
by Mounir Salhi and Moulay A. Akhloufi
BioMedInformatics 2026, 6(1), 3; https://doi.org/10.3390/biomedinformatics6010003 - 9 Jan 2026
Viewed by 4493
Abstract
Chest X-ray radiology report generation is a challenging task that involves techniques from medical natural language processing and computer vision. This paper provides a comprehensive overview of recent progress. The annotation protocols, structure, linguistic characteristics, and size of the main public datasets are [...] Read more.
Chest X-ray radiology report generation is a challenging task that involves techniques from medical natural language processing and computer vision. This paper provides a comprehensive overview of recent progress. The annotation protocols, structure, linguistic characteristics, and size of the main public datasets are presented and compared. Understanding their properties is necessary for benchmarking and generalization. Both clinically oriented and natural language generation metrics are included in the model evaluation strategies to assess their performance. Their respective strengths and limitations are discussed in the context of radiology applications. Recent deep learning approaches for report generation and their different architectures are also reviewed. Common trends such as instruction tuning and the integration of clinical knowledge are also considered. Recent works show that current models still have limited factual accuracy, with a score of 72% reported with expert evaluations, and poor performance on rare pathologies and lateral views. The most important challenges are the limited dataset diversity, weak cross-institution generalization, and the lack of clinically validated benchmarks for evaluating factual reliability. Finally, we discuss open challenges related to data quality, clinical factuality, and interpretability. This review aims to support researchers by synthesizing the current literature and identifying key directions for developing more clinically reliable report generation systems. Full article
Show Figures

Graphical abstract

20 pages, 7543 KB  
Article
Contrastive Learning with Feature Space Interpolation for Retrieval-Based Chest X-Ray Report Generation
by Zahid Ur Rahman, Gwanghyun Yu, Lee Jin and Jin Young Kim
Appl. Sci. 2026, 16(1), 470; https://doi.org/10.3390/app16010470 - 1 Jan 2026
Viewed by 1175
Abstract
Automated radiology report generation from chest X-rays presents a critical challenge in medical imaging. Traditional image-captioning models struggle with clinical specificity and rare pathologies. Recently, contrastive vision language learning has emerged as a robust alternative that learns joint visual–textual representations. However, applying contrastive [...] Read more.
Automated radiology report generation from chest X-rays presents a critical challenge in medical imaging. Traditional image-captioning models struggle with clinical specificity and rare pathologies. Recently, contrastive vision language learning has emerged as a robust alternative that learns joint visual–textual representations. However, applying contrastive learning (CL) to radiology remains challenging due to severe data scarcity. Prior work has employed input space augmentation, but these approaches incur computational overhead and risk distorting diagnostic features. This work presents CL with feature space interpolation for retrieval (CLFIR), a novel CL framework operating on learned embeddings. The method generates interpolated pairs in the feature embedding space by mixing original and shuffled embeddings in batches using a mixing coefficient λU(0.85,0.99). This approach increases batch diversity via synthetic samples, addressing the limitations of CL on medical data while preserving diagnostic integrity. Extensive experiments demonstrate state-of-the-art performance across critical clinical validation tasks. For report generation, CLFIR achieves BLEU-1/ROUGE/METEOR scores of 0.51/0.40/0.26 (Indiana university [IU] X-ray) and 0.45/0.34/0.22 (MIMIC-CXR). Moreover, CLFIR excels at image-to-text retrieval with R@1 scores of 4.14% (IU X-ray) and 24.3% (MIMIC-CXR) and achieves 0.65 accuracy in zero-shot classification on the CheXpert5×200 dataset, surpassing the established vision-language models. Full article
Show Figures

Figure 1

21 pages, 4464 KB  
Article
Chest X-Ray Medical Report Generation Using a CNN—Transformer Model with Maximum Attention
by Mei-Hua Hsih, Shih-Po Lin and Chen-Chiung Hsieh
Electronics 2025, 14(20), 4123; https://doi.org/10.3390/electronics14204123 - 21 Oct 2025
Viewed by 2511
Abstract
Medical imaging, particularly chest X-rays, plays a vital role in radiological diagnosis. However, interpreting these images and generating detailed diagnostic reports is a time-consuming task for clinicians. To address this challenge, this study proposes an automated image captioning framework for chest X-ray images, [...] Read more.
Medical imaging, particularly chest X-rays, plays a vital role in radiological diagnosis. However, interpreting these images and generating detailed diagnostic reports is a time-consuming task for clinicians. To address this challenge, this study proposes an automated image captioning framework for chest X-ray images, aiming to reduce clinical workload and enhance diagnostic efficiency. The proposed approach employs convolutional neural networks (CNNs) for visual feature extraction and a modified Transformer architecture—referred to as the Medical Transformer—for structured report generation. Three CNN models, namely InceptionV3, ResNet152V2, and Inception–ResNetV2, were evaluated as feature extractors. The attention mechanisms, Bahdanau, Luong, and scaled dot product, were activated by ReLU or Tanh functions to identify the optimal configuration, i.e., the maximum attention is used. Experiments were conducted using the Indiana University Chest X-ray dataset, which contains 7466 images paired with corresponding diagnostic reports. The proposed approach employs image augmentation to accommodate input variability, utilizes Inception–ResNetV2 for feature extraction, and integrates the Medical Transformer with maximum attention mechanisms to achieve optimal performance in medical report generation. Evaluation metrics include BLEU (BLEU-1 to BLEU-4 scores of 0.720, 0.669, 0.648, and 0.600, respectively), METEOR (0.741), and BERTScore (FBERT = 0.787), demonstrating superior performance compared to baseline models and the state of the art. These results validate the effectiveness of the proposed Medical Transformer framework in generating accurate and clinically relevant medical image captions. Full article
(This article belongs to the Special Issue Digital Signal and Image Processing for Multimedia Technology)
Show Figures

Figure 1

17 pages, 1310 KB  
Article
IHRAS: Automated Medical Report Generation from Chest X-Rays via Classification, Segmentation, and LLMs
by Gabriel Arquelau Pimenta Rodrigues, André Luiz Marques Serrano, Guilherme Dantas Bispo, Geraldo Pereira Rocha Filho, Vinícius Pereira Gonçalves and Rodolfo Ipolito Meneguette
Bioengineering 2025, 12(8), 795; https://doi.org/10.3390/bioengineering12080795 - 24 Jul 2025
Cited by 3 | Viewed by 4136
Abstract
The growing demand for accurate and efficient Chest X-Ray (CXR) interpretation has prompted the development of AI-driven systems to alleviate radiologist workload and reduce diagnostic variability. This paper introduces the Intelligent Humanized Radiology Analysis System (IHRAS), a modular framework that automates the end-to-end [...] Read more.
The growing demand for accurate and efficient Chest X-Ray (CXR) interpretation has prompted the development of AI-driven systems to alleviate radiologist workload and reduce diagnostic variability. This paper introduces the Intelligent Humanized Radiology Analysis System (IHRAS), a modular framework that automates the end-to-end process of CXR analysis and report generation. IHRAS integrates four core components: (i) deep convolutional neural networks for multi-label classification of 14 thoracic conditions; (ii) Grad-CAM for spatial visualization of pathologies; (iii) SAR-Net for anatomical segmentation; and (iv) a large language model (DeepSeek-R1) guided by the CRISPE prompt engineering framework to generate structured diagnostic reports using SNOMED CT terminology. Evaluated on the NIH ChestX-ray dataset, IHRAS demonstrates consistent diagnostic performance across diverse demographic and clinical subgroups, and produces high-fidelity, clinically relevant radiological reports with strong faithfulness, relevancy, and alignment scores. The system offers a transparent and scalable solution to support radiological workflows while highlighting the importance of interpretability and standardization in clinical Artificial Intelligence applications. Full article
Show Figures

Figure 1

13 pages, 1566 KB  
Article
Turkish Chest X-Ray Report Generation Model Using the Swin Enhanced Yield Transformer (Model-SEY) Framework
by Murat Ucan, Buket Kaya and Mehmet Kaya
Diagnostics 2025, 15(14), 1805; https://doi.org/10.3390/diagnostics15141805 - 17 Jul 2025
Cited by 1 | Viewed by 1785
Abstract
Background/Objectives: Extracting meaningful medical information from chest X-ray images and transcribing it into text is a complex task that requires a high level of expertise and directly affects clinical decision-making processes. Automatic reporting systems for this field in Turkish represent an important [...] Read more.
Background/Objectives: Extracting meaningful medical information from chest X-ray images and transcribing it into text is a complex task that requires a high level of expertise and directly affects clinical decision-making processes. Automatic reporting systems for this field in Turkish represent an important gap in scientific research, as they have not been sufficiently addressed in the existing literature. Methods: A deep learning-based approach called Model-SEY was developed with the aim of automatically generating Turkish medical reports from chest X-ray images. The Swin Transformer structure was used in the encoder part of the model to extract image features, while the text generation process was carried out using the cosmosGPT architecture, which was adapted specifically for the Turkish language. Results: With the permission of the ethics committee, a new dataset was created using image–report pairs obtained from Elazıg Fethi Sekin City Hospital and Indiana University Chest X-Ray dataset and experiments were conducted on this new dataset. In the tests conducted within the scope of the study, scores of 0.6412, 0.5335, 0.4395, 0.4395, 0.3716, and 0.2240 were obtained in BLEU-1, BLEU-2, BLEU-3, BLEU-4, and ROUGE word overlap evaluation metrics, respectively. Conclusions: Quantitative and qualitative analyses of medical reports autonomously generated by the proposed model have shown that they are meaningful and consistent. The proposed model is one of the first studies in the field of autonomous reporting using deep learning architectures specific to the Turkish language, representing an important step forward in this field. It will also reduce potential human errors during diagnosis by supporting doctors in their decision-making. Full article
(This article belongs to the Special Issue Artificial Intelligence for Health and Medicine)
Show Figures

Figure 1

17 pages, 1532 KB  
Article
RADAI: A Deep Learning-Based Classification of Lung Abnormalities in Chest X-Rays
by Hanan Aljuaid, Hessa Albalahad, Walaa Alshuaibi, Shahad Almutairi, Tahani Hamad Aljohani, Nazar Hussain and Farah Mohammad
Diagnostics 2025, 15(13), 1728; https://doi.org/10.3390/diagnostics15131728 - 7 Jul 2025
Cited by 4 | Viewed by 2902
Abstract
Background: Chest X-rays are rapidly gaining prominence as a prevalent diagnostic tool, as recognized by the World Health Organization (WHO). However, interpreting chest X-rays can be demanding and time-consuming, even for experienced radiologists, leading to potential misinterpretations and delays in treatment. Method: The [...] Read more.
Background: Chest X-rays are rapidly gaining prominence as a prevalent diagnostic tool, as recognized by the World Health Organization (WHO). However, interpreting chest X-rays can be demanding and time-consuming, even for experienced radiologists, leading to potential misinterpretations and delays in treatment. Method: The purpose of this research is the development of a RadAI model. The RadAI model can accurately detect four types of lung abnormalities in chest X-rays and generate a report on each identified abnormality. Moreover, deep learning algorithms, particularly convolutional neural networks (CNNs), have demonstrated remarkable potential in automating medical image analysis, including chest X-rays. This work addresses the challenge of chest X-ray interpretation by fine tuning the following three advanced deep learning models: Feature-selective and Spatial Receptive Fields Network (FSRFNet50), ResNext50, and ResNet50. These models are compared based on accuracy, precision, recall, and F1-score. Results: The outstanding performance of RadAI shows its potential to assist radiologists to interpret the detected chest abnormalities accurately. Conclusions: RadAI is beneficial in enhancing the accuracy and efficiency of chest X-ray interpretation, ultimately supporting the timely and reliable diagnosis of lung abnormalities. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

28 pages, 4804 KB  
Article
Towards Automatic Detection of Pneumothorax in Emergency Care with Deep Learning Using Multi-Source Chest X-ray Data
by Santiago Ibañez Caturla, Juan de Dios Berná Mestre and Oscar Martinez Mozos
Future Internet 2025, 17(7), 292; https://doi.org/10.3390/fi17070292 - 29 Jun 2025
Viewed by 3246
Abstract
Pneumothorax is a potentially life-threatening condition defined as the collapse of the lung due to air leakage into the chest cavity. Delays in the diagnosis of pneumothorax can lead to severe complications and even mortality. A significant challenge in pneumothorax diagnosis is the [...] Read more.
Pneumothorax is a potentially life-threatening condition defined as the collapse of the lung due to air leakage into the chest cavity. Delays in the diagnosis of pneumothorax can lead to severe complications and even mortality. A significant challenge in pneumothorax diagnosis is the shortage of radiologists, resulting in the absence of written reports in plain X-rays and, consequently, impacting patient care. In this paper, we propose an automatic triage system for pneumothorax detection in X-ray images based on deep learning. We address this problem from the perspective of multi-source domain adaptation where different datasets available on the Internet are used for training and testing. In particular, we use datasets which contain chest X-ray images corresponding to different conditions (including pneumothorax). A convolutional neural network (CNN) with an EfficientNet architecture is trained and optimized to identify radiographic signs of pneumothorax using those public datasets. We present the results using cross-dataset validation, demonstrating the robustness and generalization capabilities of our multi-source solution across different datasets. The experimental results demonstrate the model’s potential to assist clinicians in prioritizing and correctly detecting urgent cases of pneumothorax using different integrated deployment strategies. Full article
(This article belongs to the Special Issue Artificial Intelligence-Enabled Smart Healthcare)
Show Figures

Figure 1

22 pages, 1899 KB  
Article
GIT-CXR: End-to-End Transformer for Chest X-Ray Report Generation
by Iustin Sîrbu, Iulia-Renata Sîrbu, Jasmina Bogojeska and Traian Rebedea
Information 2025, 16(7), 524; https://doi.org/10.3390/info16070524 - 23 Jun 2025
Cited by 6 | Viewed by 2862
Abstract
Medical imaging is crucial for diagnosing, monitoring, and treating medical conditions. The medical reports of radiology images are the primary medium through which medical professionals can attest to their findings, but their writing is time-consuming and requires specialized clinical expertise. Therefore, the automated [...] Read more.
Medical imaging is crucial for diagnosing, monitoring, and treating medical conditions. The medical reports of radiology images are the primary medium through which medical professionals can attest to their findings, but their writing is time-consuming and requires specialized clinical expertise. Therefore, the automated generation of radiography reports has the potential to improve and standardize patient care and significantly reduce the workload of clinicians. Through our work, we have designed and evaluated an end-to-end transformer-based method to generate accurate and factually complete radiology reports for X-ray images. Additionally, we are the first to introduce curriculum learning for end-to-end transformers in medical imaging and demonstrate its impact in obtaining improved performance. The experiments were conducted using the MIMIC-CXR-JPG database, the largest available chest X-ray dataset. The results obtained are comparable with the current state of the art on the natural language generation (NLG) metrics BLEU and ROUGE-L, while setting new state-of-the-art results on F1 examples-averaged F1-macro and F1-micro metrics for clinical accuracy and on the METEOR metric widely used for NLG. Full article
(This article belongs to the Section Information Applications)
Show Figures

Figure 1

14 pages, 723 KB  
Article
RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation
by Caijie Qin, Yize Xiong, Weibin Chen and Yong Li
Mathematics 2025, 13(9), 1492; https://doi.org/10.3390/math13091492 - 30 Apr 2025
Cited by 2 | Viewed by 1271
Abstract
Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still [...] Read more.
Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still fails to enhance the perceptual field during image feature extraction. To solve this problem, we propose the Reinforced Memory-driven Pure Transformer (RMPT), which is a novel Transformer–Transformer-based model. In implementation, our RMPT employs the Swin Transformer to extract visual features from given X-ray images, which has a larger perceptual field to better model the relationships between different regions. Furthermore, we adopt a memory-driven Transformer (MemTrans) to effectively model similar patterns in different reports, which is able to facilitate the model to generate long reports. Finally, we present an innovative training approach leveraging Reinforcement Learning (RL) that efficiently steers the model to focus on challenging samples, consequently improving its comprehensive performance across both straightforward and complex situations. Experimental results on the IU X-ray dataset show that our proposed RMPT achieves superior performance on various Natural Language Generation (NLG) evaluation metrics. Further ablation study results demonstrate that our RMPT model achieves 10.5% overall performance compared to the base mode. Full article
Show Figures

Figure 1

17 pages, 1944 KB  
Article
Pediatric Pneumonia Recognition Using an Improved DenseNet201 Model with Multi-Scale Convolutions and Mish Activation Function
by Petra Radočaj, Dorijan Radočaj and Goran Martinović
Algorithms 2025, 18(2), 98; https://doi.org/10.3390/a18020098 - 10 Feb 2025
Cited by 7 | Viewed by 2466
Abstract
Pediatric pneumonia remains a significant global health issue, particularly in low- and middle-income countries, where it contributes substantially to mortality in children under five. This study introduces a deep learning model for pediatric pneumonia diagnosis from chest X-rays that surpasses the performance of [...] Read more.
Pediatric pneumonia remains a significant global health issue, particularly in low- and middle-income countries, where it contributes substantially to mortality in children under five. This study introduces a deep learning model for pediatric pneumonia diagnosis from chest X-rays that surpasses the performance of state-of-the-art methods reported in the recent literature. Using a DenseNet201 architecture with a Mish activation function and multi-scale convolutions, the model was trained on a dataset of 5856 chest X-ray images, achieving high performance: 0.9642 accuracy, 0.9580 precision, 0.9506 sensitivity, 0.9542 F1 score, and 0.9507 specificity. These results demonstrate a significant advancement in diagnostic precision and efficiency within this domain. By achieving the highest accuracy and F1 score compared to other recent work using the same dataset, our approach offers a tangible improvement for resource-constrained environments where access to specialists and sophisticated equipment is limited. While the need for high-quality datasets and adequate computational resources remains a general consideration for deep learning applications, our model’s demonstrably superior performance establishes a new benchmark and offers the delivery of more timely and precise diagnoses, with the potential to significantly enhance patient outcomes. Full article
(This article belongs to the Special Issue Machine Learning in Medical Signal and Image Processing (3rd Edition))
Show Figures

Figure 1

Back to TopTop