MDPI - Publisher of Open Access Journals

18 pages, 5695 KB

Open AccessArticle

Machine-to-Machine Visual Dialoguing with ChatGPT for Enriched Textual Image Description

by Riccardo Ricci, Yakoub Bazi and Farid Melgani

Remote Sens. 2024, 16(3), 441; https://doi.org/10.3390/rs16030441 - 23 Jan 2024

Cited by 13 | Viewed by 4034

Image captioning is a technique that enables the automatic extraction of natural language descriptions about the contents of an image. On the one hand, information in the form of natural language can enhance accessibility by reducing the expertise required to process, analyze, and exploit remote sensing images, while on the other, it provides a direct and general form of communication. However, image captioning is usually restricted to a single sentence, which barely describes the rich semantic information that typically characterizes remote sensing (RS) images. In this paper, we aim to move one step forward by proposing a captioning system that, mimicking human behavior, adopts dialogue as a tool to explore and dig for information, leading to more detailed and comprehensive descriptions of RS scenes. The system relies on a questions–answers scheme fed by a query image and summarizes the dialogue content with ChatGPT. Experiments carried out on two benchmark remote sensing datasets confirm the potential of such an approach in the context of semantic information mining. Strengths and weaknesses are highlighted and discussed, as well as some possible future developments. Full article

(This article belongs to the Topic Deep Learning and Transformers’ Methods Applied to Remotely Captured Data)

► Show Figures

Graphical abstract

16 pages, 2036 KB

Open AccessArticle

Goal-Driven Visual Question Generation from Radiology Images

by Mourad Sarrouti, Asma Ben Abacha and Dina Demner-Fushman

Information 2021, 12(8), 334; https://doi.org/10.3390/info12080334 - 20 Aug 2021

Cited by 10 | Viewed by 4585

Abstract

Visual Question Generation (VQG) from images is a rising research topic in both fields of natural language processing and computer vision. Although there are some recent efforts towards generating questions from images in the open domain, the VQG task in the medical domain has not been well-studied so far due to the lack of labeled data. In this paper, we introduce a goal-driven VQG approach for radiology images called VQGRaD that generates questions targeting specific image aspects such as modality and abnormality. In particular, we study generating natural language questions based on the visual content of the image and on additional information such as the image caption and the question category. VQGRaD encodes the dense vectors of different inputs into two latent spaces, which allows generating, for a specific question category, relevant questions about the images, with or without their captions. We also explore the impact of domain knowledge incorporation (e.g., medical entities and semantic types) and data augmentation techniques on visual question generation in the medical domain. Experiments performed on the VQA-RAD dataset of clinical visual questions showed that VQGRaD achieves 61.86% BLEU score and outperforms strong baselines. We also performed a blinded human evaluation of the grammaticality, fluency, and relevance of the generated questions. The human evaluation demonstrated the better quality of VQGRaD outputs and showed that incorporating medical entities improves the quality of the generated questions. Using the test data and evaluation process of the ImageCLEF 2020 VQA-Med challenge, we found that relying on the proposed data augmentation technique to generate new training samples by applying different kinds of transformations, can mitigate the lack of data, avoid overfitting, and bring a substantial improvement in medical VQG. Full article

(This article belongs to the Special Issue Neural Natural Language Generation)

► Show Figures

Figure 1

Search Results (2)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (2)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI