Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation

Sensors 2023, 23(3), 1057; https://doi.org/10.3390/s23031057

by He Zhu¹

, Ren Togo²

, Takahiro Ogawa²

and Miki Haseyama^2,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Sensors 2023, 23(3), 1057; https://doi.org/10.3390/s23031057

Submission received: 14 November 2022 / Revised: 7 January 2023 / Accepted: 14 January 2023 / Published: 17 January 2023

(This article belongs to the Section Biomedical Sensors)

Round 1

Reviewer 1 Report

The authors present a question generation system from medical images. The system is composed of an image feature extraction network to extract image features and a multi-layer perceptron to encode the category of the question. Features are used to discover a plurality of latent spaces, from whom a LSTM model can generate the question associated with the image and category. The key contribution of the paper is the design of a multi-term weighted cost function that takes into account the capacity of the latent space to represent the data, a regularization term to prevent over-fitting, a term to encourage diversity between latent spaces and a term measuring the similarity of questions generated from the latent space to that of the reference.

The paper is well written, and the experimental section is convincing.

The main criticism for the work comes from its conception. It is hard for this reviewer to understand its potential utility or applicability in any scenario of this work. What is the usefulness of generating the questions described in Figure 4?

The authors claim in lines 76-78 that VQA and VQG systems “provide systems with a clearer understanding of their illness to ease the tension among patients and possible misunderstanding about physicians”. Such affirmation may be true for VQA, however, it is very unclear how a VQG system, as the one proposed will fit such purpose.

Some minor typos in the paper:

Line 42. The acronym VOG is first used without description.

Line 82. Acronyms VOA and VOG used without description.

Line 83. Unclear what the authors refer to as “microscopic disease”

3.3 “question should be capitalized.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This study have proposed a novel visual question generation model based on diversity learning with a multi-latent space for generating informative question sets from medical images. The research is significant and valuable. However, the content and organization of this paper can be improved.

1. The abstract should generally include the research background and purpose, research methods, research results, research importance and potential impact. The number of words should be controlled at about 250.

2. Each keyword is separated by a comma and does not need an "and" to connect.

3. How is the paper structured? Please add a paragraph to introduce it at the end of the introduction.

4. When the authors cite references, most of them are simply displayed in the paper. I suggest that the authors relate these references to the work of this paper, for example, how certain studies have influenced their work.

The following items are the references that been suggested:

https://doi.org/10.7717/peerj-cs.353

https://doi.org/10.1016/j.patcog.2021.108153

https://doi.org/10.3390/machines10010042

https://doi.org/10.1109/ACCESS.2021.3074937

https://doi.org/10.3390/s22062387

https://doi.org/10.3390/s22249733

5. How to ensure the fairness and effectiveness of quantitative evaluation?

6. How to ensure the richness of the final question set?

7. My suggestion is to move figure 6 into Appendix.

8. What is the limitations of this method?

9. What are the further research topics and directions? What else can be improved?

10. The conclusion of this paper needs to be optimized. We suggest that the author add some comparisons with previous work, advantages and disadvantages of the author’s method.

11. Please strictly typesetting the paper, especially the typography of the figures and tables.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The author's manuscript has undergone serious revisions, the quality has improved considerably, and it is recommended that it be considered for publication.

Article Menu

Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation

Further Information

Guidelines

MDPI Initiatives

Follow MDPI