Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Realistic Image Generation from Text by Using BERT-Based Embedding

Electronics 2022, 11(5), 764; https://doi.org/10.3390/electronics11050764

by Sanghyuck Na¹

, Mirae Do², Kyeonah Yu² and Juntae Kim^1,*

Reviewer 1:

Jiawei Chen

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Electronics 2022, 11(5), 764; https://doi.org/10.3390/electronics11050764

Submission received: 23 November 2021 / Revised: 7 February 2022 / Accepted: 11 February 2022 / Published: 2 March 2022

(This article belongs to the Special Issue Advanced Application of Machine Learning and Meta-Learning in Image and Text Analysis)

Round 1

Reviewer 1 Report

In this paper, the authors proposed a new text-to-image generation model using pre-trained BERT. The topic in this paper is quite interesting. I have the following comments

Do the author plan to release the code and the datasets if the paper is published?
Is it possible to deploy the proposed model on an edge device? If it is , how to implement acceleration?

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

1. Section 2 can be improved by more recent papers.
- RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER, 2021, The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)
- Correlation-Guided Representation for Multi-Label Text Classification, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21)
- arXiv:2001.07966v2 [cs.CV] 23 Jan 2020
- arXiv:2003.12137v1 [cs.CV] 26 Mar 2020
review-https://doi.org/10.1016/j.inffus.2021.07.009
other applications: https://doi.org/10.3390/s21010133
https://doi.org/10.1038/s43856-021-00008-0
2. Algorithms 1-3 look like print screen in low resolution. Quality of the text outlook should be improved.
3. Rearrange (3) and (4). No. of eq. should be at the end of the row.
4. Check for spelling and grammar errors. Eg: "set. . In " (line 189).

5. You should explain your methods and details about execution in more details. For example, which software was used for what purpose and why. How did you implement technical details, etc. However, this could be solved by citing some reference. It is not necessary to rewrite the entire section.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

I have a number comments.

1.Its necessary to supplement references with modern works with realistic image generation for text.

2.Its known that there are 3 types of contextualized word embeddings from BERT using transfer learning. Show the difference in your approach.

3.In this work you need context-averaged pre-trained embeddings.This embedding approach will give the average value of the word.

4.What is the novelty of Stack GAN because this model has long been used for text to photorealistic images synthesis. In addition, its better to use Stack GAN v2 for such tasks to obtain hagh-resolution photorealistic images.

5.In addition to comparing different GAN models listed in Tabl.2 for IS-score and FID-score,you need to use MSE and SSIM metrics.

6.It was necessary to compare your model with such models as AttnGAN, BigGAN, InfoGAN.

7.The article should show that the algorithms as RAKE, NPL, LexRank and others are used for effective analysis and compression of the input text.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Overall I'm happy with the responses to my comments other than the Q5.The changes and additions have significantly improved the scientific and practical weigth of the article.

Article Menu

Realistic Image Generation from Text by Using BERT-Based Embedding

Further Information

Guidelines

MDPI Initiatives

Follow MDPI