Next Article in Journal
A Novel Approach towards the Design and Implementation of Virtual Network Based on Controller in Future IoT Applications
Previous Article in Journal
Short Circuit Characteristics of PEM Fuel Cells for Grid Integration Applications
 
 
Article
Peer-Review Record

Semi-Supervised FaceGAN for Face-Age Progression and Regression with Synthesized Paired Images

Electronics 2020, 9(4), 603; https://doi.org/10.3390/electronics9040603
by Quang T. M. Pham 1, Janghoon Yang 2 and Jitae Shin 1,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Electronics 2020, 9(4), 603; https://doi.org/10.3390/electronics9040603
Submission received: 16 March 2020 / Revised: 29 March 2020 / Accepted: 30 March 2020 / Published: 1 April 2020
(This article belongs to the Section Artificial Intelligence)

Round 1

Reviewer 1 Report

The manuscript:

“Semi-supervised FaceGAN for face-age progression and regression with synthesized paired images”, by Q. T. M. Pham, J. Yang and J. Shin (Ref. No.: electronics-760728),

has been improved and may be recommended for publication. In particular, the authors answered all questions, added the required material, included some numerical data and provided missing citations. However, few minor corrections may be desirable before the publication as follows:

1) Key numerical results obtained in this study should be shown in the Abstract

2) The sentence: “The proposed model is distinct from the existing works in the following way”, should be cited.

3) The sentence: “Overall, the contributions of this paper are: …”, should start as a new paragraph in next line with left indent.

4) The sentence: “However, z cannot capture all the information from the input images because of dimensionality reduction [35]”. What is dimensionality reduction in this context? This sentence should be briefly clarified.

5) The sentence: “The FG-NET includes 1002 images from 82 subjects (from 0 to 69 years old)”. 69 years old is below average of the life expectation. Why is it only up to 69 years old?

6) The manuscript shows only advantages of the proposed method. Disadvantages of the proposed method for face-age progression/regression and the possible ways to resolve them should also be briefly discussed.

The manuscript may be recommended for publication after minor revision.

Author Response

Please refer the attached file for our response.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors have addressed all previous reviewer comments made in the manuscript electronics-695863. The reviewer considers that the work has been significantly improved and it is suitable for publication in its current state.

Previous comments made on the manuscript “electronics-695863”:

According to the authors, the contributions of the work are:

"1. We proposed a novel framework for age progression and regression including two GAN models. By using additional GAN, we can train the model with semi-supervised approach by synthesized paired images, which avoids the limitation of the dataset.

2. We introduced a new way of training that separate the aging feature and the identity features so that we can guide our model to learn them better and provide more realistic images. With our proposed method, we can use Unet-based model as Generator, which can overcome the bottleneck limitation of auto-encoder. It helps our model to produce more detailed image. ”

Unfortunately, the reviewer considers that the scope and content of the work is insufficient for publication in the Electronics journal, since it is similar to that of the two publications used in the validation [3] and [9], both of which are congress publications.

[3] Zhang, Z .; Song, Y .; Qi, H. Age Progression / Regression by Conditional Adversarial Autoencoder. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[9] Wang, Z .; X. Tang, W.L .; Gao, S. Face Aging with Identity-Preserved Conditional Generative Adversarial Networks. 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

Work improvement points:

The authors justify the use of computer-synthesized images in validation due to the non-existence of databases with images that allow to track the evolution of the age of people. However, the reviewer considers that a scientifically adequate validation must be supported by a comparative with real images of different ages, not images synthesized digitally according to a model. Therefore, the validation of the model must be performed with a set of real images of different ages in evolution, which should not have been used in model learning. In the reviewer's opinion, this is the main work deficiency.

A more detailed description of the section “3. Method ”must be performed, providing more technical implementation details. The description of the models should be such that they allow their exact reproduction in other studies to favor possible comparisons. In the reviewer's opinion, this is another major work deficiency.

In figure 5, it is not described which is the original figure.

The importance of Figure 6 is not very well understood. Its meaning and justification should be described more fully in the text. The following are currently mentioned: “To verify the domination problem of reconstruction, we conduct an experiment by using the baseline CAAE model with Unet as the generator. As we expected, the reconstruction loss decreases to 0 quickly and the model could not learn anything. It can be seen from the Fig. 6, if we replace the auto-encoder generator with Unet, all the output images look the same as the input ones.

Table 1 must be explained and analyzed in more detail.

The writing and grammar of the work should be reviewed in its entirety to correct mistakes.

A more recent bibliographic review should be carried out, to highlight the importance of the work with respect to current publications.

Author Response

All authors  truly appreciate all the constructive comments and suggestions from the reviewer at first round review.

 

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

According to the authors, the contributions of the work are:

"1. We proposed a novel framework for age progression and regression including two GAN models. By using additional GAN, we can train the model with semi-supervised approach by synthesized paired images, which avoids the limitation of the dataset.

We introduced a new way of training that separate the aging feature and the identity features so that we can guide our model to learn them better and provide more realistic images. With our proposed method, we can use Unet-based model as Generator, which can overcome the bottleneck limitation of auto-encoder. It helps our model to produce more detailed image. ”

Unfortunately, the reviewer considers that the scope and content of the work is insufficient for publication in the Electronics journal, since it is similar to that of the two publications used in the validation [3] and [9], both of which are congress publications.

[3] Zhang, Z .; Song, Y .; Qi, H. Age Progression / Regression by Conditional Adversarial Autoencoder. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[9] Wang, Z .; X. Tang, W.L .; Gao, S. Face Aging with Identity-Preserved Conditional Generative Adversarial Networks. 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

Work improvement points:

The authors justify the use of computer-synthesized images in validation due to the non-existence of databases with images that allow to track the evolution of the age of people. However, the reviewer considers that a scientifically adequate validation must be supported by a comparative with real images of different ages, not images synthesized digitally according to a model. Therefore, the validation of the model must be performed with a set of real images of different ages in evolution, which should not have been used in model learning. In the reviewer's opinion, this is the main work deficiency.

A more detailed description of the section “3. Method ”must be performed, providing more technical implementation details. The description of the models should be such that they allow their exact reproduction in other studies to favor possible comparisons. In the reviewer's opinion, this is another major work deficiency.

In figure 5, it is not described which is the original figure.

The importance of Figure 6 is not very well understood. Its meaning and justification should be described more fully in the text. The following are currently mentioned: “To verify the domination problem of reconstruction, we conduct an experiment by using the baseline CAAE model with Unet as the generator. As we expected, the reconstruction loss decreases to 0 quickly and the model could not learn anything. It can be seen from the Fig. 6, if we replace the auto-encoder generator with Unet, all the output images look the same as the input ones.

Table 1 must be explained and analyzed in more detail.

The writing and grammar of the work should be reviewed in its entirety to correct mistakes.

A more recent bibliographic review should be carried out, to highlight the importance of the work with respect to current publications.

Reviewer 2 Report

The paper introduces a new network called Semi-supervised Gan(ss-FaceGan). This network explores face age progression and regression problem with the pair of synthesized images with the target age and the age of the face in real data. Compare to the other two networks, the new network improved the output quality.

In 3.1, mark input image I1, the target information t2 and output image I2 in Figure2. In Figure 2, explain which one is the main discriminator and which one is the additional discriminator. Explain some parameters in Formulas (1)-(9). Explain some parameters in Figure 4. In Table 1, explain what is the score.

Reviewer 3 Report

The manuscript:

“Semi-supervised FaceGAN for face-age progression and regression with synthesized paired images”, by Q. T. M. Pham, J. Yang and J. Shin (Ref. No.: electronics-695863-peer-review-v1),

contains interesting material. However, it is not well-organized and should be considerably elaborated. In particular, the aging process is not simple and many factors should be taken into account. For example, the aging process depends upon social status, ethnicity, environmental conditions, gender, and health conditions of a person. It is not clear how the proposed semi-supervised FaceGAN model accounts for all these conditions. Furthermore, it is not clear the limitations of the FaseGAN model. The authors should explicitly state the drawbacks of their model and indicate the methods and algorithms to resolve these drawbacks.

Overall English is acceptable. However, it needs some minor amendments. The manuscript needs some citations as mentioned bellow.

Apart from this, the following should be taken into consideration:

ABSTRACT

The Abstract should reflect the key results obtained in this study. Therefore, the key quantitative results showing the improvement in image processing reliability should be shown.

INTRODUCTION

1) The sentence: “Both tasks are important tasks because they can be applied in many domains”, should be cited.

2) The sentence: “By this setting, our model can overcome the limitation of dataset and provide the high quality results”. Does this model based on extrapolation that enables one to predict changing features on a face in future and/or past? If yes, this should be briefly discussed in the ending part of the Introduction.

RELATED WORKS

1) The sentence should be corrected as: “…of the generator while the generator tries to confuse the discriminator by generating images that make …”.

2) The sentence: “Although previous methods introduced several techniques to overcome the limitation of dataset …”. A brief description of how to overcome this limitation of datasets should be provided.

METHOD

1) The sentence: “However, z cannot capture all the information from the input images”. Why encoded vector z cannot capture all information from the input images? This should be clarified and cited.

2) The sentence: “In addition, the rescontruction loss they used also brings several issues”. This sentence is not clear and grammatically incorrect. Is it the reconstruction loss?

3) Equation (1) should be cited.

4) The sentence: “This conflict also results in blurry output images”. What is blurry image? Wouldn’t be a distorted (or non-realistic) image? Does color schemes of the faces counts as a factor that plays roles in the current results by applying image processing techniques?

5) The sentence: “To solve the problem of the image quality degradation in output, we apply Unet architecture to replace the auto-encoder”, should be cited.

6) The sentence: “This semi-supervised learning method helps to overcome not only the problem of original reconstruction loss but also the trade-off for identity preserving and aging translation learning”. What is the reconstruction loss in this learning method? How does the reconstruction loss vary with aging progression or regression?

7) The sentence: “… face image of the same person in accordance with a target age”. As it has been mentioned above, the target age may be vague in some sense as the aging process is different in different people. It depends on social status, ethnicity, gender, and health condition of a person. This subject matter should also be discussed. It is not clear how the proposed model can overcome this issue.

8) Equations (2)-(10) should be cited.

9) The sentence: “We use Adam optimizer with …”. What is “Adam optimizer”? This sentence should be cited.

10) The sentence: “Even though CAAE provides larger confidence score for the group of 0-10 and over 50 years old quantatively …”. Why CAAE is better in these two age groups? The spelling of “quantitatively” should be fixed.

CONCLUSION

The key quantitative results should be show in Conclusion.

The manuscript requires a major mandatory revision.

 

Back to TopTop