Face Identification Using Data Augmentation Based on the Combination of DCGANs and Basic Manipulations
Round 1
Reviewer 1 Report
I suggest simplifying it or better explaining with realistic examples. This paper has a potential to be accepted, but some important points have to be clarified or fixed before we can proceed and a positive action can be taken.
Comments for author File: Comments.pdf
Author Response
1st comment:
Limiting of the reference and citation:
- We reduce the number of references. We remove some references in the Related
Works Section (Image Data augmentation techniques and Face recognition
techniques). We remove references of papers whose authors use the same
technique and we illustrate each technique with one or two pertinent examples. We hope that we well understood this comment as we hesitated between "references are limited and references should be added" and "please limit the number of references and some of them should be deleted"
2nd comment:
Paper is lacking with strong implication
- Our work explores the performance of different data augmentation techniques for
face recognition. The main contribution of our work is to combine generative
methods and basic manipulations to improve face recognition performance. The
images are pre-processed by DCGAN. We use the Wasserstein loss to replace the
standard DCGAN cross-entropy loss to solve the problem of DCGAN training
instability Then, we applied basic manipulations on images produced by the
generative approach. FaceNet and SVM are applied for feature extraction and face
recognition. The experiments show that the combination of generative and basic
approaches performs better than the other tested techniques
3rd comment:
Brief about the studies
- Image augmentations can, in general, be classified as traditional or generative
augmentation methods. Traditional data augmentation methods involve geometric
transformations, random crop, kernel filters, color space augmentations and noise
injection. Generative models are able to generate new training data resulting in
more efficient classification models. Generative approaches include methods such
as Neural Style Transfer (NST) and Generative adversarial networks (GAN).
Variational auto-encoders are one of the techniques used in augmentation, which
improve the quality of samples produced by GAN. In our work, we combine both
generative methods (DCGANs) and basic manipulations (translation, rotation,
brightness changes, filter operations...) for data augmentation. Our results show
that the combination of generative and basic approaches performs better than the
other tested techniques.
- Traditional face recognition techniques relied on hand-crafted features, such as
edges and texture descriptors, combined with machine learning techniques, such as
principal component analysis, linear discriminant analysis or support vector
machines. Recently, traditional face recognition methods have been superseded by deep learning methods based on convolutional neural networks (CNNs). In our
work, we use both FaceNet and SVM for face recognition. We propose to remove
the last fully connected layer in a CNN and plugging in a SVM instead, which is
an efficient supervised learning algorithm for classification to create a combined
architecture for SVM for facial recognition task. Experimental results show that
FaceNet + SVM performs better than CNN.
4th comment:
Add a comparative analysis on exercise accuracies of Table and Experimental
Results on Comparison of Models.
- Experimental results show that our method based on DCGANs and basic
manipulations for data augmentation and FaceNet + SVM for face recognition has
more advantages than that of the PCA, TRPCA and LBPH methods using only a
small number of samples.
- Moreover, our proposed approach achieves better results than the work of Pei et
al., based on only basic data augmentation techniques, including geometric
transformations, brightness change and filtering and CNN for face recognition.
- The experimental evaluation demonstrates that a significant increase in accuracy
can be obtained by combining DCGANs and basic manipulations for data
augmentation than using only basic manipulations.
- Furthermore, results show that our proposed approach using DCGANs with basic
manipulations for data augmentation achieves higher accuracy than our previous
work based on only DCGANs as data augmentation technique.
5th comment:
Conclusion needs to be rewritten.
- We rewrote the conclusion.
6th comment:
How do identified the Risk with preferences and reduced of categories and more
clear explain at unclear and inherent algorithm.
- Although the combination of generative and basic approaches demonstrated a
good increase in accuracy, the basic approach is not far behind in terms of
performance but requires less time and hardware resources.
- Improving the quality of DCGAN generated samples and evaluating their
effectiveness on a broad range of datasets is a very important area for future work.
- DCGAN can be optimized by adjusting some parameters such as batch size,
learning rate and momentum to generate more realistic and diverse face samples to
improve the accuracy of the results.
- The future work can include the use of Wasserstein loss with gradient penalty to
improve the quality of the generated images.deep learning methods based on convolutional neural networks (CNNs). In our
work, we use both FaceNet and SVM for face recognition. We propose to remove
the last fully connected layer in a CNN and plugging in a SVM instead, which is
an efficient supervised learning algorithm for classification to create a combined
architecture for SVM for facial recognition task. Experimental results show that
FaceNet + SVM performs better than CNN.
4th comment:
Add a comparative analysis on exercise accuracies of Table and Experimental
Results on Comparison of Models.
- Experimental results show that our method based on DCGANs and basic
manipulations for data augmentation and FaceNet + SVM for face recognition has
more advantages than that of the PCA, TRPCA and LBPH methods using only a
small number of samples.
- Moreover, our proposed approach achieves better results than the work of Pei et
al., based on only basic data augmentation techniques, including geometric
transformations, brightness change and filtering and CNN for face recognition.
- The experimental evaluation demonstrates that a significant increase in accuracy
can be obtained by combining DCGANs and basic manipulations for data
augmentation than using only basic manipulations.
- Furthermore, results show that our proposed approach using DCGANs with basic
manipulations for data augmentation achieves higher accuracy than our previous
work based on only DCGANs as data augmentation technique.
5th comment:
Conclusion needs to be rewritten.
- We rewrote the conclusion.
6th comment:
How do identified the Risk with preferences and reduced of categories and more
clear explain at unclear and inherent algorithm.
- Although the combination of generative and basic approaches demonstrated a
good increase in accuracy, the basic approach is not far behind in terms of
performance but requires less time and hardware resources.
- Improving the quality of DCGAN generated samples and evaluating their
effectiveness on a broad range of datasets is a very important area for future work.
- DCGAN can be optimized by adjusting some parameters such as batch size,
learning rate and momentum to generate more realistic and diverse face samples to
improve the accuracy of the results.
- The future work can include the use of Wasserstein loss with gradient penalty to
improve the quality of the generated images.
Author Response File: Author Response.pdf
Reviewer 2 Report
In this paper, the authors propose an approach that combines generative methods and basic manipulations for image data augmentations and the FaceNet model with Support Vector Machine (SVM) for face recognition. Overall, the topic of this paper is convincing and the problem is hot.
Deep Convolutional Generative Adversarial Net has been used for many years, please give us more explanations about why the model is novel by consulting with the book https://www.deeplearningbook.org/ or “Neural Networks and Learning Machines” by Haykin, S.O..
Face Identification based on tensor methods has been very popular recently, could the authors make some comparisons with Cai, Shuting, et al. "Tensor robust principal component analysis via non-convex low rank approximation." Applied Sciences 9.7 (2019)?
Author Response
Covering Letter
1st comment:
Deep Convolutional Generative Adversarial Net has been used for many years,
please give us more explanations about why the model is novel by consulting with
the book https://www.deeplearningbook.org/ or “Neural Networks and Learning
Machines” by Haykin, S.O..
- Compared to the DCGAN proposed in "https://www.deeplearningbook.org/" and
“Neural Networks and Learning Machines”, our model uses the Wasserstein loss
instead of the standard DCGAN cross-entropy loss that calculates the average
score for real or fake images to solve the problem of DCGAN training instability,
mode collapse and vanishing gradient.
2nd comment:
Face Identification based on tensor methods has been very popular recently,
could the authors make some comparisons with Cai, Shuting, et al. "Tensor
robust principal component analysis via non-convex low rank approximation."
Applied Sciences 9.7 (2019)?
- We add a brief review of the different tensor methods used for face recognition in
Section 2.2. Face Recognition Techniques.
- In Table 2, Table 3 & Table 5, We compare our model based on the combination
of FaceNet and SVM for face recognition with the TRPCA algorithm proposed in
Shuting et al. using LFW dataset, VGGFace2 dataset and ChokePoint dataset.
Results show that our model outperforms TRPCA algorithm.
Author Response File: Author Response.pdf
Reviewer 3 Report
The paper proposed an overview of face identification methods. The own data augmentation method for small datasets is proposed. Very good literature review is given. The results are illustrated well.
The following suggestions should improve the paper:
1) Introduction - please add the main contribution
2) Conclusions - please add the limitation of the proposed data augmentation approach
3) Tables 1-5 - why only accuracy is used? What about unbalanced data?
Author Response
1st comment:
Introduction - please add the main contribution
- We add the main contributions of the paper in the Introduction Section.
The main contributions of this paper can be summarized as follows:
• We propose a novel data augmentation technique in which DCGAN and basic
manipulations are combined together. We use the Wasserstein loss to replace the
minimax loss in DCGAN to solve the problem of DCGAN training instability. We
show that our model improves face recognition performance by considering the LFW
dataset, VGG dataset and ChokePoint video database.
• We demonstrate the benefits of the proposed augmentation strategy for face
recognition, by comparing our approach with only basic manipulations and only
generative approach.
• We show that the use of SVM instead of a softmax function in FaceNet model may
improve the face recognition accuracy compared to the other tested techniques.
2nd comment:
Conclusions - please add the limitation of the proposed data augmentation
Approach
- Although the combination of generative and basic approaches demonstrated a
good increase in accuracy, the basic approach is not far behind in terms of
performance but requires less time and hardware resources.
- Improving the quality of DCGAN generated samples and evaluating their
effectiveness on a broad range of datasets is a very important area for future work.
- DCGAN can be optimized by adjusting some parameters such as batch size,
learning rate and momentum to generate more realistic and diverse face samples to
improve the accuracy of the results.
- The future work can include the use of Wasserstein loss with gradient penalty to
improve the quality of the generated images.
3rd comment:
Tables 1-5 - why only accuracy is used? What about unbalanced data?
- In our case, accuracy is a useful measure because we have used balanced datasets.
As future work, we will test our model with unbalanced data.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
In several instances I also suggested to cite more relevant and recent literature. Furthermore I made additional suggestions for more in-depth analyses of the data.
Comments for author File: Comments.doc
Author Response
1st comment:
In several instances I also suggested to cite more relevant and recent literature.
- We add some recently published articles:
- Duan; Q., Zhang; L., Look more into occlusion: Realistic face frontalization and
recognition with boostgan. IEEE Trans. Neural Netw. Learn. Syst., 32, 2020, pp. 214–
228
- Anzar; S. M., Amrutha; T., Efficient wavelet-based scale invariant feature transform for
partial face recognition. AIP Conference Proceedings, AIP publishing LLC, 2222, 2020,
p. 030017.
- Torfi; A., Shirvani; R., Keneshloo; Y., Fox; E. Natural language processing
advancements by deep learning: A survey. arXiv preprint arXiv:2003.01200}, 2020.
- Li; B., Wu; F., Lim; S., Weinberger; K. On feature normalization and data
augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), 2021, pp. 12383—12392.
2nd comment
Furthermore I made additional suggestions for more in-depth analyses of the data.
Done.
Author Response File: Author Response.pdf
Reviewer 3 Report
Thank you and good luck
Author Response
We would like to thank the reviewer for his thoughtful comments and effort towards improving
our work.
1st comment:
In several instances I also suggested to cite more relevant and recent literature.
Furthermore I made additional suggestions for more in-depth analyses of the data.
- We add some recently published articles:
- Duan; Q., Zhang; L., Look more into occlusion: Realistic face frontalization and
recognition with boostgan. IEEE Trans. Neural Netw. Learn. Syst., 32, 2020, pp. 214–
228
- Anzar; S. M., Amrutha; T., Efficient wavelet-based scale invariant feature transform for
partial face recognition. AIP Conference Proceedings, AIP publishing LLC, 2222, 2020,
p. 030017.
- Torfi; A., Shirvani; R., Keneshloo; Y., Fox; E. Natural language processing
advancements by deep learning: A survey. arXiv preprint arXiv:2003.01200}, 2020.
- Li; B., Wu; F., Lim; S., Weinberger; K. On feature normalization and data
augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), 2021, pp. 12383—12392.
2nd comment:
Once again check for to all the equations such as theorems, lemmas and remarks
- We check for to all the equations, theorems and remarks.
Author Response File: Author Response.pdf