Next Article in Journal
Two-Layer Matrix Factorization and Multi-Layer Perceptron for Online Service Recommendation
Next Article in Special Issue
Late Deep Fusion of Color Spaces to Enhance Finger Photo Presentation Attack Detection in Smartphones
Previous Article in Journal
Pavement Surface Defect Detection Using Mask Region-Based Convolutional Neural Networks and Transfer Learning
 
 
Article
Peer-Review Record

Analysis of Score-Level Fusion Rules for Deepfake Detection

Appl. Sci. 2022, 12(15), 7365; https://doi.org/10.3390/app12157365
by Sara Concas 1,†, Simone Maurizio La Cava 1,†, Giulia Orrù 1, Carlo Cuccu 1, Jie Gao 2, Xiaoyi Feng 2, Gian Luca Marcialis 1,*,† and Fabio Roli 3
Reviewer 1:
Reviewer 2: Anonymous
Appl. Sci. 2022, 12(15), 7365; https://doi.org/10.3390/app12157365
Submission received: 24 June 2022 / Revised: 11 July 2022 / Accepted: 12 July 2022 / Published: 22 July 2022
(This article belongs to the Special Issue Application of Biometrics Technology in Security)

Round 1

Reviewer 1 Report

This is a well written article that documents a comprehensive study of fusion methods for Deepfake detection. Although the paper does not propose any new techniques/models/algorithms, the results based on the designed experimental studies are useful for researchers in the related area. 

It is interesting to observe that most ensemble models generally outperform significantly most single models for intra-dataset testing, but not for the cross-dataset testing. I wonder if the authors have any insightful knowledge on this.    

Author Response

We thank the referee for his review. Concerning the observed variation in the difference in performance between the single models and the fusion methods in intra-dataset testing and cross-dataset testing, we believe that the increase is simply commensurate with the extent of the performance of the individual classifiers, as it is possible to see how it improves in both test scenarios.

In fact, considering the best model and comparing it with the best fusion method in the considered scenario, for example in terms of AUC, it is possible to note that:

  • In the intra-dataset scenario, the best model provided a value equal to 0.959, while with the fusion we were able to reach a value of 0.984
  • In the cross-dataset scenario, we are able to obtain an AUC equal to 0.711 starting from a maximum value equal to 0.697, thus with a lower increment with respect to the intra-dataset scenario but starting from a lower performance baseline of all the underlying single detectors from which we are trying to exploit the complementarity.

Moreover, this is also reinforced by the fact that even the methods based on the weighted average and those based on machine learning models, which provide the best results in both test scenarios, are trained on the scores obtained from the same data on which the individual models were trained (except the Resnet). Therefore, the weights and parameters estimated by the training set may not be the optimal ones on never-seen-before data such as those analyzed in the cross-dataset scenario. This is quite surprising because usually the approach is exploiting an independent data set for this aim. Anyway, the reviewer’s comment is an excellent starting point for a future study on the matter, thus we thank her/him once more.

Reviewer 2 Report

Authors exploit the complementarity of four of different individual classifiers, and evaluate which fusion rules are best suited to increase the 5 generalization capacity of modern deepfake detection systems. 

The work presents a relevant merit and presents contributions. 

However, minor points can be treated, such as:

-improve the quality or color of Fig. 3 because the label inside the yellow color is not legible. The same for Fig 9, that it is very small, it can be put in two figures inside one figure.

Overall the paper is well written.

Author Response

We thank the referee for his useful comments and suggestions. We improved the overall quality of the cited figures. In particular:

  • We changed the color of Fig. 3 and incremented the weight of the text in order to make the figure more legible
  • Similarly, we divided Fig. 9 into two figures (Fig. 9 and Fig. 10 in the reviewed manuscript) for increasing the size of the represented images, in which we also incremented the weight of the text and bolded it in order to make it easier to read
Back to TopTop