Next Article in Journal
Tohjm-Trained Multiscale Spatial Temporal Graph Convolutional Neural Network for Semi-Supervised Skeletal Action Recognition
Previous Article in Journal
Fast Self-Attention Deep Detection Network Based on Weakly Differentiated Plant Nematodess
 
 
Review
Peer-Review Record

Out-of-Distribution (OOD) Detection Based on Deep Learning: A Review

Electronics 2022, 11(21), 3500; https://doi.org/10.3390/electronics11213500
by Peng Cui 1,2 and Jinjia Wang 1,2,*
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Electronics 2022, 11(21), 3500; https://doi.org/10.3390/electronics11213500
Submission received: 20 September 2022 / Revised: 25 October 2022 / Accepted: 27 October 2022 / Published: 28 October 2022

Round 1

Reviewer 1 Report

- paper is fine - a thorough check reveals that the paper is nearly ready to be published

Author Response

Thank you very much for the recognition of the paper.

Reviewer 2 Report

The review paper is well-organized in general, but the writing can be improved and some of the explanations/introductions are not clear to the reviewer. 

I have the following concerns about this paper:

 

1. “Out of Distribution (OOD) detection separates ID (In Distribution) data and OOD data from input data through a pretraining model.”  – I think “pretraining” is a confusing word here. A pretrained model typically means a model created by someone else to solve a similar problem and I don’t think it fits the context here.

2. “Most studies use specific data. Therefore,  according to the data type, it is divided into supervised, semisupervised, and unsupervised.” A review paper is expected to not only give high level overviews on the categorization but also include concise and correct introductions on the categorization. The authors merely states “according to the data type”, why does data type determine the supervised/semisupervised/unsupervised method? Googling “data type” shows “data type is a set of possible values and a set of allowed operations on it. A data type tells the compiler or interpreter how the programmer intends to use the data”, is it what the authors intend? I guess not. Although everybody knows the difference among supervised, semi-supervised and unsupervised learning in the current ML community, the paper should always be clear and correct about the definitions. 

3. line 33-35 “In these supervised methods, model based OOD detection methods usually rely on the scoring function of statistical data from the penultimate layer or output layer of the neural network”. “scoring function of statistical data” does not seem correct to me. Why does data have a scoring function?

4. line 50 “They hope that the model has good generalization”. should be “has good generalization capability” or “has small generalization error.”

5. line 64. “Hawkins et al. [15] defined an outlier”. This sentence is confusing and needs rephrasing.

6. line 65 - 68. It would be better if the authors could elaborate on why PCA could speed up the development of the anomaly detection?

7. line 113. “The Area Under the Receiver Operating Characteristic curve (AUROC) represents the probability that an ID sample will get a higher detection score than an OOD sample”. This statement can be misleading. Firstly, what does “higher detection score” mean? is it more likely to be OOD, or more likely to be ID? Secondly, this is not a rigorous meaning of the AUROC metric. It should be “probability that the model ranks a random positive example more highly than a random negative example”.

8. section 4.4 performance comparison. In the figures, AUPR IN and AUPR OUT are not explained. Please be more specific on how they are defined. Line 120-123 is not clear in explaining the two metrics and needs rephasing.

9. section 4.4 performance comparison. Any idea or explanation of the performance difference among different methods? What contributed to the good performance of OODL?

10. line 436-437. “but OOD features are not obvious in high dimensional space, and most methods are for point features. The detection of conditional features and group features are problems that need to be solved.” What are point features, conditional features and group features?

 

I would suggest the authors carefully modify these issues and also try to proof-read and revise the paper.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

KEYWORDS: more keywords and in
alphabetical order, please

row 225: this equation is badly declared. What is "p,c"?
what is theta?

rows 252-253: these two equations are
badly declared. What is "f"? and "N"?
and epsilon?

Table 1 is awful. Please redo and
put it horizzontally

row 286: this equation is badly declared. What is "Qi"?
what is epsilon?

row 304: this equation is badly declared. What is "f"?
what is "T"?

Figures 3-4-5-6 are awful. Please
redo and put them horizzontally

row 353:this equation is badly declared. What is alfa?
what is "Dm"? what is beta?

Table 2 is awful. Please redo and
put it horizzontally

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

Thank the authors for addressing my questions. I still have the following concerns and questions:

1. To my previous comment (2), the authors updated: "Most studies use specific data. Therefore, according to the number of labeled data, it is divided into supervised, semisupervised, and unsupervised. In the supervised methods, all the training data are labeled. In unsupervised methods, all the training data are not labeled. In the semisupervised method, part of the training data are labeled and the other part are not labeled." 

I think it should not be the number of labeled data, but the availability of the labels.

 

2. To my previous comment (5), the authors updated to "Hawkins et al. [15] redefined the outlier.".

This sentence still does not make sense to me. What is it trying to convey to the audience? How did this citation redefine the outlier? I also don’t see how this sentence fits into the paragraph. Brevity may be a good thing but using 3 words to describe a citation is too brief.

 

3. To my previous comment (6). The authors updated from “PCA …speed up the development of the anomaly detection” to "PCA is a dimensionality reduction method, which is often used to reduce the dimensionality of high-dimensional data sets. It will convert a large set of variables into a smaller set of variables, while retaining most of the information in the set. The problems of difficulty in anomaly detection and model over fitting are solved."

 

Thanks for the clarification. So, PCA does not speed up the development of anomaly detection, but improves the calculation speed of some anomaly detection algorithms.

And the authors say “..anomaly detection has entered a relatively slow development process. The main reason is that no model could recognize data that do not meet the expected pattern until Svante et al. [16] proposed PCA”

So, if PCA only speeds up the calculation, why does it solve the problem of “no model could recognize data that do not meet the expected pattern”? In addition, please elaborate on what does “no model could recognize data that do not meet the expected pattern” mean. 

 

So in section 2.1, the authors claim PCA is the turning point for OOD detection research. Before PCA, development was slow; after PCA, “The problems of difficulty in anomaly detection and model over fitting are solved”. This claim is problematic to me and needs further evidence or revision. There are many following works on OOD detection using CNN, GAN, etc. Did they all use PCA? If not, why PCA is the turning point?

 

Also, “The problems of difficulty in anomaly detection and model over fitting are solved” is confusing. Please elaborate on how the problems are solved and what problems are solved. Simply saying “the problems are solved” doesn't sound right to me. 

 

So I suggest the authors doing better literature review and reorganizing/rewriting section 2.1 regarding the Development background of OOD detection. This would greatly improve the quality of the manuscript as a review paper.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

Thank you for the changes to the manuscript.

My only remark now is that the name of variables
and parameters seem superscript, instead they should
be on the same line as the text

Author Response

I apologize for my carelessness. All errors have been corrected.

Special thanks to your comments and suggestions, I hope you can recognize our responses and corrections.

We appreciate for Editors/Reviewers’ warm work earnestly, and hope that the correction will meet with approval.

Round 3

Reviewer 2 Report

Thanks for the authors for addressing my questions. I think my concerns are all properly resolved and I recommend the publication of the manuscript.

Back to TopTop