Next Article in Journal
Special Issue on Antioxidants in Natural Products II
Previous Article in Journal
Investigation of Beat Wave Propagation along Lossy Dielectric Bar for Wind Tunnel Microwave Experiments
 
 
Article
Peer-Review Record

F-ACCUMUL: A Protocol Fingerprint and Accumulative Payload Length Sample-Based Tor-Snowflake Traffic-Identifying Framework

Appl. Sci. 2023, 13(1), 622; https://doi.org/10.3390/app13010622
by Junqiang Chen 1,2,3, Guang Cheng 1,2,3,* and Hantao Mei 1,2,3
Reviewer 1:
Reviewer 2: Anonymous
Appl. Sci. 2023, 13(1), 622; https://doi.org/10.3390/app13010622
Submission received: 30 November 2022 / Revised: 20 December 2022 / Accepted: 29 December 2022 / Published: 2 January 2023

Round 1

Reviewer 1 Report

Section 1, the introduction section mostly focuses on introducing the background related to Tor, and snowflake PT protocol, but it does not touch on the problem that you are trying to solve. Why is this problem important/hard and what are the technical challenges involved? This should be made clear in the introduction.

 

Section 2, The related works section plainly describes some works on Tor traffic analysis and de-anonymization. The authors should discuss more on whether and why they can/cannot tackle the problem, the tradeoffs in their design, as well as the comparison with your proposed method. 

 

Section 2, “and there is no research on the identification of hidden service traffic in Tor-snowflake scenarios.” — I believe so, but can the existing works on other PT protocols work for Tor-snowflake scenarios? A discussion here is welcomed.

 

Section 3, the background section introduces the details of snowflake protocol, but how are these related to your algorithm design? More specifically, what features of snowflake protocol are leveraged to make your algorithm work while others can not? Otherwise the background part is not interesting to read.

 

Section 4, the introduction to the Tor-Snowflake Traffic Identification Framework is more engineering focused. I can get the sense how each component works together but I don’t see anything new here. 

 

Section 4.2, many works uses Website Fingerprinting, the method of Snowflake Identification based on DTLS Handshake Fingerprint does not feel new to me.



Line 419, “we experimentally verify that the best computational efficiency and recognition accuracy can be achieved at n=300,m=40” Why? How do you perform the fine-tune process and how can you make sure this works for all traffic scenarios?

 

Section 5.1, it seems that you collected data from simulated Tor machines. How can you ensure the patterns of website visiting are similar to real world environment?

 Have you tried with real-world dataset like CAIDA?

 

Line 502, Figure 7’s caption does not make sense.

 

Section 5.3, have you tried with techniques in [1] like n hot learning?




[1] Cherubin, G., Jansen, R. and Troncoso, C., 2022. Online Website Fingerprinting: Evaluating Website Fingerprinting Attacks on Tor in the Real World. In 31st USENIX Security Symposium (USENIX Security 22) (pp. 753-770).




Grammars

 

Line 15, differ from -> differing from xxx

Line 20, identify hidden service -> services

 

I did not list all, please proofread

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

  • Dear Authors! Here is my review:
  • Manuscript is clear, relevant for the field and presented in a well-structured manner.
  • Cited references mostly recent publications (within the last 5 years) and are relevant, except the next ones: 1,6,9,14,15,23,25,26,27,28 (they are relevant but outdated).
  • Manuscript's experimental design is appropriate to test the hypothesis.
  • Manuscript’s results are reproducible based on the details given in the methods section.
  • About figures: Figure 7 signature should be added (it is missed I guess), and figure itself should have better scales descriptions.
  • "As seen from the data in the table, our proposed method based on linear interpolation sampling of the accumulative message payload length in the DTLS data transmission phase has good results in identifying HS traffic in the Tor-snowflake scenario when building models using RF and KNN. " Please give more detailed descriptions about what you define as "good results".
  • In my opinion, Figure 3 can be removed from the text because it is obvious.
  • Conclusions are consistent with the evidence and arguments presented.
  • Data availability statements are adequate.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors have revised based on my previous comments in a point-to-point basis. The revisions look good. The quality of this paper has been improved. 

Back to TopTop