Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Designing Trojan Detectors in Neural Networks Using Interactive Simulations

Appl. Sci. 2021, 11(4), 1865; https://doi.org/10.3390/app11041865

by Peter Bajcsy^1,*

, Nicholas J. Schaub² and Michael Majurski¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Appl. Sci. 2021, 11(4), 1865; https://doi.org/10.3390/app11041865

Submission received: 30 December 2020 / Revised: 11 February 2021 / Accepted: 14 February 2021 / Published: 20 February 2021

(This article belongs to the Special Issue Machine Learning for Cybersecurity Threats, Challenges, and Opportunities)

Round 1

Reviewer 1 Report

Although it is not a field that I usually handle, the methodology outlined in the article is quite well described so that the process can be followed. The objective of each of the calculations and simulations carried out is well explained. The figures that accompany the text are especially descriptive, further improving the result. It was an interesting read.

Author Response

Thank you very much for the comments.

Reviewer 2 Report

There is a lot of useful information in this paper. They need to remove 'we' and write in the third person. The references are suitable and current in supporting the work. There is little justification of the method and no conclusion or brief analysis. Many diagrams too small to read and the equations do not all seem to be necessary.

I do not see this needs more than moderate changes and presentation. I enjoyed reading this research and the subject very important.

Author Response

Thank you very much for the comments.

“They need to remove 'we' and write in the third person.” This comment was addressed by converting the sentences into a third person or by using passive voice. Some sentences remained intact to convey the personal statements of authors to the readers.

Example of a third person:

Based on the sensitivity values shown in Figure 6 (0.1 for data regeneration and 0.5 for re-training), one could infer that the trojan T9 is likely in both classes …

Example of past tense:

For the last case, {\color{red} it is assumed} that all models were properly trained, and class labels are not assigned at random.

Example of intact sentences:

However, we are not able to make any inferences about the number of regions from Figure~\ref{fig:10} (right) other than that the complexity of modeling class P or N in the case of T8 is more inefficient than modeling class P and N in the case of T9 by comparing the deltas of modified KL divergence values.

In the following section, we show how to use the modified KL convergence to detect the presence of trojans in a network.

Therefore, we did not compare the inefficiency across the two trojans.

“There is little justification of the method and no conclusion or brief analysis.“

The section 5 “Discussion about Trojan Detection” is devoted to a brief analysis and the first paragraph of the section 6 “Summary and Future Work” presents a summary of the work to conclude the paper.

“Many diagrams too small to read and the equations do not all seem to be necessary.”

We increased the font of text and numerical values, made the text and numerical values bold, and sharpen pictures by 50% for Fig 1., Fig 2., Fig 3., Fig 4., Fig 5, Fig 6., Fig 7, and Fig 8..

All figures in the appendices were updated the same way as the figures in the main manuscript.

There are four equations in the main text: KL divergence, modified KL divergence, delta between KL divergence and modified KL divergence, and the trojan detection equations. All other equations and derivations are in the appendices. We could not identify any of the four equations in the main text that could be removed while presenting a clear understanding of the mathematical operations to the journal readers.

Reviewer 3 Report

The authors present an approach to designing Trojan detectors in neural networks. Interactive simulations are used for this purposes.

In my opinion, the approach appears to be novel and interesting. Its readability is clear and fluent. Also, the empirical results are promising. Notwithstanding, the manuscript still requires the following enhancements.

A specific toy example should be added in the Introduction to preliminarily introduce the unfamiliar reader with Trojans in NNs, their challenges and implications.
The novelty of each authors’ contribution to the presented approach must be more extensively and insightfully highlighted. This would be useful to appreciate even more in detail how authors’ research differs from the reviewed related works.
The theoretical and practical implications of authors’ research must be clearly highlighted and discussed.
The authors’ assumption TwT NN models “demonstrate higher efficiency / utilization” than TwoT NN models has to be more extensively elaborated.
An explicit section must be devoted to the discussion of the implications of the observed results, with a particular focus on general guidelines on the detection of Trojans and the design of their detectors.
An insight into how the results of Trojan detection would change with the number of classes should be provided.

Finally, as a minor issue, I would suggest the authors to more clearly reword the concept of “implementation approach” (occurring at line 224 and at line 230).

Author Response

Thank you very much for the comments.

“A specific toy example should be added in the Introduction to preliminarily introduce the unfamiliar reader with Trojans in NNs, their challenges and implications.”
1. We added a sentence in the introduction and created an entire appendix to provide a toy example motivating our work.
“The novelty of each authors’ contribution to the presented approach must be more extensively and insightfully highlighted. This would be useful to appreciate even more in detail how authors’ research differs from the reviewed related works. The theoretical and practical implications of authors’ research must be clearly highlighted and discussed.”
1. We added the following paragraphs right after listing the novelties of the work: “{\color{red} The novelties can be described as follows. First, the authors conceived the concept of interactive neural network calculator in which (a) operands are 2D data and neural networks, (b) memory operations follow the operations provided by standard calculators (MC, MR, M+, M-, MS, AVG), (c) NN and data operators are applicable functions to design, parametrize, train, infer, and analyze (inefficiency, sensitivity) NN-based models, and (d) display of NN, data, and results can be delivered in scrollable views of web browsers. Second, the authors designed a modified KL divergence measurement of NN states based on the parallels with information theory and based on computational cost considerations. Finally, the authors devised a methodology for trojan detection by investigating the simulations of multiple types of embedded trojans. These and other simulations can be used for educational and research purposes as they contribute to advancing explainable AI concepts by the AI community.”
2. The comparison to previous work is stated in the Related Work section.
“The authors’ assumption TwT NN models “demonstrate higher efficiency / utilization” than TwoT NN models has to be more extensively elaborated.”
1. We added a few sentences explaining the reasoning behind the assumption/hypothesis: “However, our goal is to explore the hypothesis that NN models trained with trojans will demonstrate higher efficiency/utilization of NN than NN models trained without trojan. {\color{red} This hypothesis can be explained by the observations that encoding n predicted classes plus trojan will likely require a model with higher modeling capacity than encoding n predicted classes. One can illustrate this observation on the last layer of fully connected layers. If the last layer consists of one node, then the node output can discriminate only two classes. In order to discriminate/predict more than two classes, one must increase the modeling capacity to more nodes per layer.}”
“An explicit section must be devoted to the discussion of the implications of the observed results, with a particular focus on general guidelines on the detection of Trojans and the design of their detectors.”
1. The discussion is provided in the section “5. Discussion about Trojan Detection”. The problem of designing trojan detectors is a relatively new and active area of research. According to our survey statistics derived from the publications listed at https://github.com/usnistgov/trojai-literature in 2020, there were only two related publications to trojan detection in Arxiv before 2017. We are not able to make general statements about the guidelines for detecting trojans or designing trojan detectors.

“An insight into how the results of Trojan detection would change with the number of classes should be provided.”
1. We extended the discussion section under “\underline{Complexity of trojan problems:}” to highlight the complexity of trojan detection for NN models with large numbers of predicted classes: “{\color{red} Finally, the number of classes goes from two to hundreds or thousands. } Given such an increase of problem complexities and without knowing the characteristics of trojan embedding, the number and selection of provided training data points per class become the key to detecting trojans. {\color{red} In addition, for NN models predicting large numbers of classes, the combinatorial complexity of triggered classes and targeted classes is much higher than for NN models predicting two classes.”
2. We added Appendix entitled “Example of Trojan Problem”. The appendix clarifies that the trojan embedding depends on the number of classes.
3. We are currently running experiments on actual NN models to understand the dependency of trojan detectors on the number of classes encoded in a NN model. We do not have the insights yet and the simulation framework is limited to only two class problems (hence we would not want to make wrong predictions).
“Finally, as a minor issue, I would suggest the authors to more clearly reword the concept of “implementation approach” (occurring at line 224 and at line 230).”
1. We have rewritten the paragraphs describing the implementation approach.
  1. “{\color{red} To minimize the memory requirements in our implementation, histogram bins are created and stored in memory only for states that occur when each training data point passes through the neural network. This implementation leads to the worst-case memory requirement scenario to be $npts * 10 * 100$ bytes. }”
  2. “To eliminate the alignment computation in our implementation, {\color{red} the KL divergence definition is modified according to Equation \ref{eq:02}. The computation of modified KL divergence $\widehat{D_{KL}}$ requires only collecting non-zero occurring states and calculating their histogram at the cost of approximating the originally defined KL divergence.The derivation of Equation \ref{eq:02} with its approximation step can be found in Appendix \ref{appendix:derivation}. “ }

Round 2

Reviewer 3 Report

I could not find the authors’ online responses to my concerns in the revised pdf manuscript. Please, ensure that the submitted manuscript is actually the revised one and that the latter includes all of your responses cited in the online report. The use of a marked pdf revised manuscript would be useful to ensure that all of your replies have been added to the manuscript.

Furthermore, by reading the authors’ online responses, I object that their answers to my first two questions are not satisfying at all. They simply replied by pointing me to parts of their manuscript, which in their opinion should target my concerns. However, such concerns arose when I read their contribution at the time of its original submission and, unfortunately, are still present after they revision.

I recapitulate my first two questions below, since I am confident that the authors will take them into account.

The authors must develop and add a new toy example, even in the form of a Figure, which is functional to introduce the interested and unfamiliar reader to the wide theme of Trojans in NNs, their challenges and implications. Unfortunately, the appendix cannot be considered as toy example for the unfamiliar reader, since it its contents are targeted to a more informed reader.
I am aware that “the comparison to previous work is stated in the Related Work section”. My previous concern is that the authors must explain the originality of their individual contribution, so that the publication of their manuscript is actually justified with respect to all of the previous work.

Please, answer the above concerns and add all answers in the online responses to the revised pdf manuscript.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 3

Reviewer 3 Report

My concerns were addressed. The article was improved.

As a minor issue please notice that:

“trained” should be train at line 479
“ to defined a class B inthe” needs be rephrased at line 485.

Article Menu

Designing Trojan Detectors in Neural Networks Using Interactive Simulations

Further Information

Guidelines

MDPI Initiatives

Follow MDPI