Next Article in Journal
Large Language Models for Structured Task Decomposition in Reinforcement Learning Problems with Sparse Rewards
Previous Article in Journal
SemiSeg-CAW: Semi-Supervised Segmentation of Ultrasound Images by Leveraging Class-Level Information and an Adaptive Multi-Loss Function
 
 
Article
Peer-Review Record

Classification of Obfuscation Techniques in LLVM IR: Machine Learning on Vector Representations

Mach. Learn. Knowl. Extr. 2025, 7(4), 125; https://doi.org/10.3390/make7040125
by Sebastian Raubitzek 1, Patrick Felbauer 2, Kevin Mallinger 1 and Sebastian Schrittwieser 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Mach. Learn. Knowl. Extr. 2025, 7(4), 125; https://doi.org/10.3390/make7040125
Submission received: 25 August 2025 / Revised: 9 October 2025 / Accepted: 14 October 2025 / Published: 22 October 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

I have some major points the authors should address in their revision:
1. Enforce program-level splits and report leave-program-out results. Also test robustness across compiler levels (train on O0–O2, test on O3, and vice versa).
2. Demonstrate generalizability. Add an open-set/OOD evaluation (held-out transformation families or a different obfuscator) and support an "unknown" rejection option with calibrated thresholds.
3. Strengthen baselines and ablations, by comparing IR2Vec against simple IR features (opcode/CFG stats) and ablate vector dimensionality.
4. Improve class design and first-stage detection. Report a two-stage pipeline (Obfuscated? Y/N -> Which class) and address the low recall for non-obfuscated samples.
5. Quantify sensitivity to pass ordering and explain recurrent confusions using feature attributions or class-centroid similarity.
6. Release a reproducible package. Share code and data.
7. Clarify novelty vs prior work.
8. Add scalability evidence on a realistic corpus.

Comments on the Quality of English Language

The manuscript is generally understandable, but it needs a careful copy-edit to fix scattered typos, article/preposition usage, inconsistent capitalization of technical terms, and minor formatting issues in references/URLs.

Author Response

file attached

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper “Classification of Obfuscation Techniques in LLVM IR: Machine Learning on Vector Representations” proposes a machine learning framework for identifying code obfuscations applied at the LLVM intermediate representation level. The method introduces two main components: the use of IR2Vec embeddings to capture structural and semantic properties of obfuscated IR code, and the application of ensemble classifiers (CatBoost and ExtraTrees) to recognize both single and layered obfuscations. Furthermore, a comprehensive dataset is constructed using Tigress obfuscations on handcrafted programs and GNU Coreutils. Experiments on this dataset demonstrate that the approach achieves over 90% classification accuracy, effectively distinguishing diverse obfuscation types and combinations.

  1. In the dataset construction, since the data is primarily derived from Tigress, could this single-source origin introduce bias, leading the classifiers to learn Tigress-specific artifacts rather than more generalizable patterns of obfuscation?
  2. The description of IR2Vec is overly brief and lacks sufficient detail. It would be beneficial to elaborate on the training principle of IR2Vec and explain how it captures instruction-level and semantic relationships, as well as to include a schematic illustration showing the transformation process from code snippets to IR2Vec embeddings.
  3. A more detailed description of the hardware and software environment is needed in the experimental setup, including specifications of the CPU/GPU configurations and the training parameters used for IR2Vec.
  4. Only CatBoost and ExtraTrees are compared, while baseline methods and ablation studies are lacking.
  5. The experimental results are primarily reported in terms of accuracy, without reflecting statistical confidence. Please consider including measures such as statistical significance tests or error bars to demonstrate the stability of the results.

Author Response

file attached

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have partially answered to the questions. However, they have improved the article enough to be published.

Reviewer 2 Report

Comments and Suggestions for Authors

I have no further comments.

Back to TopTop