Next Article in Journal
Prototype-Guided Promptable Retinal Lesion Segmentation from Coarse Annotations
Next Article in Special Issue
MPCTF: A Multi-Party Collaborative Training Framework for Large Language Models
Previous Article in Journal
FPGA Chip Design of Sensors for Emotion Detection Based on Consecutive Facial Images by Combining CNN and LSTM
Previous Article in Special Issue
Enhancing Mine Safety with YOLOv8-DBDC: Real-Time PPE Detection for Miners
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Trustworthy Dataset for APT Intelligence with an Auto-Annotation Framework

1
College of Computer Science, Beijing Information Science and Technology University, Beijing 102206, China
2
Institute of Intelligent Information Processing, Beijing Information Science and Technology University, Beijing 102206, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(16), 3251; https://doi.org/10.3390/electronics14163251
Submission received: 10 July 2025 / Revised: 7 August 2025 / Accepted: 13 August 2025 / Published: 15 August 2025
(This article belongs to the Special Issue Advances in Information Processing and Network Security)

Abstract

Advanced Persistent Threats (APTs) pose significant cybersecurity challenges due to their multi-stage complexity. Knowledge graphs (KGs) effectively model APT attack processes through node-link architectures; however, the scarcity of high-quality, annotated datasets limits research progress. The primary challenge lies in balancing annotation cost and quality, particularly due to the lack of quality assessment methods for graph annotation data. This study addresses these issues by extending existing APT ontology definitions and developing a dynamic, trustworthy annotation framework for APT knowledge graphs. The framework introduces a self-verification mechanism utilizing large language model (LLM) annotation consistency and establishes a comprehensive graph data metric system for problem localization in annotated data. This metric system, based on structural properties, logical consistency, and APT attack chain characteristics, comprehensively evaluates annotation quality across representation, syntax semantics, and topological structure. Experimental results show that this framework significantly reduces annotation costs while maintaining quality. Using this framework, we constructed LAPTKG, a reliable dataset containing over 10,000 entities and relations. Baseline evaluations show substantial improvements in entity and relation extraction performance after metric correction, validating the framework’s effectiveness in reliable APT knowledge graph dataset construction.
Keywords: entity relationship dataset; automated annotation framework; graph data evaluation; threat intelligence; APT entity relationship dataset; automated annotation framework; graph data evaluation; threat intelligence; APT

Share and Cite

MDPI and ACS Style

Qi, R.; Xiang, G.; Zhang, Y.; Yang, Q.; Cheng, M.; Zhang, H.; Ma, M.; Sun, L.; Ma, Z. A Trustworthy Dataset for APT Intelligence with an Auto-Annotation Framework. Electronics 2025, 14, 3251. https://doi.org/10.3390/electronics14163251

AMA Style

Qi R, Xiang G, Zhang Y, Yang Q, Cheng M, Zhang H, Ma M, Sun L, Ma Z. A Trustworthy Dataset for APT Intelligence with an Auto-Annotation Framework. Electronics. 2025; 14(16):3251. https://doi.org/10.3390/electronics14163251

Chicago/Turabian Style

Qi, Rui, Ga Xiang, Yangsen Zhang, Qunsheng Yang, Mingyue Cheng, Haoyang Zhang, Mingming Ma, Lu Sun, and Zhixing Ma. 2025. "A Trustworthy Dataset for APT Intelligence with an Auto-Annotation Framework" Electronics 14, no. 16: 3251. https://doi.org/10.3390/electronics14163251

APA Style

Qi, R., Xiang, G., Zhang, Y., Yang, Q., Cheng, M., Zhang, H., Ma, M., Sun, L., & Ma, Z. (2025). A Trustworthy Dataset for APT Intelligence with an Auto-Annotation Framework. Electronics, 14(16), 3251. https://doi.org/10.3390/electronics14163251

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop