Malicious codes, such as advanced persistent threat (APT) attacks, do not operate immediately after infecting the system, but after receiving commands from the attacker’s command and control (C&C) server. The system infected by the malicious code tries to communicate with the C&C server through the IP address or domain address of the C&C server. If the IP address or domain address is hard-coded inside the malicious code, it can analyze the malicious code to obtain the address and block access to the C&C server through security policy. In order to circumvent this address blocking technique, domain generation algorithms are included in the malware to dynamically generate domain addresses. The domain generation algorithm (DGA) generates domains randomly, so it is very difficult to identify and block malicious domains. Therefore, this paper effectively detects and classifies unknown DGA domains. We extract features that are effective for TextCNN-based label prediction, and add additional domain knowledge-based features to improve our model for detecting and classifying DGA-generated malicious domains. The proposed model achieved 99.19% accuracy for DGA classification and 88.77% accuracy for DGA class classification. We expect that the proposed model can be applied to effectively detect and block DGA-generated domains.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited