A Classifier with Unknown Pattern Recognition for Domain Name System Tunneling Detection in Dynamic Networks

Dong, Huijuan; Zheng, Zengwei; Pei, Shenfei

doi:10.3390/electronics15030709

Open AccessArticle

A Classifier with Unknown Pattern Recognition for Domain Name System Tunneling Detection in Dynamic Networks

by

Huijuan Dong

,

Zengwei Zheng

and

Shenfei Pei

^*

School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(3), 709; https://doi.org/10.3390/electronics15030709

Submission received: 12 January 2026 / Revised: 1 February 2026 / Accepted: 5 February 2026 / Published: 6 February 2026

(This article belongs to the Special Issue AI for Cybersecurity and Emerging Technologies for Secure Systems)

Download

Browse Figures

Versions Notes

Abstract

Domain Name System (DNS) tunneling, a stealthy attack that exploits DNS infrastructure, poses critical threats to dynamic networks and is evolving with emerging attack patterns. This study aims to accurately classify multi-pattern legitimate and malicious traffic and to identify previously unseen attack patterns. We focus on two core research questions: how to accurately classify known-pattern DNS queries and reliably identify unknown-pattern samples. The codified objective is to develop an unsupervised classification approach that integrates multi-pattern adaptation and the recognition of unknown patterns. We formalize the task as Emerging Pattern Classification and propose the Medium Neighbors Forest. It is a forest-based model that uses the “medium neighbor” mechanism and clustering to identify unknown patterns. Experiments verify that the proposed model effectively identifies unseen patterns, offering a new perspective for DNS tunneling detection.

Keywords:

artificial intelligence; pattern recognition; computer network security; DNS tunneling detection

1. Introduction

The Domain Name System (DNS), a core component of internet infrastructure that maps domain names to IP addresses, is increasingly exploited for malicious purposes. DNS tunneling, a technique that embeds covert data into DNS queries and responses, enables attackers to establish stealthy communication channels, facilitating data exfiltration, captive portal evasion, and impersonation attacks [1].

Figure 1 illustrates the working mechanism of this cyberattack: ➀ A virus-infected computer automatically sends a DNS query to the local DNS resolver, for example, “secretdata.tun.malicious.com”. ➁ Since the local DNS server has not resolved this domain name before, it forwards the request to the Internet Service Provider (ISP) DNS server. ➂ Due to the essential nature of DNS services, port 53 (the default port for DNS) is almost always allowed in firewall policies, enabling unimpeded transmission of DNS queries. ➃ “malicious.com” is a domain name pre-registered by the attacker, and its authoritative DNS server is under the attacker’s control. Since the query ends with “malicious.com”, the request is ultimately routed to this attacker-controlled authoritative DNS server. ➄ Upon receiving the request, the attacker-controlled DNS server does not resolve the domain name to a legitimate IP address; Instead, it extracts the encoded “secretdata” and forwards it to the attacker’s personal computer. ➅–➉ represent the transmission path through which the infected computer receives the DNS response. In particular, the response typically does not contain a valid IP address; instead, it contains data or control commands for further manipulation of the infected computer. At this point, a stable covert communication channel is established between the infected computer and the attacker’s device via the DNS resolution infrastructure. From the perspective of the DNS protocol, communication between the attacker and the infected computer appears to be a normal domain name query and resolution, making it difficult to detect. According to a report by the Global Cyber Alliance (GCA), blocking illegal DNS requests could prevent billions of dollars in global economic losses, highlighting the critical need for effective DNS tunneling detection mechanisms before queries are forwarded to public DNS resolvers.

From a machine learning perspective, DNS tunneling detection is typically framed as a binary classification task (safe vs. unsafe queries). However, this task faces unique challenges that distinguish it from conventional binary classification:

Diversity: For unsafe samples in the training dataset, the types of data to be transmitted and the encryption methods used are highly diverse, resulting in no unified pattern. Meanwhile, legitimate DNS traffic in the training dataset also exhibits significant variability—for example, traffic generated by accessing video websites differs substantially from that generated by instant messaging applications.
Novel patterns: During the testing phase, DNS queries with patterns not observed during training (i.e., unknown patterns) will inevitably emerge. Conventional classifiers cannot accurately predict the categories of these samples, requiring independent consideration of such unknown queries.
Class imbalance: In datasets collected from live networks, the vast majority of samples are legitimate DNS requests generated by normal user activities, whereas malicious DNS requests generated by DNS tunneling account for only a small proportion.
Feature engineering complexity: A large amount of information can be extracted from each DNS query, including the source IP address, the destination IP address, the domain name, and the timestamp. However, this raw data cannot be directly input into machine learning models, making feature engineering a critical and challenging step.

To address these issues, we propose a forest-based classifier with an unknown pattern identification function, specifically designed to handle the unique characteristics of DNS tunneling detection.

1.1. Problem Formulation

To formally describe the DNS tunneling detection task with unknown pattern classification, we first define the following key concepts:

Known Pattern (KP) Samples: DNS query samples whose patterns have been observed in the training dataset, including both legitimate and known malicious patterns.
Unknown Pattern (UP) Samples: DNS query samples whose patterns have not been observed in the training dataset, primarily consisting of newly emerging malicious DNS tunneling patterns.

Let

X = {x_{1}, x_{2}, \dots, x_{n}} \in R^{n \times d}

denote the training dataset, where n is the number of training samples, and d is the number of features. Let

\hat{X} = {{\hat{x}}_{1}, {\hat{x}}_{2}, \dots, {\hat{x}}_{m}} \in R^{m \times d}

denote the test dataset, where m is the number of test samples. Let

Y = [y_{1}, y_{2}, \dots, y_{n}]

, where

y_{i} \in R^{C}

is the class label of the i-th sample

x_{i}

, and C is the number of known classes. Let

P = [p_{1}, p_{2}, \dots, p_{n}]

, where

p_{i}

denotes the pattern (subclass) of sample

x_{i}

. In the context of DNS tunneling detection, each pattern corresponds to a specific traffic behavior mode. Each training sample

x_{i} \in X

is associated with a latent pattern label

p_{i} \in {1, 2, \dots, K}

, where K is the total number of patterns in the training dataset. Each test sample

{\hat{x}}_{j}

is associated with an implicit pattern label

{\hat{p}}_{j}

. If

{\hat{p}}_{j} \in {1, 2, \dots, K}

, then

{\hat{x}}_{j}

is regarded as a KP sample; otherwise, it is regarded as a UP sample. There is no predefined one-to-one correspondence between pattern labels

p_{i}

and class labels

y_{i}

. Instead, a single class may contain multiple distinct patterns, i.e.,

|{p_{i} ∣ y_{i} = k}| \geq 1, k \in {1, 2, \dots, C} .

Unlike open-set classification problems, DNS tunneling detection is essentially a binary classification problem (legitimate vs. malicious queries), and the training data typically includes both legitimate and malicious requests. Therefore, unseen samples

y_{i} \notin {1, C}

are unlikely to appear in this scenario. This characteristic results in the following: (1) a sample whose pattern, instead of a label, has not been observed during training should be recognized as the Unknown class. (2) This makes the traditional method of directly mapping samples to category labels no longer feasible. For example, animals are one class with multiple subclasses, such as fish, tigers, and birds, and plants are another class with multiple subclasses, such as rice, cotton, and maize. If the test sample is an elephant, it’s more reasonable to expect the model to identify it as an unknown rather than an animal. This is because, during training, the model cannot possibly know what an elephant is.

To formally describe the DNS tunneling detection task with unknown pattern classification, we first define the following key concepts: Known Pattern (KP) Samples: DNS query samples whose patterns have been observed in the training dataset, including both legitimate and known malicious patterns. Unknown Pattern (UP) Samples: DNS query samples whose patterns have not been observed in the training dataset, primarily consisting of newly emerging malicious DNS tunneling patterns. We refer to the classification problem with UP samples as “Emerging Pattern Classification (EPC)”. It is worth noting that although DNS tunneling detection is a binary classification problem, the method proposed in this paper is fully applicable to multi-class classification.

1.2. Motivation and Contributions

Because each class in the DNS tunneling detection scenario may contain multiple patterns, it is impossible to learn a single rule that describes all samples within a class. To accurately classify KP samples, we first determine the pattern to which a test sample belongs and then predict its class label from the training samples of that pattern. However, the pattern labels of training samples and the total number of patterns in the training dataset are both unknown. Therefore, pattern matching for KP test samples must be performed using an unsupervised model. In this paper, we construct a forest model in a fully unsupervised manner. During the testing phase, each test sample will fall into a leaf node of each tree in the forest. For KP samples, the class label of the test sample is inferred from the class labels of the training samples in the leaf node. For a test sample

{\hat{x}}_{i}

falling into leaf node A, an intuitive method to determine whether its pattern has been observed in the training phase is to set a distance threshold for each leaf node: if the distance between the test sample and the center of the leaf node exceeds the threshold, the sample is a UP sample; otherwise, it is a KP sample. However, determining an appropriate threshold for each leaf node is challenging because the range of possible threshold values is infinite.

To address this issue, we first define the concept of “medium neighbor”. The probability that a test sample

{\hat{x}}_{i}

is identified as a UP sample is then calculated based on the number of its medium neighbors. Specifically, let

n_{1}

be the number of medium neighbors of

{\hat{x}}_{i}

in leaf node A, and

n_{2}

be the number of training samples in leaf node A. The probability of

{\hat{x}}_{i}

being a UP sample is computed as the ratio of

n_{1}

to

n_{2}

. Since our algorithm identifies unknown samples by counting their medium neighbors, we refer to it as MNForest (Medium Neighbors Forest). Figure 2 illustrates the workflow of MNForest: (a) The training dataset contains multiple classes (distinguished by different shapes), and each class contains multiple patterns (distinguished by different colors). (b) The training samples are split into leaf nodes in an unsupervised manner using the proposed tree construction method. The model expects that samples in each leaf share the same label, while samples with the same label can appear in multiple leaves. (c) For a test sample

{\hat{x}}_{i}

(denoted by the red circle), its medium neighbors in the leaf node are counted (denoted by the black circle). The probability of

{\hat{x}}_{i}

being a UP sample is calculated based on the number of medium neighbors, and the final classification result is determined by comparing this probability with a predefined threshold. Detailed descriptions of the key components are provided in Section 3.

We explicitly formulate the following scientific research questions:

How to formalize the DNS tunneling detection task involving unknown pattern recognition to accurately characterize the dynamic emergence of DNS query patterns in real-world dynamic networks?
How to design an efficient classifier that can not only handle multi-pattern classes in DNS tunneling detection but also effectively identify unknown pattern samples?
Can the proposed model outperform state-of-the-art baselines in both known pattern classification accuracy and unknown pattern detection rate, especially in handling class imbalance, when tested on real DNS traffic across diverse network scenarios?

The structure of this paper aligns with the above research questions, with each chapter addressing specific objectives to support the conclusions: (1) Section 1 (Introduction): Elaborates on the background and challenges of DNS tunneling, and formalizes the task as the Emerging Pattern Classification (EPC) problem. (2) Section 2 (Related Work): Reviews existing research on open-set classification and DNS tunneling detection, analyzes their limitations, and justifies the innovation of EPC and MNForest. (3) Section 3 (Our Model): Focuses on Research Question 2 and details the design of MNForest. (4) Section 4 (Experiments on Public Datasets): Evaluates MNForest under SENC/EPC scenarios on public datasets. (5) Section 5 (DNS Tunneling Detection Experiments): Validates MNForest on real DNS traffic across diverse network scenarios. (6) Section 6 (Conclusions): Summarizes findings and outlines future research directions.

2. Related Work

The work most relevant to this paper is Open Set Recognition (OSR), and many OSR methods have been proposed [2,3]. To facilitate OSR work, Geng et al. [4] defined four sample types: Known Known Classes (KKCs), Known Unknown Classes (KUCs), Unknown Known Classes (UKCs), and Unknown Unknown Classes (UUCs). In DNS tunneling detection scenarios, samples typically lack side information, so we consider only KKC and UUC samples. Among the many OSR works, the approach that follows the Classification Under Streaming Emerging New Classes (SENCs) [5] framework is closest to our scenario because it focuses solely on KKCs and UUCs. In the SENC framework, Júnior et al. [6] proposed MINAS-BR, an unsupervised streaming classifier that integrates Binary Relevance transformation with a novelty-detection mechanism to adaptively handle both concept drift and new-class emergence. Kanagaraj et al. [7] proposed MuEMNL and MuEMNLHD, based on clustering techniques, to address situations in which multiple newer class labels exist. In addition, Yu et al. [8] introduced the OSR problem into partial label learning and proposed a framework that integrates novel class detection, effective classification, and efficient model updating. Din et al. [9] introduced a semi-supervised pseudo-label learning architecture that leverages synchronous pseudo-label extraction and a knowledge base, achieving high accuracy on known classes and promptly detecting new ones, with model ensembling further enhancing performance. To reduce the computational complexity of classification, Zhang et al. [10] proposed KNNENS, which can not only efficiently detect new classes but also does not require the ground truth labels of new classes when updating the model. For privacy-sensitive scenarios, Mu et al. [11] proposed FLIK, a one-shot federated learning framework that unifies novel class detection and known class classification using an isolation-based specification kernel, enabling efficient global model aggregation in a single communication round.

Despite these advances, many SENC frameworks rely on predefined thresholds to distinguish between known and novel classes. To overcome this limitation, researchers have integrated Generative Adversarial Networks (GANs) into SENC frameworks [12,13]. Wang et al. [14] proposed EMC-GAN, which uses a generator to model implicit feature distributions and a redesigned discriminator to classify known classes and detect novel ones, significantly improving the recognition accuracy of emerging classes. Cao et al. [15] proposed EEOF, an online ensemble framework that addresses the problems of historical data dependence and performance degradation in class-imbalanced scenarios. This framework enhances class discrimination (especially between emerging and known classes) through an error-feedback balancing mechanism and stabilizes performance using a confidence-triggered fallback strategy. In addition, many clustering-based methods [16,17] have been proposed for unknown-sample detection. However, these methods typically do not involve classification tasks and are therefore not discussed in detail here.

Although the SENC problem appears similar to the EPC problem addressed in this paper, most SENC models cannot achieve satisfactory performance in DNS tunneling detection scenarios. The key difference between SENC and EPC lies in the pattern distribution within classes: in SENC scenarios, samples belonging to the same class are assumed to follow a unified pattern during training, but unknown patterns may emerge for known classes in testing, so models only learn a single rule for each class. In contrast, in EPC scenarios (e.g., DNS tunneling detection), samples from the same class inherently exhibit multiple distinct patterns even in the training set, making it impossible to model them with a single rule. Therefore, this paper uses a clustering-based approach [18,19,20] to adaptively learn patterns in the data.

3. Our Model

To address the DNS tunneling detection problem, which is a typical case of the EPC problem, we propose a forest-based classifier with an unknown pattern identification function, termed MNForest. MNForest can accurately classify KP samples and identify UP samples as new classes. In this section, we detail the tree construction method, prediction mechanism, and computational complexity of MNForest.

3.1. Tree Construction (MNTree)

MNForest consists of multiple completely random trees (MNTrees). Each MNTree is constructed in an unsupervised manner using k-sums-x clustering [21,22,23]. The detailed construction process is shown in Algorithm 1:

Initialize a root node containing all training samples X.
If the number of samples in the current node is less than a predefined threshold $ρ$ (minimum internal node size), the node becomes a leaf node, and the construction process terminates.
Otherwise, use k-sums-x clustering to split the samples in the current node into two clusters $X_{1}$ and $X_{2}$ (k = 2 (We set k = 2 in k-sums-x for node splitting to ensure balanced tree growth and reduce computational complexity.)
Recursively construct left and right child nodes using $X_{1}$ and $X_{2}$ , respectively.
Return the current node, which stores the points to the left and right child nodes.

Algorithm 1 MNTree

Require: Dataset

X

, threshold

ρ

Ensure: Node A of the tree

1:: Initialize node A containing $X$
2:: if $| X | < ρ$ then
3:: return Node A
4:: else
5:: Split $X$ into $X_{1}$ and $X_{2}$ using k-sums-x // Only data $X$ and the number of groups are required in k-sums-x.
6:: $n o d e_{1} \leftarrow$ MNTree( $X_{1}, ρ$ )
7:: $n o d e_{2} \leftarrow$ MNTree( $X_{2}, ρ$ )
8:: return Node A with attributes $n o d e_{1}$ , $n o d e_{2}$
9:: end if

3.2. Prediction Mechanism

3.2.1. Medium Neighbor

For a test sample

{\hat{x}}_{j}

falling into leaf node A of a tree, let

x_{o}

denote the training sample in leaf node A that is nearest to

{\hat{x}}_{j}

(

x_{o}

be called the medium of

{\hat{x}}_{j}

). Let r represent the distance between

x_{o}

and

{\hat{x}}_{j}

. Then, the medium neighbors of

{\hat{x}}_{j}

are defined as the samples in leaf node A whose distance to

x_{0}

is less than r. As shown in Figure 3, the black circle denotes the range of medium neighbors, and the samples within this range are the medium neighbors of

{\hat{x}}_{j}

.

3.2.2. Probability Calculation of UP Samples

For each tree in the forest, after determining the leaf node A that

{\hat{x}}_{j}

falls into, we calculate the probability

{\hat{p}}_{t} ({\hat{x}}_{j})

that

{\hat{x}}_{j}

is a UP sample using the number of medium neighbors. This idea comes from density-based clustering algorithms [24,25]. To begin with, we define two numbers,

N_{a}

and

N_{b}

, and their expressions are as follows:

N_{a} = \sum_{x_{i} \in X_{l e a f}, y_{i} = y_{o}} χ (x_{i} \in M e d ({\hat{x}}_{j})), N_{b} = \sum_{x_{i} \in X_{l e a f}, y_{i} \neq y_{o}} χ (x_{i} \in M e d ({\hat{x}}_{j}))

(1)

N_{a}

represents the number of medium neighbors with the same label as

x_{o}

in the leaf node. The larger this value, the greater the distance between the test sample and medium

x_{o}

, and therefore the weaker the leaf node’s ability to classify the test sample, increasing the probability of being inferred as a UP sample.

N_{b}

represents the number of samples with different labels from

x_{o}

in the medium neighbors. The larger this value, the less compact the leaf node, and the lower the reliability of the leaf node’s label inference for the test sample, thus increasing the probability of being inferred as a UP sample.

Thus, we have the following:

{\hat{p}}_{t} ({\hat{x}}_{j}) = 1 - (1 - \frac{N_{a}}{| X_{l e a f} |}) (1 - \frac{N_{b}}{| M e d ({\hat{x}}_{j}) |})

(2)

Figure 3 illustrates two cases for calculating

{\hat{p}}_{t} ({\hat{x}}_{j})

. In this figure, The red point

{\hat{x}}_{j}

denotes the test sample, and the blue point

x_{0}

is the nearest neighbor of

{\hat{x}}_{j}

. Yellow points denote training samples with a different label from

x_{0}

, green points denote training samples with the same label as

x_{0}

. According to Equation (2), we have

{\hat{p}}_{t} ({\hat{x}}_{j}) = \frac{23}{56} (case 1), {\hat{p}}_{t} ({\hat{x}}_{j}) = \frac{19}{28} (case 2) .

3.2.3. Class Probability Calculation for KP Samples

In the t-th tree, if

{\hat{x}}_{j}

is determined to be a KP sample (based on

{\hat{p}}_{t} ({\hat{x}}_{j})

), the probability that it belongs to the k-th class

{\hat{p}}_{t k} ({\hat{x}}_{j})

is calculated as the ratio of the number of training samples of the k-th class in leaf node A to the total number of training samples in leaf node A.

If the distance from the test sample

{\hat{x}}_{j}

to

x_{0}

is large, then the medium neighbor set

M e d ({\hat{x}}_{j})

in the leaf node will be large, which makes the utility term

{\hat{u}}_{t k} ({\hat{x}}_{j})

smaller. Therefore, our formula (4) is implicitly related to the distance between

{\hat{x}}_{j}

and

x_{0}

(where

x_{0}

is the training sample in the leaf node nearest to

{\hat{x}}_{j}

).

\begin{matrix} {\hat{u}}_{t k} ({\hat{x}}_{j}) = \frac{\sum_{x_{i} \in X_{l e a f}} χ (y_{i} = k)}{| X_{l e a f} |} (1 - \frac{| M e d ({\hat{x}}_{j}) |}{| X_{l e a f} |}) \end{matrix}

(3)

\begin{matrix} {\hat{p}}_{t k} ({\hat{x}}_{j}) = \frac{{\hat{u}}_{t k} ({\hat{x}}_{j})}{\sum_{k = 1}^{C} {\hat{u}}_{t k} ({\hat{x}}_{j})} \end{matrix}

(4)

3.2.4. Integrated Prediction Result

The final probability that

{\hat{x}}_{j}

is a UP sample is the average of

{\hat{p}}_{t} ({\hat{x}}_{j})

across all trees in the forest:

\hat{P} ({\hat{x}}_{j}) = \frac{1}{T} \sum_{t = 1}^{T} {\hat{p}}_{t} ({\hat{x}}_{j})

(5)

where T is the number of trees in MNForest. The final probability that

{\hat{x}}_{j}

belongs to the k-th class is the average of

{\hat{p}}_{t k} ({\hat{x}}_{j})

across all trees:

P_{k} ({\hat{x}}_{j}) = \frac{1}{T} \sum_{t = 1}^{T} {\hat{p}}_{t k} ({\hat{x}}_{j})

(6)

By comparing

\hat{P} ({\hat{x}}_{j})

with a predefined threshold

θ

, we obtain the final classification result:

{\hat{y}}_{j} = \{\begin{matrix} {arg max}_{k = 1, \dots, C} P_{k} ({\hat{x}}_{j}) & \hat{P} ({\hat{x}}_{j}) < θ \\ “ u n k n o w n ” & \hat{P} ({\hat{x}}_{j}) \geq θ \end{matrix}

(7)

3.3. Prediction and Unknown Pattern Identification

For a test sample

{\hat{x}}_{j}

, the prediction process involves traversing each tree in MNForest to determine the leaf node it belongs to, calculating the probability of it being a UP sample, and then integrating the results from all trees to obtain the final classification label. The detailed process is shown in Algorithm 2.

Algorithm 2 Predict

Require: Model MNForest, threshold

ρ

and

θ

, test data

\hat{X} \in R^{m \times d}

,
Ensure: Updated MNForest, predicted labels

\hat{y}

1:: for $j = 1, \dots, m$ do
2:: for $t = 1, \dots, T$ do
3:: Find leaf node A for ${\hat{x}}_{j}$ in t-th tree by a standard top-down traversal strategy
4:: Compute ${\hat{p}}_{t} ({\hat{x}}_{j})$ via Equation (2)
5:: Compute $p_{t k} ({\hat{x}}_{j})$ via Equation (4)
6:: end for
7:: Find label ${\hat{y}}_{j}$ via Equation (7)
8:: end for

3.4. Time and Space Complexity

Training Phase: Given a training dataset X with n samples and d features, each k-sums-x clustering step requires

O (t n d)

time, where t is the average number of iterations until convergence. For each tree, if the samples are split into approximately balanced clusters at each node, the number of splits required is

O (log n)

; in the worst case (e.g., one cluster contains only one sample and the other contains the remaining samples), the number of splits is

O (n)

. Thus, the best-case time complexity for training MNForest (with T trees) is

O (T t d n log n)

, and the worst-case time complexity is

O (T t d n^{2})

.

Testing Phase: For a single test sample, finding the corresponding leaf node in one tree requires

O (d log n)

time in the best case and

O (d n)

time in the worst case. Calculating the probability of the sample being a UP sample and its class probabilities requires

O (ρ d)

time per tree, where

ρ

is the minimum internal node size. Thus, the best-case time complexity for testing a single sample is

O (T d (log n + ρ))

, and the worst-case time complexity is

O (T d (n + ρ))

.

Space Complexity: The space required by MNForest includes storing the training dataset X, the indices of training samples in each leaf node, and the centroids of the child clusters in each internal node. Thus, the space complexity is

O (n d + T l d + T n)

, where l is the average number of nodes per tree.

Overall, MNForest exhibits near-linear time complexity with respect to the number of samples in practical scenarios (where k-sums-x converges quickly and clusters are approximately balanced), making it suitable for large-scale DNS traffic detection tasks. Additionally, its memory-efficient design ensures that it can be deployed on resource-constrained network devices.

4. Experiments on Public Datasets

To verify the performance of MNForest in solving the EPC problem, we conduct experiments on four public datasets. We compare MNForest with state-of-the-art open-set classification models and evaluate their performance in both SENC and EPC scenarios. This section details the datasets, baselines, experimental configurations, and results.

4.1. Datasets

We use four widely used public datasets covering handwritten digit recognition and face recognition tasks. For each dataset, we design multiple experimental cases by dividing the known and unknown classes to fully simulate the characteristics of the SENC and EPC scenarios. All data used in this experiment are annotated with real labels, and all samples under the same label follow the same pattern. For the sake of convenience, the real classes are referred to as patterns in this study. The detailed dataset information and case settings are as follows:

USPS: A handwritten digit dataset containing 9298 samples of digits 0–9. It is employed and divided into three experimental cases, as detailed below:
–
The first case focuses on the SENC scenario. Specifically, the known classes are composed of samples from patterns 1, 3, 4, 5, 6, 8, while the unknown classes are composed of samples from patterns 0, 2, 7, 9.
–
The second case focuses on the balanced EPC scenario. In this case, the dataset is categorized into two classes: the first class is formed by samples from patterns 1, 3, 4, and the second class is constituted by samples from patterns 5, 6, 8. The unknown classes still contain samples from the patterns 0, 2, 7, 9. Notably, the number of samples contributed by each pattern is approximately equal.
–
The third case focuses on the imbalanced EPC scenario. It shares similarities with the second case in terms of the two-class division and the corresponding patterns involved in each class. The key difference lies in the sample size distribution: the number of samples from patterns 1, 5 is about 10 times that of samples from patterns 3, 4, 6, 8.
Digits: A handwritten digit dataset containing 1797 samples of digits 0–9. Consistent with the USPS dataset, it is divided into three experimental cases to simulate SENC and EPC scenarios, as follows:
–
The first case focuses on the SENC scenario. Specifically, the known classes are composed of samples from patterns 2, 4, 6, 8, 0, while the unknown classes are composed of samples from patterns 1, 3, 5, 7, 9.
–
The second case focuses on the balanced EPC scenario. The dataset is categorized into two classes here: the first class consists of samples from patterns 1, 2, 3, 4, and the second class is formed by samples from patterns 5, 6, 7, 8. The unknown classes remain samples from patterns 0, 9, with the number of samples contributed by each pattern being approximately equal.
–
The third case focuses on the imbalanced EPC scenario. It has the same two-class division and pattern composition as the second case, but differs in sample size distribution: the number of samples from patterns 1, 5 is about 10 times that of samples from patterns 2, 3, 4, 6, 7, 8.
JAFFE: A face emotion dataset containing 213 images of 10 Japanese female subjects, with 7 emotion categories (happy, sad, angry, fear, surprise, disgust, neutral), which are regarded as patterns. Three experimental cases are designed for this dataset, detailed as follows:
–
The first case focuses on the SENC scenario. Specifically, the known classes are composed of samples from patterns happy, sad, angry, surprise, neutral, while the unknown classes are composed of samples from patterns fear, disgust.
–
The second case focuses on the balanced EPC scenario. The dataset is divided into two classes: the first class consists of samples from the patterns happy, sad, angry, and the second class consists of samples from the patterns surprise, neutral. The unknown classes are samples from patterns fear, disgust, with each pattern contributing approximately the same number of samples.
–
The third case focuses on the imbalanced EPC scenario. It adopts the same two-class division and pattern composition as the second case, but with an unbalanced sample size distribution: the number of samples from patterns happy, surprise is about 10 times that of samples from patterns sad, angry, neutral.
Pix10P: A face recognition dataset containing 1000 images of 10 subjects (100 images per subject), where each subject corresponds to a unique pattern. Three experimental cases are designed for this dataset, as detailed below:
–
The first case focuses on the SENC scenario. Specifically, the known classes are composed of samples from patterns Subject 1, Subject 2, Subject 3, Subject 4, Subject 5, Subject 6, Subject 7, Subject 8, while the unknown classes are composed of samples from patterns Subject 9, Subject 10.
–
The second case focuses on the balanced EPC scenario. The dataset is categorized into two classes: the first class consists of samples from patterns Subject 1, Subject 2, Subject 3, Subject 4, and the second class is formed by samples from patterns Subject 5, Subject 6, Subject 7, Subject 8. The unknown classes are samples from patterns Subject 9, Subject 10, with each pattern contributing roughly the same number of samples.
–
The third case focuses on the imbalanced EPC scenario. It shares the same two-class division and pattern composition as the second case, but with an unbalanced sample size distribution: the number of samples from patterns Subject 1, Subject 5 is about 10 times that of samples from patterns Subject 2, Subject 3, Subject 4, Subject 6, Subject 7, Subject 8.

4.2. Baselines

We compare MNForest with five state-of-the-art open-set classification models that support unknown sample identification, with their parameter configurations detailed as follows:

SEEN [26]: A semi-supervised streaming learning model that can handle emerging new labels. The trade-off parameter $γ$ is set to 0.1.
EVM [27]: An extreme value machine-based open-set classification model that models the distribution of known classes using extreme value theory. The number of samples used to construct extrema is 50, the number of extreme vectors is 4, and the margin scaling multiplier is 0.5.
SENCForest [28]: A forest-based model designed for classification under streaming emerging new classes. The number of trees T is 100, and the minimum internal node size $ρ$ is 20. ( $ρ$ is a hyperparameter; according to experimental results, the algorithm usually performs well when $ρ = 20$ , so we chose this value.) The parameters involved in this algorithm are the same as those in our proposed algorithm. For fairness, the parameter values are the same for both.
ASG [29]: An open-category classification model that generates adversarial samples to expand the decision boundary of known classes. The number of generated adversarial samples $s_{1}$ is 10, and the number of original samples used for generating adversarial samples $s_{2}$ is 50.
KNNENS [10]: A K-nearest neighbor ensemble model that can identify emerging classes without ground-truth labels. The number of nearest neighbors k is 10, and the number of ensemble members is 50.
MNForest (Ours): The number of trees T is 100, the minimum internal node size $ρ$ is 20, and the threshold $θ$ for unknown sample identification is 0.5. Only T (involved in SENCForest and MNForest) and $θ$ (involved in MNForest) were set empirically; other parameters were optimized via grid search. Most hyperparameters across all baseline methods were optimized via grid search to ensure each model performed at its optimal level.

We split the KP samples of each dataset and case into a training set (80%) and a test set (20%), and put all UP samples into the test set. UP samples are those that come from known classes, which we have defined as unknown for the purposes of evaluation; they were not included in the training set.

Note that traditional DNS tunneling detection models typically operate in a closed-set manner and cannot handle unknown pattern samples, so they are not included in the baselines. All experiments are conducted on a Dell Precision 3660 Tower Workstation (Dell Inc., Round Rock, TX, USA) with an Intel Core i5-13600 CPU and 64 GB RAM. The software environment includes Python 3.8 and Numpy 1.17.

4.3. Metrics

We use four key evaluation metrics to comprehensively assess model performance, with each metric calculated separately for KP classification and UP identification. The metrics are as follows:

Accuracy (ACC): Accuracy of classifying KP samples into correct known classes, reflecting the known pattern classification capability.
UP-Detection Rate (UDR): Ratio of correctly identified UP samples to total UP samples, reflecting the unknown pattern identification capability.
Micro-F1 (MIF): Harmonic mean of overall precision and recall, emphasizing performance on majority classes.
Macro-F1 (MAF): Average F1-score of each class (including unknown class), emphasizing performance on minority classes.

All metrics range from 0 to 1, with higher values indicating better performance.

4.4. Experimental Results

Table 1, Table 2, Table 3 and Table 4 show the performance of all models on the four datasets in the SENC and EPC scenario. The key observations are as follows:

MNForest outperforms baselines in most cases: In the EPC scenario, MNForest achieves the highest micro-F1 and macro-F1 scores on most dataset configurations. For example, on the Pix10P dataset (case 3), MNForest achieves a micro-F1 score of 0.917, which is 6.7% higher than the second-best model (SENCForest, 0.850). On the Digits dataset (case 2), MNForest achieves a micro-F1 score of 0.978, which is 9.3% higher than the second-best model (SENCForest, 0.885). Similar advantages are observed on the JAFFE and USPS datasets.
Forest-based models perform better than other types of models: MNForest and SENCForest (both forest-based models) generally outperform EVM, ASG and SEEN models. This is because forest-based models can better capture the multi-pattern characteristics within classes by leveraging ensemble learning, which is consistent with the characteristics of the EPC problem.
MNForest has better unknown pattern identification capability: Compared with SENCForest, MNForest achieves higher scores on datasets with more complex pattern distributions. This is because MNForest uses the medium neighbor mechanism to accurately identify unknown patterns, while SENCForest relies on predefined distance thresholds, which are less adaptive to complex pattern distributions.
Performance on imbalanced datasets: On datasets with significant pattern imbalance, MNForest still maintains high macro-F1 scores, indicating its strong ability to handle class-imbalanced scenarios, which is critical for DNS tunneling detection (where malicious samples are rare).

5. DNS Tunneling Detection Experiments

To verify the practical applicability of MNForest in DNS tunneling detection, we conduct experiments on real-world DNS traffic datasets covering multiple network scenarios. This section details the dataset collection, scenario settings, and results.

5.1. Dataset Collection and Preprocessing

We collect DNS traffic data from four different real-world network scenarios over a period of 30 days:

Enterprise Network: DNS traffic from the internal network of a medium-sized enterprise (500+ employees), including legitimate traffic (e.g., web browsing, email, enterprise application access) and simulated DNS tunneling traffic (generated by tools such as DNScat2, iodine, and DNSExfiltrator). The total number of DNS queries in the dataset is 18,000, of which 17,460 are legitimate samples, accounting for 97% of the total dataset, and 540 are malicious DNS tunneling samples, accounting for 3%.
Campus Network: DNS traffic from a university campus network (10,000+ users), including traffic from student dormitories, teaching buildings, and research labs. The dataset includes legitimate traffic (e.g., video streaming, online learning, social media) and real malicious DNS tunneling traffic captured by the campus network security system. The total number of DNS queries is 105,000, with 101,325 legitimate samples (96.5% of the total) and 3675 malicious DNS tunneling samples (3.5%)
Public Wi-Fi Network: DNS traffic from a public Wi-Fi network in a shopping mall (5000+ daily users), including legitimate traffic (e.g., mobile app usage, web browsing) and DNS tunneling traffic used to bypass captive portals. The dataset contains 63,000 DNS queries, with 61,236 legitimate samples (97.2% of the total) and 1764 malicious DNS tunneling samples (2.8%).
Data Center Network: DNS traffic from a cloud data center (hosting 200+ enterprise applications), including legitimate traffic (e.g., application service discovery, database access) and advanced persistent threat (APT) DNS tunneling traffic (simulated using custom encoding methods). The total number of DNS queries is 12,000, with 11,760 legitimate samples (98% of the total) and 240 malicious APT DNS tunneling samples (2%)

Each query we collect is stored in the form of a string consisting of 128 fields. Most of the fields are useless for identifying DNS tunnelling. Actually, there are only 10 fields used. Detailed information about the used fields is shown in Table 5. Once the queries are obtained, three important operators are used to extract the features. They are:

Data Cleaning: Remove duplicate DNS queries, invalid queries (e.g., empty domain names), and queries with incomplete information (e.g., missing timestamp or IP address).
Group: Queries with the same primary domain name, h-time, and sip are assigned to the same group, where the primary domain name is obtained by tldextract (https://github.com/john-kurkowski/tldextract (accessed on 25 September 2025)), h-time is the hour extracted from the timestamp, and sip is obtained by

$s i p = \{\begin{matrix} destination ip & source port = 53 \\ source ip & otherwise \end{matrix}$

(8)

The port 53 is mainly used for domain name resolution.
Feature: Let $q = [q_{1}, \dots, q_{n}]$ denote the group, n is the number of queries, $q_{i}$ is the i-th query in the group; 41 customised statistics are extracted from each group. These features are described in Table 6.

5.2. Scenario-Specific Performance Analysis

To comprehensively evaluate the performance of MNForest in practical DNS tunneling detection scenarios, we adopt three core evaluation metrics for binary classification tasks: Precision (PRE), Recall (REC), and F1-score (F1). The detailed experimental results of MNForest and baselines in four real-world scenarios are shown in Table 7. Scenario-specific performance analysis is as follows:

Enterprise Network: The enterprise network has strict access control policies, so the number of unknown-pattern, legitimate queries is relatively small, which helps improve detection accuracy. Specifically, the Precision (0.954) and Recall (0.937) of the proposed algorithm are both the highest among all models. Compared to the second-best baseline (SENCForest, F1-score = 0.915), MNForest’s F1-score is 3% higher, indicating that MNForest has a strong detection capability for DNS tunneling traffic in enterprise networks.
Campus Network: The campus network has a large number of users and diverse traffic types, resulting in more unknown pattern legitimate queries. MNForest continues to maintain excellent performance in the binary classification task, with the highest Recall (0.932) and F1-score (0.928). Specifically, its Recall is 6.0% higher than the second-best baseline (SENCForest, 0.872), and its F1-score is 4.4% higher than SENCForest (0.884).
Public Wi-Fi Network: DNS tunneling traffic in this scenario is mainly used to bypass captive portals, which have unique encoding patterns. MNForest can effectively identify these patterns in the binary classification task, achieving the highest Precision (0.957) and Recall (0.924) among all models. The high precision indicates that MNForest has a strong ability to identify the unique patterns of portal-bypassing DNS tunneling traffic, reducing the misjudgment of legitimate Wi-Fi access traffic in binary classification.
Data Center Network: MNForest’s medium neighbor mechanism can effectively capture the subtle differences between APT traffic and legitimate traffic in the binary classification task, achieving the highest Recall (0.918) and F1-score (0.915). This shows that MNForest has significant advantages in detecting low-volume, custom-encoded APT DNS tunneling traffic in binary classification, which is difficult for traditional classifiers to capture.

6. Conclusions

DNS tunneling poses persistent threats to dynamic networks with evolving attack patterns, where traditional closed-set models fail to handle multi-pattern classes and unknown samples. Throughout this paper, we know that (1) the DNS tunneling detection task with unknown pattern recognition can be formalized as the Emerging Pattern Classification (EPC) problem, where only two classes of samples need to be considered: Known Pattern (KP) and Unknown Pattern (UP) samples. (2) MNForest, an unsupervised forest-based classifier, is proposed—constructed via k-sums-x clustering and leveraging a “medium neighbor” mechanism to handle multi-pattern classes and identify unknown samples. (3) Extensive experiments on public and real-world datasets confirm that MNForest outperforms state-of-the-art baselines in KP classification accuracy, UP detection rate, and robustness to class imbalance, validated across enterprise, campus, public Wi-Fi, and data center networks. The proposed MNForest and medium-neighbor mechanism offer a novel solution to multi-pattern and unknown-sample challenges, constituting the paper’s largest contribution. Designing the probability formula to determine whether a sample is a UP sample is key to the algorithm and a difficult point encountered in the research.

Author Contributions

Conceptualization, S.P. and H.D.; methodology, S.P.; software, H.D.; formal analysis, S.P.; resources, Z.Z.; data curation, H.D.; writing—original draft preparation, H.D.; writing—review and editing, S.P.; supervision, Z.Z.; project administration, Z.Z.; funding acquisition, S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62406274, in part by the Natural Science Foundation of Zhejiang Province, China, under Grant LQN25F020029, and in part by the National Innovation and Entrepreneurship Training Program under Grant 202513021008.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available from the corresponding author upon request due to the privacy and security constraints associated with real-world network traffic data.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the Funding statement. This change does not affect the scientific content of the article.

References

Kuang, S.; Wang, W.; Ye, Y.; Peng, L. HHBT: DNS tunneling Detection via Hybrid Hierarchical Bidirectional Transformer. Comput. Netw. 2025, 275, 111919. [Google Scholar] [CrossRef]
Wang, Z.; Xu, Q.; Yang, Z.; He, Y.; Cao, X.; Huang, Q. Openauc: Towards auc-oriented open-set recognition. Adv. Neural Inf. Process. Syst. 2022, 35, 25033–25045. [Google Scholar]
Cevikalp, H.; Uzun, B.; Salk, Y.; Saribas, H.; Köpüklü, O. From anomaly detection to open set recognition: Bridging the gap. Pattern Recognit. 2023, 138, 109385. [Google Scholar] [CrossRef]
Geng, C.; Huang, S.; Chen, S. Recent Advances in Open Set Recognition: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3614–3631. [Google Scholar] [CrossRef]
Zhou, Z.H. Open-environment machine learning. Natl. Sci. Rev. 2022, 9, nwac123. [Google Scholar] [CrossRef]
Júnior, J.D.C.; Faria, E.R.; Silva, J.A.; Gama, J.; Cerri, R. Novelty detection for multi-label stream classification under extreme verification latency. Appl. Soft Comput. 2023, 141, 110265. [Google Scholar] [CrossRef]
Kanagaraj, K.Y.S.S.; Nallappan, M. Methods for Predicting the Rise of the New Labels from a High-Dimensional Data Stream. Int. J. Intell. Eng. Syst. 2023, 16, 339–349. [Google Scholar] [CrossRef]
Yu, X.R.; Wang, D.B.; Zhang, M.L. Partial label learning with emerging new labels. Mach. Learn. 2024, 113, 1549–1565. [Google Scholar] [CrossRef]
Din, S.U.; Yang, Q.; Shao, J.; Mawuli, C.B.; Ullah, A.; Ali, W. Synchronization-based semi-supervised data streams classification with label evolution and extreme verification delay. Inf. Sci. 2024, 678, 120933. [Google Scholar] [CrossRef]
Zhang, J.; Wang, T.; Ng, W.W.; Pedrycz, W. KNNENS: A k-nearest neighbor ensemble-based method for incremental learning under data stream with emerging new classes. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 9520–9527. [Google Scholar] [CrossRef]
Mu, X.; Zeng, G.; Huang, Z. Federated Learning with Emerging New Class: A Solution Using Isolation-Based Specification. In Proceedings of the International Conference on Database Systems for Advanced Applications, Tianjin, China, 17–20 April 2023; Springer: Cham, Switzerland, 2023; pp. 719–734. [Google Scholar]
Wang, Y.; Vinogradov, A. Multi-classification generative adversarial network for streaming data with emerging new classes: Method and its application to condition monitoring. TechRxiv 2022. [Google Scholar] [CrossRef]
Wang, Y.; Wang, Q.; Vinogradov, A. Ensembled Multi-classification Generative Adversarial Network for Condition Monitoring in Streaming Data with Emerging New Classes. In Olympiad in Engineering Science; Springer: Berlin/Heidelberg, Germany, 2023; pp. 45–57. [Google Scholar]
Wang, Y.; Wang, Q.; Vinogradov, A. Semi-supervised deep architecture for classification in streaming data with emerging new classes: Application in condition monitoring. TechRxiv 2023. [Google Scholar] [CrossRef]
Cao, Z.; Zhang, S.; Lin, C.T. Online Ensemble of Ensemble OVA Framework for Class Evolution with Dominant Emerging Classes. In Proceedings of the 2023 IEEE International Conference on Data Mining (ICDM), Shanghai, China, 1–4 December 2023; IEEE: New York, NY, USA, 2023; pp. 968–973. [Google Scholar]
Henrydoss, J.; Cruz, S.; Li, C.; Günther, M.; Boult, T.E. Enhancing Open-Set Recognition Using Clustering-Based Extreme Value Machine (C-EVM). In Proceedings of the International Conference on Big Data (BigData), Honolulu, HI, USA, 18–20 September 2020; IEEE: New York, NY, USA, 2020. [Google Scholar]
Fontanel, D.; Cermelli, F.; Mancini, M.; Buló, S.R.; Ricci, E.; Caputo, B. Boosting Deep Open World Recognition by Clustering. IEEE Robot. Autom. Lett. 2020, 5, 5985–5992. [Google Scholar] [CrossRef]
Pei, S.; Sun, Y.; Nie, F.; Jiang, X.; Zheng, Z. Adaptive Graph K-Means. Pattern Recognit. 2025, 161, 111226. [Google Scholar] [CrossRef]
Pei, S.; Sun, Y.; Lin, Z.; Nie, F.; Lu, J.; Jiang, X.; Zhang, C.; Zheng, Z. Concave Cut: Analyzing the Role of Concave Functions in Clustering. Pattern Recognit. 2025, 174, 112950. [Google Scholar] [CrossRef]
Pei, S.; Nie, F.; Wang, R.; Li, X. A rank-constrained clustering algorithm with adaptive embedding. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Virtual, 6–11 June 2021; IEEE: New York, NY, USA, 2021; pp. 2845–2849. [Google Scholar]
Pei, S.; Chen, H.; Nie, F.; Wang, R.; Li, X. Centerless clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 167–181. [Google Scholar] [CrossRef]
Pei, S.; Nie, F.; Wang, R.; Li, X. Efficient clustering based on a unified view of k-means and ratio-cut. Adv. Neural Inf. Process. Syst. 2020, 33, 14855–14866. [Google Scholar]
Wang, H.; Du, F.; Pei, S.; Zheng, Z. Semi-supervised learning with centerless clustering. In Proceedings of the 2025 IEEE Cyber Science and Technology Congress (CyberSciTech), Hakodate, Japan, 21–24 October 2025; pp. 85–90. [Google Scholar]
Pei, S.; Zhang, Y.; Wang, R.; Nie, F. A portable clustering algorithm based on compact neighbors for face tagging. Neural Netw. 2022, 154, 508–520. [Google Scholar] [CrossRef] [PubMed]
Pei, S.; Nie, F.; Wang, R.; Li, X. An efficient density-based clustering algorithm for face groping. Neurocomputing 2021, 462, 331–343. [Google Scholar] [CrossRef]
Zhu, Y.N.; Li, Y.F. Semi-Supervised Streaming Learning with Emerging New Labels. Proc. AAAI Conf. Artif. Intell. 2020, 34, 7015–7022. [Google Scholar] [CrossRef]
Rudd, E.M.; Jain, L.P.; Scheirer, W.J.; Boult, T.E. The Extreme Value Machine. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 762–768. [Google Scholar] [CrossRef] [PubMed]
Mu, X.; Ting, K.M.; Zhou, Z.H. Classification Under Streaming Emerging New Classes: A Solution Using Completely-Random Trees. IEEE Trans. Knowl. Data Eng. 2017, 29, 1605–1618. [Google Scholar] [CrossRef]
Yu, Y.; Qu, W.Y.; Li, N.; Guo, Z. Open-Category Classification by Adversarial Sample Generation. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI’17, Melbourne, Australia, 19–25 August 2017; AAAI Press: Washington, DC, USA, 2017; pp. 3357–3363. [Google Scholar]

Figure 1. The schematic diagram of the DNS tunneling (created in Inkscape 1.4).

Figure 2. Workflow of our model (created in Inkscape 1.4).

Figure 3. Two cases for calculating

{\hat{p}}_{t} ({\hat{x}}_{j})

(created in Inkscape 1.4).

Figure 3. Two cases for calculating

{\hat{p}}_{t} ({\hat{x}}_{j})

(created in Inkscape 1.4).

Table 1. Experimental results of USPS dataset.

Cases	Metric	SEEN	EVM	SENCForest	ASG	KNNENS	MNForest
USPS Case 1 SENC	ACC	0.862	0.891	0.913	0.902	0.895	0.928
	UDR	0.783	0.816	0.839	0.828	0.822	0.857
	MIF	0.856	0.882	0.901	0.893	0.887	0.915
	MAF	0.829	0.855	0.874	0.866	0.860	0.888
USPS Case 2 EPC	ACC	0.825	0.860	0.892	0.878	0.866	0.924
	UDR	0.771	0.805	0.832	0.821	0.815	0.853
	MIF	0.823	0.857	0.889	0.875	0.863	0.921
	MAF	0.795	0.831	0.862	0.848	0.835	0.903
USPS Case 3 EPC	ACC	0.789	0.815	0.848	0.840	0.832	0.880
	UDR	0.721	0.756	0.789	0.778	0.769	0.825
	MIF	0.786	0.812	0.845	0.837	0.829	0.876
	MAF	0.652	0.698	0.734	0.716	0.701	0.793

Table 2. Experimental results of digits dataset.

Cases	Metric	SEEN	EVM	SENCForest	ASG	KNNENS	MNForest
Digits Case 1 SENC	ACC	0.845	0.878	0.905	0.896	0.889	0.923
	UDR	0.768	0.802	0.827	0.819	0.811	0.849
	MIF	0.839	0.871	0.898	0.889	0.882	0.917
	MAF	0.812	0.846	0.869	0.861	0.853	0.885
Digits Case 2 EPC	ACC	0.813	0.856	0.888	0.874	0.862	0.926
	UDR	0.756	0.798	0.823	0.815	0.807	0.847
	MIF	0.810	0.852	0.885	0.871	0.859	0.978
	MAF	0.783	0.825	0.857	0.843	0.831	0.901
Digits Case 3 EPC	ACC	0.775	0.808	0.841	0.833	0.824	0.876
	UDR	0.712	0.748	0.782	0.771	0.763	0.819
	MIF	0.772	0.805	0.838	0.830	0.821	0.873
	MAF	0.645	0.691	0.728	0.710	0.695	0.789

Table 3. Experimental results of JAFFE dataset.

Cases	Metric	SEEN	EVM	SENCForest	ASG	KNNENS	MNForest
JAFFE Case 1 SENC	ACC	0.832	0.865	0.893	0.884	0.876	0.915
	UDR	0.753	0.789	0.816	0.808	0.800	0.838
	MIF	0.826	0.859	0.887	0.878	0.870	0.909
	MAF	0.798	0.832	0.856	0.848	0.840	0.876
JAFFE Case 2 EPC	ACC	0.801	0.842	0.875	0.861	0.850	0.912
	UDR	0.741	0.783	0.811	0.803	0.794	0.835
	MIF	0.798	0.838	0.872	0.858	0.847	0.913
	MAF	0.770	0.812	0.844	0.830	0.819	0.895
JAFFE Case 3 EPC	ACC	0.762	0.796	0.830	0.821	0.812	0.868
	UDR	0.703	0.739	0.773	0.762	0.754	0.807
	MIF	0.759	0.793	0.827	0.818	0.809	0.865
	MAF	0.638	0.684	0.721	0.703	0.688	0.776

Table 4. Experimental results of Pix10P dataset.

Cases	Metric	SEEN	EVM	SENCForest	ASG	KNNENS	MNForest
Pix10P Case 1 SENC	ACC	0.856	0.889	0.916	0.907	0.899	0.932
	UDR	0.775	0.811	0.838	0.829	0.821	0.856
	MIF	0.850	0.883	0.910	0.901	0.893	0.924
	MAF	0.823	0.857	0.879	0.871	0.863	0.892
Pix10P Case 2 EPC	ACC	0.824	0.867	0.899	0.885	0.873	0.928
	UDR	0.762	0.804	0.831	0.822	0.814	0.852
	MIF	0.821	0.863	0.896	0.882	0.870	0.922
	MAF	0.796	0.838	0.868	0.854	0.842	0.904
Pix10P Case 3 EPC	ACC	0.786	0.820	0.853	0.845	0.836	0.885
	UDR	0.725	0.761	0.794	0.783	0.775	0.828
	MIF	0.783	0.817	0.850	0.842	0.833	0.917
	MAF	0.658	0.704	0.741	0.723	0.708	0.798

Table 5. Description of query.

NO.	Name	Example
2	source ip	162.105.129.122
3	destination ip	205.251.198.179
4	source port	20,018
11	start time	1,501,605,396
54	domain	pixel.redditmedia.com
55	rr type	0001;0000;0000;0000;0000
58	DNS TTL	300
59	reply ipv4	151.101.73.140
87	request length	37
89	reply length	88

Table 6. Description of features.

NO.	Description
1	The number of queries
2	The number of distinct subdomain names
3–7	$f (a)$ , where $a_{i}$ is the request length of $q_{i}$
8–12	$f (a)$ , where $a_{i}$ is the reply length of $q_{i}$
13–17	$f (a)$ , where $a_{i}$ is the DNS TTL of $q_{i}$
18–22	$f (a)$ , where $a_{i}$ is the number of tags of $q_{i}$
23–27	$f (a)$ , where $a_{i}$ is the length of $q_{i}$
28	Frequency of rrtypes containing 0001
29	Frequency of valid reply IPv4
30	Frequency of distinct valid reply IPv4
31–35	$f (a)$ , where $a_{i}$ is the number of digits in domain name of $q_{i}$
36–40	$f (a)$ , where $a_{i}$ is the number of characters that are neither digits nor letters in $q_{i}$
41	$\sum_{i = 1}^{n} a_{i} / \sum_{i = 1}^{n} b_{i}$ , where $a_{i}$ and $b_{i}$ represent the reply length and request length of $q_{i}$

Notes: f(a) is a multivalued function. Five statistics of a are computed: minimum, maximum, mean, median, and standard deviation. q_i is the i-th query in the group.

Table 7. Experimental results of real-world DNS traffic datasets.

Scenario	Metric	SEEN	EVM	SENCForest	ASG	KNNENS	MNForest
Enterprise	PRE	0.892	0.913	0.928	0.919	0.915	0.954
	REC	0.867	0.885	0.903	0.895	0.890	0.937
	F1	0.879	0.898	0.915	0.907	0.902	0.945
Campus	PRE	0.851	0.876	0.897	0.886	0.879	0.926
	REC	0.822	0.848	0.872	0.861	0.854	0.932
	F1	0.836	0.861	0.884	0.873	0.866	0.928
Public Wi-Fi	PRE	0.898	0.921	0.935	0.927	0.922	0.957
	REC	0.873	0.895	0.910	0.902	0.896	0.924
	F1	0.885	0.907	0.922	0.914	0.909	0.949
Data Center	PRE	0.836	0.864	0.886	0.875	0.868	0.913
	REC	0.801	0.827	0.852	0.840	0.833	0.918
	F1	0.818	0.845	0.868	0.857	0.850	0.915

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dong, H.; Zheng, Z.; Pei, S. A Classifier with Unknown Pattern Recognition for Domain Name System Tunneling Detection in Dynamic Networks. Electronics 2026, 15, 709. https://doi.org/10.3390/electronics15030709

AMA Style

Dong H, Zheng Z, Pei S. A Classifier with Unknown Pattern Recognition for Domain Name System Tunneling Detection in Dynamic Networks. Electronics. 2026; 15(3):709. https://doi.org/10.3390/electronics15030709

Chicago/Turabian Style

Dong, Huijuan, Zengwei Zheng, and Shenfei Pei. 2026. "A Classifier with Unknown Pattern Recognition for Domain Name System Tunneling Detection in Dynamic Networks" Electronics 15, no. 3: 709. https://doi.org/10.3390/electronics15030709

APA Style

Dong, H., Zheng, Z., & Pei, S. (2026). A Classifier with Unknown Pattern Recognition for Domain Name System Tunneling Detection in Dynamic Networks. Electronics, 15(3), 709. https://doi.org/10.3390/electronics15030709

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Classifier with Unknown Pattern Recognition for Domain Name System Tunneling Detection in Dynamic Networks

Abstract

1. Introduction

1.1. Problem Formulation

1.2. Motivation and Contributions

2. Related Work

3. Our Model

3.1. Tree Construction (MNTree)

3.2. Prediction Mechanism

3.2.1. Medium Neighbor

3.2.2. Probability Calculation of UP Samples

3.2.3. Class Probability Calculation for KP Samples

3.2.4. Integrated Prediction Result

3.3. Prediction and Unknown Pattern Identification

3.4. Time and Space Complexity

4. Experiments on Public Datasets

4.1. Datasets

4.2. Baselines

4.3. Metrics

4.4. Experimental Results

5. DNS Tunneling Detection Experiments

5.1. Dataset Collection and Preprocessing

5.2. Scenario-Specific Performance Analysis

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Correction Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI