Next Article in Journal
Analysis of Function Approximation and Stability of General DNNs in Directed Acyclic Graphs Using Un-Rectifying Analysis
Previous Article in Journal
Reconfigurable Transmitarray Based on Frequency Selective Surface for 2D Wide-Angle Beam Steering
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Effective Approach for Stepping-Stone Intrusion Detection Resistant to Intruders’ Chaff-Perturbation via Packet Crossover

1
TSYS School of Computer Science, Columbus State University, Columbus, GA 31907, USA
2
Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA
*
Author to whom correspondence should be addressed.
Electronics 2023, 12(18), 3855; https://doi.org/10.3390/electronics12183855
Submission received: 1 August 2023 / Revised: 8 September 2023 / Accepted: 9 September 2023 / Published: 12 September 2023
(This article belongs to the Section Networks)

Abstract

:
Today’s intruders usually send attacking commands to a target system through several stepping-stone hosts, for the purpose of decreasing the probability of being caught. Using stepping-stone intrusion (SSI), the intruder’s identity is very difficult to discover as it is concealed by a long interactive connection chain of hosts. An effective approach for SSI detection (SSID) is to determine how many connections are contained in a connection chain. This type of method is called network-based SSID. Most existing network-based SSID only worked for network traffic without intruders’ session manipulation. These known SSID algorithms are either weak to resist intruders’ chaff-perturbation manipulation or have very limited capability in resisting attacker’s session manipulation. This paper develops a novel network-based SSID algorithm resistant to intruders’ chaff-perturbation by using packet crossover. The SSID approach proposed in this paper is simple and easy to implement as the number of packet crossovers can be easily computed. Our proposed algorithm is verified by rigorous technical proofs as well as well-designed network experiments. Our experimental results show that the proposed SSID algorithm works effectively and perfectly in resisting intruders’ chaff-perturbation up to a chaff rate of 50%.

1. Introduction

Using SSI, an attacker builds a chain of stepping-stone machines (see Figure 1 with five connections), uses SSH or telnet to login in turn to these stepping-stones and then launches the attack [1]. In Figure 1, Host A serves as the attack host, and Host V represents the victim system. The intruder, sitting in front of the attack host, remotely logs in in turn to the stepping-stone hosts S1, S2, S3, and S4, and finally to the victim Host V. To detect SSI, any stepping-stone host between the attacker and the victim could be employed as the sensor machine where a packet sniffer program (e.g., TCPdump etc.) is running to capture network traffic. In Figure 1, we assume that Host S2 serves as the sensor host for SSI detection (SSID). The upstream sub-chain is a part of the chain from the intruder machine A to the sensor machine S2, and the downstream sub-chain is the other part of the chain from S2 to the victim machine V.
The goal of SSID is to decide if a stepping-stone is employed by a hacker for a malicious intrusion [2,3,4]. If a connection leaving the sensor machine matches with one of connections arriving at the sensor, then it is likely that the communication session is a malicious intrusion [3,5,6]. SSIs are very hard to detect as the intruder is concealed by a long TCP/IP connection chain of stepping-stone machines. For typical data communication between a server and a client, every interactive connection between them is independent of one another even if the connections might be relayed. Due to such independence, it is extremely difficult to determine the SSI attack origin while the victim machine V is accessed via several stepping-stone hosts.
Today, intruders tend to launch cyberattacks with session manipulation techniques such as chaff attack. Chaff attack is a hacking technique utilized to add intruder-created packets into a communication session to modify not only the packets’ RTTs, but also the total number of packets within a certain period of time. Most known SSID algorithms could easily be defeated by chaff attack. The chaff-attacking technique is widely used in attacks such as man-in-the-middle, DoS, DDoS, or SSI attacks.
One type of SSID approach is to determine whether a connection leaving the sensor host matches with one of connections arriving at the sensor [2,7,8]. This type of SSID approach only uses the sensor host for detection, thus is called the host-based SSID. It is well-known that most Web applications as well as cloud computing applications usually employ stepping-stone hosts to gain access to a remote server such as a database server. Therefore, high false-positive errors are likely unavoidable when a host-based SSID approach is used for detection [4,9,10].
For the purpose of decreasing such errors incurred by host-based SSID methods, another type of SSID method was proposed by estimating a connection chain length [11,12]. This type of SSID method is called the network-based SSID. That is, this type of method focuses on estimating the number of stepping-stone machines contained in the chain (the number of connections in the chain is the sum of this number and one. See Figure 1) [13,14,15]. If an attacker uses two or more stepping-stone hosts to launch an attack, then the number of connections from the attacker host to the victim is at least three. The rationale of host-based SSID approaches is discussed in detail in Section 4.2.
Most existing network-based SSID only worked for network traffic without manipulated sessions by intruders (see discussions in Section 2 on related work). These known SSID algorithms are either weak to resist intruders’ chaff-perturbation manipulation or have very limited capability in resisting an attacker’s session manipulation. The motivations of this paper are listed below:
  • To evade detection, most intruders today use chaff-perturbation manipulation to launch SSI attacks.
  • There is a pressing need to develop effective solutions to the critical problems with the existing SSID methods proposed so far.
  • The proposed solution to resolve the aforementioned issues with the existing SSID methods should be simple and easy to implement.
With these motivations, this paper develops a novel network-based SSID algorithm that is resistant to intruders’ chaff-perturbation by using packet crossover. The main contributions of this paper are summarized below:
(1)
Our SSID algorithm developed in this paper is simple and easy to implement as the number of packet crossovers can be easily computed.
(2)
Our SSID algorithm developed in this paper is network-based and generates very low false-positive errors.
(3)
Our SSID algorithm developed in this paper is verified through rigorous technical proofs as well as well-designed network experiments; the proposed SSID algorithm works effectively and perfectly in resisting chaff-attack by hackers up to a chaff rate of 50% according to our experimental results.
(4)
To the best of our knowledge, our SSID algorithm developed in this paper is the first network-based approach that can effectively detect SSI when it is present as well as resist intruders’ chaff-perturbation up to a chaff rate of 50%.
The rest of this paper follows the order below: Related work of the paper is presented in Section 2. In Section 3, we give a proposition that asserts the relationship between the downstream sub-chain length and the packet crossover ratio. The design and implementation of our proposed SSID algorithm are described in Section 4. We analyze in Section 5 the resistance of the proposed SSID algorithm to chaff attacks. In Section 6, we present and analyze the results of our well-designed network experiments. Finally, the conclusion and discussion of future research directions will be given in Section 7.

2. Related Work

In this section, we give a review of related work in SSID. Let us begin with the existing host-based SSID approaches. In 1995, S. Staniford-Chen et al. [1] proposed the first SSID method by comparing the actual contents of packets to determine whether a relayed pair of connections is present at the sensor host. Ref. [1] claimed that it is likely a malicious intrusion if such a pair of connections exists. However, the SSID approach proposed in [1] does not work if the network traffic is encrypted.
To overcome this problem, a time-based thumbprint approach was developed by Y. Zhang et al. [4] for SSID by comparing the thumbprints created based on the timestamps of network packets captured from the outgoing and incoming connections of the sensor host. Since packets’ timestamps are not encrypted, the time-based thumbprint method proposed in [4] works effectively for encrypted network traffic. A similar solution to solve the problems with S. Staniford-Chen’s SSID method was developed by K. Yoda et al. [16] by analyzing the deviation between two consecutive connections within a connection chain. This SSID method does not require any information in the packets’ contents; thus, it also works when the network traffic is encrypted.
However, none of the above-mentioned SSID methods are resistant to intruders’ session manipulation using chaff-perturbation and/or time-jittering. Research findings obtained by D. L. Donoho et al. [3] show that intruders’ capabilities of manipulating communication sessions are limited, and they would not be able to evade detection by camouflaging the communication sessions.
Another SSID method that does not require information about packets’ contents was developed by A. Blum et al. [2] by counting the number of packets in the outgoing and incoming connections, respectively. Ref. [2] claims that if a pair of relayed connections is present, then the difference between these two numbers of packets (in the outgoing and incoming connections) must be upper bounded. Another host-based SSID method developed by T. He et al. [12] was claimed to be resistant to intruders’ chaff attacks proportional to the network traffic size. More specifically, they claim that if ∆ is an upper bound of the packet delay, for an intruder to evade SSID, the intruder has to chaff at least n/(1 + λ∆) packets, where n is the total number of normal packets before any meaningless packet is chaffed.
Next, we give a thorough review of all the existing network-based SSID algorithms that perform intrusion detection via estimating the number of connections contained in a chain. The first network-based SSID method was proposed by Yung et al. [14] in 2002. This approach computed a connection chain length by calculating the ratio of the Send-Echo Round-Trip Time (RTT) over the Send-Ack RTT. According to Yung’s method, a Send-Echo RTT reflects the number of connections contained in the downstream sub-chain. whereas a Send-Ack RTT represents the length of one hop connection from the sensor host to its adjacent machine on the downstream side. The problem with Yung’s SSID algorithm is that it produced very high false-negative errors because the acknowledgement packets were used in the chain length estimation. The issues of Yung’s SSID algorithm were discovered and described in [11] by J. Yang et al.
The work [11] was the second network-based SSID algorithm and proposed in 2004. The SSID algorithm using step functions was used to estimate the number of connections contained in a connection chain in the work [11]. This paper resolved the issues of Yung’s SSID algorithm in [14] by setting up the connection chain in a different way. With this improvement, each Echo packet could match with a corresponding Send [11]. Thus, the false-negative errors for SSID were significantly reduced in [11], compared to Yung’s SSID algorithm proposed in [14]. Unfortunately, the step-function method proposed in [11] was only performing effectively in a local area network.
To overcome these issues existing in [11], an SSID approach using a data mining method was proposed by Yang et al. in [13]. The packets’ round-trip times were computed by using a data mining algorithm—the maximum–minimum distance clustering algorithm. With this method, every Send packet was accurately matched through looking at all the possible Echoes for this Send. The number of clusters output by the maximum–minimum distance clustering algorithm decides the connection chain length. However, the SSID method proposed in [13] must capture a huge number of TCP packets, which does not make the processing and analysis of the captured packets efficient. Thus, the detection approach developed in [13] is inefficient, taking the packets’ processing time into consideration.
To overcome the issue with the SSID method introduced in [13], a network-based SSID algorithm using packet crossover was proposed by Wang et al. [15]. The number of connections in the downstream sub-chain was calculated by analyzing the packet crossover ratios in [15]. The work [15] also verified that, when the packet crossover ratio gets larger and larger, so does the downstream connection chain length. However, the SSID algorithm proposed using packet crossover in [15] was not resistant to intruders’ chaff-perturbation.
A recent work [17] proposed a method that may obtain context-free properties for installing an anomaly-based NIDS (network intrusion detection system) using a machine learning model.

3. Relationship between the Downstream Sub-Chain Length and Packet Crossover Ratio

Next, the proposition asserts the relationship between the downstream sub-chain length and the obtained packet crossover ratio. This proposition was proved in our prior work [15].
Proposition 1.
For a given connection chain, the downstream sub-chain length strictly increases with the obtained packet crossover ratio.
Our experimental results show that Proposition 1 above is true when the network traffic has no chaffed meaningless packets. The experimental results will be presented in Section 6. More importantly, the proposition remains true when intruders’ chaff-perturbation is present up to 50% chaff rate. We collected 10 datasets in total, and the error rate remains 0% when the network traffic is chaffed meaningless packets with a chaff rate of 10%, 20%, 30%, 40%, or 50%, respectively. This proposition will play an important role in designing and analyzing our SSID algorithm to be proposed in Section 4 below.

4. SSID Algorithm Design and Implementation

In this section, we design an innovative network-based SSID algorithm for network traffic in any LAN or WAN with or without chaffed meaningless packets by calculating a threshold of packet crossover ratios. Our SSID algorithm not only works effectively to detect SSI in any LAN or WAN, but also resists chaff attack initiated by hackers. The design and analysis of our SSID algorithm are based on Proposition 1 stated above. We begin with a brief discussion on the rationale behind the SSID algorithms by calculating a connection chain length.

4.1. The Security Model

The main goal of this research is to determine if a computer (the sensor) is employed by an intruder for SSI via analyzing the network traffic captured at the sensor host from both of its incoming and outgoing connections. By far, most of the existing work for SSID focused on deciding whether or not there is a relayed pair of the sensor’s leaving and arriving connections. Unfortunately, these known SSID approaches can be easily defeated by hackers’ chaff attacks. It is well known that hackers are able to quickly create chaff packets and send them to a communication session that is under attack in order to evade detection. Thus, there is a pressing need to propose a novel SSID method to defeat hackers’ chaff attacks. We assume in this paper that hackers are not able to chaff meaningless packets into two or more computers in a connection chain.
It will be very difficult for an intruder to inject Echo packets into an active attacking communication session initiated by the intruder. Therefore, we assume that the intruder can only inject meaningless Send packets into a communication session. The meaningless chaff packets are sent into the arriving connection of the sensor host and could be observed and captured from the sensor host. Let us use Figure 2 below to explain the details.
In Figure 2, the intruder is assumed to use four stepping-stone hosts (H1, H2, H3, and H4, respectively) to launch the SSI attack on the victim host by using the tool OpenSSH. We assume that the host H2 is selected as the sensor machine. If the hacker sends chaffed packets into the arriving connection of the sensor machine H2, the src and dst IP addresses of these hacker-created packets are the IP addresses of H1 and the victim host, respectively. The destination port of these packets will be TCP port 22 for the OpenSSH server running on the victim host. These chaffed meaningless packets can only be observed and captured at the sensor host H2, not at the computer H3 or H4. Clearly, these chaffed meaningless packets cannot be forwarded to the connection leaving the sensor machine H2. Therefore, they cannot be captured at the host H3 or H4. The chaffed packets were not allowed to go to the victim machine as they are fake packets and will be dropped by the sensor machine.

4.2. The Rationale of Network-Based SSID Algorithms

It is an effective approach to detect SSI by calculating the number of stepping-stone machines utilized in a connection chain. Today, due to the use of emerging technologies such as Web services and/or cloud computing, many legitimate applications may use two stepping-stone machines to access a server remotely. For example, the architecture “client browser -> application server -> Web services (or cloud services) -> remote database server” is widely used in today’s IT industry. Therefore, in this paper, we use the assumption that legitimate applications may use two stepping-stone machines to access a remote server, but rarely use three or more stepping-stones to do so.
Thus, based on our above assumption, SSI is likely present if a downstream sub-chain contains at least three connections, which makes the whole chain length be at least four as the upstream sub-chain has at least one connection.

4.3. SSID Algorithm Design

Pick any host in the network as the sensor host S1. Our proposed SSID algorithm determines whether the sensor host S1 is used by an intruder for SSI. Our proposed SSID algorithm via calculating packet crossover ratios are described below in Algorithm 1:
Algorithm 1: SSID Algorithm using Packet Crossover
Input: None
Output: SSI detected or not
Begin:
  1.
Set up a connection chain A→S1→S2→S3→V of length four, where the hosts S1, S2, and S3 are the stepping-stones (S1 serves as the sensor), host A the attacker, and host V the victim. The length of the downstream sub-chain from S1 to V is three.
  2.
Some standard Linux commands (such as ls, dir, mkdir, etc.) are entered into a terminal in the attacker host A for a couple of minutes, and at the same time all the packets are captured at the sensor S1 from the connection S1→S2 in the chain. In total, 10 datasets will be captured. Then, we use the Packet Crossover Ratio algorithm (Algorithm 1 of [15]) to calculate the packet crossover ratio for each dataset of the above-captured packets.
  3.
Calculate the intrusion threshold crossover ratio which is the average packet crossover ratio among the 10 captured datasets at Step 2.
  4.
To perform SSID, at the same time, we also use host S1 as the sensor and observe one of its outgoing links. We then determine whether this outgoing link from the sensor S1 is used by an intruder for a malicious SSI. We capture 10 datasets at the sensor S1 from this outgoing connection and calculate the average packet crossover ratio over all the 10 captured datasets using the Packet Crossover Ratio algorithm (Algorithm 1 of [15]).
  5.
If the average packet crossover ratio obtained at Step 4 is greater than or equal to the intrusion threshold crossover ratio obtained at Step 3, it is most likely that this outgoing link is used by a hacker for malicious SSI.
  6.
Repeat Step 4 for every outgoing link from the sensor host S1 (except for the connection S1→S2 in the chain created in Step 2) to see whether it is used by a hacker for malicious SSI.
End
Here are some comments about Step 5 of the above Algorithm 1. If the intrusion threshold crossover ratio obtained at Step 3 is larger than the obtained average packet crossover ratio obtained at Step 4 in the algorithm, according to Proposition 1, the number of connections in the downstream sub-chain is less than three. However, we do not know the number of connections in the upstream sub-chain, no conclusion could be made in such a case. It requires further analysis on the upstream sub-chain length.

4.4. Implementation of the Proposed SSID Algorithm

In this section, we present the implementation for our SSID algorithm proposed in Section 4.3. This implementation is free of programming languages. Without loss of generality, we used the JAVA language to implement the algorithm. The implementation is composed of the following steps:
(1)
We set up a connection chain A→S1→S2→S3→V of length four using OpenSSH, where the hosts S1, S2, and S3 are the stepping-stones, host A the attacker, and host V the victim, respectively. S1 is used as a sensor host and all the network packets are captured from the sensor. The first two hosts A and S1 are local computers located on the campus of Columbus State University, Columbus, GA, USA, and the last three hosts S2, S3, and V are Amazon AWS servers located in different regions.
(2)
Some standard Linux commands (such as ls, dir, mkdir, etc.) can be executed in host A for a couple of minutes while some packets are captured using the tool TCPdump from the outgoing connection of sensor S1. We collect 10 (can be more) datasets of network packets.
(3)
We make a JAVA program to perform offline chaffing by randomly injecting meaningless packets into each of the captured dataset files. More specifically, we select different chaff rates λ = 0%, 10%, 20%, 30%, 40%, and 50%, respectively. For example, when the chaff rate λ = 10%, each captured dataset is randomly injected 100 meaningless packets in every 1000 packets.
(4)
We make another JAVA program to implement the Packet Crossover Ratio algorithm (Algorithm 1 of [15]) that calculates the packet crossover ratio for a given input file of packets. The input file of this JAVA program is a captured dataset file with chaffed meaningless packets in a specific chaff rate λ (produced at Step 3 above). This JAVA program outputs the packet crossover ratio for a given input file.
(5)
For the chaff rate λ = 0%, we run the JAVA program created at Step 4 for a captured dataset file with chaffed meaningless packets of the chaff rate λ = 0% produced at Step 3 and obtain the packet crossover ratio for this given dataset file. With the chaff rate λ, we perform the following five steps:
  • Compute the intrusion threshold crossover ratio by averaging the packet crossover ratio obtained at Step 5 among the above 10 datasets with chaffed meaningless packets using a chaff rate of λ.
  • To perform SSID, at the same time, we randomly select an outgoing link leaving from the sensor host S1 (except for the connection S1→S2 in the chain created in Step 1) and observe this outgoing link from S1. We then determine whether this outgoing link from S1 is used by an intruder for a malicious SSI. We capture 10 datasets of network packets at the sensor S1 from this outgoing connection.
  • We run the JAVA program created at Step 4 above to calculate the packet crossover ratio for each of the 10 datasets of network packets captured at Step 5(b) above. Then, we calculate the average packet crossover ratio over all the 10 datasets captured at Step 5(b) above.
  • Output a message “SSI detected” if the average packet crossover ratio obtained at Step 5(c) is greater than or equal to the intrusion threshold crossover ratio obtained at Step 5(a).
  • Repeat Step 5(b) above for every outgoing link from the sensor host S1 (except for the connection S1→S2 in the chain created in Step 1) to see whether it is used by a hacker for malicious SSI.
(6)
Increase the chaff rate λ by 10%, repeat Step 5 above until λ > 50%.
If the SSID algorithm outputs the message “SSI detected”, it is most likely that stepping-stone intrusion is present in the network.
This completes the description for the implementation of our proposed SSID algorithm.

5. Resistance Analysis to Intruders’ Chaff Attacks

Our proposed SSID algorithm using packet crossover was described in Section 4. In this section, we analyze its resistance to network traffic with chaffed meaningless packets by intruders. The following theorem asserts that our proposed Algorithm 1 for SSID in Section 4 is resistant to intruders’ chaff attacks if the chaff rate is not high.
Theorem 1.
The proposed Algorithm 1 for SSID using packet crossover ratios in Section 4 is resistant to intruders’ chaff attacks if the chaff rate is not high.
Proof. 
According to the above Proposition 1, for a given connection chain, the downstream sub-chain length strictly increases with the obtained packet crossover ratio. □
We claim that for network traffic without chaff or network traffic with chaff, but the chaff rate is not high, the packet crossover ratios observed at the sensor host are almost the same in these two cases.
According to the algorithm for calculating a packet crossover ratio presented in Section III of [15], the packet crossover ratio is defined to be the quotient of two values. The numerator is the number of packet crossovers in total, whereas the denominator is the total number of Echoes and Sends.
As an assumption of this paper mentioned in Section 4.1, no Echo packets can be chaffed as it is very difficult for intruders to inject Echo packets into active connections. For network traffic with chaff but the chaff rate is not high, the total number of Send packets will increase. Clearly, the total number of packet crossovers will also increase as the total number of Send packets increases. Therefore, both the numerator and denominator of the packet crossover ratio are increasing. However, the total number of chaffed Send packets (the additional Send packets) will be limited as the chaff rate is not high.
Thus, for network traffic with or without chaff, the change of the packet crossover ratios is ignorable when the chaff rate is not high. This result is also verified through a well-designed network experiment in Section 6 below.
This completes the proof of Theorem 1.
In this paper, we proposed a novel network-based detection algorithm for SSI by calculating the packet crossovers. To defeat our proposed SSID algorithm, intruders can only apply one (or both) of the following methods to evade detections: 1) chaff perturbation, and/or 2) time jittering.
In the above Theorem 1, we have proved that our proposed Algorithm 1 for SSID in Section 4 is resistant to intruders’ chaff attacks if the chaff rate is not high.
Our above-proposed Algorithm 1 for SSID using packet crossover ratios is also resistant to time-jittering attacks. With time-jittering attacks, only the time-stamps of some network packets are modified, none of the total number of Send packets and the total number of Echo packets are increased. In addition, the total number of packet crossovers remains unchanged as the order of the packets remains the same.
Therefore, the above Algorithm 1 for SSID developed in this paper is resistant to any session manipulations by intruders including both chaff-perturbation and time-jittering attacks.

6. Network Experimental Results and Analysis

In this section, we present and analyze the results of our network experiments that were well-designed to verify the following:
(1)
The correctness of Proposition 1 stated in Section 3;
(2)
The correctness of the proposed SSID algorithm stated in Section 4.
To set up the network environment for the experiment, six AWS servers distributed in different regions and two computers in our computer lab were used to establish the connection chain. All of the systems used in our network experiment run Ubuntu operating systems. A long chain was established by using OpenSSH to connect to all the machines in the chain from the attacker machine H1 to the victim target H8 (refer to Figure 3). Host H1 is a local PC at Columbus State University in Georgia, USA having a private IP 168.27.2.101. From H1, we accessed remotely another local host H2 which is also located at Columbus State University in Georgia, USA with a private IP 168.27.2.103. Then, the chain is extended to access remotely an AWS server H3 (running at Virginia) having the IP address 54.175.200.189, by using H2 as a stepping-stone machine. The chain is extended again to access remotely another AWS server H4 (running in London) having the IP address 35.178.87.47, by using H3 as a stepping-stone. The chain is extended again to remotely access another AWS server H5 (running in Virginia) having the IP address 3.87.217.13, by using H4 as a stepping-stone. Similarly, the connection chain is extended again to access remotely another AWS server H6 (running in Tokyo) having the IP address 54.65.202.87, by using H5 as a stepping-stone. The chain is extended again to access remotely another AWS server H7 (running in Paris) having the IP address 15.188.87.227, by using H6 as a stepping-stone. Finally, the connection chain is extended again to access remotely the victim target H8 (an AWS server) running in Virginia having the IP address 54.86.84.197, by using H7 as a stepping-stone. This completes the setup of the connection chain.
The network packets were captured using TCPdump from the downstream connection at every selected sensor host in the chain. For example, when Host 3 is chosen as the sensor, then the packets are captured from the connection between Hosts 3 and 4.
Our first experiment was conducted for network traffic without any chaffed meaningless packets. At every sensor host, as soon as TCPdump is ready to capture network packets, some standard Linux commands (such as ps, ls, dir, etc.) are entered into a terminal in the attacker machine H1 for a couple of minutes and all the packets from the indicated connection were captured at each of the five sensor hosts (H3 through H7) in the chain. Totally, ten datasets were captured at each of the five sensor hosts (H3 through H7). Then, we use our packet crossover ratio algorithm [16] to calculate the packet crossover ratio based on the captured packets at each specific sensor host. Finally, we compute the average packet crossover ratio among the 10 datasets.
In Table 1, column 1 (# of Conn) represents the number of connections in the downstream sub-chain, and “DS-1” stands for dataset #1. Columns 2 through 11 show the packet crossover ratios calculated for a specified dataset from DS-1 to DS-10, with a given number of connections specified in column 1. For each of the 10 datasets (represented in Table 1 from column 2 through column 11, respectively), the downstream sub-chain length strictly increases with the obtained packet crossover ratio at the sensor host. Therefore, Proposition 1 is true for network traffic without chaff. Therefore, our experimental results completely support the statement of Proposition 1 which is 100% correct for each of the 10 datasets. That is, the error rate is 0%.
In Table 1, the last column is the average packet crossover ratios among the 10 datasets. In this experiment, the intrusion threshold crossover ratio is 0.78, which is the average packet crossover ratio derived from a downstream sub-chain of three connections over the ten datasets. According to our proposed SSID algorithm, for a given outgoing link of the sensor host, if the average packet crossover ratio obtained at Step 4 of the algorithm over 10 datasets is at least 0.78, then it is highly suspicious that this outgoing link is used by a hacker for malicious SSI.
Our second experiment was conducted for the network traffic with meaningless packets chaffed at a rate of 10%. We performed all the steps as we did for the first experiment above for network traffic without chaffed packets. Similarly, we use our packet crossover ratio algorithm to calculate the packet crossover ratio based on the captured packets with a 10% chaff rate at each specific sensor host. The results we obtained are very similar to those we obtained in the first experiment above.
In Table 2, the meanings of the columns are the same as in Table 1. For each of the 10 datasets (represented in Table 2 from column 2 through column 11, respectively), the downstream sub-chain length strictly increases with the packet crossover ratio observed at the sensor host. Therefore, our experimental results also completely support the statement of Proposition 1 for network traffic with a 10% chaff rate.
In Table 2, the last column is also the average packet crossover ratios among the 10 datasets. In this experiment, the intrusion threshold crossover ratio is 8.54, which is the average packet crossover ratio derived from a downstream sub-chain of three connections over the ten datasets. According to our proposed SSID algorithm, for a given outgoing link of the sensor host, if the average packet crossover ratio obtained at Step 4 of the algorithm is at least 8.54, then it is highly suspicious that this outgoing link is used by a hacker for malicious SSI.
Similarly, our third through sixth experiments were conducted for the network traffic with meaningless packets chaffed at a rate of 20%, 30%, 40%, and 50%, respectively (corresponding results shown in Table 3, Table 4, Table 5 and Table 6, respectively). For each of these experiments, we performed all the steps as we did for the first experiment above. Then, we use our packet crossover ratio algorithm to calculate the packet crossover ratio based on the captured packets from the network traffic with a specific chaff rate. The results we obtained are very similar to those we obtained in the first experiment above. In Table 3, Table 4, Table 5 and Table 6, the meanings of the columns are the same as in Table 1. For each of the 10 datasets in every of these experiments from the third to the sixth (corresponding results shown in Table 3, Table 4, Table 5 and Table 6, respectively), the length of a downstream sub-chain strictly increases with the packet crossover ratio observed at the sensor host. Our experimental results also completely support this statement which is 100% true for each of the 10 datasets for network traffic with a chaff rate of 20%, 30%, 40%, and 50%, respectively.
The last column of Table 3, Table 4, Table 5 and Table 6 is also the average packet crossover ratios among the 10 datasets. In the third through sixth experiments, the intrusion threshold crossover ratios are, respectively, 15.69, 22.10, 27.99, and 33.44, derived from the corresponding downstream sub-chain of three connections over the ten datasets.

7. Conclusions and Future Work

In this paper, we developed a novel network-based SSID approach using packet crossover that is resistant to intruders’ chaff manipulation. To the best of our knowledge, this paper is the first network-based SSID approach that resists intruders’ session manipulation such as chaff perturbation. Since packet crossovers can be easily calculated, the SSID method proposed in this paper is easy to implement and efficient. Like prior network-based SSID approaches, our proposed SSID method also produces very low false-positive errors. According to our experiment results, our proposed SSID algorithm performs perfectly in resisting intruders’ chaff perturbation up to a 50% chaff rate. The error rate is zero for any LAN or WAN network traffic with a chaff rate of at most 50%.
As for future research direction, one may develop SSID algorithms that are resistant to intruders’ chaff manipulation if two or more hosts in a connection chain have chaffed meaningless packets by an intruder.

Author Contributions

L.W.: SSID design, writing, analysis, supervision, and project administration; J.Y.: validation, analysis, investigation, supervision, and project administration; J.K.: collect and analyze network packets; P.-J.W.: supervision, help analyze the algorithm correction and validation. All authors have read and agreed to the published version of the manuscript.

Funding

This work of Lixin Wang and Jianhua Yang is supported by the National Security Agency NCAE-C Research Grant (H98230-20-1-0293) with Columbus State University, Georgia, USA.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Staniford-Chen, S.; Heberlein, L.T. Holding Intruders Accountable on the Internet. In Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, USA, 8–10 May 1995; pp. 39–49. [Google Scholar]
  2. Blum, A.; Song, D.; Venkataraman, A.S. Detection of Interactive Stepping-Stones: Algorithms and Confidence Bounds. In Proceedings of the International Symposium on Recent Advance in Intrusion Detection (RAID), Sophia Antipolis, France, 15–17 September 2004; pp. 20–35. [Google Scholar]
  3. Donoho, D.L.; Flesia, A.G.; Shankar, U.; Paxson, V.; Coit, J.; Staniford, S. Multiscale stepping-stone detection: Detecting pairs of jittered interactive streams by exploiting maximum tolerable delay. In Proceedings of the 5th International Symposium on Recent Advances in Intrusion Detection (RAID), Zurich, Switzerland, 16–18 October 2002. [Google Scholar]
  4. Zhang, Y.; Paxson, V. Detecting Stepping-Stones. In Proceedings of the 9th USENIX Security Symposium, Denver, CO, USA, 14–17 August 2000; pp. 67–81. [Google Scholar]
  5. Mathew, B. UNIX security: Threats and solutions. In Proceedings of the 1995 System Administration, Networking, and Security Conference, Washington, DC, USA, 10–13 October 1995. [Google Scholar]
  6. Bhattacherjee, D. Stepping-Stone Detection for Tracing Attack Sources in Software-Defined Networks. Master’s Thesis, Aalto University, Espoo, Finland, 2016. [Google Scholar]
  7. Paxson, V.; Floyd, S. Wide-area Traffic: The Failure of Poisson Modeling. IEEE/ACM Trans. Netw. 1995, 3, 226–244. [Google Scholar] [CrossRef]
  8. Wang, X.; Reeves, D. Robust correlation of encrypted attack traffic through stepping-stones by flow watermarking. IEEE Trans. Dependable Secur. Comput. 2011, 8, 434–449. [Google Scholar] [CrossRef]
  9. Phaal, P.; Panchen, S.; McKee, N. InMon Corporation’s sFlow: A Method for Monitoring Traffic in Switched and Routed Networks; RFC 3176; IETF: Fremont, CA, USA, 2001. [Google Scholar]
  10. Chen, Y.; Wang, S. A Novel Network Flow Watermark Embedding Model for Efficient Detection of Stepping-stone Intrusion Based on Entropy. In Proceedings of the International Conference on e-Learning, e-Business, Enterprise Information Systems, and e-Government (EEE), WorldComp, Las Vegas, NV, USA, 25–28 July 2016. [Google Scholar]
  11. Yang, J.; Huang, S.-H.S. A Real-Time Algorithm to Detect Long Connection Chains of Interactive Terminal Sessions. In Proceedings of the 3rd ACM International Conference on Information Security (Infosecu’04), Shanghai, China, 14–16 November 2004; pp. 198–203. [Google Scholar]
  12. He, T.; Tong, L. Detecting encrypted stepping-stone connections. IEEE Trans. Signal Process. 2007, 55, 1612–1623. [Google Scholar] [CrossRef]
  13. Yang, J.; Huang, S.-H.S. Mining TCP/IP Packets to Detect Stepping-Stone Intrusion. Comput. Secur. 2007, 26, 479–484. [Google Scholar] [CrossRef]
  14. Yung, K.H. Detecting Long Connecting Chains of Interactive Terminal Sessions. In Proceedings of the 5th International Symposium on Recent Advances in Intrusion Detection (RAID), Zurich, Switzerland, 16–18 October 2002; pp. 1–16. [Google Scholar]
  15. Wang, L.; Yang, J.; Lee, A.; Wan, P.-J. Stepping-Stone Intrusion Detection via Estimating Numbers of Upstream and Downstream Connections using Packet Crossover. J. Wirel. Mob. Netw. Ubiquitous Comput. Dependable Appl. (JoWUA) 2022, 13, 24–39. [Google Scholar]
  16. Yoda, K.; Etoh, H. Finding Connection Chain for Tracing Intruders. In Proceedings of the 6th European Symposium on Research in Computer Security, Toulouse, France, 4–6 October 2000; Volume 1985, pp. 31–42. [Google Scholar]
  17. Figueiredo, J.; Serrão, C.; de Almeida, A.M. Deep Learning Model Transposition for Network Intrusion Detection Systems. Electronics 2023, 12, 293. [Google Scholar] [CrossRef]
Figure 1. A sample of a connection chain with five connections.
Figure 1. A sample of a connection chain with five connections.
Electronics 12 03855 g001
Figure 2. A connection chain with an intruder, four stepping-stones, and a victim.
Figure 2. A connection chain with an intruder, four stepping-stones, and a victim.
Electronics 12 03855 g002
Figure 3. A chain of seven connections with host H3 through to host H7 serving as sensors, respectively, used to capture network traffic from the downstream connection.
Figure 3. A chain of seven connections with host H3 through to host H7 serving as sensors, respectively, used to capture network traffic from the downstream connection.
Electronics 12 03855 g003
Table 1. Packet crossover ratios without chaff. The intrusion threshold crossover ratio is 0.78.
Table 1. Packet crossover ratios without chaff. The intrusion threshold crossover ratio is 0.78.
#ofConnDS-1DS-2DS-3DS-4DS-5DS-6DS-7DS-8DS-9DS-10AVG
10.010.050.040.130.010.030.030.030.010.040.04
20.470.530.450.790.460.460.510.440.510.590.52
30.690.770.751.160.720.660.750.680.750.860.78
40.780.940.921.360.950.780.910.760.871.070.93
50.871.061.031.531.060.861.060.860.991.221.05
Table 2. Packet crossover ratios with a 10% chaff rate. The intrusion threshold crossover ratio is 8.54.
Table 2. Packet crossover ratios with a 10% chaff rate. The intrusion threshold crossover ratio is 8.54.
#ofConnDS-1DS-2DS-3DS-4DS-5DS-6DS-7DS-8DS-9DS-10AVG
111.466.577.668.418.136.687.896.086.468.697.80
211.897.028.039.118.557.118.376.436.929.268.27
312.207.278.339.438.877.348.646.77.149.518.54
412.237.468.529.689.067.438.796.867.339.768.71
512.327.598.639.839.197.498.946.877.449.98.82
Table 3. Packet crossover ratios with a 20% chaff rate. The intrusion threshold crossover ratio is 15.69.
Table 3. Packet crossover ratios with a 20% chaff rate. The intrusion threshold crossover ratio is 15.69.
#ofConnDS-1DS-2DS-3DS-4DS-5DS-6DS-7DS-8DS-9DS-10AVG
121.9112.5314.5816.0515.5712.8215.1311.5512.3816.6614.92
222.4113.0315.0316.6516.0213.2715.6912.1012.8917.2015.43
322.6613.3115.3117.0916.2313.4515.8812.3113.1817.5115.69
422.7613.4615.5917.2916.5413.4716.0412.3413.2617.7415.85
522.8313.5415.7217.5016.7413.7316.1912.4413.3817.8816.00
Table 4. Packet crossover ratios with a 30% chaff rate. The intrusion threshold crossover ratio is 22.10.
Table 4. Packet crossover ratios with a 30% chaff rate. The intrusion threshold crossover ratio is 22.10.
#ofConnDS-1DS-2DS-3DS-4DS-5DS-6DS-7DS-8DS-9DS-10AVG
131.3717.9320.8922.8822.2118.3021.6416.5617.6823.8121.33
231.9018.3621.3123.6422.6818.7722.1116.9618.2524.3721.84
332.1618.6621.5823.9223.0318.9722.3517.2218.4324.6422.10
432.2418.8521.9424.1623.2119.0622.5217.3818.5924.8722.28
532.3618.9622.0324.3223.3519.1922.6617.4618.7125.0422.41
Table 5. Packet crossover ratios with a 40% chaff rate. The intrusion threshold crossover ratio is 27.99.
Table 5. Packet crossover ratios with a 40% chaff rate. The intrusion threshold crossover ratio is 27.99.
#ofConnDS-1DS-2DS-3DS-4DS-5DS-6DS-7DS-8DS-9DS-10AVG
140.0622.8226.6929.2228.3323.3527.5821.1522.5830.3727.22
240.5523.3027.1229.9528.8223.8828.1221.6223.1130.9827.75
340.8623.6027.3430.2229.1624.0128.3021.7823.4131.2527.99
440.9423.7927.6930.5129.3724.1228.4621.9923.5331.4628.19
541.0023.9227.8230.6429.5124.2328.6722.0423.6031.7228.32
Table 6. Packet crossover ratios with a 50% chaff rate. The intrusion threshold crossover ratio is 33.44.
Table 6. Packet crossover ratios with a 50% chaff rate. The intrusion threshold crossover ratio is 33.44.
#ofConnDS-1DS-2DS-3DS-4DS-5DS-6DS-7DS-8DS-9DS-10AVG
148.0527.4531.9735.0834.0428.0833.0325.3727.0736.4232.66
248.5527.9032.4335.7834.5728.5433.5525.7527.6137.0733.18
348.8228.1632.7236.0334.7228.7633.8226.0428.0137.3333.44
448.9428.3033.0536.2935.0728.7734.0926.1628.0437.6133.63
549.0128.4333.2236.5035.2129.0434.2426.2328.2037.6633.77
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, L.; Yang, J.; Kim, J.; Wan, P.-J. An Effective Approach for Stepping-Stone Intrusion Detection Resistant to Intruders’ Chaff-Perturbation via Packet Crossover. Electronics 2023, 12, 3855. https://doi.org/10.3390/electronics12183855

AMA Style

Wang L, Yang J, Kim J, Wan P-J. An Effective Approach for Stepping-Stone Intrusion Detection Resistant to Intruders’ Chaff-Perturbation via Packet Crossover. Electronics. 2023; 12(18):3855. https://doi.org/10.3390/electronics12183855

Chicago/Turabian Style

Wang, Lixin, Jianhua Yang, Jae Kim, and Peng-Jun Wan. 2023. "An Effective Approach for Stepping-Stone Intrusion Detection Resistant to Intruders’ Chaff-Perturbation via Packet Crossover" Electronics 12, no. 18: 3855. https://doi.org/10.3390/electronics12183855

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop