1. Introduction
To avoid being detected or captured, most intruders tend to exploit compromised computer hosts to attack the victims they are interested in. We call the compromised computer hosts stepping-stones [
1]. Most attackers establish a long connection chain with more than three stepping-stones to better protect themselves when launching attacks. The more stepping-stones used, the harder it is to capture the attacker. Such attacks are called stepping-stone intrusions.
Stepping-stone intruders are especially hard to track or capture due to using a long TCP/IP session to make their attacks. One easy way developed in 2000 to detect such an intrusion was to decide if a host plays a role of stepping-stone, or to estimate the number of compromised hosts. Many real-world applications may use stepping-stones legitimately; thus, a simple determination of intrusion by merely using the fact that a host has been used as a stepping-stone may produce a false positive error. Estimating the number of compromised hosts reduces the likelihood of a false positive since there are almost no legitimate uses for a connection chain with more than three hosts. However, due to its simplicity and reliability, many methods have been still proposed to detect stepping-stone intrusion by determining if a host plays a role of a stepping-stone.
The first method to detect stepping-stone intrusion by determining if a host plays a role of stepping-stone was described in the paper [
2] by S. Staniford-Chen, and L. T. Heberlein in 1995. This method was created even before the stepping-stone concept was formally proposed. In the paper [
2], the contents of the network traffic between an incoming and an outgoing connection are compared. If the two computer network connections hold the same network traffic contents, then the two sessions are treated as relayed. A relayed connection pair such as this indicates the host plays a role of stepping-stone. However, this initial algorithm cannot be applied to an encrypted network communication session, which has been widely used since 2000.
Y. Zhang and V. Paxson developed a time-based thumbprint [
1] method to detect stepping-stone intrusion. This method compares the time-thumbprint from an incoming connection with an outgoing connection. This method does not require viewing the contents of a packet and therefore is unaffected by their encrypted nature. Instead of using the contents from a series of packets, the time-based thumbprint compares the ON time gap, as well as the OFF gap between the network packets collected from an incoming connection and those collected from an outgoing connection. Monitoring an interactive network connection for a certain period, we would not see any network traffic flowing through the session for that period. This period is called the ‘OFF’ time gap. Oppositely, there is a period where packets are flowing through the connection: ‘ON’ gap. When we monitor a network connection continuously, we can get an ON–OFF gap sequence. This sequence uniquely identifies the interactive network connection. Therefore, the statement of detecting stepping-stone intrusion becomes comparing two sequences. Thus, we can avoid the previous need to view packet contents. This approach [
1] can apply to an encrypted interactive network connection. K. Yoda and H. Etoh [
3] also proposed a similar idea to detect stepping-stone intrusion in 2000. Rather than using time-based thumbprint, their algorithm viewed the deviation between two different network interactive sessions. A small deviation indicates a high likelihood that the two interactive sessions are relayed. A relayed session pair indicates a high possibility that the host plays a role of stepping-stone. The deviation [
3] can be calculated using the header information of TCP/IP network packets. Since the only data used in this method can be obtained from the header of the TCP/IP packet, which is not encrypted, this method is obviously applicable to an encrypted network session.
We found that the primary issue of the above approaches comes from the fact that the time-based thumbprint and the deviation between two network sessions can be garbled by intruders’ session manipulation, such as chaff perturbation and/or time-jittering. Chaff perturbation is a concealment method in which attackers can insert some trivial packets into a regular TCP/IP connection to make two relayed sessions appear unrelated. Chaff perturbation will be discussed in detail in 
Section 2.
Research from D. L. Donoho (ed.) [
4] reveals that attackers cannot disguise their network sessions to evade detection unlimitedly. The attackers’ ability to manipulate a live interactive session is capped. A. Blum, D. Song, and S. Venkataraman developed an algorithm [
5] using TCP/IP packet count to detect stepping-stone intrusion by checking the difference in the number of packets traveling between two connections. If the two network sessions are relayed, that difference in packet count is bounded with a high probability. This approach tends to fail in terms of resisting attackers’ chaff attack because the number of the chaffed packets necessary to evade detection is relatively small.
T. He and L. Tong [
6] developed an approach called Detect Bounded Memory Chaff to detect stepping-stone intrusion in 2007. They claimed their algorithm can resist intruders’ chaff evasion proportional to the size of the network traffic. Their research proof and experimental results show that, if assuming the packet delay is limited by 
∆, in every 
n network packets, an attacker needs to inject 
n/(1 
+ λΔ) packets minimally into the connection to evade detection, here 
λ is a parameter of a Poisson distribution. Obviously, the smaller the 
∆, the better performance the algorithm would present in resisting intruders’ chaff attack. However, higher false positive detection error might be introduced.
K. H. Yung [
7] developed an algorithm to detect stepping-stone intrusion by roughly estimating the length of a connection chain built by attackers. The longer a connection chain, the higher the possibility the chain was built by an attacker because it does not make any sense to use a long connection chain to access a host if a shorter one exists. K. H. Yung’s approach using packet round-trip time (RTT) needs matching network traffic to estimate the length of a connection chain. Unfortunately, the packet matching approach in [
7] cannot make a correct packet match for a long connection chain. J. Yang developed some approaches to match network traffic [
8] which works for a long connection.
In 2021, Dr. L. Wang et al. developed using k-means clustering to mine network packets to estimate the length of a connection chain. This approach overcomes the issues of the existing approaches for SSID (Stepping-Stone Intrusion Detection) by estimating the length of a connection chain. The issue primarily means requiring a large number of TCP packets to be captured and processed to make an effective detection. The algorithm proposed in [
9] can accurately determine the length of a connection chain with a smaller number of packets. However, this method loses effectiveness if there are many outliers in the packets’ RTTs.
In ref. [
10], Dr. L. Wang et al. proposed an approach that uses packet crossover to estimate the downstream length of a connection chain. Packet crossover is a phenomenon in which a Send (request) packet meets with the Echo (response) packet of a previous Send packet while traveling along the connection chain between a client host and a server host. The paper proved that the length of a downstream connection chain strictly increases as the ration of packet crossover increases. Dr. L. Wang et al. proposed a framework to test the resistance capability of detection algorithms for stepping-stone intrusion against time-jittering session attack [
11].
In 2021, Dr. J. Yang et al. proposed an algorithm of using encrypted packets to detect stepping-stone intrusion [
12]. To perform this approach using the encrypted packets of a host’s incoming and outgoing connections, Dr. J. Yang et al. assumed that the two connections would use the same encryption algorithm, which may not always be true. Dr. J. Yang improved the algorithm published in [
12] to make it accurate regardless of if the encryption algorithms from the incoming and outgoing connections are the same. The initial result was published in the proceedings of 36th International Conference on Advanced Information Networking and Applications, Sydney, Australia, April, 2022 [
13].
In this paper, we propose a new idea by exploiting packet cross-matching and modeling the RTTs of network packets as a random walk to detect stepping-stone intrusion and resist intruders’ chaff evasion attack at a higher rate than all other existing approaches. This paper is an extended version of the conference paper published in [
14]. The idea was initially motivated by the approach proposed in [
15]: using RTT-based one dimensional random walk to detect stepping-stone intrusions. In this paper, we made three significant improvements from the conference paper [
14]. The first improvement was to adopt a new published packet matching algorithm in 2021 [
16]. The second improvement was to enhance the cross matching algorithm by making it more robust and efficient by instead of an absolute number of RTTs, using the ratio between the number of RTT changes and the total number of RTTs as a critical factor to determine if an intrusion is occurring. The third improvement was to provide more experimental results, which were conducted using Amazon AWS cloud infrastructure to verify the improved performance of the new algorithm, esp. its capacity to resist intruders’ chaff evasion attack.
In [
14], the experiments were conducted over a short connection of three hosts. In each connection, we collected two types of packets: Send and Echo. We placed the Send packets collected into an array called Send stream. Similarly, an Echo stream contained all the Echo packets from a connection. The idea of cross-matching is to match the TCP/IP packets not only from the streams in the same connection but also the streams from different connections. The difference between the amount of RTTs matched from the same connections, and the amount of RTTs matched from different connections are the two dimensions in the Random Walk.
The rest of this paper is arranged as the following. 
Section 2 describes a way to model and match computer network traffic. 
Section 3 discusses the stepping-stone chaff evasion attack. 
Section 4 presents the RTT-based random walk. 
Section 5 describes packet cross-matching. 
Section 6 presents the detection algorithm using a two-dimensional random walk based on packet cross-matching. 
Section 7 analyzes how this algorithm can detect stepping-stone intrusion and resist intruders’ chaff attacks. In 
Section 8, we present experimental results to justify the performance of the proposed algorithm. In 
Section 9, we conclude the paper and discuss some future work.
  2. Modelling and Matching Network Traffic
Network communication traffic is the interactions between the requests of a client side and the responses from the corresponding server side. To simplify our description and clarify the packet matching process for readers, Network communication can be modeled as Send, Echo, and Ack packets in this research.
  2.1. Send and Echo Definition
A TCP/IP packet has a one-byte Flag field in its header with six bits in the order of “UAPRSF” from bit 10 to 15. These bits indicate the packet type, and the two other bits (Bits 8 and 9) are used for TCP congestion control. If the bit 11 “A” is set up, it indicates the packet is an Acknowledgement packet. However, if the bit 12 “P” is set up, it shows the packet is being used for the ‘Push’ function; it pushes the buffered data into the receiver side application. If a packet has its “P” flag set up, the packet must carry a data payload. If a packet only has its “A” flag set up, it is used as a signal to acknowledge to the opposite communication side that the data sent out has been received correctly. A packet with only an “A” flag set up does not carry any data payload. However, a packet with both “A” and “P” flags set up plays two roles: not only to carry a data payload but also to acknowledge the opposite side.
In a client–server communication model, we assume that Host A (client) communicates with Host B (server) via a TCP connection. Host A and B can both work as a sender and a receiver. If a TCP packet has the flag “P” set up and is sent from its client side to the server, the packet is defined as a Send packet; on the contrary, if the packet is sent from its server side to the client, this packet is defined as an Echo packet. Similarly, an Ack is a TCP packet sent from either the server or the client and has the flag “A” set up. Intuitively, a Send packet is a request made by a client, and an Echo packet is a server’s response to that request. In general, a request will fit into one packet, but the response to one request may be too large to fit into one packet. A Send or an Echo packet may also act as an acknowledgment packet.
  2.2. Packet Matching and RTT
A Send, also called a request packet, from the client side may cause one or more Echo packets at the server side. Packet matching is finding the corresponding Echo packet for a given Send. The reason to study packet matching is that the time gap between the two matched packets can reflect the length of the connection chain between a client and the server. We call this gap the Round-Trip-Time (RTT) between a client side and its corresponding server. Notably, the RTT here is defined as a connection chain between two communication hosts spanning multiple hosts (the stepping-stones).
If a Send packet only causes one Echo packet, it is trivial to match the two packets and compute the RTT between those packets. If a Send packet causes multiple Echo packets, the Send packet can match with the first Echo packet to compute the RTT. The problem with packet matching is that if multiple Sends each result in multiple Echo packets, is it still possible to correctly match the Sends with Echoes? Packet matching approaches will be discussed in the section “Packet Cross-Matching”.
  2.3. RTT Distribution
Matching a Send packet with its Echo packets is not trivial, especially when multiple Send packets being echoed by multiple Echo packets. Since our goal in matching TCP packets is to calculate the RTT between a matched Send and Echo, it is not necessary to match a Send packet with all its Echoes. Instead, we just compute the RTT of the first matched packets. J. Yang, etc. [
8] proposed an algorithm to find RTTs without directly matching the packets. This algorithm took advantage of the RTT distribution to obtain the RTTs of TCP packets.
A packet’s RTT can be treated as the sum of four network delays including packets’ processing delay, queuing delay, transmission delay and propagation delay [
17]. Though there may be many hops/connections between a source host and its destinations, for simplicity, we can model the delay of a network connection as a queue. 
T(t) can be expressed in Equation (1) if we use 
T(t) to represent the RTT in a TCP/IP connection.
        
        where 
T0 is a constant, and 
 is a variation. The constant part is primarily from propagation delay and the varying part 
 is from network queuing delay [
18]. We apply the /M/M/1 queuing model to the RTT queue [
19], and obtain the distribution of 
. This distribution is modeled in Equation (2).
        
        where 
 is the service rate on link 
i and 
 is the corresponding utilization factor.
From the results in Equation (2), we know that the variance of RTTs in a network connection can be simulated as an exponential distribution. Due to the cumulative delay of all the hosts between the client and the server, it is hard to simulate the variance of RTTs of a TCP connection chain as an exponential distribution. From the research results in [
8], we found that the occurrences of 
 for a TCP/IP connection chain can be modeled as a Poisson distribution [
17,
18]. We adopt a Poisson distribution to model the variance of RTTs in this paper.
If the variance of RTTs follows a Poisson distribution, a data mining-clustering approach can be used to match network traffic [
8]. One important feature of the Poisson distribution tells us that more than 90% of the RTTs are distributed around its mean within one standard deviation distance. This can be described in a math inequality 
. Here, 
 is the mean value and 
 is the standard deviation of the Poisson distribution.
  3. Chaff Attack Definition and Implementation
  3.1. Chaff Definition
A chaff attack is a widespread attack used by intruders in recent years. Many existing SSID approaches can be defeated by chaff attacks. In this section, we will explain the concept of a chaff attack, and how it lowers the performance of a stepping-stone intrusion detection algorithm and potentially renders them useless. The chaffing process injects packets into a live TCP/IP interactive session. It exploits a packet forging or spoofing technique. It interferes with an established live TCP session. The chaff technique is commonly used in man-in-the-middle, denial-of-service, and/or stepping-stone intrusion attacks. Chaffing can be performed through the following procedures: create a raw socket; make an Ethernet, IP, and TCP header; obtain empty data for the stepping-stone intrusion; assemble the header and data to generate a chaff packet, and deliver the chaff packet out by the raw socket. Some existing tools can make chaff packets easily. These tools include Pakcit, hping, and Ettercap. In the following section, we will discuss the tool Packit in detail.
  3.2. Packit
One popular tool to make and inject TCP/IP packets into a network connection is Packit [
20]. Using this tool, intruders can make essentially any types of packets including ARP, IP, TCP, UDP, ICMP and Ethernet header. Typically, attackers need to manually define a TCP and Ethernet header value to evade SSID systems. This tool allows intruders to specify the injected network traffic information including the type of packet, the total number of packets, the number of seconds to wait between each injected packet, the injection rate, the network interface to be injected into, the data payload, and the length of the packets to inject. Here is an example to show how to inject a packet into a network connection. We could insert 70 TCP/IP packets at a rate of 100 packets per second from 172.24.210.19 on the port 521 to 164.27.191.25 on the port 80 by setting up flags: SYN and RST. We assume the sequence number is 432,719,088, and a source Ethernet address is DD:44:A0:4B:8D:11 The corresponding Packit tool command would be “
packit –s 172.24.210.19 
–d 164.27.191.25 
–S 521 
–D 80 
–F SR –q 432719088 
–c 70 
–b 10 
–e DD:44:A0:4B:8D:11”.
  3.3. Chaff Tool Developed Using C#
We developed a C# code to make and inject TCP/IP packets. We first validate the source/destination ports input by a user. Next, we check if the port is valid. The easiest way to validate a port is to check if it is between 0 and 65,535, excluding 0.
A TCP packet can be built from the Datalink layer to the Application layer. At the Datalink layer, the MAC (Multiple Access Control) address of the gateway can be obtained. Therefore, packets can be sent outside of a LAN (Local Area Network). We also need to confirm that the MAC address of an active network interface is obtained and valid. Next, we construct the packet in Network Layer with IPv4 format. From the user’s input, it is trivial to get the source and destination IP address. We can obtain the header checksum based on the given header information. We assume no fragmentation is required. If the identification number of a packet is not given by the user, it is empty by default. At the transport layer, the protocol can be set up automatically. All other fields, including “Options”, “TTL”, and “TypeOfService”, can be set up according to the user’s input. However, TTL value must be validated. Its legal value cannot be more than 255.
We set up the Transport Layer after completing the network layer. In the Transport layer, we use the user’s input to set up the source port and destination port numbers, respectively. The Transport layer header checksum is set up automatically. The user can decide the packet sequence and acknowledgment numbers. Otherwise, the default sequence number is set up to 100, and the default acknowledgement number is set up to 50 by the system. TCP Packet flags are decided by user input, including U, R, S, P A, and F. If no user input is detected, the available window size can be set up to 100 in default. Finally, we need to set up the application layer to make the payload. In this layer, the user can decide the message and/or data to send across the network. This is the payload part of a packet. In our system, a textbox is set up for the user to enter their message. After completing all the above steps, we come to the last step to build the packet. We use the “PacketBuilder” function to assemble the above parts of the packet into a cohesive whole. We also need to select a network device to set up the interface to send packets out.
We reach the last stage using the communicator we built to send the packet over the network. We use the method in the “PacketBuilder” class to build our packet, then send the packet out using the “SendPacket” method of the “PacketCommunicator” class. This concludes our discussion on how to chaff a network connection with C#.
  3.4. Chaff Affection to RTT-Based Random-Walk
From 
Section 3.2 and 
Section 3.3, we know that it is not hard for attackers to chaff a network connection. This chaff attack can defeat most of the SSID algorithms. We examine one of the detection tools, ON–OFF time thumbprint [
1], as an example to demonstrate how detection is defeated. We monitor a host and collect five Send and Echo packets from the incoming connection and the outgoing connection of the host, respectively. We assume the following packet sequences are obtained: 
Pin = {27, 52, 61, 80, 92}, and 
Pout = {27, 52, 62, 81, 92}. Each value in the sequence represents the timestamp of the packet captured, either a Send or an Echo. We obtain two ON–OFF thumbprint sequences: 
Tin = {25, 9, 19, 12}, and 
Tout = {25, 10, 19, 11} where “25, 19, 11, 12” represents “OFF” time of the connections, and “9, 10” represents “ON” time. It is trivial to conclude that the two thumbprints are tightly relayed. If we chaff the outgoing connection with five TCP packets using Packit, such as 
Pout = {27, 37, 42, 52, 57, 62, 81, 109}. The corresponding outgoing time thumbprint becomes 
Tout = {35 (ON), 45(OFF)} if assuming the threshold is 10. It is very clear that after the chaffed packets are sent to the outgoing connection, it is hard to tell if the two connections are still relayed or not. The chaff attack uses packet injection to evade SSID system. The algorithm proposed in this paper can resist intruders’ chaff evasion attack.
  5. Packet Cross-Matching
  5.1. Packet Matching
The methods that utilize packet matching to detect stepping-stone intrusion not only have a largely reduced false positive error but also resist intruders’ evasion due to traffic manipulation. Packet matching is the process of pairing the requests over a TCP connection from the sender with the corresponding responses from the receiver. The RTT from the matched packets reflects the length of the connection chain from sender to receiver. The longer a connection chain is, the higher the probability that the stepping-stone intrusion is detected.
Packet matching techniques have been extensively studied since 2002 [
7]. The easiest way to match TCP packets is by using packet sequence numbers. Each TCP packet has a sequence number in its header that indicates its first byte number in its payload. A TCP packet payload is numbered in bytes. The total length field in its header tells a recipient the sequence number of the last byte in the data stream. If a packet is sent out with sequence number 100 and length 20, it indicates the packet has 20 bytes data from sequence number 100 to 119. If the packet is correctly received, the Echo packet acknowledges that the 20 bytes were correctly received. So, the echo packet’s acknowledgment number will be 120, which indicates all of the 20 bytes have been correctly received. Examining a Send packet sequence number and an Echo packets acknowledgment number can determine if the two packets are matched or not. This is only reliable in a local area network where each request is received, processed, and echoed in time. The Internet is far more complex than a local area network for packet matching.
On the Internet, packets are matched by using RTTs rather than using packet sequence and acknowledgment numbers. The time gap between matched packets can reflect the time length of an extended connection chain. If we know a packet’s RTT, we can obtain the matched packet pair using said RTT. The Clustering-Partitioning data mining algorithm [
8] is an approach that matches TCP/IP packets by taking advantage of the distribution of packets’ RTTs.
As we have discussed, the purpose of matching TCP/IP network traffic is to compute their RTTs. The RTT of a network packet can represent the length of a connection chain. Using RTTs helps us predict the length of a connection chain, and the number of hosts compromised as stepping-stones in that chain. Matching packets is not trivial, especially in the context of the Internet. There are many challenges in matching TCP/IP packets over the Internet.
Matching packets on the Internet is harder than in a local area network. The primary reason is that some issues existed in the TCP/IP protocol design. DARPA designed the TCP/IP protocol. This protocol was first used by DARPA in ARPANET in the 1970s [
21]. Security was not an important concern of the design at that time. Most efforts put into its design were focused on network communication efficiency. Knowing TCP/IP working mechanism is crucial for better understanding packet matching, as well as understanding its challenges. Before moving to packet matching, we will first discuss the design issues of the TCP/IP protocol.
TCP/IP is a protocol suite containing many different protocols. TCP is the most important one in the protocol family. TCP is a connection-oriented protocol, and it regulates how to deliver messages reliably from a source host to its destination. TCP only defines the communication rules between two nodes; each node can be either a host or a router. TCP supports many useful features to make network communication more efficient. These features include pipelining, three-way hands-shake, cumulative acknowledgment, congestion-control, and flow-control. A pipelined protocol can result in packet crossover issue. In packet crossover, it allows a request to be sent out before the acknowledgment of its previous request. It is possible that an on-the-way request may be passed by the response to its previous request. A pipelined protocol makes higher network utilization, but complex packet matching due to packet crossover issues. The feature of cumulative acknowledgement can also make packet matching complicated. The main reason is that multiple requests received by a recipient can be acknowledged and echoed in a single response packet. Another factor to affect TCP packet matching is packet resending. In order to implement a reliable delivery in TCP protocol, the mechanism taking by TCP is to check each received packet to see if there is any error or out of order. If any such error/incorrect ordering is detected, the packet is required to resend by the sender host. If, after a request packet is sent out, the sender does not receive the acknowledgment in a predetermined time, this request packet will be resent by the sender automatically. If so, the resending for a previously received packet would result in duplications of the same packet at the receiver side. All these factors make TCP packet matching a complex process.
Another factor to affect packet matching is packet loss. A Send, Echo, or Ack may be lost during networking communication due to many unknown reasons. As we mentioned before, a lost Send can be resent. However, due to the cumulative acknowledgment feature of the TCP protocol, a lost acknowledgment and echo packet might be ignored if the sender side timer does not expire. However, a lost request packet must be resent after its timer expires. Due to the difficult nature of matching TCP packets from Internet communication, packets’ RTTs can be estimated using the RTTs distribution. As we discussed before, a data mining and clustering approach [
8] can match Send and Echo packets in a network connection chain by making use of the RTTs’ distribution.
  5.2. Packet Cross-Matching
As shown in 
Figure 1, host 
hi is used as a stepping-stone, its incoming connection is denoted as 
 and a relayed outgoing connection 
. We use 
 and 
 to represent the Send and Echo packets collected from 
, respectively. We do likewise for 
 and 
. The connection chain from the intruder’s host to 
hi is called the upstream connection chain of 
hi, and the one from 
hi to the victim’s host is called the downstream connection chain. We assume the host prior to 
hi immediately along the upstream link is 
hi−1, and the one immediately after 
hi along the downstream link is host 
hi+1.
Traditionally, we only apply packet matching to the packets collected from either the incoming connection , or the outgoing connection ; that is to match the packets Sj with Ej, or Si with Ei. Since Si can be chaffed from the host hi−1, and Ei can be chaffed from the host hi, if the chaffed packets are inserted carefully, it will affect the amount of RTTs, further defeating the RTT-based random-walk detection approach. Similarly, Ej can be chaffed from the downstream host hi+1, and Sj can be chaffed from the host hi. To deal with intruders’ chaff manipulation and make connection manipulation harder, we propose Packet Cross-Matching algorithm.
As shown in 
Figure 1, the previous packet matching approaches focus on matching Send and Echo packets in either 
 or 
; that is to match 
Si with 
Ei, as well as matching 
Sj with 
Ej. If either 
Si/
Sj or 
Ei/
Ej are chaffed, the corresponding matching results may be affected. Packet Cross-Matching is to match the Sends in 
 with the Echoes in 
. It also matches the Send packets in 
 with the Echo packets in 
. Suppose Send and Echo packets are collected from the incoming and outgoing connections of a host and are put into four queues, respectively, as the following:
Applying Cross-Matching, the Send packets in 
 can match not only with the Echo packets in 
, but also with the packets in 
. Similarly, the Sends in 
 can match with the Echoes in 
, as well as with the packets in 
. In this paper, we elect to use the improved clustering partitioning approach proposed in [
16] to match network traffic. We can obtain four RTT queues from a host monitored: 
.
  6. Detection Algorithm
  6.1. Modelling the Problem
Our research goal is to determine if a host is used for stepping-stone intrusion by the network traffic from its incoming and outgoing connection. As we discussed, most existing research focuses on determining if there exists a relayed incoming and outgoing connection pair. Unfortunately, chaff perturbation manipulation can easily defeat those approaches for stepping-stone intrusion detection, even though some researchers simply assumed that chaff perturbation only happens in one stream (either the Send packet stream or Echo packet stream) of a connection. However, we know that, with the help of manipulation tools, intruders can easily inject meaningless packets into any stream in any connection concurrently. Therefore, it is necessary to develop a novel approach to defeat intruders’ evasion attacks, especially chaff perturbation. In this research, we assume that intruders can only inject chaff into each host in a connection chain independently. In other words, intruders cannot coordinate chaff injections on two or more consecutive hosts in a connection chain.
As we discussed in Section III—Stepping-stone Chaff Attack, an intruder can inject some meaningless packets including Send and Echo into a connection by specifying the source and destination IP and port numbers. The injected packets can be observed from one stepping-stone host, which is the source host of the injection and the destination host of the injection, but not any other stepping-stones. This can be explained via the scenario shown in 
Figure 2.
In 
Figure 2, we assume that an intruder can make a connection chain using OpenSSH from the host used by the Intruder to the host used by the Victim through the stepping-stone hosts h1, h2, h3, and h4. Host h2 is used as a sensor where a detection program can collect packets and decide if there exists an intrusion. If the intruder injects some packets at the outgoing connection of the sensor (the host h2) by specifying the source port at h2, the IP address of h2, the destination IP address of the Victim host, and port 22 used as the SSH server port, the injected packets can be captured at the sensor and Victim host, respectively, not at hosts h3 and h4. This also reminds us that injected packets cannot go along a connection chain. If injected packets could be processed and replied to, the echoed packet would return to the sensor, but not via the connection between h2 and h3. If the injected packets are matched, they may not give similar RTTs to those from the connection chain. So, the packet matching approach can easily filter out the injected packets.
Injected packets are not allowed to arrive at the Victim host since they are meaningless and cannot be executed. Most intruders just inject packets from a port at a sensor but make the chaff packets dropped immediately at the destination host. Intruders can inject packets into packets streams 
Si, and 
Ei, as well as 
Sj and 
Ej, as shown in 
Figure 1. This would complicate stepping-stone intrusion detection and defeat most existing detection methods. We propose the Cross-Matching RTT-based random walk SSID algorithm to defeat this type of chaff perturbation attack.
  6.2. Detection Algorithm
As shown in 
Figure 1, the host 
hi plays the role of stepping-stone. The host incoming and outgoing connections are denoted as 
Cin and 
Cout, respectively. The packets collected from 
Cin can be divided into Send packet stream 
Si, and Echo stream 
Ei. Similarly, 
Sj and 
Ej are two packet streams for the outgoing connection. The incoming and outgoing connection of a stepping-stone host should be relayed. However, intruders’ chaff manipulation may break it.
From the previous study, we matched Send and Echo in the same connection to detect stepping-stone intrusion. We found that it is easy to be defeated by intruders’ chaff manipulation attack. Our recent study has shown that if we match Send and Echo crossly, that means matching packets not only from the same connection, but also from different connections, such as the packets from incoming connection with the packets from outgoing connection, it would perform better than it was in resisting intruders’ chaff attack.
As 
Figure 1 shows, the host 
hi is monitored and Send and Echo packets from its incoming connection 
Cin and outgoing connection 
Cout, are collected, respectively. We put the Sends and Echoes collected in four queues: 
Si, Send queue, and 
Ei, Echo queue from 
Cin, and 
Sj and 
Ej from 
Cout.
        
We match the Sends in queue 
Si with the Echoes in queue 
Ei and 
Ej, respectively, using the packet-matching algorithm proposed in [
16]. This gives us matched queues 
RTTii and 
RTTij. Similarly, queue 
RTTji can be obtained from the packet matching between queues 
Sj and 
Ei and queue 
RTTjj from the packet matching between 
Sj and 
Ej. We assume the number of elements in 
RTTii, 
RTTij, 
RTTji, and 
RTTjj are 
Nii, 
Nij, 
Nji, and 
Njj, respectively. It is trivial to compute the differences between the numbers.
        
The above equation can determine not only if there exists a stepping-stone intrusion, but also if chaff attack exists, and which session is chaffed. If  is always bounded, and is close to zero, this indicates a successful detection of stepping-stone intrusion. If  is always larger than zero, it tells us that the session Ej is being chaffed from its downstream adjacent host. Similarly, if  is always larger than zero, we detect a stepping-stone intrusion, and know that the session Sj is chaffed. If  is always larger than zero, we conclude that the session Si is chaffed. Similarly, if  is always larger than zero, we understand that the session Ei is chaffed.
By summarizing the above ideas, we come up with an algorithm to not only detect stepping-stone intrusion, but also resist intruders’ chaff perturbation attack. For convenience, we name the algorithm as “
Cross-
Matching RTT-based 
Random 
Walk”, abbreviated as CMRW. The detail of the CMRW algorithm are shown in the following (Algorithm 1).
		 
| Algorithm 1: CMRW Algorithm—Cross-matching RTT-based random walk detection algorithm. | 
| Input: Threshold α (in between 0 and 1) and Collected TCP Packet Streams: Ei, Si, Ej, Sj Output: Stepping-stone intrusion detected or not
 Begin:
 Step 1: call Clustering-Partitioning Packet Match algorithm [16] to match Si with Ei, Si
 with Ej, Sj with Ei, and Sj with Ej, respectively.
 Step 2: from the above packet matching, we obtain the different amount of
 RTTs from the different packets match pairs: Nii, Nij, Nji, and Njj.
 Step 3: Compute the differences among Nii, Nij, Nji, and Njj as the following:
 
 , and compute the ratio  between Δ (the differences between the different
 amount of RTTs) and N (the amount of RTTs) as the following:
 
 Step 4:
 If
 Then, Stepping-stone intrusion, OR
 If
 Then, Stepping-stone intrusion and possibly Si and Ei, or Sj and Ej are chaffed, OR
 If
 Then, Stepping-stone intrusion, and possibly (Sj, Si and Ei), or (Si, Sj and Ej)
 are chaffed
 End:
 | 
In the above algorithm, a typical value of α in our experiments is 3%. Randomly chaffing any packet stream individually does not usually affect the amount of RTTs due to packet matching algorithm. Our experimental results show that even chaffing both the Send and Echo streams concurrently in any connection, or chaffing all Send streams in both connections concurrently, still may not affect the amount of RTTs. However, the CMRW algorithm can detect stepping-stone intrusion with or without packet chaffing.
  7. Resistance Analysis to Chaff Attack
In 
Section 6, the algorithm CMRW to detect stepping-stone intrusion has been presented. In this section, we will analyze its resistance to chaff attacks. Before heading to our analysis, we will prove a theorem which shows the relationship between injected packets and the number of RTTs from a network connection.
Theorem 1:  If a network connection is chaffed with either Send or Echo packets, then the number of RTTs of the packets from the connection cannot be decreased compared with the un-chaffed connection.
 Proof.  We collected the packets from a network connection and put them in a packet sequence, such as 
, where 
 and 
 are assumed two adjacent Sends. There may be other non-Send packets between 
 and 
. Packets 
 and 
 are two adjacent Echo packets. It is possible that there are some other types of packets in between the two Sends, but no Echoes. It is also assumed that 
 matches with 
 and 
 matches with 
. Therefore, the RTTs between the matched Sends and Echoes are calculated as 
 and 
. If packet Send 
 is injected in between 
 and 
, and also assuming 
 is either close to 
 or close to 
, from the packet matching algorithm [
16], the RTTs between 
 and 
 or 
 can be filtered out. Thus, in this case, the amount of RTTs will remain unchanged even under chaff perturbation. Otherwise, it cannot match with either 
 or 
, and cannot match with any Echoes before 
 or after 
, because if so, there will be a contradiction. This tells us if a packet is injected between two adjacent Sends, it will not contribute to the amount of RTTs. The only exception is that if an additional Echo is injected into between 
 and 
, such as 
, the two injections may or may not match each other randomly. If they happen to match each other, it will contribute to the amount of RTTs. Otherwise, it does not affect anything on SSID. However, the possibility that two injected packets match with each other is low. Therefore, we draw our conclusion that is “injected Sends or Echoes cannot decrease the amount of RTTs in a network connection”. □
 Consider two relayed incoming and outgoing connections in a host, and four streams: A send stream from the incoming and the outgoing connection, respectively, as well as an Echo stream. If each connection is chaffed, we have the following four different cases totally.
Case 1:  Figure 3 shows the first case in which represents a chaffed stream. Send packet streamcannot be chaffed at host hi. The chaff packets must be injected into the connection at the outgoing connection of the adjacent upstream host of hi, and the injections must be removed from the outgoing connection of the host hi, otherwise the two Send streams would be relayed. Based on algorithm CMRW and Theorem 1, the four differencesare all larger than zero.  Case 2:  Figure 4 shows the second case in which represents a chaffed stream. In this case, the chaffed packet can be injected into the outgoing connection of the host hi and the injected packets will head to the adjacent downstream host of hi. Based on the algorithm CMRW and Theorem 1, the four differencesare all larger than zero.  Case 3:  Figure 5 shows the third case in which represents a chaffed Echo stream. In this case, chaffed packets can be injected at the incoming connection of the host of hi. And the injections will go to the adjacent upstream host of hi. Based on the algorithm CMRW and Theorem 1, the four differencesare all larger than zero.  Case 4:  Figure 6 shows the last case in which represents a chaffed Echo packet stream. In this case, the chaffed packets must be injected from the incoming connection of the adjacent downstream host of hi, and the injected packets must be removed from the incoming connection of the host hi, otherwise the two Echo streams would be relayed. Based on algorithm CMRW and Theorem 1, the four differencesare all larger than zero.  The above analysis clearly demonstrated that if a session has been chaffed, CMRW can resist this chaff attack wherever a connection is chaffed, unlike the previous detection algorithm, which can only resist chaff attacks where the Send stream of the incoming connection is chaffed in a certain degree. Another important feature of CMRW algorithm is that it can maintain a high stepping-stone detection rate, even when the session is manipulated by chaff perturbation in different connections.
  8. Experimental Results and Analysis
We designed an experiment to test the performance of the proposed cross-matching algorithm. In this experiment, a connection chain including 8 hosts: CCT30→AWS1→AWS2→AWS3→AWS4→AWS5→AWS6→AWS7 (7 connections) was established using SSH. We used CCT 35 to record network traffic at every incoming and outgoing connection of each host from AWS1 to AWS6. CCT 30 was used as the attacker’s host, and AWS7 was used as the victim. CCT 30 and CC35 are located in the computing lab of Columbus State University, GA, USA. AWS1 through AWS7 are the servers from Amazon Cloud Services. 
Table 1 gives the ASW servers’ IP and geographic location.
We executed three different scripts to simulate three attackers to generate network traffic at the hosts CCT30 through AWS7. Each of the three attackers’ scripts was repeated ten times.
 
Attacker 1 script:
pwd
whoami
sudo su
ls
cd/etc
ls –a
scp -p shadow attacker username@attacker IP:/home/seed/Documents exit
 
Attacker 2 script:
whoami
pwd
cd /home/seed/Documents
ls
nano text_file.txt
//paste a large text and save it
ls
cat hello.txt
exit
 
Attacker 3 script:
whoami
pwd
cd /home/seed/Documents
ls
nano hello.txt
//enter a few sentences and save the text file
ls
cat hello.txt
exit
 
We used the following tcpdump command to collect network traffic at the incoming connection of each sensor host ten times,
sudo tcpdump -nn -tt “(dst port 22 && dst xxx.xxx.xxx.xxx) || (src xxx.xxx.xxx.xxx && src port 22)” >AWS1_IN_AttackX_TestX.txt
and the below command was used for the outgoing connection of each sensor host 10 times.
sudo tcpdump -nn -tt “(dst port 22 && src xxx.xxx.xxx.xxx) || (dst xxx.xxx.xxx.xx && src port 22)” >AWS1_OUT_AttackX_TestX.txt
In both cases, “xxx.xxx.xxx.xxx” referred to the hosts’ own IP address, and X referred to a test number from 1 to 10.
We only captured Send and Echo packets with the above tcpdump commands. Since the experimental results obtained from the different servers are similar, we will only present the results obtained from AWS1. For each test, we obtained one file storing the Sends and Echoes collected from the incoming connection of AWS1, called AWS1-IN-TestX, and the file AWS1-OUT-TestX for storing the Send and Echo packets collected. We divided the Send and Echo packets into two separate files. So, from AWS1-IN-Test1 we obtained S-IN-Test1 containing all the Send packets and E-IN-Test1 containing all the Echo packets. Similarly, from AWS1-OUT-Test1, we also obtained S-OUT-Test1 and E-OUT-Test1. We call the packet-matching algorithm [
16] to match S-IN-Test1 with E-IN-Test1 (denoted as S
in_E
in), as well as S-IN-Test1 with E-OUT-Test1 (denoted as S
in_E
o), S-OUT-Test1 with E-IN-Test1 (denoted as S
o_E
in), and S-OUT-Test1 with E-Out-Test1 (denoted as S
o_E
o). 
Table 2 shows the packet matching results for 10 tests at AWS1. The number inside each parenthesis represents the number of Send packets collected from either the incoming connection or outgoing connection of AWS1. The number before the parenthesis represents the number of matched Send and Echo pairs. The results from 
Table 1 show that without any manipulation, in each test, all the numbers in the four columns are equal. This tells us a stepping-stone intrusion is detected based on our proposed algorithm.
The reliability of a detection algorithm in resisting intruders’ session manipulation, especially chaff-perturbation, is a key factor in measuring the algorithm’s performance. Chaff-perturbation may happen at either the incoming or outgoing connection of a host. Chaffing an outgoing connection of a host is equivalent to chaffing the incoming connection of its adjacent host. In this paper, we chaff the incoming connection of AWS1 with a chaff rate of 10%, 20%, …, up to 100%. Due to space limitation, we only show the resisting performance of the proposed algorithm under the chaff rate of 10%, 50%, and 100%. There are terminologies used below we need to explain here. Chaff Send indicates injecting some Send packets randomly into the packets collected from a connection. Chaff Echo means injecting some Echo packets randomly into the packets collected from a connection. Chaff Send and Echo injects some Send and Echo packets randomly into the packet stream collected.
As shown in 
Table 2, we use Test01 as an example to explain Send packet insertion. It shows, in the first column of Test01, 106 Send packets were collected, and 105 of them were matched with corresponding Echo packets collected from the same connection. The second, third, and fourth columns of Test01 show the same packet matching results but with different packet matching combinations. The second column shows the matching results between the Send packets from the incoming connection of AWS1 and the Echo packets from the outgoing connection. The third and fourth columns show the matching results between the Send packets from the outgoing connection and the Echo packets from the incoming and outgoing connection. We first chaff Send packets into the incoming connection of AWS1 with chaff rate of 10%, 20%, 30%, …, to 100%. If we chaff the Send packets in the incoming connection with a 10% chaff rate, it will affect the matching rate for the first and the second column of the Test01. The more Send packets chaffed, the more severe the packet matching rate is affected. This means chaff-perturbation may potentially overturn the detection algorithm. Due to space limits, 
Table 3 only shows the packet cross-matching results under the chaff rate of 10%, 50%, and 100%. It clearly shows that even at a 100% chaff rate, a stepping-stone intrusion can be detected using the cross-matching and random-walk model. We also found the first and second columns share the same matching rate regardless of the chaff rate. A similar conclusion can be drawn for the third and fourth columns.
We chaffed Echo packets into the incoming connection of AWS1 with chaff rate of 10%, 20%, 30%, …, and 100%. The packet cross-matching results with an Echo chaff rate of 10%, 50%, and 100% are shown in 
Table 4. It shows that even though the packet matching rates are not equal for the four columns, the first and the third column keep the same, as well as the second and the fourth column. The results in 
Table 4 tell us that even under 100% chaff-perturbation manipulation, the proposed algorithm is still able to detect stepping-stone intrusion. It can resist intruders’ chaff-perturbation of 100%.
We Chaff both Send and Echo packets into the incoming connection of AWS1 with chaff rate of 10%, 20%, 30%, …, and 100%. 
Table 5 shows the packet cross-matching results with both a Send and Echo chaff rate of 10%, 50%, and 100%. The packet matching results show that even under a severe chaff-perturbation, such as 100% for both Send and Echo packets, the difference in the packet matching rate between the first and the third column follows random-walk behavior, and the matching rate of the second and the fourth column remains the same with three exceptions (see the results indicated in Red), but keeps very close. The results show that the proposed algorithm can resist intruders’ chaff-perturbation up to 100% even when both the Send and Echo streams are chaffed.