Detecting Mixed-Type Intrusion in High Adaptability Using Artificial Immune System and Parallelized Automata

: This study applies artificial immune system and parallelized finite-state machines to construct an intrusion detection algorithm for spotting hidden threats in massive number of packets. Existing intrusion detections are mostly not focused on adaptability for mixed and changing attacks, which results in low detection rate in new and mixed-type attacks. Using the characteristics of artificial immune and state transition can address the attacks in evolutionary patterns and track the anomalies in nonconsecutive packets. The proposed immune algorithm in this study is highly efficient based on a selection step in multi-island migration. Result shows that the algorithm can effectively detect mixed-type attacks and obtains an overall accuracy of 95.9% in testing data.


Introduction
Intrusion detection systems (IDSs) aim to identify and isolate all types of intrusion inside a computer or communication system [1]. Anomaly and misuse detections are popular in the field of detection. Existing systems result in high false alarm in multiple anomalies interlaced together. Despite the sophisticated algorithms in the history of development, an effective method discovering mixed-type attacks is still needed for security administrators.
Mixed-type attacks refer to the attacking pattern in which the suspicious packets are interlaced with normal or abnormal packets. Hackers often attack by sending abnormal packets, and some attacks can be identified by a single packet [1]. However, attackers might use useless or massive normal packets in a mixed mode in order to cheat autodetection algorithms [1]. Thus, a simple analysis of a single packet will fail to define the attack, and subsequent packet contents have to be analyzed through a specialized method [2]. In this study, we are no longer restricted to one single packet or analytical moment, we use finite-state machines (FSMs) to associate non-single packet attacks to internal tracking states. A state in the FSM can signal an attack. We apply parallelized FSMs to capture multiple suspicious attacks on the same time. Subsequently, an artificial immune system (AIS) is used to learn the condition of state transition from packet information and generate a complete state transition diagram.
The immune algorithm or AIS is derived from imitating biological immune process with unique learning and memory features and superior identification capability in system activity log for detecting unknown attack modes [3,4]. When a virus enters an organism, the immune system will perform a chain of reactions to kill the foreign enemy. AIS possesses specificity, discreteness, adaptability, and learning and memory functions; hence, it can detect more new attacks than conventional methods and retain effective protection [5][6][7][8]. AIS is widely applied in various knowledge disciplines, particularly in the information technology sector. However, most previous applications in intrusion detection are restricted in static model analysis and lack adaptability to external environments. Although recent deep learning in neural networks produce excellent results in IDS [9], insufficient works concentrate on unknown types of attack. Immune system can well adapt to unknown attacks through its massive learning and adaptive capability [10,11]. Additionally, an FSM is a mathematical model of computation with finite number of states at any given time, initial state, and input triggering state transition with a predefined transition probability [3,12]. The advantage of this model is to keep tracking events effectively in a directed graph. Therefore, this model can be used to discern multiple attacks among massive mixed packets.

Literature Review
Software and hardware devices should be able to detect network behavior and provide relevant information to IDS. These data reports about computer network systems contain audit records in the operating system and header files for network packets [13,14].
IDS is normally divided into anomaly, misuse, and hybrid detections [1,15]. Anomaly detection alerts anything that deviates from normal actions, whereas misuse detection matches activities with registered abnormal patterns. Hybrid detection is the mixed of the two detections, producing a complementary role to perform improved tasks [2,16].
Bharati and Kumar [3] used statistical methods on an audit log as a data source. They also used normal packet data as a benchmark to establish a baseline and then compared it with the test material. If the baseline establishes standard values, then anything deviates from the baseline is considered abnormal. This method calculates the time to process large amounts of data; however, setting a baseline that can be effectively distinguished is difficult. Al-Khaleefa et al. [17] simulated the operation of the human brain, trained the weights of intermediate nodes with infinite term memory, and classified the packets into normal or abnormal patterns. This method is complex and timeconsuming; security administrators cannot obtain clear information of the attacks through the trained weightings of nodes.
Bradley and Tyrrel [12] applied immune algorithm and a single FSM to monitor an electronic hardware system for potential operation failures. Bharati and Kumar [3] integrated state transition models and statistical methods into rule-based analysis methods and established nominal behavior based on frequent system calls, resource use cases, and file access pattern. Rule-based methods compare test data and identify abnormal access or use of resources. However, such methods cannot recognize new attacks. Fu et al. [18] classified the test data and then used string matching and FSM in constructing a high-speed IoT network IDS to improve the efficiency and increase the speed and quantity of network packet inspection. However, the inspection is limited to a single package, and the correlation between the envelopes cannot be considered fully. Hwang et al. [19] analyzed network audit logs to detect abnormal behavior using a data mining approach, such as a combination of associate rules and frequent episode.

Description of Simple Attack
This study uses the evolutionary mechanism of artificial immunology to estimate the transition probability of finite automata, as shown in Figure 1, i.e., FSM is to distinguish attacks and AIS is to learn the transition probability in detecting suspects. The probabilities are coded into the antibodies of AIS and the coding scheme of antibodies is based on the operation of automata. The number of possible attack states, which are an arbitrary number representing the limiting capacity of the learning system, is preset to meet the requirement of the experiment. An orthogonal matrix is used for the experimental design to reduce the number of trials for fine tuning the parameters for antibody optimization. The process is shown in Figure 1. The completed IDS will classify data packets into normal, known attack, undetermined (no fit to the output of the classifier), and conflict (fit all types) packets. The system performance is accessed to reflect its capability and stability. Figure 1. Procedures for the intrusion detection experiment. A dataset is divided to training and testing sets. The training set is used to adjust transition probabilities of FSMs via AIS according to known attack types. The testing set is used to calculate classification accuracy of trained FSMs by pretending that the attack types were unknown.

Mixed-Type Attacks
The intrusion attack contains a chain of associated programs, and multiple packets form a complete attack process, which may be mixed with some normal packets in nonlinear time. For example, two warezmaster attacks [1] (Table 1) separated by one normal packet ( Figure 2). This mixing packet could fool some detection algorithms. In the experiment, the correlation of nonconsecutive attacks is captured by FSM, describing the attack process with a chain of state transition.  SF  8334  0  0  0  0  0  warez  master  76,635  1  tcp  smtp  SF  1640  344  0  0  0  0  normal  76,636  0  tcp  http  SF  307  354  0  0  0  0  normal  76,637  0  tcp  http  SF  219  5014  0  0  0  0  normal  76,638  0  tcp  http  SF  212  3902  0  0  0  0  normal  76,639  0  tcp  http  SF  347  5320  0  0  0  0  normal  76,640  0  tcp  http  SF  327  365  0  0  0  0  normal  76,641  0  tcp  http  SF  296 7129 0 0 0 0 normal Figure 2. An illustration explaining the mechanism of state transition for detecting warezmaster among irrelevant packets. At beginning, a suspicious attack is detected, f(service = ftp, flag = SF, source bytes <72), but the attack activity cannot be certain completely. Therefore, state M1 is entered and the next state will be determined by further information. If sufficient evidence is observed, f(service = ftp data, flag = SF, source bytes >8334), state A1 is entered and the attack is captured completely.
In the state transition diagram designed for attack incidents ( Figure 2), suppose that the state transition probability can be represented by a function containing three parameters, i.e., P(Si|Sj) = f(X, Y, Z), where X, Y, and Z denotes service, flag, and source_bytes, respectively. When the system begins to receive packet, it is at the initial state S0 with normal packets or not conforming to the first state transition function of attack. Given as an example, when the 76,632 nd packet is received, the parameters of the packet are service = ftp, flag = SF, and src bytes = 72. Then, the state transfers to M1. When two cases exist, normal packets are initially received continuously; then, they are kept under state M1 over a certain time t (we use number of packets in this study), wherein time t can be set by users with experience. The attacker might stop further attacks of network activities for certain reasons, or the network packet is lost. Hence, in order to limit the size of memory consumption, the FSM resource will be released, if no additional intrusive packets are received, and the system returns to state S0 for further detection. In the illustration, we manually set that the receiving packet has parameters of service = ftp data, flag = SF, and src bytes >8334. Then, the state transfers to A1. After being calculated by the output function, the warezmaster attack can be determined.

Build Intrusion Detection Model Using FSM
In this study, a system model is constructed using FSM. As shown in Figure 3, there are three kinds of states, which are initial, possible attacks, and attack states. Lines in the graph represent state transitions. The red lines denote the attacks can be simply determined by a single packet. The black lines denote that a malicious attack cannot be determined by a single packet, and will be decided only after sufficient information is arrived. Therefore, we use intermediate states (M1-M5) to denote the status that the next states will be determined by other packets. The automata start from an initial state S0, transits to a suspected state M1-M5 (The number of states depends on the system resources), and finally reaches attack states (A1: normal, A2: attack). S0 can reach any suspected and attack state (antibodies 1-7), whereas each suspected state can conclude an attack state, for a total of 17 antibodies (Table 2).

Figure 3.
State and state transitions for the FSM of intrusion detection. Some attacks are easy to be captured, and, therefore, the red lines represent direct transitions to final results. Some attacks are elaborate. We have to wait for subsequent information to make the final decision. Each internal state M1-M5 corresponds to one suspicious attack. The number of states depends on the system resources. The detection of suspicious attacks over an intermediate state is shown in Figure 4. The dash lines in the FSMs present possible future actions, which are not a part of standard notation and are only for explanatory purpose. When one possible attack state is detected, the system will request to start another state machine to detect multiple attacks simultaneously. When the state machine detects suspicious packets compliant with M2, the system goes to a suspicious attack state and continues subsequent detection while verifying whether an attack is occurring.
When the state machine is in a suspicious attack state, it will count its life time t (number of packets) simultaneously. In this manner, system load can be reduced, and actuating excessive or overdue state machines can be avoided. If no associated packets in time t can prove the occurrence of an attack, then this state machine is abandoned to secure system resource operation.
The process is as follows.
(b) When a suspicious attack is detected, a state machine tracking M2 will be spawned to detect the potential attacks by keeping monitoring the possibility of consecutive anomaly. (c) Upon detecting of another potential attack, a new state machine M3 will be spawned to keep tracking the consecutive suspicious pattern. (d) M2 will be abandoned after examining t packets to avoid system overload.
The study extracts 41 features from the provided packets information, including categorical and quantity values. We code the categorical features into antibody with Arabic numbers (Table 3), whereas the quantity values are converted to fuzzy numbers [20] (Figure 5).  Assume that the content of a certain packet is V1 = 1, V2 = H, V3 = http, and V4 = icmp. Then, the encoding for the categorical features is as Figure 6: The state transitions are assigned to antibodies according to the arrangement in Table 3. Antibodies 1-7 represent the transitions starting from state S0, whereas antibodies 8-17 represent the rest transitions that end to attack states A1-A2. Our algorithm is expressed as Figure 7. The affinity, as in Equation (1), is calculated as the fitness of antibody to pathogen. For example, two antibodies have affinity values 3 and 4 with respect to a probable attack A as shown in Table 4 ( [21]). The unit of evolution is antibody set. Therefore, the actual affinity in (1) will add up all antibodies in an antibody set. The algorithm will keep highaffinity antibody sets in the memory.

Antibody Generation and Migration
As the antibody design, each learning unit, i.e., an antibody set, contains 17 antibodies, based on current illustrative design. If intermediate states have 10, the antibody set will contain 22 antibodies. Affinity is calculated according to various antibody sets. For example, in Figure 8, n antibody combinations exist, or an initial random population generates n recombinations at a time. Every antibody set contains 17 antibodies, each antibody training according to different data without interference. Affinity is calculated on the basis of all antibody sets. Thus, 17 antibodies are arranged in serial sequence. The study adopts multiple random populations at one time ( Figure 9) to simulate island genetic algorithm (IGA) migration, in which each population will evolve independently in their resident island [22]. IGA involves parallel involution in each island of population to avoid stuck into local optimum and provide opportunities to seek the best solution. The island structure is shown in Figure  9. Many studies indicated that IGA outpaces conventional genetic algorithm ( [22]). The best solution is rendered by keeping migration with good antibody sets in different islands to exchange and substitute with part of antibody sets in other populations over time, as shown by the arrows in Figure  7. The migration will be selected according to a predefined probability (migration rate). The size of slowly evolved population decreases, and the decreased numbers are passed to population with better parameters to increase the number of populations with better evolution parameters and find the best solution effectively.
After completion of the system state machine modeling, an affinity test of packet against each antibody is performed. If a packet's antibody affinity is greater than a pre-calculated lower bound, this packet will be preliminarily determined to be a suspicious attack. However, the final conclusion should be obtained after acquiring the affinities of this packet with other antibodies. Figure 9. The antibody sets migration between populations of islands. The best solution is rendered by keeping migration with good antibody sets in different islands to exchange and substitute with part of antibody sets in other populations over time. The migration will be selected according to a predefined probability (migration rate).

Experimental Design
To compare the performance our algorithm to existing studies, we follow the choice of the most the adopted dataset in the Third International Knowledge Discovery and Data Mining Tools Competition, [23]. The data used in the experiment were randomly divided into training and testing sets. The data contained 41 features in three categories, namely TCP header, content, traffic features. The KDDCup packet of the data is shown in Table 5.
Among the 23 types of attack in KDDCup data, this study screened nine types of common attacks plus one normal activity for training data, and each type contained 10,000 randomly selected packets. For testing, five additional types of attack were selected from the KDDCup data as testing data. The novel data were used as proof for the stability and adaptability of our method.
The algorithm initially used the parameter setting in Table 6 and created 50 populations with 100 k antibodies in each population. After the evolution, the convergence result is shown in Figure  10. The changes of the best affinity in selected five populations among the 1500 generations are shown by the different colors in Figure 10a (the thick red line is the best affinity for all populations). Evidently, affinity has gradually converged after 1000th generations. The best affinity changes of antibodies are shown in Figure 10b. The affinity changes among the populations are shown in Figure  10c. The evolution before 500th generations is fast, and the rate becomes moderate after 500. Slowly converged population may also produce the best affinity antibody due to the IGA migration mechanism with antibody exchange. After 1000 generations, the affinity between populations become indifferent. Slowly converged population may also produce the best affinity antibody due to the IGA migration mechanism with antibody exchange. After 1000 generations, the affinity between populations become indifferent.  0  udp  private  SF  105  146  0  0  0  0  snmpgetattack  0  udp  private  SF  105  146  0  0  0  0  snmpgetattack  0  udp  private  SF  105  146  0  0  0  0  snmpgetattack  0  udp  domain  SF  29  0  0  0  0  0  normal  0  tcp  private  SF  105  146  0  0  0  0  normal  0  udp  private  SF  105  146  0  0  0  0  snmpgetattack  0  tcp  http  SF  223  185  0  0  0  0  normal  0  udp  private  SF  105  146  0  0  0  0  snmpgetattack  0  tcp  http  SF  230  260  0  0  0  0  normal Data source (http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html). Stopped at 1500th generation Threshold = 1/selection pressure (SP) Antibody value = antibody affinity/max affinity in a population at a generation For example, a particular antibody will be selected if it must surpass the threshold (1/1.7 = 0.588); antibody affinity = 1000; max affinity in a population at a generation = 3000; 1000/3000 = 0.3 < 0.588; thus, this antibody will not be selected.

Optimization of Metaparameters
The lower bound of affinity matching must be calculated for each antibody to perform anomaly matching. Only those packets that the affinity is greater than the lower bound will be treated as an attack.
Two types of antibody are taken as examples in Figure 11. In Figure 11a, whether the antibody belongs to a normal packet is identified. In Figure 11b, whether the antibody is attacking packet is detected. The red lines are the lower bounds of antibodies. When the affinity between the packet and antibody is greater than that of the lower bound (or the affinity lies on the right of the red line), the antibody manifests a receiving state. If the packet is only received by the antibody of a normal packet, as detected, then this packet can be determined to be a normal packet; otherwise, if the packet is only received by the antibody of an attack packet, as detected, then it is determined to be an attacking packet. If the packet is rejected simultaneously by normal and attacking antibodies, then it is an unknown packet. If the packet is received by more than two types of antibody, then it is a conflict packet. The details are illustrated in Table 7. Table 7. Illustration of the determination ranges from Figure 9.

Antibodies to Be Normal Packets Antibodies to Be Attack Packets
Less than A Greater than B Less than C Greater than D Determination yes yes Unknown Suspicious yes Yes Conflict yes yes Attack yes yes Normal To find most fit lower bound of each antibody (i.e., the best position of the red line in Figure 8), each type of experimental factor is classified as three levels of 0.5, 1.0, and 1.5 standard deviations. Five antibodies in the center state adopt one level, whereas the other 12 antibodies have different levels. Hence, a total of 13 experimental factors exist, with 3 13 = 1,594,323 combinations in total. This study adopted the Taguchi method with an orthogonally designed array to reduce the number of tests in obtaining the desired result and determining the best level of each experimental factor [24]. The experiment needs 13 best levels in total. Thus, 27 (3 13 ) orthogonal array was adopted.
The calculation of the lower bound is shown in Equation (2). In each set, the lower bound of each antibody was calculated according to different level combinations and then compared with the training data to obtain the value of D of each set (Equation (3)). Finally, the best parameter of each antibody was calculated from D. Only 27 tests in the experiment could have the best factor and level combination, according to the signal-to-noise ratio ( / ) expressed in Equation (4).
After IDS modeling and level optimization in Equation (4) of each antibody, comparison of model and test data is conducted. In the experiment, the chosen ten types of attack in original training data (left half of Table 8), then screen them through secondary stage identification, leave attacks of normal packet and single packet to detect (right half of Table 8), and verify identification capability of direct detection and secondary detection; it can be known from table that the accuracy of this model in judging compound packet (packet including first and second identifications) (0.984) is very similar to that of single packet (0.989), therefore it can be concluded that this model can effectively detect compound attack packet, and regardless of single packet or compound packet attacks, high accuracy and precision can be obtained.

Experimental Results
After training, five additional types of attack in the original data were regarded as new attacks for the testing phase, with a total of 548 cases in total. The experimental results are shown in Tables  8-10. As shown in Tables 8 and 9, the performance of single-type and mixed-type attacks is superior. By adding new and unknown patterns of attacks, the performance remains good, as shown in Table  10. For the detection results of unknown or conflict, we treat them as attacks to avoid miss capturing. The overall detection accuracy was 95.9%. This study produced sufficient recall rate in terms of single and mixed-type intrusive packets. Although our algorithm has different goal to the KDD (Knowledge Discovery and Datamining) competition, we are still interested in a comparison to single-type attacks. The winner of KDD cup and a recent research had an overall accuracy of 92.7% and 96.5%, respectively [25] compared with the rate of 95.9% in this study. For the massive unknown variants from intrusion hackers, a small difference in the accuracy number may not be meaningful in performance comparison. For other performance metrics, we have a lower recall rate but have high precision, i.e., we have less type I error. Type I error implies labeling malicious packets to normal packets. Detection problems always face the trade-off between false alarming and miss capturing. We think a good security administrator cannot easily allow suspicious activities to exist under security supervision. Therefore, the decision trade-off will toward to high precision, instead of high recall. Thus, the proposed algorithm contributes to attack detection in presenting mixed-type and new attacks.

Conclusions
This study is the first to define multiple possible states using FSM theory and the evolution property of AIS to establish the transition function between varied states. Then, this study conducts a non-consecutive packet analysis, which cannot be found in a single packet. Furthermore, the detection of new intrusion is not a superficial direct analysis; rather, this study uses an intermediate state to extract new attacks similar to known attacks and has an identification capability for new intrusions, attaining stability of intrusion detection algorithm. This study attributes unidentifiable and controversial packets to one type, providing reference in analysis for administrators of user organizations, and the high stability relieves administrators of labor load caused by excessive erroneous messages.

Conflicts of Interest:
The authors have no conflicts of interest to declare.