Fast Packet Inspection for End-To-End Encryption

So-Yeon Kim; Sun-Woo Yun; Eun-Young Lee; So-Hyeon Bae; Il-Gu Lee

doi:10.3390/electronics9111937

,

and

¹

Department of Convergence Security Engineering, Sungshin University, Seoul 02844, Korea

²

Department of Future Convergence Technology Engineering, Sungshin University, Seoul 02844, Korea

^*

Author to whom correspondence should be addressed.

Electronics2020, 9(11), 1937;https://doi.org/10.3390/electronics9111937

This article belongs to the Special Issue Security, Privacy and Trustworthiness of Wireless Communications and Networks

Version Notes

Order Reprints

Abstract

With the recent development and popularization of various network technologies, communicating with people at any time, and from any location, using high-speed internet, has become easily accessible. At the same time, eavesdropping, data interception, personal data leakage, and distribution of malware during the information transfer process have become easier than ever. Recently, to respond to such threats, end-to-end encryption (E2EE) technology has been widely implemented in commercial network services as a popular information security system. However, with the use of E2EE technology, it is difficult to check whether an encrypted packet is malicious in an information security system. A number of studies have been previously conducted on deep packet inspection (DPI) through trustable information security systems. However, the E2EE is not maintained when conducting a DPI, which requires a long inspection time. Thus, in this study, a fast packet inspection (FPI) and its frame structure for quickly detecting known malware patterns while maintaining E2EE are proposed. Based on the simulation results, the proposed FPI allows for inspecting packets approximately 14.4 and 5.3 times faster, respectively, when the inspection coverage is 20% and 100%, as compared with a DPI method under a simulation environment in which the payload length is set to 640 bytes.

Keywords:

end-to-end encryption (E2EE); packet inspection; integrity; malware detection; security; confidentiality

1. Introduction

Due to the commercialization of smart electronic devices in recent years, numerous services and mobile applications that collect, process, and transfer personal and financial information have become popular [1]. In particular, according to the 2019 digital media convergence (DMC) report, the usage of networking applications was found to be high because it was reported that one smart device has at least one social networking service (SNS) and one messaging application on average [2]. This trend is expected to continue to expand owing to an environment in which non-face-to-face activities are expected to be promoted as a response to COVID-19. As a result, information exchanges through networks, such as in smart working, remote meetings, and remote classes, are expected to drastically expand [3]. However, security issues such as eavesdropping, personal data leakage, privacy infringement, and malware distribution continuously occur during the process of data transmission over a network [4]. In recent years, to respond to such security threats, the end-to-end encryption (E2EE) technique has become an essential part of both web and network applications [5,6].

An E2EE is a communication system in which security is achieved by storing the encryption key used for communication in a personal device rather than on the server, thus making it impossible for the attackers to eavesdrop on the client and server communication [4]. Although the encrypted link environment of E2EE has the advantage of securing the confidentiality of messages transmitted and received from the server, it is difficult for the user to identify the data integrity when the data are altered [7]. Additionally, it can also become an attack method by hiding the malicious behavior of the attacker because it is difficult to detect the presence of malicious traffic in the traffic inspection stage during data transmission [8]. However, studies on methods for effectively inspecting the malware of an encrypted packet in an E2EE network environment are still lacking.

A deep packet inspection (DPI) technique, which is used to identify malicious network packets, is a form of packet filtering that inspects not only the packet headers but also the data payloads [9]. A DPI can be used to evaluate a wide variety of applications in a large-scale dynamic network environment; however, there is the possibility of privacy infringement during the inspection process, and a decrease in processing speed and efficiency may result from unnecessary inspections [10]. In addition, it is difficult to apply a DPI to an E2EE environment that does not feature a decryption process during the data transmission [11]. The Blindbox DPI method, which was proposed to address such limitations and apply a DPI to encrypted traffic, is a system that addresses the tension between network security and DPI middlebox functions, and supports middleboxes for scaling and DPI filtering on encrypted traffic. However, there is still a limitation, in that it only supports attack rules in the HTTP application layer [12].

To address the issues of the existing techniques, in this paper, a fast packet inspection (FPI) method is proposed as a frame structure and transmission mechanism that can guarantee data integrity through an FPI in an E2EE environment while effectively detecting malware.

The FPI method was designed to conduct a cyclic redundancy check (CRC) inspection for an integrity verification by using the hash values (H_P) of the plain text data payload and the hash values (H_E) of the encrypted data payload, and then conducting a malware hash list comparison to detect malicious codes. An FPI enhances the efficiency of the decryption process by performing a comparison using bitmaps (H_bm) generated through hash values of the key components. The use of FPI allows for quickly detecting data alteration attacks and known malware that may be found during the encryption/decryption and transmission processes in an E2EE environment.

According to the performance assessment results of this study, under the same network environment conditions, the FPI method was shown to improve the packet transmission rate by at least 6 times that of the DPI method based on a 160-byte payload length, and up to 18.9 times that based on a 960-byte payload length.

The rest of this paper is structured as follows. Section 2 introduces previous related studies. Section 3 presents the packet frame structure of an FPI as well as the transmission mechanism, which supports an FPI in an E2EE environment. Section 4 describes the simulator implementation and operations of both the proposed method and an existing method, and then compares the processing times according to the payload length. In addition, the proposed method is compared with the existing method and evaluated in terms of the following four aspects based on the simulation results: inspection coverage, transmission speed average based on the specific payload length, retainment of E2EE, and the complexity of establishing the environment. Finally, Section 5 provides some concluding remarks and describes the significance of this study along with scope for future research.

2. Related Works

2.1. End-To-End Encryption (E2EE)

An E2EE involves decrypting the data with an encryption key shared by the communicating devices once the packets arrive at the destination, rather than performing decryption with relay servers or relays [13]. Some popular implementations of E2EE include Pretty Good Privacy, which is an email E2EE protocol, and Off-the-Record, which is an encryption protocol for instant messaging conversations. These protocols use a combination of public key and private key methods for secure key exchanges between the transmitting source and receiving destination [14]. Owing to a recent surveillance incident with KakaoTalk, the interest in and necessity of enhancing the security of messaging applications have both increased, and the E2EE technique has become an essential element in the messaging applications for enhancing confidentiality and privacy [15]. Some examples of this include the one-on-one secret chat functions in KakaoTalk and Telegram. Further, Zoom Communications, which offers a video communications application and has grown into a global company with a recent increase in its non-face-to-face online service demands, is at work implementing E2EE into its application so as to address existing security vulnerabilities [16].

In addition to mobile services, it is expected that the E2EE protocol will be an essential factor for Internet of Thing (IoT) devices in the future, as an increasing number of devices begin demanding the collection of personal and sensitive data. A study [17] proposing an E2EE protocol for IoT provided authentication features using E2EE and digital signatures.

To secure the data integrity in an E2EE environment, Park et al. [18] implemented digital signatures and verification to the existing E2EE environment for detecting data forgery and alteration, and malware that may be found during data transmission. With this implementation, the data and files become encrypted when the sender sends them, and the resulting values are hashed again and then digitally signed. Subsequently, the encrypted data and digital signatures are transparently transmitted to the receiver through a relay server; the received encrypted data are then decrypted and the resulting hash values are compared with the received values. If the two hash values do not match, the message is deemed to have been altered during the communication, and the communication session is terminated. Because this method uses a data decryption process at the receiving end before the inspection, there is a possibility of being exposed to various threats during the decryption stage.

Machine learning-based research is being conducted to process encrypted packets. MIMETIC [19] classifies encrypted traffic through multimodal deep learning. In the multi-modality deep learning architectures, the accuracy of FB/FBM is 79.98%, the accuracy of android is 89.49%, and the accuracy of iOS is 89.14%. These are greater accuracies than the single-modality deep learning architectures.

2.2. Deep Packet Inspection (DPI)

The DPI technique enables routers and switches to analyze not only the headers of the packets transmitted over the network, but also the payloads containing the data content [20]. A DPI is capable of opening the packets to be examined through all seven layers of the open system interconnection (OSI) model. It enables an accurate traffic analysis and can be used for various purposes such as virus and malware prevention, traffic management, harmful content prevention, and personalized advertisement recommendations [21]. The most fundamental way to inspect the packet payload is by quickly detecting and extracting specific patterns [22]. However, in this case, a separate training process is required to generate patterns for pattern detection. Further, the data processing may face delays, or the data may become exposed to security vulnerabilities during the process of analyzing all the data after decrypting the payload prior to comparing the patterns [12]. Such performance degradation and security vulnerabilities are causing security compromises such as eavesdropping, data leakage, data forgery and alteration, denial-of-service (DoS), and resource consumption attacks. In fact, there was a case in Korea in which the Korean National Intelligence Service received permission to apply communication restriction measures from the government for the purpose of packet eavesdropping through a DPI [23].

Because a DPI is known to consume a large volume of memory space and CPU resources during the packet payload filtering process, numerous studies have been actively conducted to address such issues during the filtering process [24]. A method using quotient-based Cuckoo, which employs a combination of a quotient filter (QF) and a cuckoo filter (CF), was proposed as an alternative to a bloom filter (BF), QF, and CF [25]. This method uses two hash functions, which enables the minimizing of the computational overhead by reducing the inspection time by up to 77% compared to CF, and up to 98% compared to the BF and QF methods [25]. However, it poses the limitation that a delay may occur owing to additional decryption and encryption processes added for the packet processing when the method is applied to an E2EE environment.

DPI techniques are also used to increase the traffic classification accuracy and speed. In [26], the authors proposed a method for a high classification speed while maintaining a pretty accurate result by combining multiple application-layer classifiers, including machine learning algorithms and DPI. However, this method can generate overhead because two classifiers are executed for the initial few packets.

In [27], chaining fast classification stages (port-based and machine learning-based) based on DPI was proposed to speed up DPI traffic classification. This method classifies network traffic 45% faster than nDPIng, a state-of-the-art DPI classifier, with comparable classification performance. Furthermore, it allows for a relatively more privacy-friendly approach compared with full-DPI, by limiting the extraction of features from the payload of the packets only when classification has not been possible with other privacy preserving means. However, it poses a limitation that is still exposed to the existing problems of DPI when a stage based on DPI is required among the two chains.

Further, a previous study proposed a software-defined, network-based, integrated security switch by combining the features of a firewall, intrusion prevention system/intrusion detection system, and network admission control based on high-speed network processes; however, this method still faces the insufficient feasibility and design verification of the software and hardware [28].

3. Proposed Mechanism

3.1. FPI Structure and Operation Mechanism

In this study, an FPI mechanism and advanced network frame structure is proposed for efficient packet inspection in an E2EE environment.

Figure 1 illustrates the system architecture of the proposed E2EE FPI technique. The FPI consists of a sender node, a relay server including an information security system (RS/ISS node), and a receiver node. The end nodes employ a transceiver capable of transmitting and receiving data in one device. In Figure 1, the sender and receiver are illustrated separately to clearly describe the structure and mechanism of the proposed method by simplifying the system architecture. In the figure, the sender represents a node that transmits data and the receiver represents a node that receives the data. Further, it is assumed that the sender and receiver nodes are changed between communication devices to communicate with each other.

Figure 1. System architecture.

The sender node is designed to generate and transmit packets. Subsequently, RS/ISS nodes check for packet errors through CRC, compare the hash lists with the H_p and H_bm to check the presence of malicious codes, and relay packets. The CRC is an error-detecting code commonly used in digital networks to detect errors caused by channel noise or collision. In this work, the CRC is applied to reduce the processing latency for comparing the received hashmaps with the known malicious patterns. If the error in the hashmaps is not checked using CRC, it increases the latency because all the hashmaps have to be compared for packets that are caused by channel noise. On the other hand, if CRC is used, the latency can be further reduced because the receiver can quickly discard the erroneous hashmaps. Finally, the receiver node is designed to confirm the packet validity and conduct a comparative inspection through the data receiving process.

The sender node is composed of two steps: First, hash values, H_P, are generated after hashing the plain text (P) of the data payloads, hash values, H_E, are generated after hashing the encrypted data payloads (E), and hash values, H_bm, are generated after adding specific components (C) among the payloads in the form of hash bitmaps. Second, the CRC code is derived using the hash values from the first step. The RS/ISS node is composed of an integrity verification step using the CRC and inspection step, in which the presence of a malicious code is checked by comparing the received hash list (H_bm, H_P) with the malware hash list.

Similar to the RS/ISS node, the receiver node is composed of an integrity verification step and an inspection step. In addition, it features a decryption step to inspect the plain text.

Malicious packets can be blocked in advance by inspecting the reliability and security of the packets using the information fields of H_bm, H_P and H_E. As such, because the FPI involves analyzing the packets using irreversible hash values inserted in the headers rather than directly inspecting the payloads of the packets, it is capable of solving problems in which payloads are exposed to the other party or third parties in addition to securing a stable transmission speed and security when compared to existing methods.

3.2. FPI Packet Frame Structure

Figure 2 illustrates the frame structure of an FPI. An FPI features an extended packet structure compatible with existing packets using the same packet frame header as the existing packets. Malicious behaviors can be detected by inserting the hash values of the key components in the new extended header. By inserting the CRC code at the same time, an error in the elements in the extension header by channel noise can be detected. Further, the FPI includes an FPI field in the existing frame header, labeled the basic header, to check if the frame uses a new frame format. The basic header is designed to maintain compatibility with existing devices, and the FPI field employs a reserved field. The FPI field allows for checking the presence of a new extension header, which includes the addition of four fields: H_bm, H_P, H_E, and CRC. Detailed descriptions of each field are as follows.

Figure 2. Fast packet inspection (FPI) frame structure.

In the H_P field, hash values H_P are included after hashing the plain text data payloads. The H_E field contains hash values H_E derived by hashing the encrypted data payloads. The H_P and H_E fields are used to verify the integrity of the data payloads.

In the H_bm field, hash values of components from C₁ to C_n, selected from the payloads, are included as bitmaps. In the process of comparing malicious hash lists, if the signature being matched is part of the payload rather than the entire payload, the detection is made by utilizing H_bm, which adds a specific component, C, in the form of a hash bitmap, during the payload. In this study, the identification and fragment offset existing in the network layer for the IP protocols, and the URL and referer existing in the application layer for the HTTP protocols, were selected as the key components. However, other additional information that can be used as malicious traffic patterns, such as the source address, can also be defined and used depending on the use case.

The identification field in the IP protocol enables a reassembly when the fragmented datagram is transmitted. The fragment offset field indicates the offset of a particular fragment relative to the beginning of the original unfragmented datagram. When the first fragment is too small to contain the entire transport header, DoS attacks may disrupt the firewalls or receivers. In such cases, identification and a fragment offset can be used to detect the attacks [29]. The referer header field in the HTTP protocol supports a function that identifies the source URL from which the request was obtained when the client sends a request to the server [30]. Since the HTTP referer field is used in the process of concealing malicious websites [30], the malicious websites can be detected by checking if the domain of the referrer header field is present in the blacklist of malicious websites.

In the CRC field, CRC values for an integrity inspection are entered through the validity confirmation of H_bm, H_P and H_E. The CRC is a widely used technique for detecting errors found in data resulting from noise occurring through the message transmission process during communication. The CRC codes are typically composed of 32 check bits attached at the end of the transmission data. The CRC bits are generated through a binary division of data bits using a predefined divisor agreed upon by the communicating devices. The generated check bits are encoded into the data before sending the message over the network. Subsequently, the receiver decodes the incoming data by conducting the same operation using the divisor, and checks if the message is error-free based on the resulting value.

3.3. FPI Packet Transport Mechanism

Figure 3 shows a flow chart of an FPI. The packet transporting steps of the FPI are as follows.

Figure 3. FPI flowchart.

The sender node is designed to generate hash bitmaps (H_bm) by acquiring hash values of predefined key inspection components in the data payloads, and then generates hash values (H_P and H_E) both before and after the data payload encryption. Subsequently, CRCs are generated and encoded to the corresponding fields of the header to generate packets, and the packets are then delivered to the RS/ISS node.

In the RS/ISS node, the H_bm, H_P and H_E values incoming from the sender node are used to regenerate the CRC values, and the generated CRC values are then compared with the CRC values of the packets received from the sender node. If the CRC values do not match, the packets are immediately discarded. However, once the CRC comparison shows validity, the hash list is compared with the malware hash list. If the hash values match the malware hash list, the packets are blocked; otherwise, the packets are delivered to the receiver node.

The receiver node is designed to generate CRC values based on the H_bm, H_P and H_E values of the packets transmitted from the RS/ISS node, and compares the generated CRC with the CRC received from the RS/ISS node. If the newly generated CRC values do not match the CRC values transmitted from the RS/ISS node, the packets are discarded; otherwise, the hash list is compared with the malware hash list. If the hash list matches the malware hash list, the packets are discarded; otherwise, the hash values of the encrypted data payloads and the hash values, H_E, received from the RS/ISS node are compared to check if they match. If the hash values match, the data payloads are restored to extract the hash values. The extracted hash values are then compared with the H_P values received from the RS/ISS node to check if both values are the same. Once the values are determined to be the same, the key component values, C, are compared with the received components of H_bm in the decrypted payloads. If the components match, the received data payloads are determined to be normal and can be accepted.

4. Evaluation

4.1. Experimental Conditions and Environment

Packet inspection simulations were conducted to compare and evaluate the packet inspection time of the proposed FPI technique and the existing DPI technique. The FPI and DPI methods were modeled based on Python to compare the packet inspection time through the packet transmission simulation.

The hypothetical conditions for the experiment of this study are as follows. First, it is assumed that both the proposed FPI and the existing DPI models apply the key exchanges in advance, and that they are in the process of performing a packet transmission and inspection. The FPI model is executed by adding the values of H_bm, H_P, H_E, and CRC to the extension header. In addition, the existing DPI model is executed by analyzing the payloads after decrypting the encrypted payload data from the RS/ISS node, and then transmitting the message to the receiving end after encrypting the data again. Further, with the DPI model, it is assumed that the inspection is conducted only in the router nearest to the receiving end rather than in all routers that are involved in the network communication process between the two nodes. During the inspection process, advanced encryption standard (AES)-128 cipher block chaining was used for the data payload encryption and decryption processes of both models.

The threat condition for this experiment is as follows. Man-in-the-middle attack, eavesdropping and personal data leakage can occur. If E2EE encryption is not applied, privacy is likely to be violated because packets are decrypted during the RS/ISS. In addition, there is a risk of data leakage and eavesdropping in the process of transmitting information. An attacker may tamper with a packet, disguising it as a normal packet.

4.2. Experiment Procedures

For the experiment, the FPI and DPI comprised the sender, RS/ISS and receiver nodes.

Figure 4a illustrates the packet transmission and inspection processes of an FPI. First, in the sender node, a random data payload of a predefined length is generated, and specific component values from C₁ to C_n are then selected from the generated payload data to hash the data through the secure hash algorithm (SHA)-256 where the hashed values are set as H_bm. The specific components were used by dividing the payloads into 10-byte groups. For example, it is assumed that the first component uses the first 10 bytes of the payload, i.e., bytes 1–10, and the second component uses the next 10 bytes, i.e., bytes 11–20. The generated data payload is added to the data payload field after encrypting the data through the AES method. Both H_P and H_E use SHA-256 to hash the data payload and the encrypted data payload, respectively. Subsequently, the hashed values are entered into the corresponding fields. The CRC values are generated based on the values of H_bm, H_P and H_E. Finally, the derived CRC values are entered into the CRC field and the packet is transmitted to the RS/ISS node.

Figure 4. Packet transmission process: (a) is for FPI; (b) is for Deep Packet Inspection (DPI).

The RS/ISS node checks the validity of the CRC and compares and inspects the hash list with the malware hash list to conduct packet filtering. During this process, if the received packet hash list equals the malware hash list or if the CRC values turn out to be invalid, the packet is discarded during the filtering. The malware hash list is constructed by generating 10 data entries of a predefined length and is set using malware hash values for the comparison process.

The receiver node verifies the validity of the CRC and compares the received hash list with the malware hash list. Subsequently, the node checks if the H_E values and H_bm values generated based on the hash values of the specific components, and the AES-encrypted data payload hash values received from the RS/ISS node, match its own values. The FPI was designed to accept the data payload if all three field hash values are found to match.

Figure 4b illustrates the packet transmission and inspection processes of a DPI. In the DPI model, after generating a random data payload of a predefined length in the sender node, the payload is encrypted with AES and the resulting values are specified in the data payload field. Subsequently, the packet is transmitted to the RS/ISS node. The RS/ISS node first decrypts the received packet through the AES, then compares the resulting data payload of the packet with the malware list. The packet is discarded if the values match. Otherwise, the payload is encrypted with AES again to route the packet to the receiver node.

The receiver node also decrypts the received packet to compare the data payload received from the RS/ISS node with the malware list. The packet is discarded if the values match; otherwise, the data payload is accepted for use.

The transmission times of the FPI and DPI models were evaluated by comparing the time taken for the entire process, i.e., the time it takes for the sender node to generate packets for transmission and the receiver node to finish inspecting the packets and determine if the payload is valid for use.

4.3. Experimental Results

Figure 5 shows a graph comparing the packet transmission time of the proposed FPI model and the existing DPI model according to the payload length. The x-axis of the graph indicates the length of the packet payload transmitted by the sender node, and the y-axis indicates the average packet transmission time. Each transmission round involves the sender node transmitting payloads from 160 to 960 bytes with an increment of 160 bytes, and the receiver node decrypting the transmitted payloads into plain text and conducting all required inspections. The average transmission times were calculated by conducting 10,000 rounds for each payload length.

Figure 5. Comparison of packet transmission times of FPI and DPI.

As a result, the proposed FPI method was shown to provide a quicker overall transmission rate than the DPI. When the payload length was 160 bytes, the FPI took approximately 0.661 ms, whereas the DPI took approximately 4 ms, suggesting that the latency of the DPI packet transmission and inspection processes was six times greater than that of the FPI. Further, when the payload length was 960 bytes, the FPI took approximately 1271 ms, whereas the DPI took approximately 24.052 ms, indicating that the DPI packet transmission and inspection processes faced 18.9 times greater latency compared to that of the FPI. Accordingly, it can be inferred that the FPI supports a faster packet transmission than the DPI, and as the payload length increases, the difference in transmission time also increases linearly. Further, the simulation results showed that the FPI model, which successfully maintains the security level by inspecting the packets in an encrypted state, is a more stable and time-efficient mechanism than the DPI model, which directly compares the data payload through the encryption and decryption processes using the actual data in the RS/ISS node.

Figure 6 shows a comparison of the average packet transmission time of the proposed FPI model according to the number of components. The x-axis denotes the length of the packet payload transmitted by the sender node, and the y-axis denotes the average transmission time according to payload length. Each transmission round denotes the sender node transmitting payloads from 160 to 960 bytes with an increment of 160 bytes, and the receiver node decrypting the transmitted payloads into plain text and performing all of the required inspections. The average transmission times were calculated by conducting 10,000 rounds for each payload length.

Figure 6. Packet transmission time comparison of FPI according to the number of components.

FPI1 used packets with H_P and H_E, FPI2 used packets with H_P, H_E, and one component of H_bm, and FPI3 used packets with H_P, H_E, and two components of H_bm. In addition, H_P denotes the hash values of the payload and H_E represents the hash values derived by hashing the encrypted payload using SHA-256. H_P and H_E are 32-byte values. The components of H_bm were assumed to be 10-byte fragments obtained by dividing the payload. Accordingly, 64, 96, and 128 bytes were encoded into the extension headers of FPI1, FPI2, and FPI3, respectively, for the simulation.

As a result, it was revealed that the transmission time is proportionally delayed with an increase in the number of components (H_bm). However, the difference was minimal and insignificant, with the maximum and minimum differences being 0.064 and 0.018 ms, respectively. Further, compared with DPI with respect to Figure 5, even when 290 components (H_bm) were added to the extension header on the 800-byte payload length, the FPI took a shorter time than the DPI when inspecting a packet of the same length. As a result of conducting the simulation, because adding more components (H_bm) can be interpreted as improving the security level, it was shown that an FPI is a suitable mechanism capable of delivering high security compared to its transmission latency. In the future, we expect to scale the extension header to enable a quick inspection of various components.

Table 1 summarizes the comparison of the DPI and FPI methods based on four evaluation criteria. The inspection coverage represents the ratio of the inspected components to the entire payload. The transmission time and inspection coverage were analyzed based on the 640-byte payload length. For example, 10% inspection coverage means 10% of the total payload is subject to inspection, and 100% inspection coverage is when the entire payload is divided into multiple components and converted into hash values in the form of a hashmap to inspect the entire payload. As a result of comparing the transmission time according to the difference in the inspection coverage, the FPI was approximately 15.4 times quicker than the DPI when the inspection coverage was 10%, approximately 14.4 times quicker when the inspection coverage was 20%, and approximately 5.3 times quicker when the inspection coverage was 100%. Although the transmission time increased with an increase in the inspection coverage, it was shown that the FPI method was still quicker than the DPI even when the inspection was conducted on the entire payload. Based on this observation, we confirmed that the FPI mechanism is more efficient than the DPI mechanism in terms of the packet transmission time.

Table 1. Comparison of DPI and FPI.

The E2EE retention indicates that E2EE can be perfectly maintained while transmitting data. The FPI method detects and prevents malicious behaviors while maintaining E2EE by loading irreversible hash values and hash bitmaps (H_bm, H_P, H_E,) as information fields into the extended header without having to decrypt the encrypted payload. However, the DPI method faces the problem that the E2EE cannot be applied because it requires decrypting the payload at the RS/ISS node to detect malicious behaviors. Further, because DPI decrypts the payload into plain text, the payload data may become exposed to attackers if the DPI is conducted in the RS/ISS node, where various security vulnerabilities are present.

The complexity of establishing the environment was also compared for each model. First, the DPI is a payload-based inspection method, and thus, extracting information is difficult because it conducts an inspection centered on actual data. In addition, it requires numerous processing resources and a high maintenance cost during the process of storing the payload; hence the DPI model is used with certain constraints [31]. Further, the throughput and cost of the DPI engine may vary significantly depending on which hardware platform and matching algorithms are used, and factors such as the scalability and flexibility may also differ depending on the pattern matching unit structures [32]. By contrast, the FPI utilizes only the header part of the original packet frame structure without having to adjust the existing network devices according to the protocols and functions. Further, the cost of device replacement and data management is low because the computation process is simplified when using only the hash values. Accordingly, the FPI method can be seen to be more cost-effective for the overall established environment.

In summary, the FPI method showed quicker overall packet transmission speeds than the DPI method during the simulation using AES encryption, which is a widely applied encryption method for an E2EE environment. This result indicates that the FPI method performs more efficiently than the DPI method, which requires applying decryption, inspection, re-encryption, and transmission in RS/ISS node. Furthermore, because FPI supports adding the desired H_bm components, which indicates the security of such inspection, it can be concluded that the FPI method is a more effective inspection technique than the DPI method in terms of the transmission time and security.

5. Conclusions

In this work, we focus on resolving the latency problem, which is a major weakness of DPI. An FPI model used to present a packet frame structure and transport mechanism that support the data integrity and malware detection while maintaining E2EE was proposed. Inspection simulations were conducted to compare and evaluate the inspection speed and coverage of the proposed FPI method along with the existing DPI method by separately modeling both approaches. Because the DPI method requires encryption, decryption, inspection, re-encryption, and transmission processes in the RS/ISS node, the proposed FPI method was found to provide a relatively quicker transmission time while maintaining E2EE, which means that the increased number of packets will result in greater efficiency. In addition, an FPI is scalable to an agile environment by adjusting the scope of the components to be inspected according to the given network environment and its requirements. In summary, the proposed FPI method can be regarded as a packet inspection method suitable for a network environment that requires simultaneously securing the security, transmission speed, and flexible scalability requirements. Further, the proposed method is expected to be used as a part of advanced network security technology in the future.

Basically, the FPI will be less accurate than the DPI, but the more hashmaps that are used, the better the detection accuracy will be, in return for the increased processing latency and risk of leakage of information. In addition, an FPI has limitations that allow malicious nodes to easily bypass the same hash detection logic, using the same hash detection technique based on blacklists. In the case of whitelist-based approaches, it would be practically impossible to make the same hash for a given turnaround time. In further works, we plan to implement and evaluate the FPI/DPI adaptive scheme and whitelist-based FPI scheme.

Author Contributions

Conceptualization S.-Y.K. and I.-G.L.; methodology, S.-Y.K., S.-W.Y. and E.-Y.L.; software, S.-W.Y. and E.-Y.L.; validation, S.-Y.K. and I.-G.L.; investigation, S.-H.B.; resources, S.-W.Y.; visualization, S.-H.B.; writing—original draft preparation, S.-Y.K.; writing—review and editing, I.-G.L.; supervision, I.-G.L.; project administration, I.-G.L.; funding acquisition, I.-G.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partly supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2017R1C1B5074695 and No. 2020R1F1A1061107) and Korea Institute for Advancement of Technology (KIAT) grant funded by the Korea Government (MOTIE) (No. 2020-0089).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

E2EE	End-to-End Encryption
FPI	Fast Packet Inspection
DPI	Deep Packet Inspection
CRC	Cyclic Redundancy Check
RS/ISS	Relay Server including an Information Security System

References

Wang, D.; Cheng, H.; He, D.; Wang, P. On the Challenges in Designing Identity-Based Privacy-Preserving Authentication Schemes for Mobile Devices. IEEE Syst. J. 2018, 12, 916–925. [Google Scholar] [CrossRef]
DMC MEDIA. 2019 Mobile Messenager App Usage Behavior by DMC REPORT. Available online: https://www.dmcreport.co.kr/ (accessed on 7 May 2019).
Triyason, T.; Tassanaviboon, A.; Kanthamanon, P. Hybrid Classroom: Designing for the New Normal after COVID-19 Pandemic. In Proceedings of the 11th International Conference on Advances in Information Technology (IAIT2020), Bangkok, Thailand, 1–3 July 2020; ACM: New York, NY, USA, 2020; Volume 30, pp. 1–8. [Google Scholar]
Endeley, R.E. End-to-End Encryption in Messaging Services and National Security—Case of WhatsApp Messenger. J. Inf. Secur. 2018, 9, 95–99. [Google Scholar] [CrossRef]
Rösler, P.; Mainka, C.; Schwenk, J. More is Less: On the End-to-End Security of Group Chats in Signal, WhatsApp, and Threema. In Proceedings of the 2018 IEEE European Symposium on Security and Privacy (EuroS&P), London, UK, 24–26 April 2018; pp. 415–429. [Google Scholar]
Cohn-Gordon, K.; Cremers, C.; Dowling, B.; Garratt, L.; Stebila, D. A Formal Security Analysis of the Signal Messaging Protocol. In Proceedings of the 2018 IEEE European Symposium on Security and Privacy (EuroS&P), London, UK, 24–26 April 2018; pp. 451–466. [Google Scholar]
Karbasi, A.H.; Shahpasand, S. A post-quantum end-to-end encryption over smart contract-based blockchain for defeating man-in-the-middle and interception attacks. Peer-to-Peer Netw. Appl. 2020, 13, 1–19. [Google Scholar] [CrossRef]
Wang, W.; Zhu, M.; Wang, J.; Zeng, X.; Yang, Z. End-to-end encrypted traffic classification with onedimensional convolution neural networks. In Proceedings of the IEEE International Conference on Intelligence and Security Informatics (ISI), Beijing, China, 22–24 July 2017; pp. 43–48. [Google Scholar]
Amaral, P.; Dinis, J.; Pinto, P.; Bernardo, L.; Tavares, J.; Mamede, H.S. Machine learning in software defined networks: Data collection and traffic classification. In Proceedings of the IEEE 24th International Conference on Network Protocols (ICNP), Singapore, 8–11 November 2016; pp. 1–5. [Google Scholar]
Ren, H.; Li, H.; Liu, D.; Xu, G.; Cheng, N.; Shen, X.S. Privacy-preserving Efficient Verifiable Deep Packet Inspection for Cloud-assisted Middlebox. IEEE Trans. Cloud Comput. 2020. [Google Scholar] [CrossRef]
Garcia, J.; Korhonen, T.; Andersson, R.; Västlund, F. Towards Video Flow Classification at a Million Encrypted Flows Per Second. In Proceedings of the IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA), Krakow, Poland, 16–18 May 2018; pp. 358–365. [Google Scholar]
Sherry, J.; Lan, C.; Popa, R.A.; Ratnasamy, S. Blindbox: Deep packet inspection over encrypted traffic. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, London, UK, 17–21 August 2015; pp. 213–226. [Google Scholar]
Nabeel, M. The Many Faces of End-to-End Encryption and Their Security Analysis. In Proceedings of the IEEE International Conference on Edge Computing (EDGE), Honolulu, HI, USA, 25–30 June 2017; pp. 252–259. [Google Scholar]
Shirvanian, M.; Saxena, N.; George, J.J. On the pitfalls of end-to-end encrypted communications: A study of remote key-fingerprint verification. In Proceedings of the 33rd Annual Computer Security Applications Conference, Orlando, FL, USA, 4–8 December 2017; pp. 499–511. [Google Scholar]
Espinoza, A.M.; Tolley, W.J.; Crandall, J.R.; Crete-Nishihata, M.; Hilts, A. Alice and bob, who the FOCI are they?: Analysis of end-to-end encryption in the LINE messaging application. In Proceedings of the 7th USENIX Workshop on Free and Open Communications on the Internet (FOCI 17), Vancouver, BC, Canada, 14 August 2017. [Google Scholar]
Zoom. E2E Encryption for Zoom Meetings. Available online: https://blog.zoom.us/end-to-end-encryption-update/ (accessed on 17 June 2020).
Kumar, S.; Hu, Y.; Andersen, M.P.; Popa, R.A.; Culler, D.E. JEDI: Many-to-Many End-to-End Encryption and Key Delegation for IoT. In Proceedings of the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14–16 August 2019; pp. 1519–1536. [Google Scholar]
Park, S.-Y.; Park, S.-M.; Park, U.-S.; Kim, H.-G. Messenger Program with End-to-end Encryption and Digital Signature. JKICS 2016, 305–306. [Google Scholar]
Aceto, G.; Ciuonzo, D.; Montieri, A.; Pescapè, A. MIMETIC: Mobile encrypted traffic classification using multimodal deep learning. Comput. Netw. 2019, 165, 106944. [Google Scholar] [CrossRef]
Xu, C.; Chen, S.; Su, J.; Yiu, S.M.; Hui, L.C.K. A Survey on Regular Expression Matching for Deep Packet Inspection: Applications, Algorithms, and Hardware Platforms. IEEE Commun. Surv. Tutor. 2016, 18, 2991–3029. [Google Scholar] [CrossRef]
Elagin, V.S.; Goldshtein, B.S.; Onufrienko, A.V.; Zarubin, A.A.; Savelieva, A.A. The efficiency of the DPI system for identifying traffic and providing the quality of OTT services. Systems of Signals Generating and Processing in the Field of on Board Communications. In Proceedings of the 2018 Systems of Signals Generating and Processing in the Field of on Board Communications, Moscow, Russia, 14–15 March 2018; pp. 1–5. [Google Scholar]
Yu, F.; Katz, R.H.; Lakshman, T.V. Gigabit rate packet pattern-matching using TCAM. In Proceedings of the 12th IEEE International Conference on Network Protocols, Berlin, Germany, 8 October 2004; pp. 174–183. [Google Scholar]
Jung, J.-W. Communication Equipment Related Information Leakage Risk Analysis and Preparation. Available online: https://intelligence.na.go.kr:444/intelligence/reference/reference01.do?mode=view&articleNo=663003&article.offset=0&articleLimit=10 (accessed on 9 April 2019).
Al-hisnawi, M.; Ahmadi, M. Deep Packet Inspection Using Quotient Filter. IEEE Commun. Lett. 2016, 20, 2217–2220. [Google Scholar] [CrossRef]
Al-hisnawi, P.M.; Ahmadi, M. QCF for deep packet inspection. IET Netw. 2018, 7, 346–352. [Google Scholar] [CrossRef]
Li, Y.; Li, J. MultiClassifier: A combination of DPI and ML for application-layer classification in SDN. In Proceedings of the 2014 2nd International Conference on Systems and Informatics (ICSAI 2014), Shanghai, China, 15–17 November 2014; pp. 682–686. [Google Scholar]
Doroud, H.; Aceto, G.; Donato, W.-D.; Jarchlo, E.A.; Lopez, A.M.; Guerrero, C.D. Speeding-Up DPI Traffic Classification with Chaining. In Proceedings of the 2018 IEEE Global Communications Conference, Abu Dhabi, UAE, 9–13 December 2018; pp. 1–6. [Google Scholar]
Kim, J.-W.; Choi, J.-I. Software-Defined Networking (SDN) Based Integrated Security Switch Design. IEIE 2019, 103, 552–553. [Google Scholar]
John, W.; Olovsson, T. Detection of malicious traffic on back-bone links via packet header analysis. Campus-Wide Inf. Syst. 2008, 25, 342–358. [Google Scholar] [CrossRef]
Mansoori, M.; Hirose, Y.; Welch, I.; Choo, K.R. Empirical Analysis of Impact of HTTP Referer on Malicious Website Behaviour and Delivery. In Proceedings of the IEEE 30th International Conference on Advanced Information Networking and Applications (AINA), Crans-Montana, Switzerland, 23–25 March 2016; pp. 941–948. [Google Scholar]
Gong, C.; Sarac, K. IP traceback based on packet marking and logging. In Proceedings of the IEEE International Conference on Communications, Seoul, Korea, 16–20 May 2005; Volume 2, pp. 1043–1047. [Google Scholar]
Lin, P.-C.; Lin, Y.-D.; Lee, T.-H.; Lai, Y.-C. Using String Matching for Deep Packet Inspection. IEEE Comput. 2008, 41, 23–28. [Google Scholar] [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Category	DPI	FPI
Inspection coverage	100%	10%	20%	100%
Transmission speed average (ms)	15.74	1.06	1.09	2.88
E2EE retention	X	O
Complexity of establishing environment	Complex [31]	Simple