Comparison of Hash Functions for Network Traffic Acquisition Using a Hardware-Accelerated Probe

Korona, Mateusz; Szumełda, Paweł; Rawski, Mariusz; Janicki, Artur

doi:10.3390/electronics11111688

Open AccessArticle

Comparison of Hash Functions for Network Traffic Acquisition Using a Hardware-Accelerated Probe

Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(11), 1688; https://doi.org/10.3390/electronics11111688

Submission received: 29 April 2022 / Revised: 19 May 2022 / Accepted: 23 May 2022 / Published: 25 May 2022

(This article belongs to the Special Issue Cybersecurity and Data Science)

Download

Browse Figures

Versions Notes

Abstract

:

In this article we address the problem of efficient and secure monitoring of computer network traffic. We proposed, implemented, and tested a hardware-accelerated implementation of a network probe, using the DE5-Net FPGA development platform. We showed that even when using a cryptographic SHA-3 hash function, the probe uses less than 17% of the available FPGA resources, offering a throughput of over 20 Gbit/s. We have also researched the problem of choosing an optimal hash function to be used in a network probe for addressing network flows in a flow cache. In our work we compared five 32-bit hash functions, including two cryptographic ones: SHA-1 and SHA-3. We ran a series of experiments with various hash functions, using traffic replayed from the CICIDS 2017 dataset. We showed that SHA-1 and SHA-3 provide flow distributions as uniform as the ones offered by the modified Vermont hash function proposed in 2008 (i.e., with low means and standard deviations of the bucket occupation), yet assuring higher security against potential attacks on a network probe.

Keywords:

traffic analysis; network probe; hash function; SHA-3; FPGA

1. Introduction

At present, society is witnessing an unparalleled pace of technological development and global expansion of the Internet. An increasing number of ventures rely on network connectivity, both in the public sector and in business. Entities connected to the Internet range from those used for leisure purposes to elements of critical infrastructure, such as industrial process control or transportation management systems. In the background, a new technology paradigm known as Internet of Things (IoT) is evolving, which consists of objects that collect, process, and exchange data via diverse networks, often operating without direct human supervision [1]. This automation is one of the reasons why people have been already surrounded by massive numbers of IoT devices; it is estimated that about 75 million IoT devices will be connected to the network by 2025 [2].

In parallel, computer networks enable criminal activities named cybercrimes [3]. Constantly, new cybercrime types are being developed [4]. Some methods were previously associated only with mafia and now are a threat in the virtual world. This includes extortion using distributed denial of service (DDoS) attacks or ransomware—software that encrypts user data for ransom. According to the NETSCOUT Threat Intelligence Report [5], 9.7 million DDoS attacks were encountered in 2021. As Cybersecurity Ventures estimates [6], global cybercrime costs will grow yearly by 15%, reaching 10.5 trillion US dollars annually by 2025. Even though general awareness of various cybersecurity threats is increasing, as is the overall level of safety, constant effort to improve countermeasures is required. The growing number of targets, new attack vectors, and the fact that malware constantly evolves do not make this an easy task. It is estimated that over 450,000 new malicious programs and potentially unwanted applications (PUA) are registered every day [7].

In response to numerous network threats, various cybersecurity methods have been proposed. The first safeguards of a network are firewalls and intrusion detection/prevention systems (ID/PS), whose task is to analyze incoming traffic and intercept packets when a malicious signature is detected. Collecting IP traffic information for network monitoring is a common practice of network operators and researchers. To build a coarse-grained understanding of network traffic, the concept of network flows is used. It records traffic statistics in the form of flow records. Each record contains important information about a flow, such as its source and destination Internet Protocol (IP) addresses, start and end timestamps, types of service, and application ports, along with the volume of packets or bytes, etc. IP packets are assigned into flows based on their characteristics, such as source or destination address, protocol type carried, and protocol port numbers (for TCP and UDP) that can be referred to as flow keys. As a result of the analysis procedure, which often incorporates the most cutting-edge approaches, including machine learning [8,9,10], disallowed flows can be eliminated.

Flow-based network monitoring is today the most widespread technology, and NetFlow [11,12,13] is a widely used tool in network measurement and analysis. It is now gradually evolving into one of the most important means of ensuring network cybersecurity.

Performance of NetFlow monitoring tools has been identified as a crucial factor in network security allowing for the application of immediate countermeasures. It has been widely addressed, including the possibility for its hardware acceleration [14,15,16,17,18]. However, it is important to note that also the monitoring device itself can be a target of a specialized cyberattack [19], especially when the assailant has appropriate knowledge and is willing to spend their resources and time for initial reconnaissance. Crossfire [20] is an example of such a sophisticated attack (in comparison to the brute-force DDoS attack), tailored to a targeted enterprise, that can isolate a target area by flooding carefully selected network links.

NetFlow-like tools face great challenges when both the speed and complexity of the network traffic increase. To keep up with the multigigabit speed of network traffic, especially on high-bandwidth backbone links, NetFlow probes incorporate advanced techniques to efficiently store and manipulate flow records [21]. A fast local memory inside the probe, known as flow cache, is used to store the active flows. The flow cache is organized in a data structure called a flow table, which consists of a list of flow records, one for each active flow.

To efficiently process incoming packets and access the database gathered based on the flow key of the current packet often requires the use of sophisticated data structures, which vastly reduces computational complexity. Hash-based data structures are commonly proposed for this purpose as a solution allowing high-speed packet processing. Such data structures are usually coupled with a hashing function that maps a flow key to a flow cache location. Unfortunately, applying a perfect hashing function that maps each flow key to a distinct flow cache location is not possible in practice. Thus, it is crucial to select a hashing function that maps a small number of flow keys on to the same flow cache location, so-called hash buckets. If the number of collisions is sufficiently small, then hash tables work quite well and give

O (1)

search times. To ensure optimal utilization of the hash table and reduce the vulnerability of a NetFlow probe to cyberattacks, the hash function needs to be carefully chosen. If it is not, malicious traffic may be able to create collisions that degenerate the hash table to linked lists with worst-case lookup times of

O (n)

and greatly reduce the performance of the flow cache modules.

In [19], the authors evaluated the resilience of hash functions used in the software-based NetFlow probes nProbe and Vermont. Theoretical analysis and real attacks proposed by the authors show how easily flow monitors can be overloaded if the hash algorithm has not been carefully chosen. The paper also presents a hash function that seems to offer protection against hash collision attacks and computes fast enough to be deployed in high-speed flow meters.

The obvious countermeasure against hash collision-based attacks (hash flooding or HashDoS) is a hash function for which collisions cannot easily be created. Cryptographic hash functions would provide such a feature; however, they are computationally expensive, which makes them difficult to use efficiently in NetFlow probes. The implementation of such network monitoring elements with rigorous throughput may be challenging. Hardware acceleration of their crucial functions can be an aid here. Still, to the best of our knowledge, there is a lack of publications discussing hardware-accelerated network probes for network traffic analysis with dedicated hash functions that would be resilient to targeted attacks.

Our article aims at filling up this gap. In this work, we propose a hardware-accelerated network probe that accelerates extraction of network packet characteristics and calculation of the hash identifier. In addition, we describe the application of the cryptographic hash functions SHA-1 and SHA-3 to map a flow key to a flow cache location. The efficiency of our approach will be compared with the solutions discussed in [19].

Our article is organized as follows: First, in Section 2 we present the concept of a hardware-accelerated network probe and review different hashing algorithms. Next, in Section 3 we describe the experiments conducted. Their results are presented in Section 4, followed by discussion and conclusions in Section 5.

2. Materials and Methods

In this section, we outline the concept of a hardware-accelerated network probe (Section 2.1). Different hash algorithms that can produce hash table keys are discussed in Section 2.2. Details of hardware implementations and functional verification of the design are described in Section 2.3.

2.1. Hardware-Accelerated Network Probe

A network probe is a tool which acquires parameters from network traffic for traffic-analysis purposes. In this work, we used a hardware-accelerated version of the software network probe proposed in [22], which is also briefly presented here. The block diagram of the probe is presented in Figure 1. The network probe processes the traffic data in the following steps:

capture network packets from a specific interface,
analyze packets in chosen network stack layers,
extract flow key and other features from the current packet,
compute the hash value from the flow key,
create or update a network flow record in the active flow cache,
export inactive flows to the expired flow table,
calculate flow parameters for the expired flows,
store flow parameters in the output dataset.

The traffic captured from a network interface is analyzed and then a network flow record is created or an existing one is updated in the flow cache. Packet headers are analyzed in terms of second, third, and fourth ISO/OSI Reference Model layers. Assignment of new packets to flows is based on a hash function of the header parameters, which is calculated using the IP source address, the IP destination address, the source port number, the destination port number, and information on the transport layer protocol.

Considering the transport layer protocols, the conditions for classifying the stream as ended are RST or FIN flags in the case of TCP, and reaching a predefined inactivity time in the case of UDP. The flows considered as ended are statistically analyzed and their parameters extracted, as described in the next section. Expired flows are dumped to a file.

Captured packets are processed starting with the second ISO/OSI layer. From the data link layer, information about the timestamp and the packet length is fetched. The Ether_type field contains information about the higher-layer protocol used, which is, in the network probe’s case, IPv4. After receiving the IP header, it is possible to decode the source and destination IP addresses, along with the transport layer protocol. Knowing the values of the headers of transport layer protocols, it is possible to decode the recipient’s port, and the TCP flags, if applicable.

Current flows are stored in flow caches organized in buckets. For every incoming packet, a hash of the flow key is calculated and then checked against the existing flow keys in the appropriate bucket. If the hash does not exist, a new flow record is created in the given bucket, with parameters such as: source and destination IP addresses, source and destination port numbers, first packet timestamp, and transport layer protocol. If the hash already exists, the existing flow is updated. The packet count value is incremented, TCP flags are updated (if applicable), and a new timestamp and the packet size are added to the list.

In the case of the TCP protocol, the appearance of a FIN or RST flag means the end of the flow. Then, some of the flow’s parameters are updated. Furthermore, the flow is moved from the active flows map to the expired flows list. Post-processing of the parameters consists of converting source and destination IP addresses to ASCII format; marking last timestamp; and calculating the flow’s duration and total byte count, and its statistical parameters.

In the case of UDP packets, these are periodically checked by the application thread, which will be iterating through the active flows cache. The last packet’s arrival time in a flow is compared to the last packet’s arrival time on the network adapter, and if this exceeds the time difference by a predefined value (set in our case to 10 s), it is moved from the current flows cache to the expired flows list.

2.2. Hash Functions

Hashing is an extremely useful technique widely used to construct fast lookup methods to be able to quickly assign received packets to their corresponding flows. The hash functions used for mapping flow keys to hash values need to be chosen carefully to ensure optimal utilization of the hash table. Intuitively, a hash function is a function that maps every item to a hash value in a fashion that is somehow random. The most obvious model for a hash function is that it is fully random. Unfortunately, it is almost always impractical to construct fully random hash functions, as the space required to store such a function is essentially the same as that required to encode an arbitrary function as a lookupTable [23]. Thus, the hashing applied is usually a compromise between the randomness properties that are desired in a hash function and the computational resources needed to store and evaluate such a function.

Hash functions utilized in network monitoring devices should have the following features:

good performance—hash calculation cannot become a bottleneck in the network monitor;
uniform distribution—when this condition is fulfilled, buckets of the hash table which stores data describing monitored flows are randomly selected for traffic that is not manipulated, and none of them is likely to contain long list of packets (or to overflow);
collision resistance—when the hash function has this feature, it is extremely hard for an attacker to forge two packets with different flow characteristics that will end in the same hash table bucket, a situation that might eventually lead to bucket overflow.

Report [19] discusses hash algorithms used in two popular monitoring tools—nProbe [24] and Vermont [25]. The authors of the current paper have identified some flaws in both algorithms and proposed a modified version of Vermont. They also suggest that cryptographic hash functions might be best for such an application, if their implementations meet performance demands.

The network probe implements all three algorithms from [19] in hardware. In addition, two cryptographic hash functions were implemented—the cryptographically broken but still widely used SHA-1 and the state-of-the-art SHA-3. All of the algorithms are described in following subsections.

For the proposed network probe, a hash width of 32 bits was considered. If the result of a given algorithm was wider, this was reduced accordingly to 32 bits. The network probe considers source IP address, destination IP address, protocol, and protocol (TCP/UDP) source/destination port numbers as flow keys.

2.2.1. Sum Modulo 32—nProbe

The nProbe [24] monitoring tool utilizes simple sum modulo as its hash algorithm. For the proposed network probe, the calculation is presented as Equation (1):

h = (s r c I P + d s t I P + p r o t o c o l + s r c P o r t + d s t P o r t) m o d 32

(1)

This algorithm is very simple; however, as the authors of [19] point out, after testing it with a captured network packet trace, it does not have a perfectly uniform distribution—a number of buckets contain considerably more entries than others. Another drawback is relative ease of generating collisions, because an attacker can freely manipulate the values of the flow keys provided that their sum is constant.

2.2.2. Nested CRC-32—Vermont

Cyclic redundancy checks or cyclic redundancy codes (CRC) have been utilized for error detection in computing for a long time. A digest is calculated from transmitted data and is appended to the frame. The same algorithm is applied to data upon frame reception, and when the result is the same as the code calculated by the transmitter, it means that the received packet is correct.

The actual algorithm can be described mathematically as polynomial division of binary data being interpreted as polynomial over GF(2) (every bit is a polynomial coefficient—zero or one) by generator polynomial G(x). The remainder of that division is treated as a check sequence, which is appended to the transmitted frame [26].

The CRC-32 implementation used in the proposed network probe is based on IEEE 802.3 [27] polynomial. Implementation parameters, according to [26], are presented in Table 1.

Vermont [25] is built on nested CRC-32 invocations. The algorithm starts with a given initial seed, and Figure 2 presents how CRC-32 is invoked five times to include flow keys in the hash calculation. The result of the preceding CRC-32 function is utilized as seed for the next one.

The authors of [19] found that Vermont is computationally efficient and offers roughly uniform distribution; however, they also proved that an attacker is still able to create hash collisions on purpose.

2.2.3. Nested CRC-32 with w Constants—Modified Vermont

Report [19] proved that the CRC-based Vermont algorithm does not protect network monitoring devices from targeted collision attacks. The goal of the authors of this current paper was to design a function that does not have this flaw, but that offers the same statistical qualities. The result of their research is a modified Vermont algorithm, presented in Figure 3.

To ensure that an attacker cannot create collisions in a simple way, a unique secret random value (w(i), initialized during network monitor activation) is added to every flow key before CRC-32 calculation. This significantly increases the cost of a targeted attack, but does not prevent it, since the CRC-32 scheme is still used.

2.2.4. SHA-1

SHA-1 is a cryptographic hash function created in 1995, described in [28,29]. In its cycle of life it is currently marked as deprecated, because it is prone to a variety of attacks. In 2015, a group of researchers was able to find a freestart collision, where the SHA-1 initialization vector was chosen by themselves [30], but soon the full SHA-1 algorithm was also cracked [31,32,33].

An organized crime syndicate in possession of tens of thousands of dollars can create an SHA-1 collision in about two months, and for instance, forge an SSL certificate. That is the reason famous brands such as Microsoft, Google, and Mozilla abandoned the SHA-1 algorithm; however, it still may be useful in real-time applications such as network monitoring.

The SHA-1 function produces a 160-bit hash. It is capable of hashing messages as long as

2^{64} - 1

bits, which are divided into 512-bit blocks processed one by one.

The first step of the algorithm is padding, because the length of the message must be a multiple of 512 bits. During this process, the information about message length is encoded in 64 bits (hence the message length limit). This number is concatenated with exactly one “1” bit and an appropriate number of “0” bits, so when the padding bit string is appended to the message, the total length is a multiple of 512 bits. The temporary value of the hash is stored in five 32-bit variables H, initialized as in Listing 1.

Listing 1. Initial values of H variables in SHA-1 algorithm.

H_0(0) = 0x67452301

H_1(0) = 0xEFCDAB89

H_2(0) = 0x98BADCFE

H_3(0) = 0x10325476

H_4(0) = 0xC3D2E1F0

Every block of the message is processed through 80 rounds according to the scheme in Figure 4.

Variables A to E are assigned values of corresponding H registers from the previous block or H(0) for the first block. The W array is generated—the first 16 words are 32-bit chunks of the processed block and subsequent words are calculated with Equation (2).

W (i) = W (i - 3) \oplus W (i - 8) \oplus W (i - 14) \oplus W (i - 16)

(2)

Function F and the value of variable K depend on the current round number as in Equations (3) and (4).

F (i) = \{\begin{matrix} (B & C) | ((\sim B) & D) & for 0 < = i < = 19 \\ B \oplus C \oplus D & for 20 < = i < = 39 \\ (B & C) | (B & D) | (C & D) & for 40 < = i < = 59 \\ B \oplus C \oplus D & for 60 < = i < = 79 \end{matrix}

(3)

K (i) = \{\begin{matrix} 0 x 5 A 827999 & for 0 < = i < = 19 \\ 0 x 6 E D 9 E B A 1 & for 20 < = i < = 39 \\ 0 x 8 F 1 B B C D C & for 40 < = i < = 59 \\ 0 x C A 62 C 1 D 6 & for 60 < = i < = 79 \end{matrix}

(4)

After 80 rounds for the given block, the H registers are updated as in Listing 2. When all blocks of the message are processed, the hash can be read as a concatenation of H variables.

Listing 2. Update of H variables when block was processed in the SHA-1 algorithm.

H_0(i) = H_0(i − 1) + A

H_1(i) = H_1(i − 1) + B

H_2(i) = H_2(i − 1) + C

H_3(i) = H_3(i − 1) + D

H_4(i) = H_4(i − 1) + E

In the proposed network probe, SHA-1 is applied to a 104-bit string that consists of 32-bit IP source and destination addresses, 8-bit IP protocol information, and 16-bit source and destination ports of TCP/UDP. The 160-bit hash is reduced to 32-bit words by XORing (⊕) all H registers together.

2.2.5. SHA-3

SHA-3 [34] is the newest hash standard issued by NIST. Unlike previous SHA algorithms, it is based on sponge construction [35] instead of the Merkle–Damgȧrd structure [36]. SHA-3 is in fact a slightly modified Keccak algorithm [37], the winner of the NIST contest. SHA-3, like SHA-2, is capable of four hash length generations: 224, 256, 384, and 512 bits, depending on the underlying sponge construction configuration.

Keccak has an internal state which is b-bit string S; this can be also presented as a three-dimensional array (named A, Figure 5) with mapping as in Equation (5). For SHA-3, b = 1600 and two more helper variables are derived from this value: w = b/25 = 64 and l =

l o g_{2}

(w) = 6.

A [x, y, z] = S [w (5 y + x) + z]

(5)

In Figure 5:

the color green marks an example column of the state array (x = 1, z = 0),
the color red marks an example row of the state array (y = 0, z = 0),
the color blue marks an example lane of the state array (x = 2, y = 3),
and the color yellow marks an example slice of the state array (z = 3).

An SHA-3 round consists of five step mappings denoted

θ

,

ρ

,

π

,

χ

, and

ι

(Equation (6)). Each of those mappings takes state array A as an input and returns an updated state array A’. The

ι

mapping also takes round index

i_{r}

as an argument.

R n d (A, i_{r}) = ι (χ (π (ρ (θ (A)))), i_{r})

(6)

A detailed explanation of every step mapping can be found in [34], and the descriptions below will give a brief idea of how each of these works.

The effect of

θ

is to XOR (⊕) each bit in the state with the parities of two columns in the array. The

ρ

operation result is modification of the z coordinate for every bit in each lane by an offset (modulo lane size), which depends on fixed x and y coordinates of this lane. The

π

operation effect is rearranged positions of lanes in every state array slice. In the

χ

operation, each bit of the state array is XORed (⊕) with a non-linear function of two other bits in its row. The effect of the

ι

operation is to modify some of the bits in Lane(0,0) (the exact center of the state array slice) in a way that depends on the round index

i_{r}

. Lane(0,0) is XORed (⊕) with a w-bit string, where most of the bits are “0”, but a selected few are the result of rc(x) transformation dependent on round index

i_{r}

.

Before the message is fed into the sponge construction, a two-bit suffix “01” is appended to its end. It supports domain separation and allows us to distinguish the SHA-3 hash function from other algorithms. Now the message must be padded so its length is a multiple of rate (r) parameter, which essentially is the SHA-3 block width. SHA-3 utilizes a pad10*1 padding scheme, which generates a bit string starting and ending with “1” and filled with an appropriate number of 0s (hence the asterisk, which in regular expression notation indicates zero or more).

Figure 6 presents the SHA-3 sponge construction’s principle of operation. At the beginning, the SHA-3 state is initialized with a 1600-bit (b = 1600) string of zeros. In the phase called absorption, the padded message is divided into series of r-bit blocks and XORed (⊕) into a state vector. Then f transformation, which consists of 24 SHA-3 rounds, is applied to the state. This process is repeated until the whole message is absorbed. In the second stage, the actual hash is squeezed from the sponge. For all SHA-3 hash lengths, the hash can be obtained without applying the f transformation again—an appropriate number of bits is taken directly from the state vector as r is always greater than the hash length (Table 2). Variable c is the capacity of the sponge, and for SHA-3 it is double the hash length (c = 2d). As variables r and c satisfy relation r + c = b, the selection of capacity determines the block width of the SHA-3 algorithm.

In the network probe, SHA-3 is applied to a 104-bit string that consists of 32-bit IP source and destination addresses, 8-bit IP protocol information, and 16-bit source and destination ports for TCP/UDP. The SHA-3 digest is trimmed to the 32 most significant bits, which are considered the flow hash.

2.3. Implementation and Verification

2.3.1. Implementation

The proposed network probe hardware accelerator was implemented with the hardware description language Verilog [38]. The accelerator’s top module is depicted in Figure 7. It has a 128-bit data path with two AXI4-Stream interfaces [39], Slave and Master, used for data flow. Packets are processed sequentially, and their order is not changed. Block netprobe_top consists of two submodules that implement the two main functions of the accelerator:

netprobe_parser_top, where IP packet parsing and extraction of flow keys along with some other parameters (e.g., payload length, TCP flags) is performed,
netprobe_hash_top, where calculation of the 32-bit hash over flow keys extracted from the IP packet header is carried out.

Figure 8 presents the structure of the packet parser module. The first block in the data path is a protocol filter, responsible for dropping IP packets that contain a protocol other than TCP or UDP. Packets that pass this protocol check are distributed in a round-robin manner between two parallel parser engines which extract flow keys and other information from the packet header.

These modules were parallelized to avoid empty cycles on the Master interface due to the unfavorable header structure of processed IP packets, e.g., such as IP header length (IHL), and as a result the TCP header offset that causes the TCP port and TCP flag fields to be in different packet beats for the 128-bit data path width. Parser engines process the IP packet header, extract flow keys and the rest of the features, and forward data in an internal format (two beats in a 128-bit data path). A placeholder for the hash is included, although it is calculated later in the pipeline.

Module netprobe_hash_top is a block that wraps hash engines. It is parameterized with a HASH_ALGORITHM variable, which selects an appropriate algorithm submodule to be instantiated (Table 3).

The module netprobe_hash_top also has a set of strap ports used for modified Vermont and SHA-3 algorithm configuration as in Table 4. In the network probe hardware accelerator, w constant straps were tied off to random integers and a 512-bit hash was selected for the SHA-3 algorithm.

The nProbe hash algorithm (for HASH_ALGORITHM

= 1

) was implemented as a simple 32-bit adder, whose inputs are flow keys extracted from the internal packet format and left-padded with zeros to 32-bit width if necessary.

The Vermont hash algorithm (for HASH_ALGORITHM

= 2

) was implemented as 5-stage pipeline, similarly to the diagram in Figure 2. Internal packet data are registered in parallel to CRC-32 logic, and at every stage an appropriate flow key is selected to be included in the hash.

The modified Vermont hash algorithm (for HASH_ALGORITHM

= 3

) was realized in a similar manner to regular Vermont. Flow keys are obfuscated with w constants before being used in CRC-32 calculations, as in Figure 3.

In the case of SHA-1 (for HASH_ALGORITHM

= 4

), concatenation of all flow keys forms a 104-bit word, which is considered input to the hash function. The length of the input word is less than 512 bits, which means that SHA-1 transformation (80 rounds) must be applied only to a single block. This makes pipelined algorithm implementation possible, as backpressure towards subsequent packets is not necessary.

Figure 9 presents an example of such a pipeline. Data with extracted flow keys are constantly fed to the input, and multiple packets are processed simultaneously. Since the internal packet format requires two cycles to be transmitted in a 128-bit data path, where only the first cycle carries valid flow keys, a valid hash is obtained at the final stage of the pipeline only for the first beat of this packet.

In regular SHA-1 implementation, the hash pipeline would have 80 stages—one per SHA-1 round. It is possible to reduce the number of stages by unfolding the algorithm loop and implementing two rounds between stage registers. This approach, however, leads to critical path extension of circuits and as a result decreases maximum clock frequency. The solution to this problem was proposed in [40], where the authors described a method with the SHA-1 algorithm loop unfolding using additional variables. This technique allows us to perform two algorithm rounds within one clock cycle and reduces the required number of stages by half. It was incorporated in the network probe hardware accelerator SHA-1 implementation; therefore, its pipeline had 40 stages.

For SHA-3 (for HASH_ALGORITHM

= 5

), as previously, concatenation of all flow keys creates a 104-bit input word. Again, this is less than the SHA-3 block length, so the approach illustrated in Figure 9 can be applied once more. The SHA-3 pipeline in the proposed network probe hardware accelerator has 24 stages, one per SHA-3 round.

In all cases, a 32-bit flow hash is inserted into the initial placeholder of the output accelerator packet.

2.3.2. Functional Verification

Functional verification of the proposed network probe hardware accelerator was conducted using cocotb—an open source, Python-based testbench environment for VHDL/Verilog RTL [41]. It adopts the same concepts of constrained random verification as industry-standard UVM [42]; however, it is implemented in Python rather than SystemVerilog. This enables swift and productive construction of the verification environment, as Python scripting is simple, and additionally, a huge library of existing code is available (e.g., packet generation libraries and cryptographic algorithm implementations).

Figure 10 presents the structure of the cocotb-based verification environment. DUT (Design Under Test, here netprobe_top) was instantiated as top level in the simulator and was surrounded by verification environment components as drivers, monitors, and scoreboard, which were extended from infrastructure provided by cocotb. Ports of the tested module were stimulated directly from the Python function acting as a test case.

At the beginning, a number of transaction objects that mimic IP packets were created and randomized. The goal was to cover a broad space of possible network traffic, so multiple packet parameters were changed: packet length, addresses, encapsulated protocol, etc. These objects were passed to an AXI4-Stream driver, which transmitted them onto the Slave interface of the netprobe_top module. Both Slave and Master interfaces were watched by AXI4-Stream monitors, which were able to transform waveforms into transaction objects. Initial packets and those processed by DUT were fed to the scoreboard component. The DUT behavior model was applied to the stimulus packets there, and the result was compared with transactions processed by the netprobe_top module itself. They must be the same, and when this condition is not fulfilled, an error is reported.

Figure 11 is a screen capture from a simulation of netprobe_top module configured with the SHA-3 algorithm. The selected SHA-3 hash length was 512 bit (strap_hash_length equals 2’d3). The goal of the executed test case was to check the performance of the design. Signal axis_m_tready of the accelerator’s Master interface was tied off to high value, which indicates no backpressure. DUT was flooded with a number of short IP packets—signal axis_s_tvalid went high at Cursor 1. After 32 clock cycles (latency for SHA-3 configuration), the first result packets were presented on the Master interface (Cursor 2, axis_m_tvalid goes high). Checks implemented in the testbench verified whether the axis_s_tready signal goes low. Module netprobe_top does not introduce backpressure on its own, and even in these harsh conditions, DUT behaved as expected.

2.3.3. Synthesis Results

Synthesis of the network probe hardware accelerator was performed for Intel Stratix V GX FPGA (5SGXEA7N2F45C2), an element of the Terasic DE5-Net development kit [43], using Intel Quartus Prime 18.1 software.

Table 5 summarizes the synthesis results of the netprobe_top module for a range of hash algorithms. Since nProbe, Vermont, and modified Vermont are based on simple hashing schemes that use basic types of calculations (addition modulo 32 or CRC), hardware implementation of these algorithms requires little hardware resources (less than 1% of available resources of FPGA used in the experiment). Although SHA-1 and SHA-3 cryptographic functions are far more computationally expensive, the proposed implementation requires few enough resources to be efficiently used as a part of the hardware NetFlow probe. Even though SHA-1 and SHA-3 were optimized for performance, not for the area, the probe with the most complex SHA-3 algorithm utilized only 16.44% of resources, leaving enough of them to implement other functionalities of the NetFlow probe [17]. It is no surprise that straightforward hash algorithms (such as nProbe, Vermont, or modified Vermont) implementations can sustain multigigabit throughput, but realizations of cryptographic functions (SHA-1, SHA-3) definitely match this. All investigated hash algorithms offer throughput over 20 Gbit/s.

It has been assumed that cryptographic hash functions such as SHA are computationally too expensive for efficient use in a flow monitor. The high bandwidth and low latency of the hardware accelerator based on the SHA-1 and SHA-3 functions definitely enables construction of a network probe working in a real-time manner—even when it is flooded with the smallest IP packets.

It is worth mentioning that the low percentage of logic utilization allows for further design optimization and parallelization [44]. Utilizing such techniques, it should be even possible to reach a 100 Gbit/s bandwidth limit.

3. Experiments

In our experiments, we wanted to verify the following research hypotheses:

It is possible to realize a network traffic probe with a cryptographic hash function, working in a real-time regime.
Cryptographic hash functions SHA-1 and SHA-3 provide comparable distribution of flows to the reference methods.

In the experiments, the NetFlow probe was supplied with selected traffic, and the distribution of flow records in the flow cache buckets was analyzed. We conducted tests for five hardware-accelerated probes implementing different hash functions. Each probe was supplied with three different types of network traffic to analyze the impact of traffic type on flow record distribution over buckets in the flow cache.

3.1. Experimental Testbed

Verification and performance tests of the NetFlow probe hardware-accelerator designs were carried out using a dedicated testbed. The hardware part of the probe was implemented in the DE5-Net FPGA development platform. A general-purpose PC containing 10 Gbps Ethernet interfaces (Intel 82599 10 Gigabit Ethernet card) was connected to the DE5-Net kit. The Ethernet connectivity between the DE5-Net FPGA platform and the PC was established by means of multi-mode fiber optics, with SFP+ transceivers. The PC was used as a traffic generator running the tcreplay network driver and as a network monitor implementing the software part of the NetFlow probe.

3.2. Network Traffic Used

In our experiments, we used the CICIDS 2017 dataset [45]. It contains the traffic captured during five days of activity in a simulated network. Both pcap and bidirectional flow formats have been published. These datasets cover various kinds of attack, such as botnets, (D)DoS, web application attacks, and SSH brute-force attempts. In total, 2,830,540 flows were collected over five days (from Monday to Friday).

In our experiments, we used the Monday, Wednesday, and Friday traffic. The traffic collected on Monday contained 496,943 flows, purely with benign network communication. The Wednesday traffic embraced 452,601 flows, which, apart from normal traffic, contained traffic captured during DoS, Heartbleed, slowloris, Slowhttptest, Hulk, and GoldenEye attacks. The Friday subset was the most numerous—it contained 792,487 flows with normal traffic and traffic with registered DDoS attacks, botnet communication, and various port scan attacks.

3.3. Metrics

The hardware-accelerated network flow probe was modified so that the flow cache stores records for all flows during a test session, i.e., flow records for terminated or expired flows, were not removed from the flow cache buckets. This allowed measurement of such values as:

minimal number of flow records in a nonempty bucket (hereinafter named as Min),
maximal number of flow records in a bucket (Max),
mean number of flow records in a bucket (Mean),
standard deviation (SD) of flow records in a bucket,
number of nonempty buckets (Buckets).

This gave us an overview of the distribution of flow records over all buckets in the flow cache for a given hash computation scheme and for the selected traffic type.

4. Results

In our experiments, each hash function was used in 16-bit and 32-bit versions, which organized the flow cache into

2^{16}

and

2^{32}

buckets, respectively. Every probe was supplied with three types of traffic from the CICIDS 2017 dataset labeled Normal (Monday), Normal + attacks (Wednesday), and Normal + attacks (Friday) (see Section 3.2). For each traffic type, the number of flows it contains was given as N.

The results for hardware-accelerated probes using 16-bit hash functions have been presented in Table 6. For every traffic type used to supply, all probe metrics proposed in Section 3.3 were recorded. As can be seen, all hash functions except Mod32 yielded similar statistics over flow cache buckets. Mod32 function achieved noticeably worse Max, Mean, and SD values than the rest of the hash functions. We observed, however, that statistics for all functions were not much affected by anomalous traffic (DoS attacks, botnet communication, port scan attacks)—see the results for the Wednesday and Friday traffic. It can be noticed that traffic Normal + attacks (Friday) generated larger values of recorded parameters for all functions than the other two traffic types. However, this can be explained by the fact that it contains much more flows than the other two traffic types used. Graphical presentation of the distribution of flow records over the flow cache buckets for a hash function based on a simple modulo operation (Mod32), modified Vermont, or the SHA-3 cryptographic function is shown in Figure 12. It can be seen that the distribution produced by the simple modulo hash function is far from uniform. The modified Vermont hash function and that based on the cryptographic SHA-3 function offer much better distributions.

A more precise overview is given in Table 7, where the results for the 32-bit version of hash functions are presented. Such a hash size greatly increases the flow cache capacity (up to

2^{32}

buckets). In this case, in addition to the metrics used in Table 6, the number of nonempty buckets is also given (Buckets column). Again, all hash functions, except Mod32, showed similar distribution over flow cache buckets, which was not affected by typical anomalous traffic. The Mod32 results significantly deviate from those obtained for the rest of the hash functions. It is worth noting that for Vermont, modified Vermont, and the two SHA hash functions, the mean value of flow records in a bucket was 1, and the number of nonempty buckets was almost equal to the number of all flows present in the traffic. This indicates that these functions put almost every flow record in a separate bucket, offering almost uniform distribution of flow records over flow cache buckets for normal traffic and typical anomalous traffic.

5. Discussion and Conclusions

A proper view of the statistics and the dynamics of a network is of great importance, since it enables us to detect network attacks. Thus, network monitors using the network flow concept are an important part of modern cybersecurity defense. As such, these devices themselves may be the targets of cyberattacks. One of the possible weak points of NetFlow probes is a network flow cache, which is usually implemented as a hash table. Due to the limited size of a hash table, it is inevitable that, sooner or later, two different flows will be mapped to the same hash bucket. It is essential that the hash function used for calculating the hash keys offers a uniform distribution of NetFlow records over available buckets, so that the lengths of all bucket lists would be almost equal. This makes it possible to use a reasonably sized hash data structure to make the flow lookup fast, because of minimal list lengths. The experiments conducted during this research show that even a relatively simple hash function may guarantee such characteristics.

However, nowadays, when components of cybersecurity systems themselves may be a targets of a cyberattack, a no less important feature of such systems is their resistance to attacks. In the case of a NetFlow probe, it should be impossible for an attacker to create directed collisions in the hash function. If an attacker is able to fabricate network traffic in such a way as to lead to a large number of collisions in the hash function, some buckets of the hash table may overflow, causing malfunction of the probe.

The results from Section 4 show that only very simple hash functions (i.e., Mod32) are susceptible to common malicious traffic, such as DDoS or port scan attacks. More complex methods, such as Vermont, based on CRC32, offer relatively uniform distribution of flow records over flow cache buckets for normal traffic, and typical anomalous traffic. However, as demonstrated in [19], it is possible to prepare a targeted attack exploiting a vulnerability of the implemented hash function.

Thus, it is crucial to select a hashing function that maps a small number of flow keys on to the same flow cache location. A hash function should therefore compute hash keys that are uniformly distributed, so that it should be impossible for an attacker to create directed collisions. At the same time, the hash function must be fast so that it does not become a bottleneck of the NetFlow probe.

The obvious countermeasure against hash collision-based attacks is the application of cryptographic hash functions, for which collisions cannot be created easily. The results presented in Section 4 prove that the use of the cryptographic functions SHA-1 and SHA-3 offers comparable distribution of flows in the flow cache to the dedicated methods (Vermont, modified Vermont) used as reference. The advantage of implementing a hash function based on cryptographic functions in a NetFlow probe is that it is very difficult (or even impossible) to prepare a targeted attack on such a probe by fabricating network traffic to overflow flow cache buckets through systematically creating packets that lead to hash collisions.

Cryptographic functions, however, have not usually been candidates for hash functions in NetFlow probes, since they are considered to be computationally too expensive for efficient use in flow monitoring. Our concept presented in Section 2.3 shows that it was possible to implement a hardware-accelerated network flow probe employing a cryptographic hash function that offered sufficient performance to construct a network probe working in real-time with multigigabit traffic, even when it was flooded with the smallest IP packets. Relatively low hardware resource utilization makes it possible to reach a 100 Gbit/s bandwidth limit by applying hardware-specific design optimization and parallelization.

It has to be emphasized that most available traffic datasets contain traffic with a relatively small number of flows. The set CICIDS 2017 used in our experiment contains, in total, 2,830,540 flows. Taking into account the fact that the flow cache of a probe that uses a 32-bit hash function contains

2^{32}

buckets, the flow records fill only a small fraction of the flow cache. The use of datasets with significantly larger numbers of flows with normal and anomalous traffic might give a better view of possible differences in distribution of flow records over flow cache buckets for the evaluated hash functions. Such an approach, and the application of customized traffic containing flows intentionally constructed to produce hash collisions (which may not be a trivial task for some hash functions), could be the subject of future work.

To conclude, we can state that the resistance of cryptographic hash functions to collisions and the multigigabit efficiency of a hardware-accelerated implementation of hash computation allow the creation of an effective monitoring solution for modern cybersecurity systems while delivering a high level of resilience to targeted attacks.

Author Contributions

Conceptualization, M.K., M.R. and A.J.; methodology, M.K. and M.R.; software, M.K. and P.S.; validation, M.K. and P.S.; formal analysis, M.K. and M.R.; investigation, M.K., P.S., M.R. and A.J.; resources, M.K., P.S. and M.R.; data curation, P.S.; writing—original draft preparation, M.K., P.S., M.R. and A.J.; writing—review and editing, M.K., M.R. and A.J.; visualization, M.K.; supervision, M.R. and A.J.; project administration, A.J.; funding acquisition, A.J. All authors have read and agreed to the published version of the manuscript.

Funding

The study has been supported by the SIMARGL Project—Secure Intelligent Methods for Advanced RecoGnition of malware and stegomalware, with the support of the European Commission and the Horizon 2020 Program, under grant agreement number 833042. The publication was funded by the statutory activity subsidy from the Polish Ministry of Education and Science.

Conflicts of Interest

The authors declare no conflict of interest.

References

Al-Garadi, M.A.; Mohamed, A.; Al-Ali, A.K.; Du, X.; Ali, I.; Guizani, M. A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Security. IEEE Commun. Surv. Tutor. 2020, 22, 1646–1685. [Google Scholar] [CrossRef] [Green Version]
Fizza, K.; Banerjee, A.; Mitra, K.; Jayaraman, P.P.; Ranjan, R.; Patel, P.; Georgakopoulos, D. QoE in IoT: A vision, survey and future directions. Discov. Internet Things 2021, 1, 4. [Google Scholar] [CrossRef]
Federal Bureau of Investigations. The Cyber Threat. Available online: https://www.fbi.gov/investigate/cyber (accessed on 1 April 2022).
Caviglione, L.; Choraś, M.; Corona, I.; Janicki, A.; Mazurczyk, W.; Pawlicki, M.; Wasielewska, K. Tight Arms Race: Overview of Current Malware Threats and Trends in Their Detection. IEEE Access 2021, 9, 5371–5396. [Google Scholar] [CrossRef]
NETSCOUT. NETSCOUT Threat Intelligence Report. Available online: https://www.netscout.com/threatreport (accessed on 1 April 2022).
Morgan, S. Cybercrime To Cost The World $10.5 Trillion Annually By 2025. Special Report: Cyberwarfare In The C-Suite. 2020. Available online: https://cybersecurityventures.com/hackerpocalypse-cybercrime-report-2016/ (accessed on 1 April 2022).
AV-TEST Institute. Malware Statistics. 2022. Available online: https://www.av-test.org/en/statistics/malware/ (accessed on 1 April 2022).
Wagner, C.; François, J.; State, R.; Engel, T. Machine Learning Approach for IP-Flow Record Anomaly Detection. In Proceedings of the NETWORKING 2011, Valencia, Spain, 9–13 May 2011; Domingo-Pascual, J., Manzoni, P., Palazzo, S., Pont, A., Scoglio, C., Eds.; Springer: Berlin/Heidelberg, Germany, 2011; pp. 28–39. [Google Scholar]
Iglesias, F.; Ferreira, D.C.; Vormayr, G.; Bachl, M.; Zseby, T. NTARC: A Data Model for the Systematic Review of Network Traffic Analysis Research. Appl. Sci. 2020, 10, 4307. [Google Scholar] [CrossRef]
Krupski, J.; Graniszewski, W.; Iwanowski, M. Data Transformation Schemes for CNN-Based Network Traffic Analysis: A Survey. Electronics 2021, 10, 2042. [Google Scholar] [CrossRef]
Hofstede, R.; Čeleda, P.; Trammell, B.; Drago, I.; Sadre, R.; Sperotto, A.; Pras, A. Flow Monitoring Explained: From Packet Capture to Data Analysis With NetFlow and IPFIX. IEEE Commun. Surv. Tutor. 2014, 16, 2037–2064. [Google Scholar] [CrossRef] [Green Version]
Spognardi, A.; Villani, A.; Vitali, D.; Mancini, L.V.; Battistoni, R. Large-Scale Traffic Anomaly Detection: Analysis of Real Netflow Datasets. In E-Business and Telecommunications; Obaidat, M.S., Filipe, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2014; pp. 192–208. [Google Scholar]
van der Steeg, D.; Hofstede, R.; Sperotto, A.; Pras, A. Real-time DDoS attack detection for Cisco IOS using NetFlow. In Proceedings of the 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), Ottawa, ON, Canada, 11–15 May 2015; pp. 972–977. [Google Scholar] [CrossRef] [Green Version]
Zadnik, M.; Pecenka, T.; Korenek, J. Netflow probe intended for high-speed networks. In Proceedings of the International Conference on Field Programmable Logic and Applications, Tampere, Finland, 24–26 August 2005; pp. 695–698. [Google Scholar] [CrossRef] [Green Version]
Novotný, J.; Čeleda, P.; Žádník, M. Hardware-Accelerated Framework for Security in High-Speed Networks. In Information Assurance for Emerging and Future Military Systems; NATO Science and Technology Organization: Brussels, Belgium, 2008. [Google Scholar] [CrossRef]
Forconesi, M.; Sutter, G.; Lopez-Buedo, S.; Aracil, J. Accurate and flexible flow-based monitoring for high-speed networks. In Proceedings of the 2013 23rd International Conference on Field programmable Logic and Applications, Porto, Portugal, 2–4 September 2013; pp. 1–4. [Google Scholar] [CrossRef] [Green Version]
Trzepiński, M.; Skowron, K.; Korona, M.; Rawski, M. FPGA Implementation of Memory Management for Multigigabit Traffic Monitoring. In Man–Machine Interactions 5; Gruca, A., Czachórski, T., Harezlak, K., Kozielski, S., Piotrowska, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2018; pp. 555–565. [Google Scholar]
Sonchack, J.; Michel, O.; Aviv, A.J.; Keller, E.; Smith, J.M. Scaling Hardware Accelerated Network Monitoring to Concurrent and Dynamic Queries With *Flow. In Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC 18), USENIX Association, Boston, MA, USA, 11–13 July 2018; pp. 823–835. [Google Scholar]
Eckhoff, D.; Limmer, T.; Dressler, F. Hash tables for efficient flow monitoring: Vulnerabilities and countermeasures. In Proceedings of the 2009 IEEE 34th Conference on Local Computer Networks 2009, Zurich, Switzerland, 20–23 October 2009; pp. 1087–1094. [Google Scholar] [CrossRef]
Kang, M.S.; Lee, S.B.; Gligor, V.D. The Crossfire Attack. In Proceedings of the 2013 IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 19–22 May 2013; pp. 127–141. [Google Scholar] [CrossRef]
Zhao, Z.; Shi, X.; Wang, Z.; Li, Q.; Zhang, H.; Yin, X. Efficient and Accurate Flow Record Collection With HashFlow. IEEE Trans. Parallel Distrib. Syst. 2022, 33, 1069–1083. [Google Scholar] [CrossRef]
Szumełda, P.; Orzechowski, N.; Rawski, M.; Janicki, A. VHS-22—A Very Heterogeneous Set of Network Traffic Data for Threat Detection. In Proceedings of the European Interdisciplinary Cybersecurity Conference (EICC 2022), Barcelona, Spain, 15–16 June 2022. [Google Scholar] [CrossRef]
Kirsch, A.; Mitzenmacher, M.; Varghese, G. Hash-Based Techniques for High-Speed Packet Processing. In Algorithms for Next Generation Networks; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
Deri, L. nProbe: An Open Source NetFlow Probe for Gigabit Networks. In Proceedings of the TERENA Networking Conference 2003, Zagreb, Croatia, 21 May 2003. [Google Scholar]
Lampert, R.T.; Sommer, C.; Münz, G.; Dressler, F. Vermont—A Versatile Monitoring Toolkit for IPFIX and PSAMP. In Proceedings of the IEEE/IST Workshop on Monitoring, Attack Detection and Mitigation (MonAM 2006), Tübingen, Germany, 28–29 September 2006. [Google Scholar]
Williams, R.N. A Painless Guide to CRC Error Detection Algorithms. Available online: http://ross.net/crc/download/crc_v3.txt (accessed on 24 April 2022).
IEEE Std 802.3-2018; IEEE Standard for Ethernet. Revision of IEEE Std 802.3-2015. IEEE: Piscataway, NJ, USA, 2018; pp. 1–5600. [CrossRef]
Dang, Q. Secure Hash Standard (SHS); National Institute of Standards and Technology: Gaithersburg, MD, USA, 2015. [CrossRef]
Eastlake, D.E., 3rd; Jones, P. US Secure Hash Algorithm 1 (SHA1); RFC 3174. Available online: https://www.rfc-editor.org/info/rfc3174 (accessed on 27 April 2022).
Stevens, M.; Karpman, P.; Peyrin, T. Freestart Collision for Full SHA-1. In Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, Vienna, Austria, 8–12 May 2016; pp. 459–483. [Google Scholar]
Stevens, M.; Bursztein, E.; Karpman, P.; Albertini, A.; Markov, Y. The First Collision for Full SHA-1. In Proceedings of the Advances in Cryptology—CRYPTO 2017, Santa Barbara, CA, USA, 20–24 August 2017; pp. 570–596. [Google Scholar]
Leurent, G.; Peyrin, T. From Collisions to Chosen-Prefix Collisions—Application to Full SHA-1. Cryptology ePrint Archive, Report 2019/459. 2019. Available online: https://ia.cr/2019/459 (accessed on 27 April 2022).
Leurent, G.; Peyrin, T. SHA-1 Is a Shambles—First Chosen-Prefix Collision on SHA-1 and Application to the PGP Web of Trust. Cryptology ePrint Archive, Report 2020/014. 2020. Available online: https://ia.cr/2020/014 (accessed on 27 April 2022).
Dworkin, M. SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2015. [CrossRef]
Bertoni, G.; Daemen, J.; Peeters, M. Cryptographic Sponge Functions; Citeseer: University Park, PA, USA, 2011. [Google Scholar]
Merkle, R.C. Secrecy, Authentication, and Public Key Systems. Ph.D. Thesis, Stanford university, Stanford, CA, USA, 1979. [Google Scholar]
Bertoni, G.; Daemen, J.; Peeters, M.; Van Assche, G. Keccak. In Advances in Cryptology–EUROCRYPT 2013; Johansson, T., Nguyen, P.Q., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2013; Volume 7881. [Google Scholar] [CrossRef] [Green Version]
IEEE Std 1364-2005; IEEE Standard for Verilog Hardware Description Language. Revision of IEEE Std 1364-2001. IEEE: Piscataway, NJ, USA, 2006; pp. 1–590. [CrossRef]
ARM. AMBA^® 4 AXI4-Stream Protocol Version 1.0 (ARM IHI 0051A). 2010. Available online: https://documentation-service.arm.com/static/60d5e2510320e92fa40b4788 (accessed on 27 April 2022).
Lee, E.H.; Kim, S.M.; Lee, J.H.; Cho, K. Design of a High Speed SHA-1 Architecture Using Unfolded Pipeline for Biomedical Applications. In Proceedings of the International Multi-Conference on Society, Cybernetics and Informatics (IMSCI 2009), Orlando, FL, USA, 10–13 July 2009. [Google Scholar]
Various. Cocotb’s Documentation. Available online: https://docs.cocotb.org/en/stable (accessed on 24 April 2022).
Accellera. Universal Verification Methodology. Available online: https://www.accellera.org/community/uvm (accessed on 24 April 2022).
Terasic. DE5-Net FPGA Development Kit. User Manual; Terasic: Hsinchu, Taiwan, 2018. [Google Scholar]
Korona, M.; Skowron, K.; Trzepiński, M.; Rawski, M. High-performance FPGA architecture for data streams processing on example of IPsec gateway. Int. J. Electron. Telecommun. 2018, 64, 351–356. [Google Scholar]
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018), Funchal, Portugal, 22–24 January 2018. [Google Scholar]

Figure 1. Block diagram of hardware-accelerated network probe.

Figure 2. Illustration of Vermont hashing function.

Figure 3. Illustration of modified Vermont hashing function, based on [19].

Figure 4. SHA-1 algorithm round scheme.

Figure 5. SHA-3 state as three-dimensional array A.

Figure 6. Sponge construction, which is the basis of SHA-3.

Figure 7. Block scheme of the network probe hardware accelerator’s top module—netprobe_top.

Figure 8. Block scheme of the network probe hardware accelerator packet parser module—netprobe_parser_top.

Figure 9. Pipelined hash algorithm implementation in the network probe hardware accelerator.

Figure 10. cocotb-based verification environment of netprobe_top module.

Figure 11. Simulation of netprobe_top module with the SHA-3 hash algorithm using the cocotb-based verification environment.

Figure 12. Visualization of bucket occupation for

2^{16}

buckets. (a) Mod32; (b) modified Vermont; (c) SHA-3.

Figure 12. Visualization of bucket occupation for

2^{16}

buckets. (a) Mod32; (b) modified Vermont; (c) SHA-3.

Table 1. The network probe hardware accelerator CRC-32’s implementation parameters, following [26].

Parameter	Value
Polynomial	$x^{32} + x^{26} + x^{23} + x^{22} + x^{16} + x^{12} + x^{11} + x^{10} + x^{8} + x^{7} + x^{5} + x^{4} + x^{2} + x^{1} + 1$ (0x04C11DB7)
Data width	32
Initial value	0xFFFFFFFF
Reflect input	True
Reflect output	True
Final XOR	0xFFFFFFFF

Table 2. Capacity c and rate r of SHA-3 algorithms in relation to hash length.

Hash Length d	Capacity c = 2d	Rate r = b − c
224	448	1152
256	512	1088
384	768	832
512	1024	576

Table 3. Values of HASH_ALGORITHM parameter for netprobe_hash_top module configuration.

HASH_ALGORITHM Value	Hash Algorithm
0	None—hash is not appended
1	Sum modulo 32—nProbe
2	Nested CRC-32—Vermont
3	Nested CRC-32 with w constants—modified Vermont
4	SHA-1
5	SHA-3

Table 4. Module netprobe_hash_top strap ports for algorithm configuration.

Strap Input Port	Width	Description
strap_w_src_addr	32	32-bit constant w for modified Vermont algorithm to sum with Source IP Address key
strap_w_dst_addr	32	32-bit constant w for modified Vermont algorithm to sum with Destination IP Address key
strap_w_protocol	32	32-bit constant w for modified Vermont algorithm to sum with IP Protocol key
strap_w_src_port	32	32-bit constant w for modified Vermont algorithm to sum with Source Port key
strap_w_dst_port	32	32-bit constant w for modified Vermont algorithm to sum with Destination Port key
strap_hash_length	2	Selection of SHA-3 hash length—2 $^{'}$ d0, 2 $^{'}$ d1, 2 $^{'}$ d2, 2 $^{'}$ d3 mean 224, 256, 384, 512 bits, respectively.

Table 5. Module netprobe_top implementation results in Stratix V FPGA.

Hash Algorithm	Logic Utilization	Maximum Clock Frequency $F_{\max}$ (MHz)	Throughput (Gbit/s)	Latency (Clock Cycles/ns)
nProbe	0.56%	206.48	26.43	9/43.59
Vermont	0.69%	207.51	26.56	13/62.65
modified Vermont	0.79%	197.63	25.30	13/65.78
SHA-1	6.17%	201.61	25.81	48/238.08
SHA-3	16.44%	175.04	22.41	32/182.82

Table 6. Statistics of bucket occupation for various hash functions—

2^{16}

buckets used.

Table 6. Statistics of bucket occupation for various hash functions—

2^{16}

buckets used.

Hash Function	Normal (Monday)				Normal + Attacks (Wednesday)				Normal + Attacks (Friday)
	N = 496,943				N = 452,601				N = 792,487
	Min	Max	Mean	SD	Min	Max	Mean	SD	Min	Max	Mean	SD
Mod32	0	38	7.58	5.45	0	34	6.91	5.18	0	52	12.09	7.62
Vermont	0	21	7.58	2.66	0	21	6.91	2.62	2	27	12.09	3.38
Modified Vermont	0	19	7.58	2.63	0	19	6.91	2.47	1	27	12.09	3.32
SHA-1	0	21	7.58	2.76	0	21	6.91	2.62	1	32	12.09	3.47
SHA-3	0	20	7.58	2.75	0	22	6.91	2.62	1	29	12.09	3.47

Table 7. Statistics of bucket occupation for various hash functions—

2^{32}

buckets used.

Table 7. Statistics of bucket occupation for various hash functions—

2^{32}

buckets used.

Hash	Normal (Monday)					Normal + Attacks (Wednesday)					Normal + Attacks (Friday)
	N = 496,943					N = 452,601					N = 792,487
	Min	Max	Mean	SD	Buckets	Min	Max	Mean	SD	Buckets	Min	Max	Mean	SD	Buckets
Mod32	1	22	2.91	2.43	170,626	1	22	2.93	2.28	154,662	1	32	3.98	3.65	199,241
Vermont	1	2	1	0.01	496,919	1	2	1	0.01	452,577	1	2	1	0.01	792,465
Mod. Verm.	1	2	1	0.01	452,572	1	2	1	0.01	452,572	1	2	1	0.01	792,386
SHA-1	1	2	1	0.01	452,567	1	2	1	0.01	452,567	1	2	1	0.01	792,423
SHA-3	1	2	1	0.01	496,922	1	2	1	0.01	452,583	1	2	1	0.01	792,420

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Korona, M.; Szumełda, P.; Rawski, M.; Janicki, A. Comparison of Hash Functions for Network Traffic Acquisition Using a Hardware-Accelerated Probe. Electronics 2022, 11, 1688. https://doi.org/10.3390/electronics11111688

AMA Style

Korona M, Szumełda P, Rawski M, Janicki A. Comparison of Hash Functions for Network Traffic Acquisition Using a Hardware-Accelerated Probe. Electronics. 2022; 11(11):1688. https://doi.org/10.3390/electronics11111688

Chicago/Turabian Style

Korona, Mateusz, Paweł Szumełda, Mariusz Rawski, and Artur Janicki. 2022. "Comparison of Hash Functions for Network Traffic Acquisition Using a Hardware-Accelerated Probe" Electronics 11, no. 11: 1688. https://doi.org/10.3390/electronics11111688

APA Style

Korona, M., Szumełda, P., Rawski, M., & Janicki, A. (2022). Comparison of Hash Functions for Network Traffic Acquisition Using a Hardware-Accelerated Probe. Electronics, 11(11), 1688. https://doi.org/10.3390/electronics11111688

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparison of Hash Functions for Network Traffic Acquisition Using a Hardware-Accelerated Probe

Abstract

1. Introduction

2. Materials and Methods

2.1. Hardware-Accelerated Network Probe

2.2. Hash Functions

2.2.1. Sum Modulo 32—nProbe

2.2.2. Nested CRC-32—Vermont

2.2.3. Nested CRC-32 with w Constants—Modified Vermont

2.2.4. SHA-1

2.2.5. SHA-3

2.3. Implementation and Verification

2.3.1. Implementation

2.3.2. Functional Verification

2.3.3. Synthesis Results

3. Experiments

3.1. Experimental Testbed

3.2. Network Traffic Used

3.3. Metrics

4. Results

5. Discussion and Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI