Not All Seeds Are Important: Fuzzing Guided by Untouched Edges

Xie, Chen; Jia, Peng; Yang, Pin; Hu, Chi; Kuang, Hongbo; Ye, Genzuo; Hong, Xuanquan

doi:10.3390/app132413172

Open AccessArticle

Not All Seeds Are Important: Fuzzing Guided by Untouched Edges

by

Chen Xie

¹,

Peng Jia

^1,*

,

Pin Yang

¹,

Chi Hu

²,

Hongbo Kuang

¹,

Genzuo Ye

¹ and

Xuanquan Hong

¹

School of Cyber Science and Engineering, Sichuan University, Chengdu 610207, China

²

China Academy of Engineering Physics (CAEP), Mianyang 621900, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(24), 13172; https://doi.org/10.3390/app132413172

Submission received: 24 October 2023 / Revised: 5 December 2023 / Accepted: 5 December 2023 / Published: 12 December 2023

(This article belongs to the Special Issue Advances in Cybersecurity: Challenges and Solutions)

Download

Browse Figures

Versions Notes

Abstract

:

Coverage-guided greybox fuzzing (CGF) has become the mainstream technology used in the field of vulnerability mining, which has been proven to be effective. Seed scheduling, the process of selecting seeds from the seeds pool for subsequent fuzzing iterations, is a critical component of CGF. While many seed scheduling strategies have been proposed in academia, they all focus on the explored regions within programs. In response to the inefficiencies of traditional seed scheduling strategies, which often allocate resources to ineffective seeds, we introduce a novel seed scheduling strategy guided by untouched edges. The strategy generates the optional seed set according to the information on the untouched edges. We also present a new instrumentation method to capture unexplored areas and guide the fuzzing process toward them. We implemented the prototype UntouchFuzz on top of American Fuzzy Lop (AFL) and conducted evaluation experiments against the most advanced seed scheduling strategies. Our results demonstrate that UntouchFuzz has improved in code coverage and unique vulnerabilities. Furthermore, the method proposed is transplanted into the fuzzer MOpt, which further proves the scalability of the method. In particular, 13 vulnerabilities were found in the open-source projects, with 7 of them having assigned CVEs.

Keywords:

vulnerability mining; greybox fuzzing; seed scheduling

1. Introduction

Fuzzing is a prevalent and effective automated software testing method that has already found numerous vulnerabilities in real-world applications [1,2,3,4,5,6,7]. Fuzzers efficiently explore the input space of the program under test, operating at nearly raw execution speeds, with the aim of identifying specific inputs that can provoke program crashes or anomalous behaviors. However, the input space of most real-world programs is so large that it is difficult to fully explore. Moreover, vulnerabilities are sparse in an application, with only certain specific inputs capable of triggering vulnerabilities [8].

American Fuzzy Lop (AFL) [9], one of the most popular and widely used coverage-guided greybox fuzzers in both academia and industry, is an efficient fuzzing tool for file applications and has already discovered many high-risk vulnerabilities across various projects. AFL employs a mutation-based fuzzing approach by mutating the binary data of the seed file to find test cases that improve coverage or trigger crashes. Research [10] indicates that the performance of mutation-based fuzzers depends on seed scheduling, essentially determining the prioritization of which seed to mutate.

The main challenge in seed scheduling is to determine which seeds in the corpus are more likely to explore new code space in the program when mutated. From the perspective of code coverage, the main role of seed scheduling is to prioritize those seeds that are more promising to trigger new code coverage after being mutated. AFL, for example, utilizes a greedy algorithm to maintain an optimal seed set that covers all explored edges. However, seeds related to validation check edges are also added to the seed set by mistake, and fuzzing these seeds will waste a lot of computational overhead [11].

In the field of seed scheduling, many scholars have conducted in-depth research. Classic seed scheduling strategies mainly guide seed scheduling and prioritization through the distribution of edge coverage or path coverage throughout the fuzzing process. For example, AFLFast [12] gives priority to those seeds likely to trigger low-frequency paths. FairFuzz [13] prioritizes those triggering rare edges, while EcoFuzz [14] prioritizes those based on the seed’s self-transition probability. However, the distribution of coverage information mentioned above is related to the control flow graph (CFG) of the program. Consequently, the performance of such fuzzers can vary across programs with different CFGs.

To decouple the strong correlation between seed scheduling strategies and target programs, SLIME [15] classifies seeds by constructing multiple property queues and employs the upper confidence bound variance (UCB-V) algorithm to select the optimal seed queue and then fuzzing the seeds in that queue. Furthermore, Alphuzz [16] considers seed interdependencies and uses a Monte Carlo search tree approach for seed scheduling.

However, the existing methods mentioned above neglect to focus on the unexplored regions within the control flow graph (CFG) of the program. For instance, consider a seed s whose execution path does not contain any untouched edges, but a coverage-guided fuzzer still marks s as a favored seed. On one hand, this seed is likely to cover branches related to validation checks, while those checks are often hard to solve. On the other hand, when this seed initially joins the seed queue, it contains untouched edges. However, as fuzzing proceeds, other seeds may explore these untouched edges, resulting in a seed execution path without untouched edges. Our insight is that if a seed does not contain any untouched edges, there is little sense in prioritizing mutations on it.

In response to the aforementioned challenges, we propose a seed scheduling strategy based on untouched edges extracted from underlying CFG. Unlike traditional seed scheduling strategies that pay more attention to explored regions, our approach gives precedence to seeds that incorporate untouched edges, which are then subjected to fuzzing as a priority. We instrument the target program to collect data on edge coverage and untouched edge information. Subsequently, we revise the untouched edge coverage information for all seeds within the queue. Ultimately, we select an optimal minimal subset of seeds that covers all untouched edges from the seed pool. Furthermore, our scheduling strategy allocates more energy to seeds with more low-frequency untouched edges in the queue. Our insight is that low-frequency untouched edges imply a high probability of being explored, while high-frequency untouched edges are likely to be hard-to-solve edges. Therefore, we enhance the effectiveness of seed scheduling by allocating extra energy to seeds associated with more low-frequency untouched edges, further encouraging in-depth exploration of these areas.

The main contributions of this paper are summarized as follows:

(1): We propose a new seed scheduling strategy that efficiently selects seeds based on unexplored regions within program execution paths. This approach prevents wasting resources on ineffective seeds, a common issue in traditional scheduling methods.
(2): We applied the new seed scheduling strategy to a new greybox fuzzing tool named UntouchFuzz. To our knowledge, UntouchFuzz is the first fuzzer to utilize unexplored area information as the basis for seed scheduling. The source code is available at https://github.com/bladchan/untouchFuzz.git (accessed on 22 October 2023).
(3): We evaluated UntouchFuzz on 12 programs, demonstrating its effectiveness when compared to four AFL-based seed schedulers.

The rest of the paper is organized as follows. Section 2 discusses the background. Section 3 illustrates our motivation with an example. Section 4 shows our design of UntouchFuzz and its technical details. Implementation details are listed in Section 5. Section 6 shows the evaluation results. Section 7 discusses several limitations of our implementation. Section 8 concludes this paper.

2. Background

2.1. Techniques

In this section, we provide the background on coverage-guided greybox fuzzing and focus on the coverage acquisition method and seed scheduling in the classical fuzzer AFL.

2.1.1. Coverage-Guided Greybox Fuzzing

Coverage-guided greybox fuzzing continuously generates test cases by employing coverage feedback loops and preserves seeds that yield new coverage. Specifically, coverage-guided greybox fuzzing includes four main stages [17]:

(1): Seed Scheduling: Effective seeds are chosen from a pool of seeds based on a scheduling strategy, where effectiveness refers to the fact that new code can be explored more easily by mutating that seed.
(2): Energy Scheduling: Appropriate mutation counts (energy) are assigned based on the attributes of the chosen effective seeds.
(3): Seed Mutation: Within the allocated energy, various mutation operations are performed on the selected seeds to generate new test cases.
(4): Seed Selection/saving: Each generated test case is executed on the target program, and seeds are evaluated based on corresponding coverage information. If a test case enhances the coverage of the target program, it is chosen as a new seed. Through this feedback loop, the coverage of the target program under test continues to increase and it is more likely to generate test cases that trigger new bugs.

2.1.2. Lightweight Instrumentation

The key idea of the coverage-guided greybox fuzzing method lies in coverage acquisition data, as the coverage feedback mechanism propels the entire process of coverage-guided greybox fuzzing forward. AFL, as one of the state-of-the-art coverage-guided fuzzing tools, employs a lightweight instrumentation technique to capture transitions between program basic blocks [18]. AFL assigns a unique random ID to each basic block in CFG. The transition between two basic blocks, i.e., the edge, is defined as in Equation (1). In particular,

e d g e_{i}

represents the ID of the edge,

p r e v_{b b}

refers to the ID of the previous basic block, and

c u r_{b b}

refers to the ID of the basic block where the current transition occurs. Note that

p r e v_{b b}

is shifted one bit to the right in order to distinguish different transition orders between two basic blocks.

e d g e_{i} = (p r e v_{b b} ≫ 1) \oplus c u r_{b b}

(1)

AFL maintains a default 64 KB bitmap. Each byte in the bitmap is utilized to log the transition count associated with the byte’s index in the bitmap, as illustrated in Figure 1. In this way, AFL can effectively track and count the edges covered by different inputs in the program.

2.1.3. Seed Scheduling

AFL employs a genetic algorithm to preserve test cases that cover new edges or hit new edge counts and utilizes a greedy algorithm to generate a favored seed subset from the seed queue, as described in Algorithm 1.

Algorithm 1 Generating the favored seed subset

In the initial step, the algorithm maintains an array,

t e m p_{v}

, to keep a record of edges currently covered by favored seeds. Then, in line 3, the algorithm iterates through the highest-scoring seed of each edge. Notably, the highest-scoring seeds are dynamically maintained by AFL during the fuzzing process based on criteria such as seed size and execution speed.

Subsequently, the algorithm determines whether the seed has already been covered by other seeds within the favored seed set in lines 4–7. If the edge remains uncovered, we update edge coverage information in

t e m p_{v}

. Finally, in line 8, the seed is added to the favored seed set.

Once this favored seed set is obtained, AFL prioritizes these seeds and allocates more energy for mutation. It is important to acknowledge that AFL assumes that if a seed triggers new edges, fuzzing that seed will likely trigger more edges. However, this assumption has certain limitations when dealing with unexplored regions, as we will discuss in detail in Section 3.

2.2. Related Work

In this section, we discuss the closely related works.

Coverage-guided greybox fuzzing. Coverage-guided greybox fuzzing is one of the most effective techniques for finding vulnerabilities and bugs, garnering significant attention from both academia and industry. Coverage-based greybox fuzzers typically adopt the coverage information to guide different program path explorations.

Since a coverage guidance engine is a key component for the greybox fuzzers, much effort has been devoted to improving their coverage. For example, REDQUEEN [19], GREYONE [20] and PATA [21] employ lightweight taint analysis to penetrate some paths protected by magic bytes comparisons. Driller [22], T-Fuzz [23] and QSYM [3] incorporate symbolic execution engines to delve into deeper program codes. Angora [24] adopts a gradient descent technique to resolve path constraints to break some hard comparisons. MemFuzz [25], MemLock [26], and ovAFLow [27] augment evolutionary fuzzing by additionally leveraging information about memory accesses or memory consumption performed by the target program. CollAFL [28] proposes a coverage-sensitive fuzzing approach to mitigate path collisions. Furthermore, AFLGo [29], Hawkeye [30], Beacon [31], and SelectFuzz [32] utilize alternative metrics for directing fuzzing toward user-specified target sites in the program.

Seed scheduling. In this paper, we focus on improving the seed scheduling component in a fuzzer. With the seed set, seed scheduling is essential for addressing two key issues: (1) which seed to select for the next round and (2) the time budget for the selected seed. In practice, instead of time budget, most fuzzers optimize the number of mutations performed on the selected seeds, i.e., energy scheduling.

AFLFast [12] models path transitions as Markov chain [33], efficiently guiding fuzzing to explore undiscovered path transitions. FairFuzz [13] leverages a targeted mutation strategy to prioritize the exploration of rare branches. EcoFuzz [14] adopts the Variant of the Adversarial Multi-armed Bandit Model (VAMAB) model to prioritize and allocate more energy to the seeds with lower self-transition probabilities. The insight behind it is that the low self-transition probability indicates the high probability of discovering new paths after mutating. Alphuzz [16] models the seed scheduling problem as a Monte Carlo tree search (MCTS) problem. Its key observation is that the relationships among seeds are valuable for seed scheduling. SLIME [15] prioritizes seeds based on reward estimated by a customized upper confidence bound variance-aware (UCB-V) algorithm on different property seed queues, adaptively allocating energy to the seed with different properties.

3. Motivating Example

We use a program’s control flow graph in Figure 2 to illustrate our motivation. The initial seed A is considered to be a quality seed, i.e., the seed is capable of executing the main logic codes of the program. Figure 2a shows the execution path of seed A, with red circles denoting the basic blocks covered by the seed. When the coverage-guided greybox fuzzer mutates seed A, it can easily produce seed B and seed C. Both of these seeds cover edges associated with validation checks at the control flow graph level, i.e., BB1→BB10 and BB2→BB10.

The appearance of these two new edges is interpreted by the fuzzer as an increase in edge coverage, leading to their inclusion in the seed queue. Furthermore, according to the previous description in Section 2.1.3, AFL marks both seed B and seed C as favored seeds during seed scheduling since they cover new edges. However, when attempting to mutate these seeds, they encounter challenges in producing descendant seeds that explore deeper code areas.

We further assume that after mutating seed A, the fuzzer generates not only seed B and seed C but also seed D. Seed D’s execution path is shown in Figure 2d. The difference between seed A and seed D lies in the fact that seed D explores basic block BB6, an unexplored area of seed A’s execution path. In this case, both two seeds are effective for the exploration of basic block BB5. However, if seed E, which covers basic block BB5 is produced later, seed A and seed D become ineffective as they make limited contributions to exploring basic blocks BB8 and BB9. Nevertheless, AFL still regards seed A and seed D as favored because they both cover new edges.

As discussed in Section 2.1.3, AFL’s seed scheduling algorithm selects favored seeds based on covered edges. However, it lacks tracking information for unexplored areas at the CFG level. AFL’s perspective hinges on the assumption that if a seed covers new edges, it is more likely to explore unexplored regions. This perspective relies on a crucial precondition: the new edges must be adjacent to unexplored areas; otherwise, these edges are likely related to validation checks or have already been explored by other seeds. Particularly in the context of expansive and complex programs, AFL expends substantial fuzzing resources on ineffective seeds, impeding the prioritization of genuinely effective seeds and slowing down the convergence of the fuzzing process. Therefore, this paper addresses the issue in AFL’s seed scheduling algorithm by introducing a new mechanism to track unexplored regions.

4. Design of UntouchFuzz

4.1. Overview

Figure 3 shows the overview of UntouchFuzz. In comparison to traditional coverage-guided greybox fuzzers, UntouchFuzz introduces an additional bitmap for tracking untouched edges. The untouched edge instrumentation mechanism updates this bitmap during program execution. Seed scheduling is then performed based on the information from this bitmap, prioritizing the mutations of favored seeds generated by scheduler.

To further explain, UntouchFuzz starts with an initial corpus as a seed set. By using the seed scheduling mechanism, the fuzzer selects a seed from the seed set for mutations. The number of mutations is determined by the energy scheduling mechanism, based on seed attributes. The fuzzer then executes the AFL-instrumented program with mutated test cases. If the program causes a crash, the test case is preserved on the local disk. If it covers new edges, the test case is preserved as a seed and added to the seed queue. Meanwhile, the coverage-increasing seed is provided to the program instrumented for untouched edge tracking, collecting information on untouched edges to guide the next seed scheduling process.

Furthermore, UntouchFuzz allocates more energy to seeds with more low-frequency untouched edges in the seed queue. This allocation aims to encourage these seeds to make more attempts at breaking through these low-frequency untouched edges. In the following sections, we will discuss the methods for collecting information on untouched edges in Section 4.2, introduce the seed scheduling algorithm based on untouched edges in Section 4.3, and outline slight improvements to energy scheduling in Section 4.4.

4.2. Untouched Edges Tracking

As mentioned in Section 2.1.2, AFL employs a lightweight instrumentation technique to track program edge coverage. In essence, AFL’s instrumentation assigns a random ID to each basic block within the target program’s CFG. During program execution, the instrumentation codes calculate the corresponding edge index based on Equation (1) and use this index to update the coverage bitmap.

To ensure minimal impact on program execution speed, AFL utilizes a lightweight XOR operation for computing coverage indices. Similarly, the instrumentation codes responsible for gathering untouched edge information should also be lightweight to minimize disruption to program execution.

We introduce our instrumentation approach with a practical example. Figure 4 provides a snippet of branches within the program’s CFG. The hexadecimal values in green boxes represent random IDs allocated by the AFL instrumentation for each basic block. According to Equation (1), the edge index for the transition from basic block BB1 to BB2 is calculated as

(0 x a b c d ≫ 1) \oplus 0 x 1234 = 0 x 47 d 2

, while the edge index for the transition from basic block BB1 to BB3 is calculated as

(0 x a b c d ≫ 1) \oplus 0 x 5678 = 0 x 039 e

.

Suppose a particular seed triggers a transition from basic block BB1 to BB2, leaving the transition from BB1 to BB3 unexplored. In this scenario, we label edge

0 x 039 e

as an untouched edge within the execution path of the seed. A straightforward method for capturing untouched edges is to insert instrumentation codes within basic block BB2. Such codes update a byte of untouched edge bitmap by using

0 x 039 e

as a static index pre-allocated during compilation. This approach works efficiently when a basic block has only one predecessor, but confusion arises when a basic block has multiple incoming edges.

Consider the seed’s execution path: BB0→BB1→BB2→BB1. Basic block BB1 has two incoming edges:

0 x e 693

from BB0 and

0 x a 2 d 7

from BB2. The corresponding unexplored edges are

0 x b 6 a 5

and

0 x b 2 a 1

. Employing the aforementioned instrumentation codes, distinguishing between these two untouched edges becomes a challenging endeavor.

Upon analyzing the above example, it becomes evident that static pre-allocation of untouched edge IDs is impractical when a basic block has multiple predecessors. To address this challenge, we propose a dynamic method for obtaining untouched edge IDs based on the properties of XOR operations. Specifically, we employ a global variable named “

__a f l_b b_i d s

” to maintain the XOR value of the IDs of two basic blocks at branch transition point. When a transition between basic blocks occurs, we perform an XOR operation on the ID of the transitioned-to basic block with “

__a f l_b b_i d s

” to retrieve the ID of the other unexplored basic block. Furthermore, we define the equation for calculating untouched edge IDs as follows:

e d g e_{u n t o u c h} = (p r e v_{b b} ≫ 1) \oplus (__a f l_b b_i d s \oplus c u r_{b b})

(2)

w h e r e,__a f l_b b_i d s = c u r_{b b} \oplus u n t o u c h_{b b}

(3)

Illustrated with the control flow graph in Figure 4, consider the branch transition point within basic block BB1, which leads to two transitions to basic blocks BB2 and BB3. At BB1, we update the value of “

__a f l_b b_i d s

” by performing an XOR operation on the IDs of BB2 and BB3, resulting in

0 x 1234 \oplus 0 x 5678 = 0 x 444 c

. When a transition occurs from basic block BB1 to BB2, we can recover the ID of the unexplored basic block BB3 by using “

__a f l_b b_i d s

”: “

__a f l_b b_i d s \oplus I D_{B B 2} = 0 x 444 c \oplus 0 x 1234 = 0 x 5678

”. Subsequently, we calculate the untouched edge ID between BB1 and BB3 using Equation (1): “

(0 x a b c d ≫ 1) \oplus 0 x 5678 = 0 x 039 e

”. The calculated value is then utilized to update the bitmap information for the untouched edges.

Algorithm 2 delineates the process of instrumenting untouched edges. Initially, in lines 2–6, the algorithm employs AFL’s native edge coverage-based instrumentation, concurrently capturing the random IDs allocated to each basic block. Subsequently, from lines 8 to 27, the algorithm performs to instrument for untouched edges.

Algorithm 2 Instrumentation for untouched edges

To elucidate further, lines 10–12 of the algorithm determine whether the current “untouch_inst1” is invoked. “untouch_inst1” appends instrumentation codes to the beginning of each basic block. The primary function of these codes lies in fetching the value of the global variable “

__a f l_b b_i d s

” within the program. It subsequently calculates the ID for the untouched edge per Equation (2) and leverages this ID to update the untouched edge bitmap “

__a f l_u n t o u c h_p t r

”.

Moving to lines 15–24, the algorithm traverses the two successor basic blocks of the current basic block, performing XOR operation on the IDs allocated to these two basic blocks. Finally, in line 25, the function “untouch_inst2” is invoked. The primary aim of the code is to update the value of the global variable “

__a f l_b b_i d s

” with the control flow graph, setting it to the outcome of the XOR operation mentioned earlier. At this point, all steps of the untouched edge instrumentation algorithm have been completed.

At the low assembly level, a basic block always has exactly two successor basic blocks. However, when we move up to higher-level compiler intermediate languages, it is not guaranteed that a basic block has always exactly two successor basic blocks, especially in programs that contain Switch statements. Specific solutions to this issue will be provided in Section 5. Additionally, it is worth noting that the instrumentation approach proposed still leads to the problem of edge index collisions, where different edges in the CFG are assigned with the same index value [28].

4.3. Seed Scheduling Based on Untouched Edges

In the preceding Section 4.2, we introduced the method for obtaining untouched edge information. In this section, we delve into seed scheduling based on the collected untouched edges. Similar to AFL, we maintain an array, ‘

u n t o u c h_t o p_r a t e d

’, with a size of MAP_SIZE to store the most favored seed for each untouched edge. We also employ a greedy algorithm, specifically the minimum covering set algorithm [34], to generate an optimal seed set that contains all currently untouched edges.

Algorithm 3 describes the seed scheduling algorithm based on untouched edges. Initially, in line 1, the algorithm feeds the newly added seed, denoted as ‘s’, to the instrumented program ‘

P^{'}

’ with untouched edges. Subsequently, it acquires the edge coverage bitmap and the untouched edge bitmap of that seed. Following this, in lines 2–27, the algorithm iterates through each index value in the bitmap.

Algorithm 3 Seed scheduling based on untouched edges

Specifically, in lines 3–10, the algorithm first checks whether the edge has been both marked as a touched edge and an untouched edge in the two bitmaps. If this condition is true, it suggests that the edge is likely in a loop structure. Due to the repeated edges covered in loops, it is possible that edges untouched in a previous iteration of the loop are now covered in the current iteration. To mitigate this effect, the algorithm sets the value of the untouched edge bitmap corresponding to this edge’s index to 0.

The algorithm also maintains a global array called ‘

v i r g i n_u n t o u c h

’. The indices of this array correspond to edge IDs, and the array values indicate the status of the respective untouched edges: 0 denotes an untouched edge that has not been covered by the execution path of any seed, 1 denotes that a seed’s execution path includes this untouched edge, and 255 denotes an untouched edge that has been covered by the execution path of another seed, i.e., it has been “explored”. If the current edge is covered by seed ‘s’, and the corresponding value in the ‘

v i r g i n_u n t o u c h

’ array is not 0, the algorithm updates the value to 255.

Moving on to lines 11–13, if the edge is an untouched edge for the current seed ‘s’ and has not been “explored” by other seeds in history, the algorithm updates the value in the ‘

v i r g i n_u n t o u c h

’ array to 1. In lines 14–17, if the corresponding ‘

v i r g i n_u n t o u c h

’ value for this edge is 255, indicating that the edge is no longer untouched, the algorithm sets the ‘

u n t o u c h_t o p_r a t e d

’ value for this edge to NULL. The ‘

u n t o u c h_t o p_r a t e d

’ maintains the best seed for each untouched edge.

In lines 18–26 of the algorithm, the current seed s is compared with the metrics of the best seed for this untouched edge. We continue to use AFL’s default metrics, which is the seed’s execution speed multiplied by its file size. Then, at line 28, the algorithm invokes the

c o v e r M i n S e t ()

function, which generates a minimal seed set containing all untouched edges found, following the logic described in Algorithm 1.

In the end, the algorithm outputs the optimal seed set based on untouched edges. UntouchFuzz prioritizes testing seeds from this selected seed set.

4.4. Energy Scheduling Optimization

The goal of energy scheduling is to allocate energy efficiently to the chosen seeds for optimal mutations. It is essential to strike the right balance in energy allocation. Allocating excessive energy can lead to a significant waste of fuzzing resources on a single seed. Conversely, insufficient energy allocation may underutilize a seed’s potential to explore new paths, as discussed in reference [14].

In this paper, we obtain the number of seeds in the seed set for each untouched edge included. Following this, we sort the number of seeds for these untouched edges in ascending order and select the top

β %

(40% is the default in this paper) as rare untouched edges. Then, we calculate the difference between the number of current seed’s rare untouched edge ‘

r a r e_{s} [i]

’ and the maximum number of seed ‘

m a x_{s}

’ among all rare untouched edges then calculate the average distance value

d i s t_{s}

based on Equation (4).

d i s t_{s} = \frac{\sum_{i}^{n} (m a x_{s} - r a r e_{s} [i])}{n}

(4)

Assuming the original energy allocated to the seed was p, it is now assigned a new energy of

(1 + α \times d i s t_{s}) \times p

, where

α

is the default value of 0.3. In our insight, if a certain untouched edge appears frequently in the seed set, it implies a high probability that the selected seed contains this untouched edge and suggests that this particular untouched edge is likely to be difficult to explore. Therefore, if a seed includes many high-frequency untouched edges, it should be allocated less energy. Conversely, if the seed contains numerous low-frequency untouched edges, it should be allocated more energy.

To elaborate on our insight, Figure 5 illustrates it through an example. Assuming that the current phase has fuzzed for a while and the seed set contains four seeds, with two global untouched edges: BB1→BB10 and BB5→BB8. According to the seed scheduling algorithm outlined in Section 4.3, seed B is identified as a favored seed due to the presence of untouched edge BB1→BB10 in its execution path and superior seed attribute. Similarly, seed D is designated as a favored seed because of the inclusion of the new untouched edge BB5→BB8 in its execution path.

Subsequently, the fuzzing process prioritizes the mutation of seed B and seed D. As previously mentioned, the untouched edge BB1→BB10 appears in the execution paths of all four seeds, indicating it is not a rare untouched edge. In contrast, BB5→BB8 is considered as a rare untouched edge since it appears only in the execution path of seed D. Consequently, we can make a reasonable conjecture that the untouched edge BB1→BB10 is likely associated with a hard-to-solve constraint, whereas BB5→BB8 is likely to represent a more manageable constraint. To increase the likelihood of covering edge BB5→BB8, we allocate more energy to the mutation of seed D based on the previously outlined energy allocation mechanism. It is essential to note that the provided example is simplified for explanatory purposes, while the real-world program will be complex.

However, in AFL, when new seeds are discovered, the fuzzer doubles the energy allocated to the seeds. Hence, we do not intentionally reduce the energy but instead provide seeds with a higher initial energy value to the seeds, aiming to fully unleash the potential of seeds to discover new paths.

5. Implementation

We implemented a prototype of UntouchFuzz on the top of AFL 2.57b, comprising two key components: instrumentation based on untouched edges and the main fuzzing loop. Next, we discuss a few important implementation details.

For the instrumentation, we employed the LLVM framework [35] to instrument the target program’s codes, collecting information about untouched edges. However, LLVM IR contains SwitchInst instructions, which can lead to situations where basic blocks with SwitchInst instructions have multiple successor basic blocks. To address this, we utilized a pass available in the AFL++ [6] instrumentation tools that splits switch statements. By applying this pass, all SwitchInst instructions in the target program’s LLVM IR are transformed into if…else… structures, converting basic blocks that originally had multiple successor basic blocks into those with only two successor basic blocks. This enables us to effectively implement the instrumentation method described in Section 4.2.

Before entering the main fuzzing loop, we launched an instrumented program using another fork server for untouched edge instrumentation in UntouchFuzz. Here, we considered that instrumentation due to untouched edges might impact the program’s execution speed. Therefore, we only run the untouched edge-instrumented version of the program when seeds are added to the queue. In other scenarios, we run the AFL’s native instrumented version of the program. However, it is important to note that to maintain instrumentation consistency, we applied the aforementioned switch statement, splitting pass to the AFL’s native instrumentation.

Regarding the fuzzing main loop, we introduced an

u p d a t e_u n t o u c h_s c o r e ()

function to maintain the best seed for each untouched edge. Additionally, we made modifications to the

c u l l_q u e u e ()

and

f u z z_o n e ()

functions in AFL to implement seed scheduling and energy allocation. Other logic in the fuzzing main loop remained unchanged.

6. Evaluation

In this section, we evaluated the effectiveness of UntouchFuzz and answer the following questions:

RQ1: How effective is UntouchFuzz at improving coverage faster when compared with other seed scheduling strategies?
RQ2: Can UntouchFuzz discover more unique crashes with respect to other seed scheduling strategies?
RQ3: How does the seed scheduling based on untouched edges perform in other fuzzers?
RQ4: Can UntouchFuzz detect new vulnerabilities in real-world programs?

6.1. Experiment Settings

(1) Baseline Seed Scheduling Strategies: Starting from the minimum coverage set, rare paths, new paths, and same-prefix coverage, the seed scheduling method proposed in this paper was compared with native AFL [9], AFLFast [12], EcoFuzz [14], and Alphuzz [16] seed scheduling strategies. It is important to note that these five tools differ only in their seed scheduling mechanisms while all other components remain the same. FairFuzz [13] was not compared due to its custom mutator, which aims to obtain mutation byte masks. When fuzzing large-scale seeds, the custom mutator might lead to starvation in subsequent seeds. Conversely, if seeds are small, the additional coverage gained from deterministic mutation might be unfair compared to fuzzers without deterministic mutation. Previous work [36] removed this custom mutation phase in experiments, but this deviated from FairFuzz’s original intent. Therefore, we do not select FairFuzz for comparison.

Additionally, the multi-property queue seed scheduling method SLIME [15] is not compared due to the additional instrumentation for the target program. This instrumentation affects program execution speed and differs significantly from the instrumentation used by the aforementioned tools. The tool we implemented can share the same instrumentation program with these tools, ensuring no experimental differences in comparison.

(2) Benchmark Programs: We selected 12 real-world binary programs for testing based on their popularity, testing frequency, and diversity of categories. As shown in Table 1, these 12 binary programs include popular binary utilities (such as readelf), image parsing and processing libraries (such as libjpeg-turbo, exiv2), audio parsing tools (such as mp3gain), document processing libraries (such as xpdf, libxml2), and network packet parsing tools (such as tcpdump). Since certain vulnerabilities (i.e., buffer overflows) do not affect the program execution, we apply Address Sanitizer (ASAN) [37] to capture memory errors.

(3) Initial Corpus: The initial seed corpus used comes from datasets provided by Mopt [38] and uniFuzz [39], with some contributions from the open-source community.

(4) Experimental Environment: The experiments were conducted on a server with a 56-core Intel Xeon CPU, 128 GB of memory, and running Ubuntu 22.04. Deterministic mutations were disabled as it is less effective compared to AFL’s havoc mutation strategy [40]. To reduce the impact of randomness, we ran each benchmark program for 24 h, repeating the process 10 times and taking the arithmetic mean as the final result [41].

(5) Experimental Metrics: We evaluated the proposed method against four fuzzers with different seed scheduling strategies, considering edge coverage, edge coverage over time, and the number of unique crashes. Additionally, the proposed guided mechanism for untouched edges was transplanted into MOpt to assess its performance across different fuzzers. MOpt was chosen due to its focus on optimizing mutation operators while keeping the seed scheduling mechanism unaltered.

6.2. RQ1: Code Coverage Improving

Code coverage is a crucial metric for evaluating the performance of fuzzing techniques [17]. In general, the more code a fuzzer can cover in the target program, the higher the probability of discovering hidden vulnerabilities. As explained in Section 2.1.2, AFL [9] employs a 64KB bitmap to collect coverage information, with each byte in the bitmap representing the number of hits for a particular edge ID. AFL maps program branches to the bitmap by using a hash function. If a branch is explored, the byte at the index corresponding to its edge ID in the bitmap is updated. AFL maintains a simplified bitmap in real time and stores it on local disk, which allows us to assess code coverage based on this simplified bitmap.

Table 2 presents a comparison of UntouchFuzz with four other state-of-the-art fuzzers in terms of edge coverage. The data in the table represent the arithmetic average edge coverage across ten fuzzing tests. The results in Table 2 demonstrate that UntouchFuzz outperforms EcoFuzz and Alphuzz in edge coverage and slightly surpasses AFL and AFLFast. UntouchFuzz achieves better coverage than the other four baseline seed scheduling strategies in 11 out of the 12 programs tested (all except for mujs). Overall, the proposed method is effective in improving coverage. While the improvement percentages on AFL and AFLFast (2.16% and 3.31%) are relatively modest, these results are consistent with findings from previous research [40], which suggests that differences among fuzzers are minimized when using the single havoc strategy. Nonetheless, our seed scheduling strategy still has an impact on the direction of fuzzing evolution by concentrating mutation energy on more effective seeds, thus enhancing overall program coverage.

Figure 6 shows the evolution of edge coverage over time, with samples taken every hour. Evidently, on the seven target programs, exiv2, pdftotext, tcpdump, tiffcp, readelf, nm-new, and bsdtar, UntouchFuzz has a significantly higher edge coverage growth rate than the other four baseline seed scheduling strategies. However, for the remaining five target programs, the fuzzers converge quickly due to their smaller scale and lower complexity, negating the advantage demonstrated by UntouchFuzz.

6.3. RQ2: Unique Crashes

To further validate the effectiveness of the proposed method in vulnerability discovery, we conducted a statistical analysis of unique crashes. Table 3 presents a comparison of UntouchFuzz with four baseline fuzzers on the number of unique crashes, where the data in the table represent the total number of unique crashes discovered over ten rounds of testing. It is worth noting that, as in our experiments, we used the latest version of the djpeg program and did not find any valid crashes, so its results are not listed in Table 3.

As shown in the experimental results in Table 3, UntouchFuzz outperforms the other four baseline seed scheduling strategies in total unique crash discoveries, discovering the highest number of unique crashes on six of the programs under test. However, on the remaining five target programs, UntouchFuzz does not achieve the best results, but the difference between it and the best-performing approach is not significant. We attribute this to the randomness of mutation and differences in fuzzing evolution.

It is essential to note that the experimental results in Table 3 are directly taken from the unique crash metric of AFL-based fuzzers. However, the number of unique crashes may not accurately reflect the actual number of unique vulnerabilities because there is often a many-to-one relationship between them, which means that multiple crashes may correspond to a single vulnerability. For example, suppose crash one’s triggering path is A →B→D, and crash two’s triggering path is A→C→D, and both crashes occur at the same location in basic block D. From AFL’s perspective, since the triggering paths of these two crashes are different, they are both considered unique crashes. However, from a root cause analysis perspective, both crashes are due to codes in basic block D, and thus, these two crashes should be categorized as the same vulnerability. Moreover, the differences in these two crash paths are likely related to changes in the input, where bytes in the input can affect the execution of subsequent basic blocks following basic block A. To obtain a more accurate count of unique vulnerabilities, we conducted deduplication of unique crashes based on the function call stack information provided by ASAN, selecting the top three functions and removing duplicate crashes. Table 4 presents the results after crash deduplication.

The data in Table 4 reveal that various seed scheduling strategies exhibit distinct performances across different programs, with UntouchFuzz achieving the highest number of unique vulnerabilities in six programs. Overall, UntouchFuzz outperforms the other four baseline seed scheduling strategies based on the total number of discovered vulnerabilities. Furthermore, compared to EcoFuzz and Alphuzz, UntouchFuzz demonstrates a more stable performance across the programs.

6.4. RQ3: Scalability

To assess the scalability of the proposed approach, we integrated our method into the MOpt fuzzer, naming the modified tool “MOpt-u.” Table 5 provides a comparison of edge coverage between UntouchFuzz, AFL, MOpt-u, and the original MOpt fuzzer. The results in Table 5 demonstrate that UntouchFuzz and MOpt-u outperform the original, unmodified fuzzers in terms of edge coverage across all 12 benchmark programs. This further substantiates the capability of the untouched edge-guided mechanism to enhance the performance of the original fuzzers.

6.5. RQ4: New Vulnerabilities

We used UntouchFuzz to find new vulnerabilities in open-source projects on GitHub. We reported these vulnerabilities to the respective projects. The details are in Table 6, confirming the effectiveness of UntouchFuzz in real-world scenarios.

7. Discussion

In this section, we discuss several limitations of our current implementation:

(1): Our method is implemented on the top of AFL and not on AFL++. The choice to not implement it on AFL++ was due to the fact that AFL++ already integrates the seed scheduling mechanism from AFLFast and other advanced technologies. Porting the untouched edge guidance mechanism into AFL++ would have been complex. We plan to implement our approach to AFL++ in future work. Additionally, it is important to note that our method may not apply to base fuzzers outside of AFL, such as libFuzzer [42] or honggfuzz [1].
(2): Coverage bitmap collisions are a common issue in AFL-based fuzzers. AFL-based fuzzers typically use a fixed-size 64KB bitmap to collect coverage information, which can be adjusted via configuration. The fixed bitmap size might result in different edges being assigned the same edge ID, causing coverage bitmap collisions. Prior research attempts have aimed to optimize this issue by modifying instrumentation [28,43,44]. However, with the increase in the size of the target programs, expanding the bitmap to solve collision issues might not lead to significant performance improvements.
(3): We conducted fuzzing tests on 12 mainstream benchmark programs, running each program 10 times for 24 h per test. The experimental results demonstrate that, under the given initial corpus conditions, our tool effectively selects better seeds for fuzzing and guides the fuzzer toward maximizing program coverage. However, differences between the initial corpus and the target program can impact the performance of coverage-guided greybox fuzzers [10,15,45], causing variations in the evolutionary process and resulting in different outcomes.

8. Conclusions

In this paper, we concluded that the existing seed scheduling methods neglect to focus on the unexplored regions within the program’s control flow graph. In response to this issue, we presented a greybox fuzzer guided by untouched edges, UntouchFuzz. We developed a lightweight instrumentation technique to track untouched edges. Furthermore, we designed a seed scheduling strategy based on untouched edges inspired by a minimal coverage sets algorithm. The strategy prioritizes seeds that include all untouched edges. Additionally, we made minor adjustments to the energy scheduler to align with the new seed scheduling method. In evaluation, UntouchFuzz outperformed the other fuzzers on code coverage and the number of vulnerabilities, further proving the untouched guidance mechanism proposed in this paper. To foster future research in this area, we have made our fuzzer open source. Further, future research could combine symbolic execution techniques with the untouched guidance mechanism for better fuzzing results. We plan to implement our mechanism into AFL++ and investigate the influence of bitmap collisions in our mechanism.

Author Contributions

Conceptualization, P.J.; data curation, C.X.; formal analysis, G.Y.; funding acquisition, P.J.; investigation, H.K.; methodology, C.X.; software, C.X.; supervision, P.Y. and C.H.; validation, X.H.; visualization, C.X.; writing—original draft, C.X.; writing—review and editing, C.X. and P.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Key R&D projects of China OF FUNDER grant number 2021YFB3101803. This research was supported in part by the National Natural Science Foundation of China under Grant U2133208 and the Sichuan Youth Science and Technology Innovation Team under Grant 2022JDTD0014.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The source code of fuzzer proposed in this research is available at https://github.com/bladchan/untouchFuzz.git (accessed on 22 October 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

Swiecki, R.; Gröbert, F. Honggfuzz. 2016. Available online: http://code.google.com/p/honggfuzz (accessed on 2 October 2023).
Schumilo, S.; Aschermann, C.; Gawlik, R.; Schinzel, S.; Holz, T. kAFL: Hardware-Assisted feedback fuzzing for OS kernels. In Proceedings of the 26th USENIX Security Symposium (USENIX Security 17), Vancouver, BC, Canada, 16–18 August 2017; pp. 167–182. [Google Scholar]
Yun, I.; Lee, S.; Xu, M.; Jang, Y.; Kim, T. QSYM: A practical concolic execution engine tailored for hybrid fuzzing. In Proceedings of the 27th USENIX Security Symposium (USENIX Security 18), Baltimore, MD, USA, 15–17 August 2018; pp. 745–761. [Google Scholar]
Pham, V.T.; Böhme, M.; Santosa, A.E.; Căciulescu, A.R.; Roychoudhury, A. Smart greybox fuzzing. IEEE Trans. Softw. Eng. 2019, 47, 1980–1997. [Google Scholar] [CrossRef]
Zheng, Y.; Davanian, A.; Yin, H.; Song, C.; Zhu, H.; Sun, L. FIRM-AFL:High-Throughput greybox fuzzing of IoT firmware via augmented process emulation. In Proceedings of the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14 August 2019; pp. 1099–1114. [Google Scholar]
Fioraldi, A.; Maier, D.; Eißfeldt, H.; Heuse, M. AFL++: Combining incremental steps of fuzzing research. In Proceedings of the 14th USENIX Workshop on Offensive Technologies (WOOT 20), Boston, MA, USA, 11 August 2020. [Google Scholar]
Pham, V.T.; Böhme, M.; Roychoudhury, A. AFLNet: A greybox fuzzer for network protocols. In Proceedings of the 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST), Porto, Portugal, 24–28 October 2020; pp. 460–465. [Google Scholar]
Zhu, X.; Wen, S.; Camtepe, S.; Xiang, Y. Fuzzing: A survey for roadmap. ACM Comput. Surv. (CSUR) 2022, 54, 1–36. [Google Scholar] [CrossRef]
Zalewski, M. American Fuzzy Lop (AFL) Fuzzer. 2017. Available online: http://lcamtuf.coredump.cx/afl/technical_details.txt (accessed on 2 October 2023).
Herrera, A.; Gunadi, H.; Magrath, S.; Norrish, M.; Payer, M.; Hosking, A.L. Seed selection for successful fuzzing. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual, 11–17 July 2021; pp. 230–243. [Google Scholar]
Zhang, K.; Xiao, X.; Zhu, X.; Sun, R.; Xue, M.; Wen, S. Path transitions tell more: Optimizing fuzzing schedules via runtime program states. In Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 21–29 May 2022; pp. 1658–1668. [Google Scholar]
Böhme, M.; Pham, V.T.; Roychoudhury, A. Coverage-based greybox fuzzing as markov chain. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 1032–1043. [Google Scholar]
Lemieux, C.; Sen, K. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, Montpellier, France, 3–7 September 2018; pp. 475–485. [Google Scholar]
Yue, T.; Wang, P.; Tang, Y.; Wang, E.; Yu, B.; Lu, K.; Zhou, X. EcoFuzz: Adaptive Energy-Saving greybox fuzzing as a variant of the adversarial Multi-Armed bandit. In Proceedings of the 29th USENIX Security Symposium (USENIX Security 20), Virtual, 12–14 August 2020; pp. 2307–2324. [Google Scholar]
Lyu, C.; Liang, H.; Ji, S.; Zhang, X.; Zhao, B.; Han, M.; Li, Y.; Wang, Z.; Wang, W.; Beyah, R. SLIME: Program-sensitive energy allocation for fuzzing. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual, 18–22 July 2022; pp. 365–377. [Google Scholar]
Zhao, Y.; Wang, X.; Zhao, L.; Cheng, Y.; Yin, H. Alphuzz: Monte carlo search on seed-mutation tree for coverage-guided fuzzing. In Proceedings of the 38th Annual Computer Security Applications Conference, Austin, TX, USA, 5–9 December 2022; pp. 534–547. [Google Scholar]
Wang, J.; Duan, Y.; Song, W.; Yin, H.; Song, C. Be sensitive and collaborative: Analyzing impact of coverage metrics in greybox fuzzing. In Proceedings of the 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019), Beijing, China, 23–25 September 2019; pp. 1–15. [Google Scholar]
Wang, M.; Liang, J.; Zhou, C.; Jiang, Y.; Wang, R.; Sun, C.; Sun, J. RIFF: Reduced Instruction Footprint for Coverage-Guided Fuzzing. In Proceedings of the 2021 USENIX Annual Technical Conference (USENIX ATC 21), Virtual, 14–16 July 2021; pp. 147–159. [Google Scholar]
Aschermann, C.; Schumilo, S.; Blazytko, T.; Gawlik, R.; Holz, T. REDQUEEN: Fuzzing with Input-to-State Correspondence. In Proceedings of the NDSS, San Diego, CA, USA, 24–27 February 2019; Volume 19, pp. 1–15. [Google Scholar]
Gan, S.; Zhang, C.; Chen, P.; Zhao, B.; Qin, X.; Wu, D.; Chen, Z. GREYONE: Data flow sensitive fuzzing. In Proceedings of the 29th USENIX Security Symposium (USENIX Security 20), Virtual, 12–14 August 2020; pp. 2577–2594. [Google Scholar]
Liang, J.; Wang, M.; Zhou, C.; Wu, Z.; Jiang, Y.; Liu, J.; Liu, Z.; Sun, J. Pata: Fuzzing with path aware taint analysis. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 23–26 May 2022; pp. 1–17. [Google Scholar]
Stephens, N.; Grosen, J.; Salls, C.; Dutcher, A.; Wang, R.; Corbetta, J.; Shoshitaishvili, Y.; Kruegel, C.; Vigna, G. Driller: Augmenting fuzzing through selective symbolic execution. In Proceedings of the NDSS, San Diego, CA, USA, 21–24 February 2016; Volume 16, pp. 1–16. [Google Scholar]
Peng, H.; Shoshitaishvili, Y.; Payer, M. T-Fuzz: Fuzzing by program transformation. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 21–23 May 2018; pp. 697–710. [Google Scholar]
Chen, P.; Chen, H. Angora: Efficient fuzzing by principled search. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 21–23 May 2018; pp. 711–725. [Google Scholar]
Coppik, N.; Schwahn, O.; Suri, N. Memfuzz: Using memory accesses to guide fuzzing. In Proceedings of the 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), Xi’an, China, 22–27 April 2019; pp. 48–58. [Google Scholar]
Wen, C.; Wang, H.; Li, Y.; Qin, S.; Liu, Y.; Xu, Z.; Chen, H.; Xie, X.; Pu, G.; Liu, T. Memlock: Memory usage guided fuzzing. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, Seoul, Republic of Korea, 27 June–19 July 2020; pp. 765–777. [Google Scholar]
Zhang, G.; Wang, P.F.; Yue, T.; Kong, X.D.; Zhou, X.; Lu, K. ovAFLow: Detecting Memory Corruption Bugs with Fuzzing-Based Taint Inference. J. Comput. Sci. Technol. 2022, 37, 405–422. [Google Scholar] [CrossRef]
Gan, S.; Zhang, C.; Qin, X.; Tu, X.; Li, K.; Pei, Z.; Chen, Z. Collafl: Path sensitive fuzzing. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 21–23 May 2018; pp. 679–696. [Google Scholar]
Böhme, M.; Pham, V.T.; Nguyen, M.D.; Roychoudhury, A. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October– 3 November 2017; pp. 2329–2344. [Google Scholar]
Chen, H.; Xue, Y.; Li, Y.; Chen, B.; Xie, X.; Wu, X.; Liu, Y. Hawkeye: Towards a desired directed grey-box fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 2095–2108. [Google Scholar]
Huang, H.; Guo, Y.; Shi, Q.; Yao, P.; Wu, R.; Zhang, C. Beacon: Directed grey-box fuzzing with provable path pruning. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 23–26 May 2022; pp. 36–50. [Google Scholar]
Luo, C.; Meng, W.; Li, P. Selectfuzz: Efficient directed fuzzing with selective path exploration. In Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 22–25 May 2023; pp. 2693–2707. [Google Scholar]
Norris, J.R. Markov Chains; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
Rebert, A.; Cha, S.K.; Avgerinos, T.; Foote, J.; Warren, D.; Grieco, G.; Brumley, D. Optimizing seed selection for fuzzing. In Proceedings of the 23rd USENIX Security Symposium (USENIX Security 14), Anaheim, CA, USA, 9–11 August 2023; pp. 861–875. [Google Scholar]
Lattner, C.; Adve, V. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization, CGO 2004, Palo Alto, CA, USA, 20–24 March 2004; pp. 75–86. [Google Scholar]
She, D.; Shah, A.; Jana, S. Effective seed scheduling for fuzzing with graph centrality analysis. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 23–26 May 2022; pp. 2194–2211. [Google Scholar]
Serebryany, K.; Bruening, D.; Potapenko, A.; Vyukov, D. AddressSanitizer: A fast address sanity checker. In Proceedings of the 2012 USENIX Annual Technical conference (USENIX ATC 12), Boston, MA, USA, 13–15 June 2012; pp. 309–318. [Google Scholar]
Lyu, C.; Ji, S.; Zhang, C.; Li, Y.; Lee, W.H.; Song, Y.; Beyah, R. MOPT: Optimized mutation scheduling for fuzzers. In Proceedings of the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14–16 August 2019; pp. 1949–1966. [Google Scholar]
Li, Y.; Ji, S.; Chen, Y.; Liang, S.; Lee, W.H.; Chen, Y.; Lyu, C.; Wu, C.; Beyah, R.; Cheng, P.; et al. UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Vancouver, BC, Canada, 11–13 August 2021; pp. 2777–2794. [Google Scholar]
Wu, M.; Jiang, L.; Xiang, J.; Huang, Y.; Cui, H.; Zhang, L.; Zhang, Y. One fuzzing strategy to rule them all. In Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 21–29 May 2022; pp. 1634–1645. [Google Scholar]
Klees, G.; Ruef, A.; Cooper, B.; Wei, S.; Hicks, M. Evaluating fuzz testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 2123–2138. [Google Scholar]
Serebryany, K. Continuous fuzzing with libfuzzer and addresssanitizer. In Proceedings of the 2016 IEEE Cybersecurity Development (SecDev), Boston, MA, USA, 3–4 November 2016; p. 157. [Google Scholar]
Ahmed, A.; Hiser, J.D.; Nguyen-Tuong, A.; Davidson, J.W.; Skadron, K. BigMap: Future-proofing Fuzzers with Efficient Large Maps. In Proceedings of the 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Taipei, Taiwan, 21–24 June 2021; pp. 531–542. [Google Scholar]
Hsu, C.C.; Wu, C.Y.; Hsiao, H.C.; Huang, S.K. Instrim: Lightweight instrumentation for coverage-guided fuzzing. In Proceedings of the Symposium on Network and Distributed System Security (NDSS), Workshop on Binary Analysis Research, San Diego, CA, USA, 18–21 February 2018; p. 40. [Google Scholar]
Lee, M.; Cha, S.; Oh, H. Learning Seed-Adaptive Mutation Strategies for Greybox Fuzzing. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, 15–16 May 2023; pp. 384–396. [Google Scholar]

Figure 1. Details of AFL edge coverage collection.

Figure 2. The CFGs of the motivating example.

Figure 3. Overview of UntouchFuzz.

Figure 4. Example of an AFL-instrumented program’s CFG.

Figure 5. Example of our insight.

Figure 6. Edge coverage over time in five fuzzers.

Table 1. Twelve real-world programs for evaluation.

Program	Library and Version	Input Type	Commands
djpeg	libjpeg-turbo-2.1.91	jpg	@@
exiv2	exiv2-0.26	jpg	@@/dev/null
pdftotext	xpdf-4.0.0	pdf	@@
tcmdump	tcmdump-4.8.1	bin	-e -vv -nr @@
mp3gain	mp3gain-1.5.2	mp3	@@
mp42aac	Bento4-1.5.1-628	mp4	@@/dev/null
tiffcp	libtiff-3.9.7	tiff	-i -E l -H 10 -V 10 -S 8:4 -R 270 @@ ./output.tif
readelf	binutils-2.28	elf	-a @@
nm-new	binutils-2.28	elf	-A -a -l -S -s –special-syms –synthetic –with-symbol-versions -D @@
xmllint	libxml-2.98	xml	@@
bsdtar	libarchive-3.2.0	tar	-xf @@/dev/null
mujs	SQLite-3.8.9	text(js)	@@

Table 2. Arithmetic mean edge coverage comparison.

	Default	RarePath	NewPath	SamePrefix	UntouchedEdge
Fuzzer	AFL	AFLFast	EcoFuzz	Alphuzz	UntouchFuzz
djpeg	3998	3850	3646	3613	4101
exiv2	12,152	12,104	11,310	11,958	12,304
pdftotext	15,160	15,120	11,408	14,510	15,280
tcpdump	18,208	17,252	16,134	17,640	18,556
mp3gain	1375	1376	1364	1370	1382
mp42aac	3263	3230	3178	3197	3263
tiffcp	5841	5801	5405	5692	6659
readelf	10,336	10,503	9100	10,029	10,647
nm-new	5455	5463	5530	5185	5601
xmllint	10,406	10,332	10,121	10,172	10,470
bsdtar	5250	5227	5109	5169	5318
mujs	9170	9243	8892	9084	9210
Total	100,614	99,501	91,197	97,619	102,791
	(−2.16%)	(−3.31%)	(−12.71%)	(−5.30%)

Table 3. Comparison on unique crashes.

	Default	RarePath	NewPath	SamePrefix	UntouchedEdge
Fuzzer	AFL	AFLFast	EcoFuzz	Alphuzz	UntouchFuzz
exiv2	503	476	301	529	546
pdftotext	3725	3424	122	2310	3937
tcpdump	1917	2008	1738	2090	2088
mp3gain	1230	1219	986	1128	1226
mp42aac	502	480	262	243	467
tiffcp	3495	3553	2317	3235	3642
readelf	19	2	1	5	83
nm-new	1550	1450	1435	641	1662
xmllint	4331	4377	3313	4063	4251
bsdtar	341	559	978	76	603
mujs	471	524	71	423	469
Total	18,084	18,072	11,524	14,743	18,974
	(−4.92%)	(−4.99%)	(−64.65%)	(−28.70%)

Table 4. Results after crash deduplication.

	Default	RarePath	NewPath	SamePrefix	UntouchedEdge
Fuzzer	AFL	AFLFast	EcoFuzz	Alphuzz	UntouchFuzz
exiv2	102	96	59	99	105
pdftotext	159	166	25	100	176
tcpdump	515	488	435	571	519
mp3gain	65	66	68	61	67
mp42aac	22	26	28	24	24
tiffcp	381	406	318	336	418
readelf	3	2	1	2	6
nm-new	165	156	160	92	158
xmllint	17	24	15	19	19
bsdtar	37	37	31	12	39
mujs	20	19	19	10	25
Total	1486	1486	1159	1326	1556
	(−4.17%)	(−4.17%)	(−34.25%)	(−17.35%)

Table 5. Comparison between AFL, MOpt and UntouchFuzz, MOpt-u.

Fuzzer	AFL	UntouchFuzz	Mopt	Mopt-u
djpeg	3998	4101	4073	4397
exiv2	12,152	12,304	12,404	12,480
pdftotext	15,160	15,280	15,322	15,341
tcpdump	18,208	18,556	18,467	18,734
mp3gain	1375	1382	1378	1380
mp42aac	3263	3263	3352	3409
tiffcp	5841	6659	5949	6132
readelf	10,336	10,647	11,128	11,252
nm-new	5455	5601	5573	5674
xmllint	10,406	10,470	10245	10,408
bsdtar	5250	5318	5270	5273
mujs	9170	9210	9121	9157
Total	100,614	102,791	102,282	103,637
	(−2.16%)		(−1.31%)

Table 6. New vulnerabilities found by UntouchFuzz.

Project	Version	CVE/Issue	Type	Fixed Status
LIEF	v0.12.1	CVE-2022-40922	segmentation fault	✓
LIEF	v0.12.1	CVE-2022-40923	segmentation fault	✓
LIEF	v0.12.1	CVE-2022-43171	heap buffer overflow	✓
LIEF	v0.12.1	CVE-2022-43172	segmentation fault	✓
LIEF	v0.12.1	github-issue-785	allocator oom	✓
PcapPlusPlus	v22.11	CVE-2023-31991	heap buffer overflow	✓
libfyaml	v0.7.12	CVE-2023-31992	use after free	✓
libfyaml	v0.7.12	CVE-2023-31993	stack buffer overflow	✓
libfyaml	v0.7.12	github-issue-56	stack-buffer-overflow	✓
sxmlc	v4.5.2	github-issue-24	segmentation fault	✓
sxmlc	v4.5.2	github-issue-25	segmentation fault	✓
configor	v0.9.18	github-issue-97	infinite loop	✓
tom11	v3.7.1	github-issue-199	heap buffer overflow	✓

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xie, C.; Jia, P.; Yang, P.; Hu, C.; Kuang, H.; Ye, G.; Hong, X. Not All Seeds Are Important: Fuzzing Guided by Untouched Edges. Appl. Sci. 2023, 13, 13172. https://doi.org/10.3390/app132413172

AMA Style

Xie C, Jia P, Yang P, Hu C, Kuang H, Ye G, Hong X. Not All Seeds Are Important: Fuzzing Guided by Untouched Edges. Applied Sciences. 2023; 13(24):13172. https://doi.org/10.3390/app132413172

Chicago/Turabian Style

Xie, Chen, Peng Jia, Pin Yang, Chi Hu, Hongbo Kuang, Genzuo Ye, and Xuanquan Hong. 2023. "Not All Seeds Are Important: Fuzzing Guided by Untouched Edges" Applied Sciences 13, no. 24: 13172. https://doi.org/10.3390/app132413172

APA Style

Xie, C., Jia, P., Yang, P., Hu, C., Kuang, H., Ye, G., & Hong, X. (2023). Not All Seeds Are Important: Fuzzing Guided by Untouched Edges. Applied Sciences, 13(24), 13172. https://doi.org/10.3390/app132413172

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Not All Seeds Are Important: Fuzzing Guided by Untouched Edges

Abstract

1. Introduction

2. Background

2.1. Techniques

2.1.1. Coverage-Guided Greybox Fuzzing

2.1.2. Lightweight Instrumentation

2.1.3. Seed Scheduling

2.2. Related Work

3. Motivating Example

4. Design of UntouchFuzz

4.1. Overview

4.2. Untouched Edges Tracking

4.3. Seed Scheduling Based on Untouched Edges

4.4. Energy Scheduling Optimization

5. Implementation

6. Evaluation

6.1. Experiment Settings

6.2. RQ1: Code Coverage Improving

6.3. RQ2: Unique Crashes

6.4. RQ3: Scalability

6.5. RQ4: New Vulnerabilities

7. Discussion

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI