Not All Seeds Are Important: Fuzzing Guided by Untouched Edges

: Coverage-guided greybox fuzzing (CGF) has become the mainstream technology used in the field of vulnerability mining, which has been proven to be effective. Seed scheduling, the process of selecting seeds from the seeds pool for subsequent fuzzing iterations, is a critical component of CGF. While many seed scheduling strategies have been proposed in academia, they all focus on the explored regions within programs. In response to the inefficiencies of traditional seed scheduling strategies, which often allocate resources to ineffective seeds, we introduce a novel seed scheduling strategy guided by untouched edges. The strategy generates the optional seed set according to the information on the untouched edges. We also present a new instrumentation method to capture unexplored areas and guide the fuzzing process toward them. We implemented the prototype UntouchFuzz on top of American Fuzzy Lop (AFL) and conducted evaluation experiments against the most advanced seed scheduling strategies. Our results demonstrate that UntouchFuzz has improved in code coverage and unique vulnerabilities. Furthermore, the method proposed is transplanted into the fuzzer MOpt, which further proves the scalability of the method. In particular, 13 vulnerabilities were found in the open-source projects, with 7 of them having assigned CVEs.


Introduction
Fuzzing is a prevalent and effective automated software testing method that has already found numerous vulnerabilities in real-world applications [1][2][3][4][5][6][7].Fuzzers efficiently explore the input space of the program under test, operating at nearly raw execution speeds, with the aim of identifying specific inputs that can provoke program crashes or anomalous behaviors.However, the input space of most real-world programs is so large that it is difficult to fully explore.Moreover, vulnerabilities are sparse in an application, with only certain specific inputs capable of triggering vulnerabilities [8].
American Fuzzy Lop (AFL) [9], one of the most popular and widely used coverageguided greybox fuzzers in both academia and industry, is an efficient fuzzing tool for file applications and has already discovered many high-risk vulnerabilities across various projects.AFL employs a mutation-based fuzzing approach by mutating the binary data of the seed file to find test cases that improve coverage or trigger crashes.Research [10] indicates that the performance of mutation-based fuzzers depends on seed scheduling, essentially determining the prioritization of which seed to mutate.
The main challenge in seed scheduling is to determine which seeds in the corpus are more likely to explore new code space in the program when mutated.From the perspective of code coverage, the main role of seed scheduling is to prioritize those seeds that are more promising to trigger new code coverage after being mutated.AFL, for example, utilizes a greedy algorithm to maintain an optimal seed set that covers all explored edges.However, seeds related to validation check edges are also added to the seed set by mistake, and fuzzing these seeds will waste a lot of computational overhead [11].
In the field of seed scheduling, many scholars have conducted in-depth research.Classic seed scheduling strategies mainly guide seed scheduling and prioritization through the distribution of edge coverage or path coverage throughout the fuzzing process.For example, AFLFast [12] gives priority to those seeds likely to trigger low-frequency paths.Fair-Fuzz [13] prioritizes those triggering rare edges, while EcoFuzz [14] prioritizes those based on the seed's self-transition probability.However, the distribution of coverage information mentioned above is related to the control flow graph (CFG) of the program.Consequently, the performance of such fuzzers can vary across programs with different CFGs.
To decouple the strong correlation between seed scheduling strategies and target programs, SLIME [15] classifies seeds by constructing multiple property queues and employs the upper confidence bound variance (UCB-V) algorithm to select the optimal seed queue and then fuzzing the seeds in that queue.Furthermore, Alphuzz [16] considers seed interdependencies and uses a Monte Carlo search tree approach for seed scheduling.
However, the existing methods mentioned above neglect to focus on the unexplored regions within the control flow graph (CFG) of the program.For instance, consider a seed s whose execution path does not contain any untouched edges, but a coverage-guided fuzzer still marks s as a favored seed.On one hand, this seed is likely to cover branches related to validation checks, while those checks are often hard to solve.On the other hand, when this seed initially joins the seed queue, it contains untouched edges.However, as fuzzing proceeds, other seeds may explore these untouched edges, resulting in a seed execution path without untouched edges.Our insight is that if a seed does not contain any untouched edges, there is little sense in prioritizing mutations on it.
In response to the aforementioned challenges, we propose a seed scheduling strategy based on untouched edges extracted from underlying CFG.Unlike traditional seed scheduling strategies that pay more attention to explored regions, our approach gives precedence to seeds that incorporate untouched edges, which are then subjected to fuzzing as a priority.We instrument the target program to collect data on edge coverage and untouched edge information.Subsequently, we revise the untouched edge coverage information for all seeds within the queue.Ultimately, we select an optimal minimal subset of seeds that covers all untouched edges from the seed pool.Furthermore, our scheduling strategy allocates more energy to seeds with more low-frequency untouched edges in the queue.Our insight is that low-frequency untouched edges imply a high probability of being explored, while high-frequency untouched edges are likely to be hard-to-solve edges.Therefore, we enhance the effectiveness of seed scheduling by allocating extra energy to seeds associated with more low-frequency untouched edges, further encouraging in-depth exploration of these areas.
The main contributions of this paper are summarized as follows: (1) We propose a new seed scheduling strategy that efficiently selects seeds based on unexplored regions within program execution paths.This approach prevents wasting resources on ineffective seeds, a common issue in traditional scheduling methods.(2) We applied the new seed scheduling strategy to a new greybox fuzzing tool named UntouchFuzz.To our knowledge, UntouchFuzz is the first fuzzer to utilize unexplored area information as the basis for seed scheduling.The source code is available at https://github.com/bladchan/untouchFuzz.git (accessed on 22 October 2023).(3) We evaluated UntouchFuzz on 12 programs, demonstrating its effectiveness when compared to four AFL-based seed schedulers.
The rest of the paper is organized as follows.Section 2 discusses the background.Section 3 illustrates our motivation with an example.Section 4 shows our design of UntouchFuzz and its technical details.Implementation details are listed in Section 5. Section 6 shows the evaluation results.Section 7 discusses several limitations of our implementation.Section 8 concludes this paper.

Techniques
In this section, we provide the background on coverage-guided greybox fuzzing and focus on the coverage acquisition method and seed scheduling in the classical fuzzer AFL.

Coverage-Guided Greybox Fuzzing
Coverage-guided greybox fuzzing continuously generates test cases by employing coverage feedback loops and preserves seeds that yield new coverage.Specifically, coverageguided greybox fuzzing includes four main stages [17]: (1) Seed Scheduling: Effective seeds are chosen from a pool of seeds based on a scheduling strategy, where effectiveness refers to the fact that new code can be explored more easily by mutating that seed.

Lightweight Instrumentation
The key idea of the coverage-guided greybox fuzzing method lies in coverage acquisition data, as the coverage feedback mechanism propels the entire process of coverageguided greybox fuzzing forward.AFL, as one of the state-of-the-art coverage-guided fuzzing tools, employs a lightweight instrumentation technique to capture transitions between program basic blocks [18].AFL assigns a unique random ID to each basic block in CFG.The transition between two basic blocks, i.e., the edge, is defined as in Equation (1).In particular, edge i represents the ID of the edge, prev bb refers to the ID of the previous basic block, and cur bb refers to the ID of the basic block where the current transition occurs.Note that prev bb is shifted one bit to the right in order to distinguish different transition orders between two basic blocks.
AFL maintains a default 64 KB bitmap.Each byte in the bitmap is utilized to log the transition count associated with the byte's index in the bitmap, as illustrated in Figure 1.In this way, AFL can effectively track and count the edges covered by different inputs in the program.In the initial step, the algorithm maintains an array, temp v , to keep a record of edges currently covered by favored seeds.Then, in line 3, the algorithm iterates through the highest-scoring seed of each edge.Notably, the highest-scoring seeds are dynamically maintained by AFL during the fuzzing process based on criteria such as seed size and execution speed.
Subsequently, the algorithm determines whether the seed has already been covered by other seeds within the favored seed set in lines 4-7.If the edge remains uncovered, we update edge coverage information in temp v .Finally, in line 8, the seed is added to the favored seed set.
Once this favored seed set is obtained, AFL prioritizes these seeds and allocates more energy for mutation.It is important to acknowledge that AFL assumes that if a seed triggers new edges, fuzzing that seed will likely trigger more edges.However, this assumption has certain limitations when dealing with unexplored regions, as we will discuss in detail in Section 3.

Related Work
In this section, we discuss the closely related works.Coverage-guided greybox fuzzing.Coverage-guided greybox fuzzing is one of the most effective techniques for finding vulnerabilities and bugs, garnering significant attention from both academia and industry.Coverage-based greybox fuzzers typically adopt the coverage information to guide different program path explorations.
Since a coverage guidance engine is a key component for the greybox fuzzers, much effort has been devoted to improving their coverage.For example, REDQUEEN [19], GREYONE [20] and PATA [21] employ lightweight taint analysis to penetrate some paths protected by magic bytes comparisons.Driller [22], T-Fuzz [23] and QSYM [3] incorporate symbolic execution engines to delve into deeper program codes.Angora [24] adopts a gradient descent technique to resolve path constraints to break some hard comparisons.MemFuzz [25], MemLock [26], and ovAFLow [27] augment evolutionary fuzzing by additionally leveraging information about memory accesses or memory consumption performed by the target program.CollAFL [28] proposes a coverage-sensitive fuzzing approach to mitigate path collisions.Furthermore, AFLGo [29], Hawkeye [30], Beacon [31], and SelectFuzz [32] utilize alternative metrics for directing fuzzing toward user-specified target sites in the program.
Seed scheduling.In this paper, we focus on improving the seed scheduling component in a fuzzer.With the seed set, seed scheduling is essential for addressing two key issues: (1) which seed to select for the next round and (2) the time budget for the selected seed.In practice, instead of time budget, most fuzzers optimize the number of mutations performed on the selected seeds, i.e., energy scheduling.
AFLFast [12] models path transitions as Markov chain [33], efficiently guiding fuzzing to explore undiscovered path transitions.FairFuzz [13] leverages a targeted mutation strategy to prioritize the exploration of rare branches.EcoFuzz [14] adopts the Variant of the Adversarial Multi-armed Bandit Model (VAMAB) model to prioritize and allocate more energy to the seeds with lower self-transition probabilities.The insight behind it is that the low self-transition probability indicates the high probability of discovering new paths after mutating.Alphuzz [16] models the seed scheduling problem as a Monte Carlo tree search (MCTS) problem.Its key observation is that the relationships among seeds are valuable for seed scheduling.SLIME [15] prioritizes seeds based on reward estimated by a customized upper confidence bound variance-aware (UCB-V) algorithm on different property seed queues, adaptively allocating energy to the seed with different properties.

Motivating Example
We use a program's control flow graph in Figure 2 to illustrate our motivation.The initial seed A is considered to be a quality seed, i.e., the seed is capable of executing the main logic codes of the program.Figure 2a shows the execution path of seed A, with red circles denoting the basic blocks covered by the seed.When the coverage-guided greybox fuzzer mutates seed A, it can easily produce seed B and seed C. Both of these seeds cover edges associated with validation checks at the control flow graph level, i.e., BB1→BB10 and BB2→BB10.
The appearance of these two new edges is interpreted by the fuzzer as an increase in edge coverage, leading to their inclusion in the seed queue.Furthermore, according to the previous description in Section 2.1.3,AFL marks both seed B and seed C as favored seeds during seed scheduling since they cover new edges.However, when attempting to mutate these seeds, they encounter challenges in producing descendant seeds that explore deeper code areas.
We further assume that after mutating seed A, the fuzzer generates not only seed B and seed C but also seed D. Seed D's execution path is shown in Figure 2d.The difference between seed A and seed D lies in the fact that seed D explores basic block BB6, an unexplored area of seed A's execution path.In this case, both two seeds are effective for the exploration of basic block BB5.However, if seed E, which covers basic block BB5 is produced later, seed A and seed D become ineffective as they make limited contributions to exploring basic blocks BB8 and BB9.Nevertheless, AFL still regards seed A and seed D as favored because they both cover new edges.
The CFGs of the motivating example.
As discussed in Section 2.1.3,AFL's seed scheduling algorithm selects favored seeds based on covered edges.However, it lacks tracking information for unexplored areas at the CFG level.AFL's perspective hinges on the assumption that if a seed covers new edges, it is more likely to explore unexplored regions.This perspective relies on a crucial precondition: the new edges must be adjacent to unexplored areas; otherwise, these edges are likely related to validation checks or have already been explored by other seeds.Particularly in the context of expansive and complex programs, AFL expends substantial fuzzing resources on ineffective seeds, impeding the prioritization of genuinely effective seeds and slowing down the convergence of the fuzzing process.Therefore, this paper addresses the issue in AFL's seed scheduling algorithm by introducing a new mechanism to track unexplored regions.

Overview
Figure 3 shows the overview of UntouchFuzz.In comparison to traditional coverageguided greybox fuzzers, UntouchFuzz introduces an additional bitmap for tracking untouched edges.The untouched edge instrumentation mechanism updates this bitmap during program execution.Seed scheduling is then performed based on the information from this bitmap, prioritizing the mutations of favored seeds generated by scheduler.
To further explain, UntouchFuzz starts with an initial corpus as a seed set.By using the seed scheduling mechanism, the fuzzer selects a seed from the seed set for mutations.The number of mutations is determined by the energy scheduling mechanism, based on seed attributes.The fuzzer then executes the AFL-instrumented program with mutated test cases.If the program causes a crash, the test case is preserved on the local disk.If it covers new edges, the test case is preserved as a seed and added to the seed queue.Meanwhile, the coverage-increasing seed is provided to the program instrumented for untouched edge tracking, collecting information on untouched edges to guide the next seed scheduling process.Furthermore, UntouchFuzz allocates more energy to seeds with more low-frequency untouched edges in the seed queue.This allocation aims to encourage these seeds to make more attempts at breaking through these low-frequency untouched edges.In the following sections, we will discuss the methods for collecting information on untouched edges in Section 4.2, introduce the seed scheduling algorithm based on untouched edges in Section 4.3, and outline slight improvements to energy scheduling in Section 4.4.

Untouched Edges Tracking
As mentioned in Section 2.1.2,AFL employs a lightweight instrumentation technique to track program edge coverage.In essence, AFL's instrumentation assigns a random ID to each basic block within the target program's CFG.During program execution, the instrumentation codes calculate the corresponding edge index based on Equation (1) and use this index to update the coverage bitmap.
To ensure minimal impact on program execution speed, AFL utilizes a lightweight XOR operation for computing coverage indices.Similarly, the instrumentation codes responsible for gathering untouched edge information should also be lightweight to minimize disruption to program execution.
We introduce our instrumentation approach with a practical example.Figure 4 provides a snippet of branches within the program's CFG.The hexadecimal values in green boxes represent random IDs allocated by the AFL instrumentation for each basic block.According to Equation (1), the edge index for the transition from basic block BB1 to BB2 is calculated as (0xabcd ≫ 1) ⊕ 0x1234 = 0x47d2, while the edge index for the transition from basic block BB1 to BB3 is calculated as (0xabcd ≫ 1) ⊕ 0x5678 = 0x039e.
Suppose a particular seed triggers a transition from basic block BB1 to BB2, leaving the transition from BB1 to BB3 unexplored.In this scenario, we label edge 0x039e as an untouched edge within the execution path of the seed.A straightforward method for capturing untouched edges is to insert instrumentation codes within basic block BB2.Such codes update a byte of untouched edge bitmap by using 0x039e as a static index pre-allocated during compilation.This approach works efficiently when a basic block has only one predecessor, but confusion arises when a basic block has multiple incoming edges.
Consider the seed's execution path: BB0→BB1→BB2→BB1.Basic block BB1 has two incoming edges: 0xe693 from BB0 and 0xa2d7 from BB2.The corresponding unexplored edges are 0xb6a5 and 0xb2a1.Employing the aforementioned instrumentation codes, distinguishing between these two untouched edges becomes a challenging endeavor.Upon analyzing the above example, it becomes evident that static pre-allocation of untouched edge IDs is impractical when a basic block has multiple predecessors.To address this challenge, we propose a dynamic method for obtaining untouched edge IDs based on the properties of XOR operations.Specifically, we employ a global variable named "__a f l_bb_ids" to maintain the XOR value of the IDs of two basic blocks at branch transition point.When a transition between basic blocks occurs, we perform an XOR operation on the ID of the transitioned-to basic block with "__a f l_bb_ids" to retrieve the ID of the other unexplored basic block.Furthermore, we define the equation for calculating untouched edge IDs as follows: where, __a f l_bb_ids = cur bb ⊕ untouch bb Illustrated with the control flow graph in Figure 4, consider the branch transition point within basic block BB1, which leads to two transitions to basic blocks BB2 and BB3.At BB1, we update the value of "__a f l_bb_ids" by performing an XOR operation on the IDs of BB2 and BB3, resulting in 0x1234 ⊕ 0x5678 = 0x444c.When a transition occurs from basic block BB1 to BB2, we can recover the ID of the unexplored basic block BB3 by using "__a f l_bb_ids": "__a f l_bb_ids ⊕ ID BB2 = 0x444c ⊕ 0x1234 = 0x5678".Subsequently, we calculate the untouched edge ID between BB1 and BB3 using Equation (1): "(0xabcd ≫ 1) ⊕ 0x5678 = 0x039e".The calculated value is then utilized to update the bitmap information for the untouched edges.
Algorithm 2 delineates the process of instrumenting untouched edges.Initially, in lines 2-6, the algorithm employs AFL's native edge coverage-based instrumentation, concurrently capturing the random IDs allocated to each basic block.Subsequently, from lines 8 to 27, the algorithm performs to instrument for untouched edges.
To elucidate further, lines 10-12 of the algorithm determine whether the current "un-touch_inst1" is invoked."untouch_inst1" appends instrumentation codes to the beginning of each basic block.The primary function of these codes lies in fetching the value of the global variable "__a f l_bb_ids" within the program.It subsequently calculates the ID for the untouched edge per Equation (2) and leverages this ID to update the untouched edge bitmap "__a f l_untouch_ptr".Moving to lines 15-24, the algorithm traverses the two successor basic blocks of the current basic block, performing XOR operation on the IDs allocated to these two basic blocks.Finally, in line 25, the function "untouch_inst2" is invoked.The primary aim of the code is to update the value of the global variable "__a f l_bb_ids" with the control flow graph, setting it to the outcome of the XOR operation mentioned earlier.At this point, all steps of the untouched edge instrumentation algorithm have been completed.

Algorithm 2 Instrumentation for untouched edges
At the low assembly level, a basic block always has exactly two successor basic blocks.However, when we move up to higher-level compiler intermediate languages, it is not guaranteed that a basic block has always exactly two successor basic blocks, especially in programs that contain Switch statements.Specific solutions to this issue will be provided in Section 5. Additionally, it is worth noting that the instrumentation approach proposed still leads to the problem of edge index collisions, where different edges in the CFG are assigned with the same index value [28].

Seed Scheduling Based on Untouched Edges
In the preceding Section 4.2, we introduced the method for obtaining untouched edge information.In this section, we delve into seed scheduling based on the collected untouched edges.Similar to AFL, we maintain an array, 'untouch_top_rated', with a size of MAP_SIZE to store the most favored seed for each untouched edge.We also employ a greedy algorithm, specifically the minimum covering set algorithm [34], to generate an optimal seed set that contains all currently untouched edges.Algorithm 3 describes the seed scheduling algorithm based on untouched edges.Initially, in line 1, the algorithm feeds the newly added seed, denoted as 's', to the instrumented program 'P ′ ' with untouched edges.Subsequently, it acquires the edge coverage bitmap and the untouched edge bitmap of that seed.Following this, in lines 2-27, the algorithm iterates through each index value in the bitmap.Specifically, in lines 3-10, the algorithm first checks whether the edge has been both marked as a touched edge and an untouched edge in the two bitmaps.If this condition is true, it suggests that the edge is likely in a loop structure.Due to the repeated edges covered in loops, it is possible that edges untouched in a previous iteration of the loop are now covered in the current iteration.To mitigate this effect, the algorithm sets the value of the untouched edge bitmap corresponding to this edge's index to 0.
The algorithm also maintains a global array called 'virgin_untouch'.The indices of this array correspond to edge IDs, and the array values indicate the status of the respective untouched edges: 0 denotes an untouched edge that has not been covered by the execution path of any seed, 1 denotes that a seed's execution path includes this untouched edge, and 255 denotes an untouched edge that has been covered by the execution path of another seed, i.e., it has been "explored".If the current edge is covered by seed 's', and the corresponding value in the 'virgin_untouch' array is not 0, the algorithm updates the value to 255.
Moving on to lines 11-13, if the edge is an untouched edge for the current seed 's' and has not been "explored" by other seeds in history, the algorithm updates the value in the 'virgin_untouch' array to 1.In lines 14-17, if the corresponding 'virgin_untouch' value for this edge is 255, indicating that the edge is no longer untouched, the algorithm sets the 'untouch_top_rated' value for this edge to NULL.The 'untouch_top_rated' maintains the best seed for each untouched edge.
In lines 18-26 of the algorithm, the current seed s is compared with the metrics of the best seed for this untouched edge.We continue to use AFL's default metrics, which is the seed's execution speed multiplied by its file size.Then, at line 28, the algorithm invokes the coverMinSet() function, which generates a minimal seed set containing all untouched edges found, following the logic described in Algorithm 1.
In the end, the algorithm outputs the optimal seed set based on untouched edges.UntouchFuzz prioritizes testing seeds from this selected seed set.

Energy Scheduling Optimization
The goal of energy scheduling is to allocate energy efficiently to the chosen seeds for optimal mutations.It is essential to strike the right balance in energy allocation.Allocating excessive energy can lead to a significant waste of fuzzing resources on a single seed.Conversely, insufficient energy allocation may underutilize a seed's potential to explore new paths, as discussed in reference [14].
In this paper, we obtain the number of seeds in the seed set for each untouched edge included.Following this, we sort the number of seeds for these untouched edges in ascending order and select the top β% (40% is the default in this paper) as rare untouched edges.Then, we calculate the difference between the number of current seed's rare untouched edge 'rare s [i]' and the maximum number of seed 'max s ' among all rare untouched edges then calculate the average distance value dist s based on Equation (4).
Assuming the original energy allocated to the seed was p, it is now assigned a new energy of (1 + α × dist s ) × p, where α is the default value of 0.3.In our insight, if a certain untouched edge appears frequently in the seed set, it implies a high probability that the selected seed contains this untouched edge and suggests that this particular untouched edge is likely to be difficult to explore.Therefore, if a seed includes many high-frequency untouched edges, it should be allocated less energy.Conversely, if the seed contains numerous low-frequency untouched edges, it should be allocated more energy.
To elaborate on our insight, Figure 5 illustrates it through an example.Assuming that the current phase has fuzzed for a while and the seed set contains four seeds, with two global untouched edges: BB1→BB10 and BB5→BB8.According to the seed scheduling algorithm outlined in Section 4.3, seed B is identified as a favored seed due to the presence of untouched edge BB1→BB10 in its execution path and superior seed attribute.Similarly, seed D is designated as a favored seed because of the inclusion of the new untouched edge BB5→BB8 in its execution path.
Subsequently, the fuzzing process prioritizes the mutation of seed B and seed D. As previously mentioned, the untouched edge BB1→BB10 appears in the execution paths of all four seeds, indicating it is not a rare untouched edge.In contrast, BB5→BB8 is considered as a rare untouched edge since it appears only in the execution path of seed D. Consequently, we can make a reasonable conjecture that the untouched edge BB1→BB10 is likely associated with a hard-to-solve constraint, whereas BB5→BB8 is likely to represent a more manageable constraint.To increase the likelihood of covering edge BB5→BB8, we allocate more energy to the mutation of seed D based on the previously outlined energy allocation mechanism.It is essential to note that the provided example is simplified for explanatory purposes, while the real-world program will be complex.
However, in AFL, when new seeds are discovered, the fuzzer doubles the energy allocated to the seeds.Hence, we do not intentionally reduce the energy but instead provide seeds with a higher initial energy value to the seeds, aiming to fully unleash the potential of seeds to discover new paths.

Implementation
We implemented a prototype of UntouchFuzz on the top of AFL 2.57b, comprising two key components: instrumentation based on untouched edges and the main fuzzing loop.Next, we discuss a few important implementation details.
For the instrumentation, we employed the LLVM framework [35] to instrument the target program's codes, collecting information about untouched edges.However, LLVM IR contains SwitchInst instructions, which can lead to situations where basic blocks with SwitchInst instructions have multiple successor basic blocks.To address this, we utilized a pass available in the AFL++ [6] instrumentation tools that splits switch statements.By applying this pass, all SwitchInst instructions in the target program's LLVM IR are transformed into if. . .else. . .structures, converting basic blocks that originally had multiple successor basic blocks into those with only two successor basic blocks.This enables us to effectively implement the instrumentation method described in Section 4.2.
Before entering the main fuzzing loop, we launched an instrumented program using another fork server for untouched edge instrumentation in UntouchFuzz.Here, we considered that instrumentation due to untouched edges might impact the program's execution speed.Therefore, we only run the untouched edge-instrumented version of the program when seeds are added to the queue.In other scenarios, we run the AFL's native instrumented version of the program.However, it is important to note that to maintain instrumentation consistency, we applied the aforementioned switch statement, splitting pass to the AFL's native instrumentation.
Regarding the fuzzing main loop, we introduced an update_untouch_score() function to maintain the best seed for each untouched edge.Additionally, we made modifications to the cull_queue() and f uzz_one() functions in AFL to implement seed scheduling and energy allocation.Other logic in the fuzzing main loop remained unchanged.

Evaluation
In this section, we evaluated the effectiveness of UntouchFuzz and answer the following questions: • RQ1: How effective is UntouchFuzz at improving coverage faster when compared with other seed scheduling strategies?• RQ2: Can UntouchFuzz discover more unique crashes with respect to other seed scheduling strategies?• RQ3: How does the seed scheduling based on untouched edges perform in other fuzzers?• RQ4: Can UntouchFuzz detect new vulnerabilities in real-world programs?

Experiment Settings
(1) Baseline Seed Scheduling Strategies: Starting from the minimum coverage set, rare paths, new paths, and same-prefix coverage, the seed scheduling method proposed in this paper was compared with native AFL [9], AFLFast [12], EcoFuzz [14], and Alphuzz [16] seed scheduling strategies.It is important to note that these five tools differ only in their seed scheduling mechanisms while all other components remain the same.FairFuzz [13] was not compared due to its custom mutator, which aims to obtain mutation byte masks.When fuzzing large-scale seeds, the custom mutator might lead to starvation in subsequent seeds.Conversely, if seeds are small, the additional coverage gained from deterministic mutation might be unfair compared to fuzzers without deterministic mutation.Previous work [36] removed this custom mutation phase in experiments, but this deviated from FairFuzz's original intent.Therefore, we do not select FairFuzz for comparison.
Additionally, the multi-property queue seed scheduling method SLIME [15] is not compared due to the additional instrumentation for the target program.This instrumentation affects program execution speed and differs significantly from the instrumentation used by the aforementioned tools.The tool we implemented can share the same instrumentation program with these tools, ensuring no experimental differences in comparison.
(2) Benchmark Programs: We selected 12 real-world binary programs for testing based on their popularity, testing frequency, and diversity of categories.As shown in Table 1, these 12 binary programs include popular binary utilities (such as readelf), image parsing and processing libraries (such as libjpeg-turbo, exiv2), audio parsing tools (such as mp3gain), document processing libraries (such as xpdf, libxml2), and network packet parsing tools (such as tcpdump).Since certain vulnerabilities (i.e., buffer overflows) do not affect the program execution, we apply Address Sanitizer (ASAN) [37] to capture memory errors.
Table 1.Twelve real-world programs for evaluation.
(4) Experimental Environment: The experiments were conducted on a server with a 56-core Intel Xeon CPU, 128 GB of memory, and running Ubuntu 22.04.Deterministic mutations were disabled as it is less effective compared to AFL's havoc mutation strategy [40].To reduce the impact of randomness, we ran each benchmark program for 24 h, repeating the process 10 times and taking the arithmetic mean as the final result [41].
(5) Experimental Metrics: We evaluated the proposed method against four fuzzers with different seed scheduling strategies, considering edge coverage, edge coverage over time, and the number of unique crashes.Additionally, the proposed guided mechanism for untouched edges was transplanted into MOpt to assess its performance across different fuzzers.MOpt was chosen due to its focus on optimizing mutation operators while keeping the seed scheduling mechanism unaltered.

RQ1: Code Coverage Improving
Code coverage is a crucial metric for evaluating the performance of fuzzing techniques [17].In general, the more code a fuzzer can cover in the target program, the higher the probability of discovering hidden vulnerabilities.As explained in Section 2.1.2,AFL [9] employs a 64KB bitmap to collect coverage information, with each byte in the bitmap representing the number of hits for a particular edge ID.AFL maps program branches to the bitmap by using a hash function.If a branch is explored, the byte at the index corresponding to its edge ID in the bitmap is updated.AFL maintains a simplified bitmap in real time and stores it on local disk, which allows us to assess code coverage based on this simplified bitmap.
Table 2 presents a comparison of UntouchFuzz with four other state-of-the-art fuzzers in terms of edge coverage.The data in the table represent the arithmetic average edge coverage across ten fuzzing tests.The results in Table 2 demonstrate that UntouchFuzz outperforms EcoFuzz and Alphuzz in edge coverage and slightly surpasses AFL and AFLFast.UntouchFuzz achieves better coverage than the other four baseline seed scheduling strategies in 11 out of the 12 programs tested (all except for mujs).Overall, the proposed method is effective in improving coverage.While the improvement percentages on AFL and AFLFast (2.16% and 3.31%) are relatively modest, these results are consistent with findings from previous research [40], which suggests that differences among fuzzers are minimized when using the single havoc strategy.Nonetheless, our seed scheduling strategy still has an impact on the direction of fuzzing evolution by concentrating mutation energy on more effective seeds, thus enhancing overall program coverage.Figure 6 shows the evolution of edge coverage over time, with samples taken every hour.Evidently, on the seven target programs, exiv2, pdftotext, tcpdump, tiffcp, readelf, nmnew, and bsdtar, UntouchFuzz has a significantly higher edge coverage growth rate than the other four baseline seed scheduling strategies.However, for the remaining five target programs, the fuzzers converge quickly due to their smaller scale and lower complexity, negating the advantage demonstrated by UntouchFuzz.

RQ2: Unique Crashes
To further validate the effectiveness of the proposed method in vulnerability discovery, we conducted a statistical analysis of unique crashes.Table 3 presents a comparison of UntouchFuzz with four baseline fuzzers on the number of unique crashes, where the data in the table represent the total number of unique crashes discovered over ten rounds of testing.It is worth noting that, as in our experiments, we used the latest version of the djpeg program and did not find any valid crashes, so its results are not listed in Table 3.
As shown in the experimental results in Table 3, UntouchFuzz outperforms the other four baseline seed scheduling strategies in total unique crash discoveries, discovering the highest number of unique crashes on six of the programs under test.However, on the remaining five target programs, UntouchFuzz does not achieve the best results, but the difference between it and the best-performing approach is not significant.We attribute this to the randomness of mutation and differences in fuzzing evolution.
It is essential to note that the experimental results in Table 3 are directly taken from the unique crash metric of AFL-based fuzzers.However, the number of unique crashes may not accurately reflect the actual number of unique vulnerabilities because there is often a many-to-one relationship between them, which means that multiple crashes may correspond to a single vulnerability.For example, suppose crash one's triggering path is A →B→D, and crash two's triggering path is A→C→D, and both crashes occur at the same location in basic block D. From AFL's perspective, since the triggering paths of these two crashes are different, they are both considered unique crashes.However, from a root cause analysis perspective, both crashes are due to codes in basic block D, and thus, these two crashes should be categorized as the same vulnerability.Moreover, the differences in these two crash paths are likely related to changes in the input, where bytes in the input can affect the execution of subsequent basic blocks following basic block A. To obtain a more accurate count of unique vulnerabilities, we conducted deduplication of unique crashes based on the function call stack information provided by ASAN, selecting the top three functions and removing duplicate crashes.Table 4 presents the results after crash deduplication.The data in Table 4 reveal that various seed scheduling strategies exhibit distinct performances across different programs, with UntouchFuzz achieving the highest number of unique vulnerabilities in six programs.Overall, UntouchFuzz outperforms the other four baseline seed scheduling strategies based on the total number of discovered vulnerabilities.Furthermore, compared to EcoFuzz and Alphuzz, UntouchFuzz demonstrates a more stable performance across the programs.

RQ3: Scalability
To assess the scalability of the proposed approach, we integrated our method into the MOpt fuzzer, naming the modified tool "MOpt-u."Table 5 provides a comparison of edge coverage between UntouchFuzz, AFL, MOpt-u, and the original MOpt fuzzer.The results in Table 5 demonstrate that UntouchFuzz and MOpt-u outperform the original, unmodified fuzzers in terms of edge coverage across all 12 benchmark programs.This further substantiates the capability of the untouched edge-guided mechanism to enhance the performance of the original fuzzers.

RQ4: New Vulnerabilities
We used UntouchFuzz to find new vulnerabilities in open-source projects on GitHub.We reported these vulnerabilities to the respective projects.The details are in Table 6, confirming the effectiveness of UntouchFuzz in real-world scenarios.

Discussion
In this section, we discuss several limitations of our current implementation: (1) Our method is implemented on the top of AFL and not on AFL++.The choice to not implement it on AFL++ was due to the fact that AFL++ already integrates the seed scheduling mechanism from AFLFast and other advanced technologies.Porting the untouched edge guidance mechanism into AFL++ would have been complex.We plan to implement our approach to AFL++ in future work.Additionally, it is important to note that our method may not apply to base fuzzers outside of AFL, such as libFuzzer [42] or honggfuzz [1].(2) Coverage bitmap collisions are a common issue in AFL-based fuzzers.AFL-based fuzzers typically use a fixed-size 64KB bitmap to collect coverage information, which can be adjusted via configuration.The fixed bitmap size might result in different edges being assigned the same edge ID, causing coverage bitmap collisions.Prior research attempts have aimed to optimize this issue by modifying instrumentation [28,43,44].However, with the increase in the size of the target programs, expanding the bitmap to solve collision issues might not lead to significant performance improvements.
(3) We conducted fuzzing tests on 12 mainstream benchmark programs, running each program 10 times for 24 h per test.The experimental results demonstrate that, under the given initial corpus conditions, our tool effectively selects better seeds for fuzzing and guides the fuzzer toward maximizing program coverage.However, differences between the initial corpus and the target program can impact the performance of coverage-guided greybox fuzzers [10,15,45], causing variations in the evolutionary process and resulting in different outcomes.

Conclusions
In this paper, we concluded that the existing seed scheduling methods neglect to focus on the unexplored regions within the program's control flow graph.In response to this issue, we presented a greybox fuzzer guided by untouched edges, UntouchFuzz.We developed a lightweight instrumentation technique to track untouched edges.Furthermore, we designed a seed scheduling strategy based on untouched edges inspired by a minimal coverage sets algorithm.The strategy prioritizes seeds that include all untouched edges.Additionally, we made minor adjustments to the energy scheduler to align with the new seed scheduling method.In evaluation, UntouchFuzz outperformed the other fuzzers on code coverage and the number of vulnerabilities, further proving the untouched guidance mechanism proposed in this paper.To foster future research in this area, we have made our fuzzer open source.Further, future research could combine symbolic execution techniques with the untouched guidance mechanism for better fuzzing results.We plan to implement our mechanism into AFL++ and investigate the influence of bitmap collisions in our mechanism.

Figure 5 .
Figure 5. Example of our insight.

Figure 6 .
Figure 6.Edge coverage over time in five fuzzers.
(2)Energy Scheduling: Appropriate mutation counts (energy) are assigned based on the attributes of the chosen effective seeds.(3) Seed Mutation: Within the allocated energy, various mutation operations are performed on the selected seeds to generate new test cases.(4) Seed Selection/saving: Each generated test case is executed on the target program, and seeds are evaluated based on corresponding coverage information.If a test case enhances the coverage of the target program, it is chosen as a new seed.Through this feedback loop, the coverage of the target program under test continues to increase and it is more likely to generate test cases that trigger new bugs.

Table 2 .
Arithmetic mean edge coverage comparison.

Table 3 .
Comparison on unique crashes.

Table 6 .
New vulnerabilities found by UntouchFuzz.