MooFuzz: Many-Objective Optimization Seed Schedule for Fuzzer

: Coverage-based Greybox Fuzzing (CGF) is a practical and effective solution for ﬁnding bugs and vulnerabilities in software. A key challenge of CGF is how to select conducive seeds and allocate accurate energy. To address this problem, we propose a novel many-objective optimization solution, MooFuzz, which can identify different states of the seed pool and continuously gather different information about seeds to guide seed schedule and energy allocation. First, MooFuzz conducts risk marking in dangerous positions of the source code. Second, it can automatically update the collected information, including the path risk, the path frequency, and the mutation information. Next, MooFuzz classiﬁes seed pool into three states and adopts different objectives to select seeds. Finally, we design an energy recovery mechanism to monitor energy usage in the fuzzing process and reduce energy consumption. We implement our fuzzing framework and evaluate it on seven real-world programs. The experimental results show that MooFuzz outperforms other state-of-the-art fuzzers, including AFL, AFLFast, FairFuzz, and PerfFuzz, in terms of path discovery and bug detection.


Introduction
Fuzzing is a popular and effective software testing technology for detecting bugs and vulnerabilities. In the past few years, it has gained widespread usage in mainstream software companies (such as Google [1][2][3], Microsoft [4], and Adobe [5]) and has found thousands of vulnerabilities.
Coverage-based Greybox Fuzzing (CGF) [6,7] is one of the most popular methods of fuzzing. It is based on the guidance that increasing code coverage usually leads to better crash detection. By using lightweight instrumentation, CGF automatically generates a large number of inputs to feed target programs, and continuously collects coverage information as feedback to guide fuzzing.
Inspired by the impressive achievements of CGF, many researchers have conducted studies and developed their own fuzzers from different perspectives [8][9][10]. AFLFast [11] assigns more energy to the low-frequency paths based on the Markov chain model. AFLGo [12], a directed grey-box fuzzer, is implemented to generate inputs to reach given sets of target program locations. FairFuzz [13] identifies rare branches in the program and adjusts mutation strategies to increase coverage. MOPT [14] leverages a mutation schedule based on particle swarm optimization (PSO) to accelerate the convergence speed. EcoFuzz [15] improves the power schedule for discovering new paths using a variant of the adversarial multi-armed bandit model. PerfFuzz [16] generates pathological inputs to detect algorithm complexity vulnerabilities. MemLock [17] utilizes memory consumption information to guide seed selection to trigger the weakness of memory corruption.
However, most previous approaches mainly leverage a single selection criterion to select seeds. While these approaches are simple and easy to use in solving specific problems, • We propose the path risk measurement method to assist seed schedule in Exploration State. • We use many-objective optimization to model CGF and classify three different states of seed pool and put forth different selection criteria that enhance the fuzzer performance. • We propose an energy allocation and monitor mechanism to improve the power schedule. • We implement our framework as MooFuzz and evaluate its effectiveness on a series of popular real-world applications. MooFuzz substantially outperforms the other fuzzers.
The rest of this paper is organized as follows. Section 2 introduces the background and related work of many-objective optimization and CGF. Section 3 shows the design of MooFuzz. Section 4 presents the evaluation and we get the conclusion in Section 5.

Background and Related Work
In this section, we introduce the background of many-objective optimization and CGF and discuss related work.
In many-objective optimization problems, minimization problems simultaneously optimize minimize objectives to obtain the maximum benefit. Within the scope of mathematics, minimization problems are embodied in the minimization of objective functions (that is, to minimize all objective values of objective functions as far as possible). In this paper, we use the minimum optimization model to carry out seed schedule. The definition of minimum optimization problems is given below.
where F(x) is the objective vector, f i (x) is the i-th objective to be minimized, x = (x 1 , · · · , x n ) is a vector of n decision variables, X is an n-dimensional decision space, and m denotes the number of objectives to be optimized.
Definition 1 (Pareto Dominance [57]). Given any two decision vectors x, y with M objectives for the minimization optimization. ∀x, y ∈ X, if there is f m (x) ≤ f m (y) for all m = 1, 2, · · · , M then x dominates y, which is denoted as x ≺ y.
Definition 2 (Pareto Optimal [57]). Assuming that x * ∈ X, if there is no solution x ∈ X satisfying x ≺ x * , then x * is the Pareto optimal solution. Definition 3 (Pareto Optimal Set [57]). All the Pareto optimal solutions constitute the Pareto optimal set (PS). [57]). All the objective vectors of the solutions in Pareto optimal set constitute the Pareto front (PF). Figure 1 is a solution distribution under two-dimensional objective space, where all points represent solutions. For a minimal optimization problem, it can be seen that the point A is smaller than the point C under the two-dimensional objective space, that is, there is a dominance relationship between the point A and the point C, and the point C is dominated the point A. For the points A and B in Figure 1, we can see that the point A is greater than the point B on the f 2 axis, but the point A is less than B on the f 1 axis, so there is not a dominance relationship between the point A and the point B.

Coverage-Based Greybox Fuzzing
CGF is an evolutionary algorithm that includes two stages: the static analysis stage and the fuzzing loop stage. In the static analysis stage, it executes compile-time or dynamic binary instrumentation to obtain the instrumented target program. In the fuzzing loop stage, CGF uses a series of initial seeds provided by the user as inputs and maintains a seed queue stored in the seed pool. CGF first selects a saved seed input from the seed queue and mutates it to generate the new input by using mutation strategies. Next, the target program is executed with the new input. Then, lightweight instrumentation technique is used to gather coverage information, if the new input causes a crash, it will be marked and added to the crash set. If the new input leads to new coverage, CGF will judge that the new input is interesting and add it to the seed pool. Algorithm 1 shows the workflow of CGF in the fuzzing loop stage.

Algorithm 1: Coverage-based Greybox Fuzzing
Input: a data set of initial seeds, an instrumented target program P Output: a seed queue Q, a crash set C 1 Q ← seeds 2 C ← ∅ 3 Procedure fuzzing process 4 while TRUE do 5 S ← SeedsSelect(Q) Code instrumentation aims to insert code fragments at compile-time, which is useful for path tracing and testing during the fuzzing process. AFL [7] is a greybox fuzzer using edge (branch) coverage as feedback. Before the fuzzing loop stage, AFL first uses afl-gcc or afl-clang as instrumentation commands to trace edge coverage. AFL preserves a 64KB shared bitmap Bitmap to record edge coverage information including whether the edge has been visited, and the count of hits. AFL assigns a random number to represent each basic block in the program and uses the XOR and right shift operation for the current basic block and the previous basic block to mark each edge. Each edge is used as an offset of Bitmap and the value is the count of hits.
The specific formula for coverage calculation is as follows [9].

Seed Schedule
Seed schedule refers to select seeds from the seed pool for future mutation. A perfect seed schedule scheme is conducive to speeding up path discovery and bug detection. AFL [7] gives priority to seeds that are unfuzzed (not selected for mutation) and favored (among all seeds passing through the edge, the seed with the smallest product of seed length and execution time). AFLGo [12] preferentially selects seeds closer to the target location for directed fuzzing. VUzzer [8] prioritizes seeds of deeper paths, it may detect bugs deep in the code. SlowFuzz [58] preferentially selects seeds that generate more resource consumption to trigger algorithm complexity vulnerabilities. In order to discover memory consumption bugs, MemLock [17] preferentially selects seed inputs that generate more memory consumption. UAFL [59] preferentially selects seeds that execute the operation sequence violating typestate properties to uncover use-after-free (UAF) vulnerabilities.

Mutation Strategy
The mutation strategy determines where and how to mutate the selected seed. Different fuzzers use different mutation strategies. AFL has two mutation stages: the deterministic stage and the indeterministic stage. The deterministic stage. The deterministic stage is used when the first time fuzzing seed. This stage includes mutation operators, bitflip, byteflip, arithmetic addition/subtraction, interesting values, and dictionary.
The indeterministic stage. After completing the deterministic stage, seeds will enter the indeterministic stage, in which AFL includes havoc and splice. In this stage, AFL randomly selects a sequences of mutation operators and assigns random location to mutate the seed.
There are many studies on mutation strategies for fuzzer. VUzzer [8] leverages data flow and control flow features to infer the critical regions of the input for mutation. GREYONE [60] uses a fuzzing-driven taint inference to infer taint variables for mutation. Superion [61] deploys mutation strategies to fuzz programs that process structured inputs. MOPT [14] uses particle swarm optimization algorithm to optimize mutation operators.

Power Schedule
Power schedule aims to allocate energy to each seed during the fuzzing process, which determines the number of seed mutations. Reasonable energy allocation can effectively improve the discovery of new paths. If the energy of a seed is over allocated, other seeds mutation will be affected. Conversely, if the energy of one seed is under allocated, it will be detrimental to new path discovery and potential bug detection.
AFL has two power schedule methods based on different mutation stages. In the deterministic stage, the energy of a seed is related to its length. The longer seed length, the more energy will be consumed. In the indeterministic stage, the energy allocation depends on the running time, the number of edges, the average size of the file, the number of cycles, and others.
Recent research shows that power schedule is very critical for fuzzer. AFLFast [11] allocates more energy to the low-frequency path to explore more paths. EcoFuzz [15] uses reinforcement learning to model power schedule as the adversarial multi-armed bandit model that enables adaptive energy saving. However, they did not consider the path risk and the effectiveness of energy allocation.

The Design of MooFuzz
To address problems mentioned in the previous sections, we propose a many-objective optimization fuzzer MooFuzz, as shown in Figure 2. The main components of MooFuzz contain static analyzer, feedback collector, seed scheduler, and power scheduler. In Moo-Fuzz, static analyzer marks the risk edge and records the risk value for each edge by scanning the source code and then inserts code fragments to update the edge risk value in running program. Feedback collector is used to record and update related information to guide the seed schedule after the program execution. Seed scheduler adopts different many-objective optimization schedules based on different states of the seed pool to select seeds. Power scheduler assigns energy based on feedback information and monitors energy usage.

Static Analyzer
A common idea is that the place has dangerous functions may trigger vulnerabilities. For example, the function malloc is used to dynamically allocate memory in C language. Although it can automatically allocate memory space, if used improperly, it may cause problems such as overflow, heap exhaustion, and use-after-free. The function write shall attempt to write n bytes from the buffer pointed to by bu f into the file associated with the open file descriptor. However, if programmer cannot control the size of the bytes written to bu f , it will cause the risk of out-of-bounds read of the memory. Therefore, MooFuzz identifies potentially dangerous functions as risk edges to label in static analyzer. In this paper, MooFuzz uses functions in Table 1 as dangerous functions [62], including memory allocation, memory recovery, memory operation, string operation, and file I/O operation. At the same time, users can also customize dangerous functions and add them to static analyzer for fuzzing. Algorithm 2 shows the basic idea of MooFuzz instrumentation. Before the static analysis, there are well-known potentially dangerous functions. The static analyzer can identify them by traversing the source code and perform source code instrumentation at the corresponding edge position without running the program. MooFuzz uses a pointer danger_trace to record the hit-counts of the risk edge in shared memory after running program every time. Specifically, MooFuzz first obtains each basic block information of the program, then identifies each call instruction and judges whether someone is dangerous (Lines 1-7). If any exists, the hit-counts will be updated and stored in the memory pointed to by danger_trace (Lines 8-11).

Algorithm 2: Code instrumentation
Input: the program P, a set of dangerous functions DF, a pointer variable danger_trace Output: the instrumented program P 1 MAP_SIZE = 2 16 2 for basic_block in P do 3 bool risk = f alse

Feedback Collector
The feedback collector is mainly used to continuously update seed information to assist seed schedule. For the running of the instrumented program, a series of running information would be updated for seeds. Algorithm 3 shows the process of information updating by feedback collector. It takes the seed queue Q and the pointer variable as inputs, and output is the seed queue Q with new information. The new information includes the number of times the seed has been selected, the path frequency, the path risk, and the mutation information. Specifically, MooFuzz selects a seed s by using seed scheduler (see Section 3.3) and updates the number of times it has been selected (Lines 1-3). Then, it uses a mutation strategy to generate a new test case s and executes the target program by using test case s (Lines 4-5). Next, two pointer variables danger_bits and edge r are used to update the edge risk (Line 6). Here, danger_bits is obtained with the pointer variable danger_trace. The edge r records the risk of each edge. At the beginning, the edge corresponding to dangerous function has a maximum value, while those of the other edges are zero. Next, if the mutated test case produces new coverage, MooFuzz will calculate path risk value (Lines 7-8). Next, MooFuzz traverses each seed in the seed pool and determine whether its path is the same as the current path. If so, the frequency information of the seeds in seed pools will be updated (Lines 9-11). Finally, if the path of s is identical to the path of s , the mutation information will be updated (Lines 12-13).
We discuss how to update different information separately as follows.
The path risk mainly refers to the ability of seeds to detect dangerous locations, which determines the number and speed of bug discovery. Before discussing the path risk, we first give the definition of edge risk update and then that of path risk update.
The edge risk update. Given an edge e i and the corresponding hit-count danger_bits[e i ], the edge risk edge r [e i ] is updated as follows.
where danger_edge is the set of edges corresponding to dangerous function.

Algorithm 3: Information update
Input: a seed queue Q, a pointer variable danger_bits, a pointer variable edge r , the instrumented program P Output: a seed queue Q with new information The path risk update. Given a seed s and the risk values of all edges covered by the seed s, the path risk of seed s, s.risk is calculated as follows.
The path frequency indicates the ability of the seed to discover a new path. As time goes by, there are high-frequency paths and low-frequency paths in the program. Generally, those seeds that cover low-frequency paths have a higher probability of discovering new paths than those that cover high-frequency paths (the larger the value, the higher the path frequency) after the program running for a while.
The path frequency update. Given a seed s and its path p s , if there is a seed s in the seed pool and its path p s , and p s is the same as p s . We add one to the path frequency of seed s, that is, s. f re = s. f re + 1, if p s = p s The mutation information indicates the mutation ability of a seed. For each seed that has not been fuzzed, its mutation effectiveness is set to 0, indicating that the seed has the best mutation validity. Among the seeds being fuzzed, the mutation ability of the seeds will be continuously evaluated, and individuals with high mutation ability (the smaller the value, the better) will obtain priority.
The mutation information update. Given a seed s and its mutation strategy M, if the path of seed s is the same as that of seed s generated by seed mutation upon s, the mutation information of seed s, s.mta is calculated as follows.

Seed Scheduler
Seed scheduler is mainly used for seeds selection. In order to effectively prioritize seeds, we propose a many-objective optimization seed schedule scheme.
Before seed schedule, MooFuzz divides the seed pool into three states according to seed attributes. Exploration State. Exploration State refers to the existence of unfuzzed and favored seeds in the seed pool. Exploration State represents that the current seed pool state is an excellent state and it maintains the diversity of seeds.
Search State. In this state, the favored seeds have been fuzzed, but there are still unfuzzed seeds. Search State represents that there is a risk that the seed pool is completely fuzzed, and it is necessary to concentrate on finding more paths.
Assessment State. In this state, all the seeds are all fuzzed. It is very difficult to find a priority seed, but the fuzzed seeds produce a lot of information that can serve as a reference. Besides, MooFuzz performs state monitoring in the assessment state. Once the state changes, the seed set of the current state will be discarded to perform seed schedule in other states.
For these three states, MooFuzz uses different selection criteria based on bug detection, path discovery, and seed evaluation. MooFuzz constructs different objective functions based on different states.
In the previous discussion, MooFuzz has obtained the risk value of the seed before it is added to the seed pool, indicating the path risk. Based on previous research [8], seeds with deeper executing paths may be more capable of detecting bugs. Therefore, MooFuzz uses path risk r and path depth d as objectives for seed selection. To reduce the energy consumption of seeds and speed up the discovery of bugs, MooFuzz also takes the length l of the seed data and the execution time t of the seed as objectives. In Exploration State, MooFuzz uses the following objective functions to select the seeds that have not been fuzzed and favored. Min Search State indicates that all the favored seeds in current seed pool have been fuzzed and there are unfuzzed seeds. At this time, MooFuzz's selection of seeds will mainly focus on the path discovery. The frequency information of the seeds will increase with the running time changes. In this state, those seeds that pass the low-frequency path will have greater potential to discover new paths. MooFuzz regards path frequency e and path depth d as criteria for seeds selection. Meanwhile, MooFuzz uses l and t described above to balance energy consumption. In Search State, MooFuzz uses the following objective functions to select the seeds that have not been fuzzed.
Assessment State means that all seeds in the current seed pool have been fuzzed. MooFuzz will obtain the information of the seed including the path frequency e, the number of times that the seed has been selected n, the seed path depth d, and the mutation information m, and then add them to the objective functions as mutation criterion. Note that the current state does not choose the length and execution time of the seed as criteria to balance energy consumption, because the current state is very difficult to generate new seeds. Besides, once new seeds are generated in this state, Assessment State will be terminated and enter other state. In Assessment State, MooFuzz uses the following objective functions to select the seeds from the seed pool.
MooFuzz selects the optimal seed set after establishing objective functions for different seed pool states and models seed schedule as a minimization problem. Algorithm 4 mainly completes the seed schedule by using non-dominated sorting [19]. The seed set S that satisfies state conditions will be selected as the input. A set CF that is used to store the optimal seed set. Initially, CF is an empty set, and s 1 in seed set S was added to CF. For each seed s i from the seed set S and seeds s j in CF finish the dominance comparisons (Lines 1-9). If s j dominates s i (each attribute value of s j is less than s i ), the next seed comparison will be performed. If s i dominates s j , remove s j from CF. After the comparison between the seed s i and s j , if there is not a dominance relationship between s i and all the seeds in CF, s i will be added to CF (Lines 10-11). After the above cycle is completed, the optimal seed set is stored in CF, and MooFuzz extracts each seed inside for fuzzing (Lines 12-13).

Algorithm 4: Seed schedule
Input: the seed set S satisfying conditions in different states Output: a series of optional seed s 1 CF ← ∅ 2 for s i in S do 3 bool isdominated = f alse

Power Scheduler
The purpose of power schedule is assigning reasonable energy for each seed involved in mutation. A high quality seed has more chances to mutation and should be assigned with more energy in fuzzing process.
Existing coverage-based fuzzers (such as AFL [7]) usually calculate the energy for the selected seeds as follows [18], energy(i) = allocate_energy(q i ) (12) where i is the seed and q i is the quality of the seed, depending on the execution time, branch edge coverage, creation time, and so on. Algorithm 5 is the seed power schedule algorithm. MooFuzz considers different seed pool states to set up different energy distribution methods. Meanwhile, it also uses an energy monitoring mechanism, which has the ability to monitor the execution of target programs and reduce unnecessary energy consumption.
After many experiments, we find that the amount of energy in the deterministic stage is mainly related to the length of the seed, which is a relatively fine-grained mutation, but as the number of candidate seeds in the seed pool increases, it will affect the path discovery. Thus, in Algorithm 5 we open the deterministic stage to seeds that cause crashes after mutation (Lines 1-2). In the indeterministic stage, MooFuzz judges the state of the current seed. If it belongs to Search State, MooFuzz uses the frequency information to set the energy. If it belongs to Assessment State, both the frequency and the mutation information will be comprehensively considered to set the energy (Lines 3-6).
After energy allocation, we set up a monitoring mechanism to monitor the mutation of seeds (Lines 7-14). When each seed consumes 75% of the allocated energy, MooFuzz monitors the mutation of the current seed, and records the ratio of the average energy consumption of the current seed covering a new path and that of all seeds covering a new path. If its ratio is lower than threshold 1 , MooFuzz will withdraw the energy, if its ratio is higher than threshold 2 , the mutation information will be updated. Here, threshold 1 is equal to 0.9 and threshold 2 is equal to 1.3.

Algorithm 5: Power schedule
Input: a seed s, the number of all seeds in seed pool total_seed, the total energy consumed in the fuzzing process total_energy, the number of new seeds generated by the current seed mutation cur_seed Output: the energy of seed s s.energy 1 if seed s that causes crashes after mutation then

Evaluation
MooFuzz is built on top of AFL-2.52b [7]. The implementation adds C/C++ code to the AFL. The instrumentation components are implemented to mark danger edges based on the LLVM framework [63] in static analysis. Through these experiments, the following research questions are tackled: RQ1: How capable is MooFuzz in crash detection? RQ2: How effective is the code coverage of MooFuzz? RQ3: How capable is MooFuzz in identifying real-world vulnerabilities?
AFL is currently one of the most common coverage-based greybox fuzzer in community.

2.
AFLFast is a variant of AFL with better power schedule.

3.
FairFuzz is also an extending fuzzer of AFL. It optimizes inputs that hit rare branches.

4.
PerfFuzz improves the instrumented components to generate pathological inputs.

Benchmark.
To evaluate MooFuzz, we choose seven real-world open source Linux applications as the benchmark to conduct experiments. Jasper [64] is a software tool kit for processing image data that provides a way to represent images and facilitates the manipulation of image data. LibSass [65] is a C/C++ port of the Sass engine. Exiv2 [66] is a C++ library and a command line utility to read, write, delete, and modify Exif, IPTC, XMP, and ICC image metadata. Libming [67] is a library for generating Macromedia Flash files, written in C, and includes useful utilities for working with Flash files. OpenJPEG [68] is an open source JPEG 2000 codec written in C language. Bento4 [69] is a C++ class library that is designed to read and write ISO-MP4 files. The GUN Binutils [70] is a collection of binary tools. Table 2 shows target applications and their fuzzing configure. Table 2. Target applications and their fuzzing configure.

Program
Command Line Project Version Performance Metrics. Crashes, paths, and vulnerabilities are chosen as metrics in this section. In code coverage metrics, we use the number of seeds in the queue as an indicator and use tool Afl-cov [71] to measure code line coverage and function coverage. In vulnerability detection, we directly use AddressSanitizer [72] to detect it. Experiment Environment. All experiments are conducted on a server configured with two Xeon E5-2680 v4 processors (56 logical cores in total) and 32 GB RAM. The server installed Ubuntu 18.04 system. For the same application, the initial seed set is the same. We fuzz each application for 24 h (on a single logical core) and repeat 5 times to reduce randomness. In all implementations, we use 42 logical cores, and we leave 14 logical cores for other processes to keep the workload stable.

Unique Crashes Evaluation (RQ1)
In order to evaluate the effectiveness of MooFuzz, a direct method is to evaluate the number of crashes and the speed at which they are triggered. It is believed that more crashes may trigger more bugs. We fuzz each application to run on 5 different fuzzers to compare the number of unique crashes and the speed of discovery. Figure 3 shows the growth trends of unique crashes discovery in different fuzzers. From these results, we can make the follow observations. First, different fuzzers have different capability in fuzzing different application programs. For example, PerfFuzz has zero crash in fuzzing openjpeg within 24 h, but it can trigger most crashes in fuzzing exiv2 among other fuzzers. This shows that the different criteria of the seed selection affect the number of crashes.
Second, seed schedule and power schedule affect the efficiency of crashes discovery. The experimental results show that MooFuzz outperforms AFL in the speed of crashes discovery and just takes about 10 h to trigger most of the unique crashes. There is no path risk measurement and energy monitoring in AFL, leading to a lot of time spent on invalid mutation operators.
Third, MooFuzz is able to find more crashes than other state-of-the-art fuzzers. The static results are shown in Table 3. We count the number of crashes found in applications by different fuzzers within 24 h, and count the total number of crashes found by each fuzzer. Table 3 shows that except for exiv2, MooFuzz triggers more crashes than other fuzzers, among which jasper triggers 182 crashes within 24 h and AFL only triggers 118 crashes. In total, MooFuzz triggers 818 crashes in benchmark application programs, improving by 46%, 32%, 34%, and 153%, respectively, compared with state-of-the-art fuzzers AFL [7], AFLFast [11], FairFuzz [13], and PerfFuzz [16]. Overall, MooFuzz significantly outperforms other fuzzers in terms of speed and number of unique crashes.

Coverage Evaluation (RQ2)
Code coverage is an effective way to evaluate fuzzers. The experiment measures coverage from source code line, function, and path. Table 4 shows the line and function covered by different fuzzers. In total, MooFuzz's line coverage and function coverage are better than AFL, AFLFast, FairFuzz, and PerfFuzz.  Figure 4 shows the growth trends of paths discovery in five different fuzzers after fuzzing applications for 24 h. We can clearly observe that except for cxxfilt, MooFuzz ranks first among all fuzzers from the perspective of the number of path discovery. Among them, it can find about 6000 paths in fuzzing openjpeg, and the other four fuzzers can only find about 3600 paths. It can find about 1200 paths after fuzzing jasper for 24 h, while other fuzzers can only find about 500 to 700 paths. Although the number of paths discovered by MooFuzz is lower than FairFuzz and AFLFast in fuzzing cxxfilt, it can trigger the most crashes compared with other fuzzers. From the speed of path discovery, MooFuzz is significantly higher than other fuzzers. Overall, MooFuzz outperforms other fuzzers in terms of line, function, and path coverage.

Vulnerability Evaluation (RQ3)
MooFuzz tests old version of the applications and analyzes related vulnerabilities to evaluate the ability in vulnerability detection. Table 5 shows the real vulnerabilities combination with its IDs identified by MooFuzz. MooFuzz is able to find stack overflow, heap overflow, null pointer dereference, and memory leaks related vulnerabilities. Vulnerability analysis. We use a real-world application program vulnerability to analyze the effectiveness of our approach, as shown in Figure 5. This is a code snippet from openjpeg [68] which contains a heap-buffer-overflow vulnerability (i.e., CVE-2020-8112).
In Figure 5, the main function contains a conditional statement (Lines 1-9). In Moo-Fuzz, the seed s t satisfies the judgment condition and enters the true branch to execute function opj_tcd_decode_tile(...). Moreover, the seed s n enters the false branch to execute other codes that do not contain dangerous functions. Asmalloc is a dangerous function which is used in opj_tcd_decode_tile(...), risks might emerge when this function is used. Therefore, MooFuzz preferentially selects seed s t for mutation. In this case, malloc(l_data_size) is called and l_data_size comes from an unsigned operation in the function opj_tcd_decode_tile. Then, the function opj_t1_clbl_decode_processor will be called in the following program flow, where the allocated memory will be modify through two variables cblk_h and cblk_w. All of these two variables are obtained through signed operation, which causes an integer overflow making cblk_h * cblk_w > l_data_size, and MooFuzz easily satisfies the above conditions through mutation, so the heap-buffer-overflow happened.

Discussion
We enhance fuzzing from the perspectives of vulnerabilities and coverage. Although more coverage may trigger more vulnerabilities, not all coverage is equal [62]. Based on our observation of the fuzzing process, we define the path risk and prioritize seeds that consume less energy while executing high risks, to maximize the improvement of fuzzing. Meanwhile, we use different objectives for seed optimization and energy allocation. It can improve the efficiency of fuzzing in a limited time.
In the algorithm design of the power schedule, we use two thresholds to judge the current seed energy usage. There is still an opportunity to adaptively adjust these two thresholds instead of the fixed thresholds. For example, these thresholds can be dynamically adjusted according to the fuzzing process. In our evaluation, our method can improve the probability of triggering vulnerabilities, but it may not be effective for triggering vulnerabilities that require complex conditions, such as deeply nested conditions. Although we use a variety of open source benchmarks to evaluation MooFuzz, it may not be effective for programs that require specific grammatical conditions for inputs (such as XML). However, the prototype we develop, MooFuzz, is a completely dynamic prototype.
It can integrate static analysis techniques like symbolic execution to generate test cases that satisfy specific conditions to improve fuzzing.

Conclusions and Further Work
In this paper, a many-objective optimization model is built for seed schedule. Considering the three states of the seed pool, we use different objective functions to select seeds from the perspectives of bug detection, path discovery, and seed evaluation. At the same time, an energy recovery mechanism is designed to monitor energy usage during the fuzzing process. We implement a prototype MooFuzz on top of AFL and evaluate it on seven real-world programs. The experiment results show that MooFuzz behaves more effectively than state-of-the-art fuzzers in path discovery and bug detection.
In the future, we plan to use MooFuzz to fuzz the latest version of the applications to assist testers in testing. In next study, we will consider optimizing power schedule through multi-information feedback on the basis of MooFuzz, so that it can monitor energy consumption according to the current program running progress, and automatically set and adjust energy. We also consider starting from the seed mutation, and propose a new decision model to determine effective region of the seed and select the effective mutation strategy.