Fractional Encoding of At-Most-K Constraints on SAT

: The satisﬁability problem (SAT) in propositional logic determines if there is an assignment of values that makes a given propositional formula true. Recently, fast SAT solvers have been developed, and SAT encoding research has gained attention. This enables various real-world problems to be transformed into SAT and solved, realizing a solution to the original problems. We propose a new encoding method, Fractional Encoding, which focuses on the At-Most-K constraints—a bottleneck of computational complexity—and reduces the scale of logical expressions by dividing target variables. Furthermore, we conﬁrm that Fractional Encoding outperforms existing methods in terms of the number of generated clauses and required auxiliary variables. Hence, it enables the efﬁcient solving of real-world problems like planning and hardware veriﬁcation.


Introduction
The satisfiability problem (SAT) in propositional logic is the determination of whether there exists an assignment of values that makes a given propositional formula true. Recently, very fast SAT solvers have been developed, and the study of SAT encoding has attracted attention. Problems such as planning, hardware verification, software verification, and scheduling are transformed into SAT and solved using a SAT solver, realizing a solution to the original problem [1,2]. In recent years, there has been active research into SAT encoding for optimal Clifford circuits [3,4], expanding the application range of SAT.
Real-world problems encoded into SAT are composed of various constraints, among which the At-Most-K constraints often become the bottleneck of computational complexity [2]. Regarding state-of-the-art SAT papers, Timm et al. used SAT for the verification of multi-agent systems [5]. In this situation, an efficient encoding method is needed, especially for At-Most-K constraints for large K. To date, various encoding methods for At-Most-K constraints have been proposed. Frisch et al. proposed Binary Encoding, which performs efficient encoding by assigning domains to each target variable [6,7]. Sinz et al. introduced Counter Encoding, which operates efficiently by referring to sequential counting circuits [8]. There is still room for improvement in these methods in terms of suppressing the scale of logical expressions. Therefore, in this study, we propose a new method, Fractional Encoding, which suppresses the scale of logical expressions by dividing the target variables. The performance of the encoding is evaluated by the number of clauses generated and the number of auxiliary variables required. As a result, it has been confirmed that the proposed Fractional Encoding performs better than existing methods in terms of the number of clauses generated and the number of auxiliary variables.
The starting point of this study is the preliminary report [9], in which the idea of Fractional Encoding is presented. There is also related research [10] that is based on the same idea and proposed Approximate Encoding of At-Most-K constraints. While Fractional Encoding in this paper has fine-tuning variables to be described later, Approximate Encoding does not, thus Approximate Encoding cannot cover all possible solutions for At-Most-K constraints.

Satisfiability Problem (SAT)
"Solving SAT" means determining the satisfiability of a propositional logic formula. In other words, it determines whether there exists an assignment (model) of propositional variables to the logical formula containing propositional variables such that the formula becomes true. A program that solves SAT is called a SAT solver, and it determines whether a given SAT is satisfiable or unsatisfiable. There are various types of SAT solvers, such as MiniSat [11], which is well-known, and CaDiCaL [12], which has achieved excellent results in recent competitions. In most SAT solvers, if the SAT is satisfiable, a concrete assignment is shown; if it is unsatisfiable, the assignment is shown to be non-existent.
The logical expressions encoded in SAT are in conjunction standard form (CNF). CNF is a form of expressing logical expressions by a sequence of disjunction clauses and is the target of a SAT solver [13].

At-Most-K Constraints
SAT is composed of various constraints (logical expressions). The At-Most-K constraint is the constraint that no more than K variables can be true. As the simplest example, Pairwise Encoding with at most one true variable and three target variables (x 1 , Although Pairwise Encoding is a simple implementation, it requires O(n 2 ) clauses for n target variables, which becomes a huge expression when n is large. Therefore, various coding methods have been proposed to suppress the number of clauses by introducing auxiliary variables. Typical coding methods that use auxiliary variables include Binary Encoding and Counter Encoding. The two methods are described below.

Binary Encoding
Binary encoding was originally introduced by Frisch et al. [6,7]. The encoding introduces new variables B 1 , . . . , B log 2 n . It then associates with each x i a unique bit string s i ∈ {1, 0} log 2 n . The binary encoding of At-Most-One constraint is where ϕ(i, j) denotes B j if the jth bit of s i is 1 and otherwise denotes ¬B j . The binary encoding can extend to the At-Most-K constraint. As before, associate with each x i a unique bit string s i ∈ {1, 0} log 2 n . The encoding introduces new variables B i,g (1 ≤ i ≤ K, 1 ≤ g ≤ log 2 n ), which are essentially K copies of the previous B variables. The binary encoding of At-Most-K constraint is where ϕ(i, g, j) denotes B g,j if the jth bit of s i is 1 and otherwise denotes ¬B g,j .

Counter Encoding
Sinz introduced an encoding that works by encoding a circuit that sequentially counts the number of x i that are true [8]. For each 1 ≤ i ≤ n there is a register whose value is constrained to contain the number of x 1 , . . . , x i that are true. Each register maintains its count in base one and hence uses K bits to count to K. Thus, the encoding introduces the new variables R i,j , 1 ≤ i ≤ n, 1 ≤ j ≤ K, where each R i,j represents the ith bit of register j. The clauses of the encoding are as follows.
Formula (3) states that if x i is true then the first bit of register i must be true. Formula (4) ensures that in the first register only the first bit can be true. Formulas (5) and (6) together constrain each register i (1 < i < n) to contain the value of the previous register plus x i . Finally, (7) asserts that there cannot be an overflow on any register as it would indicate that more than K variables are true.
The encoding method proposed in this study, "Fractional encoding", also suppresses the number of clauses better than pairwise encoding by introducing auxiliary variables. The remainder of this paper is organized as follows: Section 2 presents the method. Section 3 presents the results. Additionally, Section 4 contains a discussion, and concluding remarks are provided in Section 5.

Methods
In this study, we propose an encoding method with At-Most-K constraints that distributes the computational complexity by splitting the set of target variables into multiple parts. The proposed method is called Fractional Encoding because it is based on the concept that K is the numerator and the number of target variables, n, is the denominator. Hereafter, when there are at most K variables that are true out of n target variables, it is denoted as AtMost K/n (a set o f target variables). The overall flow of the research is shown in Figure 1.

Splitting Target Variables
By simply splitting the set of target variables into m subsets and applying Pairwise Encoding to each subset, the number of target variables n can be reduced to 1/m. In the following example, a set of 8 target variables is split into two subsets.
The number of clauses in Pairwise Encoding is calculated by n C K+1 and can be reduced from 56 to 8 as shown in (9). However, this example does not lead to assignment = x 1 , x 2 , x 3 , x 5 such that 3 variables in set x 1 , . . . , x 4 and 1 variable in set x 5 , . . . , x 8 are true. As a result, a simple splitting of the set greatly reduces the number of possible combinations of variables that can be solved (solution space). Figure 2 illustrates this phenomenon in the "split target vars" section. It shows the coverage when simply splitting the target variables, the coverage after introducing the auxiliary variables, and the coverage after introducing the fine-tuning variables, which will be discussed later.

Fractional Encoding
Fractional Encoding realizes At-Most-K constraints by propagating the fraction from the upper layer to the lower layer, as in the tree structure shown in Figure 3. Propagation is performed using auxiliary variables (g i,j ) to dynamically determine the At-Most-K constraints to be applied to the split target variables. This prevents the solution space from decreasing. As an example, the splitting of target variables for the AtMost 8/16(x 1 , . . . , x 16 ) constraint is shown below.
The variables belonging to G 1 , . . . , G 3 become auxiliary variables that propagate the ratio of the number of variables that can be true to the number of target variables as shown on the right in Figure 3 to realize (10) and (11). In addition, since the number of variables that become true at the bottom layer is controlled by propagating the ratio set at the top layer to the bottom layer, there is a fraction (Top layer K/n) that serves as the base. When 2/4 is used as the base, only 2 × 2 m /4 × 2 m patterns can be generated, such as At-Most-2/4, At-Most-4/8, and At-Most-8/16. The encoding procedure based on 2/4 is shown below. . , x 16 ) using Fractional Encoding. The bottom layer serves as the target variables, and the other layers play the role of auxiliary variables. Each layer has a group G i containing four variables, the number of which depends on the number of layers. The number of layers is log 2 K, and the number of groups G i is calculated as 1/2K log 2 K. In addition, each group is assigned a fine-tuning variable F i (to be explained below) to determine its state.

1.
Ratio setting at the top layer: In the formula below, the At-Most-2/4 constraint is applied to the top layer to set the base 2/4 ratio (at most half of the target variables are true). 2.
Introduction of fine-tuning variables: The fine-tuning variable F i = { f i.1 , f i.2 } is a variable to increase or decrease the number of variables that can be true among groups G i of the same layer. Each group has three states plus, minus, const and is uniquely encoded using two fine-tuning variables. Depending on the state of each set, the variables that can be true are increased or decreased as shown in Figure 4.

3.
Added logical expressions to propagate ratio to lower layer: The following equation adds a constraint that propagates the ratio from the upper to the lower layer. In addition, p(p = 1/2k log 2 k − 1) indicates the number of Local-Propagations, which will be discussed later. The Exactly-K constraint (to be explained below) in the upper layer counts the auxiliary variables that become true, and the At-Most-2K+z constraint (z = −1, 0, 1) is added in the lower layer.
The Exactly-K constraint is a constraint that counts exactly K variables to be true. It is expressed by the At-Most-K constraint and the At-Least-K constraint that indicates that there are at least K variables that are true, as shown below.
The leaf variables (target variables) generated by (16), (17), (18), and (19) propagate the proportions set at the top layer. Also, when Exactly-0(g i.j ), the number of possible true values in group G 2i+j cannot be further reduced, and when Exactly-2(g i.j ), the number of possible true values in group G 2i+j cannot be further increased, so we add the following equation.

4.
Add constraints for plus/minus offsetting at each layer: By applying At-Most-2 m to the fine-tuning variables in each layer, the pluses and minuses in the same layer must be balanced out. This allows increasing or decreasing the number of variables that can be true among the groups G in the same layer. In the equation below, p represents the number of layers and is calculated as in q = number o f layers.

Pattern Extension by Fixing Variables
Since Fractional Encoding generates At-Most-K constraints based on the base fraction, it is difficult to handle an arbitrary number of target variables. For example, if you want to create At-Most-3/5, there is no suitable fraction (top layer K/n). Therefore, we extend the patterns that can be generated by fixing values of a part of the target variables. Fixing true/false means adding clauses that make certain target variables true or false, as shown below.

Fix x i as true : x i (24)
Fix x i as f alse : ¬x i (25) As shown below, when the number c of target variables are fixed as true, K and n decrease by c, and when they are fixed as false, only n decrease by c. Figure 5 shows an example of three variables fixed.

Analysis
We discuss the number of clauses of the At-Most-K constraint generated by Fractional Encoding. Hereafter, the propagation from g i.j to G 2i+j as shown in Figure 6, which is important in the clauses number calculation, will be called Local-Propagation. By recursively transforming, the number of clauses in the At-Most-K/n constraint can be expressed as the number of Local-Propagations required to generate the At-Most-K/n constraint × 47. The number of Local-Propagations depends on the number of layers (r = log 2 K) and is calculated to be O(K log 2 K) as follows. The number of Local-Propagations depends on the number of layers; the first term is clauses (Local-Propagation) in (28); the second term shows the clauses (Local-Propagation) that are added when (28) is calculated recursively.
There are two types of auxiliary variables that are required for Fractional Encoding: The first is the variables that propagate the ratio to the bottom variable (the target variable), which contains four auxiliary variables in each group G. The second is a fine-tuning variable for each group G, two for each group G. The calculation of the number of auxiliary variables is shown below: 6(2 + 4) auxiliary variables are needed for each group G, as shown in the first term. In addition, since the fine-tuning variables are not needed for the top-layer group G T , they are subtracted as shown in the second term. In addition, variables in the lowest group G B are target variables and are subtracted in the third term.
Since G, G T , and G b can be computed with 1/2K log 2 K, 1/2K, and 1/4K(log 2 K + 1), respectively, the number of auxiliary variables becomes D, as shown below. Table 1 shows a comparison with conventional methods regarding the order computational complexity of the number of clauses and auxiliary variables.

Justification of the Proposed Method
In order to verify the validity of the proposed method, two verifications are conducted. The first is to verify that the At-Most-K constraint is realized by the proposed method. The At-Most-K constraint is a constraint where the number of variables that can be true is limited to K at most. Therefore, if the proposed method becomes unsatisfiable only when more than K target variables are fixed to be true, it can be shown that the proposed method is correct. To the generated At-Most-K constraints, clauses that fix more than K target variables to be true were added and targeted to the SAT solver. As a result, the proposed method is correct because it is unsatisfiable only when more than K target variables are fixed as true.
Second, the solution space of the generated At-Most-K constraints was verified: Fractional Encoding has a solution that does not occur when the target variables are split and Pairwise is applied, as described in the "Splitting target variables" section. Therefore, the proposed method can prevent the solution space from decreasing.

Comparison with Conventional Methods
In order to compare 2/4-based Fractional Encoding with conventional methods, we examined the number of clauses (Figure 7), the number of auxiliary variables (Figure 8), and the total number of literals ( Figure 9). These figures can be said to directly represent the differences in the order calculation tables of each method shown in Table 1.

Discussion
In comparison with the conventional method, it was found that the size of the logical formula can be reduced as the number of target variables increases. This is considered to be because when the number of target variables is small, the number of logical formulas required for Local-Propagation accounts for a large proportion of the total logical formulas. When the number of target variables is large, the number of logical expressions in Local-Propagation becomes negligible, and the size of logical expressions can be reduced compared to conventional methods. However, Fractional Encoding is inferior to conventional methods in terms of generating flexible At-Most-K constraints. Fractional encoding requires searching for the base fraction (the top layer K/n of the At-Most-K constraints) as described in the section "Fractional Encoding", and if it cannot be found, the modifications described in the section "Pattern Extension with Variable Fixation" must be made. Regarding the correction ability of Fractional Encoding, no verification has been performed yet, and it is necessary to verify the possibility of demonstrating superior performance even when overhead occurs due to "Pattern Extension with Variable Fixation" compared to existing methods.
In the near future, in an effort to generalize Fractional Encoding, we plan to examine methods for finding base fractions and verify the correction ability of Fractional Encoding.

Conclusions
In this study, the Fractional Encoding method is proposed as a means of reducing the size of the logical expression of the At-Most-K constraint. Fractional Encoding reduces the size of logical expressions by splitting the set of target variables and using Pairwise Encoding dynamically for each set. However, since simply splitting the set significantly reduces the number of possible variable combinations, we dynamically determined the At-Most-K constraints for the split set using auxiliary variables.
Comparison with conventional methods shows that the size of the logic expression can be reduced when the number of target variables increases. However, when the number of target variables is small or when it is difficult to find the base fraction (Top Layer K/n) necessary for the propagation of the At-Most-K constraints to be generated, the scale of the generated logic formulas is significantly inferior to that of conventional methods.

Conflicts of Interest:
The authors declare no conflict of interest.