An Extension of the Brouwer–Zimmermann Algorithm for Calculating the Minimum Weight of a Linear Code

: A modiﬁcation of the Brouwer–Zimmermann algorithm for calculating the minimum weight of a linear code over a ﬁnite ﬁeld is presented. The aim was to reduce the number of codewords for consideration. The reduction is signiﬁcant in cases where the length of a code is not divisible by its dimensions. The proposed algorithm can also be used to ﬁnd all codewords of weight less than a given constant. The algorithm is implemented in the software package Q EXT N EW E DITION .


Introduction
In 1997, Alexander Vardy proved that for general binary linear code, computing the minimum weight is an NP-hard problem, and the corresponding decision problem is NP-complete [1]. The popular practical algorithms are based on a maximum number of generating matrices G 1 , G 2 , . . . , G s with disjoint sets of systematic coordinates. Some are the Brouwer-Zimmermann algorithm and its various modifications for cyclic codes, quasi-cyclic codes, divisible codes, etc. (see [2][3][4][5]). Such algorithms are implemented in the software package MAGMA [6].
The basic idea is that after taking 1, 2, . . . , l linear combinations of the rows of all matrices, all codewords with weight ≤ w (depending on l) will be generated, provided there are any. Additionally, if so far the lightest generated codeword has a weight w + 1, then the minimum distance d(C) is equal to w + 1. The problem we are considering here is which matrices among G 1 , G 2 , . . . , G s to use and which linear combinations of their rows to take so that the number of the codewords generated is minimal. We propose a new modification of the Brouwer-Zimmermann algorithm that allows the generation of fewer codewords than the algorithm implemented in MAGMA. The MAGMA algorithm is described by M. Grassl in [4] and is implemented in the function MINIMUMWEIGHT(C). The advantages of the proposed approach are very visible in code with length n, close but less than 2k, where k is the dimension. This can be seen in the table of experimental results. The number of required vector operations in some cases is many times smaller in our algorithm compared to the algorithms we know.
In Section 2, we present some important properties of the minimum weight of linear code and a short description of the Brouwer-Zimmermann algorithm. We give the theoretical basis of our method in Section 3. Section 4 is devoted to the new algorithm that we propose. The first difference between our algorithm and the BZ algorithm is that the Brouwer-Zimmermann algorithm uses overlapping information sets, but we partition the set of coordinate positions into disjoint sets called systematic sets (full and reduced). For each of the considered sets, the algorithm constructs a corresponding generator matrix of the code (this holds for both algorithms). Another important difference is in the number of considered linear combinations of rows of generator matrices needed to calculate the minimum weight. If the length n is not divisible by the dimension k, our algorithm needs fewer codewords to obtain the value of d(C). We list and explain some computational results in Section 5.

Preliminaries
Let F q be a finite field with q elements and F n q be the n-dimensional vector space over F q . A linear [n, k] q code C is a k-dimensional subspace of the vector space F n q . A matrix whose rows form a basis of C is called a generator matrix of this code.
The (Hamming) distance d(x, y) between two vectors x, y ∈ F n q is the number of coordinate positions in which they differ, and the (Hamming) weight wt(x) of a vector x ∈ F n q is the number of its nonzero coordinates. The minimum distance of a linear code is the smallest distance between two different codewords, and the minimum weight is the smallest weight among all non-zero codewords of the code. If C is a linear code, then its minimum weight and minimum distance are equal. An [n, k, d] q code is a linear code over F q with minimum distance d. Then d ≤ n, but there are much better upper bounds for the minimum weight. In our algorithm, we use the Singleton bound, namely, d ≤ n − k + 1.
The most widely used algorithm for computing the minimum weight of a linear [n, k] q code was designed by A. Brouwer and subsequently extended by K.-H. Zimmermann. The literature about this problem refers to it as the Brouwer-Zimmermann algorithm, abbreviated the BZ algorithm (see [2,4] for its description). The BZ algorithm was outlined in [5], where the authors proposed its extension, which consists of a good (sometimes best possible) sequence of information sets for a given code. We propose another extension to the BZ algorithm which is related to the linear combinations needed to compute the minimum weight of the given code.
The main idea of the BZ algorithm is to enumerate the codewords in such a way that one not only obtains an upper bound on the minimum weight of the code via the minimum of the weights of the words that have been encountered, but also a lower bound on the minimum weight. For this, the concept of information sets is needed [4].
Definition 1 ((Information Set) [4]). Let C be a linear [n, k, d] q code of length n and dimension k over F q . A subset T ⊆ {1, 2, . . . , n} of size |T| = k is called an information set if the corresponding columns in a generator matrix G of C are linearly independent. Then there also exists a systematic generator matrix G T of C such that the columns of G T specified by T form an identity matrix.
The BZ algorithm uses a family of information sets T 1 , . . . , T l for the code C such that T 1 ∪ · · · ∪ T l = {1, 2, . . . , n}. These information sets are not necessarily disjoint, and therefore a sequence r 1 , . . . , r l of nonnegative integers, called relative ranks, is defined, where r i = |T i \ (T 1 ∪ · · · ∪ T i−1 )|. The methods used in [2,4] for constructing sets T i and their corresponding systematic matrices G i produce them in such a way that the sequence of relative ranks is non-increasing.
The Brouwer-Zimmermann algorithm goes in the following way: In the initial step, the upper bound on the minimum weight is d = n − k + 1 (Singleton bound), and the lower bound is d = 1. Each step depends on the integers w and j so that the algorithm enumerates all codewords uG j such that wt(u) = w. During this process, if a codeword x with wt(x) < d is generated, then the algorithm updates d according to The value of d is also updated (see [4,5] for more details). The algorithm then tests whether d ≥ d, and if so, then it terminates with the conclusion that d(C) = d.

Systematic Sets
We define the term systematic set, which is similar to the information set, but the difference is in the size of the set.

Definition 2.
Let C be a linear [n, k] q code with a generator matrix G. A subset T ⊆ {1, 2, . . . , n} of size |T| ≤ k is called a systematic set for C if the corresponding columns in G are linearly independent. If |T| = k. then T is a full systematic set; otherwise it is a reduced systematic set.
The terms information set and full systematic set coincide. If T is a systematic set, then C has a generator matrix G T such that the columns corresponding to the set T form a submatrix of G T which is equivalent to the identity matrix I |T| extended with k − |T| zero rows.
The following notation is very important for this research. Let T 1 , . . . , T s be disjoint systematic sets for the linear [n, k] q code C, such that T 1 ∪ · · · ∪ T s = {1, . . . , n}. Suppose that T 1 , . . . , T t , and t ≤ s, are full systematic sets, and the other s − t systematic sets are reduced. Denote by U for any a i ≥ 0. These sets of codewords underlie our approach and make it different from known methods. We use a union of such sets (denoted as usual by U (a i ) i ), and a multiset sum (denoted by U (a i ) i ), which may contain some codewords repeated several times.
Consider first the case when n = tk for an integer t and C has t disjoint information sets T 1 , . . . , T t , so s = t. Such is, for example, the self-dual code or the t-CIS (complementary information set) code. Obviously, the minimum distance of such code is at least t. This is because for each codeword corresponding to a non-trivial linear combination of rows of any generating matrix, there will be at least one nonzero coordinate for each information set. If we compute the weights of all rows of G 1 , we will determine all codewords with weight t, but not those with weight t + 1, because the code may contain a codeword with two nonzero coordinates from T 1 . If we continue with computing the weights on all rows of G 2 , we will determine all codewords with weight t + 1 (if any), etc.
The following proposition is basic for both algorithms (Algorithm 1 and the BZ algorithm) for which all systematic sets are full. In this case, U consists of the linear combinations of up to a i rows of the matrix G i .
We would like to mention here that a similar technique has been applied in the study of zero-divisor graph structure in [7]. Now consider the general case where not all systematic sets are full. Let {1, . . . , n} = T 1 ∪ · · · ∪ T t ∪ T t+1 ∪ · · · ∪ T s , where T 1 , . . . , T t are full systematic sets for the code C; T t+1 , . . . , T s are reduced systematic sets; and T i ∩ T j = ∅ for 1 ≤ i < j ≤ s. Without loss of generality we can suppose that |T 1 | = · · · = |T t | = k > |T t+1 | ≥ · · · ≥ |T s |, and Suppose that the systematic sets T 1 , . . . , T t are full, and the other s − t sets are reduced. Let r ≤ s − t and a 1 , . . . , a t+r be nonnegative integers, such that a i ≤ k and a 1 + · · · + a t+r ≥ 1. Then the set U = U (a 1 ) 1 ∪ · · · ∪ U (a t+r ) t+r contains all codewords with weight w ≤ a 1 + a 2 + · · · + a t+r + t + r − 1.
Proof. The proof is similar to the proof of Proposition 1. Let m = a 1 + a 2 + · · · + a t+r + t + r − 1 ≥ t + r and v ∈ C be a codeword of weight w ≤ m. Let b i = wt(v| T i ), where v| T i is the restriction of v on the systematic set T i , i = 1, . . . , s. It turns out that w = b 1 + · · · + b s and b i > 0 for all i = 1, . . . , t. Suppose that b i ≥ a i + 1 for all i = 1, . . . , t + r. Then m ≥ w ≥ a 1 + a 2 + · · · + a t+r + t + r + b t+r+1 + · · · + b s = m + 1 + b t+r+1 + · · · + b s ≥ m + 1, which is not possible. Hence, there is at least one index i, These formulae show why we use the parameter r in Theorem 1 and then in the algorithm. Let G 1 , . . . , G s be the generator matrices that correspond to the sets T 1 , . . . , T s , respectively. If the matrix G i is not required in a current step of the algorithm for 1 ≤ i ≤ t, then we take a i = 0 and then U (a i ) i = ∅. However, if t < i ≤ s then the set U (0) i contains q k−|T i | − 1 ≥ q − 1 codewords. Therefore, when the matrix G i is not required, we reduce the number of the considered nonnegative integers a i .

The Algorithm
We are looking for the minimum weight of the linear [n, k] q code C. Let the set of the coordinate positions {1, . . . , n} for the code C is partitioned into s systematic sets T 1 , . . . , T s , such that |T 1 | = · · · = |T t | = k > |T t+1 | ≥ · · · ≥ |T s |, t ≤ s.
The algorithm uses an integer δ which increases by one from the integer t until it reaches d(C) − 1 or d(C). For a particular value of δ, we use consistently r = 0, 1, . . . , s − t of the reduced systematic sets. For a fixed r, let δ = a 1 + · · · + a t+r + t + r − 1 for some nonnegative integers a 1 , . . . , a t+r . We define multisets Ω δ recursively in the following way: t+r , where a 1 , . . . , a t+r are nonnegative integers such that m = a 1 + · · · + a t+r + t + r − 1, 0 ≤ r ≤ s − t.
• Take δ = m + 1. The nonnegative integers r, a 1 , . . . , a t+r are the same as above so that and j = l is the smallest value of the parameter j for which the sum is minimal (is equal to S m+1 ). If l ≤ t + r, we set If r < s − t and l = t + r + 1, then Ω m+1 = Ω m U (0) t+r+1 . We write briefly According to Theorem 1, all codewords in C of weight ≤ δ belong to the set Ω δ . If the code does not contain nonzero codewords of weight ≤ δ, then δ increases by one, and this value is taken as a lower bound lb for the minimum weight of the code. The lightest codeword in Ω δ defines an upper bound for the minimum weight of C. The exact value of d(C) is obtained when lb + 1 = ub, or during the generation process of Ω δ , a codeword of weight lb = δ is obtained.
The reason we take Ω δ this way is to generate as few codewords as possible, while making sure we get a codeword with a minimum weight. The summands a 1 , . . . , a t+r in the expression δ = a 1 + · · · + a t+r + t + r − 1 were chosen so that the number of the codewords in the set Ω δ should be as small as possible.
The pseudocode is presented as Algorithm 1. The correctness of the algorithm follows from Theorem 1. If all considered systematic sets are full (s = t), our algorithm is similar to the other variants of the Brouwer-Zimmermann algorithm [4]. The number of elementary operations performed by the algorithm depends on k, d, and the sizes of the systematic sets. For given n, k, q, and sizes of the disjoint systematic sets, it is theoretically possible to determine both the minimum and maximum numbers of codewords to be generated to prove that the minimum weight of the corresponding code is equal to d.
t+r , then the minimum number of the generated codewords is and it is attained when there is a codeword v ∈ Ω d−1 of weight d. If all vectors in Ω d−1 have weights larger than d, then the algorithm generates the set Ω d . The experimental results show that the upper bound ub reaches the minimum weight at an early stage of the algorithm, and this can be used to solve some problems faster, for example, when we want to prove that the minimum weight is ≤ d for a given integer d.
In the while-loop, to construct Ω δ we use its subset Ω δ−1 because its codewords have already been generated. To get each vector in Ω δ with only one vector addition, we generate some codewords more than once and the formula (1) actually gives the number of these additions. More details on this process are given in [8] for prime fields and in [9] for composite fields. Moreover, instead of the sets U  (1)). Then we generate the multiset Ω δ as a multiset sum of U (a i ) i in the same way, as is explained in the beginning of this section. The whole algorithm consists of several subproblems. The time complexity for finding an optimal solution for constructing systematic sets is given in detail in [5]. The complexity of the rest is difficult to be estimated because it depends on two parameters that are not known at the beginning. One of them includes the number and cardinality of the systematic sets, which depend on the structure of the code. The second one is the minimum weight we are looking for. While the problem with the second parameter can be solved by considering the corresponding decision problem [5], the first parameter is difficult to estimate. The time complexity of the algorithm for generating the codewords in a set U (a) i is given in [8].
The same approach can also be used to obtain the set of codewords with minimum weight, and the set of codewords with a given weight w > d. This is necessary, for example, in equivalence tests and finding automorphism groups of linear code. This approach is more effective than the one given in [8]. The procedure is the following:

2.
Generating the set U (a 1 ) 1 and save its vectors with weight w.

3.
For 2 ≤ i ≤ t + r, generating the set U (a i ) i and save the codewords v ∈ U (a i ) i with weight w, for which wt(v| T j ) < a j for j = 1, . . . , i − 1. With this procedure we obtain all codewords of weight w without repetitions. If we want to have a maximal set of nonproportional codewords of weight w, we apply the same procedure, but using the sets U (a i ) i for i = 1, . . . , s. The following example illustrates Algorithm 1. In fact, the code itself is not important to see how the algorithm works; the parameters and the sizes of the systematic sets are sufficient.

Computational Results
For a given [n, k, d] q linear code C, we computed the number of codewords that have to be generated, required to prove that wt(C) = d by the BZ algorithm, and by our algorithm. For the partitioning of the set of coordinate positions {1, . . . , n} we used the same strategy as the BZ algorithm. The difference is that the BZ algorithm uses overlapping information sets. As is described in [4], for a general linear code with systematic generator matrix G 1 = (I|A 1 ), the rank of the matrix A 1 can be less than k, which implies that there is no information set T 2 with T 1 ∩ T 2 = ∅. In this situation, we can obtain a reduced systematic set T 2 of size |T 2 | = rankA 1 (this reduced systematic set is called a partial information set in [4]). Our algorithm uses this reduced systematic set, unlike the BZ algorithm, which uses an information set I 2 ⊇ T 2 obtained from T 2 and k − r 2 elements of T 1 .
We performed experiments with some [n, k, d] q codes with MAGMA, and our program included in the package QEXTNEWEDITION. We have used MAGMA V2.25-2 via online Magma Calculator run in a virtual machine on an Intel Xeon Processor E3-1220, 3.10 GHz. Our implementation was executed on Intel Core i7-6700hq 2.60 GHz processor. We present the number of the enumerated vectors with both programs in Table 1. Moreover, we give the amounts of time for computing the minimum weights of the corresponding codes with MAGMA and QEXTNEWEDITION. To compare the algorithms, only the numbers of codewords generated are important, because we ran the programs on different computers. In fact, in the case when C has disjoint full systematic sets and n = tk, our algorithm and the BZ algorithm are similar. Therefore, in this case the numbers of codewords generated were almost the same, but there was a difference when n = (t − 1)k + r for 1 ≤ r < k. Take, for example, the [115, 60, 13] 2 code in Table 1. To calculate the minimum weight of this code, Magma uses two overlapping information sets I 1 and I 2 , their corresponding generator matrices G I 1 and G I 2 , and generates 2 ∑ 8 i=1 ( 60 i ) = 6001753644 codewords. Our algorithm operates with two systematic sets T 1 and T 2 , |T 1 | = 60, and |T 2 | = 55, so Actually, as we see in this example, when we know the minimum distance in advance, we can compute the number of the codewords to be generated by a formula independently before running the algorithm.

Conclusions
In conclusion, we would like to add that Algorithm 1 is a competitive version of the classical Brouwer-Zimmermann algorithm. This approach can be further developed for special types of codes, such as cyclic, quasi-cyclic, and divisible codes. The algorithm is implemented in the software package QEXTNEWEDITION [10].
At the time of preparing this paper, three modules of QEXTNEWEDITION are publicly available and can be freely downloaded. These are the programs GENERATION for classifying linear codes over small finite fields; LCEQUIVALENCE, designed to obtain the inequivalent codes in a set of linear codes over a finite field with q < 65 elements and compute their automorphism groups; and WDHV, which calculates the weight distribution of linear code. As a stand-alone program, the implementation of the presented algorithm is not finalized. The current LINUX or WINDOWS version will be sent upon request by the authors.
We would like to mention some open problems related to the algorithms for computing the minimum weight of linear code. The standard way to find the minimum distance is through the weight spectrum of the code. In practice, the Brouwer-Zimmermann type algorithms are effective for codes with small numbers of disjoint systematic sets. One of the open questions is to determine for which parameters each of the two approaches is more effective.
Our experimental results show that when a matrix with a reduced systematic set is used, the proposed algorithm is more efficient than the one implemented in MAGMA. The question arises as to whether this is true for all such cases.
A parallel implementation of the Brouwer-Zimmermann algorithm is presented in [11]. The codes that the authors considered are very suitable for our algorithm. Therefore, the natural question arises about the parallel implementation of Algorithm 1.  Data Availability Statement: The current LINUX or WINDOWS version of the software will be sent upon request by the authors.