Next Article in Journal
Application of Shannon Entropy to Reaction–Diffusion Problems Using the Stochastic Finite Difference Method
Previous Article in Journal
Fine-Grained Semantics-Enhanced Graph Neural Network Model for Person-Job Fit
Previous Article in Special Issue
An Aliasing Measure of Factor Effects in Three-Level Regular Designs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Two-Level Regular Designs for Baseline Parameterization

1
School of Statistics and Data Science, Qufu Normal University, Qufu 273165, China
2
Zaozhuang No.2 Middle School, Zaozhuang 277000, China
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(7), 706; https://doi.org/10.3390/e27070706
Submission received: 31 March 2025 / Revised: 18 June 2025 / Accepted: 27 June 2025 / Published: 30 June 2025
(This article belongs to the Special Issue Number Theoretic Methods in Statistics: Theory and Applications)

Abstract

This paper considers two-level regular fractional factorial designs for baseline parameterization. Some new results that reveal relationships between the K-values and word length pattern are developed. The new results help find two-level regular fractional factorial designs that are likely to be optimal under the K-aberration criterion. Illustrative examples are included to demonstrate this point.

1. Introduction

The application of experimental design in information theory is primarily reflected in optimizing experimental schemes to maximize information acquisition efficiency while minimizing uncertainty. The core concepts of information theory, such as entropy, provide quantitative tools for experimental design. For example, in the field of communications, experimental design can optimize channel coding schemes, thereby improving transmission reliability. Additionally, in machine learning feature selection, experimental design based on information entropy can identify the most discriminative feature subsets, reducing model complexity and enhancing classification accuracy. The integration of experimental design and information theory enables a systematic approach to solving key challenges in information acquisition, processing, and optimization, offering theoretical guidance for engineering and scientific experiments. Bose explored the interplay between statistical experimental design and information-theoretic concepts, discussing how to construct designs that maximize informational yield from data [1]. His work established parallels between statistical efficiency (particularly variance minimization in parameter estimates) and information-theoretic principles.
Factorial designs have a wide application in various fields. The baseline parameterization (BP) model and the orthogonal parameterization (OP) model are two of the most used models in the analysis of experimental data. The BP model is a linear model based on baseline constraints while the OP model is based on zero-sum constraints. There are numerous research findings under the OP model. For details, please refer to [2,3] and the references therein. BP is quite a natural option for modeling the experiments with each factor having a null state or baseline level. For example, in toxicology experiments [4], each binary factor represents the presence or absence of a particular toxin. Scientists consider the absence of all toxins to be the natural reference for the possible presence of some toxins. In such experiments, the status of the absence of a toxin can be naturally regarded as a baseline level. Another practical example is introduced by Glonek and Solomon [5] in a leukemic mice experiment. The authors presented two binary components that stand for sample and time, respectively. A natural baseline level is the condition of the non-leukemic line for one factor and time zero for another factor. For a broader interpretation of BP, one is referred to [6,7].
In recent years, choosing efficient designs under BP has raised considerable attention from researchers. The D-optimality and A-optimality are two commonly used efficiency criteria for selecting optimal experimental designs under the main effect model. From an information perspective, they aim to maximize the information content of the experimental data or minimize the variance of parameter estimates. Mukerjee and Tang [7] highlighted the A-optimality of two-level orthogonal arrays for BP when the interaction effects are all absent. Mukerjee and Huda [8] applied approximate theory together with discretization procedures to find designs that have high A-efficiencies and are robust to model misspecification. Liu et al. [9] employed the D-optimality criterion to find designs under BP that are both efficient and robust.
For situations where the interaction effects are present, the experimenter usually concerns the bias caused by interaction effects in estimating the main effects. Karunanayaka and Tang [10] proposed to add runs to one-factor-at-a-time designs for generating compromise designs that are competitive under both the efficiency criterion and the bias criterion. Mukerjee and Tang [7] proposed an optimality criterion, the K-aberration criterion, which quantifies the bias for two-level designs. Under the K-aberration criterion, a few works on designs for BP emerged. Li et al. [11] proposed an efficient incomplete search algorithm for finding nearly optimal designs and tabulated some 20-run (nearly) optimal two-level designs. Miller and Tang [12] focused on identifying efficient two-level regular designs through bridging the K-values and word length pattern. Mukerjee and Tang [13] developed certain rank conditions which, in conjunction with the idea of the minimum moment aberration and recursive set, help alleviate the burden of finding optimal two-level regular designs. Lin and Yang [14] considered finding multistratum designs for BP by using the coordinate-exchange algorithm. Li et al. [15] further proposed a theoretical construction method of compromise designs. Chen et al. [16] considered the situations where some two-factor interactions are also of interest in addition to the main effects and proposed an algorithm for searching optimal designs under their proposed minimum aberration criterion. Sun and Tang [17] investigated the linear relationship between the effects under BP and those under OP and explored its applications to design construction under BP in terms of estimability, optimality, and robustness. Yan and Zhao [18,19] put forward a minimum aberration criterion for three-level designs for BP and found some optimal designs using their proposed construction algorithm.
This paper focuses on two-level regular designs under the K-aberration. Though the two-level nonregular designs may outperform the regular ones in some cases, there is a compelling reason for considering two-level regular designs as has been justified in [13]: the results on two-level regular designs serve as a benchmark for evaluating further work on nonregular ones that have to be compared. Given the importance of the two-level regular designs for BP, we endeavor to make further progress on bridging the K-values and word length pattern and analytically calculate the quantities for two-level regular designs with a higher resolution based on the results in [13]. Such new progress has the following advantages: (i) it can help screen out the candidate designs that are not likely to be K-aberration optimal and thus can yield a further simplification of the search algorithm in [13]; (ii) it is capable of finding two-level regular designs that have better or even optimal K-aberration characteristics than those identified by the results in [13]. Illustrative examples are given to demonstrate these points.
The rest of this paper is organized as follows. Section 2 introduces some necessary notation and elementary knowledge of BP and the word length pattern. Section 3 presents the main results of this paper. Applications of the theoretical results are included in Section 4. The concluding remarks are given in Section 5.

2. Preliminaries

Consider an experiment with n factors each at two levels. A full design includes 2 n runs that correspond to 2 n level combinations of the n factors. Suppose only a 2 m fraction of the full design can be carried out for economic reasons. This paper considers the regular fraction of the full design. Let 2 n m denote a two-level regular fractional factorial design D with N = 2 n m runs and n columns, with each column at two levels 0 and 1; such a design can be obtained as follows: Let q = n m . Define the N × ( N 1 ) matrix
H q = ( 1 , 2 , 12 , 3 , 13 , , q , 1 q , , 12 q ) ,
with columns arranged in Yates order, where
1 = ( 0 , 1 , 0 , 1 , , 0 , 1 , 0 , 1 ) 2 q , 2 = ( 0 , 0 , 1 , 1 , , 0 , 0 , 1 , 1 ) 2 q , q = ( 0 , , 0 , 1 , , 1 ) 2 q
are q independent columns, and the other columns are obtained by taking the component-wise sum (modulo 2) of the independent ones, and say 12 = ( 0 , 1 , 1 , 0 , , 0 , 1 , 1 , 0 ) 2 q . A regular 2 n m design D can be obtained by selecting n columns of H q such that q are independent and the other m ( = n q ) columns are component-wise sums (modulo 2) of the q independent ones. Clearly, the design D is an N × n submatrix of H q .
For a regular 2 n m design D, we regard it as a set with elements being the n columns of D and denote still the set as D without causing confusion. Let ω s = { g 1 , , g s } denote a subset of s columns of D and Ω s ( D ) denote the collection of all the possible ω s , where s n . For ease of presentation, Ω s is sometimes used instead of Ω s ( D ) without causing confusion. Then for any ω Ω s ( D ) , ω is also an N × s submatrix of D. Note that each row of ω is an s-tuple with entries of 0 or 1; we call it a binary s-tuple. Let α ( ω ) be the number of times that the s-tuple 11 1 occurs in the submatrix ω . For a regular 2 n m design D, let K s denote the total bias to the main effects estimations caused by all the s-factor interaction effects. Mukerjee and Tang [7] proved that
K s = ( 4 / N 2 ) ( s T 1 s + T 2 s ) ,
where
T 1 s = ω s Ω s ( D ) ( α ( ω s ) ) 2 ,
T 2 s = ω s + 1 Ω s + 1 ( D ) ω s Ω s ( ω s + 1 ) ( 2 α ( ω s + 1 ) α ( ω s ) ) 2 ,
and Ω s ( ω s + 1 ) denotes the collection of all the possible ω s ω s + 1 for given ω s + 1 Ω s + 1 ( D ) . Formula (2) applies to any two-level designs not just the regular ones. For more details of the K-aberration, one is referred to [7]. A two-level orthogonal array is K-aberration optimal if it sequentially minimizes the following sequence:
( K 2 , K 3 , , K m )
among all the two-level designs, given the run size and the number of columns.
The word length pattern (WLP) is a concept proposed for the designs under OP. With a slight modification, such a concept can also be applied to the two-level regular designs under BP as follows: For the original definition of WLP under OP, one is referred to [20]. For any regular 2 n m baseline design D, denote ψ ( ω k ) as the sum of the components of the vector g 1 + + g k (modulo 2), where ω k = { g 1 , , g k } Ω k ( D ) . Define
J k ( ω k ) = | 2 ψ ( ω k ) N | .
If the k columns in ω k satisfy g 1 + + g k (modulo 2) = 1 N or 0 N , then we have ψ ( ω k ) = N or 0, which leads to J k ( ω k ) = N and the k columns correspond to a defining word, where 1 N and 0 N denote the N-vectors with all components being 1 or 0, respectively. If the k columns in ω k do not correspond to a defining word, then these columns must be independent of each other. This means that all of the binary k-tuples appear equally often as rows in ω k , which leads to J k ( ω k ) = 0 . Let A k ( D ) = ω k Ω k ( D ) J k ( ω k ) / N , then A k ( D ) is the number of defining words of length k of the regular design D. For a given two-level regular design, we call the corresponding sequence
( A 3 , A 4 , A 5 , , A n ) ,
as its word length pattern and t as its resolution, if the first nonzero element in Sequence (5) is A t , where 3 t n .
With the primary knowledge above, the next section builds connections between Sequences (4) and (5).

3. Main Results

Note that both Sequences (4) and (5) are closely related to the collection ω of a regular 2 n m design. Recalling the definition of T 1 s in (3), when an s-column collection ω s contains no defining word, all of the binary s-tuples appear exactly N / 2 s times in ω s , which means that such an ω s contributes N / 2 s to T 1 s . When an s-column collection ω s contains some defining words, i.e., the columns in ω s form some defining words, the analyses for the contribution to T 1 s caused by such an ω s become complex. Similar analyses are also required when considering T 2 s . Therefore, it is necessary to investigate the possible cases of a given collection ω s containing defining words, so as to calculate K s . Suppose a regular 2 n m design has a resolution of t. Lemma 1 presents the maximum number of defining words in each of its ( t + 2 ) -column collections ω t + 2 .
Lemma 1.
Suppose D is a regular 2 n m design of resolution t and ω t + 2 Ω t + 2 ( D ) . Then
(i)
ω t + 2 contains at most two independent defining words for t = 3 , 4 ;
(ii)
ω t + 2 contains at most one defining word for t 5 ,
where t + 2 n .
Proof. 
(i) For t = 3 , denote ω 5 = { g 1 , g 2 , g 3 , g 4 , g 5 } . If ω 5 contains three independent defining words W 1 , W 2 , and W 3 , then W 1 , W 2 , and W 3 generate another four defining words W 1 W 2 , W 1 W 3 , W 2 W 3 , and W 1 W 2 W 3 . These seven defining words contain at least 21 letters (columns) since each defining word contains at least t ( = 3 ) letters. Note that a letter appears at most four times among the seven defining words. Then the seven defining words contain no more than 20 letters since there are only five columns in ω 5 . This contradiction shows the validity of (i) for t = 3 . For t = 4 , the proof is similar.
(ii) If ω t + 2 contains two defining words W 1 and W 2 with the length t 1 and t 2 , respectively, then W 1 and W 2 have at least t 1 + t 2 ( t + 2 ) common letters. Since t 1 t and t 2 t , the length of the defining word W 1 W 2 is at most ( t 1 + t 2 ) 2 [ t 1 + t 2 ( t + 2 ) ] 4 , which contradicts t 5 . This completes the proof of (ii).    □
Before proceeding to the main results of this section, we first introduce a lemma that is a refinement of the results from [12,21].
Lemma 2.
Denote g 1 , g 2 , , g i , g i + 1 , , g j as j columns from a regular 2 n m design D of resolution t. Suppose, among these j columns, only the first i ones correspond a defining word, i.e., g 1 + g 2 + + g i ( m o d u l o 2 ) = 0 N or 1 N . Then, we have the following:
(i)
The rows in the i-column matrix ( g 1 , g 2 , , g i ) must consist of N / 2 i 1 copies of a half replicate of the full 2 i factorial design;
(ii)
Furthermore, in matrix ( g 1 , g 2 , , g i , g i + 1 , , g j ) , for the N / 2 i 1 copies of each distinct row of matrix ( g 1 , g 2 , , g i ) , all the 2 j i distinct rows of matrix ( g i + 1 , , g j ) appear equally N / 2 j 1 times,
where 3 t i < j n .
In (i) of Lemma 2, the half replicate of the full 2 i factorial design that the i-column matrix ( g 1 , g 2 , , g i ) contains depends on whether g 1 + g 2 + + g i = 0 N or 1 N (modulo 2). This is addressed in detail in Remark 1.
Remark 1.
In Lemma 2, if g 1 + g 2 + + g i = 0 N (modulo 2), then all the possible i-tuples that contain an even number of ones appear N / 2 i 1 times in the i-column matrix ( g 1 , g 2 , , g i ) . If g 1 + g 2 + + g i = 1 N (modulo 2), then all the possible i-tuples that contain an odd number of ones appear N / 2 i 1 times in the i-column matrix ( g 1 , g 2 , , g i ) .
For a defining word W, let ϕ ( W ) denote the vector generated by taking component-wise sums (modulo 2) of the columns in the defining word. Then, ϕ = 0 N or 1 N . Denote A i 0 and A i 1 as the numbers of length i defining words with ϕ = 0 N and 1 N , respectively, where i = 3 , 4 , , n . Clearly, A i 0 + A i 1 = A i . Denote A 4 0 , 0 as the number of pairs of length four defining words that have two common columns and ϕ = 0 N , and A 4 1 , 1 as the number of pairs of length four defining words that have two common columns and ϕ = 1 N . Theorem 1 builds the bridge between K t + 1 and the WLP for t = 4 .
Theorem 1.
Suppose D is a regular 2 n m design with resolution t = 4 , then
K 5 = ( 1 / 16 ) 2 5 n 5 + ( n 4 ) ( 2 n 15 ) A 4 + 20 ( n 4 ) A 4 0 4 A 4 0 , 0 8 A 4 1 , 1 + ( 1 / 16 ) 2 5 ( n 6 ) A 5 + 20 A 5 1 + 6 A 6 .
Proof. 
We calculate T 15 and T 25 , separately. For T 15 , suppose ω 5 = { g 1 , g 2 , g 3 , g 4 , g 5 } Ω 5 ( D ) . Since D has resolution 4, ω 5 contains at most one defining word. There are five scenarios as follows:
(a1)
ω 5 contains one length-four defining word W with ϕ ( W ) = 0 N ;
(a2)
ω 5 contains one length-four defining word W with ϕ ( W ) = 1 N ;
(a3)
ω 5 contains one length-five defining word W with ϕ ( W ) = 0 N ;
(a4)
ω 5 contains one length-five defining word W with ϕ ( W ) = 1 N ;
(a5)
The five columns in ω 5 are independent of each other.
For (a1), suppose the defining word is g 1 + g 2 + g 3 + g 4 = 0 N (modulo 2) without loss of generality. According to Lemma 2 (i) and Remark 1, the rows that consist of four ones appear N / 2 3 times in the four-column matrix ( g 1 , g 2 , g 3 , g 4 ) . Therefore, the columns of entire ones appear N / 2 4 times in the five-column matrix ( g 1 , g 2 , g 3 , g 4 , g 5 ) , i.e., α ( ω 5 ) = N / 16 .
For (a2), suppose the defining word is g 1 + g 2 + g 3 + g 4 = 1 N (modulo 2) without loss of generality. According to Lemma 2 (i) and Remark 1, the four-column matrix ( g 1 , g 2 , g 3 , g 4 ) contains only rows that consist of an odd number of ones. This implies that none of the rows in matrix ( g 1 , g 2 , g 3 , g 4 , g 5 ) contains entire ones, i.e., α ( ω 5 ) = 0 .
With similar arguments to (a1) and (a2), we can obtain α ( ω 5 ) = 0 and α ( ω 5 ) = N / 16 for (a3) and (a4), respectively. It is obvious that α ( ω 5 ) = N / 32 for (a5). The number of ω 5 s belonging to (a1)–(a5) are ( n 4 ) A 4 0 , ( n 4 ) A 4 1 , A 5 0 , A 5 1 , and n 5 ( n 4 ) A 4 0 ( n 4 ) A 4 1 A 5 0 A 5 1 , respectively. With the analysis above, it yields that
T 15 = ( N / 16 ) 2 ( n 4 ) A 4 0 + ( N / 16 ) 2 A 5 1 + ( N / 32 ) 2 n 5 ( n 4 ) A 4 A 5 .
Now, we consider calculating T 25 . According to Lemma 1 (i) for t = 4 , there are nine possibilities for ω 6 Ω 6 ( D ) :
(b1)
ω 6 contains two independent defining words W 1 and W 2 , which generate the third defining word W 3 = W 1 W 2 . Each of the three defining words has a length of four, one has ϕ = 0 N and the other two have ϕ = 1 N ;
(b2)
ω 6 contains two independent defining words W 1 and W 2 , which generate the third defining word W 3 = W 1 W 2 . Each of the three defining words has a length of four and ϕ = 0 N ;
(b3)
ω 6 contains only one defining word with a length of four and ϕ = 0 N ;
(b4)
ω 6 contains only one defining word with a length of four and ϕ = 1 N ;
(b5)
ω 6 contains only one defining word with a length of five and ϕ = 0 N ;
(b6)
ω 6 contains only one defining word with a length of five and ϕ = 1 N ;
(b7)
ω 6 contains only one defining word with a length of six and ϕ = 0 N ;
(b8)
ω 6 contains only one defining word with a length of six and ϕ = 1 N ;
(b9)
The six columns in ω 6 are independent of each other.
For (b1)–(b9), denote ω 5 as any five-column subset of ω 6 , i.e., ω 5 Ω 5 ( ω 6 ) . Now we proceed to investigate the values of 2 α ( ω 6 ) α ( ω 5 ) and the number of ω 6 s in each of the cases for (b1)–(b9).
For (b1), since there is a length-four defining word, say W 1 , with ϕ ( W 1 ) = 1 N in ω 6 , none of the rows in the matrix consisting of the columns involved in W 1 contains entire ones and thus α ( ω 6 ) = 0 . With careful checking, each ω 5 Ω 5 ( ω 6 ) must contain only one defining word, say W, which has a length of four with either ϕ ( W ) = 0 N or 1 N . For the ω 5 s of the former case, we have α ( ω 5 ) = N / 2 4 and there are two such ω 5 s in Ω 5 ( ω 6 ) . For the ω 5 s of the latter case, we have α ( ω 5 ) = 0 and there are four such ω 5 s in Ω 5 ( ω 6 ) . Note that the ω 6 in (b1) contains a pair of length-four defining words that have two columns in common and their ϕ = 1 N . Recalling the meaning of A 4 1 , 1 , we conclude that the number of ω 6 s belonging to (b1) is A 4 1 , 1 .
For (b2), since ω 6 contains three length-four defining words and each of which has ϕ = 0 N , we have α ( ω 6 ) = N / 2 5 according to Lemma 2 (ii). Note that each ω 5 Ω 5 ( ω 6 ) must contain one of these three defining words. From Lemma 2 (ii), we have α ( ω 5 ) = N / 2 4 for each ω 5 Ω 5 ( ω 6 ) and there are six such ω 5 s in Ω 5 ( ω 6 ) . Note that there are three pairs of length-four defining words in ω 6 and each pair has two columns in common. Thus, it has totally A 4 0 , 0 / 3   ω 6 s in (b2).
For (b3), it is easy to obtain that α ( ω 6 ) = N / 2 5 . Each ω 5 Ω 5 ( ω 6 ) contains either a length-four defining word with ϕ = 0 N or five independent columns. For the ω 5 s of the former case, we have α ( ω 5 ) = N / 2 4 and there are two such ω 5 s in Ω 5 ( ω 6 ) . For the ω 5 s of the latter case, we have α ( ω 5 ) = N / 2 5 and there are four such ω 5 s in Ω 5 ( ω 6 ) . Now we investigate the number of ω 6 s that belong to (b3). The four columns of each of the A 4 0 defining words jointed with any two of the remaining n 4 columns of D induce an ω 6 Ω 6 ( D ) . Notably, the three defining words in each ω 6 of case (b2) induce exactly the ω 6 itself. Similarly, the ω 6 in case (b1) can be induced by the length-four defining word with ϕ = 0 N in it. Therefore, the number of ω 6 s in case (b3) is n 4 2 A 4 0 A 4 0 , 0 2 A 4 1 , 1 .
For (b4), we have α ( ω 6 ) = 0 as there is a length-four defining word with ϕ = 1 N . With a similar analysis to (b3), among the six ω 5 s in Ω 5 ( ω 6 ) , two of them have α ( ω 5 ) = 0 and four of them have α ( ω 5 ) = N / 2 5 , depending on whether the ω 5 contains a length-four defining word with ϕ = 1 N or not. The four columns of each of the A 4 1 defining words jointed with any two of the remaining n 4 columns of D induce an ω 6 in Ω 6 ( D ) . One thing to note is that the two length-four defining words with ϕ = 1 N in each ω 6 of case (b1) induce exactly the ω 6 itself. The number of ω 6 s belonging to (b4) is n 4 2 A 4 1 2 A 4 1 , 1 .
For cases (b5)–(b8), the results on the values of α ( ω 6 ) s, the number of ω 6 s belonging to each case, the values of α ( ω 5 ) s, and the number of ω 5 s in each Ω 5 ( ω 6 ) , which have the same values of α ( ω 5 ) s, are straightforward. These results are summarized in Table 1 along with those for cases (b1)–(b4), where the notation n ω 5 represents the number of α ( ω 5 ) s of each value. Note that each ω 6 in (b9) has α ( ω 6 ) = N / 2 6 and α ( ω 5 ) = N / 2 5 for each ω 5 Ω 5 ( ω 6 ) ; this results in 2 α ( ω 6 ) α ( ω 5 ) = 0 . Therefore, there is no need to consider case (b9) when calculating T 25 .
With Table 1, it is obtained that
T 25 = ( N / 32 ) 2 4 n 4 2 A 4 4 A 4 0 , 0 8 A 4 1 , 1 + 5 ( n 5 ) A 5 + 6 A 6 .
With Equations (7) and (8), we obtain
K 5 = ( 4 / N 2 ) ( 5 T 15 + T 25 ) = ( 1 / 16 ) 2 5 n 5 + ( n 4 ) ( 2 n 15 ) A 4 + 20 ( n 4 ) A 4 0 4 A 4 0 , 0 8 A 4 1 , 1 + ( 1 / 16 ) 2 5 ( n 6 ) A 5 + 20 A 5 1 + 6 A 6 .
This completes the proof.    □
Theorem 2 below builds the relationship between K t + 1 and the WLP for t 5 .
Theorem 2.
Suppose D is a regular 2 n m design with resolution t 5 , then
K t + 1 = ( 1 / 2 2 t ) 4 ( t + 1 ) ( n t ) A t 1 + 4 ( t + 1 ) A t + 1 0 + t n t 2 ( t + 1 ) ( n t ) A t + ( 1 / 2 2 t ) ( t + 1 ) ( n t 2 ) A t + 1 + ( t + 2 ) A t + 2 + ( t + 1 ) n t + 1
for an odd t 5 , and
K t + 1 = ( 1 / 2 2 t ) 4 ( t + 1 ) ( n t ) A t 0 + 4 ( t + 1 ) A t + 1 1 + t n t 2 ( t + 1 ) ( n t ) A t + ( 1 / 2 2 t ) ( t + 1 ) ( n t 2 ) A t + 1 + ( t + 2 ) A t + 2 + ( t + 1 ) n t + 1
for an even t 5 .
Proof. 
To calculate T 1 s in (2) for s = t + 1 , consider five possibilities for ω t + 1 Ω t + 1 ( D ) :
(c1)
ω t + 1 contains only one defining word with length t and ϕ = 0 N ;
(c2)
ω t + 1 contains only one defining word with length t and ϕ = 1 N ;
(c3)
ω t + 1 contains only one defining word with length t + 1 and ϕ = 0 N ;
(c4)
ω t + 1 contains only one defining word with length t + 1 and ϕ = 1 N ;
(c5)
ω t + 1 consists of t + 1 columns that are independent of each other.
With similar analyses to Theorem 1, we have Table 2 and Table 3 for calculating T 1 ( t + 1 ) for an odd and even t 5 , respectively. With Table 2 and Table 3, we obtain that
T 1 ( t + 1 ) = ( N / 2 t + 1 ) 2 4 ( n t ) A t 1 + 4 A t + 1 0 + n t + 1 ( n t ) A t A t + 1
for an odd t 5 , and
T 1 ( t + 1 ) = ( N / 2 t + 1 ) 2 4 ( n t ) A t 0 + 4 A t + 1 1 + n t + 1 ( n t ) A t A t + 1
for an even t 5 .
Considering T 2 ( t + 1 ) , there are seven possibilities for ω t + 2 Ω t + 2 ( D ) :
(d1)
ω t + 2 contains only one defining word and its length is t with ϕ = 0 N ;
(d2)
ω t + 2 contains only one defining word and its length is t with ϕ = 1 N ;
(d3)
ω t + 2 contains only one defining word and its length is t + 1 with ϕ = 0 N ;
(d4)
ω t + 2 contains only one defining word and its length is t + 1 with ϕ = 1 N ;
(d5)
ω t + 2 contains only one defining word and its length is t + 2 with ϕ = 0 N ;
(d6)
ω t + 2 contains only one defining word and its length is t + 2 with ϕ = 1 N ;
(d7)
ω t + 2 consists of t + 2 columns that are independent of each other.
With similar analyses to Theorem 1, we have Table 4 and Table 5 for calculating T 2 ( t + 1 ) for an odd and even t 5 , respectively. Note that each ω t + 2 in (d7) has α ( ω t + 2 ) = N / 2 t + 2 and α ( ω t + 1 ) = N / 2 t + 1 for each ω t + 1 Ω t + 1 ( ω t + 2 ) ; this results in 2 α ( ω t + 2 ) α ( ω t + 1 ) = 0 . Therefore, there is no need to consider case (d7) when calculating T 2 ( t + 1 ) .
With Table 4 and Table 5, we obtain that
T 2 ( t + 1 ) = ( N / 2 t + 1 ) 2 t n t 2 A t + ( t + 1 ) ( n t 1 ) A t + 1 + ( t + 2 ) A t + 2
for both an odd and even t 5 .
With Equations (9)–(11), we have
K t + 1 = ( 4 / N 2 ) [ ( t + 1 ) T 1 ( t + 1 ) + T 2 ( t + 1 ) ] = ( 1 / 2 2 t ) 4 ( t + 1 ) ( n t ) A t 1 + 4 ( t + 1 ) A t + 1 0 + t n t 2 ( t + 1 ) ( n t ) A t + ( 1 / 2 2 t ) ( t + 1 ) ( n t 2 ) A t + 1 + ( t + 2 ) A t + 2 + ( t + 1 ) n t + 1
for an odd t 5 , and
K t + 1 = ( 4 / N 2 ) [ ( t + 1 ) T 1 ( t + 1 ) + T 2 ( t + 1 ) [ = ( 1 / 2 2 t ) 4 ( t + 1 ) ( n t ) A t 0 + 4 ( t + 1 ) A t + 1 1 + t n t 2 ( t + 1 ) ( n t ) A t + ( 1 / 2 2 t ) ( t + 1 ) ( n t 2 ) A t + 1 + ( t + 2 ) A t + 2 + ( t + 1 ) n t + 1
for an even t 5 . This completes the proof.    □
Remark 2.
Theorems 1 and 2 establish relationships between the K-aberration and WLP that are further developments based on the work in [12]. Theorems 1 and 2 help narrow down the choice of finding optimal regular 2 n m designs. Moreover, for some situations, Theorems 1 and 2 are capable of identifying the optimal ones. This point will be demonstrated in Section 4.

4. Applications

It is worth noting that the concept of isomorphism for the designs under BP is different from that under OP. Under OP, two designs are called isomorphic if one can be obtained from the other by column-permuting, row-permuting, or symbol-switching. However, the symbols of the two-level designs are not interchangeable under BP. Hence, two designs are called isomorphic under BP if one can be obtained from the other by column-permuting or row-permuting. Hereafter, we use the terms OP regular 2 n m designs versus BP regular 2 n m designs as discriminations. Clearly, switching symbols of some columns of OP regular 2 n m designs may result in nonisomorphic BP designs. In the following, we illustrate how to find BP regular 2 n m designs that have desirable K-aberration characteristics by using the catalogs of nonisomorphic OP regular 2 n m designs displayed in [22].
Consider finding desirable BP regular 2 16 10 designs under K-aberration. By checking the catalogs of nonisomorphic OP designs displayed in [22], all the OP regular 2 16 10 designs have a resolution of either t = 3 or 4. According to Theorem 1 in [12], any OP regular 2 16 10 design with a resolution of t = 4 has a smaller K 2 than those with a resolution of t = 3 , noting that K 2 = 2 n 2 ( 1 / 2 4 2 ) = 60 for t = 4 and K 2 = ( 1 / 2 2 t 4 ) ( t 1 ) n t 1 + t A t = 60 + 3 A 3 / 4 for t = 3 . Among the OP regular 2 16 10 designs of resolution t = 4 , the designs with the minimum A 4 have a smaller K 3 according to Theorem 1 (b) in [12]. According to [22], the unique OP regular 2 16 10 design with the minimum A 4 , denoted as D 0 , is determined by the following ten independent defining words: g 1 g 2 g 3 g 4 g 5 g 7 , g 1 g 2 g 3 g 6 g 8 , g 1 g 4 g 6 g 9 , g 1 g 2 g 5 g 6 g 10 , g 1 g 3 g 4 g 11 , g 1 g 3 g 5 g 12 , g 1 g 2 g 4 g 13 , g 3 g 5 g 6 g 14 , g 2 g 4 g 5 g 6 g 15 , and g 2 g 3 g 5 g 16 , where g 1 , , g 6 are the 1st, 2nd, 2 2 th, , 2 5 th columns of the matrix H q in (1) with q = 6 . The ϕ s of these ten defining words equaling to 0 N or 1 N determines 2 10 BP regular designs that may have different K-aberration performances. According to Theorem 2 of [12], among the 2 10 BP designs, those with the minimum A 4 0 have a smaller K 4 . For example, the following two BP regular 2 16 10 designs D 1 and D 2 have the minimum A 4 0 = 17 , which results in the minimum K 4 = 153.8906 :
D 1 : ϕ ( g 1 g 2 g 3 g 4 g 5 g 7 ) = 1 N , ϕ ( g 1 g 2 g 3 g 6 g 8 ) = 0 N , ϕ ( g 1 g 4 g 6 g 9 ) = 1 N , ϕ ( g 1 g 2 g 5 g 6 g 10 ) = 1 N , ϕ ( g 1 g 3 g 4 g 11 ) = 1 N , ϕ ( g 1 g 3 g 5 g 12 ) = 0 N , ϕ ( g 1 g 2 g 4 g 13 ) = 0 N , ϕ ( g 3 g 5 g 6 g 14 ) = 1 N , ϕ ( g 2 g 4 g 5 g 6 g 15 ) = 0 N , ϕ ( g 2 g 3 g 5 g 16 ) = 1 N . D 2 : ϕ ( g 1 g 2 g 3 g 4 g 5 g 7 ) = 1 N , ϕ ( g 1 g 2 g 3 g 6 g 8 ) = 0 N , ϕ ( g 1 g 4 g 6 g 9 ) = 1 N , ϕ ( g 1 g 2 g 5 g 6 g 10 ) = 1 N , ϕ ( g 1 g 3 g 4 g 11 ) = 1 N , ϕ ( g 1 g 3 g 5 g 12 ) = 0 N , ϕ ( g 1 g 2 g 4 g 13 ) = 0 N , ϕ ( g 3 g 5 g 6 g 14 ) = 0 N , ϕ ( g 2 g 4 g 5 g 6 g 15 ) = 0 N , ϕ ( g 2 g 3 g 5 g 16 ) = 1 N .
Although D 1 and D 2 have the same value of K 4 , they can be discriminated with respect to K 5 by applying Theorem 1. Compared to D 1 , D 2 has the same A 4 0 , 0 and A 4 1 , 1 but a smaller A 5 1 than D 1 . This means that D 2 has a smaller K 5 than D 1 according to Theorem 1. As a confirmation, we calculate the K s values of the previously stated 2 10 BP regular 2 16 10 designs, and it transpires that D 2 is one of the K-aberration optimal BP regular 2 10 6 designs.
Here is an example of the application of Theorem 2. Consider finding BP regular 2 10 3 designs that have desirable K-aberration characteristics. By checking the catalogs of nonisomorphic OP regular 2 10 3 designs provided in [22], we only need to consider the OP regular 2 10 3 design, denoted as D 3 , determined by the independent defining words g 1 g 2 g 3 g 4 g 5 g 8 , g 1 g 2 g 4 g 6 g 9 , and g 1 g 2 g 3 g 6 g 7 g 10 , since it has A 4 = 0 and the minimum A 5 among all the nonisomorphic regular 2 10 3 designs, where g i is the 2 i 1 th column of the matrix H q in (1) with q = 7 , i = 1 , , 7 . There are 2 3 BP regular 2 10 3 designs associated with D 3 depending on whether the previously mentioned three defining words are equal to 0 N or 1 N . Among these regular BP 2 10 3 designs, those with the minimum A 5 0 have the minimum K 5 according to Theorem 2 of [12]. For example, the regular BP 2 10 3 design, denoted as D 4 , which is determined by the defining words g 1 g 2 g 3 g 4 g 5 g 8 = 0 N , g 1 g 2 g 4 g 6 g 9 = 0 N , and g 1 g 2 g 3 g 6 g 7 g 10 = 0 N , has the minimum A 5 0 = 3 and then the minimum K 5 = 5.227 . At the same time, D 4 has the minimum A 6 0 = 3 , which indicates that D 4 has the minimum K 6 according to Theorem 2. As a confirmation, we calculate the K s values of all the 2 3 BP regular designs and find that D 4 is one of the K-aberration optimal BP regular 2 10 3 designs.
The two examples above show that Theorems 1 and 2 can help to filter or select designs. Take Theorem 1 as an example. When the experimenters need to compare 2 n m designs with resolution 4, they can firstly calculate A 4 , A 4 0 , A 4 0 , 0 , A 4 1 , 1 , A 5 , A 5 1 , and A 6 according to the defining words of the designs and then K 5 according to Theorem 1. Clearly, this is time-saving compared with calculating K 5 among the 2 10 designs.
The results of Theorems 1 and 2 establish a relation between the K-values and word length pattern of a design. To facilitate practitioners in other fields applying the methods, the following algorithm is provided based on Theorem 1. The algorithm (Algorithm 1) can be directly extended if Theorem 2 is required.
Algorithm 1: For a given n and m, consider a 2 n m design D with resolution 4.
Step 1.
List all the defining words of D.
Step 2.
Calculate A 4 , A 4 0 , A 4 0 , 0 , A 4 1 , 1 , A 5 , A 5 1 , and A 6 according to the defining words.
Step 3.
Calculate K 5 using (6) directively.
Here, we would like to point out that, to calculate the defining words of D, one should refer to [20].

5. Concluding Remarks

In experiments with each factor having a null state or baseline level, the BP model has quite a natural explanation. Then, finding the optimal fractional factorial designs under BP becomes important. However, the number of nonisomorphic designs under BP is much larger than that under OP, which makes it intricate for us in finding the optimal designs. Together with the results in [12], the present work helps narrow down the choice of finding the optimal regular 2 n m designs through bridging the K-values and WLP. Nonregular designs are commonly used in various experiments due to their flexibility of the run size. Therefore, further study that focuses on finding the optimal nonregular designs under BP is deserved. However, just like the regular designs, it is also intricate in finding the optimal nonregular designs under BP. An algorithm reducing the candidates of the optimal designs would be very useful.

Author Contributions

Conceptualization, S.Z.; methodology, S.Z. and M.Q.; writing—original draft preparation, M.Q.; writing—review and editing, S.Z.; funding acquisition, S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 12171277.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Bose, R.C. On some connections between the design of experiments and information theory. Bull. Inst. Internat. Statist. 1961, 38, 257–271. [Google Scholar]
  2. Cheng, C.-S.; Tang, B. Theory of Nonregular Factorial Designs; CRC Press: Boca Raton, FL, USA, 2025. [Google Scholar]
  3. Dey, A.; Mukerjee, R. Fractional Factorial Plans; Wiley: New York, NY, USA, 2009. [Google Scholar]
  4. Kerr, K.F. Efficient 2k factorial designs for blocks of size 2 with microarray applications. J. Qual. Technol. 2006, 38, 309–318. [Google Scholar] [CrossRef]
  5. Glonek, G.F.V.; Solomon, P.J. Factorial and time course designs for cDNA microarray experiments. Biostatistics 2004, 5, 89–111. [Google Scholar] [CrossRef] [PubMed]
  6. Banerjee, T.; Mukerjee, R. Optimal factorial designs for cDNA microarray experiments. Ann. Appl. Statist. 2008, 2, 366–385. [Google Scholar] [CrossRef]
  7. Mukerjee, R.; Tang, B. Optimal fractions of two-level factorials under a baseline parameterization. Biometrika 2012, 99, 71–84. [Google Scholar] [CrossRef]
  8. Mukerjee, R.; Huda, S. Approximate theory-aided robust efficient factorial fractions under baseline parametrization. Ann. Inst. Statist. Math. 2016, 68, 787–803. [Google Scholar] [CrossRef]
  9. Liu, Y.; Ren, M.; Zhao, S.L. Robust and efficient factorial designs under baseline parametrization. Commun. Statist. Theory Methods 2025, 54, 1868–1879. [Google Scholar] [CrossRef]
  10. Karunanayaka, R.C.; Tang, B. Compromise designs under baseline parameterization. J. Statist. Plann. Inference 2017, 190, 32–38. [Google Scholar] [CrossRef]
  11. Li, P.; Miller, A.; Tang, B. Algorithmic search for baseline minimum aberration designs. J. Statist. Plann. Inference 2014, 149, 172–182. [Google Scholar] [CrossRef]
  12. Miller, A.; Tang, B. Using regular fractions of two-level designs to find baseline designs. Statist. Sinica 2016, 26, 745–759. [Google Scholar] [CrossRef]
  13. Mukerjee, R.; Tang, B. Optimal two-level regular designs under baseline parametrization via cosets and minimum moment aberration. Statist. Sinica 2016, 26, 1001–1019. [Google Scholar] [CrossRef]
  14. Lin, C.Y.; Yang, P. Robust multistratum baseline design. Comput. Statist. Data Anal. 2018, 118, 98–111. [Google Scholar] [CrossRef]
  15. Li, W.; Liu, M.Q.; Tang, B. A systematic construction of compromise designs under baseline parameterization. J. Statist. Plann. Inference 2022, 219, 33–42. [Google Scholar] [CrossRef]
  16. Chen, A.; Sun, C.Y.; Tang, B. Selecting baseline designs using a minimum aberration criterion when some two-factor interactions are important. Statist. Theory Relat. Fields 2021, 5, 95–101. [Google Scholar] [CrossRef]
  17. Sun, C.Y.; Tang, B. Relationship between orthogonal and baseline parameterizations and its application to design constructions. Statist. Sinica 2022, 32, 239–250. [Google Scholar] [CrossRef]
  18. Yan, Z.H.; Zhao, S.L. Optimal fractions of three-level factorials under a baseline parameterization. Statist. Probab. Lett. 2023, 202, 109902. [Google Scholar] [CrossRef]
  19. Yan, Z.H.; Zhao, S.L. Optimal s-level fractional factorial designs under baseline parameterization. J. Statist. Plann. Inference 2025, 236, 106242. [Google Scholar] [CrossRef]
  20. Fries, A.; Hunter, W.G. Minimum aberration 2k-p designs. Technometrics 1980, 22, 601–608. [Google Scholar]
  21. Deng, L.-Y.; Tang, B. Generalized resolution and minimum aberration criteria for Plackett-Burman and other nonregular factorial designs. Statist. Sin. 1999, 9, 1071–1082. [Google Scholar]
  22. Xu, H. Algorithmic Construction of Efficient Fractional Factorial Designs with Large Run Sizes. Available online: http://www.stat.ucla.edu/~hqxu/pub/ffd2r/ (accessed on 30 March 2025).
Table 1. α ( ω 6 ) and α ( ω 5 ) for calculating T 25 in Theorem 1.
Table 1. α ( ω 6 ) and α ( ω 5 ) for calculating T 25 in Theorem 1.
Scenario α ( ω 6 ) α ( ω 5 ) : n ω 5 n ω 6
(b1)0 N / 16 : 2   and   0 : 4 A 4 1 , 1
(b2) N / 32 N / 16 : 6 A 4 0 , 0 / 3
(b3) N / 32 N / 16 : 2   and   N / 32 : 4 n 4 2 A 4 0 A 4 0 , 0 2 A 4 1 , 1
(b4)0 0 : 2   and   N / 32 : 4 n 4 2 A 4 1 2 A 4 1 , 1
(b5)0 N / 32 : 5   and   0 : 1 ( n 5 ) A 5 0
(b6) N / 32 N / 32 : 5   and   N / 16 : 1 ( n 5 ) A 5 1
(b7) N / 32 N / 32 : 6 A 6 0
(b8)0 N / 32 : 6 A 6 1
n ω 5 : the number of α ( ω 5 ) s of each value for ω 6 . n ω 6 : the number of ω 6 s in (b1)–(b8).
Table 2. α ( ω t + 1 ) for calculating T 1 ( t + 1 ) for an odd t 5 in Theorem 2.
Table 2. α ( ω t + 1 ) for calculating T 1 ( t + 1 ) for an odd t 5 in Theorem 2.
α ( ω t + 1 ) n ω t + 1
(c1)0 ( n t ) A t 0
(c2) N / 2 t ( n t ) A t 1
(c3) N / 2 t A t + 1 0
(c4)0 A t + 1 1
(c5) N / 2 t + 1 n t + 1 ( n t ) A t A t + 1
n ω t + 1 : the number of ω t + 1 s in (c1)–(c5).
Table 3. α ( ω t + 1 ) for calculating T 1 ( t + 1 ) for an even t 5 in Theorem 2.
Table 3. α ( ω t + 1 ) for calculating T 1 ( t + 1 ) for an even t 5 in Theorem 2.
α ( ω t + 1 ) n ω t + 1
(c1) N / 2 t ( n t ) A t 0
(c2)0 ( n t ) A t 1
(c3)0 A t + 1 0
(c4) N / 2 t A t + 1 1
(c5) N / 2 t + 1 n t + 1 ( n t ) A t A t + 1
n ω t + 1 : the number of ω t + 1 s in (c1)–(c5).
Table 4. α ( ω t + 2 ) and α ( ω t + 1 ) for calculating T 2 ( t + 1 ) for an odd t 5 in Theorem 2.
Table 4. α ( ω t + 2 ) and α ( ω t + 1 ) for calculating T 2 ( t + 1 ) for an odd t 5 in Theorem 2.
α ( ω t + 2 ) α ( ω t + 1 ) : n ω t + 1 n ω t + 2
(d1)0 N / 2 t + 1 : t   or   0 : 2 n t 2 A t 0
(d2) N / 2 t + 1 N / 2 t + 1 : t   or   N / 2 t : 2 n t 2 A t 1
(d3) N / 2 t + 1 N / 2 t + 1 : ( t + 1 )   or   N / 2 t : 1 ( n t 1 ) A t + 1 0
(d4)0 N / 2 t + 1 : ( t + 1 )   or   0 : 1 ( n t 1 ) A t + 1 1
(d5)0 N / 2 t + 1 : ( t + 2 ) A t + 2 0
(d6) N / 2 t + 1 N / 2 t + 1 : ( t + 2 ) A t + 2 1
n ω t + 1 : the number of α ( ω t + 1 ) s of each value for ω t + 2 . n ω t + 2 : the number of ω t + 2 s in (d1)–(d6).
Table 5. α ( ω t + 2 ) and α ( ω t + 1 ) for calculating T 2 ( t + 1 ) for an even t 5 in Theorem 2.
Table 5. α ( ω t + 2 ) and α ( ω t + 1 ) for calculating T 2 ( t + 1 ) for an even t 5 in Theorem 2.
α ( ω t + 2 ) α ( ω t + 1 ) : n ω t + 1 n ω t + 2
(d1) N / 2 t + 1 N / 2 t : 2   or   N / 2 t + 1 : t n t 2 A t 0
(d2)0 N / 2 t + 1 : t   or   0 : 2 n t 2 A t 1
(d3)0 N / 2 t + 1 : ( t + 1 )   or   0 : 1 ( n t 1 ) A t + 1 0
(d4) N / 2 t + 1 N / 2 t + 1 : ( t + 1 )   or   N / 2 t : 1 ( n t 1 ) A t + 1 1
(d5) N / 2 t + 1 N / 2 t + 1 : ( t + 2 ) A t + 2 0
(d6)0 N / 2 t + 1 : ( t + 2 ) A t + 2 1
n ω t + 1 : the number of α ( ω t + 1 ) s of each value for ω t + 2 . n ω t + 2 : the number of ω t + 2 s in (d1)–(d6).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, S.; Qin, M. Two-Level Regular Designs for Baseline Parameterization. Entropy 2025, 27, 706. https://doi.org/10.3390/e27070706

AMA Style

Zhao S, Qin M. Two-Level Regular Designs for Baseline Parameterization. Entropy. 2025; 27(7):706. https://doi.org/10.3390/e27070706

Chicago/Turabian Style

Zhao, Shengli, and Mengru Qin. 2025. "Two-Level Regular Designs for Baseline Parameterization" Entropy 27, no. 7: 706. https://doi.org/10.3390/e27070706

APA Style

Zhao, S., & Qin, M. (2025). Two-Level Regular Designs for Baseline Parameterization. Entropy, 27(7), 706. https://doi.org/10.3390/e27070706

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop