Evaluating Popular Statistical Properties of Incomplete Block Designs: A MATLAB Program Approach

: Evaluating the statistical properties of a semi-Latin square, and in general, an incomplete block design, is vital in determining the usefulness of the design for experimentation. Improving the procedures for obtaining these statistical properties has been the subject of some research studies and software developments. Many available statistical software that evaluate incomplete block designs do so at the level of analysis of variance but not for the popular A- , D -, E-, and MV -efﬁciency properties of these designs to determine their adequacy for experimentation. This study presents a program written in the MATLAB environment using MATLAB codes and syntaxes which is capable of computing the A- , D -, E-, and MV -efﬁciency properties of any ( n × n ) / k semi-Latin square and any incomplete block design via their incidence matrices, where N is the number of rows and columns and k is the number of plots. The only input required for the program to compute the four efﬁciency criteria is the incidence matrix of the incomplete block design. The incidence matrix is the binary representation of an incomplete block design. The program automatically generates the efﬁciency values of the design once the incidence matrix has been provided, as shown in the examples.


Introduction
An (n × n)/k semi-Latin square, according to Bedford and Whitaker [1], Bailey and Royle [2] and Soicher [3], has N 2 cells (called blocks) in a square array, where each block has k plots, and there are Nk treatments which are allocated to plots such that each treatment occurs once in each row and once in each column (see also References [2,3]). Bailey [4] listed some methods for the construction of semi-Latin squares, such as the inflation method, interleaving method, Trojan method, superposition method, etc. Software such as the nauty package by McKay [5], GRAPE package by Soicher [6] and Traces package by McKay and Piperno [7] aid in the enumeration of semi-Latin squares up to isomorphism classification by exploring the graphical properties of the semi-Latin squares. Bailey and Royle [2] pointed out that results from graph theory can be applied to problems of incomplete block designs and the quotient block design of a semi-Latin square is said to be a regulargraph design.
A semi-Latin square could be referred to as an incomplete block design with the usual parameters, t, b, r, k and λ, where b is the number of blocks, r is the treatment replication while λ is the number of times each pair of treatments appears together in the design [2]. The most important statistical property of a block design is the efficiency, and there are many measures of efficiency of a block design, including the popular A-, D-, E-, and MVefficiencies. According to John [8], incomplete block designs with equal block sizes often differ in the precision with which treatment effects are estimated, and efficiency factors of the incomplete block designs provide useful measures of these differences.
The efficiency factors, in particular, the pairwise and canonical efficiency factors, of incomplete block designs have been the subject of extensive research in the area of statistical analysis of incomplete block designs [8][9][10]. Most statistical efficiency measures of the incomplete block designs are derived from their pairwise/canonical efficiency factors. Bailey and Royle [2] and Morgan [11] stated that the A-, Dand E-efficiency measures are obtained from the canonical efficiency factors, while the MV-efficiency measure is obtained from the component of the pairwise efficiency factors. The canonical efficiency factors are the non-zero eigenvalues of the information matrix, while the pairwise efficiency factors are obtained from the variances of the treatment contrasts.
The earliest programs for incomplete block designs were built to analyze the isomorphism and automorphism properties of the (n × n)/k semi-Latin squares. Such programs include McKay's nauty package and the program by Bailey and Royle [2], which runs on SparcStation ELC. The nauty package uses the graph properties of semi-Latin squares to obtain the isomorphism and automorphism classifications but does not evaluate the statistical properties of the semi-Latin squares, such as the optimality and efficiency properties. The nauty package is not easily accessible and runs only on a number of systems. The Bailey and Royle program was designed for only (6 × 6)/2 semi-Latin squares and also uses the orthogonal one-factorizations of the regular graph designs for the construction of the semi-Latin squares on SparcStation ELC, and each graph takes about one minute of CPU time. GRAPE software was developed by Soicher [6] in 1993, while the shipped version was developed in 2019. The GRAPE software runs partly on GAP environment and is primarily designed for constructing and analyzing graphs related to finite geometry, groups and designs. The other component runs on McKay's nauty package. According to Soicher [12], the GRAPE algorithm performs more slowly and the computer time is not exact. The MATLAB program presented here utilizes simple matrix algebra, which is already part of the MATLAB codes and syntaxes, to obtain the four popular statistical properties, A-, D-, E-, and MV-efficiencies, of any incomplete block design in about one second of computer time.
The MATLAB R2019a software has an in-built statistics toolbox which is a collection of MATLAB codes and syntaxes for statistical data analysis. The statistics toolbox could analyze incomplete block designs up to analysis of variance tests of hypothesis but does not contain the required codes to obtain the A-, D-, E-, and MV-efficiencies measures of incomplete block designs. In this study, we provide a MATLAB program for the computation of the A-, D-, E-, and MV-efficiencies of incomplete block designs from their canonical and pairwise efficiency factors. The program, presented in the Appendix A, is capable of accurately computing the A-, D-, E-, and MV-efficiencies measures of any semi-Latin square, and in general, all incomplete block designs. The program requires only input of the binary equireplicate incidence matrix of the block.

Algebraic Basis for the Program
Given a semi-Latin square, Ξ, with t = nk treatments arranged in b = n 2 blocks, the incidence matrix, N, emanating from Ξ or that of any other t-treatment incomplete block design, Σ, in b blocks, is a t × b matrix of 1's and 0's. The t × t concurrence matrix of the incomplete block design is a product of N and its transpose, N . The concurrence matrix, NN , is then transformed to the information, C = R − 1 k NN , of order t, where R = rI, r is the treatment replication in N and I is an identity matrix, also of order t. The product of the eigenvalues of C and the reciprocal of r gives the t − 1 canonical efficiency factors, ω i , i = 1, 2, . . ., t − 1, of Ξ or Σ, with one of the values of ω i being equal to zero. Conversely, multiply C by the reciprocal of r to normalize the information matrix to obtain C * = I − 1 rk NN . Then, the canonical efficiency factors, ω i , are the t − 1 non-zero eigenvalues of C * .
The A-efficiency measure of Ξ or Σ is the harmonic mean of the canonical efficiency factors and is given by The D-efficiency measure is the geometric , while the E-efficiency is the minimum of the canonical efficiency factors, given by E = min(ω i ).
The MV-efficiency measure is obtained from the components of the generalized inverse of C. Let C inv be the Moore-Penrose generalized inverse of C and δ i , i = 1, 2, . . ., t, the i-th treatment effect, withδ i the least squares estimator of δ i . Then, the treatment contrast is δ i −δ j , i = j, while the variance from intra-block analysis of the incomplete block design is var δ i −δ j = ξ ii + ξ jj − 2ξ ij σ 2 , where ξ ij is the (i,j)-th element of C inv . The MV-efficiency criterion is given as max var δ i −δ j : see, for example, Reference [11].

Theoretical ComputatioN of the EfficieNcy Criteria
Given an incomplete block design, Σ(t, b, r, k λ), the t × b incidence matrix, N, has its (i, j) entry as the number of times treatment i occurs in block j. The column sums of N are all equal to the block size, k, and the sum of the i-th row, r i , is the i-th treatment replication (that is, the number of times treatment i occurs overall). If r i is the same for all the treatments, then Σ(t, b, r, k λ) is called an equireplicate design. The matrix, N, is made binary by identifying treatment positions in each block by '1' and non-treatment positions by '0'. Let Ξ ∈ Σ(t, b, r, k, λ) be the (6 × 6)/2 semi-Latin square, ∆ 0 , of Bailey and Royle [2]. The design, ∆ 0 , is displayed in Figure 1. The input matrix required by this MATLAB program to compute the efficiency criteria of ∆ 0 is the 12 × 36 incidence matrix of ∆ 0 , given by N = This concurrence matrix can be obtained from the program by entering the command, 'NNP', in the command window. To obtain the information matrix, C, the treatment replication, r = 3, is used to multiply the identity matrix of order 12 (which is available in MATLAB as 'eye(12)') to give the R matrix of order 12. The concurrence matrix multiplied by the reciprocal of k, the block size, is subtracted from R to give C = Mathematics 2021, 9, 1281 4 of 12 which is also of order 12. Dividing C by r, the normalized information matrix is obtained as The canonical efficiency factors of Ξ ∈ Σ(t, b, r, k, λ) are the non-zero eigenvalues of   The minimum of the variances of the elementary contrasts for each treatment is 0.4314. Hence, MV is the maximum of the minimum contrasts, MV = 0.4314.

DescriptioN of the Program
The MATLAB software by Mathworks is designed to run on C, C++, and MATLAB programming languages. The MATLAB program presented in this work requires inputting only the incidence matrix of the incomplete block design as a matrix to obtain the results of the four efficiency criteria as output. Paste the program on the MATLAB Command Window from where the program will request for the input of the incidence matrix through the command prompt, 'N = '. Paste the incidence matrix of the desired incomplete block design at the command prompt and the results of the A-, D-, E-, and MV-efficiencies would be displayed immediately on the Command Window. Alternatively, input the incidence matrix row-by-row, with rows separated by semi-colons in a bracket, then press the 'Enter' key on the computer keyboard to display the results. Another method of running the program is through the MATLAB M-file console. Copy and paste the program on the M-file console for easy editing. After inputting the incidence matrix either as a matrix or row-by-row, press the 'Run' button on the M-file toolbar to display the results of the four efficiency values on the Command Window.
Generally, by providing input of the incidence matrix, the program automatically estimates the basic parameters of an incomplete block design: the number of treatments, t, number of blocks, b, block size, k, and number of treatment replications, r. These parameters are vital in the computation of the information matrix, normalized information matrix, etc. Like the Command Window of the MATLAB software, the MATLAB program presented here is interactive. By interactive, we mean the ability of the program to accept input of the incidence matrix of an incomplete block design using the computer keyboard by typing the incidence matrix row-by-row and separating the rows with semi-colons or importing already prepared incidence matrix from other environments, such as Excel spreadsheet, Notepad, etc., and pasting the matrix on the program's input command prompt on the Command Window or the M-file console.
We now highlight some of the advantages of the MATLAB program in the evaluation of incomplete block designs. The program utilizes computer time of about one second and occupies a small disc space of 911 bytes. These are obvious advantages when compared to Bailey and Royle's [2] program for evaluating only (6 × 6)/2 semi-Latin squares, which takes about one minute of computer time and could only be executed on SparcStation ELC and the GRAPE package, which runs more slowly without definite computer time. The program is also designed to accept only the incidence matrix as its input (instead of multiple inputs), from which the parameters of the incomplete block designs are estimated by the program. The program computes the exact values of the four efficiency measures for any incomplete block design and does not give any approximate values for any of the efficiency measures. For instance, in the first illustrative example in Section 3.1, the program computed the exact values of some of those efficiency measures of the seven semi-Latin squares provided by Bailey and Royle [2] as approximate values.

Results
The MATLAB program, presented in the Appendix A, for the computation of the A-, D-, E-, and MV-efficiencies of incomplete block designs, including semi-Latin squares, completes the evaluation of any semi-Latin square or any other incomplete block design from the incidence matrix in less than one second of computing time on an HP Intel (R) Core (TM)i5-4200U CPU with 4.0 GB RAM in a 64-bit Operating System running on Windows. The program size is 911 bytes.
In the subsequent subsection, we provide some illustrative examples of the performances of the MATLAB program in evaluating semi-Latin squares.

S/N The Design Efficiency Values by Bailey and Royle [2]
Efficiency Values from Our MATLAB Program

S/N The Design Efficiency Values by Bailey and Royle [2]
Efficiency Values from Our MATLAB Program  From Table 1, the results of the MATLAB program show very high levels of accuracy in the computations of the A-, D-, E-, and MV-efficiencies of the (6 × 6)/2 semi-Latin squares. That is, the values of the four efficiency measures in Bailey and Royle [2] and from the MATLAB program presented here are exactly the same for most of the seven (6 × 6)/2 semi-Latin squares, except the approximate values. Additionally, the MATLAB program computed the exact values of the efficiency measures for some of the (6 × 6)/2 semi-Latin squares for which Bailey and Royle [2] could provide only the approximate values.

EvaluatioN of (4 × 4)/4 semi-LatiN Square (Chigbu)
The second illustrative example for the use of the MATLAB program is in the evaluation of the three optimal (4 × 4)/4 semi-Latin squares whose statistical properties were studied by Chigbu [13]. The MATLAB program was used to obtain the values of the A-, D-, E-, and MV-efficiency measures of the three optimal (4 × 4)/4 semi-Latin squares. The three semi-Latin squares are Ω 1 , Ω 2 , and Ω 3 , and are presented in Figures 8-10, while the results of their A-, D-, E-, and MV-efficiency measures are displayed in Table 2.  [2].

EvaluatioN of  
semi-LatiN square (Chigbu [13]) The second illustrative example for the use of the MATLAB program is in the evaluation of the three optimal   4 4 4 semi-Latin squares whose statistical properties were studied by Chigbu [13].  Table 2.   semi-LatiN square (Chigbu [13]) The second illustrative example for the use of the MATLAB program is in the evaluation of the three optimal   4 4 4 semi-Latin squares whose statistical properties were studied by Chigbu [13].  Table 2.       Table 2 reveals that the A-, Dand E-efficiency values of Ω 2 and Ω 3 are exactly the same, while the three (4 × 4)/4 semi-Latin squares have exactly the same E-efficiency values. Slight differences exist between the values of Aand D-efficiencies of Ω 1 and Ω 2 , which are attributable to what Soicher [3] referred to as mixture of manual and machine computations.
In general, the MATLAB program uses less than one second of computing time to compute and display the A-, D-, E-, and MV-efficiencies of any incomplete block design with a very high level of accuracy, as indicated in the illustrative examples.