Next Article in Journal
Coordinated Slip Control of Multi-Axle Distributed Drive Vehicle Based on HLQR
Previous Article in Journal
Seaweeds Arising from Brauer Configuration Algebras
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multiplication Algorithms for Approximate Optimal Distributions with Cost Constraints

1
School of Mathematics and Statistics, Liaoning University, Shenyang 110036, China
2
School of Economics, Liaoning University, Shenyang 110036, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2023, 11(8), 1963; https://doi.org/10.3390/math11081963
Submission received: 6 March 2023 / Revised: 18 April 2023 / Accepted: 19 April 2023 / Published: 21 April 2023

Abstract

:
In this paper, we study the D- and A-optimal assignment problems for regression models with experimental cost constraints. To solve these two problems, we propose two multiplicative algorithms for obtaining optimal designs and establishing extended D-optimal ( E D -optimal) and A-optimal ( E A -optimal) criteria. In addition, we give proof of the convergence of the E D -optimal algorithm and draw conjectures about some properties of the E A -optimal algorithm. Compared with the classical D- and A-optimal algorithms, the E D - and E A -optimal algorithms consider not only the accuracy of parameter estimation, but also the experimental cost constraint. The proposed methods work well in the digital example.

1. Introduction

For a given regression model, one can develop different experimental protocols with various regression design methods. In order to compare the merits of various programs, many criteria have been developed. Due to their good statistical properties, the D- and A-optimal criteria (Kiefer [1]) are widely used in experimental design in various fields, such as in clinical investigations, biological experiments, agricultural experiments, automatic control, mechanical engineering, etc. Kiefer and Wolfowitz [2] made seminal contributions to the field, as they were the first to extend the concept of the discrete design of experiments to the measurement space of continuous design, thus facilitating mathematical research and the application of optimization theory. Elfving [3] and Dette [4] also gave the basic knowledge and details of optimal design theory, based on which Silvey et al. [5,6,7] further developed the related theory of optimal design. In general, optimal design is also known as optimal allocation in the literature [8]. With the extensive and in-depth development of optimal design, multiplication algorithms for optimal design under different optimization criteria were proposed. For instance, Wynn [9] first proposed an algorithm for D-optimal design, which is now known as the W-algorithm. Fedorov [10] proposed the V-algorithm and proved its convergence. However, the above algorithms suffered from slow convergence and overly cumbersome computations. Atwood [11] proposed three modifications to the V-algorithm, which improved its convergence speed and accuracy. John and Draper [12] further improved Atwood’s first suggestion and increased the efficiency of the algorithm. Silvey et al. [13] presented a multiplicative algorithm for computing D-optimal designs on a finite design, and this was a straightforward iterative algorithm with monotonic convergence. Kiefer and Wolfowitz [2] introduced the general equivalence theorem and proposed an algorithm for solving optimization problems. Yu [14] suggested the cocktail method, which combined the well-known vertex direction method (VDM) with a multiplicative algorithm to compute D-optimal approximation designs. Martin-Martin et al. [15] applied a multiplicative algorithm to construct marginal and conditional restricted designs when certain factors were known in advance, i.e., they were given and not controlled. Harman and Pronzato [16] proposed a class of multiplicative algorithms for computing D-optimal designs with non-optimal support points removed. Gao et al. [8] constructed a series of multiplicative algorithms for finding the D- and A-optimal designs of regression models. Castro et al. [17] introduced a new method aimed at computing approximate D- and A-optimal designs for multivariate polynomial regression on compact (semi-algebraic) design spaces. Harman et al. [18] constructed a series of randomized exchange algorithms (REX) to obtain the D- and A-optimal approximation designs for regression models. Duan et al. [19] proposed a simple multiplicative approach to generating sequential experimental designs while focusing on D- and A-optimal criteria. Duan et al. [20] described two multiplicative algorithms for obtaining approximate D- and A-optimal designs for multi-dimensional linear regression on a large variety of design spaces.
Although a great deal of research has been conducted on optimal design and the algorithms for achieving it, little consideration has been given to the cost of experimentation [21,22,23,24,25]. The experimental cost refers to the various raw materials consumed in an experiment [26,27,28], and it is an essential factor that is not negligible in experiments. First, to take this factor into account, we construct two weighted objective functions and establish extended D-optimal ( E D -optimal) and A-optimal ( E A -optimal) criteria to make the objective function optimal. For the E D -optimal algorithm, the objective function is the difference between the logarithm of the determinant of the information matrix in the classical D-optimal criterion and the weighted average experimental cost. A design w ^ is referred to as an E D -optimal design if it maximizes the objection function in the class w { w i 0 ,   and   i = 1 k w i = 1 } of all designs. For the E A -optimal algorithm, the objective function is the sum between the logarithm of the trace of the inverse matrix of the information matrix in the classical A-optimal criterion and the weighted average experimental cost. A design w ^ is referred to as an E A -optimal design if it minimizes the objection function in the class w { w i 0 ,   and   i = 1 k w i = 1 } of all designs. Compared with the classical D- and A-optimal designs, the extended D- and A-optimal designs strike a balance between maximizing the determinant of the information matrix and minimizing the experimental cost. Second, to solve the two optimization problems, we obtain the general equivalence conditions for E D - and E A -optimality with the Kuhn–Duck method. On this basis, we propose multiplicative algorithms for computing D- and A-optimal designs with cost constraints in the regression model and provide proof of the monotonic convergence of the E D -optimal algorithm. Although we do not give proof of the properties associated with the E A -optimal algorithm, the simulation results strongly support the validity and reliability of the algorithm. Finally, we give conclusions and further indicate possible research directions.
In this study, we extend the classical D- and A-optimal algorithms to D- and A-optimal algorithms with cost constraints. Compared with the classical D- and A-optimal algorithms, the E D - and E A -optimal algorithms consider not only the accuracy of parameter estimation, but also the experimental cost constraint. The organization of this article is as follows. Section 2 presents the model and the extended optimality criteria. Section 3 shows the multiplication algorithms for determining the E A - and E D -optimal assignments and describes the algorithm-related properties. Section 4 presents simulations that show the good performance of the proposed algorithms. Section 5 presents the conclusions. Appendix A contains proof of the main results.

2. Model and Optimal Criteria

2.1. Model and Symbols

Consider the generalized regression model
Y ( x ) = β f ( x ) + ε , x X ,
where f ( x ) is a known regressor associated with x , β is a vector of parameters, X is the design space, and ε is an error term with a mean of 0 and variance of σ 2 . For different independent trials, the errors are independent. The p-dimensional vector x is taken from the design space X . The experiment aims to obtain the regression model by selecting k mutually independent design points x 1 , x 2 , , x k and estimating the unknown parameters β .
Suppose that the total number of trials is N, the numbers of trial points repeated at various levels are n 1 , n 2 , , n k , and n 1 + + n k = N . Further, we have w 1 , w 2 , , w k as the probability of occurrence of the design points x 1 , x 2 , , x k with w i = n i N ( i = 1 , 2 , , k ) and i = 1 k w i = 1 . Therefore, the information array is denoted as H ( w ) = i = 1 k w i x i x i .

2.2. Optimal Criteria

In the context of generalized regression models, the attainment of reduced variance in parameter estimates and a concomitant decrease in the cost of the experimental procedure typically results in enhanced regression model accuracy and increased economic efficiency of the experiment.

2.2.1. E D -Optimal Criterion

In the classical D-optimal design, a design w ^ is referred to as optimal if it maximizes log | H ( w ) | ( | · | is the determinant of the matrix) while satisfying the constraints { w i 0 ,   and   i = 1 k w i = 1 } . However, in the classical D-optimal design, people only consider the accuracy of parameter estimation and do not take into account the cost of the experiment. Therefore, we propose an objective function in the form of the difference between the logarithm of the determinant of the information matrix in the classical D-optimal criterion and the weighted average experimental cost, i.e.,
T ( w ) = log | H ( w ) | i = 1 k w i c i ,
where H ( w ) is the information matrix [8], and i = 1 k w i c i denotes the cost of the experiment [27]. An increase in the determinant of the information matrix is positively associated with a reduction in the volume of the Wald-type joint confidence region of the model parameters, thereby contributing to higher estimation accuracy. Furthermore, a decrease in trial cost is observed, leading to more efficient utilization of various raw materials, thereby enhancing the economic efficacy of the experiment. Formally, a design w ^ is referred to as an E D -optimal design if it maximizes the objection function (2) in the class w { w i 0 ,   and   i = 1 k w i = 1 } of all designs. That is,
w ^ = arg max { T ( w ) : subject   to   w i 0 ,   and   i = 1 k w i = 1 } .
Compared with the classical D-optimal design, the extended D-optimal design strikes a balance between maximizing the determinant of the information matrix and minimizing the experimental cost.

2.2.2. E A -Optimal Criterion

In the classical A-optimal design, a design w ^ is referred to as optimal if it minimizes log ( t r a c e ( H 1 ( w ) ) ) while satisfying the constraints { w i 0 ,   and   i = 1 k w i = 1 } . Similarly, to take the experimental cost into account, we give a new objective function whose form is the sum between the logarithm of the trace of the inverse matrix of the information matrix in the classical A-optimal criterion and the weighted average experimental cost, i.e.,
G ( w ) = log ( trace ( H 1 ( w ) ) ) + i = 1 k w i c i .
where log ( t r a c e ( H 1 ( w ) ) ) is the criterion function of the classical A-optimal criterion [8] and i = 1 k w i c i is the experimental cost [27]. If the information matrix is non-degenerate, then the trace of the inverse matrix of the information matrix reflects the magnitude of the sum of the variances of the components of the maximum likelihood estimate of the parameter β . A smaller trace of the inverse matrix of the information matrix and experimental cost implies that the sum of the variance of the estimated components is also smaller and that the experiment is more economically efficient. Formally, a design w ^ is referred to as an E A -optimal design if it minimizes the objection function (4) in the class w { w i 0 ,   and   i = 1 k w i = 1 } of all designs. That is,
w ^ = arg min { G ( w ) : subject   to   w i 0 ,   and   i = 1 k w i = 1 } .
Compared with the classical A-optimal design, the extended A-optimal design strikes a balance between minimizing the trace of the inverse matrix of the information matrix and minimizing the experimental cost.

3. Algorithms for Optimal Distributions

In this section, we present the iterative algorithm in the E D -optimal criterion and give proof of the monotonic convergence of the algorithm. Meanwhile, we propose an iterative algorithm under the E A -optimal criterion and draw conjectures about the monotonic convergence of the algorithm. The corresponding proofs of these theorems and lemmas are in Appendix A.

3.1. Algorithm for E D -Optimal Distribution

To find the E D -optimal design, we obtain the general equivalence condition with the Kuhn–Duck method, and the result is shown in Theorem 1.
Theorem 1. 
w ^ is an E D -optimal assignment if and only if it satisfies the following conditions:
x i H 1 ( w ^ ) x i + i = 1 k w i ^ c i = p + c i f o r w ^ i 0 ,
x i H 1 ( w ^ ) x i + i = 1 k w i ^ c i p + c i f o r w ^ i = 0 .
Note that Equation (6) in Theorem 1 is a special case of the classical approximate D-optimal equivalence theorem (Kiefer and Wolfowitz [29]). Based on this, we propose a multiplicative algorithm for calculating the E D -optimal allocation, as shown below.
In Algorithm 1, m is the number of iterations, k is the number of design points, and p is the number of experimental variables. In general, we set k p , and the parameter ζ is usually set to a very small value, which determines the criteria for stopping the algorithm. In part 4,we set ζ = 0.0001. Compared to the W- and V-algorithms, our proposed algorithm has the advantage of running quickly and simplifying the computational process because it involves only simple matrix operations. Furthermore, Algorithm 1 maintains its validity in instances where the sample size N is exceedingly large, as it solely relies on the weight w and design point x . Contrastingly, the VDM algorithm’s computational process is reliant not only on the sample size N, but also on the presence of interactions between design points, resulting in a decline in its efficacy. Our proposed algorithm is also a multiplicative algorithm, so it also has excellent properties, such as simplicity, efficiency, and monotonic convergence.
Algorithm 1: Algorithm for E D -optimal assignment
Input: 
Design space X and stopping parameter ζ .
Output: 
E D -optimal assignment w ^ .
 1:
Set the initial value of w to w ( 0 ) = ( 1 k , , 1 k ) .
 2:
For m 1 and i = 1 , , k ,
w i ( m ) = w i ( m 1 ) x i H 1 ( w ( m 1 ) ) x i + i = 1 k w i ( m 1 ) c i p + c i ,
H ( w ( m 1 ) ) = w 1 ( m 1 ) x 1 x 1 + w 2 ( m 1 ) x 2 x 2 + + w k ( m 1 ) x k x k .
 3:
Repeat step 2 until max { | w i ( m ) w i ( m 1 ) | } < ζ .
 4:
Output E D -optimal assignment w ^ .
Later in this section, we prove that the proposed algorithm converges to E D -optimality. To obtain the convergence of the proposed algorithm, we first demonstrate that the sequence { T ( w ( m ) ) } is monotonically convergent, and the result is given in Theorem 2. Second, based on the theoretical results for the sequence { T ( w ( m ) ) } , we further certify that sequence { w ( m ) } converges to the E D -optimal allocation, and the conclusion is given in Theorem 3.
Theorem 2. 
T ( w ( m ) ) T ( w ( m 1 ) ) = log | H ( w ( m ) ) | log | H ( w ( m 1 ) ) | 0 and w ( m ) w ( m 1 ) 0 .
Theorem 2 shows that as the number of iterations increases, the sequence { T ( w ( m ) ) } monotonically increases and the distance between sequences { w ( m ) } tends to zero. Note that Appendix A presents proof that the sequence { T ( w ( m ) ) } converges. Based on the content of Theorem 2, we can also obtain Theorem 3.
Theorem 3. 
The sequence { w ( m ) } obtained with the proposed algorithm converges to E D -optimality.
In Theorem 3, we will first show that the sequence { w ( m ) } converges and subsequently prove its convergence to E D -optimality.

3.2. Algorithm for E A -Optimal Distribution

Similarly, according to the Kuhn–Duck method, we can obtain the general equivalence condition for E A -optimality, as shown in Theorem 4.
Theorem 4. 
w ^ is an E A -optimal assignment if and only if it satisfies the following conditions:
t r a c e ( H 1 ( w ^ ) x i x i H 1 ( w ^ ) ) t r a c e ( H 1 ( w ^ ) ) + i = 1 k w i ^ c i = 1 + c i f o r w ^ i 0 ,
t r a c e ( H 1 ( w ^ ) x i x i H 1 ( w ^ ) ) t r a c e ( H 1 ( w ^ ) ) + i = 1 k w i ^ c i 1 + c i f o r w ^ i = 0 .
Based on Equation (8), we propose a multiplicative algorithm to calculate the E A -optimal design, and Algorithm 2 is shown below.
Algorithm 2: Algorithm for E A -optimal distribution
Input: 
Design space X and stopping parameter ζ .
Output: 
E D -optimal assignment w ^ .
 1:
Set the initial value of w to w ( 0 ) = ( 1 k , , 1 k ) .
 2:
For m 1 and i = 1 , , k ,
w i ( m ) = w i ( m 1 ) trace ( H 1 ( w ( m 1 ) ) x i x i H 1 ( w ( m 1 ) ) trace ( H 1 ( w ( m 1 ) ) ) + i = 1 k w i ( m 1 ) c i 1 + c i ,
H ( w ( m 1 ) ) = w 1 ( m 1 ) x 1 x 1 + w 2 ( m 1 ) x 2 x 2 + + w k ( m 1 ) x k x k .
3:
Repeat step 2 until max { | w i ( m ) w i ( m 1 ) | } < ζ .
4:
Output E A -optimal assignment w ^ .
Although the theoretical basis of the E A -optimal algorithm is similar to that of the E D -optimal algorithm, we do not obtain relevant conclusions similar to those of Theorem 2 under the E A -optimal criterion. In Section 3, we describe extensive simulation studies that were performed to illustrate the E A -optimal algorithm’s properties. In all cases, the algorithm was convergent and successfully found the optimal allocation, so the algorithm is feasible.

4. Numerical Illustrations

In this section, we describe some simulations that were performed to illustrate some of the excellent properties of the recommended algorithms. We calculated the E D - and E A -optimal allocations, as well as the values of the corresponding objective functions. In addition, we provide the convergence rates and the numbers of iterations of the algorithms. The simulation results showed that the proposed algorithms ran quickly and were estimated accurately. The algorithms are easy to program and are suitable for a variety of situations.
Initially, we generated a set of design points, denoted by x 1 , x 2 , , x k , which were mutually independent and followed a uniform distribution over the interval [−1, 1], i.e., U(−1, 1). Furthermore, we specified the initial values of weights, w 1 , w 2 , , w k , and the corresponding experimental costs c 1 , c 2 , , c k , which were also independently drawn from a uniform distribution over the range [0, 1], i.e., U (0, 1). By taking Table A1 as an illustrative example, subject to the constraints p = 5, k = 8, and p < k , we obtained the values of x 1 , x 2 , , x 8 and the associated trial costs c 1 , c 2 , , c 8 . Without loss of generality, we set the initial weights w 1 = 0.125 , w 2 = 0.125 , , w 8 = 0.125 .
Second, in accordance with the proposed Algorithm 1, the stopping criterion was set to max { | w i ( n ) w i ( n 1 ) | } < 0.0001 for i = 1 , , k , and we obtained the E D -optimal allocations and the values of the objective function. Table A1 shows the results of the simulation. In Table A2, keeping the value of k constant, by adjusting the number of experimental variables p, we obtained the corresponding computational results. As seen in Table A1 and Table A2, the E D -optimal algorithm converged, and the sum of weights was 1 in each case, which meant that the algorithm could accurately estimate the optimal weights. Table A5 shows the average number of runs and the average time (in seconds) for 50 simulation cycles, with standard deviations in parentheses, indicating that the proposed E D -optimal algorithm converged quickly. When the value of k was fixed, the number of iterations and time decreased as p increased.
Finally, similarly to the above computational procedure, we found the E A -optimal assignments, the corresponding objective function values, and the average number of operations and time for 50 simulations. Taking Table A3 as an example, when p = 5 and k = 8, we first generated the corresponding design points, costs, and initial weights. According to Algorithm 2, we could obtain the E A -optimal weights and the objective function values. It can be seen from Table A3 that the proposed Algorithm 2 converged and the sum of the weights was 1 in each case, indicating the superiority of the E A -optimal algorithm for parameter estimation. Moreover, when p was kept constant, the value of the objective function (4) became smaller as k increased. According to the E A -optimal criterion, the estimation of the parameters was more accurate. Thus, this also verified the conjecture that we mentioned above. As shown in Table A4, when k remained constant, the value of the objective function (4) became larger as p increased. The results in Table A6 show the average number of iterations and the average elapsed time (standard deviation in parentheses) for 50 simulations of the proposed Algorithm 2, indicating that the algorithm ran rapidly. In addition, when the value of p was kept constant, the number of iterations and time increased with k.

5. Conclusions

In this article, we studied the extended D- and A-optimal assignment problems in a regression model, i.e., the D- and A-optimal allocation issues with cost constraints. To obtain optimal designs, we extended the classical D- and A-optimal algorithms to D- and A-optimal algorithms with cost constraints. Compared with the classical D- and A-optimal designs, the extended D- and A-optimal designs strike a balance between maximizing the precision of parameter estimation and minimizing the experimental cost. For the E D -optimal algorithm, the objective function is the difference between the logarithm of the determinant of the information matrix in the classical D-optimal criterion and the weighted average experimental cost. We provided proof of the monotonicity and convergence of the algorithm. For the E A -optimal algorithm, the objective function is the sum between the logarithm of the trace of the inverse matrix of the information matrix in the classical A-optimal criterion and the weighted average experimental cost. Although there is no relevant proof in the literature concerning the nature of the algorithm, the simulation results show that the algorithm performs well. It is worth noting that while this study focuses on D- and A-optimal designs for linear models, the ideas of the proposed algorithms can be generalized to other standards and nonlinear models.

Author Contributions

Conceptualization, L.F. and F.M.; methodology, L.F.; software, F.M.; validation, F.M., Z.Y., and Z.Z.; formal analysis, L.F.; writing—original draft preparation, L.F., F.M., Z.Y., and Z.Z.; writing—review and editing, L.F., F.M., Z.Y., and Z.Z.; visualization, L.F., F.M., Z.Y., and Z.Z.; supervision, L.F., F.M., Z.Y., and Z.Z.; funding acquisition, L.F., F.M., Z.Y., and Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Social Science Funds of China (grant number: 20BTJ056).

Data Availability Statement

These algorithms were implemented in R, and the programs are available from the authors upon request.

Acknowledgments

The authors thank the editor, associate editors, and reviewers for their positive and constructive comments on this paper.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Appendix A

Proof of Theorem 1. 
According to the Kuhn–Tucker condition (Kuhn and Tucker [30]), w ^ is the optimal solution of (2) if and only if
0 i = 1 k T w i ( w i w i ^ ) = i = 1 k log | H ( w ) | w i c i ( w i w i ^ ) = i = 1 k trace ( x i x i H 1 ( w ^ ) ) c i ( w i w i ^ ) = i = 1 k x i H 1 ( w ^ ) x i c i ( w i w i ^ ) = i = 1 k x i H 1 ( w ^ ) x i c i w i i = 1 k w i ^ x i H 1 ( w ^ ) x i c i = i = 1 k x i H 1 ( w ^ ) x i c i w i i = 1 k trace w i ^ x i x i H 1 ( w ^ ) + i = 1 k w i ^ c i = i = 1 k x i H 1 ( w ^ ) x i c i w i trace i = 1 k w i ^ x i x i H 1 ( w ^ ) + i = 1 k w i ^ c i = i = 1 k x i H 1 ( w ^ ) x i c i w i p + i = 1 k w i ^ c i ,
for all w ( w i 0 ,   and   i = 1 k w i = 1 ), which implies (6) and (7). □
To prove the monotonicity of the sequence { T ( w ( m ) ) } , we give Lemmas A1–A3.
Lemma A1. 
For m 1 , w 1 ( m ) c 1 + w 2 ( m ) c 2 + + w k ( m ) c k = w 1 ( m 1 ) c 1 + w 2 ( m 1 ) c 2 + + w k ( m 1 ) c k .
Proof. 
For m 1 , w 1 ( m ) c 1 + w 2 ( m ) c 2 + + w k ( m ) c k ,
= i = 1 k w i ( m 1 ) x i H 1 ( w ( m 1 ) ) x i + i = 1 k w i ( m 1 ) c i p = trace H ( w ( m 1 ) ) H 1 ( w ( m 1 ) ) + i = 1 k w i ( m 1 ) c i p = w 1 ( m 1 ) c 1 + w 2 ( m 1 ) c 2 + + w k ( m 1 ) c k .
According to step 3 in Algorithm 1, we can get
w 1 ( m ) c 1 + w 2 ( m ) c 2 + + w k ( m ) c k = i = 1 k w i ( m 1 ) x i H 1 ( w ( m 1 ) ) x i + i = 1 k w i ( m 1 ) c i p .
Further, we have
i = 1 k w i ( m 1 ) x i H 1 ( w ( m 1 ) ) x i = i = 1 k w i ( m 1 ) trace x i x i H 1 ( w ( m 1 ) ) = trace i = 1 k w i ( m 1 ) x i x i H 1 ( w ( m 1 ) ) = trace H ( w ( m 1 ) ) H 1 ( w ( m 1 ) ) .
Lemma A2. 
Let H ( w 1 , · , w k ) = i = 1 k w i x i x i , with w i 0 , x i x i being a nonnegative definite matrix on X and | H ( w 1 , · , w k ) | 0 ; thenm
log | H ( w 1 * , · , w k * ) | log | H ( w 1 , · , w k ) | i = 1 k w i t r a c e H 1 ( w 1 , , w k ) x i x i log w i * w i
for w 1 * 0 , , w k * 0 .
Proof. 
See Gao et al. [8]. □
Lemma A3. 
Suppose that w * = ( w 1 * , · , w k * ) and w = ( w 1 , · , w k ) , which satisfy i = 1 k w i * = i = 1 k w i , are two probability vectors in R k ; then,
i = 1 k | w i * w i | 2 i = 1 k w i * log w i * w i 1 2 .
Proof. 
See Kullback [31]. □
Proof of Theorem 2. 
T ( w ( m ) ) T ( w ( m 1 ) ) = log | H ( w ( m ) ) | i = 1 k w i ( m ) c i log | H ( w ( m 1 ) ) | i = 1 k w i ( m 1 ) c i = log | H ( w ( m ) ) | log | H ( w ( m 1 ) ) | ( by   Lemma   1 ) i = 1 k w i ( m 1 ) x i H 1 ( w ( m 1 ) ) x i log w i ( m ) w i ( m 1 ) ( by   Lemma   2 ) = i = 1 k w i ( m ) ( p + c i ) w i ( m 1 ) i = 1 k w i ( m 1 ) c i log w i ( m ) w i ( m 1 ) = i = 1 k w i ( m ) ( p + c i ) log w i ( m ) w i ( m 1 ) s i = 1 k w i ( m 1 ) log w i ( m ) w i ( m 1 ) = i = 1 k t i w i ( m ) log t i w i ( m ) t i w i ( m 1 ) + s i = 1 k w i ( m 1 ) log w i ( m 1 ) w i ( m ) 1 2 i = 1 k t i | w i ( m ) w i ( m 1 ) | 2 + s 2 i = 1 k | w i ( m ) w i ( m 1 ) | 2 ( by   Lemma   3 ) 1 2 ( η 2 + s ) i = 1 k | w i ( m ) w i ( m 1 ) | 2 0 ,
where t i = p + c i , s = i = 1 k w i ( m 1 ) c i and η = min { t i } for i = 1 , , k . □
To obtain the convergence of the sequence { T ( w ( m ) ) } , we give Lemma A4.
Lemma A4. 
H R p × p is a nonnegative matrix; then,
| H | i = 1 p h i i ,
where h i j is the (i,j)-th element in H .
Proof. 
See Anderson [32]. □
T ( w ) = log | H ( w ) | i = 1 k w i c i log | H ( w ) | i = 1 p log | h i i | ,
that is, the sequence { T ( w ( m ) ) } is uniformly bounded and monotonically increasing, so it is convergent. We have
0 = lim m T ( w ( m ) ) T ( w ( m 1 ) ) = lim m log | H ( w ( m ) ) | log | H ( w ( m 1 ) ) | lim m 1 2 ( η 2 + s ) i = 1 k | w i ( m ) w i ( m 1 ) | 2 0 ,
which implies w ( m ) w ( m 1 ) 0 .
Proof of heorem 3. 
The sequence { w ( m ) } is convergent, and Gao et al. [8] gave the proof. Let w ^ = lim m w ( m ) ; then,
w ^ i = lim m w i ( m ) = lim m w i ( m 1 ) x i H 1 ( w ( m 1 ) ) x i + i = 1 k w i ( m 1 ) c i p + c i = w ^ i x i H 1 ( w ^ ) x i + i = 1 k w ^ i c i p + c k ,
which implies
x i H 1 ( w ^ ) x i + i = 1 k w i ^ c i = p + c i f o r w ^ i 0
and
x i H 1 ( w ^ ) x i + i = 1 k w i ^ c i p + c i f o r w ^ i = 0 .
So, the sequence { w ( m ) } converges to E D -optimality.
Proof of Theorem 4. 
According to the Kuhn–Tucker condition (Kuhn and Tucker [30]), w ^ is the optimal solution of (4) if and only if
0 i = 1 k G w i ( w i w i ^ ) = i = 1 k log (   trace   ( H 1 ( w ) ) ) w i + c i ( w i w i ^ ) = i = 1 k   trace   ( H 1 ( w ^ ) x i x i H 1 ( w ^ ) )   trace   ( H 1 ( w ^ ) ) c i w i + 1   trace   ( H 1 ( w ^ ) ) i = 1 k trace ( H 1 ( w ^ ) w i ^ x i x i H 1 ( w ^ ) ) i = 1 k w i ^ c i = i = 1 k   trace   ( H 1 ( w ^ ) x i x i H 1 ( w ^ ) )   trace   ( H 1 ( w ^ ) ) c i w i + 1   trace   ( H 1 ( w ^ ) ) trace H 1 ( w ^ ) i = 1 k w i ^ x i x i H 1 ( w ^ ) i = 1 k w i ^ c i = i = 1 k   trace   ( H 1 ( w ^ ) x i x i H 1 ( w ^ ) )   trace   ( H 1 ( w ^ ) ) c i w i + 1 i = 1 k w i ^ c i
for all w ( w i 0 ,   and   i = 1 k w i = 1 ), which implies (8) and (9).
Table A1. Design points, costs, E D -optimal weights, and values of objective function (2) when p = 5.
Table A1. Design points, costs, E D -optimal weights, and values of objective function (2) when p = 5.
Design PointsCostsWeightsDesign PointsCostsWeights
(k = 8)(k = 8)(k = 8)(k = 12)(k = 12)(k = 12)
(0.92,0.99,−0.77,−0.20,−0.03)0.970.0831(0.60,0.46,0.98,−0.07,0.48)0.320.0039
(−0.56,0.58,0.58,0.05,−0.07)0.300.1428(−0.86,0.04,−0.34,0.89,0.44)0.610.1021
(0.45,−0.64,−0.46,0.34,−0.95)0.430.1486(−0.42,−0.76,0.99,−0.57,0.80)0.840.1537
(−0.75,−0.96,−0.52,0.29,−0.25)0.400.1300(0.76,0.56,0.86,−0.57,0.75)0.660.1753
(0.75,0.28,0.99,−0.29,0.78)0.380.0929(0.26,0.32,−0.79,0.78,0.17)0.180.0052
(0.87,0.32,−0.32,0.99,−0.89)0.430.1814(0.39,−0.48,0.55,−0.24,0.29)0.760.0000
(−0.54,−0.68,−0.05,0.38,−0.98)0.860.0813(0.95,−0.99,−0.45,0.11,−0.73)0.450.1670
(−0.64,0.88,−0.69,−0.69,0.11)0.650.1399(−0.55,−0.26,−0.19,−0.95,0.89)0.210.2027
(−0.35,−0.36,−0.28,0.33,0.85)0.470.0714
(−0.19,−0.48,0.64,−0.90,0.09)0.590.0000
(0.77,0.04,−0.92,0.95,−0.31)0.320.1186
(−0.75,−0.06,0.05,−0.82,0.03)0.840.0000
T( w ) −7.2778 −5.8847
When p = 5, k = 8, or k = 12, we first generated the design points, costs, and initial weights. According to Algorithm 1, we could obtain the E D -optimal weights and the corresponding objective function values.
Table A2. Design points, costs, E D -optimal weights, and values of objective function (2) when k = 10.
Table A2. Design points, costs, E D -optimal weights, and values of objective function (2) when k = 10.
Design PointsCostsWeightsDesign PointsCostsWeights
(p = 3)(p = 3)(p = 3)(p = 6)(p = 6)(p = 6)
(−0.86,0.31,0.75)0.160.0002(0.23,−0.30,−0.35,−0.98,−0.59,−0.16)0.470.1556
(−0.44,−0.44,−0.65)0.040.0000(−0.54,−0.22,−0.45,0.15,0.30,0.35)0.970.1362
(0.17,−0.94,0.80)0.040.0927(0.60,−0.49,−0.94,0.21,−0.35,−0.31)0.760.0851
(−0.43,−0.87,0.70)0.550.1759(−0.82,1.00,0.04,0.67,−0.66,0.91)0.830.1298
(−0.80,0.96,−0.68)0.860.1610(−0.71,0.61,−0.67,0.65,0.14,0.85)0.740.0815
(−0.69,0.30,−0.73)0.270.0000(0.45,−0.70,−0.70,0.69,−0.59,−0.54)0.360.1175
(0.26,0.24,−0.29)0.280.0000(−0.22,−0.67,0.88,0.50,−0.39,0.60)0.520.1643
(0.21,−0.15,−0.80)0.360.0000(−0.33,0.25,−0.17,0.49,0.48,−0.04)0.440.0001
(0.36,0.62,0.81)0.100.3208(0.29,−0.41,−0.21,−0.92,0.30,0.23)0.410.1296
(−0.93,0.63,0.82)0.610.2494(0.41,−0.93,0.01,−0.36,−0.39,−0.16)0.690.0001
T( w ) −2.508 −10.2531
When k = 5, p = 3, or p = 6, we first generated the design points, costs, and initial weights. According to Algorithm 1, we could obtain the E D -optimal weights and the corresponding objective function values.
Table A3. Design points, costs, E A -optimal weights, and values of objective function (4) when p = 5.
Table A3. Design points, costs, E A -optimal weights, and values of objective function (4) when p = 5.
Design PointsCostsWeightsDesign PointsCostsWeights
(k = 8)(k = 8)(k = 8)(k = 12)(k = 12)(k = 12)
(−0.99,−0.95,0.64,−0.47,0.56)0.400.0020(0.14,0.45,0.11,0.63,0.00)0.020.0000
(−0.42,−0.54,0.19,0.47,−0.05)0.920.0000(−0.12,−1.00,0.24,0.31,0.75)0.270.0000
(0.01,−0.13,0.39,0.00,0.60)0.050.3003(−0.03,0.85,−0.56,−0.84,−0.63)0.960.0046
(−0.88,−0.25,−0.77,0.48,0.94)0.050.1719(0.07,−0.75,0.52,−0.26,0.55)0.440.0000
(0.51,0.46,−0.85,0.98,0.20)0.490.1488(0.45,−0.17,0.59,−0.90,0.07)0.240.0017
(−0.68,−0.19,0.47,0.10,−0.65)0.830.1320(−0.83,0.80,0.17,0.27,0.21)0.610.0496
(0.81,0.97,0.01,−0.77,−0.81)0.230.0909(−0.88,−0.30,0.94,−0.15,0.33)0.730.1503
(0.67,−0.78,0.10,0.04,−0.20)0.420.1540(−0.53,−0.91,−0.20,−0.69,−0.20)0.010.2335
(0.96,0.66,0.87,−0.72,−0.27)0.630.0338
(0.85,0.38,0.62,0.56,−0.90)0.400.1730
(−0.51,0.41,−0.60,0.51,−0.80)0.070.1868
(−0.40,0.94,0.25,−0.91,−0.73)0.030.1665
G( w ^ ) 3.7949 3.0554
When p = 5, k = 8, or k = 12, we first generated the design points, costs, and initial weights. According to Algorithm 2, we could obtain the E A -optimal weights and the corresponding objective function values.
Table A4. Design points, costs, E A -optimal weights, and values of objective function (4) when k = 10.
Table A4. Design points, costs, E A -optimal weights, and values of objective function (4) when k = 10.
Design PointsCostsWeightsDesign PointsCostsWeights
(p = 3)(p = 3)(p = 3)(p = 6)(p = 6)(p = 6)
(−0.80,−0.97,−0.26)0.220.2756(−0.86,−0.88,−0.47,0.76,−0.99,0.90)0.300.1015
(0.14,−0.67,−0.50)0.860.0000(0.00,0.47,−0.65,0.47,0.95,0.99)0.020.1511
(0.19,0.66,−0.21)0.810.0000(0.72,0.33,0.91,−0.35,0.44,0.59)0.340.0932
(0.46,−0.30,0.28)0.980.0000(0.77,0.99,−0.14,0.10,−0.87,0.46)0.290.1367
(0.32,−0.28,−0.02)0.580.0000(0.29,−0.70,−0.31,−0.45,−0.18,0.35)0.240.2404
(1.00,−0.54,−0.74)0.200.3158(−0.33,0.78,−0.39,−0.25,0.26,−0.24)0.750.0468
(0.59,0.83,0.57)0.150.0676(−0.29,−0.01,−0.64,−0.04,−0.77,0.71)0.580.0000
(0.32,−0.08,0.77)0.330.0007(−0.32,0.80,−0.51,−0.17,−0.47,−0.63)0.550.0756
(0.77,−0.05,−0.17)0.000.0000(0.75,0.83,−0.32,0.59,−0.22,0.97)0.700.0015
(0.17,−0.51,0.83)0.700.3403(−0.34,0.28,0.34,−0.54,−0.47,0.41)0.960.1533
G( w ^ ) 2.2659 3.6571
When k = 10, p = 3, or p = 6, we first generated the design points, costs, and initial weights. According to Algorithm 2, we could obtain the E A -optimal weights and the corresponding objective function values.
Table A5. The average number of iterations and average elapsed time of the E D -optimal algorithm.
Table A5. The average number of iterations and average elapsed time of the E D -optimal algorithm.
kpAverage No. of Iterations (s.d.)Average Elapsed Time in Sec. (s.d.)
10476.6(46.0)0.024(0.015)
552.7(27.8)0.019(0.011)
821.6(11.1)0.008(0.009)
20493.8(55.5)0.079(0.054)
571.0(28.6)0.062(0.033)
843.4(9.0)0.038(0.011)
1035.1(7.2)0.031(0.011)
1518.3(6.8)0.017(0.010)
30499.8(58.5)0.127(0.079)
570.5(23.0)0.096(0.037)
853.7(13.6)0.065(0.015)
1039.1(6.5)0.050(0.011)
1527.4(3.0)0.040(0.011)
2018.6(3.4)0.033(0.010)
259.0(4.1)0.022(0.013)
404104.7(53.2)0.179(0.092)
584.5(34.7)0.137(0.060)
860.5(16.4)0.107(0.028)
1045.3(7.3)0.078(0.014)
1531.0(2.7)0.062(0.010)
2023.2(1.5)0.056(0.009)
2518.3(2.8)0.050(0.010)
3010.7(3.6)0.038(0.013)
The table gives the average number of iterations and the average running time in seconds for 50 simulations of the proposed Algorithm 1, with the standard deviation in parentheses
Table A6. The average number of iterations and average elapsed time of the E A -optimal algorithm.
Table A6. The average number of iterations and average elapsed time of the E A -optimal algorithm.
kpAverage No. of Iterations (s.d.)Average Elapsed Time in Sec. (s.d.)
10452.4(24.2)0.052(0.031)
547.0(22.2)0.044(0.022)
817.7(7.5)0.019(0.012)
20491.2(52.6)0.144(0.079)
581.3(41.3)0.154(0.065)
842.2(10.5)0.088(0.026)
1030.5(7.5)0.062(0.017)
1515.2(6.3)0.030(0.014)
30494.6(48.6)0.270(0.145)
586.8(30.8)0.252(0.102)
855.4(15.0)0.164(0.041)
1040.8(7.5)0.124(0.029)
1524.6(4.2)0.084(0.018)
2014.3(3.8)0.064(0.019)
259.0(1.9)0.046(0.014)
404112.7(52.4)0.425(0.186)
589.8(38.2)0.379(0.143)
856.5(12.1)0.231(0.061)
1046.1(8.7)0.19(0.04)
1527.8(4.0)0.135(0.021)
2019.3(2.6)0.112(0.018)
2512.6(3.1)0.093(0.023)
308.7(2.1)0.082(0.021)
The table gives the average number of iterations and the average running time in seconds for 50 simulations of the proposed Algorithm 2, with the standard deviation in parentheses.

References

  1. Kiefer, J. General equivalence theory for optimum designs (approximate theory). Ann. Stat. 1974, 2, 849–879. [Google Scholar] [CrossRef]
  2. Kiefer, J.; Wolfowitz, J. Optimum designs in regression problems. Ann. Math. Stat. 1959, 30, 271–294. [Google Scholar] [CrossRef]
  3. Elfving, G. Optimum allocation in linear regression theory. Ann. Math. Stat. 1952, 23, 255–262. [Google Scholar] [CrossRef]
  4. Dette, H.; Studden, W.J. Canonical Moments and Optimal Design-First Application. In The Theory of Canonical Moments with Applications in Statistics, Probability, and Analysis; John Wiley & Sons: New York, NY, USA, 1997; pp. 128–159. [Google Scholar]
  5. Silvey, D. Optimal Design; Chapman and Hall: New York, NY, USA, 1980. [Google Scholar]
  6. Box, G.E.P.; Draper, N.R. Empirical Model-building and Response Surfaces; John Wiley & Sons: New York, NY, USA, 1987; pp. 229–231. [Google Scholar]
  7. Gilmour, S.G.; Trinca, L.A. Optimum design of experiments for statistical inference. J. R. Stat Soc. Ser. C Appl. Stat. 2012, 61, 345–401. [Google Scholar] [CrossRef]
  8. Gao, W.; Chan, P.S.; Ng, H.K.T.; Lu, X. Efficient computational algorithm for optimal allocation in regression models. J. Comput. Appl. Math. 2014, 261, 118–126. [Google Scholar] [CrossRef]
  9. Wynn, H.P. The sequential generation of D-optimum experimental designs. Ann. Math. Stat. 1970, 41, 1655–1664. [Google Scholar] [CrossRef]
  10. Fedorov, V.V. Continuous Optimal Designs (Statistical Methods). In Theory of Optimal Experiments; Academic Press: New York, NY, USA, 1972; pp. 64–153. [Google Scholar]
  11. Atwood, C.L. Sequences converging to D-optimal designs of experiments. Ann. Stat. 1973, 1, 342–352. [Google Scholar] [CrossRef]
  12. John, R.C.S.; Draper, N.R. D-optimality for regression designs: A review. Technometrics 1975, 17, 15–23. [Google Scholar] [CrossRef]
  13. Silvey, S.D.; Titterington, D.H.; Torsney, B. An algorithm for optimal designs on a design space. Commun. Stat. Theory Methods 1978, 7, 1379–1389. [Google Scholar] [CrossRef]
  14. Yu, Y. D-optimal designs via a cocktail algorithm. Stat. Comput. 2011, 21, 475–481. [Google Scholar] [CrossRef]
  15. Martin-Martin, R.; Torsney, B.; López-Fidalgo, J. Construction of marginally and conditionally restricted designs using multiplicative algorithms. Comput. Stat. Data Anal. 2007, 51, 5547–5561. [Google Scholar] [CrossRef]
  16. Harman, R.; Pronzato, L. Improvements on removing nonoptimal support points in D-optimum design algorithms. Stat. Probab. Lett. 2007, 77, 90–94. [Google Scholar] [CrossRef]
  17. De Castro, Y.; Gamboa, F.; Henrion, D.; Hess, R.; Lasserre, J.-B. Approximate optimal designs for multivariate polynomial regression. Ann. Stat. 2008, 47, 127–155. [Google Scholar] [CrossRef]
  18. Harman, R.; Filová, L.; Richtárik, P. A randomized exchange algorithm for computing optimal approximate designs of experiments. J. Am. Stat. Assoc. 2020, 115, 348–361. [Google Scholar] [CrossRef]
  19. Duan, J.; Gao, W.; Ng, H.K.T. Efficient computational algorithm for optimal continuous experimental designs. J. Comput. Appl. Math. 2019, 350, 98–113. [Google Scholar] [CrossRef]
  20. Duan, J.; Gao, W.; Ma, Y.; Ng, H.K.T. Efficient computational algorithms for approximate optimal designs. J. Stat. Comput. Simul. 2022, 92, 764–793. [Google Scholar] [CrossRef]
  21. Yang, M.; Biedermann, S.; Tang, E. On optimal designs for nonlinear models: A general and efficient algorithm. J. Am. Stat. Assoc. 2013, 108, 1411–1420. [Google Scholar] [CrossRef]
  22. Jones, B.; Allen-Moyer, K.; Goos, P. A-optimal versus D-optimal design of screening experiments. J. Qual. Technol. 2021, 53, 369–382. [Google Scholar] [CrossRef]
  23. Torsney, B.; Martín-Martín, R. Multiplicative algorithms for computing optimum designs. J. Stat. Plan Inference 2009, 139, 3947–3961. [Google Scholar] [CrossRef]
  24. Yu, Y. Monotonic convergence of a general algorithm for computing optimal designs. Ann. Stat. 2010, 38, 1593–1606. [Google Scholar] [CrossRef]
  25. Goudarzi, M.; Khazaei, S.; Jafari, H. D-optimal designs for linear mixed model with random effects of Dirichlet process. Commun. Stat. Simul. Comput. 2021, 1–10. [Google Scholar] [CrossRef]
  26. Harman, R.; Bachratá, A.; Filová, L. Construction of efficient experimental designs under multiple resource constraints. Appl. Stoch. Models Bus. Ind. 2016, 32, 3–17. [Google Scholar] [CrossRef]
  27. Harman, R.; Benková, E. Barycentric algorithm for computing d-optimal size-and cost-constrained designs of experiments. Metrika 2017, 80, 201–225. [Google Scholar] [CrossRef]
  28. Coetzer, R.; Haines, L.M. The construction of D- and I-optimal designs for mixture experiments with linear constraints on the components. Chemometr. Intell. Lab. Syst. 2017, 171, 112–124. [Google Scholar] [CrossRef]
  29. Kiefer, J.; Wolfowitz, J. The equivalence of two extremum problems. Can. J. Math. 1960, 12, 363–366. [Google Scholar] [CrossRef]
  30. Khun, H.W.; Tucker, A.W. Non-Linear Programming, Proceeding Second Berkeley Symposium Mathematical Statistic and Probability (ed) Nyman; University of California Press: Berkeley, CA, USA, 1951; pp. 481–492. [Google Scholar]
  31. Kullback, S. A lower bound for discrimination information in terms of variation (corresp.). IEEE Trans. Inf. Theory 1967, 13, 126–127. [Google Scholar] [CrossRef]
  32. Anderson, T.W. An Introduction to Multivariate Statistical Analysis, 3rd ed.; John Wiley and Sons: New York, NY, USA, 2003. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fu, L.; Ma, F.; Yu, Z.; Zhu, Z. Multiplication Algorithms for Approximate Optimal Distributions with Cost Constraints. Mathematics 2023, 11, 1963. https://doi.org/10.3390/math11081963

AMA Style

Fu L, Ma F, Yu Z, Zhu Z. Multiplication Algorithms for Approximate Optimal Distributions with Cost Constraints. Mathematics. 2023; 11(8):1963. https://doi.org/10.3390/math11081963

Chicago/Turabian Style

Fu, Lianyan, Faming Ma, Zhuoxi Yu, and Zhichuan Zhu. 2023. "Multiplication Algorithms for Approximate Optimal Distributions with Cost Constraints" Mathematics 11, no. 8: 1963. https://doi.org/10.3390/math11081963

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop