Appendix A.1. Numerically Searching for an Optimum and Gradients Calculation
The optimization algorithms in MSC Nastran belong to a family of methods generally referred to as ‘gradient-based’, since they use function gradients in addition to function values in order to assist in the numerical search for an optimum. The numerical search process can be summarized as follows: for a given point in the design space, we determine the gradients of the objective function and its constraints, and use this information to determine a search direction. We then proceed in this direction as far as possible, after which we investigate to see if we are at an optimum point. If we are not, we repeat the process until we can make no further improvement in our objective without violating any of the constraints.
The first step in a numerical search procedure is determining the direction in which to search. The situation may be somewhat complicated if the current design is infeasible (with one or more violated constraints) or if one or more constraints are critical. For an infeasible design, we are outside of one of the fences, to use a hill analogy. In the case of a critical design, we are standing immediately adjacent to a fence. In general, we need to know at least the gradient of our objective function and perhaps some of the constraint functions as well. The process of taking small steps in each of the design variable directions (supposing we are not restricted by the fences for this step) corresponds exactly to the mathematical concept of a first-forward finite difference approximation of a derivative. For a single independent variable, the first-forward difference is given by
where the quantity
represents the small step taken in the direction
x. For the most practical design tasks, we are usually concerned with a vector of design variables. The resultant vector of partial derivatives, or gradient, of the function can be written as
where each partial derivative is a single component of the dimensional vector.
Physically, the gradient vector points uphill, or in the direction of increasing objective function. If we want to minimize the objective function, we will actually move in a direction opposite to that of the gradient. The steepest descent algorithm searches in the direction defined by the negative of the objective function gradient, or
Proceeding in this direction reduces the function value most rapidly. is referred to as the search vector.
MSC Nastran uses the steepest descent direction only when none of the constraints are critical or violated; even then, it is only used as the starting point for other, more efficient search algorithms. The difficulty in practice stems from the fact that, although the direction of steepest descent is usually an appropriate starting direction, subsequent search directions often fail to improve the objective function significantly. In MSC Nastran, more efficient methods, which can be generalized for the cases of active and/or violated constraints, are usually used.
Once a search direction has been determined, we proceeded ‘downhill’ until we collided with a fence, or until we reached the lowest point along our current path. It is important to note that this requires us to take a number of steps in this given direction, which is equivalent to a number of function evaluations in numerical optimization. For a search direction
and a vector of design variables
, the new design at the conclusion of our search in this direction can be written as
where
is the initial vector of design variables,
is the search vector, and
is the value of the search parameter α that yields the optimal design in the direction defined by
. Equation (A4), represents a one-dimensional search since the update on
depends only on the single scalar parameter α. This relation allows us to update a potentially huge number of design variables by varying the single parameter α. When we can no longer proceed in this search direction, we have the value of α which represents the move required to reach the best design possible for this particular direction. This value is defined as
. The new objective and constraints can now be expressed as
From this new point in the design space, we can again compute the gradients and establish another search direction based on this information. Again, we will proceed in this new direction until no further improvement can be made, repeating the process if necessary. At a certain point, we will not be able to establish a search direction that can yield an improved design. We may be at the bottom of the hill, or we may have proceeded as far as possible without crossing over a fence. In the numerical search algorithm, it is necessary to have some formal definition of an optimum. Any trial design can then be measured against this criterion to see if it is met, and if an optimum has been found. This required definition is provided by the Kuhn–Tucker conditions.
Figure A1 shows a two design variable space with constraints
and
and the objectives function
. The constraint boundaries are those curves for which the constraint values are both zero. A few contours of constant objective are shown as well; these can be thought of as contour lines drawn along constant elevations of the hill. The optimum point in this example is the point which lies at the intersection of the two constraints. This location is shown as
.
Figure A1.
Kuhn–Tucker condition at a constrained optimum.
Figure A1.
Kuhn–Tucker condition at a constrained optimum.
If we compute the gradients of the objective and the two active constraints at the optimum, we see that they all point off roughly in different directions (it should be remembered that function gradients point in the direction of increasing function values). For this situation—a constrained optimum—the Kuhn–Tucker conditions state that the vector sum of the objective and all active constraints must be equal to zero given an appropriate choice of multiplying factors. These factors are called the Lagrange multipliers. Constraints which are not active at the proposed optimum are not included in the vector summation.
Figure A2 shows this to be the case, where
and
are the values of the Lagrange multipliers that enable the zero vector sum condition to be met. It is likely that we could convince ourselves that this condition could not be met for any other point in the neighboring design space.
Figure A2.
Graphical interpretation of Kuhn–Tucker conditions.
Figure A2.
Graphical interpretation of Kuhn–Tucker conditions.
The Kuhn–Tucker conditions are useful even if there are no active constraints at the optimum. In this case, only the objective function gradient is considered, and this is identically equal to zero; any finite move in any direction will not decrease the objective function. A zero objective function gradient indicates a stationary condition. Not only are the Kuhn–Tucker conditions useful in determining whether we have achieved an optimal design, but they are also physically intuitive. The optimizer tests the Kuhn–Tucker conditions in connection with the search direction determination algorithm.