1. Introduction
The traditional conjugate gradient method is mainly based on the steepest-descent method or Newton method [
1]. The steepest-descent method can reach the immediate neighboring area of the optimal point; however, the searching ability is reduced as the distance between the iterative and the optimal points is small [
2]. The Newton method is also referred to as a gradient-based optimization method [
3]. Compared to the steepest-descent method, the convergence speed of the Newton method is even faster. Unfortunately, if the position of the original point is too far away from the optimal point, the iteration of the Newton method may fail to achieve convergence. The traditional conjugate gradient method (CGM) combines the above two methods and thus takes advantage of them. However, the feasibility of this method is still limited because the form of the objective function should be cast into a sum of squared difference [
4], and hence, it is not suitable for multi-goal optimization.
Cheng and Chang [
5] proposed a simplified conjugate gradient method (SCGM). The SCGM method is referred to as a local optimization scheme that is suitable for the search for optimal parameters within a bounded range. With this method, the step sizes corresponding to the designed parameters are fixed during iteration, and the sensitivity of the objective function to the perturbations of the designed parameters are evaluated directly using direct numerical differentiation. In this manner, the objective function can be defined flexibly and not limited to a sum of squared difference form. Besides, the SCGM method features a simplification in the mathematical formulation, and hence, it has been widely applied in the optimization of various engineering devices, such as fuel cells [
6], thermoelectric coolers [
7], micro reformers [
8], and so on. However, one major disadvantage of this method is that the SCGM method may slow down the convergence because the step size is fixed.
For further improving the SCGM method, in this study an efficient method, which is named variable-step simplified conjugate gradient method (VSCGM), is proposed. This method is a further modified version of the SCGM method. In the VSCGM method, the step size of the iteration can be varied and adjusted automatically, and the adjustment is dependent on the gain function and ratio of search directions. The VSCGM allows the convergence speed to be greatly accelerated while the form of the objective function can still be defined flexibly.
In this study, a direct solution provider is built based on the deep learning neural network algorithm, which learns to map inputs to outputs given a training dataset of examples. The training process involves finding a set of weights in the network that are accurate enough at solving the specific problems. The first step towards the artificial neuron was taken by McCulloch and Pitts in 1943 inspired by neurobiology [
9,
10]. Rosenblatt [
11] used a probabilistic model for information storage and organization simulating the perception and learning ability of the brain. Lately, DARPA employed a layered method to explore this new terrain and the development of DARPA on the neural network model was described in [
12]. Rumelhart, McClelland, and PDP Research Group [
13] assumed the mind includes a great number of elementary units connected in a neural network. Recently, there have been many neural models proposed by related researchers. Rumelhart, Hinton, and Williams [
14,
15] presented a backpropagation model that can efficiently calculate the gradient of the loss function with respect to the weights of the network. It is feasible to use gradient methods for training multi-layer networks and updating weights to minimize loss. Munakata [
16] introduced the fundamentals of the backpropagation model and stated that a neural network is composed of artificial neurons and interconnections. Such a network can be viewed as a graph, neurons as nodes, and interconnections as edges. A general review of the backpropagation model is given by Goodfellow, Bengio, and Courville [
17]. In machine learning, especially deep learning, the backpropagation model is an algorithm widely used in the training of feedforward neural networks with supervised learning. Among the known methods with the backpropagation model, the Levenberg–Marquardt method [
18] appears to be the fastest method for training moderate-sized feedforward neural networks. Thus, in the present study, the neural network algorithm is developed based on this method [
19].
On the other hand, the Stirling engine is referred to as an external combustion engine, which may be operated at a low-temperature difference between high and low thermal reservoirs, and compatible with a variety of heat sources like solar energy, geothermal energy, and industrial waste. Besides, the engine features low noise, high efficiency, and safe operation; therefore, the Stirling engine can serve as an alternative power machine to improve global warming and reduce the usage of fossil energy. In terms of the mechanical structure, Stirling engines are divided into three configurations: alpha-, beta-, and gamma-types [
20]. Among them, the gamma-type Stirling engines are the most popular configuration to exploit low-temperature thermal energy. Recently, Cheng, Le, and Huang [
21] developed a computational fluid dynamics (CFD) module of a low-temperature-differential gamma-type Stirling engine that might be used for recycling the waste heat at 423 to 700 K. They used the CFD module to investigate the effects of the geometrical and operating parameters on the indicated power output and thermal efficiency of the engine; however, in their parametric analysis, when the effects of a parameter were investigated to find the optimal value of the parameter, all other parameters were fixed at prescribed values; therefore, the optimization is referred to as one-parameter optimization. To improve the performance of a real engine, the one-parameter optimization is not practical, and the optimizer should be capable of dealing with multi-parameter optimization.
As a test case for testing the present approach, optimization of the low-temperature-differential gamma-type Stirling engine was attempted. The dataset for training the neural network was prepared based on the CFD computational results using a similar module of Cheng, Le, and Huang [
21]. Given input geometrical parameters, the trained neuron network then served as the direct solution provider and outputs numerical information of the indicated power output and the thermal efficiency of the engine.
On the other hand, an objective function is defined in terms of the indicated power output and the thermal efficiency of the engine and is calculated by the direct solution provider. Meanwhile, with the help of the neural network algorithm, the VSCGM method was employed to iteratively adjust multiple parameters until the objective function is minimized and the optimum group of the parameters is obtained.
2. Optimization Methods
2.1. CGM Method
With the traditional conjugate gradient method (CGM), the objective function (J) to minimize is typically defined as the sum of squares of the differences
where
vi and
are the iterative and the compared quantities, respectively, and I is the number of input data. The gradient of the objective function for the designed parameters, {
Xj | j = 1, 2,.., k} where k is the number of designed parameters, is then expressed in terms of the sensitivity coefficients
as
The conjugate gradient coefficient (
γj) is expressed as the ratio of gradients of objective functions of two consecutive iteration steps:
where
n is the index of the iteration step. Next, the search direction is calculated as a linear combination of the conjugated gradient coefficient and gradients of the objective function.
The designed parameter is then updated as follows:
In the CGM method, for different designed parameters {Xj | j = 1, 2,.., k} the corresponding step sizes {βj | j = 1, 2,.., k} are different. They are obtained by solving a set of simultaneous linear algebraic equations. Meanwhile, the sensitivity coefficient ∂vi/∂Xj with each of the designed parameters is calculated from the solution of a partial differential equation (PDE). Thus, if there are k designed parameters, one may have k PDEs to solve for the values of the k sensitivity coefficients. These sensitivity coefficients are introduced into (2) to determine the gradient of objective function ∂J/∂Xj.
2.2. SCGM Method
In the SCGM method proposed by Cheng and Chang [
5], the step sizes corresponding to the designed parameters are fixed during iteration. That is
The task of the sensitivity analysis is to evaluate the sensitivity of the objective function J for the designed parameter, which is represented with the gradient of objective function ∂
J/∂
Xj and is evaluated directly using direct numerical differentiation as
The perturbations in the designed parameters ΔXj can be specified readily by trial and error. In this study, these perturbations are ranged between 0.0001 and 0.001 depending on the complexity of the problem. In the optimization process, the values are fixed generally.
In this manner, it is not necessary to solve any equation for the sensitivity coefficients nor the step sizes, and hence, the tedious computation process is greatly simplified. Furthermore, the objective function is not limited to the form of the sum of the squared difference; however, one major disadvantage of this method is that the SCGM method may slow down the convergence because the step size is fixed. Under these circumstances, the present study developed a novel optimization method that is named variable-step simplified conjugate gradient method (VSCGM), which is described in the successive section.
2.3. VSCGM Method
The VSCGM method is a further modified version of the SCGM method, which can reduce the time to reach convergence while the form of the objective function can still be defined flexibly. Finding an optimum design can be accomplished by following almost the same problem-solving process of the SCGM method.
In the VSCGM method, the gradient of objective function ∂
J/∂
Xj is calculated by using the same direct numerical differentiation method as described with (7); however, since the distance between the iterative and the optimal points is gradually decreased in the iterations, the values of the step sizes and the perturbations of the designed parameters for the sensitivity analysis are varied. For this purpose, the perturbations in the designed parameters Δ
Xj(n) is determined at each iteration as
where Δ
Gj(1) and
βj(1) are the initial values of the perturbations and the step sizes, respectively, in the first iteration. It is noted that as the iterative point approaches the optimal point, the step sizes
βj(1) should be reduced to its minimum such that the iteration may not jump over the optimal point. Thus, the step sizes are expressed with
where
and
The variable step size is automatically adjusted with (9) to (11) in terms of the gain function (Gj(n)) and the ratio of search directions (Rj(n)). The gain function is calculated by using a linear interpolation between two extreme values, Gj,min and Gj,Max. The two extreme values need to be appropriately specified by the user. In this study, the minimum and maximum gain function values are assigned to be 1.0 and 3.0, respectively. Through the gain function, the influence of the ratio of search directions is introduced to the adjustment of the step sizes. Note that the step sizes can be enlarged or reduced depending on the magnitudes of the gain function and the ratio of search directions. In this way, when the iterative point is still far away from the optimal point, the step sizes can be increased to facilitate the search. On contrary, the step sizes will be decreased when the iterative point gets into the immediate neighboring area of the optimal point; however, to avoid a steep rise or steep drop in the step sizes, it is necessary to prescribe the upper and the lower bounds of the ratio of search directions (Rj,Max, Rj,min) properly.
Through the automatic adjustment of the variable step size, the optimal design is reached more efficiently and accurately.
Figure 1 shows the comparison in the solution process between VSCGM and SCGM with computation flow charts, and the difference between the two methods can be seen.
4. Results and Discussion
The trained neuron network serves as the direct solution provider which provides numerical information of the thermal efficiency and the indicated power output of the engine for the VSCGM method. Then, the objective function is determined, and the designed parameters are updated by the VSCGM method. With the updated designed parameters, the neural network provides the indicated power output and the thermal efficiency for the VSCGM method again. The iteration continues until the objective function reaches a minimum value. The VSCGM method can be applied for a multi-goal and multi-parameter optimization. The objective function of optimization is calculated in terms of the indicated power output and the thermal efficiency as
On the right-hand side of the above equation, the denominator of the fraction includes two terms. The first term represents the magnitude of indicated power output, and the second term represents the magnitude of thermal efficiency. M and N are two positive weighting coefficients that are specified by the users depending on individual applications. In the present optimization, the values of M and N are assigned to be 1.0 and 8.8, respectively.
It is important to mention that among the eight parameters, some parameters have a monotonic relationship with the performance of the engine. For example, as the value of charged pressure or heating temperature increases, so does the value of the indicated power output or thermal efficiency. Those parameters are called monotonically related parameters. The parameters can then be categorized into two groups: monotonically and non-monotonically related variables. Here in this study, the optimization task is focused on the group of non-monotonically related parameters. Thus, five non-monotonically related parameters are selected to be the designed parameters, including rotation speed, phase angle, piston diameter, displacer stroke, and porosity. Note that the designed parameters are changed around the values given in
Table 2 for the baseline case.
Table 5 conveys the initial values and the bounded ranges of the five designed parameters. The results of the indicated power output, thermal efficiency, and minimum objective function before and after the optimization are also provided in this table. A comparison between the initial and optimal designs shows that the indicated power output can be increased from 161.616 to 327.980 W, and the thermal efficiency from 16.532% to 17.399%. In the end, the objective function reaches a minimum value of 0.002078. The indicated power output and the thermal efficiency are elevated with this approach by 102.93% and 5.24%, respectively. The optimization can improve the performance of the engine remarkably. Meanwhile, the values of the designed parameters of the optimal design are determined within the bounded ranges efficiently. These values make up an important part of what influences designers when they make their design decisions.
The increase in the performance via optimization is mainly dependent on the collected dataset (
Table 3), the fixed parameters (
Table 2), and the lower and the upper bounds of the designed parameters (
Table 5). The dataset shown in
Table 3 illustrates a wider range for the indicated power while a narrower range for the thermal efficiency. Therefore, it is expected that increase of the indicated power output will be greater than that of thermal efficiency.
It is necessary to compare relative performance in the convergence speed between the VSCGM and SCGM methods. Both VSCGM and SCGM methods start from the baseline case and use the same trained neural network.
Figure 8 shows the variations in the objective function, indicated power output, and thermal efficiency with the five-parameter optimization process.
There is a rapid initial change in all three quantities with VSCGM than with SCGM. The quantities then change less rapidly, and more linearly. Eventually, the objective function approaches a minimum value. It is seen that the number of iterations needed is approximately 2700 with the SCGM method, whereas with the VSCGM it is only 1700. The rapid initial change is caused by the automatic adjustment of the step size. In general, the larger step size is produced by Equation (9) in the very initial stage such that the process can be quickly facilitated to the area adjacent to the optimal point. Then, the step size is reduced to a smaller value such that the iteration can approach the optimal point smoothly. As a result, the VSCGM method can accelerate convergence significantly.
Meanwhile, note that the approach is general; therefore, it shall include but not be limited to the optimization of the low-temperature-differential gamma-type Stirling engine. The dataset can also be prepared by experiments, not just limited to numerical computation.
It is also necessary to know whether the optimization approach can lead to a unique optimal point for different initial guesses. To test the uniqueness of the optimal design, in addition to the baseline case (Initial guess 1), three additional initial guess sets (Initial guess 2 to 4) are taken into account:
Figure 9 depicts the optimization processes starting from these four initial guesses in the coordinates (
Dp,
θph,
ϕ). It is found that even though the four initial points are separate in the coordinates, the four optimization processes still approach the same optimal point. This implies that the optimization method is robust, and the obtained optimal design is independent of the initial point for this case. It is noted that the present approach is not limited to the present group of designed parameters. When necessary, more designed parameters may be readily applied.
In this study, the VSCGM method is originally presented in this paper which is a novel optimization method that can be coupled with a neural network training algorithm for optimization. It has been proven that the approach is efficient and robust. It is not limited to Stirling engine optimization. It can be readily applied to optimize other energy devices.
5. Conclusions
The present study develops a variable-step simplified conjugate gradient method (VSCGM) and incorporates this method with a neural network training algorithm to optimize a low-temperature-differential gamma-type Stirling engine. The VSCGM method is a further modified version of the existing SCGM method which introduces the concept of variable step size into the optimization process. A comparison between the VSCGM and the SCGM methods shows that the number of iterations needed is approximately 2700 with the SCGM method, whereas only 1700 with the VSCGM. The VSCGM method can accelerate convergence significantly. On the other hand, the neural network training algorithm is based on the Levenberg–Marquardt method with supervised learning. The three-dimensional CFD simulation results are used as the dataset for training the neural network.
A comparison between the initial and optimal designs shows that using this approach, the indicated power output can be elevated from 161.616 to 327.980 W, and the thermal efficiency from 16.532% to 17.399%.
Meanwhile, four different initial points are adapted to test the robustness of the approach. It is found that the approach is robust, and the obtained optimal combination of the designed parameters is independent of the initial guess for this case.