In this chapter, the mixer’s structure is optimized using a joint optimization algorithm combining SSA-CNN-LSTM and the WOA. Initially, the optimization objectives were determined, and a sample library was established using OLH. Subsequently, an SSA-CNN-LSTM prediction model was constructed and compared with CNN-LSTM and LSTM prediction models. Finally, the WOA was employed to optimize the constructed prediction model to obtain the optimal structural parameters.
4.3. Construction and Validation of the SSA-CNN-LSTM Model
This section employs OLH to extract a certain number of samples, followed by the construction of an SSA-CNN-LSTM prediction model. The model is evaluated using Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE) as performance metrics. To demonstrate the superiority of the SSA-CNN-LSTM model, comparative analyses are conducted with CNN-LSTM and LSTM models.
4.3.1. Establishment of Training Samples
Before constructing the model, multiple sets of samples must be obtained to serve as training data, ensuring they are uniformly distributed within the design space. This paper selects three optimization variables. Manually assigning samples often leads to an uneven distribution, resulting in training data that do not adequately represent the sample space. Therefore, uniform sampling and Latin Hypercube Sampling (LHS) are commonly used.
Uniform sampling considers only the distribution of samples, without ensuring they are neat and comparable. Scholars later proposed the LHS method, which divides the sample space into equidistant single spaces and randomly takes one sample from each. This method is efficient and stable, but the samples are not uniformly distributed within the design space. To better represent the sampling space, OLH was developed as an improvement to LHS. OLH is a stochastic multidimensional stratified sampling approach that divides the probability distribution function of experimental factors into N non-overlapping sub-regions based on the value ranges of the influencing factors, with independent equal-probability sampling within each sub-region. Compared to orthogonal experiments, LHS offers more flexible grading of level values, and the number of experiments can be manually controlled. However, the experimental points may still not be evenly distributed, and as the number of levels increases, some regions of the design space may be missed. OLH improves upon LHS, enabling uniform, random, and orthogonal sampling within the design space of experimental factors, allowing substantial model information to be acquired with relatively few sampling points [
29].
Figure 9 shows the results of 16 samples at 2 levels of OLH, demonstrating that this method better fills the sampling space.
Figure 10 shows the sampling results of LHS.
The training samples should be at least ten times the number of input variables. To ensure credible results, this paper uses OLH to draw 60 sample points from the sample space, as depicted in
Figure 11. Based on these samples, 60 mixer models are constructed in 3D drawing software, resulting in 60 CFD computational models. The numerical simulation method is identical to the one used for the prototype mixer discussed earlier, producing 60 sets of data for mixing unevenness corresponding to 60 sets of parameters. A sample database is established, as shown in
Table 4.
4.3.2. Construction of the SSA-CNN-LSTM Model
In the CNN-LSTM model, SSA adjusts various parameters to enhance performance. The network structure has three main components: CNN for extracting spatial features, LSTM for learning time series information, and SSA for optimizing parameters (e.g., number of convolutional kernels and LSTM units). Input data pass through the CNN to extract spatial features, then into the LSTM to learn time-series information. SSA optimizes parameters through iterative searching and updating. Ultimately, this optimized model can be used for data prediction tasks.
The application of the SSA to optimize the CNN-LSTM model involves the following steps:
- (1)
Data preprocessing: data labeling, dataset division, data normalization, and data format conversion;
- (2)
SSA population initialization: setting the initial size of the sparrow population (n), the maximum number of iterations (N), the proportion of discoverers (PD), the number of sparrows perceiving danger (SD), the safety value (ST), and the alert value (R2);
- (3)
Computing fitness, updating discoverer positions, updating joiner positions, updating scrounger positions, and updating the optimal individual position;
- (4)
Feeding data into the CNN network, passing data through CNN layers, batch normalization layers, activation function layers, and average pooling layers;
- (5)
Entering data into the LSTM neural network after going through LSTM layers to the fully connected layer and SoftMax layer;
- (6)
Outputting the results.
Although the SSA algorithm has global search capabilities and adaptability, it is sensitive to the selection of initial parameters and requires tuning for specific problems. When optimizing the CNN-LSTM model, carefully selecting appropriate optimization objectives and fitness functions ensures the SSA algorithm significantly enhances performance. The parameters optimized by SSA mainly include the number of neurons in the LSTM layer, convolution kernel size, number of convolutional layers, number of neurons in the fully connected layer, and learning rate. The schematic diagram of the SSA-CNN-LSTM model is shown in
Figure 12.
- (1)
Number of neurons in the LSTM layer: The LSTM layer is an important part of the CNN-LSTM model, and the choice of the number of neurons directly affects the performance of the model. Through the SSA algorithm, the optimal number of neurons in the LSTM layer can be dynamically searched, thereby improving the model’s prediction accuracy.
- (2)
Convolution kernel size: The convolution operation in the CNN-LSTM model is very important for extracting spatial features of the data, and the choice of convolution kernel size will affect feature extraction. The SSA can optimize the size of the convolution kernel to further improve the accuracy of the model.
- (3)
Number of convolutional layers: The deeper the convolutional layers in the CNN-LSTM model, the greater the computational cost required during forward propagation, but the accuracy of the model will also increase correspondingly. By optimizing the number of convolutional layers, the SSA algorithm can enhance model accuracy within an acceptable range of computational costs.
- (4)
Number of neurons in the fully connected layer: The number of neurons in the fully connected layer determines the output dimension and prediction ability of the model. By optimizing the number of neurons in the fully connected layer, the SSA can improve the accuracy and generalization ability of the model.
- (5)
Learning rate: The CNN-LSTM model needs to specify a learning rate during training. Too high or too low a learning rate will affect the convergence and accuracy of the model. The SSA can search for the optimal learning rate to improve the efficiency and accuracy of model training. Combining the above module construction steps, the flowchart of the SSA-CNN-LSTM model is shown in
Figure 13.
4.3.3. Benchmark Methods and Evaluation Criteria
In the comparative study of models, this paper adopts the following two machine learning models:
LSTM is a type of Recurrent Neural Network (RNN) with two gates and one cell state, allowing it to store data longer than standard RNNs. The gates regulate information flow by specifying activation functions within the LSTM cell. The LSTM model in this paper consists of four consecutive pairs of LSTM and Dropout layers. The Dropout layers reduce dependency on previous data, and the final output is obtained through a fully connected layer.
- (2)
CNN-LSTM Model
The CNN-LSTM model can achieve multi-feature classification prediction. The CNN is used for feature (fusion) extraction, and then the extracted features are mapped as sequence vectors input into the LSTM.
To evaluate the effectiveness and applicability of the prediction models on the dataset, three common evaluation metrics are used to quantify the precision and errors of predictions, namely MAE, MAPE, and RMSE.
The calculation for MAE is shown below; it represents the average of the absolute errors and can reflect the actual situation of prediction errors.
The calculation for MAPE is as follows, representing the fitting effect and precision of the prediction model.
RMSE measures the deviation between the actual and the predicted values, with its calculation shown as below.
where z represents the number of samples,
is the actual value, and
is the predicted value. These three indicators represent different aspects of the predictive performance, and the smaller their values, the better the performance.
4.3.4. Experimental Parameter Settings
In this section, the SSA-CNN-LSTM model is employed for predicting mixing unevenness. The model optimizes CNN-LSTM using the SSA, with parameters including the number of iterations (M), population size pop, the proportion of discoverers (p_percent), fitness function (fn), and the dimension of parameters to be optimized (dim).
- (1)
Number of iterations (M): This parameter determines the computation time and accuracy of the algorithm. Typically, the more iterations, the longer the computation time and the higher the accuracy. Here, it is set to 50.
- (2)
Population size (pop): This refers to the number of sparrows used for computation, with no explicit rule for its size. Typically, it is based on a specific analysis of the problem at hand. For optimizing general problems, setting 50 sparrows is sufficient to solve most issues. For particularly difficult or specified problems, it might be necessary to increase the population size. Generally, the smaller the population size, the easier it is to fall into local optima, while a larger population size results in slower convergence. Here, the population size is set to 50.
- (3)
Proportion of discoverers (p_percent): The percentage of discoverers in the population, set here to 20%.
- (4)
The percentage of sparrows that are aware of danger is set to 10%.
- (5)
Safety threshold (ST) is set to 0.8.
- (6)
Fitness function (fn): Determines the fitness of a sparrow’s position.
- (7)
Parameter dimension (dim): Indicates the number of parameters to be optimized. Here, optimization is applied to three parameters of the LSTM layer, which are the optimal number of neurons, the optimal initial learning rate, and the optimal L2 regularization coefficient, making the dimension 3.
4.3.5. Comparative Analysis
Five sixths of the data from the chosen sample database were used as training data, with the remaining serving as the test set. The predictive performance and prediction errors of various algorithms are shown in
Figure 14 and
Figure 15, respectively. It can be observed from the figures that SSA-CNN-LSTM has the smallest prediction error, not exceeding 2%, followed by CNN-LSTM, with LSTM showing the poorest predictive performance. In the experiment, the prediction indicators were inlet diameter, outlet diameter, and angle of attack. The SSA optimized three parameters for CNN-LSTM: the optimal number of neurons was set at 75, the optimal initial learning rate at 0.006258, and the optimal L2 regularization coefficient at 0.0004562. By comparing the actual values and predicted values, it can be seen that SSA-CNN-LSTM fits better than various baseline models.
The prediction results of the above three algorithms were statistically analyzed, and as shown in
Table 5 and
Figure 16, SSA-CNN-LSTM achieved the best results according to the evaluation metrics RMSE, MAE, and MAPE. MAPE was reduced from 12.5% to 7%, achieving an accuracy improvement of 44 percentage points. RMSE and MAE also decreased, indicating that the model’s predictive performance had improved following optimization.
4.4. Joint Optimization Design of SSA-CNN-LSTM and the WOA
The WOA is a relatively novel optimization algorithm proposed in recent years. In 2016, Mirjalili et al. [
30] from Griffith University in Australia introduced the WOA, inspired by the unique predatory behavior of humpback whales observed in the ocean. As an emerging swarm intelligence optimization algorithm, the WOA is characterized by its few parameters, simple structure, and high flexibility. Specifically, the advantages are as follows:
- (1)
The algorithm’s optimization mechanism is primarily controlled by a random number (Pa), a threshold parameter (A), and a random number (C), resulting in fewer control parameters.
- (2)
The WOA is divided into three main optimization mechanisms, with a relatively simple formula model that is easy to implement.
- (3)
The WOA offers high flexibility, as it can switch between optimization mechanisms using the random number (Pa) and the threshold parameter (A).
As a result, the WOA can be widely applied in various engineering fields [
31]. For example, Li et al. [
32] designed a WOA-based optimization algorithm for evaluating bipolar transistor models. In the field of intelligent healthcare, Hassan et al. [
33] proposed a hybrid algorithm based on the WOA for diagnosing equipment faults. He et al. [
34] developed an improved WOA for adaptively searching for optimal parameters in stochastic resonance systems. El-Fergany et al. [
35] used the WOA to enhance the accuracy of fuel cell models.
The overall optimization process of the WOA mainly consists of three parts: Encircling Prey, the Bubble-Net Attacking Technique, and Searching for Prey. Through the cooperation of these three processes, the optimization process can achieve good global search capability and accurate and fast local convergence ability. Thus, while ensuring fast convergence of the algorithm, it can also avoid falling into local optimal solutions, ensuring that the obtained optimal solution is the global optimum, fully leveraging the best performance of the optimization algorithm.
The optimization process of the WOA is as follows: First, initialize the WOA parameter values and input the structural parameters to be optimized. At the beginning of each iteration, check if the maximum number of iterations has been reached. If it has, return the current best structural parameter values and their corresponding fitness values. If the maximum number of iterations has not been reached, update the WOA parameters (A, a, C, and Pa).
The search strategy is then determined based on the random number (p): If Pa < 0.5, further check the value of |A|. If |A| < 1, select a random agent and update its position using the range search mechanism; otherwise, update the search agent’s position using the spiral search strategy. Next, calculate the new fitness value for each search agent and update the structural parameters based on these fitness values. This process repeats until the maximum number of iterations is achieved. Finally, when the maximum number of iterations is reached, the algorithm returns the best structural parameter values and their corresponding fitness values.
Based on the above algorithm steps, the flowchart of the WOA is illustrated as shown in
Figure 17.
Based on the information provided, the combined optimization algorithm of SSA-CNN-LSTM and the WOA can rapidly optimize structural parameters. Utilizing the determined structural model, an SSA-CNN-LSTM prediction model is built from the sample library to establish a functional relationship between structural parameters and their corresponding fitness values. This enables the input of parameter values to yield fitness values as outputs. After determining the optimization range of parameter values, the parameters of the optimization algorithm are initialized. The SSA-CNN-LSTM model determines the fitness value of each individual in each generation, facilitating selection and iteration, ultimately obtaining the best parameter values and outputting the optimal solution with its corresponding fitness value. Compared to traditional optimization methods, the combined algorithm offers higher efficiency, achieving superior optimization accuracy and better results in a shorter time. The combined optimization process of SSA-CNN-LSTM and the WOA is illustrated in
Figure 18.
The prediction output of the trained model is used as the individual fitness value, eliminating individuals with inferior fitness values in favor of superior ones. The WOA is utilized to globally optimize the search for extrema in the prediction model. To demonstrate the superior capabilities of the WOA, it is compared with traditional genetic algorithms and particle swarm optimization.
The Particle Swarm Optimization (PSO) algorithm initializes a swarm of particles within the feasible solution space, characterizing each particle by its position, velocity, and fitness. As the particles move, they update their positions using the individual best position—Pbest—the position with the optimal fitness value experienced by the particle—and the global best position—Gbest—the position with the optimal fitness value found by all particles in the swarm. Pbest and Gbest positions are continuously updated by comparing and adjusting fitness values [
36].
The Genetic Algorithm (GA) [
37] follows Darwin’s theory of survival of the fittest and simulates biological gene inheritance. The GA calculates a fitness function to evaluate the quality of individuals in the population, classifying them based on these values. Inferior individuals are eliminated, while superior ones survive. This iterative process ensures that the best individuals are passed on to subsequent generations, eventually leading to a population containing the optimal individuals [
38,
39].
The optimization performance is highlighted by comparing optimization times and the optimization effects of the best fitness values. The results are shown in
Figure 19. From the figure, it can be seen that the GA requires 303 iterations to obtain the optimal parameter values, PSO also requires 41 iterations to achieve the optimal parameter values, while the WOA only needs 17 steps to obtain the best optimization solution. At the same time, compared with the GA and PSO, the WOA yields better optimization results. According to the previous text, the lower the mixing unevenness achieved, the better the performance. Here, combined with
Figure 4,
Figure 5,
Figure 6,
Figure 7,
Figure 8,
Figure 9,
Figure 10,
Figure 11 and
Figure 12, it can be observed that the best solutions obtained by the GA, PSO, and WOA are 0.039, 0.037, and 0.03, respectively. Through comparison, it can be seen that the WOA has a faster optimization speed and can achieve better optimization results with fewer iterations. After optimization by the WOA, the best structural parameters obtained are: inlet diameter of 7, outlet diameter of 31.6, and impingement angle of 119.5°.