Adaptive Latin Hypercube Sampling for a Surrogate-Based Optimization with Artiﬁcial Neural Network

: A signiﬁcant number of sample points are often required for surrogate-based optimization when utilizing process simulations to cover the entire system space. This necessity is particularly pronounced in complex simulations or high-dimensional physical experiments, where a large number of sample points is essential. In this study, we have developed an adaptive Latin hypercube sampling (LHS) method that generates additional sample points from areas with the highest output deviations to optimize the required number of samples. The surrogate model used for the optimization problem is artiﬁcial neural networks (ANNs). The standard for measuring solution accuracy is the percent error of the optimal solution. The outcomes of the proposed algorithm were compared to those of random sampling for validation. As case studies, we chose three different chemical processes to illustrate problems of varying complexity and numbers of variables. The ﬁndings indicate that for all case studies, the proposed LHS optimization algorithm required fewer sample points than random sampling to achieve optimal solutions of similar quality. To extend the application of this methodology, we recommend further applying it to ﬁelds beyond chemical engineering and higher-dimensional problems.


Introduction
For many engineering problems, simulation is an essential method to understand the behavior of real systems accurately.In numerous research projects, simulation is employed to interpret complex models, reducing the time and cost associated with real-world process operations.A surrogate model, also known as a meta-model, can be created using a set of input-output data obtained from simulations or experiments.In this context, the simulation is considered a 'black box,' where the functions and underlying assumptions of the simulation are unknown except for the process output [1].
The popular surrogate model is Response Surface Methodology (RSM), which is evaluated through the regression of input-output data using low-order polynomials.In recent years, RSM has been applied in several chemical processes to study the effects of independent factors [2][3][4][5][6][7][8].It has also been used in treatment processes to improve removal efficiency and reduce operating costs [2].However, for complex processes, low-order polynomials have limitations in capturing highly nonlinear behavior [9].Generally, RSM provides local optimization results.Therefore, global surrogate models such as Kriging interpolation and artificial neural networks (ANNs) have been proposed to overcome these limitations.Kriging interpolation offers an accurate model that provides better global approximations compared to low-order polynomial models.It predicts output values at specific inputs that match the simulated output values [10].Kriging interpolation fits a spatial correlation function to a set of input-output pairs, and the data points are used in linear interpolation [11].However, it is important to note that this surrogate model may have limitations due to available support software.Several surrogate models have been used in the past years, including artificial neural networks (ANNs), radial basis function (RBF), and random forest (RF) [12].ANNs are a powerful tool for various applications, such as approximating nonlinear systems, process modeling, and optimization [13].They can fit many complex nonlinear functions and apply to numerous processes in chemical engineering [14,15].ANNs were coupled with another machine learning technique such as the genetic algorithm [16,17] and simulated annealing [17,18] for application in the engineering process.ANNs are commonly employed due to their simplicity and ability to handle challenging nonlinear mappings [19].These mathematical models have been used in both laboratory [20,21] and industry [14,22,23].In 2020, Tahkola et al. developed a surrogate model for electrical machine torque using grid sampling combined with ANNs, and the results demonstrated the effectiveness of this sampling approach in modeling torque behavior [24].Many studies have shown that ANNs can provide accurate models for various engineering disciplines [25][26][27][28][29][30].Therefore, ANNs were employed to develop accurate models for the chemical process case studies in this work.
To build a model that describes the relationship of any input and output, the sampling technique is also required to construct the experiment and points of any responses.In order to generate the sample points, the design of experiment (DoE) approach is necessary.Therefore, the study of design of experiment (DoE) is necessary in order to select the numbers and locations of sample points.The development of modern DoE techniques can be categorized into two types, which are static and adaptive [31].Factorial design, Latin hypercube sampling, the Monte Carlo method and orthogonal array are classified as static DoEs.Factorial design is widely used due to its simplest form of space-filling design, which covers most of the considered regions of variables [32,33].The first modern design of experiment is Monte Carlo or random sampling, which generates sample points in the design space using random numbers.This technique requires a large number of sample points to represent the entire problem space.Subsequently, Quasi-Monte Carlo was introduced to achieve a uniform distribution of sample points across the design space by using low-discrepancy sequences.Three popular quasi-random low-discrepancy sequences are Halton, Hammersley, and Sobol sequences.However, these sampling techniques, including random sampling, demand a large sample size to achieve a highly accurate model, which results in high costs and time consumption during real process operations.Therefore, the proposed Latin hypercube sampling was studied to address these challenges.Latin hypercube sampling exhibits a beneficial property known as optimum non-collapsing.However, generating Latin hypercube sample points presents challenges, particularly in terms of achieving uniform distribution across the design space.Additionally, in highdimensional problems, a large number of sample points is needed to adequately cover the design space.
Therefore, this work introduced adaptive Latin hypercube sampling to employ the necessary sample points for representing the design space in simulation-based optimization.The algorithm's objective is to generate the fewest sample points while achieving the best feasible optimal solution.The algorithm started with a small number of sample points from Latin hypercube sampling.Additional sample points were then generated and sequentially added to an original dataset.Artificial neural networks (ANNs) were employed as a surrogate model.The optimal results obtained from the proposed algorithm were compared with those obtained from random sampling.Three chemical processes were used as case studies to represent problems with varying complexity and numbers of factors.Three replications of each case study were conducted to assess the consistency of the method.
This paper is organized as follows: Section 2 provides reviews of Latin hypercube sampling, random sampling, and artificial neural networks.Section 3 presents the proposed adaptive Latin hypercube sampling for surrogate-based optimization.Section 4 introduces details of three different case studies: (1) ammonia production from syn gas, (2) methanol production via CO 2 hydrogenation, and (3) CO 2 capture using the Rectisol pro-cess to represent three different types of surrogate-based optimization problems.Section 5 offers results, discussion, recommendations, and future work.Finally, Section 6 presents the conclusion.

Sampling Technique and Surrogate Modeling
This section provides information on sampling techniques and surrogate models.Section 2.1 describes Latin hypercube sampling, Section 2.2 provides information on random sampling, and Section 2.3 describes the artificial neural network.

Latin Hypercube Sampling
Latin hypercube sampling was first proposed in 1979 by M.D. Mckey et al. [34].It is a method for generating samples that cover the entire sample space without any replications.The technique involves dividing each dimension of the sample space into N equally likely intervals for N sample points.The fundamental principle of Latin hypercube sampling is to ensure that each stratum is sampled exactly once, thereby guaranteeing a comprehensive representation of the entire sample space.For instance, in Latin hypercube sampling (LHS) involving two factors and ten sample points, the range of each factor, typically from 0 to 1, is partitioned into ten equal intervals.Each sample point is then randomly chosen from one of these intervals, creating a distinctive and stratified LHS dataset.Each interval accommodates only one sample point, which is randomly paired with other variables.You can observe this LHS dataset in Figure 1a.
a method for generating samples that cover the entire sample space without an tions.The technique involves dividing each dimension of the sample space into N likely intervals for N sample points.The fundamental principle of Latin hyperc pling is to ensure that each stratum is sampled exactly once, thereby guaranteein prehensive representation of the entire sample space.For instance, in Latin h sampling (LHS) involving two factors and ten sample points, the range of ea typically from 0 to 1, is partitioned into ten equal intervals.Each sample poin randomly chosen from one of these intervals, creating a distinctive and stratified taset.Each interval accommodates only one sample point, which is randomly pa other variables.You can observe this LHS dataset in Figure 1a.
One advantage of Latin hypercube sampling (LHS) is that by selecting inp independently, it ensures that the sampled points are spread across the entire p space without clustering in any particular region.This reduces the risk of overe ing certain areas of the space, which can occur with other random sampling Moreover, LHS ensures that once a point is sampled, it will not be selected ag feature minimizes redundancy in the samples, allowing the system to focus on e new regions of the parameter space, leading to different output results.The illus Latin hypercube sampling is shown in Figure 1.However, the worst-case sce Latin hypercube sampling may result in poorly sampled spaces, as depicted in F which may not adequately represent the entire domain space.Therefore, in any uses Latin hypercube sampling as a sampling technique for optimization probl essential to ensure that samples cover the entire parameter space (space filling).To address the primary challenge in Latin hypercube sampling, researchers plored various criteria to generate well-filling Latin hypercube designs, aiming to their performance.Table 1 summarizes the criteria used in Latin hypercube samp two most common criteria used in Latin hypercube sampling (LHS) involve ma One advantage of Latin hypercube sampling (LHS) is that by selecting input points independently, it ensures that the sampled points are spread across the entire parameter space without clustering in any particular region.This reduces the risk of overemphasizing certain areas of the space, which can occur with other random sampling methods.Moreover, LHS ensures that once a point is sampled, it will not be selected again.This feature minimizes redundancy in the samples, allowing the system to focus on exploring new regions of the parameter space, leading to different output results.The illustration of Latin hypercube sampling is shown in Figure 1.However, the worst-case scenario for Latin hypercube sampling may result in poorly sampled spaces, as depicted in Figure 1b, which may not adequately represent the entire domain space.Therefore, in any field that uses Latin hypercube sampling as a sampling technique for optimization problems, it is essential to ensure that samples cover the entire parameter space (space filling).
To address the primary challenge in Latin hypercube sampling, researchers have explored various criteria to generate well-filling Latin hypercube designs, aiming to enhance their performance.Table 1 summarizes the criteria used in Latin hypercube sampling.The two most common criteria used in Latin hypercube sampling (LHS) involve maximizing the minimum inter-point distance between pairs of sample points and minimizing the correlation between pairs of columns in the sample matrix [35].By maximizing the minimum distance, LHS avoids the problem of points clustering in certain regions of the parameter space.Clustering can lead to bias in estimation and might cause important regions of the space to be overlooked.By minimizing the correlation between pairs of columns in the LHS matrix, it ensures that the selected variables or factors are as orthogonal as possible.This means that the variables are as independent as possible, and each factor has less influence on the others.This makes it easier to identify the individual impact of each variable on the system's output.However, when dealing with a large number of sample points or when integrating them with the optimization of a specific case study, these algorithms may require more computational time and become complex.In order to reduce computational time, the adaptive Latin hypercube with sequential design is an intriguing choice for case studies involving process simulation.Adaptive sampling strategies typically begin with a small sample size and then sequentially add points based on specific criteria.In 2003, Wang proposed an adaptive response surface methodology model using inherited Latin hypercube sampling points [50].This sampling approach involves adding new sample points in underrepresented areas.Subsequently, in 2018, Chang applied this technique instead of central composite design to calculate the failure probability of a complex turbine blade structure [51].The results demonstrated the effectiveness of this approach when combined with Gaussian process regression for structural reliability analysis.Another variation of adaptive Latin hypercube sampling, which initially identifies the region of interest, was introduced by Roussouly [52].After marking the location and sampling the first point, the region is subdivided to add two more sample points.Finally, two additional sample points are introduced using another two-point Latin hypercube.This technique offers advantages for reliability analysis by focusing sampling efforts solely on the areas of interest.Zhi-zhao Liu and colleagues proposed two general extension algorithms for Latin hypercube sampling: basic general extension and general extension based on the greedy algorithm [53].These algorithms aim to preserve the majority of the original Latin hypercube points.While the general extension approach can be time consuming, the general extension based on the greedy algorithm reduces computational time, though it may lead to the removal of some of the original points.In 2016, a rigorous Latin hypercube sampling method was coupled with a sophisticated algorithm to ensure the retention of all initial points [54].

Random Sampling
Monte Carlo or random sampling was the first modern sampling design for computer experiments, which was proposed by Metropolis and Ulam in 1949 [55].Random sampling requires a large number of sample points (N) to achieve a high level of accuracy in surrogate modeling.However, this approach consumes more computational time and is time consuming in terms of data collection.Figure 2  Latin hypercube.This technique offers advantages for reliability analysis by focusing sampling efforts solely on the areas of interest.Zhi-zhao Liu and colleagues proposed two general extension algorithms for Latin hypercube sampling: basic general extension and general extension based on the greedy algorithm [53].These algorithms aim to preserve the majority of the original Latin hypercube points.While the general extension approach can be time consuming, the general extension based on the greedy algorithm reduces computational time, though it may lead to the removal of some of the original points.In 2016, a rigorous Latin hypercube sampling method was coupled with a sophisticated algorithm to ensure the retention of all initial points [54].

Random Sampling
Monte Carlo or random sampling was the first modern sampling design for computer experiments, which was proposed by Metropolis and Ulam in 1949 [55].Random sampling requires a large number of sample points (N) to achieve a high level of accuracy in surrogate modeling.However, this approach consumes more computational time and is time consuming in terms of data collection.Figure 2 illustrates random sampling.

Artificial Neural Networks (ANNs)
Artificial neural networks are mathematical expressions consisting of interconnected processing units known as neurons.These computer systems mimic the biological neural networks found in animals.The parameters of the network include weights and biases.Each neuron operates by using a transfer function.Generally, feedforward neural networks are widely used for mathematical expressions, forming a directed graph.The primary structure of artificial neural networks includes an input layer, a hidden layer, and an output layer.
In this research, a three-layer feedforward neural network was used to train the dataset due to its mathematical simplicity.Figure 3 illustrates the schematic of the feedforward neural network with a single hidden layer.

Artificial Neural Networks (ANNs)
Artificial neural networks are mathematical expressions consisting of interconnected processing units known as neurons.These computer systems mimic the biological neural networks found in animals.The parameters of the network include weights and biases.Each neuron operates by using a transfer function.Generally, feedforward neural networks are widely used for mathematical expressions, forming a directed graph.The primary structure of artificial neural networks includes an input layer, a hidden layer, and an output layer.
In this research, a three-layer feedforward neural network was used to train the dataset due to its mathematical simplicity.Figure 3 illustrates the schematic of the feedforward neural network with a single hidden layer.The training algorithm includes Levenberg-Marquardt, Bayesian regularization, and scaled conjugate gradient techniques.However, Levenberg-Marquardt requires less time than the others [56].The input for the ANNs consists of a dataset of operational and design parameters, which was generated using the original Latin hypercube sampling.The output of the ANNs is the predicted cost corresponding to each process.In terms of neural network hyperparameters in this research, a single hidden layer with eight neurons was employed in the ANN network.The sigmoid function is used to model the relationship between the input and the hidden layer.It is smooth and differentiable everywhere, making it suitable for gradient-based optimization methods such as backpropagation.The smoothness of activation functions ensures that small changes in weights and biases result in continuous changes in the output, which is crucial for efficient training.The mathematical equations corresponding to the weights and biases of ANNs used to estimate the output (y) are shown in Equations ( 1) and (2).
where k is the number of decision variables, r is the number of neurons in a hidden layer, and W1 and W2 are input weights of the hidden layer and output layer, respectively.The parameters B1 and B2 are biases of the input and the output layers, respectively.

Methodology
The methodology of this work is divided into three sections.In Section 3.1, we introduce the adaptive Latin hypercube sampling approach for surrogate-based optimization, which focuses on solving optimization problems using a small set of sample points.The algorithm's optimal solutions are validated through outcomes obtained via random sampling.Section 3.2 covers the details of adaptive LHS in the proposed surrogate-based optimization, while Section 3.3 offers detailed insights into optimization using random sampling.Adaptive Latin hypercube sampling is a sequential sampling method that increases the number of sample points one by one until it meets the algorithm's criteria.In contrast, random sampling is a one-time operation that generates the entire set of sample points at once without the option to add more.In this work, if the number of sample points in random sampling cannot adequately represent the entire surface, the algorithm will generate a new initial dataset with twice the number of sample points as in the previous set.The training algorithm includes Levenberg-Marquardt, Bayesian regularization, and scaled conjugate gradient techniques.However, Levenberg-Marquardt requires less time than the others [56].The input for the ANNs consists of a dataset of operational and design parameters, which was generated using the original Latin hypercube sampling.The output of the ANNs is the predicted cost corresponding to each process.In terms of neural network hyperparameters in this research, a single hidden layer with eight neurons was employed in the ANN network.The sigmoid function is used to model the relationship between the input and the hidden layer.It is smooth and differentiable everywhere, making it suitable for gradient-based optimization methods such as backpropagation.The smoothness of activation functions ensures that small changes in weights and biases result in continuous changes in the output, which is crucial for efficient training.The mathematical equations corresponding to the weights and biases of ANNs used to estimate the output (y) are shown in Equations ( 1) and (2).
where k is the number of decision variables, r is the number of neurons in a hidden layer, and W1 and W2 are input weights of the hidden layer and output layer, respectively.The parameters B1 and B2 are biases of the input and the output layers, respectively.

Methodology
The methodology of this work is divided into three sections.In Section 3.1, we introduce the adaptive Latin hypercube sampling approach for surrogate-based optimization, which focuses on solving optimization problems using a small set of sample points.The algorithm's optimal solutions are validated through outcomes obtained via random sampling.Section 3.2 covers the details of adaptive LHS in the proposed surrogate-based optimization, while Section 3.3 offers detailed insights into optimization using random sampling.Adaptive Latin hypercube sampling is a sequential sampling method that increases the number of sample points one by one until it meets the algorithm's criteria.In contrast, random sampling is a one-time operation that generates the entire set of sample points at once without the option to add more.In this work, if the number of sample points in random sampling cannot adequately represent the entire surface, the algorithm will generate a new initial dataset with twice the number of sample points as in the previous set.

Proposed Adaptive LHS for Surrogate-Based Optimization Algorithm
Figure 4 represents the proposed adaptive Latin hypercube sampling for surrogatebased optimization.At the beginning of the algorithm, ranges of factors that have effects on the output value were assigned.Then, the first dataset of x kn (with the desired number of sample points) was generated using Latin hypercube sampling where k = 1, 2, 3, . .., K and n = 1, 2, 3, . .., N 0 , (K is number of decision variables, and N is number of sample points).The initial number of sample points was set at seven times the number of factors.The corresponding output values, y n , were obtained from process simulation combined with an economic analysis.The sets of input and output were normalized in the range of −1 to 1 for ANNs training using MATLAB with a three-layer design (input layer, hidden layer, and output layer).
split into 70% for training, 15% for validation, and 15% for testing.The criteria used determine the accuracy of the ANN model are the mean square error (MSE) [57] and t R-squared (R 2 ) value.The training loop for ANNs was terminated when the MSE value the ANN model remained at a minimum for 5 consecutive iterations, and the R 2 value the model was greater than or equal to 0.995.The weights and biases of the most rece ANN model were then used as the objective function for the optimization problem to d termine the optimal solution (xi,opt and  ).The optimal operating conditions (xi,opt) we then input into the process simulation to obtain the corresponding output value (yopt).T accuracy of the obtained optimal solution is deemed acceptable if the percent error b tween the predicted output value (obtained from the ANN model) and the actual outp value (obtained from process simulation), as shown in Equation ( 3), is within a pre value of 1 percent.If the percent error is less than one, the optimization algorithm is t minated, and the most recent optimal solution is reported as the optimal solution of t problem.If the percent error value exceeds one, three additional data points are incorporat into the existing dataset.The first additional data point is the most recent optimal soluti (xi,opt, yopt), while the other two data points are generated using the adaptive Latin hyp cube sampling method (as described in Section 3.2).The updated dataset was then n malized, and these steps were repeated until the percent error in the output falls belo one, ultimately yielding the optimal solution.For ANNs training, the Levenberg-Marquardt algorithm was used.The dataset was split into 70% for training, 15% for validation, and 15% for testing.The criteria used to determine the accuracy of the ANN model are the mean square error (MSE) [57] and the R-squared (R 2 ) value.The training loop for ANNs was terminated when the MSE value of the ANN model remained at a minimum for 5 consecutive iterations, and the R 2 value of the model was greater than or equal to 0.995.The weights and biases of the most recent ANN model were then used as the objective function for the optimization problem to determine the optimal solution (x i,opt and ŷopt ).The optimal operating conditions (x i,opt ) were then input into the process simulation to obtain the corresponding output value (y opt ).The accuracy of the obtained optimal solution is deemed acceptable if the percent error between the predicted output value (obtained from the ANN model) and the actual output value (obtained from process simulation), as shown in Equation ( 3), is within a preset value of 1 percent.If the percent error is less than one, the optimization algorithm is terminated, and the most recent optimal solution is reported as the optimal solution of the problem.
If the percent error value exceeds one, three additional data points are incorporated into the existing dataset.The first additional data point is the most recent optimal solution (x i,opt , y opt ), while the other two data points are generated using the adaptive Latin hypercube sampling method (as described in Section 3.2).The updated dataset was then normalized, and these steps were repeated until the percent error in the output falls below one, ultimately yielding the optimal solution.

Adaptive Latin Hypercube Sampling: Addition of Sample Points
In the step involving the addition of sample points in adaptive LHS, two additional sample points are generated during each iteration.To determine these sample points, the deviation from their respective actual values for all (N 0 + 3i) predicted outputs is calculated for each sample point in the dataset, where i represents the number of iterations.The sample point that exhibits the highest deviation, denoted as (x n , y n,max deviation ), is selected.Subsequently, the intervals corresponding to the sample point with the highest deviation (x n , y n,max deviation ) are evenly divided into two intervals.Two additional sample points are then randomly selected from each of these intervals.Figure 5 illustrates an example of generating two additional sample points from the highest deviation output sample point.This example showcases the generation of sample points for two factors, which are denoted as x 1 and x 2 .The highest deviation sample point (x n , y n,max deviation ) is represented as a red dot.The intervals corresponding to these sample points are within the ranges of 0.4 to 0.6 for factor 1 and 0.2 to 0.4 for factor 2. Both factors are evenly divided into two intervals: 0.4 to 0.5 and 0.5 to 0.6 for factor 1, and 0.2 to 0.3 and 0.3 to 0.4 for factor 2 (as indicated by dashed lines).Next, one sample point is randomly selected from each interval of the factors.A total of two additional sample points are obtained and represented as blue dots.
Percent error

Adaptive Latin Hypercube Sampling: Addition of Sample Points
In the step involving the addition of sample points in adaptive LHS, two additi sample points are generated during each iteration.To determine these sample points deviation from their respective actual values for all (N0 + 3i) predicted outputs is calcul for each sample point in the dataset, where i represents the number of iterations.The s ple point that exhibits the highest deviation, denoted as (xn, yn,max deviation), is selected.sequently, the intervals corresponding to the sample point with the highest deviation yn,max deviation) are evenly divided into two intervals.Two additional sample points are randomly selected from each of these intervals.Figure 5 illustrates an example of generating two additional sample points from highest deviation output sample point.This example showcases the generation of sam points for two factors, which are denoted as x1 and x2.The highest deviation sample p (xn, yn,max deviation) is represented as a red dot.The intervals corresponding to these sam points are within the ranges of 0.4 to 0.6 for factor 1 and 0.2 to 0.4 for factor 2. Both fac are evenly divided into two intervals: 0.4 to 0.5 and 0.5 to 0.6 for factor 1, and 0.2 to and 0.3 to 0.4 for factor 2 (as indicated by dashed lines).Next, one sample point is domly selected from each interval of the factors.A total of two additional sample po are obtained and represented as blue dots.

Verification of the Optimal Solution Using Random Sampling Technique
Figure 6 illustrates the simulation-optimization algorithm using the random s pling technique.The algorithm commenced with the same input ranges and factors as proposed simulation-optimization method employing adaptive LHS.Initially, 50 sam points were used.Output data were collected, and ANNs were trained using the s stopping criterion as in the proposed method.The resulting ANN model was then jected to optimization using a nonlinear solver to obtain the optimal solution.The a racy of the optimal solution was evaluated using the same criteria as in the propose gorithm.The algorithm terminated and reported the optimal solution if the percent e of the solution was less than one.If not, additional points were generated and adde the original samples, doubling the recent sample size (N × 2).These additional points w then processed through the simulation, and the steps were repeated until the determ optimal solution achieved a percent error of less than one.

Verification of the Optimal Solution Using Random Sampling Technique
Figure 6 illustrates the simulation-optimization algorithm using the random sampling technique.The algorithm commenced with the same input ranges and factors as the proposed simulation-optimization method employing adaptive LHS.Initially, 50 sample points were used.Output data were collected, and ANNs were trained using the same stopping criterion as in the proposed method.The resulting ANN model was then subjected to optimization using a nonlinear solver to obtain the optimal solution.The accuracy of the optimal solution was evaluated using the same criteria as in the proposed algorithm.The algorithm terminated and reported the optimal solution if the percent error of the solution was less than one.If not, additional points were generated and added to the original samples, doubling the recent sample size (N × 2).These additional points were then processed through the simulation, and the steps were repeated until the determined optimal solution achieved a percent error of less than one.

Case Study
To illustrate the varying complexity and number of factors in the problem, this section examines three distinct chemical processes as case studies.Case Study I focuses on ammonia production from syngas, which involves three factors.Case Study II explores methanol production via carbon dioxide hydrogenation, which deals with four factors.Case Study III addresses CO2 absorption by methanol using the Rectisol process, which involves five factors.The details of each case study are provided below.

Case Study 1: Ammonia Production from Syngas
The first case study addresses an optimization problem with three decision variables.Figure 7 depicts the process simulation of ammonia production based on the Haber process.The process involves feeds of 31,900 kg per hour of natural gas at 35 bar and 25 °C as well as 95,800 kg per hour of high-pressure steam at 150 bar.Further details on the process simulation and conditions can be found in the work of Janosovsky [58].Ammonia production comprises two main parts: the production of syngas and the production of ammonia (Equation ( 16)).The syngas production includes steps such as steam reforming (Equations ( 4)-( 7)), air reforming (Equations ( 9)-( 12)), high and low shift conversion (Equation ( 13)), CO2 removal, and methanation (Equations ( 14) and ( 15)).The reactions for each step are as follows.

Case Study
To illustrate the varying complexity and number of factors in the problem, this section examines three distinct chemical processes as case studies.Case Study I focuses on ammonia production from syngas, which involves three factors.Case Study II explores methanol production via carbon dioxide hydrogenation, which deals with four factors.Case Study III addresses CO 2 absorption by methanol using the Rectisol process, which involves five factors.The details of each case study are provided below.

Case Study 1: Ammonia Production from Syngas
The first case study addresses an optimization problem with three decision variables.Figure 7 depicts the process simulation of ammonia production based on the Haber process.The process involves feeds of 31,900 kg per hour of natural gas at 35 bar and 25 • C as well as 95,800 kg per hour of high-pressure steam at 150 bar.Further details on the process simulation and conditions can be found in the work of Janosovsky [58].Ammonia production comprises two main parts: the production of syngas and the production of ammonia (Equation ( 16)).The syngas production includes steps such as steam reforming (Equations ( 4)-( 7)), air reforming (Equations ( 9)-( 12)), high and low shift conversion (Equation ( 13)), CO 2 removal, and methanation (Equations ( 14) and ( 15)).The reactions for each step are as follows.
Steam reforming (Reformer unit) Air reforming (Combuster unit) Water gas shift reaction (HTC and LTC units) Methanation (Methanizer unit) Ammonia production (PFR-100, PFR-101 and PFR-102 units) Processes 2023, 11, x FOR PEER REVIEW 10 of 20 Water gas shift reaction (HTC and LTC units) Methanation (Methanizer unit) Ammonia production (PFR-100, PFR-101 and PFR-102 units) For the process optimization, three decision variables are the reformer temperature, the combuster temperature, and the low-temperature conversion reactor temperature.The ranges of these decision variables are shown in Table 2.The temperatures of the hightemperature conversion reactor, methanizer, and ammonia reactor were fixed at 450 °C, 340 °C and 450 °C, respectively.A two-stage compressor was used to increase the pressure of the stream up to 150 bar before entering the ammonia synthesis loop.The ammonia synthesis system was modeled using three plug flow reactors in series with three quenching sections.The final product, ammonia, was separated by a separation unit at 45 °C with ammonia content exceeding 98 mol%.More details for replicating the process simulation For the process optimization, three decision variables are the reformer temperature, the combuster temperature, and the low-temperature conversion reactor temperature.The ranges of these decision variables are shown in Table 2.The temperatures of the hightemperature conversion reactor, methanizer, and ammonia reactor were fixed at 450 • C, 340 • C and 450 • C, respectively.A two-stage compressor was used to increase the pressure of the stream up to 150 bar before entering the ammonia synthesis loop.The ammonia synthesis system was modeled using three plug flow reactors in series with three quenching sections.The final product, ammonia, was separated by a separation unit at 45 • C with ammonia content exceeding 98 mol%.More details for replicating the process simulation are provided in Supplementary Information S1.1.The second case study represents an optimization problem with four decision variables.Figure 8 depicts the process simulation of methanol production via carbon dioxide hydrogenation with a recycling process.The decision variables were (1) pressure of the equilibrium reactor, (2) temperature of the equilibrium reactor, (3) temperature of the steam entering a separator, and ( 4) the recycle ratio.This simulation used the Peng-Robinson thermodynamics package.The feed stream of the process consisted of 1000 kmole per hour of carbon dioxide and 3000 kmole per hour of hydrogen at conditions of 20 bar and 40 • C. The process's specification was to achieve a methanol product purity of 99.5% by mole.The range of each decision variable is shown in Table 3.Details of the economic evaluation assumptions for this process can be found in the work of Borisut and Nuchitprasittichai [57].Additional information for reproducing the process simulation is provided in Supplementary Information S1.The second case study represents an optimization problem with four decision v bles.Figure 8 depicts the process simulation of methanol production via carbon di hydrogenation with a recycling process.The decision variables were (1) pressure o equilibrium reactor, (2) temperature of the equilibrium reactor, (3) temperature o steam entering a separator, and (4) the recycle ratio.This simulation used the Penginson thermodynamics package.The feed stream of the process consisted of 1000 k per hour of carbon dioxide and 3000 kmole per hour of hydrogen at conditions of 2 and 40 °C.The process's specification was to achieve a methanol product purity of 9 by mole.The range of each decision variable is shown in Table 3.Details of the econ evaluation assumptions for this process can be found in the work of Borisut and Nu prasittichai [57].Additional information for reproducing the process simulation is vided in Supplementary Information S1.2.The third case study represents an optimization problem with five decision varia Figure 9 illustrates the process of CO2 absorption by methanol via the Rectisol pro The process consists of three main sections: water removal, absorption, and CO2/meth separation.The syngas feed contains a mole fraction of 0.2462 of CO2, 0.0002 of NH3, 0 of CO, 0.0050 of Ar, 0.4148 of N2, 0.3186 of H2, 0.0035 of H2O, and 0.0073 of CH4 at bar and 18.30 °C [59].A small amount of methanol was mixed with the syngas feed sent to the first separator to separate water from the feed gas.Subsequently, the sy was fed to an absorber column, where CO2 gas was captured by chilled methanol a at the top of the column.The rich methanol then passes through a three-stage sepa

Decision Variables Range
Pressure of the equilibrium reactor The third case study represents an optimization problem with five decision variables.Figure 9 illustrates the process of CO 2 absorption by methanol via the Rectisol process.The process consists of three main sections: water removal, absorption, and CO 2 /methanol separation.The syngas feed contains a mole fraction of 0.2462 of CO 2 , 0.0002 of NH 3 , 0.0044 of CO, 0.0050 of Ar, 0.4148 of N 2 , 0.3186 of H 2 , 0.0035 of H 2 O, and 0.0073 of CH 4 at 17.24 bar and 18.30 • C [59].A small amount of methanol was mixed with the syngas feed and sent to the first separator to separate water from the feed gas.Subsequently, the syngas was fed to an absorber column, where CO 2 gas was captured by chilled methanol added at the top of the column.The rich methanol then passes through a three-stage separator and is fed into a stripper column to separate CO 2 from methanol.The lean methanol leaving the stripper was mixed with 40 kmole per hour of makeup methanol and recycled to the absorber.The vapor product from the stripper, primarily containing CO 2 , was combined to other CO 2 product streams from the second and third separators.Details for replicating the process simulation are provided in Supplementary Information S1.In the optimization of this process, the decision variables included the lean methano temperature, the third-stage separator pressure, the stripper reflux ratio, the stripper inle temperature, and the distillation reflux ratio.The ranges of these decision variables are detailed in Table 4.

Results and Discussion
The results and discussion are presented in four sections.In Section 5.1, we describe the optimal results obtained from random sampling, which started with an initial numbe of 50 sample points.Section 5.2 discusses the convergence of the proposed adaptive Latin hypercube sampling (LHS) algorithm.In Section 5.3, we provide a comparison between the optimal results obtained from the random sampling technique and those obtained from the proposed adaptive sampling method.Section 5.4 includes recommendations fo future work.The dataset for all three case studies can be found in Supplementary Infor mation S2.

The Results of Monte Carlo or Random Sampling
The Monte Carlo or random sampling was used to verify the optimal results obtained from the proposed adaptive sampling LHS optimization algorithm.The starting sample points for random sampling were set at 50, and they were then doubled until the solution satisfied the criteria.Table 5 shows the results obtained from the random sampling tech nique for each iteration for all three case studies.The criterion for the ANN model is tha the R 2 value of the model has to be greater than or equal to 0.995.The criterion for the optimal solution is that the percent error of the optimal cost has to be less than one (1).In the optimization of this process, the decision variables included the lean methanol temperature, the third-stage separator pressure, the stripper reflux ratio, the stripper inlet temperature, and the distillation reflux ratio.The ranges of these decision variables are detailed in Table 4.

Decision Variables Range
Lean methanol temperature (

Results and Discussion
The results and discussion are presented in four sections.In Section 5.1, we describe the optimal results obtained from random sampling, which started with an initial number of 50 sample points.Section 5.2 discusses the convergence of the proposed adaptive Latin hypercube sampling (LHS) algorithm.In Section 5.3, we provide a comparison between the optimal results obtained from the random sampling technique and those obtained from the proposed adaptive sampling method.Section 5.4 includes recommendations for future work.The dataset for all three case studies can be found in Supplementary Information S2.

The Results of Monte Carlo or Random Sampling
The Monte Carlo or random sampling was used to verify the optimal results obtained from the proposed adaptive sampling LHS optimization algorithm.The starting sample points for random sampling were set at 50, and they were then doubled until the solution satisfied the criteria.Table 5 shows the results obtained from the random sampling technique for each iteration for all three case studies.The criterion for the ANN model is that the R 2 value of the model has to be greater than or equal to 0.995.The criterion for the optimal solution is that the percent error of the optimal cost has to be less than one (1).The results showed that in the case of Case Study I (ammonia production from syngas) and Case Study II (methanol production via carbon dioxide hydrogenation), 100 sample points were required to meet the criteria.The obtained minimum ammonia production cost was USD 495.87 for Case Study I, and the minimum methanol production cost was USD 495.87 for Case Study II.For the results of Case Study III (CO 2 absorption by methanol via the Rectisol process), the optimization problem required 400 sample points to meet the criteria.The minimum CO 2 capture cost was USD 43.40.

The Convergence of the Proposed Adaptive LSH Optimization Algorithm
Figure 10 illustrates the convergence of the proposed algorithm for all three case studies.In the figure, the triangle, black circle, and white circle symbols represent the actual cost (obtained from the process simulation), predicted cost (obtained from the optimization problem), and the percent error, respectively.The percent error between actual and predicted costs is depicted as a dashed line.The proposed algorithm terminates when the percent error of the optimal cost was less than one percent (1%).
For Case Studies I and III (Figure 10a,c), the outcomes exhibited a similar pattern, where the percent error decreased as more iterations were added.There were a few points where the percent error sharply rose before the algorithm converged.This was due to the fact that sample points for ANN training, testing, and validation were randomly chosen from the dataset, resulting in an insufficient number of data points before the algorithm converged.For Case Study II, as shown in Figure 10b, the percent error of the optimal solution decreased as the number of iterations (or sample points) increased until it eventually became less than one percent.In all three case studies, each of which featured a different number of independent parameters, the proposed method consistently converged to provide optimal results that met the necessary criteria.Details on the ANNs, including the weights and biases, for all three case studies, can be found in Supplementary Information S3.
Considering the scale of the case studies, the first case study, which included three studied factors, reached convergence after four iterations.Case Studies II and III, which had four and five studied factors, respectively, converged after six iterations.The results showed that the algorithm required more iterations to converge when dealing with higherdimensional problems.

Comparison of Optimal Solutions between Proposed Sampling and Random Sampling
This section compares the outcomes of the proposed adaptive LHS optimization algorithm with those of random sampling to verify the results.Additionally, the number of sample points required to obtain the optimal solution is compared in this section.The number of sample points used in constructing the ANN model is crucial.Having a sufficient number of sample points to represent the relationship between independent and dependent variables allows the mathematical model to accurately depict the system, enabling the solver to find the true optimal solutions for the optimization problem.
Table 6 provides a comparison of the optimal results obtained from random sampling and the proposed algorithm.Three replications of the proposed algorithm were conducted to ensure consistency.For Case Study I (ammonia production from syngas), the optimal operating conditions for all three variables obtained from the proposed algorithm matched those determined by random sampling.The optimal operating conditions were temperatures of 900 • C for the reformer, 1400 • C for the combuster, and 160 • C for the low-temperature conversion reactor, resulting in a minimum production cost of USD 495.87 per ton of generated ammonia, which was consistent with the results from random sampling.Notably, the proposed technique required only 38 sample points to find the optimal solution for this problem with three decision variables compared to the 100 sample points needed by random sampling.For Case Studies I and III (Figure 10a,c), the outcomes exhibited a simil where the percent error decreased as more iterations were added.There were a  The optimal operating conditions for Case Study II (methanol production from carbon dioxide hydrogenation) were determined using the proposed algorithm for three replications, and they matched those determined by random sampling, with the exception of the temperature of the steam entering the separator (x 3 ) in the third replication.In the third replication, the steam's temperature entering the separator (x 3 ) is 76 • C, which was slightly different from the temperatures obtained in the other replications (80 • C).A slight decrease in the temperature of the stream entering the separator resulted in an increase in the methanol production cost.This change did not have a significant impact in terms of model prediction error.Furthermore, the third replication used only 40 sample points to represent the entire surface, while the other two replications used 46 sample points.This indicates that the model with 40 sample points found a local optimum.The optimal operating conditions for this case study were a pressure of 70 bar for the equilibrium reactor, a temperature of 190 • C for the reactor, a temperature of 80 • C for the steam entering a separator, and a recycling ratio of 1.The lowest cost to generate methanol was USD 942.45 per tonne of methanol.While random sampling required 100 sample points to find the optimal solution for this problem with four decision variables, the proposed adaptive LHS optimization approach only required 46 sample points.
For Case Study III (carbon dioxide absorption by methanol via the Rectisol process), the values of the optimal operating conditions were identical to those obtained from random sampling except for the value of the lean methanol temperature (x 1 ).The results for the lean methanol temperature (x 1 ) from three replications varied from −26.70 to −20.0 • C, which was slightly different from the values obtained from random sampling (−29.0 • C).The lowest CO 2 capture costs identified by the proposed algorithm for three replications ranged from USD 43.25 to USD 45.91 per ton of CO 2 capture, which is consistent with the lowest cost identified by random sampling (USD 43.40 per ton of CO 2 capture).The optimal operating parameters for this case study included a lean methanol temperature of −26.70 • C, a third-stage separator pressure of 1.28 bar, a stripper reflux ratio of 5, a stripper input temperature of 40 • C, and a distillation reflux ratio of 1.The minimum cost for CO 2 capture was USD 43.19 per tonne of CO 2 .The proposed technique required only 53 sample points for this problem with five decision variables, while random sampling needed 400 sample points to achieve the same level of ANN model accuracy and obtain the optimal solutions.
Based on the results of the three different case studies, adaptive Latin hypercube sampling required fewer sample points than random sampling to accurately represent the entire surface using ANN models.The model accuracy and optimal conditions achieved using adaptive sampling were comparable to random sampling but required fewer sample points.

Recommendation for Future Work
To assess the performance of the current algorithm and establish its generalizability, it was compared to the work by [60], which introduced incremental Latin Hypercube Sampling (LHS) for ANN-based optimization.In [60], the sample size began at ten times the number of decision variables and increased iteratively by one-third of the current number of sample points, which was rounded up to the nearest tenth.In contrast, the present algorithm initiated with a sample size set at seven times the number of decision variables with three additional sample points added in each iteration.Both algorithms employed the same exit criteria (% error less than 1).The current algorithm required fewer additional sample points in each iteration and resulted in fewer sample points to obtain the optimal solution.However, it is important to note that the previous work applied their method to more complex problems involving six and seven decision variables.To comprehensively compare the performance of both methods, it is recommended to apply the current algorithm to optimization problems with higher dimensionality.
For future work, it is recommended to compare the performance of this adaptive LHS approach with other adaptive sampling techniques or advanced surrogate model techniques.Implementing Bayesian optimization to enhance the optimization method is suggested.Performance tests that incorporate various adaptive sampling criteria, such as distance, space-filling metrics, gradient information, and prediction uncertainty, can be applied to the algorithm.Furthermore, the results have demonstrated that as the number of samples increased, the solution converged to the true optimal solution.To enhance the performance of the algorithm, convergence guarantees can be further implemented.The application of the proposed algorithm should be extended to different fields of study, such as engineering design, materials discovery, etc.Additionally, it is essential to study the limitations of ANNs, particularly in terms of their consistency during the training process.

Conclusions
This paper proposes an adaptive Latin hypercube sampling method for simulationbased optimization problems.The algorithm uses the deviation of the output as a criterion for generating new sample points.After each iteration, three additional sample points are added to the existing dataset.The first additional sample point is the optimal solution from the previous optimization dataset, while the other two additional data points are selected from a grid area with the maximum deviation in output.Artificial neural networks (ANNs) were employed as the surrogate model for the optimization problem.The accuracy of the ANN model was assessed using the mean square error (MSE) and R-squared (R 2 ) values.The criterion for determining the accuracy of the optimal solution was that the percent error should be less than one percent.The results demonstrate that the proposed algorithm is capable of obtaining optimal solutions that are similar to the random sampling approach but require fewer sample points.

Figure 1 .
Figure 1.Illustration of Latin hypercube sampling for N = 10 of (a) good filling design an filling design.

Figure 1 .
Figure 1.Illustration of Latin hypercube sampling for N = 10 of (a) good filling design and (b) poor filling design.
illustrates random sampling.Processes 2023, 11, x FOR PEER REVIEW 5 of 20

Figure 3 .
Figure 3.The schematic of feedforward neural networks.

Figure 3 .
Figure 3.The schematic of feedforward neural networks.

Figure 4 .
Figure 4. Proposed adaptive Latin hypercube sampling for surrogate-based optimization.

Figure 4 .
Figure 4. Proposed adaptive Latin hypercube sampling for surrogate-based optimization.

Figure 5 .
Figure 5.Additional sample points generation using maximum deviation.(a) original Latin hy cube sampling and (b) adaptive Latin hypercube sampling.

Figure 5 .
Figure 5.Additional sample points generation using maximum deviation.(a) original Latin hypercube sampling and (b) adaptive Latin hypercube sampling.

Figure 7 .
Figure 7. Process simulation of ammonia production from syngas.

Figure 7 .
Figure 7. Process simulation of ammonia production from syngas.

Figure 8 .
Figure 8. Process simulation of methanol production via carbon dioxide hydrogenation.

3 .
(bar) 50 to 70 Temperature of the equilibrium reactor ( • C) 190 to 210 Temperature of the steam entering a separator ( Case Study 2: Methanol Production via Carbon Dioxide Hydrogenation

Processes 2023 ,Figure 10 .
Figure 10.The convergence of the proposed algorithm for all three case studies: (a) am duction from syngas, (b) the production of methanol from carbon dioxide hydrogenation dioxide absorption by methanol via Rectisol process.

Figure 10 .
Figure 10.The convergence of the proposed algorithm for all three case studies: (a) ammonia production from syngas, (b) the production of methanol from carbon dioxide hydrogenation, (c) carbon dioxide absorption by methanol via Rectisol process.

Table 1 .
The approach for constructing optimal Latin hypercube sampling.

Table 2 .
Range of decision variables of Case Study 1.

Table 2 .
Range of decision variables of Case Study 1.

Table 3 .
Range of decision variables of Case Study 2.

Table 4 .
Range of decision variables of Case Study 3.

Table 4 .
Range of decision variables of Case Study 3.

Table 5 .
Result of random sampling of three case studies.

Table 6 .
Comparison the results of proposed adaptive Latin hypercube sampling and random sampling.