On Solving the Problem of Finding Kinetic Parameters of Catalytic Isomerization of the Pentane-Hexane Fraction Using a Parallel Global Search Algorithm

: This article is devoted to the problem of developing a kinetic model of a complex chemical reaction using a parallel optimization method. The design of the kinetic model consists of ﬁnding the kinetic parameters of the reaction, which cannot be calculated analytically, and since the chemical reaction involves many stages, the optimization problem is multiextremal. As a chemical reaction, the process of catalytic isomerization of the pentane-hexane fraction is considered, which is now important due to the switch of the oil reﬁning industry to the production of gasoline corresponding to the Euro-5 standard. On the basis of known industrial data on the concentrations of reaction components and the temperature at the outlet of the third reactor, the activation energies and pre-exponential factors of each reaction stage were calculated. To solve the optimization problem, the authors developed a parallel global search algorithm and a program based on Lipschitz optimization. The kinetic parameters found made it possible to develop a mathematical model of the process, which is in good agreement with industrial data. The developed mathematical model in future works will make it possible to study the dynamics of the gas–liquid ﬂow in the reactor unit, taking into account diffusion and heat exchange processes through the catalyst layer.


Introduction
Due to the transition of the domestic oil refining industry to the production of gasoline conforming to the Euro-5 standard, reducing the content of aromatic hydrocarbons is an urgent task, in particular benzene in motor fuel, while maintaining the octane value. In connection with the transition of the oil refining industry to the production of motor gasoline conforming to Euro-5 and Euro-6 standards, the reduction of aromatic hydrocarbons in motor fuel and, mainly, benzene, becomes an urgent task. At the same time, when removing aromatic hydrocarbons, which are high-octane components, it is necessary to preserve the octane number of gasoline. It is known that hydrocarbons with a branched structure have a higher octane number than hydrocarbons with a linear structure. For example, in n-pentane, the octane number according to the research method is 62 points and in iso-pentane it is 93 points. On an industrial scale, catalytic isomerization units of light paraffins have been introduced to produce hydrocarbons of an isomeric structure. The catalytic isomerization of light paraffins, in turn, makes it possible to obtain a high-octane component of automobile gasoline with a minimum content of aromatic hydrocarbons. The main advantage of this process is the production of high-octane gasoline, while aromatic hydrocarbons are practically not formed. Straight-run gasoline fractions with boiling limits of 62-70°C containing mainly n-pentane and n-hexane, pentane-hexane fractions from gas condensate or from gas fractionation plants, as well as catalytic reforming raffinates, are used as feedstock for catalytic isomerization. Modern bifunctional catalysts are used to carry out the process. In order to prevent coke deposition, the catalytic isomerization process is carried out in a hydrogen-containing gas environment under pressure [1][2][3].
Computer information technologies are widely used in the modeling of production processes and today it is difficult to imagine research activities without the use of modern means of constructing and using models. Moreover, special attention is currently being paid to the development of a detailed kinetic model of complex industrial processes [4][5][6][7]. Based on these models, it is possible to significantly increase the efficiency of the process while utilizing minimal energy and material resources [8,9]. The main advantage of such methods is the absence of additional costs for the construction of pilot plants. The developed model should describe the process within a wide range of operating conditions, have high flexibility and allow for changes and calculations of the composition of the reaction mixture, etc.
The reactor unit of the catalytic isomerization unit of the pentane-hexane fraction is the object of research in this work. The production process was carried out in a continuous way. The reactor unit consisted of three consecutive reactors. The raw material was a light gasoline fraction after hydrotreating. The catalyst of the process was a bifunctional catalyst of the SI-2 brand. The total mass of the catalyst in the reactor unit was 27 tons (9 tons in each reactor). The process was carried out at a constant pressure of 3.2 MPa. For calculations, data for three days of operation of an industrial installation were taken. Based on chromatographic analyses and mass flow rates for each day, the component compositions of the reaction mixture were formed both at the entrance to the reactor unit and at the exit from the reactor unit.
In [10] we already reviewed this process, but in order to simplify the model, the scheme of chemical transformations, in particular hydrocracking reactions, have been revised. Thus, the most probable hydrocracking reactions, based on the carbonium-ion mechanism, were formed. The new transformation scheme implies the recalculation of the kinetic parameters of the reaction, which form a mathematical model of the process.
In [10], a genetic algorithm was used to calculate kinetic parameters, and the calculation time of kinetic parameters was about 10 h. Therefore, in order to develop a mathematical model of a new transformation scheme, the question of choosing a more effective optimization method arises.
In the complex problems of chemical kinetics, kinetic constants of reactions cannot be found analytically; they can only be found using numerical methods (see, for example, [11]). In this case, the quality criterion (objective function) is set in the form of a certain algorithm, which includes the stage of numerical modeling, which, in turn, makes it computationally intensive. Furthermore, in the inverse problems of chemical kinetics, the objective function can be substantially multiextremal, i.e., having many local extremes along the global one. Gradient-based optimization methods will not provide a good result for such problems. On the one hand, such methods converge only to a local solution. On the other hand, in the considered problem, the gradient of the objective function is not known analytically and its numerical estimate will be a time-consuming operation.
Computational algorithms for solving such multiextremal problems (global optimization methods) differ vastly from local optimization techniques since they require the study of the entire search domain (see, for example, [12,13]). These methods can be divided into two classes: metaheuristic and deterministic. Metaheuristic algorithms are usually based on simulations of processes occurring in the wild. Examples of metaheuristic algorithms are simulated annealing and evolutionary computing, etc. (see, for example, [14,15]). These algorithms are included in many software libraries for solving global optimization problems (for example, MATLAB Global Optimization Toolbox, Scipy.Optimize, etc.). Due to the operation speed and low requirements for computing resources (in particular, memory), metaheuristic algorithms have become widespread for solving practical problems. However, the solution to the problem found by such an algorithm is, in general, local, and can be located far from the global optimum [16].
It is also possible to construct deterministic methods of global search, which are different from brute force algorithms and metaheuristic algorithms. Such methods are associated with the presence and consideration of a priori assumptions about the problem properties. One of the natural assumptions about the problem is that changing the values of an objective function is limited by changes of its argument. In this case, the function satisfies the Lipschitz condition, and the problem itself is called the Lipschitz global optimization problem. The central direction of the development of modern methods of Lipschitz optimization is the development of parallel algorithms. It is assumed that the values of the objective function are determined as a result of solving some other task and require a costly computational experiment. On many benchmarks, Lipschitz global optimization methods outperform metaheuristic algorithms in terms of the number of correctly solved problems [17,18].
This paper reflects the results of applying parallel methods of Lipschitz optimization developed in [19][20][21] to the solution of inverse problems of chemical kinetics. The main part of the paper is structured as follows.
Section 2 provides a description of the mathematical model of the chemical reaction under study. Section 3 discusses the standard formulation of the Lipschitz global optimization problem and the algorithm for solving it. Section 4 outlines the issues related to the technical aspects of the implementation of the parallel algorithm. The physico-chemical interpretation of the results of numerical calculations using the developed parallel algorithm is presented in Section 5.
The mathematical model of the chemical process is a non-linear system of differential equations [10,22]: with initial values at τ = 0, x i (0) = x 0 i , where ν ij are stoichiometric coefficients of stages of the chemical reaction; J is the number of stages (J = 48), x i is the molar flow rate of the i-th component of reaction, kmol/h; I is the number of components (I = 17); w j is the rate of the j-th stage of reaction, kmol/(h·kg cat.), E j is the activation energy of the reaction stage, J/mol; R is the universal gas constant (R = 8.31 J/(mol·K), T is the temperature, K; k j are pre-exponential factors, kmol/(h·kg cat.); τ is conditioned contact time, (kg cat.); F is the total molar flow rate, kmol/h. The mathematical model involves parameters such as the raw material flow rate F in the reactor and temperature T, but these two parameters also depend on τ; therefore it is necessary to write equations on them because the technological process is non-isothermal: with initial condition: at τ = 0, T(0) = T 0 , where T is the temperature, K, ∆H i f ,T is the enthalpy of the i-th component formation at temperature T, J/mol, C T p i is the specific heat capacity of the i-th component at temperature T, (J/mol·K), F is the mole discharge of the flow, kmol/h. Table 1. Chemical transformations of the catalytic isomerization of the pentane-hexane fraction.

Reaction Stages
Reaction Stages  Thus, when solving the inverse kinetic problem, the direct problem is repeatedly solved, which includes the numerical integration of Equations (1), (4) and (5). After finding the numerical values of the molar flow rate x i , temperature T and total molar flow rate F, these values are compared with the corresponding industrial data obtained during the three days of operation of the catalytic isomerization unit [10], which is a minimization task: where I is the number of observed parameters (I = 19, of which 17 parameters are x i , one parameter is T and one parameter is F), y exp i are the industrial data, y calc i are the numerical values of concentrations of reaction substances obtained by solving a direct kinetic problem.
As a result of solving the inverse problem, the kinetic parameters k 0 j and E j included in the Arrhenius Equation (2) will be obtained. Since the number of reaction stages is 48, the kinetic model will contain 96 kinetic parameters (activation energy E j and pre-exponential multiplier k 0 j for each stage); however, the authors decided to fix some of the parameters as already known for chemical reasons, so the number of optimized parameters was reduced to 24.

Statement of the Global Optimization Problem
As stated above, from a mathematical point of view, the inverse problem of chemical kinetics can be considered a Lipschitz global optimization problem. This fact has has the following justification. We solve the ODE system (1)-(5) to compute the value of the objective function (6). The right-hand side of the ODE system is continuous functions with bounded derivatives. Thus, the system solution will also be continuous and bounded at the bounded period of time. Therefore, the objective function (6) will satisfy the Lipschitz condition, but the Lipschitz constant will be a priori unknown.
In general, the problem of the specified class can be formulated as follows: where a, b ∈ R N are boundaries of the search domain, and the objective function φ(y) satisfies the Lipschitz condition, The criterion φ(y) is assumed to be multiextremal and is given in the form of some computational algorithm. This assumption is typical for many applied optimization problems (see, e.g., [23,24]). The input of this algorithm is supplied with a parameter vector, and the output is the corresponding function value. Such functions are often referred to as "black-box" functions. It is also assumed that each calculation of the function value at the search domain point (hereinafter also referred to as trial) is a laborious operation and requires significant computational resources. As noted in the introduction, this statement of the problem fully corresponds to the inverse problem of chemical kinetics.
The Lipschitz condition (9) can be used to construct estimates of the function global minimum, which can serve as a foundation for designing global optimization algorithms and proving their convergence conditions (see, for example, [25]).
One of the main difficulties in solving Lipschitz global optimization problems is the socalled "curse of dimensionality"-the exponential increase in the number of trials required to solve a problem with a given accuracy, accompanied by the increasing dimensionality of the problem. It is possible to smooth this effect while maintaining the accuracy of the solution by constructing significantly uneven coverages of the search domain with trial points.
Known examples of such methods are the DIvide RECTangles (DIRECT) algorithm [26], the simplicial partitions algorithm [12], the diagonal partitions algorithm [13]. These techniques are based on adaptive partitioning of the search domain into a finite number of subdomains at each iteration. Then, the need for further subdivision for a more detailed study of these subdomains is estimated (using the accumulated information about the objective function). A fundamentally different approach to solving a multidimensional problem (7) is its reduction to one or more one-dimensional problems followed by the use of efficient methods of one-dimensional optimization. Such a reduction may be accomplished, for example, by a recursive optimization scheme [27] or by Peano curves [28]. The latter approach is used in this work.
Using a continuous single-valued correspondence (Peano curve) y(x) of the real axis segment [0, 1] onto hyperinterval D from (8), a multidimensional problem (7) can be reduced to a one-dimensional one: It is known [25,28] that the reduced one-dimensional function φ(y(x)) will no longer be Lipschitzian, but will satisfy the Hölder condition: where the Hölder constant H is associated with the Lipschitz constant L by the formula H = 2L √ N + 3. Specific methods for the numerical construction of this kind of mapping (called evolvents) are discussed in [25,28]. Here, we note that evolvent will approximate the theoretical Peano-Hilbert curve with an accuracy of 2 −m , where m is the parameter that sets the approximation accuracy. Examples of evolvent for different dimensions are shown in Figure 1. Thus, under this approach, the search trial at some point x k ∈ [0, 1] will consist of constructing the image y k = y(x k ) first, and only then of calculating the function value z k = φ(y k ). We will call the three values, {x k , y k , z k }, the trial result.

Asynchronous Parallel Algorithm Used to Determine the Global Minimum of a Function
The parallel algorithm used in this study is based on the information-statistical approach to the development of global optimization methods, the theoretical foundations of which are set out in [25].
We have implemented an asynchronous paralleling scheme of the "master/worker" type. The master process performs a global search algorithm, in which the search information is accumulated. Then the Lipschitz constants for the objective function are estimated on this basis, and the points of new trials are calculated and distributed over the work processes. Worker-processes receive points from the master, carry out new trials in them, and send the results of the trials to the master.
Let us assume that at each iteration, the master calculates one new trial point and transfers it for processing to the worker. At the same time, a trial performed by a worker is a much more labor-intensive operation than the selection of a new point by a master, which eliminates the worker downtime. In this case, the total number of trials performed by each worker will depend on the labor intensity of the specific trial and cannot be estimated in advance.
When describing a parallel algorithm, let us assume that there is one process master and p workers. The superscript will indicate the serial number of the trial performed, and the subscript will be used to number the trials in ascending order of the coordinate.
The initial iteration is carried out according to a special rule. The master process (let us assume that this is the process No 0) initiates p parallel trials at p different points of the search region, two of which are boundary points, and the rest are arbitrary internal points (for example, uniformly distributed). Thus, the initial trials are carried out at the points {y(x 1 ), y(x 2 ), . . . , y(x p )}, where x 1 = 0, x p = 1, x i ∈ (0, 1), i = 2, . . . , p − 1.
Let us suppose that the master process has obtained the results of k trials, and at the current time the worker processes are conducting trials at the points {y(x k+1 ), y(x k+2 ), . . . , y(x k+p )}.
Let one of the processes (without limiting the commonality, we can assume that this is the process No 1), at some point in time, complete the trial at the point y(x k+1 ). The remaining processes are at the stage of conducting their trials, i.e., at these points multiple trials have already been initiated, but have not yet been finalized.
Then the No 1 process forwards to the master process the trial results, i.e., the three values {x k+1 , y k+1 , z k+1 }. The master saves the results in its information base and selects a new trial point x k+p+1 for the process No 1 in accordance with the rules described below.
Step 1. Renumber the points of the set which contains all the points at which the trials have either been carried out or are being carried out by the lower index in ascending order, i.e., 0 = x 1 < x 2 < · · · < x x+p = 1.
Step 2. Calculate values If the value M is 0, assign M = 1. The values z i = φ(y(x i )) at x i ∈ I k are undefined because the trials at x i ∈ I k are not finished yet.
Step 3. For each interval ( called the characteristic of this interval. Here, ∆ i = (x i − x i−1 ) 1/N , and r > 1 is the reliability parameter of the method.
Step 4. Select the interval [x t−1 , x t ] with the highest characteristic value, i.e., Step 5. Calculate the point x k+p+1 ∈ (x t−1 , x t ) according to the formula: Immediately after calculating the point x k+p+1 of the next trial, the master appends it to the set I k and forwards it to the worker, which starts the trial at this point.
The master stops the algorithm if either the condition ∆ t < ε (where t is the number of the interval with the maximum characteristic from Step 4) or the condition k > K max (where k is the number of trials performed) is met. The real number ε > 0 and the integer K max ≥ 1 are the algorithm parameters which correspond to the accuracy of the solution search and the maximum number of trials.
The sequential global search algorithm, on the basis of which the proposed asynchronous algorithm was implemented, allows the following interpretation [29]. Using the points of previously performed trials, it is possible to construct a minorant of the objective function, for which the value −R(i), where R(i) is the value of the characteristics of the interval (x i−1 , x i ), will be the value of the minorant at the point of its minimum at the interval (x i−1 , x i ), 1 ≤ i ≤ k + p. In accordance with Step 4 of the algorithm, a new trial is carried out in the interval with the lowest value of the minorant. In accordance with Step 5, the point of the new trial coincides with the minorant minimum point.
The theory of convergence and various modifications of serial and parallel algorithms are detailed in the monographs [13,25]. It should be noted that the asynchronous parallelization scheme described here, in contrast to the synchronous schemes used earlier in solving a number of applied problems (see, e.g., [24]), provides a full load of all the processes involved in solving problems with different labor intensities of trials at different points in the search domain.

Features of the Software Implementation of the Parallel Algorithm
In our implementation of the global search algorithm, MPI is used to organize parallel calculations. A process with the rank 0 acts as a master, processes with a rank other than 0 act as calculators (workers) (see the description of the algorithm in the previous section).
The principle of process interaction (Figure 2, link 1) is as follows. The first step in the master process is to select p points for trials. The selected trial points are sent to the calculators and the master enters the stand-by state for receiving messages. Messages from the calculators contain the trial result. The obtained information is used to select the point of the next trial. The new point is sent as a message to the calculator. If the stopping condition is met in the algorithm, instead of the trial point, the master sends a shutdown message to the calculator. Each process that performs the role of a calculator creates an instance of the task when it starts, and then goes into a stand-by mode for messages from the master process. When a message containing a trial point is received, a computational experiment is performed. The results of the experiment are sent to the master. After the experiment, the process itself continues to wait for new messages. When a shutdown message is received, the resources used by the task are released and the main process is terminated.
When implementing the asynchronous scheme, we assumed that the start time for calculating the objective function values is different for each particular process. Our implementation takes into account the fact that the trial point as well as the calculation results represent a small amount of data that can be sent relatively quickly over the network. As a result, basic point-to-point MPI message forwarding operations were used to interact between the master and the calculators (Figure 2, link 1).
When conducting computational experiments (see next section), the solution of the direct problem of chemical kinetics was performed using MATLAB. In the initial version of the problem, each set of parameters required starting the MATLAB process, transferring the trial point, as well as obtaining the results of computational experiments. A number of additional settings have been made to reduce computing overhead. First, the source code for the computational experiment was compiled into an executable application. Second, the approach in which MATLAB received a trial point through input stream and returned the results of the computational experiment through output stream was implemented. This approach allowed each calculator to run a process with the compiled MATLAB code once, intercept input and output streams, and then perform calculations without waiting for new processes to start.

Numerical Experiments
The asynchronous parallel global search algorithm was implemented on C++; the software was built on Intel MPI and Intel C++ compiler, version 2017.0.1 (Intel Corporation, academic licence, https://www.intel.com, accessed on 1 September 2022). The objective function of the problem to be solved was implemented in MATLAB 2019b (The Math-Works, Inc., campus-wide academic license, https://matlab.mathworks.com, accessed on 1 September 2022). In order to check the stability of the developed parallel program, it was launched on three different supercomputers: Lomonosov-2 (Moscow State University), Lobachevsky (University of Nizhni Novgorod), and MVS-10P (Joint Supercomputer Center of the RAS). The main part of the experiments was carried out on the Lomonosov-2 supercomputer with the following computer nodes characteristics: Intel Haswell-EP E5-2697v3 CPU (processor frequency 2.6 GHz, 14 cores), 64 GB of RAM, CentOS 7 operating system.
To solve the optimization problem, a combination of the parallel global search algorithm and a local optimization method was used. In the first stage of the search, the parallel asynchronous algorithm described in Section 3 was used. The core parallel global search algorithm is available on GitHub, https://github.com/UNN-ITMM-Software/gsa_nlp_ solver (accessed on 1 September 2022). One hundred and one processes were involved in the launch, of which one process was a master and the remaining 100 were workers (calculators). The values of the method parameters were set as follows: the maximum number of trials K max = 3000, the search accuracy ε = 10 −3 , the reliability parameter r = 3, the evolvent accuracy parameter m = 12. At the second stage, the solution was refined using the Hooke-Jeeves pattern search method (see, e.g., [30]) with an accuracy of ε = 10 −4 and a limit on the number of trials K max = 500. The total time for solving the problem was 8109 s and an optimum point was found with the objective function value φ * = 0.002115. Thus, using the developed algorithm, the inverse problem of chemical kinetics was solved and the kinetic parameters of the reaction were found, which are given in Table A1.

Discussion
After the development of the kinetic reaction model, it is necessary to check the adequacy of the model; this procedure can be carried out using a direct kinetic problem. Substituting the found kinetic values k 0 j and E j into the system (1)-(5), the direct problem of chemical kinetics was solved and the concentrations of reaction substances x i and temperature T were found (Figures 3 and 4). Figures 3 and 4 show graphs of changes in the concentrations of individual components of the reaction and temperature changes in the reactor unit. It should be noted that in the figures the vertical lines separating the graphs characterize three consecutive reactors (I, II, III). The initial data on the concentrations of the components were known only at the entrance to reactor I and at the exit from reactor III. The concentrations of components at the outlet of reactor III are marked with a marker in the figures.    Figure 3a shows that the concentrations of n-pentane and n-hexane are continuously decreasing. This is mainly due to the transformation of the alkanes with a normal (linear) structure into the corresponding isomers, which is reflected in Figure 3b. With the transformation of n-pentane, everything is natural, since only one isomer is known. In the case of n-hexane, everything is somewhat different: first, the conversion of n-hexane into isomers 2-methylpentane and 3-methylpentane is carried out and the latter are further converted into 2,2-dimethylbutane and 2,3-dimethylbutane.
In Figure 4a, it can be seen that the concentration of cyclopentane in reactor I gradually decreases to zero. The decrease in cyclopentane concentration can be explained by the hydrogenation reaction to n-pentane. According to Figure 3a, this is probably the reason for the increase in the concentration of n-pentane at the beginning of reactor I. Additionally, this figure shows a sharp decrease in the concentration of benzene, presumably due to hydrogenation reactions. Figure 4b shows a graph of temperature changes in reactors I-III. According to the technological scheme, refrigeration equipment for cooling the reaction mixture is installed after reactor II; therefore, a sharp decrease in temperature is observed when the catalyst is 18 tones. A gradual increase in temperature in reactors III here indicates the presence of isomerization reactions that occur with a slight exothermic effect.
It is worth noting that the composition of the products at the outlet of the last reactor and the temperature differences in all reactors are close to the industrial data. Thus, these facts may indicate the adequacy of the model we have developed.

Conclusions
To summarize, the problem of developing a kinetic model for the process of catalytic isomerization of the pentane-hexane fraction using a parallel optimization method was considered. Due to the switch of the oil refining industry to the production of gasoline with the Euro-5 standard, this process is becoming important. On the basis of known industrial data on the concentrations of reaction components and the temperature at the outlet of the third reactor, the inverse problem of chemical kinetics was solved, and the activation energies and pre-exponential factors of each reaction stage were calculated. To solve the optimization problem, the authors have developed a parallel global search algorithm and a program based on Lipschitz optimization. The kinetic parameters found made it possible to develop a mathematical model of the process, which is in good agreement with industrial data.
The future plan is to study the dynamics of the gas-liquid flow in the reactor unit for the process of catalytic isomerization of the pentane-hexane fraction, taking into account diffusion and heat exchange processes through the catalyst layer.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
The Table A1 is a numerically calculated kinetic parameters of a chemical reaction. Table A1. Kinetic parameters of the catalytic isomerisation reaction of the pentane-hexane fraction, according to the scheme of chemical transformations (Table 1).