Studying the Efficiency of Parallelization in Optimal Control of Multistage Chemical Reactions

: In this paper, we investigate the problem of optimal control of complex multistage chemical reactions, which is considered a nonlinear global constrained optimization problem. This class of problems is computationally expensive due to the inclusion of multiple parameters and requires parallel computing systems and algorithms to obtain a solution within a reasonable time. However, the efficiency of parallel algorithms can differ depending on the architecture of the computing system. One available approach to deal with this is the development of specialized optimization algorithms that consider not only problem-specific features but also peculiarities of a computing system in which the algorithms are launched. In this work, we developed a novel parallel population algorithm based on the mind evolutionary computation method. This algorithm is designed for desktop girds and works in synchronous and asynchronous modes. The algorithm and its software implementation were used to solve the problem of the catalytic reforming of gasoline and to study the parallelization efficiency. Results of the numerical experiments are presented in this paper.


Introduction
Many practical optimal control problems obtaining the optimal solution can be insufficient because they cannot be applied owing to various physical restrictions [1,2]. Additionally, modern systems that require any type of control are complex and usually described by high-dimensional systems of ordinary differential equations (ODEs) that have only numeric solutions [3,4]. As a result, achieving optimal control is impossible without robust numerical solvers [5].
In this paper, investigate an optimal control problem as a global constrained optimization task. A similar approach was used in [6,7] to control a wheeled robot and in [8] for gas allocation control, as well as for some other applications [9][10][11]. This type of problem belongs to a class of numeric methods that can obtain an approximate solution to the optimal control problem [12,13].
In this paper, we consider a complex chemical reaction involving the catalytic reforming of gasoline [14]. The specified process is of significant practical importance, as it produces commercial gasoline for daily use; therefore the reaction must be controlled to increase output of the target product and to reduce the output of undesired components. However, this process, like other complex multistage chemical reactions [15][16][17], involves many internal stages and multiple intermediate complexes, which makes the optimization task computationally expensive. Thus, in order to obtain a solution within a reasonable time, parallel computing systems are required [18]. However, the design of such a parallel optimization algorithm is a multifaceted task [19]. The algorithm should take into account not only problem-specific features but also peculiarities of the computing system in which the algorithms are launched. In other words, an algorithm must be designed that is capable of both localizing suboptimal solutions within a reasonable number of iterations [20] and efficiently utilizing all available computing resources [18].
In this study, we consider the emerging class of loosely coupled computing systems, in particular, grid systems comprising heterogeneous personal computers (desktop grids), which are widely used for scientific computations [21,22]. When developing an optimization algorithm for this class of systems, communication expenses between computing nodes must be minimized. It is possible to achieve this task either with a peculiar optimization algorithm [23,24] or with a specific parallelization technique [22].
We propose a new parallel algorithm based on the mind evolutionary computation (MEC) algorithm [25] to solve an optimal control problem. The classical MEC algorithm appeared to be successful in solving real-world global optimization problems and to be suitable for parallelization [1,4,25], like similar population-based algorithms [24]. The proposed method takes the architecture of the desktop grid into account by minimizing the number of information exchanges between computing nodes and can work both in synchronous and asynchronous modes. In order to ensure the feasibility of solutions, the algorithm includes several strategies that consider various constraints imposed on the control parameters. For example, it helps to control the speed of change in control temperature or the change in component concentration, as well as the smoothness of the control function. These constraints are important for practical implementation of the obtained control strategies [26,27].
The remainder of this paper is organized as follows. Section 2 is devoted to problem formulation. In Section 3, the model of the gasoline catalytic reforming is described, and the problem of optimal control is formulated and transformed into a nonlinear global optimization problem. In Section 4, we propose synchronous and asynchronous parallel mind evolutionary computation algorithms. In Section 5, the results of numerical experiments are presented and analyzed both from the parallelization and chemical perspectives. In Section 6, we present our conclusions, summarize the study, and suggest directions for further work.

Problem Formulation
In this paper, we consider a deterministic global constrained nonlinear minimization problem: Here, Φ( ) is the scalar objective function, Φ( * ) = Φ * is the required minimal value, = ( 1 , 2 , … , ) is the -dimensional vector of variables, is the -dimensional arithmetical space, and is the constrained search domain. Feasible domain is determined with inequality constraints Most chemical reactions within a class of problems under consideration are described by systems of ODEs [3,4]. However, it is difficult to find such a control function-( ), where is the reaction time-that can provide the required output characteristics for a reaction, for instance, the maximum of the target product and/or the minimum of undesired substances.
In this work, the optimal control problem was transformed into a global optimization problem in the following manner. The integration interval [ ; ] is discretized so that the length of one section [ ; +1 ] meets the restrictions imposed by the experimental unit. The values of ( ), are the components of vector = ( 1 … ). A piecewise linear function is used for approximation of the function ( ). The objective function is transformed into a function that minimizes the difference between current and required behavior:

Optimal Control of the Catalytic Reformation of Gasoline
In this paper, we study a non-isothermal industrial reaction of the catalytic reformation of gasoline over a bimetallic catalyst ( − /Al 2 O 3 ). The output of such a reaction is commercial gasoline for daily use.
During the catalytic reformation of gasoline, aromatic hydrocarbons are formed, which increase the octane number of the gasoline, which is the main purpose of the process. In addition, reformation is also used for the separate production of aromatic hydrocarbons, which are used in petrochemical processes. The octane number of gasoline determines its class and therefore its price.
Catalytic reforming is one of the main sources of aromatics in gasoline, along with catalytic cracking of vacuum gas oil and low-temperature catalytic isomerization of the pentane-hexane fraction. According to current standards, the main requirement is to reduce the proportion of benzol and aromatics in the gasoline composition. Therefore, one of the crucial tasks associated with the catalytic reforming of gasoline is to reduce the amount of aromatic hydrocarbons and benzol with minimal changes in the octane number [14].
The problem of aromatic hydrocarbon and benzol content, is solved as follows: prefractionation of raw materials (i.e., benzol-forming components are removed) or removal of excess aromatic hydrocarbons from commercial gasoline. In the first case, raw materials deteriorate, and productivity decreases [28]. In the latter case, the cost of the product increases. A possible solution to the above problem is the optimization of the reactor unit itself, which requires a detailed kinetic model of the process based on the fundamental laws of chemical transformations.
The target reactions of catalytic reforming are the reactions of formation of high-octane components: dehydrocyclization, dehydrogenation of naphthenes, isomerization of naphthenic, and dehydrogenation of naphthenic hydrocarbons [29][30][31]. Reactions that result in the splitting of a molecule into several smaller molecules are undesirable, as they form gases, which reduce the yield of the target reformate product. Target reactions have a total endothermic effect of about 200 kJ, and side reactions are exothermic.
The use of adiabatic reactors in the process affects the non-thermal nature of catalytic reforming. Therefore, in the kinetic model, it is necessary to take into account the change in temperature during the reaction in each adiabatic reactor.
In the kinetic model, individual hydrocarbons are represented as 37 groups, including normal paraffins ( ), isoparaffins ( ), five-membered naphthenes ( ), six-membered naphthenes ( ), and aromatic hydrocarbons ( ), where is the number of carbon atoms in the molecular structure, and hydrogen. When developing a model for the catalytic reforming of gasoline, it is necessary to take into account the change in the number of molecules due to chemical transformations based on the scheme of transformations and grouping of individual components [16]. Because adiabatic reactors are used for reactions with heat absorption, the temperature of the mixture decreases along the catalyst bed (up to 80 °C), leading to a decrease in reaction rates. In the reforming model, it is necessary to describe the change in temperature over the catalyst bed.
Thus, the mathematical description of the non-isothermal reaction of the catalytic reforming of gasoline is expressed as (4)- (7). The mathematical model consists of 40 differential equations (group component equations, temperature change and moles) with initial data, with the following initial conditions: (0) = 0 . = ∑ =1 , = 1, . . , ; (4) where is the concentration of reaction reagents (mol/L); is the conditional contact time (kg·min/mol); is the number of stages; is the number of substances; is the coefficients of the stoichiometric matrix; is the rate of the -th stage (1/min); and − are the rate constants of direct and inverse reactions, respectively (1/min); is the negative elements of the matrix ( ); is the positive elements of the matrix ( ); 0 and − 0 are the pre-exponential factors (1/min); + and − are the activation energies of the direct and inverse reactions, respectively (kcal/mol); is the gas contact (2 cal/(mol·K)); is the temperature (K); * is the duration of the reaction (min); ∆ ( ) is the formation enthalpy of the -th component at temperature (J/mol); ( ) is the specific thermal capacity of the -th component at temperature (J/(mol·K)); , , , , and are the coefficients of thermal capacity's temperature dependance of the -th component; and is the mole discharge of the flow (mol/min). The main challenges associated with the catalytic reforming of gasoline are restrictions on the content of benzol and the amount of aromatic hydrocarbons in the final reformate. As a result, a restriction is imposed on the introduction of the product of the process into commercial gasoline in terms of the content of these components. However, the main purpose of the catalytic reforming process is to increase the octane number (ON) of the gasoline. Therefore, along with environmental criteria, the necessary criterion of the optimal control task is to maintain the high value of the octane number of the mixture, which depends on the composition of the product.
The reactor block of the catalytic reforming process consists of three adiabatic reactors, each of which receives a mixture heated to the required temperature. The reactor block is represented by a cascade of successive adiabatic displacement reactors, 1, 2, and 3, filled with a catalyst. Before entering each of the reactors, raw materials are heated in ovens. Due to the strong endothermic effect of the reactions, as the reaction mixture passes through the catalyst layer, the temperature of the reaction mixture decreases, which negatively affects the rates of chemical reactions. In industry, the temperature at the inlet to each of the three reactors is kept within the range of 480-500 °C [32]. In this work, the following initial temperatures were used: The optimality criterion for the catalytic reforming of gasoline is based on the octane number of the reformate-the higher, the better.
where is the octane number of the -th component. An increase in the octane number of gasoline during reformation is achieved due by the complete conversion of naphthenic (cyclic) hydrocarbons into arenes (aromatic hydrocarbons) with a high octane number, as well as by the partial conversion of paraffins into naphthenic hydrocarbons, which, in turn, are converted into arenes. However, according to environmental requirements, the contents of aromatic hydrocarbons ( ) and benzol ( ) are limited to 40% and 2%, respectively. We treat these requirements as constraints in our optimization problem.
Another criterion with respect to the optimality of the catalytic reforming of gasoline is the yield of the target product, i.e., the reformate ( ( )), which is calculated as the product of the process minus cracking gases. We use this criterion for reference only.
The resulting objective function and the constraints are presented below: In this reaction, the temperature cannot be used as a control parameter, as it changes in each reactor. However, extra heating or cooling can be added to each reactor. In the model, this can be achieved by adding an extra summand to the equation that describes the change in temperature. In other words, we can control the first derivative ( ′ ( )) or the speed of extra heating or cooling of each reactor. The working temperature range for the catalytic reforming of gasoline is approximately ( ) ∈ [400 ℃; 500 ℃]; considering the duration of this reaction and the initial temperatures for every reactor, the following constraints were determined: ′ ( ) ∈ [−0.002; 0.002]. The piecewise linear function was selected for approximation of the function ′ ( ) on the interval ∈ [0; 89321] seconds. The reasonable time step differs slightly for each reactor and equals 240 seconds on average, thus the dimension of the optimization problem is = 378.

Parallel Mind Evolutionary Computation Algorithms
In this work, we propose a novel parallel optimization algorithm based on the classical mind evolutionary computation algorithm [25]. The original MEC algorithm (simple MEC, SMEC) simulates some aspects of human behavior. An individual ( ) is considered an intelligent agent that operates in a group ( ) composed of analogous individuals. During the evolution process, every individual is affected by other individuals within the group. In order to achieve a high position within the group, an individual has to learn from the most successful individuals in the group, whereas groups should follow the same logic in intergroup competition.
The algorithm is composed of three main stages: initialization of groups, similar taxis, and dissimilation. Operations of similar taxis and dissimilation are repeated iteratively, while the best obtained value of an objective function (Φ( )) changes. When the best obtained value stops changing, the winner of the best group among a set of leading groups is selected as a solution to the optimization problem [33].
The SMEC algorithm can be interpreted as a multipopulation problem. A multipopulation consists of independent subpopulations with different instances of the SMEC algorithm. Each subpopulation is composed of leading groups ( = ( 1 , 2 , … , | | )) and lagging groups ( = ( 1 , 2 , … , | | )). The number of individuals within each group is set to be the same and equals | |.
Every subpopulation has its own communication environment called a local blackboard, denoted as , ∈ [1: ], where is the number of subpopulations. In addition, the whole multipopulation has a general global blackboard, i.e., . A multipopulation version of the SMEC algorithm can be used to decompose a problem and map it onto computing nodes of the loosely coupled system. In such a case, each subpopulation or group of subpopulations evolves independently on separate computational nodes. On the other hand, the absence of communication between nodes reduces the efficiency of optimization. In order to avoid this, different dynamic adaptation strategies should be added to the algorithm [34].
In this work, we incorporated the following changes into the multipopulation SMEC algorithm: a step adaptation strategy [35], projection onto a search domain technique to handle constraints [36], and a smoothing method to producing control strategies that can be put into practice [17]. For the purpose of parallelization, the master-slave paradigm was used [18]. The description of the proposed parallel algorithm is presented below, and a flow chart diagram is displayed in Figure 1.

Initialization of groups within the search domain ( ) and problem decomposition.
This stage is performed on the master CPU. a. Generate a given number ( ) of groups ( , ∈ [1: ]); is the free parameter of the algorithm; b. Generate a random vector ( ,1 ). In this work, an sequence was used [37].
Here, | | is the number of individuals in each group; f.
Calculate the scores of all individuals in every subpopulation and put them on the corresponding local blackboards ( , ∈ [1: ]); g. Create leading and lagging groups based on the obtained scores. The ratio between leading and lagging groups is determined by the free parameter . In this work, the number of lagging and leading groups is the same ( = 0.5), corresponding to a situation in which there is a balance between exploration and exploitation. 2. The modified similar-taxis stage is launched independently on every parallel computing node and works with a subset of / groups determined during the initialization stage. We describe the process for one computing node. a. Take information on the current best individual ( , , ∈ [1: | |]) of the group ( ) from the blackboard ( ); b. The value of parameter used to generate new agents decreases depending on the number of iterations: Here, ̂ is the threshold number of iterations; when ≥̂, the standard deviation ( ) begins to decrease; 0 is the initial value of the standard deviation; is the speed parameter (the recommended value = 0.2); and is the tolerance used to identify the stagnation. c. Check for constraint violations: i. If any component of vector is outside of domain , it is projected back on to the boundary of domain [24]; and ii.
The difference between any two nearby components of vector is modified in order not to overcome the specified limit from physical experiments. This procedure is performed in random order [3]. d. Create new leading groups ( = ( 1 , 2 , … , | | )) and lagging groups ( = ( 1 , 2 , … , | | )) around current best individuals (̃, ) using Formula (10); e. Put information on the new winner on the corresponding local blackboard ( ) and the global blackboard ( ) available to the master. 3. The dissimilation stage is also performed on every parallel computing node.
a. Read the scores of all groups (Φ , Φ , ∈ [1: | |], ∈ [1: | |]) from the global blackboard ( ); b. Compare all the scores. If the score of any leading group ( ) is less than score of any lagging group ( ), than the lagging group becomes a leading groups, and the first group becomes a lagging group. If score of a lagging group ( ) is lower than the scores of all leading groups for consecutive iterations, then it is removed from the population; c. Each removed group is replaced with a new group via an initialization procedure. 4. For this study, the maximum allowed value of the objective function's evaluation was used as the termination criterion [36]. The synchronous approach was used, as the master node waits for the results or termination messages (via time out) from all computing nodes before composing the solution. Without any preliminary problem analysis and/or load-balancing procedure, the synchronous approach can be inefficient; some computational nodes can sit idle, whereas others still operate. As we focused on parallelization for desktop grids, dynamic load balancing would be impractical, as our goal was to minimize communication between the nodes.
Considering these aspects, we propose asynchronous parallel modification of the MEC algorithm. The idea of asynchronous learning was inspired by the [38], in which it was used for deep evolutionary reinforcement learning for large-scale problems. The idea is to update the evolutionary information every time a computing node finishes optimization.
The asynchronous update is performed by the master CPU after every new message received from the nodes and can be described as follows. a. Read the scores of all groups (Φ , Φ , ∈ [1: | |], ∈ [1: | |]) from the global blackboard ( ); b. Compare all the scores. If score of any leading group ( ) is less than the score of any lagging group ( ), then the lagging group becomes a leading group, and the first group becomes a lagging group. If the score of a lagging group ( ) is lower than scores of all leading groups for consecutive iterations, then it is removed from the population; c. Each removed group is replaced with a new group with via the initialization procedure; d. If a lagging group ( ) was deleted while being processed on another computing node, its results are kept in order to form a new group as soon as another group is removed in order to maintain a constant number of groups and individuals and prevent information loss.

Computational Experiments and Analysis
Both synchronous parallel MEC (sPMEC) and asynchronous parallel MEC (aPMEC) algorithms were implemented using a combination of Python and Wolfram Mathematica. In addition, the sequential version of the modified algorithm was implemented for the efficiency study. Finally, we implemented another parallel version of the algorithm in which instead of problem decomposition, parallel computing nodes were utilized only for evaluation of the objective function (every individual is evaluated at the available node), whereas the general logic remained sequential. We refer to this algorithm as oPMEC (objective PMEC). It will be used only at the master node (multicore CPU), as it requires many information exchanges between nodes.
The specified problem of optimal control of the catalytic reforming of gasoline was used to compare the efficiency of the proposed parallelization techniques. For the experiments, a computational limit of 32,000 objective function evaluations was set for all four algorithms. The backward differentiation formula solver in Wolfram Mathematica was utilized to solve systems of ODEs. Computations for sPMEC and aPMEC were performed using a desktop grid composed of sixteen personal computers connected physically via a local network.
The following free-parameter values of the algorithm were utilized: the number of groups ( = 32), the number of individuals in each group (| | = 15), the removing frequency ( = 30 iterations), and the initial value of the standard deviation ( 0 = 0.0001).

Study of the Parallelization Efficiency
A sequential experiment with one node was performed at the master node only, which is a personal computer with four CPU cores. For the specified computational limit, the sequential experiment took 159,217 seconds. In other words, one objective function evaluation takes approximately 5 s.
Parallelization with the oPMEC algorithm was also performed at the master node with four CPU cores ( = 4). In this work, we used the parallelization speedup (S) and the efficiency of parallelization (E) as two criteria for comparison of the algorithms [7]. The speedup is calculated as the ratio of program execution time on one node ( 1 ) to the execution time on N nodes ( ). The efficiency of the parallelization is calculated as the speedup ( ) divided by the number of nodes ( ). Because this type of parallelization does not involve any algorithm modification, its speedup and efficiency are poor ( Figure  2); the exact values are presented in Table 1.  We also analyzed the sPMEC algorithm (Figure 3), which achieved relatively good speedup and efficiency when the number of nodes was low, although it scales up poorly. The parallelization efficiency is only 60% when the number of nodes is N = 16 (Table 2), mainly due to the absence of load balancing as a result of the loosely coupled architecture. As a result, some computing nodes took up to 75% more time to finish computations than others.   Finally, we analyzed the aPMEC algorithm ( Figure 4). This algorithm scales up better than the synchronous algorithm, resulting an efficiency of more than 80% when N = 16 ( Table 3). Asynchronous evolution helped to avoid unnecessarily lengthy computation in the beginning of the process and resulted in high efficiency, as there were few steps performed at the master node.

Analysis from the Chemical Perspective
The introduction of temperature control throughout the process means determining the cooling regime to achieve the optimum according to the given criterion. In Figures 5  and 6, the vertical lines indicate the boundaries of the process by reactors (the reactor block of catalytic reforming consists of three reactors). The red dashed curves represent the scenario without ′ ( ), which is used as a baseline. The cooling mode for the existing technological process shows the need for constant heat removal in the first and second reactors, as well as at the beginning of the third reactor; then, a decrease in cooling occurs, which is associated with the criterion of the optimal control problem to reduce the yield of aromatics. The authors of [28,29] showed that in the first and second reactors, as well as at the beginning of the third reactor, there is a constant accumulation of aromatic hydrocarbons at an elevated temperature; then, the content of aromatics decreases somewhat. Possible means of consumption include alkylation, hydrocyclization, and cracking reactions of alkyl substituents, which predominate in the third reactor. The task of temperature control enables a reduction in the yield of aromatics to 36%, which is more than 20% less than without control ( Table 4). The calculation is conducted as follows: 0.47 − 0.24 × 0.47 = 0.47 − 0.11 = 0.36; the calculation is conducted in a similar manner for all other rows in Table 4. Furthermore, the yield of benzene was reduced from 3 to 2%. Such a reduction in the yield of aromatic hydrocarbons and benzene allows the process product to comply with environmental restrictions. The value of ON decreases slightly, but it is not crucial for the quality of the commercial gasoline, as it also contains products of isomerization of the pentane-hexane fraction, which compensate for the drop. In turn, the aromatic hydrocarbons are generated only at this stage, and their decrease significantly improves the ecological properties of the gasoline. Furthermore, the yield of the reformate, i.e., the reaction product, increases from 86 to 93.  In general, the control of the thermal regime in reactors plays an important role, as the reaction rates change, which can lead to a change in the material balance and properties of the obtained products. Technological examples include when a heat exchanger and a refrigeration equipment are used in catalytic reforming units [39]. In the presented work, the optimal control of the catalytic reforming of gasoline is calculated, which makes it possible to provide specific technological recommendations with respect to operating conditions with heat removal in each reactor of the cascade throughout the entire process.

Conclusions
In this paper, we present a new parallel asynchronous MEC algorithm designed for optimal control of multistage catalytic reactions using desktop grids. Instead of a traditional evolution paradigm, the aPMEC algorithm operates in the asynchronous parallel mode, so learning is not required across the entire population before subsequent steps. Once local optimization is performed for any group, the master process can immediately perform another step of the algorithm. As a result, it rearranges computations to avoid down time in the beginning of the evolution process. The algorithm is also capable of finding feasible solutions by taking practical constraints into account.
The proposed algorithm and its software implementation were used to obtain feasible control for the catalytic reforming of gasoline. The results of the conducted numerical experiments and the obtained control are presented in this paper. The algorithm enabled improvement of the reaction characteristics. The yield of the reformate, i.e., the reaction target product, was increased from 86 to 93, and all environmental requirements were satisfied: the content of aromatic hydrocarbons was maintained under 40%, and that of benzol was maintained under 2%, from initial contents of 47% and 3%, respectively. Additionally, the parallelization efficiency was studied for the aPMEC algorithm and its synchronous variant. The asynchronous approach improved the parallelization efficiency for 16 computing nodes from 60% to 82%.
Future research will be devoted to investigating the performance of the proposed algorithm using hybridization with local search methods and other parallel architectures. Another possible research direction involves the concept of landscape analysis [24] and its importance for parallelization, as the results obtained during the landscape analysis stage can be used not only to tune the optimization algorithm but also for initial load balancing.