An Enhanced Logic-Based Bender’s Decomposition Algorithm with Proximity Principle for Simulator-Based Distillation Process Optimization

Tian, Chenshan; Zhang, Xiaodong; Lan, Yang; Sun, Jinsheng

doi:10.3390/pr13040977

Open AccessArticle

An Enhanced Logic-Based Bender’s Decomposition Algorithm with Proximity Principle for Simulator-Based Distillation Process Optimization

¹

School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China

²

College of Smart Energy, Shanghai Jiao Tong University, Shanghai 200240, China

^*

Authors to whom correspondence should be addressed.

Processes 2025, 13(4), 977; https://doi.org/10.3390/pr13040977

Submission received: 22 February 2025 / Revised: 16 March 2025 / Accepted: 19 March 2025 / Published: 25 March 2025

(This article belongs to the Section Separation Processes)

Download

Browse Figures

Versions Notes

Abstract

The optimization of distillation processes is particularly challenging due to the presence of nonlinear equations and integer variables, resulting in complex mixed-integer nonlinear programming (MINLP) problems. This work introduces an enhanced optimization algorithm, the logic-based proximity principle Bender’s decomposition (LB-PBD), to address non-convex MINLP issues in simulator-based distillation optimization. The key innovation, the proximity principle, improves lower bound predictions by prioritizing information from the closest known integer solutions. Additionally, the integration of a multi-start points strategy and a delayed convergence strategy ensures the algorithm achieves global optimality while avoiding premature convergence. The effectiveness if the proposed LB-PBD is validated through three case studies. Numerical experiments demonstrate so-called proximity principle superior ability of original algorithm to navigate local optima, making LB-PBD more versatile than traditional deterministic algorithm (logic-based outer approximation algorithm) and stochastic algorithm (adaptive superstructure-differential evolution algorithm). In a single-column distillation case, LB-PBD achieves high accuracy. In an extractive distillation case, the algorithm successfully optimizes the separation of a near-azeotropic mixture, reducing energy consumption and improving product recovery compared to previous solutions. These results highlight LB-PBD as a robust and effective tool for solving non-convex MINLLP problems, particularly in simulator-based distillation process optimization.

Keywords:

distillation optimization; mixed-integer nonlinear programming; proximity principle; Bender’s decomposition; simulator-based optimization

Graphical Abstract

1. Introduction

Distillation is one of the most widely used industrial separation methods, often preferred by engineers due to its established process and simplicity. It is estimated that distillation accounts for 90–95% [1] of liquid separations in industry. However, the high energy consumption associated with distillation has been a long-standing criticism. It is estimated that distillation operations contribute to 10–15% [2] of global energy consumption. To mitigate this, engineers strive to minimize energy consumption during the early stages of design. However, typical distillation processes involve numerous integer variables and nonlinear equations, which make optimization a challenging mixed-integer nonlinear programming (MINLP) problem.

Most distillation designs rely on process simulators, such as Aspen and ProII. These simulators commonly employ the sequential modular (SM) method as their default solving algorithm due to its stability. However, SM algorithms cannot provide direct derivative information for continuous variables, leading to the treatment of the problem as a derivative-free optimization (DFO) [3]. DFO methods iteratively generate new prediction points based on known samples of independent and dependent variables. Classic DFO approaches include differential evolution [4,5], particle swarm optimization [6,7], and genetic algorithms (GA) [8,9], along with various improved versions [10]. DFO is particularly powerful in multi-objective optimization, as it can identify Pareto frontiers [11].

However, DFO methods face challenges, such as the lack of a stationary point criterion, like the Karush–Kuhn–Tucker (KKT) conditions. Additionally, the setting of termination conditions and hyperparameters is highly related to the problem, and it is difficult to ensure the accuracy and speed of solving unknown problems using specific parameters.

An alternative approach involves using equation-oriented (EO) [12,13] or perturbation methods [14] to derive continuous variable derivatives and mathematical programming for integer variables, constituting deterministic algorithms. Famous deterministic algorithms include branch and bound [15], generalized Bender’s decomposition (GBD) [16], outer approximation (OA) [17], and extended cutting plane algorithms (ECP) [18]. Nevertheless, many deterministic algorithms were initially designed for convex programming, which poses challenges in avoiding local optima.

Many solutions have been proposed by researchers to avoid the algorithm’s solution being limited to local optima. In nonlinear programming (NLP), the multi-start strategy [19] is highly effective. Increasing the number of starting points enhances the likelihood of finding global optima. Similarly, a hot start method [20] was proposed, which solves multiple NLP subproblems before iterations, while Su et al. [21] extended this by incorporating hot starts into each iteration, allowing parallel computation without significantly increasing optimization time. The n-step restart strategy or segmented strategy, frequently used in NLP, can also improve the deterministic algorithms in MINLP. Franke [22] modified the GBD strategy by discarding Lagrangian functions after convergence and re-establishing MILP from the optimal integer solution. Similarly, Javaloyes-Antón et al. [23] improved ECP by intermittently relaxing and modifying constraints. Bergamini et al. [24] employed the McCormick envelope for piecewise linearization, enhancing the probability of global optima and adopted by solvers like Gurobi [25]. These techniques can be extended to other deterministic algorithms.

In order to reduce the dependence of deterministic algorithm solutions on the initial point, Chia et al. [26] provided starting points for OA using GA, while Liñán et al. [27] proposed hybrid algorithms that combine stochastic (Differential Evolution with Tabu List) and deterministic methods (discrete-steepest descent algorithm with variable bounding). These hybrid algorithms narrow the search domain of deterministic methods, improving global optima discovery. However, a fundamental issue still exists: It is impossible to determine in advance that the programming within the narrowed search space is convex, which means that the solution quality of hybrid algorithms is still dependent on deterministic algorithms because the final solution is obtained by a deterministic algorithm to ensure that it meets the mathematical criteria for determining the optimal solution.

When examining existing convex deterministic algorithms, a key limitation of most in non-convex programming is their reliance on gradient information (or difference information for integer variables) from all known points to construct constraint conditions and derive the lower bound of the original problem. In convex functions, as shown in Figure 1a, this lower bound prediction (represented by the red line) is accurate. However, for non-convex functions, as shown in Figure 1b, this prediction no longer serves as a lower bound but instead becomes an “upper bound prediction”. This occurs because the executed programming behaves like a min–max programming, ultimately leading to premature convergence of the algorithm.

To address this limitation, if the lower bound of the unknown point is constrained solely by information from its nearest known point (illustrated by the blue line in Figure 1), the method maintains an accurate lower bound prediction for convex functions while improving lower bound accuracy for non-convex functions. This concept, so-called the proximity principle, will be discussed in detail in Section 2.

The first deterministic algorithm applied in simulator-based distillation optimization is the logic-based outer approximation algorithm (LB-OA), a variant of OA [14,28]. This method transforms the mixed-integer linear programming (MILP) problem into mixed binary linear programming (MBLP) using logical variables, allowing a single inequality to contain information from multiple integer combinations. Additionally, generalized disjunctive programming (GDP) [29], which addresses disjunctive terms in models, can be applied using big-M or hull reform. The use of penalty functions to incorporate constraints into the objective function [30] further improves algorithm robustness, making deterministic algorithms widely applicable in simulator-based optimization. In simulation-based distillation optimization, it is challenging to decouple integer variables from continuous ones, meaning that the MINLP to be solved does not conform to the assumptions of the LB-OA method, making it impossible to guarantee the global optimality of the final solution.

On the other hand, Liñán et al. [31] propose a Bender’s decomposition (BD) framework, which excludes continuous variables in the MILP master problem, is more appropriate for simulator-based distillation optimization. Compared to GBD, BD does not rely on Lagrange multipliers, which are difficult to provide reliably in NLP due to numerical issues. Thus, BD proves to be more suitable in this context.

The organization of this paper is as follows: Section 2 introduces the proximity principle and the proposed improved algorithms. Section 3 demonstrates the effectiveness of the improvements through a numerical case and two simulation-based optimization case studies. The conclusions are presented in Section 4.

2. Materials and Methods

2.1. Problem Statement and Logic-Based Bender’s Decomposition Algorithm

Process simulator-based distillation optimization problem can be formulated as a mixed-integer nonlinear programming (MINLP), represented as follows:

(x, y) = a r g \min \{f_{M I N L P} = F (x, y, u)\}

(1)

Subject to the following:

U (x, y, u) = 0

(2)

G (x, y, u) = [g_{m} (x, y, u)] = [\begin{matrix} g_{1} (x, y, u) \\ ⋮ \\ g_{\dim (G)} (x, y, u) \end{matrix}] \leq 0

(3)

x^{l b} \leq x \leq x^{u b}, y^{l b} = [y_{p}^{l b}] \leq y = [y_{p}] \leq y^{u b} = [y_{p}^{u b}] x \in R^{\dim (x)}, y \in N^{\dim (y)}, m \in N_{1}, p \in N_{2} N_{1} = \{1,2, \dots, \dim (G)\} N_{2} = {1,2, \dots, \dim (y)}

where the input variables of the simulator consist of

x

and

y

.

x

is a series of continuous variables, such as the reflux ratio of the distillation column, the temperature of the heat exchanger, etc.

y

represents a series of integer variables

y_{p}

, such as the number of trays in the distillation column.

u

is the output of the simulator, such as the heat duty of heat exchanger, top temperature of distillation column, etc.

F (x, y, u)

is the objective function of the design, which can be an energy consumption target or a CO₂ emission target, and the most commonly used is the total annualized cost (TAC).

U (x, y, u)

represents a series of implicit equation systems that represent the calculation process of the simulator. As far as we know, the steady-state calculation process does not include inequality constraints.

G (x, y, u)

is a series of constraints functions

g_{m} (x, y, u)

, such as purity constraints, temperature constraints, flow constraints, etc. Note that equality constraints can be expressed through two inequality constraints.

The logic-based Bender’s decomposition algorithm (LB-BD) decomposes MINLP into nonlinear programming (NLP) subproblem, which is Formulas (4)–(6):

x = \arg \min \{f_{N L P} = f (x, y^{*}, u)\}

(4)

Subject to the following:

U (x, y^{*}, u) = 0

(5)

G (x, y^{*}, u) \leq 0

(6)

x^{l b} \leq x \leq x^{u b}, x \in R^{\dim (x)}

And mixed-integer linear programming (MILP) master problem, which is Formulas (7)–(15):

y = \arg \min \{f_{M I L P} = α + M \sum_{m \in N_{2}} β_{m}\}

(7)

Subject to the following:

K_{n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) \leq α

(8)

W_{m, n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) \leq β_{m}

(9)

y_{p} = \sum_{q = 1}^{y_{p}^{u b} - y_{p}^{l b} + 1} (y_{p}^{l b} + q - 1) B_{p, q}

(10)

\sum_{q = 1}^{y_{p}^{u b} - y_{p}^{l b} + 1} B_{p, q} = 1

(11)

\sum_{p \in N_{3}} B_{p, y_{p, n}^{*} - y_{p}^{l b} + 1} \leq \dim (\vec{y}) - 1

(12)

y^{l b} \leq y \leq y^{u b}, y \in N^{\dim (y)} n_{w} = [\begin{matrix} n_{1, w} \\ ⋮ \\ n_{\dim (\vec{y}), w} \end{matrix}], n_{w, w} = 1, n_{p, w} = 0, p \neq w α, β_{m} \in R, β_{m} \geq 0, B_{p, q} \in \{0,1\}, m \in N_{1}, p, w \in N_{2}, n \in N_{3} N_{1} = \{1,2, \dots, \dim (G)\} N_{2} = {1,2, \dots, \dim (y)} N_{3} = {1,2, \dots, N u m b e r o f k n o w n p o i n t s y_{n}^{*} i n M I L P}

where

α

is the relaxation variable for the objective function

f

, and

β_{m}

is the relaxation variable for the m-th constraint function

g_{m}

. For every set of

y_{n}^{*}

, there will be a set of Formulas (8) and (9).

B_{p, q}

is a 0–1 variable, when it is 1, it represents that the p-th integer variable is

(y_{p}^{l b} + q - 1)

, otherwise it is not.

[B_{p, q}]

corresponds one-to-one with

y

through Formulas (10) and (11), whose quantity is equal to that of integer variables. And avoids situations

y = y_{n}^{*}

by Formula (12).

n_{p}

is a unit vector.

M

is a sufficiently large number to reject the infeasible solution.

K_{n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q)

(

W_{m, n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q)

) is the lower bound estimate of the objective function (constraint function) based on the known points

(x_{n}^{*}, y_{n}^{*})

(solutions of NLP subproblem), using linear extrapolation, written as follows:

K_{n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) = f_{n} (y_{n}^{*}) + \sum_{p = 1}^{\dim (y)} \{\begin{matrix} \sum_{q = 1}^{y_{p}^{*} - y_{p}^{l b}} (f (y_{n}^{*} - n_{p}) - f_{n} (y_{n}^{*})) (y_{p, n}^{*} - (y_{p}^{l b} + q - 1)) B_{p, q} \\ + \sum_{q = y_{p}^{*} - y_{p}^{l b} + 2}^{y_{p}^{u b} - y_{p}^{l b} + 1} (f (y_{n}^{*} + n_{p}) - f_{n} (y_{n}^{*})) ((y_{p}^{l b} + q - 1) - y_{p, n}^{*}) B_{p, q} \end{matrix}\}

(13)

W_{m, n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) = g_{m, n} (y_{n}^{*}) + \sum_{p = 1}^{\dim (y)} \{\begin{matrix} \sum_{q = 1}^{y_{p}^{*} - y_{p}^{l b}} (g_{m} (y_{n}^{*} - n_{p}) - g_{m, n} (y_{n}^{*})) (y_{p, n}^{*} - (y_{p}^{l b} + q - 1)) B_{p, q} \\ + \sum_{q = y_{p}^{*} - y_{p}^{l b} + 2}^{y_{p}^{u b} - y_{p}^{l b} + 1} (g_{m} (y_{n}^{*} + n_{p}) - g_{m, n} (y_{n}^{*})) ((y_{p}^{l b} + q - 1) - y_{p, n}^{*}) B_{p, q} \end{matrix}\}

(14)

In the above formula, the summation formula satisfies the following constraints:

i f a > b, t h e n \sum_{i = a}^{b} x = 0, a, b \in \forall N

(15)

where

y_{n}^{*} - n_{p}

and

y_{n}^{*} + n_{p}

is the neighborhood of the

y_{n}^{*}

.

f (y_{n}^{*} - n_{p})

,

g_{m} (y_{n}^{*} - n_{p})

is the solution of NLP subproblem (Formulas (4)–(6)) with fixed integer combination

y_{n}^{*} - n_{p}

, and the same expression also applies to

y_{n}^{*} + n_{p}

and

y_{n}^{*}

.

An example is presented to help readers understand Formula (13), assuming that the objective function

f (y)

is a typical quadratic function, written as follows:

f (y) = y^{2}, - 4 \leq y \leq 4, y \in N

(16)

The real function value is represented by the black solid dots in Figure 2, assuming that the known point

y_{n}^{*} = 0

, as indicated by the red circle in Figure 2.

Due to the dimension of

y

being 1, which means the upper limit of

p

is 1,

y_{1}^{l b} = - 4

,

y_{1}^{u b} = 4

, and

n_{p}

has only one vector:

n_{1} = 1

. Then, there are

f (y_{n}^{*} - n_{p}) = f (- 1) = 1

and

f (y_{n}^{*} + n_{p}) = f (1) = 1

. There are 9

B_{p, q}

, each corresponding to the value of

y

one by one according to Formulas (10) and (11). For example, when

B_{1,3} = 1

, all other

B_{p, q}

are 0, and

y = - 2

. Due to the limitation of Formula (12),

B_{1,5} = 1

cannot be the solution to the MILP. In summary, Formula (13) can be simplified as follows, as indicated by the red dashed line in Figure 2:

K_{n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) = K_{n} (B_{p, q}) = 4 B_{1,1} + 3 B_{1,2} + 2 B_{1,3} + B_{1,4} + B_{1,6} + 2 B_{1,7} + 3 B_{1,8} + 4 B_{1,9}

(17)

A similar process can also be applied to Formula (14). Obviously, this linear estimation splits a single dimension of

y_{p}

into positive and negative dimensions, with the drawback of requiring the calculation of two additional NLPs (

f (- 1)

and

f (1)

). However, the resulting linear lower bound estimation is more accurate compared to the traditional two-point resulting linear lower bound estimation.

Theorem 1.

(Proof of LB-BD Convergence): If the MILP master problem is convex programming and

f_{M I L P} \geq f_{N L P}

, the algorithm converges to the global optimal solution, with the optimal objective value being

\min f_{N L P}

.

Proof of Theorem 1.

The proof of this theorem is presented in the Supporting Information S1. □

Inference 1.

If the MILP master problem is convex programming, As the number of iterations increases, the objective function value of MILP shows a single increasing trend.

Proof of Inference 1.

The proof of this inference is presented in the Supporting Information S1. □

2.2. Proximity Principle and Logic-Based Proximity Principle Bender’s Decomposition Algorithm

NLP subproblems can obtain global optimal solutions through a multi-start strategy, and its expression is the same as Formulas (4)–(6). The so-called proximity principle aims to avoid MILP solutions from becoming stuck in local optima by controlling the use of Formulas (8) and (9), with its GDP expressed as follows:

y = \arg \min \{f_{M I L P} = α + M \sum_{m \in N_{2}} β_{m}\}

(18)

Subject to the following:

[\begin{matrix} z_{n} \\ K_{n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) \leq α \\ W_{m, n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) \leq β_{m} \end{matrix}] ⋁ [\begin{matrix} \neg z_{n} \\ - i n f \leq α \\ - i n f \leq β_{m} \end{matrix}]

(19)

Ω_{P r o x i m i t y - p r i n c i p l e} (z_{n}) = T r u e

(20)

y_{p} = \sum_{q = 1}^{y_{p}^{u b} - y_{p}^{l b} + 1} (y_{p}^{l b} + q - 1) B_{p, q}

(21)

\sum_{q = 1}^{y_{p}^{u b} - y_{p}^{l b} + 1} B_{p, q} = 1

(22)

\sum_{p \in N_{3}} B_{p, y_{p, n}^{*} - y_{p}^{l b} + 1} \leq \dim (y) - 1

(23)

y^{l b} \leq y \leq y^{u b}, y \in N^{\dim (y)} α, β_{m} \in R, β_{m} \geq 0, B_{p, q} \in \{0,1\}, z_{n} \in \{F a l s e, T u r e\} m \in N_{1}, p \in N_{2}, n \in N_{3}

Formulas (18)–(23) and Formulas (7)–(15) differ only in sub-Formulas (8) and (9) and sub-Formulas (19) and (20). Formula (19) involves the definition of a set of disjunctions, corresponding to conditional lower bound estimation of different known points, for which Formulas (8), (9), (13) and (14) are defined. Each disjunction involves only two terms, corresponding to the selection or not of lower bound estimation from a known point

y_{n}^{*}

(

z_{n}

). The so-called proximity principle defines the interrelationship between

z_{n}

, expressed as Formula (20), which restricts

z_{n} = T r u e

to only the one or few known points with the closest Euclidean distance, while other

z_{n} = F a l s e

. Specifically, use the big-M method to reformulate the Formulas (19) and (20) as follows:

y = \arg \min \{f_{M I L P} = α + M \sum_{m \in N_{2}} β_{m}\}

(24)

Subject to the following:

K_{n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) \leq α + M_{1} (1 - z_{n})

(25)

W_{m, n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q) \leq β_{m} + M_{1} (1 - z_{n})

(26)

(z_{n_{1}} - z_{n_{2}}) M_{2} + \sum_{p \in N_{3}} {(\frac{y_{p} - y_{p, n_{1}}}{y_{p}^{u b} - y_{p}^{l b}})}^{2} \leq \sum_{p \in N_{3}} {(\frac{y_{p} - y_{p, n_{2}}}{y_{p}^{u b} - y_{p}^{l b}})}^{2} + M_{2}

(27)

\sum_{n \in N_{1}} z_{n} = K

(28)

y_{p} = \sum_{q = 1}^{y_{p}^{u b} - y_{p}^{l b} + 1} (y_{p}^{l b} + q - 1) B_{p, q}

(29)

\sum_{q = 1}^{y_{p}^{u b} - y_{p}^{l b} + 1} B_{p, q} = 1

(30)

\sum_{p \in N_{3}} B_{p, y_{p, n}^{*} - y_{p}^{l b} + 1} \leq \dim (y) - 1

(31)

y^{l b} = [y_{p}^{l b}] \leq y = [y_{p}] = [\begin{matrix} y_{1} \\ ⋮ \\ y_{\dim (\vec{y})} \end{matrix}] \leq y^{u b} = [y_{p}^{u b}], y \in N^{\dim (y)} α, β_{m} \in R, β_{m} \geq 0, B_{p, q}, z_{n} \in \{0,1\}, m \in N_{1}, p \in N_{2}, n, n_{1}, n_{2} \in N_{3}

where

M_{1}

and

M_{2}

are sufficiently large numbers, and

M_{2}

only needs to use the dimension of

y

.

z_{n}

is a Boolean variable and

z_{n} = 1

indicates that the corresponding constraints (8) and (9) of

y_{n}^{*}

is active, otherwise, the constraints will not work. The value of

z_{n}

is limited by Formula (27). The use of Euclidean distance to determine the distance between discrete points in Formula (27) is to keep the master problem to linear programming. Although Formula (27) has a quadratic term, the quadratic term will be completely eliminated by addition and subtraction after expansion, so the master problem is still MILP. If the Euclidean distance from the unknown integer combination

y

to

n_{1}

-th known integer combination

y_{n_{1}}^{*}

is less than or equal to the Euclidean distance to

n_{2}

-th known integer combination

y_{n_{2}}^{*}

, any combination of

z_{n_{1}}

and

z_{n_{2}}

can be considered. However, when the Euclidean distance from the unknown integer combination to

n_{1}

-th known-point

y_{n_{1}}^{*}

is greater than the Euclidean distance to

n_{2}

-th known integer combination

y_{n_{2}}^{*}

, when

z_{n_{1}}

is 1,

z_{n_{2}}

is not allowed to be 0, that is, when there are known integer combinations closer to the unknown integer combination, information from known integer combinations farther away will not be given priority. The number of known integer combinations information used is limited by Formula (28), Specifically, when

K

in Formula (28) equals the number of known points, LB-PBD is equivalent to LB-BD. Generally,

K

is set to 1 to ensure high accuracy. This part of the content will be discussed in Case 2 of Section 3. The meanings of other variables remain the same as Formulas (7)–(15).

Theorem 2.

With same known points

y_{n}^{*}

, The

f_{M I L P}

solved by Formulas (24)–(30) must be less than or equal to the

f_{M I L P}

solved by Formulas (7)–(11).

Proof of Theorem 2.

The proof of this theorem is presented in the Supporting Information S1. □

Inference 2.

For cases where the global optimal solution of MILP master problem is within a local convex domain, as long as each local convex domain has at least one known integer combination, LB-PBD can definitely find the global optimal solution.

Proof of Inference 2.

The proof of this inference is presented in the Supporting Information S1. □

2.3. Delayed Convergence Strategy and Multi-Start Points Strategy

Due to the fact that LB-PBD only changes the way Formulas (8) and (9) take effect, Theorem 1 still applies to LB-PBD. However, in reality, programming problems are not always convex, and even the neighborhood space of the global optimal solution is concave, which will cause LB-PBD to convergence prematurely. If the real problem is a completely concave function, the

f_{M I L P}

constructed by LB-PBD is the envelope hyperplane of the original function, which is always greater than the original function at any position

y

. Assuming no convergence condition is set, the objective function

f_{M I L P}

solved by MILP will monotonically decrease with increasing algebra (readers can follow the proof process of Inference 1 to prove), which is as follows:

f_{M I L P, n} \geq f_{M I L P, n + 1}

(32)

Note that this is completely opposite to the properties exhibited by convex functions (according to Inference 1), which is as follows:

f_{M I L P, n} \leq f_{M I L P, n + 1}

(33)

Therefore, the convergence idea can be changed to enable the algorithm to cross the concave function region or find the optimal solution within the concave function region. One direct way is to delay convergence strategy, which requires continuous

i

generations to have the following:

f_{M I L P} \geq f_{N L P}

(34)

Only then do we believe that the algorithm converges. Obviously,

i

is a hyperparameter related to the problem. If it is known in advance that the programming to be solved is convex,

i

can be set to 1; If it is known in advance that the global optimal solution of the programming is within a locally convex domain, but there are some concave domains, an additional requirement can be added that

f_{M I L P}

in this

i

generation is monotonically increasing.

According to Inference 2, for situations where there are multiple local optima, LB-PBD requires sufficient information within the search scope. Therefore, a multi-start points strategy is also indispensable, or in other words, the accuracy of the algorithm is positively improved. This is also a hyperparameter

h

2.4. Solver and Simulator Configuration

The LB-PBD algorithm process is described as Table 1:

While the flow chart of LB-PBD applied in simulator-based optimization is shown as Figure 3.

The algorithm starts with the calculation of the NLP subproblem of random

h

(multi-start points strategy) initial points and their neighborhood points. The NLP is solved by FMINCON in MATLAB 2021b. ASPEN PLUS V8.8 as the simulator is manipulated through the ActiveX interface and the data transmission is completed by MATLAB. The derivative is obtained through the Open Object Model Framework (OOMF)¹². Then GUROBI 11.0 completes the MILP master problem, giving the minimum lower bound

f_{M I L P}

and the next generation of fixed integer variable

y_{k}^{*}

. Then the algorithm iterates between NLP and MILP. After each MILP solution,

f_{M I L P}

is compared with the known minimum NLP objective function

f_{N L P}

until successive

i

generations (delayed convergence strategy) meet

f_{M I L P} \geq f_{N L P}

, and the algorithm terminates. The optimal solution is the independent variable corresponding to

f_{N L P}

.

The results of all NLP will be recorded to prevent waste of time by repeated solving. And the MATLAB programming environment provides the communication platform between all software. All programs run on a workstation with 16 core CPU W-2295,64 GB of memory and GPU A4000. The specific parameters used by each solver, the method of obtaining derivatives in NLP subproblem solving, and the binary interaction parameters used by simulators can be found in Supporting Information S2.

3. Results and Discussion

This section presents three cases to illustrate LB-PBD. Case 1 illustrates the effectiveness of the proximity principle through two different non-convex MINLP numerical examples while also demonstrating the range of use of

M_{1}

in Formulas (25) and (26) and the role of hyperparameters in the multi-start points strategy or the delayed convergence strategy. Case 2 aims to further illustrate the advantages of LB-PBD in simulator-based optimization and determine hyperparameters for Case 3. The comparison with LB-BD aims to illustrate the influence of

K

in Formula (28).

It is worth noting that in simulator-based distillation optimization, the CPU time of NLP has always been the main component because the simulator takes several seconds to run once. Therefore, the number of times the simulator is used to evaluate the algorithm’s computation time, a good algorithm should obtain a high accuracy while also having a low number of simulator activations.

Case 3 demonstrates the process of optimizing extraction distillation using LB-PBD, indicating that MILP has relatively low time consumption and the algorithm has the potential to be applied in simulator-based optimization.

3.1. Case 1—Numerical Experiment

To demonstrate the algorithm’s ability to navigate local optima and saddle points, two test functions have been selected:

m i n : f_{1} (x_{1}, x_{2}, y_{1}, y_{2}) = - \sum_{i = 1}^{2} (1 + \frac{x_{i} - 15}{100}) \sin (\frac{π x_{i}}{10}) - \sum_{i = 1}^{2} (1 + \frac{y_{i} - 15}{100}) \sin (\frac{π y_{i}}{10}) 0 \leq x_{i} \leq 30, x_{i} \in R, 0 \leq y_{i} \leq 30, y_{i} \in N, i \in {1,2}

(35)

With the best solution

x_{1} = x_{2} = 25.0920

and

y_{1} = y_{2} = 25

.

m i n : f_{2} (x_{1}, x_{2}, y_{1}, y_{2}) = \sum_{i = 1}^{2} (y_{i}^{2} - 40 \cos (\frac{π}{2} y_{i})) - \sum_{i = 1}^{2} ((x_{i} - 10 y_{i}) \sin (\sqrt{|x_{i} - 10 y_{i}|})) - 500 \leq x_{i} \leq 500, x_{i} \in R, - 20 \leq y_{i} \leq 20, y_{i} \in N, i \in {1,2}

(36)

With the best solution

x_{1} = x_{2} = - 500

and

y_{1} = y_{2} = 4

.

Figure 4 illustrates the NLP results of two test functions under all integer combinations. The blue dots in the graph represent the corresponding optimal objective function values under integer combinations. The bluer the surface color, the smaller the value. The multi-start strategy is applied to NLP to prevent solutions from multiple local optima in a fixed integer combination. The number of multi-start points is set to 10 for Formula (35) and 200 for Formula (36), which can ensure the global optimality of the solution.

Notably, there are four locally optimal solutions with significantly different distances in

f_{1} (y_{1}, y_{2})

, with objective function values differing by less than 1%. While three local optima exist near the global optima integer combinations of

f_{2} (y_{1}, y_{2})

, with objective function values differing by less than 1%.

In the discussion of algorithm performance below, a solution is considered globally optimal if its final values in each dimension differ from the global optimum by less than 0.1%. To avoid skewed conclusions from random occurrences, each experiment is repeated 100 times, and the average result is taken. The algorithms compared include LB-BD, LB-PBD, and the deterministic benchmark algorithm LB-OA [14], as well as Synchronously Population-Distributed Differential evolution (SPDDE) [32], a stochastic algorithm benchmark.

Firstly, in order to test the approximate range of use of

M_{1}

in Formulas (25) and (26), taking the above test functions as examples, with the maximum and minimum difference between the two functions rounded up to

10^{n}

(

n

is a positive integer) as the benchmark,

M_{1}

is set to

10^{n + m}

(

m

is also a positive integer), the hyperparameter

h

of multi-start points strategy is set to 8, and the hyperparameter

i

of the delayed convergence strategy is set to 3. The accuracy changes with m as shown in Figure 5, with specific data available in Table S4 of the Supporting Information S3. It can be seen that setting

m = 2

no longer affects the normal solution of MILP (fluctuations are caused by random integration phenomena), and an excessively large

m

will make Formulas (25) and (26) ill-conditioned, not only increasing solution time but also possibly reducing algorithm accuracy.

Secondly, the test results of different algorithms are shown in Figure 6, with specific data available in Tables S2 and S3 of the Supporting Information S3, where LB-BD and LB-PBD used different hyperparameters. LB-PBD showed an increasing trend in algorithm accuracy with the increase of hyperparameters for both strategies in both test functions; however, LB-BD showed a continuous decrease in algorithm accuracy as the hyperparameters of both strategies increased in test function (1) and a very low accuracy in test function (2). This validates our conclusion in Figure 1 and Theorem 2.

Due to

f_{1}

being a typical function where continuous and discrete variables can be separated, LB-OA exhibits ultra-high accuracy. However, due to the mixing of the two variables in

f_{2}

, its accuracy is completely lost. The random algorithm SPDDE is unable to extricate itself from local optima and exhibits poor accuracy because of the fact that

f_{1}

has four locally optimal solutions that are far apart. On the other hand,

f_{2}

has four locally optimal solutions that are close in distance, which is beneficial for the SPDDE to converge to the global optimal solution as the population shrinks. However, it should be noted that this requires setting appropriate hyperparameters such as population size; otherwise, the accuracy will still be poor (refer to the results in Table S3 of Supporting Information S3). Overall, for unknown MINLP test functions, LB-PBD with reasonable hyperparameters would be a good choice.

A complete example of the calculation process of Formula (35) is described in Appendix A to help readers understand the working principle of LB-PBD.

3.2. Case 2—Single Column Case of Methanol Distillation

This case focuses on the optimization of a single distillation column, specifically separating a ternary mixture of methanol, ethanol, and water (with a molar flow ratio of 1:1:1) with a flow rate of 300 kmol/h, shown as Figure 7, where all independent variables are marked. The objective is to achieve a methanol purity of greater than 99.9 mol% at the top of the column, while the methanol purity at the bottom should be less than 0.1 mol%. The minimum TAC and the corresponding design parameters of the distillation column are calculated. The TAC calculation formula is provided in Table S5, and the upper and lower bounds of each independent variable are listed in Table S6.

The global optimal solution for all integer combinations is represented in Figure 8, based on a grid of 19,881 points. The black areas indicate regions where the separation requirements are not met, while the bluer areas correspond to lower TAC values. Figure 8 reveals that, as expected, increasing the number of trays results in for higher pressure, which in turn reduces the amount of steam required within the column, lowering the heat duty. However, contrary to the ideal case, as the number of trays increases, the reflux ratio first decreases and then increases. This occurs because the pressure drop of the column becomes more significant with more trays, leading to greater exergy losses and a more pronounced remixing effect. Consequently, the concentration of ethanol (the light non-key component) at the top of the column increases, requiring a higher reflux ratio to mitigate this. This phenomenon results in the TAC no longer exhibiting convexity for columns with a large number of trays, meaning that the LB-OA algorithm may fail to find the global optimal solution.

This case uses LB-PBD and LB-BD for testing and comparison. Each test is repeated 30 times, and the average results are taken. The detailed test results are presented in Table S7 of Support Information S4. From the data, we can see that LB-PBD, which applies a delayed convergence strategy and a multi-start points strategy, maintains high accuracy. Its MILP consumption time is also within 1 min. In most cases, the global optimal solution can be found within 50 iterations, 200 or fewer NLP subproblems where the problem has nearly 20,000 integer combinations.

The data in Table 2 and Table 3 are excerpted from Table S7, and the trends shown in the tables can also be found on other parameters.

Firstly, according to Table 2, as the hyperparameter

h

of the multi-start points strategy increases, the number of known points for each MILP master problem increases, and then the time consumption increases. However, the number of iterations decreases, resulting in a smaller increase in the number of NLP subproblems to be solved than the increase in the number of NLP subproblems caused by the multi-start points strategy. Due to the fact that the time consumption of NLP dominates in the actual optimization process, this means that the multi-start points strategy can improve the accuracy of the algorithm while also consuming better time compared to the traditional strategy that optimization is independently executed multiple times. It is worth noting that the multi-start points strategy concentrates NLP, which is beneficial for the deployment of parallel computing.

Secondly, according to Table 3, this conclusion was verified: as the value of

K

in Formula (28) increases (the reason for setting

K = 3

is that only 3 points are needed to obtain a unique convex hull in two-dimensional space), the performance of LB-PBD tends towards LB-BD. This trend weakens with an increase in the number of known points (multi-start points strategy or delayed convergence strategy), as more known points make programming convex in most of the convex hull, which helps the algorithm to quickly find local optimal solutions. Therefore, if it is predicted that the programming will be convex in a specific area,

K = n + 1

(

n

is the dimension of the discrete independent variable) can be set to speed up the algorithm. Conversely, when the information of the programming is unknown,

K = 1

should be taken to ensure the high accuracy of the algorithm.

3.3. Case 3—Extraction Distillation Case of Cyclohexane/Cyclohexene

This case is adapted from part of a published work [33] on the separation of a near-azeotropic system composed of cyclohexane (ANE) and cyclohexene (ENE) using extractive distillation. Dimethylacetamide (DMAC) is employed as the entrainer, with its physical properties consistent with the reported properties [33]. A detailed description of the process is provided in the Supporting Information S5, with the search bounds listed in Table S8. The TAC calculation formula is referenced from Table S5.

The hyperparameter

i

of the delayed convergence strategy is set to 5, the hyperparameter

h

of the multi-start points strategy is set to 15, and the hyperparameter

K

in Formula (28) is set to 1. Figure 9a illustrates the time consumption per generation, where each generation includes one set of NLP subproblems and one MILP master problem. Figure 9b depicts the optimizing process of Case 3, while Figure 9c presents the optimal solution found by the algorithm.

Firstly, it is important to note that we cannot conclusively state that the solution obtained is the global optimum, as it has not been validated through exhaustive enumeration, and NLP algorithms cannot guarantee a globally optimal solution. However, as discussed in Section 2, the results sufficiently demonstrate that the solution is “good enough” for practical purposes. Therefore, the current solution is deemed sufficiently “excellent”. Moreover, the lower bound curve obtained by MILP in Figure 9b has multiple obvious inflection points, indicating the existence of multiple local optima in the process, which verifies the algorithm’s ability to cross over local optima. Additionally, for comparison, this case is calculated using SPDDE, and the results are shown in the Supporting Information S5. Unfortunately, in a similar time consumption, the solution obtained by SPDDE is worse than that obtained by LB-PBD, even worse than the most upper bound in LB-PBD.

Secondly, compared to the results from article [33], our optimization yielded a different operational strategy for the extraction column. In that, the column used a high reflux ratio, resulting in ENE being the main impurity at the top of the column. In contrast, by utilizing the NLP search algorithm, we obtained a solution with a lower reflux ratio, which caused DMAC (the heavier entrainer) to be the main impurity at the top, thereby reducing energy consumption and increasing the recovery rate of ENE.

An experiment was designed to explain this phenomenon: the molar purity of ANE was ensured to be 0.999 by changing the reflux ratio of the extraction column, the molar purity of ENE was ensured to be 0.999 by changing the distillate rate at the top of the extraction column, and the number of trays in the stripping section of the extraction column was changed. Other operating parameters were consistent with Figure 9c. The trend of the reflux ratio versus the impurity composition at the top of the extraction column is shown in Figure 10a, and the trend of the heat duty at the bottom of the extraction column is shown in Figure 10b. When the number of trays in the stripping section is small, with the increase of the number of trays in the stripping section, its ability to remove light components increases, and ANE is driven to the top of the column first, which allows the top of the column to use smaller reflux ratios to achieve the purity requirements of ANE, and with the further increase of the number of trays in the stripping section, ENE is driven to the top of the column as well, which makes the top of the column need to use larger reflux ratios in order to clean the excess of ENE. Regardless of the larger or smaller number of trays, the higher high reflux ratios used in the extraction column corresponded to the presence of ENE as the main component in the impurities at the top of the column and a higher heat duty, whereas in the case of smaller reflux ratios there, was a higher amount of DMAC in the impurities as well as a lower heat duty.

This phenomenon can be explained as follows: when the reflux ratio is low, the rectifying section cannot effectively remove the heavy entrainer DMAC near the top of the column, leading to a higher concentration of entrainers in the impurities. As the reflux ratio increases, the separation ability of the rectifying section improves, reducing the entrainer content. However, too much ANE, the primary component in the reflux, dilutes the DMAC, weakening the entrainer’s ability to enhance the relative volatility of the ANE/ENE mixture. Consequently, more ENE enters the top of the column at higher reflux ratios.

Lastly, Figure 9a shows that, in simulation-based optimization, the increase in solving time due to the complexity of MILP is negligible when there are few iterations. This observation suggests that LB-PBD is not only effective but also suitable for simple industrial applications due to its manageable computational requirements. However, the time-consuming nature of MILP increases with the number of iterations, which greatly increases the time-consuming nature of the algorithm when applied in high-dimensional and more complex situations.

4. Conclusions

This work introduces the proximity principle for the MILP master problem in the decomposition algorithm. This principle imposes disjunction constraints on the use of different known integer combination information, ensuring that only the integer combination information with the closest Euclidean distance is used when evaluating the lower bound of unknown combinations.

Applying the proximity principle to the LB-BD algorithm to generate LB-PBD, mathematical analysis, and three case studies have proven that the proximity principle effectively improves the accuracy of the original algorithm. Combined with the multi-start points strategy and delayed convergence strategy, LB-PBD has global optimal convergence for non-convex MINLP.

By employing basis transformation, the Euclidean distance between unknown and known integer combinations is further reduced, improving the accuracy of lower bound predictions. Specifically, the time consumption analysis of Case 3 indicates that the algorithm has the potential to be applied in simulator-based optimization.

However, the significant time consumption in solving the NLP subproblems continues to hinder further advancement of LB-PBD. Meanwhile, Case 3 shows that with the increase of the number of iterations, the solving time of MILP increases, which hinders the application of the algorithm in high-dimensional problems. To address these problems, future research is suggested to focus on integrating stochastic algorithms to reduce the search space of deterministic algorithms, thereby curbing excessive time consumption and enhancing the scalability of the approach for higher-dimensional problems.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/pr13040977/s1: Section S1: Theorems and their proofs; Section S2: Specific parameters in solvers and simulators; Section S3: Information of case 1; Section S4: Information of case 2; Section S5—Information of case 3; Figure S1: Process flow diagram of SPDDE for case 3; Figure S2: The iteration diagram of SPDDE for case 3; Table S1: The binary interaction parameters used by simulators; Table S2: Algorithm numerical test function (1) results; Table S3: Algorithm numerical test function (2) results; Table S4: Big-M test results; Table S5: Calculation formula for total annualized cost (TAC); Table S6: The upper and lower bounds of the independent variable in Case 2; Table S7: Test results of different algorithms; Table S8: The upper and lower bounds of the independent variable in Case 3 [34,35,36,37,38,39].

Author Contributions

C.T.: conceptualization (lead), formal analysis (lead), investigation (equal), methodology (lead), software (equal), validation (equal), visualization (lead), and writing—original draft (lead); X.Z.: investigation (equal), software (equal), supervision (equal), validation (equal), and writing—review and editing (equal); Y.L.: software (equal); J.S.: project administration (lead), resources (lead), supervision (equal), validation (equal), and writing—review and editing (equal). All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article. The code that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. We acknowledge the use of ChatGPT-4o for language enhancement and polishing of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

Roman symbols
$α$	The relaxation variable for the objective function
$β$	The relaxation variable for the constraint function
letters
ANE	Cyclohexane
$B$	Integer variable selection logic-based flag in MILP Bottoms rate in distillation column
$D$	Distillate rate in distillation column
$D_{S}$	Makeup rate of extractant
$\dim (y)$	$T h e n u m b e r o f d i m e n s i o n s o f y$
DMAC	Dimethylacetamide
ENE	Cyclohexene
$F (x, y, u)$	Objective function
$G (x, y, u) \leq 0$	Constraint inequality system for programming
$g (x, y, u) \leq 0$	Constraint inequality for programming
$h$	Hyper parameter of multi-start points strategy
$i$	Hyper parameter of delayed convergence strategy
LB-OA	Logic-based outer approximation algorithm
LB-BD	Logic-based Bender’s decomposition
LB-PBD	Logic-based proximity principle Bender’s decomposition
$M$	A sufficiently large number
MILP	Mixed-integer linear programming
MINLP	Mixed-integer nonlinear programming
$n$	Neighborhood vector
$N$	Number of trays for a specific section
$N_{1}$	${1,2, \dots, N u m b e r o f c o n s t r a i n t s}$
$N_{2}$	${1,2, \dots, \dim (y)}$
$N_{3}$	${1,2, \dots, N u m b e r o f k n o w n i n t e g e r c o m b i n a t i o n s i n M I L P}$
NLP	Nonlinear programming
$P$	Condenser pressure in distillation column
$R$	Reflux ratio in distillation column
SPDDE	Synchronously Population-Distributed Differential evolution
$T A C$	Total annualized cost
$T_{E}$	Outlet temperature of heat exchanger
$U (x, y, u) = 0$	Equation system of the simulator
$u$	Dependent variable vector in simulator
$x$	Independent variable vector with continuous values
$y$	Independent variable vector with integer values
$y$	Independent variable with integer value
$z$	Selection flag for known integer combinations in MILP
superscripts
$l b$	Lower bound value
$u b$	Upper bound value
$*$	A fixed variable for the current programming
subscripts
$m$	The index of the constraint function
$n$	The index of the known integer combinations in MILP
$p$	The index of elements in an integer vector
$q$	The index of elements in a set of logic-based flag

Appendix A

Taking

f_{1}

in Case 1 as an example, with

h = 4

(the hyperparameter of the multi-start points strategy). In the following Figure A1, the red dot represents known point

y_{n}^{*}

in Formulas (24)–(31), the orange dot represents its neighborhood points, and these points need to be calculated using Formulas (4)–(6). The blue pentagram represents the global optimal solution, the solid line represents the contour line of the function to be optimized, and the bluer the line, the smaller the value. The red dashed line represents the interval dividing line.

Firstly, assuming that the initial four combinations of integers are

(10,10)

,

(10,20)

,

(20,10)

and

(20,20)

, and then these points and their corresponding neighboring points are solved by fixing them into Formulas (4)–(6). As shown in Figure A1a, the search area is divided into four regions based on the distance from known points. Within each region, due to the constraints of

z_{n}

in Formulas (25)–(28), only the known points within that region are used to generate Formulas (8) and (9). Take unknown point

(30,30)

as an example, its Formulas (27) and (28) is simplified as the following equation:

(z_{1} - z_{2}) M_{2} + {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} \leq {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} + M_{2}

(A1)

(z_{2} - z_{1}) M_{2} + {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} \leq {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} + M_{2}

(A2)

(z_{1} - z_{3}) M_{2} + {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} \leq {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} + M_{2}

(A3)

(z_{3} - z_{1}) M_{2} + {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} \leq {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} + M_{2}

(A4)

(z_{1} - z_{4}) M_{2} + {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} \leq {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} + M_{2}

(A5)

(z_{4} - z_{1}) M_{2} + {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} \leq {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} + M_{2}

(A6)

(z_{2} - z_{3}) M_{2} + {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} \leq {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} + M_{2}

(A7)

(z_{3} - z_{2}) M_{2} + {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} \leq {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} + M_{2}

(A8)

(z_{2} - z_{4}) M_{2} + {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} \leq {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} + M_{2}

(A9)

(z_{4} - z_{2}) M_{2} + {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} \leq {(\frac{30 - 10}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} + M_{2}

(A10)

(z_{3} - z_{4}) M_{2} + {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} \leq {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} + M_{2}

(A11)

(z_{4} - z_{3}) M_{2} + {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 20}{30 - 0})}^{2} \leq {(\frac{30 - 20}{30 - 0})}^{2} + {(\frac{20 - 10}{30 - 0})}^{2} + M_{2}

(A12)

z_{1} + z_{2} + z_{3} + z_{4} = 1

(A13)

There is a unique solution

z_{1} = z_{2} = z_{3} = 0

,

z_{4} = 1

. Substituting the value of

z_{n}

into Formulas (25)–(28) shows that only

K_{n} (y, y_{n}^{*}, n_{p}, B_{p, q}, q)

generated by point

(20,20)

has a constraint effect on

α

, and the other known solutions are valid for any value due to the existence of

M_{1}

(invalid constraint).

Therefore, the solution of MILP will fall into point

(30,30)

. This point and its neighboring points will be solved by fixing them into Formulas (4)–(6), as shown in Figure A1b, resulting in five regions, each of which only trusts the linear extrapolation of known points within that region. By analogy, Figure A1c–f show the results of the next four iterations. Obviously, Figure A1f has found the global optimal solution. However, due to the influence of regional partitioning, the

f_{M I L P}

of Formulas (24)–(31) is less than the

f_{N L P}

of all known points. The algorithm continues, and after three iterations, Figure A1g is obtained. At this iteration, the

f_{M I L P}

begins to be greater than the

f_{N L P}

. The algorithm can terminate, but if a delayed convergence strategy is used, continuing to solve will result in Figure A1h. Afterwards, similar processes to Figure A1c–f will be repeated until convergence conditions are reached.

In this process, it can be clearly seen that as long as there is a known point within each local convex domain, the lower bound of the corresponding local optimal solution will be considered by LB-PBD. Therefore, the algorithm has global convergence for non-convex MINLP.

Figure A1. Solution process diagram of LB-PBD. (a) The result of Multi starting point strategy. (b) The 1st iteration result (c) The 2nd iteration result (d) The 3rd iteration result (e) The 4th iteration result (f) The 5th iteration result (g) The 8th iteration result (h) The 9th iteration result.

References

Sorensen, E. Principles of Binary Distillation. Distillation: Fundamentals and Principals; Elsevier: Amsterdam, The Netherlands, 2014; pp. 145–185. [Google Scholar] [CrossRef]
David, S.S.; Ryan, P. Lively. Seven chemical separations to change the world. Nature 2016, 532, 435–437. [Google Scholar] [CrossRef]
Boukouvala, F.; Misener, R.; Floudas, C.A. Global optimization advances in Mixed-Integer Nonlinear Programming, MINLP, and Constrained Derivative-Free Optimization, CDFO. Eur. J. Oper. Res. 2016, 252, 701–727. [Google Scholar] [CrossRef]
Storn, R.; Price, K. Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces. J. Glo. Opt. 1997, 11, 341–359. [Google Scholar] [CrossRef]
Hu, Z.; Li, P.; Liu, Y. Enhancing the Performance of Evolutionary Algorithm by Differential Evolution for Optimizing Distillation Sequence. Molecules 2022, 27, 3802. [Google Scholar] [CrossRef] [PubMed]
Clerc, M.; Kennedy, J. The particle swarm—Explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comp. 2002, 6, 58–73. [Google Scholar] [CrossRef]
Liang, M.; Song, J.; Zhao, K.; Jia, S.; Qian, X.; Yuan, X. Optimization of dividing wall columns based on online Kriging model and improved particle swarm optimization algorithm. Comp. Chem. Eng. 2022, 166, 107978. [Google Scholar] [CrossRef]
Holland, J.H. Adaptation in Natural and Artificial Systems; MIT Press: Cambridge, UK, 1976. [Google Scholar] [CrossRef]
Zhang, D.; Zeng, S.; Li, Z.; Yang, M.; Feng, X. Simulation-assisted design of a distillation train with simultaneous column and sequence optimization. Comp. Chem. Eng. 2022, 164, 107907. [Google Scholar] [CrossRef]
Zhang, X.; Jin, L.; Cui, C.; Sun, J. A self-adaptive multi-objective dynamic differential evolution algorithm and its application in chemical engineering. Appl. Soft Comp. 2021, 106, 107317. [Google Scholar] [CrossRef]
Xu, Z.; Ding, Y.; Ye, Q.; Li, J.; Wu, H.; Pan, J. Investigation of side-stream extractive distillation for separating multi-azeotropes mixture based on multi-objective optimization. Chem. Eng. Proce. Proc. Inten. 2024, 200, 109793. [Google Scholar] [CrossRef]
Aspen OOMF—Script Language Reference Manual; AspenTech: Knowledge Base; Aspen Technology, Inc.: Bedford, MA, USA, 2011.
López, C.A.M.; Telen, D.; Nimmegeers, P.; Cabianca, L.; Logist, F.; Van Impe, J. A process simulator interface for multiobjective optimization of chemical processes. Comp. Chem. Eng. 2018, 109, 119–137. [Google Scholar] [CrossRef]
Milán-Yañez, D.; Caballero, J.A.; Grossmann, I.E. Rigorous design of distillation columns: Integration of disjunctive programming and process simulators. Ind. Eng. Chem. Res. 2005, 44, 6760–6775. [Google Scholar] [CrossRef]
Lawler, E.L.; Wood, D.E. Branch-and-Bound Methods: A Survey. Oper. Res. 1966, 14, 699–719. [Google Scholar] [CrossRef]
Geoffrion, A.M. Generalized Benders decomposition. J. Optim. Theory Appl. 1972, 10, 237–260. [Google Scholar] [CrossRef]
Duran, M.A.; Grossmann, I.E. An outer-approximation algorithm for a class of mixed-integer nonlinear programs. Math. Prog. 1986, 36, 307–339. [Google Scholar] [CrossRef]
Tapio, W.; Frank, P. An extended cutting plane method for solving convex MINLP problems. Comp. Chem. Eng. 1995, 19, 131–136. [Google Scholar] [CrossRef]
Ugray, Z.; Lasdon, L.; Plummer, J.; Glover, F.; Kelly, J.; Martí, R. Scatter Search and Local NLP Solvers: A Multistart Framework for Global Optimization. INFORMS J. Comput. 2007, 19, 328–340. [Google Scholar] [CrossRef]
Trespalacios, F.; Grossmann, I.E. Review of Mixed-Integer Nonlinear and Generalized Disjunctive Programming Methods. Chem. Ingen. Tech. 2014, 86, 991–1012. [Google Scholar] [CrossRef]
Su, L.J.; Grossmann, I.E.; Tang, L.X. Computational strategies for improved MINLP algorithms. Comp. Chem. Eng. 2015, 75, 40–48. [Google Scholar] [CrossRef]
Franke, M.B. Mixed-integer optimization of distillation sequences with Aspen Plus: A practical approach. Comp. Chem. Eng. 2019, 131, 106583. [Google Scholar] [CrossRef]
Javaloyes-Antón, J.; Kronqvist, J.; Caballero, J.A. Simulation-based optimization of distillation processes using an extended cutting plane algorithm. Comp. Chem. Eng. 2022, 159, 107655. [Google Scholar] [CrossRef]
Bergamini, M.L.; Aguirre, P.; Grossmann, I.E. Logic-based outer approximation for globally optimal synthesis of process networks. Comp. Chem. Eng. 2005, 29, 1914–1933. [Google Scholar] [CrossRef]
Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual. 2024. Available online: https://www.gurobi.com (accessed on 22 February 2025).
Chia, D.N.; Duanmu, F.; Sorensen, E. Optimal Design of Distillation Columns Using a Combined Optimization Approach. Comp. Aided Chem. Eng. 2021, 50, 153–158. [Google Scholar] [CrossRef]
Liñán, D.A.; Contreras-Zarazúa, G.; Sanchez-Ramírez, E.; Segovia-Hernández, J.G.; Ricardez-Sandoval, L.A. A hybrid deterministic-stochastic algorithm for the optimal design of process flowsheets with ordered discrete decisions. Comp. Chem. Eng. 2024, 180, 108501. [Google Scholar] [CrossRef]
Caballero, J.A. Logic hybrid simulation-optimization algorithm for distillation design. Comp. Chem. Eng. 2015, 72, 284–299. [Google Scholar] [CrossRef]
Trespalacios, F.; Grossmann, I.E. Lagrangean relaxation of the hull-reformulation of linear generalized disjunctive programs and its use in disjunctive branch and bound. Euro. J. Oper. Res. 2016, 253, 314–327. [Google Scholar] [CrossRef]
Corbetta, M.; Grossmann, I.E.; Manenti, F. Process simulator-based optimization of biorefinery downstream processes under the Generalized Disjunctive Programming framework. Comp. Chem. Eng. 2016, 88, 73–85. [Google Scholar] [CrossRef]
Liñán, D.A.; Ricardez-Sandoval, L.A. A Benders decomposition framework for the optimization of disjunctive superstructures with ordered discrete decisions. AICHE J. 2023, 69, 18008. [Google Scholar] [CrossRef]
Lyu, H.; Cui, C.; Zhang, X.; Sun, J. Population-distributed stochastic optimization for distillation processes: Implementation and distribution strategy. Chem. Eng. Res. Des. 2021, 168, 357–368. [Google Scholar] [CrossRef]
Lyu, H.; Li, S.; Yu, X.; Sun, J. Superstructure modeling and stochastic optimization of side-stream extractive distillation processes for the industrial separation of benzene/cyclohexane/cyclohexene. Sep. Purif. Technol. 2021, 257, 117907. [Google Scholar] [CrossRef]
Duran, M.A.; Grossmann, I.E. A mixed-integer nonlinear programming algorithm for process systems synthesis. AICHE J. 1986, 32, 592–606. [Google Scholar] [CrossRef]
Rathore, R.N.S.; Vanwormer, K.A.; Powers, G.J. Synthesis of distillation systems with energy integration. AICHE J. 1974, 20, 940–950. [Google Scholar] [CrossRef]
William, L. Plantwide Dynamic Simulators in Chemical Processing and Control, 1st ed.; CRC Press: Boca Raton, FL, USA, 2002. [Google Scholar] [CrossRef]
Seider, W.D.; Lewin, D.R.; Seader, J.D.; Widagdo, S.; Gani, R.; Ng, K.M. Product and Process Design Principles: Synthesis, Analysis and Evaluation, 4th ed.; John Wiley & Sons: New York, NY, USA, 2019. [Google Scholar]
Luyben, W.L. Principles and Case Studies of Simultaneous Design, 1st ed.; John Wiley & Sons: New York, NY, USA, 2011. [Google Scholar] [CrossRef]
Luyben, W.L. Capital cost of compressors for conceptual design. Chem. Eng. Proce. Proce. Inten. 2018, 126, 206–209. [Google Scholar] [CrossRef]

Figure 1. Illustration of different lower bound prediction methods in different functions. (a) A convex function; (b) a non-convex function.

Figure 2. An explanation case of Formula (13).

Figure 3. Architecture of logic-based proximity principle Bender’s decomposition algorithm.

Figure 4. Graph of the NLP results under all integer combinations. (a) Test function

f_{1} (x_{1}, x_{2}, y_{1}, y_{2})

; (b) test function

f_{2} (x_{1}, x_{2}, y_{1}, y_{2})

.

Figure 4. Graph of the NLP results under all integer combinations. (a) Test function

f_{1} (x_{1}, x_{2}, y_{1}, y_{2})

; (b) test function

f_{2} (x_{1}, x_{2}, y_{1}, y_{2})

.

Figure 5. Big-M test results.

Figure 6. Comparison of average accuracy.

Figure 7. Process flow diagram of Case 2.

Figure 8. TAC for different integer combinations.

Figure 9. The test results of Case 3. (a) the time consumption result (b) the process of LB-PBD optimization (c) the optimal solution found by LB-PBD.

Figure 10. Sensitivity analysis of the number of trays in the stripping section of an extracting column: (a) the result of reflux ratio and composition of impurities; (b) the result of heat duty.

Table 1. Pseudo-code of LB-PBD.

The logic-based proximity principle Bender’s decomposition algorithm for mixed-integer nonlinear programming.
S1. Initialization (multi-start points strategy with hyperparameter $h$ )
	Initialize $h$ fixed integer combinations $y_{n}^{}$ , and their neighborhood $(y_{n}^{} - n_{p})$ and $(y_{n}^{} + n_{p})$ , solve the nonlinear programming subproblem (Formulas (4)–(6)) for each fixed integer combinations with objective function value $f_{n} (y_{n}^{})$ , $f_{n} (y_{n}^{} - n_{p})$ and $f_{n} (y_{n}^{} + n_{p})$ , constraint function value $g_{m, n} (y_{n}^{})$ , $g_{m, n} (y_{n}^{} - n_{p})$ and $g_{m, n} (y_{n}^{*} + n_{p})$ , and the optimal objective function value $f_{N L P}$ . Set $k = 1$ and $k k = 0$ .
S2. Mixed-integer linear programming master problem
	Solve mixed-integer linear programming master problem (Formulas (13)–(15) and (24)–(31) with all solution of nonlinear programming subproblem (Formulas (4)–(6)). The result of solution is $y_{k}^{*}$ with optimization objective function value $f_{M I L P}$ , if $f_{M I L P} > f_{N L P}$ , set $k k = k k + 1$ and continue with S4, else set $k k = 0$ and continue with S3.
S3. Nonlinear programming subproblem
	solve the nonlinear programming subproblem (Formulas (4)–(6)) for each fixed integer combinations $y_{k}^{}$ and their neighborhood $(y_{k}^{} - n_{p})$ and $(y_{k}^{} + n_{p})$ , with objective function value $f_{k} (y_{k}^{})$ , $f_{k} (y_{k}^{} - n_{p})$ and $f_{k} (y_{k}^{} + n_{p})$ , constraint function value $g_{m, k} (y_{k}^{})$ , $g_{m, k} (y_{k}^{} - n_{p})$ and $g_{m, k} (y_{k}^{*} + n_{p})$ , and optimal objective function value $f_{N L P, k}$ , if $f_{N L P, k} \leq f_{N L P}$ , set $f_{N L P} = f_{N L P, k}$ . Then set $k = k + 1$ and continue with S2.
S4. Convergence judgment (delayed convergence strategy with hyperparameter $i$ )
	If $k k \geq i$ , end the algorithm, the global optimal solution is the independent variable $(x, y)$ corresponding to $f_{N L P}$ . Else, set $k k = k k + 1$ and continue with S3.

Table 2. Excerpt results of LB-PBD (hyperparameters i = 5 and K = 1).

Hyperparameter h	Average CPU Consumption (Excluding NLP Consumption)/s	Average Number of Iterations	Average Number of NLP Solved	Theoretical Number of NLP Solved Increases
1	39.70	32.9	164.4	-
5	42.23	30.3	171.7	(5 − 1) × 5
15	60.63	26.5	202.7	(15 − 1) × 5

Table 3. Comparison of average accuracy.

Hyperparameters	LB-PBD ( $K = 1$ )	LB-PBD ( $K = 3$ )	LB-BD
$i = 1, h = 5$	28/30	9/30	0/30
$i = 1, h = 15$	30/30	20/30	0/30
$i = 3, h = 5$	30/30	22/30	0/30
$i = 3, h = 15$	30/30	30/30	0/30
$i = 5, h = 5$	30/30	28/30	0/30
$i = 5, h = 15$	30/30	30/30	0/30

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tian, C.; Zhang, X.; Lan, Y.; Sun, J. An Enhanced Logic-Based Bender’s Decomposition Algorithm with Proximity Principle for Simulator-Based Distillation Process Optimization. Processes 2025, 13, 977. https://doi.org/10.3390/pr13040977

AMA Style

Tian C, Zhang X, Lan Y, Sun J. An Enhanced Logic-Based Bender’s Decomposition Algorithm with Proximity Principle for Simulator-Based Distillation Process Optimization. Processes. 2025; 13(4):977. https://doi.org/10.3390/pr13040977

Chicago/Turabian Style

Tian, Chenshan, Xiaodong Zhang, Yang Lan, and Jinsheng Sun. 2025. "An Enhanced Logic-Based Bender’s Decomposition Algorithm with Proximity Principle for Simulator-Based Distillation Process Optimization" Processes 13, no. 4: 977. https://doi.org/10.3390/pr13040977

APA Style

Tian, C., Zhang, X., Lan, Y., & Sun, J. (2025). An Enhanced Logic-Based Bender’s Decomposition Algorithm with Proximity Principle for Simulator-Based Distillation Process Optimization. Processes, 13(4), 977. https://doi.org/10.3390/pr13040977

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Enhanced Logic-Based Bender’s Decomposition Algorithm with Proximity Principle for Simulator-Based Distillation Process Optimization

Abstract

1. Introduction

2. Materials and Methods

2.1. Problem Statement and Logic-Based Bender’s Decomposition Algorithm

2.2. Proximity Principle and Logic-Based Proximity Principle Bender’s Decomposition Algorithm

2.3. Delayed Convergence Strategy and Multi-Start Points Strategy

2.4. Solver and Simulator Configuration

3. Results and Discussion

3.1. Case 1—Numerical Experiment

3.2. Case 2—Single Column Case of Methanol Distillation

3.3. Case 3—Extraction Distillation Case of Cyclohexane/Cyclohexene

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI