A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization

Perez-Rosero, Diego Armando; Álvarez-Meza, Andrés Marino; Castellanos-Dominguez, Cesar German

doi:10.3390/computers13070176

Open AccessArticle

A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization

by

Diego Armando Perez-Rosero

^*

,

Andrés Marino Álvarez-Meza

and

Cesar German Castellanos-Dominguez

Signal Processing and Recognition Group, Universidad Nacional de Colombia, Manizales 170003, Colombia

^*

Author to whom correspondence should be addressed.

Computers 2024, 13(7), 176; https://doi.org/10.3390/computers13070176

Submission received: 17 June 2024 / Revised: 12 July 2024 / Accepted: 16 July 2024 / Published: 18 July 2024

(This article belongs to the Special Issue Machine Learning Applications in Pattern Recognition)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Nonlinear optimization (NOPT) is a meaningful tool for solving complex tasks in fields like engineering, economics, and operations research, among others. However, NOPT has problems when it comes to dealing with data variability and noisy input measurements that lead to incorrect solutions. Furthermore, nonlinear constraints may result in outcomes that are either infeasible or suboptimal, such as nonconvex optimization. This paper introduces a novel regularized physics-informed neural network (RPINN) framework as a new NOPT tool for both supervised and unsupervised data-driven scenarios. Our RPINN is threefold: By using custom activation functions and regularization penalties in an artificial neural network (ANN), RPINN can handle data variability and noisy inputs. Furthermore, it employs physics principles to construct the network architecture, computing the optimization variables based on network weights and learned features. In addition, it uses automatic differentiation training to make the system scalable and cut down on computation time through batch-based back-propagation. The test results for both supervised and unsupervised NOPT tasks show that our RPINN can provide solutions that are competitive compared to state-of-the-art solvers. In turn, the robustness of RPINN against noisy input measurements makes it particularly valuable in environments with fluctuating information. Specifically, we test a uniform mixture model and a gas-powered system as NOPT scenarios. Overall, with RPINN, its ANN-based foundation offers significant flexibility and scalability.

Keywords:

nonlinear optimization; physics-informed neural networks; regularization; data driven

1. Introduction

Optimization approaches have emerged as tools for solving complex problems across various disciplines. Unlike traditional linear models, nonlinear optimization (NOPT) methods are capable of incorporating the intricate and interdependent relationships inherent in real-world scenarios [1]. These techniques are particularly valuable in fields such as engineering, economics, and operations research, where they enable the formulation and solution of models that more accurately reflect the underlying dynamics [2]. By leveraging advanced algorithms and computational solutions, NOPT facilitates improved decision-making and implementation, thereby enhancing efficiency and effectiveness in tackling multifaceted challenges. As research and technology continue to evolve, their significance in achieving optimal outcomes in diverse applications is becoming increasingly evident [3,4]. Nonetheless, NOPT comprises salient issues: First, data variability and noisy input measurements yield erroneous and fluctuating solutions. Second, nonlinear constraints greatly complicate the task of achieving optimal outputs [5]. Moreover, system scalability should be considered.

Data variability and noisy samples, in particular, are known to be problems that make stochastic measurements less accurate and increase the number of errors in NOPT [6]. The presence of unwanted effects in the data not only reduces the solution quality but also adds complications to the computation, making it more difficult to choose suitable optimization parameters [7]. The instability greatly impedes the optimization process, rendering the algorithm vulnerable to external effects and significantly reducing its overall efficiency [8]. The intricacies of nonlinear constraints might result in outcomes that are either infeasible or suboptimal [9]. Then, the NOPT may have a slow rate of convergence, with a tendency to become trapped at a local minimum. This might present a challenge when both speed and accuracy are crucial [10]. Hence, optimization techniques become impractical for large-scale applications [11], and as the number of variables increases, scalability becomes a significant hindrance, underscoring the pressing need for specialist software and more processing time [12]. Consequently, it is important to deal with large optimization problems, reduce runtime, and simplify the inherent complexity of noisy inputs and nonlinear constraints [11]. Indeed, many NOPT tasks are nondeterministic polynomial time (NP-hard), making it difficult to find an exact solution for large instances because there is not a polynomial time algorithm that works well or that does not introduce errors into the final output [13]. Additionally, some NOPT tasks have nonconvex nonlinear programming (NLP) issues. The latter are especially challenging because they involve a lot of nonconvex and integer functions [14].

Typically, mathematical programming or other classical techniques solve NOPT. These methods are capable of effectively handling nonlinearities and discontinuities [9]. Customized strategies are also implemented to refine the iterative search [15]. Gradient-based techniques, mostly based on descent methods, have also shown they can deal with problems like nonlinear and convex constraints [16]. Similarly, decomposition methods simplify complexity by segmenting the optimization into more manageable subproblems [17]. Additionally, search approaches and metaheuristics are crucial for maintaining a proper balance between exploration and exploitation [18], which enhances efficiency in finding optimal outputs. However, conventional methods often converge on solutions that may not be useful, especially in stochastic and noisy environments with high uncertainty and intrinsic data variability, which can reduce their accuracy [19].

Nowadays, artificial neural networks (ANNs) employ supervised learning to tackle nonlinear and stochastic problems through regression tasks. These networks are trained to find complex patterns and make accurate predictions even when there is a lot of uncertainty using data-driven strategies [20]. Commonly, ANN-based approaches employ automatic differentiation (AD), a computational technique used to evaluate the derivatives of functions efficiently and accurately. Unlike numerical alternatives, which can suffer from precision issues, or symbolic differentiation, which can be computationally expensive, AD works by breaking down functions into elementary operations for which derivatives are known and applying the chain rule systematically [21]. This process ensures that the derivative calculations are exact to machine precision and enables the calculation of loss function gradients with respect to network parameters, which is essential for gradient-based optimization algorithms like back-propagation.

Recently, physics-informed neural networks (PINNs) have emerged as an effective ANN-based optimization technique. Designed to align training with relevant physical principles, they have proven successful in various NOPT applications [22]. Commonly, the Karush–Kuhn–Tucker (KKT) criteria are used to represent constraints and integrate them into the network’s cost function during supervised training [23]. Additionally, a novel approach for integrating constraints using Runge–Kutta (RK) in unsupervised training has been proposed in [24]. Nevertheless, putting these networks into action is hard, especially when it comes to defining the right loss functions, choosing the best hyperparameters, and making sure that computations run quickly while complex systems are being trained [25]. Also, although PINNs have remarkable capabilities, their ability to generalize to nonlinear optimization problems is limited [26].

In this paper, we present a novel regularized PINN framework, termed RPINN, as a NOPT optimization tool for both supervised and unsupervised data-driven scenarios. As a result, we deal with three key NOPT issues. We first address data variability and noisy input measurements by appropriately adapting custom activation and regularization penalties within an ANN scheme. Second, we effectively integrate nonlinear constraints into the network architecture, adhering to the principles of model physics. Specifically, we utilize the network weights and/or learned features within a functional composition framework to determine the NOPT variables. Third, our ANN-based strategy employs AD training, which favors system scalability and computational time through batch-based back-propagation. Experimental results from both supervised and unsupervised data-driven NOPT tasks confirm that our proposal is robust and competitive against state-of-the-art optimization approaches. The primary advantage of our proposal lies in its stability against noisy input measurements, making it a particularly valuable solution in contexts with fluctuating information. Furthermore, because RPINN is based on ANN, it offers flexibility in terms of the network architecture.

The agenda for this paper is as follows: Section 2 summarizes the related work. Section 3 describes the materials and methods. Section 5 and Section 6 depict the experiments and discuss the results. Lastly, Section 7 outlines the conclusions and future work.

2. Related Work

Some studies have shown that mathematical programming has become a crucial tool in numerical optimization. A notable example is the analysis by [9], which employs a sequential linear programming algorithm to address nonlinearities and discontinuities. In this context, the simplex method proves essential, being a classic technique effective for solving linear programming problems through iterative adjustments of solutions within a feasible set [27]. Similarly, the study by [15] explores a solution via quadratic programming (QP). Mixed-integer programming (MIP), on the other hand, is an optimization strategy that uses both integer and continuous variables. It is widely used to solve difficult problems [28], focusing on how the branch-and-cut (BC) algorithm can be employed to find the best solution [29]. Furthermore, second-order cone programming (SOCP) facilitates effective solutions for problems involving linear and quadratic constraints [30]. New studies, like [31], look into semidefinite programming (SDP), and the work in [32] uses convexification techniques. Likewise, exponential programming (EXP) models NOPT objectives and constraints through exponential functions [33]. Additionally, power cone programming (PCP) is considered for modeling product and square relationships [34]. Yet, these classical methods face challenges such as scalability, computation time, convergence, and practical precision, underscoring their inherent complexity and limitations. Furthermore, the use of relaxations or approximations affects the optimization accuracy ref. [31].

On the other hand, gradient methods’ efficiency and precision in identifying optimal solutions highlight their relevance for practical optimization tasks. The work in [35] uses the Dai–Liao conjugate gradient method and hyperplane projections for global convergence to solve nonlinear equations. In addition, ref. [36] faces the nonconvex issue based on a set of starting points. Moreover, nonlinear decomposition using linear programming (LP) and gradient descent was also proposed [37]. Further, the work in [38] examines the Newton-based search to deal with convergence issues in poorly conditioned systems. Also, the semisweeping Newton technique is applied for optimization in Hilbert spaces [39]. For noisy problems, the authors in [40] use piecewise polynomial interpolation and box reformulations, along with an interior-point (IP) method. The authors in [41] tackle similar problems with integrated penalty techniques. Overall, gradient methods are effective at solving NOPT tasks, but they have a challenging time convergent and are expensive to run in noisy and nonlinear situations [42]. Also, it can be challenging to choose the best learning rate, and they run the risk of finding local minima [43]. As seen in [44], it is also important to make sure that at least first-degree differentiation continuity is maintained when using techniques like the conjugate gradient, the IP, and the Newton-based approach.

Of note, most of the available optimization solvers are based on the classical approaches mentioned above. Among them, Clarabel stands out for its versatility in optimizing a wide variety of problems. However, it still faces significant challenges in areas such as MIP [45]. Gurobi is renowned for its proficiency in MIP due to its extensive range of techniques, including simplex and IP methods. However, because it is proprietary software, it might not be able to be used in situations that require license flexibility [46]. Mosek is efficient concerning the IP approach, but its support for MIP is relatively limited, and its aptitude for NLP remains under debate, which could be a hindrance for developers who prefer open-source solutions [47]. Xpress specializes in solving MIP, offering conditional support for NLP, but is a closed-license alternative [48]. In turn, SCS, leveraging its open-source status, promotes adaptability and collaborative development, although its limitations in NLP reduce its effectiveness in certain optimization areas [49]. IPOPT excels at solving NLP problems, and its open access allows for flexibility [50].

Now, in this multifaceted optimization environment, the integration of tools such as MATPOWER, GEKKO, and CVXPY significantly expands the available options. MATPOWER is essential for solving energy system issues and supports solvers like Gurobi, Xpress, and IPOPT for linear, mixed-integer, and nonlinear programming [51,52,53]. GEKKO specializes in dynamic systems and nonlinear models, offering a holistic and open-source Python platform [54,55]. CVXPY is an open-source modeling language for convex optimization problems embedded in Python. It allows you to express your problems naturally, mirroring the mathematical formulation rather than conforming to the restrictive standard form required by solvers [56,57]. Table 1 summarizes the mentioned solvers.

Recently, ANNs have positioned themselves as fundamental tools in optimization by incorporating deep learning techniques, effectively addressing the complexity and nonlinearities of various problems. Conventional ANNs employ supervised learning to tackle nonlinear and stochastic problems through regression tasks. To this end, historical data or solutions precomputed by specialized NOPT tools are used to train these networks [59]. This approach enables ANNs to learn complex patterns and make accurate predictions even under significant uncertainty [20]. Typically, ANN-based approaches utilize AD, a computational method for efficiently and accurately evaluating function derivatives. Instead of numerical or symbolic differentiation, which can have issues with accuracy and require a lot of computing power, AD breaks functions down into simple operations whose derivatives are known and uses the chain rule consistently [21]. Thereby, AD ensures machine-level accuracy in derivative calculations and simplifies the determination of loss function gradients in relation to network parameters, enabling the use of gradient-based search with back-propagation. The work in [60] combines quasi-Newton methods and ANNs for NOPT. Furthermore, the authors in [59] utilize deep learning to solve optimal flow problems. Similarly, the work in [61] introduces an integrated training technique that, while effective, requires larger neural networks and presents challenges in generalization. Concurrently, ref. [62] uses elastic layers and incremental training as optimization-based solvers. Furthermore, the method by [63] combines convex relaxation with graph neural networks.

PINN has recently emerged as a powerful optimization tool. These training approaches have proven effective in various NOPT applications, integrating relevant physical principles within ANNs [22]. The KKT criteria are applied to formulate constraints that are incorporated into an ANN cost function during supervised training [23]. In [64], a PINN framework is detailed that imposes penalties for constraint violations in the loss function. The study in [65] proposes a loss function that combines errors from differential and algebraic states with normative equation violations. Additionally, a novel strategy has been proposed to include constraints in unsupervised training using an RK-based technique [24]. Nevertheless, complete approaches based on ANNs and PINNs face challenges such as optimality degradation. In response, advanced alternatives like [66] have emerged, integrating system constraints into the cost function and applying penalties for violations. Furthermore, ref. [67] introduces an algorithm to address nonlinear problems modeled by partial differential equations with noisy data through Bayesian physics-informed neural networks (B-PINNs). Additionally, ref. [68] proposes a parametric differential equation-based approach holding functional connections to enhance the robustness and accuracy of PINNs. In turn, ref. [69] presents a truncated Fourier decomposition, termed Modal-PINNs, to optimize the reconstruction of periodic signals. However, these alternatives often lack adequate precision, generalization capability, and scalability ref. [23]. Finally, supervised data are usually required, complicating their application in various NOPT scenarios.

3. Materials and Methods

3.1. Nonlinear Optimization Fundamentals (NOPT)

Let

x \in R^{P}

be a vector in P variables. The conventional NOPT problem can be summarized as follows:

\begin{array}{l} \min_{x} & ϱ (x) \\ s . t . & ξ_{\min} \leq x \leq ξ_{\max} \\ h_{L} (x) \leq 0 \\ h_{N} (x) \leq 0, \end{array}

(1)

where the objective function

ϱ : R^{P} \to R

is real-valued. Also, the bound constraints are shown by

ξ_{\min}, ξ_{\max} \in R^{P}

. The linear and nonlinear constraints are described by

h_{L} : R^{P} \to R^{C_{L}}

and

h_{N} : R^{P} \to R^{C_{N}}

, where

C_{L} \in N

and

C_{N} \in N

.

Figure 1 depicts the main pipeline of the classical approaches for NOPT. First, it includes the physical system’s parameters, constraints, limits, and the objective function to be optimized. Second, starting from an initial point, the optimization algorithm iterates until convergence. Of note, the number of iterations, the level of improvement, and the objective function thresholding are the relevant stopping criteria to return the final output.

3.2. Regularized Physics-Informed Neural Network (RPINN)

Let

{y_{r} \in Y, z_{r} \in Z}_{r = 1}^{R}

be an input–output set holding R samples. Our data-driven RPINN approach aims to couple the optimization problem in Equation (1) as a penalty-based loss with bounded constraints from both network weights and learned features as follows:

\begin{array}{l} \min_{\tilde{X}, \tilde{Z}} & \frac{λ_{L}}{R} \sum_{r = 1}^{R} L (y_{r}, \tilde{f} (z_{r} | \tilde{X}, \tilde{Z})) + \sum_{i = 1}^{C_{L}} \frac{λ_{L i}}{R} \sum_{r = 1}^{R} {\tilde{h}}_{L i} (y_{r}, \tilde{f} (z_{r} | \tilde{X}, \tilde{Z})) + \sum_{j = 1}^{C_{N}} \frac{λ_{N j}}{R} \sum_{r = 1}^{R} {\tilde{h}}_{N j} (y_{r}, \tilde{f} (z_{r} | \tilde{X}, \tilde{Z})) \\ s . t . & λ_{L} + \sum_{i = 1}^{C_{L}} λ_{L i} + \sum_{j = 1}^{C_{N}} λ_{N j} = 1 \\ ζ_{\min} \leq \tilde{X} \leq ζ_{\max} \\ ψ_{\min} \leq \tilde{f} (z_{r} | \tilde{X}, \tilde{Z}) \leq ψ_{\max}, \forall r \in R; \end{array}

(2)

where

\tilde{f} : Z \to Y

is an ANN-based mapping function,

L : Y \times Y \to R

is a given loss,

\tilde{X}

holds the network parameters, and

\tilde{Z}

gathers the learned features along layers. Also,

{\tilde{h}}_{L i} (\cdot, \cdot)

and

{\tilde{h}}_{N i} (\cdot, \cdot)

are the i-th linear and j-th nonlinear penalty functions to follow the NOPT constraints set by the regularization terms

λ_{L}, λ_{L i}, λ_{N j} \in [0, 1],

where

i \in {1, 2, \dots, C_{L}}

and

j \in {1, 2, \dots, C_{N}}

. Furthermore,

ζ_{\min}

and

ζ_{\max}

collect the network parameter limit values, and

ψ_{\min}

and

ψ_{\max}

capture the network output and feature bounds.

For a given input

z \in Z

, our deep learning-based function with

\hat{L}

layers yields:

\begin{matrix} \tilde{f} (z | \tilde{X}, \tilde{Z}) = (f_{\hat{L}} \circ \dots \circ f_{1} | \tilde{X}, \tilde{Z}) (z), \\ {\tilde{z}}_{l} = f_{l} ({\tilde{z}}_{l - 1} | {\tilde{x}}_{l}, b_{l}) = ν_{l} ({\tilde{x}}_{l}^{⊤} {\tilde{z}}_{l - 1} + b_{l}) . \end{matrix}

(3)

In the l-th layer of Equation (3), where

l \in {1, 2, \dots, \hat{L}}

, the weights and bias are

{\tilde{x}}_{l}, b_{l} \in \tilde{X}

, the learned feature vector is

{\tilde{z}}_{l} \in \tilde{Z}

, and

ν_{l} (\cdot)

is a nonlinear activation function to deal with both network representation and customized bounds to fulfill the Equation (2) limit constraints. Furthermore, the RINN optimization problem can be solved via gradient descent with AD and back-propagation [70].

It is worth noting that our baseline RINN studies a supervised scenario for simplicity, but by addressing its regularized loss, we can easily achieve an unsupervised extension. Figure 2 depicts the RINN main sketch.

4. Tested Scenarios for NOPT Using RPINN

We study two main datasets to test our RPINN as a data-driven NOPT approach: (i) a constrained uniform mixture model with nonlinear loss and supervised target, and (ii) a constrained flow and pressure gas-powered system optimization with unsupervised loss. Below, we provide a detailed description of each experiment.

4.1. Supervised Constrained Optimization: Uniform Mixture Model

This task comprises a linear and bound-constrained optimization of a nonlinear cost [71]:

\begin{array}{l} \min_{x} & \sum_{r = 1}^{R} {| y_{r} - x^{⊤} z_{r} |}_{2}^{2} \\ s . t . & 0 \leq x \leq 1, \\ x^{⊤} 1 = 1; \end{array}

(4)

where

y_{r} \in R^{+}

is the r-th target output,

x \in R^{P}

denotes the mixing coefficients, and

z_{r} \in R^{P}

holds random samples drawn from a uniform distribution as

z_{r p} \sim U (z | p - 1, p)

.

0

and

1

are all-zero and all-one vectors of a proper size. Figure 3 depicts the uniform mixture model task.

The optimization problem in Equation (4) can be solved through our RPINN as follows:

\begin{array}{l} \min_{\tilde{X}} & \frac{λ_{L}}{R} \sum_{r = 1}^{R} L_{H} (y_{r}, \tilde{f} (z_{r} | \tilde{X}); ϵ) + \frac{λ_{L}}{R} {\tilde{x}}_{\hat{L}}^{⊤} 1 \\ s . t . & λ_{L} + λ_{L} = 1, \forall λ_{L}, λ_{L} \in [0, 1] \\ 0 \leq {\tilde{x}}_{\hat{L}} \leq 1 . \end{array}

(5)

For concrete testing and to mitigate noisy samples, a Huber-based loss is used in Equation (5):

\begin{matrix} L_{H} (y, \tilde{f} (z | \tilde{X}); ϵ) = \{\begin{matrix} \frac{1}{2} {∥ y - \tilde{f} (z | \tilde{X}) ∥}^{2} & ∥ y - \tilde{f} (z | \tilde{X}) ∥ \leq ϵ \\ ϵ \cdot (∥ y - \tilde{f} (z | \tilde{X}) ∥ - \frac{1}{2} ϵ) & ∥ y - \tilde{f} (z | \tilde{X}) ∥ > ϵ, \end{matrix} \end{matrix}

(6)

where

ϵ \in R^{+} .

Next, we fix a scaled exponential linear (SELU) activation for the network function composition as follows:

\begin{matrix} SELU (x) = \{\begin{matrix} θ x & x > 0 \\ θ ϑ \cdot (e^{x} - 1) & x \leq 0, \end{matrix} \end{matrix}

(7)

where

θ, ϑ \in R

. Then,

{\tilde{x}}_{\hat{L}}

. To fulfill the former NOPT limit restriction, the RPINN weights at the output layer

\hat{L}

hold a l1-based max constraint.

4.2. Unsupervised Constrained Optimization: Gas-Powered System

We study a gas-powered system as a function of flow and pressure. For this purpose, a synthetic network of eight nodes is used as detailed in [72] and illustrated in Figure 4.

In particular, the NOPT problem is written as:

\begin{array}{l} \min_{x, π} & x^{⊤} a \\ s . t . & B x = z \\ x_{q} = sgn (π_{w (q)}^{2} - π_{w^{'} (q)}^{2}) \sqrt{k_{q} | π_{w (q)}^{2} - π_{w^{'} (q)}^{2} |}, \\ \forall q \in Q; w (q), w^{'} (q) \in W \\ β_{\min} (n, n^{'}) \leq \frac{π_{n}}{π_{n^{'}}} \leq β_{\max} (n, n^{'}), \forall n, n^{'} \in V \\ γ_{\min} \leq π \leq γ_{\max} \\ δ_{\min} \leq x \leq δ_{\max}, \end{array}

(8)

where

a \in R^{P}

represents the gas transport costs for the P flows in

x \in R^{P}

. The incidence matrix

B \in R^{W \times P}

encodes the gas network structure, with W nodes and

z \in R^{W}

, the input gas demand. The first equality constraint is what encodes the linear-based flow and gas demand equilibrium along the network nodes. Next, the node pressure is stored in

π \in R^{W}

. In turn, the q-th flow

x_{q} \in x

is selected according to the network structure from

B

to fulfill the Weymouth equality with

k_{q} \in R

and

Q \leq P

[53]. Then, the function

w (q)

extracts the related pressure

π_{w (q)} \in π

regarding such a Weymouth-based physic constraint. Furthermore,

π_{n}, π_{n^{'}} \in π

choose the inlet and outlet pressures to fulfill the system compression ratio, with V components (

n, n^{'} \in {1, 2, \dots, V}

,

V \leq W

) and compression factor limits

β_{\min} (n, n^{'}), β_{\max} (n, n^{'}) \in R^{+}

. Also,

γ_{\min}, γ_{\max} \in R^{W}

and

δ_{\min}, δ_{\max} \in R^{P}

are the minimum and maximum pressure and flow limits, respectively.

Now, let

{z_{r} \in R^{W}}_{r = 1}^{R}

be an unsupervised input set concerning the required gas demand for R observations. Our RPINN solution of Equation (8) is as follows:

\begin{matrix} \min_{{{\tilde{z}}_{r}, {\tilde{π}}_{r}}_{r = 1}^{R}} & \frac{λ_{L}}{R} \sum_{r = 1}^{R} {\tilde{z}}_{r}^{⊤} a + \frac{λ_{L 1}}{R} \sum_{r = 1}^{R} {\tilde{h}}_{L 1} (B {\tilde{z}}_{r}, z_{r}; ϵ_{L 1}) + \\ \frac{λ_{N 1}}{R} \sum_{r = 1}^{R} {\tilde{h}}_{N 1} ({\tilde{z}}_{r}, {\tilde{π}}_{r}; B, ϵ_{N 1}) + \frac{λ_{N 2}}{R} \sum_{r = 1}^{R} {\tilde{h}}_{N 2} ({\tilde{π}}_{r}; B, β, ϵ_{N 2}) \\ s . t . λ_{L} + λ_{L 1} + λ_{N 1} + λ_{N 2} = 1 \\ \tilde{f} = {\tilde{f}}^{†} \cup {\tilde{f}}^{‡} \\ {\tilde{z}}_{r} = {\tilde{f}}^{†} (z_{r} | {\tilde{X}}^{†}, {\tilde{Z}}^{†}), {\tilde{π}}_{r} = {\tilde{f}}^{‡} (z_{r} | {\tilde{X}}^{‡}, {\tilde{Z}}^{‡}) \\ γ_{\min} \leq {\tilde{π}}_{r} \leq γ_{\max} \\ δ_{\min} \leq {\tilde{z}}_{r} \leq δ_{\max}, \forall r \in R . \end{matrix}

(9)

Given the r-th gas demand vector

z_{r} \in R^{W}

,

{\tilde{z}}_{r} \in R^{P}

predicts the flow vector based on

f^{†}

, and

{\tilde{π}}_{r} \in R^{W}

the corresponding pressure vector using

f^{‡}

. Moreover:

\begin{matrix} {\tilde{h}}_{L 1} (B {\tilde{z}}_{r}, z_{r}; ϵ_{L 1}) = & L_{H} (B {\tilde{z}}_{r}, z_{r}; ϵ_{L 1}) \\ {\tilde{h}}_{N 1} ({\tilde{z}}_{r}, {\tilde{π}}_{r}; B, ϵ_{N 1}) = & \frac{1}{Q} \sum_{q = 1}^{Q} L_{H} ({\tilde{z}}_{r q}, φ_{q} ({\tilde{π}}_{r}; B); ϵ_{N 1}) \\ {\tilde{h}}_{N 2} ({\tilde{π}}_{r}; B, β) = & \frac{1}{V^{2}} \sum_{n, n^{'} \in V} L_{\tilde{H}} ({\tilde{π}}_{r n}, {\tilde{π}}_{r n^{'}}; B, β, ϵ_{N 2}), \end{matrix}

(10)

where notation

L_{H} (\cdot, \cdot; ϵ_{\cdot})

stands for a Huber-based penalty (see Equation (6)),

φ ({\tilde{π}}_{r}; B) \in R^{Q}

holds elements:

φ_{q} ({\tilde{π}}_{r}; B) = sgn ({\tilde{π}}_{r w (q)}^{2} - {\tilde{π}}_{r w^{'} (q)}^{2}) \sqrt{k_{q} | {\tilde{π}}_{r w (q)}^{2} - {\tilde{π}}_{r w^{'} (q)}^{2} |}, \forall q \in Q,

(11)

and:

L_{\tilde{H}} ({\tilde{π}}_{r n}, {\tilde{π}}_{r n^{'}}; B, β, ϵ_{N 2}) = \{\begin{matrix} 0 & β_{\min} (n, n^{'}) \leq \frac{{\tilde{π}}_{r n}}{{\tilde{π}}_{r n^{'}}} \leq β_{\max} (n, n^{'}) \\ {(\frac{\frac{{\tilde{π}}_{r n}}{{\tilde{π}}_{r n^{'}}} - 0.5 β_{\min} (n, n^{'})}{β_{\max} (n, n^{'})})}^{2} & otherwise . \end{matrix}

(12)

It is worth mentioning that the custom penalty in Equation (10) aims to deal with noisy inputs while preserving the NOPT limits and constraints. In particular,

L_{\tilde{H}} (\cdot, \cdot; B, β, ϵ_{N 2})

penalizes pressures that are far from the middle of the compression factor range, according to

β_{\min} (n, n^{'}), β_{\max} (n, n^{'}) \in β

. Finally, a scaled sigmoid function

\tilde{σ} (\cdot) \in [u_{\min}, u_{\max}]

addresses the predicted flow and pressure limits in Equation (9) as:

\tilde{σ} (x) = α \frac{1}{1 + e^{- x}} + ι,

(13)

where

α, ι \in R .

5. Experimental Set-Up

The scenarios in Section 4 will be used to test our RPINN in both supervised and unsupervised settings. They will be utilized to look at sample variability, noisy input measurements, and nonlinear constraints.

5.1. Deep Learning Architectures

To address the uniform mixture model NOPT (supervised constrained optimization), our RPINN consists of two dense layers as shown in Figure 5 and Table 2.

Next, as seen in Figure 6 and Table 3, a wide ANN architecture is proposed for our RPINN-based gas-powered system scenario. We can specifically focus on essential variables—flows and pressures—in our sketch, adapting it to the unique characteristics of the gas network. To achieve this, our model incorporates blocks of dense layers designed to map input data, as well as batch normalization layers that help stabilize and normalize the features and gradient along the back-propagation. Additionally, it includes custom layers named custom dense, bounded dense, source switching, and unsupply gas switching. We design these to encode the source behavior of the system, manage unmet demand, and delineate system boundaries.

As seen, a shallow and straightforward approach is a simple NOPT task for the uniform mixture model, which we must elucidate in relation to fixed architectures. Additionally, in order to mitigate overfitting and accommodate numerous constraints and a linear loss in the gas-powered system, we implement a shallow and wide network. Nevertheless, our RPINN approach is adaptable in terms of network architecture, enabling the implementation of more complex schemes as needed.

5.2. Training Details and Method Comparison

To evaluate the effectiveness of our methodology in addressing optimization problems, we utilize the mean absolute percentage error (MAPE) as the primary performance measure across all conducted experiments, defined as:

\begin{matrix} MAPE ({\tilde{y}}_{r}, {\hat{y}}_{r}) = \frac{100}{R} \sum_{r = 1}^{R} |\frac{{\tilde{y}}_{r} - {\hat{y}}_{r}}{{\tilde{y}}_{r}}| [%], \end{matrix}

(14)

where

{\tilde{y}}_{r}, {\hat{y}}_{r} \in R

stands for r-th target and predicted value,

MAPE (\cdot, \cdot) \in [0, 100] [%]

, and

| \cdot |

is the absolute value operator.

Now, for the uniform mixture model, we generate 500 samples, each composed of five variables. We train our RPINN architectures on a total of 400 samples, allocating 30% for the validation phase. We use the remaining 100 samples to evaluate the model’s performance. To see how well NOPT works with noisy inputs, we add white Gaussian noise to the model output while keeping the signal-to-noise ratio (SNR) value within the set

\{- 1, 3, 5\}

. Further, for the gas-powered system, we define three distinct scenarios to evaluate the network’s capacity under varying demand conditions. This process yields a total of 20.000 samples, of which 30% is designated for testing. We produce 320 samples using GEKKO v1.0.6 to compare the model’s performance with IPOPT v3.12 [73].

We implement RPINN using Python 3.10.12 and the TensorFlow API 2.15.0 on Google Colaboratory. For training, we fix 600 epochs, a batch size of 32 samples, an Adam optimizer, and a learning rate value of

1 \times 10^{- 3}

in the supervised constrained optimization. Likewise, the unsupervised constrained NOPT scenario uses a batch size of 256 and an Adamax optimizer. Also, an initial learning rate of

1 \times 10^{- 2}

with decreasing scheduling is employed. The regularization hyperparameters, namely

λ_{\cdot}

in Equation (2), are experimentally fixed within the range

[0, 1]

. Since IPOPT excels at solving NOPT, not to mention its open access, we fix it as a method comparison [50]. Our codes and studied datasets are publicly available at https://github.com/UN-GCPDS/python-gcpds.optimization (accessed on 1 March 2024).

6. Results and Discussion

6.1. Supervised Constrained Optimization Results

As shown in Figure 7(left) and Figure 8, for noisy-free data on the uniform mixture model scenario, both our proposal and the IPOPT solution exhibit similar results. The similarity of the results stems from the fact that the problem defined in Equation (4) is convex. Next, for noisy inputs, our RPINN, based on the Huber loss function, shows greater robustness against data variability and noise issues. In fact, the Huber function applies the l1-norm for errors exceeding a defined threshold, reducing sensitivity to extreme values, while for smaller errors, it uses the l2-norm, ensuring accuracy by penalizing smaller errors. In contrast, the classical IPOPT technique uses an objective function based on the l2-norm, which is sensitive to outliers because it significantly penalizes large deviations. The weight distributions provide support for the latter hypothesis. Noise-free data lead to similar strength predictions for both RPINN and IPOPT. Conversely, for noisy inputs, our proposal regularizes the network weights, yielding concentrated values to find the main output dynamics, and outperforms the IPOPT regarding the MAPE for all considered SNR values.

6.2. Unsupervised Constrained Optimization Results

Figure 9 depicts our RPINN regularized penalty illustration for the gas-powered system NOPT. We adopt a standard variant of the Huber loss for the node balance and Weymouth constraints. As shown, the threshold

ϵ_{\cdot}

transitions between the

l 1

and

l 2

norms. Regarding the constraint on the compression ratio limit, it is essential to alter the structure due to its inequality behavior. This enhancement stabilizes the transition between the l2 and l1 norms at zero, based on the distance to the required range’s central value. Furthermore, it is crucial to correctly integrate these cost functions into our RPINN. Then, the right plot in Figure 9 shows the Weymouth (blue), compression ratio (orange), and compression factor constraint (green) penalty evolution. The resulting loss shows a decreasing trend, indicating that the Huber-based approach can handle the physical limitations of the gas-powered NOPT.

In turn, we design three evaluation scenarios in comparison with the IPOPT framework to validate the performance of regularization functions in data generation. In the first scenario, data remain below the source’s maximum capacity. In the second scenario, 50 percent of the samples exceed this capacity, while in the third, about 100 percent of the data surpass it. Figure 10 shows that even though IPOPT has a lower MAPE, its precision (variance) changes a lot over the iterations. This means that conventional methods for NOPT are not strong against data variability and nonlinear constraints. In contrast, our RPINN achieves acceptable MAPE with low variability across experiments due to its regularized strategy based on ANNs. In fact, both approaches share similar costs and adhere to compression ratio constraints. In the first two cases, traditional solutions to the Weymouth equation work better than ours. But in the third case, our proposal is better because it is more stable and less affected by outliers, thanks to the Huber-based penalty.

Finally, to see if our RPINN model can handle the limits in Equation (9) well, we look at the results of the flow and pressure prediction layers and how they behave as shown in Figure 11. The parameters analyzed, including injection and pipe flows as well as pipeline pressures, remain within acceptable limits. This behavior is attributed to custom activation in Equation (13), which ensures a smooth and steady transition between the established ranges.

6.3. Computational Cost Results

Figure 12 shows the training and prediction times needed by the RPINN compared to IPOPT. Our model needs more time to process during the training phase because it has to perform both forward and backward passes in each iteration within an ANN-based framework. However, in the prediction phases, our RPINN outperforms IPOPT, resulting in significantly shorter prediction times. This is due to the fact that our approach only requires forward passes after weight training. These tools demonstrate the capability of RPINN to generate fast and accurate predictions for NOPT solutions, not only by reducing processing times but also by narrowing interquartile ranges.

6.4. Limitations

The RPINN framework, while innovative and effective in addressing many challenges of NOPT, has several limitations that need to be considered. One significant limitation is the complexity involved in defining appropriate loss functions and selecting optimal hyperparameters, which can make the implementation process cumbersome. Additionally, extremely high levels of noise or complex nonlinear constraints can hinder the performance of RPINN, despite its robustness against data variability and noisy inputs. Although AD has improved the model’s scalability, it may still face challenges when applied to very large-scale problems due to computational resource limitations.

Furthermore, integrating precise physical principles into the network architecture can be intricate and may not always generalize well across different types of NOPT problems. Current trends in PINNs emphasize improving these models’ generalization capabilities and computational efficiency [74]. To better solve the problems of scalability and accuracy, researchers are focusing on hybrid approaches that mix PINNs with other advanced optimization methods, like metaheuristics and gradient-based methods. The latter indicates a growing recognition of the need for more flexible and adaptive frameworks that can handle a broader range of NOPT scenarios.

7. Conclusions

We introduce a novel Regularized Physics-Informed Neural Network (RPINN) framework, named RPINN, presenting a significant advancement in addressing the challenges associated with nonlinear constrained optimization. By integrating custom activation functions and regularization penalties within an ANN architecture, RPINN effectively handles data variability and noisy inputs. The incorporation of physics principles into the network architecture allows for the computation of optimization variables based on network weights and learned features, leading to competitive performance compared to state-of-the-art solvers. Furthermore, the use of automatic differentiation for training enhances scalability and reduces computation time, making RPINN a robust solution for various NOPT tasks. Experimental results included two scenarios regarding supervised and unsupervised datasets.

The uniform mixture model experiments (supervised constrained NOPT) show that the RPINN is good at dealing with data variability and noisy samples. For noise-free data, both RPINN and the IPOPT solver achieve similar results due to the convex nature of the problem. Still, in scenarios with noisy inputs, RPINN significantly outperforms IPOPT. The RPINN framework, leveraging the Huber loss function, shows greater robustness against noise by effectively regularizing the network weights. This results in more accurate and stable output predictions compared to IPOPT, which relies on an objective function based on the l2-norm and is more sensitive to outliers. The RPINN weight distributions are concentrated, which shows that the model can find the main output dynamics even when noise is present as shown by the lower mean absolute percentage error across all signal-to-noise ratio values.

Then, the results of the gas-powered system (unsupervised constrained optimization) highlight the capability of the RPINN framework to effectively manage complex, nonlinear constraints under varying conditions of gas demand. Compared to the IPOPT framework, the RPINN shows consistent performance with low changes in the mean absolute percentage error. This is especially true when the gas demand is higher than the source’s maximum capacity. While IPOPT shows lower MAPE in terms of node balance and Weymouth constraints, its precision fluctuates significantly with data variability. In contrast, RPINN maintains stable performance, ensuring compliance with physical constraints such as the Weymouth equation and compression ratio limits. The custom penalty functions within RPINN facilitate this stability, proving particularly valuable when traditional methods struggle with outliers and extreme values. Overall, RPINN offers a robust, scalable solution with reduced prediction times.

As future work, authors plan to include Bayesian hyperparameter optimization for RPINN fine tuning [75]. We will also look at normalized and information theoretic learning-based loss as ways to deal with noisy inputs and complicated constraints [76,77]. Finally, Bayesian PINN and graph neural networks will be coupled with our RPINN for representation learning enhancement [67,78].

Author Contributions

Conceptualization, D.A.P.-R., A.M.Á.-M. and C.G.C.-D.; data curation, D.A.P.-R.; methodology, D.A.P.-R., A.M.Á.-M. and C.G.C.-D.; project administration, A.M.Á.-M.; supervision, A.M.Á.-M. and C.G.C.-D.; resources, D.A.P.-R. and A.M.Á.-M. All authors have read and agreed to the published version of the manuscript.

Funding

Under grants provived by the projects: “Desarrollo de una herramienta para la planeación a largo plazo de la operación del sistema de transporte de gas natural en Colombia” (Minciencias-contrato 184-2021) and “Sistema prototipo de visión por computador utilizando aprendizaje profundo como soporte al monitoreo de zonas urbanas desde unidades aéreas no tripuladas-HERMES 55261” (Universidad Nacional de Colombia).

Data Availability Statement

The publicly available dataset analyzed in this study can be found at https://github.com/UN-GCPDS/python-gcpds.optimization (accessed on 1 March 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Stanimirović, P.S.; Ivanov, B.; Ma, H.; Mosić, D. A survey of gradient methods for solving nonlinear optimization. Electron. Res. Arch. 2020, 28, 1573–1624. [Google Scholar] [CrossRef]
Abdulkadirov, R.; Lyakhov, P.; Nagornov, N. Survey of optimization algorithms in modern neural networks. Mathematics 2023, 11, 2466. [Google Scholar] [CrossRef]
Chen, Q.; Zuo, L.; Wu, C.; Bu, Y.; Lu, Y.; Huang, Y.; Chen, F. Short-term supply reliability assessment of a gas pipeline system under demand variations. Reliab. Eng. Syst. Saf. 2020, 202, 107004. [Google Scholar] [CrossRef]
Yu, W.; Huang, W.; Wen, Y.; Li, Y.; Liu, H.; Wen, K.; Gong, J.; Lu, Y. An integrated gas supply reliability evaluation method of the large-scale and complex natural gas pipeline network based on demand-side analysis. Reliab. Eng. Syst. Saf. 2021, 212, 107651. [Google Scholar] [CrossRef]
Kohjitani, H.; Koda, S.; Himeno, Y.; Makiyama, T.; Yamamoto, Y.; Yoshinaga, D.; Wuriyanghai, Y.; Kashiwa, A.; Toyoda, F.; Zhang, Y.; et al. Gradient-based parameter optimization method to determine membrane ionic current composition in human induced pluripotent stem cell-derived cardiomyocytes. Sci. Rep. 2022, 12, 19110. [Google Scholar] [CrossRef] [PubMed]
Shcherbakova, G.; Krylov, V.; Qianqi, W.; Rusyn, B.; Sachenko, A.; Bykovyy, P.; Zahorodnia, D.; Kopania, L. Optimization methods on the wavelet transformation base for technical diagnostic information systems. In Proceedings of the 2021 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Cracow, Poland, 22–25 September 2021; Volume 2, pp. 767–773. [Google Scholar]
Weiner, A.; Semaan, R. Backpropagation and gradient descent for an optimized dynamic mode decomposition. arXiv 2023, arXiv:2312.12928. [Google Scholar]
Han, M.; Du, Z.; Yuen, K.F.; Zhu, H.; Li, Y.; Yuan, Q. Walrus optimizer: A novel nature-inspired metaheuristic algorithm. Expert Syst. Appl. 2024, 239, 122413. [Google Scholar] [CrossRef]
Mhanna, S.; Mancarella, P. An exact sequential linear programming algorithm for the optimal power flow problem. IEEE Trans. Power Syst. 2021, 37, 666–679. [Google Scholar] [CrossRef]
Chang, H.; Chen, Q.; Lin, R.; Shi, Y.; Xie, L.; Su, H. Controlling Pressure of Gas Pipeline Network Based on Mixed Proximal Policy Optimization. In Proceedings of the 2022 China Automation Congress (CAC), Xiamen, China, 25–27 November 2022; pp. 4642–4647. [Google Scholar]
Wang, G.; Zhao, W.; Qiu, R.; Liao, Q.; Lin, Z.; Wang, C.; Zhang, H. Operational optimization of large-scale thermal constrained natural gas pipeline networks: A novel iterative decomposition approach. Energy 2023, 282, 128856. [Google Scholar] [CrossRef]
Montoya, O.; Gil-González, W.; Hernández, J.C.; Giral-Ramírez, D.A.; Medina-Quesada, A. A mixed-integer nonlinear programming model for optimal reconfiguration of DC distribution feeders. Energies 2020, 13, 4440. [Google Scholar] [CrossRef]
Robuschi, N.; Zeile, C.; Sager, S.; Braghin, F. Multiphase mixed-integer nonlinear optimal control of hybrid electric vehicles. Automatica 2021, 123, 109325. [Google Scholar] [CrossRef]
Arya, A.K.; Jain, R.; Yadav, S.; Bisht, S.; Gautam, S. Recent trends in gas pipeline optimization. Mater. Today Proc. 2022, 57, 1455–1461. [Google Scholar] [CrossRef]
Sadat, S.A.; Sahraei-Ardakani, M. Customized sequential quadratic programming for solving large-scale ac optimal power flow. In Proceedings of the 2021 North American Power Symposium (NAPS), College Station, TX, USA, 14–16 November 2021; pp. 1–6. [Google Scholar]
Awwal, A.M.; Kumam, P.; Abubakar, A.B. A modified conjugate gradient method for monotone nonlinear equations with convex constraints. Appl. Numer. Math. 2019, 145, 507–520. [Google Scholar] [CrossRef]
Gao, H.; Li, Z. A benders decomposition based algorithm for steady-state dispatch problem in an integrated electricity-gas system. IEEE Trans. Power Syst. 2021, 36, 3817–3820. [Google Scholar] [CrossRef]
Wang, Y.; Gao, S.; Zhou, M.; Yu, Y. A multi-layered gravitational search algorithm for function optimization and real-world problems. IEEE/CAA J. Autom. Sin. 2020, 8, 94–109. [Google Scholar] [CrossRef]
Pillutla, K.; Roulet, V.; Kakade, S.M.; Harchaoui, Z. Modified Gauss-Newton Algorithms under Noise. In Proceedings of the 2023 IEEE Statistical Signal Processing Workshop (SSP), Hanoi, Vietnam, 2–5 July 2023; pp. 51–55. [Google Scholar] [CrossRef]
Jamii, J.; Trabelsi, M.; Mansouri, M.; Mimouni, M.F.; Shatanawi, W. Non-Linear Programming-Based Energy Management for a Wind Farm Coupled with Pumped Hydro Storage System. Sustainability 2022, 14, 11287. [Google Scholar] [CrossRef]
Baydin, A.G.; Pearlmutter, B.A.; Radul, A.A.; Siskind, J.M. Automatic differentiation in machine learning: A survey. J. Mach. Learn. Res. 2018, 18, 1–43. [Google Scholar]
Pan, X.; Chen, M.; Zhao, T.; Low, S.H. DeepOPF: A Feasibility-Optimized Deep Neural Network Approach for AC Optimal Power Flow Problems. IEEE Syst. J. 2023, 17, 673–683. [Google Scholar] [CrossRef]
Nellikkath, R.; Chatzivasileiadis, S. Physics-informed neural networks for ac optimal power flow. Electr. Power Syst. Res. 2022, 212, 108412. [Google Scholar] [CrossRef]
Huang, B.; Wang, J. Applications of Physics-Informed Neural Networks in Power Systems—A Review. IEEE Trans. Power Syst. 2023, 38, 572–588. [Google Scholar] [CrossRef]
Stiasny, J.; Chevalier, S.; Chatzivasileiadis, S. Learning without data: Physics-informed neural networks for fast time-domain simulation. In Proceedings of the 2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Aachen, Germany, 25–28 October 2021; pp. 438–443. [Google Scholar]
Strelow, E.L.; Gerisch, A.; Lang, J.; Pfetsch, M.E. Physics informed neural networks: A case study for gas transport problems. J. Comput. Phys. 2023, 481, 112041. [Google Scholar] [CrossRef]
Applegate, D.; Diaz, M.; Hinder, O.; Lu, H.; Lubin, M.; O’Donoghue, B.; Schudy, W. Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient. In Proceedings of the Advances in Neural Information Processing Systems; Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2021; Volume 34, pp. 20243–20257. [Google Scholar]
Zhao, Z.; Liu, S.; Zhou, M.; Abusorrah, A. Dual-objective mixed integer linear program and memetic algorithm for an industrial group scheduling problem. IEEE/CAA J. Autom. Sin. 2020, 8, 1199–1209. [Google Scholar] [CrossRef]
Vo, T.Q.T.; Baiou, M.; Nguyen, V.H.; Weng, P. Improving Subtour Elimination Constraint Generation in Branch-and-Cut Algorithms for the TSP with Machine Learning. In Proceedings of the Learning and Intelligent Optimization; Sellmann, M., Tierney, K., Eds.; Springer International Publishing: Cham, Switzerland, 2023; pp. 537–551. [Google Scholar]
Sun, Y.; Zhang, B.; Ge, L.; Sidorov, D.; Wang, J.; Xu, Z. Day-ahead optimization schedule for gas-electric integrated energy system based on second-order cone programming. CSEE J. Power Energy Syst. 2020, 6, 142–151. [Google Scholar]
Lin, Y.; Zhang, X.; Wang, J.; Shi, D.; Bian, D. Voltage Stability Constrained Optimal Power Flow for Unbalanced Distribution System Based on Semidefinite Programming. J. Mod. Power Syst. Clean Energy 2022, 10, 1614–1624. [Google Scholar] [CrossRef]
Chowdhury, M.M.U.T.; Kamalasadan, S. A new second-order cone programming model for voltage control of power distribution system with inverter-based distributed generation. IEEE Trans. Ind. Appl. 2021, 57, 6559–6567. [Google Scholar] [CrossRef]
Asgharieh Ahari, S.; Kocuk, B. A mixed-integer exponential cone programming formulation for feature subset selection in logistic regression. EURO J. Comput. Optim. 2023, 11, 100069. [Google Scholar] [CrossRef]
Kumar, J.; Rahaman, O. Lower bound limit analysis using power cone programming for solving stability problems in rock mechanics for generalized Hoek–Brown criterion. Rock Mech. Rock Eng. 2020, 53, 3237–3252. [Google Scholar] [CrossRef]
Abubakar, A.B.; Kumam, P. A descent Dai-Liao conjugate gradient method for nonlinear equations. Numer. Algorithms 2019, 81, 197–210. [Google Scholar] [CrossRef]
Chen, J.; Wang, L.; Wang, C.; Yao, B.; Tian, Y.; Wu, Y.S. Automatic fracture optimization for shale gas reservoirs based on gradient descent method and reservoir simulation. Adv. Geo-Energy Res. 2021, 5, 191–201. [Google Scholar] [CrossRef]
Mahapatra, D.; Rajan, V. Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization. In Proceedings of the International Conference on Machine Learning, PMLR, Online conference, 13–18 July 2020; pp. 6597–6607. [Google Scholar]
Karimi, M.; Shahriari, A.; Aghamohammadi, M.; Marzooghi, H.; Terzija, V. Application of Newton-based load flow methods for determining steady-state condition of well and ill-conditioned power systems: A review. Int. J. Electr. Power Energy Syst. 2019, 113, 298–309. [Google Scholar] [CrossRef]
Mannel, F.; Rund, A. A hybrid semismooth quasi-Newton method for nonsmooth optimal control with PDEs. Optim. Eng. 2021, 22, 2087–2125. [Google Scholar] [CrossRef]
Pinheiro, R.B.; Balbo, A.R.; Cabana, T.G.; Nepomuceno, L. Solving Nonsmooth and Discontinuous Optimal Power Flow problems via interior-point l_p-penalty approach. Comput. Oper. Res. 2022, 138, 105607. [Google Scholar] [CrossRef]
Delgado, J.A.; Baptista, E.C.; Balbo, A.R.; Soler, E.M.; Silva, D.N.; Martins, A.C.; Nepomuceno, L. A primal–dual penalty-interior-point method for solving the reactive optimal power flow problem with discrete control variables. Int. J. Electr. Power Energy Syst. 2022, 138, 107917. [Google Scholar] [CrossRef]
Liu, B.; Yang, Q.; Zhang, H.; Wu, H. An interior-point solver for AC optimal power flow considering variable impedance-based FACTS devices. IEEE Access 2021, 9, 154460–154470. [Google Scholar] [CrossRef]
Haji, S.H.; Abdulazeez, A.M. Comparison of optimization techniques based on gradient descent algorithm: A review. PalArch’s J. Archaeol. Egypt/Egyptol. 2021, 18, 2715–2743. [Google Scholar]
Ibrahim, I.A.; Hossain, M.J. Low voltage distribution networks modeling and unbalanced (optimal) power flow: A comprehensive review. IEEE Access 2021, 9, 143026–143084. [Google Scholar] [CrossRef]
Goulart, P.; Chen, Y. Clarabel Documentation. 2024. Available online: https://oxfordcontrol.github.io/ClarabelDocs/stable/ (accessed on 12 June 2024).
Gurobi Optimization. 2024. Available online: https://www.gurobi.com/ (accessed on 12 June 2024).
MOSEK. 2024. Available online: https://www.mosek.com/ (accessed on 12 June 2024).
Xpress Optimization. 2024. Available online: https://www.fico.com/en/products/fico-xpress-optimization (accessed on 12 June 2024).
O’Donoghue, B. Operator Splitting for a Homogeneous Embedding of the Linear Complementarity Problem. SIAM J. Optim. 2021, 31, 1999–2023. [Google Scholar] [CrossRef]
Ipopt Deprecated Features. 2024. Available online: https://coin-or.github.io/Ipopt/deprecated.html (accessed on 12 June 2024).
Zimmerman, R.D.; Murillo-Sánchez, C.E. MATPOWER User’s Manual; Zenodo: Tempe, AZ, USA, 2020. [Google Scholar] [CrossRef]
Wang, H.; Murillo-Sanchez, C.E.; Zimmerman, R.D.; Thomas, R.J. On Computational Issues of Market-Based Optimal Power Flow. IEEE Trans. Power Syst. 2007, 22, 1185–1193. [Google Scholar] [CrossRef]
García-Marín, S.; González-Vanegas, W.; Murillo-Sánchez, C. MPNG: A MATPOWER-Based Tool for Optimal Power and Natural Gas Flow Analyses. IEEE Trans. Power Syst. 2022, 39, 5455–5464. [Google Scholar] [CrossRef]
Beal, L.; Hill, D.; Martin, R.; Hedengren, J. GEKKO Optimization Suite. Processes 2018, 6, 106. [Google Scholar] [CrossRef]
Mugel, S.; Kuchkovsky, C.; Sanchez, E.; Fernandez-Lorenzo, S.; Luis-Hita, J.; Lizaso, E.; Orus, R. Dynamic portfolio optimization with real datasets using quantum processors and quantum-inspired tensor networks. Phys. Rev. Res. 2022, 4, 013006. [Google Scholar] [CrossRef]
Diamond, S.; Boyd, S. CVXPY: A Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 2016, 17, 1–5. [Google Scholar]
Agrawal, A.; Boyd, S. Disciplined quasiconvex programming. arXiv 2020, arXiv:1905.00562. [Google Scholar] [CrossRef]
O’Donoghue, B.; Chu, E.; Parikh, N.; Boyd, S. Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding. J. Optim. Theory Appl. 2016, 169, 1042–1068. [Google Scholar] [CrossRef]
Pan, X.; Zhao, T.; Chen, M.; Zhang, S. DeepOPF: A Deep Neural Network Approach for Security-Constrained DC Optimal Power Flow. IEEE Trans. Power Syst. 2021, 36, 1725–1735. [Google Scholar] [CrossRef]
Baker, K. A learning-boosted quasi-newton method for ac optimal power flow. arXiv 2020, arXiv:2007.06074. [Google Scholar]
Zhou, M.; Chen, M.; Low, S.H. DeepOPF-FT: One Deep Neural Network for Multiple AC-OPF Problems With Flexible Topology. IEEE Trans. Power Syst. 2023, 38, 964–967. [Google Scholar] [CrossRef]
Liang, H.; Zhao, C. DeepOPF-U: A Unified Deep Neural Network to Solve AC Optimal Power Flow in Multiple Networks. arXiv 2023, arXiv:2309.12849. [Google Scholar]
Falconer, T.; Mones, L. Leveraging Power Grid Topology in Machine Learning Assisted Optimal Power Flow. IEEE Trans. Power Syst. 2023, 38, 2234–2246. [Google Scholar] [CrossRef]
Misyris, G.S.; Venzke, A.; Chatzivasileiadis, S. Physics-informed neural networks for power systems. In Proceedings of the 2020 IEEE Power & Energy Society General Meeting (PESGM), Montreal, QC, Canada, 2–6 August 2020; pp. 1–5. [Google Scholar]
Misyris, G.S.; Stiasny, J.; Chatzivasileiadis, S. Capturing power system dynamics by physics-informed neural networks and optimization. In Proceedings of the 2021 60th IEEE Conference on Decision and Control (CDC), Austin, TX, USA, 14–17 December 2021; pp. 4418–4423. [Google Scholar]
Habib, A.; Yildirim, U. Developing a physics-informed and physics-penalized neural network model for preliminary design of multi-stage friction pendulum bearings. Eng. Appl. Artif. Intell. 2022, 113, 104953. [Google Scholar] [CrossRef]
Yang, L.; Meng, X.; Karniadakis, G.E. B-PINNs: Bayesian physics-informed neural networks for forward and inverse PDE problems with noisy data. J. Comput. Phys. 2021, 425, 109913. [Google Scholar] [CrossRef]
Schiassi, E.; De Florio, M.; D’Ambrosio, A.; Mortari, D.; Furfaro, R. Physics-informed neural networks and functional interpolation for data-driven parameters discovery of epidemiological compartmental models. Mathematics 2021, 9, 2069. [Google Scholar] [CrossRef]
Raynaud, G.; Houde, S.; Gosselin, F.P. ModalPINN: An extension of physics-informed Neural Networks with enforced truncated Fourier decomposition for periodic flow reconstruction using a limited number of imperfect sensors. J. Comput. Phys. 2022, 464, 111271. [Google Scholar] [CrossRef]
Murphy, K.P. Probabilistic Machine Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2022. [Google Scholar]
González-Vanegas, W.; Álvarez Meza, A.; Hernández-Muriel, J.; Orozco-Gutiérrez, Á. AKL-ABC: An Automatic Approximate Bayesian Computation Approach Based on Kernel Learning. Entropy 2019, 21, 932. [Google Scholar] [CrossRef]
García-Marín, S.; González-Vanegas, W.; Murillo-Sánchez, C. MPNG: MATPOWER-Natural Gas. 2019. Available online: https://github.com/MATPOWER/mpng (accessed on 12 June 2024).
Owerko, D.; Gama, F.; Ribeiro, A. Unsupervised optimal power flow using graph neural networks. arXiv 2022, arXiv:2210.09277. [Google Scholar]
Mustajab, A.H.; Lyu, H.; Rizvi, Z.; Wuttke, F. Physics-Informed Neural Networks for High-Frequency and Multi-Scale Problems Using Transfer Learning. Appl. Sci. 2024, 14, 3204. [Google Scholar] [CrossRef]
Eleftheriadis, P.; Leva, S.; Ogliari, E. Bayesian hyperparameter optimization of stacked bidirectional long short-term memory neural network for the state of charge estimation. Sustain. Energy Grids Netw. 2023, 36, 101160. [Google Scholar] [CrossRef]
Ma, X.; Huang, H.; Wang, Y.; Romano, S.; Erfani, S.; Bailey, J. Normalized loss functions for deep learning with noisy labels. In Proceedings of the International Conference on Machine Learning, PMLR, Online Meeting, 13–18 July 2020; pp. 6543–6553. [Google Scholar]
Jeon, H.J.; Van Roy, B. An Information-Theoretic Framework for Deep Learning. Adv. Neural Inf. Process. Syst. 2022, 35, 3279–3291. [Google Scholar]
Thangamuthu, A.; Kumar, G.; Bishnoi, S.; Bhattoo, R.; Krishnan, N.; Ranu, S. Unravelling the performance of physics-informed graph neural networks for dynamical systems. Adv. Neural Inf. Process. Syst. 2022, 35, 3691–3702. [Google Scholar]

Figure 1. Classical optimization pipeline for NOPT.

Figure 2. Regularized physics-informed neural network for data-driven nonlinear constrained optimization main sketch.

Figure 3. Uniform mixture model optimization. (Left): weighted uniform probabilities. (Right): visual representation of the mixing results.

Figure 4. Optimizing gas-powered systems. An eight-node gas network is studied. The diagram depicts the nodes as points, and the arrows indicate flow direction. The trapezoidal shapes represent the pressure compressors. Numbers represent each node within the gas network.

Figure 5. RPINN pipeline for the uniform mixture model-based NOPT.

Figure 6. RPINN pipeline for the gas-powered system-based NOPT.

Figure 7. RPINN uniform mixture model-based NOPT results. First row: SNR

= - 1

. Second row: SNR

= 3

. Third row: noise-free. Left: output prediction. Right: weight distribution. Green: target. Red: noisy target. Black: RPINN. Blue: IPOPT.

Figure 7. RPINN uniform mixture model-based NOPT results. First row: SNR

= - 1

. Second row: SNR

= 3

. Third row: noise-free. Left: output prediction. Right: weight distribution. Green: target. Red: noisy target. Black: RPINN. Blue: IPOPT.

Figure 8. Uniform mixture model MAPE results. Left: output error. Right: weights error. (N): noisy-free. (−1), (3), and (5) stand for the SNR value.

Figure 9. Gas-powered system regularized loss illustration. Left: node balance and Weymouth penalties based on conventional Huber-loss. Middle: Compression factor limit constraint using our Huber-based enhancement. (see Equation (10)). Right: Gas-powered system custom penalty evolution (Blue: Weymouth equality constraint; Orange: compression ratio limit constraint; Green: Compression factor constraint).

Figure 10. Gas-powered system objective cost and constraint compliance MAPE results. Upper left: node balance. Upper right: Weymouth constraint. Bottom left: compression ratio constraint. Bottom right: cost difference (objective function) between RPINN and IPOPT.

Figure 11. Gas-powered system-bound constraint MAPE results. The star symbol on this graph denotes the defined limits for each of the sources, compressors, pipelines, and pressures as well as their behavior. The number on the x-axis indicates the node to which the information belongs. MMSCFD: Million standard cubic feet per day. psia: pounds per square inch absolute.

Figure 12. RPINN vs. IPOPT computational cost results. The graph compares solution times for the test data between the classical technique (IPOPT, in blue) and our strategy (RPINN, in green). On the left, the training times are shown, while on the right, the prediction times are displayed.

Table 1. State-of-the-art solvers for optimization. (*) Except mixed-integer SDP. (**) Features available with the licensed version only.

Solver	LP	QP	SOCP	SDP	EXP	PCP	MIP	NLP	Strategy	Open Source	Software
Clarabel [45]	✓	✓	✓	✓	✓	x	x	x	IP	✓	CVXPY 1.5
Gurobi [46]	✓	✓	✓	x	x	x	✓	x	IP, Simplex, BC	x	MATPOWER 8.0, CVXPY 1.5
Mosek [47]	✓	✓	✓	✓	✓	✓	✓ *	x	IP	x	MATPOWER 8.0, CVXPY 1.5
Xpress [48]	✓	✓	✓	x	x	x	✓	✓ **	IP, Simplex, BC	x	CVXPY 1.5
SCS [49,58]	✓	✓	✓	✓	✓	✓	x	x	IP	✓	CVXPY 1.5
IPOPT [50]	✓	✓	✓	✓	✓	✓	✓	✓	IP	✓	MATPOWER 8.0, GEKKO 1.0.3

Table 2. RPINN details for the uniform mixture model-based NOPT.

\tilde{R}

: batch-size for AD-based back-propagation. Param. #: number of trainable parameters. Total # of parameters: 30 (0.12 KB).

Table 2. RPINN details for the uniform mixture model-based NOPT.

\tilde{R}

: batch-size for AD-based back-propagation. Param. #: number of trainable parameters. Total # of parameters: 30 (0.12 KB).

Layer Name	Type	Output Shape	Param. #	Memory Size
Input	InputLayer	( $\tilde{R}$ , 5)	0	0 KB
Dense_1	Dense (SELU)	( $\tilde{R}$ , 5)	25	0.1 KB
Dense_2	Dense (SELU, l1-max-constraint)	( $\tilde{R}$ , 1)	5	0.02 KB

Table 3. RPINN architecture details for the gas-powered system NOPT.

\tilde{R}

: batch-size for AD-based back-propagation. Source switching, unsupply gas switching, custom dense, and bounded dense stand for specific switching, limited, and scaled layers, as explained in Section 4.2. Param. #: number of trainable parameters. Total # of parameters: 12,707 (49.67 KB).

Table 3. RPINN architecture details for the gas-powered system NOPT.

\tilde{R}

: batch-size for AD-based back-propagation. Source switching, unsupply gas switching, custom dense, and bounded dense stand for specific switching, limited, and scaled layers, as explained in Section 4.2. Param. #: number of trainable parameters. Total # of parameters: 12,707 (49.67 KB).

Layer Name	Type	Output Shape	Param. #	Memory Size
Input	InputLayer	( $\tilde{R}$ , 8)	0	0 KB
Dense_1	Dense (SELU)	( $\tilde{R}$ , 236)	2124	8.3 KB
Dense_2	Dense (SELU)	( $\tilde{R}$ , 8)	1896	7.41 KB
Source switching	CustomDense	( $\tilde{R}$ , 1)	1	4 B
BatchNormalization_1	BatchNormalization	( $\tilde{R}$ , 236)	944	3.69 KB
BatchNormalization_2	BatchNormalization	( $\tilde{R}$ , 8)	32	0.12 KB
Partial flows	BoundedDense	( $\tilde{R}$ , 50)	2274	8.88 KB
Unsupply gas switching	CustomDense	( $\tilde{R}$ , 8)	0	0 KB
Flow prediction	Concatenate	( $\tilde{R}$ , 59)	0	0KB
Dense_3	Dense (SELU)	( $\tilde{R}$ , 236)	2124	8.3 KB
BatchNormalization_3	BatchNormalization	( $\tilde{R}$ , 236)	944	3.69 KB
Pressure prediction	BoundedDense	( $\tilde{R}$ , 8)	1896	7.41 KB
Node balance	CustomDense	( $\tilde{R}$ , 8)	472	1.84 KB
Weymouth	CustomDense	( $\tilde{R}$ , 14)	0	0 KB

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Perez-Rosero, D.A.; Álvarez-Meza, A.M.; Castellanos-Dominguez, C.G. A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization. Computers 2024, 13, 176. https://doi.org/10.3390/computers13070176

AMA Style

Perez-Rosero DA, Álvarez-Meza AM, Castellanos-Dominguez CG. A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization. Computers. 2024; 13(7):176. https://doi.org/10.3390/computers13070176

Chicago/Turabian Style

Perez-Rosero, Diego Armando, Andrés Marino Álvarez-Meza, and Cesar German Castellanos-Dominguez. 2024. "A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization" Computers 13, no. 7: 176. https://doi.org/10.3390/computers13070176

APA Style

Perez-Rosero, D. A., Álvarez-Meza, A. M., & Castellanos-Dominguez, C. G. (2024). A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization. Computers, 13(7), 176. https://doi.org/10.3390/computers13070176

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Regularized Physics-Informed Neural Network to Support Data-Driven Nonlinear Constrained Optimization

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Nonlinear Optimization Fundamentals (NOPT)

3.2. Regularized Physics-Informed Neural Network (RPINN)

4. Tested Scenarios for NOPT Using RPINN

4.1. Supervised Constrained Optimization: Uniform Mixture Model

4.2. Unsupervised Constrained Optimization: Gas-Powered System

5. Experimental Set-Up

5.1. Deep Learning Architectures

5.2. Training Details and Method Comparison

6. Results and Discussion

6.1. Supervised Constrained Optimization Results

6.2. Unsupervised Constrained Optimization Results

6.3. Computational Cost Results

6.4. Limitations

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI