In this section, we present some fundamental sensitivity results from the literature and then use them in a pathfollowing scheme for obtaining fast approximate solutions to the NLP.
3.1. Sensitivity Properties of NLP
The dynamic optimization Problem (
2) can be cast as a general parametric NLP problem:
where
$\mathit{\chi}\in {\mathbb{R}}^{{n}_{\chi}}$ are the decision variables (which generally include the state variables and the control input
${n}_{\chi}={n}_{x}+{n}_{u}$) and
$\mathbf{p}\in {\mathbb{R}}^{{n}_{p}}$ is the parameter, which is typically the initial state variable
${\mathbf{x}}_{k}$. In addition,
$F:\phantom{\rule{0.166667em}{0ex}}{\mathbb{R}}^{{n}_{\chi}}\times {\mathbb{R}}^{{n}_{p}}\to \mathbb{R}$ is the scalar objective function;
$c:\phantom{\rule{0.166667em}{0ex}}{\mathbb{R}}^{{n}_{\chi}}\times {\mathbb{R}}^{{n}_{p}}\to {\mathbb{R}}^{{n}_{c}}$ denotes the equality constraints; and finally,
$g:\phantom{\rule{0.166667em}{0ex}}{\mathbb{R}}^{{n}_{\chi}}\times {\mathbb{R}}^{{n}_{p}}\to {\mathbb{R}}^{{n}_{g}}$ denotes the inequality constraints. The instances of Problem (
3) that are solved at each sample time differ only in the parameter
$\mathbf{p}$.
The Lagrangian function of this problem is defined as:
and the KKT (Karush–Kuhn–Tucker) conditions are:
In order for the KKT conditions to be a necessary condition of optimality, we require a constraint qualification (CQ) to hold. In this paper, we will assume that the linear independence constraint qualification (LICQ) holds:
Definition 1 (LICQ)
. Given a vector $\mathbf{p}$ and a point χ, the LICQ holds at χ if the set of vectors $\left\{{\left\{{\nabla}_{\mathit{\chi}}{c}_{i}\left(\mathit{\chi},\mathbf{p}\right)\right\}}_{i\in \{1,...,{n}_{c}\}}\cup {\left\{{\nabla}_{\mathit{\chi}}{g}_{i}\left(\mathit{\chi},\mathbf{p}\right)\right\}}_{i:\phantom{\rule{0.166667em}{0ex}}{g}_{i}(\mathit{\chi},\mathbf{p})=0}\right\}$ is linearly independent.
The LICQ implies that the multipliers $(\mathit{\lambda},\mathit{\mu})$ satisfying the KKT conditions are unique. If additionally, a suitable secondorder condition holds, then the KKT conditions guarantee a unique local minimum. A suitable secondorder condition states that the Hessian matrix has to be positive definite in a set of appropriate directions, defined in the following property:
Definition 2 (SSOSC)
. The strong secondorder sufficient condition (SSOSC) holds at χ with multipliers λ and μ if ${\mathbf{d}}^{T}\phantom{\rule{0.166667em}{0ex}}{\nabla}_{\mathit{\chi}}^{2}\mathcal{L}\left(\mathit{\chi},\mathbf{p},\mathit{\lambda},\mathit{\mu}\right)\phantom{\rule{0.166667em}{0ex}}\mathbf{d}>0$ for all $\mathbf{d}\ne 0$, such that ${\nabla}_{\mathit{\chi}}c{\left(\mathit{\chi},\mathbf{p}\right)}^{T}\mathbf{d}=0$ and ${\nabla}_{\mathit{\chi}}{g}_{i}{\left(\mathit{\chi},\mathbf{p}\right)}^{T}\mathbf{d}=0$ for i, such that ${g}_{i}\left(\mathit{\chi},\mathbf{p}\right)=0$ and ${\mu}_{i}>0$.
For a given
$\mathbf{p}$, denote the solution to (
3) by
${\mathit{\chi}}^{*}\left(\mathbf{p}\right),{\mathit{\lambda}}^{*}\left(\mathbf{p}\right),{\mathit{\mu}}^{*}\left(\mathbf{p}\right)$, and if no confusion is possible, we omit the argument and write simply
${\mathit{\chi}}^{*},{\mathit{\lambda}}^{*},{\mathit{\mu}}^{*}$. We are interested in knowing how the solution changes with a perturbation in the parameter
$\mathbf{p}$. Before we state a first sensitivity result, we define another important concept:
Definition 3 (SC)
. Given a vector $\mathbf{p}$ and a solution ${\mathit{\chi}}^{*}$ with vectors of multipliers ${\mathit{\lambda}}^{*}$ and ${\mathit{\mu}}^{*}$, strict complimentary (SC) holds if ${\mu}_{i}^{*}{g}_{i}\left({\mathit{\chi}}^{*},\mathbf{p}\right)>0$ for each $i=1,\dots ,{n}_{g}.$
Now, we are ready to state the result below given by Fiacco [
25].
Theorem 1 (Implicit function theorem applied to optimality conditions)
. Let ${\mathit{\chi}}^{*}\left(\mathbf{p}\right)$ be a KKT point that satisfies (
5)
, and assume that LICQ, SSOSC and SC hold at ${\mathit{\chi}}^{*}$. Further, let the function F, c, g be at least $k+1$times differentiable in χ and ktimes differentiable in $\mathbf{p}$. Then:${\mathit{\chi}}^{*}$ is an isolated minimizer, and the associated multipliers λ and μ are unique.
for $\mathbf{p}$ in a neighborhood of ${\mathbf{p}}_{0}$, the set of active constraints remains unchanged.
for $\mathbf{p}$ in a neighborhood of ${\mathbf{p}}_{0}$, there exists a ktimes differentiable function $\sigma \left(\mathbf{p}\right)=\left[\begin{array}{ccc}{\mathit{\chi}}^{*}{\left(\mathbf{p}\right)}^{T}& {\mathit{\lambda}}^{*}{\left(\mathbf{p}\right)}^{T}& {\mathit{\mu}}^{*}{\left(\mathbf{p}\right)}^{T}\end{array}\right]$, that corresponds to a locally unique minimum for (
3)
.
Using this result, the sensitivity of the optimal solution
${\mathit{\chi}}^{*},{\mathit{\lambda}}^{*},{\mathit{\mu}}^{*}$ in a small neighborhood of
${\mathbf{p}}_{0}$ can be computed by solving a system of linear equations that arises from applying the implicit function theorem to the KKT conditions of (
3):
Here, the constraint gradients with subscript
${g}_{A}$ indicate that we only include the vectors and components of the Jacobian corresponding to the active inequality constraints at
χ, i.e.,
$i\in A$ if
${g}_{i}(\mathit{\chi},{\mathbf{p}}_{0})=0$. Denoting the solution of the equation above as
${\left[\begin{array}{ccc}{\nabla}_{\mathbf{p}}\mathit{\chi}& {\nabla}_{\mathbf{p}}\mathit{\lambda}& {\nabla}_{\mathbf{p}}\mathit{\mu}\end{array}\right]}^{T}$, for small
$\Delta \mathbf{p}$, we obtain a good estimate:
of the solution to the NLP Problem (
3) at the parameter value
${\mathbf{p}}_{0}+\Delta \mathbf{p}$. This approach was applied by Zavala and Biegler [
10].
If $\Delta \mathbf{p}$ becomes large, the approximate solution may no longer be accurate enough, because the SC assumption implies that the active set cannot change. While that is usually true for small perturbations, large changes in $\Delta \mathbf{p}$ may very well induce active set changes.
It can be seen that the sensitivity system corresponds to the stationarity conditions for a particular QP. This is not coincidental. It can be shown that for $\Delta \mathbf{p}$ small enough, the set $\{i:\phantom{\rule{0.166667em}{0ex}}\mathit{\mu}{\left(\overline{\mathbf{p}}\right)}_{i}>0\}$ is constant for $\overline{\mathbf{p}}={\mathbf{p}}_{0}+\Delta \mathbf{p}$. Thus, we can form a QP wherein we are potentially moving off of weaklyactive constraints while staying on the stronglyactive ones. The primaldual solution of this QP is in fact the directional derivative of the primaldual solution path ${\mathit{\chi}}^{*}\left(\mathbf{p}\right),{\mathit{\lambda}}^{*}\left(\mathbf{p}\right),{\mathit{\mu}}^{*}\left(\mathbf{p}\right)$.
Theorem 2. Let $F,c,g$ be twice continuously differentiable in $\mathbf{p}$ and χ near $\left({\mathit{\chi}}^{*},{\mathbf{p}}_{0}\right)$, and let the LICQ and SSOSC hold at $\left({\mathit{\chi}}^{*},{\mathbf{p}}_{0}\right)$. Then, the solution $\left({\mathit{\chi}}^{*}\left(\mathbf{p}\right),{\mathit{\lambda}}^{*}\left(\mathbf{p}\right),{\mathit{\mu}}^{*}\left(\mathbf{p}\right)\right)$ is Lipschitz continuous in a neighborhood of $\left({\mathit{\chi}}^{*},{\mathit{\lambda}}^{*},{\mathit{\mu}}^{*},{\mathbf{p}}_{0}\right)$, and the solution function $\left({\mathit{\chi}}^{*}\left(\mathbf{p}\right),{\mathit{\lambda}}^{*}\left(\mathbf{p}\right),{\mathit{\mu}}^{*}\left(\mathbf{p}\right)\right)$ is directionally differentiable.
Moreover, the directional derivative uniquely solves the following quadratic problem:where ${K}_{+}=\left\{j\in \mathbb{Z}:\phantom{\rule{0.166667em}{0ex}}{\mu}_{j}>0\right\}$ is the stronglyactive set and ${K}_{0}=\left\{j\in \mathbb{Z}:\phantom{\rule{0.166667em}{0ex}}{\mu}_{j}=0\phantom{\rule{0.166667em}{0ex}}\mathrm{and}\phantom{\rule{0.166667em}{0ex}}{g}_{j}\left({\mathit{\chi}}^{*},{\mathbf{p}}_{0}\right)=0\right\}$ denotes the weakly active set. Proof. See [
26] (Sections 5.1 and 5.2) and [
27] (Proposition 3.4.1). ☐
The theorem above gives the solution of the perturbed NLP (
3) by solving a QP problem. Note that regardless of the inertia of the Lagrangian Hessian, if the SSOSC holds, it is positive definite on the nullspace of the equality constraints, and thus, the QP defined is convex with an easily obtainable finite global minimizer. In [
28], it is noted that as the solution to this QP is the directional derivative of the primaldual solution of the NLP, it is a predictor step, a tangential firstorder estimate of the change in the solution subject to a change in the parameter. We refer to the QP (
10) as a purepredictor. Note that obtaining the sensitivity via (
10) instead of (
6) has the advantage that changes in the active set can be accounted for correctly, and strict complementarity (SC) is not required. On the other hand, when SC does hold, (
6) and (
10) are equivalent.
3.2. PathFollowing Based on Sensitivity Properties
Equation (
6) and the QP (
10) describes the change in the optimal solutions for small perturbations. They cannot be guaranteed to reproduce the optimal solution accurately for larger perturbations, because of curvature in the solution path and active set changes that happen further away from the linearization point. One approach to handle such cases is to divide the overall perturbation into several smaller intervals and to iteratively use the sensitivity to track the path of optimal solutions.
The general idea of a pathfollowing method is to reach the solution of the problem at a final parameter value ${\mathbf{p}}_{f}$ by tracing a sequence of solutions $({\mathit{\chi}}_{k},{\mathit{\lambda}}_{k},{\mathit{\mu}}_{k})$ for a series of parameter values $\mathbf{p}\left({t}_{k}\right)=\left(1{t}_{k}\right){\mathbf{p}}_{0}+{t}_{k}\phantom{\rule{0.166667em}{0ex}}{\mathbf{p}}_{f}$ with $0={t}_{0}<{t}_{1}<...<{t}_{k}<...<{t}_{N}=1$. The new direction is found by evaluating the sensitivity at the current point. This is similar to a Euler integration for ordinary differential equations.
However, just as in the case of integrating differential equations with a Euler method, a pathfollowing algorithm that is only based on the sensitivity calculated by the pure predictor QP may fail to track the solution accurately enough and may lead to poor solutions. To address this problem, a common approach is to include elements that are similar to a Newton step, which force the pathfollowing algorithm towards the true solution. It has been found that such a corrector element can be easily included into a QP that is very similar to the predictor QP (
10). Consider approximating (
3) by a QP, linearizing with respect to both
χ and
$\mathbf{p}$, but again enforcing the equality of the stronglyactive constraints, as we expect them to remain strongly active at a perturbed NLP:
In our NMPC problem
${\mathcal{P}}_{nmpc}$, the parameter
$\mathbf{p}$ corresponds to the current “initial” state,
${\mathbf{x}}_{k}$. Moreover, the cost function is independent of
$\mathbf{p}$, and we have that
${\nabla}_{\mathbf{p}}F=0$. Since the parameter enters the constraints linearly, we have that
${\nabla}_{\mathbf{p}}c$ and
${\nabla}_{\mathbf{p}}g$ are constants. With these facts, the above QP simplifies to:
We denote the QP formulation (
12) as the predictorcorrector. We note that this QP is similar to the QP proposed in the realtime iteration scheme [
15]. However, it is not quite the same, as we enforce the stronglyactive constraints as equality constraints in the QP. As explained in [
28], this particular QP tries to estimate how the NLP solution changes as the parameter does in the predictor component and refines the estimate, in more closely satisfying the KKT conditions at the new parameter, as a corrector.
The predictorcorrector QP (
12) is well suited for use in a pathfollowing algorithm, where the optimal solution path is tracked from
${\mathbf{p}}_{\mathbf{0}}$ to a final value
${\mathbf{p}}_{f}$ along a sequence of parameter points
$\mathbf{p}\left({t}_{k}\right)=\left(1{t}_{k}\right){\mathbf{p}}_{0}+{t}_{k}\phantom{\rule{0.166667em}{0ex}}{\mathbf{p}}_{f}$ with
$0={t}_{0}<{t}_{1}<...<{t}_{k}<...<{t}_{N}=1$. At each point
$\mathbf{p}\left({t}_{k}\right)$, the QP is solved and the primaldual solutions updated as:
where
$\Delta \mathit{\chi}$ is obtained from the primal solution of QP (
12) and where
$\Delta \mathit{\lambda}$ and
$\Delta \mathit{\mu}$ correspond to the Lagrange multipliers of QP (
12).
Changes in the active set along the path are detected by the QP as follows: If a constraint becomes inactive at some point along the path, the corresponding multiplier ${\mathit{\mu}}_{j}$ will first become weakly active, i.e., it will be added to the set ${K}_{0}$. Since it is not included as an equality constraint, the next QP solution can move away from the constraint. Similarly, if a new constraint ${g}_{j}$ becomes active along the path, it will make the corresponding linearized inequality constraint in the QP active and be tracked further along the path.
The resulting pathfollowing algorithm is summarized with its main steps in Algorithm 2, and we are now in the position to apply it in the advancedstep NMPC setting described in
Section 2.2. In particular, the pathfollowing algorithm is used to find a fast approximation of the optimal NLP solution corresponding to the new available state measurement, which is done by following the optimal solution path from the predicted state to the measured state.
Algorithm 2: Pathfollowing algorithm. 

3.3. Discussion of the PathFollowing asNMPC Approach
In this section, we discuss some characteristics of the pathfollowing asNMPC approach presented in this paper. We also present a small example to demonstrate the effect of including the stronglyactive constraints as equality constraints in the QP.
A reader who is familiar with the realtime iteration scheme [
15] will have realized that the QPs (
12) that are solved in our pathfollowing algorithm are similar to the ones proposed and solved in the realtime iteration scheme. However, there are some fundamental differences between the standard realtime iteration scheme as described in [
15] and the asNMPC with a pathfollowing approach.
This work is set in the advancedstep NMPC framework, i.e., at every time step, the full NLP is solved for a predicted state. When the new measurement becomes available, the precomputed NLP solution is updated by tracking the optimal solution curve from the predicted initial state to the new measured or estimated state. Any numerical homotopy algorithm can be used to update the NLP solution, and we have presented a suitable one in this paper. Note that the solution of the last QP along the path corresponds to the updated NLP solution, and only the inputs computed in this last QP will be injected into the plant.
The situation is quite different in the real time iteration (RTI) scheme described in [
15]. Here, the NLP is not solved at all during the MPC sampling times. Instead, at each sampling time, a single QP is solved, and the computed input is applied to the plant. This will require very fast sampling times, and if the QP fails to track the true solution due to very large disturbances, similar measures as in the advancedstep NMPC procedure (i.e., solving the full NLP) must be performed to get the controller “on track” again. Note that the inputs computed from every QP are applied to the plant and, not as in our pathfollowing asNMPC, only the input computed in the last QP along the homotopy.
Finally, in the QPs of the previously published realtime iteration schemes [
15], all inequality constraints are linearized and included as QP inequality constraints. Our approach in this paper, however, distinguishes between strongly and weaklyactive inequality constraints. Stronglyactive inequalities are included as linearized equality constraints in the QP, while weaklyactive constraints are linearized and added as inequality constraints to the QP. This ensures that the true solution path is tracked more accurately also when the full Hessian of the optimization problem becomes nonconvex. We illustrate this in the small example below.
Example 1. Consider the following parametric “NLP”for which we have plotted the constraints at $t=0$ in Figure 1a. The feasible region lies in between the parabola and the horizontal line. Changing the parameter t from zero to one moves the lower constraint up from ${x}_{2}=2$ to ${x}_{2}=1$.
The objective gradient is $\nabla F\left(x\right)=(2{x}_{1},2{x}_{2})$, and the Hessian of the objective is always indefinite $H=\left(\begin{array}{cc}2& 0\\ 0& 2\end{array}\right)$. The constraint gradients are $\nabla g\left(x\right)=\left(\begin{array}{cc}0& 1\\ 2{x}_{1}& 1\end{array}\right)$. For $t\in [0,\phantom{\rule{0.166667em}{0ex}}1]$, a (local) primal solution is given by ${x}^{*}\left(t\right)=\left(0,t2\right)$. The first constraint is active, the second constraint is inactive, and the dual solution is ${\lambda}^{*}\left(t\right)=\left(2{x}_{2},0\right)$. At $t=0$ we thus have the optimal primal solution ${x}^{*}=(0,2)$ and the optimal multiplier ${\lambda}^{*}=(4,0)$.
We consider starting from an approximate solution at the point $\widehat{x}(t=0)=(1,2)$ with dual variables $\widehat{\lambda}(t=0)=(4,0)$, such that the first constraint is strongly active, while the second one remains inactive. The linearized constraints for this point are shown in Figure 1b. Now, consider a change $\Delta t=1$, going from $t=0$ to $t=1$. The pure predictor QP (
10)
has the form, recalling that we enforce the strongly active constraint as equality:This QP is convex with a unique solution $\Delta x=(0,1)$ resulting in the subsequent point $\widehat{x}(t=1)=(1,1)$. The predictorcorrector QP (
12),
which includes a linear term in the objective that acts as a corrector, is given for this case as Again this QP is convex with a unique primal solution $\Delta x=(1,1)$. The step computed by this predictor corrector QP moves the update to the true optimal solution $\widehat{x}(t=1)=(0,1)={x}^{*}(t=1)$.
Now, consider a third QP, which is the predictorcorrector QP (
12),
but without enforcing the strongly active constraints as equalities. That is, all constraints are included in the QP as they were in the original NLP (
16),
This QP is nonconvex and unbounded; we can decrease the objective arbitrarily by setting $\Delta x=(1.50.5r,r)$ and letting a scalar $r\ge 1$ go to infinity. Although there is a local minimizer at $\Delta x=(1,1)$, a QP solver that behaves “optimally” should find the unbounded “solution”.
This last approach cannot be expected to work reliably if the full Hessian of the optimization problem may become nonconvex, which easily can be the case when optimizing economic objective functions. We note, however, that if the Hessian ${\nabla}_{xx}\mathcal{L}$ is positive definite, QP it is not necessary to enforce the strongly active constraints as equality constraints in the predictorcorrector QP (
12).