ROM-Based Inexact Subdivision Methods for PDE-Constrained Multiobjective Optimization

: The goal in multiobjective optimization is to determine the so-called Pareto set. Our optimization problem is governed by a parameter-dependent semi-linear elliptic partial differential equation (PDE). To solve it, we use a gradient-based set-oriented numerical method. The numerical solution of the PDE by standard discretization methods usually leads to high computational effort. To overcome this difﬁculty, reduced-order modeling (ROM) is developed utilizing the reduced basis method. These model simpliﬁcations cause inexactness in the gradients. For that reason, an additional descent condition is proposed. Applying a modiﬁed subdivision algorithm, numerical experiments illustrate the efﬁciency of our solution approach.


Introduction
Multiobjective optimization plays an important role in many applications, e.g., in industry, medicine, or engineering.One of the mentioned examples is the minimization of costs with simultaneous quality optimization in production or the minimization of CO 2 emission in energy generation and simultaneous cost minimization.These problems lead to multiobjective optimization problems (MOPs), where we want to achieve an optimal compromise with respect to all given objectives at the same time.Normally, the different objectives are contradictory such that there exists an infinite number of optimal compromises.The set of these compromises is called the Pareto set.The goal is to approximate the Pareto set in an efficient way, which turns out to be more expensive than solving a single objective optimization problem.
As multiobjective optimization problems are of great importance, there exist several algorithms to solve them.Among the most popular methods are scalarization methods, which transform MOPs into scalar problems.For example, in the weighted sum method [1][2][3][4], convex combinations of the original objectives are optimized.Another popular approach is to use non-deterministic methods like evolutionary algorithms, cf., e.g., [5].Furthermore, as multiobjective problems are generalizations of scalar problems, some solution methods can be generalized from the scalar to the multiobjective case [6][7][8].
In addition to the classical methods above, there are set-based strategies for the solution of MOPs.Continuation methods [9][10][11] use the fact that the Pareto set is typically (the projection of) a smooth manifold.Subdivision methods [12][13][14][15] use tools from the area of dynamical systems to generate a covering of the Pareto set via hypercubes.However, especially when the objective functions and their gradients are expensive to evaluate, e.g., as an underlying PDE has to be solved for every evaluation, the computational time of these methods can quickly become very large.In the presence of PDE constraints, surrogate models offer a promising tool to reduce the computational effort significantly [16].Examples are dimensional reduction techniques such as the Reduced Basis (RB) Method [17,18].In an offline phase, a low-dimensional surrogate model of the PDE is constructed by using, e.g., the greedy algorithm, cf.[17].In the online phase, only the RB model is used to solve the PDE, which saves a lot of computing time.
In this article, we combine an extension of the set-oriented method presented in [12] based on inexact gradient evaluations of the objective functions with an RB approach and a discrete empirical interpolation method (DEIM) [19,20] for semi-linear elliptic PDEs.In order to deal with the inexactness introduced by the surrogate model, we combine the firstorder optimality conditions for multiobjective optimization problems with error estimates for the RB-DEIM method and derive an additional condition for the descent direction [9] to get a tight superset of the Pareto set.This approach allows us to better control the quality of the result by controlling the errors for the objective functions independently.In order to obtain an even tighter superset of the Pareto set, we update these error estimates in our subdivision algorithm after each iteration step.
The article is organized as follows.In Section 2, we recall the basic concepts of multiobjective optimization problems and review results on descent directions with exact and inexact gradients.Furthermore, we develop a set-oriented method to solve these problems, where only inexact gradient information is utilized.In Section 3, the PDEconstrained multiobjective optimization problem and the underlying semi-linear PDE are introduced.Subsequently, we show how reduced-order modeling can be applied efficiently.In Section 4, numerical results concerning both the subdivision and the modified algorithm are presented.Finally, we give a conclusion and discuss possible future work in Section 5.

A Set-Oriented Method for Multiobjective Optimization with Inexact Objective Gradients
In this section, we briefly recall the basic concepts of multiobjective optimization.Furthermore, we develop a set-oriented method to solve these problems, where only inexact gradient information is utilized.

Multiobjective Optimization
Let µ a , µ b ∈ R m with µ a ≤ µ b be arbitrary.We define the convex and compact parameter set with a given objective Ĵ = ( Ĵ1 , . . ., Ĵk ) : Compared to scalar optimization, we do not have a natural total order of R k for k > 1.Therefore, we cannot expect that there is a single point in P ad that minimizes all objectives Ĵi simultaneously.For this reason, we make use of the following definition.

Definition 1. (a)
A point μ ∈ P ad is called (globally) Pareto optimal, if there is no µ ∈ P ad satisfying Ĵ(µ) Ĵ( μ).In that case we call μ a Pareto point.(b) The set of all Pareto points in P ad is called Pareto set and is denoted by If Ĵ is continuously differentiable on an open set containing P ad , then there exists a first-order necessary condition for Pareto optimality.To formulate this condition, we define the convex, closed, and bounded set Further, the row vector stands for the gradient of the i-th objective Ĵi with i ∈ {1, . . ., k}.

Definition 2.
If for a given μ ∈ P ad there exists an ᾱ = (ᾱ i ) 1≤i≤k ∈ ∆ k with then we call μ Pareto critical, where denotes the Jacobi matrix of Ĵ at µ.The set of all Pareto critical points is called the Pareto critical set, denoted by P c .Now, we recall the first-order necessary optimality conditions for (1).
Proof.The claim follows from ([1], Theorem 3.25) and the specific choice of P ad .
see also in [21].(b) Throughout the paper, we only calculate the Pareto critical points in the interior of P ad and make use of (3).The idea is to choose P ad sufficiently large so that we get P c ⊂ int (P ad ).(c) Due to Theorem 1, we have provided P c ⊂ int (P ad ) holds true.♦

Descent Direction with Exact Gradients
Next, we introduce the notion of a descent direction for the vector valued objective function Ĵ at a non-Pareto critical point µ / ∈ P c .From now on, we assume that Ĵ : P ad → R k is continuously differentiable (on an open set containing P ad ).Definition 3. The vector v ∈ R m is a descent direction for Ĵ in µ ∈ P ad , if we have ∇ Ĵi (µ)v ≤ 0 for all i ∈ {1, ..., k} and if there is at least one i ∈ {1, ..., k} with ∇ Ĵi (µ)v < 0.
One way to compute a descent direction is to solve a constrained quadratic optimization problem in R k as shown in the following theorem.For a proof we refer to the work in [8].A similar result was shown in [22].Theorem 2. For given µ ∈ int (P ad ) let ᾱ = ᾱ(µ) ∈ ∆ k be a (global) solution of the convex constrained quadratic minimization problem Then, we have either D Ĵ(µ Combined with a backtracking Armijo line search, the descent direction from Theorem 2 can be used to construct the steepest descent method in Algorithm 1.

Descent Direction with Inexact Gradients
Suppose that we have continuously differentiable approximations Ĵ = ( Ĵ i ) 1≤i≤k of the objective function Ĵ satisfying max for given tolerances ε i ≥ 0. By P ⊂ P ad we denote the Pareto set for Ĵ and by P ⊂ P ad the Pareto set for Ĵ .If we write P c , we mean the Pareto-critical set for Ĵ and P c is the Pareto-critical set for Ĵ .Note that, in general, we have neither P ⊂ P nor P ⊂ P .In this section, our goal is to compute an approximation of P based on the approximation Ĵ of the objective function and the error bounds ε i .We begin by investigating the relationship between the KKT conditions of the original objective function and its approximation.
Based on estimate (6), we define two approximation sets for the Pareto-critical set P c of Ĵ. Definition 4. Let us introduce the two sets Lemma 2. It holds that P c ⊂ P 2 , P c ⊂ P 2 and P 2 ⊂ P 1 .
From Lemma 1, it follows that P c ⊂ P 2 .
Next, we assume that μ ∈ P c is a Pareto-critical point of Ĵ , then there exists ᾱ ∈ ∆ k with D Ĵ ( μ) ᾱ = 0.This implies min Therefore, we have μ ∈ P 2 .Let μ ∈ P 2 .Then, there exists ᾱ ∈ ∆ k with Our goal is to compute the set P l 2 via a descent method like Algorithm 1.To this end, the following theorem presents a modified version of the descent direction (4), which additionally takes the error bounds ε i into account.
Theorem 3. Let ε = (ε j ) 1≤j≤k with ε ≥ 0 and µ ∈ int (P ad ) be given.Assume that α ε is a minimizer of the quadratic problem Then, we have that Proof.The Lagrange functions for ( 7) is given as As α ε is a minimizer of ( 7), we get Lagrangian multipliers If we multiply (8) with α ε from the left, we get First case: λ ≤ 0, then µ ∈ P 2 holds and we are done.
Second case: λ > 0, then µ / ∈ P 2 holds.In this case, we show that If we can show that w v ε < 0 holds for every w ∈ K, we know that v ε is a descent direction for every objective function J i in µ.Choose w ∈ K(µ).Then, there exists an α w ∈ ∆ k with w = D Ĵ (µ) α w , and using (8) we obtain Therefore, v ε is a descent direction in µ for every objective function Ĵ j , j = 1, ..., k.
The descent direction v ε from the previous theorem will be referred to as the modified descent direction.Based on this direction, we can now construct a descent method for the computation of P l 2 , which is shown in Algorithm 2.

Remark 3. (a)
If Algorithm 2 terminates after l iteration steps, then µ l is contained in P 2 .
(b) Assume that Algorithm 2 does not terminate after a finite number of iteration steps.Then, every accumulation point μ of the sequence {µ l } l∈N generated by Algorithm 2 is in the set P 2 .A proof based on ([7] Theorem 1) can be found in ([23] Theorem 5.3.5).(c) Note that the tolerance ε is constant for all l throughout Algorithm 2. In Section 2.5, we will adapt ε in each iteration.♦ Algorithm 2: Descent method with inexact gradients.

Subdivision Algorithm
As mentioned in the introduction, there exist set-based solution methods for MOPs which globally approximate the Pareto set via sets (instead of a finite number of points).
Here, we will consider the subdivision algorithm [12,13,15], which computes an approximation of the Pareto set as a covering of hypercubes (or boxes).The idea is to start with a large box containing the Pareto set which is then iteratively subdivided into smaller boxes, while eliminating boxes that do not contain part of the Pareto set.
There are essentially two versions of the subdivision scheme: one is gradient free and, thus, is particularly useful in the case when the evaluation of gradients is computationally expensive.We refer to the work in [12], where this variant is utilized to numerically realize a reduced-order approach for a PDE-constrained multiobjective optimization problem.The other one is directly based on a dynamical systems approach and utilizes gradient information in a similar way to memetic algorithms, see in [8].Here, we will generalize the latter to the case of inexact gradients.
For a stepsize t l > 0, let us formulate a descent step of the optimization procedure by where v l and v l ε are the descent directions given by Theorems 2 and 3, respectively, with the choice µ = µ l .Depending on the descent step that we use, we either want to compute the Pareto-critical set P c or the superset P 2 or P 1 of P c .As these sets are the sets of fixed points for their respective descent step, we want to find the subset A P ⊂ int (P ad ) satisfying a(A P ) = A P or a ε (A P ) = A P .
To generate the set A P , we will use a subdivision method.This method produces an outer approximation of the set A P in the form of a nested sequence of sets B 0 , B 1 , ... ⊂ P(P ad ), where P(P ad ) denotes the power set of P ad and each B l is a subset of B l−1 in the sense that holds and B l consists of finitely many subsets B covering A P for all l ∈ N.For each set B l , we define a diameter through diam(B l ) = max B∈B l (diam(B)).Algorithm 3 shows the classical subdivision method (based on Theorem 2) and our modified descent direction (based on Theorem 3).

Algorithm 3: Subdivision algorithm.
Require : B 0 ⊂ P(P ad ) finite collection of subsets of P ad with B∈B 0 B = P ad , θ ∈ (0, 1), l = 0; in the setting of inexact gradients ε 1 , ..., ε k ; 1 while a (ε) B∈B l B = B∈B l B do Remark 4. In order to realize the subdivision algorithm numerically we choose a similar way as described in ([13] Remark 2.4).Instead of working explicitly with the centers and radii of the boxes, these are stored within a binary tree in the subdivision step, whereby the memory requirement is noticeably reduced.The selection step is implemented using a certain number of sample points in each box.These sample points are chosen either on an a priori defined grid or randomly within the boxes.Afterwards, a (ε) is evaluated in these points.For more details, we refer the reader to ([24] Section 5).♦

Modified Subdivision Algorithm for Inexact Gradients
In Algorithm 2, we utilize the same error bounds ε = (ε 1 , ..., ε k ) with ε i ≥ 0, i = 1, ..., k, in each iteration step l.Note that the larger the ε, the greater the difference between P c and P 2 or P 1 .In the algorithm, we produce an outer approximation of the set A P with a nested sequence of sets {B l } l∈N by Bl = B∈B l B ⊆ P ad .
As it holds that Bl ⊂ P ad , we have Now we modify Algorithm 3 by utilizing the descent directions introduced in Theorem 3 and update ε after every iteration step l to generate a better approximation of the set A P .
For updating ε, we use the formula and set ε l = (ε l i ) 1≤i≤k .Due to the nested choice of the box coverings, we have ε l+1 i ≤ ε l i for i = 1, ..., k and l ∈ N. In iteration step l, we generate the descent direction by computing Then, we set Typically, the set P 2 is far smaller then the admissible set P ad .Therefore, we expect that the error bounds in (9) become significantly smaller than the ones from (5).Therefore, we expect that we get better results with the modified function a l ε instead of a ε .

Multiobjective Optimization of a Semi-Linear Elliptic PDE
In this section, we introduce a multiobjective parameter optimization problem governed by a semi-linear elliptic PDE.Further, we show how reduced-order modeling can be applied efficiently.
subject to the elliptic boundary value problem where V for ϕ ∈ V, see ( [25], p. 133) for instance.

The Parameter Dependent Semi-Linear Elliptic PDE
In this subsection, we study the state equation (11b).First, we define the nonlinear operator Now, we define a weak solution to the state equation (11b).
Proposition 1.For a fixed parameter µ ∈ R m , there exists a unique solution y ∈ V to (11b).This solution is even continuous on Ω, and for a constant c ∞ it holds that Remark 5. We define the state space Y = V ∩ C(Ω), which is a Banach space endowed with the natural norm Motivated by Proposition 1, we define the parameter-to-state mapping S : P ad → Y as follows: For a given parameter µ ∈ P ad , the function y = S(µ) ∈ Y is the solution to (11b).It follows by standard arguments that S is continuously Fréchet-differentiable, see ( [23] Sections 2 and 4).♦

Reduced Formulation and Adjoint Approach
Utilizing the parameter-to-state mapping S, we define the reduced cost functional If μ ∈ P ad is a locally optimal solution to (13), then the pair (S( μ), μ) is a locally optimal solution to (11).Conversely, if (S( μ), μ) ∈ Y × P ad solves (11) locally, the parameter μ is a locally optimal solution to (13).
To apply the subdivision algorithm, the reduced objective function Ĵ has to be Fréchetdifferentiable, following immediately from the fact that the parameter-to-state operator S is Fréchet-differentiable, cf.Remark 5.The gradient of Ĵ can be expressed by introducing adjoint variables.For that purpose, we define the operators B i : Next, we define adjoint variables.Definition 6.Let a parameter µ ∈ P ad be given and y(µ) = S(µ) ∈ Y.For every i ∈ {1, . . ., k − 1}, we call the solution p i (µ) ∈ V to the adjoint variable associated with the objective Ĵi .

Finite Element (FE) Galerkin Discretization
Let us briefly introduce a standard finite element (FE) method based on a triangulation of the spatial domain Ω.Here, we utilize piecewise linear FE ansatz functions ϕ 1 , . . ., ϕ N ∈ V, which are linearly independent.We define the finite-dimensional subspace V N = span {ϕ 1 , ..., ϕ N } ⊂ V supplied with the same topology as in V.
Next, we replace ( 12) by a FE Galerkin scheme: For each µ ∈ P ad , the FE solution It follows by the same arguments as for ( 12) that the FE problem ( 16) has a unique solution y N = y N (µ) for every µ ∈ P ad .Therefore, the parameter-to-state FE mapping S N : Inserting ( 17) into ( 16), we can express (16) as a nonlinear algebraic system.For that purpose, we introduce the N × N-matrices the N-vectors and the nonlinearity function H : R N → R N given as Inserting ( 17) into ( 16), we end up with the following nonlinear system: Remark 7. The difficulty of (18) lies in the fact that we cannot assemble H(y N ) efficiently in terms of a matrix-vector multiplication.Therefore, we use mass lumping to compute H(y with (y N ) 3 = ((y N i ) 3 ) 1≤i≤N and the lumped mass matrix M defined by Further details can be found in [23,29].Let us refer to in [30], where mass lumping is utilized in optimal control.♦ Now, the FE objectives are given as

Reduced-Order Modelling (ROM)
To generate the Pareto-critical set of the MOP (11), we need to evaluate the reduced objectives Ĵi , i = 1, . . ., k, and their gradients many times.Therefore, the state and adjoint equations have to be solved numerically very often.Therefore, the use of ROM is a suitable option.In this paper, we will use the Reduced Basis (RB) method.The main idea is to construct a low-dimensional (i.e., N) subspace V of the FE space V N spanned by FE solutions of the state and adjoint equation for appropriately chosen parameters µ ∈ P ad .Here, this strategy is realized by greedy algorithms.We refer to the works in [17,18,31] for a general explanation and to ([23] Section 3.2) for our specific problem (11).This is an iterative procedure where in each iteration FE solutions of the state and adjoint equation at a specific parameter value are added to the basis.An essential ingredient of greedy algorithms is the choice of an error indicator η (µ).Here, we use the maximal true error between the FE and the ROM gradients, i.e., we set Our subdivision scheme is based on gradient information.To be able to generate P 1 or P 2 for the approximation of P, we have to ensure that the approximated objective function Ĵ based on the ROM model satisfies the inequality in (5).The idea is to generate a new basis element in every step until the maximum error η (µ) on a discrete training set S train , which approximates P ad good enough, is smaller than a tolerance ε tol .For more details, see Algorithm 4. As V is a subset of V, we endow V with the V-topology.Due to ψ j ∈ V N (1 ≤ j ≤ ), there exists a coefficient matrix Ψ ∈ R N× such that Now, we replace ( 16) by an RB Galerkin scheme: For each µ ∈ P ad , the RB solution We suppose that ( 22) has a unique solution y = y (µ) for every µ ∈ P ad .Therefore, the parameter-to-state RB mapping S : P ad → V , µ → S (µ) = y (µ) is well defined.Inserting (21) and y (µ) = ∑ N i=1 y i (µ)ψ i into (22), we derive the nonlinear algebraic system (cf.(18)) with the × matrices K = Ψ KΨ, M = Ψ MΨ, and Q = Ψ QΨ and the -vectors with (Ψy ) 3 = (((Ψy ) 3 i )) 1≤i≤N and the × N matrix M = Ψ M.However, in the RB Galerkin scheme the evaluation of the nonlinearity is still as costly as in the FE case.Here, discrete empirical interpolation (DEIM) is applied, cf.[19,20].We skip the detailed description here and refer the reader to ([23] Section 3.2).♦

Convergence Analysis
We prove the convergence of the RB solution with mass lumping to the weak solution of the state and adjoint equation.

Remark 9.
As the FE space is a finite dimensional space, the error for the state and adjoint equation converges for increasing dimension of the RB space to zero.Therefore, the RB solution converges for increasing accuracy in the Greedy algorithm to the FE solution.We skip a detailed description of the proof here and refer the reader to [23,Section 3.2].♦ Theorem 4. Let (S j ) j ⊂ P ad with S j ⊆ S j+1 be a growing sequence of training sets and choose a monotone sequence (ε j ) j ⊂ R >0 satisfying Then, we get Proof.Let µ ∈ P ad be an arbitrary parameter.It holds that with N > .From Theorem A1 (see Appendix A), we infer that the first summand converges to zero for increasing N.For the second summand, we refer the reader to ([23] Section 3.2.5); in this section we proved the convergence of the RB solution to the FE solution.For the adjoint equation, we can do the same and the claim follows.

Numerical Experiments
In this section, we use our algorithm to solve multiobjective optimization problems with PDE constraints and interpret the numerical results.All computations were executed on a computer with a 2.9 GHz Intel Core i7 CPU, 8 GB of RAM, and an Intel HD Graphic 4000 1536 MB GPU.The algorithms were implemented in Matlab R2017b.For the subdivision method, we used the implementation from https://math.uni-paderborn.de/en/ag/chair-of-applied-mathematics/research/software.
In this example, we will numerically investigate the application of the modified subdivision algorithm presented in Section 2.5 to the PDE-constrained multiobjective optimization problem using the RB-DEIM solver from Section 3.5.For the underlying PDE, we set d = 2, Ω = (0, 1) 2 with elements x = (x 1 , x 2 ), This leads to the following PDE: In Figure 1, the corresponding solutions of the state equation are shown for three values of µ.In [23], we have already observed that the error between the FE-and RB-DEIMsolution of the state and adjoint equation decreases if the FE grids get finer.We skip the detailed description here and refer the reader to ([23] Section 5).Notice that y(x) = x 2 1 + x 2 2 solves (25) for µ = (0, 0).For the FE-solver, we used linear finite elements with ∆H max = 0.04 and the finite elements have 762 degrees of freedom.We choose the following two objective functions: with µ d = (1, 1).For the desired state y d , we take the FE solution for µ = 0, i.e., y d = y N (0).Thus, The associated FE objectives are now given as where p N (µ) = ∑ N j=1 p N j (µ)ϕ j is the FE solution to (15), p N (µ) = (p N j (µ)) 1≤j≤N and Fi = ( Ω ϕ j ξ i dx) 1≤j≤N .The associated RB objective functions have the form with y d, = Ψ y d .In [23], we have already observed good agreement between the approximated Pareto critical sets with the FE-and RB-DEIM-solver, if the error between the gradients is sufficiently small.However, we cannot always guarantee this agreement.Thus, we will instead use the supersets P 1 and P 2 from Section 2.3 for the approximation of P (and P c ), which is only based on the reduced objective function Ĵ and the error bounds ε.
To generate P 1 , we compute the steepest descent direction for all components of Ĵ .Similar to Algorithm 1, we first calculate α k as solution of (4) for µ k .Then, the descent direction is To generate P 2 , we use Algorithm 2.
To save computational time during the modified subdivision algorithm, we calculate max 1≤i≤k before the algorithm starts (i.e., in an offline phase) on a training set P train , which approximates P ad and store these errors.During the subdivision algorithm, we use them to generate the new ε l i+1 faster and without calculating the FE solution again.To see a significant difference between the modified subdivision algorithm and the subdivision algorithm and P 2 and P 1 , we need to have a big error between the gradients.To achieve this, we choose a rough training set S train = (−2 + 0.51k, −2 + 0.427j) (k, j) ∈ {0, ..., 7} × {0, ..., 9} with |S train | = 80.For the Greedy algorithm, we choose the tolerance ε tol = 0.06 and the true error η (µ) := max 1≤i≤k ∇ ĴN i (µ) − ∇ Ĵ i (µ) 2 as error indicator, for the DEIM algorithm we choose the tolerance ε = 0.5.With these settings, we generate an RB-basis with six elements and a DEIM-basis with 18 elements.This leads to the following estimations for the gradients of ĴN and Ĵ : max To generate these estimations, we chose a training set P train ⊂ P ad with 3105 equally distributed test points and generate the error for these points.Thus, we have Now, we test the subdivision and the modified subdivision algorithm with the following conditions.To compute the descent direction, we use the Matlab function fmincon and solve (4) or (7).The algorithm stops when the box size is small enough, which is after 25 iteration steps in our case.In every step, we halve the boxes.We choose five sample points in each box during the first five iteration steps, four sample points in the next five iteration steps, three sample points for the iteration steps ten to 14, two sample points for the next five steps, and one sample points for the last five iteration steps, see Remark 4. Figure 2 shows the generated approximated Pareto sets for the FE-solver after 20 and 25 iteration steps.The results with the subdivision algorithm and RB-DEIM-solver are shown in Figure 3.The modified subdivision algorithm and RB-DEIM-solver lead to the results plotted in Figure 4.The runtime, number of boxes and number of function and gradient evaluations needed in iteration step 10, 15, 20, and 25 are shown in Tables 1-5.The total runtime and number of function and gradient evaluations needed for the different methods and the speed-up are shown in Table 6.The Greedy algorithm and DEIM together take 14.06 s in the offline phase.To ensure the error ε (cf.( 5)) for the gradients on the training set P train , it takes 151.13 s.It follows from Figures 3 and 4 that P 2 is significantly smaller than P 1 for the subdivision algorithm as well as for the modified subdivision algorithm.Therefore, we have a much better approximation for P c if we choose P 2 instead of P 1 .If we compare the two different subdivision algorithms, we notice that with the modified subdivision algorithm we have a better approximation of P c than with the subdivision algorithm.
The reason for this is that in the modified subdivision algorithm we have a monotonically decreasing sequence ε l .In Figure 5, the error between ∇ ĴN 1 and ∇ Ĵ 1 is plotted on the parameter set P ad .The black markers are some points on the Pareto critical set P c .It turns out that the difference between the two gradients is significantly smaller near P c than in other regions of P ad .Due to this, in the modified subdivision algorithm, the sequence ε l decreases noticeably so that it is useful to update ε after each iteration step.As we have already mentioned, P 2 is a better approximation of P c .If P 2 is generated by the modified subdivision algorithm, the result is even better.Comparing Figures 2b and 4b, there is no significant difference between the two sets.Therefore, the modified subdivision algorithm yields a good approximation P 2 for P c , although the error ε 1 is not small.Regarding the computational time (cf.Tables 1-6), we notice that in the first 20 iterations the four methods take almost the same time.In these iteration steps, the FEsolver takes between 19 and 30 times more time for one iteration step than the four RB-based methods.Only in the subsequent iteration steps a significant difference in the four RBbased methods appears.For these iteration steps, the FE-solver takes between 3.9 and 30.8 times as much time as one of the four methods for one iteration step.We notice that the computation of P 1 takes around two times as much time as the computation of P 2 with the subdivision algorithm.The main reason for this is the much larger number of function and gradient evaluations that we have for P 1 .This, in turn, can be attributed to the significantly larger number of boxes for P 1 .If we use the modified subdivision algorithm, the computation of P 2 takes slightly more time than the computation of P 1 .The main reason for this is again the larger number of function and gradient evaluations.Unlike the previous case, this can not be lead back to the number of boxes, but probably to the calculation of the modified descent method, which requires more function and gradient evaluations.Nevertheless, the difference in the computational time is marginal (a factor about 1.04) for the modified subdivision algorithm.For P 1 , we notice another behavior: Here, the generation of P 1 with the modified subdivision algorithm is 2 times faster than with the subdivision algorithm.This is mainly because of the larger number of function evalutions, which is due to the larger number of boxes.The FE-solver takes approximately 23.6 times as much time as the computation of P 1 and approximately 22.6 times as much as the time as the computation of P 2 with the modified subdivsion.This is another advantage of the modified subdivision algorithm.As we get a better approximation in this algorithm, we have fewer boxes in the iteration steps, and therefore we have a smaller number of function and gradient evaluations than we have for the subdivision algorithm.Finally, we see that the modified subdivision algorithm works better and faster than the subdivision algorithm.As P 2 is a tighter approximation of P c , it is better to generate P 2 rather than P 1 .

Conclusions
In this article, we present a way to solve multiobjective parameter optimization problems of semilinear elliptic PDEs by combining an RB approach and DEIM with the set-oriented method based on gradient evaluations.To deal with the error introduced by the surrogate model, we derived an additional condition for the descent direction, which allows us to consider the errors for the objective functions independently and derive a superset P 2 of the Pareto-critical set P c .To get an even tighter superset, we update these error bounds in our subdivision algorithm after each iteration step.To summarize the numerical results, we first investigated the influence of the error bounds for the gradients of the objective functions.By individually adapting the components of the error bounds, we obtained a tighter covering of the Pareto critical set.When we additionally adjusted the error bounds in each iteration step, the result became even tighter and almost coincided with the exact solution of the MOP (solution with the FE-solver and the steepest descent method).Furthermore, we compared the computational time for each method.The FEsolver needed between 11.9 times and 23.6 times more time than the four different RB-based methods we presented in this work.For future work, it could be interesting to improve the results in ([23] Section 3.2.4) and to develop an efficient a posteriori error estimator for the error in the objectives and their gradients, cf., e.g., [32][33][34].These error bounds can then be used in a weak greedy algorithm and beyond that for the error bounds which are needed in the subdivision algorithm.

4 5 Remark 2 .
Calculate α l as a solution of (4) for µ = µ l ; Set v l = −D Ĵ(µ l ) α l ; 6 end (a) If Algorithm 1 terminates after a finite number l of iterations, then µ l is a Pareto critical point.(b) Assume that Algorithm 1 does not stop after a finite number of iterations.Then, every accumulation point μ of the sequence {µ l } l∈N generated by Algorithm 1 is a Pareto critical point.A proof based on ([7], Theorem 1) can be found in ([23] Theorem 5.2.5).♦

and y d 1 ,
..., y d k−1 ∈ H.Moreover, b, c, and d are non-negative constants with c > 0. As Ω is a bounded connected open set with smooth boundary, it is known that V is a Hilbert space endowed with the inner product ϕ, ψ V = Ω ∇ϕ • ∇ψ dx + Γ ϕψ ds for ϕ, ψ ∈ V and the induced norm ϕ V = ϕ, ϕ 1/2

Figure 2 .
Figure 2. Pareto-critical set P c in iteration step (a) 20 and (b) 25 for the FE-solver.

Figure 3 .
Figure 3. (a) P 1 and (b) P 2 generated with the subdivision algorithm.

Figure 4 .
Figure 4. (a) P 1 and (b) P 2 generated with the modified subdivision algorithm.

Table 1 .
The performance of the subdivision algorithm with the FE-solver and the steepest descent method.

Table 2 .
Subdivision algorithm with the steepest descent method.

Table 3 .
Subdivision algorithm with the modified descent direction.

Table 4 .
Modified subdivision algorithm with the steepest descent method.

Table 5 .
Modified subdivision algorithm with the modified descent direction.

Table 6 .
Total runtime, number of function, and gradient evaluations for the different methods and the speed-up.