Numerical Computation of Lightly Multi-Objective Robust Optimal Solutions by Means of Generalized Cell Mapping

: In this paper, we present a novel algorithm for the computation of lightly robust optimal solutions for multi-objective optimization problems. To this end, we adapt the generalized cell mapping, originally designed for the global analysis of dynamical systems, to the current context. This is the ﬁrst time that a set-based method is developed for such kinds of problems. We demonstrate the strength of the novel algorithms on several benchmark problems as well as on one feed-back control design problem where the objectives are given by the peak time, the overshoot, and the absolute tracking error for the linear control system, which has a control time delay. The numerical results indicate that the new algorithm is well-suited for the reliable treatment of low dimensional problems.


Introduction
In many real-world engineering problems, one is faced with the problem that several objectives have to be optimized concurrently leading to a multi-objective optimization problem (MOP). The typical goal for such problems is to identify the set of optimal solutions (the so-called Pareto set) and its image in objective space, the Pareto front. However, in practice, the decision-maker may not always be interested in the best solutions, in particular, if these solutions are sensitive to perturbations [1,2]. In such cases, there exists an additional challenge. One has to search not only for solutions with a good performance but also for solutions that can be implemented, leading to the so-called robust multi-objective optimization problem (RMOP) [3]. In this context, the notion of robustness is not clear since it relies on the information at hand from the given problem as well as the preferences of the decision-maker. Consequently, there exist multiple definitions of robustness according to different scenarios [3][4][5][6]. The interested reader is referred to [7] for a survey of the different definitions of robustness.
Recently, the lightly robust multi-objective optimal solutions were proposed [7]. These solutions are often good candidates for the decision-maker since solutions have to be reliable as well as to yield good performances. In this case, a solution is considered to be feasible if it is "close enough" to an optimal solution. Then, the "most reliable" solutions are chosen with respect to the set-based minmax robust efficiency [3]. Thus, lightly robust optimal solutions yield a similar performance to optimal ones while being more reliable.

Background
In the following, the basic concepts to understand the sequel are presented.

Multi-Objective Optimization
This work considers continuous multi-objective optimization problems of the form: where F is defined as the vector of the objective functions F : Q → R k , F(x) = ( f 1 (x), . . . , f k (x)) T . Further, assume that each objective f i : Q → R is continuously differentiable, however, stress that in practice continuity will be enough. In multi-objective optimization, optimality is defined by the concept of dominance. Definition 1. (a) kleiner Let v, w ∈ R k . Then the vector v is less than w (v < p w), if v i < w i for all i ∈ {1, . . . , k}. The relation ≤ p is defined analogously.
(b) y ∈ Q is dominated by a point x ∈ Q (x ≺ y) with respect to (1) if F(x) ≤ p F(y) and F(x) = F(y), otherwise y is called non-dominated by x; (c) x ∈ Q is a Pareto point if there is no y ∈ Q that dominates x; (d) x ∈ Q is weakly Pareto optimal if there exists no y ∈ Q such that F(y) < p F(x).
The set P Q of Pareto points within Q is called the Pareto set and its image F(P Q ) the Pareto front. Typically, both sets form (k − 1)-dimensional objects [17].
In some cases it is worth looking for solutions that are 'close' in objective space to the optimal solutions, for instance as backup solutions. This is the so-called set of nearly optimal solutions. This set is defined as: The notion of − -dominance can be used to define the set of nearly optimal solutions. Definition 3 ([18]). Denote by P Q, the set of points in Q ⊂ R n that are not − -dominated by any other point in Q, i.e., P Q, := {x ∈ Q| ∃y ∈ Q : y ≺ − x} .
Thus, in the remainder of this work, the aim will be to compute the set P Q, . To the best of the authors' knowledge, there exist only a few algorithms that aim to find approximations of P Q, [19][20][21][22]. Most of the approaches use the ArchiverU pdateP Q, as the critical component to maintaining a representation of the set of interest.

Uncertain Multi-Objective Optimization
Next, an uncertain multi-objective optimization problem (UMOP) is defined. Here, assume that uncertainties in the problem formulation are given as scenarios from a known uncertainty set U ⊆ R m . It is also assumed that F : Q × U → R k . In particular, this work focuses on uncertainties in the decision variables. According to [1], this is referred to as Type B uncertainties modeled in a deterministic way. Thus, the following definitions are adapted to this context. Definition 4. An UMOP P(U) = (P(δ), δ ∈ U) is defined as the family of parameterized problems" where F : Q × U → R k and Q ⊆ R n .
Note that it is not clear what a solution to such a family of problems is. In the following, a concept of robustness is introduced [3].
For a given feasible solution x, the worst case of the objective vector is interpreted as a set, namely the set of efficient solutions to the multi-objective problem of maximizing the objective function over the uncertainty set. Formally, this can be written as follows: Definition 5 (Robust efficiency (re) [3]). Given an UMOP P(U), a feasible solutionx ∈ Q is called set-based minmax robust efficient if there is no : ξ ∈ U} and R k represents the dominance cone. The robust counterpart of an uncertain multi-objective optimization problem is the problem of identifying all x ∈ Q which are re. Thus, the robust counterpart problem can be stated as: min where sup δ∈U is defined as the set of efficient solutions of the following multi-objective optimization problem: One of the main criticisms to previous definitions is that it could be over-conservative since it considers the worst case. Thus, solutions with a poor performance in terms of their objective functions could be selected. As a possible remedy, in [7] the authors extended the notion of lightly robust solutions to the multi-objective context. In this case, given a nominal scenarioδ ∈ U, let Q δ (δ) be the set of efficient solutions of P(δ). For each efficient solutionx ∈ Q δ (δ) to P(δ) and some given 0 ≤ ∈ R k the authors define the uncertain multi-objective optimization problem LR(x, , U) := LR(x, , δ), δ ∈ U, as the family of parametrized, deterministic multi-objective optimization problems: Definition 6 (Lightly robust efficiency (lre)). Given an uncertain multi-objective optimization problem P(U) with nominal scenario δ ∈ U and some ∈ R k ≥ . Then a solutionx ∈ Q is called lightly robust efficient for P(U) w.r.t. if it is set-based minmax robust efficient for LR(x, , U) for somex ∈ Q δ (δ).
Thus, the robust counterpart of this uncertain multi-objective optimization problem is the problem of identifying all x ∈ Q which are lre. The robust counterpart problem can be defined as:

Cell Mapping Techniques
Cell mapping techniques were first introduced in [9] for the global analysis of nonlinear dynamical systems. They transform classical point-to-point dynamics into a cell-to-cell mapping by discretizing both phase space and the integration time. In particular, the phase space discretization bounds the method to a small number of variables that can be considered (say, n < 10), but this global analysis offers in turn much more information than other methods. The cell mapping techniques are particularly advantageous for the thorough investigation of low dimensional problems. Such problems occur, for instance, when optimizing the control gains in optimal control [11,12,[23][24][25][26][27][28]. In the context of multi-objective optimization, this is in particular the extended set of options that can be offered to the decision-maker (DM) after analyzing the model. In [29], the authors adapted the simple cell mapping (SCM) to the multi-objective context. The proposed algorithm is capable of computing the Pareto set/front and the set of approximate solutions. The method is particularly advantageous if there exist several possibilities to obtain the same optimal or nearly optimal performance. It is important to note that the relevant information about all these sets of interest is available after one single run of the algorithm (together with an ex-post analysis of the obtained data). In GCM, a cell z is allowed to have several image cells, being the successors of z unlike the previous studies performed with SCM since only one image is allowed by the method. In GCM, each of the image cells is assigned a fraction of the total transition probability, which is called the transition probability with respect to z.
The transition probabilities can be grouped into a transition probability matrix P of order N c × N c , where N c is the total number of cells. Then the evolution of the system is completely described by: where p is a probability vector of dimension N c that represents the probability function of the state. This generalized cell mapping formulation leads to an absorbing Markov chain [30]. In the following, some concepts that are useful for our work are presented (see [9] for more details).
A Markov chain is absorbing if it has at least one absorbing state, and it is possible to go to an absorbing state from every state (not necessarily in one step). Two types of cells can be distinguished: A periodic cell i is a cell that is visited infinitely often once it has been visited. In our work, the focus is on periodic cells of period 1, i.e., P ii = 1. These kinds of cells correspond to the local optima candidates.
A transient cell is by definition a cell that is not periodic. For an absorbing Markov chain, the system will leave the transient cells with probability one and will settle on an absorbing (periodic) cell.
To consider an arbitrary absorbing Markov chain, renumber the states so that the transient states come first. If there are r absorbing states and ts transient states (N c = r + ts), the transition matrix has the following canonical form: where Q is a ts by ts matrix, R is a nonzero ts by r matrix, 0 is an r by ts zero matrix, and I is the r by r identity matrix. The matrix Q gathers the probabilities of transitioning from some transient state to another whereas matrix R describes the probability of transitioning from some transient state to some absorbing state.
For an absorbing Markov chain the matrix I − Q has an inverse N = (I − Q) −1 . The (i, j)-entry n ij of the matrix N is the expected number of times the chain is in state s j , given that it starts in state s i . The initial state is counted if i = j.
is called the fundamental matrix of the Markov chain.
The absorbing probability is defined as the probability of being absorbed in the absorbing state j when starting from transient state i, which is the (i, j)-entry of the matrix B = NR. In terms of cell mapping, the set of all B i,j = 0 for i ∈ [1, . . . , ts] is called the basin of attraction of state j, and an absorbing cell within that basin is called the attractor. Table 1 describes the nomenclature of relevant variables used in the manuscript.

Proposed Algorithm
This section presents the algorithm for the computation of lightly robust optimal solutions.

General Framework
In the following, the general procedure to compute lightly robust optimal solutions is presented (Algorithm 1). First, the algorithm computes GCM to compute the canonical matrix (line 2 of Algorithm 1). The sub-matrix I contains the periodic cells (candidate optimal solutions of the nominal MOP). Next, these solutions are the starting point to look for nearly optimal solutions with a backward search algorithm (line 3 of Algorithm 1). Then, the cells containing the set of nearly optimal solutions are subdivided and the process is repeated for a number of iterations. After that, the algorithm computes the worst case for each cell found in the previous step by solving max δ∈U F(x + δ) where x ∈ P Q, (line 6 of Algorithm 1). Finally, the algorithm uses an archiver to filter the best sets of worst cases (line 7 of Algorithm 1). The next sections give details on how to perform each of the steps.

Algorithm 1 GCM for Multi-objective Light Robust Optimal Solutions
Require: F: objective function, δ ∈ R n : error, lb ∈ R n and ub ∈ R n : lower and upper bounds respectively, N 0 ∈ R n : cells per dimension, s 0 set of cells, iter number of subdivision steps Ensure: LR: Set of lightly robust solutions 1: for l = 0, . . . , iter do 2: [P l ,s l ] ← GCM(F, s l , lb, ub, N l ) 3:

Generalized Cell Mapping for Multi-Objective Optimization
In order to use GCM in the context of multi-objective optimization, one has to define the dynamical system to be used. In the following, the dynamical system used is described and is based on Pareto dominance. In this case, a given cell s i ∈ S will map to those neighbors N e (s i ) that are better according to at least one objective function (i.e., those cells that dominate s i ) (Equation (11)). If no neighbor yields an improvement, then it belongs to a periodic group (candidate to be a local optimum) (Equation (12)). Then, for each neighbor, the algorithm assigns a probability proportional to the improvement in terms of the objective functions (Equation (13)).
Now that a suitable dynamical system has been defined, it is possible to apply GCM to the multi-objective context. Algorithm 2 shows the key elements to compute the global properties of the MOP at hand. For each cell z, F(z) is compared to the objective values of its neighbors N e (z). Next, a probability is assigned, proportional to their function values, to pass into those cells. If there is no better neighbor cell, the transition probability is divided by the number of neighbors with the same function value and assigned to them. Note that dominated neighbor cells always gains a transition probability of 0. One of the advantages of GCM is its global nature. The methods allow scape saddle points since it analyzes all the search space to look for a promising direction.  (9) and rearrange s intos 6: Return P,s

Computing Approximate Solutions with Backward Search
After one run of the GCM algorithm, the algorithm has gathered information on the global dynamics of the system and is able to approximate the set of interest in a post-processing step. For the problem at hand, the archiving technique ArchiveU pdateP Q, [20,22] is used.
The integration of both algorithms is as follows: Algorithm 3 updates the archive first with the periodic cells discovered with GCM and continues with the rest of the periodic motion by inverting the cell mappings. First, a queue is generated with the periodic cells, and until the queue is empty the algorithm searches for nearly optimal solutions. The algorithm takes advantage of the fact that GCM has already encoded the mappings in the canonical matrix. Thus, it is possible to exploit that information to perform a breadth-first search where new cells are enqueued if they are accepted by the archiver. Note that if a cell s is not accepted by the archiver neither would the cells that map to s, since by construction these cells are dominated by s. Thus, Algorithm 3 computes the set of nearly optimal solutions without testing all cells in the search space. Q.enqueue(c ∩ A) 8: end while 9: Return A

Subdivision
When using GCM all the search space is analyzed. Thus, it is a common need to find a compromise between computational effort and precision. A promissory approach is to use GCM to compute a raw picture of promissory regions. Afterward, with this preliminary result, the algorithm now focuses on these regions to perform a fine search. Algorithm 4 shows the steps to perform the subdivision of the set of cells that contain nearly optimal solutions.

Algorithm 4 Subdivision
Require: P Q, : P Q, approximation, l: subdivision level Ensure:B l : new collection of cells 1: To realize the subdivision, multi-level partitions ofP Q, as described in [15] are considered: A n-dimensional cell B (or box) can be expressed as: where c ∈ R n denotes the center and r ∈ R n the box size, respectively. Every cell B can be subdivided with respect to the j-th coordinate. This division leads to two cells B − (c − ,r) and B + (c + ,r) where: Denote by P(P Q, , d), d ∈ N, the set of cells obtained after d ∈ N subdivision steps starting with B(c 0 , r 0 ), where in each step i = 1, . . . , d the cells are subdivided with respect to the j i -th coordinate, where j i is varied cyclically. That is, j i = ((i − 1) mod n) + 1. Note that for every point y ∈ Q and every subdivision step d there exists exactly one cell B = b(y, d) ∈ P(P Q, , d) with center c and radius r such that c i − r i ≤ y i < c i + r i , ∀i = 1, . . . n. Thus, every set of solutions S B leads to a (unique) set of cell collections:

Compute the Worst Cases
Once the algorithm has computed a suitable representation of the set of nearly optimal solutions, it can search for the worst cases for each solution in P Q, . As before, the information provided from GCM allows us to compute the set of worst cases with post-processing of the data. For a given cell s, the algorithm finds the 2 δ i h i neighbors for i = 1, . . . , n, where h is the size of the cell and δ is the uncertainty. Then, the algorithm computes the set of worst cases. Note that this can be done by transposing the matrix P and then looking for those cells that do not have any image in Q = {x|x i − δ i ≤x ≤ x i + δ i }. Algorithm 5 shows the procedure to compute the worst cases.

Algorithm 5 Computation of worst cases
Require: P Q, : P Q, approximation, p: probability matrix Ensure: set of worst cases 1: for all cell ∈ P Q, do 2: Select neighbors of cell (neighbors) 3: Compute max of neighbors (WC) 4: end for 5: Return WC

Compute Best Worst Cases
Finally, it is a requirement to to filter the solutions to keep the best worst cases. Algorithm 6 extends the archiver ArchiveU pdateP Q to handle families of solution sets. In this case, both P and A 0 are families of sets. Note that line 3 of the algorithm uses set-based dominance instead of classical Pareto dominance.

Computational Complexity
In this section, the computational time complexity of each of the algorithms presented is discussed with respect to the number of cells to process. ).
Thus, the complexity of the archiver is O(max( δ i h i )Nc 2 ) for i = 1, . . . , n.
From the above discussion it follows that the total time complexity of the algorithm to compute lr solutions is O(max( δ i h i )Nc 2 ). It is also important to notice that the total number of cells is given by ∏ n i=1 N. If one would like to maintain the precision, the number of cells required will increase exponentially with the number of dimensions. Thus, the complexity with respect to the number of dimensions is O(max(N) n ). This is the main reason why GCM is restricted to low dimensional problems. for all a ∈ A do 7: if p ≺ − a then 8: A := A\{a} 9: end if 10: end for 11: end for 12: return A

Numerical Results
In the following, the experimental design used to validate the performance of the novel algorithm is presented. The algorithms were implemented in Matlab R R2012a and C (connected through mex files). All executions were performed in a desktop computer with Intel R Core TM i7-2600K CPU 3.40 GHz processor and 4 GB of RAM. Here, four academic test problems were used. These problems were proposed for multi-objective optimization without uncertainty. However, the problems present interesting features for light robust optimization: Deb99 [31] is a bi-objective optimization problem whose global front is highly sensitive to perturbations since it has a small basin of attraction. This problem was modified to move closer to the local front to the global one. Equation (18) shows the definition of the problem.
Two-on-one [32] was proposed to test the capability of an EMOA to compute equivalent Pareto sets. Sym-part [33], the Pareto set is formed by nine connected components that map to the same region in the Pareto front. SSW [34] is an MOP whose Pareto set falls into four connected components. Due to symmetries of the model two of these components (the two outer curves on the boundary of the domain) map to the same region in the Pareto front. Figure 1 shows the results of the novel algorithm on Deb99 in each step of the algorithm. First, Figure 1a shows the GCM, there it is possible to see that the Pareto set is located in x 2 = 0.2. Figure 1b shows the results after the application of the BackwardSearch algorithm. In this case, there are two regions of interest located around x 2 = 0.2 and x 2 = 0.6. These regions correspond to the local Pareto sets. Then, once the set P Q, was computed, the next step is to compute the worst cases for each x ∈ P Q, . Figure 1c shows the computation of worst cases. Here, it is possible to observe a curve connecting [1,2] and [0.1, 10] that did not appear in the set of nearly optimal solutions. This curve corresponds to the worst case of the global Pareto front. Finally, Figure 1d shows the lightly robust optimal solutions after the application of the ArchiverP re . There, the global Pareto front is dominated (in the re sense) by the local front. Thus, the lightly robust solutions are those solutions in the local front located in x 2 = 0.6. Now, the results for all the problems are shown. Table 2 shows the parameters used to perform the experiments. denotes the allowed deterioration from the Pareto optimal solutions, δ represents the uncertainty, and N is the number of cells used per dimension. In this experimental study, the ∆ 2 is used [35][36][37] indicator to measure the distance of the best solutions found by the algorithm to the real solution in decision space. Note that this measure can be interpreted as the deviation of the solution of the algorithm to the real solution.     Figures 2 and 3 show the GCM approximation of the set of lightly robust optimal solutions as well as their respective worst-case images. Up to date, no other method exists that deals with this problem class. In order to appreciate the results and for the sake of comparison, the following possible alternative approach is used through the use of an archiver: A number of uniform random samples are produced. Next, ArchiverU pdateP Q, is used to obtain the approximate solutions is applied. Then, for each solution, the algorithm samples in their neighborhood are defined by the uncertainty and find the set of worst cases. Finally, the best-worst cases are computed. To have a fair comparison, the inner sampling was set to 100 solutions and the rest for the sampling in the whole search space. Table 3 shows the ∆ 2 values in decision space between the real solution and approximation set found.  From the results, it is possible to observe that in Deb99, the solutions in the nominal global front are dominated by those in the local front in terms of lre. In the cases of two-to-one and sym-part, the nominal global front is the lightly robust front. Finally, in SSW one of the connected components that was optimal for the nominal MOP is now dominated in terms of lre. As can be seen, in all cases the GCM approach is superior to the archive-based approach, and the results differ by several orders of magnitude.

Application to Optimal Control
Next, a second order oscillator subject to a proportional-integral-derivative (PID) control [12] is considered. Until now, this problem was only studied without uncertainty.
where ω n = 5, ζ = 0.01, r(t) is a step input, k p , k i , and k d are the PID control gains. The MOP with the control gains k = k p , k i , k d T as design parameters. The design space for the parameters was chosen and are as follows, The multi-objective optimization problem to design the control gain k is defined as: where M p stands for the overshoot of the response to a step reference input, t p is the corresponding peak time, and e I AE is the integrated absolute tracking error: where r(t) is a reference input and T ss is the time when the response is close to the steady state. The closed-loop response of the system for each design trial is computed with the help of closed form solutions. The integrated absolute tracking error e I AE is calculated over time with T ss = 20 s. In this case, the error was considered to be δ = [0.4, 0.29, 0.01] T which corresponds to a 1% error and = [0.10.10.1] T . GCM was executed with an initial grid of N = [30,18,8] T and 3 subdivision steps. Figure 4 shows the Pareto optimal solutions. Figure 5 shows the nearly optimal solutions. Figure 6 shows the approximation of the lightly robust optimal solutions and their worst-case image found by GCM as well as the Pareto optimal solutions of the nominal problem. Note that optimal solutions and lightly optimal solutions have the same structure. However, there is a ∆ 2 (P Q , P LR ) = 1.0525. The running time was 709.13 s.
Finally, the response of the system is studied. In [12], the vector x pq = [40.0, 2.8796, 1.9792] T was selected after the decision making process. In this case, the closest solution in the set of lightly robust optimal solutions is analyzed. This solution is x lr = [40.5880, 2.7059, 1.9118] T with an euclidian distance of 0.6168 in decision space and 0.9023 in objective space to x pq . Figure 7 shows the response of the system for both solutions. The result shows the trade-off between optimality and robustness. It is possible to observe that x lr deteriorates almost 1% in terms of overshoot. However, this solution is more robust than the solution selected in [12] and thus it would be more desired in practice given the uncertainty.

Conclusions and Future Work
In this paper, we investigated cell mapping techniques for the numerical treatment of uncertain multi-objective optimization problems in terms of lre. We adapted the cell mapping techniques to the given context via considering dynamical systems derived from descent methods and argued that the resulting algorithm is particularly beneficial for the thorough investigation of small problems. That is, the new algorithm is capable of detecting solutions that have almost the same performance as the optimal solutions but that are more reliable. This gives the decision-maker solutions that are less susceptible to uncertainties and thus are good candidates to implement. The main advantage of the novel algorithm is that lr solutions can be computed with the same effort as computing optimal solutions in terms of function evaluations.
Though the results presented in this work are very promising, some points have to be addressed in order to make the algorithm applicable to a broader class of problems. First of all, the main drawback of the cell mapping techniques is that they are restricted to small dimensional problems. Note, however, that the algorithm is highly parallelizable since the core of the algorithm is the mapping of each cell which can be realized with small effort. We expect thus that the use of massive parallelism realized e.g., via GPUs will lead to applicability to higher dimensional problems adapting ideas presented in [38]. Moreover, it would be interesting to study stochastic model predictive control since they are inherently uncertain multi-objective optimizations. Finally, there exist several definitions and kinds of uncertainty. In this work, we focused on lre and uncertainty in decision variables, such as the one found in manufacturing tolerance errors. Thus, it would be interesting to extend the algorithm to the treatment of other kinds and definitions of uncertainty.

Conflicts of Interest:
The authors declare no conflict of interest.