Adaptive reconstruction of imperfectly-observed monotone functions, with applications to uncertainty quantification

Motivated by the desire to numerically calculate rigorous upper and lower bounds on deviation probabilities over large classes of probability distributions, we present an adaptive algorithm for the reconstruction of increasing real-valued functions. While this problem is similar to the classical statistical problem of isotonic regression, the optimisation setting alters several characteristics of the problem and opens natural algorithmic possibilities. We present our algorithm, establish sufficient conditions for convergence of the reconstruction to the ground truth, and apply the method to synthetic test cases and a real-world example of uncertainty quantification for aerodynamic design.


Introduction
This paper considers the problem of adaptively reconstructing a monotonically increasing function F † from imperfect pointwise observations of this function. In the statistical literature, the problem of estimating a monotone function is commonly known as isotonic regression, and it assumed that the observed data consist of noisy pointwise evaluations of F † . However, we consider this problem under assumptions that differ from the standard formulation, and these differences motivate our algorithmic approach to the problem. To be concrete, our two motivating examples are that (1.1) is the cumulative distribution function (CDF) of a known real-valued function g of a random variable Ξ with known distribution µ, or that is the supremum of a family of such CDFs over some class A. We assume that we have access to a numerical optimisation routine that can, for each x and some given numerical parameters q (e.g. the number of iterations or other convergence tolerance parameters), produce a numerical estimate or L. Bonnet, J.-L. Akian,É. Savin, and T. J. Sullivan observation G(x, q) of F † (x); furthermore, we assume that G(x, q) ≤ F † (x) is always true, i.e. the numerical optimisation routine always under-estimates the true optimum value, and that the positive error F † (x) − G(x, q) can be controlled to some extent through the choice of the optimisation parameters q, but remains essentially influenced by randomness in the optimisation algorithm for each x. The assumption G(x, q) ≤ F † (x) is for example coherent with either Equation (1.1), which may be approached by increasing the number of samples (say q) in a Monte Carlo simulation, or Equation (1.2), which is a supremum over a set that may be explored only partially by the algorithm. A single observation G(x, q) yields some limited information about F † (x); a key limitation is that one may not even know a priori how accurate G(x, q) is. Naturally, one may repeatedly evaluate G at x, perhaps with different values of the optimisation parameters q, in order to more accurately estimate F † (x). However, a key observation is that a suite of observations G(x i , q i ), i = 1, . . . , I, contains much more information than simply estimates of F † (x i ), i = 1, . . . , I, and this information can and must be used. For example, suppose that the values (G(x i , q i )) I i=1 are not increasing, e.g. because Such a suite of observations would be inconsistent with the axiomatic requirement that F † is an increasing function. In particular, while the observation at x i may be relatively good or bad on its own merits, the observation G(x i , q i ) at x i , which violates monotonicity, is in some sense "useless" as it gives no better lower bound on F † (x i ) than the observation at x i does. The observation at x i is thus a good candidate for repetition with more stringent optimisation parameters q -and this is not something that could have been known without comparing it to the rest of the data set. The purpose of this article is to leverage this and similar observations to define an algorithm for the reconstruction of the function F † , repeating old observations of insufficient quality and introducing new ones as necessary. The principal parameter in the algorithm is an "exchange rate" E that quantifies the degree to which the algorithm prefers to have a few high-quality evaluations versus many poor-quality evaluations. Our approach is slightly different from classical isotonic (or monotonic) regression, which is understood as the least-squares fitting of an increasing function to a set of points in the plane. The latter problem is uniquely solvable and its solution can be constructed by the pool adjacent violators algorithm (PAVA) extensively studied in Barlow et al. (1972). This algorithm consists of exploring the data set from left to right until the monotonicity condition is violated, and replacing the corresponding observations by their average while back-averaging to the left if needed to maintain monotonicity. Extensions to the PAVA have been developed by de Leeuw et al. (2009) to consider non least-squares loss functions and repeated observations, by Tibshirani et al. (2011) to consider "nearly-isotonic" or "nearly-convex" fits, and by Jordan et al. (2019) to consider general loss functions and partially ordered data sets. Useful references on isotonic regression also include Robertson et al. (1988) and Groeneboom and Jongbloed (2014).
The remainder of this paper is structured as follows. Section 2 presents the problem description and notation, after which the proposed adaptive algorithm for the reconstruction of F † is presented in Section 3. We demonstrate the convergence properties of the algorithm in Section 3.2 and study its performance on several analytically tractable test cases in Section 4. Section 5 details the application of the algorithm to a challenging problem of the form Equation (1.2) drawn from aerodynamic design. Some closing remarks are given in Section 6.

Notation and problem description
In the following, the "ground truth" response function that we wish to reconstruct is denoted F † : [a, b] → R and has inputs x ∈ [a, b] ⊂ R. It is assumed that F † is monotonically increasing and non-constant on [a, b]. In contrast, G : [a, b] × R + → R denotes the numerical process used to obtain an imperfect pointwise observation y of F † (x) at some point x ∈ [a, b] for some numerical parameter q ∈ R + . Here, on a heuristic level, q > 0 stands for the "quality" of the noisy evaluation G(x, q).
The main aim of this paper is to show the effectiveness of the proposed algorithm for the adaptive reconstruction of F † , which could be continuous or not, from imperfect pointwise observations G(x i , q i ) of F † , where we are free to choose x i+1 and q i+1 adaptively based upon x j , q j , and G(x j , q j ) for j ≤ i First, we associate with I imperfect pointwise observations {x i , y i : ⊂ R + which we will call qualities. The quality q i quantifies the confidence we have in Possible ground truth functions between two consecutive points x 1 and x 2 . The ground truth function must lie in the area formed by these two points.
x 1 x 2 Interpolation function (b) Right-continuous piecewise constant interpolation function. the pointwise observation y i of F † (x i ) using the numerical process G(x i , q i ). The higher this value, the greater the confidence. We divide this quality as the product of two different numbers c i and r i , q i = c i × r i , with the following definitions: • Consistency c i ∈ {0, 1}: This describes the fact that two successive points must be monotonically consistent with respect to each other. That is, when one takes two input values x 2 > x 1 , one should have y 2 ≥ y 1 as y must be monotonically increasing. There is no consistency associated with the very first data point as it does not have any predecessor. • Reliability r i ∈ R + : This describes how confident we are about the numerical value. Typically, it will be related to some error estimator if one is available, or the choice of optimisation parameters. It is expected that the higher the reliability, the closer the pointwise observation is to the true value, in average. Typically, if the observation y i+1 = G(x i+1 , q i+1 ) is consistent with regard to the observation y i = G(x i , q i ) where x i+1 > x i , the quality q i+1 associated with y i+1 will be equal to q i+1 = r i+1 ∈ R * + since c i+1 = 1 in this case. If the value is not consistent, we have q i+1 = r i+1 × c i+1 = 0. Finally, if x = a there is no notion of consistency as there is no point preceding it. Thereby, the quality associated with this point is only equal to its reliability.
Moreover, we associate to these pointwise observations a notion of area, illustrated in Figure 2.1 and defined as follows. Consider two consecutive points x i and x i+1 with their respective observations y i and y i+1 , the area a i for these two points is (2.1) Thus, we can define a vector a = {a i } I−1 i=1 which contains all the computed areas for the whole dataset. In addition, we can assure that if we take two points x 1 and x 2 > x 1 with y 1 = F † (x 1 ) and y 2 = F † (x 2 )namely, the error at these point is equal to zero, the graph of ground truth function F † must lie in the rectangular area spanned by the two points (x 1 , F † (x 1 )) and (x 2 , F † (x 2 )).
To adopt a conservative point of view, we choose as the approximating function F of F † a piecewise constant interpolation function, say: where 1 I denotes the indicator function of the interval I. We do not want this interpolation function to overestimate the true function F † as one knows that the numerical estimate in our case always underestimates the ground truth function F † (x). See Figure 2.1 for an illustration of this choice, which can be viewed as a worst-case approach. Indeed, this chosen interpolation function is the worst possible function underestimating F † given two points x 1 and x 2 and their respective observations y 1 and y 2 .

Reconstruction algorithms
The reconstruction algorithm that we propose, Algorithm 1, is driven to produce a sequences of reconstructions that converges to F † by following a principle of area minimisation: we associate to the discrete data set {x i , y i } I i=1 ⊂ [a, b] × R a natural notion of area (2.1) as explained above, and seek to drive this area towards zero. The motivation behind this objective is in Proposition 3.6 which states that the area converges to 0 as more points are added to the data set. However, the objective of minimising the area is complicated by the fact that evaluations of F † are imperfect. Therefore, a key user-defined parameter in the algorithm is E ∈ (0, ∞), which can be thought of as an "exchange rate" that quantifies to what extent the algorithm prefers to redo poor-quality evaluations of the target function versus driving the area measure to zero.

Algorithm
The main algorithm is organized as follows, starting from I (0) ≥ 2 points and a dataset that is assumed to be consistent at the initial step n = 0. It goes through N iterations, where N is either fixed a priori, or obtained a posteriori once a stopping criterion is met. Note that q new stands for the quality of a newly generated observation y new for any new point x new introduced by the algorithm. The latter is driven by the user-defined "exchange rate" E as explained just above. At each step n, the algorithm computes the weighted area WA (n) as the minimum of the quality times the sum of the areas of the data points: is the area computed by Equation (2.1) at step n (see also Equation (3.5)), and I (n) is the number of data points. Then it is divided into two parts according to the value of WA (n) compared to E.
• If WA (n) < E, then the algorithm aims at increasing the quality q (n) − of the worst data point (the one with the lowest quality) with index i i } at step n. It stores the corresponding old value y old , searches for a new value y new by improving successively the quality of this very point, and stops when y new > y old .
• If WA (n) ≥ E, then the algorithm aims at driving the total area A (n) to zero. In that respect, it identifies the biggest rectangle a (n) and its index i and adds a new point x new at the middle of this biggest rectangle. Then, it computes a new data value y new = G(x new , q new ) with a new quality q new . In both cases, the numerical parameters q new (for example a number of iterations, or the size of a sampling set or a population) are arbitrary and any value can be chosen in practice each time a new point x new is added to the dataset. They can be increased arbitrarily as well each time such a new point has to be improved. Indeed, the numerical parameters q of the optimisation routine we have access to can be increased as much as desired, and increasing them will improve the estimates G(x, q) of the true function F † (x) uniformly in x; see Assumption 3.1. The algorithm then verifies the consistency of the dataset by checking the quality of each point. If there is any inconsistent point, the algorithm computes a new value until obtaining consistency by improving successively the corresponding reliability. This is achieved in a finite number of steps starting from an inconsistent point and exploring the dataset from the left to the right.
Finally, the algorithm updates the quality vector {q , the worst quality q Algorithm 1: Adaptive algorithm to reconstruct a monotonically increasing function F † Input:

Initialization:
Get the worst quality point and its index: Compute the area of each pair of data points: a i ). Get the biggest rectangle and its index: Define the weighted area at step n = 0 as Introduce a new point at the middle of the biggest rectangle: , . . . , x (n) I (n) ); Compute the new value ynew = G(xnew, qnew); end Verify consistency of the pointwise observations {y by checking their quality. If there are not consistent, recompute them until obtaining consistency and then update the quality vector; Compute the new quality vector {q

Proof of convergence
We denote by I (n) the number of data points, and {x i=1 the positions of the data points, the observations given by the optimization algorithm at these positions, and the qualities associated with the optimization algorithm at the step n of Algorithm 1. For each i = 1, . . . , I (n) − 1, we define a, b] and the vector containing all rectangle areas {a (3.5) The pointwise observation y i ) is thus associated to the quality q (n) i ∈ R + , which quantifies the confidence we have in this observation as outlined in the problem description in Section 2. This number can represent the inverse error achieved by the optimization algorithm, for example, or the number of iterations, or the number of individuals in a population, or any other numerical parameter pertaining to this optimization process. The higher it is, the closer the observation is to the true target value. Therefore we consider the following assumption on the numerical process G.
Moreover, we can guarantee that: ( 3.6) That is, the optimisation algorithm will always underestimate the true value F † (x). In this way, one can model the relationship between the numerical estimate G and the true value F † as: where is a positive random variable. These assumptions imply some robustness and stability of the algorithm we use.
In the following, we will assume that I (0) ≥ 2. That is, we have at least two data points at the beginning of the reconstruction algorithm. Also among these points, we have one point at x = a and another one at x = b. Moreover, we will assume that the initial dataset is consistent. Since Algorithm 1 recomputes the inconsistent points at all steps, we can also consider in the following that any new numerical observation is actually consistent. Also, we need to guarantee that the weighted area WA (n) will permanently oscillate about E as the iteration step n is increasing; this is the purpose of Assumption 3.3 below as shown in the subsequent Proposition 3.5. From these properties it will then be shown that Algorithm 1 is convergent, as stated in Theorem 3.8.
Assumption 3.2. Any new numerical value obtained by Algorithm 1 is consistent.
Within Assumption 3.2 all points have a consistency of 1, and therefore q = r > 0 the reliability. Besides, one has G(x i+1 for all points i and steps n. We finally define the sequence of piecewise constant reconstruction functions F (n) as follows.
Definition 3.4. For each x ∈ [a, b], we define the reconstructing function F (n) at step n as: which are such that E + ∪ E − = N and E + ∩ E − = ∅. In order to prove the convergence (in a sense to be given) of Algorithm 1, we first need to establish the following intermediate results, Proposition 3.5, Proposition 3.6, and Proposition 3.7. They clarify the behaviour of the sequence WA (n) when points are added to the dataset and the largest area a (n) + is divided into four parts at each iteration step n; see Figure 3.1.
Proof. Let us assume that E + is finite: ∃N such that ∀n ≥ N , n ∈ E − . Therefore we are in the situation WA (n) < E, the minimum quality q (n) − of the data goes to infinity, and the total area A (n) is modified although the evaluation points {x and their number I (n) are unchanged; thus they are independent of n. Repeating this step yields since F † is monotonically increasing and non-constant on [a, b], and Assumption 3.1 is used. Consequently WA (n) → +∞ as n → ∞, that is WA (n) ≥ E ∀n ≥ N 1 for some N 1 , which is a contradiction.
The set E + is therefore of the form Let us introduce the strictly increasing application ϕ : N → N such that ϕ(p) is the p th element of E + (in increasing order), and m k , n k = ϕ( p k + 1, p k+1 ). p is the counter of the elements of E + , and n is the corresponding iteration number.
Proof. Let us assume that E − is finite: ∃N such that ∀n ≥ N , n ∈ E + . Therefore we are in the situation WA (n) ≥ E > 0, and ϕ(n) has the form ϕ(n) = n − n 0 , n ≥ N for some n 0 ∈ N. From Proposition 3.6: thus A (n) → 0 and WA (n) → 0 as n → ∞ since q (n) − is kept unchanged, which is a contradiction. We now provide three results on the convergence of Algorithm 1. As is to be expected, the algorithm can only be shown to converge uniformly when the target response function F † is sufficiently smooth; otherwise, the convergence is at best pointwise or in mean.
Theorem 3.8 (Algorithm convergence). Assume that F † is strictly increasing. Then, for any choice of E > 0, Algorithm 1 is convergent in the following senses: • Proof. Let E > 0. We know from Propositions 3.5 and 3.7 that WA (n) will oscillate about E in the iterating process as n → ∞, while lim n→∞ q (n) − = +∞ from Assumption 3.3. Furthermore, let Assuming for example that for some j, s ) is never divided in two in the iteration process and is thus independent of n, it turns out that a (n) j → (x j+1 − x j )(F † (x j+1 ) − F † (x j )) > 0 as n → ∞, which is impossible because A (n) goes to 0 as n → ∞ from Proposition 3.6. Therefore there exists some m ∈ N * (depending on n) such that ∆ (n+m) ≤ 1 2 ∆ (n) ; also the sequence ∆ (n) is decreasing, hence ). Then: is continuous at x, the second term on the right hand side above goes to 0 as n → ∞. However, if F † is continuous everywhere on [a, b], it is in addition uniformly continuous on [a, b] by Heine's theorem, and the second term goes to 0 as n → ∞ uniformly on [a, b]. Finally, invoking Assumption 3.1, the first term on the right hand side above also tends to 0 as n → ∞. This completes the proof.
Proposition 3.9 (Convergence in mean). Let F † : [a, b] → R be piecewise continuous. Then Algorithm 1 is convergent in mean in the sense that Proof. We can check that the sequence F (n) is monotone. Indeed, if WA (n) < E, then by construction we have . However, if WA (n) > E, then consistency implies that . The claim now follows from the monotone convergence theorem and the fact that F (0) is integrable.

Test cases
To show the effectiveness of Algorithm 1, we try it on two cases, in which F † is a continuous function and a discontinuous function respectively. For both cases, the error between the numerical estimate and the ground truth function is modelled as a random variable following a Log-normal distribution. That is, As we have access to the ground truth function and for validation purpose, the quality value associated to a numerical point is the inverse of the relative error. Moreover, we assume that the initial points are consistent.
For illustrative purposes, we set the parameter E = 15 for the examples considered below.

F † is a continuous function
First, consider the function F † ∈ C 0 ([1, 2], [1, 2]) defined as follows: The target function F † and the reconstructions F (n) obtained through the algorithm for several values of the step n are shown on Figure 4.1. For each n, the reconstruction F (n) is increasing and the initial points are consistent. The ∞-norm and 1-norm of the error appear to converge to zero with approximate rates −0.512 and −0.534 respectively.

F † is a discontinuous function
Now, consider the discontinuous function F † defined as follows:  Here, F † is piecewise continuous on [1, 3 2 ] and ] 3 2 , 2]. In this case, one can apply Proposition 3.9. The target function F † and the reconstructions F (n) obtained through the algorithm for several values of the step n are shown on Figure 4.2. Observe that the approximation quality, as measured by the ∞-norm of the error F † − F (n) , quite rapidly saturates and does not converge to zero. This is to be expected for this discontinuous target F † , since closeness of two functions in the supremum norm mandates that they have approximately the same discontinuities in exactly the same places. The 1-norm error, in contrast, appears to converge at the rate −0.561.
Regarding computational cost, the number of calls to the numerical model is lower when F † is continuous than when it is discontinuous. For both examples above and for the same number of data points, the number of evaluations of the numerical model (analytical formula in the present case) in the discontinuous case is about six times higher than the number of evaluations in the continuous case. This is because the algorithm typically adds more points near discontinuities and the effort of making them consistent increases the number of calls to the model.

Influence of the user-defined parameter E
We consider the case in which F † is discontinuous, as in Section 4.2. We will show the influence of the choice of the parameter E on the reconstruction function F (n) .

Case E 1
Let us consider the case E = 10 −4 1. This choice corresponds to the case where one wishes to split over redo the worst quality point. This can be seen on Figure 4.3 where the worst quality is almost constant over 100 steps while the sum of areas strongly decreases; see Figure 4.3(e) and Figure 4.3(f) respectively. At each step, the algorithm is adding a new point by splitting the biggest rectangle. One can note on Figure 4.3(f) that the minimum of the quality is not constant. It means that when the  algorithm added a new data point, the point with the worst quality was not consistent any more and had to be recomputed. In summary, in this case, we obtain more points but with lower quality values.

Case E 1
We now consider the case E = 10 4 1. This choice corresponds to the case where one wishes to redo the worst quality point over split. This can be seen on Figure 4.4 where the sum of areas stays more or less the same over 100 steps while the minimum of the quality surges; see Figure 4.4(f) and Figure 4.4(e) respectively. There is no new point. The algorithm is only redoing the worst quality point to improve it. To sum up, we obtain fewer points with higher quality values.

Optimal uncertainty quantification
In the optimal uncertainty quantification paradigm proposed by  and further developed by, e.g.,  and Han et al. (2015), upper and lower bounds on the performance of an incompletely-specified system are calculated via optimisation problems. More concretely, one is interested in the probability that a system, whose output is a function g † : X → R of inputs Ξ distributed according to a probability measure µ † on an input space X , satisfies g † (Ξ) ≤ x, where x is a specified performance threshold value. We emphasise that although we focus on a scalar performance measure, the input Ξ may be a multivariate random variable.
In practice, µ † and g † are not known exactly; rather, it is known only that (µ † , g † ) ∈ A for some admissible subset A of the product space of all probability measures on X with the set of all real-valued  The inequality is, by definition, the tightest possible bound on the quantity of interest P Ξ∼µ † [g † (Ξ) ≤ x] that is compatible with the information used to specify A. Thus, the optimal UQ perspective enriches the principles of worst-and best-case design to account for distributional and functional uncertainty. We concentrate our attention hereafter, without loss of generality, on the least upper bound P A (x).
Remark 5.1. The main focus of this paper is the dependency of P A (x) on x. In practice, an underlying task is, for any individual x, reducing the calculation of P A (x) to a tractable finite-dimensional optimisation problem. Central enabling results here are the reduction theorems of Owhadi et al. (2013, Section 4), which, loosely speaking, say that if, for each g, {µ | (µ, g) ∈ A} is specified by a system of m equality or inequality constraints on expected values of arbitrary test functions under µ, then for the determination of P A (x) it is sufficient to consider only distributions µ that are convex combinations of at most m + 1 point masses; the optimisation variables are then the m independent weights and m + 1 locations in X of these point masses. If µ factors as a product of distributions (i.e. Ξ is a vector with independent components), then this reduction theorem applies componentwise.
As a function of the performance threshold x, P A (x) is an increasing function, and so it is potentially advantageous to determine P A (x) jointly for a wide range of x values using the algorithm developed above. Indeed, determining P A (x) for many values of x, rather than just one value, is desirable for multiple reasons:  1. Since numerical optimisation to determine P A (x) may be affected by errors, computing several values of P A (x) could lead to validate their consistency as the function x → P A (x) must be increasing; 2. The function P A (x) can be discontinuous. Thus, by computing several values of P A (x), one can highlight potential discontinuities and can identify key threshold values of x → P A (x).

Test case
For the application of Algorithm 1 to OUQ, we study the robust shape optimization of the twodimensional RAE2822 airfoil (Cook et al., 1979, Appendix A6) using ONERA's CFD software elsA (Cambier et al., 2013). The following example is taken from Dumont et al. (2019). The shape of the original RAE2822 is altered using four bumps located at four different locations: 5%, 20%, 40%, and 60% of the way along the chord c (see Figure 5.1). These bumps are characterised by B-splines functions.
The lift-to-drag ratio C l C d of the RAE2822 wing profile (see Figure 5.2) at Reynolds Number Re = 6.5 · 10 6 , Mach number M ∞ = 0.729 and angle of attack α = 2.31 • is chosen as the performance function g † with inputs Ξ = (Ξ 1 , Ξ 2 , Ξ 3 , Ξ 4 ), where (Ξ i ) i=1...4 is the amplitude of each bump. They will be considered as random variables over their respective range given in Table 5.1.
The corresponding flow values are the ones described in test case #6 together with the wall interferences corrections formulas given in Garner et al. (1966, Chapter 6) and in Haase et al. (1993, Section 5.1). Moreover, we will assume that (Ξ i ) i=1...4 are mutually independent. An ordinary Kriging procedure has been chosen to build a metamodel (or response surface) of g † , which is identified with the actual response function g † in the subsequent analysis. A tensorised grid of 9 equidistributed abscissas for each parameter is used. The model is then based on N = 9 4 = 6561 observations. In that respect, a Gaussian  kernel has been chosen, where Ξ = (Ξ 1 , Ξ 2 , Ξ 3 , Ξ 4 ) and Ξ = (Ξ 1 , Ξ 2 , Ξ 3 , Ξ 4 ) are inputs of the function g † , and where γ = (γ 1 , γ 2 , γ 3 , γ 4 ) are the parameters of the kernel. These parameters are chosen to minimize the variance between the ground truth data defined by the N observations and their Kriging metamodel g † . The responce surfaces in the (Ξ 1 , Ξ 3 ) plan for two values of (Ξ 2 , Ξ 4 ) are shown on Figure 5.3. One seeks to determine P A (x) := sup µ∈A P Ξ∼µ [g † (Ξ) ≤ x], where the admissible set A is defined as follows: (5.1) A priori, finding P A (x) is not computationally tractable because it requires a search over a infinitedimensional space of probability measures defined by A. Nevertheless, as described briefly in Remark 5.1, it has been shown in  that this optimisation problem can be reduced to a finitedimensional one, where now the probability measures are products of finite convex combinations of Dirac masses.
The optimisation problem of determining P A (x) for each chosen x was solved using the Differential Evolution algorithm of Storn and Price (1997) within the mystic optimisation framework (McKerns et al., 2011). Ten iterations of Algorithm 1 have been performed using E = 1 × 10 4 . The evolution of P A (x) as function of the iteration count, n, is shown on Figure 5.4. At n = 0 -see Figure 5.4(a) -two consistent points are present at x = 57.51 and x = 67.51. At this step, WA (0) = 35289. As WA (0) ≥ E, at next step n = 1, the algorithm adds a new point at the middle of the biggest rectangle -see Figure 5.4(b) and Figure 5.5(b). After n = 10 steps, eight points are now present in total with a minimum quality increasing from 5000 to 11667 and with a total area decreasing from 7.05 to 0.84; see Figure 5.5(a) and Figure 5.5(b) respectively.
The number of iterations in this complex numerical experiment has been limited to 10 because obtaining new or improved data points consistent throughout the optimization algorithm may take up to two days (wall-clock time on a personal computer equipped with an Intel Core i5-6300HQ processor with 4 cores and 6 MB cache memory) for one single point. This running time is increased further for data points of higher quality. Nevertheless, this experiment shows that the proposed algorithm can be used for real-world examples in an industrial context.

Concluding remarks
In this paper we have developed an algorithm to reconstruct a monotonically increasing function such as the cumulative distribution function of a real-valued random variable, or the least upper bound of the performance criterion of a system as a function of its performance threshold. In particular, this latter setting has relevance to the optimal uncertainty quantification (OUQ) framework of  we have in mind for applications to real-world incompletely specified systems. The algorithm uses imperfect pointwise evaluations of the target function, subject to partially controllable one-sided errors, to direct further evaluations either at new sites in the function's domain or to improve the quality of evaluations at already-evaluated sites. It allows for some flexibility at targeting either strategy through a user-defined "exchange rate" parameter, yielding an approximation of the target function with few highquality points or alternatively more lower-quality points. We have studied its convergence properties and have applied it to several examples: known target functions that are either continuous and discontinuous, and a performance function for aerodynamic design of a well-documented standard profile in the OUQ setting.
Algorithm 1 is reminiscent of the classical PAVA approach to isotonic regression that applies to statistical inference with order restrictions. Examples of its use can be found in shape constrained or parametric density problems as illustrated in e.g. Groeneboom and Jongbloed (2014). Possible improvements and extensions of our algorithm include weighting the areas a (n) i as they are summed up to form the total weighted area WA (n) driving the iterative process, in order to optimally enforce both the addition of "steps" s (n) i in the reconstruction function F (n) of Definition 3.4, and the improvement of their "heights" y (n) i . This could be achieved considering for example the following alternative definition i