Predicting Long-Term Stability of Precise Oscillators under Influence of Frequency Drift

High-performance oscillators, atomic clocks for instance, are important in modern industries, finance and scientific research. In this paper, the authors study the estimation and prediction of long-term stability based on convex optimization techniques and compressive sensing. To take frequency drift into account, its influence on Allan and modified Allan variances is formulated. Meanwhile, expressions for the expectation and variance of discrete-time Hadamard variance are derived. Methods that reduce the computational complexity of these expressions are also introduced. Tests against GPS precise clock data show that the method can correctly predict one-week frequency stability from 14-day measured data.


Introduction
Timing technology is important in modern finance [1], industries and scientific research [2]. High frequency trading, real-time navigation and the verification of relativistic effects require accurate and high-resolute time and/or frequency information. Timing information is given by counting the periodic signals of referenced oscillators. Meanwhile, frequencies of the timing signal are multiples of the referenced oscillator period. A time-scale is accurate only if the participant oscillators produce frequencies consistent with their nominal values or are stable enough to be predictable. Furthermore, high resolution requires, in turn, a short period of oscillator output signal frequencies.
Unfortunately, no high-performance oscillator produces constant and high-resolution signals.
The difference between an oscillator's output signal from its nominal value can be divided into deterministic and random parts. The oscillator's random behavior is well-documented by a class of noise processes called power-law noise (PLN) [3]. While the random variations are defined in the frequency domain, it is often measured in the time domain by a class of structure functions and referred to as the frequency stability of the oscillator. For example, Allan (AVAR), modified Allan (MVAR) and Hadamard variance (HVAR) are commonly-used methods. These statistics can be improved by using a 'total approach' [4]. Recently, Thêo- [5] and parabolic variances [6] were also proposed. The authors proposed an oscillator noise analysis method called stochastic ONA [7]. The method predicts long-term frequency stabilities using convex optimization techniques. Specifically, the confidence regions of long-term Hadamard variances (HVAR) predicted from 14-day GPS precise clock data include HVAR estimated from 168-day measured data and are smaller than those estimated from 84-day time derivations.
On the other hand, distinctions between deterministic and random behavior are blurry [8]. It is often difficult to differentiate drift from frequency noises [9]. A main drawback of stochastic ONA is its requirement for drift-free input variances. For example, cesium frequency standards are conventionally believed to be free from drift [10]. However, analysis of historical data and current practice show that the performance of TAI (International Atomic Time) improved when taking the frequency drifts of participant cesium clocks into account [11].
This paper studies the estimation of oscillator stability under the influence of frequency drift. In Section 2, the basic concepts and methods of time domain stability have been reviewed. Although time domain stability is related to the frequency domain, discrete sampling has different impacts on the two. The influence of discrete sampling on both domains has also been reviewed in this section. In Section 3, we introduce a method called stochastic ONA, which extends the oscillator noise analysis problem to the prediction of long-term stability. In the following section, we describe methods to compute coefficient matrices used in stochastic ONA. We also introduce a method that greatly reduces the computational complexity of Walter's characterization of AVAR and MVAR. From these works, we can then predict long-term frequency contaminated by deterministic linear frequency drift. The proposed model is tested against GPS precise clock data in Section 5. The one-week AVAR, MVAR and HVAR predicted by stochastic ONA from 14-day measured data are consistent with those estimated from 84-day data. In addition, the fifteen-day variances predicted by stochastic ONA have more compact confidence regions than those estimated from 42-60-day data.

Review of Time Domain Stability
It is well documented that high performance oscillators are influenced by power-law noises (PLN). PLN processes are conventionally defined by their power spectral densities (PSD): where S y ( f ) is the PSD of oscillator fractional frequency y(t), S x ( f ) PSD of time deviations x(t), f (Fourier) frequency and h α noise intensity coefficient, α = α 1 , α 2 , · · · , α N h . Often, α = 2 (white phase modulation, WHPM), 1 (flicker PM, FLPM), 0 (white frequency modulation, WHFM), −1 (flicker FM, FLFM), −2 (random walk FM, RWFM) [12]. However in Global Positioning System (GPS) master control station (MCS) clock prediction, α = 2, 0, −2 and −4 (random run FM, RRFM), and the h α coefficients are replaced by q i [4]: However, PSD measured from an oscillator signal is not used solely in practice. Since the PSD estimates are "noisy" [13], time domain statistics are often used instead. For instance, AVAR [13]: MVAR: and HVAR:σ A time-domain variance σ 2 k (τ) can be related to its PSD: Here, τ = mτ 0 is the averaging time, τ 0 sampling period and H k ( f ) the transfer function of σ 2 k (τ) defined in [12]: Here, subscript k is used as a generic form of different variances. The majority of measured data nowadays are digital. σ 2 k (τ) estimated from finite data may have different values for Equation (5). The former is usually denoted asσ 2 k (τ), and it can be viewed as a realization of sample variance variable Σ k (τ) [14]. The distribution function F Σ k (τ) ≤σ 2 k (τ) (F σ 2 k (τ) for short) of random variable Σ k (τ) can be formulated as: for arbitrary positive real numberσ 2 k (τ), where e is Euler's number, Γ(·) the Gamma function, EDF k (τ) the equivalence degrees of freedom (EDF): where varianceσ 2 k (τ) is estimated from x[t] and x[t] the discrete sampling of time deviations x(t), then Φ k (α i , τ) does not equal Equation (6). Instead of PSD, Kasdin shows that it is the symmetric two-time autocorrelation function: which directly samples the continuous-time function [15]. For example, instead of Equation (1), Walter shows that PSD measured from discrete sampled time deviations x[t] relates to S y ( f ) in the following way [16]: The autocorrelation function of PLN processes has the following asymptotic form when t >> τ [15]: for α = −1, and: for α = −1, where: To derive an expression for the discrete autocorrelation function, the deviation of Brownian motion is often replaced by the discrete Wiener process in time and frequency metrology [15,17]. If, in addition, the noise process is wide sense stationary, R x (t, τ) can be recast as [15]: Here, τ = mτ 0 . From Equations (12)- (14), Walter derives Φ k (α i , τ): and Var σ 2 k (τ) : for Allan variance (AVAR) [16]. It should be noted that variances estimated from discrete sampled data may be distorted when the averaging time τ = mτ 0 is near sampling period τ 0 . The distortions are caused by alias and measurement noise [15]. Equations (15) and (16) do not take the distortions into account. Furthermore, the influences of frequency drift are not included in these equations. As a development of Walter's work, we will formulate the effect of deterministic linear frequency drift in AVAR estimates and derive Φ k (α i , τ) and Var σ 2 k (τ) for HVAR in Section 4. On the other hand, ifσ 2 k (τ) is a measurement of the time-domain variance σ 2 k (τ), we can estimate the h α coefficients fromσ 2 k (τ). It can be formed as a least square problem: and called oscillator noise analysis. Suppose there are M different input variances. Therefore, the coefficient matrix Φ can be divided into M blocks: Φ k (α, τ) is defined in Equation (9). Φ can be formed by using the method we proposed in Section 4. Similarly, the column vector of input variances σ can be partitioned into M blocks: the k-th block: is comprised ofσ 2 k (m k τ 0 ), m = 1, · · · , m k estimated from time deviations x[i], i = 1, 2, · · · , N. h is a column vector of noise intensity coefficients and W the weight matrix. The works in [18,19] give different ways to compute W.

Stochastic ONA
We extended the oscillator noise analysis problem to the prediction of long-term stability. Since the likelihood (or conditional probability) of: is zero, we estimate a 1 − 2ε confidence region of σ 2 k (τ) instead. This extension is realized by using convex optimization techniques (Appendix A.2), and we call it stochastic ONA [7].
The basic idea of stochastic ONA is: and: is the chi-square distribution function defined in Equation (7), ε > 0, Clearly, when ε is small enough. The matrices B(ε) and B(1 − ε) are obtained by substituting Φ k (α, τ) in the coefficient matrix Φ with B k (α, τ, ε) and B k (α, τ, 1 − ε). This can be cast into the following optimization problem: where the N h -dimensional column vector of the noise intensity coefficients h subject to: In practice, it is not always easy to find such an ε. In addition, Equation (21) does not model the uncertainty of input variances completely. For example, neither the correlations among the different averaging time, nor those among different variances calculated from the same underlying time series are taken into account. We therein prescribe a lower bound ε l as a threshold. Equation (21) will be replaced by an alternative model if stochastic ONA fails to find an ε ≥ ε l that holds for the inequalities. While different variances estimated from the same time series are correlated, they contain independent pieces of information. It is difficult to formulate the correlations of structure functions precisely. It is even more difficult to solve stochastic programming under complex probabilistic constraints. When unformulated information only has a strong impact on a small population of the whole input variances, they can be treat as 'violations' using the techniques of compressive sensing. The auxiliary variables µ and ν are used as indicators for the violations. Specifically, the ∑ M i=1 m i -dimensional non-negative vector variable µ and ν are defined such that: for any norm · of µ and ν. We choose the 1 -norm (see Appendix A.1 for details): where ν i is the i-th component of ν, and |ν i | returns the absolute value of ν i . The probabilistic fact that only a minority of input variances violate Equation (21) can be formulated using the property of 1 : the minimum of the 1 norm is approximately sparse when we have more variables than problem data. Therein, the optimization problem can be formulated as: where optimization variables (h, µ, ν) subject to: We adjust the values of input variance according to the result of Equation (23). Consequently, we can find that h holds for Equation (21) with the adjusted variances. Suppose (h * , µ * , ν * ) is the optimum of Equation (23). We label an input varianceσ 2 k (τ) as an 'outlier' when: An outlier will be adjusted in the following way: where 0 < ψ < 1. We set ψ = 0.5 in this article. By then, we can either minimize or maximize the values of: under the restriction of Equation (21). Here, k is not necessarily any of k 1 , k 2 , · · · , k M . Neither τ should be smaller than Nτ 0 , where τ 0 is the sampling interval and N number of time deviations. If we denote the minimum and maximum of Equation (25) as σ 2 k (τ) andσ 2 k (τ), respectively, σ 2 k (τ),σ 2 k (τ) can be approximately considered as an 1 − 2ε confidence region of σ 2 k . This is the predictive model used in stochastic ONA.

Models for Discrete-Time Variances
In this section, we introduce a way to compute coefficient matrices Φ, B(ε) and B(1 − ε) used in stochastic ONA. Because B(ε) and B(1 − ε) can be computed from Φ and the inverse of the chi-square distribution function (7). Equation (7) is well-defined if the degrees of freedom EDF k (τ) is known. EDF k (τ), in turn, can be determined by Φ and Var σ 2 k (τ) . Specifically, we: (i) formulate the influence of deterministic linear frequency drift on Allan (AVAR) and modified Allan (MVAR) variance; (ii) derive expressions for Φ k (α, τ) and Var σ 2 k (τ) of discrete-time Hadamard variance (HVAR). Computing the values of Φ k (α, τ) and Var σ 2 k (τ) for discrete-time AVAR, MVAR and HVAR is a daunting task, since we need to compute the gammafunctions O(mN) (O mN 2 for MVAR) times. We reduce the computation of gamma functions to three times per f α noise in the end of this section.

Drift Model
To formulate the influences of deterministic linear frequency drift on AVAR and MVAR, we first assume the oscillator output signal to be contaminated by drift. Suppose its time derivations x[i] can be separated as: x are discrete sampling of the continuous-time signals x(t) and x (t), respectively. We also denote the AVAR and MVAR estimated from and: Since x [i] is unknown, AVAR and MVAR can only be measured from x[i]. We denote them aŝ σ 2 y (x, m) and Modσ 2 y (x, m), respectively. It can be shown, from Equations (2) and (3), that: and: In other words, the drift-free AVAR and MVAR can be divided from the influence of drift theoretically.
To predict long-term stability, the sign of a makes no difference. We can therein treat a 2 as a component of h. The column vector h of a rubidium frequency standard, for instance, is: where h α , α = 2, 1, 0, −1, −2 and −4 are noise intensity coefficients of white and flicker PM, white, flicker, random walk and random run FM, respectively. Accordingly, the m-th row of Φ k and B k (ε) can be cast as: for AVAR, respectively. Here, we use the subscript α = 2, 1, 0, −1, −2 or −4 to indicate the dominant PLN process; other PLN processes will be ignored in the computation of the inverse chi-square distribution function. While Φ AVAR (α, mτ 0 ) is given in Equation (15) and Var σ 2 AVAR (τ) in Equation (16) explicitly, it is a daunting task to compute Φ k and B k (ε) from these equations. As we show at the end of this section, the computation can be greatly shortened by taking the properties of the gamma function into account. Especially, we can simplify Equation (15) in the case of GPS MCS clock prediction: On the other hand, the simplified expression for Var σ 2 AVAR (τ) depends on the ratio of m to N. When m ≤ N/4, for α = 0, and: for α = −2. Simplified expressions for Var σ 2 y (mτ 0 ) , m > N/4, will be given in Appendix B.

Hadamard Variance
To differentiate the influence of frequency drift from the random behavior of an oscillator, for example RWFM or RRFM, we can combine AVAR with some statistics, which are convergent for RRFM and free from drift. We choose HVAR among those statistics. In order to form coefficient matrices Φ, B(ε) and B(1 − ε), we derive here Φ k (α, τ) and Var σ 2 k (τ) of discrete-time HVAR. Since discrete-time PSD is not a direct sampling of the corresponding continuous-time PSD, the discrete-time HVAR is not a discrete sampling of the continuous function defined by Equation (5). On the other hand, the discrete-time symmetric two-time autocorrelation function is a direct sampling of its continuous counterpart (10). If we can recast the continuous-time HVAR as a combination of autocorrelation functions, the discrete-time HVAR can be derived from directly sampling the autocorrelation function. Equivalently, if the discrete-time HVAR can be expanded as a combination of autocorrelation functions, an explicit expression of the variance can be derived by replacing the autocorrelation functions with Equations (12)- (14).
From Equations (4) and (10), By substituting autocorrelation functions in the equation above with Equations (12) and (13), we attain:  In addition, t ≥ τ 0 , the sampling interval τ 0 ranges from several minutes to days, and HVAR is approximately independent of t (We assume here that the random behaviors of an oscillator is unchanged. Otherwise, HVAR is either divergent or changes with time t). Hence, we replace the symmetric two-time autocorrelation function in Equation (35) with Equation (14). The expression for Φ k (α, τ) of discrete-time HVAR is therein derived: autocorrelation functions in the equation above with Equations (12) and (13). Likewise, we expand Var σ 2 z (τ) with the symmetric two-time autocorrelation functions: by assuming the third order differences of x(t), m fixed, are normally distributed. It is easy to see that Equation (37) holds for α > −5 after substituting the autocorrelation functions in the above equation with Equations (12) and (13). Furthermore, Var σ 2 z (τ) is approximately independent of the parameter t. Therefore, we replace the autocorrelation functions with Equation (14). Var σ 2 k (τ) of HVAR is therein cast as By then, the coefficient matrices Φ, B(ε) and B(1 − ε) in stochastic ONA can be constructed in the following way: where Φ AVAR is defined in Equation (28), B AVAR (ε) and B AVAR (1 − ε) in Equation (29). The m-th rows of Φ HVAR and B HVAR (ε) are defined as: and:  respectively. When the time series contains random run FM, h −4 = 0. While AVAR does not converge for α = −4 PLN, the noise process has little influence on short-term AVAR estimated from real data. In such a case, the inconsistency between Equations (29) and (40) will be treated as 'violations' by the optimization problem (23). The unformulated influence of α = −4 PLN in Equation (29) will be smoothed out by Equation (24). In GPS MCS clock prediction, only PLN of α = 2, 0, −2 and −4 are considered. In such a case, the flicker noise components in Φ, B(ε) and B(1 − ε) should be removed. Furthermore, the remaining components can be computed using the following simplified expression: If, in addition, m ≤ N/6, for α = 2, for α = 0, for α = −2, and: Var σ 2 z (τ) = for α = −4. Expressions of Var σ 2 z (τ) for m > N/6 will be given in Appendix B.

Quick Computation of Discrete-Time Variances
Although for Equations (36) (46) From the properties of the gamma function, we recast Equations (15), (16), (36) and (38) as functions of b Γ . For instance, and: It is obvious that: On the other hand, for any 0 ≤ j ≤ N − 1, there exists such that: for some −2 ≤ i ≤ 2. Hence, the auxiliary parameter b Γ is both sufficient and necessary in the computation of Equations (15), (16), (36) and (38).
To calculate the values of b Γ , we start by searching for the least positive i 0 such that: and: Then, we compute the value of b Γ (i 0 ): Other components of b Γ can be estimated recursively: Given the value of b Γ (i): By using the auxiliary vector b Γ , we reduce the calculations of gamma functions to three times per f α noise.

Results and Discussion
To test the method proposed in this article, we predict τ = 15-day frequency stabilities of GPS onboard clocks. The predictions are made based on 14-day GPS precise clock data provided by IGS (International GNSS Service). The IGS timescale is selected as the reference clock. For comparison, we also estimate variances from 42-60-day measured data. It should be noted that the method discussed in this paper assumes power-law processes and deterministic frequency drift being the major sources of time-series data. Analysis of all thirty-two satellites shows that the method fails when period behaviors have a strong influence on input variances. In this section, only the predictions of GPS SVN. 45 and 41 satellite rubidium clock frequency stabilities are chosen as representative. This is because:

•
To test the modified stochastic ONA method, a strong presentation of deterministic frequency drift should be seen. In the real data test of [7], frequency drift does not have significant influence on the behaviors of some onboard rubidium frequency standards within 168 days. Such data cannot test the capability of stochastic ONA in predicting drift contaminated stabilities.

•
The oscillator should be somehow well-modeled. If the dominant variation of the frequency standard has not been model and it has major influence on frequency stability estimates, stochastic ONA will not function properly. For instance, stochastic ONA fails to predict the long-term AVAR and HVAR from MJD.52437.0-52451.0 GPS SVN.36 (PRN.06) precise clock data. AVAR and HVAR estimated from the data imply strong periodic behaviors.

SVN.45
The prediction of GPS SVN.45 onboard clock long-term stability shows how stochastic ONA behaves when it cannot distinguish RWFM from frequency drift. As shown in Figure 1, we use (i) AVAR, (ii) MVAR and (iii) HVAR estimated from 84-day GPS SVN.45 rubidium precise clock data ('-') as the reference. Although a linear frequency drift was removed before the estimation, the variances are not free from drift. The predictions ('-·-') made by stochastic ONA are based on variances estimated from the first 14 days of the time deviations ('· · · '). Since we set ε = 0.025 in the computation, the predicted confidence interval of long-term variance can be seen to have a 95%-confidence level. For comparison, we also estimate (i) AVAR and (ii) MVAR from the first 42-day ('--') and (iii) HVAR from the first 60-day ('--') measured data. By assuming RWFM (α = −2) as the dominant noise, 95%-confidence regions of there variances (' -•-') are computed in the following way:   It can be seen from Figure 1 that 0.1-1-day AVAR and HVAR estimated from 14-day measured data are less than the referenced variances. On the other hand, τ > 1-day AVAR and HVAR estimated from 14 days are much larger than the reference. This will be interpreted as a weaker FLFM and stronger RWFM by the conventional oscillator noise analyzer. Stochastic ONA attributes the fluctuations to RWFM, because RWFM has much larger confidence regions at long-term averaging time. For example, the 1152-th (τ = 4/day) rows of Φ y and B y (ε = 0.025) have the following approximate values: respectively. Consequently, the predicted lower and upper bounds of long-term variance are almost parallel to the references. Stochastic ONA cannot find any h that holds for Equation (21), so it has to make a trade-off between 0.1-1-day and τ > 1-day input variances: while the former lead to a smaller RWFM noise level, the latter indicate a larger one. Although the predicted confidence regions are greater than those estimated from 42-day measured data for averaging time of τ ≤ 10 days, they are consistent with the referenced variances. However, the confidence regions of variances estimated from 42-day data, by contrast, do not encompass all referenced variances. For instance, τ = 1-and 2-day referenced AVAR and MVAR, and τ = 2∼7-day HVAR are not included in the confidence regions calculated from 42-day clock data.

SVN.41
When stochastic ONA does find an h that holds for Equation (21), it predicts long-term variances with narrow confidence regions. This can be seen in Figure 2. The input variances ('· · · ') of stochastic ONA are estimated from GPS SVN.41 rubidium clock time deviations from MJD.52018.0-MJD.52032.0. Stochastic ONA predicts 2-day ≤ τ ≤ 15-day AVAR, MVAR and HVAR ('-·-') based on these variances with ε = 0.025. The referenced (i) AVAR and (ii) MVAR are measured from 84-day ('-'); and (iii) HVAR 160-day ('-') time deviations of the same clock. For comparison, we also estimate (i) AVAR from 42-day ('--'), (ii) MVAR from 45-day ('--') and (iii) HVAR from 60-day ('--') measured data. Their 95%-confidence regions (' -•-') are computed by assuming RWFM as the dominant noise. In Figure 2, only the τ ≤ 7-day referenced variances are included in the confidence intervals predicted by stochastic ONA. This phenomenon can be explained by the behavior of input variances. By comparing Figure 2 with Figure 1, it is easy to see that the 0.1-day ≤ τ ≤ 1-day frequency stabilities of the GPS SVN.41 onboard clock do not vibrate so fiercely as GPS SVN.45's. On the other hand, the former has a tail tens of times smaller than the referenced. As discussed in the previous subsection, stochastic ONA tends to attribute the fluctuations at long-term averaging time to RWFM. Stochastic ONA finds an h that holds for Equation (21). In this case, stochastic ONA fits the input variances using the inequality-bounded least-square model (17). However, the least square criterion (17) is designed for Gaussian distributions. In addition, the existence of some 1 − 2ε confidence regions holding for input variances does not mean no violation to the theoretical 1 − 2ε confidence intervals. Consequently, as shown in Figure 2, stochastic ONA underestimates the influence of frequency drift and overestimates the noise level of RWFM. Despite their compactness, only τ ≤ one-week referenced variances are included in the predicted regions. On the other hand, all referenced variances of the SVN.45 satellite clock are included in the regions predicted without using the least-square criterion.

Conclusions
In this article, we discussed the prediction of long-term stability with the presentation of deterministic linear frequency drift. The fundamental theory of time stability analysis and the influences of discrete sampling are first revisited. Based on these theories, we construct a method called stochastic ONA. Stochastic ONA extends the capability of conventional oscillator noise analysis to the prediction of long-term frequency stability. By then, we introduce methods to model long-term variances contaminated by frequency drift. Specifically, we: (i) formulated the influence of frequency drift on Allan (AVAR) and modified Allan (MVAR) variances; (ii) derived expressions for discrete Hadamard variance (HVAR); (iii) simplified the formulations for the case of GPS MCS (master control station) clock prediction; and (iv) introduce a method that reduces the computational complexity of Walter's characterization of AVAR and MVAR.
To test stochastic ONA and the model, we predict τ ≤ 15-day AVAR, MVAR and HVAR based on 14-day GPS precise clock data. Due to limited space, we choose the result of GPS SVN 45 and 41 as representatives: • For the SVN.45 satellite clock, stochastic ONA cannot find a set of noise intensity coefficients for which Equation (21) holds for input variances. In such a case, stochastic ONA predicts long-term stabilities based on Equation (23). The criterion (23) takes the probability distributions of input variances into account and produces robust results. All the referenced variances are included in the predicted confidence regions. On the other hand, τ = 1-and 2-day referenced AVAR and MVAR, and τ = 2∼7-day HVAR are not included in the 95%-confidence regions estimated from 42-day clock data. • For the SVN.41 onboard clock, stochastic ONA does find noise intensity coefficient values that hold for the probabilistic constraints (21). In this case, it predicts long-term stability of the satellite frequency standard based on the least square criterion (17). Despite the compactness of predicted confidence intervals, only τ ≤ 7-day referenced variances are included in these regions. Specifically, τ > 7-day referenced AVAR and MVAR are greater than the predicted ones, while the τ > 7-day referenced HVAR is smaller than the predictions. This suggests an overestimation of RWFM noise level and underestimation of frequency drift. Nevertheless, the inconsistency may be interpreted as the inappropriate power-law model used in this paper.
In summary, the method introduced in this paper can predict long-term stability superimposed with influences of frequency drift. Criterion (23) takes the probability distributions of input variances into account by assuming the majority of input variances within their 1 − 2ε confidence regions. The predictions made therein have large uncertainty, but are robust. In contrast, the least square criterion (17) assumes non-existing symmetric distributions of input variances, which reduce both the uncertainty and robustness of the result. We should find an alternative for the least square criterion in a future study. they are convex sets. In addition, the object functions defined in Equations (20), (23) and (25) are convex since they are either linear or quadratic with the semi-positive Hessian.
If we denote the object functions of Equations (20), (23) and (25) as f 0 (s), where s = h or (h, µ, ν), and the inequality constraints as: where l = 3 ∑ M j=1 m j or 6 ∑ M j=1 m j . The (Lagrange) dual problem of Equation (20), (23) and (25) is defined as: where λ is an l-dimensional real column vector, L(λ) dual function of the original problem: the infimum function inf (s)∈D (r(s, λ)) returns the maximum value of A, A is the set of all real values less than r(s, λ) for s ∈ D and fixed λ. r(s, λ) is the corresponding Lagrangian, Then, L(λ) ≤ f 0 (s). If, in addition, all inequalities f i (s) ≤ 0, i = 1, ..., l, are strict, and we denote the optimum of f 0 (s) and L(λ) as s * and λ * , respectively, f 0 (s * ) = L(λ * ).
Since the object function f 0 (s) defined by Equations (20), (23) This is the famous Karush-Kuhn-Tucker (KKT) conditions. To solve the inequality constraints, we replace f i (s) with the interior barriers: where log is the natural logarithm function and η an arbitrary positive real number. These barriers will be used as 'penalties' in the cost function. Specifically, we substitute the original object function f 0 (s) by: Since the inequality constraints become strict after taking the interior barriers, the KKT conditions are both sufficient and necessary for optima of Equation (A5). We therein solve Equation (A5) in the following Newtonian framework: • (s 0 , λ 0 ) is an arbitrary initial point of the original and dual problem • for k = 0, 1, · · · • solve the Newton step ∆s k , ∆λ k from: F (s) = f 1 (s) f 2 (s) · · · f l (s), 1 is an l-dimensional column vector whose elements are one, diag(λ) returns a square diagonal matrix with the elements of vector λ on the main diagonal, and the operator ∇ 2 returns the Hessian matrix of the operand at its right-hand side; for example, for f 0 (h, s) defined in Equation (17), the i-th row and j-th column component of ∇ 2 f 0 (s) is: is the i-th row and j-th column component of Φ T QΦ, λ i the i-th component of λ, DF (s) the derivative matrix of F (s), whose i-th row j-th column component is: • set h k+1 , s k+1 , λ k+1 ← h k , s k , λ k + ψ ∆h k , ∆s k , ∆λ k choosing β so that f i (s) < 0, i = 1, · · · , m 1 , and λ ≥ 0 [22], where 0 < ψ ≤ 1, ψ can be determined using a backtracking line search described in [23]. When ψ = 1, the primal-dual interior-point algorithm is based on the pure Newton method.
If the difference between f 0 (s k ) and r(s k , λ k ) grows when k increases, from [24], we label Equation (A5) as 'infeasible', and change the model according to the description of Section 3. Otherwise, we increase t in Equation (A5) and use the result of the previous iteration as the initial point (s 0 , λ 0 ) in th current iteration.