A Spectral Acceleration Approach for the Spherical Harmonics Discrete Ordinate Method

A spectral acceleration approach for the spherical harmonics discrete ordinate method (SHDOM) is designed. This approach combines the correlated k-distribution method and some dimensionality reduction techniques applied on the optical parameters of an atmospheric system. The dimensionality reduction techniques used in this study are the linear embedding methods: principal component analysis, locality pursuit embedding, locality preserving projection, and locally embedded analysis. Through a numerical analysis, it is shown that relative to the correlated k-distribution method, PCA in conjunction with a second-order of scattering approximation yields an acceleration factor of 12. This implies that SHDOM equipped with this acceleration approach is efficient enough to perform spectral integration of radiance fields in inhomogeneous multi-dimensional media.


Introduction
The retrieval of trace gas products from UV/VIS spectrometers is strongly affected by the presence of clouds. For the forward-model simulation of satellite measurements from instruments with high spatial resolution, it is important to account for the sub-pixel cloud inhomogeneity, or at the least, to assess this effect on the radiances at the top of the atmosphere, and so, on the retrieval results. For the new generation of atmospheric composition UV-VIS-NIR sensors, such as Sentinel-5P, and Sentinel-4 and −5, fast and accurate models accounting for the cloud inhomogeneities are crucial.
The spherical harmonics discrete ordinate method (SHDOM) developed by Evans [1,2] is one of the most efficient and widely used multi-dimensional deterministic (explicit) methods in the atmospheric sciences. SHDOM adopts (i) both spherical harmonics and discrete ordinates representation of the radiance field during different parts of the solution algorithm, and (ii) a sort of successive order of scattering solution method (Picard iteration [3]). The spherical harmonics are employed for computing the source function including the scattering integral, while the discrete ordinates are used to integrate the radiative transfer equation spatially. Specifically, the radiative transfer equation is solved iteratively by (i) transforming the source function from spherical harmonics to discrete ordinates, (ii) integrating the product of the source function and transmission along discrete ordinates, (iii) transforming the discrete ordinate radiances to spherical harmonics, and (iv) calculating the source function from the radiance in the spherical harmonics space. To achieve higher accuracy with a limited amount of memory, an adaptive grid is implemented. By this technique, regions where the source function is changing more rapidly have a higher density of grid points.
The computation of the radiative transfer through an atmosphere consisting of gas molecules is a demanding task because it requires an accurate modeling of the spectral absorption and scattering signatures of the gases plus additional particulate material under consideration. SHDOM uses a correlated k-distribution approach [4] for the integration over a spectral band. Note that in the version of March 1996, the parameterization of Fu and Liou [5] is the basis of the broadband integration of the gaseous line absorption, while longwave and shortwave Rapid Radiative Transfer Model (RRTM)-based k-distribution programs were added in 2003.
In addition to the k-distribution method, dimensionality reduction techniques applied on the optical properties of an atmospheric system become a standard broadband acceleration tool for one-dimensional radiative transfer [6][7][8]. Natraj et al. [9,10] designed a radiative transfer model based on the principal component analysis, while Efremenko et al. [11] generalized this model to include local linear embedding methods, as for example, locality pursuit embedding, locality preserving projection, and locally embedded analysis.
The aim of this paper was two-fold. First, to extend the applicability range of dimensionality reduction techniques to multi-dimensional radiative transfer, and second, to design a spectral acceleration approach for SHDOM that combines the correlated k-distribution method with some dimensionality reduction techniques. Although in the literature these methods are used separately, we apply them together. In this way, we expect to increase the computation speed of the k-distribution-based SHDOM.

The Spectral Acceleration Approach
Let us consider the solar radiative transfer in a rectangular prism of lengths L x , L y and L z as shown in Figure 1. The top and bottom faces of the prism are denoted by S t and S b , respectively, while S 1x (x = 0), S 2x (x = L x ), S 1y (y = 0), and S 2y (y = L y ) correspond to the lateral faces. The boundary-value problem for the total radiance at point r in direction Ω consists of the inhomogeneous differential equation the top-of-atmosphere boundary condition and the Lambertian surface boundary condition At the horizontal boundaries, periodic boundary conditions are assumed, i.e., I(λ, r 1x , Ω) = I(λ, r 2x , Ω), I(λ, r 1y , Ω) = I(λ, r 2y , Ω), for r ix ∈ S ix and r iy ∈ S iy with i = 1, 2. In Equations (1)-(4), λ is the wavelength, σ ext (λ, r) and σ sct (λ, r) the extinction and scattering coefficients, respectively, P(λ, ·) the phase function, A(λ) the surface albedo, Ω 0 = (−µ 0 , ϕ 0 ) with µ 0 > 0, the solar direction, F 0 (λ) the solar flux, r 2x = r 1x + L x i and r 2y = r 1y + L y j with (i, j, k) being the Cartesian unit vectors. Moreover, Ω + and Ω − stand for an upward and a downward direction, respectively, Ω is the unit sphere, while Ω + and Ω − denote the upper and lower unit hemispheres, respectively, An instrument placed at point r m measures the radiance at the top of the atmosphere in direction Ω 0 m = (µ 0 m , ϕ 0 m ), µ 0 m > 0. Denoting by g(λ) the slit function of the instrument, s the slit width, and S tm the footprint of the instrument on the top face S t , the spectral signal measured by the instrument at wavelength λ m in the spectral interval [λ min , λ max ] is given by where is the signal integrated over the field of view of the instrument at wavelength λ , Ω m (r t ) the unit vector along the line connecting the points r t ∈ S tm and r m , i.e., Ω m (r t ) = (r m − r t )/|r m − r t | (see Figure 1), and and A tm the characteristic function and the area of the instrument footprint, respectively. Assuming that the distance from the top of the atmosphere to the instrument is large, we approximate I(λ , r, Ω m (r)) ≈ I(λ , r, Ω 0 m ) in Equation (6). For an atmosphere consisting of gas molecules and a cloud, we suppose that 1. the optical coefficients of the gas molecules depend on the altitude level and the wavelength, and 2. the optical coefficients of the cloud depend on the spatial coordinates but not on the wavelength.
The extinction coefficient is then computed as where σ cloud ext (r) is the extinction coefficient in the cloud, σ mol sct (λ, z) the molecular scattering coefficient due to Rayleigh scattering, σ gas abs (λ, z) the gas absorption coefficient, and (x, y, z) the Cartesian coordinates of point r. For the scattering coefficient, we have a similar representation, namely, where ω cloud is the single-scattering albedo of the cloud. The goal of our analysis is to design an approach for accelerating the computation of the spectral signal I(λ m ) at a set of W m spectral points {λ mw } W m w=1 characterizing the instrument. For this purpose, we combine the correlated k-distribution method with some dimensionality reduction techniques.

Correlated k-Distribution Method
The correlated k-distribution method [4,12] is based on grouping spectral intervals according to absorption coefficient strength. In this section we provide a description of the correlated k-distribution method, which will enable us to connect this method with dimensionality reduction techniques.
Let {λ k } W k=1 be a discrete set of W equally spaced wavelengths in the interval where λ the discretization step. The forward-model spectral set {λ k } W k=1 is finer than the measurement spectral set {λ mw } W m w=1 ; in practice, W is chosen as a multiple of W m , i.e., W = cW m with say, c being an integer greater than 20. The convolution integral giving the expression of the spectral signal measured by the instrument can be approximated by As gas absorption has stronger spectral variation than molecular and particulate scattering, we write formally (at a given altitude level z) I FOV (λ) = I FOV (σ gas abs (λ)). The accurate technique for computing the integral in Equation (10) is a line-by-line (LBL) calculation of the gas absorption coefficient versus wavelength. Contrarily, the correlated k-distribution method is based on the fact that for a homogeneous atmosphere, the transmission within a spectral interval is independent of the LBL variation of σ gas abs with respect to the wavelength, but rather depends only on the distribution of σ gas abs within the spectral interval [13]. In this regard, let F = F(σ gas abs,k ) be the cumulative density function of σ gas abs (λ) in the spectral interval [λ k − λ/2, λ k + λ/2], and σ gas abs,k (F) the quantile function (i.e., the inverse function of the cumulative distribution function). The spectral signal measured by the instrument can then be computed as (11) where {F l , l } N q l=1 is a set of N q quadrature points and weights in the interval [0, 1]. The σ gas abs,k (F l ) can be computed by inverting the cumulative density functions of the LBL gas absorption coefficients, or in the case of the "exponential sum fitting of transmittance" method [14], by solving a nonlinear least squares problem.
In the final step, we define the sets of wavelengths and weights {λ w } W w=1 and {ω w } W w=1 with W = W N q , respectively, through the relations λ w = λ k and ω w = λ l , where w = l + (k − 1)N q for k = 1, . . . , W, l = 1, . . . , N q . Accordingly, we set σ gas abs (λ w ) = σ gas abs,k (F l ). Note that {λ w } W w=1 contains W groups of N q identical wavelengths. By this construction, Equation (11) becomes Note that Equation (12) is a quadrature rule for the convolution integral (10) in the case of the correlated k-distribution method.
The computation of the spectral signal by means of the correlated k-distribution method requires the computation of the absorption coefficients σ gas abs (λ w , z) at the set of wavelengths {λ w } W w=1 , and so, W monochromatic radiative transfer calculations. For applications with c ≥ 20 and N q ≥ 4, W = W N q = cW m N q can be in the order of hundreds, and a further acceleration is needed. This can be achieved by applying dimensionality reduction techniques on the optical parameters.

Dimensionality Reduction of Atmospheric Optical Parameters
Following Ref. [11], the dimensionality reduction problem of the optical parameters can be formulated as follows. For each wavelength λ w , we define an N-dimensional vector x w by where σ gas abs,k and σ mol sct,k are the optical coefficients in the kth level, N = 2N z + 1, and N z is the number of altitude levels. For simplicity, the surface albedo A was not included in x w (we assumed that the variations of A are negligible over the width of a molecular absorption band). Thus, the wavelength variability of the optical parameters is encapsulated in the vector x w ∈ R N , and there is a one-to-one correspondence between the discrete set of wavelengths {λ w } W w=1 , and the discrete set of optical x w is the sample mean of the data. The goal of a dimensionality reduction technique is to find an M-dimensional subspace (M < N) spanned by a set of linear independent vectors {a k } M k=1 , such that the centered data x w − x belong to this subspace; that is, In Equation (14), the matrix A = [a k ] M k=1 ∈ R N×M , encapsulating the column vectors a k , represents the inverse transform from the low-dimensional space to the high-dimensional space, and y wk is the kth component of y w ∈ R M . The vector y w is given by the forward transform from the high-dimensional space to the low-dimensional space, where An approximate model for estimating the integrated signal of the instrument at wavelength λ is of the form where I a FOV (λ) is the integrated signal computed by an approximate radiative transfer model, and f (λ) a correction factor. The correction factor f (λ) can be calculated efficiently and accurately by means of dimensionality reduction techniques applied on the optical parameters. Let us assume that the scalar function f (x w ) given by Equation (16) is not too nonlinear in x w . The computational process is organized as follows.

1.
For (cf. Equation (14)) f (x w ) is approximated by a second-order Taylor expansion around x; that is, where ∇ f and ∇ 2 f are the gradient and the Hessian of f , respectively.

2.
For a k = ||a k || a k and in view of Equation (17), the second and third terms in Equation (18) are written as and respectively. 3.
The mixed directional derivatives in Equation (20) are neglected, while the first-and second-order directional derivatives are approximated by means of central differences; that is, and respectively.

4.
For h = ||a k ||, Equations (18)-(22) yield the computational formula: From Equation (23), it is apparent that the computation of f (x w ) requires 2M + 1 calls of the exact and approximate models, while the computation of I FOV (λ w ) requires, in addition, W calls of the approximate model. In this regard and taking into account that M W, we are led to a substantial reduction of the computational time.
The most widely used dimensionality reduction techniques are the linear embedding methods which are summarized below.

1.
The principal component analysis (PCA) [15] performs a dimensionality reduction by projecting the original N-dimensional data on the M-dimensional subspace spanned by the dominant singular vectors of the data covariance matrix.

2.
The locality pursuit embedding (LPE) [16] performs a principal component analysis on local nearest neighbor patches to reveal the tangent space structure on the M-dimensional subspace.

3.
The locality preserving projection (LPP) [17] solves a variational problem that optimally preserves the neighborhood structure of the data set.

4.
The locally embedded analysis (LEA) [18] uses an embedding strategy based on a linear eigenspace analysis to minimize the local reconstruction error.
It should be pointed out that PCA preserves only the global structure of the data, and may fail to preserve the local structure if the data lie on a nonlinear manifold. In contrast, LPE, LPP, and LEA are local linear approaches, which optimally preserve local neighborhood information (the local structure of the data) in a certain sense.
The PCA, LPE, LPP, and LEA methods are illustrated in Algorithms A1-A4 of Appendix A. The outputs of these algorithms are the linear mapping A and the vectors of parameters y w , w = 1, . . . , W. Note that in the PCA language, the column vectors a k , k = 1, . . . , M of A are the so-called empirical orthogonal functions, while the vectors of parameters y w , w = 1, . . . , W are the principal components.
A last problem which has to be solved is the choice of an approximate model. Two special features of SHDOM facilitate this choice, namely, the use (i) of spherical harmonics and discrete ordinates to represent the radiance field, and (ii) of a sort of successive order of scattering solution method (Picard iteration). In a first approximate model, we choose 2 zenith discrete angles per hemisphere to represent the radiance field; analogous to the one-dimensional radiative transfer, the resulting method is called the four-stream approximation. In a second approximate model, we consider only 2 steps of the method of Picard iteration; the resulting method is called the second-order of scattering approximation.
In summary, the spectral acceleration approach includes two steps. The first step is a pre-processing step, consisting in the computation of the sets of wavelengths {λ w } W w=1 , the absorption coefficients σ gas abs (λ w , z), and the weighting factors {ω w } W w=1 in the framework of the correlated k-distribution method. The second step is a computational step, consisting of the calculation of 1.

Numerical Analysis
Before presenting some numerical results, we describe an implementation of SHDOM that is devoted to the retrieval of atmospheric trace gas concentrations from space-borne spectral measurements of radiation reflected through the Earth's atmosphere.

SHDOM Implementation
For an atmosphere consisting of gas molecules, we have to consider a domain of analysis with a large vertical extent, especially when dealing with trace retrievals. The reason is that gas molecules have a vertical concentration profile up to several tens of kilometers. Unfortunately, for large problems, SHDOM is limited by the amount of memory available for calculation, in particular on smaller computers. To handle this problem, in our implementation of SHDOM, we divide the domain of analysis into a number of subdomains along the vertical direction, and for each subdomain, we store the radiance and source function in separate files. The optical properties and the direct solar beam are calculated at each point on the base grid of the entire domain of analysis, while (i) the radiative transfer equation integration along each discrete ordinate during the solution iterations and (ii) the integration along the viewing directions for the intensity output are done successively moving from one subdomain to the neighboring one. Each subdomain has its own grid obtained by the cell splitting (adaptive grid) procedure. At the boundaries of a subdomain, the radiances are calculated not only at the grid points of the subdomain itself, but also at the grid points corresponding to the boundaries of the neighboring subdomains ( Figure 2). This is done by integrating the radiative transfer equation and by applying the standard interpolation approach (the entering and exiting values of the extinction and extinction-source function product are computed using bilinear interpolation of the four gridpoint values of the faces pierced by a ray). To ensure the continuity condition for the radiance field, the boundary values of the radiance are passed between the neighboring subdomains. The data are transferred to and from files.
This implementation of SHDOM does not give exactly the same results as the standard implementation because of (i) the splitting decision for the cells situated at the boundaries of a subdomain, and (ii) the differences between the rays along which the radiative transfer equation is integrated (a ray is traced backward until the transmission falls below some minimum specified value only inside a subdomain). However, as compared to the overall accuracy of the algorithm, these errors are small. On the other hand, this implementation increases the computation time by the times which are needed (i) to write and read data to and from files, and (ii) to compute the radiances at the additional grid points on the boundaries of each subdomain. It should be pointed out that the number of subdomains is set by the user and depends on the problem to be solved.
The code can be reformulated as a parallel code by using the Message Passing Interface (MPI). In this context, it should be pointed out that a parallel implementation of SHDOM, but in which the domain of analysis is divided along the horizontal directions, has been described in Ref. [19]. However, in the case of horizontal splitting, the transfer of data between different neighboring subdomains is more challenging for the implementation.

Numerical Results
The computational domain is a rectangular prism of lengths L x = L y = 15 km and L z = 40 km. The discretization steps along the horizontal directions are x = y = 0.5 km; thus, the corresponding numbers of base grid points are N x = N y = 30. Along the vertical direction, the domain is discretized into 8 subdomains; the corresponding altitude levels and discretization steps are listed in Table 1. The cloud is homogeneous in the vertical direction and is placed between 3 and 4 km (second subdomain). The cloud extinction field is modeled as where σ 0 = 6 km −1 and f (x, y) is the indicator function taking the values f (x, y) = 1 inside the cloud and f (x, y) = 0 inside the clear sky region. The indicator function f (x, y) is generated by a two-dimensional broken cloud model [20] with a cloud fraction of about 0.4. In order to avoid abrupt changes of the extinction field in the horizontal plane, σ cloud ext (x, y) is smoothed at the boundary of a cloudy region (by linear interpolation along two discretization steps in each direction). To simplify the analysis, a Henyey-Greenstein phase function with an asymmetry parameter g = 0.8 is considered, while the single-scattering albedo is chosen as ω cloud = 0.99. The number of discrete zenith and azimuth angles are N µ = 16 and N ϕ = 2N µ , respectively, the solar and instrument zenith angles are θ 0 = 60 • and θ 0 m = 30 • , respectively, the relative azimuth angle is ∆ϕ = 0, and a Lambertian reflecting surface with the surface albedo A = 0.2 is assumed. The footprint of the detector is considered to be a square of length 2a = 10∆x centered at x 0 = y 0 = L x /2, and z 0 = L z . The calculations are performed for a mid-latitude summer atmosphere [21] and a wavelength-dependent slit function corresponding to the TROPOspheric Monitoring Instrument (TROPOMI). SHDOM is run by using an adaptive grid with a splitting accuracy of 10 −4 . The solution accuracy is 10 −4 , and the simulations are performed for periodic boundary conditions by using the delta-M approximation along with the untruncated phase function single-scattering solution (TMS correction of Nakajima and Tanaka [22]). The dimensionality reduction parameter M, which determines the approximation accuracy of the linear embedding methods, is chosen as M = 4. The simulations were performed on a computer Intel Core i5-3340M CPU 2.70GHz with 8 GB RAM. Table 1. Discretization of the domain of analysis along the vertical direction.

Oxygen A-Band Test Problem
In our first example, we consider, in addition to the scattering and absorption by a cloud, molecular Rayleigh scattering, and the absorption in the Oxygen A-band. The measurement spectral grid contains 107 spectral points between 758 nm and 771 nm, while the number of spectral points for the correlated k-distribution method is W = 936. For this test, a two-dimensional slice at y = 7 km is selected from a three-dimensional broken-cloud field (Figure 3). Figure 4 illustrates the spectral signal computed by using SHDOM with the correlated k-distribution method. Using these results as a reference, we show in Figure 5 the relative differences in the spectral signal corresponding to the linear embedding methods, and in Table 2, the RMS of the relative difference and the computational time in hours, minutes, and seconds. The computation time of SHDOM with the correlated k-distribution method was 36 minutes and 28 s. Several conclusions can be formulated:

1.
For the dimensionality reduction parameter M = 4, the relative differences are smaller than 1.25 × 10 −3 over the entire spectral domain, while the RMS values are smaller than 6.4 × 10 −4 .

2.
The computation time corresponding to the second-order of scattering approximation is smaller than that corresponding to the four-stream approximation.

3.
For the second-order of scattering approximation, the fastest linear embedding method is PCA and the most accurate is LPP.

4.
Relative to the correlated k-distribution method, the acceleration factor is of about 7-12.    , and water vapor is considered. The measurement spectral grid contains 119 spectral points between 425 nm and 450 nm, while the number of spectral points for the correlated k-distribution method is W = 268. Note that in the correlated k-distribution method, the auxiliary gases are treated as weak absorbers meaning that for these gases, the average values of the LBL cross sections are used in the calculation. The test problem is three-dimensional, and the indicator function of the broken cloud is illustrated in Figure 6. In Table 3, we indicate the amount of memory required for storing the radiance and source function for each subdomain. As a result that the domain of analysis is not excessively large, the total amount of disk memory is 3.168 GB. As a result, we infer that SHDOM can be also run without splitting the domain of analysis. Taking these results as a reference, we found that the relative errors due to the domain splitting procedure are smaller than 10 −3 . The spectral signal computed by using SHDOM with the correlated k-distribution method is shown in Figure 7, the relative differences in the spectral signal corresponding to the linear embedding methods are plotted in Figure 8, and finally, the RMS of the relative difference and the computational time in hours, minutes, and seconds are given in Table 4. For this test example, the computation time of SHDOM with the correlated k-distribution method was 12 h, 18 min, and 32 s. As in the first test example, we see that 1.
for the dimensionality reduction parameter M = 4, the relative differences are smaller than 5 × 10 −4 over the entire spectral domain, while the RMS values are smaller than 2.2 × 10 −4 ; 2.
the second-order of scattering approximation is faster than the four-stream approximation; 3.
for the second-order of scattering approximation, the fastest linear embedding method is PCA and the most accurate is LPE; 4.
the acceleration factor is about 9-12. Figure 6. Indicator function f (x i , y j ) with x i = i∆x and y j = j∆y for the NO 2 -test problem.

Conclusions
A spectral acceleration approach for SHDOM has been designed. This approach combines the correlated k-distribution method and the linear embedding methods PCA, LPE, LPP, and LEA.
We have performed a numerical analysis of these techniques for a two-dimensional atmosphere in the Oxygen A-band, and a three-dimensional atmosphere consisting of nitrogen, ozone, oxygen dimer, and water vapor in a spectral interval ranging from 425 nm and 450 nm. The main conclusions of our analysis are as follows.

1.
SHDOM with the correlated k-distribution and linear embedding methods has a sufficiently high accuracy. The relative differences in the spectral signal are smaller than 1.25 × 10 −3 (over the entire spectral domain) in the case of a two-dimensional atmosphere, and 5.0 × 10 −4 in the case of a three-dimensional atmosphere. In three of the four cases, the most accurate linear embedding method is LPE.

2.
The linear embedding methods based on the second-order of scattering approximation are faster than those based on the four-stream approximation. Specifically, in the case of a two-dimensional atmosphere, the computation time of the linear embedding methods is on average about 4 min and 30 s for the four-stream approximation, and 3 min and 20 s for the second-order of scattering approximation, while in the case of a three-dimensional atmospheres, the corresponding times are 1 h and 21 min for the four-stream approximation, and 1 h and 2 min for the second-order of scattering approximation. The fastest linear embedding methods is PCA followed by LPE. For the test examples considered in our numerical analysis, PCA in conjunction with a second-order of scattering approximation yields an acceleration factor of 12 relative to the correlated k-distribution method.
In conclusion, from the point of view of accuracy and efficiency, it appears that LPE based on the second-order of scattering approximation is the most suitable acceleration method. Note that the accuracy and efficiency of the linear embedding methods are determined by the dimensionality reduction parameter M (as the accuracy and the computation time increase with M, an optimal value for M should be a compromise between accuracy and efficiency).
SHDOM equipped with the spectral acceleration approach can be used to analyze the impact of cloud inhomogeneities on trace gas retrievals [23]. In particular, for a specific cloudy scene, a spectral signal can be simulated by SHDOM and then included in a one-dimensional retrieval algorithm to derive for example, the total column amount of a trace gas. By comparing the retrieved total column with the true value, a bias due to cloud effects can be deduced [24,25], and possible correction strategies can be explored. Going one step further, a multi-dimensional trace gas retrieval algorithm can be designed. In this case, the spectral acceleration approach should be adapted to a linearized version of SHDOM (based on a forward or a forward-adjoint approach as described in Ref. [26]). This is the topic of a companion paper.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Step 1. Stack all centered data x w ← x w − x, w = 1, . . . , W, into the columns of a matrix X ∈ R N×W , i.e., X = [x w ] W w=1 .
Step 2. For each w = 1, . . . W, consider the sets X (w) = {x (w) N(j) } k j=1 containing the k nearest neighbors to x w with respect to the Euclidean distance.
Step 3. Calculate the elements of the sparse matrix W ∈ R W×W as follows: (i) for w = 1, . . . W, calculate the elements G ij of the matrix G ∈ R k×k as G ij = (x w − x (w) N(j) ), and (ii) set W wN(j) = w j with w j being the elements of the column vector w = G −1 1/(1 T G −1 1) ∈ R k .
Step 4. Calculate the elements D v of the diagonal matrix D = diag[D v ] W v=1 ∈ R W×W as D v = ∑ w W wv .
Step 5. As in Algorithm A3, solve the generalized eigenvalue problem Az k = λ k Bz k , for A = X(D − W) T (D − W)X T ∈ R N×N and B = X(D T D + W T W)X T ∈ R N×N , and let C = VΛV T .