A Correlated Multi-Pixel Inversion Approach for Aerosol Remote Sensing

Feng Xu; David J. Diner; Oleg Dubovik; Yoav Schechner

doi:10.3390/rs11070746

,

and

¹

Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, USA

²

Laboratoire d’Optique Atmosphérique, CNRS/Université Lille-1, 59655 Villeneuve d’Ascq, France

³

Viterbi Faculty of Electrical Engineering, Technion - Israel Institute of Technology, Haifa 32000, Israel

^*

Author to whom correspondence should be addressed.

Remote Sens.2019, 11(7), 746;https://doi.org/10.3390/rs11070746

This article belongs to the Special Issue Advances of Remote Sensing Inversion

Version Notes

Order Reprints

Abstract

Aerosol retrieval algorithms used in conjunction with remote sensing are subject to ill-posedness. To mitigate non-uniqueness, extra constraints (in addition to observations) are valuable for stabilizing the inversion process. This paper focuses on the imposition of an empirical correlation constraint on the retrieved aerosol parameters. This constraint reflects the empirical dependency between different aerosol parameters, thereby reducing the number of degrees of freedom and enabling accelerated computation of the radiation fields associated with neighboring pixels. A cross-pixel constraint that capitalizes on the smooth spatial variations of aerosol properties was built into the original multi-pixel inversion approach. Here, the spatial smoothness condition is imposed on principal components (PCs) of the aerosol model, and on the corresponding PC weights, where the PCs are used to characterize departures from the mean. Mutual orthogonality and unit length of the PC vectors, as well as zero sum of the PC weights also impose stabilizing constraints on the retrieval. Capitalizing on the dependencies among aerosol parameters and the mutual orthogonality of PCs, a perturbation-based radiative transfer computation scheme is developed. It uses a few dominant PCs to capture the difference in the radiation fields across an imaged area. The approach is tested using 27 observations acquired by the Airborne Multiangle SpectroPolarimetric Imager (AirMSPI) during multiple NASA field campaigns and validated using collocated AERONET observations. In particular, aerosol optical depth, single scattering albedo, aerosol size, and refractive index are compared with AERONET aerosol reference data. Retrieval uncertainty is formulated by accounting for both instrumental errors and the effects of multiple types of constraints.

Keywords:

correlated aerosol inversion; radiative transfer; multiangle radiometry; polarimetry

1. Introduction

Aerosol retrievals performed on remotely-sensed radiometric and polarimetric imagery are subject to non-uniqueness of the solutions and a large computational burden. The former is caused by insufficient information in the observations and the latter is associated with the high dimensionality of the parameter space. To stabilize the inversion process, optimization-based inversions are often informed by extra constraints that supplement the observations. Exploration and application of these constraints are primary tasks in designing a reliable inversion algorithm. A particularly effective physical constraint for remote sensing applications is the so-called smoothness condition (see [1,2,3,4,5]), which takes advantage of the general observation that some quantities (such as ozone vertical distribution and atmospheric temperature profile) vary smoothly in certain dimensions. For aerosols, their loading and properties (e.g., size distribution or spectral real and imaginary refractive index) tend to vary smoothly in the spatial (horizontal and vertical) dimensions, while the underlying surface reflectance tends to vary smoothly in the temporal dimension. Imposition of this type of constraint leads to the “multi-pixel” inversion approach [6]. In this case, many pixels are stably inverted at one time and the dimensionality of the retrieval space is proportional to the number of inverted pixels.

Additional constraints can further reduce the large dimensionality of the parameter space, with the expected benefits of higher stability and accelerated retrievals. In this study, we introduce a “correlation constraint” on the aerosol inversions. This constraint takes advantage of empirical dependencies between different sets of aerosol parameters. The cross-correlation coefficient between a pair of aerosol fields quantifies the degree of correlation—as long as this coefficient is not zero, the aerosol variables are not independent of each other, i.e., they are correlated to some extent. However, development of a sophisticated physical model to parameterize these correlations is intractable due to the complexity of the real physical world. On the other hand, empirical methods, such as principal component analysis (PCA) or empirical orthogonal function (EOF) analysis, can be used to capture phenomenological correlations between aerosol fields. By implementing PCA over a training dataset, the original fields can be transformed to superpositions of a set of mutually orthogonal vectors, or principal components (PCs). Then, the original aerosol fields can be approximately reconstructed using a linear combination of a certain number of PCs, with accuracy dependent on the number of PCs employed.

Figure 1 and Figure 2 show a PC analysis of AERONET Level 2 aerosol inversions within a circular domain of diameter 2000 km around the Fresno, California, and Namibe, Angola, sites, respectively. The aerosol fields include total volume concentration, fraction of fine mode particles, effective radii of fine and coarse mode particles, spectral real and imaginary parts of refractive index, and volume concentration of the spherical particles. To generate Figure 1, a total of 691 retrievals from 60 AERONET stations within a circular domain of 2000 km around the AERONET site in Fresno, California, at longitude −119.787° and latitude 36.738° for the years 1994–2018 were used in the PC analysis. For Figure 2, a total of 780 retrievals from 18 AERONET stations within a circular domain of 2000 km around the AERONET site in Namibe, Angola, at longitude 12.178° and latitude −15.159° for the years 2000–2017 were used for the PC analysis. Although there are 13 parameters and the aerosol fields vary both temporally and spatially, a few PCs dominate the variations about the mean. For these two cases, 85–90% of the variance in the aerosol fields is captured by the first four PCs. Corresponding to the AERONET analysis in Figure 1 (Fresno domain), Table 1 lists the mutual correlation coefficients among all 13 parameters. Many different types of parameters show correlation coefficients exceeding ~0.3, and among related parameters, such as real and imaginary part of the refractive index, even stronger correlation is observed. While non-absorbing and weakly absorbing particles are dominant in the Fresno domain (Figure 1), absorbing (smoke) aerosols prevail in the Namibe domain (Figure 2). Nevertheless, mutual correlation among aerosol parameters is found in the latter domain as well, as indicated in Table 2. Moreover, comparison of Table 1 and Table 2 shows dependence of correlation on specific sets of parameters and regions. For example, the correlation between total aerosol concentrations and spherical particle concentration in the 2000 km Fresno region is weaker than that in the Namibe region. The correlation between fine mode aerosol fraction and real part of refractive index, however, is stronger in Fresno region.

Figure 1. (a) Top left panel: AERONET inversion of aerosol fields and properties (in natural logarithm space) including total volume concentration (

C_{v, tot}

), fraction of fine mode aerosols (

f_{fine}

), effective radii of fine (

r_{eff, fine}

) and coarse (

r_{eff, coarse}

) mode aerosols, column effective real part of refractive indices (

n_{r, 1 - 4}

) at 0.439, 0.675, 0.870, and 1.018 μm, respectively, imaginary part of refractive indices (

n_{i, 1 - 4}

), and spherical particle volume concentration (

C_{v, sphere}

). Though not specified by use of legend, each color is associated with an independent set of retrieval parameters. Top right panel: spatial and temporal mean of all retrievals and the first four principal components (PCs). Bottom left panel: percentage variance of the aerosol fields captured by PCs, indicating that 41%, 62%, 77%, and 85% of variance is captured by the first one, two, three, and four PCs, respectively. Bottom right panel: Aerosol optical depth (AOD) and single scattering albedo (SSA) at 440 nm that are reported in the AERONET retrievals analyzed here. (b) Regression of the derived aerosol properties from four PC against input AERONET values. The scatter plot is colored by the density of the points. Each panel corresponds to a specific aerosol property indicated in the title.

Figure 2. Same as Figure 1 but analysis was performed for a circular domain of 2000 km around AERONET site in Namibe, Angola. The first four PCs capture 56%, 74%, 83%, and 90% variance of the aerosol fields.

Table 1. Correlation coefficient of AERONET retrieved aerosol properties analyzed in Figure 1 (for 2000 km circular around Fresno, California). The correlation is calculated for all parameters in logarithmic space.

Table 2. Correlation coefficient of AERONET retrieved aerosol properties analyzed in Figure 2 (for 2000 km circular domain around Namibe, Angola).

This empirical correlation among aerosol parameters motivates reduction of the dimensionality of the retrieval parameter space and acceleration of the radiative transfer (RT) computations. Benefiting from the empirical aerosol correlations and orthogonality of PCs, increased computational efficiency is achieved by developing a PC-based fast multi-pixel RT computation scheme, as described in Section 4. A straightforward way to reduce dimensionality of the retrieval space is to first derive PCs from a training dataset, such that the retrieval problem reduces to determination of the pixel-dependent PC weights. To account for possible errors in the precomputed PCs, the correlation-based inversion should allow adjustment of the elements of the PC vectors. This is not straightforward, as the vector elements vary significantly in magnitude. A methodology for handling this issue is discussed in Section 3.

PCA has been used to significantly accelerate hyperspectral RT computations [7,8,9]. Following a determination of representative components from a training dataset of trace-gas and solar-induced chlorophyll fluorescence, the hyperspectral inversions of these quantities are sped up by retrieving PC weights [10,11,12]. In another retrieval application, Multi-angle Imaging SpectroRadiometer (MISR) operational aerosol retrieval over land uses spatial contrasts to derive PCs of the surface related contribution to the top-of-the-atmosphere (TOA) radiances [13,14]. The similarity in the angular shapes of surface bidirectional reflectance factors among MISR’s four spectral bands in the visible and near-infrared [15] is an example of a prior empirical constraint used to stabilize the aerosol retrievals. In contrast to MISR’s application of PCA to the surface boundary, this paper applies PCA to the aerosol parameters. An optimization approach is developed, which allows both PC vectors and PC weights to be retrieved.

Our retrieval approach is referred to as the “correlated multi-pixel inversion”, where “multi-pixel” reflects that our approach simultaneously uses multiple pixels within image-based measurements (as in the original multi-pixel inversion [6]), while “correlated” captures the extra correlation constraint. In contrast to most optimization approaches that retrieve the individual aerosol fields (see reviews in [16,17]) from multi-spectral radiometric or polarimetric remote sensing observations, we retrieve PC weights and PC vectors of those fields that have spectral, spatial, or temporal correlation with each other, with the purpose of reducing parameter space and improving algorithm efficiency. The individual fields are then constructed from the PC weights and PC vectors. The smoothness constraints on those fields, as implemented in the original “multi-pixel algorithm” used with airborne data at JPL [18,19], are adapted to the current correlated multi-pixel inversion approach. Both “multi-pixel” and “correlated multi-pixel” retrieval approaches utilize a one-dimensional (1D) code based on the independent pixel approximation [20] to ensure forward modeling efficiency.

This paper is organized as follows. Following an algorithm overview in Section 2, the correlated multi-pixel retrieval algorithm and error analysis are formulated in Section 3. A faster forward vector (polarized) RT model for a coupled atmosphere-surface system is presented in Section 4. In Section 5, retrieval tests are performed using 27 datasets acquired by the Airborne Multiangle SpectroPolarimetric Imager (AirMSPI) during multiple NASA field campaigns. A summary is presented in Section 6.

2. General Structure of the Algorithm

The expected advantages of correlated multi-pixel inversion in PC space are two-fold. First, it reduces the number of parameters to be retrieved and utilizes observed correlations as empirical constraints to improve the retrieval efficiency and accuracy. Specifically, the correlated parameter space reduces from

N_{corr} \times N_{pixel}

to

(N_{PC} + 1) \times N_{corr} + N_{PC} \times N_{pixel}

, where

N_{pixel}

,

N_{corr}

, and

N_{PC}

are the number of image pixels in the scene, the number of correlated aerosol parameters per pixel, and the number of PC vectors, respectively. Reduction of parameter in conjunction with the imposition of proper constraints is expected to mitigate ill-posedness. Morever, establishment of the PC vectors using ground-based measurements, climatology, or other data sources enables characterization of aerosol properties for which the remote-sensing observations may have insufficient sensitivity (such as aerosol chemical composition or vertical profile).

An overview of the correlated multi-pixel approach and its mapping to different sections of this paper is shown in Figure 3. Symbols and abbreviations used in this work are found in Appendix A (Table A1). A training dataset of correlated parameters is used to derive the spatial and temporal mean of correlated parameters over a targeted area, along with PCs of the correlated departures from the mean. Together with the uncorrelated retrieval parameters, a state vector is initialized with elements specified in Section 3.1. In addition to observational constraints provided by the remote-sensing observations used in the retrievals (Section 3.2.1), convergence and robustness retrieval are optimized by imposing additional constraints. For correlated fields, the additional constraints include: (a) the spectral variation of certain correlated fields (e.g., aerosol optical properties); (b) the spatial variation of certain correlated fields across neighboring image pixels; (c) mutual orthogonality and unit length of the PC vectors; and (d) zero sum of the PC weights. Formalisms for imposing these constraints in PC space are provided in Section 3.2.2, Section 3.2.3, Section 3.2.4, Section 3.2.5, Section 3.2.6 and Section 3.2.7 and Appendix C. For any uncorrelated field (such as surface reflectance), constraints on the spectral, spatial, or temporal variations of the associated parameters are handled in the same way as is formulated in the original multi-pixel inversion approach [6] and repeated in Section 3.2.3 and Appendix B. Combining observational and a priori constraints, a system of equations is established at each iteration and then solved to increment the solution (Section 3.2.8) in an inversion process. Error analysis for all retrieved properties is discussed in Section 3.3.

Figure 3. General structure of the correlated multi-pixel inversion approach. The interpretation of symbols used in the figure can be found in Table A1 of Appendix A.

To preserve generality of the approach, we use the symbol

x_{corr}

in the notations below to denote a column vector containing an arbitrary set of parameters that have correlation with each other. In the specific implementation of the correlated multi-pixel inversion for AirMSPI aerosol remote sensing (Section 5), we specify the correlation to be among aerosol parameters only, while the surface parameters are uncorrelated with each other or with aerosol properties (though in principle these assumptions could be relaxed [21,22,23]). Utilization of the aerosol correlations enables development of a fast PC-RT model for TOA radiance and polarization calculation over a group of pixels (Section 4).

As the retrieval output, the correlated aerosol fields for pixel p (

x_{corr, p}

) are constructed from pixel-resolved PC weights in vector form (

w_{p}

), spatially and temporally effective PC matrix (v), and spatial and temporal mean vector

{\bar{x}}_{corr}

, namely

x_{corr, p} = {\bar{x}}_{corr} + v \times w_{p}

(1)

where

{\bar{x}}_{corr}

is a single column vector and the matrix v is made up of

N_{PC}

column vectors consisting of correlated aerosol fields, with

N_{PC}

denoting the number of retrieved principal components. By integrating all PCs into a matrix, we have

v = [\begin{matrix} v_{1} & v_{2} & \dots & v_{N_{PC}} \end{matrix}]

(2)

where each PC is composed of

N_{corr}

elements, i.e.,

v_{k} = [v_{k} (1); v_{k} (2); \dots; v_{k} (N_{corr})]

(3)

where “;” indicates that the elements

v_{k} (1)

,

v_{k} (2)

, … and

v_{k} (N_{corr})

are vertically arranged into a column vector. We can further define a PC weight matrix w that consists of N_pixel columns, namely,

w = [\begin{matrix} w_{1} & w_{2} & \dots & w_{N_{pixel}} \end{matrix}]

(4)

where the vector containing PC weights for p-th pixel is

w_{p} = [w_{p} (1); w_{p} (2); \dots; w_{p} (N_{PC})]

(5)

The quantities (

{\bar{x}}_{corr}

, v, w) are contained in the solution vector x from the last iteration of an inversion process.

3. Inversion

3.1. State Vector

Equations (1)–(5) demonstrate the basic operations for constructing correlated fields from PC weight and PC vectors. We define a state vector

x_{state}

that includes both correlated and uncorrelated fields. The retrieval parameters are arranged in the following order,

x_{state} = [{\bar{x}}_{corr}; v_{k = 1}; v_{k = 2}; \dots v_{k = N_{PC}}; w_{p = 1}; x_{p = 1, uncorr}; w_{p = 2}; x_{p = 2, uncorr} \dots; w_{p = N_{pixel}}; x_{p = N_{pixel}, uncorr}]

(6)

where the mean of aerosol fields takes the form of a column vector

{\bar{x}}_{corr} = [{\bar{x}}_{corr, 1}; {\bar{x}}_{corr, 2}; \dots; {\bar{x}}_{corr, N_{TP, corr}}]

(7)

and

N_{TP, corr}

is the total number of types of parameters correlated with others, and each type may have a subset of values. For example, real and imaginary refractive indices are two types of parameters correlated with each other and each of them have a subset of values as a function of wavelength. For the j-th type of correlated parameter with L(j) elements,

{\bar{x}}_{corr, j} = [{\bar{x}}_{corr, j} (1); {\bar{x}}_{corr, j} (2); \dots {\bar{x}}_{corr, j} (L (j))]

(8)

Similarly, an arbitrary (

k^{th}

) PC vector composed of

N_{TP, corr}

types of correlated parameters can be arranged into the following column vector,

v_{k} = [v_{k, 1}; v_{k, 2}; \dots; v_{k, N_{TP, corr}}]

(9)

where for the type of correlated parameter with

L_{corr} (j)

elements,

v_{k, j} = [v_{k, j} (1); v_{k, j} (2); \dots; v_{k, j} (L_{corr} (j))]

(10)

The PC weight vector for pixel p is

w_{p} = [w_{p} (1); w_{p} (2); \dots w_{p} (N_{PC})]

(11)

For the state vector with

N_{TP, uncorr}

types of uncorrelated parameters, we have for an arbitrary pixel p,

x_{p, uncorr} = [x_{p, uncorr, 1}; x_{p, uncorr, 2}; \dots; x_{p, uncorr, N_{TP, uncorr}}]

(12)

where the [

n_{TP, uncorr}

]-th type of uncorrelated parameter has

L_{uncorr} (n_{TP, uncorr})

elements, namely,

x_{uncorr, p, n_{TP, uncorr}} = [x_{uncorr, p, n_{TP, uncorr}} (1); x_{uncorr, p, n_{TP, uncorr}} (2); \dots; x_{uncorr, p n_{TP, uncorr}} (L_{uncorr} (n_{TP, uncorr}))]

(13)

3.2. Constraints

Reduction of parameter space is critical for ensuring inversion efficiency and mitigating ill-posedness of large-scale optimization. Moreover, imposition of various types of constraints is important for stabilizing the retrievals and improving accuracy. The correlated multi-pixel inversion algorithm described here uses a total of ten types of constraints:

The first type of constraint, formulated in Section 3.2.1, consists of the observations provided directly by the remote sensing instrument(s).
The second type of constraint consists of a priori values for the retrieval parameters, and is described in Section 3.2.2.
For the uncorrelated fields (such as surface reflection properties), the pixel-resolved properties $x_{uncorr}$ can be subjected to both across-pixel and within-pixel constraints. Imposition of these types of constraints has been incorporated into the original multi-pixel inversion [6] and is repeated in Section 3.2.3 as well as in Appendix B as the third type of constraint.
When a set of parameters (e.g., aerosol properties) are correlated with each other, their mean can be subjected to smoothness constraints. This is referred to as the fourth type of constraint in Section 3.2.3.
Transformation of smooth variations of aerosol properties from regular aerosol parameters into the PC space forms the fifth type of constraint, discussed in Section 3.2.4.
In PC space, the smoothness constraints can be applied to the PC weights w and vectors v separately. Application to across-pixel weights w, discussed in Section 3.2.3, forms the sixth type of constraint.
Similarly, application of the smoothness constraints to certain type of parameters within a PC vector $v_{k}$ , also discussed in Section 3.2.3, forms the seventh type of constraint. Although it appears that the sixth and seventh types of constraints applied to “w” and “v” separately are redundant with the fifth type of constraint that ensures a smooth variation of overall correlated field constructed from PCs; they are helpful when poor initial guesses of “w” and “v” are provided by a training dataset.
The eighth type of constraint, formulated in Section 3.2.5, imposes a zero sum of the PC weights.
The ninth type of constraint, formulated in Section 3.2.6, imposes mutual orthogonality of the PC vectors.
The tenth type of constraint, formulated in Section 3.2.7, imposes unit norm on all PC vectors.

Even with this set of constraints, numerical errors are unavoidable during the iterations. Hence, a post-correction is implemented after each iteration by reapplying PC analysis to the updated PCs. This way the intrinsic properties of PCs are strictly preserved.

With the ten types of constraints (M = 10) introduced above, we now describe the statistical inversion of multi-source (or multi-constraint) data. It involves solving the following system of equations (see [24] for Equations (14)–(20) below):

f_{i}^{*} = f_{i} (x) + Δ f_{i}^{*}, 1 \leq i \leq M

(14)

where

f_{i}^{*}

denotes the

i^{th}

type of constraint,

Δ f_{i}^{*}

is the error with this type of constraint, and x =

x_{state}

, as defined in Equation (6). Formally, the statistical independence of different sources of constraints means that the covariance matrix of joint constraint

f^{*} = [f_{1}^{*}; f_{2}^{*}; \dots; f_{M}^{*}]

has the following structure

C_{f^{*}} = [\begin{matrix} C_{1} & 0 & 0 & 0 \\ 0 & C_{2} & 0 & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & C_{M} \end{matrix}]

(15)

where

C_{i}

indicates the covariance matrix of i-th constraint (

f_{i}^{*}

). Following the expressions of

f_{i}^{*}

and

C_{f_{i}^{*}}

, the probability distribution function (PDF) of joint data (1 ≤ i ≤ M) can be derived by multiplying PDFs of data from all M sources, namely,

P (f (x) | f^{*}) = \prod_{i = 1}^{M} P (f_{i} (x) | f_{i}^{*}) ~ \exp {- \frac{1}{2} \sum_{i = 1}^{M} {[f_{i} (x) - f_{i}^{*}]}^{T} {(C_{i})}^{- 1} [f_{i} (x) - f_{i}^{*}]}

(16)

Further introducing the weight matrix (W) for multiple (M) types of constraints, the objective cost function to be minimized has a quadratic form, namely,

Ψ_{total} (x) = \sum_{i = 1}^{M} γ_{i} Ψ_{i} (x)

(17)

where

Ψ_{i} (x) = \frac{1}{2} {[f_{i} (x) - f_{i}^{*}]}^{T} W_{i}^{- 1} [f_{i} (x) - f_{i}^{*}]

(18)

W_{i} = \frac{1}{ε_{i}^{2}} C_{i}

(19)

γ_{i} = \frac{ε_{1}^{2}}{ε_{i}^{2}}

(20)

In the above equations,

ε_{i}^{2}

is the first diagonal element of

C_{i}

(i.e.,

ε_{i}^{2} = {C_{i}}_{11}

) and the Lagrange factor

γ_{i}

weights the contribution of each type of constraint with respect to the first one (

γ_{1}

= 1).

Minimization of

Ψ_{total} (x)

in Equation (17) means its gradient with respect to the solution x approaches zero, such that

\nabla Ψ_{total} (x) = \sum_{i = 1}^{M} γ_{i} \nabla Ψ_{i} (x) = 0

(21)

which can be ensured by enforcing the gradient of all components approach zero, namely

\nabla Ψ_{i} (x) = K_{i}^{T} W_{i}^{- 1} (f_{i} (x) - f_{i}^{*}) = 0

(22)

where

K_{i}

is the Jacobian matrix containing the derivatives of i-th type of constraint with respect to the retrieval parameters. To solve the above equation iteratively, we replace x by x − Δx in Equation (22) and substitute

f_{i} (x - Δ x) = f_{i} (x) - K_{i} Δ x

(23)

into it. This results in

\nabla Ψ_{i} (x - Δ x) = K_{i}^{T} W_{i}^{- 1} (f_{i} (x) - K_{i} Δ x - f_{i}^{*}) = 0

(24)

or equivalently,

(K_{i}^{T} W_{i}^{- 1} K_{i}) Δ x = K_{i}^{T} W_{i}^{- 1} (f_{i} (x) - f_{i}^{*}) = \nabla Ψ_{i} (x)

(25)

The Jacobian matrix

K_{i}

in Equations (22)–(25) consists of the derivative of the l-th observational or a priori data with respect to the n-th unknown,

K_{i, (l, n)} = {\frac{\partial f_{i, l}}{\partial x_{n}} |}_{x}

(26)

More explicit evaluation of

f_{i} (x)

,

f_{i}^{*}

, and K and W matrices for all ten types of constraints is discussed in the following subsections.

3.2.1. Observational Constraints (i = 1)

Assuming a total number of

N_{pixel}

pixels, where each pixel has Z observations, then after arranging all pixel data into a column vector we have

f_{i = 1}^{*} (x) = {[f_{1, p_{1}}^{*}; f_{1, p_{2}}^{*}; \dots; f_{1, N_{pixel}}^{*}]}_{N \times 1} = [y_{1}; y_{2}; \dots; y_{N_{f}}]

(27)

where

N_{f}

=

N_{pixel}

× Z.

The parameters of the atmosphere-surface model are adjusted so that the model prediction of radiance and polarization

f_{1} (x)

fit the observational constraints

f_{1}^{*} (x)

. The calculation of

f_{1} (x)

is introduced in Section 4. The Jacobian matrix

K_{1}

consists of first order partial derivatives with respect to correlated and uncorrelated parameters in the vicinity of x, namely,

K_{1} = [\begin{matrix} K_{1, {\bar{x}}_{corr}} & K_{1, v} & K_{1, w} (p_{1}) & K_{1, uncorr} (p_{1}) & 0 & 0 & \dots & 0 & 0 \\ K_{1, {\bar{x}}_{corr}} & K_{1, v} & 0 & 0 & K_{1, w} (p_{2}) & K_{1, uncorr} (p_{2}) & \dots & 0 & 0 \\ \dots & \dots & \dots & \dots & \dots & \dots & ⋱ & \dots & \dots \\ K_{1, {\bar{x}}_{corr}} & K_{1, v} & 0 & 0 & 0 & 0 & \dots & K_{1, w} (N_{pixel}) & K_{1, uncorr} (N_{pixel}) \end{matrix}]

(28)

where

K_{1, {\bar{x}}_{corr}}

is the Jacobian matrix containing derivatives of observations with respect to spatial and temporal mean correlated parameters (total number is

N_{corr}

). Variation of

{\bar{x}}_{corr}

impacts observations in all pixels. Then

K_{1, {\bar{x}}_{corr}}

is evaluated by

K_{1, {\bar{x}}_{corr}} = [\begin{matrix} \frac{\partial y_{1}}{\partial {\bar{x}}_{corr} (1)} & \frac{\partial y_{1}}{\partial {\bar{x}}_{corr} (2)} & \dots & \frac{\partial y_{1}}{\partial {\bar{x}}_{corr} (N_{corr})} \\ \frac{\partial y_{2}}{\partial {\bar{x}}_{corr} (1)} & \frac{\partial y_{2}}{\partial {\bar{x}}_{corr} (2)} & \dots & \frac{\partial y_{2}}{\partial {\bar{x}}_{corr} (N_{corr})} \\ \dots & \dots & ⋱ & \dots \\ \frac{\partial y_{N_{f}}}{\partial {\bar{x}}_{corr} (1)} & \frac{\partial y_{N_{f}}}{\partial {\bar{x}}_{corr} (2)} & \dots & \frac{\partial y_{N_{f}}}{\partial {\bar{x}}_{corr} (N_{corr})} \end{matrix}]

(29)

The matrix

K_{1, v}

in Equation (28) is the Jacobian matrix containing derivatives of observations with respect to all PC elements. It is evaluated as follows:

K_{1, v} = [\begin{matrix} K_{1, v} (k = 1) & K_{1, v} (k = 2) & \dots & K_{1, v} (k = N_{PC}) \end{matrix}]

(30)

with

K_{1, v} (k) = [\begin{matrix} \frac{\partial y_{1}}{\partial v_{k} (1)} & \frac{\partial y_{1}}{\partial v_{k} (2)} & \dots & \frac{\partial y_{1}}{\partial v_{k} (N_{corr})} \\ \frac{\partial y_{2}}{\partial v_{k} (1)} & \frac{\partial y_{2}}{\partial v_{k} (2)} & \dots & \frac{\partial y_{2}}{\partial v_{k} (N_{corr})} \\ \dots & \dots & ⋱ & \dots \\ \frac{\partial y_{N_{f}}}{\partial v_{k} (1)} & \frac{\partial y_{N_{f}}}{\partial v_{k} (2)} & \dots & \frac{\partial y_{N_{f}}}{\partial v_{k} (N_{corr})} \end{matrix}]

(31)

The matrix

K_{1, w}

in Equation (28) is the Jacobian matrix containing derivatives of observations with respect to pixel-resolved PC weights and is evaluated as follows:

K_{1, w} (p) = [\begin{matrix} \frac{\partial y_{p, 1}}{\partial w_{p} (1)} & \frac{\partial y_{p, 1}}{\partial w_{p} (2)} & \dots & \frac{\partial y_{p, 1}}{\partial w_{p} (N_{PC})} \\ \frac{\partial y_{p, 2}}{\partial w_{p} (1)} & \frac{\partial y_{p, 2}}{\partial w_{p} (2)} & \dots & \frac{\partial y_{p, 2}}{\partial w_{p} (N_{PC})} \\ \dots & \dots & ⋱ & \dots \\ \frac{\partial y_{p, Z}}{\partial w_{p} (1)} & \frac{\partial y_{p, Z}}{\partial w_{p} (2)} & \dots & \frac{\partial y_{p, Z}}{\partial w_{p} (N_{PC})} \end{matrix}]

(32)

Moreover, the derivatives with respect to pixel-resolved uncorrelated parameters are evaluated by,

K_{1, uncorr} (p) = [\begin{matrix} \frac{\partial y_{p, 1}}{\partial x_{p} (1)} & \frac{\partial y_{p, 1}}{\partial x_{p} (2)} & \dots & \frac{\partial y_{p, 1}}{\partial x_{p} (L (1) + L (2) + \dots + L (N_{uncorr}))} \\ \frac{\partial y_{p, 2}}{\partial x_{p} (1)} & \frac{\partial y_{p, 2}}{\partial x_{p} (2)} & \dots & \frac{\partial y_{p, 2}}{\partial x_{p} (L (1) + L (2) + \dots + L (N_{uncorr}))} \\ \dots & \dots & ⋱ & \dots \\ \frac{\partial y_{p, Z}}{\partial x_{p} (1)} & \frac{\partial y_{p, Z}}{\partial x_{p} (2)} & \dots & \frac{\partial y_{p, Z}}{\partial x_{p} (L (1) + L (2) + \dots + L (N_{uncorr}))} \end{matrix}]

(33)

The covariance matrix for the first constraint (i.e., the observations) assembles the sub-weighting matrix from all pixels, namely,

W_{1} = [\begin{matrix} W_{1} (1) & 0 & \dots & 0 \\ 0 & W_{1} (2) & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & W_{1} (N_{pixel}) \end{matrix}]

(34)

3.2.2. A Priori Constraints (i = 2)

The a priori constraint is constructed in the same way as in a previous study [24]. Namely, Equation (14) becomes

f_{i = 2}^{*} = x^{a p r i o r i} = x + Δ x^{a p r i o r i}

(35)

Then in Equations (21)–(25),

K_{i = 2}

= I (identity matrix) and

W_{i = 2} = \frac{1}{ε_{a *}^{2}} C_{a *}

. More explicitly,

W_{i = 2}

can be constructed from estimated range of each parameter relative to the first one,

W_{i = 2} = [\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & \frac{{(x_{2, \max} - x_{2, \min})}^{2}}{{(x_{1, \max} - x_{1, \min})}^{2}} & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & \frac{{(x_{N, \max} - x_{N, \min})}^{2}}{{(x_{1, \max} - x_{1, \min})}^{2}} \end{matrix}]

(36)

3.2.3. Smoothness Constraints in Regular Parameter Space (i = 3, 4, 6, and 7)

The formulation of these types of constraints is similar, so we discuss them together. The third type of constraint reflects the smooth variation of an uncorrelated type of parameter (such as the variation of parameter of the surface bidirectional reflectance distribution function with wavelength). The fourth type of constraint reflects the smooth variation of certain type of parameter in the mean field (e.g., aerosol refractive index as a function of wavelength). The sixth type of constraint reflects the smooth variation of PC weights from pixel to pixel. The seventh equation reflects the smooth variation of certain type of parameters residing in a PC vector (such as the deviation of refractive index from the mean). The strengths of the sixth and seventh types of constraints depends on the rank of PC. Putting all four of these types of constraints together, we have

{\begin{cases} f_{3}^{*} = 0^{*} = S_{3, m} x_{uncorr} + Δ_{{g (x}_{uncorr})}^{*} \\ f_{4}^{*} = 0^{*} = S_{4, m} \bar{x} + Δ_{{g (\bar{x}}_{corr})}^{*} \\ f_{6}^{*} = 0^{*} = S_{6, m} w_{state} + Δ_{g (w)}^{*} \\ f_{7}^{*} = 0^{*} = S_{7, m} v_{state} + Δ_{g (v)}^{*} \end{cases}

(37)

where

w_{state}

and

v_{state}

are column vectors containing only PC weights and vectors, respectively, and are extracted from the overall state vector expressed in Equation (6), and

S_{i, m}

is the differentiation matrix of m-th order for i-th type of constraint.

When a retrieval parameter varies smoothly as a function of some variable z, it is assumed to be locally approximated by a smooth function g(z), such as a constant, a line, a parabola, etc. With a polynomial form, the m-th derivative approaches zero [6], such that

S_{i, m} (z) z = \frac{d^{m} g (z)}{d z^{m}} = 0 \Rightarrow {\begin{cases} g_{m = 1} (z) = const \\ g_{m = 2} (z) = A z + B \\ g_{m = 3} (z) = A z^{2} + B z + C \end{cases}

(38)

For a discretized grid of the variable z, the explicit form of

S_{i, m} z

, is,

S_{i, m} (z_{j}) z_{j} = {\begin{cases} \frac{d g}{d z} \approx \frac{Δ^{1} g}{Δ_{1} (z)} = \frac{g (z_{j + 1}) - g (z_{j})}{Δ_{1} (z_{j})}, for m = 1 \\ \frac{d^{m} g}{d z^{m}} \approx \frac{Δ^{m} g}{Δ_{m} (z)} = \frac{Δ^{m - 1} g (z_{j + 1}) / Δ_{m - 1} (z_{j + 1}) - Δ^{m - 1} g (z_{j}) / Δ_{m - 1} (z_{j})}{[Δ_{m - 1} (z_{j}) + Δ_{m - 1} (z_{j + 1})] / 2}, for m \geq 2 \end{cases}

(39)

Taking the orders of difference m = 1 and 2 as examples, we have

{\begin{cases} Δ_{m = 1} (z_{j}) = z_{j + 1} - z_{j} \\ Δ_{m = 2} (z_{j}) = [Δ_{1} (z_{j}) + Δ_{1} (z_{j + 1})] / 2 \end{cases}

(40)

Application of the above equation to L discretized grids r_j (namely, 1 ≤ j ≤ L) leads to

f_{i} (x) = S_{i, m} x

(41)

so that by invoking Equation (26),

K_{i} = S_{i, m}

(42)

where the matrix

S_{i, m}

is evaluated by,

S_{i, m = 1} = [\begin{matrix} \frac{1}{Δ_{1} (1)} & - \frac{1}{Δ_{1} (1)} & 0 & \dots & 0 \\ 0 & \frac{1}{Δ_{1} (2)} & - \frac{1}{Δ_{1} (2)} & \dots & 0 \\ 0 & 0 & \dots & ⋱ & 0 \\ 0 & 0 & \dots & \frac{1}{Δ_{1} (L - 1)} & - \frac{1}{Δ_{1} (L - 1)} \end{matrix}]

(43)

and

S_{i, m = 2} = [\begin{matrix} \frac{2}{Δ_{1} (1) [Δ_{1} (1) + Δ_{1} (2)]} & \frac{- 2}{Δ_{1} (1) Δ_{1} (2)} & \frac{2}{Δ_{1} (2) [Δ_{1} (1) + Δ_{1} (2)]} & 0 & 0 & \dots & 0 \\ 0 & \frac{2}{Δ_{1} (2) [Δ_{1} (2) + Δ_{1} (3)]} & \frac{- 2}{Δ_{1} (2) Δ_{1} (3)} & \frac{2}{Δ_{1} (3) [Δ_{1} (2) + Δ_{1} (3)]} & \dots & 0 \\ \dots & \dots & \dots & \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & \frac{2}{Δ_{1} (L - 1) [Δ_{1} (L - 1) + Δ_{1} (L)]} & \frac{- 2}{Δ_{1} (L - 1) Δ_{1} (L)} & \dots & \frac{2}{Δ_{1} (L) [Δ_{1} (L - 1) + Δ_{1} (L)]} \end{matrix}]

(44)

The same principle applied to higher orders of difference (m > 2) ensures a smooth curve with

\frac{d^{m} g (z)}{d z^{m}} = 0

. Substitution of Equations (41), (42), and

f_{i}^{*} = 0^{*}

into Equation (18) gives,

Ψ_{i} (x) = \frac{1}{2} x^{T} S_{i, m}^{T} W_{i, m}^{- 1} S_{i, m} x

(45)

where the weighting matrix W has the following diagonal terms,

{W_{i, m}}_{j j} = \frac{1}{Δ_{m} (z_{j})}

(46)

and

Δ_{m} (z_{j})

is specified in Equation (40) for m = 1 and 2, and can be generalized to an arbitrary higher order.

Substitution of Equations (41), (42), and

f_{i}^{*} = 0^{*}

into Equation (26) gives

(S_{i, m}^{T} W_{i, m}^{- 1} S_{m}) Δ x = S_{i, m}^{T} W_{i, m}^{- 1} (S_{i, m} x)

(47)

Further defining

Ω_{i} = S_{i, m}^{T} W_{i, m}^{- 1} S_{i, m}

(48)

we have

Ω_{i} Δ x = Ω_{i} x

(49)

Note that for equidistant pixels on an imaged grid, the weighting matrix W is a unity matrix (namely W = I). To distinguish the smoothness constraints imposed on different types of parameters, we note here,

Ω_{i} = {\begin{cases} Ω_{uncorr}, & i = 3 \\ Ω_{corr}^{\bar{x}}, & i = 4 \\ Ω_{corr}^{w}, & i = 6 \\ Ω_{corr}^{v}, & i = 7 \end{cases}

(50)

3.2.4. Smoothness Constraints in PC Space (i = 5)

The fifth type of constraint reflects the smooth variation of correlated parameters in different dimensions. However, unlike the smoothness constraint directly imposed on relevant uncorrelated parameters, it has to be transformed to the PC vectors and weights. For simplicity, we put all PC weights into a single-column vector

w_{state} = [\begin{matrix} w_{k = 1}; & w_{k = 2}; & \dots; & w_{k = N_{PC}} \end{matrix}]

and all PC elements for j-th correlated parameter into another single-column vector

v_{state} = [\begin{matrix} v_{k = 1}; & v_{k = 2}; & \dots; & v_{k = N_{PC}} \end{matrix}]

. The smooth variation of the j-th correlated aerosol parameter is ensured by,

f_{i = 5}^{*} = 0^{*} = S_{i, m} x_{wv} + Δ_{g ([v; w])}^{*}

(51)

where

x_{wv} = [\begin{array}{l} v_{state} \\ w_{state} \end{array}]

. To evaluate the differentiation matrix

S_{i, m}

, we recall that a pixel-resolved correlated parameter is constructed from PCs via Equation (1). An arbitrary correlated parameter

x_{corr, p} (j)

associated with p-th pixel is derived as,

x_{corr, p} (j) = {\bar{x}}_{corr, j} + \sum_{k = 1}^{N_{PC}} w_{p} (k) v_{k} (j) = {\bar{x}}_{corr, j} + [w_{p} (1) v_{1} (j) + w_{p} (2) v_{2} (j) + \dots + w_{p} (N_{PC}) v_{N_{PC}} (j)]

(52)

(a) across-pixel smoothness

The across-pixel variation of a correlated parameter is contributed by pixel-resolved PC weights, namely,

Δ x_{corr, p} = v \times (Δ w_{p})

(53)

Taking the first order of difference (m = 1) in the across-pixel variation of the field as an example, Equation (39) becomes,

S_{m = 1} (z_{j}) z_{j} = \frac{d g}{d z} \approx \frac{Δ^{1} g}{Δ_{1} z} \approx \frac{\sum_{k = 1}^{N_{PC}} {v_{k} (j) \times [w_{j + 1} (k) - w_{j} (k)]}}{Δ_{1} (z_{j})}

(54)

where z measures the inter-pixel relative distance. As an example, we take the simplest case of a single PC (k = 1), three pixels (

p_{1}

,

p_{2}

,

p_{3}

) associated with PC weights [

w_{p 1}

,

w_{p 2}

,

w_{p 3}

], and two correlated parameters associated with PC elements

v_{k = 1}

(1) and

v_{k = 1}

(2). The matrix

S_{i, m}

is constructed in the following way to ensure smooth variation of the correlated parameter across equidistant (e.g., unity spacing) pixels via

f_{i = 5} (x_{wv}) = S_{i, m} x_{wv}

:

f_{i = 5} (x_{wv}) = S_{i, m} x_{wv} = {\frac{1}{2} [\begin{matrix} w_{p 1} (k = 1) - w_{p 2} (k = 1) & 0 & v_{k = 1} (1) & - v_{k = 1} (1) & 0 \\ 0 & w_{p 1} (k = 1) - w_{p 2} (k = 1) & v_{k = 1} (2) & - v_{k = 1} (2) & 0 \\ w_{p 2} (k = 1) - w_{p 3} (k = 1) & 0 & 0 & v_{k = 1} (1) & - v_{k = 1} (1) \\ 0 & w_{p 2} (k = 1) - w_{p 3} (k = 1) & 0 & v_{k = 1} (2) & - v_{k = 1} (2) \end{matrix}]} \times [\begin{matrix} v_{k = 1} (1) \\ v_{k = 1} (2) \\ w_{p 1} (k = 1) \\ w_{p 2} (k = 1) \\ w_{p 3} (k = 1) \end{matrix}]

(55)

(b) within-pixel smoothness

Within-pixel variation of a parameter (such as refractive index) describes its dependence on a non-spatial variable, e.g., wavelength. For an arbitrary pixel p, the within-pixel variation of a correlated parameter is contributed by PC vectors, namely,

Δ x_{corr, p} = (Δ v) \times w_{p}

(56)

Taking the first order of difference (m = 1) as an example, the differentiation matrix

S_{i, m}

is constructed in the following way to ensure smooth variation of the correlated parameter to x(j) as a function of z across which x(j) is smooth (for example, z is wavelength when x(j) is aerosol refractive index),

S_{i, m = 1} (z_{j}) z_{j} = \frac{d g}{d z} \approx \frac{Δ^{m} g}{Δ_{m} z} \approx \frac{\sum_{k = 1}^{N_{PC}} {w_{p} (k) \times [v_{k} (j + 1) - v_{k} (j)]}}{Δ_{1} z_{j}}

(57)

Taking the simplest case of a single PC (k = 1), three parameters associated with PC elements

v_{k = 1}

(1),

v_{k = 1}

(2) and

v_{k = 1}

(3) as an example,

f_{i = 5} (x_{wv}) = S_{i, m = 1} x_{wv} = {\frac{1}{2} [\begin{matrix} \frac{w_{p} (k = 1)}{Δ_{1} (z_{1})} & - \frac{w_{p} (k = 1)}{Δ_{1} (z_{1})} & 0 & \frac{v_{k = 1} (2) - v_{k = 1} (1)}{Δ_{1} (z_{1})} \\ 0 & \frac{w_{p} (k = 1)}{Δ_{1} (z_{2})} & - \frac{w_{p} (k = 1)}{Δ_{1} (z_{2})} & \frac{v_{k = 1} (3) - v_{k = 1} (2)}{Δ_{1} (z_{2})} \end{matrix}]} \times [\begin{matrix} v_{k = 1} (1) \\ v_{k = 1} (2) \\ v_{k = 1} (3) \\ w_{p} (k = 1) \end{matrix}]

(58)

Invoking

f_{i}^{*} = 0^{*}

, the cost function in the form of Equation (45) is derived for both across-pixel and within-pixel smoothness constraints. Further, substitution of

f_{i = 5} (x_{wv}) = S_{i, m = 1} x_{wv}

into Equation (26) gives the expression for Jacobian matrix elements,

K_{i, (l, n)} = {\frac{\partial {(S_{i, m} x_{wv})}_{l}}{\partial x_{wv, n}} |}_{x_{wv}}

(59)

in which the matrix

S_{i, m}

is a function of x (see Equation (57)), therefore

K_{i} \neq S_{i, m}

. Intuitively, one can think of

f_{i = 5} (x_{wv}) = S_{i, m = 1} x_{wv}

as a model prediction to fit the “observation”

f_{i}^{*}

, which is a zero vector. Therefore, the above Jacobian matrix is evaluated in the same way as is done for observational constraints by use of finite difference methodology. The finite difference method is used for evaluating Equation (59).

Invoking

f_{i} (x_{wv}) = S_{i, m} x_{wv}

and

f_{i}^{*} = 0^{*}

, Equation (25) becomes

(K_{i}^{T} W_{i}^{- 1} K_{i}) Δ x_{wv} = K_{i}^{T} W_{i}^{- 1} S_{i, m} x_{wv}

(60)

Further defining

{\begin{cases} Ω_{corr, 1}^{wv} = K_{i}^{T} W_{i}^{- 1} K_{i} \\ Ω_{corr, 2}^{wv} = K_{i}^{T} W_{i}^{- 1} S_{i, m} \end{cases}

(61)

we have

Ω_{corr, 1} Δ x_{wv} = Ω_{corr, 2} x_{wv}

(62)

Note that if

S_{i, m}

is independent of x (cases for 3, 4, 6, and 7), then K can be analytically evaluated from Equation (59), namely,

K_{i} = S_{i, m}

(63)

so that

Ω_{corr, 1}^{wv} = Ω_{corr, 2}^{wv} = S_{i, m}^{T} W_{i}^{- 1} S_{i, m}

(64)

In this manner, the static smoothness matrices for cases for 3, 4, 6, and 7 (Section 3.2.3) are recovered.

Based on above demonstration, the evaluation of

S_{i, m}

,

K_{i}

, and

Ω_{i}

matrices is generalized from the case of regular (non-correlated) retrieval parameters to the case of correlated parameters represented by PCs in Appendix C. The

Ω

matrix derived above is accounted for later to solve for an increment of the PC terms during optimization.

3.2.5. Zero-sum Constraint on PC Weights (i = 8)

As an intrinsic property of pixel-resolved PC weights, they are required to sum to zero for an arbitrary set of PC vectors. This constraint is ensured by multiplying the PC weights w in the state vector by an O matrix, such that,

f_{i = 8}^{*} = 0^{*} = f_{i = 8} (w_{state}) + Δ_{O}^{*}

(65)

where

w_{state}

is a column vector defined in Section 3.2.4 and contains the weights of all PCs, and

f_{i = 8} (w_{state})

is expressed as

f_{8} (w_{state}) = O w_{state}

(66)

where

O = {[\begin{matrix} \overset{⇀}{1} & \overset{⇀}{0} & \dots & \overset{⇀}{0} \\ \overset{⇀}{0} & \overset{⇀}{1} & \dots & \overset{⇀}{0} \\ \dots & \dots & ⋱ & \overset{⇀}{0} \\ \overset{⇀}{0} & \overset{⇀}{0} & \dots & \overset{⇀}{1} \end{matrix}]}_{N_{PC} \times (N_{PC} \times N_{pixel})}, \overset{⇀}{1} = {[1, 1, \dots, 1]}_{1 \times N_{pixel}}

(67)

Further substituting

f_{8}^{*} = 0^{*}

and

f_{8} (w_{state}) = O w_{state}

into Equation (18), the cost function has the following quadratic form,

Ψ_{i} (w_{state}) = \frac{1}{2} (w_{state}^{T} O^{T} O w_{state})

(68)

Substitution of Equation (66) into Equation (26) gives the evaluation of Jacobian matrix elements,

K_{i, (l, n)} = {\frac{\partial {(O w_{state})}_{l}}{\partial w_{state, n}} |}_{w_{state}} = O_{l, n}

(69)

Further defining a zero-sum matrix to be,

Ω_{O} = Ω_{i} = O^{T} O

(70)

and invoking

f_{8} (w_{state}) = O w_{state}

and

f_{8}^{*} = 0^{*}

, Equation (25) becomes,

Ω_{O} Δ w_{state} = Ω_{O} w_{state}

(71)

3.2.6. Mutual Orthogonality Constraint among PC Vectors (i = 9)

An intrinsic property of PC vectors is that they are mutually orthogonal. This forms the orthogonality constraint imposed on all pairs of PC vectors via

f_{i}

(i = 9), namely

f_{i = 9}^{*} = 0^{*} = f_{i = 9} (v_{state}) + Δ_{Γ}^{*}

(72)

where

v_{state}

is a column vector defined in Section 3.2.4 and contains all PC vectors, and

f_{i = 9} (v_{state})

is expressed as

f_{9} (v_{state}) = Γ v_{state}

(73)

where,

Γ = [\begin{matrix} v_{k = 2}^{T} & v_{k = 1}^{T} & \overset{⇀}{0} & \overset{⇀}{0} & \dots & \overset{⇀}{0} & \overset{⇀}{0} \\ v_{k = 3}^{T} & \overset{⇀}{0} & v_{k = 1}^{T} & \overset{⇀}{0} & \dots & \overset{⇀}{0} & \overset{⇀}{0} \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ v_{k = N_{PC}}^{T} & \overset{⇀}{0} & \overset{⇀}{0} & \overset{⇀}{0} & \dots & \overset{⇀}{0} & v_{k = 1}^{T} \\ \overset{⇀}{0} & v_{k = 3}^{T} & v_{k = 2}^{T} & \overset{⇀}{0} & \dots & \overset{⇀}{0} & \overset{⇀}{0} \\ \overset{⇀}{0} & v_{k = 4}^{T} & \overset{⇀}{0} & v_{k = 2}^{T} & \dots & \overset{⇀}{0} & \overset{⇀}{0} \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ \overset{⇀}{0} & v_{k = N_{PC}}^{T} & \overset{⇀}{0} & \overset{⇀}{0} & \dots & \overset{⇀}{0} & v_{k = 2}^{T} \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ \overset{⇀}{0} & \overset{⇀}{0} & \overset{⇀}{0} & \overset{⇀}{0} & \dots & v_{k = N_{PC} - 1}^{T} & v_{k = N_{PC}}^{T} \end{matrix}]

(74)

Further substituting

f_{9}^{*} = 0^{*} = {{[\begin{matrix} 0 & 0 & \dots & 0 \end{matrix}]}^{T}}_{{\frac{1}{2} [N_{PC} \times (N_{PC} - 1)]} \times 1}

and

f_{9} (v_{state}) = Γ v_{state}

into Equation (18), the cost function is derived as,

Ψ_{i} (v) = \frac{1}{2} v_{state}^{T} Γ^{T} Γ v_{state}

(75)

Substitution of Equation (73) into Equation (26) gives the evaluation of Jacobian matrix elements,

k_{i, (l, n)} = {\frac{\partial {(Γ v)}_{l}}{\partial v_{state, n}} |}_{v_{state}} = Γ_{l, n}

(76)

Further defining a zero-sum matrix to be,

Ω_{Γ} = Ω_{i} = Γ^{T} Γ

(77)

and invoking

f_{9} (v_{state}) = Γ v_{state}

and

f_{9}^{*} = 0^{*}

, Equation (25) becomes,

Ω_{Γ} Δ v_{state} = Ω_{Γ} v_{state}

(78)

3.2.7. Unity-norm Constraint PC Vectors (i = 10)

Another intrinsic property of each PC vector is that its norm equals unity (i.e.,

{‖ v_{k} ‖}^{2} = v_{k}^{T} v_{k} = 1

). This forms the unity constraint imposed on all PC vectors via

f_{i}

(i = 10), such that

f_{i = 10}^{*} = 1^{*} = f_{i = 10} (v_{state}) + Δ_{U}^{*}

(79)

where

f_{10} (v_{state}) = U v_{state}

(80)

and U is expressed as,

U = {[\begin{matrix} v_{k = 1}^{T} & \overset{⇀}{0} & \dots & \overset{⇀}{0} \\ \overset{⇀}{0} & v_{k = 2}^{T} & \dots & \overset{⇀}{0} \\ \dots & \dots & ⋱ & \overset{⇀}{0} \\ \overset{⇀}{0} & \overset{⇀}{0} & \dots & v_{k = N_{PC}}^{T} \end{matrix}]}_{N_{PC} \times (N_{PC} \times N_{corr})}

(81)

Then substituting Equation (80) and

f_{10}^{*} = 1^{*} = {{[\begin{matrix} 1 & 1 & \dots & 1 \end{matrix}]}^{T}}_{N_{PC} \times 1}

into Equation (18), the cost function is derived as,

Ψ_{i} (v_{state}) = \frac{1}{2} (v_{state}^{T} U^{T} U v_{state} - N_{PC})

(82)

Moreover, substitution of Equation (80) into Equation (26) gives the evaluation of Jacobian matrix elements:

K_{i, (l, n)} = {\frac{\partial {(U v_{state})}_{l}}{\partial v_{state, n}} |}_{v_{state}}

(83)

Further defining a unity constraining matrix to be,

{\begin{cases} Ω_{U, 1} = K_{i}^{T} K_{i} \\ Ω_{U, 2} = K_{i}^{T} U \end{cases}

(84)

and invoking

f_{10} (v_{state}) = U v_{state}

and

f_{10}^{*} = 1^{*}

, Equation (25) becomes equivalent to

Ω_{U, 1} Δ v_{state} = Ω_{U, 2} v_{state} - K_{i}^{T} f_{i}^{*}

(85)

3.2.8. Construction of Overall Equation System

Accounting for the above ten types of above-specified constraints, the solution to minimizing Equation (17) is approached iteratively. The solution from iteration q is incremented as follows,

x_{q + 1} = x_{q} - Δ x_{q}

(86)

where

Δ x_{q}

is obtained by accounting for both observational and a priori constraints derived in Section 3.2.1, Section 3.2.2, Section 3.2.3, Section 3.2.4, Section 3.2.5, Section 3.2.6 and Section 3.2.7,

A_{q} \times Δ x_{q} = \nabla Ψ_{total} (x_{q})

(87)

As the explicit form, we have

A_{q} = K_{1, q}^{T} W_{1}^{- 1} K_{1, q} + γ Ω_{total, 1} + γ_{a} W_{a}^{- 1}

(88)

\nabla Ψ_{total} (x_{q}) = K_{1, q}^{T} W_{1}^{- 1} [f_{1} (x_{q}) - f_{1}^{*}] + γ Ω_{total, 2} x_{q} + γ_{a} W_{a}^{- 1} (x_{q} - x^{a p r i o r i}) + [\begin{matrix} 0^{*} \\ - γ_{U} K_{10, q}^{T} f_{10}^{*} \\ 0^{*} \end{matrix}]

(89)

where “

γ Ω_{total}

” incorporates i = 3–10 types of constraints, namely,

{\begin{cases} γ Ω_{total, 1} = γ_{uncorr} Ω_{uncorr}^{Ra} + γ_{corr}^{\bar{x}} Ω_{corr}^{\bar{x}, Ra} + γ_{corr}^{wv} Ω_{corr, 1}^{wv, Ra} + γ_{corr}^{w} Ω_{corr}^{w, Ra} + γ_{corr}^{v} Ω_{corr}^{v, Ra} + γ_{O} Ω_{O}^{Ra} + γ_{Γ} Ω_{Γ}^{Ra} + γ_{U} Ω_{U, 1}^{Ra} \\ γ Ω_{total, 2} = γ_{uncorr} Ω_{uncorr}^{Ra} + γ_{corr}^{\bar{x}} Ω_{corr}^{\bar{x}, Ra} + γ_{corr}^{wv} Ω_{corr, 2}^{wv, Ra} + γ_{corr}^{w} Ω_{corr}^{w, Ra} + γ_{corr}^{v} Ω_{corr}^{v, Ra} + γ_{O} Ω_{O}^{Ra} + γ_{Γ} Ω_{Γ}^{Ra} + γ_{U} Ω_{U, 2}^{Ra} \end{cases}

(90)

where the eight terms on the right-hand-side of the above equation are equal to

Ω_{i}

with i = 3–10, respectively, and the multipliers

γ_{\dots}

control the strength of these constraints. The smoothness matrices

γ_{uncorr} Ω_{uncorr}^{Ra}

,

γ_{corr}^{\bar{x}} Ω_{corr}^{\bar{x}, Ra}

,

γ_{O} Ω_{O}^{Ra}

,

γ_{Γ} Ω_{Γ}^{Ra}

,

γ_{U} Ω_{U}^{Ra}

,

γ_{corr}^{wv} Ω_{corr}^{wv, Ra}

, and

γ_{corr}^{w / v} Ω_{corr}^{w / v, Ra}

act on uncorrelated fields, multi-pixel mean of correlated fields, PC weights, PC vectors, PC vectors, combined PC weights and vectors, and separate PC weights and vectors, respectively. They are essentially equal to

γ_{uncorr} Ω_{uncorr}

,

γ_{corr}^{\bar{x}} Ω_{corr}^{\bar{x}}

,

γ_{O} Ω_{O}

,

γ_{Γ} Ω_{Γ}

,

γ_{U} Ω_{U}

,

γ_{corr}^{wv} Ω_{corr}^{wv}

, and

γ_{corr}^{w / v} Ω_{corr}^{w / v}

, respectively, as evaluated in Section 3.2.3, Section 3.2.4, Section 3.2.5, Section 3.2.6 and Section 3.2.7. However, the elements of these matrices are rearranged into the same dimension to add to each other in assembling

γ Ω_{total}

in Equation (90). In performing the rearrangement, specific locations of relevant parameters in the state vector (see Equation (6)) have to be accounted for, and zero values have to be filled in to accommodate the parameters not subjected to a specific constraint. For the uncorrelated parameters and fields, the explicit forms of

γ_{uncorr} Ω_{uncorr}

as well as

γ_{corr}^{\bar{x}} Ω_{corr}^{\bar{x}}

are given in Appendix B. For the correlated parameters and fields, the smoothness constraints imposed on PC vector and weight retrieval incorporates two types: the coupled type (

γ_{corr}^{wv} Ω_{corr}^{wv}

), which acts on combined PC weights and vectors, and the decoupled type

γ_{corr}^{w / v} Ω_{corr}^{w / v}

, which act on PC weights and vectors separately. Based on previous demonstration of principles for deriving

Ω_{i}

for i = 5, a comprehensive evaluation of

γ_{corr}^{wv} Ω_{corr}^{wv}

is provided in Appendix C. Comprehensive evaluations of

γ_{corr}^{w} Ω_{corr}^{w}

and

γ_{corr}^{v} Ω_{corr}^{v}

are provided in Appendix D. The imposition of

γ_{corr}^{w} Ω_{corr}^{w}

and

γ_{corr}^{v} Ω_{corr}^{v}

are effective when pixel-resolved PC weights themselves or PC elements associated with a type of parameter present certain smooth behavior (usually appearing in low order PCs). Our retrieval tests indicate that a combined use of

γ_{corr}^{wv} Ω_{corr}^{wv}

and

γ_{corr}^{w / v} Ω_{corr}^{w / v}

stabilize the inversion of PCs.

Ideally, a retrieval is deemed successful when the minimization of the cost function is achieved, such that

Ψ_{total} \leq (N_{f} + N_{c} + N_{a^{*}} - N_{a}) ε_{f}^{2}

(91)

where N_f, N_c, N_a, and N_a* are the total number of observations, total number of constraints imposed on retrieval (including smoothness constraints in different dimensions, zero-sum constraint over PC weights and orthogonality and unity constraints over PC vectors), total number of retrieval parameters, and total number of a priori estimates of parameters, respectively; and

ε_{f}^{2}

is the expected variance due to measurement errors. In practice, forward RT modeling error and other unmodeled effects can impede realization of the required cost function minimization. Therefore, the retrieval is also terminated when the relative difference of fitting residues with solutions from two successive iterations drops below a user-specified threshold value,

ε_{c}^{2}

, namely,

\frac{| Ψ_{total} (x_{q + 1}) - Ψ_{total} (x_{q}) |}{Ψ_{total} (x_{q})} \leq ε_{c}^{2} .

(92)

3.2.9. Determination of Lagrange Multipliers

The Lagrange multipliers reflect the strength of smoothness constraints for a given parameter to be retrieved. Each multiplier is defined as,

γ_{i} = ε_{i = 1}^{2} / ε_{i}^{2}

(93)

where

ε_{i}^{2}

are the first diagonal elements of the covariance matrices corresponding to i-th type of constraints. To estimate

ε_{i}^{2}

for a given parameter to be retrieved (f = x), which is a function of z, the most unsmooth known solution

f^{us} (z)

over the target domain is used, as suggested by Dubovik and King [25], namely,

ε^{2} = {\int_{z_{\min}}^{z_{\max}} (\frac{d^{m} [f^{us} (z)]}{d^{m} z})}^{2} d z

(94)

where

z_{\min}

and

z_{\max}

specify the lower and upper bound of z. For i = 3, 4, 5,

γ_{i}

is evaluated using least smooth known solution of correlated parameter x over the target domain. For i = 6 and 7,

γ_{i}

is evaluated using the least smooth known solution of PC weights and PC vectors, respectively, which are derived from a training dataset over the target domain.

In practical implementation of our algorithm, the Lagrange multipliers are modified in the following way:

γ_{i}^{Final} = \frac{N_{f}}{N_{i}} \frac{{\tilde{ε}}_{f}^{2}}{ε_{f}^{2}} γ_{i}

(95)

There are two differences between

γ_{i}^{Final}

and

γ_{i}

:

1. The multipliers “

N_{f}

/

N_{i}

” are introduced to account for possible redundancy of the measured and a priori data. Considering that

ε_{i}^{2}

is a variance of the error in a single measured or estimated a priori value, if we have N values of a similar kind, the total variance increases in proportion to N. Introducing this coefficient ensures that when there are several kinds of data, the data with fewer values are given comparable weight, as the data type for which there is a greater number of available values.

2. The multiplier

{\tilde{ε}}_{f}^{2}

/

ε_{f}^{2}

is introduced with

{\tilde{ε}}_{f}^{2}

estimated as the dynamic fitting residual during iterations:

{\tilde{ε}}_{f}^{2} (x_{q}) \approx \frac{Ψ_{total} (x_{q})}{(N_{f} + N_{c} + N_{a^{*}} - N_{a}) ε_{f}^{2}}

(96)

With the multiplier

{\tilde{ε}}_{f}^{2}

/

ε_{f}^{2}

, the fitting residual is used as an estimate of measurement error variance. As a result, during the first few iterations, the contribution of the a priori term is strongest, and its influence decreases as the retrieval progresses. This is done to ensure mostly monotonic convergence, as in the Levenberg-Marquardt procedure [26,27]. However, the Levenberg-Marquardt approach does not specify a particular scheme for introducing these terms, rather it relies on the implementer’s intuition. Our algorithm requires the fitting errors in the initial iterations to be dominated by model linearization errors as opposed to random measurement errors, because at each iterative step the full forward model is replaced by its linear approximation, the “errors of linearization” decrease as convergence toward the final solution progresses, and they practically disappear, so that

{\tilde{ε}}_{f}^{2}

becomes equal to

ε_{f}^{2}

. As a result of this adjustment of the Lagrange multiplier, the non-linear iteration becomes significantly more monotonic. The Lagrange multiplier that controls the strength of a priori (i = 2), zero sum (i = 8), orthogonality (i = 9), and unity (i = 10) constraints are dynamically updated using Equations (95) and (96). When the solution approaches the truth, the constraining effect of smoothness constraints via the Lagrange factor decreases during the optimization process.

3.3. Retrieval Error Estiamte

Instrumental errors, forward modeling errors, and physical constraints imposed on the retrievals need to be taken into account when evaluating retrieval errors. Instrumental errors include random and systematic effects, such as absolute bias, band-to-band error, and camera-to-camera error. If all of these error sources are well-characterized, their error propagation to the retrievals can be derived in a comprehensive way [19]. If not, then a practical method is to use the square root of the diagonal terms of the following covariance matrix,

Δ x_{j} = \sqrt{{(C_{Δ x})}_{j j}}

(97)

with

C_{Δ x} = (Δ x_{syst}) \times {(Δ x_{syst})}^{T} + C_{Δ x, rand}

(98)

where the covariance matrix

C_{Δ x, rand}

accounts for the contributions by random errors in the measurement (

ε_{rand}

),

C_{Δ x_{rand}} = {[A (x^{true})]}^{- 1} ε_{rand}^{2}

(99)

and systematic error

Δ x_{syst}

is estimated by

Δ x_{syst} = {[A (x^{true})]}^{- 1} \nabla Ψ_{total} (x^{true})

(100)

where A and

\nabla Ψ_{total}

are calculated by substituting

x^{true}

into Equations (88) and (89), respectively.

As the retrieved solution is the closest estimate of the “true” solution

x^{true}

, we adopt

x^{true}

=

x^{retrieved}

for uncertainty estimation. Uncertainties with uncorrelated parameters are available from Equation (97). Uncertainties with pixel-resolved correlated parameters rely on the propagation of retrieval uncertainties of PCs and PC weights, so that,

{(Δ x_{p, corr}^{retrieved})}^{2} = {(Δ {\bar{x}}_{corr}^{retrieved})}^{2} + \sum_{k = 1}^{N_{PC}} {{[w_{p}^{retrieved} (k)]}^{2} {(Δ v_{k}^{retrieved})}^{2} + {[Δ w_{p}^{retrieved} (k)]}^{2} {(v_{k}^{retrieved})}^{2}}

(101)

To estimate errors for functions of the retrieved parameters (namely

f (x_{p}^{retrieved})

, e.g., aerosol single-scattering albedo, (SSA)), the following chain rule is applied,

Δ f (x_{p}^{retrieved}) = \sqrt{K_{2}^{T} (x_{p}^{retrieved}) C_{Δ x_{p}^{retrieved}} K_{2} (x_{p}^{retrieved})}

(102)

where

K_{2}

denotes the Jacobian array containing derivatives of

f (x_{p}^{retrieved})

with respect to

x_{p}^{retrieved}

.

When

x^{a p r i o r i}

is not available, we assume

x^{a p r i o r i}

=

x_{q}

so that the a priori term on the right-hand-side of Equation (89) for

\nabla Ψ_{total}

is neglected. Indeed, instrument errors are already contained in the observation vector. Therefore, by implementing Equations (97) and (98), it is possible that errors are double-counted in the case that they bias the solution in the same direction as the modeling errors, resulting in a conservative error estimate.

Most error estimates involving Jacobians assume that the calculation is representative of the whole solution space and that the retrieval error is linear with measurement error. These two assumptions can be problematic in situations where model or observation errors are large. In these cases, closure tests using synthetic data with combined random and systematic errors to obtain improved error estimates are recommended.

3.4. Retrieval Options

A priori or first guess of PCs can be obtained from running PCA on a climatology database, field measurements, or model output. In the case of PCs for which confidence in their contents is low, simultaneous retrieval of PC weights and PC vectors has to be performed. For high confidence PCs, a two-step retrieval can be used to speed up the retrieval. That is, we run PC weight retrieval using the predetermined PCs first and then relax PCs during the retrieval to refine the solution. This relaxation is one way to capture unexpected variance that exceeds that represented by the predetermined PCs. In the extreme situation, where PCs are not representative of the actual aerosol properties, the scene model can be reproduced by using N PCs, where N equals to the number of aerosol parameters. In the absence of confidence in an a priori set of PCs, retrieval efficiency is still desired. In this case, one can run the correlated multi-pixel inversion with 2–3 PC vectors to capture the major variation of the correlated fields and then (a) relax more PCs into the retrieval to capture unexpected variance, or (b) return to the original multi-pixel retrieval to refine the solution. These different options have been built into our algorithm. In addition, to stabilize the retrieval of correlated fields varying significantly in magnitude (even in logarithmic space), we built into the algorithm an option of retrieving scaled PC vectors and weights. In this case, the representation of correlated fields by PCs for p-th pixel (namely Equation (1)) is modified as,

x_{corr, p} = {\bar{x}}_{corr} + c \circ (v \times w_{p})

(103)

where

\circ

in the above equation denotes an operation of element-wise multiplication, and the constant column vector c has

N_{corr}

elements and is introduced here to balance or weight the contribution of all types of retrieval parameters in the PC analysis. It is calculated from (a) the standard deviation of a correlated field x varying in spatial or temporal scales (

σ_{spatio - temporal, x}

), and (b) its uncertainty estimate (

σ_{e, x}

), namely,

C_{x} = \sqrt{σ_{spatio - temporal, x}^{2} + σ_{e, x}^{2}}

(104)

Such a scaled PC analysis is used to analyze the AERONET retrievals in Figure 1 and Figure 2. To accommodate the constant column vector, the smoothness matrices for the correlated fields and Lagrange multipliers need a slight modification, which is straightforward and is not discussed here.

4. Radiative Transfer in a Coupled Atmosphere-Surface System

A fast radiative transfer model for a coupled atmosphere-surface (CAS) system plays a critical role in ensuring inversion efficiency. Where correlation among aerosol parameters exists, this section formulates the development of such a model for simulating TOA radiance for a group of pixels simultaneously. Figure 4 shows a CAS system consisting of a land surface and an aerosol/air-molecule mixed layer. The Markov chain model [28] is adopted for modeling RT in the atmospheric part of the CAS system. Coupling of atmospheric radiation and surface reflection is performed by use of the adding method [29]. As outlined in Figure 5, the principal-component based radiative transfer (PC-RT) model includes two steps: (1) calculating aerosol light scattering properties (aerosol optical depth, single scattering albedo, and phase matrix), and the reflection and transmission matrices of the atmosphere for a group of pixels by utilizing correlations in aerosol properties (Section 4.1.1); (2) coupling reflection/transmission matrices of the atmosphere with surface reflection pixel-by-pixel to model the TOA observations (Section 4.1.2). For retrieval use, evaluation of the Jacobian matrix that contains the derivatives of TOA observations with respect to all aerosol and surface parameters is described in Section 4.2.

Figure 4. Depiction of the coupled atmosphere-surface (CAS) system model. The sun illuminates the top-of-atmosphere with solar zenith angle

θ_{0}

and azimuthal plane

ϕ_{0}

. The sensor views the atmosphere at view zenith angle

θ_{v}

and azimuthal angle

ϕ_{v}

. A Gaussian vertical distribution profile for aerosols. The Markov chain model is used for computing polarized RT in the atmosphere. It is then coupled with surface reflection using an adding method.

Figure 5. The scheme of PC-based forward radiative transfer modeling of remote sensing observations from an airborne or spaceborne sensor. The interpretation of symbols used in the figure can be found in Table A1 of Appendix A.

To further ensure modeling efficiency, the correlated aerosol microphysical properties derived from their PCs are input to a light scattering database [30] for determination of aerosol scattering properties using interpolation. The modeled scattering properties include aerosol optical depth (AOD), single scattering albedo (SSA), and phase matrix. To model surface reflection, the Rahman-Pinty-Verstraete (RPV) model [31] is used to calculate the unpolarized surface bidirectional reflectance distribution function (BRDF). The parameters of the unpolarized part of the surface model include spectral weights and a few angular shape parameters. A modified microfacet model [32] is used to calculate the polarized surface BRDF (pBRDF). The pBRDF model parameters include spectral weight, shadowing width, and slope variance of the microfacets. Combination of the two components gives the full surface reflection model used in AirMSPI aerosol retrievals over land [19]. With the AOD and single scattering properties, a fast RT model that utilizes the correlation in aerosol fields and orthogonality of PC vectors is used to model the radiometric and polarimetric observations.

4.1. Fast Multi-Pixel Polarized RT Modeling in the Atmosphere

At a given iteration step, aerosol optical depth (AOD) and light scattering properties, including SSA and phase matrices, are established for all layers. Then, along with the parameters of the surface reflection, RT modeling is performed to evaluate the TOA radiation field. The basic multi-pixel algorithm, which does not take advantage of correlations between aerosol parameters, runs the RT pixel by pixel. Making use of correlation in aerosol fields and orthogonality of PC vectors, the next section describes a fast way to compute radiation in the atmosphere, which is then coupled with surface reflection to derive the TOA radiation field.

4.1.1. Fast Multiple-Pixel Radiative Transfer Modeling Utilizing Correlation

The spatial and spectral variation of aerosol fields, such as volume concentration as a function of aerosol size, aerosol refractive index, vertical distribution profile, and volume concentration of spherical particles, are captured by a few (

N_{PC}

) dominant PC vectors. Taking advantage of orthogonality of the PCs, the quantities (Y), including total AOD (

τ_{aer, tot}

), absorption AOD (

τ_{aer, abs}

), and reflection and transmission matrices (R and T) for the above-surface atmosphere, can be expanded into Taylor series for multi-variable vector-valued functions

v_{k}

. Adopting the second order of approximation (justified later in this section) and finite difference method to calculate derivatives, we have,

Y (x_{p}) \approx Y (\bar{x}) + \sum_{k = 1}^{N_{PC}} [\frac{Y (\bar{x} + δ_{s} \times v_{k}) - Y (\bar{x} - δ_{s} \times v_{k})}{2 δ_{s}} w_{p} (k) + \frac{Y (\bar{x} + δ_{s} \times v_{k}) - 2 Y (\bar{x}) + Y (\bar{x} - δ_{s} \times v_{k})}{2 δ_{s}^{2}} w_{p}^{2} (k)]

(105)

where

\bar{x}

is state vector containing multi-pixel mean aerosol properties and

δ_{s}

is the scale factor that perturbs a PC vector. The scale factor is empirically determined by accounting for: (a) the accuracy of the two Jacobian related terms on the right-hand-side of the above equation; and (b) the sufficiency of expanding the radiation fields to the second order and using finite number of PCs to capture the variation of radiation across the imaging area. For an AirMSPI imaging area 10 km by 10 km, an empirical value of 0.1 is adopted to use the above equation.

The efficiency gain of implementing the above PC based RT modeling is obvious: only (2 ×

N_{PC}

+1) RT runs are necessary for evaluating the radiance fields for all pixels, which is much smaller than required for pixel-by-pixel RT runs. If the variation of aerosol fields across an imaging area is extremely high, we use an alternative method built into the algorithm, which applies PCA to the difference in the RT fields evaluated by RT runs with low and high numbers of streams (“streams” represent the quadrature points for integrating the contribution of light from different directions to an event of scattering in a new direction), namely,

\begin{array}{l} Y (x_{p}) - Y^{LS} (x_{p}) \approx & [Y^{HS} (\bar{x}) - Y^{LS} (\bar{x})] + \\ \sum_{k = 1}^{N_{PC}} [\frac{Δ Y (\bar{x} + δ_{s} \times v_{k}) - Δ Y (\bar{x} - δ_{s} \times v_{k})}{2 δ_{s}} w_{p} (k) + \frac{Δ Y (\bar{x} + δ_{s} \times v_{k}) - 2 Δ Y (\bar{x}) + Δ Y (\bar{x} - δ_{s} \times v_{k})}{2 δ_{s}^{2}} w_{p}^{2} (k)] \end{array}

(106)

where “HS” and “LS” denote high and low stream-based RT runs, respectively, and

Δ Y = Y^{HS} - Y^{LS}

. In this case, RT runs with low order streams are performed for all pixels and then corrected by Equation (106). This methodology was proposed for a hyperspectral RT simulation [8] and adopted here for the multi-pixel RT simulation. Compared to directly applying PCA to RT via Equation (105), implementation of Equation (106) increases computational cost but captures more variance in the image-scale radiation fields and achieves higher modeling accuracy.

Note that the above evaluation assumes the same view and azimuthal angles for all pixels across the whole image, which is not strictly correct, as view angle varies pixel-by-pixel. To account for such an effect, the R and T matrices are calculated for a few view and azimuthal angular grids for each viewed image. Interpolation is then used to obtain the reflection and transmission matrices for an arbitrary pixel before coupling them with the surface reflection (as we assume no correlation between aerosol and surface reflection properties in this paper). Coupling of atmospheric radiation and surface reflection to get the TOA radiance for fitting observation is formulated in the next section.

Using three PCs (

N_{PC}

= 3), Figure 6 shows a comparison of PC based RT computation of radiance (expressed as bidirectional reflectance factor, BRF) and degree of linear polarization (DOLP) for 30 super-pixels of an AirMSPI image against the RT computation obtained without using PCs. Here, BRF is defined as

π I_{meas} d_{ES}^{2} / μ_{0} E_{0}

, where

I_{meas}

is the measured radiance,

d_{ES}

is the Earth-Sun distance,

μ_{0}

is the cosine of solar zenith angle, and s is the exo-atmospheric solar irradiance. A super-pixel has the resolution of 1 km and is generated by aggregating 100 by 100 original AirMSPI pixels with 10-m resolution. Running RT model over super-pixels can mitigate the errors from the independent pixel approximation [20]. For a variation of aerosol loading up to 100%, the error in computed TOA, BRF, and DOLP from PC-based RT computation is found to be less than ~0.5% and ~0.0025, respectively. These values are smaller than typical instrument errors (e.g., ~4% for radiance and 0.005 for DOLP, which are the instrument requirements for AirMSPI [33]).

Figure 6. (a) Pixel-resolved spectral aerosol optical depth (AOD) used in simulating the radiance and degree of linear polarization (DOLP) in (b). The AODs for seven Airborne Multiangle SpectroPolarimetric Imager (AirMSPI) spectral bands are plotted in different colors: pink (355 nm), purple (380 nm), dark blue (445 nm), light blue (470 nm), green (555 nm), red (660 nm), and brown (865 nm). b) The principal component based RT (PC-RT) simulation of radiance using three PCs (upper left panel) and DOLP (lower left panel) at AirMSPI’s nine viewing angles within (−66°, +66°) range around nadir, and the relative error of PC based RT simulation of radiance (top right) and absolute error of DOLP (bottom right) as compared to direct RT computation pixel-by-pixel. Errors are estimated by 100 × (Y_PC-RT − Y_Direct-RT)/Y_Direct-RT for Y = radiance and by (Y_PC-RT − Y_Direct-RT) for Y = DOLP. The viewing geometries are adopted from one of the scenes acquired by AirMSPI during the ACEPOL field campaign. The unpolarized and polarized surface reflectance is calculated from the retrieved BRDF and pBRDF parameters, respectively.

Note that the PC-RT model formulated here assumes that aerosol properties are correlated. It is possible, however, that some aerosol fields are not highly correlated. In this case, one can use more PCs to capture the variance. In the worst case scenario of no correlation, the RT calculation defaults to the pixel-by-pixel approach.

4.1.2. Coupling Atmospheric Radiation with Surface Reflection

Using the deterministic Markov chain RT model for a plane-parallel atmosphere ([28], or other RT models), we calculate two sets of reflection and transmission matrices for the atmosphere, (

R_{atmos}

,

T_{atmos}

) and (

R_{atmos}^{*}

,

T_{atmos}^{*}

), which result from illumination at the top and bottom of the atmosphere, respectively. We denote the surface reflection matrix by

R_{surf}

, which consists of a BRDF component (

R_{surf, BRDF}

) and a pBRDF component (

R_{surf, pBRDF}

). In accordance with the adding method [29], two matrices Q and S are defined to account for the interaction between the surface and atmosphere, respectively,

S = \sum_{n = 1}^{N_{\max}} Q_{n}

(107)

Q_{n} = Q_{1} Q_{n - 1}

(108)

Q_{1} = R_{atmos}^{*} R_{surf}

(109)

Then the matrices for downwelling and upwelling diffuse light at the atmosphere-land interface are given by

D = T_{atmos} + S \exp (- \frac{τ_{atmos}}{μ_{0}}) + S T_{atmos}

(110)

where

τ_{atmos}

is the optical thickness of the whole atmosphere, as contributed by all atmospheric constituents, and the diffuse upwelling light from the surface is calculated as

U = R_{Surf} \exp (- \frac{τ_{atmos}}{μ_{0}}) + R_{Surf} D

(111)

The reflection matrix of the full CAS is,

R_{CAS} = R_{atmos} \exp (- \frac{τ_{atmos}}{μ}) U + T_{atmos}^{*} U

(112)

For simplicity in describing the conceptual scheme, the superscript “m” that denotes Fourier series order is not shown in the above expressions. In actuality, the TOA radiation fields for the CAS system are reconstructed from all orders of Fourier terms, namely,

{BRF}_{TOA} = π \sum_{m = 0}^{\infty} (2 - δ_{0 m}) R_{CAS, 11}^{(m)} \cos m (ϕ_{v} - ϕ_{0})

(113)

{qBRF}_{TOA} = π \sum_{m = 0}^{\infty} (2 - δ_{0 m}) R_{CAS, 21}^{(m)} \cos m (ϕ_{v} - ϕ_{0})

(114)

{uBRF}_{TOA} = π \sum_{m = 0}^{\infty} (2 - δ_{0 m}) R_{CAS, 31}^{(m)} \cos m (ϕ_{v} - ϕ_{0})

(115)

{vBRF}_{TOA} = π \sum_{m = 0}^{\infty} (2 - δ_{0 m}) R_{CAS, 41}^{(m)} \cos m (ϕ_{v} - ϕ_{0})

(116)

where

(ϕ_{v} - ϕ_{0})

is the relative azimuth angle between the view and illumination directions. Equations (113)–(116) describe TOA BRFs for total, linearly polarized, and circularly polarized radiation. When DOLP is used to fit the observations, it is calculated as

DOLP \approx \sqrt{{qBRF}_{TOA}^{2} + {uBRF}_{TOA}^{2}} / {BRF}_{TOA}

, where we neglect the minor contribution of circular polarization related term (

{vBRF}_{TOA}

) when it is excluded from polarimetric measurements.

4.2. Jacobian Evaluation

In an iterative way, the optimization algorithm adjusts the state vector that parameterizes aerosol and surface properties to approach the solution. At each iterative step, RT calculations are performed to obtain both the modeled radiation fields and their Jacobians, which describe how the radiation fields vary in response to the small perturbation of state vector components and determine the direction and step size to move for the iterative solution to converge. For an observation vector composed of a series of N_f measurements (

f_{1}^{*} = [y_{1}; y_{2}; \dots; y_{N_{f}}]

) and a state vector of

N_{a}

components (

x = [x_{1}; x_{2}; \dots; x_{N_{a}}]

), the Jacobian matrix has the following structure,

K = {[\begin{matrix} \frac{\partial y_{1}}{\partial x_{1}} & \frac{\partial y_{1}}{\partial x_{2}} & \dots & \frac{\partial y_{1}}{\partial x_{N_{a}}} \\ \frac{\partial y_{2}}{\partial x_{1}} & \frac{\partial y_{2}}{\partial x_{2}} & \dots & \frac{\partial y_{2}}{\partial x_{N_{a}}} \\ \dots & \dots & ⋱ & \dots \\ \frac{\partial y_{N_{f}}}{\partial x_{1}} & \frac{\partial y_{N_{f}}}{\partial x_{2}} & \dots & \frac{\partial y_{N_{f}}}{\partial x_{N_{a}}} \end{matrix}]}_{N_{f} \times N_{a}}

(117)

Each matrix element is evaluated by use of the finite difference method. Namely, the derivative of modeled i-th data with respect to n-th state vector component is evaluated by,

\frac{\partial y_{i}}{\partial x_{n}} = \frac{y_{i} (x_{n} + Δ x_{n}) - y_{i} (x_{n} - Δ x_{n})}{2 Δ x_{n}}

(118)

The above equation applies to the calculation of the derivatives of

τ_{aer, tot}

,

τ_{aer, abs}

, R, and T with respect to the mean aerosol properties and pixel-resolved PC weights. Then, without applying the finite difference method again, the derivative of quantity Y = {

τ_{aer, tot}

,

τ_{aer, abs}

, R, T} at pixel p with respect to an element in k-th PC vector can be derived in a fast manner, namely, from the derivatives with respect to the mean and to the PC weights associated with p-th pixel,

\frac{\partial Y_{p}}{\partial v_{k} (i)} \approx w_{p} (k) \frac{\partial Y_{p}}{\partial \bar{x} (i)}

(119)

Such a strategy further improves the Jacobian evaluation efficiency when multiple PCs are retrieved.

5. Inversion of Aerosol and Surface Properties

Section 3 provided a general algorithm formulation to retrieve correlated and uncorrelated parameters. This section describes practical specifics in using AirMSPI data to retrieve aerosol and surface properties. In doing so, we assume correlations in aerosol properties, no correlation in surface properties, and no correlation between aerosol and surface properties.

The first guesses of PCs, PC weights, and mean aerosol fields are derived from a training dataset, which contains all historical AirMSPI retrievals for selected scenes with the criteria of well-calibrated measurements, clear-sky conditions, and collocated ground observations. The aerosol fields include spectrally dependent real and imaginary parts of the aerosol refractive index, volume concentrations of multiple aerosol size components, nonspherical particle fraction, and Gaussian profile-based parameterization of aerosol layer height and standard deviation [19]. Constrained by the information content in polarimetric observations, we retrieve aerosol properties by assuming a single layer, and an “effective” set of aerosol properties for the single layer that fits the polarimetric observations is derived.

With these assumptions, the correlated aerosol parameters include volume concentration as a function of five size components (

C_{v_{1 - N_{S C}}}

with

N_{SC}

= 5 as adopted in [19]), real and imaginary parts of refractive index (

n_{r}

and

n_{i}

), Gaussian distribution-based vertical profile parameterized by central height

h_{a}

and standard deviation

s_{a}

, and the volume fraction of spherical particles (

f_{v, sphere}

). Then, the parameter spaces described at the beginning of Section 3 are specified as follows,

{\bar{x}}_{corr} = \log {[\overset{{\bar{x}}_{corr, 1}^{T}}{\overset{︷}{{\bar{C}}_{v 1} \dots {\bar{C}}_{v_{N_{SC}}}}}, \overset{{\bar{x}}_{corr, 2}^{T}}{\overset{︷}{{\bar{n}}_{r, λ 1} \dots {\bar{n}}_{r, λ 7}}}, \overset{{\bar{x}}_{corr, 3}^{T}}{\overset{︷}{{\bar{n}}_{i, λ 1} \dots {\bar{n}}_{i, λ 7}}}, \overset{{\bar{x}}_{corr, 4}}{\overset{︷}{{\bar{h}}_{a}}}, \overset{{\bar{x}}_{corr, 5}}{\overset{︷}{{\bar{s}}_{a}}}, \overset{{\bar{x}}_{corr, 6}}{\overset{︷}{{\bar{f}}_{v, sphere}}}]}^{T}

(120)

v = [v_{1} v_{2} \dots v_{N_{PC}}]

(121)

with

v_{k} = {[\overset{v_{k, 1}^{T}}{\overset{︷}{v_{k, C_{v 1}} \dots v_{k, C_{v 5}}}}, \overset{v_{k, 2}^{T}}{\overset{︷}{v_{k, n_{r, λ 1}} \dots v_{k, n_{r, λ 7}}}}, \overset{v_{k, 3}^{T}}{\overset{︷}{v_{k, n_{i, λ 1}} \dots v_{k, n_{i, λ 7}}}}, \overset{v_{k, 4}}{\overset{︷}{v_{k, h_{a}}}}, \overset{v_{k, 5}}{\overset{︷}{v_{k, s_{a}}}}, \overset{v_{k, 6}}{\overset{︷}{v_{k, f_{v, sphere}}}}]}^{T}

(122)

w = [w_{1} w_{2} \dots w_{N_{pixel}}], with w_{p} = [\begin{matrix} w_{p} (1); & w_{p} (2); & \dots; & w_{p} (N_{PC}) \end{matrix}]

(123)

x_{p, uncorr} = [x_{p, uncorr, 1}; x_{p, uncorr, 2}; \dots; x_{p, uncorr, N_{TP, uncorr}}]

(124)

with

N_{TP, uncorr}

= 6 types of uncorrelated parameters, including the BRDF spectral weight

a_{λ}

(j = 1), anisotropy parameter

k_{λ}

(j = 2), anisotropy parameter

g_{λ}

(j = 3), pBRDF weight

ϵ_{λ}

(j = 4), shadowing width

k_{γ}

(j = 5), and slope variance

σ_{s}

(j = 6), namely,

x_{p, uncorr, j = 1} = [a_{p} (λ_{1}); a_{p} (λ_{2}); \dots; a_{p} (λ_{7})]

(125)

x_{p, uncorr, j = 2} = [k_{p} (λ_{1}); k_{p} (λ_{2}); \dots; k_{p} (λ_{7})]

(126)

x_{p, uncorr, j = 3} = [g_{p} (λ_{1}); g_{p} (λ_{2}); \dots; g_{p} (λ_{7})]

(127)

x_{p, uncorr, j = 4} = [ε_{p} (λ_{1}); ε_{p} (λ_{2}); \dots; ε_{p} (λ_{7})]

(128)

x_{p, uncorr, j = 5} = [k_{γ, p}]

(129)

x_{p, uncorr, j = 6} = [σ_{s, p}^{2}]

(130)

In Equation (120), the natural logarithm is used to ensure non-negativity of the solution after dynamic updates during the iterative optimization process. Though not noted here, the angular shape parameter “g” with the RPV model needs to be offset by a constant before being transformed into logarithmic space.

An overview of correlated multi-pixel inversion of aerosol properties and surface reflection algorithm flow is shown in Figure 7. As part of the state vector, the PCs of the correlated aerosol microphysical properties are derived from a training dataset from climatology or other sources. As another part of the state vector, uncorrelated parameters (such as surface reflection properties) are initialized to be static [19]. The inversion is stabilized by applying a priori constraints on smooth variations of certain aerosol and surface properties in spatial and spectral dimensions. Iterations repeat until convergence is achieved. For the retrieval tests in this study, it takes five to seven iterations for the solution to converge.

Figure 7. Algorithm flowchart for correlated multi-pixel retrieval of aerosol and land surface reflection properties. The interpretation of symbols used in the figure can be found in Table A1 of Appendix A.

As demonstrated in earlier POLDER and AirMSPI retrievals [6,19], imposition of temporal constraints on the variation of surface reflectance can improve aerosol and surface retrievals. In this case, multiple scenes acquired from revisits of a target have to be used. This functionality is not turned on in the following retrieval tests to simplify the demonstration.

5.1. AirMSPI Datasets

The retrieval algorithm is designed to retrieve column aerosol and surface reflection properties from observations by AirMSPI, which flies on NASA’s ER-2 aircraft at an altitude of 20 km and operates in eight spectral channels: 355, 380, 445, 470*, 555, 660*, 865*, and 935 nm, with the asterisk denoting polarimetric bands, in which the Stokes parameters Q and U are measured in addition to radiance I. Images of each targeted area were obtained at 9 viewing angles: 0° (nadir), ±29°, ±48°, ±59°, and ±66° in AirMSPI’s step-and-stare mode. At nadir, the imaged area covers a 10 km × 11 km region and the data are mapped to a 10-m spatial grid. Without using the water-vapor influenced band (935 nm), a total of 117 signals per pixel are used, which include 63 radiances (transformed to logarithmic space in retrieval) at nine angles and seven spectral bands, and 27 signals of q = Q/I and another 27 signals of u = U/I in the three polarimetric bands. Retrievals for all pixels of a surface area viewed from all 9 angles are performed simultaneously. The measurement errors are adopted as 4% for radiance (to account for angle-to-angle and pixel-to-pixel uncertainties) and 0.005 for DOLP.

A wide range of atmospheric conditions and terrestrial environments have been covered by AirMSPI during over a hundred flights in several airborne campaigns. In this paper, we use AirMSPI data from the Polarimeter Definition Experiment (PODEX) (January to February 2013), Studies of Emissions, Atmospheric Composition, Clouds, and Climate Coupling by Regional Surveys (SEAC⁴RS) (August to September 2013), CalWater-2 (January to March 2015), and Imaging Polarimetric Assessment and Characterization of Tropospheric Particulate Matter (ImPACT-PM) (July 2016) campaigns. From these, 27 AirMSPI step-and-stare data collection sequences were identified to be cloud-free and collocated with AERONET sun photometers for retrieval validation. Locations of these AERONET sites and AirMSPI/AERONET measurement times can be found in a previous study [19]. To control the strength of multiple types of constraints, the initial values of Lagrange multipliers for the two PCs used in our retrieval are provided in Table 3, Table 4 and Table 5. Note that the difference between Table 3 and Table 4 and Table 5 is that Table 3 and Table 4 are for constructing the smoothness constraints over the correlated fields constructed from the combined set of PCs and PC weights. Table 5 is for constructing the smoothness constraints over PC weights and PC vectors separately. Table 3, Table 4 and Table 5 also list the first guesses of the relevant state vector components and the order of difference for imposing the smoothness constraints on these components.

Table 3. Initial guess of image-effective (multi-pixel mean) aerosol parameters and uncorrelated pixel-resolved surface parameters, and the order of difference and Lagrange multipliers for imposing within-pixel smoothness constraints.

Table 4. Initial guess and the order of difference and Lagrange multipliers for imposing within-pixel and across-pixel smoothness constraints on the correlated aerosol fields through the first two PCs.

Table 5. The order of difference and Lagrange multipliers for imposing within-PC constraints on the first two PC vectors and for imposing across-pixel constraint on the PC weights.

5.2. Retrieval Validation against AERONET Products

The retrieved AirMSPI AOD, SSA, size distribution, and refractive index are validated against AERONET Level 1.5 aerosol products. We choose Level 1.5 AERONET product (cloud-screened and quality controlled), as it reports more fields to validate our retrievals. Though the Level 2.0 AERONET product (quality-assured) has higher confidence, the SSA, refractive index, and aerosol size distribution fields were not all generated in 27 test cases. However, for the fields reported by both versions, negligible differences were observed. As a first check, a retrieval is performed using AirMSPI observations acquired on September 9, 2013, over the AERONET Baskin, Louisiana, site which is located at longitude = −91.738° and latitude = 32.282°. The left image in Figure 8a shows nadir radiance using the spectral band combination of 445, 555, and 660 nm, while the right image displays DOLP at 470, 660, and 865 nm. The retrievals are performed over the area viewed in common at all 9 AirMSPI view angles, outlined by the yellow box. Figure 8b shows TOA BRF at 445, 555, and 660 nm in the left, middle, and right panels, respectively, for the retrieval area with spatial resolution ~1.0 km. Figure 8c shows maps of retrieved AOD, SSA, and surface albedo (A_surf) at 555 nm in the left, middle, and right panels, respectively. The retrieved AOD, SSA, and volume weighted aerosol size distribution for the atmosphere above the super-pixel closest to the Baskin site is compared to the AERONET reference data in the left, middle, and right panels of Figure 8d. Good agreement (quantified below) is obtained for all of these quantities, except for the coarse particle size distribution, likely due to the lack of bands longward of 1000 nm in AirMSPI. Generally, the difference between AirMSPI retrievals of AOD, SSA, and size distribution and AERONET reference data is within their retrieval uncertainties. The AirMSPI uncertainties plotted in Figure 8d are estimated as the root mean square of the retrieval uncertainties of these aerosol quantities and the standard deviation of their variations over the whole image. The AERONET uncertainties consist of two parts: temporal variation within the ±~1-h window centered on the AirMSPI nadir overpass time, and aerosol measurement and retrieval error [34]. A temporally closest AERONET reference data was identified compare to AirMSPI retrieval at the spatially closest pixel. To account for airmass change during the measurements, the ±~1-h window centered on AirMSPI nadir overpass time is used to calculate the AERONET uncertainty from temporal variation.

Figure 8. (a) High resolution AirMSPI nadir imagery of Baskin, Louisiana, acquired on September 9, 2013. The left image is of radiance at 445, 555, and 660 nm. The location of the Baskin AERONET site is marked. The right image displays DOLP in the three polarimetric bands (470, 660, and 865 nm). The yellow box indicates the area viewed at all 9 AirMSPI view angles, and where data were used for retrieval algorithm testing. (b) Lower-resolution imagery (~1.0 km) of the retrieval area after pixel aggregation. The left, middle, and right panels are images of BRF at 445, 555, and 660 nm, respectively. (c) Retrieved AOD, SSA, and surface albedo (A) maps at 555 nm in the left, middle, and right panels, respectively. (d) The AirMSPI retrieved AOD, SSA, and volume weighted aerosol size distribution at the pixel closest to the Baskin AERONET site, compared to the AERONET-derived values.

Figure 9 shows a comparison of retrieved pixel-resolved AODs, SSAs, and aerosol size distributions (dV/dln(r), in μm³/μm²) from the correlated multi-pixel inversion with those retrieved using original multi-pixel algorithm adapted for AirMSPI [19]. The AOD results for seven AirMSPI spectral bands are plotted in different colors: pink (355 nm), purple (380 nm), dark blue (445 nm), light blue (470 nm), green (555 nm), red (660 nm), and brown (865 nm). Linear regression is performed to obtain slope a, intercept b, as well as the coefficient of determination R². The mean absolute difference (MAD) in AERONET and AirMSPI results is also calculated to measure the overall deviation. Strong correlation and low bias (R² ≥ 0.88, a ~ 0.90, b ≤ 0.05, and MAD < 0.02) are observed. It can also be observed that a variation of aerosol loading by ~30% around the mean (namely in the range 0.28 ≤

{AOD}_{445 n m}

≤ 0.40 with mean value 0.32) across the retrieval area is captured by the correlated multi-pixel inversion. Implementation of our approach using several datasets with even higher (~90%) variation of aerosol loading over several smoke scenes acquired by AirMSPI during the recent Aerosol Characterization from Polarimeter and Lidar (ACEPOL) campaign showed the algorithm to be capable of capturing this variation. The regression in Figure 9b shows correlations and low bias of SSAs (

R^{2}

> 0.40, a > 0.60, b < 0.030, and MAD ≤ 0.004) from the two inversions as well. Figure 9c shows basic consistency in the retrieved aerosol size distributions: both algorithms find the peaks of fine and coarse mode aerosol size to be around ~0.15 μm and ~2 μm, respectively. Due to the lower sensitivity of AirMSPI’s longest wavelength 865 nm to coarse mode aerosols, some differences in coarse mode aerosol size can be observed. This indicates the impact of insufficient observational information about certain aerosol properties. Comparisons of pixel-scale AOD, SSA, and size distribution at other retrieval cases show a similar quality of agreement.

Figure 9. (a) Regression of pixel-scale AOD retrieved from correlated multi-pixel inversion (CMPI) against the previous retrievals using multi-pixel inversion (MPI, [19]). The AirMSPI dataset used for retrieval is the same as in Figure 5 over the AERONET Baskin site. The results for seven AirMSPI spectral bands are plotted in different colors: pink (355 nm), purple (380 nm), dark blue (445 nm), light blue (470 nm), green (555 nm), red (660 nm), and brown (865 nm). (b) The same as Figure 9a but for SSA. (c) Comparison of image-mean volume-weighted aerosol size distribution retrieved from correlated multi-pixel inversion and multi-pixel inversion.

Figure 10 and Figure 11 compare AirMSPI and AERONET retrievals of AOD and SSA respectively. Since the temporal variation of aerosol loading and properties is not constant, non-symmetric AERONET error bars can be observed in Figure 10 and Figure 11. Under the circumstance of no AERONET reference data before or after AirMSPI measurement, only AERONET measurement/retrieval error is reported. Figure 12 and Figure 13 compare real and imaginary parts of aerosol refractive index, respectively. The AERONET spectral aerosol products were linearly interpolated in wavelength to match the AirMSPI band centers. Comparisons of fine and coarse mode effective radii are shown in Figure 14. To facilitate the comparison of aerosol size, an effective radius was calculated for fine and coarse mode aerosols from AirMSPI retrievals using Equation (35) in a previous study [19]. For AOD, linear regression is performed. Values of regression related parameters (a, b, R²) and MAD are indicated in all panels. The AOD regression shows a spectral means of coefficient of determination 0.91, slope 0.93, and intercept 0.03, reflecting high retrieval quality. While SSA and refractive index in Figure 11, Figure 12 and Figure 13 show relatively larger differences between the AirMSPI and AERONET retrievals, the differences are generally within their respective uncertainties, which in turn depend on AirMSPI and AERONET observation errors and the sensitivities of the respective retrieval algorithms. Figure 14 shows a maximum difference of 30% between AirMSPI and AERONET retrieved fine mode aerosol size, whereas larger differences (up to 80%) are observed in coarse mode aerosol size. As noted above, shortwave infrared spectral bands, which AirMSPI lacks, are necessary to constrain the coarse mode aerosol size.

Figure 10. Regression of AirMSPI retrieved aerosol optical depth (AOD) against AERONET measured values. The upper three panels are for 355, 445, and 470 nm, and the lower three panels are for 555, 660, and 865 nm. Linear interpolation is used to obtain AERONET AOD values at the AirMSPI wavelengths. The AERONET uncertainties are from the ±~1-h window around the time of AirMSPI overflight plus measurement uncertainties (0.01), while the AirMSPI uncertainties are the root mean square of the retrieval uncertainties and standard deviation of pixel-resolved AODs over the whole image. Linear regression analysis yields values of slope a, intercept b, coefficient of determination R², and mean absolute difference (MAD). Values of each are indicated in all panels.

Figure 11. Comparison of AirMSPI retrieved single scattering albedo (SSA) against AERONET retrievals. The upper left and right panels are for 445 and 555 nm and the lower left and right panels are for 660 and 865 nm. Linear interpolation is used to obtain AERONET SSA values at the AirMSPI wavelengths. The AirMSPI errors are computed from statistics obtained over the whole image plus the errors evaluated using the method in Section 3.3. Values of MAD are shown.

Figure 12. Same as Figure 11 but for the real part of aerosol refractive index.

Figure 13. Same as Figure 11 but for the imaginary part of aerosol refractive index.

Figure 14. Same as Figure 11 but for the effective radii of fine and coarse mode aerosols.

Table 6 summarizes MAD in several key aerosol properties from correlated multi-pixel inversion approach (this paper) and from the original multi-pixel inversion approach adapted to AirMSPI [19]. These properties include AOD, SSA, real and imaginary parts of refractive index (

n_{r}

,

n_{i}

), and effective radii of fine and coarse particles (

r_{eff, fine}

,

r_{eff, coarse}

). The bias is evaluated by taking the mean of the absolute difference between AirMSPI and AERONET retrievals at collocated pixels. The correlated multi-pixel inversion has slightly higher MAD than that of original multi-pixel retrieval, namely by ~0.01, 0.015, 0.001, 0.002, and 0.05 for AOD and SSA in visible,

n_{r}

,

n_{i}

,

r_{eff, fine}

, and

r_{eff, coarse}

, respectively. The deviation of both correlated and original multi-pixel retrievals from AERONET reference data are mostly within the retrieval uncertainties from our algorithms (see Section 3.3 and Section 4.2 of the previous study [19]) and those estimated for AERONET, as observed in Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14 of this work, as well as Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 in the previous study [19].

Table 6. Mean absolute bias (MAD) in key aerosol properties from correlated multi-pixel inversion approach using two PCs (this paper) and from original multi-pixel inversion approach adapted to AirMSPI [19]. The bias is calculated by taking the mean of the absolute difference between AirMSPI and AERONET retrievals at collocated pixels. A total of 27 AirMSPI datasets are used. Interpolation is used to calculate some AERONET parameters at central wavelengths of the AirMSPI spectral bands. By performing regression against AERONET reference AOD, the coefficients of determination (R²) are given in columns 3 and 5. The ranges of non-AOD parameters are too small and the sample size is limited. Therefore a reliable regression analysis is not established and R² is reported only for AOD in the table.

Note that to enhance the retrieval efficiency, two PCs are adopted to perform the retrieval and achieve the retrieval quality in Table 6. Including more PCs in our retrieval would allow capturing more spatial variations of aerosol properties across the imaged area and improve accuracy, but at a cost of decreased retrieval efficiency. In correlated multi-pixel inversion, the retrieval efficiency is gained from two aspects: (a) reduction of parameter space by retrieving PCs of correlated fields; (2) use of PC-based RT model. For a case study with 22 correlated aerosol parameters and 30 uncorrelated (assumed) surface parameters per super-pixel, the correlated multi-pixel saves 50% CPU time by using two PCs. The speed gain can be greater if we further capitalize the correlation in surface parameters, as well as the correlation between aerosol and surface parameters. It is anticipated that correlated multi-pixel inversion will save 90% CPU time if an image of 100 × 100 pixels is retrieved simultaneously and all 52 parameters are correlated with each other.

6. Summary and Outlook

Without utilizing correlations among aerosol parameters, optimization-based retrievals are confronted with a high-dimensional parameter space. However, certain types of aerosols or certain combinations of aerosol fields generally prevail within a targeted area, and consequently some aerosol parameters are correlated with each other (in other words, have high linear dependency, as captured by PC analysis). Due to the lack of accurate physical models, however, it is hard to quantify the physical processes and accurately quantify the correlations between all parameters. To mitigate the influence of model assumptions, a priori information about aerosol correlation informed by ground or other types of measurements is helpful. This motivates our development of PC-based aerosol inversion approach to improve the inversion stability and efficiency by reducing the number of retrieval parameters. The algorithm makes use of multiple types of constraints, including across-pixel smoothness constraints transformed to be imposed on the PCs to retrieve zero sum of the PC weights, and orthogonality and unit norm conditions on PC weights and PC vectors, respectively. While applying smoothness constraints to the PCs instead of the individual aerosol parameters, the regularization (smoothness) remains faithful to both aerosol correlations and smooth spatial and spectral variation within the scene. To accelerate the multi-pixel retrieval, a PC-based RT model is developed, which capitalizes on the aerosol correlations and mutual orthogonality of the PC vectors. The retrieval methodology was tested by comparing aerosol retrievals from 27 AirMSPI datasets acquired between 2012 and 2016 with collocated aerosol reference data reported by AERONET. Mean absolute differences between AirMSPI and AERONET retrievals are found to be ~0.029 and 0.038 for AOD and SSA, respectively.

The correlated multi-pixel inversion established in this paper is informed by prior estimates of aerosol properties within the retrieval in order to generate initial guesses for the PC vectors and to calculate the co-variance of a priori. Potential sources of such information can include AERONET climatologies, chemical transport model results, or satellite-based aerosol inversion output obtained, for example, from the MISR operational aerosol retrieval [13,14], and the near-real time POLDER aerosol retrieval [35]. Though not implemented in this study, it would be interesting to further impose on retrieval the correlations in surface properties and the correlations between aerosol and surface properties. To derive a priori of the correlations in surface properties, PC analysis can be performed over a surface reflectance dataset, such as the one based on the Multi-Angle Implementation of Atmospheric Correction (MAIAC) algorithm [36,37], the ASTER spectral library [38], and the U.S. Geological Survey (USGS) digital spectral library [39]. To impose the correlation between aerosol and surface, however, one might encounter a complex surface condition over a targeted area (e.g., with co-existence of multiple types of surfaces such as soil, water, snow, forest, etc.). Under such circumstances, surface pixel classification can be performed to identify surface types first, either from existing surface climatology or from various indices for vegetation, snow, soil, water, etc. based on their spectral difference. Then one can assign surface-type dependent sets of PCs to pixels, each set containing a small number of PCs. Finally, these surface-type dependent sets of PCs are retrieved simultaneously with the imposition of the constraints regarding smooth variations of relevant aerosol/surface properties in spatial, spectral and/or temporal directions. To account for dependent sets of PCs in a state vector, some modifications are necessary in formulating the smoothness constraints. Using such a strategy is expected to gain more retrieval efficiency than if one directly relaxes more PCs in retrievals to capture strong spatial variations of surface properties.

While the correlated multi-pixel approach developed here allows a retrieval of both PC vectors and PC weights, another way to capture regionally limited variability in aerosol type is to use a traditional lookup table (LUT) based aerosol retrieval. This approach is implemented in some operational aerosol retrievals employed by, e.g., Multi-angle Imaging SpectroRadiometer (MISR) [13,14] and Moderate resolution Imaging Spectroradiometer (MODIS) [40], and has extremely high retrieval efficiency. With reliable estimates of aerosol properties from other sources as noted above, a “smart” LUT can be generated, which then serves as the basis for a reliable set of PC vectors. If there is high confidence in the representativeness of these PCs, the retrieval could be confined to the PC weights only, which will be faster than the combined inversion of PC vectors and PC weights. Such an approach would compensate for the weakness in traditional LUT approach, namely that aerosol mixtures are confined to a discretized aerosol parameter space. Further testing of these ideas are planned using multi-angle satellite observations from MISR [41].

Author Contributions

F.X. formulated the correlation-based multi-pixel inversion approach and radiative transfer model for fast multi-pixel polarized radiance modeling, and tested the approach using AirMSPI data. D.J.D. and O.D. reviewed the whole formalism and provided editorial changes. Y.S. provided discussions on imposing correlation constraint to improve aerosol retrieval and editorial suggestions.

Funding

Feng Xu and David J. Diner’s work was performed at the Jet Propulsion Laboratory (JPL), California Institute of Technology under contract with the National Aeronautics and Space Administration. Oleg Dubovik is supported by the CaPPA (Chemical and Physical Properties of the Atmosphere) project funded by the French National Research Agency through PIA (Programme d’Investissement d’Avenir) program under contract (ANR-11-LABX-0005-01), the Hauts-de-France Regional Council, and the European Funds for Regional Economic Development. Yoav Schechner is a Landau Fellow supported by the Taub Foundation. His research, supported by the US-Israel Binational Science Foundation (BSF grant 2016325), is partly conducted in the Ollendorff Minerva Center. Minerva is funded through the BMBF.

Acknowledgments

We are grateful to Carol J. Bruegge, Michael J. Garay, Gerard van Harten, Veljko M. Jovanovic, Olga V. Kalashnikova, Brian E. Rheingans, Felix C. Seidel, Irina N. Tkatcheva, and Mika Tosca of JPL for their support of AirMSPI data acquisition and calibration in multiple NASA field campaigns. We also thank Xu Liu at NASA Langley Research Center and Vijay Natraj at JPL for helpful discussions on principal component analysis to accelerate radiative transfer modeling and atmospheric remote sensing.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Symbols and Abbreviations

Table A1. List of symbols and abbreviations.

Symbol/Abbreviation	Description
a	Slope of a regression line
$\bar{a}$	Mean slope of a set of regression lines
$a_{λ}$	Spectral weight of surface BRDF
A	Coefficient for a basic smooth function shape, e.g., “g(z) = Az ² + Bz + C” for a parabola
A	Fischer matrix
AERONET	Aerosol Robotic Network
AirMSPI	Airborne Multiangle SpectroPolarimetric Imager
AOD	Aerosol optical depth
ASTER	Advanced Spaceborne Thermal Emission and Reflection Radiometer
A_surf	Surface albedo
b	Intercept of a regression line
$\bar{b}$	Mean intercept of a set of regression lines
B	Coefficient for a basic smooth function shape, e.g., “g(z) = Az ² + Bz + C” for a parabola
BRDF	Bidirectional reflectance distribution function
BRF	Bidirectional reflectance factor (BRF) associated with Stokes vector component I
c	Constant column vector that weight correlated fields in PC analysis
C	Coefficient for a basic smooth function shape, e.g., “g(z) = Az ² + Bz + C” for a parabola
$C_{i}$	Covariance matrix of i-th type of constraint
$C_{Δ x, rand}$	Covariance matrix of random errors in the measurements
C_v,sphere	Column volume concentration of spherical aerosols
C_v,tot	Total column volume concentration of aerosols
d	Differentiation array
$d_{ES}$	Earth-Sun distance
DOLP	Degree of linear polarization
dV/dln(r)	Volume weighted aerosol size distribution
$E_{0}$	Exo-atmospheric solar irradiance
$f_{i}^{*}$	Column vector of i-th type of constraint
$f_{i}$	Column vector that contains model prediction to fit i-th type of constraint
$Δ f_{i}^{*}$	Column vector that contains the errors of i-th type of constraint
f_fine	Fine mode aerosol fraction
f_v,sphere	Volume fraction of spherical aerosols
g(z)	Smooth function with variable z
$g_{λ}$	Spectral anisotropy parameter of surface BRDF
GRASP	Generalized Retrieval of Aerosol and Surface ProperEes
h	Cartesian coordinate in the direction h
h_a	Central height of aerosol vertical profile (constrained by Gaussian profile)
I	First Stokes vector component
I	Identity matrix
I_meas	Measured radiance
$k_{λ}$	Spectral anisotropy parameter of surface BRDF
$k_{γ}$	Shadowing width of polarized BRDF
$K$	Jacobian matrix
$K_{i}$	The Jacobian matrix associated with i-th type of constraint
L(j)	Length of j-th type of correlated parameters (fields)
m	Order of difference in constructing smoothness matrix
M	Number of types of constraints imposed on retrieval
MAD	Mean absolute difference
MAIAC	MultiAngle Implementation of Atmospheric Correction
MISR	Multi-angle Imaging SpectroRadiometer
MODIS	Moderate Resolution Imaging Spectroradiometer
n_{r, j}	Real part of refractive index at j-th wavelength
n_{i, j}	Imaginary part refractive index at j-th wavelength
$N_{a}$	Total number of retrieval parameters
$N_{a^{*}}$	Total number of a priori estimate of parameters
$N_{c}$	Total number of constraints imposed on retrieval
$N_{corr}$	Number of correlated parameters (fields)
$N_{f}$	Total number of observations (all pixels are accounted)
$N_{i}$	Total number of i-th type of constraint
$N_{PC}$	Total number of principle components
$N_{pixel}$	Total number of pixels
$N_{SC}$	Number of aerosol size components
$N_{TP, uncorr}$	Total number of types of uncorrelated parameters
O	Constraining matrix that reflects zero-sum of PC weights
pBRDF	Polarized BRDF
P	Probability distribution function (PDF)
PC	Principal component
PCA	Principal component analysis
POLDER	Polarization and Directionality of Earth’s Reflectance
q	Ratio of Stokes components Q and I
q	Iterative step during optimization
qBRF_TOA	Top-of-atmosphere BRF associated with Stokes vector component Q
Q	Second Stokes vector component
r	Radius of aerosol
r_eff,coarse	Effective radius of coarse mode aerosols
r_eff,fine	Effective radius of fine mode aerosols
R	Reflection matrix
R²	Coefficient of determination
$R_{atmos}$	Reflection matrix for atmosphere associated with light illumination from top of the atmosphere
$R_{atmos}^{*}$	Reflection matrix for atmosphere associated with light illumination from bottom of the atmosphere
$R_{CAS}$	Reflection matrix for the coupled atmosphere-surface system (CAS)
RPV	Rahman-Pinty-Verstraete (surface BRDF model)
$R_{surf}$	Surface reflection matrix
$R_{surf, BRDF}$	Depolarizing part of surface reflection matrix
$R_{surf, pBRDF}$	Polarizing part of surface reflection matrix
RT	Radiative transfer
s_a	Standard deviation of aerosol vertical profile (constrained by Gaussian profile)
$S_{i, m}$	Differentiation matrix of m-th order for i-th type of constraint
SSA	Single scattering albedo
t	Temporal coordinate
T	Transmission matrix
$T_{atmos}$	Transmission matrix for atmosphere associated with light illumination from top of the atmosphere
$T_{atmos}^{*}$	Transmission matrix for atmosphere associated with light illumination from bottom of the atmosphere
TOA	Top-of-atmosphere
u	Cartesian coordinate in the direction u
u	Ratio of Stokes components U and I
uBRF_TOA	Top-of-atmosphere BRF associated with Stokes vector component U
U	Third Stokes vector component
U	Constraining matrix that reflects the unit length of a PC
USGS	U.S. Geological Survey
v	Cartesian coordinate in the direction v
$v$	PC matrix containing N_PC columns PC vectors
vBRF_TOA	Top-of-atmosphere BRF associated with Stokes vector component V
$v_{k}$	The k-th PC vector
$v_{state}$	Column vector containing all PC vectors
w	PC weight matrix containing N_pixel column vectors containing PC weights
$w_{i}$	Weight matrix for i-th type of constraint
$w_{p}$	Column vector containing PC weights for p-th pixel
$w_{state}$	Column vector containing all PC weights
$x$	Column state vector including all retrieval parameters
$x^{a p r i o r i}$	a priori of state vector
${\bar{x}}_{corr}$	Column vector containing spatial and temporal mean of correlated parameters (fields)
$x_{corr, p}$	Column vector containing correlated parameters (fields) for p-th pixel
$x_{q, aer}$	The vector consisting of correlated aerosol properties – calculated from the solution at q-th iteration
$x_{q, surf}$	The vector consisting of uncorrelated surface reflection properties – containing in the solution at q-th iteration
$x^{retrieved}$	Retrieved column state vector
$Δ x_{syst}$	Systematic error in retrieval
$x_{uncorr, p}$	Column vector containing uncorrelated parameters (fields) for p-th pixel
$x^{true}$	Column state vector associated with true solution
$x_{wv}$	Column vector including PC weights and vectors
${(Δ x)}_{j}$	The retrieval error in j-th parameter
$y_{i}$	i-th observational signal
$Y^{HS}$	Output of RT calculation with high stream approximation
$Y^{LS}$	Output of RT calculation with low stream approximation
z	Variable of a smooth function
z_min	Lower bound of z
z_max	Upper bound of z
Z	Number of observations per pixel
$δ$	Kronecker delta
$δ_{s}$	Scale factor that perturbs a PC vector
$ε_{rand}$	Random error in measurements
$ε_{λ}$	Spectral weight of pBRDF
$ε_{i}^{2}$	First diagonal element of C_i
$ε_{c}^{2}$	User-specified threshold value to diagnose the convergence of optimization
$ε_{f}^{2}$	Expected variance due to measurement errors
$θ_{0}$	Solar zenith angle
$θ_{v}$	View zenith angle
$λ$	Wavelength
$μ_{0}$	Cosine of solar zenith angle
$γ_{i}$	Lagrange factor for i-th type of constraint
$ϕ_{0}$	Solar azimuthal angle
$Ψ_{i}$	Objective cost function for i-th type of constraint
$Ψ_{total}$	Overall objective cost function
$\nabla Ψ_{i}$	Gradient of the objective cost function for i-th type of constraint
$\nabla Ψ_{total}$	Gradient of the overall objective cost function
$Ω_{i}$	The smoothness matrix associated with i-th type of constraints
$Ω_{\dots}^{Ra}$	The rearranged smoothness matrix from $Ω_{\dots}$
$Ω_{uncorr}$	The smoothness matrix imposed on uncorrelated parameters (fields)
$Ω_{corr}^{\bar{x}}$	The smoothness matrix imposed on spatial and temporal mean mean of correlated parameters (fields)
$Ω_{corr}^{w}$	Smoothness matrix imposed on pixel resolved PC weights
$Ω_{corr}^{v}$	Smoothness matrix imposed on a PC vector
$Ω_{corr}^{wv}$	Smoothness matrix imposed on correlated parameters (fields)
$σ_{spatio - temporal, x}$	Standard deviation of a correlated field x
$σ_{e, x}$	Uncertainty estimate of a correlated field x
$σ_{s}$	Slope variance of polarized BRDF
$τ_{aer, tot}$	Total aerosol optical depth
$τ_{aer, abs}$	Total absorption aerosol optical depth
$τ_{atmos}$	Atmospheric optical depth
$Γ$	Constraining matrix that reflects the mutual orthogonality in PCs

Appendix B. Smoothness Matrix to Constrain Uncorrelated Parameter Retrieval

To explain construction of the smoothness matrix for PC vectors and weights, we start by describing the smoothness matrix used in the original multi-pixel algorithm formulated by Dubovik et al. [6]. This appendix forms the basis for extension to PC-based smoothness matrix, described in Appendix C and Appendix D. Two major classes of constraints are imposed on the PC retrieval: across-pixel (spatial) constraints and within-pixel (e.g., spectral) constraints. The following matrix incorporates both across-pixel and within-pixel constraints for a set of uncorrelated aerosol parameters:

γ_{uncorr} Ω_{uncorr} = γ_{uncorr, Δ} Ω_{uncorr, Δ} + (\begin{matrix} γ_{uncorr, ⋄} Ω_{uncorr, 1} & 0 & \dots & 0 \\ 0 & γ_{uncorr, ⋄} Ω_{uncorr, 2} & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & γ_{uncorr, ⋄} Ω_{uncorr, N_{pixel}} \end{matrix})

(A1)

where

γ_{uncorr, Δ} Ω_{uncorr, Δ}

is a block matrix that includes across-pixel smoothness constraints,

γ_{uncorr, ⋄} Ω_{uncorr, ⋄, p}

is a block matrix that includes within-pixel smoothness constraints over the parameters associated with p-th pixel, and 0 is the zero block matrix. Pixel resolved matrices

γ_{uncorr, ⋄} Ω_{uncorr, ⋄, p}

do not interfere with each other, so they are arranged along the diagonal axis of the large matrix on the right-hand-side of the above equation. To facilitate the use of

γ_{uncorr} Ω_{uncorr}

calculated via Equation (A1), the uncorrelated parameters are grouped together in the order of

x_{uncorr} = [x (t_{1}); x (t_{2}); \dots; x (t_{N_{t}})]

, where

x (t_{j}) = [x (v_{1}; t_{j}); x (v_{2}; t_{j}); \dots; x (v_{N_{v}}; t_{j})]

,

x (v_{i}; t_{j}) = [x (u_{1}; v_{i}; t_{j}); x (u_{2}; v_{i}; t_{j}); \dots; x (u_{N_{u}}; v_{i}; t_{j})]

, and

x (u_{k}; v_{i}; t_{j})

is a vector that contains uncorrelated parameters for the pixel

(u_{k}, v_{i})

observed at temporal point

t_{j}

. Evaluations of

γ_{uncorr, Δ} Ω_{uncorr, Δ}

and

γ_{uncorr, ⋄} Ω_{uncorr, ⋄, p}

are discussed in Appendix B.1 and Appendix B.2, respectively.

Appendix B.1. Across-Pixel Smoothness Matrix

The across-pixel constraints that ensure smooth variation of a parameter in space and time is expressed by,

γ_{uncorr, Δ} Ω_{uncorr, Δ} = γ_{uncorr, u} Ω_{uncorr, u} + γ_{uncorr, v} Ω_{uncorr, v} + γ_{uncorr, t} Ω_{uncorr, t}

(A2)

To simplify the explanation, consider an image of

N_{u}

pixels along the horizontal spatial dimension u, two pixels along the horizontal spatial dimension v (

N_{v}

= 2) and two successive measurements in time (

N_{t}

= 2) of some parameter x. The construction of a smoothness matrix along the vertical direction is neglected for simplicity. In this case, the single-column state vector is arranged as,

x_{uncorr} = ⌈ \begin{matrix} x (u_{1}; v_{1}; t_{1}) \\ \begin{array}{l} x (u_{2}; v_{1}; t_{1}) \\ x (u_{3}; v_{1}; t_{1}) \end{array} \\ ⋮ \\ x (u_{N_{u}}; v_{1}; t_{1}) \\ x (u_{1}; v_{2}; t_{1}) \\ \begin{array}{l} x (u_{2}; v_{2}; t_{1}) \\ x (u_{3}; v_{2}; t_{1}) \end{array} \\ ⋮ \\ \begin{array}{l} x (u_{N_{u}}; v_{2}; t_{1}) \\ t o c o n t i n u e \end{array} \end{matrix} ⌉ ⌊ \begin{matrix} \begin{array}{l} c o n t i n u e d \\ x (u_{1}; v_{1}; t_{2}) \end{array} \\ \begin{array}{l} x (u_{2}; v_{1}; t_{2}) \\ x (u_{3}; v_{1}; t_{2}) \end{array} \\ ⋮ \\ x (u_{N_{u}}; v_{1}; t_{2}) \\ x (u_{1}; v_{2}; t_{2}) \\ \begin{array}{l} x (u_{2}; v_{2}; t_{2}) \\ x (u_{3}; v_{2}; t_{2}) \end{array} \\ ⋮ \\ x (u_{N_{u}}; v_{2}; t_{2}) \end{matrix} ⌋

(A3)

The corresponding smoothness matrix constraining the horizontal and temporal variation of x is given by,

γ_{uncorr, u / v / t} Ω_{uncorr, u / v / t} = {[diag (\sqrt{γ_{uncorr, u / v / t}} S_{uncorr}^{(m_{uncorr, u / v / t})})]}^{T} [diag (\sqrt{γ_{uncorr, u / v / t}} S_{uncorr}^{(m_{uncorr, u / v / t})})]

(A4)

where

γ_{uncorr, u / v / t}

controls the strength of smoothness constraint, which varies for different types of parameters. Specifically, the component that ensures the smooth variation of x in the direction u is expressed as,

\sqrt{γ_{uncorr, u}} S_{uncorr, u}^{(m_{uncorr, u})} = \sqrt{γ_{uncorr, u}} [\begin{matrix} s_{uncorr, u}^{(m_{uncorr, u})} & 0 \\ 0 & s_{uncorr, u}^{(m_{uncorr, u})} \end{matrix}]

(A5)

where

s_{uncorr, u}^{(m_{uncorr, u})} = [\begin{matrix} d_{1}^{(m_{uncorr, u})} (1) & d_{1}^{(m_{uncorr, u})} (2) & \dots & d_{1}^{(m_{uncorr, u})} (m_{uncorr, u} + 1) & 0 & \dots & 0 \\ 0 & d_{2}^{(m_{uncorr, u})} (1) & d_{2}^{(m_{uncorr, u})} (2) & \dots & d_{2}^{(m_{uncorr, u})} (m_{uncorr, u} + 1) & \dots & 0 \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ 0 & 0 & \dots & d_{N_{u} - m_{uncorr, u}}^{(m_{uncorr, u})} (1) & d_{N_{u} - m_{uncorr, u}}^{(m_{uncorr, u})} (2) & \dots & d_{N_{u} - m_{uncorr, u}}^{(m_{uncorr, u})} (m_{uncorr, u} + 1) \end{matrix}]

(A6)

As an example, the differentiation array for the first order of difference in above equation is,

d_{j}^{(m_{uncorr, u} = 1)} = [\begin{matrix} \frac{1}{Δ_{1} (j)}, & - \frac{1}{Δ_{1} (j)} \end{matrix}]

(A7)

where

Δ_{1} (j)

accounts for the distance between neighboring pixels in the direction u and is evaluated by Equation (40), and for the second order of difference,

d_{j}^{(m_{uncorr, u} = 2)} = [\begin{matrix} \frac{2}{Δ_{1} (j) [Δ_{1} (j) + Δ_{1} (j + 1)]}, & - \frac{2}{Δ_{1} (j) Δ_{1} (j + 1)}, & \frac{2}{Δ_{1} (j + 1) [Δ_{1} (j) + Δ_{1} (j + 1)]} \end{matrix}]

(A8)

Moreover, the smoothness matrix in the direction v is,

\sqrt{γ_{uncorr, v}} S_{uncorr, v}^{(m_{uncorr, v})} = \sqrt{γ_{uncorr, v}} [\begin{matrix} s_{uncorr, v}^{(m_{uncorr, v})} & 0 \\ 0 & s_{uncorr, v}^{(m_{uncorr, v})} \end{matrix}]

(A9)

where

s_{uncorr, v}^{(m_{uncorr, v} = 1)} = [\begin{matrix} d_{1}^{(m_{uncorr, v})} (1) & 0 & \dots & 0 & d_{1}^{(m_{uncorr, v})} (2) & 0 & \dots & 0 \\ 0 & d_{1}^{(m_{uncorr, v})} (1) & \dots & 0 & 0 & d_{1}^{(m_{uncorr, v})} (2) & \dots & 0 \\ \dots & \dots & ⋱ & \dots & \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & d_{1}^{(m_{uncorr, v})} (1) & 0 & 0 & \dots & d_{1}^{(m_{uncorr, v})} (2) \end{matrix}]

(A10)

where we assume

m_{uncorr, v} = 1

and the calculation of the differentiation array d accounts for the distance between neighboring pixels in the direction v and is evaluated by Equation (40).

The smoothness matrices in the direction t is,

s_{uncorr, t}^{(m_{uncorr, t} = 1)} = [\begin{matrix} d_{1}^{(m_{uncorr, t})} (1) & 0 & \dots & 0 & 0 & \begin{matrix} 0 & \dots \end{matrix} & 0 & d_{1}^{(m_{uncorr, t})} (2) & 0 & 0 & \dots & 0 & \begin{matrix} 0 & \dots \end{matrix} & 0 \\ 0 & d_{1}^{(m_{uncorr, t})} (1) & \dots & 0 & 0 & \begin{matrix} 0 & \dots \end{matrix} & 0 & 0 & d_{1}^{(m_{uncorr, t})} (2) & 0 & \dots & 0 & \begin{matrix} 0 & \dots \end{matrix} & 0 \\ d_{1}^{(m_{uncorr, t})} (1) & 0 & \dots & 0 & 0 & \begin{matrix} 0 & \dots \end{matrix} & 0 & d_{1}^{(m_{uncorr, t})} (2) & 0 & 0 & \dots & 0 & \begin{matrix} 0 & \dots \end{matrix} & 0 \\ 0 & d_{1}^{(m_{uncorr, t})} (1) & \dots & 0 & 0 & \begin{matrix} 0 & \dots \end{matrix} & 0 & 0 & d_{1}^{(m_{uncorr, t})} (2) & 0 & \dots & 0 & \begin{matrix} 0 & \dots \end{matrix} & 0 \end{matrix}]

(A11)

where we assume

m_{uncorr, t} = 1

and the calculation of the differentiation array d accounts for the temporal gap between successive measurements and is evaluated by Equation (40).

Note that the above formalism applies to all uncorrelated but smoothly varying parameters. The incorporation of these smoothness matrices for different parameters into an overall matrix

γ_{uncorr, Δ} Ω_{uncorr, Δ}

is designed to account for the locations of these parameters in a retrieval state vector.

Appendix B.2. Within-Pixel Smoothness Matrix

Certain types of parameters, such as the spectral weight of the microfacet model-based pBRDF function and spectral shape parameters in the RPV model-based BRDF function as a function of wavelength, are subjected to inherent smoothness. The smooth variation of such a type of within-pixel parameters is ensured by

γ_{uncorr, ⋄} Ω_{uncorr, ⋄} = [\begin{matrix} {(γ_{uncorr, ⋄} Ω_{uncorr, ⋄})}_{p 1} & 0 & \dots & 0 \\ 0 & {(γ_{uncorr, ⋄} Ω_{uncorr, ⋄})}_{p 2} & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & {(γ_{uncorr, ⋄} Ω_{uncorr, ⋄})}_{N_{pixel}} \end{matrix}]

(A12)

where for any pixel p,

{(γ_{uncorr, ⋄} Ω_{uncorr, ⋄})}_{p} = {[\sqrt{γ_{uncorr, ⋄}} S_{uncorr, ⋄}^{(m_{uncorr, ⋄})}]}_{p}^{T} {[W^{(m_{uncorr, ⋄})}]}^{- 1} {[\sqrt{γ_{uncorr, ⋄}} S_{uncorr, ⋄}^{(m_{uncorr, ⋄})}]}_{p}

(A13)

where

{[\sqrt{γ_{uncorr, ⋄}} S_{uncorr, ⋄}^{(m_{uncorr, ⋄})}]}_{p} = [\begin{matrix} C_{p}^{(m_{uncorr, ⋄} (1))} (1) & 0 & \dots & 0 \\ 0 & C_{p}^{(m_{uncorr, ⋄} (2))} (2) & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & C_{p}^{(m_{uncorr, ⋄} (N_{TP, uncorr}))} (N_{TP, uncorr}) \end{matrix}]

(A14)

with the following submatrix for j-th type of uncorrelated parameters,

C_{p}^{(m_{uncorr, ⋄} (j))} (j) = \sqrt{γ_{uncorr, ⋄} (j)} [\begin{matrix} d_{1}^{(m_{uncorr, ⋄} (j))} (1) & d_{1}^{(m_{uncorr, ⋄} (j))} (2) & \dots & d_{1}^{(m_{uncorr, ⋄} (j))} (m_{uncorr, ⋄} (j) + 1) & 0 & \dots & 0 \\ 0 & d_{2}^{(m_{uncorr, ⋄} (j))} (1) & d_{2}^{(m_{uncorr, ⋄} (j))} (2) & \dots & d_{2}^{(m_{uncorr, ⋄} (j))} (m_{uncorr, ⋄} (j) + 1) & \dots & 0 \\ \dots & \dots & \dots & \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & d_{TP, uncorr (j) - m_{uncorr, ⋄} (j)}^{(m_{uncorr, ⋄} (j))} (1) & d_{TP, uncorr (j) - m_{uncorr, ⋄} (j)}^{(m_{uncorr, ⋄} (j))} (2) & \dots & d_{TP, uncorr (j) - m_{uncorr, ⋄} (j)}^{(m_{uncorr, ⋄} (j))} (m_{uncorr, ⋄} (j) + 1) \end{matrix}]

(A15)

where we assume j-th parameter belongs to

[{TP}_{uncorr} (j)]

-th type of uncorrelated parameters, the calculation of the differentiation array d accounts for the distance between neighboring wavelengths and proceeds in the way as expressed in Equations (A7) and (A8).

The weight matrix in Equation (A13) is expressed as,

W^{(m_{uncorr, ⋄})} = [\begin{matrix} W^{(m_{uncorr, ⋄} (1))} (1) & 0 & 0 & \dots & 0 \\ 0 & W^{(m_{uncorr, ⋄} (2))} (2) & 0 & \dots & 0 \\ 0 & 0 & W^{(m_{uncorr, ⋄} (3))} (3) & \dots & 0 \\ \dots & \dots & \dots & ⋱ & 0 \\ 0 & 0 & 0 & 0 & W^{(m_{uncorr, ⋄} (N_{TP, uncorr}))} (N_{TP, uncorr}) \end{matrix}]

(A16)

where

W^{(m_{uncorr, ⋄} (i))} (i) = [\begin{matrix} \frac{1}{Δ_{m_{uncorr, ⋄} (i)} (1)} & 0 & \dots & 0 \\ 0 & \frac{1}{Δ_{m_{uncorr, ⋄} (i)} (2)} & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & \frac{1}{Δ_{m_{uncorr, ⋄} (i)} (L (i) - m_{uncorr, ⋄} (i))} \end{matrix}]

(A17)

where

Δ_{m} (j)

is evaluated by Equation (40).

To generate

γ_{corr}^{\bar{x}} Ω_{corr}^{\bar{x}}

(see Equation (50) for

Ω_{corr}^{\bar{x}}

), the evaluation of smoothness matrix of the correlated fields in the column vector

\bar{x}

containing multi-pixel mean fields proceeds in a similar way as formulated for the within-pixel constraint.

Appendix C. Smoothness Matrix for Correlated Parameters

As shown in Appendix B, the original multi-pixel inversion algorithm imposes smoothness constraints directly on the aerosol fields. In the correlated multi-pixel inversion, the correlated parameters are not directly retrieved; the retrieval involves the PC vectors, PC weights, and multi-pixel mean fields. Therefore, the smoothness constraints must be imposed on PCs and PC weights. Like the implementation for uncorrelated parameter retrievals, the overall smoothness matrix includes both across-pixel (

γ_{corr, Δ}^{wv} Ω_{corr, Δ}^{wv}

) and within-pixel constraints (

γ_{corr, ⋄}^{wv} Ω_{corr, ⋄}^{wv}

), namely,

{\begin{cases} γ_{corr}^{wv} Ω_{corr, 1}^{wv} = γ_{corr, Δ}^{wv} Ω_{corr, Δ, 1}^{wv} + γ_{corr, ⋄}^{wv} Ω_{corr, ⋄, 1}^{wv} \\ γ_{corr}^{wv} Ω_{corr, 2}^{wv} = γ_{corr, Δ}^{wv} Ω_{corr, Δ, 2}^{wv} + γ_{corr, ⋄}^{wv} Ω_{corr, ⋄, 2}^{wv} \end{cases}

(A18)

The explicit forms of

γ_{corr, Δ}^{wv} Ω_{corr, Δ}^{wv}

and

γ_{corr, ⋄}^{wv} Ω_{corr, ⋄}^{wv}

will be given in the Appendix C.1 and Appendix C.2.

Appendix C.1. Across-Pixel Smoothness Constraints

To simplify the discussion, the formalism given in this and the next section assumes that the smoothness constraint applied to a parameter x occurs in the dimension u. It can be easily extended to enable construction of smoothness matrices in other spatial dimensions.

To constrain across-pixel variation of a certain parameter, the

γ_{corr, Δ}^{wv} Ω_{corr, Δ}^{wv}

matrix in Equation (A18) is given by

{\begin{cases} γ_{corr, Δ}^{wv} Ω_{corr, Δ, 1}^{wv} = \sum_{i = 1}^{N_{corr}} {[\sqrt{γ_{corr, Δ}^{wv}} \frac{d (S_{wv, i}^{(m_{corr, Δ}^{wv})} x_{i})}{d x_{i}}]}^{T} [\sqrt{γ_{corr, Δ}^{wv}} \frac{d (S_{wv, i}^{(m_{corr, Δ}^{wv})} x_{i})}{d x_{i}}] \\ γ_{corr, Δ}^{wv} Ω_{corr, Δ, 2}^{wv} = \sum_{i = 1}^{N_{corr}} {[\sqrt{γ_{corr, Δ}^{wv}} \frac{d (S_{wv, i}^{(m_{corr, Δ}^{wv})} x_{i})}{d x_{i}}]}^{T} [\sqrt{γ_{corr, Δ}^{wv}} S_{wv, i}^{(m_{corr, Δ}^{wv})}] \end{cases}

(A19)

where the column vector

x_{i}

has the same length as that of the state vector and contains all-pixel PC weights and the i–th elements of all PC vectors (all rest elements of

x_{i}

are zero), and

\sqrt{γ_{corr, Δ}^{wv}} S_{wv, i}^{(m_{corr, Δ}^{wv})}

includes two matrix components B and C that account for the contributions by PCs and PC weights, respectively, in the following form,

\sqrt{γ_{corr, Δ}^{wv}} S_{wv, i}^{(m_{corr, Δ}^{wv})} = \begin{array}{l} [\begin{matrix} 0 & B_{w, p 1}^{(m_{corr, Δ}^{wv})} (i) \\ 0 & B_{w, p 2}^{(m_{corr, Δ}^{wv})} (i) \\ \dots & \dots \\ 0 & B_{w, N_{pixel} - m_{corr, Δ}^{wv}}^{(m_{corr, Δ}^{wv})} (i) \end{matrix} \begin{matrix} C_{p 1}^{(m_{corr, Δ}^{wv})} (1, i) & 0_{uncorr, p 1} & C_{p 1}^{(m_{corr, Δ}^{wv})} (2, i) & 0_{uncorr, p 2} & \dots & \dots & C_{p 1}^{(m_{corr, Δ}^{wv})} (m_{corr, Δ}^{wv} + 1, i) & 0_{uncorr, p 1 + m_{corr, Δ}^{wv}} & 0 & 0 & 0 \\ 0 & 0 & C_{p 2}^{(m_{corr, Δ}^{wv})} (1, i) & 0_{uncorr, p 2} & C_{p 2}^{(m_{corr, Δ}^{wv})} (2, i) & 0_{uncorr, p 3} & \dots & C_{p 2}^{(m_{corr, Δ}^{wv})} (m_{corr, Δ}^{wv} + 1, i) & 0_{uncorr, p 2 + m_{corr, Δ}^{wv}} & 0 & 0 \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ 0 & \dots & \dots & 0 & C_{N_{pixel} - m_{corr, Δ}^{wv}}^{(m_{corr, Δ}^{wv})} (1, i) & 0_{uncorr, N_{pixel} - m_{corr, Δ}^{wv}} & C_{N_{pixel} - m_{corr, Δ}^{wv}}^{(m_{corr, Δ}^{wv})} (2, i) & 0_{uncorr, N_{pixel} - m_{corr, Δ}^{wv} + 1} & \dots & C_{N_{pixel} - m_{corr, Δ}^{wv}}^{(m_{corr, Δ}^{wv})} (m_{corr, Δ}^{wv} + 1, i) & 0_{uncorr, N_{pixel}} \end{matrix}] \end{array}

(A20)

where the first column of zero matrices accommodate the multi-pixel mean of correlated parameters, and the explicit form of B matrix is expressed as,

B_{w, p}^{(m_{corr, Δ}^{wv})} (i) = [\begin{matrix} B_{p}^{(m_{corr, Δ}^{wv} (i))} (i, w_{\dots} (1)) & B_{p}^{(m_{corr, Δ}^{wv} (i))} (i, w_{\dots} (2)) & \dots & B_{p}^{(m_{corr, Δ}^{wv} (i))} (i, w_{\dots} (N_{PC})) \end{matrix}]

(A21)

with

B_{p}^{(m_{corr, Δ}^{wv} (i))} (i, w_{\dots} (k)) = \sqrt{γ_{corr, Δ}^{wv} (i)} [\begin{matrix} \sum_{m = 0}^{m_{corr, Δ}^{wv} (i)} d_{p}^{(m_{corr, Δ}^{wv} (i))} (m + 1) w_{p + m} (k) & 0 & \dots & 0 \\ 0 & \sum_{m = 0}^{m_{corr, Δ}^{wv} (i)} d_{p}^{(m_{corr, Δ}^{wv} (i))} (m + 1) w_{p + m} (k) & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & \sum_{m = 0}^{m_{corr, Δ}^{wv} (i)} d_{p}^{(m_{corr, Δ}^{wv} (i))} (m + 1) w_{p + m} (k) \end{matrix}]

(A22)

where the i-th correlated parameter.

The matrix C is a null matrix except its i-th row has fill-in values, namely

C_{p}^{(m_{corr, Δ}^{wv})} (m^{'}, i) = \sqrt{γ_{corr, Δ}^{wv} (i)} d_{p}^{(m_{corr, Δ}^{wv} (i))} (m^{'}) [\begin{matrix} 0 & 0 & \dots & 0 \\ \dots & \dots & \dots & \dots \\ v_{1} (i) & v_{2} (i) & \dots & v_{N_{PC}} (i) \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & 0 \end{matrix}]

(A23)

Moreover, Equation (59) is implemented to evaluate

d (S_{wv, i}^{(m_{corr, Δ}^{wv})} x_{i}) / d x_{i}

in Equation (A19).

Appendix C.2. Within-Pixel Smoothness Matrix

The overall within-pixel smoothness matrix

γ_{corr, ⋄}^{wv} Ω_{corr, ⋄}^{wv}

in Equation (A18) is expressed as,

{\begin{cases} γ_{corr, ⋄}^{wv} Ω_{⋄, 1} = \sum_{i = 1}^{N_{TP, corr}} {[\sqrt{γ_{corr, ⋄}^{wv}} \frac{d (S_{wv}^{(m_{corr, ⋄}^{wv})} x_{i})}{d x_{i}}]}^{T} {[W^{(m_{corr, ⋄}^{wv})}]}^{- 1} [\sqrt{γ_{corr, ⋄}^{wv}} \frac{d (S_{wv}^{(m_{corr, ⋄}^{wv})} x_{i})}{d x_{i}}] \\ γ_{corr, ⋄}^{wv} Ω_{⋄, 2} = \sum_{i = 1}^{N_{TP, corr}} {[\sqrt{γ_{corr, ⋄}^{wv}} \frac{d (S_{wv}^{(m_{corr, ⋄}^{wv})} x_{i})}{d x_{i}}]}^{T} {[W^{(m_{corr, ⋄}^{wv})}]}^{- 1} [\sqrt{γ_{corr, ⋄}^{wv}} S_{wv}^{(m_{corr, ⋄}^{wv})}] \end{cases}

(A24)

where the column vector

x_{i}

has the same length as that of the state vector and contains all-pixel PC weights and the i-th type of correlated fields included in all PC vectors (all rest elements of

x_{i}

are zero), and the weight matrix W that has a similar structure as Equations (A16) and (A17) (but for the i-th type of correlated parameters and is expanded to account for the number of pixels). To account for the smooth variation of a type of correlated parameters, the contributions of PC vectors and weights are coupled in the following way,

\sqrt{γ_{corr, ⋄}^{wv}} S_{wv}^{(m_{corr, ⋄}^{wv})} = [\begin{matrix} \sqrt{γ_{corr, ⋄}^{wv}} S_{v}^{(m_{corr, ⋄}^{wv})} & \sqrt{γ_{corr, ⋄}^{wv}} S_{w}^{(m_{corr, ⋄}^{wv})} & 0 & \dots & 0 \\ \sqrt{γ_{corr, ⋄}^{wv}} S_{v}^{(m_{corr, ⋄}^{wv})} & 0 & \sqrt{γ_{corr, ⋄}^{wv}} S_{w}^{(m_{corr, ⋄}^{wv})} & \dots & 0 \\ \dots & \dots & \dots & ⋱ & \dots \\ \sqrt{γ_{corr, ⋄}^{wv}} S_{v}^{(m_{corr, ⋄}^{wv})} & 0 & \dots & \dots & \sqrt{γ_{corr, ⋄}^{wv}} S_{w}^{(m_{corr, ⋄}^{wv})} \end{matrix}]

(A25)

where

\sqrt{γ_{corr, ⋄}^{wv}} S_{v}^{(m_{corr, ⋄}^{wv})} = [\begin{matrix} \sqrt{γ_{corr, ⋄}^{wv}} S_{v}^{(m_{corr, ⋄}^{wv}, k = 1)} & \sqrt{γ_{corr, ⋄}^{wv}} S_{v}^{(m_{corr, ⋄}^{wv}, k = 2)} & \dots & \sqrt{γ_{corr, ⋄}^{wv}} S_{v}^{(m_{corr, ⋄}^{wv}, k = N_{PC})} \end{matrix}]

(A26)

with

\sqrt{γ_{corr, ⋄}^{wv}} S_{v}^{(m_{corr, ⋄}^{wv}, k)} = [\begin{matrix} \sqrt{γ_{corr, ⋄}^{wv} (1)} S_{v}^{(m_{corr, ⋄}^{wv} (1), k)} (1) & 0 & \dots & 0 \\ 0 & \sqrt{γ_{corr, ⋄}^{wv} (2)} S_{v}^{(m_{corr, ⋄}^{wv} (2), k)} (2) & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & \sqrt{γ_{corr, ⋄}^{wv} (N_{TP, corr})} S_{v}^{(m_{corr, ⋄}^{wv} (N_{TP, corr}), k)} (N_{TP, corr}) \end{matrix}]

(A27)

S_{v}^{(m_{corr, ⋄}^{wv}, k)} (j) = w_{p} (k) \times [\begin{matrix} d_{1}^{(m_{corr, ⋄}^{wv})} (1) & d_{1}^{(m_{corr, ⋄}^{wv})} (2) & \dots & d_{1}^{(m_{corr, ⋄}^{wv})} (m_{corr, ⋄}^{wv} + 1) & 0 & \dots & 0 & 0 & 0 & 0 & 0 \\ 0 & d_{2}^{(m_{corr, ⋄}^{wv})} (1) & d_{2}^{(m_{corr, ⋄}^{wv})} (2) & \dots & d_{2}^{(m_{corr, ⋄}^{wv})} (m_{corr, ⋄}^{wv} + 1) & \dots & 0 & 0 & 0 & 0 & 0 \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ 0 & 0 & 0 & \dots & \dots & \dots & \dots & d_{L (j) - m_{corr, ⋄}^{wv}}^{(m_{corr, ⋄}^{wv})} (1) & d_{L (j) - m_{corr, ⋄}^{wv}}^{(m_{corr, ⋄}^{wv})} (2) & \dots & d_{L (j) - m_{corr, ⋄}^{wv}}^{(m_{corr, ⋄}^{wv})} (m_{corr, ⋄}^{wv} + 1) \end{matrix}]

(A28)

In Equation (A25),

\sqrt{γ_{corr, ⋄}^{wv}} S_{w}^{(m_{corr, ⋄}^{wv})}

is expressed as,

\sqrt{γ_{corr, ⋄}^{wv}} S_{w}^{(m_{corr, ⋄}^{wv})} = [\begin{matrix} \sqrt{γ_{corr, ⋄}^{wv}} S_{w_{1}}^{(m_{corr, ⋄}^{wv})} & \sqrt{γ_{corr, ⋄}^{wv}} S_{w_{2}}^{(m_{corr, ⋄}^{wv})} & \dots & \sqrt{γ_{corr, ⋄}^{wv}} S_{w_{N_{pixel}}}^{(m_{corr, ⋄}^{wv})} \end{matrix}]

(A29)

with

\sqrt{γ_{corr, ⋄}^{wv}} S_{w_{p}}^{m_{corr, ⋄}^{wv}} = [\begin{matrix} \sqrt{γ_{corr, ⋄}^{wv} (1)} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (1), k = 1)} (1) & \sqrt{γ_{corr, ⋄}^{wv} (1)} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (1), k = 2)} (1) & \dots & \sqrt{γ_{corr, ⋄}^{wv} (1)} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (1), k = N_{PC})} (1) & 0_{uncorr, p} \\ \sqrt{γ_{corr, ⋄}^{wv} (2)} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (2), k = 1)} (2) & \sqrt{γ_{corr, ⋄}^{wv} (2)} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (2), k = 2)} (2) & \dots & \sqrt{γ_{corr, ⋄}^{wv} (2)} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (2), k = N_{PC})} (2) & 0_{uncorr, p} \\ \dots & \dots & \dots & \dots & \dots \\ \sqrt{γ_{corr, ⋄}^{wv} (N_{TP, corr})} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (N_{TP, corr}), k = 1)} (N_{TP, corr}) & \sqrt{γ_{corr, ⋄}^{wv} (N_{TP, corr})} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (N_{TP, corr}), k = 2)} (N_{{TP}_{corr}}) & \dots & \sqrt{γ_{corr, ⋄}^{wv} (N_{TP, corr})} S_{w_{p}}^{(m_{corr, ⋄}^{wv} (N_{TP, corr}), k = N_{PC})} (N_{TP, corr}) & 0_{uncorr, p} \end{matrix}]

(A30)

with the column vector for j-th type of correlated parameter described by,

\sqrt{γ_{corr, ⋄}^{wv} (j)} S_{w}^{(m_{corr, ⋄}^{wv} (j), k)} (j) = \sqrt{γ_{corr, ⋄}^{wv} (j)} \times [\begin{matrix} \sum_{i = 1}^{m_{corr, ⋄}^{wv} (j) + 1} d_{1}^{(m_{corr, ⋄}^{wv} (j))} (i) \times v_{k} ([\sum_{n = 1}^{j - 1} L (n)] + i) \\ \sum_{i = 1}^{m_{corr, ⋄}^{wv} (j) + 1} d_{2}^{(m_{corr, ⋄}^{wv} (j))} (i) \times v_{k} ([\sum_{n = 1}^{j - 1} L (n)] + i) \\ ⋮ \\ \sum_{i = 1}^{m_{corr, ⋄}^{wv} (j) + 1} d_{L (j) - m_{corr, ⋄}^{wv}}^{(m_{corr, ⋄}^{wv} (j))} (i) \times v_{k} ([\sum_{n = 1}^{j - 1} L (n)] + i) \end{matrix}]

(A31)

Moreover, Equation (59) is implemented to evaluate

d (S_{wv}^{(m_{corr, Δ}^{wv})} x_{i}) / d x_{i}

in Equation (A24).

Appendix D. Decoupled Smoothness Constraints

Smooth variations are often directly observed in PC weights and elements in a PC vector associated with a certain type of parameter. Direct imposition of smoothness helps stabilize the first few iterations when a priori information about the PC vectors and weights is insufficient. In an integrated form, the across-pixel and within-PC smoothness matrix is expressed as

γ_{corr}^{w / v} Ω_{corr}^{w / v} = γ_{corr}^{w} Ω_{corr}^{w} + γ_{corr}^{v} Ω_{corr}^{v}

(A32)

where

γ_{corr}^{w} Ω_{corr}^{w}

and

γ_{corr}^{v} Ω_{corr}^{v}

represent the matrix form of smoothness constraints imposed on PC weights and PC vectors, respectively. The explicit forms of

γ_{corr}^{w} Ω_{corr}^{w}

and

γ_{corr}^{v} Ω_{corr}^{v}

are given in the Appendix D.1 and Appendix D.2, respectively.

Appendix D.1. Across-Pixel Smoothness Constraints on PC Weights

The smoothness constraints imposed directly on PC weights is expressed as

γ_{corr}^{w} Ω_{corr}^{w} = \sum_{k = 1}^{N_{PC}} {[\sqrt{γ_{corr, k}^{w}} S_{w}^{(m_{corr, K}^{w})} (k)]}^{T} [\sqrt{γ_{corr, k}^{w}} S_{w}^{(m_{corr, k}^{w})} (k)]

(A33)

where

\begin{array}{l} \sqrt{γ_{corr, k}^{w}} S_{w}^{(m_{corr, k}^{w})} (k) = \\ [\begin{matrix} 0 & 0 \\ 0 & 0 \\ \dots & \dots \\ 0 & 0 \end{matrix} \begin{matrix} C_{p 1} (1, k) & 0_{uncorr, p 1} & C_{p 1} (2, k) & 0_{uncorr, p 2} & \dots & \dots & C_{p 1} (m_{corr, k}^{w} + 1, k) & 0_{uncorr, p 1 + m_{corr, k}^{w}} & 0 & \dots & 0 \\ 0 & 0 & C_{p 2} (1, k) & 0_{uncorr, p 2} & C_{p 2} (2, k) & 0_{uncorr, p 3} & \dots & C_{p 2} (m_{corr, k}^{w} + 1, k) & 0_{uncorr, p 2 + m_{corr}^{w} (k)} & \dots & 0 \\ \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots & \dots \\ 0 & \dots & \dots & 0 & C_{N_{pixel} - m_{corr, k}^{w}} (1, k) & 0_{uncorr, N_{pixel} - m_{corr, k}^{w}} & C_{N_{pixel} - m_{corr, k}^{w}} (2, k) & 0_{uncorr, N_{pixel} - m_{corr, k}^{w} + 1} & \dots & C_{N_{pixel} - m_{corr, k}^{w}} (m_{corr, k}^{w} + 1, k) & 0_{uncorr, N_{pixel}} \end{matrix}] \end{array}

(A34)

where the first two columns of zero matrices are to accommodate the multi-pixel mean of correlated parameters and PC vectors, and C(i, k) is calculated by

C_{p} (m^{'}, k) = d_{k, p}^{(m_{corr}^{w} (k))} (m^{'}) \sqrt{γ_{corr, k}^{w}} {[\begin{matrix} 0 & 0 & \dots & 0 \\ \dots & \dots & \dots & \dots \\ δ (k, 1) & δ (k, 2) & \dots & δ (k, N_{PC}) \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & 0 \end{matrix}]}_{N_{Pixel} \times N_{PC}}

(A35)

where the delta function

δ (k, k^{'})

= 1 for k = k′ and 0 for k ≠ k′, respectively, and the differentiation arrays are expressed similarly as Equations (A7) and (A8), except that an extra subscript “k” is introduced to the differentiation arrays d to allow different strength and orders of difference of smoothness constraints to impose on a specific (k-th) principal component.

Appendix D.2. Within-PC Smoothness Constraints

The matrix form of within-PC smothness constraints imposed directly upon a PC is evaluated by,

γ_{corr}^{v} Ω_{corr}^{v} = \sum_{k = 1}^{N_{PC}} {[\sqrt{γ_{corr, k}^{v}} S_{v}^{(m_{corr, k}^{v})} (k)]}^{T} {[W^{(m_{corr, k}^{v})}]}^{- 1} [\sqrt{γ_{corr, k}^{v}} S_{v}^{(m_{corr, k}^{v})} (k)]

(A36)

where

{[\sqrt{γ_{corr, k}^{v}} S_{v}^{(m_{corr, k}^{v})} (k)]}_{v} = [\begin{matrix} 0 & T^{(m_{corr, k}^{v})} (1) & T^{(m_{corr, k}^{v})} (2) & \dots & T^{(m_{corr, k}^{v})} (N_{PC}) & 0 & 0 \end{matrix}]

(A37)

where the first zero matrix accommodates the multi-pixel mean of correlated parameters, the last two zero matrices accommodate the PC weights and uncorrelated parameters, and

T^{(m_{corr, k}^{v})} (k)

is evaluated by,

T^{(m_{corr, k}^{v})} (k) = [\begin{matrix} C_{}^{(m_{corr, k}^{v} (1), k)} (1) & 0 & \dots & 0 \\ 0 & C_{}^{(m_{corr, k}^{v} (2), k)} (2) & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & C_{}^{(m_{corr, k}^{v} (N_{TP, corr}), k)} (N_{TP, corr}) \end{matrix}]

(A38)

where

C^{(m_{corr, k}^{v} (j), k)} (j) = \sqrt{γ_{corr, k}^{v} (j)} [\begin{matrix} d_{k, 1}^{(m_{corr, k}^{v} (j))} (1) & d_{k, 1}^{(m_{corr, k}^{v} (j))} (2) & \dots & d_{k, 1}^{(m_{corr, k}^{v} (j))} (m_{corr, k}^{v} (j) + 1) & 0 & \dots & 0 \\ 0 & d_{k, 2}^{(m_{corr, k}^{v} (j))} (1) & d_{k, 2}^{(m_{corr, k}^{v} (j))} (2) & \dots & d_{k, 2}^{(m_{corr, k}^{v} (j))} (m_{corr, k}^{v} (j) + 1) & \dots & 0 \\ \dots & \dots & \dots & \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & d_{k, L (j) - m_{corr, k}^{v} (j)}^{(m_{corr, k}^{v} (j))} (1) & d_{k, L (j) - m_{corr, k}^{v} (j)}^{(m_{corr, k}^{v} (j))} (2) & \dots & d_{k, L (j) - m_{corr, k}^{v} (j)}^{(m_{corr, k}^{v} (j))} (m_{corr, k}^{v} (j) + 1) \end{matrix}]

(A39)

where d depends on the type of correlated parameters and is expressed in Equations (A7) and (A8). It can also vary as a function of a specific principal component. The weights matrix

W^{(m_{corr, k}^{v})}

in Equation (A36) is evaluated in the same way as for within-pixel constraints for uncorrelated parameters (see Equation (A16)).

References

Phillips, D.L. A technique for numerical solution of certain integral equation of first kind. J. ACM 1962, 9, 84–97. [Google Scholar] [CrossRef]
Twomey, S. On the numerical solution of Fredholm integral equations of the first kind by the inversion of the linear system produced by quadrature. J. ACM 1963, 10, 97–101. [Google Scholar] [CrossRef]
Twomey, S. Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements; Elsevier: New York, NY, USA, 1977. [Google Scholar]
Tikhonov, A.N. On the solution of incorrectly stated problems and a method of regularization. Dokl. Akad. Nauk SSSR 1963, 151, 501–504. [Google Scholar]
Tikhonov, A.N.; Arsenin, V.Y. Solution of Ill-Posed Problems; Wiley: New York, NY, USA, 1977. [Google Scholar]
Dubovik, O.; Herman, M.; Holdak, A.; Lapyonok, T.; Tanré, D.; Deuzé, J.L.; Ducos, F.; Sinyuk, A.; Lopatin, A. Statistically optimized inversion algorithm for enhanced retrieval of aerosol properties from spectral multi-angle polarimetric satellite observations. Atmos. Meas. Tech. 2011, 4, 975–1018. [Google Scholar] [CrossRef]
Liu, X.; Smith, W.L.; Zhou, D.K.; Larar, A. Principal component-based radiative transfer model for hyperspectral sensors: Theoretical concept. Appl. Opt. 2006, 45, 201–209. [Google Scholar] [CrossRef] [PubMed]
Natraj, V.; Spurr, R.J.D. A fast linearized pseudo-spherical two orders of scattering model to account for polarization in vertically inhomogeneous scattering-absorbing media. J. Quant. Spectrosc. Radiat. Transf. 2007, 107, 263–293. [Google Scholar] [CrossRef]
Spurr, R.J.D.; Natraj, V.; Lerot, C.; van Roozendael, M.; Loyola, D. Linearization of the principal component analysis method for radiative transfer acceleration: Application to retrieval algorithms and sensitivity studies. J. Quant. Spectrosc. Radiat. Transf. 2013, 125, 1–17. [Google Scholar] [CrossRef]
Liu, X.; Zhou, D.K.; Larar, A.M.; Smith, W.L.; Mango, S.A. Case-study of a principal-component-based radiative transferforward model and retrieval algorithm using EAQUATE data. Q. J. R. Meteorol. Soc. 2007, 133, 243–256. [Google Scholar] [CrossRef]
Liu, X.; Zhou, D.K.; Larar, A.M.; Smith, W.L.; Schluessel, P.; Newman, S.M.; Taylor, J.P.; Wu, W. Retrieval of atmospheric profiles and cloud properties from IASI spectra using super-channels. Atmos. Chem. Phys. 2009, 9, 9121–9142. [Google Scholar] [CrossRef]
Liu, X.; Liu, L.; Zhang, S.; Zhou, X. New spectral fitting method for full-spectrum solar-induced chlorophyll fluorescence retrieval based on principal components analysis. Remote Sens. 2015, 7, 10626–10645. [Google Scholar] [CrossRef]
Martonchik, J.V.; Diner, D.J.; Kahn, R.A.; Verstraete, M.M.; Pinty, B.; Gordon, H.R.; Ackerman, T.P. Techniques for the retrieval of aerosol properties over land and ocean using multiangle data. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1212–1227. [Google Scholar] [CrossRef]
Martonchik, J.V.; Kahn, R.A.; Diner, D.J. Retrieval of aerosol properties over land using MISR observations. In Satellite Aerosol Remote Sensing over Land; Kokhanovsky, A., de Leeuw, G., Eds.; Springer: Berlin, Germany, 2009. [Google Scholar]
Diner, D.J.; Martonchik, J.V.; Kahn, R.A.; Pinty, B.; Gobron, N.; Nelson, D.L.; Holben, B.N. Using angular and spectral shape similarity constraints to improve MISR aerosol and surface retrievals overland. Remote Sens. Environ. 2005, 94, 155–171. [Google Scholar] [CrossRef]
Dubovik, O.; Li, Z.; Mishchenko, M.I.; Tanré, D.; Karol, Y.; Bojkov, B.; Cairns, B.; Diner, D.J.; Espinosa, W.R.; Goloub, P.; et al. Polarimetric remote sensing of atmospheric aerosols: Instruments, methodologies, results, and perspectives. J. Quant. Spectrosc. Radiat. Transf. 2019, 224, 474–511. [Google Scholar] [CrossRef]
Kokhanovsky, A.A. The modern aerosol retrieval algorithms based on the simultaneous measurements of the intensity and polarization of reflected solar light: A review. Front. Environ. Sci. 2015, 3, 4. [Google Scholar] [CrossRef]
Xu, F.; Dubovik, O.; Zhai, P.-W.; Diner, D.J.; Kalashnikova, O.V.; Seidel, F.C.; Litvinov, P.; Bovchaliuk, A.; Garay, M.J.; van Harten, G.; et al. Joint retrieval of aerosol and water-leaving radiance from multispectral, multiangular and polarimetric measurements over ocean. Atmos. Meas. Tech. 2016, 9, 2877–2907. [Google Scholar] [CrossRef]
Xu, F.; van Harten, G.; Diner, D.J.; Kalashnikova, O.V.; Seidel, F.C.; Bruegge, C.J.; Dubovik, O. Coupled retrieval of aerosol properties and land surface reflection using the Airborne Multiangle SpectroPolarimetric Imager. J. Geophys. Res. Atmos. 2017, 122, 7004–7026. [Google Scholar] [CrossRef]
Cahalan, R.F.; Ridgway, W.; Wiscombe, W.J.; Bell, T.L.; Snider, J.B. The albedo of fractal stratocumulus clouds. J. Atmos. Sci. 1994, 51, 2434–2455. [Google Scholar] [CrossRef]
Hou, W.; Wang, J.; Xu, X.; Reid, J.S.; Han, D. An algorithm for hyperspectral remote sensing of aerosols: 1. Development of theoretical framework. J. Quant. Spectrosc. Radiat. Transf. 2016, 178, 400–415. [Google Scholar] [CrossRef]
Hou, W.; Wang, J.; Xu, X.; Reid, J.S. An algorithm for hyperspectral remote sensing of aerosols: 2. Information content analysis for aerosol parameters and principal components of surface spectra. J. Quant. Spectrosc. Radiat. Transf. 2017, 192, 14–29. [Google Scholar] [CrossRef]
Dubovik, O. (University of Lille); Xu, F. (Jet Propulsion Laboratory). Personal communication, 2018.
Dubovik, O. Optimization of numerical inversion in photopolarimetric remote sensing. In Photopolarimetry in Remote Sensing; Videen, G., Yatskiv, Y., Mishchenko, M., Eds.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2004; pp. 65–106. [Google Scholar]
Dubovik, O.; King, M.D. A flexible inversion algorithm for retrieval of aerosol optical properties from Sun and sky radiance measurements. J. Geophys. Res. 2000, 105, 673–696. [Google Scholar] [CrossRef]
Levenberg, K. A method for the solution of certain non-linear problems in Least Squares. Quart. Appl. Math. 1944, 2, 164–168. [Google Scholar] [CrossRef]
Marquardt, D. An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl. Math. 1963, 11, 431–441. [Google Scholar] [CrossRef]
Xu, F.; Davis, A.B.; West, R.A.; Esposito, L.W. Markov chain formalism for polarized light transfer in plane-parallel atmospheres, with numerical comparison to the Monte Carlo method. Opt. Express 2010, 19, 946–967. [Google Scholar] [CrossRef] [PubMed]
Lacis, A.A.; Hansen, J.E. A parameterization for the absorption of solar radiation in the Earth’s atmosphere. J. Atmos. Sci. 1974, 31, 118–133. [Google Scholar] [CrossRef]
Dubovik, O.; Sinyuk, A.; Lapyonok, T.; Holben, B.N.; Mishchenko, M.; Yang, P.; Eck, T.F.; Volten, H.; Muñoz, O.; Veihelmann, B.; et al. Application of spheroid models to account for aerosol particle nonsphericity in remote sensing of desert dust. J. Geophys. Res. 2006, 111, D11208. [Google Scholar] [CrossRef]
Rahman, H.; Pinty, B.; Verstraete, M.M. Coupled surface-atmosphere reflectance (CSAR) model 2. Semiempirical surface model usable with NOAA advanced very high resolution radiometer data. J. Geophys. Res. 1993, 98, 20791–20801. [Google Scholar] [CrossRef]
Litvinov, P.; Hasekamp, O.; Cairns, B. Models for surface reflection of radiance and polarized radiance: Comparison with airborne multi-angle photopolarimetric measurements and implications for modeling top-of-atmosphere measurements. Remote Sens. Environ. 2011, 115, 781–792. [Google Scholar] [CrossRef]
Van Harten, G.; Diner, D.J.; Daugherty, B.J.S.; Rheingans, B.E.; Bull, M.A.; Seidel, F.C.; Chipman, R.A.; Cairns, B.; Wasilewski, A.P.; Knobelspiesse, K.D. Calibration and validation of airborne multiangle spectropolarimetric imager (AirMSPI) polarization measurements. Appl. Opt. 2018, 57, 4499–4513. [Google Scholar] [CrossRef]
Dubovik, O.; Smirnov, A.; Holben, B.N.; King, M.D.; Kaufman, Y.J.; Eck, T.F.; Slutsker, I. Accuracy assessment of aerosol optical properties retrieval from AERONET sun and sky radiance measurements. J. Geophys. Res. 2000, 105, 9791–9806. [Google Scholar] [CrossRef]
Dubovik, O.; Lapyonok, T.; Litvinov, P.; Herman, M.; Fuertes, D.; Ducos, F.; Lopatin, A.; Chaikovsky, A.; Torres, B.; Derimian, Y.; et al. GRASP: A versatile algorithm for characterizing the atmosphere. SPIE Newsroom 2014, 25. [Google Scholar] [CrossRef]
Lyapustin, A.; Martonchik, J.; Wang, Y.; Laszlo, I.; Korkin, S. Multiangle implementation of atmospheric correction (MAIAC): 1. Radiative transfer basis and look-up tables. J. Geophys. Res. Atmos. 2011, 116, D03210. [Google Scholar] [CrossRef]
Lyapustin, A.; Wang, Y.; Laszlo, I.; Kahn, R.; Korkin, S.; Remer, L.; Levy, R.; Reid, J.S. Multiangle implementation of atmospheric correction (MAIAC): 2. Aerosol algorithm. J. Geophys. Res. Atmos. 2011, 116, D03211. [Google Scholar] [CrossRef]
Baldridge, A.M.; Hook, S.J.; Grove, C.I.; Rivera, G. The ASTER spectral library version 2.0. Remote Sens. Environ. 2009, 113, 711–715. [Google Scholar] [CrossRef]
Clark, R.N.; Swayze, G.A.; Wise, R.; Livo, E.; Hoefen, T.; Kokaly, R.; Sutley, S.J. USGS digital spectral library splib06a. USA Geol. Surv. Digit. Data Ser. 2007, 231, 2007. [Google Scholar]
Levy, R.C.; Remer, L.A.; Mattoo, S.; Vermote, E.F.; Kaufman, Y.J. Second-generation operational algorithm: Retrieval of aerosol properties over land from inversion of moderate resolution imaging spectroradiometer spectral reflectance. J. Geophys. Res. 2007, 112, D13211. [Google Scholar] [CrossRef]
Diner, D.J.; Beckert, J.C.; Reilly, T.H.; Bruegge, C.J.; Conel, J.E.; Kahn, R.; Martonchik, J.V.; Ackerman, T.P.; Davies, R.; Gerstl, S.A.W.; et al. Multiangle imaging spectro-radiometer (MISR) description and experiment overview. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1072–1087. [Google Scholar] [CrossRef]

Figure 1. (a) Top left panel: AERONET inversion of aerosol fields and properties (in natural logarithm space) including total volume concentration (

C_{v, tot}

), fraction of fine mode aerosols (

f_{fine}

), effective radii of fine (

r_{eff, fine}

) and coarse (

r_{eff, coarse}

) mode aerosols, column effective real part of refractive indices (

n_{r, 1 - 4}

) at 0.439, 0.675, 0.870, and 1.018 μm, respectively, imaginary part of refractive indices (

n_{i, 1 - 4}

), and spherical particle volume concentration (

C_{v, sphere}

). Though not specified by use of legend, each color is associated with an independent set of retrieval parameters. Top right panel: spatial and temporal mean of all retrievals and the first four principal components (PCs). Bottom left panel: percentage variance of the aerosol fields captured by PCs, indicating that 41%, 62%, 77%, and 85% of variance is captured by the first one, two, three, and four PCs, respectively. Bottom right panel: Aerosol optical depth (AOD) and single scattering albedo (SSA) at 440 nm that are reported in the AERONET retrievals analyzed here. (b) Regression of the derived aerosol properties from four PC against input AERONET values. The scatter plot is colored by the density of the points. Each panel corresponds to a specific aerosol property indicated in the title.

Figure 2. Same as Figure 1 but analysis was performed for a circular domain of 2000 km around AERONET site in Namibe, Angola. The first four PCs capture 56%, 74%, 83%, and 90% variance of the aerosol fields.

Figure 3. General structure of the correlated multi-pixel inversion approach. The interpretation of symbols used in the figure can be found in Table A1 of Appendix A.

Figure 4. Depiction of the coupled atmosphere-surface (CAS) system model. The sun illuminates the top-of-atmosphere with solar zenith angle

θ_{0}

and azimuthal plane

ϕ_{0}

. The sensor views the atmosphere at view zenith angle

θ_{v}

and azimuthal angle

ϕ_{v}

. A Gaussian vertical distribution profile for aerosols. The Markov chain model is used for computing polarized RT in the atmosphere. It is then coupled with surface reflection using an adding method.

Figure 5. The scheme of PC-based forward radiative transfer modeling of remote sensing observations from an airborne or spaceborne sensor. The interpretation of symbols used in the figure can be found in Table A1 of Appendix A.

Figure 6. (a) Pixel-resolved spectral aerosol optical depth (AOD) used in simulating the radiance and degree of linear polarization (DOLP) in (b). The AODs for seven Airborne Multiangle SpectroPolarimetric Imager (AirMSPI) spectral bands are plotted in different colors: pink (355 nm), purple (380 nm), dark blue (445 nm), light blue (470 nm), green (555 nm), red (660 nm), and brown (865 nm). b) The principal component based RT (PC-RT) simulation of radiance using three PCs (upper left panel) and DOLP (lower left panel) at AirMSPI’s nine viewing angles within (−66°, +66°) range around nadir, and the relative error of PC based RT simulation of radiance (top right) and absolute error of DOLP (bottom right) as compared to direct RT computation pixel-by-pixel. Errors are estimated by 100 × (Y_PC-RT − Y_Direct-RT)/Y_Direct-RT for Y = radiance and by (Y_PC-RT − Y_Direct-RT) for Y = DOLP. The viewing geometries are adopted from one of the scenes acquired by AirMSPI during the ACEPOL field campaign. The unpolarized and polarized surface reflectance is calculated from the retrieved BRDF and pBRDF parameters, respectively.

Figure 7. Algorithm flowchart for correlated multi-pixel retrieval of aerosol and land surface reflection properties. The interpretation of symbols used in the figure can be found in Table A1 of Appendix A.

Figure 8. (a) High resolution AirMSPI nadir imagery of Baskin, Louisiana, acquired on September 9, 2013. The left image is of radiance at 445, 555, and 660 nm. The location of the Baskin AERONET site is marked. The right image displays DOLP in the three polarimetric bands (470, 660, and 865 nm). The yellow box indicates the area viewed at all 9 AirMSPI view angles, and where data were used for retrieval algorithm testing. (b) Lower-resolution imagery (~1.0 km) of the retrieval area after pixel aggregation. The left, middle, and right panels are images of BRF at 445, 555, and 660 nm, respectively. (c) Retrieved AOD, SSA, and surface albedo (A) maps at 555 nm in the left, middle, and right panels, respectively. (d) The AirMSPI retrieved AOD, SSA, and volume weighted aerosol size distribution at the pixel closest to the Baskin AERONET site, compared to the AERONET-derived values.

Figure 9. (a) Regression of pixel-scale AOD retrieved from correlated multi-pixel inversion (CMPI) against the previous retrievals using multi-pixel inversion (MPI, [19]). The AirMSPI dataset used for retrieval is the same as in Figure 5 over the AERONET Baskin site. The results for seven AirMSPI spectral bands are plotted in different colors: pink (355 nm), purple (380 nm), dark blue (445 nm), light blue (470 nm), green (555 nm), red (660 nm), and brown (865 nm). (b) The same as Figure 9a but for SSA. (c) Comparison of image-mean volume-weighted aerosol size distribution retrieved from correlated multi-pixel inversion and multi-pixel inversion.

Figure 10. Regression of AirMSPI retrieved aerosol optical depth (AOD) against AERONET measured values. The upper three panels are for 355, 445, and 470 nm, and the lower three panels are for 555, 660, and 865 nm. Linear interpolation is used to obtain AERONET AOD values at the AirMSPI wavelengths. The AERONET uncertainties are from the ±~1-h window around the time of AirMSPI overflight plus measurement uncertainties (0.01), while the AirMSPI uncertainties are the root mean square of the retrieval uncertainties and standard deviation of pixel-resolved AODs over the whole image. Linear regression analysis yields values of slope a, intercept b, coefficient of determination R², and mean absolute difference (MAD). Values of each are indicated in all panels.

Figure 11. Comparison of AirMSPI retrieved single scattering albedo (SSA) against AERONET retrievals. The upper left and right panels are for 445 and 555 nm and the lower left and right panels are for 660 and 865 nm. Linear interpolation is used to obtain AERONET SSA values at the AirMSPI wavelengths. The AirMSPI errors are computed from statistics obtained over the whole image plus the errors evaluated using the method in Section 3.3. Values of MAD are shown.

Figure 12. Same as Figure 11 but for the real part of aerosol refractive index.

Figure 13. Same as Figure 11 but for the imaginary part of aerosol refractive index.

Figure 14. Same as Figure 11 but for the effective radii of fine and coarse mode aerosols.

Table 1. Correlation coefficient of AERONET retrieved aerosol properties analyzed in Figure 1 (for 2000 km circular around Fresno, California). The correlation is calculated for all parameters in logarithmic space.

	C_v,tot	f_fine	r_eff,fine	r_eff,coarse	n_r,1	n_r,2	n_r,3	n_r,4	n_i,1	n_i,2	n_i,3	n_i,4	C_v,sphere
C_v,tot	1.00	−0.34	−0.41	0.08	−0.12	−0.01	0.01	0.02	−0.01	−0.16	−0.23	−0.25	0.06
f_fine	-	1.00	0.18	0.30	−0.34	−0.36	−0.34	−0.35	0.22	0.32	0.31	0.30	0.67
r_eff,fine	-	-	1.00	−0.01	−0.17	−0.35	−0.41	−0.43	−0.16	-0.08	−0.01	−0.01	−0.06
r_eff,coarse	-	-	-	1.00	−0.01	−0.01	−0.01	−0.01	0.25	0.23	0.21	0.20	0.25
n_r,1	-	-	-	-	1.00	0.94	0.88	0.84	0.33	0.31	0.28	0.28	−0.29
n_r,2	-	-	-	-	-	1.00	0.98	0.96	0.28	0.22	0.18	0.17	−0.28
n_r,3	-	-	-	-	-	-	1.00	0.99	0.24	0.16	0.12	0.11	−0.26
n_r,4	-	-	-	-	-	-	-	1.00	0.21	0.12	0.08	0.08	−0.28
n_i,1	-	-	-	-	-	-	-	-	1.00	0.95	0.90	0.87	0.23
n_i,2	-	-	-	-	-	-	-	-	-	1.00	0.98	0.96	0.29
n_i,3	-	-	-	-	-	-	-	-	-	-	1.00	0.99	0.25
n_i,4	-	-	-	-	-	-	-	-	-	-	-	1.00	0.23
C_v,sphere	-	-	-	-	-	-	-	-	-	-	-	-	1.00

Table 2. Correlation coefficient of AERONET retrieved aerosol properties analyzed in Figure 2 (for 2000 km circular domain around Namibe, Angola).

	C_v,tot	f_fine	r_eff,fine	r_eff,coarse	n_r,1	n_r,2	n_r,3	n_r,4	n_i,1	n_i,2	n_i,3	n_i,4	C_v,sphere
C_v,tot	1.00	−0.32	−0.37	−0.05	−0.43	−0.41	−0.42	−0.40	−0.52	−0.54	−0.54	−0.54	0.39
f_fine	-	1.00	0.23	0.42	−0.12	−0.06	0.02	0.09	0.62	0.59	0.57	0.56	0.45
r_eff,fine	-	-	1.00	−0.07	0.20	0.18	0.17	0.14	0.26	0.28	0.27	0.26	−0.01
r_eff,coarse	-	-	-	1.00	0.14	0.19	0.25	0.33	0.36	0.41	0.41	0.42	0.26
n_r,1	-	-	-	-	1.00	0.97	0.92	0.86	0.51	0.58	0.58	0.58	−0.31
n_r,2	-	-	-	-	-	1.00	0.98	0.94	0.57	0.61	0.61	0.61	−0.28
n_r,3	-	-	-	-	-	-	1.00	0.98	0.61	0.63	0.63	0.62	−0.22
n_r,4	-	-	-	-	-	-	-	1.00	0.62	0.63	0.63	0.62	−0.17
n_i,1	-	-	-	-	-	-	-	-	1.00	0.96	0.94	0.92	0.19
n_i,2	-	-	-	-	-	-	-	-	-	1.00	0.99	0.99	0.12
n_i,3	-	-	-	-	-	-	-	-	-	-	1.00	1.00	0.09
n_i,4	-	-	-	-	-	-	-	-	-	-	-	1.00	0.08
C_v,sphere	-	-	-	-	-	-	-	-	-	-	-	-	1.00

Table 3. Initial guess of image-effective (multi-pixel mean) aerosol parameters and uncorrelated pixel-resolved surface parameters, and the order of difference and Lagrange multipliers for imposing within-pixel smoothness constraints.

	Range	First Guess	Order of Finite Difference for Spectral Smoothness Constraints $(m_{uncorr, ⋄})$	Lagrange Regularization Factor $(γ_{uncorr, ⋄})$
Aerosol parameters (scene-averaged)
Volume concentration of size components (C_{v, 1-5}, μm³/μm²)	[1.0 × 10⁻⁶,5]	0.002	-	-
Central height of aerosol distribution profile (h_a, km)	[0.05,10]	1	-	-
Standard deviation of aerosol distribution profile (s_a)	[0.5,2.5]	0.75	-	-
Real part of refractive index (n_r(λ))	[1.33,1.60]	1.50	1	0.1
Imaginary part of refractive index (n_i(λ))	[5.0 × 10⁻⁷,5.0 × 10⁻¹]	0.005	2	0.01
Spherical particle volume fraction (f_v,sphere)	[0.5,1.0]	0.95	-	-
Surface parameters (pixel-resolved)
BRDF spectral weight (a_λ)	[0,0.7]	0.015–0.1	3	0.1
Anisotropy parameter (k_λ)	[0,1]	0.6	1	0.5
Anisotropy parameter (g_λ)	[−1,1]	0.1	1	0.5
pBRDF weight (ε_λ)	[0,10]	0.01	-	-
Shadowing width (k_γ)	[0,1]	0.75	1	0.1
Slope variance (σ_s )	[0.05,0.5]	0.075	-	-

Table 4. Initial guess and the order of difference and Lagrange multipliers for imposing within-pixel and across-pixel smoothness constraints on the correlated aerosol fields through the first two PCs.

	Initial Guess (PC 1)	Range (PC 1)	Initial Guess (PC 2)	Range (PC 2)	Order of Finite Difference for Spectral Smoothness Constraints $(m_{corr, ⋄}^{wv})$	Lagrange Multiplier $(γ_{corr, ⋄}^{wv})$	Order of Finite Difference for Spatial Smoothness Constraints $(m_{corr, Δ}^{wv})$	Lagrange Multiplier $(γ_{corr, Δ}^{wv})$
log(C_{v, 1-5})	0.1–0.6	[−0.75,+0.75]	−3 × 10⁻¹–7 × 10⁻¹	[−1,+1]	-	-	1	1
log(h_a)	≈−1 × 10⁻¹	[−0.9,+0.9]	≈5 × 10⁻¹	[−1,+1]	-	-	1	0.01
log(s_a)	≈−2 × 10⁻²	[−0.4,+0.4]	≈7 × 10⁻²	[−0.4,+0.4]	-	-	1	0.01
log(n_r(λ))	≈3 × 10⁻³	[−0.1,+0.1]	≈5 × 10⁻³	[−0.1,+0.1]	1	0.1	1	10
log(n_i(λ))	≈−2 × 10⁻²	[−0.1,+0.1]	≈2 × 10⁻²	[−0.1,+0.1]	2	0.01	1	1
log(f_v,sphere)	≈6 × 10⁻³	[−0.05,+0.05]	≈4 × 10⁻³	[−0.05,+0.05]	-	-	1	0.1

Table 5. The order of difference and Lagrange multipliers for imposing within-PC constraints on the first two PC vectors and for imposing across-pixel constraint on the PC weights.

	Initial Guess	Range (in log-space)	Order of Finite Difference for Spectral Smoothness Constraints on First (Second) PC Vectors $(m_{c o r r, ⋄}^{v})$	Lagrange Multiplier on First (Second) PC Vectors $(γ_{c o r r, ⋄}^{v})$	Order of Finite Difference for Spatial Smoothness Constraints on First (Second) PC weight $(m_{c o r r, Δ}^{w})$	Lagrange Multiplier on PC Weights $(γ_{c o r r, Δ}^{w})$
log(C_{v, 1-5})	−0.3–0.7	[−1,+1]	-	-	-	-
log(h_a)	≈5 × 10⁻¹	[−1,+1]	-	-	-	-
log(s_a)	≈7 × 10⁻²	[−0.4,+0.4]	-	-	-	-
log(n_r(λ))	≈5 × 10⁻³	[−0.1,+0.1]	1(1)	0.01(0.001)	-	-
log(n_i(λ))	≈2.5 × 10⁻²	[−0.01,+0.01]	2(2)	0.001(0.00001)	-	-
log(f_v,sphere)	≈4 × 10⁻³	[−0.05,+0.05]	-	-	1(1)	0.05(0.005)
w_p	0	[−10,10]	-	-	1(1)	0.05(0.005)

Table 6. Mean absolute bias (MAD) in key aerosol properties from correlated multi-pixel inversion approach using two PCs (this paper) and from original multi-pixel inversion approach adapted to AirMSPI [19]. The bias is calculated by taking the mean of the absolute difference between AirMSPI and AERONET retrievals at collocated pixels. A total of 27 AirMSPI datasets are used. Interpolation is used to calculate some AERONET parameters at central wavelengths of the AirMSPI spectral bands. By performing regression against AERONET reference AOD, the coefficients of determination (R²) are given in columns 3 and 5. The ranges of non-AOD parameters are too small and the sample size is limited. Therefore a reliable regression analysis is not established and R² is reported only for AOD in the table.

Parameters	Correlated Multi-pixel Inversion		Original Multi-pixel Inversion
	MAD	R²	MAD	R²
AOD_{355 nm}	0.058	0.917	0.040	0.925
AOD_{445 nm}	0.035	0.927	0.024	0.955
AOD_{470 nm}	0.030	0.933	0.021	0.960
AOD_{555 nm}	0.020	0.933	0.016	0.960
AOD_{660 nm}	0.016	0.922	0.013	0.959
AOD_{865 nm}	0.015	0.815	0.014	0.851
SSA_{445 nm}	0.035	-	0.030	-
SSA_{555 nm}	0.036	-	0.030	-
SSA_{660 nm}	0.040	-	0.032	-
SSA_{865 nm}	0.041	-	0.035	-
n_{r,445 nm}	0.052	-	0.039	-
n_{r,555 nm}	0.051	-	0.039	-
n_{r,660 nm}	0.046	-	0.036	-
n_{r,865 nm}	0.038	-	0.037	-
n_{i,445 nm}	0.004	-	0.004	-
n_{i,555 nm}	0.004	-	0.004	-
n_{i,660 nm}	0.005	-	0.004	-
n_{i,865 nm}	0.005	-	0.005	-
r_eff,fine (μm)	0.024	-	0.022	-
r_eff,coarse (μm)	1.050	-	0.993	-

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

A Correlated Multi-Pixel Inversion Approach for Aerosol Remote Sensing

Abstract

1. Introduction

2. General Structure of the Algorithm

3. Inversion

3.1. State Vector

3.2. Constraints

3.2.1. Observational Constraints (i = 1)

3.2.2. A Priori Constraints (i = 2)

3.2.3. Smoothness Constraints in Regular Parameter Space (i = 3, 4, 6, and 7)

3.2.4. Smoothness Constraints in PC Space (i = 5)

3.2.5. Zero-sum Constraint on PC Weights (i = 8)

3.2.6. Mutual Orthogonality Constraint among PC Vectors (i = 9)

3.2.7. Unity-norm Constraint PC Vectors (i = 10)

3.2.8. Construction of Overall Equation System

3.2.9. Determination of Lagrange Multipliers

3.3. Retrieval Error Estiamte

3.4. Retrieval Options

4. Radiative Transfer in a Coupled Atmosphere-Surface System

4.1. Fast Multi-Pixel Polarized RT Modeling in the Atmosphere

4.1.1. Fast Multiple-Pixel Radiative Transfer Modeling Utilizing Correlation

4.1.2. Coupling Atmospheric Radiation with Surface Reflection

4.2. Jacobian Evaluation

5. Inversion of Aerosol and Surface Properties

5.1. AirMSPI Datasets

5.2. Retrieval Validation against AERONET Products

6. Summary and Outlook

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Symbols and Abbreviations

Appendix B. Smoothness Matrix to Constrain Uncorrelated Parameter Retrieval

Appendix B.1. Across-Pixel Smoothness Matrix

Appendix B.2. Within-Pixel Smoothness Matrix

Appendix C. Smoothness Matrix for Correlated Parameters

Appendix C.1. Across-Pixel Smoothness Constraints

Appendix C.2. Within-Pixel Smoothness Matrix

Appendix D. Decoupled Smoothness Constraints

Appendix D.1. Across-Pixel Smoothness Constraints on PC Weights

Appendix D.2. Within-PC Smoothness Constraints

References

Article Metrics

Citations

Article Access Statistics