Multi-View Polarimetric Scattering Cloud Tomography and Retrieval of Droplet Size

Tomography aims to recover a three-dimensional (3D) density map of a medium or an object. In medical imaging, it is extensively used for diagnostics via X-ray computed tomography (CT). Optical diffusion tomography is an alternative to X-ray CT that uses multiply scattered light to deliver coarse density maps for soft tissues. We define and derive tomography of cloud droplet distributions via passive remote sensing. We use multi-view polarimetric images to fit a 3D polarized radiative transfer (RT) forward model. Our motivation is 3D volumetric probing of vertically-developed convectively-driven clouds that are ill-served by current methods in operational passive remote sensing. These techniques are based on strictly 1D RT modeling and applied to a single cloudy pixel, where cloud geometry is assumed to be that of a plane-parallel slab. Incident unpolarized sunlight, once scattered by cloud-droplets, changes its polarization state according to droplet size. Therefore, polarimetric measurements in the rainbow and glory angular regions can be used to infer the droplet size distribution. This work defines and derives a framework for a full 3D tomography of cloud droplets for both their mass concentration in space and their distribution across a range of sizes. This 3D retrieval of key microphysical properties is made tractable by our novel approach that involves a restructuring and differentiation of an open-source polarized 3D RT code to accommodate a special two-step optimization technique. Physically-realistic synthetic clouds are used to demonstrate the methodology with rigorous uncertainty quantification.


Why polarized light?
There is an additional caveat in common retrievals, which rely on SWIR absorption [9]. In addition to absorption, light undergoes multiple scattering in clouds. Multiple scattering diminishes sensitivity to droplet microphysics. High sensitivity to microphysics is embedded in single-scattering events. It is thus beneficial to pick-up single-scatter signals, out of the strong multiply-scattered background radiance. Polarization signals of scattered light are dominated by single-scattering events, and are thus highly sensitive to the type and size specifications of scatters. Thus in recent years, there is growing interest in polarimetric imagers for remote sensing of clouds and aerosols [10,11,12,13,14,15]. In turn, increased interest in polarimetric sensing capabilities has led to the development of 1D and 3D polarized (or "vector") RT codes [16,17] with an aim of improving retrieval algorithms. Motivated by the CloudCT mission formulation-only the first of many to come in innovative passive cloud remote sensing-we develop herein a novel framework for 3D remote sensing of cloud properties using multi-view polarimetric measurements.

Why passive tomography?
From its etymology, the word "tomography" means a slice-by-slice recovery of an object's 3D internal structure using 2D projections of cumulative density. In the computer age, this task is termed Computed Tomography (CT) [18]. Common medical CT approaches are transmission-based X-ray CT or single-photon emission computed tomography (SPECT). There, 2D projections represent straight line-of-sight (LOS) integrals of the local X-ray opacity or nuclear marker density, respectively. In both imaging modalities, the inverse problem of recovering the medium content is linear [19].
Biomedical imaging also involves CT modalities which are not based on linear projections. A prime example is Optical Diffusion Tomography (ODT) [20,21,22], which uses non-ionizing near-infrared light. It is worth noting the work by Che et al. [23] which departs from physics-based approaches into the realm of machine-learning.
In ODT, a patient's organ is surrounded by a large number of point sources of pulsed isotropic near-infrared irradiation and a large number of time-resolving omnidirectional sensors. The organ transmits radiation diffusely, with very little absorption. Anomalous 3D absorbing or vacuous regions can be detected and assayed using nonlinear inverse diffusion spatiotemporal analysis, that relies on very high orders of scattering. The detected radiance is blurred, yielding limited 3D spatial resolution. However, ODT can yield sufficient diagnostic information, using non-ionising radiation.
Medical CT modalities generally use active radiation. Active methods are also used for local atmospheric sensing or scatterers by radar and lidar. There, a transmitter and receiver are generally collocated and signals are based on backscattering and time-resolved two-way transmission. Probing is solved per LOS using methods which are computationally relatively simple. However, the technology is expensive, horizontal sampling is generally very limited, and irradiance decays fast from the transmitter. Passive sensing is less expensive, uses minimal power, and can image wide swaths of Earth. Thus global coverage mandates passive imaging from space. Consequently, this paper focuses on derivation of 3D passive tomography of scatterer fields.
Passive remote sensing does not benefit from pulsed sources for echo-location. It should rely on multi-angular data. Linear CT models (analogous to medical Xray CT and SPECT) were used to study gas emission and absorption in 3D plumes in the vicinity of pollution sources [24,25] or volcanoes [26,27]. There, Rayleigh-scattered sunlight was transmitted through the gas to a spectrometer on a platform flying around the plume. Following the vision of Werner et al. [28], Huang et al. [29,30] used scanning microwave radiometers to reconstruct 2D slices of particle density in clouds based on its impact on local emissivity.
Linear CT was also adapted by Garay et al. [31] to characterize a smoke plume over water emanating from a coastal wild fire. There, the signal is sunlight scattered to space and detected by the Multi-angle Imaging Spectro-Radiometer (MISR) sensor [32] at nine viewing angles. The analysis in [31] yields the direct transmission through the plume per LOS, from which linear CT analysis yields the plume density without using solar radiometers under the plume.
In general, however, retrieving atmospheric scatterer fields in 3D requires a full forward model of scattering in 3D. The model satisfies neither a direct transmission model of linear CT, nor the diffusion limit of ODT. In passive imaging of scatterers, the light source irradiating the atmosphere is the sun: uncontrolled, steady and mono-directional. Aides et al. [33] formulated CT based on single-scattered light. Their forward model is based on sets of broken-ray paths, where light changes direction once from the sun to a sensor.
All the above atmospheric tomography methods assumed the medium to be optically thin enough for direct and oncescattered radiation to dominate the measured radiance. We depart radically from this assumption, drawing inspiration from the success of active ODT, though necessarily with a different forward model. We formulate an inverse 3D RT problem for cloud tomography utilizing multi-view multi-spectral polarimetric images. In contrast to linear CT, the image formation model is nonlinear in the microphysical and density variables. Our approach seeks an optimal fit of droplet microphysical parameters. This is based on a computational 3D polarized RT forward model, the vector Spherical Harmonics Discrete Ordinates Method (vSHDOM) [34,35]. To this effect, we generalize our demonstrated iterative inversion approach [36,37,38] to take advantage of polarimetric measurements.

Outline
In the next section, we cover basic cloud droplet optics using Mie scattering theory and the fundamentals of polarized 3D RT. The latter yields radiance which has a clear decomposition into single-and multiply-scattered light. This decomposition supports the solution to the inverse problem at hand. We then lay out our 3D cloud tomography method where we target three basic microphysical properties, volumetrically. Necessary but tedious mathematical details are presented in the Appendix. Subsequently, the new 3D cloud tomographic capability is demonstrated on realistic synthetic clouds from a Large Eddy Simulation (LES) that provide ground truth for unambiguous retrieval error quantification. We conclude with a summary of our results and an outline of future developments, mostly looking toward CloudCT and other future space-based uses.

Background
This section describes bulk microphysical parameterization of scattering media, the polarimetric radiative transfer image formation (forward) model and the relation between them. The section also describes the coordinate systems in use (per-scatterer, imager and Earth frames). We further decompose the polarized radiance into single-scattered Normalized Gamma-distribution. The effective radius and variance dictate the centroid and width of the size-distribution. The limit of very low v e approaches a mono-disperse distribution.
[Center] Log-polar plot of the Mie phase-function p 11 induced by a single water sphere of radius r.
[Right] Log-polar plot of the effective phase-function s s p 11 r /σ s induced by a small volume that includes particles of different sizes. and high-order scattered components. These foundations are used in subsequent sections, to formulate tomographic recovery.

Scatterer microphysical properties
In the lower atmosphere, cloud particles are droplets of liquid water that are very nearly spherical, having radius r. They are however polydisperse, with a droplet size distribution denoted n(r). For most remote-sensing purposes, n(r) is parameterized using an effective radius in µm and a dimensionless variance [39]: A commonly used parametric size distribution, having empirical support [39] is the Gamma-distribution (Fig. 2): where we require v e < 1/2.
is a normalization constant and is the droplet number concentration. Let ρ w be the density of liquid water. An important cloud characteristic is the water mass density or Liquid Water Content (LWC) per unit volume: It is expressed as LWC = 4 /3 πρ w r 3 e (1 − v e )(1 − 2v e ) for the Gamma distribution in (2).

Polarized light
A light wave is associated with orthogonal components of a random electric wave, E 1 (t) and E 2 (t), where t is time. The components' direction unit vectors are respectivelyÊ 1 andÊ 2 . The wave propagates in direction ω =Ê 1 ×Ê 2 . It is convenient to define the polarized light state in terms of the Stokes [39] vector I = (I, Q, U, V ) . Each component of I expresses temporal expectation: where i = √ −1. Unpolarized intensity is I. The degrees of polarization (DOP) and linear polarization (DoLP) are respectively defined as the ratios √ Q 2 +U 2 +V 2 /I, √ Q 2 +U 2 /I. The angle of linear polarization (AoLP) is 1 /2 tan −1 (U/Q).

Single scattering of polarized light
Light interaction with a single particle is described by the total extinction cross-section s t (r, λ), decomposed into scattering and absorption cross-sections, respectively: s t (r, λ) = s s (r, λ) + s a (r, λ).
Define size-weighted average over some function a(r) by 3 Note that we use here an approximation, commonly used in multi-spectral remote sensing, of a single rendering with spectrally-averaged optical properties. The material optical properties can furthermore be approximated, in the absence of molecular absorption, by using a single wavelength for each spectral band. This is valid if wavelength dependencies within a spectral band are weak, a condition met when narrow bands are considered. Macroscopic optical cross-sections are then expressed as weighted averages 4 Throughout the text, dependency on λ is generally omitted for simplicity; however, it is used at specific points as needed.
Scattering, as a fraction of the overall interaction [40], is expressed by the dimensionless single scattering albedo The extinction coefficient (or optical density) is denoted by β. Following Eqs. [3,4,8], β = N σ t is expressed in terms of the LWC as [41] β = LWC Here,σ t is the mass extinction coefficient (in units of m 2 /g).
Let ω and ω be the unitary incident and scattered ray direction vectors respectively in Fig. 2. Single-scattering geometry is defined by the local coordinate system of the incoming beam's electric fields. As stated above, the electric field of incoming light is decomposed into components along orthogonal directions. We set them as The scattering angle is θ = cos −1 (ω·ω ). The angular redistribution of singly-scattering light from a sphere of is defined by the 4×4 dimensionless Mueller matrix P s (θ, r). The macroscopic phase matrix is the size-weighted average For spherical (or just randomly-oriented) particles, the phase-matrix P(θ) takes the following symmetric form [39] where p 11 is the (unpolarized) scattering phase-function. In single-scattering of unpolarized incident sunlight, the DoLP of scattered light amounts to the ratio |p 21 |/p 11 . cloudbow glory Figure 3: Normalized phase matrix element −p Mie 12 /p Mie 11 around the cloud-bow and glory regions. For highly disperse droplet distributions (large v e ) the secondary lobes of the cloud-bow (θ ∼ 140 • ) and glory (θ ∼ 180 • ) diminish. The main cloud-bow peak is slightly sensitive to λ or v e . The side-lobe angles are more sensitive to λ and r e . The side-lobe amplitude is sensitive to v e . This cloud-bow signal is helpful for retrievals of r e . [Right plot] Solid lines indicate monochromatic light. Dashed lines indicate spectral averaging over a 100 nm bandwidth, which is more than double any of the spectral bands considered further on.

Rayleigh scattering
The Rayleigh model describes light scattering by particles much smaller than the wavelength. The Rayleigh phase matrix takes the following form [42] The single-scattering DoLP due to air molecules is then According to (15) a maximum DoLP is attained at single-scattering angle θ = 90 • .

Mie scattering
Mie theory describes how light interacts with a spherical particle of size comparable to λ [43]. Denote µ= cos θ. Mie scattering is defined in terms of complex-valued amplitude scattering functions 5 S 1 (µ), S 2 (µ), which correspond to scattering of the E 1 , E 2 electric field components. Scattering of the Stokes vector I is described by the phase matrix P Mie (µ), which is fully defined by six matrix components: Here, is a normalization constant, set to satisfy 1 Mie scattering due to water droplets is peaked at specific angles. For a single droplet or monodisperse material, P Mie has sharp scattering lobes at angles that depend on the droplet's r /λ ratio. A macroscopic voxel contains droplets in a range of radii r, smoothing the scattering lobes. The smoothing effect depends on v e (Fig. 3) and, to a far lesser extent, the spectral bandwidth (Fig. 3). Two angular domains that stand out for remote-sensing purposes are the cloud-bow (θ ∈ [135 • , 155 • ]) and glory (θ ∈ [175 • , 180 • ]). Both domains have peaks that are sensitive to the droplet microphysical parameters, and are significantly polarized (i.e., peaks are visible in the p Mie 12 component). The latter fact renders these peaks distinguishable in the presence of a multiply-scattered signal component.  18)). Integration yields the partially polarized (vector) light field I ( (17)). Here I(x k , ω k ) is a pixel measurement at the TOA and I Single is the single-scattered contribution from x [Right] Ray tracing of a line-integral over a discretized voxel field h[g] (zero-order interpolation).

Multiple scattering of polarized light
The Radiative Transfer Equation (RTE) [42] describes multiple scattering interactions of monochromatic partially polarized light within a medium. Transmittance between two points An atmospheric domain Ω has boundary ∂Ω. The intersection of ∂Ω with a ray originating at point x in direction −ω ( Fig. 4) is denoted x 0 (x, ω). Denote the Stokes vector field as I (x, ω). Then I(x 0 , ω) is the Stokes vector of radiation which propagates in direction ω at boundary point x 0 (x, ω). The non-emissive forward RT model [42] couples I (x, ω) to a vector source field J (x, ω) ( Fig. 4) by Equations [17][18] are solved numerically, either directly with an explicit solver [35] or indirectly using a Monte-Carlo path tracer [44]. We use vSHDOM [35] to simulate scattered Stokes components of a realistic atmosphere, having both Mie and Rayleigh scattering due to water droplets and air molecules.
Multiple scattering interactions are defined using two coordinate systems. Local scatterer coordinates are set by (Ê 1 ,Ê 2 ). Stokes measurements in satellites, however, are defined in Meridional coordinates. Letẑ denote the zenith direction vector at every point on Earth. In meridian coordinates, the electric field components are defined by direction vectorsm Each pixel-scale Stokes measurement is described by a coordinate system defined bym 1 andm 2 . The transformation between the two coordinate systems amounts to a multiplication of I by a Mueller rotation matrix.
Sampling I (x, ω) at the location of each camera and direction of each camera pixel yields the measured Stokes vector. A measurement k is done at the camera position x k , LOS direction ω k , and wavelength λ k (Fig. 4). Thus, Eqs. [17][18] yield the pixel measurement model

Single-scattering separation
It is often convenient to separate the single-scattering contribution from the rest of the radiance field [45]. The solar irradiance at the top of the atmosphere (TOA) is F Sun . It is unoplarized, thus corresponds to a Stokes vector F Sun = (F Sun , 0, 0, 0) . The Sun is modeled as an ideal directional source with direction ω Sun . A solar ray heading to point x intersects the TOA at point x Sun . The solar transmittance is given by T (x Sun →x). Let δ denote Dirac's delta. Thus, I can be written as a sum of the diffuse component I d , and direct solar component: Inserting [21] into [18] yields where Consider Fig. 4. Denote a broken-ray path of direct sunlight which undergoes single scattering at x , then reaches the camera: x Sun →x →x k .
(24) It projects in direction ω k to pixel at x k , thus contributing to the measurement I(x k , ω k ). Using Eqs. [17,22], the single-scattered contribution from x is Thus, the entire single-scattered signal accumulates contributions along the LOS

Ray tracing
Ray tracing computes a function over a straight line through a 3D domain. A common operation is path-integration (e.g. Eqs. [16,17]). Let h(x) be a continuous field. Define a grid of discrete points For zero-order interpolation (i.e., voxel grid), (27) where g (x 1 →x 2 ) is the intersection of the path with voxel g (Fig. 4). For voxel indices g that do not intersect the path sectionCloud Tomography So far, we described the forward (image-formation) model, i.e., how images are formed, given cloud properties. In this work, we formulate a novel inverse tomographic problem of recovering the unknown cloud microphysical properties, volumetrically. In voxel g, the vector of unknown parameters is The unknown microphysical parameters are concatenated to a vector of length 3N grid Neglecting circular polarization, each pixel measures a Stokes vector, y I = y I , y Q , y U at N λ wavelengths. Let N views and N pix denote the number of view points and camera pixels. The total number of Stokes measurements is thus N meas =N λ N views N pix . The measurement vector of length 3N meas is expressed as In this section, we formulate the use of measurements y (multi-view, multi-pixel, multi-spectral, polarimetric measurements) for tomographic retrieval of Θ (3D volumetric cloud density and microphysics). It is worth mentioning at this point that Stokes components are not measured directly. Rather, they are computationally retrieved from measurements of different polarization states (see Appendix for the AirMSPI measurement model).

Polarimetric information
To make an initial assessment of the sensitivity of polarimetric measurements, we simulate a simple homogeneous cubic cloud (Fig. 5), parameterized by two microphysical parameters: (LWC, r e ). Back-scattered Stokes measurements are taken at the TOA for angles and wavelengths sampled by the Airborne Multi-angle Spectro-Polarimetric Imager (AirMSPI) [14].
where we hold v e constant. Equations [31][32][33] are 2D manifolds. Figure 6 plots the cost manifolds for different solar azimuth angles, φ 0 . While there is an ambiguity between LWC and r e when relying on D I , there are better defined minima for D Q and D U . This indicates that polarization measurements carry valuable information.

Inverse problem formulation
Denote I Θ as the image formation model. Tomography can be formulated as minimization of a data-fit function. We preformΘ Here Σ is related to the co-variance of the measurement noise. For brevity, we omit the subscript Θ but remember that Assuming noise in different pixels, wavelengths and angles is uncorrelated, (34) degenerates tô The matrix R depends on the particular sensor technology. Description of R, tailored to the AirMSPI sensor, is detailed in the Appendix.  Figure 6: Logarithm of the 2D cost manifolds for a 2-parameter homogeneous cubic cloud (Fig. 5). Each column of plots corresponds to the cost of the different Stokes components in Eqs. [31][32][33]. Each row of plots corresponds to a different Solar azimuth angle φ 0 .
We solve (36) by a gradient-based approach. The gradient with respect to the unknown parameters Θ is The term ∇ Θ I[k] is the Jacobian of the sensing model. Equation [37] is used to formulate an update rule for an iterative optimization algorithm where b denotes the iteration index and χ b is a scalar. We use L-BFGS [46] for numerical optimization that, in particular, determines adaptively the value of χ b . One approach to computing the gradient ∇ Θ D is the Adjoint RTE [47,48]. Due to the recursive nature of the RTE, computing the gradient through the exact Jacobian ∇ Θ I[k] is computationally expensive. In the following sections, we derive a method to make the computation of the gradient tractable and efficient. We do that by approximating the Jacobian ∇ Θ I in a tractable way, using a two-step iterative algorithm [36,38].

Iterative solution approach
We formulate an iterative algorithm which alternates between two steps (See the diagram in Fig. 7). Starting with an initial guess, Θ 0 , Step 1 uses vSHDOM to compute the forward (recursive) RT equations. This renders synthetic images according to the multi-view geometry, spectral bands and spatial samples of the cameras. Keeping I d fixed, Step 2 efficiently computes an approximate gradient with respect to Θ. The approximate gradient is fed into an L-BFGS step to update the current estimate Θ b .
Step 1: RTE Forward Model The first step in the estimation approach is running the forward model in Eqs. [17][18] using a numerical RTE solver. This requires transforming microphysical to optical properties at every voxel (g) and spectral band (λ): Implementing (39) using Eqs. [8][9][10][11][12] during each optimization iteration can be time-consuming. Therefore, define grids r e ∈ r min e , ... Step 2: Approximate Jacobian Computation  (20) depends on optical properties (β, , P), which themselves depend on the sought microphysics. The Jacobian at voxel g is expressed by applying the chain-rule to (20). For example, the derivative with respect to the effective radius is

The forward vRTE model in
Analogously, replacing r e in (40) with LWC or v e yields the respective microphysical derivatives. We proceed by expressing the derivatives ∂{β, , P}/∂{LWC, r e , v e }. Afterwards, we expand and combine the derivatives ∂I/∂{β, , P} to express (40).

∂P ∂LWC
= 0, ∂P ∂r e = P(r e +ε re , v e )−P(r e , v e ) ε re , where v e derivatives are computed analogously to the r e derivatives. Using the shorthand notation ∂ g ≡{ ∂ ∂LWC[g] , ∂ ∂re[g] , ∂ ∂ve[g] }, the overall Jacobian is given by a sum of terms The full expression for each term in Eq. [44] is given in the Appendix. For example, Let us focus on the term This Jacobian term stands out, because it is only term which requires computing the derivative of I. This derivative is computationally expensive because I is computed recursively through the RTE [Eqs. [17][18]. In principle, a change in the microphysics of one voxel can recursively affect the radiance at every other voxel. We decompose ∂ g I using the diffuse-direct decomposition of (21) At the core our approach for computational efficiency is the assumption that the diffuse light I d is less sensitive to slight changes in the microphysical properties of any single voxel g. Rather, I d is impacted mainly by bulk changes to the over-all volume. Thus, we approximate (46) by keeping I d independent of Θ for a single iteration of the gradient computation, i.e., This bypasses the complexity of recursively computing ∂ g I d .
It is important to note that at every iteration, the Jacobian ∇ Θ I[k] still is impacted by I d . This is because I d affects I through Eq. [21], and I appears in the terms A 1 , . . . , A 6 . As the estimated medium properties evolve through iterations, so does I d (in Step 1, above). We just assume during Step 2 that ∂ g I d is negligible compared to other terms in Eq. [44].
Contrary to I d , the single-scattered component is highly sensitive to changes in the micro-physical properties of a single voxel. We therefore include an exact treatment of single-scattering in the gradient computation (in the Appendix). This is the essence of our numerical optimization approach. It enables tackling multiple-scattering tomography, in practice. Simulation results presented in the following section rely on additional numerical considerations (e.g., initialization, preconditioning, convergence criteria), which are all described in the accompanying Appendix.

Simulations
As mentioned, real data of simultaneous spaceborne multi-angular polarimetric images of clouds does not yet exist, but a mission to supply this data is in the works. Therefore, we use careful simulations to test the approach. We simulate an atmosphere with molecular Rayleigh scattering and liquid water clouds. Rayleigh scattering is taken from the AFGL database [49] for a summer mid-latitude atmosphere. Mie tables are pre-computed for r e ∈ [4,25] µm and v e = 0.1 with N re = 100. The surface is Lambertian with a water-like albedo of 0.05. For realistic complexity, a Large Eddy Simulation (LES) model [50] was used to generate a cloud field. Each voxel is of size 20×20×40 m 3 . The LES outputs [50] are clouds with 3D variable LWC and 1D (vertically) variable r e . A typical value [51] of v e = 0.1 was chosen. Consequently, the present recovery demonstrations recover LWC and r e on their respective native LES grid.
On the other hand, v e = 0.1 is excluded from the unknowns.
From the generated cloud field, two isolated cloudy regions are taken for reconstruction: 1. Scene A: An atmospheric domain of dimensions 0.64×0.72×20 km 3 with an isolated cloud (see synthetic AirMSPI nadir view in Fig. 8). 2. Scene B: An atmospheric domain of dimensions 2.42×2.1×8 km 3 with several clouds of varying optical thickness (see synthetic AirMSPI nadir view in Fig. 9).
Synthetic measurements rendered with the spatial resolution and angular sampling of AirMSPI [14], namely, 10 m pixels and 9 viewing angles: ±70.  Single scattering albedos for these wavelengths are all within 10 −4 of unity. In other words, and in sharp contrast with the operational Nakajima-King [9] bi-spectral non-tomographic retrieval, absorption by droplets plays no role in this demonstration of tomography of cloud microphysics. The measurements are synthesized with realistic noise, according to the AirMSPI data acquisition model (see Appendix).
Qualitative volumetric results of the recovered LWC for Scene A are shown in Fig. 10. Scatter plot of the recovered LWC and the recovery results of r e for Scene A are given in Fig. 12. Analogous plots for Scene B recovery results are given in the Appendix.
Multi-angular tomographic retrieval enables vertical resolution of the droplet effective radius. By contrast, a homogeneous droplet radius is typically retrieved by mono-angular observations fitted to a plane-parallel homogeneous cloud model. The retrieval errors of droplet radii in the demonstrations above are significantly smaller than retrieval errors of a homogeneous droplet radius. The latter can easily exceed 50% in similar conditions to our study i.e, shallow cumuli and illumination conditions (see e.g. [53]).

Summary & Outlook
We derive tomography of cloud microphysics based on multi-view/multi-spectral polarimetric measurements of scattered sunlight. This novel type of tomography uses, for the first time, 3D polarized RT as the image formation model. We define a model-fitting error function and compute approximate gradients of this function to make the recovery tractable. Demonstration are done on synthetic 3D clouds, based on a Large Eddy Simulation with the effective radius assumed to vary only vertically.
Future work will address the extent to which polarimetric measurements penetrate the cloud and the relation between r e in the outer shell and r e in the cloud core, as defined by Forster et al. [54]. Furthermore, we will relax the fixed v e assumption that was used in the demonstrations, and thus assess full microphysical retrieval capabilities of polarization measurements. A thorough discussion on these assumptions and their applicability to real-world clouds is given in the Appendix. Moreover, future plans include experimental demonstration and use, while the CloudCT formation orbits.
Lastly, we note that our atmospheric tomography approach herein can be adapted to aerosols, including dense plumes of wild fire smoke, volcanic ash, and dust. Research is ongoing [54] about such adaptation for satellite data as can be obtained from the multi-view imaging from MISR on Terra and a SWIR view from the collocated MODIS, as well as in the planned CloudCT [8].

A Jacobian Derivation
In Eq. [44] of the main text, the Jacobian is written as a sum of six terms In this section we expand and describe each of these terms. Using Eqs. [16] and [28], the transmittance derivative is Then, Note that I (x, ω) and J (x, ω) are computed in Step 1 and are therefor ready for use when computing A 1 , A 2 , A 3 , A 5 and A 6 . Furthermore, g (x 0 →x k ) =0 for any voxel that is not on the LOS of pixel k. Therefore, the terms A 1 , A 2 , A 3 , A 5 , A 6 are computed using a single path tracing x k →x 0 .
We now give special attention to A 4 in Eq. [55]. Using the diffuse-direct decomposition of (21), we decompose (55) as The first term in (58) is based on ∂ g I d , ie., a derivative of the diffuse (high order scattering) component. Herein lies a recursive complexity. In principle, a differential change in the microphysics of one voxel can recursively affect the radiance at every other voxel, and this affects all the pixels. To make calculations numerically efficient, we approximate (58). The approximation assumes that relative to other components in the Jacobian, I d is less sensitive to a differential changes in the microphysical properties at voxel g. Thus, (58) is approximated by keeping I d independent of Θ for a single iteration of the gradient computation, i.e, The second term in (58) is based on differentiation of the direct component. This is straight-forward to compute using (51). Consequently, using Eq. [59] and the definition of I Single (x Sun →x →x k ) in (25), the term A 4 in (58) is approximated by  The term g (x Sun →x ) in (60) contributes to voxels outside of the LOS. The integral inÃ 4 is computed with a broken-ray [55] path x k →x →x Sun , as illustrated in Fig. 4.

B Measurement Noise
The inverse problem defined in the main text is formulated in terms of measured Stokes vectors [Eq. 30]. However, Stokes vectors are not measured directly. Rather, they are derived from intensity measurements taken through filters. The raw intensity measurements are noisy. Noise is dominated by Poisson photon noise, which is independent across different raw measurements. However, the estimation of Stokes components from independent intensity measurements yields noise which is correlated across the components of the Stokes vector, per-pixel. In this section, we describe the synthesis model we employ to generate realistic noise in simulations. Our synthesis is based on the AirMSPI [56] sensor model. Furthermore, we derive the expression for R, which we use in the recovery process (Eq. [36] in the main text).
AirMSPI measures a modulated intensity signal at N sub =23 subframes. Define a normalized frame which spans the unitless integration time interval ψ ∈ [−0.5, 0.5]. Denote the temporal center and span of each subframe as ψ l and ∆ψ = 1/N sub , respectively (Fig. 13). Based on the sensing process described in Ref. [56], define the following modulation function, whose parameters are given in Table 1: with Here J 0 , J 2 are the Bessel functions of the first kind of order 0 and 2, respectively. Denote by ξ(λ) a wavelengthdependent ratio, which is drawn from quantum efficiencies and spectral bandwidths 6 of each AirMSPI band (Table 1). Using simulated Stokes vectors derived by vSHDOM, AirMSPI measurements are synthesised as passing through two polarization analyzing filters [56]. As defined in Eq. [20] The gain G is chosen to let the maximum signal at each camera view (i.e. maximum over pixels, wavelengths and subframe measurements) reach the maximum full-well depth of 200,000 electrons, consistent with AirMSPI specifications. (68) synthesizes raw AirMSPI signals including noise (Fig. 8) The vectors y I [k] form the data for tomographic analysis.
Our tomographic analysis takes into account the noise properties, including noise correlation. As we now show, the measurement model [69] yields correlated noise of different Stokes components. Thus, R −1 ((36)) is not diagonal. Denote the diagonal co-variance matrix of the photo-electron readings by C −1 e =diag e . Let I 46×46 denote the Identity matrix. The signal is generally dominated by unpolarized multiply-scattered background light. Relative to it, the magnitude of the modulated polarization signal is small. Thus, per pixel k, the diagonal matrix C −1 e [k] is approximately constant with a global weight Using Eqs. [69,70] for each pixel, the Stokes co-variance matrix is A maximum-likelihood estimator corresponding to a Poisson process should have a weight α[k] ∝ 1/ e 1 , to account for higher photon noise in brighter pixels. In simulations, however, we found that α[k] = 1 worked better. This is perhaps due to richer information carried by denser cloud regions, i.e. brighter pixels. Overall the expression we minimize in (36) isΘ i.e. R −1 = M M.

C Numerical considerations
In this section we describe numerical considerations that stabilize the recovery.

C.1 Hyper-parameters
Our code requires the choice of hyper-parameters for rendering with vSHDOM [57] in Step 1 and optimization with scipy L-BFGS [46,58] in Step 2. Table 2 summarizes the numerical parameters used in our simulations.

C.2 Preconditioning
Multivariate optimization can suffer from ill-conditioning due to different scales of the sought variables. This is expected when recovered variables represent different physical quantities with different units and orders of magnitude. A preconditioning of the update rule in (38) takes the following form where we apply a diagonal scaling matrix Π (Jacobi preconditioner) to scale the different physical variables (LWC, r e ). Thus, Π takes the form Π = diag Π LWC , Π re , ...., Π LWC , Π re . (74) In our tests, we use Π LWC = 15 and Π re = 0.01 to scale the parameters to a similar magnitude and closer to unity upon initialization.

C.3 Initialization
The recovery is initialized by the estimation of a cloud voxel mask, which bounds the cloud 3D shape. The 3D shape bound of the cloud is estimated using Space-Carving [59]. Space-carving is a geometric approach to estimate a bound to 3D shape via multi-view images. The following steps are preformed in our space-carving algorithm D Qualitative Results: Scene B Qualitative volumetric results of the recovered LWC for Scene B are shown in Fig. 14. A scatter plot of the recovered LWC and the recovery results of r e for Scene B are given in Fig. 15.

E Spatial Variation of The Effective Radius
In nature, generally the droplet effective radius r e and variance v e vary in 3D. However, operational remote sensing algorithms, which rely on 1D RT and plane-parallel cloud models, retrieve a single value for r e (and for v e ), for each cloudy pixel. This occurs both in bi-spectral [9,60] and polarimetric [11,61] techniques. In these approaches, it is always uncertain which portion of the cloud the retrieved quantity corresponds to, because light penetrates into the cloud and simultaneously scatters from different depths inside it. In polarization analysis of plane-parallel cloud models, it is often assumed that the retrieved microphysical parameters correspond approximately to an optical depth of unity. At any rate, this uncertainty complicates the interpretation of retrieved values in studies which rely on them.
The mathematical approach of the paper is formulated for 3D variation of all the required fields: LWC, r e , v e . As Fig. 3 in the main paper shows, polarization is sensitive to r e of any voxel which scatters sunlight towards the camera. Moreover, the formulation explicitly models and seeks spatially varying microphysics, using multi-angular data. We confidently anticipate the same sensitivity to v e . The demonstrations in the simulations used a representation in which r e varies vertically, not horizontally. This is more general than the operational methods mentioned above, yet more degenerate than full 3D heterogeneity. We now discuss the implication of such a representation.
Textbook cloud physics (e.g., [51]) is based on the mental picture of a parcel of moist air containing a certain number of cloud condensation nuclei that is ascending vertically in the buoyancy-driven part of the convective cycle. Since temperature and pressure are strongly stratified environmental quantities, moist adiabatic thermodynamics thus predict a vertically-varying droplet size distribution, at least in the so-called "convective core" of the cloud. For the present study, this restriction of microphysical variability to the vertical dimension only applies to both r e and v e .
There is compelling evidence that the horizontal variability r e is indeed small over a cloud scale. This evidence comes from in-situ aircraft observations of shallow cumulus [62,63,64], modelling studies [65] and theory [66]. However, there are also select observations of monsoonal clouds [67] and theoretical arguments [66] that suggest there is a sharp gradient in the droplet effective radius in the very outer shell of the clouds. If this is the case, then a representation having vertical-only variation of r e loses validity at the outer shell. This may cause bias in retrievals based on polarimetry. The reason is that polarization signals are dominated by single-scattering, which is most likely to occur at shallow depth in the cloud.
The value of v e can also vary significantly across different environmental conditions. This is seen in research flights including in-situ measurements [68,69,70,71,72]. Moreover, in LES simulations of shallow cumulus clouds with bin microphysics, v e might range from 0.01 to 0.26 [73]. The core of a cloud tends to have a low effective variance as condensation is the dominant process there [74,73]. Cloud edges, in contrast, experience also evaporation and entrainment mixing, as the cloud is diluted by environmental air [75]. This tends to increase v e . If the cloud has precipitation, spatial variability of v e increases [76].
These points show that, on the one hand, the approximations in the demonstrations are often reasonable. On the other hand, it is indeed worth representing cloud microphysical parameters as functions in 3D, then retrieving them in tomography, to push the frontier of cloud physics research. Retrieving a large number of degrees of freedom can be managed better by using more information from diverse sources. One option is to include additional sources of measurements, e.g., by using a combination of the AirMSPI [14] and Research Spectro-Polarimeter (RSP) [77] airborne instruments. Another option is to introduce tailored regularization schemes, which mathematically express the natural trends of horizontal variability mentioned above. The 3D tomographic approach presented in the paper is a significant enabler for probing such questions. It offers more flexibility than current operational analyses, which are largely based on 1D RT and bulk retrieved values for a whole cloud.