Machine-Learning Based Analysis of Liquid Water Path Adjustments to Aerosol Perturbations in Marine Boundary Layer Clouds Using Satellite Observations

: Changes in marine boundary layer cloud (MBLC) radiative properties in response to aerosol perturbations are largely responsible for uncertainties in future climate predictions. In particular, the relationship between the cloud droplet number concentration ( N d , a proxy for aerosol) and the cloud liquid water path (LWP) remains challenging to quantify from observations. In this study, satellite observations from multiple polar-orbiting platforms for 2006–2011 are used in combination with atmospheric reanalysis data in a regional machine learning model to predict changes in LWP in MBLCs in the Southeast Atlantic. The impact of predictor variables on the model output is analysed using Shapley values as a technique of explainable machine learning. Within the machine learning model, precipitation fraction, cloud top height, and N d are identiﬁed as important cloud state predictors for LWP, with dynamical proxies and sea surface temperature (SST) being the most important environmental predictors. A positive nonlinear relationship between LWP and N d is found, with a weaker sensitivity at high cloud droplet concentrations. This relationship is found to be dependent on other predictors in the model: N d –LWP sensitivity is higher in precipitating clouds and decreases with increasing SSTs.


Introduction
Marine boundary layer clouds (MBLC) make up a large part of global cloud cover as they are persistently present over more than 20% of the Earth's oceans in the annual mean [1]. This is especially the case for the tropical and subtropical oceans off the west coasts of the continents, where semi-permanent stratocumulus sheets can cover more than 50% of the surface annually [1,2]. By reflecting solar radiation to a much greater degree than the ocean beneath, while only having a minor effect on the outgoing longwave radiation, these clouds play an important role in Earth's energy budget by exerting a large net cooling effect [1,3]. Therefore, even a comparatively small increase in the albedo of MBLC could offset part of the global warming due to increasing concentrations of greenhouse gases [4].
Stratocumulus cloud properties and their radiative characteristics, such as cloud albedo, horizontal and vertical extent, lifetime and precipitation susceptibility, are dependent on environmental conditions. Aerosols in their role as cloud condensation nuclei (CCN) affect the cloud albedo via changes in the cloud droplet number concentration (N d ), also known as the Twomey effect [5]. Subsequent cloud adjustments to aerosol perturbations may lead to changes in cloud fraction and in the liquid water path (LWP), further altering the radiative properties of the cloud. The magnitude and sign of the radiative forcing due to these aerosol-cloud interactions remain among the largest uncertainties in projections of future climate [6][7][8]. While there is a large body of research on the Twomey effect from the last decades, e.g., [5,[9][10][11][12][13][14][15][16][17], this is much less the case for LWP adjustments, where disagreement between observations and models is large and even the sign of the aerosol effect on LWP is unclear [18]. As LWP is the main controlling factor of liquid-cloud albedo [18], it is therefore important to better understand the effect of aerosols on LWP to ultimately improve climate model predictions.
Using the N d -LWP relationship as a measure for the LWP adjustment of clouds to an aerosol perturbation, [18] outline several counteracting pathways of how aerosols can impact the LWP in MBLC. Small cloud droplets in situations with elevated N d can suppress the formation of precipitation, prolonging the lifetime of the cloud and increasing LWP [19]. Preliminary results of the CLoud-Aerosol-Radiation Interaction and Forcing: Year 2017 (CLARIFY-2017) and indications from the NASA ObseRvations of Aerosols above CLouds and their intEractionS (ORACLES) campaigns support this with findings of decreased drizzle formation in polluted (high N d ) clouds [20][21][22]. A second pathway that is hypothesized to increase LWP is the process of warm cloud invigoration, where the condensation and LWP build-up under low-N d conditions is limited by the small droplet surface area, which is then increased following an increase in N d [23]. On the other hand, the entrainment of relatively dry air at the cloud top leads to a decrease in LWP with increasing N d by way of faster evaporation of smaller cloud droplets [24][25][26]. The resulting evaporative cooling induces a (self-amplifying) positive feedback that enhances the entrainment-evaporation process, further decreasing LWP [27][28][29]. The extent to which these processes affect clouds under different meteorological conditions and cloud states is still unclear and previous estimates of the relationship between aerosols and LWP span from positive [23,[30][31][32][33][34] to negative [24,25,29,[35][36][37] with some studies showing a bidirectional relationship [38,39] as summarized in in Gryspeerdt et al. [18]. The recent study by Gryspeerdt et al. [18] found a possible explanation for these varying results in the non-linear relationship of N d and LWP, where the N d -LWP relationship is positive at low N d and is reversed in high N d situations. A parameter that was found to influence this relationship is the state of the cloud (precipitating/non-precipitating). Precipitating clouds were shown to display an increase in LWP with increasing N d in the past [29,33]. SST is another important controlling factor for the N d -LWP relationship [40] and radiative properties [41] in MBLCs. Higher SST leads to a deeper boundary layer with decreased stability and thus supports entrainment drying at the cloud top, thereby accelerating the evaporation at the cloud top.
Given the non-linear nature of the N d -LWP relationship, and its potential dependence on meteorological factors and the cloud state, machine learning methods are ideally suited to analyze these processes in observational data sets. While machine learning techniques are becoming a popular tool in Earth sciences [42], and have been used to study aerosolcloud interactions before [43][44][45], their potential to study LWP adjustment processes has not yet been explored.
The goal of this study is to improve the understanding of the N d -LWP relationship in MBLC of the Southeast Atlantic region, specifically focusing on its dependency on meteorological conditions and cloud state. To this end, a large data set of polar-orbiting satellite observations and reanalysis data are analyzed in a machine learning framework.

Materials and Methods
This study is conducted for a 10°by 10°region in the Southeast Atlantic (0°E-10°E and 10°S-20°S), characterised by a high annual coverage of MBLC largely in the form of stratocumulus clouds [2]. The Southeast Atlantic has recently been the focus of multiple large aircraft campaigns to better understand interactions of stratocumulus clouds with the seasonally occurring biomass burning aerosol layer above the cloud deck [20,21,46]. Here, the focus is not explicitly on the effects of biomass burning aerosols on clouds, but rather on the LWP adjustment of stratocumulus clouds to aerosol perturbations, using N d as a mediating variable [18]. A combination of observation data from multiple satellite sensors and reanalysis model output is used to create a data set for the period of July 2006-April 2011, which is then utilized to train a machine learning algorithm to predict LWP. An overview of the variables used in this work is shown in Table 1.

Data
Observation data on cloud properties are taken from the CALIPSO-CloudSat-CERES-MODIS Merged Release B1 (C3M) product. The C3M data set aggregates observations from multiple sensors (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO), Cloudsat, Clouds and the Earth's Radiant Energy System (CERES) and Moderate Resolution Imaging Spectroradiometer (MODIS)) on CERES footprints with a resolution of ∼20 km [47]. In the C3M data set, each CERES footprint represents an individual observation with data from all four instruments collocated. To account for the higher resolution and the vertical profiles provided by the original CALIPSO and CloudSat products, C3M contains a maximum of 16 cloud groups and a maximum of six cloud layers per CERES footprint. Individual cloud groups (i.e., clouds within the same CERES Footprint that are separated by clear sky areas) are distinguished as seen from above while the cloud layers are distinguished vertically. In order to filter for low-level clouds and to exclude the influence of additional cloud layers above, only observations of single-layer clouds with a cloud top height (CTH) below 3 km [48] as detected by CALIPSO are used. CTH is defined as the median of all cloud groups in a CERES footprint. To inform the machine-learning model about possible precipitation, the CloudSat precipitation flag is used to calculate the precipitation fraction (PF). The PF is defined as the number of cloud groups where precipitation is detected by CloudSat (precipitation classes can either be liquid, solid or drizzle), divided by the total number of cloud groups for each CERES footprint. The large majority (>99%) of precipitating clouds in this data set are classified as drizzle with no instances of solid precipitation detected.
The cloud-droplet number concentration (N d ) is calculated using MODIS retrievals of the effective cloud-droplet radius (r e ), the cloud optical depth (τ c ), the cloud-top temperature and the cloud-top pressure according to Grosvenor et al. [49]: with k = 0.8, f ad = 0.66 and Q ext = 2. The overall uncertainty in the calculated N d due to the cumulative uncertainties in the retrievals of r e and τ c are estimated to amount to around 78% [49].
Meteorological reanalysis data are taken from the ERA5 dataset provided by the European Centre for Medium-Range Weather Forecasts (ECMWF) at a 0.25°× 0.25°resolution on an hourly basis [50,51]. Data for air temperature, relative humidity (RH), vertical velocity, u and v wind components, sea-surface temperature (SST) and mean sea-level pressure (MSL) are collocated with the C3M data. For each CERES footprint the cloud-base height (CBH) from CALIPSO is used to select the nearest pressure level below the cloud for vertically resolved ERA5 variables (temperature, RH and winds). Additionally, CTH is used to select RH and winds at the nearest pressure level above the cloud. Finally, the estimated inversion strength is calculated according to Wood and Bretherton [52] using the 2 m air temperature and the temperature at 700 hPa from ERA5 and assuming a surface pressure of 1010 hPa.
The liquid-water path (LWP) is obtained from the Level-2B precipitation product Version 3 of the Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) sensor aboard the Aqua satellite. This data set is independent of the MODIS-derived N d and is utilized to eliminate the potential risk of introducing a pseudo-relationship through correlated errors in N d and LWP retrievals. The data have a resolution of 5 km across track and 10 km along track [53]. To achieve a comparable spatial resolution to the CERES footprints, the five pixels nearest to the center of each CERES footprint are selected to calculate the mean LWP for that footprint. One drawback of the AMSR-E LWP is that for low LWP cases, cloud water cannot be separated from rainwater (mostly drizzle in this data set), meaning that in such situations, AMSR-E LWP is biased high when compared to other products [54].
Spatial gradients are inherent in the LWP and many predictors in the data of the Southeast Atlantic that are used here. However, these are not necessarily directly linked to the N d -LWP relationship. To ensure that the model is not primarily exploiting the spatial gradients to predict the LWP, a secondary data set is created by removing the spatial gradients ( Figure S1). The results based on this anomalous data set are shown in Figures S2-S5 in the Supplementary.

Methods
Gradient-boosting regression trees (GBRTs) are used to model the LWP using the set of 12 predictor variables (model features) described in Table 1. GBRTs are a robust machinelearning technique that features the advantages of tree-based methods, which allow for the use of different data types (e.g., categorial or numerical data) and do not make/need prior assumptions concerning the distribution of the data. They are capable of representing and quantifying complex non-linear relationships, while considering interactive effects between the predictors [55]. GBRTs have been successfully applied to study aerosols and clouds in the past (e.g., [44,[56][57][58]). As a result of uncertainties in the retrieval of r e and τ c , unrealistic values for N d may be calculated. Therefore, only observations where N d is within percentiles 1-99 are used for the analysis to remove extreme values in N d , yielding N = 29,901 observations. To evaluate the model, the data set is randomly split into training (70%, N = 20,931) and test data sets (30%, N = 8970). To find the best set of hyperparameters, multiple instances of the model are first run with manually selected parameters to find a grid of the most suitable settings. In a second step, the final hyperparameters are chosen by applying a grid search approach (Table 2). In order to interpret the model predictions and to analyse the N d -LWP relationship, an explainable machine learning tool, the Shapley additive explanation (SHAP) values, are used. SHAP values quantify the contributions of each model feature to each individual (local) model prediction [59,60]. An example for a single set of observations is provided in Figure 1. SHAP values retain local accuracy, so that each individual model prediction is equal to the sum of the SHAP values of all features and the mean model prediction. Following from this, individual SHAP values can be positive (increase in the model prediction due to the specific feature value) or negative (decrease in the model prediction), and are calculated for each individual feature and each individual model prediction. Figure 1 shows that the observed N d = 10.041 contributes a SHAP value of −8.41 to the local prediction. Since SHAP values are provided in units of the predictand, this translates to a decrease in the LWP prediction by −8.41 gm −2 . Accordingly, the mean absolute SHAP value of a feature directly indicates the strength of the influence of this feature on the model prediction.
Interactive effects between the predictors are quantified as the change in the contribution of a feature to the model prediction depending on the presence/absence of a second feature (SHAP interaction values). For additional information on the theoretical background and technical details of SHAP values, the reader is referred to Lundberg et al. [59,60]. SHAP values have seen use in the field of medical science [61] and more recently also in the environmental sciences [62].   Figure 2a shows the correlation between N d and LWP at the spatial scale of the C3M data, summarizing all data points in 1°× 1°grid boxes in the study domain. Mostly positive correlations between N d and LWP are found in the data sets analyzed here. This is particularly the case for low N d values (<30, Figure 2b), where the average weighted correlation (weighted by number of observations per pixel) is 0.14. However, when considering only observations with N d > 30, the correlation decreases overall (average weighted correlation: 0.05, Figure 2c), with negative correlations apparent in southwestern parts of the study domain. To account for this non-linearity in the N d -LWP relationship, and to analyze its dependence on meteorological factors and cloud state, a machine learning model is used in the following. This approach allows for the estimation of sensitivities of LWP to perturbations in N d , including interaction effects with secondary parameters, that can be compared to other studies. A comparison of the model based on the data set with spatial gradients removed ( Figures S2-S5), and the model based on the observations, reveals no significant differences with respect to the SHAP values and interactive effects. This indicates that the model performance is not negatively impacted by the spatial gradients present in the data set. Therefore, only the model based on the untreated observations is discussed in the following. The GBRT model is able to explain 70% of the variability in the LWP (R 2 of the independent test data = 0.70). An overview of the importance and the contributions of each predictor to the model-predicted LWP is shown in Figure 3. The predictors are sorted by their mean absolute SHAP value as a measure of importance with respect to the prediction of LWP. PF, CTH, MSL and N d are identified as the most important variables. The fact that cloud state variables are among the most important features in the model is to be expected, given that they are the result of the prevailing environmental conditions, but also more directly related to the predicted LWP. As such, the importance of the two groups (cloud state variables and environmental conditions) should not be directly compared. In Figure 3, each data point is represented by a dot for each of the predictors with the associated SHAP value (contribution to the predicted LWP) on the horizontal axis, while the normalized predictor value is indicated by the coloring (ranging from the 5th to the 95th percentile), where blue shows a below-average feature value, and red an above-average feature value. A positive sensitivity of LWP to the three most important features PF, CTH and MSL is apparent. This relationship for PF is to be expected, as only clouds with a sufficiently high LWP form precipitation, but may also be amplified by the inability of AMSR-E to distinguish between cloud water and rain water for low LWP values, meaning that the LWP may contain drizzle/precipitation water in cases where PF > 0. Similarly, a higher CTH in stratocumulus clouds is linked to a deeper boundary layer and thicker MBLCs [43,63]. The importance of MSL for the model predictions of LWP underscores the role of dynamics for MBLCs in the Southeast Atlantic, as suggested by [57,64]. Furthermore, Figure 3 suggests an overall positive N d -LWP relationship that is in agreement with the positive correlations shown in Figure 2. In the following, this N d -LWP relationship and its dependence on meteorological factors and cloud state is explored in detail.  The main effects of N d on LWP are shown in Figure 4. Main effects are equivalent to the fraction of the SHAP values that are solely attributable to a single feature with all interaction effects removed. A positive relationship of N d and LWP is found for low N d values. The relationship is much weaker for higher N d , where the predicted LWP is less sensitive to a further increase in droplet concentration, underscoring the nonlinearity of the N d -LWP relationship. This is partly in agreement with [18], who also found a positive sensitivity of LWP to N d at lower N d . However, the negative N d -LWP relationship at higher N d values found in Gryspeerdt et al. [18] is not apparent in the data set analyzed here. A potential cause for this difference could be the differences in spatial resolution and specific cloud filters in both studies, which have been shown to lead to systematic biases [49]. In particular, the spatial and temporal aggregation of a data set can have a significant impact on the statistical relationship between N d and LWP, and even lead to a change in its sign when using Level 3-type data [65], as in Gryspeerdt et al. [18]. Another aspect could be the different method used to calculate the N d , as Gryspeerdt et al. [18] use the method from Bennartz and Rausch [66], whereas here, we use the method presented in Grosvenor et al. [49]. In the study domain, however, the retrieved N d agrees closely for both methods [49]. It should be noted that this finding of a nonlinear N d -LWP relationship that saturates at higher cloud droplet concentrations agrees well with previous modeling work done in the Southeast Pacific with a regionally-nested configuration of the Met Office Unified Model, that has found a very similar relationship [32].  Figure 5 shows the influence of precipitation fraction on the N d -LWP relationship. The SHAP values for N d are shown in Figure 5a, including interactive effects (in contrast to Figure 4). The observed PF that corresponds to each data point is indicated by color, increasing from blue to red. Overall, a higher variability in N d SHAP values is found when interaction effects are taken into account, compared to only the main effects of N d (Figure 4). This can largely be attributed to the importance of the interaction effects between N d and PF (Figure 5b). The positive slope of the interaction effects of PF on the N d -LWP relationship suggests that the model actively uses the information of PF to improve the model performance. This indicates that the strength of the positive N d -LWP relationship, which is assumed to be related to precipitation suppression, is amplified in conditions that already partially develop drizzle. Preliminary results from the CLARIFY-2017 insitu measurements support this finding with higher LWP and a lower amount of drizzle formation in clouds with elevated N d [20]. A second microphysical process that could explain the amplified positive N d -LWP relationship in cases of drizzle could be related to the simultaneous removal of cloud water and droplets. Since drizzle acts as a sink for both N d and LWP at the same time, this could increase the strength of their positive relationship [67]. The dependency of the LWP response to aerosol/increasing droplet concentrations on precipitation is also in agreement with previous studies that find a positive relationship between N d and LWP in precipitating conditions [29,33]. The lower panels of Figure 5 show a comparison of the N d -LWP sensitivity for nonprecipitating (m noprecip , Figure 5c) and precipitating clouds (m precip , Figure 5d) as the slope of the linear regression for N d SHAP values. While the finding of a stronger N d -LWP relationship at low N d and a weakening at higher N d is consistent across both subsets, precipitating clouds show a markedly higher sensitivity of LWP to changes in N d (m precip = 0.108) compared to non-precipitating clouds (m noprecip = 0.069). Figure 6a shows the influence of N d on LWP predictions, with color showing SST (same as in Figure 5a for PF). The differences in LWP sensitivity for low (blue dots) and high (red dots) SST suggest only a minor direct influence of SST on the N d -LWP relationship. The details of this influence are shown in Figure 6b as SHAP interactive effects. High (low) values for SST weaken (enhance) the sensitivity of LWP to changes in N d . This effect seems to be more pronounced in situations with a lower amount of cloud droplets. Higher SST conditions are associated with increased surface fluxes, a deeper and less stable marine boundary layer, and an increased moisture difference between the marine boundary layer and the troposphere [68,69]. In these conditions, the evaporation-entrainment process may be facilitated [40], leading to a less positive N d -LWP relationship. This finding is in good agreement with recent studies by Zhou et al. [41] and Zhang et al. [40]. It has to be noted, that Figure 6b shows the direct impact of SST on the N d -LWP relationship in MBLC. However, since SSTs are known to be drivers for other meteorological factors and the cloud state variables as described above, they may also have an indirect impact on the N d -LWP relationship exerted through these features in the model. Compared to other studies, this approach has the potential to separate the direct influence of SST from the indirect effects exerted through changes in secondary features like CTH, LTS and RH, as these are included in the model.

Conclusions
In this work, a machine learning model was trained on observation data and reanalysis model output representing important parameters in cloud development to predict the LWP of MBLC in the Southeast Atlantic. The goal of this study was to improve the understanding of how changes in N d affect the LWP of MBLC and how this relationship is influenced by meteorological factors and cloud state. The main findings of the study are that

•
Within the machine-learning model, the most important cloud state parameters for the prediction of LWP are PF, CTH, and N d , while the most important environmental predictors are MSL, v bc and SST. The machine-learning model is able to explain 70% of the observed variability in LWP (R 2 = 0.70). • Overall, a nonlinear but positive sensitivity of LWP to changes in N d is found, with a positive relationship at low N d values, which saturates at higher N d values. Unlike findings in a previous global study [18], the N d -LWP relationship at higher N d is not negative in the data set used here for the Southeast Atlantic. • Marked differences are found in the sensitivity of LWP to changes in N d for precipitating and non-precipitating cloud groups. The stronger sensitivity is likely due to an amplified importance of precipitation suppression in situations that already develop some drizzle. • Changes in SST show a direct influence on the N d -LWP relationship, with a decreased sensitivity of LWP to N d at higher SSTs. This may be attributed to increased evaporation-entrainment and deeper clouds due to the lower stability at higher SSTs.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/atmos13040586/s1, Figure S1: Calculation of the spatial anomalies, Figures S2-S5: Results of the GBRT model based on the data set with the spatial gradients removed.