Relation between Convective Rainfall Properties and Antecedent Soil Moisture Heterogeneity Conditions in North Africa

Recent observational studies have demonstrated the relevance of soil moisture heterogeneity and the associated thermally-induced circulation on deep convection and rainfall triggering. However, whether this dynamical mechanism further influences rainfall properties—such as rain volume or timing—has yet to be confirmed by observational data. Here, we analyze 10 years of satellite-based sub-daily soil moisture and precipitation records and explore the potential of strong spatial gradients in morning soil moisture to influence the properties of afternoon rainfall in the North African region, at the 100-km scale. We find that the convective rain systems that form over locally drier soils and anomalously strong soil moisture gradients have a tendency to initiate earlier in the afternoon; they also yield lower volumes of rain, weaker intensity and lower spatial variability. The strongest sensitivity to antecedent soil conditions is identified for the timing of the rain onset; it is found to be correlated with the magnitude of the soil moisture gradient. Further analysis shows that the early initiation of rainfall over dry soils and strong surface gradients yet requires the presence of a very moist boundary layer on that day. Our findings agree well with the expected effects of thermally-induced circulation on rainfall properties suggested by theoretical studies and point to the potential of locally drier and heterogeneous soils to influence convective rainfall development. The systematic nature of the identified effect of soil moisture state on the onset time of rainstorms in the region is of particular relevance and may help foster research on rainfall predictability.


Introduction
Soil moisture can influence rainfall at a range of spatial and temporal scales: via direct recycling of moisture within the same region (>1000 km, interannual) [1,2] and advection of moisture caused by modification of regional circulation patterns (100-1000 km, weekly) [3,4] or indirectly, through modification of thermodynamic characteristics of the boundary layer (10-100 km, daily) [5][6][7].An accurate representation of the soil moisture state and soil moisture-rainfall feedback is important for seasonal forecasting [8][9][10][11], as well as for reliable prediction of weather and climate extremes [12,13].The large uncertainty in the feedback remains within the indirect path of soil moisture-rainfall coupling at convective scales, i.e., (sub-)daily and 10-100 km.Multiple modeling studies and limited observational evidence agree on the mechanisms governing the effect of soil moisture on deep convection triggering [14,15].However, the influence of soil moisture state on rainfall and its properties, such as volume or timing, remains less studied and is expected to be more subtle [16].In fact, observational evidence of the impact of soil moisture on rain properties is still lacking.
Mechanistic understanding behind the soil moisture-precipitation coupling is often reduced to understanding of how soil moisture affects moist convection triggering or, in other words, cloud formation.In this regard, multiple studies based on conceptual boundary layer column models and homogeneous soil moisture conditions suggest that both dry and wet soils can favor convective initiation, with their net effect depending mainly on the initial state of the lower atmosphere [5,[17][18][19][20][21].In essence, dry soil advantage (negative coupling) will be observed, if adding heat to the given boundary layer is more energetically efficient for triggering of convection.The other way round, wet soil advantage and positive coupling will be present if adding moisture to the lower atmosphere would rather lead to moist convection initiation [22].On the other hand, model simulations with more realistic, heterogeneous soil moisture conditions, emphasize the role of a dynamical mechanism that leads to a preferential triggering of convection over spatially drier soils (negative coupling), see, e.g., [15,[23][24][25][26].In this case, thermal contrasts between drier and wetter surfaces form circulation patterns, with higher updrafts and moisture convergence favoring rain occurrence over the dry patch [27].The thermally-induced circulation has proven to be an important mechanism for convection initiation, even under unfavorable atmospheric conditions [14,28,29].Analyses of multi-year satellite data have further demonstrated the climatological relevance of this mechanism for convection triggering in soil moisture-limited regimes [30] and particularly in the Western Sahel [28].
While most studies to date have focused on the relationship between soil moisture and cloud formation, similar physical mechanisms are expected to govern the relationship between soil moisture and convective rainfall.However, the complexity of processes underlying the evolution of rain systems suggests that variations in strength and even the sign of soil moisture-precipitation coupling may exist [31][32][33][34].It is generally believed that drier soils with spatial contrasts in soil moisture are expected to be more relevant for the evolution of weaker rain systems at the early stage of their development [14,35].Mature systems with well-developed cold pools, on the contrary, will likely intensify over wetter areas [36,37].Despite this complexity, observational analyses of rain event climatology suggest that the dynamical mechanism likely dominates over thermo-dynamical processes (dry or wet soil advantage) in determining soil moisture impact on rainfall formation and that the sign of this coupling is, therefore, predominantly negative [38,39].This signal was found to be the most robust in the Sahel [29, 40,41].However, further understanding of whether this dominance could reflect a significant contribution of dry and heterogeneous soils to the properties of rainstorms in this semi-arid region has not yet been clarified.This is also the main research question that we are addressing in the present study.
Theory suggests that when triggered over wet surfaces, moist convection leads to higher rain amounts and shows later triggering times in contrast to when triggering occurs over dry soils [42,43].In this case, a positive correlation between soil moisture and rainfall totals might be expected regardless of the atmospheric state [44].At the same time, thermally-induced circulation is also expected to influence the timing, intensity, organization and therefore distribution of rainfall [23,25,32,45].A recent analysis showed that deep convection forms 1-4 h faster in the presence of surface heterogeneity, with the timing of the initiation being correlated with the magnitude of the sensible heat flux [26].Furthermore, the strength of the spatial soil moisture gradient was found to be positively correlated with the amount of rainfall under benign atmospheric conditions [46].Hence, the expectation is that a stronger soil moisture gradient with drier soil may lead to an earlier precipitation onset and more rain.
In the present study, we explore the evidence of these model-based coupling relationships in the climatological record of convective rainfall events.We focus on the North African domain, a large part of which is represented by the Sahel zone.We analyze 10 years of sub-daily satellite-based estimates of soil moisture and precipitation during the monsoon months of JJAS and explore the relationship between morning soil moisture state (magnitude and spatial gradient) and six characteristics of afternoon convective rainfall: rain area, amount, maximum intensity, spatial heterogeneity, onset time and time of maximum intensity.We apply standard statistical methods to understand whether afternoon rainfall properties are sensitive to preceding soil moisture heterogeneity conditions (Section 5.1) and whether a relationship between surface heterogeneity and selected rainfall characteristics exists (Section 5.2).Then, we explore the consistency of the obtained results with the physical mechanisms by analyzing the variability in boundary layer conditions (Section 5.3).The paper proceeds with a description of the study domain (Section 2), observational datasets (Section 3) and methods (Section 4).The results are presented in Section 5, in which we also discuss the consistency of the main findings with the existing knowledge of model-based theoretical studies and corresponding physical mechanisms.Conclusions are given in Section 6.

Study Domain
The study domain extends from the tropical forest in the south to the Sahara Desert in the north (5-20 • N, 20 • W-40 • E) and encompasses the entire Sahel (a well-known hot spot of land-atmosphere interactions) [47][48][49] (Figure 1a).Most of the annual rainfall in the region occurs during the monsoon months of June-September and is of convective origin [50].The large-scale atmospheric structures like the African Easterly Jet and the African Easterly Waves largely modulate convection activity over the region [51,52].Yet, the evidence supporting a significant role of the surface state in the triggering of deep precipitating convection is steadily growing [16].Modeling and observational studies have suggested that the variability in soil moisture induced by rainfall exerts strong control on the magnitude of surface turbulent heat and moisture fluxes in the region [49,53,54].The partitioning of the surface fluxes is in turn expected to influence the state and evolution of the daytime boundary layer [7] and, hence, affect the development of new rainfall [40,55].Relatively dense vegetation in the south of the domain and the emergent vegetation at the end of the monsoon season obscure the correlation between soil moisture and surface fluxes and, therefore, reduce the effect of surface state on rainfall [53].The wet monsoon months of July and August are also known to reduce surface influence on convection development [7,28].In order to preserve a large sample of rainfall events in the present study, we do not aim to correct for the influence of vegetation and monsoon dynamics.The latter is expected to mainly reduce the magnitude of identified relationships, but is unlikely to change the dominant coupling signal; see, e.g., [28].

Satellite-Based Soil Moisture and Precipitation Datasets
Surface soil moisture data are obtained from Level 3 estimates of the Land Surface Parameter Model (LPRM, [56]) based on C-band observations from the Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E, [57]).Data are available at 0.25 • horizontal resolution.The AMSR-E unit is carried on board the Aqua satellite following a Sun-synchronous orbit with typically one overpass per pixel per day, at either 13:30 or 01:30 Local Solar Time (LST) over the Equator.In order to capture soil moisture state shortly before afternoon deep convection onset, only data from the daytime overpass at 13:30 LST are used.The daytime overpass estimates, however, are known to have higher biases because of the greater temperature differences between surface and canopy [57].Despite that, the AMSR-E product was found to be accurate in capturing rain-related soil moisture variability at the gauge level in the Sahel [58] and proved to perform better than other products over sparsely-vegetated and desert areas [59].Due to the choice of the datasets and the definition of soil moisture-precipitation coupling used in the present study, the experimental setup largely replicates the one introduced by [38].In addition, we apply the same soil moisture quality mask as in [38], i.e., we exclude pixels in which (i) the optical vegetation depth exceeds 0.8 (dense vegetation), (ii) the percentage of open water is more than 5% based on static maps from the 1-km Global Land Cover 2000 dataset (available online at http://forobs.jrc.ec.europa.eu/products/glc2000/products.php)or (iii) the temporal correlation between precipitation and subsequent soil moisture is poor.The latter condition is used to reduce the number of pixels covered with wetlands; see the Supplementary Materials in [38].
To define convective rainfall events and rain properties, we use three-hourly precipitation data from the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) product [60], also at 0.25 • horizontal resolution.The TMPA algorithm derives precipitation estimates using multiple passive microwave sensors (including the TRMM radar), infrared geostationary sensors and monthly rain gauges.For the present study, TRMM-3B42 Version 7 is used [61].To ensure similar solar forcing across longitude, three-hourly precipitation time-series are adjusted to LST by taking the closest three-hourly UTC time step.Following comparisons of existing precipitation products at daily scales in West Africa, the TRMM-3B42 product has been shown to be closest to field measurements, while other alternative datasets-such as the Climate Prediction Center Morphing Method (CMORPH, [62]) and Precipitation Estimation from Remotely-Sensed Information using Artificial Neural Networks (PERSIANN, [63])-showed a systematic overestimation [64][65][66].The choice of TRMM-3B42 is justified in Appendix A. Both the soil moisture and precipitation data are available for the study period June-September, 2002-2011.

Atmospheric Profile Data from ERA-Interim
Atmospheric stability and moisture conditions in the boundary layer on rain event days are estimated using six-hourly records of temperature and specific humidity profiles, surface pressure and geopotential height from the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-Interim) database [67].To assess moisture content of the boundary layer at various heights, we calculate relative humidity averaged within a 850 and 600 hPa layer (%), the humidity index ( • C) as a the sum of the dew point depressions at 50 and 150 hPa above the ground level following [22], the Total Precipitable Water (TPW, mm) as a total column-integrated water vapor based on mixing ratio and the Lifting Condensation Level (LCL, hPa) for a parcel of air lifted from the surface.The stability of the atmospheric profile is evaluated based on the Convective Triggering Potential measure (CTP, J * kg −1 ) calculated after [22] and the Convective Available Potential Energy (CAPE, J).In addition, six-hourly estimates of Boundary Layer Height (BLH, hPa, originally in m) are obtained from the ERA-Interim database.All parameters are calculated prior to the rain, at 12 UTC.All model fields are re-scaled to 0.25 • horizontal resolution using bi-linear interpolation.The characteristics of the datasets used in the study are summarized in Table 1.

Definition of Afternoon Convective Rainfall Event
The definition of the afternoon convective rainfall event and preceding soil moisture heterogeneity conditions largely follows the methodology suggested by [38] and extended by [39] and [41].Following their method, we define a convective rain event location Lmax as a location where the rainfall accumulated between 15 and 21 LST exceeds a threshold of 6 mm (Figure 1b).The assumption that rain events are of the convective type in the study is reasonable: most (>90%) afternoon rainfall during the monsoon season in this region is known to be of convective origin [50,68].Next, the location(s) of afternoon accumulated precipitation minima, Lmin, is identified within a 5 × 5-pixel box (1.25 × 1.25 • ) centered at Lmax.In order to better isolate the influence of soil moisture on rainfall, we exclude rain events if (i) morning (6-15 LST) accumulated precipitation in the box exceeds 1 mm, (ii) the elevation height difference within the box exceeds 300 m based on 1-Arc-Minute Relief Model data ETOPO1 [69] or (iii) two identified event boxes overlap.In the latter case, the box with the higher value of accumulated rainfall is preserved.For the full method details, the reader is referred to [41].In the present study, we additionally limit the events to those in which at least two Lmin locations show zero rain, in order to partly restrict the scale of rainfall variability to the scale of soil moisture heterogeneity, i.e., to 50-100 km.It also allows reducing the number of events identified within propagating squall lines or large organized convective systems.This, however, does not fully restrict the sample to those events that evolve locally; the latter remains the inherent limitation of the current method.
Following this entire filtering procedure, a total of 15,790 rainfall events is obtained for the 10 years of the study period.For every identified Afternoon Rain (AR) event, we then estimate: AR area, accumulations (also referred to as AR volume), maximum intensity, spatial heterogeneity, time of maximum AR intensity and time of first rainfall (onset).The definition and value range of every parameter are given in Table 2.The geographical distribution of identified AR event occurrence in the domain is found to be consistent with the typical distribution of intense Mesoscale Convective Systems (MCS) over the region [70] (Appendix Figure A3).

Measure of Local Soil Moisture Heterogeneity
For every identified AR event, we then estimate antecedent soil moisture heterogeneity conditions at 13:30 LST, i.e., prior to an event.The heterogeneity measure is represented by the spatial soil moisture gradient, Se, calculated as the mean difference in soil moisture anomaly, S', between Lmax and surrounding its Lmin locations normalized by the distance (in meters) between them (Figure 1b,c).Note that the final gradient value is scaled per 100 m to increase its interpretability: The value of soil moisture temporal anomaly, S', is calculated as a departure from the seasonal expectation.The latter is calculated as linear averaging within a 21-day moving window centered on the particular day and across the entire multi-year record.As in [38], the year of the event is excluded from the calculation of the seasonal expectation.
To assess how unexpected the observed soil moisture gradient value is, we define a control sample, S cntr , which represents the distribution of climatological soil moisture gradients in the same Lmax and Lmin locations, in the same calender month, but on the non-event years.Unlike in previous studies, we calculate the departure from the typical conditions, not for an aggregate of multiple rain events, but for every single rain event separately (Figure 1c): Finally, since the climatological variability of rainfall and soil moisture ranges significantly between dry northern and wet southern latitudes in the North African region, a normalization of the ∆S is done.In order to minimize the zonal dependency of the soil moisture heterogeneity measure, we divide ∆S for every AR event by σ(S cntr ), the standard deviation of the corresponding climatological sample of soil moisture gradients.We refer to this final metric as the local heterogeneity, LocS: A negative value of LocS means that in the location of maximum rain (Lmax), soils were more anomalously dry than in the neighbor region(s) where it did not rain (Lmin).This behavior is representative of a negative coupling and suggests the relevance of the dynamical (mesoscale) mechanism for rainfall development.Events with a positive LocS value correspond to cases in which soils were more anomalously wet in the location of maximum rain (Lmax) than in the neighbor region(s) where it did not rain (Lmin).This indicates a positive coupling and a lower role of mesoscale circulation dynamics.The magnitude of LocS indicates how untypically strong the observed spatial gradient of soil moisture anomalies, Se, was in the morning of the rain event day, compared to the expected gradient for that concrete location.It should be also noted that a soil moisture gradient estimated from seasonal soil moisture anomaly in our setup corresponds well to a spatial gradient calculated in the same locations using absolute soil moisture values (R = 0.64, p 0.001; see Appendix Figure A4).This should allow more accurate interpretation of a spatial gradient in physical terms.

Statistical Test for Differences in Rain Characteristics
The sensitivity of the six AR properties to preceding conditions of soil moisture heterogeneity is evaluated using the Anderson-Darling two-sample test (A-D test), applicable to continuous and discrete distributions.For that, the complete AR event sample is stratified into deciles based on the LocS value.This ensures an equal number of events (∼1500) in every sub-sample and allows the separation of soil moisture heterogeneity conditions by the sign and strength of the soil moisture gradient.The A-D test is then applied to every AR property at every LocS value range separately to prove the hypothesis that the distribution of an AR property for that specific LocS range (Empirical Distribution 1) significantly differs from the total distribution of this AR property (Empirical Distribution 2).From the A-D test, we store and plot its D-statistic (i.e., weighted average squared difference between the two empirical distribution functions); see Figure 2. The two distributions are concluded to be significantly different if a p-value corresponding to a particular D-statistic is below 0.01 (gray-shaded bars in Figure 2).

Sensitivity of Afternoon Rain Properties to Preceding Soil Moisture Heterogeneity State
The results of the A-D test applied to every AR property at every decile range of soil moisture heterogeneity conditions (LocS) are summarized in Figure 2. Interestingly, the largest significant (p < 0.01) differences in distribution frequency of any AR property to its expected (total) distribution are confined to the range of extreme negative LocS values (Range #1).This range corresponds to the AR events that formed over untypically strong negative soil moisture gradients and are associated with locally drier soils in the location of maximum rain (Lmax).In contrast, no other range of LocS values shows significant differences for any AR property statistics.This analysis was repeated with three alternative statistical tests (Kolmogorov-Smirnov, Chi-squared and Mann-Whitney) and led to qualitatively equivalent results (not shown).
The above finding suggests that the strong negative soil moisture gradients have potential to influence properties of afternoon rainstorms in the region.Since negative soil moisture gradients in the LocS Range #1 are found to be over 1.5 standard deviations larger than usual (see x-axis in Figure 2), these gradients might be expected to produce equivalent strong spatial contrasts in surface heat and moisture fluxes.Following theoretical knowledge, these contrasts in turn can lead to the formation of local mesoscale (thermally-induced) circulations, which can influence storm development [15,27,28].The potential relevance of this mechanism is further discussed in Section 5.3.

Relationship between Rainfall Properties and Strong Negative Soil Moisture Gradients
To explore the relationship between rain properties and preceding soil moisture state, we first address the question: 'Which rain conditions are more likely to occur given the antecedent soil moisture heterogeneity?'.To do so, the results in Figure 2 are further scrutinized by analyzing the correspondent Difference Histograms (HD) for each AR property at every LocS range.Results in Figure 3 (first row) indicate that the probability distribution of each AR property within the LocS Range #1 reveals a clear and systematic shift from its expected (total) distribution.This shift points to a higher probability to observe an AR system with smaller area and volume, weaker maximum intensity and lower spatial variability, if the rain system developed over a strong negative soil moisture gradient.The most pronounced differences are observed in the timing of the events.Figure 3 (Columns 5-6) indicates a strong preference for rainfall to onset earlier (at 15 LST), as well as to show earlier maximum intensity when the soil moisture gradients are strong and negative (i.e., Range #1).Consistently with the A-D test results, any other differences in probability frequencies (of any AR property for any other heterogeneity range) are negligible and non-significant.
These results do not prove any causal relationship between soil moisture conditions and AR properties, but the observed systematic shift in the frequency of rain properties complies well with existing theoretical knowledge.Drier soils are known to trigger deep convection and rainfall earlier, especially when the vertical motion induced by turbulent buoyancy changes is amplified by the mesoscale circulation patterns [26,42].On the other hand, the absence of moisture sources from the dry surface will likely erode the moist static energy faster and therefore tend to suppress convective cloud formation [43,44,71].An estimate of the local afternoon rain maximum in our study implies that deep convection developed, or at least intensified locally, above Lmax.It is therefore reasonable to suggest that rain systems that evolved over relatively dry soils (strong negative soil moisture gradients) would not be sustained long and grow into large rain complexes, as corroborated by the higher amount of smaller and weaker AR events in the LocS Range #1.Because of the three-hourly temporal resolution of the TRMM-3B42 dataset, results for the two AR time parameters are likely to reflect similar information.Likewise, AR volume and maximum intensity are not fully independent of one another.Considering that the properties of convective rain systems are generally closely correlated [72,73], it is plausible to suggest that the size of AR might be a primary factor of the probability distribution shifts for other related parameters.In this view, a more detailed analysis considering the complete rain system life-cycle in the future can be beneficial.
As a next step, we assess the geographical patterns of the relationship between antecedent locally drier soils with strong spatial gradients and AR properties.For that, Spearman's rank correlations are calculated in every 5 × 5 • grid box between each AR property and the value of the soil moisture gradient within the LocS Range #1. Figure 4 shows that the time of the AR onset and the time of maximum intensity are positively correlated with the magnitude of the underlying negative gradient, particularly in the eastern and western parts of the study domain.Significant correlation (p < 0.05) is found in 36% and 39% of the domain 5 × 5 • grid boxes, respectively (Figure 4e-f).In those regions, the presence of a stronger negative soil moisture gradient will likely lead to an earlier precipitation onset time and possibly earlier time of maximum precipitation intensity.This relationship agrees with the findings from multiple model-based studies [25,26] and supports a potential role of mesoscale circulation patterns in AR development in the region.As in Figure 2, the histogram difference between two probability density functions is shown: one histogram corresponding to the sample of that particular AR property in the particular LocS range; the second one resulting from the total sample of that AR property independent of the LocS range.Unlike in Figure 2, the values of the rain properties are separated into bins and shown in actual units (from left to right): rain area (%), volume (mm/9 h), maximum intensity (mm/h), spatial heterogeneity (mm/9 h), time of maximum intensity (LST) and onset time (LST).
Following theoretical knowledge, it is the strength of the updraft over the dry soil patch that efficiently forces moister air from the surroundings up to (and beyond) the condensation level, triggering convection [17].Hence, a stronger soil moisture gradient with potentially more vigorous circulation and a higher buoyancy flux may lead to a faster onset of precipitation.In contrast, turbulent updrafts over soils that are homogeneously dry and thus do not instigate mesoscale circulation are indicated to be less efficient in initiating convection [23].In our sample, soil moisture gradients are found to be positively and significantly correlated with soil moisture anomalies in the Lmax location (R = 0.47, p 0.001) and uncorrelated with the mean soil moisture anomaly in Lmin (R = −0.06,p 0.001).The latter observation suggests that the soil drying in Lmax largely determines the strength of the spatial soil moisture gradient and with that supports the potential for greater buoyancy flux and more vigorous circulation when the gradient is strong.Generally low soil moisture content in the Lmax location of the events from the LocS Range #1 also supports the above assumptions (Appendix Figure A4).Yet, accurate surface flux estimates at high temporal resolution would be required to draw firm conclusions.Finally, in contrast to rainfall triggering, precipitation volume and perhaps also area are expected to be more closely correlated with the total available moisture amount, than with the strength of turbulent updrafts [23,46,71].This together with the complexity of factors involved in precipitation development would explain the lack of correlation for the other AR parameters.

Atmospheric Moisture on Days with Early AR Onset
The occurrence of AR events over relatively dry soils within the LocS Range #1 implies that enough moisture must be present in the boundary layer in order to trigger and sustain deep convection [74,75], which is not often the case in the semi-arid Sahel environment [76,77].To explore the state of the lower atmosphere on days with the earlier AR onset time over locally drier soils with strong heterogeneity, we analyze a set of boundary layer parameters, which we calculate from ERA-Interim temperature and humidity profile data for every heterogeneity range.To take into account the strong variation in climatological regimes along the meridional transect [78], we confine our analysis to five narrow domains (Figure 1a).These domains cover one degree of latitude change and span from 8 • W-12 • E. The restriction of the domain location towards West Africa is intended to exclude regions with extensive orography (see Figure 1) and to cover about one time zone.The results of this analysis are summarized in Figures 5 and 6.
First, we assess the co-variation of a corresponding boundary layer parameter at 12 LST with the value of soil moisture in the afternoon overpass of AMSR-E (13:30 LST) over the Lmax location; this is done for all AR events that fall within each heterogeneity range.The scatter plots obtained for one domain (11-12 • N, 8 • W-12 • E) are shown in Figure 5 and are separated by the AR events with the earliest onset time (at 15 LST, blue-colored dots) and later onset times (gray-colored dots).The main feature of the identified co-variability between the boundary layer parameters and soil moisture is the presence of a relatively strong (non-)linear relationship for the heterogeneity Ranges #2-10 and the absence of such a relationship for the extreme negative Range #1.This can be seen in the observed range of Spearman's rank correlation coefficients estimated for every plot (red text and regression line; see Figure 5).The reason for this difference is a unique combination of the driest soil conditions and the wettest boundary layers where the majority of the AR events with the earliest onset time occur in the LocS Range #1 (Figure 5, left column).The clearest evidence of this relationship is found in all moisture-related parameters, independent of the height of the considered atmospheric layer.These are mean Relative Humidity between 850 and 600 hPa (RH), Total column Precipitable Water (TPW) and the Humidity Index (HI).Consistent with the higher moisture amount, lower pressure levels of Lifting Condensation Level (LCL) and Boundary Layer Height (BLH) are observed on days with the early AR onset.On the contrary, the two instability measures, expressed by Convective Triggering Potential (CTP) and Convective Available Potential Energy (CAPE) do not show any notable distinctions between the days with the earlier and later AR onset.  .Preference of early onset AR events to occur over particular antecedent soil moisture and boundary layer conditions.(a-c) Difference in the Joint Probability Distribution (JPD) of absolute soil moisture (at 13:30 LST) and each boundary layer variable (at 12 LST) between AR events with an earlier onset time (i.e., 15 LST) and a later onset time (i.e., 18-24 LST).The plots refer to the events occurring over extreme heterogeneous and dry soils (LocS Range #1) only.Accordingly, red shading indicates a higher probability of early onset AR events to occur over dry soil and moist boundary layer conditions as compared to AR events with later onset time (and vice versa for gray shading).(d-f) Difference in the JPD of absolute soil moisture (at 13:30 LST) and each boundary layer variable (at 12 LST) between early AR events that fall over extremely heterogeneous landscapes (Range #1) and those that fall in less heterogeneous states (Ranges #2-10).Accordingly, red shading indicates a higher probability of early onset AR events from the LocS Range #1 to occur over dry soil and moist boundary layer conditions as compared to early onset AR events from Ranges # 2-10.All plots show the mean relationship over the five small domains illustrated in Figure 1a.Red crosses place the lowest 10th percentile (i.e., the strongest) negative soil moisture gradients from all five domains in the corresponding parameter space.The selected boundary layer parameters from left to right are: Relative Humidity (RH, %) (averaged between 850 and 600 hPa), Lifting Condensation Level (LCL, hPa) and Boundary Layer Height (BLH, hPa).
To demonstrate the consistency of this unique relationship across five selected domains, we analyze Joint Probability Distributions (JPD) of soil moisture and boundary layer conditions on days with early and late AR onset.Figure 6 shows the difference in the JPDs of soil moisture and RH (a), LCL (b) and BLH (c) calculated between the AR events with an early (15 LST) versus those with a late (i.e., 18-24 LST) onset time.This is shown for the locally drier and heterogeneous conditions (LocS Range #1) and averaged across all five domains.Differences in JPDs demonstrate that AR events with an earlier onset have a higher probability to occur over the driest soils and the wettest boundary layers in heterogeneous landscapes (Figure 6a).The higher probability of a lower pressure level of LCL and BLH at 12 LST also supports the existence of a very moist and shallow boundary layer on those days (Figure 6b,c).The presence of a strong negative soil moisture gradient appears as a key ingredient to generate early AR events in these conditions.This is seen in the distribution of the red crosses that mark the events with the 10% strongest negative soil moisture gradients (Figure 6a-c, red crosses).
These differences in surface and boundary layer moisture conditions between AR events with an earlier and later onset time are only present in the extreme LocS Range #1; under any other heterogeneity state (LocS #2-10), the less frequent rain events with the early onset do not cluster over a particular soil moisture range nor anomalous boundary layer conditions (Figure 5).To demonstrate the consistency of the latter observation across the domains, we also assess the difference in the JPDs of absolute soil moisture and boundary layer conditions, but now between early AR events that fall over extremely heterogeneous landscapes (Range #1) versus those that fall in less heterogeneous states (ranges #2-10).Figure 6d-f supports our hypothesis: the joint probability frequency of soil moisture and RH (d), LCL (e) and BLH (f) for early afternoon events within the extreme negative heterogeneity range clearly deviates from the positive relationship inherent to Ranges #2-10, i.e., when drier soils correspond to drier boundary layers.The results again reiterate the preference of early afternoon events to occur when the boundary layer is very moist around noon and soils are dry and heterogeneous.Despite the consistency in the identified relationships, differences exist among the five latitudinal domains.The signal is found to be less prominent for the two driest (13-14 • N and 14-15 • N) regions.This is partly related to the weaker soil moisture gradients and smaller range of soil moisture conditions, but may also be affected by the overall reduced number of AR events at those latitudes (Figures A5 and A6).The results obtained for the time of maximum rain intensity were the same and are not discussed here.Likewise, the analyses were repeated for the early morning (06 LST) boundary layer state, showing less clear, but consistent results (not shown).
To summarize, the analysis of boundary layer conditions at 12 LST suggests that the early onset of convective rainfall over dry soils is mostly possible when the atmosphere is very moist and the strong spatial gradient in soil moisture is present.This result is consistent with findings of several large-eddy simulation case-studies over West Africa [75][76][77].All of these studies showed that deep convection initiation in the presence of thermally-induced circulations was only possible when higher moisture volumes were brought by the monsoon flow into the region.Similar relationships have been found over the central U.S., and hence, are not expected to be limited to the Sahel region [33,71].The results of our study further corroborate this knowledge by demonstrating the systematic nature of this process-based relationship over the climatology of AR events.

Conclusions
Our study provides the first observational evidence of the impact of soil moisture heterogeneity and locally drier soils on the properties of convective rainfall in the semi-arid region of North Africa during the monsoon season.Results obtained using 10 years of satellite data of precipitation and soil moisture, combined with ERA-Interim atmospheric reanalysis data, agree well with the expectations provided by existing model-based theoretical studies.In particular, the identified sensitivity of the rainfall onset time to the presence of locally drier soils and anomalously heterogeneous soil moisture conditions is in agreement with the high efficiency of thermally-induced (local and mesoscale) circulations to trigger convection earlier, under favorable boundary layer states [25,26,42,43].The positive correlation revealed between the magnitude of soil moisture gradients and time of the AR onset in our study agrees well with the dependency of the initiation time of deep convection on the intensity of the updraft over the dry patch anomaly as identified by [26].It should be noted, however, that the horizontal scale of 100 km considered in the present study is larger than the typical scale of 10 km over which the effect of mesoscale circulation on rainfall triggering is usually recognized.Yet, the possibility of thermal circulations to form over spatial soil moisture contrasts at a 100-km scale and modulate afternoon convection was also demonstrated by some case studies over the region [14,40].Therefore, built on the methodology of [38], our findings further corroborate their initial hypothesis of the dynamical mechanism underlying the sensitivity of AR to strong negative soil moisture gradients in the North African region.
Though we did not find any direct relationship between the magnitude of underlying soil moisture gradient and AR volume, size, intensity or heterogeneity, the systematic shift identified in their probability distributions towards smaller and weaker systems above the locally drier and heterogeneous soils is consistent with the general tendency of dry soils to suppress existing deep convection [43,79,80].In addition, as was already mentioned earlier, soil moisture heterogeneity impact on AR amount and possibly AR size will rather occur via the contribution of added moisture due to convergence to the total moisture amount rather than via increased strength of updrafts [46].Given this complexity of factors and non-linear relationships involved in the rainfall evolution, quantification of soil moisture impact on rainfall amount has proven to be notoriously difficult in the past [81][82][83][84] and will likely remain so.
The role of humidity distribution within the lower atmosphere in triggering of rainfall is more explicit.Analyses of the atmospheric state in our study demonstrate that a moist boundary layer is a necessary condition for early rainfall triggering over dry soils with strong heterogeneity.This result confirms previous theoretical findings.An earlier study demonstrated that a humid atmosphere that lacks sufficient thermal forcing from the land surface is not able to produce or sustain convection [71], while a more recent one showed that dry soils are unable to initiate rainfall if the atmosphere is too dry [75].Our results illustrate the combined mechanism of these effects and further suggest that the combination of the wettest boundary layer and strong soil moisture heterogeneity might be key for early triggering and sustaining of rainfall over relatively dry soils, which would otherwise possibly not occur.The relevance of this mechanism over other regions of the world is foreseen [74].
One of the main limitations of our study is the implicit assumption that the identified rain systems evolved locally.Though conditioning the number of Lmin locations in our setup should reduce the amount of non-locally developed or at least intensified rain events, tracking single rain systems would be a valuable improvement to the method in the future.Another limitation concerns our current inability to prove the robustness of the identified soil moisture-rainfall coupling relationships with respect to the input precipitation dataset.Sensitivity analysis performed with the CMORPH-v1.0precipitation product emphasized the strong dependence of rainfall properties at the event scale on the type of retrieval algorithm (Appendix A).The results obtained with less numerous rain events from CMORPH-v.1.0data were found to be insignificant, yet qualitatively similar to those of TRMM-3B42.This gave us more confidence that the identified relationships are not dataset dependent.Yet, the potential of current precipitation products to be used for the process-understanding analyses at the event scale should be explored further.
To conclude, our findings underline the influence of soil moisture heterogeneity and the associated mesoscale circulations for the triggering of rainstorms in the North African region.As such, they highlight the importance of accurately representing soil moisture dynamics to enhance rainfall predictability in the region.As pointed out in previous studies, the correct simulation of rain timing remains a key challenge [85,86] and so does the accurate modeling of mesoscale circulation patterns [87].
to not only a distinct number of events, but also to distinct event locations.In this way, only 15% of all rain events had identical geographical locations of Lmax between the TRMM-3B42 and CMORPH-v1.0products ([38], Supplementary).
Despite the insufficient number of AR events identified with CMORPH-v1.0data, qualitatively similar results were obtained in the correlation test between the LocS range value and AR onset time using CMORPH-v1.0events (Figure A2b), which gives us more confidence that the identified relationships are not dataset dependent.However, the regression coefficient was more noisy and hence insignificant in the CMORPH-v1.0case (not shown).The latter calls for a more extensive sensitivity analysis of the identified soil moisture -precipitation coupling relationships for the choice of input datasets in the future.In summary, the differences between the TRMM-3B42 and CMORPH-v1.0products discussed above do not necessarily indicate the superior performance of TRMM-3B42 over CMORPH-v1.0at the event scale.These results emphasize the need for a better evaluation of existing precipitation products at the event scale.The AR size parameter here is calculated as a percentage of AR pixels within an event box of 9 × 9 pixels; the 9 × 9 deg box is commensurate with the 90th percentile of all possible meso-scale convective system size values, i.e., 200,000 km 2 [72], and therefore should be well suited for capturing the range of possible AR areas.(b) Spearman's rank correlation between the soil moisture gradient and the onset time of afternoon rainfall calculated for each of the ten LocS ranges for either the TRMM-3B42 (red) or CMORPH-v1.0(blue) dataset.Correlation is first estimated in every 5 × 5 degree grid box as in Figure 4 and then averaged across the domain.

Figure 1 .
Figure 1.Study area and summary of the methodology.(a) Study domain in North Africa; the color-scale indicates elevation (m); black lines mark major rivers; and the orography mask is shown in red shading.The main orographic features in the region are the Air Mountains (AM), Darfur Mountains (DM), Ethiopian Highlands (EH), Cameroon Mountains (CM), Jos Plateau (JP) and Guinea Highlands (GH).The five smaller domains marked with black rectangles indicate the areas selected for the analyses of atmospheric conditions over the relatively flat West African region.(b) Schematic representation of afternoon rainfall event geometry.(c) Generalized example illustrating the calculation of the deviation, ∆S, of the observed spatial soil moisture gradient, Se, from the expected value of climatological soil moisture gradients, S cntr , for an individual rainfall event.

Figure 2 .
Figure 2. Sensitivity of afternoon rain properties to preceding soil moisture heterogeneity conditions.D-statistic of the Anderson-Darling (A-D) test, i.e., weighted average squared difference between the two selected Empirical Distribution Functions (EDF), for every LocS range (x-axis) and six rain properties: (a) rain area, (b) volume, (c) spatial heterogeneity, (d) maximum intensity, (e) time of maximum intensity and (f) onset time.The first EDF results from the sample of the specific AR property in the particular LocS range, while the second EDF results from the total sample of that AR property independent of the LocS range.Differences significant at the 99% level (p-value < 0.01) are shaded in gray.In the case of the two discrete AR time parameters (e,f), the calculation of the A-D statistic for every LocS range was repeated 100 times for every 30 randomly-selected event values.The two inset plots in (e) and (f) show the spread of the estimated p-values for every sample and every LocS range.The number of events in every LocS range is the same, i.e., ∼1500, because they are binned per decile.The same qualitative result was obtained for the two discrete distributions using the Chi-squared non-parametric test for a complete 1500 sample size (not shown).

Figure 3 .
Figure 3. Preference of each rain property over antecedent soil moisture heterogeneity conditions.As in Figure2, the histogram difference between two probability density functions is shown: one histogram corresponding to the sample of that particular AR property in the particular LocS range; the second one resulting from the total sample of that AR property independent of the LocS range.Unlike in Figure2, the values of the rain properties are separated into bins and shown in actual units (from left to right): rain area (%), volume (mm/9 h), maximum intensity (mm/h), spatial heterogeneity (mm/9 h), time of maximum intensity (LST) and onset time (LST).

Figure 4 .
Figure 4. Spatial distribution of the correlations between the rain properties and soil moisture heterogeneity.Spearman's rank correlation coefficient between the soil moisture gradient, Se, and corresponding AR property value is estimated for each 5 × 5 degree grid box.Only the extreme negative LocS values (Range #1) are considered.Boxes with significant correlation (p-value < 0.05) are marked with a cross.Grid boxes where the number of events is below 20 are masked out.

Figure 5 .
Figure 5. Co-variability of antecedent soil moisture and boundary layer conditions on rain event days.Scatter plots between selected boundary layer parameters at 12 LST and soil moisture at 13:30 LST measured in the AR event Lmax location.The calculations are shown for one of the five African domains (11-12 • N, 8 • W-12 • E) and for every of the 10 heterogeneity ranges.The blue points indicate AR events with an earlier onset time, i.e., at 15 LST.All the other AR events are indicated by gray dots.Boundary layer parameters shown are (from top to bottom) mean Relative Humidity between 850 and 600 hPa (RH), Total column Precipitable Water (TPW), the Humidity Index (HI), Lifting Condensation Level (LCL), Boundary Layer Height (BLH), Convective Triggering Potential (CTP) and Convective Available Potential Energy (CAPE).

Figure 6
Figure6.Preference of early onset AR events to occur over particular antecedent soil moisture and boundary layer conditions.(a-c) Difference in the Joint Probability Distribution (JPD) of absolute soil moisture (at 13:30 LST) and each boundary layer variable (at 12 LST) between AR events with an earlier onset time (i.e., 15 LST) and a later onset time (i.e., 18-24 LST).The plots refer to the events occurring over extreme heterogeneous and dry soils (LocS Range #1) only.Accordingly, red shading indicates a higher probability of early onset AR events to occur over dry soil and moist boundary layer conditions as compared to AR events with later onset time (and vice versa for gray shading).(d-f) Difference in the JPD of absolute soil moisture (at 13:30 LST) and each boundary layer variable (at 12 LST) between early AR events that fall over extremely heterogeneous landscapes (Range #1) and those that fall in less heterogeneous states (Ranges #2-10).Accordingly, red shading indicates a higher probability of early onset AR events from the LocS Range #1 to occur over dry soil and moist boundary layer conditions as compared to early onset AR events from Ranges # 2-10.All plots show the mean relationship over the five small domains illustrated in Figure1a.Red crosses place the lowest 10th percentile (i.e., the strongest) negative soil moisture gradients from all five domains in the corresponding parameter space.The selected boundary layer parameters from left to right are: Relative Humidity (RH, %) (averaged between 850 and 600 hPa), Lifting Condensation Level (LCL, hPa) and Boundary Layer Height (BLH, hPa).

Figure A1 .
Figure A1.Correspondence of the number of rain pixels per day in TRMM-3B42 and CMORPH-v1.0with Morning accumulated Rain (MR) or Afternoon accumulated Rain (AR) exceeding a threshold of 1 mm/9 h (a,b) or 0 mm/9 h (c,d).

Figure A2 .
Figure A2.(a)Probability histogram of AR system size estimated for the TRMM-3B42 and CMORPH-v1.0datasets.The AR size parameter here is calculated as a percentage of AR pixels within an event box of 9 × 9 pixels; the 9 × 9 deg box is commensurate with the 90th percentile of all possible meso-scale convective system size values, i.e., 200,000 km 2[72], and therefore should be well suited for capturing the range of possible AR areas.(b) Spearman's rank correlation between the soil moisture gradient and the onset time of afternoon rainfall calculated for each of the ten LocS ranges for either the TRMM-3B42 (red) or CMORPH-v1.0(blue) dataset.Correlation is first estimated in every 5 × 5 degree grid box as in Figure4and then averaged across the domain.

Figure A3 .
Figure A3.Number of identified afternoon rain events aggregated over 1 × 1 • grid boxes (gray shading) and applied orography mask at the 0.25 • horizontal resolution (red dots).

Figure A4 .
Figure A4.Correspondence between the LocS value and the spatial gradient in absolute soil moisture estimated in the same Lmax and Lmin locations.Color shading indicates the magnitude of absolute soil moisture in the Lmax location only.The value of Spearman's rank correlation coefficient calculated between the LocS value and the spatial gradient is given.

Figure A5 .
Figure A5.Same as Figure 6a-c, but for every African domain separately.Note, that pressure levels on the y-axis of LCL and BLH plots have a reverse order.

Figure A6 .
Figure A6.Same as Figure 6d-f, but for every African domain separately.Note, that pressure levels on the y-axis of LCL and BLH plots have a reverse order.

Table 1 .
Characteristics of datasets used in the study.The common period 2002-2011 (JJAS) and horizontal resolution of 0.25 • were imposed on the products.The CMORPH precipitation data are only used for additional sensitivity tests in Appendix A. For all calculations, the volumes of precipitation parameters are scaled to mm per 3 h.

Table 2 .
Definitions of Afternoon Rain (AR) properties.The minimum-maximum range of every AR property within our final sample of 15,790 events estimated based on the TRMM-3B42 precipitation dataset is also given.