1. Introduction
Monitoring of rainfall is essential for famine early warning and for the management of agricultural risk [
1,
2]. In many parts of the world the gauge station network is sparse, and hence monitoring cannot rely on ground-based measurements. An alternative is to use remotely sensed data, such as satellite-based rainfall estimates (
i.e., rainfall that is estimated based on calibrated satellite imagery—for example, [
3,
4,
5]). Although they both provide estimates of rainfall, satellite and ground-based methods are fundamentally different. Their agreement varies spatially and temporally, depending on the meteorological regime, satellite rainfall estimation methodology, and density of the gauge network [
6,
7,
8,
9].
The skill of remotely sensed rainfall estimates is generally improved by aggregating in time and space [
10,
11,
12]. The more rainfall is aggregated, however, the less representative it becomes of local meteorological conditions. When deciding on a scale for aggregation, improvement in skill must be balanced against loss of “representativity” [
13]. It is important to note, however, that not all applications that use rainfall data depend on instantaneous knowledge of local conditions. Indeed, applications based on temporally and spatially aggregated rainfall may be naturally suited to satellite-based rainfall monitoring. Agricultural losses, for example, are linked more closely to cumulative rainfall than to instantaneous rainfall at a point.
Assessments of the usefulness of remotely sensed rainfall thus need to account for the context in which the data are being used. Climate-related risk to agriculture is related both to variability in the weather and to the interaction between meteorological and land-surface conditions. The suitability of proxies, such as remotely sensed rainfall, depends, moreover, not only on skill in representing spatial and temporal variability, but also on the relative importance of this variability for the application in question.
In this paper, we illustrate these issues through a case study of the use of remotely sensed rainfall for weather index insurance (WII)—a form of drought insurance increasingly used in regions such as Africa and India. Unlike traditional insurance, which compensates proven loss, WII pays out in the event that a weather index is breached [
14,
15]. Lack of agricultural insurance in Africa increases the vulnerability of farmers to climate-related risk. WII has the potential to be cheap to administer and transparent to operate. In principle, WII provides a means of insuring smallholder farmers throughout Africa against drought. In a data-sparse region, such as Africa, expanding WII schemes requires the use of remotely sensed environmental data, including rainfall. Design of indices that maximise the skill of satellite-based rainfall estimates for the monitoring of agricultural drought is thus a pressing concern [
16]. However, assessment of skill is challenging in regions, such as Africa, where the ground station network is sparse. And yet, these are the regions most dependent on remotely sensed rainfall [
9]. Our case study presents a novel application of a well-established method for inferring uncertainties inherent in spatially and temporally aggregated remotely sensed rainfall [
17,
18,
19].
A key issue for WII is the mismatch between insured weather based indices and agricultural losses. This can lead either to unfair payouts, or to uncompensated losses. Such mismatches are termed basis risk. Basis risk can result from losses that are not connected with variability in the weather. Cotton yield may, for example, be reduced by pests and diseases [
20]. Another aspect of basis risk stems from complexities in the progression from below expected rainfall (meteorological drought), to deficit in root zone soil moisture (agricultural drought). Complexities arise because below expected rainfall is not necessarily a precursor to soil moisture deficit, and conversely, soil moisture deficits occur, even when rainfall is near normal [
21,
22]. Furthermore, if remotely sensed data are used, lack of skill in the estimation methodology can lead to errors in the indices themselves, and hence to increased basis risk. When designing indices based on remotely sensed rainfall, it is thus necessary to consider the propagation of uncertainty from errors in rainfall estimation to complexities in the link between rainfall and drought.
In this paper, we focus on cotton in Zambia–an economically important cash crop with highly variable yield (
Figure 1), which is already insured using remotely sensed rainfall. It should be noted that cotton is sensitive not only to rainfall, and indeed that sensitivity to other meteorological parameters may increase basis risk for WII schemes that are based only on rainfall. For example, laboratory-based studies have demonstrated that cotton yield is adversely affected by heat stress, with optimal temperatures being about 28 °C and severe damage ensuing when temperatures exceed ~35 °C [
23,
24]. Nevertheless, field studies carried out in Africa, suggest that, while variability in temperature has a significant impact, rainfall variability is the dominant environmental driver of variation in yield [
25].
In this study, we first use a land-surface model to describe the progression from meteorological to agricultural drought in the region. We then consider how aggregation in space and time affects the capacity of satellite-based rainfall estimates to represent local rainfall. The final part of the analysis draws these threads of analysis into a discussion of how the aggregation of rainfall relates to basis risk. The paper closes with a brief account of the bearing that the methodologies presented have on weather index design.
2. Data and Methods
This study combines land-surface model integrations with analyses of satellite imagery and survey of agricultural loss data. The land surface model chosen was the Joint UK Land Environment Simulator (JULES), the land surface model of the UK Met Office. JULES, driven with remotely sensed rainfall and reanalysis data, is used to investigate the development of drought and the scales of land-atmosphere interactions. Analysis of ensembles of equally likely rainfall is used to quantify algorithmic uncertainty in the rainfall estimates, and thus to quantify the effect of aggregation in space and time on skill. The following sections describe JULES, the satellite-based rainfall estimation methodology (Tropical Applications of Meteorology using SATellite and ground based data (TAMSAT)), and the agricultural loss data.
2.1. Representation of Agricultural Drought Using the Joint UK Land Environment Simulator (JULES)
JULES is a process-based land-surface model [
26,
27]. When coupled to one of the Hadley Centre atmosphere models, it comprises the land-surface scheme of the Hadley Centre climate models. Being a process-based model, JULES can be used to investigate the links between climatic and other environmental factors and the condition of the land surface [
28,
29]. In Zambia, where there are no long-term observations of root zone soil moisture, output from models such as from JULES can provide an indication of the nature of the links between the climate and the water available to plants. In this study, JULES is being used to infer the expected strength of the link between rainfall and agricultural drought in Zambia. The following summarizes the features of JULES that are of greatest relevance to this study.
JULES divides the land-surface into nine surface types: broadleaf trees, needle leaf trees, C3 grass, C4 grass, shrubs, urban, inland water, bare soil, and ice. The land-surface types are tiled to represent sub-grid heterogeneity [
30]. Surface fluxes of moisture and heat are calculated for each tile, and the state of the grid box is then represented by the aggregation of the tile fluxes. JULES can be run either at a point or over a grid (distributed JULES). It is important to note that the formulation of distributed JULES used for this study does not include lateral transfer of heat or moisture.
JULES includes a multi-layer representation of soil. Each soil layer is described by a set of hydraulic and thermal properties (Table 3 in [
26]). In practice, these quantities are derived by applying pedotransfer functions to maps of soil texture [
31]. In distributed JULES, the soil hydraulic and thermal properties are allowed to vary from one grid point to another. Although it is possible to vary the soil properties with depth, for this study, they were assumed to be constant. The calculation of soil moisture is described in detail in the published descriptions of JULES [
26,
27]. The following summarizes the key processes. The soil moisture in the top layer of soil is related to the amount of water that reaches the soil, the maximum rate of infiltration and the rate of evaporation. For non-vegetated surfaces, the amount of water reaching the soil is equivalent to the precipitation. For vegetated surfaces, the amount of water reaching the soil is the canopy throughfall. The calculation of canopy throughfall in JULES accounts for precipitation, the water held in the canopy, and the rate of evaporation from the canopy water store (Equations (46) and (47) in [
26]). Once water reaches the surface, it either infiltrates, or runs off. The gridbox runoff rate is given by Equation (48) in [
26] and derived in [
32]. The infiltration is then calculated from the water balance.
In JULES, water can move vertically through the soil layers, and can drain from the lowest layer. The fluxes of water between layers are calculated using the finite difference formulation of the Richards equation:
where W
k-1 and W
k are the diffusive fluxes of water flowing into and out of soil layer k, E
k is the moisture extracted by plant roots from the lower layers, or evaporated from the top layer. In this study, the soil column is split into four layers (depths 0.1, 0.25, 0.65, and 2 metres). Note that the lateral runoff (R
k) is set to zero.
The water availability to plants (beta) is a function of the soil moisture concentration (see above), the soil moisture concentration at the critical point and the soil moisture concentration at the wilting point. The critical point is the soil moisture concentration below which plants start to be affected by water stress. When soil moisture is lower than the wilting point, plants do not grow or transpire. The critical and wilting points depend on soil texture [
26].
The formulation of beta is given in Equation (2), where
θ is the soil moisture concentration,
θw is the soil moisture concentration at the wilting point and
θc is the soil moisture at the critical point. Beta ranges from 0 to 1. When beta is 1, modelled plant growth is not affected by water stress.
Beta is directly related to the water stress on plants, and can thus be used to infer the degree of agricultural drought as the season progresses. The link between beta and rainfall, moreover, reflects the link between agricultural and meteorological drought.
The results shown in
Section 3 derive from integrations of JULES carried out for 1983–2012 at 0.5° horizontal resolution over the illustrated horizontal domain shown. JULES was forced with three-hourly gridded time series of radiation, precipitation, temperature, humidity, wind speed, and surface pressure, extracted from the WFDEI (WATCH Forcing Data based on ERA-Interim) forcing dataset [
33]. In this dataset, all variables apart from precipitation are extracted from the WFDEI dataset; the precipitation data are TAMSAT rainfall ensembles (see
Section 2.2). The land cover surface type percentages and the soil properties at each grid point were provided as part of the WFDEI dataset [
33].
2.2. TAMSAT and TAMSAT Rainfall Ensembles
There are a number of African rainfall datasets available at high resolution and for a sufficient time period for use in WII [
4,
34]. This study uses one such dataset, TARCAT, which is the historical product, based on the TAMSAT method. We focus on TARCAT because the underlying method has been shown to have good skill for Zambia [
35], and the dataset is already used in WII schemes for this region.
2.2.1. The TAMSAT Method
The TAMSAT algorithm uses imagery from Meteosat thermal Infra-red (TIR) imagery to determine the Cold Cloud Duration (CCD) parameter, which is defined as the duration each pixel is below a predetermined threshold temperature [
3,
36]. CCD is then used as a proxy for rainfall [
11,
12,
35]. Such an approach is used in place of the instantaneous brightness temperature measurements often used in rainfall estimation [
37,
38], because TIR-only rainfall estimates are most skillful when aggregated [
10]. Although it is valid for convective rainfall events (
i.e., the longer a cloud top is below the threshold temperature, the greater one would expect the rainfall amount to be), the indirect relationship does not hold for warm rain processes where the cloud top temperature is less representative of the rainfall on the ground. As such, the TAMSAT algorithm is suited for much of tropical Africa, which is dominated by convective rainfall. Given the heterogeneous nature of the African rainfall climate, CCD fields are regionally calibrated assuming a linear relationship between CCD and rainfall for each calendar month using historic gauge measurements, ensuring the resulting estimates reflect the expected local conditions [
17,
18,
19]. The TAMSAT method has shown high levels of skill across Africa [
8,
10,
35,
39,
40,
41,
42].
2.2.2. TAMSAT Rainfall Ensembles
The TAMSAT rainfall ensemble algorithm is an extension of the standard TAMSAT rainfall estimation methodology. The standard TAMSAT method derives deterministic rainfall estimations from CCD (
Section 2.2.1). In reality, however, a given CCD is associated with a range of rainfall amounts. The TAMSAT rainfall ensemble algorithm generates multiple equally likely realizations of rainfall, for a given CCD field. Rainfall ensembles thus contain information on the inherent uncertainty in satellite estimates of rainfall [
17,
18,
19]. In regions with intermittent gauge measurements, the methodology provides a means of quantifying the uncertainty in satellite-based rainfall estimates. Specifically the ensemble range at a pixel, or over a region is linked to the uncertainty in rainfall estimation. In this study, rainfall ensembles are used to explore how spatial and temporal aggregation affect uncertainty; and how uncertainty in rainfall is propagated to uncertainty in soil moisture and hence in agricultural drought.
During the calibration stage, the relationship between CCD and rainfall can be characterized probabilistically to determine probability distributions of both rainfall occurrence and amount [
18]. Using this information, it is possible to generate an ensemble of rainfall fields by randomly sampling from the probability distributions. Carrying out this process for each pixel independently, however, would result in unrealistic, spatially uncorrelated fields. To overcome this, spatially independent “seed” pixels are chosen from the observed CCD field and the influence of the surrounding pixels on each seed pixel’s probabilities is calculated using a geostatistical process known as sequential simulation (SS) [
19]. SS is performed in two stages to (1) delineation of regions of rain and no rain and (2) assignment of a rainfall amount to the rainy pixels. This process samples out of each pixel’s occurrence and rainfall amount probability distributions (in a manner designed to preserve spatial correlations) and is conducted until all pixels in the domain are considered. The entire process is repeated many times, producing a set of spatially coherent, equally likely rainfall scenarios that are consistent with the observed CCD field and the climatological CCD-rainfall relationship.
In this study we ran a 50-member ensemble implemented over the region shown in
Figure 2. The calibration was carried out against measurements from 36 rain gauges, distributed throughout Zambia.
2.3. Crop Yield Loss Data
Data on agricultural yield losses were gathered at the 38 locations shown on the maps included in
Figure 2. The loss percentage (actual/expected yield) is calculated using a combination of different sources of data, both quantitative and qualitative. Multiple sources of data are used, because there is no single reliable source of loss data for all sites. The reliability and credibility of the source of loss data, moreover, varies by location.
The sources of loss data are:
Information on farmer experience collected via semi-structured interviews. Inevitably these focus on the “worst years”, which farmers’ remember, and there is a bias towards recent years,
Feedback from field staff of distribution channels, including agricultural extension agencies and suppliers of agricultural inputs, and
A simple yield stress model, relating rainfall deviation and multiplicative crop yield factors to calculated yield deviations for cotton for different historical years (based on FAO crop yield stress factors), supplemented by occurrence of droughts reported in the scientific literature.
Different credibility weights are assigned to the each of these sources. For example, the early part of the record (before 1995) depend more strongly on the yield stress model, while the later data incorporate more information from farmer interviews.
4. Discussion
WII is one of various instruments for reducing vulnerability to drought [
14]. However, schemes that do not pay out during bad years, or pay out inappropriately during good years, may do more harm than good [
44]. Essential to successful WII is an index that reflects the agricultural conditions experienced by the policy holder—
i.e., low basis risk. Basis risk will be high if crop yield is strongly affected by factors that are not well correlated with the insured index. For example, if crops are insured on rainfall, but variability in yield is driven by pests and diseases, basis risk will be high. Low correlation between the rainfall index and field scale soil moisture will also increase basis risk.
In regions, such as Africa, where observational coverage is poor, inaccurate estimation of the insured index can worsen mismatch with agricultural losses. If ground-based observations are used, policy holders must live close enough to the station for the index to represent the meteorological conditions they are experiencing [
45]. If remotely sensed data are used, the index must be skillfully represented by the dataset. In the case of remotely sensed rainfall, the skill with which an index is captured depends to a large extent on the degree of spatial and temporal aggregation. This is because aggregation in space and time, are generally increases accuracy [
10].
Design of appropriate indices is especially challenging in regions that lack data for validation. In this study, errors in remotely sensed rainfall were assessed using a method designed to represent the inherent algorithmic uncertainty in the estimation process [
17,
18,
19]. This approach has the advantage of not requiring dense coverage of ground-based data for evaluation. In the case study presented here, it was shown that, for Zambia, accuracy is significantly enhanced when rainfall is cumulated over 5 days or more, or over large spatial domains (>100 km) (
Figure 6 and
Figure 7). Skill resulting from spatial aggregation must, however, be balanced against loss of representativity of local conditions. For Zambia, rainfall is fairly homogenous, and
Figure 8 shows that even up to ~150 km, aggregated rainfall skilfully captures local conditions. While these specific findings are applicable only to this case study, the methods of analysis are applicable anywhere.
In the absence of
in situ root zone soil moisture measurements, the inherent link between rainfall and water stress on plants was assessed using a process-based land surface model.
Figure 3,
Figure 4 and
Figure 5 show that at the peak of the rainy season, there is some correlation between plant water stress and temperature, but that this relationship varies considerably in both space and time. Such variability may explain some of the mismatches between rainfall and agricultural losses shown in
Figure 1.
Over the last few years, several pilot WII schemes have been successfully implemented for cotton in southern Zambia, based on TAMSAT rainfall data. As such schemes are extended within Zambia and beyond, it is critical that insured indices continue to be rigorously evaluated. This study has presented a suite of methods that can be used for this purpose–even in the absence of dense networks of station observations.