Farmer Perception , Recollection , and Remote Sensing in Weather Index Insurance : An Ethiopia Case Study

A challenge in addressing climate risk in developing countries is that many regions have extremely limited formal data sets, so for these regions, people must rely on technologies like remote sensing for solutions. However, this means the necessary formal weather data to design and validate remote sensing solutions do not exist. Therefore, many projects use farmers’ reported perceptions and recollections of climate risk events, such as drought. However, if these are used to design risk management interventions such as insurance, there may be biases and limitations which could potentially lead to a problematic product. To better understand the value and validity of farmer perceptions, this paper explores two related questions: (1) Is there evidence that farmers reporting data have any information about actual drought events, and (2) is there evidence that it is valuable to address recollection and perception issues when using farmer-reported data? We investigated these questions by analyzing index insurance, in which remote sensing products trigger payments to farmers during loss years. Our case study is perhaps the largest participatory farmer remote sensing insurance project in Ethiopia. We tested the cross-consistency of farmer-reported seasonal vulnerabilities against the years reported as droughts by independent satellite data sources. We found evidence that farmer-reported events are independently reflected in multiple remote sensing datasets, suggesting that there is legitimate information in farmer reporting. Repeated community-based meetings over time and aggregating independent village reports over space lead to improved predictions, suggesting that it may be important to utilize methods to address potential biases.


Introduction
The increasing climate hazards and extreme weather events that affect food production and agricultural income have been the target of risk reduction strategies for smallholder farmers in the developing world [1].A key challenge in these risk reduction strategies is that there is very little information from in-situ observations for the design of solutions in the lowest income regions, where tools may be needed the most.In these situations, farmer recollections may be the most important source of information.However, there are several well-established and often interrelated fields of study that flag potential issues in farmer reporting.
The economics literature highlights strategic responses and the costs of information asymmetries.Farmers are likely to be biased in their reporting to negotiate for higher payouts, or to be reluctant to reveal valuable proprietary information that could be used against them in negotiations for loans, labor, or rent [2][3][4].There is a substantial body of literature on the biases in reporting due to gender issues, when important information known by women is not represented, or is suppressed by gender dynamics in the reporting process [5][6][7].In the psychology and behavioral economics literature, cognitive and social biases such as present bias, loss aversion, and social desirability bias are highlighted [8,9], while related work focuses on the cognitive challenges of recalling the events, identifying biases such as telescoping, heaping, recall delay, and anchoring [10][11][12][13].The more time that passes between the harvest and an interview, the less accurate farmer reporting is [10].There is a large body of literature on reporting errors in surveys.Bound et al. [14] reviewed the existing literature in recall errors using statistics in numerous sectors, including healthcare, labor, crime, and motor vehicle accidents.One general finding is that the longer it takes to recall an event, the greater the bias will be.It is important to test the validity of information on farmer reporting because of these many sources of potential biases in perceptions, recollections, and reporting.
We are not aware of any quantitative study that has scientifically evaluated the quality of farmers' perception and recollection on long-term weather impacts (30+ years) on agriculture.There is a growing interest reflected in the literature, particularly in the risk management sector, on farmers' perception and recollection on climate variability.The majority of the literature surrounding farmers' perception and recollection of weather information focused on farmers' attitude towards temperature and precipitation variability and measured how different characteristics affect their adaptation strategy [15][16][17].Allahyari et al. (2016) [15] summarizes fourteen related studies conducted from 2008 that focused on understanding the factors that shape farmers' perception, such as their farm size, household income, etc.While a majority of the studies measured farmers' perception with national data sources (e.g., meteorological department information and national statistics), there was no systematic measurement of the accuracy and biases of their recollection of historic weather events.
Without individually addressing the broad set of specific possible biases and issues, we ask two related questions: (1) Is there evidence that farmers reporting data have any information about actual drought events, and (2) is there evidence that it is valuable to specifically address recollection and perception issues when using farmer-reported data?An index insurance project is perhaps the best setting to explore this question.We utilized the case study of an index insurance project and relevant remotely sensed data sources to test these two questions-our goal was not to design an index insurance product.Index insurance is part of a suite of risk management strategies, and allows contract holders to reduce covariate risk that affects all members of a community or region simultaneously [18].The contract is designed around an index based on satellite-derived datasets as a proxy for expected agricultural loss [19,20].These insurance products use semiautomated, objective, remote sensing datasets to trigger payouts instead of insurance adjusters or direct assessment of loss, which enables fast payouts and low transaction costs, making insurance more affordable in the developing world [21].This opens up new markets for insurance in the developing world, where claim-based insurance is too costly to develop due to extensive monitoring, administration, and claim verification.However, for index insurance to be sustainable, an increase in cooperation between the insurance industry and the remote sensing community is required [18].
Over the past decade, index insurance pilots have been developed to implement insurance contracts using satellite-derived rainfall data, vegetation index data, evapotranspiration, and area-yield estimates over large geographic areas [22,23].The focus of many of these programs has been to leverage available datasets to provide the appropriate measure of weather impacts across large areas in order to scale up small pilot studies to reach more farmers [18].Index insurance is more accurate and effective over homogeneous areas with a similar vulnerability to weather shocks, and when used effectively, can augment agricultural production during good years by increasing investment in productive resources, as well as protecting these assets during bad ones [24].If insurance programs are to successfully help to manage weather-related risk, the satellite estimates of rainfall must reflect the perceived loss due to rainfall deficits.However, since remote sensing is often utilized because other data sources are not available, there are extremely limited ground data to validate remote sensing, or to design a product effectively linking remotely sensed estimates to farmer loss.
Index insurance faces many challenges.Despite significant weather-related losses, there are many who are concerned by the projects' "disappointing demand" [25,26].There is a great deal of variation, with other insurance projects having high levels of demand, exceeding insurance demand in the United States and exceeding project logistical capacity [19,23].Authors attribute issues such as an absence of confidence in financial institutions, gaps in complimentary risk management and production tools, and inability to pay the premiums with debates focusing on issues related to lack of farmer understanding and ineffective targeting of actual losses [27,28].In order to effectively develop appropriate insurance coverage both from the perspective of scientific analysis and the farmer, it is imperative that issues in farmer perceptions and recollections be addressed and understood.
Clearly, to be effective, index insurance products need to be designed around an index that is closely related to the insured's actual loss experience, whether that be income from sales of agricultural production or assets such as livestock or savings.The mismatch between losses on the ground and insurance payouts, termed 'basis risk', is one of the greatest challenges for index insurance [18,23].Farmers' understanding of the specifications of the index is clearly important.When farmers understand the index behind an insurance product and are aware of which years would have paid out before they sign up, it is argued that there is not only lower dissatisfaction [29], but also an increase in demand [23].Again, this paper is to test for issues in farmer recollection utilizing remote sensing in an index insurance setting.It is not to develop an index insurance product, or minimize basis risk.
In rain-fed agriculture, the hydrologic cycle is driven by rainfall, providing moisture to the root zone of crops.Since different satellite products measure different parts of the hydrologic cycle, there is potential to test farmer recollections by comparing with compounding evidence from several sources.Essentially, this consists of using satellite products independently in order to test recollections against different components of the hydrologic cycle [30].Different satellite datasets have strengths and limitations based on their sensor technology and algorithmic development, which results in challenges in the development and calibration of remote sensing data for operational use, including index design.
An important consideration in index design is that datasets that measure components of the hydrologic cycle may not be the only relevant variables indicating key production problems farmers face.Low yields recorded might not correspond to a drought, but may, among other factors, be driven by the end of an agricultural subsidy program.There are potential challenges if a comprehensive dataset is utilized because it is available rather than because of its relevance.For instance, although coffee production statistics are often available historically, many coffee producers are not as vulnerable to quantity changes in coffee; rather their losses are driven by changes in quality, for which datasets are much less available and reliable [31].Therefore, if a yield dataset alone is used for index insurance design, but it is not related to the targeted loss, this may cause serious inconsistencies.
Agricultural models are often utilized, but these may also be misspecified, missing key local features.Models may have inaccurate assumptions about sowing timing, varieties, phenology, or the availability of labor or inputs.An inaccuracy of just two weeks can lead to a dramatically different identification of which years are the most damaging droughts.They may fail to reflect the gaps in farmers' production portfolio that makes them vulnerable to risk.
Farmer recollection data are often the primary source of information available for the design of the insurance [32].Consequently, these projects are vulnerable to the limitations and biases of farmers' recollection and the sensitivities of perceptions when reporting.Given the nature of index insurance, it is a valuable environment for our case study focusing on the challenges in farmer recollections, perceptions, and reporting.
There are alternate approaches to obtaining information from farmers.Our work here is most closely related to community-based observing networks (CBON), in which farm-level expertise is utilized to frame locally-defined poor rain ('bad') year data which are representative of socially-mediated vulnerability [33] to improve the relationship of the index to the problems faced by the community and to strengthen the agency of the farmers [34].We are therefore more focused on the issues relevant to community-based observing networks, as opposed to the issues in the citizen scientist literature, for which local citizens are trained to reliably and accurately read and report sensor data (for example, rainfall measurements).

Case Study
We studied the R4 Rural Initiative partnership between the UN World Food Programme (WFP) and Oxfam America.The R4 index insurance project in Ethiopia is one of the largest and longest-running developing country index insurance programs targeting individuals, with tens of thousands of farmers and nearly one hundred villages.R4 refers to the four risk management strategies integrated through the project to strengthen farmers' food and income security: Improved resource management (risk reduction), insurance (risk transfer), microcredit (prudent risk taking), and savings (risk reserves) [35].The focus of the R4 program, part of a broader 'risk reduction' strategy, is to improve the resilience and food security of vulnerable rural households facing increasing climate risks [1] through index insurance, along with efficient allocation of the resources already in a community (e.g., forests, water, health facilities, schools, land, and livestock).In Ethiopia, this program builds on the success of the Horn of Africa Risk Transfer for Adaptation (HARITA) initiative [36].Our current research extends work on the application of satellite datasets for index insurance design [32], to utilize remote sensing as a tool to study farmers' perception of loss.
Impact assessments conducted over the past few years on index insurance programs in East Africa showed significantly better outcomes for communities and individual households who participated in insurance programs.Janzen and Carter [37] reported a 22-36% reduction in the sale of assets after a severe drought in Kenya, as well as a reduction of 27-36% in the probability on average that households reduce meals over their uninsured counterpart during the post-drought period.The study also showed that insured households were 45-50% less dependent on food aid and less dependent on other forms of assistance.Mude [38] reported that households with livestock insurance experienced a 25% reduced likelihood of significantly limiting their nutritional intake, a 25% reduction in distress sales of livestock assets, and a 33% reduction in their reliance on food aid.Madajewicz et al. [36] also found that insured farmers, on average, increased their savings and the number of oxen they owned relative to uninsured farmers for the particular index insurance project we studied.
Insurance contract and index design in R4 used a participatory approach that integrated local farmers' and experts' knowledge into the design process in the first year as a 'dry run'.This included economic risk simulations (or games) to help farmers to understand the contract and to inform design [23].Once an index was designed that was as closely as possible associated with known losses, economic research games/exercises were completed at the farmer and institutional level and a capacity building program was pursued [23].In the second year, the program provided insurance for several thousand farmers with further refinement of the insurance contracts.In subsequent years, a significant expansion of the number of communities was achieved by involving more communities in the dry-run exercise every year, and built on positive sentiment in participating communities.
Drought is the focus of the R4 insurance product.Although lack of rainfall is just one of many causes of low crop yields that a farmer might experience, including insect damage [39], heat stress [40], and inadequate nutrient support [41], it is often the dominant cause of yield reduction [42][43][44] and is consequently often the focus of index insurance, even in an environment with multiple risks [45].By focusing index insurance on dominant drought-related risk, it is hoped that publicly accessible data products (such as remote sensing) are of sufficient accuracy to trigger appropriate payouts.Other types of yield losses are likely to require a traditional insurance adjustor, leading to much higher premium levels [46].Therefore, by covering the bulk of a dominant risk to farmers, remote-sensing-based index insurance projects focusing on drought have value in many applications, including R4, particularly when complimentary non-insurance risk management is simultaneously introduced for other perils.

R4 Rainfall Index Design and Threshold
Scalable index insurance programs require reliable historical rainfall data records across space.Therefore, the index used in the Ethiopia R4 program is based on the satellite-derived African Rainfall Climatology Version 2 (ARC2) dataset [47].The index design parameters include a trigger, exit, and cap.The trigger establishes the amount of rainfall within a certain time period below, which the insurance contract will pay, and is set through analysis of 30+ years of satellite-derived rainfall estimates for past growing seasons, along with farmers' planting criteria and community knowledge of drought return frequency.The R4 project employs fractional payouts, i.e., partial payment between the trigger (above which there are no payouts) and an exit value (below which all contract holders receive 100% of the value of the policy).The slope between the trigger and the exit precipitation values is a linear function, and each value is defined each year in each community according to the needs and interests of the community [48].The cap is designed to account for an excess of rainfall during the defined window so that an index is not susceptible to short bursts of heavy rainfall that may not be useful to the crop insured.This allows for payouts to occur when rainfall totals are sufficient, but rainfall amount is poorly distributed within the season.
Drought conditions can occur without being captured by the satellite rainfall data product.ARC2 and other satellite-derived rainfall products use gauge data as part of the retrieval [49,50], which makes the accuracy of rainfall products highly related to the amount of data reported and used in the algorithm [51].Data products also have increasing deviations from observed rainfall in coastal and mountainous regions, since orographic effects are not included in the data product [52].Spatial resolution is another important factor for capturing differences in rainfall across the landscape, particularly in regions with monsoonal rainfall [53,54].One known issue with ARC2 relates to its approach for integrating rain gauge data.If there is a change in the number of gauges reporting over time in a region, this may lead to biases in the rainfall estimates over time [55].
These factors have an important implication for index design-remote sensing products cannot be effectively applied unless their estimates and observations can be properly corroborated using other sources of data.There is often a tradeoff between fine spatial resolution data products and temporal resolution, or the length of time it takes for the sensor to pass over the same area.Data products with coarser spatial resolution usually have more frequent observations, which make them particularly useful for index insurance that requires near real-time updates [56].The acceptance of satellite data by farmers is often determined by the accuracy, the monitoring capacity, and the limitations of the data, products, and model.It is also imperative for the farmers to self-evaluate the results with real-world observations and experiences, and for researchers to communicate individual strengths, limitations, and value of the products [57].

Methodological Framework Objective
In order to test if there is evidence that farmers reporting data have any information about actual drought events and if it is valuable to specifically address recollection and perception issues when using farmer-reported data, we examined the cross-consistency of farmer-reported seasonal vulnerabilities against the years reported as having anomalously low rainfall in independent satellite data sources.We used logistic regressions to test how well remote sensing estimates can predict farmer recollection of historical drought years in Ethiopia and rainfall deficits observed via satellite.We complemented our regressions with the Heidke skill score, Peirce skill score, and the equitable threat score methods of forecast verification to see which part of the season agrees with farmer-derived bad year data.We explored whether potential bias in farmer reporting was filtered through aggregation by discussing regressions across temporal aggregation, i.e., repeated discussions, and spatial aggregation, i.e., using information from nearby villages.

Data Information
We explored a variety of satellite-derived datasets that are commonly used in the weather insurance industry and added a new soil moisture product that we believe has potential for informing crop production due to weather conditions.Our goal was to have simple proxies from largely independent sensors that provide different windows into the hydrologic cycle: From initial rainfall to soil moisture, to the evapotranspiration process, to landscape level vegetative response.
To perform a conservative test we utilized simple, commonly available forms of these products, as opposed to formulations highly optimized for yield estimation or index insurance purposes.There are limitations in using these satellite products, as the information in coarse resolution images in the visible spectrum is particularly susceptible to aerosol, water vapor-related interference [56], and vulnerable to cloud and cloud shadow interference [58].Sensors might also over-or underestimate the volume of rainfall for a particular time and place.According to a study done in Ethiopia by Ayehu et al. [59], climate hazards group infrared precipitation with station data (CHIRPS) sometimes overestimated the frequency of rainfall by 31% and ARC2 underestimated rainfall observed by rainfall gauges by 24%.However, the estimation skill of CHIRPS was less affected by variation in elevation compared to other satellites in Ethiopia.If these different sources provide compounding evidence about farmer recollection, we can have higher confidence than we might have through a single source, which might have specific errors or biases.However, to minimize basis risk, it is important to utilize more sophisticated optimized products if these sources are to be utilized in index insurance projects.It is also important to keep in mind that the performance of any specific product is likely to be biased downwards in our purposefully conservative test.

Farmer Perception Data
Similar to other studies, as part of the R4 index insurance design process, information on dry periods was collected using group meetings and linking climatic anomalies to events of importance in the communities [60].This approach is referred to as "participatory design process", which the International Research Institute (IRI) employs when it collects data and in the application of projects in the developing world.The data used from this suite of activities are referred to as "farmer bad years", or historical years that farmers representing a particular community recall significant drought-related production losses for that community (not for a specific farmer).The participatory design process consisted of multiple visits to communities within a geographic area, and often multiple visits to the same community.Meetings with farmers used in order to provide the "farmer bad years" for the analysis in this study were conducted over a series of years in 81 villages that participate in the R4 index insurance program in Tigray, Ethiopia.These meetings were designed by local and international staff with experience in farmer reporting issues, and therefore included mechanisms to help address anticipated recollection and bias issues.The meetings involve formal mechanisms to improve representation across gender, ethnicity, age, wealth, land owners, laborers, and across agricultural crop and production types.Formal game exercises were played using drought events to assist with recall.The years of local, regional, and national events were noted and used to help people to recall the years of historical droughts.The farmer representatives were divided into groups.Farmers in each group were asked to recall and discuss the worst eight years for the community in terms of drought, going back to the beginning of the satellite rainfall data record, 1983 (see Table 1).The groups each reported the years they identified and reconciled differences together to arrive at a single list.We used the community perceptions of drought generated for the R4 project to provide a unique opportunity to understand how this related to other sources of information.This analysis used 81 total villages, out of which 21 were visited multiple times, allowing for temporal comparisons.Figure 1 shows the locations of each of the years that have representative farmer perception data.

Farmer Perception Data
Similar to other studies, as part of the R4 index insurance design process, information on dry periods was collected using group meetings and linking climatic anomalies to events of importance in the communities [60].This approach is referred to as "participatory design process", which the International Research Institute (IRI) employs when it collects data and in the application of projects in the developing world.The data used from this suite of activities are referred to as "farmer bad years", or historical years that farmers representing a particular community recall significant drought-related production losses for that community (not for a specific farmer).The participatory design process consisted of multiple visits to communities within a geographic area, and often multiple visits to the same community.Meetings with farmers used in order to provide the "farmer bad years" for the analysis in this study were conducted over a series of years in 81 villages that participate in the R4 index insurance program in Tigray, Ethiopia.These meetings were designed by local and international staff with experience in farmer reporting issues, and therefore included mechanisms to help address anticipated recollection and bias issues.The meetings involve formal mechanisms to improve representation across gender, ethnicity, age, wealth, land owners, laborers, and across agricultural crop and production types.Formal game exercises were played using drought events to assist with recall.The years of local, regional, and national events were noted and used to help people to recall the years of historical droughts.The farmer representatives were divided into groups.Farmers in each group were asked to recall and discuss the worst eight years for the community in terms of drought, going back to the beginning of the satellite rainfall data record, 1983 (see Table 1).The groups each reported the years they identified and reconciled differences together to arrive at a single list.We used the community perceptions of drought generated for the R4 project to provide a unique opportunity to understand how this related to other sources of information.This analysis used 81 total villages, out of which 21 were visited multiple times, allowing for temporal comparisons.Figure 1 shows the locations of each of the years that have representative farmer perception data.

African Rainfall Climatology Version 2 (ARC2) Rainfall Estimates
Many indexes used in R4 projects in Africa are based on the NOAA Climate Prediction Center (CPC) African Rainfall Climatology Version 2 (ARC2) dataset, which blends satellite and rain gauge data to create rainfall estimates extending back to 1983 [47].The ARC2 rainfall estimate data have been used as an input in crop yield estimation and forecasting models [61,62] as well as monitoring activities from the Famine Early Warning System (FEWSNET).ARC2 data have been used to design the indexes in the R4 Ethiopia project; therefore we used ARC2 rainfall estimates as the baseline independent variable for the models in this analysis.ARC2 rainfall estimates were summed for two targeted windows per growing season, an "early" window and "late" window.

Climate Hazards Group InfraRed Precipitation with Station Data (CHIRPS)
Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) is a 35+ year quasi-global record of rainfall, merging satellite and station data and a long-term climatology [49] and is a product that has been shown to capture rainfall variability and outperform other datasets [63,64].CHIRPS has been vital in the ability of many IRI-associated index insurance projects to expand spatial coverage for satellite-based rainfall estimates.Spanning 50 • S-50 • N (and all longitudes), starting in 1981 to near-present, CHIRPS incorporates 0.05 • resolution satellite imagery with in-situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring.A study validating new satellite rainfall products in the Upper Blue Nile Basin, Ethiopia, found that CHIRPS is less affected by topography than other satellite estimates.Although there is a tendency for CHIRPS to show small amounts of rain during the dry times, the values are small and do not meaningfully affect any sort of seasonal total.The study's findings justified the use of the CHIRPS product for various operational applications in this region in Ethiopia [59].CHIRPS is also used for remote sensing rainfall estimates important in index insurance design, most notably because it is widely available outside of Africa, beyond ARC2 and TAMSAT satellites, which are used for operational monitoring [65].

The Atmosphere-Land Exchange Inverse (ALEXI)
The atmosphere-land exchange inverse (ALEXI) model partitions measurements of bulk radiometric temperature into plant and soil temperatures in order to make separate estimates of soil evaporation and plant transpiration [66].Soil estimates infer information about the top few centimeters of soil and plant transpiration includes water in the entire root zone region.Since 2000, ALEXI has been running daily, estimating evapotranspiration (ET) over the continental United States.Collaborator Anderson has been running the model over Africa from 2007-current and assessing its performance [67,68].Drought index-insurance designs in Malawi, Tanzania, and Kenya have used relative evapotranspiration datasets as they were more closely related to the phenology of crop development for the crops they were covering than the precipitation datasets [69].We included ALEXI as it is freely available, and more distinct from our other sensors than many of the other evapotranspiration datasets commonly utilized in insurance.with a goal of increasing its independence from other remote sensing products in our analysis, as opposed to the goal of minimizing basis risk, which would be important in actual insurance design.

Moderate-Resolution Imaging Spectroradiometer (MODIS), Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI)
The moderate-resolution imaging spectroradiometer (MODIS) sensor provides consistent spatial and temporal estimates of vegetation canopy greenness, a composite property of leaf area, chlorophyll, and canopy structure, from February 2000 to the present at 250 m resolution.MODIS data can be used to model photosynthetically active radiation and have been shown to be related to the amount of biomass on the ground [70] and productivity of crops [71].The normalized difference vegetation index (NDVI) data are related to the amount of healthy or dense vegetation, whereas if the vegetation is unhealthy or sparse, the vegetation reflects more visible light and less near infrared light, resulting in lower NDVI values [72].The enhanced vegetation index (EVI) was developed to improve the sensitivity of the signal over dense vegetation regions and reduce canopy-soil variations and atmosphere influences [73].NDVI is one of the most common products utilized in index insurance.For example, Klisch and Atzberger [74] described the operational use of NDVI-derived drought indicators for triggering payouts of disaster contingency funds in Kenya.By measuring spatial and temporal aggregations, they presented a fully operational processing chain for more accurate mapping of drought occurrence, extent, and strength, based on MODIS NDVI data.
2.1.6.Climate Change Initiative (CCI) of the European Space Agency (ESA)-ESA CCI Surface Soil Moisture Satellite-derived soil moisture allows to close a gap between precipitation anomalies and the response of vegetation [75].Microwave soil moisture observations are particularly advantageous, as that they are largely independent from weather conditions (e.g., clouds).Despite limitations over complex topography and dense vegetation cover, satellite-derived soil moisture could become a valuable dataset in index insurance to strengthen the overall drought narrative [30] or to validate satellite-derived rainfall estimates [76].Within the Climate Change Initiative (CCI) of the European Space Agency (ESA), more than ten radar and radiometer sensors are combined.The dataset used in this study is version 04.2, which offers daily global surface soil moisture estimates at a spatial resolution of 0.25 degrees (roughly 28 kilometers at the equator) from 1978 to 2016 [77,78], with large gaps before 1992 [79].In regions with relatively low vegetation cover, the radiometer data are prioritized.The active (radar) product is given more weight in regions with moderate vegetation cover [80].Regions with very dense vegetation, such as tropical forests, need to be masked, because neither sensor type performs well.So far, the dataset has been updated roughly every year, but an operational version [81] will soon be available via the Climate Data Store of the European Space Agency (https: //cds.climate.copernicus.eu/#!/home).

Methodology
Due to the limitations in both the farmer recollection data and remote sensing rainfall estimates, our approach was to test the cross-consistency of farmer-reported seasonal vulnerabilities against the years reported as droughts in independent satellite data sources.We used logistic regressions to test the extent to which remote sensing estimates can predict farmer recollection of historical drought years in 81 communities in Ethiopia (gathered through farmer meetings) and rainfall deficits observed via satellite.This strategy compared the prediction quality for alternate processing strategies of farmer recollection data.If there is evidence of a drought in the biophysical-based remote sensing estimates and that event is independently identified through farmer recollections, there is evidence that the event reported is not spurious.It is important to note that this approach is vulnerable to errors when farmers recall legitimate loss events that are not reflected in satellite datasets, which may be due to potential inaccuracies in a satellite estimate.Because of this, we also explored the use of multiple satellite information sources, which may yield a more robust prediction of farmer recollection.
We complemented our regressions with Heidke skill score (HSS) tables to map the seasonal timing of historical drought events.The Heidke skill score is used to evaluate the strength of forecasts, using a straightforward hit/miss equation [82].In addition to the HSS, we provided further forecast verification using the Peirce skill score (PSS) and the equitable threat score (ETS).The Peirce skill score measures the difference between probability of detection and false detection, essentially measuring the ability of the forecast to differentiate between whether or not an event occurred [82].The equitable threat score measures the fraction of observed and/or forecast events that were correctly predicted, adjusted for hits associated with random chance [82].All the scores were calculated for the same bad year dataset groups in order to see, out of the entire satellite rainfall record, which parts of the season agreed with the farmer-derived bad year data the most.
We utilized a handful of key criteria to assess the quality of the farmer-reported information as we explored alternatives to understand if there were issues, and what approaches may help address these issues.The basic strategy was to compare the reported loss years against the anomalies in the remote sensing data, during the vulnerable seasonal timing that is produced by the focus group process.To the extent that the improvements in predictions are seen across independent remote sensing sources, the evidence was increased that issues in perceptions and farmer reporting were being addressed.It is important to recognize that legitimate bad years that the remote sensing fails to identify will lower the quality of the fit, even if the farmer recollections are correct.
We therefore looked at improvements in quality of fit, recognizing that there will be biases, and gaps in the remote sensing skill.Criteria that are important in the diagnostics included the quality of fit, the significance of the fit, and whether the signs of the parameters were consistent with the physical processes believed to drive the relationships (for example, if lower rainfall is associated with drought).The central strategy relied on evidence across multiple independent sources that drought events have occurred.For those events, we had higher confidence of their existence, and those were the subset of events that we utilized to explore issues in perception and recollection.
We began with a baseline exploration of an initial visit to 21 villages (this was a subset of the 81 villages initially visit for which we have follow-up data for comparison).We then compared that visit to the reporting from farmers from those same villages a few years later, highlighting the reporting issues, and exploring the potential benefits of utilizing multiple village visits over time.We compared the multiple visits strategy to a direct increase in sample size, and then explored the issues related to spatial aggregation of farmer recollection reporting.Finally, we looked at multiple satellite-derived variables in one model to understand if there was evidence for farmer reports of bad years across multiple, largely independent, remote sensing sources of droughts, as reported from the farmer process.

Retrieval of Drought Events from Satellite Information
All datasets were downloaded using the IRI data library querying the pixel containing the latitude/longitude of the village centroid in Tigray, for early and late total rainfall data.Conveniently, the villages and their associated agricultural regions are approximately the size of the rainfall pixels.For vegetation sensing, the strategy was to interpret the landscape's response to the arrival of (or lack of) rainfall necessary for crops to grow.When we compared vegetation remote sensing, i.e., EVI, ET, and soil moisture, with the satellite rainfall estimates, we looked at the month following the rainfall window of interest, aggregated across the landscape of a village.The lag-time was intended to better capture the vegetation response ("green-up" or "brown-out", etc.) to the rainfall experienced during the essential periods for crop growth [32].The time period used for this case study varied between satellite information, due to temporal gaps in satellite data.Therefore, the time period used for ARC2 and CHIRPS was 1983 to 2016 and 2000 to 2016 for NDVI, ET, and Soil Moisture.For NDVI and EVI, the native pixel size was averaged to the size of ARC2 rainfall pixel (10 km × 10 km), to reflect the village level vegetative response.For ET and Soil Moisture areas, we utilized a resampled 25 km × 25 km dataset from Enenkel et al. [30].

Temporal Aggregation
We began with initial diagnostics, first investigating our baseline observation of 21 villages in 2010.The participatory process identified the rough timing of Early June-July and late August-September as the vulnerable times for crops.These aligned with two categories of crops, short season and long season.These time periods roughly aligned with the sowing/establishment of the long season crops and the flowering/filling timing of both the long season and short season crops (which were not vulnerable to drought at the beginning of the season, because they are typically planted later).The bad years reported from the village discussion process are presented in Figure 2. Figure 3 presents the Heidke skill score, Peirce skill score, and equitable threat score of the seasonal timing of ARC2 data in predicting bad years reported by farmers for an array of decadal "windows".The Heidke skill, Peirce skill, and equitable threat plots (Figures 3, 5 and 7) were used in order to show how temporal aggregation affects the agreement between rainfall during seasonal windows and farmer-perceived bad years.Areas with darker green indicate an increased number of "hits", whereas purple indicates an increased number of misses.Plots are arranged with the starting dekad, or 10-day period, on the y-axis and the ending dekad on the x-axis.Although skill overall is relatively low, the skill is highest in one of the windows reported by farmers, the initial June-July timing, suggesting that there may be some information in the farmerreported bad years and farmer-reported seasonal timing.However, the later window, which was described to be important for all crops, does not show recognizable skill in the Heidke, Peirce, and equitable threat maps.This emphasis on early vulnerability is not consistent with agronomic models.It is also not consistent to more crops being vulnerable to the later part of the season, nor is it consistent with farmer and agronomist reports that following a bad start of the season, many farmers maintain good yields by delaying planting or replanting.
Follow-up participatory farmer discussion processes were performed in the years following the 2010 initial meetings.Figure 4 presents the years reported in the second round of farmer village reporting visits.The years reported are substantially different from the initial visits.The Heidke map (Figure 5) indicates that the seasonality that predicts farmer bad years from the second round is not the early window, but instead appears to be more prominent in the late summer window, again with   Although skill overall is relatively low, the skill is highest in one of the windows reported by farmers, the initial June-July timing, suggesting that there may be some information in the farmerreported bad years and farmer-reported seasonal timing.However, the later window, which was described to be important for all crops, does not show recognizable skill in the Heidke, Peirce, and equitable threat maps.This emphasis on early vulnerability is not consistent with agronomic models.It is also not consistent to more crops being vulnerable to the later part of the season, nor is it consistent with farmer and agronomist reports that following a bad start of the season, many farmers maintain good yields by delaying planting or replanting.
Follow-up participatory farmer discussion processes were performed in the years following the 2010 initial meetings.Figure 4 presents the years reported in the second round of farmer village reporting visits.The years reported are substantially different from the initial visits.The Heidke map (Figure 5) indicates that the seasonality that predicts farmer bad years from the second round is not the early window, but instead appears to be more prominent in the late summer window, again with low levels of skill.This suggests that substantially different information may have been reported in Although skill overall is relatively low, the skill is highest in one of the windows reported by farmers, the initial June-July timing, suggesting that there may be some information in the farmer-reported bad years and farmer-reported seasonal timing.However, the later window, which was described to be important for all crops, does not show recognizable skill in the Heidke, Peirce, and equitable threat maps.This emphasis on early vulnerability is not consistent with agronomic models.It is also not consistent to more crops being vulnerable to the later part of the season, nor is it consistent with farmer and agronomist reports that following a bad start of the season, many farmers maintain good yields by delaying planting or replanting.Follow-up participatory farmer discussion processes were performed in the years following the 2010 initial meetings.Figure 4 presents the years reported in the second round of farmer village reporting visits.The years reported are substantially different from the initial visits.The Heidke map (Figure 5) indicates that the seasonality that predicts farmer bad years from the second round is not the early window, but instead appears to be more prominent in the late summer window, again with low levels of skill.This suggests that substantially different information may have been reported in follow-up meetings, which may have focused more on years in which the late window was more important.It is important to note that the difference in years reported may not be entirely driven by farmer recollection differences, but instead by reporting process changes.In the initial visit, discussions were catalyzed around the past 15 years, while in follow-up visits, a thirty-year timeframe was utilized.Many farmer-reported data issues may be project-and reporting-process-driven.While both may contribute to issues in the same final dataset, many project-implementation-related issues may need to be addressed through different approaches than purely farmer perception and recollection issues.
In Figure 6, we present the bad year reporting from the combined meetings.The associated maps in Figure 7 hint at skill in both the early and late windows reported by farmers.Therefore, it may be that there was indeed information in the discussion process, but that incomplete information was reported in each visit, and importantly, design or policy choices based on one round of visits could   It is important to note that the difference in years reported may not be entirely driven by farmer recollection differences, but instead by reporting process changes.In the initial visit, discussions were catalyzed around the past 15 years, while in follow-up visits, a thirty-year timeframe was utilized.Many farmer-reported data issues may be project-and reporting-process-driven.While both may contribute to issues in the same final dataset, many project-implementation-related issues may need to be addressed through different approaches than purely farmer perception and recollection issues.
In Figure 6, we present the bad year reporting from the combined meetings.The associated maps in Figure 7 hint at skill in both the early and late windows reported by farmers.Therefore, it may be that there was indeed information in the discussion process, but that incomplete information was reported in each visit, and importantly, design or policy choices based on one round of visits could have led to misguided decisions.These initial diagnostics suggest that there may be useful information about seasonality and drought years in farmer discussions, but that there are potential bias issues, which were attenuated through the mechanism of repeated discussions.It is important to note that the difference in years reported may not be entirely driven by farmer recollection differences, but instead by reporting process changes.In the initial visit, discussions were catalyzed around the past 15 years, while in follow-up visits, a thirty-year timeframe was utilized.Many farmer-reported data issues may be project-and reporting-process-driven.While both may contribute to issues in the same final dataset, many project-implementation-related issues may need to be addressed through different approaches than purely farmer perception and recollection issues.
In Figure 6, we present the bad year reporting from the combined meetings.The associated maps in Figure 7 hint at skill in both the early and late windows reported by farmers.Therefore, it may be that there was indeed information in the discussion process, but that incomplete information was reported in each visit, and importantly, design or policy choices based on one round of visits could have led to misguided decisions.These initial diagnostics suggest that there may be useful information about seasonality and drought years in farmer discussions, but that there are potential bias issues, which were attenuated through the mechanism of repeated discussions.We performed more formal exploration through a logistic regression of farmer-reported loss years as predicted by ARC2 rainfall (filtered utilizing the decadal cap described in Section 1.2 R4 Rainfall Index Design and Threshold) during the two windows that were the outcomes of the village meeting processes.Table 2 presents the results of this regression.In this regression, an observation is a year in a village (21 villages × 33 years).In each regression, there is a significant prediction of bad years based on satellite rainfall, suggesting that there is information in farmer reporting.In each of the regressions, the sign of the rainfall in both windows was consistent with physical processes (less rainfall is associated with drought), suggesting a possible relationship with actual processes.The parameter size and significance of the parameters was generally consistent with the Heidke skill, Pierce skill, and equitable threat maps.In the first visit, early rainfall drove prediction, which was significant and higher in magnitude.In the follow-up visits, late season rainfall drove the prediction in the significance and magnitude of the parameter.When the years were aggregated, the parameters were of the same magnitude and both were significant, while the fit improved substantially.This suggests that although there may be useful information in the recollections, they may include biases that may be attenuated through the strategy of multiple visits.We performed more formal exploration through a logistic regression of farmer-reported loss years as predicted by ARC2 rainfall (filtered utilizing the decadal cap described in Section 1.2 R4 Rainfall Index Design and Threshold) during the two windows that were the outcomes of the village meeting processes.Table 2 presents the results of this regression.In this regression, an observation is a year in a village (21 villages × 33 years).In each regression, there is a significant prediction of bad years based on satellite rainfall, suggesting that there is information in farmer reporting.In each of the regressions, the sign of the rainfall in both windows was consistent with physical processes (less rainfall is associated with drought), suggesting a possible relationship with actual processes.The parameter size and significance of the parameters was generally consistent with the Heidke skill, Pierce skill, and equitable threat maps.In the first visit, early rainfall drove prediction, which was significant and higher in magnitude.In the follow-up visits, late season rainfall drove the prediction in the significance and magnitude of the parameter.When the years were aggregated, the parameters were of the same magnitude and both were significant, while the fit improved substantially.This suggests that although there may be useful information in the recollections, they may include biases that may be attenuated through the strategy of multiple visits.We performed more formal exploration through a logistic regression of farmer-reported loss years as predicted by ARC2 rainfall (filtered utilizing the decadal cap described in Section 1.2 R4 Rainfall Index Design and Threshold) during the two windows that were the outcomes of the village meeting processes.Table 2 presents the results of this regression.In this regression, an observation is a year in a village (21 villages × 33 years).In each regression, there is a significant prediction of bad years based on satellite rainfall, suggesting that there is information in farmer reporting.In each of the regressions, the sign of the rainfall in both windows was consistent with physical processes (less rainfall is associated with drought), suggesting a possible relationship with actual processes.The parameter size and significance of the parameters was generally consistent with the Heidke skill, Pierce skill, and equitable threat maps.In the first visit, early rainfall drove prediction, which was significant and higher in magnitude.In the follow-up visits, late season rainfall drove the prediction in the significance and magnitude of the parameter.When the years were aggregated, the parameters were of the same magnitude and both were significant, while the fit improved substantially.This suggests that although there may be useful information in the recollections, they may include biases that may be attenuated through the strategy of multiple visits.

Diagnostics Based on Increasing Sample Size
An additional regression was performed to understand if the improvement was simply due to including additional observations.In 2010, additional villages were visited beyond the 21 villages, with a total of 81 total initial visits.Table 3 presents the results of a regression of the full 81 villages.As with the combined initial and follow-up dataset, both the early and late parameters were significant, and of signs consistent with physical processes, although the importance of the early season was stronger in this regression, with the parameter several times larger than the later window.This suggests that although increased investments in data collection may be of value, the improvements may not be purely based on statistical processes.The strategy of 42 visits (21 initial, and 21 follow-up) led to similar significance and results as the approximately twice as large 81 visits, which may have had bias towards the beginning of the season.Therefore, it is possible that farmer perception issues (and project dynamics) may lead to statistical biases and processes that might be more effectively addressed through strategies such as aggregation or differencing over time, than simply relying on the laws of large numbers.

Spatial Aggregation
To the extent that nearby villages experience similar loss years, it may be possible to utilize nearby villages to help address issues in farmer reporting.However, if climate or livelihoods are substantially different, nearby villages may truly face different loss years, and aggregating over space to filter noise from farmer information may filter out important loss years.To hopefully inform more sophisticated strategies of spatial smoothing, we performed several diagnostics to see if there was value in utilizing spatial strategies in identifying and addressing noise and bias in farmer recollection, asking if the signal is lost when aggregation occurs.We performed this test on the repeat visit analysis of the 21 villages, to explore how spatial processes might be used to extend filters for reporting noise once the time filters of repeated visits had been applied.
We averaged both farmer and satellite rainfall estimates over higher levels of administrative units.We tested if the logistic regression found significant parameters suggesting prediction of farmer loss years at higher levels of spatial aggregation.This exercise explored the balance of increase in signal with the averaging compared to a reduction in the number of observations as the 21 villages were aggregated into the 11 woredas and three zones.Since the number of observations is the number of years per number of locations, N dropped dramatically with each aggregation.These spatial distributions and political boundaries of the aggregated woredas and zones can be seen in Figure 8.If significance remains once bad years are aggregated, even with the corresponding low N, it suggests that there may be noise or biases in farmer reporting addressed by analyses including nearby villages.Table 4 presents the results of the regressions.The parameters are of roughly the same order of magnitude for both early and late rainfall across all scales.Significance is lost for the early window after the first level of aggregation, but the second window significance remains across all regressions, even for the single average over the 33-year period, maintaining significance at the one percent level for all of the prior regressions.This diagnostic suggests that while statistical power may be lost quickly as aggregation occurs, there is very likely a benefit of strategies that explore spatial filtering of farmer recollections in this particular case study.At the highest level of aggregation, there is still significant evidence that farmers were reporting loss years that were consistent with independent satellite rainfall estimates.It is, therefore, worthwhile to explore strategies that might better preserve sample size than this worst-case scenario in filtering bias and noise from farmer reporting.
Remote Sens. 2018, 10, x FOR PEER REVIEW 15 of 25 asking if the signal is lost when aggregation occurs.We performed this test on the repeat visit analysis of the 21 villages, to explore how spatial processes might be used to extend filters for reporting noise once the time filters of repeated visits had been applied.We averaged both farmer and satellite rainfall estimates over higher levels of administrative units.We tested if the logistic regression found significant parameters suggesting prediction of farmer loss years at higher levels of spatial aggregation.This exercise explored the balance of increase in signal with the averaging compared to a reduction in the number of observations as the 21 villages were aggregated into the 11 woredas and three zones.Since the number of observations is the number of years per number of locations, N dropped dramatically with each aggregation.These spatial distributions and political boundaries of the aggregated woredas and zones can be seen in Figure 8.If significance remains once bad years are aggregated, even with the corresponding low N, it suggests that there may be noise or biases in farmer reporting addressed by analyses including nearby villages.Table 4 presents the results of the regressions.The parameters are of roughly the same order of magnitude for both early and late rainfall across all scales.Significance is lost for the early window after the first level of aggregation, but the second window significance remains across all regressions, even for the single average over the 33-year period, maintaining significance at the one percent level for all of the prior regressions.This diagnostic suggests that while statistical power may be lost quickly as aggregation occurs, there is very likely a benefit of strategies that explore spatial filtering of farmer recollections in this particular case study.At the highest level of aggregation, there is still significant evidence that farmers were reporting loss years that were consistent with independent satellite rainfall estimates.It is, therefore, worthwhile to explore strategies that might better preserve sample size than this worst-case scenario in filtering bias and noise from farmer reporting.

Extremes Diagnostic
It is important to check if the index-insurance payout is driven by rainfall levels that fall outside of the extreme negative anomalies of loss years.We therefore performed a diagnostic, for which the rainfall estimates were restricted to approximately the lowest 20th percentile, reflecting the constraints of an insurance product.We utilized the payment formula from the index product to provide a sense of how insurance design constraints impact the relationship.Since lower rainfall years yield higher "payouts", the signs should be positive.Table 5 illustrates the regression results, which suggest that some of the significance in the early window was from rainfall variation not in extremely low rainfall years, but that significance and parameters are relatively robust when restricting to low rainfall anomalies, particularly for the later window.

Broader Suite of Remote Sensing Sources
As a final exploration, we investigated if the farmer recollections could be predicted independently by a broader suite of remote sensing data sources, through a range of independent sources targeting different stages of the crop water balance process [30].This was not an assessment of the quality of the data source for insurance, but instead to test if there was evidence that there was some information in farmer reporting, which was not dependent on a single source of satellite information, particularly since the remote sensing data sources often spanned different years.Applied in modified, optimized forms for index insurance applications, these datasets often exhibited higher relationships with farmer losses.
We did not perform this optimization because of potential endogeneity with farmer recollections that would be detrimental to our test if remote sensing independently predicted loss events.Table 6 presents a set of regressions using multiple remote sensing proxies for the combined first and second visit farmer reporting.The ARC2 rainfall product shows that lower rainfall in both the early and late windows significantly predicted farmer recollections.CHIRPS, an alternate satellite rainfall estimate showed similar results with less significance, but similar magnitudes and signs.The evapotranspiration data were not significant and of mixed signs.Both of the vegetative data sources (lagged by one month from the farmer windows) showed significance for the late window, with lower VI values significantly predicting farmer recollections of loss years.It is not surprising that early season vegetative indexes do not have explanatory power, as the leaf cover is not established until later in the season.Finally, lower soil moisture as estimated through passive and active microwave, also predicted farmer-recalled loss years.The range of independent satellite proxies, each targeting a different stage of the agricultural water balance process, each with its own strengths and limitations, generally suggests that there was actual information in the farmer recollections, as a set of compounding evidence.

Discussion
Farmers' perception and recollection are important sources of information for the responsible application of remote sensing in climate risk programs throughout the world.We do see evidence that farmer recollections can be corroborated with remote sensing information, suggesting that recollections do contain valuable information.Our case study expanded the methodological framework of past research that used historical trends based on the meteorological department's available information and other national statistics to corroborate farmers' perception [10,[15][16][17]60] using remote sensing information.This methodology could be applied in regions that do not have robust historical climate information.
We performed several diagnostics to explore the potential existence of biases in farmers' recollections.In this work, we were not necessarily focused on the differences between farmer-perceived droughts and hydrological droughts as categorized by remote sensing sources as bias.Instead, we focused on testing if there was evidence of a relationship, and on uncovering inconsistences in farmer reporting, perhaps due to biases.First, the strategy of multiple visits might be more effective than just relying on sampling large numbers for statistical significance.This was evident in meetings with a larger population, i.e., 81 visits, which showed a bias towards the beginning of the season, had similar significance with the repeated visits, i.e., 42 visits (21 initial and 21 follow-up), and even showed less bias than the larger sample.It is possible that farmer perception issues and project dynamics are responsible for statistical biases and processes that can be addressed by multiple visits rather than relying on a large sample.Second, our case study showed there is a benefit in spatial filtering through spatial aggregation.Although statistical power may be lost when spatial aggregation occurs, our analysis suggested that even at the highest level of aggregation, from village level to the entirety of Tigray, there is still significant evidence that farmers were reporting on bad years that matched with independent satellite rainfall estimates.This indicates that it may be valuable to explore strategies that might preserve sample size rather than this worst-case scenario in filtering bias and noise from farmer reporting.However, spatial aggregation should be carefully reviewed in other contexts due to the potential meteorological variability in different locations, which could potentially have different results.Lastly, by a careful cross-comparison analysis of independent satellite proxies that individually assess different stages of the agricultural water balance process, it was suggested that there is information in the farmers' recollections of bad years.It is important to note that while memory is an imperfect reconstruction of past information, it is a useful source of information in regions with little or no in-situ observation data.
Although our case study is not focused on the specific types of biases or characteristics that shape perception, our findings are consistent with the potential existence of biases that are noted in the literature.It is likely that it is valuable to have strategies to address specific issues reflected in the literature.Although many studies identified that the longer the time has passed when asked to recall an event, the larger the recall error [9,10,14,15], there is value in strategies such as spatial and temporal aggregation that may be effective in mitigating recall error.It is also worth noting that not all agricultural bad years are caused by drought; therefore, reporting mechanisms should take this potential discrepancy into consideration.The phenomenon of over-or underreporting events, known as the telescoping effect [10], was evident in the variance of bad years recalled during the initial and second visit (Figures 2 and 4).However, the changes in responses may have been due to how the questions were framed [12].During the initial meeting with farmers, discussions of "bad" years were focused on the previous fifteen years, while during the second visit, a thirty-year time frame was utilized.Therefore it is valuable to not only address reporting biases, but also potential issues from changing project survey and discussion instruments.These instruments could be designed to reduce anchoring, heaping [10], and other factors that shape perception and response.
The design process of weather index insurance needs to account for farmers' errors in reporting.Our exercise was to perform a conservative test of farmer recollection using relatively simplistic remote sensing products; thus it is important to keep in mind that if the satellite data products are to be utilized in index insurance, it is worthwhile to further address the sources of errors, biases, and noise in the satellite products themselves.More robust earth observation information is needed to create proxies for historical and potential future drought impacts, which could potentially reduce basis risk.While much of index insurance uses precipitation data, there have been recent breakthroughs with the use of soil moisture data, potentially making it another option in triggering payouts.Current studies are focusing on the potential of ESI and soil moisture to close sensitive knowledge gaps between atmospheric moisture supply and the response of the land surface [83].Soil moisture can help to mitigate some of the challenges presented in other Earth observation data that have limitations due to cloud cover and poor atmospheric conditions, since they are largely independent from weather conditions.This could result in a better match between calculated payouts/credit repayment levels and the actual needs of smallholder farmers [83].
Moreover, if satellite information is to be used as an 'objective' proxy for triggering payouts, it is imperative that there are filters in place to reduce noise and uncertainty.Future research should refine not only bias filtering, but atmospheric noise from cloud contamination and other water-vapor-related challenges.Additionally, basis risk reduction benefits may arise from noise reduction, spatial and temporal filters of the remote sensing information, and possible integration of multiple sensors.

Conclusions
We studied information content and potential biases in farmer recollection by comparing the content against empirical satellite observations.The key findings in our study are evidence that (1) farmer-reported events are reflected in multiple remote sensing datasets, which suggests that there is indeed evidence that farmer-reported data include information about actual drought events, and (2) by utilizing strategies of repeated meetings over time and, to some extent, aggregating independent village reports over space lead to improved predictions.This suggests that it may not be appropriate to utilize farmer recollection data with the same approaches that would be appropriate for sensor data and that mechanisms to address reporting and recollection biases may be important.
Given the increased focus on the coproduction of knowledge [84], specifically in the research and application of climate change adaptation, it is important to understand the contributions of, and biases in the information provided by stakeholders.Additionally, the process of coproduction of knowledge requires effective discussion support tools, which could help to reduce errors in the reporting process.Discussion support tools should include mechanisms to help to address potential biases to strengthen the significance of the information from stakeholders.Specifically, in a cross-disciplinary context, the present methodological framework could be valuable for the development of discussion tools in participatory processes.This could address errors in the execution of gathering information, as well as a systematic approach of gathering information through multiple visits or spatial aggregation.
The results from our case study support the careful use of farmer reporting in insurance design.Given the paucity of available validation data, responsible use of farmer reporting will be increasingly important if remote sensing is to be increasingly relied upon at larger scales.There are many efforts to bring interventions like index insurance to massive scale.Scaling up is an important goal, since the potential of agriculture insurance programs can only be recognized by reaching a large enough portion of the poor rural community such that the impact of covariate shocks on incomes and asset retention is limited and the financial structure of affected communities is protected [21].This will require that participatory methodologies be improved to perform time-intensive and costly exercises more efficiently.There is current exploration of the role of technology to help mitigate some of the challenges in participatory processes.It is crucial for farmers to be able to self-evaluate the results of remote sensing with real observations and experiences and for researchers to study and communicate particular strengths, limitations, and value of using remote sensing in insurance [57].However, this will require a deeper subjectivity data analysis to better understand larger datasets of farmers' perception and recollection.Current research efforts are investigating databases where farmers could textually input information that captures their observations and experiences.Our findings could serve as a starting point for future reporting approaches that can be built into sentiment analysis to help to detect, extract, and classify subjective information [85].These filtering systems should consider temporal and spatial aggregation as a way to address potential farmers recall error and could be valuable to specifically address other potential sources of bias (i.e., psychological and recollection processes), challenges due to economic incentives, and gender representation.
As the demand for user-driven strategies grows, it is imperative that verification of stakeholders' (e.g., farmers) information is not only represented, but that it is as accurate as possible.This is especially relevant in the design of weather index insurance, as it requires agreement across multiple information sources, such as remote sensing tools and farmer reporting.

Figure 1 .
Figure 1.Tigray, Ethiopia land cover with locations of field visits.

Figure 1 .
Figure 1.Tigray, Ethiopia land cover with locations of field visits.Source: European Space Agency (ESA) Climate Change Initiative (CCI) land cover team 20 m Africa land cover S2 product 2016 derived from Sentinel-2A observations.
Remote Sens. 2018, 10, x FOR PEER REVIEW 11 of 25 purple indicates an increased number of misses.Plots are arranged with the starting dekad, or 10day period, on the y-axis and the ending dekad on the x-axis.

Figure 2 .
Figure 2. Histogram of farmer-reported "bad years" for 21 villages in Tigray; initial visit.

Figure 3 .
Figure 3. Heidke skill score, Peirce skill score, and equitable threat score for a variety of African Rainfall Climatology Version 2 (ARC2) rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving up from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" on the initial visit to 21 villages in Tigray, Ethiopia, served as the truth data.

Figure 2 .
Figure 2. Histogram of farmer-reported "bad years" for 21 villages in Tigray; initial visit.

Figure 2 .
Figure 2. Histogram of farmer-reported "bad years" for 21 villages in Tigray; initial visit.

Figure 3 .
Figure 3. Heidke skill score, Peirce skill score, and equitable threat score for a variety of African Rainfall Climatology Version 2 (ARC2) rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving up from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" on the initial visit to 21 villages in Tigray, Ethiopia, served as the truth data.

Figure 3 .
Figure 3. Heidke skill score, Peirce skill score, and equitable threat score for a variety of African Rainfall Climatology Version 2 (ARC2) rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving up from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" on the initial visit to 21 villages in Tigray, Ethiopia, served as the truth data.
Remote Sens. 2018, 10, x FOR PEER REVIEW 12 of 25 follow-up meetings, which may have focused more on years in which the late window was more important.

Figure 4 .
Figure 4. Histogram of farmer-reported "bad years" for 21 villages in Tigray; second visit.

Figure 5 .
Figure 5. Heidke skill score, Peirce skill score, and equitable threat score for a variety of ARC2 rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" on the second visit to 21 villages in Tigray, Ethiopia, served as the truth data.

Figure 4 .
Figure 4. Histogram of farmer-reported "bad years" for 21 villages in Tigray; second visit.

Figure 4 .
Figure 4. Histogram of farmer-reported "bad years" for 21 villages in Tigray; second visit.

Figure 5 .
Figure 5. Heidke skill score, Peirce skill score, and equitable threat score for a variety of ARC2 rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving up from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" on the second visit to 21 villages in Tigray, Ethiopia, served as the truth data.

Figure 5 .
Figure 5. Heidke skill score, Peirce skill score, and equitable threat score for a variety of ARC2 rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving up from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" on the second visit to 21 villages in Tigray, Ethiopia, served as the truth data.

Figure 7 .
Figure 7. Heidke skill score, Pierce skill score, and equitable threat score for a variety of ARC2 rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving up from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" in combined visit (1st and 2nd visit) to 21 villages in Tigray, Ethiopia, served as the truth data.

Figure 7 .
Figure 7. Heidke skill score, Pierce skill score, and equitable threat score for a variety of ARC2 rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving up from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" in combined visit (1st and 2nd visit) to 21 villages in Tigray, Ethiopia, served as the truth data.

Figure 7 .
Figure 7. Heidke skill score, Pierce skill score, and equitable threat score for a variety of ARC2 rainfall accumulation intervals over the year.Higher skill scores reflect that the accumulation interval aligns better with farmer-identified payout years.The start and end dekads (on the y-and x-axis, respectively) indicate the 10-day period during the calendar that the accumulation calculation starts or ends.The lower-right portion of the graphic captures accumulation over the full calendar year, moving up from the lower right indicates a later start of accumulation, while moving to the left on the graphic represents an earlier end to the accumulation interval.Farmer-reported "bad years" in combined visit (1st and 2nd visit) to 21 villages in Tigray, Ethiopia, served as the truth data.

Figure 8 .
Figure 8. Spatial distribution of the the villages visited, and three zones where observations were recorded.

Figure 8 .
Figure 8. Spatial distribution of the the villages visited, and three zones where observations were recorded.

Table 1 .
A description of dataset characteristics used for the study.
Moderate-resolution imaging spectroradiometer (MODIS) enhanced vegetation index (EVI) Vegetation greenness index from MODIS terra 2000-2016 250 m resampled to 10 km resolution The atmosphere-land exchange inverse (ALEXI model) evapotranspiration (ET) ET used in ALEXI model 2000-2013 4 km resolution Climate Change Initiative (CCI) of the European Space Agency (ESA) soil moisture Satellite-derived surface soil moisture estimates provided via the Climate Change Initiative of the European Space Agency 2000-2016 0.25 degree resolution

Table 2 .
Village level results aggregating first and second visits (21 villages).

Table 3 .
Village level results from first visit (81 villages).

Table 6 .
All datasets, village bad years.