Linking Agricultural Index Insurance with Factors That Influence Maize Yield in Rain-Fed Smallholder Farming Systems

Weather extremes pose substantial threats to food security in areas where the main source of livelihood is rain-fed crop production. In most of these areas, agricultural index insurance (AII) is recognized as being capable of securitizing food production by providing safety nets against weather-induced crop losses. Unfortunately, however, AII does not indemnify farmers for non-weather-related crop losses. This study investigates how this gap can be filled by exploring strategies through which AII can be linked with non-weather factors that influence crop production. We do this by using an improvised variable ranking methodology to identify these factors in the O.R. Tambo District Municipality, South Africa. Results show that key agrometeorological variables comprising surface moisture content, growing degree-days, and precipitation influence maize yield even under optimal weather conditions, while seed variety, fertilizer application rate, soil pH, and ownership of machinery play an equally important role. This finding is important because it demonstrates that although AII focuses more on weather elements, there are non-weather variables that may expose farmers to production risk even under optimal weather conditions. As such, linking AII with critical non-weather, yield-determining factors can be a better risk management strategy.


Introduction
Smallholder farming around the world contributes substantially to economic growth and food security, especially in rural areas [1,2]. However, the increasing occurrence of weather shocks threatens agriculture, especially in Sub-Saharan Africa (SSA) where 95% of farmland is rain-fed [3][4][5]. In the past 10 to 15 years, attempts were made to support farmers through agricultural index insurance (AII), which acts as a safety net against the adverse effects of weather-induced crop failures [6][7][8]. Recent studies show that insurance encourages farmers to take risks and make more investments in productive inputs. In Bangladesh, for example, purchasing insurance led to the expansion of agricultural land and more investment in fertilizer, labor, irrigation, and pesticides [9]. In Kenya, an uptake of insurance was significantly associated with the increased use of fertilizer and expenditures on seeds by 50% and 65%, respectively, and a corresponding increase in maize yields by 60% [10].
This shows that farmers will allocate their resources in a manner that maximizes returns if they are assured of financial compensation for losses arising from factors beyond

Study Area
The O.R. Tambo District Municipality is in the northeastern part of South Africa's Eastern Cape Province ( Figure 1).
The ORTDM is sub-divided into five local municipalities covering an area of 12,096 km 2 . It is the second poorest of seven district municipalities in the Eastern Cape Province [23]. About 94% of the population in ORTDM are rural dwellers whose main sources of livelihood include livestock farming, rain-fed maize production, and government social grants [24]. This investigation focused on three of ORTDM's local municipalities whose identification was guided by maize yield records obtained from the Department of Agriculture, Land Reform and Rural Development (DALRRD). These records consist of GPS locations of maize fields and contact details of farmers. The maize fields included small plots that ranged in size from one hectare to slightly more and outfield collective farms (≤80 ha) in which individual farmers cultivate specific plots. We selected maize partly because it is the primary staple food crop which is widely produced under rain-fed conditions in SSA [25][26][27], including South Africa, where white maize is destined for human consumption and yellow maize for animal feed [28].
The selected farms were evenly distributed in a landscape that is characterized by (a) low-lying densely vegetated areas along the Wild Coast where elevation ranges from 5 m to 500 m, (b) gentle-to-moderate-sloping grasslands in the interior, and (c) savannas and forests in the northern areas where elevation extends up to 1500 m. The area has a warm oceanic climate, which includes the humid sub-tropical climate of the northeastern peripheries and the semi-arid climates of the southwestern parts [29]. Mean annual rainfall ranges from 900 mm to 1300 mm, with summer minimum and maximum temperatures of 14-19 • C and 14-27 • C, respectively [30]. The soils are largely dominated by sandy loams, Sustainability 2021, 13, 5176 3 of 13 sandy clay loams, and clays that are yellow to black in color and slightly acidic [31,32]. In the past, the farmers used to begin planting maize in the first dekad of October, but planting now begins in mid-November, often extending up to late-January. The harvest season is usually from June to August. The ORTDM is sub-divided into five local municipalities covering an area of 12,096 km 2 . It is the second poorest of seven district municipalities in the Eastern Cape Province [23]. About 94% of the population in ORTDM are rural dwellers whose main sources of livelihood include livestock farming, rain-fed maize production, and government social grants [24]. This investigation focused on three of ORTDM's local municipalities whose identification was guided by maize yield records obtained from the Department of Agriculture, Land Reform and Rural Development (DALRRD). These records consist of GPS locations of maize fields and contact details of farmers. The maize fields included small plots that ranged in size from one hectare to slightly more and outfield collective farms (≤80 ha) in which individual farmers cultivate specific plots. We selected maize partly because it is the primary staple food crop which is widely produced under rain-fed conditions in SSA [25][26][27], including South Africa, where white maize is destined for human consumption and yellow maize for animal feed [28].
The selected farms were evenly distributed in a landscape that is characterized by (a) low-lying densely vegetated areas along the Wild Coast where elevation ranges from 5 m to 500 m, (b) gentle-to-moderate-sloping grasslands in the interior, and (c) savannas and forests in the northern areas where elevation extends up to 1500 m. The area has a warm oceanic climate, which includes the humid sub-tropical climate of the northeastern peripheries and the semi-arid climates of the southwestern parts [29]. Mean annual rainfall ranges from 900 mm to 1300 mm, with summer minimum and maximum temperatures of 14-19 °C and 14-27 °C, respectively [30]. The soils are largely dominated by sandy loams, sandy clay loams, and clays that are yellow to black in color and slightly acidic [31,32]. In the past, the farmers used to begin planting maize in the first dekad of October, but planting now begins in mid-November, often extending up to late-January. The harvest season is usually from June to August.

Socioeconomic and Agronomic Data
Baseline socio-economic and agronomic information was solicited through a pilot survey and a follow-up semi-structured interview with the farmers (Table 1). Questions related to intercropping, usage of manure, and other inputs were omitted because the farmers were practicing monoculture and using chemical fertilizers provided by DALRRD and Grain South Africa (GrainSA). Information about income was also omitted because the farmers were reluctant to disclose their off-farm sources of livelihood and their annual and monthly incomes.

Soil Data
The maize fields have gentle to flat slopes and homogenous vegetation, which allowed us to collect composite soil samples. The soils were collected from ground level to a depth of 30 cm using a soil auger. Thereafter, all the soil samples were taken to South Africa's Agricultural Research Council (ARC) laboratory for chemical and physical analyses of the parameters listed in Table 1.

Meteorological Data
Daily precipitation and maximum and minimum temperature records were obtained from the ARC's agro-climate databank, which continuously received data from seven automatic weather stations distributed across the study area. These data were used to compute accumulated growing degree days (GDD) and total precipitation. GDD are used as an agrometeorological index to model the rate at which crops develop from one stage to another in their lifecycle [33]. The number of GDD is recognized as a more accurate estimate of plant physiological development than calendar days because a crop plant develops when the temperature is above a specific base temperature and below a certain upper threshold [33]. For maize, the lower limit/base temperature (Tbase) is 10 • C and the upper limit is 30 • C. Thus, GDD were calculated using the following Equation (1): where T min and T max are the daily minimum and maximum temperatures, respectively.

Remote Sensing Indices
A Garmin Montana 650 GPS was used to geo-locate all maize fields during the farm surveys. Atmospherically corrected Sentinel-2 images taken from November 2017 to June 2018, from November 2018 to June 2019, and from November 2019 to June 2020 that were downloaded from the European Space Agency's Copernicus Hub were used to compute spectral indices. We used these satellite images to compute times-series maps of the Normalized Difference Vegetation Index (NDVI), the Two-band Enhanced Vegetation Index (EVI2), and the Moisture Stress Index (MSI). We selected NDVI because of its established ability to provide reliable results in modelling vegetation dynamics and crop yield [34][35][36][37]. EVI2, which is also recognized as useful for estimating crop yield, is more effective than NDVI when vegetation density is high [38][39][40][41]. MSI was selected due to its sensitivity to leaf and soil water content, which are correlated with grain yield [42,43]. These indices are calculated as follows: where RED is the red band of Sentinel-2, and NIR and SWIR are the near and shortwave infrared bands, respectively. We initially regressed multi-temporal NDVI and EVI2 against grain yield to identify the period within which these two indices were best related to yield and proceeded to use peak values of these indices and the seasonal average of MSI after observing that they were closely related to yield. These processes were finalized by converting all the categorical independent variables listed in Table 1 to dummy variables.

Maize Yield Data
Yield surveys were conducted at the beginning of the harvest season in 2018, 2019, and 2020. Prior to this period, we interviewed the farmers to ascertain whether they or the DALRRD apply any methods to estimate yield. About 80% of the farmers reported that although they get inputs from DALRRD and GrainSA, no one conducts yield surveys on their farms. Therefore, we conducted yield surveys by employing the objective yield survey method that is used by South Africa's Crop Estimates Committee. Detailed information about this method is provided in FAO's 2016 report on crop yield forecasting [44].

Variable Importance
To rank the independent variables, we used percent increase in mean squared error (%IncMSE), which is a variable importance measure embedded in the random forest (RF) regression algorithm [45]. We chose this method because it is a model-based approach with the ability to order independent variables according to their relative importance. RF is an ensemble learning technique that works by constructing a number of decision trees and computing the mean prediction of the individual trees. RF trains each decision tree on a different sample of the training set, where sampling is performed with replace- perm (x j ) is calculated in the kth tree for variable x j according to Equation (5) [46]: where y (k) i is the observation of dependent variable in tree k,ŷ i is the prediction by tree k, andŷ (k) j, i is the prediction by tree k when the jth variable is permuted. n OOB is the number of samples in the out of bag (OOB) data seen by each of the trees in the forest. In order to identify the most important variables, we trained different RF models sequentially, removing the least important variables until we achieved a model with the optimum number of variables and the lowest root mean squared error (RMSE = 656.62 kg/ha). We also used the rfPermute function in RStudio to compute p values for the important variables [47].

Partial Dependence Plots
The study used partial dependence plots (PDPs) to assess the relationships between the independent variables and yield. PDPs were developed by Friedman [48] as a way to interpret complicated regression models (or black boxes). PDPs are derived by using Equations (6) and (7):f where x s is the variable for which partial dependence is assessed, x c represents the other independent variables in the modelf , andf x s is calculated from a set of training data; where x

Results
The results are presented in the form of descriptive statistics of factors that influence maize yield (Section 3.1) and rankings of these factors (Section 3.2). Table 2 shows descriptive statistics of factors that influence maize yield in ORTDM. Over the three-year period covered by this study, the farms produced between 367.10 and 7449.13 kg/ha of maize with a mean of 3259.16 kg/ha. Average yields in 2018, 2019, and 2020 were 2946.40, 3067, and 3509 kg/ha, respectively. Fifty-four percent (54%) of the farmers planted Pioneer PHB3356BR, while 20% and 26% planted Monsanto 7674BR and Pan-14 seeds, respectively (hereafter referred to as Pioneer, Monsanto, and Pannar). Moisture stress ranged between 0.60 and 1.09 with a mean of 0.81 (MSI values typically range between 0 and >3). Most farmers applied 150 and 200 kg/ha of the same NPK fertilizer that was provided by the DALRRD in partnership with Grain South Africa (GrainSA). Only one farmer used a fertilizer application rate of 100 kg/ha. Accumulated GDD ranged between 948.64 and 2013.16 with a mean of 1531.95, while precipitation ranged between 110.15 and 709.50 mm with a mean of 484.30 mm. Soil pH ranged between 4.89 and 6.57 with a mean of 5.55. Fifty-nine percent (59%) of the farmers used hired machinery, while 41% used their own machinery. Figure 2 shows variables that %IncMSE ranked as the most important. Results of variable ranking (Figure 2) show that maize yield was highly dependent on seed variety, surface water content as measured by MSI, fertilizer application rate, and GDD (p < 0.01). Yield was also dependent on precipitation, soil pH, and ownership of machinery (p < 0.05). Figure 3 shows how the independent variables were associated with maize yield.

Results of Variable Importance Analysis
Seed variety was the most important variable, with the Monsanto seed (1.00) producing higher yields than the other types (0.00, Figure 3a). Surface water content was the second most important variable, with Figure 3b showing that yield decreased as MSI increased. The third most important factor was the fertilizer application rate as seen in Figure 3c, which shows that 100 and 150 kg/ha of fertilizer produced lower maize yields than 200 kg/ha. The fourth most important factor was GDD in Figure 3d, which shows that yield generally increased with the amount of accumulated GDD. The fifth most important factor was total precipitation. Maize yields did not show any significant response to low precipitation; however, we observed a drastic increase in yield as a function of precipitation above 600 mm (Figure 3e). The sixth most important factor was soil pH, as Figure 3f shows, where farms with soil pH above 5.0 produced higher maize yields. Lastly, Figure 3g shows that yield was higher among farmers who owned machinery (1.00) as compared to those Sustainability 2021, 13, 5176 7 of 13 who hired machinery (0.00). There were correlations between the independent variables, as shown in Table 3.
between 0 and >3). Most farmers applied 150 and 200 kg/ha of the same NPK fertilizer that was provided by the DALRRD in partnership with Grain South Africa (GrainSA). Only one farmer used a fertilizer application rate of 100 kg/ha. Accumulated GDD ranged between 948.64 and 2013.16 with a mean of 1531.95, while precipitation ranged between 110.15 and 709.50 mm with a mean of 484.30 mm. Soil pH ranged between 4.89 and 6.57 with a mean of 5.55. Fifty-nine percent (59%) of the farmers used hired machinery, while 41% used their own machinery. Figure 2 shows variables that %IncMSE ranked as the most important. Results of variable ranking (Figure 2) show that maize yield was highly dependent on seed variety, surface water content as measured by MSI, fertilizer application rate, and GDD (p < 0.01). Yield was also dependent on precipitation, soil pH, and ownership of machinery (p < 0.05). Figure 3 shows how the independent variables were associated with maize yield. Seed variety was the most important variable, with the Monsanto seed (1.00) producing higher yields than the other types (0.00, Figure 3a). Surface water content was the second most important variable, with Figure 3b showing that yield decreased as MSI increased. The third most important factor was the fertilizer application rate as seen in Figure 3c, which shows that 100 and 150 kg/ha of fertilizer produced lower maize yields than 200 kg/ha. The fourth most important factor was GDD in Figure 3d, which shows that  Some of the independent variables with lower importance scores were significantly correlated with the highly important variables (Table 3). MSI correlated with EVI2, while GDD correlated strongly with planting date. Ownership of machinery correlated with hiring of machinery, male farmers, collective farms, and individually owned farms. Soil pH correlated with Ca, Mg, K, and Na.

Discussion
This study investigated factors affecting maize yield in the SFS of ORTDM, South Africa. The purpose of this investigation was to identify critical factors that influence maize production and inputs, which AII and credit could provide or assist farmers to invest in. Over the 3-year period between 2018 and 2020, maize yields ranged between 367.10 and 7449.13 kg/ha with an average of 3259.16 kg/ha (Table 2), which reveals that most farmers produced below the national average, which often ranges from 4700 to more than 7000 kg/ha [49][50][51]. Other studies also reported that maize yields in ORTDM and the Eastern Cape province at large are low and less than potential [52][53][54]. The most important input influencing yield was seed variety. The farmers were receiving extension services and input recommendations from DALRRD and GrainSA (a private association of grain farmers). GrainSA recommended 200 kg/ha of fertilizer and supplied the farmers with a Monsanto seed variety, while DALRRD recommended 150 kg/ha of fertilizer and supplied the other farmers with Pannar and Pioneer seed varieties.
Although 200 kg/ha of fertilizer was associated with higher yields, the fertilizer application rates were less than what is generally recommended (>250 kg/ha) for maize in many parts of South Africa [55][56][57]. The low usage of fertilizer among South Africa's smallholder farmers is partly due to the perception that the widely recommended fertilizer application rate is unrealistic, risky, expensive, and meant for resource-rich farmers [52,58]. Therefore, the reason Pannar and Pioneer seed varieties produced lower yields compared to the Monsanto seed variety could be due to the lower fertilizer application rate. Smallholder farmers also have limited experience with new seed varieties. In 2019, the majority of farmers complained about the Pioneer seed variety, which was new to them, stating that it reached premature senescence.
The second most important factor affecting maize yield was moisture stress, which was significantly associated with EVI2. While MSI is sensitive to surface water content, EVI2 is sensitive to plant chlorophyll content, which influences the rate of photosynthesis and crop yield [59]. Interestingly, MSI was more important than total precipitation. Recent studies on AII are exploring the feasibility of using soil moisture rather than rainfall indices [60][61][62][63]. The motivation behind this is that surface water content, rather than precipitation, is a better indicator of water availability to plants and a better measure of agricultural drought. Although there is evidence that remotely sensed MSI can estimate soil moisture and vegetation water content [64,65], to our knowledge, no study has investigated the utility of MSI for AII in Africa.
The most important edaphic factor was soil pH, which was associated with base saturation (Table 3). In areas like ORTDM, where soils are largely acidic, an AII bundle would need to include lime to help neutralize soil pH, improve nutrient availability, minimize production risk, and enhance crop productivity. Lastly, ownership of machinery was associated with males who cultivate maize in individually owned farms. This shows that cooperatives and female-owned farms are more vulnerable to production risk. Farmers without equipment tend to cultivate smaller areas, delay application of agronomic inputs, and lose portions of their harvest, whereas equipped farmers produce higher yields because of timely operations and improved labor productivity [66]. Encouraging farmers and government to invest in affordable implements instead of focusing on large machines, which are not only expensive but also unsuitable for farmers with small fields, could improve access to equipment [67]. Alternatively, insurance could unlock credit for farmers so that they get the money needed to hire agricultural equipment on time.
The factors influencing maize yield in ORTDM are some of the common factors affecting crop yields in Africa's SFS. However, the exact nature and importance of these factors may vary from one place to another depending on socioeconomic and environmental conditions. Studies in other localities could provide more insight on some of the most important factors that need to be addressed in SFS to minimize production risk and to improve agricultural productivity. More research is also needed to find efficient ways by which AII and agricultural inputs can be systematically packaged into comprehensive risk management portfolios. Although linking AII with factors that influence crop yield may attract farmers to take up insurance, more work still needs to be carried out to reduce basis risk, which is one of the reasons why insurance uptake remains low [68]. In the efforts to address basis risk, future research could test the performance of MSI and other surface moisture indices in the design of AII.

Conclusions
ORTDM experienced no weather shocks over the three seasons covered in this study; therefore, weather conditions were suitable for maize cultivation. The below-average yields demonstrate that maize production in ORTDM could plummet even further in the event of a moderate weather shock (e.g., a mild drought). Low yields under optimal weather conditions also show that non-weather variables play a significant role in maize production in this area. For instance, the importance of agrometeorological factors on yield was largely associated with planning and planting dates. In countries like South Africa where farmers delay planting because of the late delivery of inputs and weather risks, linking AII with inputs, advisories, and credit would ensure that farmers use the appropriate seed varieties and fertilizers and have timely access to machinery. This approach has worked well in other countries like India and Kenya, for example, where insured farmers also get advisory services and weather information via mobile phones [69,70]. Accomplishing this requires a strategic synchronization of efforts in order to minimize production risks by acknowledging the influence of non-weather elements on crop yields. Future research could focus on how these non-weather elements can be incorporated in the design and packaging of AII. Lastly, we recommend further exploration of surface moisture indices like MSI, which could potentially reduce basis risk. The methodology used in this study can be applied in other areas as well as for other crops in identifying yield-determining factors that can be bundled or linked with AII.