Next Article in Journal
Novel Remote Sensing Index of Electron Transport Rate Predicts Primary Production and Crop Health in L. sativa and Z. mays
Previous Article in Journal
Deep Open-Set Domain Adaptation for Cross-Scene Classification based on Adversarial Learning and Pareto Ranking
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Estimating Yields of Household Fields in Rural Subsistence Farming Systems to Study Food Security in Burkina Faso

Remote Sensing Solutions (RSS) GmbH, Dingolfingerstr. 9, 81673 Munich, Germany
Heidelberg Institute of Global Health (HIGH), Faculty of Medicine, Heidelberg University, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany
Centre de Recherche en Santé de Nouna (CRSN), Institut National de Santé Publique (INSP), Rue Namory Keita, PO Box 02, Nouna, Boucle du Mouhoun, Burkina Faso
Institut Universitaire de Formations Initiale et Continue (IUFIC), Université Ouaga II, 12 BP 417 Ouagadougou 12, Burkina Faso
Institute of Computer Science, University of Osnabrück, Wachsbleiche 27, 49090 Osnabrück, Germany
Author to whom correspondence should be addressed.
Shared first authorship.
Shared last authorship.
Remote Sens. 2020, 12(11), 1717;
Received: 6 April 2020 / Revised: 10 May 2020 / Accepted: 22 May 2020 / Published: 27 May 2020


Climate change has an increasing impact on food security and child nutrition, particularly among rural smallholder farmers in sub-Saharan Africa. Their limited resources and rainfall dependent farming practices make them sensitive to climate change-related effects. Data and research linking yield, human health, and nutrition are scarce but can provide a basis for adaptation and risk management strategies. In support of studies on child undernutrition in Burkina Faso, this study analyzed the potential of remote sensing-based yield estimates at household level. Multi-temporal Sentinel-2 data from the growing season 2018 were used to model yield of household fields (median 1.4 hectares (ha), min 0.01 ha, max 12.6 ha) for the five most prominent crops in the Nouna Health and Demographic Surveillance (HDSS) area in Burkina Faso. Based on monthly metrics of vegetation indices (VIs) and in-situ harvest measurements from an extensive field survey, yield prediction models for different crops of high dietary importance (millet, sorghum, maize, and beans) were successfully generated producing R² between 0.4 and 0.54 (adj. R² between 0.32 and 0.5). The models were spatially applied and resulted in a yield estimation map at household level, enabling predictions of up to 2 months prior to harvest. The map links yield on a 10-m spatial resolution to households and consequently can display potential food insecurity. The results highlight the potential for satellite imagery to provide yield predictions of smallholder fields and are discussed in the context of health-related studies such as child undernutrition and food security in rural Africa under climate change.

Graphical Abstract

1. Introduction

Climate change presents an increasing risk for household food security and child nutrition [1,2]. Climate variability and extremes affect food security in multiple ways [1]. Lower crop yields are the most prominent drivers [2,3] and will be felt particularly by rural smallholder farmers in sub-Saharan Africa, who rely on rainfed agricultural yield for subsistence and minor economic returns [4]. Nevertheless, little is known about their farming practices, needs, and developments, which make them specifically susceptible to impacts caused by climate change, population growth, changing dietary patterns, and land and water scarcity. Data on yield, food security at household level, climate, and weather can significantly support decision-makers and project implementers in defining the best adaptation and risk management strategies in rainfed agricultural contexts [5,6].
While long-term climate change impacts food production through gradual warming, derangement of rainfall patterns, and reduced micronutrients in food crops due to increased atmospheric CO2 concentration, short-term changes are already observable today [1]. These short-term changes are mainly an increase in extreme weather events. Crop yields are influenced by biotic and abiotic factors (e.g., temperature, rainfall, soil type, insects, and management practices) that combined determine crop growth and crop vitality during the whole growing period [7]. In our study area, North West Burkina Faso, this translates to a subtle increase in mini-droughts [8] and strong or torrential rainfalls (>40 mm/day). Both of these extremes are detrimental to plant growth, crop yields, and thus, household food security and child nutrition [9].
Climate models project an increase in rainfall variability with more mini-droughts and considerable variations in rainfall seasons [10]. Further, non-conservative models project about 2–3 degrees Celsius (°C) of additional warming by 2100 for Burkina Faso [11]. Both an increase in the frequency and intensity of heatwaves can have detrimental effects on yields [12]. Rowhani et al. [13] showed that in Tanzania, by 2050, the modelled seasonal temperature increase of 2 °C would reduce average maize, sorghum, and rice yields by 13%, 8.8%, and 7.6%, respectively. Aside from direct effects resulting from higher temperatures, there is the indirect effect of greater evapotranspiration, which reduces crops’ water availability. Increased heat has another impact on food production, caused by the physiological limits of the human body to work in hot and humid environments. Their work productivity as farmers has been shown to be substantially reduced [14,15].
While epidemiological studies on children’s nutritional status, its multiple determinants, and assessments of crop yields and projections under different climate scenarios are abundant, there is a scarcity of studies linking yield of subsistence farming households to household food security [16,17]. However, the meteorological uncertainty and agricultural dependency make the households highly vulnerable to food insecurity, child stunting, and wasting, which are reliable indicators for household well-being [18,19]. A cohort study from 2018 with 514 children between 8 and 59 months of age in the Nouna Health and Demographic Surveillance System (HDSS) showed a prevalence of wasting at 7% (acute undernutrition), and of stunting at 26% (chronic undernutrition) [20,21]. One rare example that combines this information is from Belesova et al. [22]. The authors showed links between food crop yield in their year of birth with subsequent survival of children below 5 years of age and used long-term Mid-Upper-Arm Circumference (MUAC) data from the Nouna HDSS.
Satellite remote sensing analyses have shown to be effective for measuring yield-relevant crop parameters, e.g., based on the leaf area indices [23], nitrogen status [24], and biomass [25] can reflect the crop growth heterogeneity within fields [26]. Back in 1993, Groten [27] published a paper on crop monitoring and crop yield forecasting in Burkina Faso using the normalized difference vegetation index (NDVI). Most crop types show a correlation between vegetation indices (VIs) and yield, which allows the use of multi-temporal satellite data to estimate yields [28,29,30]. Sorgho et al. [7] underlined the potential of high-resolution data for qualitative yield assessments in household fields and state that quantitative yield estimates require denser time series as well as sufficient reference data. The cost-free Copernicus data, namely from the Sentinel satellites (cf. Section 2.2.4), offer new monitoring possibilities through (i) increased availability of Earth Observation (EO) data, with (ii) improved spatial resolutions, (iii) more extensive coverage, and (iv) shorter revisiting cycles. The EO-data thus allows the detailed monitoring of crop parameters over the growing period. These new developments permit moving assessments on food security from a national level with a spatial resolution of 250 m for West Africa [31] or 30 m for Burkina Faso [32] to the household level with a spatial resolution of 10 m.
Most studies on remote sensing-based yield estimates and predictions were developed in and for large, more industrialized farming systems. The used methodologies range from simulations, such as the Agricultural Production Systems sIMulator (APSIM) [33,34,35] or the Environmental Policy Integrated Climate (EPIC) model [28], to regression models based on a variety of VIs or the leaf area index (LAI) [36,37,38]. Schwalbert et al. [39] used Sentinel-2 data to predict maize yields of fields ranging from 20 to 150 hectare (ha) on the basis of a stepwise-regression procedure with multiple VIs as input. Another example is a study by Skakun et al. [40], who used VI metrics from Sentinel-2 and Landsat data for a Gaussian mixture model for yield assessments of large winter wheat fields in the Ukraine (30 m spatial resolution). However, few examples of studies on yield estimation in very small spatial units such as household fields with remote sensing data exist. Jain et al. [35] used high-resolution SkySat data and modelled yield simulations for multiple years with crop samples from smallholder wheat fields. Lambert et al. [41] apply a similar method as presented, equally based on Sentinel-2 imagery. However, they do not create the link to nutrition on the household level. Jin et al. [33] analyzed yield heterogeneity on smallholder fields in Kenya with a linear regression model, however, for only one crop type (maize). They tested several data sources such as SkySat, RapidEye, and Sentinel-2 data and achieved the best model performance with yield measurements aggregated at district level. Linking yield predictions of household fields to undernutrition on the household level is a novel approach. While Noromiarilanto et al. [17] assessed food self-sufficiency in smallholder farming systems of south-western Madagascar using remote sensing, household socio-demographic, and food consumption data, so far only Shively et al. [42] and Johnson and Brown [43] merged remotely sensed data and health aspects. The latter was addressed using the nutrition status (stunting and wasting) and survival of children aged below 5 years in Nepal and West Africa (Benin, Burkina Faso, Guinea, and Mali).
The present study aimed to quantify crop yields of household fields, which are defined on the basis of our field samples with household fields presenting a median of 1.4 ha. The fields regarded carry sorghum, millet, beans, maize, and peanuts. The research is conducted using high-resolution Sentinel-2 data, which, for the first time, allows the derivation of crop patterns at household level. The study investigated (i) whether Sentinel-2 time series can be used to quantify and predict yields of food crops at household level for a study area in Burkina Faso, (ii) which crop types are suitable for yield estimations using a multiple linear stepwise regression model, and (iii) the crucial aspects of field data, which are fundamental inputs for a remote sensing based yield estimation approach. We discuss the results in the context of health-related studies such as on child undernutrition and food security in times of a changing climate with a focus on rural Africa.

2. Material and Methods

2.1. Study Area

The study area is located in rural sub-Saharan Africa, in the western part of Burkina Faso, in the Kossi province. The area covered by the Nouna HDSS (central coordinates: 12.74°N, 3.89°W), encompassing about 1756 km², represents the focus area for this project, counting 59 villages. Fourteen villages were selected based on a connected study on child undernutrition [20]. The study region is shown in Figure 1, displaying the study area with three exemplary cut-outs (named A through C) showing some surveyed fields as well as corresponding household locations. Cut-out (A) in Figure 1 displays multiple fields in the northern part of the HDSS area with no households in close proximity (1 km), while the cut-out (Β) shows an example for fields of different sizes around a few households (in 0.5 km proximity) towards the center of the HDSS area. Cut-out (C) in Figure 1 mainly displays smaller fields with households directly attached in the southern HDSS area. The range of surveyed field sizes is given in the results section.
The area is climatically characterized by a distinct short rainy season lasting from June to October [32]. Within the Nouna HDSS area, substantial temporal and spatial meteorological variations have been documented, implying consequences for the local community, which is dominated by smallholder farmers [44]. These smallholder households get more than 80% of their income from agricultural activities, making them highly dependent on their yearly yield [6]. The Nouna HDSS area, as most of Burkina Faso, is dominated by rainfed agriculture [32] with the dominant crop types being sorghum, millet, beans, peanuts, maize, sesame, fonio, rice, and cotton [45,46,47]. The agricultural season starts around May–June with sowing, the fields are maintained from July through September, and the season ends in October–December, when the fields are harvested [46]. Since a lack of electricity and financial means is highly characteristic in Burkina Faso in general and in the study area specifically, irrigation is rare, and the dependency on rainfall and manual labor high [44].

2.2. Data

2.2.1. Field Data

The authors developed a comprehensive field protocol (see supplementary) that allowed the collection of reference crop data in a systematic and reproducible manner. This method is also applied by the Ministry of Agriculture and Hydraulic Installations in Burkina Faso, whose 2008 agricultural report was used as guidance and adapted for our needs [48]. Two enumerators from the local Agricultural Service in Nouna were trained in conducting the project survey according to the defined protocol. The field assessment was carried out from September to November 2018 in four steps: (i) global positioning system (GPS)-based mapping of field boundaries using Garmin eTrex 10 GPS handheld devices and collection of crop type data; (ii) farmers survey on agricultural practices; (iii) weighing of a sample of crop yields for each crop type; and (iv) recall of the farmers after the harvest as an assessment of general crop conditions and anomalies. Based on the GPS information, crop types, yield, and other information from the fields were gathered and combined with remotely sensed data.

2.2.2. Field Sampling Approach

Nine crop types were identified as central crops for the region: millet, sorghum, maize, peanuts, beans, cotton, sesame, fonio, and rice, which cover more than 90% of the cultivated area [18]. The farmers and fields were selected based on a previous survey conducted in the Nouna HDSS [20], which provided information on crop types planted by the respective household. The study only regarded single-crop fields and excluded intercropped or mixed fields, since the yield models depend on maximally pure crop-wise remotely sensed reflectance. For all nine crop types, GPS information was taken to represent the whole agricultural diversity, while five crop types were selected for yield estimations due to their dietary importance for children; namely millet, sorghum, maize, peanuts, and beans. To draw statistically significant conclusions, we aimed for at least 30 yield samples of each crop type.
Following the identification of the fields, GPS tracks of field borders (i) were taken together with the farmer by walking along the perimeter of the field with a handheld GPS device. The data was stored in a track-log format and provided field size noted down on a survey sheet. All collected information was coded with the household ID and coordinates (longitude and latitude) of GPS points. Additionally, from each field, one picture was taken from ’above head perspective’ and three photos from different angles at breast height that allowed for a visual inspection and documentation of field characteristics.
Trees and bushes on fields are common in the Nouna HDSS area and can have a negative effect on the harvest in its surroundings as well as the remote sensing method applied in this study (edge effect [36]). As a result, each GPS recording of the field boundaries was post-edited in a Geographic Information System (QGIS 3.6), manually excluding all trees and bushes generously from the field boundaries using very high-resolution images from Google Earth.

2.2.3. Harvest Measurements

Crop yield measurements were taken from 25 m² squares. This approach was perceived as the most feasible and sufficiently reliable for the study area compared to other methods [49]. To locate a random sample location in the field, a set approach that includes the circumference of the field based on the GPS information, the size of the field in meters, random numbers given by the farmer, and two number tables were used. Once the point location was defined, the 25 m² square was placed. Three pickets connected by a ribbon with 7.07 m between each other for the diagonal and 5 m for the sides of the square allowed creating the required 25 m² square. This setting method assured equally sized squares throughout all 245 surveyed fields. GPS coordinates were taken from the center of the squares. Following the harvest, the crops were dried, and weight measurements were taken from the pre-set 25 m² squares of 213 fields (iii). For 32 field squares, measurements were not completed or validated.
In addition to a plausibility assessment, harvest squares set too close to a fields’ boundary or near large, shadow-casting trees were excluded. This step led to a total of 17 harvest measurements being omitted from originally 213 sampled fields, constituting a total of 196 harvest measurements for further analyses. Specifically, the variable set includes 31 harvest measurements for beans, 31 for maize, 32 for peanuts, 45 for millet, and 57 for sorghum fields.
The farmer notified the field workers of the impending harvest. This ensured the field square being harvested simultaneously with the remaining field and its yield being stored separately. The crops were then dried on the farmer’s property for 3–10 days. Afterwards, the crops were weighed—millet, sorghum, maize, and beans without stems and leaves, and peanuts with the shell. The Salter Model 235 6S was used for weighing the yield (kg). The apparatus was hung on a tree, and the sample crops were put in a lightweight plastic bag for weighing. The crop was returned to the farmer after weighing. The final farmers’ recall provided insights on weather-related harvest losses, fertilizer application, and usage of pesticides.
Between August and September 2018, agricultural information from 517 farmers in the Nouna HDSS was obtained. Accordingly, a household tilled between one and 16 grain-fields with 55% owning two to four grain-fields. The reported fields range from 0.2 to 20 ha, with the median at 1.4 ha. A total of 86% of the farmers reported owning 3 ha or less of total area per crop. Regarding the crop distribution, 88% of the households tend to at least one sorghum field, followed in descending order by sesame (86%), maize (80%), millet (77%), peanuts (48%), cotton (25%), rice (27%), and fonio (14%). For a better insight into field management methods, farmers recall (ii and iv) revealed that 57% of the farmers did not use pesticides, although 13% mentioned struggling with Striga, a root-parasitic plant. More than half of the farmers also applied either a chemical or an organic fertilizer (56%).

2.2.4. Remote Sensing Data

The study is based on remote sensing data derived from the Sentinel-2 satellite constellation, which is part of the Copernicus Programme and is comprised of the Sentinel-2A and 2B satellites (hereafter referred to as Sentinel-2). The multi-spectral images are composed of 13 bands, providing spectral information in the visible, red-edge, near-infrared (NIR), and short-wave infrared (SWIR) electromagnetic spectrum. Sentinel-2′s orbital swath width is 290 km with a revisiting frequency of only 5 days over the equator, which increases the probability of regularly acquiring cloud-free images. Ten spectral bands provide valuable data for vegetation monitoring, of which four have a spatial resolution of 10 m and six of 20 m [50].
For this project, all available Sentinel-2 images for the growing season 2018 were used (March 9 through December 29, 2018) resulting in a total of 188 images for the four tiles covering the study area. Each image, acquired at Level-1C, was pre-processed, including an atmospheric correction utilizing Sen2Cor version 2.4.0, before further processing.
Multi-spectral remotely sensed data, in particular different derived VIs, can represent biomass and crop conditions [51]. A multitude of VIs were tested in preliminary steps, finalizing with the choice of three VIs that use different ranges of the electromagnet spectrum (vis, RedEdge, NIR, and SWIR) and reflect relevant vegetation characteristics reliably (photosynthetic activity, chlorophyll absorption, and water content, etc.). Kross et al. [52] showed the relevance of multiple VIs using different wavelengths. The chosen VIs are the most commonly used VIs, the NDVI [53], and two additional VIs, especially suitable for yield estimation [41], namely the normalized difference red edge index (NDRE) and the normalized difference water index (NDWI). The NDVI is a well-known VI for remotely sensed vegetation analyses based on the reflectance values in the NIR and red band [44], while the NDRE uses the red-edge spectral band instead of the NIR. Sharma et al. [54] show that the NDRE offers better differentiation when regarding dense crops and crops in the final growth stages. In these cases, NDVI values (e.g., for areas with a leaf area index (LAI) > 3) are saturated, and the NDRE has proven to be more sensitive and applicable [54]. The NDWI, which uses the NIR and SWIR band, is especially sensitive to the water content of vegetation (moisture) that interacts with solar radiation [53]. The NDWI’s values are positive (≤ 1) for green vegetation and usually negative (≥ −1) for soil, which has to be kept in mind for sparsely covered crop stands. All indices have a range between −1 and 1.
The calculation of the respective index can be derived from the following equation. All VIs were calculated at 10 m spatial resolution, by resampling the RedEdge and the SWIR band each from 20 to 10 m resolution. For the calculation based on Sentinel-2 data, band 4 represents red, band 8 NIR, band 5 RedEdge, and band 11 SWIR.
Normalized Difference Vegetation Index           N D V I = N I R R e d N I R + R e d             [55]              (1)
Normalized Difference Red Edge Index         N D R E = N I R R e d E d g e N I R + R e d E d g e             [56]              (2)
Normalized Difference Water Index         N D W I = N I R S W I R N I R + S W I R             [53]              (3)
For each VI, the maximal monthly values per pixel were extracted from all available images per month by analyzing the time series pixelwise and creating a synthetic image of the maximal monthly index values [36]. The outcome of this multi-temporal metric calculation is 10 images, March through December, per VI (namely NDVI, NDRE, and NDWI). As an input for the regression analysis and the multiple linear regression (MLR) yield model development, monthly maximal VI-values (30) are derived from the pixel under the respective sampled harvest squares, which are then used as input variables (30) for the yield modelling.

2.3. Modelling Methodology

Numerous studies on remote sensing yield estimation exist. Besides the methodology, the size of the surveyed fields varies, typically large and homogenous fields, as do the number and type of crops analyzed. Jin et al. [33] analyze yield heterogeneity on smallholder fields in Kenya of one crop, namely maize, with a linear regression model. The common denominator throughout all related studies is the studies’ base; the correlation of yield to biomass [57], which also presents the theoretical core of the presented research.
The statistical yield modelling approach of this study comprises three main steps: the first step is an explorative analysis of the data to optimize the model parametrization and verify all relevant assumptions for implementing an MLR. The second step is the yield model itself, while the third step encompasses the yield model application for yield estimation on the surveyed fields beyond the sampled squares.

2.3.1. Model Parametrization

The presented methodology is based on the assumption of a direct relationship between biomass/vegetation conditions over the growing period that can be described by VIs and harvest yield, the monthly correlation of the different VIs, and the measured harvest amounts. Simple scatterplots with linear regression models were used to confirm a linear relationship. With regard to individual predictor variables, a simple linear regression is sufficient. However, when more than one predictor is involved, an MLR is more adequate. An MLR model is calculated using
Y i = β 0 + β 1 x i 1 + β 2 x i 2 + + β p x i p + ε
where for every individual i, Y is the model response, here yield, β o is the intercept, β p is the regression coefficient of the respective x, which are the distinct predictor variables, here the VIs, and ε   is the residual error. It has to be kept in mind that additional model bias is added when a linear relationship between variables is assumed since this almost always represents an approximation of reality [51].
There are multiple preemptive assumptions required for implementing an MLR. The main aspects are a linear relationship between the predictor and outcome variable (i), no outliers (ii), and no multicollinearity within the data (iii) and finally independence of the residuals and their normal distribution (iv) [58]. The MLR model developed in this study aims at modelling the relationship between a set of predictors (multi-temporal VIs) and observations (yield measurements).
Several methods exist to validate normality within or adjust the data to normality (iv) [58]. Normality was tested with the Kolmogorov–Smirnov (K-S) test, which provides statistical parameters to define the goodness of fit [59]. DeGroot and Schervish [58] also state that defining independence of residuals or absolute normality is especially difficult for small sample sizes, such as present for this study. Brownlee [60] states that data’s normal distribution might not be exposed until a significant increase in the sample size [60]
The often-encountered issue regarding regressions is multicollinearity (iii), meaning a high correlation between one or several predictors negatively affects the model stability and quality and therefore needs to be reduced to a minimum [61]. To detect possibly correlated predictors, the Pearson correlation coefficient (PCC), which divides the cross-correlation of two random variables ( E ( xy ) )   with the product of the standard deviation (σ), is calculated for each predictor pair and visualized in a matrix [62]:
ρ ( x , y ) = E ( x y ) σ x σ y , 0 ρ ( x , y ) 1  
Dormann et al. [63] show that collinearity inflates the variance of estimated regression coefficients when the PCC is >0.7. It is suggested to exclude variables surpassing this threshold. Another indicator, which measures the severity of multicollinearity (iii), is the condition index. It represents the “square root of the ratio of each eigenvalue to the smallest eigenvalue of X” [63]. Belsley [64] introduced a diagnostic test for determining degrading collinearity (DCL). It states that all condition indices greater than 30 have a collinear relation that is degrading to the model. Further, for each condition index larger than 30, two variables contribute to its degrading collinear relation if their variances, values in the corresponding row, are both greater than 0.5. Consequently, to reduce multicollinearity, the highest condition index value needs to be below 30 by excluding one of the variables with variances greater than 0.5 for the corresponding condition index [65,66]. These two adaptions necessitate an iterative process removing highly correlated predictors from the predictor set to finally receive data in accordance with MLR prerequisites.
As a general rule for iteratively excluding one variable from a pair that shows high correlations, NDVI values were prioritized over NDRE, due to its scientific dominance. Moreover, months later in the growing season were prioritized over earlier time steps, due to the assumed higher relevance to yield. Once one variable was excluded, the correlation matrix and the condition indices were re-calculated, and the multicollinearity re-evaluated. This process is repeated for every predictor set per crop type, resulting in an insignificantly correlated set of predictor variables per crop for further modelling steps.

2.3.2. Model Validation

Opposed to simple linear regressions, MLRs necessitate a variable selection (explanatory variables), resulting in a reduced number of significant model predictors. There are three main approaches to selecting the final model predictors: forward selection, backward selection, and mixed selection. For the presented study, the mixed selection with a significance level at 0.05 was chosen. This approach essentially considers zero predictors for the initial situation and predictors are added incrementally. Predictors can be removed if their addition negatively changes the p-value. Predictors are added until all model predictors provide sufficiently low p-values, and all remaining predictors have a high p-value (not desirable) when added [67].
There are several popular statistical parameters to validate and choose the best model fit. As shown in multiple studies [33,34,68] the adjusted R² and the root mean square error (RMSE) are well equipped to validate and serve as a basis to choose the optimal model. In this study, the best model fit is based on the adjusted R² and the ANOVA results (F-Score, RMSE). For the classical R², a high value (0 ≤ R² ≤ 1) indicates a small test error in the model. The calculation induces an increased R² with every added variable. Opposed to R², the adjusted R² accounts for noise added by irrelevant additional variables and therefore does not automatically improve with increasing variable numbers. The RMSE can be interpreted more precisely when compared to the overall value range (min-max) of the input variables (here: harvest field measurements). A higher range, which represents high intra-variability, can justify a higher RMSE. The ANOVA performs an analysis of variance using an F-test and offers insights on the significance of the established model [67]. The calculated significance needs to be below the defined significance level, in this case, 0.05, for the model to be deemed fit to describe the data.
Besides the significance level, another crucial aspect for the prediction accuracy is the relationship of the number of observations (n), in this case, the harvest measurements, and the number of predictors (p), here the VIs. If n is not much larger than p, the model tends to overfit. N >> p invalidates this issue and is therefore preferable. While a multitude of predictors is desirable for developing a maximally well-fitted model, their number in proportion to the number of observations needs to be taken into consideration as well [67].
The general model outputs are the defining parameters, their coefficients, and the intercept. These make up a generalized prediction formula. This formula is then applied, using the remote sensing indices time series with a spatial resolution of 10 m as inputs, to quantitatively predict yield for all surveyed household-related fields in the study area. In the last step, these yield estimations were then used to quantitatively estimate the amount of yield per crop type per household, creating a yield map, which creates the link between remotely sensed yields and health-related issues per household and individual. All MLR related analyses and modelling were implemented in the programming language R.
All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the most recent version of the ethical principles of the Declaration of Helsinki of 1975, which applies to national and international regulatory requirements. Ethical approval was obtained both from the Heidelberg University Ethical Committee (S-180/2017) and the Nouna Health Research Center Ethical Committee.

3. Results

3.1. MLR Requirements

Figure 2 shows crop-wise NDVI developments over time, representing the phenological development of each crop. The 10 months on the x-axis (March through December) reflect the growing season. Looking more closely at beans (top left), the increase of the NDVI values starting in June coincides with sowing and the peak in September with the maximum development of beans’ stand, before being harvested starting in October.
The primary assumption of the presented research on crop yield modelling focuses on the relation of harvested yield and VIs. Solely regarding the linear relationship of measured yield and the VI NDVI, monthly analyses showed a correlation with R² ranging from 0.01 to 0.36. The low values represent a weaker correlation. This can be explained because the translation is indirect, and the degree of correlation is also dependent, e.g., on the absolute biomass, the soil coverage, and other environmental aspects [41,69,70]. The explorative statistical analysis showed that correlation increased with progressing time and, in most cases, reached its maximum shortly before harvest. These initial findings underlined the promoted theory of a relationship between the harvested yield and the VIs and paved the way for the more complex MLR model for yield estimation.
Another assumption for an MLR is the necessity of a normal distribution of the data’s residuals (cf. Section 2.3.1, iv), which was tested with probability plots and the K-S normality test. All analyzed crops except for millet, for which the p-value of 0.044 is too low, showed p-values clearly above the threshold of 0.05 (confidence interval at 95%). Following arguments listed in Section 2.3.1, the MLR prerequisite of normality was assumed for further processing steps.
The next step involved analyzing the data for multicollinearity and autocorrelation (cf. Section 2.3.1, iii). In an iterative process, the final predictors for the MLR Model were chosen by excluding predictors showing autocorrelation or multicollinearity in compliance with the thresholds suggested by Dormann et al. [63]. The initial PCC matrices, including all 30 predictors, showed very high correlations, of up to 0.98, especially between NDVI and NDRE values.
The final set of predictors per crop met all requirements for running the regression model. All correlated variables were iteratively excluded, omitting PCCs above 0.7, reducing the condition index to below 30 (cf. Section 2.3.1), and the maximal variance decomposition to below 0.85. Based on the final crop-type predictor sets, the MLR models were run.

3.2. MLR Model and Yield Estimation

For each crop type and their specific predictor variable set, an MLR model was developed. The results vary depending on the crop type.
From the five crops analyzed, beans outperformed the other crops with an R² of 0.54 (adj. R² = 0.50), while maize (R² = 0.46, adj. R² = 0.40) and sorghum (R² = 0.44, adj. R² = 0.40) follow shortly behind. Millet presents a slightly lower R² of 0.40 (adj. R² = 0.32), equally concluding a highly suitable model fit [71]. The model significance of < 0.001 for these crops justifies the model. However, with an R² of 0.12 (adj. R² = 0.10), a weak model significance of 0.0477, and considering Cohen’s thresholds, the model results for peanuts are not applicable for further usage [71]. Comparing the resulting RMSE to the crop-specific value range, the results are good except for peanuts. Table 1 contains the final model predictors and further relevant parameters for assessing the model’s quality.
Aside from the good model fits for estimating yields of beans, maize, sorghum, and millet, the relevant predictor months are of interest. The latest predictor month defines the earliest time at which yield predictions are possible for each crop type. The model outputs show that maize yield could be predicted as early as August, bean yields as of October, and forecasts for sorghum and millet are not possible until November (harvest month). Maize yield can be predicted early and could thus act as an indicator crop for the overall yield situation prior to harvest. These results can have a substantial positive impact on early identification of yield gaps and support early mitigation measures. The model was applied to the VI-metrics using the equations from Table 2.
Figure 3 contains a representative subset of the yield map resulting from the model application. It shows sorghum, millet, maize, and beans fields. Each pixel of 10 m resolution contains the estimated amount of yield in kilograms per square meter (kg·25 m²). Additionally, the total amount of yield in kilograms is given for each field.
In combining the yield amount per field and the corresponding household from the survey, it is possible to quantitatively estimate the amount of yield per crop type per household. For the first time, this allows the link between the remotely sensed model output, food security, and health-related issues per household and individual. For every surveyed household, statistics on their predicted yield per crop type can be generated. In rural subsistence farming systems, where people live from what they grow, harvest deficits translate to household food insecurity and even child undernutrition [10]. Hence, the quantitative yield estimate from Figure 3 can be translated to crucial socio-economic information, such as crop-specific calorie estimations per household. The quantitative yield estimation is therefore a predictor and direct determinant for potential food deficits and thus food insecurity.
Table 3 contains anonymized information on the estimated amount of yield per crop type for exemplary households and includes the model output for fields recorded in the field study. Additionally, each households’ average yield per crop type is given in kg·25 m², which is a field size and field number-independent value of productivity. The mean yield, therefore, underlines the differences between households and also the variability and reliability of crops.

4. Discussion

Our results highlight the potential of freely available satellite imagery for yield estimations and predictions of smallholder household fields at 10 m spatial resolution. Based on monthly aggregated VIs calculated for the entire growing period and in-situ harvest measurements, five crop-dependent yield models were generated with good model fits for most crops and then applied. While the presented regression-based yield modelling approach is no novelty, this approach demonstrates the potential of earth observation data for modelling yields of a multitude of crops growing on household fields in rural West Africa for the first time. This was possible through an extensive field data survey that provided reference yield samples from different crop types, which is crucial for establishing the relationship between remotely sensed crop parameters and actual yield.
However, even with regard to the positive study results, the model output is highly dependent on the input variables, and possibilities for optimization should be discussed. The essential aspects of this study were the identification of potential improvements regarding remote sensing modelling and field data collection, particularly the potential improvements for model parameter optimization, reduction of uncertainties inherent to the reference data, and contextualization of the results. These aspects are all discussed in more detail below.

4.1. Yield Model Results and Limitations

When comparing the analyzed and modelled crops, sorghum, millet, beans, and maize produced good model fits, the latter two proving to be the most reliable. Several factors can influence the model output, such as the project region, sample, field size, or the data basis. These must be considered when comparing results with other studies. Depending on the crop type, the model outputs delivered exceeding results or ones comparable to those of similar studies conducted. Comparable results were presented by Lambert et al. [36], who reported R² values between 0.3 and 0.6 for yield estimations in Mali for smallholder farms. Morel et al. [72] focus on sugarcane yield estimation on smallholder farms resulting in R² between 0.21 and 0.53, depending on the underlying data. In regard to the weaker model results for peanuts, we assume that the correlation of biomass to the final yield is not as linear as it proved to be for grains. Peanuts showed weak correlations in the initial analyses of VIs over the measured yield. This could be caused by the special geocarpy plant reproduction of peanut plants, which develops the peanut below ground.
The analysis of the results was an integral part of this study in order to identify potential improvements in the yield modelling approach. This analysis included an explorative statistical analysis of the input data, a consideration of uncertainties, and an identification of potential error sources. We want to discuss three central aspects that may explain the variation in results for the specific case in the Nouna HDSS area: (i) remote sensing input data, (ii) pests, and (iii) yield measurements.
From the remote sensing point of view (i), an influence on the correlation of biomass to measured yield can be caused by weeds and trees. On the one hand, this negatively influences the yield by competing with the crops for nutrients and space. However, on the other hand, it results in positive biomass measurements in the remotely sensed data (e.g., saturated vegetation signal from trees). The satellite does not distinguish between food crops or weeds. To what extent weeds influence the data is difficult to quantify and underlines the necessity of contextual information gathered in addition to field samples. Regarding the effect of trees and their canopies in the data, they were excluded manually from the field boundaries to minimize their impact on the reflectance signal and can be neglected here.
The satellite data constituting the input data for this study were monthly composites of VIs derived from all available Sentinel-2 images. Since these input data were exactly the same for all crop yield models, the satellite data cannot explain the variability of the model results. However, a potential improvement from the data side could be a higher temporal resolution of the input data, such as bi-weekly VI composites instead of monthly VI composites. This could enhance the identification of the crop-specific phenological key features, such as onset, peak, or end of the crop season. In 2018, this was not possible for the whole growing season since cloud-free observations are required in order to achieve full area coverage. Only monthly composites yielded valid observations in all timesteps for the whole study area since main parts of the growing period lie within the rainy season, resulting in only a very limited number of cloud-free observations per month.
Future studies of the authors will focus on incorporating multi-annual analyses, which will result in more robust yield models and balance out annual variabilities. It is perceivable that including multi-annual harvest information, VIs, and potentially corresponding weather data, would result in an even better model fit. In addition, to scale up the yield predictions to a larger area, the authors suggest to also conduct a crop classification in order to apply the prediction to all cultivated fields in the area. Two machine learning approaches could, therefore, be combined, by first applying a crop classification using ensemble learning (such as random forest) and then applying the MLR models within the crop-specific masks. The models’ application could then be spatially expanded due to the variety of input data and adapted to more extensive or different project areas.
The second factor that can cause variability in the model results are pests (ii) since they can skew the correlation of biomass to measured yield. This is particularly true for pests that negatively affect the growing fruit or seed, but less the crop stand itself. In these cases, the values of the VIs indicate high amounts of biomass and, consequently, predict high yields, while, in reality, the development of the fruits has been impeded. Concerning the present study, less than half of the farmers used pesticides but reported to struggle with the root-parasitic plant Striga, which was found throughout the study area. A closer assessment of weed or insect infestations and fungal diseases can highly contribute to improved crop yield evaluations and is recommended for similar studies.
Reference data collection, particularly yield measurements (iii), constitutes the basis for remote sensing-based yield estimates and holds a high potential for data variation and uncertainty due to various factors, which Jin et al. [33] found to be true as well. Already minor impacts such as animals walking through the sample square, or a variation in the crop sample’s moisture at the time of weighing may have a negative impact on the correlation of the remotely sensed and field data. Regarding the latter, our data showed that discrepancies between households and within crops, concerning the elapsed time from harvest to drying to weighing, exist. This study was realized in close cooperation with local farmers, who dried the samples in their homes as they traditionally do. Farmers reported a varying drying period of a minimum of 3 and a maximum of 10 days. This is a crucial aspect since the humidity of the yield profoundly influences the yield’s weight. Moreover, with elapsed time between harvest and weighing, the probability of disturbances, e.g., post-harvest losses, increases. According to a study by Grolleaud [2], post-harvest losses vary strongly depending on the crop type (i). He stated that while beans, maize, and sorghum present comparably low harvest losses on the field (≈4%), millet is reported to be more vulnerable to negative impacts on the field and its transport causing up to 15% post-harvest losses. Between harvest and weighing, the yield is dried and stored, during which time further losses can occur and which is challenging to control considering the number of smallholder farmers involved in this extensive survey. Generally speaking, remotely sensed data cannot encompass post-harvest losses, nor are they accounted for in the model, which can provide an explanation for the small discrepancies between and within crop types for the model fit. A possibility to overcome this limitation would be to dry and weigh all samples in a laboratory setting, which was impossible in this rural study area due to a lack of facilities. Another helpful measurement would be the weighing of the biomass at the time of harvest in the sample squares [17]. This could provide an additional variable to assess the relationship between biomass and the remote sensing parameters and could help identify irregularities in yield data stemming from factors influencing the sample in the drying period.

4.2. Linking Smallholder Yield, Child Undernutrition, and Weather Variability

A significant obstacle for connecting health and nutrition data to agricultural data is the need for “locally specific, dated, and geolocated datasets that can be linked quantitatively” [15]. Analyzing crop yields at the household plot level is challenging due to the lack of information on the geographic areas in which crops are cultivated (total area and crop types). This is especially relevant in subsistence agriculture, such as practiced in sub-Saharan Africa. Non-uniform plots, intercropping, and a mix of crops with different seasonalities (sequential cropping patterns) further challenge crop analyses [32,45]. Intercropping is recommended because this agricultural practice decreases the likelihood of soil erosion, improves soil quality, increases soil water retention, and prevents crop diseases. However, identifying intercropping of household fields at 10 m spatial resolution remains a challenge in remote sensing. Previous attempts focused on large and homogenous fields, applying data with a spatial resolution of 250 m [5]. In regard to crop yield estimation of inter-cropping fields, only relative yield estimates seem possible, since discrimination of yield per crop type for quantitative estimates is highly challenging.
The 245 fields regarded in this study ranged from 0.01 to 12.6 ha, while 74% of these are smaller than 1 ha. Moreover, no cadastral information on land use is available for Burkina Faso, which is dominated by small structures and minimal presence of irrigation. These factors make for an agricultural landscape highly dependent on environmental variables and with significant inter- and intra-field variability, meaning inhomogeneous growing conditions in the area and within the fields.
Brown et al. [16] tried linking human health and nutrition to remote sensing data on a purely methodological basis. Accordingly, isolated analyses on environmental change are not purposeful but need to incorporate ground-based local parameters. Nevertheless, at the household level, the link between climate variables, weather, e.g., heat and flooding, and agricultural output in the context of food insecurity, notably child undernutrition, has hardly been studied [7]. The Demographic and Health Survey (DHS) started to collect geographic information in the mid-1990s, creating a basis for publicly available health and nutrition data linked to geolocations. In the meantime, more national and international surveys, such as the World Bank’s Living Standards Measurement Surveys, provide this publicly accessible data. Subsequently, the connection of environmental information to health and nutrition using multivariate regression methods becomes a possibility [18,43,73]. Due to the Nouna HDSS existing since the 1990s [74], continuous and reliable health data at the household and even the individual level are available. With the knowledge of, e.g., household wealth, expenditure, and migration patterns, changes over time can be observed. Regular yield information is essential information for a population highly dependent on agricultural output for survival and income and will support researchers and thus policymakers to understand population changes and risks. Assuming that cultivated fields can be linked to the respective households, individual yield outputs can be generated to analyze the effects of nutrition on the health of individual families.
A further application can be seen in relating food crop yields to rainfall or temperature variability in the respective growing season, thus exploring the impact of these dimensions of climate change on yields and further on child undernutrition. The methodological advances lie in the extremely high resolution of the household with its children’s nutritional status and the food crop yields of the respective household. Such data can further be “fed” into climate impact models, e.g., with highly aggregated data [10].
The results of the present study can support studies regarding nutritional aspects to link household food security characteristics to the respective dietary status. Especially the link of yield data to individual households is essential for such work. Future studies could make use of the provided yield assessments for health studies at the household level, especially considering that these are available up to 2 months before harvesting.

4.3. Potential and Implications for Policymakers and Development Organizations

This research provides evidence for a novel method relevant for researchers and policymakers since it allows the small scale (10 m spatial resolution) estimation and prediction of yields at the household level. These can be used to evaluate the impact of climatic factors (rainfall variability, droughts, etc.) or farming management variations (fertilizer, insecticides, crop varieties) on households’ economic and nutritional health status. Thus, this research provides strong political arguments for governments and non-governmental stakeholders active in the humanitarian and developmental sectors. Results from the presented yield modelling approach can be used at low costs for planning agricultural interventions, supporting preparedness and prevention measures, and can improve local understanding of yield gaps and losses. One of the major strengths of our approach is the spatial resolution. It enables the link between socioeconomics, nutrition, and yield data on the household level and thus fits the spatial scale that is required for early-onset risk identification of undernutrition of individuals. Subsequently, it portrays subsistence food production and the vulnerability of smallholder farmers to a lack or loss of yield as one of the central determinants of child undernutrition.
The identification of agriculture areas and their respective yields on a larger scale (upscaling) becomes especially important in the context of (mal) nourishment since crop diversity directly relates to dietary diversification, which again is crucial for human nutrition. At this point, the model was only applied to the surveyed fields. With an underlying agricultural crop map, the yield estimation and prediction can be applied to all cultivated fields in the area. This would benefit the countries as population growth poses another challenge to policymakers, who need to assure constant access and sufficient production of agricultural products to ensure self-sufficiency on a national scale and independence of international market price fluctuations. Knauer et al. [10] reported a 91% expansion of agricultural land in Burkina Faso between 2001 and 2014 and accordingly reported an agricultural area coverage of 22%. This shows the country’s potential to feed its fast-growing population (3% annual population increase) and assure labor in a country where 90% work in the agricultural sector.
Nevertheless, not only the quantity but also the quality of the use of arable land needs to be monitored, wherefore remote sensing is a promising surveillance system for land use, crop coverage, and identification of agricultural potential. Low-resolution satellite data with extensive area coverage to register crop growth and vegetation stress are operationally used to assess (non-crop specific) yield anomalies on a regional level (e.g., the Famine Early Warning System Network by USAID (FEWS-NET) or the Food Security (FOODSEC) and Monitoring Agricultural ResourceS (MARS) by the European Commission [75]). However, particularly in spatially small-scaled subsistence farming, there are a wide range of farming practices and available resources (e.g., fertilizers, irrigation, etc.). A direct link between low-resolution satellite information and household food security is, thus, not possible.
Remote sensing data and derived information products can be used to acquire up-to-date data by utilizing high spatial and temporal coverage of inaccessible areas and by identifying environmental constraints [44]. Remote sensing data is still rarely quantified on livelihood status, let alone the nutritional status of children. Nelson et al. [76] related remote sensing to household-level expenditure to explain poverty patterns in Uganda. However, research has been limited in linking agricultural practices to food security, health, and climate change. We believe that remote sensing can explain some factors leading to child undernutrition patterns and provide reliable data even before an event actually happens. Such patterns would be of high value for policymakers and international organizations to act on and prevent food insecurity and humanitarian disasters.
Although forecasting food insecurity is not a new field, e.g., drought forecasts, we propose forecasting yield shortages on the household level. This is a novelty and has not been implemented before. Our results showed great potential to predict yields for maize as early as August, 2 months before harvest, beans in October, and sorghum and millet in November, each a month prior or during harvest (estimation). On this basis, a near-real-time warning system on the political level could be implemented with countermeasures on crop shortage before households, and especially children, suffer from malnutrition.

5. Conclusions

This research successfully established yield assessment models for different main food crops cultivated in household fields. Crucial information in the context of food insecurity can be derived from these and provide a new and valuable link to child undernutrition at the household level. Given that satellite imagery is freely accessible, the developed approach is particularly useful to low-income countries profoundly affected by climate change-related issues. Future studies should incorporate detailed rainfall and temperature data into the prediction model, which would support contextual accuracy and spatial upscaling. In light of climate change issues, especially in sub-Saharan Africa, an analysis of this link between yield estimates/predictions and household food security becomes particularly crucial and interesting.
We proved that remote sensing can be used as a proxy for harvest yield assessments at small spatial scales. While this approach was developed specifically for household plot-level analysis, it can be aggregated to the village or larger administrative units for further research. The findings of the study can benefit a range of stakeholders, such as by supporting the agricultural self-sufficiency of subsistence farmers through assisting local farmer cooperatives, by promoting governmental efforts towards the sustainable development goal 2—zero hunger—and by assisting institutions working on food security promotion as well as food emergency response. We recommend additional applied and participatory research in order to establish sound recommendations for the improvement of cropland management and food self-sufficiency, as well as to increase the willingness of local farmers to adopt new strategies that may mitigate climate impacts.

Supplementary Materials

The following are available online at, Document: Field Assessment Protocol.

Author Contributions

Conceptualization, R.S. (Rainer Sauerborn) and J.F.; methodology (remote sensing), J.F., K.-J.S. and I.G.K.; methodology (survey), I.M., R.S. (Rainer Sauerborn), and I.T.; validation, I.G.K., and I.M.; investigation, J.F., I.G.K., I.T., R.S. (Raissa Sorgho) and I.M.; resources, A.S., I.T. and S.S.; data curation, K.-J.S. and I.G.K.; field data collection, I.M., I.T., S.S. and R.S. (Raissa Sorgho); writing—original draft preparation, I.G.K., I.M., J.F. and R.S. (Rainer Sauerborn); writing—review and editing, I.T., S.S., R.S. (Raissa Sorgho), K.-J.S. and A.S.; visualization, I.G.K.; project administration, R.S. (Rainer Sauerborn), J.F., A.S. and I.M.; funding acquisition, J.F. and R.S. (Rainer Sauerborn). I.G.K. and I.M. share the first authorship, and J.F. and R.S. (Rainer Sauerborn) share the last authorship, mirroring the interdisciplinary and shared contribution to this research. All authors have read and approved of the submitted version. All authors agree to be personally accountable for their contributions and for ensuring that questions related to the accuracy or integrity of any part of the work, even ones in which they were not personally involved, are appropriately investigated, resolved, and documented in the literature. All authors have read and agreed to the published version of the manuscript.


This research was funded by own funds from the Heidelberg Institute of Global Health (HIGH) and RSS—Remote Sensing Solutions GmbH, and the fieldwork was supported through a PhD fellowship provided to IM by the Baden-Württemberg Landesgraduiertenförderung Program (LGF).


We would like to express our gratitude to the people of the Nouna HDSS area for their time and support in conducting this research. We also thank the local Agricultural Service and its members for their input and support during and beyond the data collection.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Myers, S.S.; Smith, M.R.; Guth, S.; Golden, C.D.; Vaitla, B.; Mueller, N.D.; Dangour, A.D.; Huybers, P. Climate Change and Global Food Systems: Potential Impacts on Food Security and Undernutrition. Ann. Rev. Public Health 2017, 38, 259–277. [Google Scholar] [CrossRef]
  2. Grolleaud, M. Post-harvest Losses: Discovering the full story: Overview of the Phenomenon of Losses During the Post-harvest System. Available online: (accessed on 31 January 2020).
  3. FAO; IFAD; UNICEF; WFP; WHO. The State of Food Security and Nutrition in the World 2019. Safeguarding against Economic Slowdowns and Downturns; FAO: Rome, Italy, 2019. [Google Scholar]
  4. Morton, J.F. The impact of climate change on smallholder and subsistence agriculture. Proc. Natl. Acad. Sci. USA 2007, 104, 19680–19685. [Google Scholar] [CrossRef][Green Version]
  5. Bégué, A.; Arvor, D.; Bellon, B.; Betbeder, J.; Abelleyra, D.D.; Ferraz, P.D.R.; Lebourgeois, V.; Lelong, C.; Simões, M.R.; Verón, S. Remote Sensing and Cropping Practices: A Review. Remote Sens. 2018, 10, 99. [Google Scholar] [CrossRef][Green Version]
  6. Waha, K.; Zipf, B.; Kurukulasuriya, P.; Hassan, R.M. An agricultural survey for more than 9500 African households. Sci. Data 2016, 3, 160020. [Google Scholar] [CrossRef] [PubMed][Green Version]
  7. Sorgho, R.; Franke, J.; Simboro, S.; Barteit, S.; Phalkey, R.; Sauerborn, R. Linking Weather Data, Satellite Imagery and Field Observations to Household Food Production and Child Undernutrition: An Exploratory Study in Burkina Faso. Univers. J. Public Health 2017, 5, 256–270. [Google Scholar] [CrossRef][Green Version]
  8. Longueville, F.D.; Hountondji, Y.-C.; Kindo, I.; Gemenne, F.; Ozer, P. Long-term analysis of rainfall and temperature data in Burkina Faso (1950-2013). Int. J. Climatol. 2016, 36, 4393–4405. [Google Scholar] [CrossRef]
  9. IPCC. Climate Change 2014: Synthesis Report. Contribution of Working Groups I, II and III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change; IPCC: Geneva, Switzerland, 2014. [Google Scholar]
  10. Belesova, K.; Gornott, C.; Milner, J.; Sié, A.; Sauerborn, R.; Wilkinson, P. Mortality impact of low annual crop yields in a subsistence farming population of Burkina Faso under the current and a 1.5 °C warmer climate in 2100. Sci. Total Environ. 2019, 691, 538–548. [Google Scholar] [CrossRef]
  11. Hondula, D.M.; Rocklöv, J.; Sankoh, O.A. Past, present, and future climate at select INDEPTH member Health and Demographic Surveillance Systems in Africa and Asia. Glob. Health Action 2012, 5, 74–86. [Google Scholar] [CrossRef] [PubMed]
  12. Lobell, D.B.; Bänziger, M.; Magorokosho, C.; Vivek, B. Nonlinear heat effects on African maize as evidenced by historical yield trials. Nat. Clim. Chang. 2011, 1, 42–45. [Google Scholar] [CrossRef]
  13. Rowhani, P.; Lobell, D.B.; Linderman, M.; Ramankutty, N. Climate variability and crop production in Tanzania. Agric. For. Meteorol. 2011, 151, 449–460. [Google Scholar] [CrossRef]
  14. Kjellstrom, T.; Briggs, D.; Freyberg, C.; Lemke, B.; Otto, M.; Hyatt, O. Heat, Human Performance, and Occupational Health: A Key Issue for the Assessment of Global Climate Change Impacts. Ann. Rev. Public Health 2016, 37, 97–112. [Google Scholar] [CrossRef][Green Version]
  15. Sahu, S.; Sett, M.; Kjellstrom, T. Heat exposure, cardiovascular stress and work productivity in rice harvesters in India: Implications for a climate change future. Ind. Health 2013, 51, 424–431. [Google Scholar] [CrossRef][Green Version]
  16. Brown, M.E.; Grace, K.; Shively, G.; Johnson, K.B.; Carroll, M. Using satellite remote sensing and household survey data to assess human health and nutrition response to environmental change. Popul. Environ. 2014, 36, 48–72. [Google Scholar] [CrossRef][Green Version]
  17. Noromiarilanto, F.; Brinkmann, K.; Faramalala, M.H.; Buerkert, A. Assessment of food self-sufficiency in smallholder farming systems of south-western Madagascar using survey and remote sensing data. Agric. Syst. 2016, 149, 139–149. [Google Scholar] [CrossRef]
  18. Grace, K.; Brown, M.; McNally, A. Examining the link between food prices and food insecurity: A multi-level analysis of maize price and birthweight in Kenya. Food Policy 2014, 46, 56–65. [Google Scholar] [CrossRef]
  19. Phalkey, R.K.; Aranda-Jan, C.; Marx, S.; Höfle, B.; Sauerborn, R. Systematic review of current efforts to quantify the impacts of climate change on undernutrition. Proc. Natl. Acad. Sci. USA 2015, 112, E4522–E4529. [Google Scholar] [CrossRef][Green Version]
  20. Mank, I.; Vandormael, A.; Traoré, I.; Ouédraogo, A.; Sauerborn, R.; Danquah, I. Dietary habits associated with growth development among children aged <5 years in the Nouna Health and Demographic Surveillance System, Burkina Faso. Nutr. J. (under review).
  21. Beiersmann, C.; Bountogo, M.; Tiendrébeogo, J.; Louis, V.R.; Gabrysch, S.; Yé, M.; Jahn, A.; Müller, O. Malnutrition in young children of rural Burkina Faso: Comparison of survey data from 1999 with 2009. Trop. Med. Int. Health 2012, 17, 715–721. [Google Scholar] [CrossRef]
  22. Belesova, K.; Gasparrini, A.; Sié, A.; Sauerborn, R.; Wilkinson, P. Annual Crop-Yield Variation, Child Survival, and Nutrition Among Subsistence Farmers in Burkina Faso. Am. J. Epidemiol. 2018, 187, 242–250. [Google Scholar] [CrossRef]
  23. Zheng, G.; Moskal, L.M. Retrieving Leaf Area Index (LAI) Using Remote Sensing: Theories, Methods and Sensors. Sensors (Basel) 2009, 9, 2719–2745. [Google Scholar] [CrossRef][Green Version]
  24. Tilling, A.K.; O’Leary, G.J.; Ferwerda, J.G.; Jones, S.D.; Fitzgerald, G.J.; Rodriguez, D.; Belford, R. Remote sensing of nitrogen and water stress in wheat. Field Crops Res. 2007, 104, 77–85. [Google Scholar] [CrossRef]
  25. Lu, D. The potential and challenge of remote sensing-based biomass estimation. Int. J. Remote Sens. 2006, 27, 1297–1328. [Google Scholar] [CrossRef]
  26. Voss, K.; Franke, J.; Mewes, T.; Menz, G.; Kühbauch, W. Remote Sensing for Precision Crop Protection—A Matter of Scale. In Precision Crop Protection—The Challenge and Use of Heterogeneity; Oerke, E.-C., Gerhards, R., Menz, G., Sikora, R.A., Eds.; Springer: Dordrecht, The Netherlands, 2010; pp. 101–118. ISBN 978-90-481-9276-2. [Google Scholar]
  27. Groten, S.M.E. NDVI—Crop monitoring and early yield assessment of Burkina Faso. Int. J. Remote Sens. 1993, 14, 1495–1515. [Google Scholar] [CrossRef]
  28. Doraiswamy, P.C.; Moulin, S.; Cook, P.W.; Stern, A. Crop Yield Assessment from Remote Sensing. Photogramm. Eng. Remote Sens. 2003, 69, 665–674. [Google Scholar] [CrossRef]
  29. Bolton, D.K.; Friedl, M.A. Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics. Agric. For. Meteorol. 2013, 173, 74–84. [Google Scholar] [CrossRef]
  30. Huang, J.; Gómez-Dans, J.L.; Huang, H.; Ma, H.; Wu, Q.; Lewis, P.E.; Liang, S.; Chen, Z.; Xue, J.-H.; Wu, Y.; et al. Assimilation of remote sensing into crop growth models: Current status and perspectives. Agric.For. Meteorol. 2019, 276–277, 107609. [Google Scholar] [CrossRef]
  31. Gessner, U.; Machwitz, M.; Esch, T.; Tillack, A.; Naeimi, V.; Kuenzer, C.; Dech, S. Multi-sensor mapping of West African land cover using MODIS, ASAR and TanDEM-X/TerraSAR-X data. Remote Sens. Environ. 2015, 164, 282–297. [Google Scholar] [CrossRef]
  32. Knauer, K.; Gessner, U.; Fensholt, R.; Forkuor, G.; Kuenzer, C. Monitoring Agricultural Expansion in Burkina Faso over 14 Years with 30 m Resolution Time Series: The Role of Population Growth and Implications for the Environment. Remote Sens. 2017, 9, 132. [Google Scholar] [CrossRef][Green Version]
  33. Jin, Z.; Azzari, G.; Burke, M.; Aston, S.; Lobell, D. Mapping Smallholder Yield Heterogeneity at Multiple Scales in Eastern Africa. Remote Sens. 2017, 9, 931. [Google Scholar] [CrossRef][Green Version]
  34. Burke, M.; Lobell, D.B. Satellite-based assessment of yield variation and its determinants in smallholder African systems. Proc. Natl. Acad. Sci. USA 2017, 114, 2189–2194. [Google Scholar] [CrossRef][Green Version]
  35. Jain, M.; Srivastava, A.; Balwinder, S.; Joon, R.; McDonald, A.; Royal, K.; Lisaius, M.; Lobell, D. Mapping Smallholder Wheat Yields and Sowing Dates Using Micro-Satellite Data. Remote Sens. 2016, 8, 860. [Google Scholar] [CrossRef][Green Version]
  36. Lambert, M.-J.; Traoré, P.C.S.; Blaes, X.; Baret, P.; Defourny, P. Estimating smallholder crops production at village level from Sentinel-2 time series in Mali’s cotton belt. Remote Sens. Environ. 2018, 216, 647–657. [Google Scholar] [CrossRef]
  37. Meroni, M.; Marinho, E.; Sghaier, N.; Verstrate, M.; Leo, O. Remote Sensing Based Yield Estimation in a Stochastic Framework—Case Study of Durum Wheat in Tunisia. Remote Sens. 2013, 5, 539–557. [Google Scholar] [CrossRef][Green Version]
  38. Mkhabela, M.S.; Bullock, P.; Raj, S.; Wang, S.; Yang, Y. Crop yield forecasting on the Canadian Prairies using MODIS NDVI data. Agric. For. Meteorol. 2011, 151, 385–393. [Google Scholar] [CrossRef]
  39. Schwalbert, R.A.; Amado, T.J.C.; Nieto, L.; Varela, S.; Corassa, G.M.; Horbe, T.A.N.; Rice, C.W.; Peralta, N.R.; Ciampitti, I.A. Forecasting maize yield at field scale based on high-resolution satellite imagery. Biosyst. Eng. 2018, 171, 179–192. [Google Scholar] [CrossRef]
  40. Skakun, S.; Franch, B.; Vermote, E.; Roger, J.-C.; Justice, C.; Masek, J.; Murphy, E. Winter Wheat Yield Assessment Using Landsat 8 and Sentinel-2 Data. In IGARSS 2018—2018 IEEE International Geoscience and Remote Sens. Symposium; IEEE: Valencia, Spain, 2018; pp. 5964–5967. ISBN 978-1-5386-7150-4. [Google Scholar]
  41. Lambert, M.-J.; Blaes, X.; Traore, P.S.; Defourny, P. Estimate yield at parcel level from S2 time serie in sub-Saharan smallholder farming systems. IEEE 2017, 1–7. [Google Scholar] [CrossRef]
  42. Shively, G.; Sununtnasuk, C.; Brown, M. Environmental variability and child growth in Nepal. Health Place 2015, 35, 37–51. [Google Scholar] [CrossRef] [PubMed]
  43. Johnson, K.; Brown, M.E. Environmental risk factors and child nutritional status and survival in a context of climate variability and change. Appl. Geogr. 2014, 54, 209–221. [Google Scholar] [CrossRef]
  44. Imran, M.; Stein, A.; Zurita-Milla, R. Investigating rural poverty and marginality in Burkina Faso using remote sensing-based products. Int. J. Appl. Earth Obs. Geoinf. 2014, 26, 322–334. [Google Scholar] [CrossRef]
  45. Belesova, K.; Gasparrini, A.; Sié, A.; Sauerborn, R.; Wilkinson, P. Household cereal crop harvest and children’s nutritional status in rural Burkina Faso. Environ. Health 2017, 16, 65. [Google Scholar] [CrossRef][Green Version]
  46. Dixon, S.; Holt, J. Livelihood Zoning and Profiling Report: Burkina Faso. A Special Report by the Famine Early Warning Systems Network (FEWS NET); USAID: Washington, DC, USA, 2010.
  47. Sarr, B.; Atta, S.; Ly, M.; Salack, S.; David, G.A.; Ourback, T.; Subsol, S. Adapting to climate variability and change in smallholder farming communities: A case study from Burkina Faso, Chad and Niger. J. Agric. Ext. Rural Dev. 2015, 7, 16–27. [Google Scholar] [CrossRef]
  48. Direction de la Prospective et des Statistiques Agricoles et Alimentaires (DPSAA). Rapport General du Module Pluvial. Phase 2°; Ministère de l’Agriculture et de l’Hydraulique: Ouagadougou, Burkina Faso, 2011. [Google Scholar]
  49. Fermont, A.; Benson, T. Estimating Yield of Food Crops Grown by Smallholder Farmers. A Review in the Uganda Context; IFPRI: Washington, DC, USA; Kampala, Uganda, 2011; p. 01097. [Google Scholar]
  50. ESA. Sentinel-2-Missions. 2020. Available online: (accessed on 13 January 2020).
  51. Guerini Filho, M.; Kuplich, T.M.; Quadros, F.L.F.D. Estimating natural grassland biomass by vegetation indices using Sentinel 2 remote sensing data. Int. J. Remote Sens. 2020, 41, 2861–2876. [Google Scholar] [CrossRef]
  52. Kross, A.; McNairn, H.; Lapen, D.; Sunohara, M.; Champagne, C. Assessment of RapidEye vegetation indices for estimation of leaf area index and biomass in corn and soybean crops. Int. J. Appl. Earth Obs. Geoinf. 2015, 34, 235–248. [Google Scholar] [CrossRef][Green Version]
  53. Gao, B.-C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. of Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
  54. Sharma, L.K.; Bu, H.; Denton, A.; Franzen, D.W. Active-Optical Sensors Using Red NDVI Compared to Red Edge NDVI for Prediction of Corn Grain Yield in North Dakota, U.S.A. Sensors (Basel) 2015, 15, 27832–27853. [Google Scholar] [CrossRef]
  55. Rouse, J.; Haas, R.; Schell, J.; Deering, D. Monitoring Vegetation Systems in the Great Plains with ERTS. In Proceedings of the Third Earth Resources Technology Satellite-1 Symposium, Washington, DC, USA, 10–14 December 1973. [Google Scholar]
  56. Barnes, E.; Clarke, T.R.; Richards, S.E.; Colaizzi, P.; Haberland, J.; Kostrzewski, M.; Waller, P.; Choi, C.; Riley, E.; Thompson, T.L. Coincident detection of crop water stress, nitrogen status, and canopy density using ground based multispectral data. In Proceedings of the Fifth International Conference on Precision Agriculture, Bloomington, MN, USA, 16–19 July 2000. [Google Scholar]
  57. Bastiaanssen, W.G.M.; Ali, S. A new crop yield forecasting model based on satellite measurements applied across the Indus Basin, Pakistan. Agric. Ecosyst. Environ. 2003, 94, 321–340. [Google Scholar] [CrossRef]
  58. DeGroot, M.H.; Schervish, M.J. Probability and Statistics; Pearson New International, Ed.; Pearson: Harlow, Essex, UK, 2014; ISBN 1292025042. [Google Scholar]
  59. Chakravarti, I.M.; Laha, R.G.; Roy, J. Handbook of Methods of Applied Statistics; John Wiley & Sons: Hoboken, NJ, USA, 1967; Volume I, pp. 392–394. [Google Scholar]
  60. Brownlee, J. How to Transform Data to Better Fit the Normal Distribution. 2019. Available online: (accessed on 13 February 2020).
  61. Ziegel, E.R.; Rawlings, J.O.; Pantula, S.G.; Dickey, D.A. Applied Regression Analysis: A Research Tool. Technometrics 1999, 41, 82. [Google Scholar] [CrossRef]
  62. Cohen, I.; Huang, Y.; Chen, J.; Benesty, J. Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; ISBN 978-3-642-00295-3. [Google Scholar]
  63. Dormann, C.F.; Elith, J.; Bacher, S.; Buchmann, C.; Carl, G.; Carré, G.; Marquéz, J.R.G.; Gruber, B.; Lafourcade, B.; Leitão, P.J.; et al. Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography 2013, 36, 27–46. [Google Scholar] [CrossRef]
  64. Belsley, D.A.; Kuh, K.; Welsch, R.E. Regression diagnostics: Identifying influential data and sources of collinearity. J. Appl. Econ. 1989, 4, 97–99. [Google Scholar] [CrossRef]
  65. Belsley, D.A. Conditioning Diagnostics. Collinearity and Weak Data in Regression; Wiley: New York, NY, USA, 1991; ISBN 0471528897. [Google Scholar]
  66. Douglass, D.H.; Clader, B.D.; Christy, J.R.; Michaels, P.J.; Belsley, D.A. Test for harmful collinearity among predictor variables used in modeling global temperature. Clim. Res. 2003, 24, 15–18. [Google Scholar] [CrossRef]
  67. James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning. With Applications in R; Corrected at 8th printing edition; Springer: New York, NY, USA, 2017; ISBN 9781461471387. [Google Scholar]
  68. Abdul-Wahab, S.A.; Bakheit, C.S.; Al-Alawi, S.M. Principal component and multiple regression analysis in modelling of ground-level ozone and factors affecting its concentrations. Environ. Model. Softw. 2005, 20, 1263–1271. [Google Scholar] [CrossRef]
  69. Carfagna, E.; Gallego, F.J. Using Remote Sensing for Agricultural Statistics. Int. Stat. Rev. 2005, 73, 389–404. [Google Scholar] [CrossRef][Green Version]
  70. Rasmussen, M.S. Developing simple, operational, consistent NDVI-vegetation models by applying environmental and climatic information. Part II: Crop yield assessment. Int. J. Remote Sens. 1998, 19, 119–139. [Google Scholar] [CrossRef]
  71. Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.; Erlbaum: Hillsdale, NJ, USA, 1988; ISBN 1299630081. [Google Scholar]
  72. Morel, J.; Todoroff, P.; Bégué, A.; Bury, A.; Martiné, J.-F.; Petit, M. Toward a Satellite-Based System of Sugarcane Yield Estimation and Forecasting in Smallholder Farming Conditions: A Case Study on Reunion Island. Remote Sens. 2014, 6, 6620–6635. [Google Scholar] [CrossRef][Green Version]
  73. Bauer, J.M.; Mburu, S. Effects of drought on child health in Marsabit District, Northern Kenya. Econ. Hum. Biol. 2017, 24, 74–79. [Google Scholar] [CrossRef] [PubMed]
  74. Sié, A.; Louis, V.R.; Gbangou, A.; Müller, O.; Niamba, L.; Stieglbauer, G.; Yé, M.; Kouyaté, B.; Sauerborn, R.; Becher, H. The Health and Demographic Surveillance System (HDSS) in Nouna, Burkina Faso, 1993–2007. Glob. Health Action 2010, 3. [Google Scholar] [CrossRef][Green Version]
  75. Rembold, F.; Atzberger, C.; Savin, I.; Rojas, O. Using Low Resolution Satellite Imagery for Yield Prediction and Yield Anomaly Detection. Remote Sens. 2013, 5, 1704–1733. [Google Scholar] [CrossRef][Green Version]
  76. Nelson, A.D.; Rogers, D.J.; Robinson, T.P. Poverty Mapping in Uganda. Extrapolating Household Expenditure Data Using Environmental Data and Regression Techniques; Food and Agriculture Organization of the United Nations (FAO): Rome, Italy, 2012; ISBN 9251072116. [Google Scholar]
Figure 1. Study area overview with cutouts A, B, and C of surveyed fields and individual households (Environmental Systems Research Institute (ESRI) Basemaps).
Figure 1. Study area overview with cutouts A, B, and C of surveyed fields and individual households (Environmental Systems Research Institute (ESRI) Basemaps).
Remotesensing 12 01717 g001
Figure 2. Exemplary annual normalized difference vegetation index (NDVI) development for all surveyed fields per crop type. NDVI values on the y-axis and respective months (starting with March) on the x-axis.
Figure 2. Exemplary annual normalized difference vegetation index (NDVI) development for all surveyed fields per crop type. NDVI values on the y-axis and respective months (starting with March) on the x-axis.
Remotesensing 12 01717 g002
Figure 3. Subset of the study area with modelled yield amounts per crop-type (kg·25 m²) and the total amount of yield per field. Locator shows Kossi province in Burkina Faso. Basemap: ESRI.
Figure 3. Subset of the study area with modelled yield amounts per crop-type (kg·25 m²) and the total amount of yield per field. Locator shows Kossi province in Burkina Faso. Basemap: ESRI.
Remotesensing 12 01717 g003
Table 1. Multiple linear regression (MLR) model outputs and model validation parameters.
Table 1. Multiple linear regression (MLR) model outputs and model validation parameters.
Predictors Used
(from Avail)
Predictor Months July, August, OctoberJune, AugustMarch, July, September, NovemberMay, July, September, NovemberSeptember
Correlation Coefficient (R)0.740.680.660.630.35
Coefficient of Determination (R²)0.540.460.440.400.12
Adjusted R²0.500.400.400.320.10
RMSE (kg·25 m²)0.942.381.241.521.13
Harvest Value Range (kg·25 m²)6.313.38.794.6
Model Significance0.00010.00040.0000.00110.0477
Table 2. MLR model equations per crop type.
Table 2. MLR model equations per crop type.
Crop TypeModel Equation
Beans−5.145 + 3.348 × NDVIJul + 6.019 × NDVIAug + 9515 × NDWIOct
Maize11.666 − 21.440 × NDVIJun + 43.144 × NDWIJun + 9.830 × NDVIAug
Sorghum−0.8368 − 12.3063 × NDWIMar + 5.4679 × NDWIJul + 5.6015 × NDVISep − 4.1599 × NDWINov
Millet5.260 + 26.152 × NDVIMay + 8.951 × NDWIMay + 5.528 × NDWIJul − 3.599 × NDVISep − 8.258 × NDVINov
Table 3. Household-specific yield quantification for three exemplary households (A, B, and C).
Table 3. Household-specific yield quantification for three exemplary households (A, B, and C).
No. Fields113111113131
Est. total yield (kg)4541678104176502277711731665102112881167
Mean yield (kg 25 m²)

Share and Cite

MDPI and ACS Style

Karst, I.G.; Mank, I.; Traoré, I.; Sorgho, R.; Stückemann, K.-J.; Simboro, S.; Sié, A.; Franke, J.; Sauerborn, R. Estimating Yields of Household Fields in Rural Subsistence Farming Systems to Study Food Security in Burkina Faso. Remote Sens. 2020, 12, 1717.

AMA Style

Karst IG, Mank I, Traoré I, Sorgho R, Stückemann K-J, Simboro S, Sié A, Franke J, Sauerborn R. Estimating Yields of Household Fields in Rural Subsistence Farming Systems to Study Food Security in Burkina Faso. Remote Sensing. 2020; 12(11):1717.

Chicago/Turabian Style

Karst, Isabel G., Isabel Mank, Issouf Traoré, Raissa Sorgho, Kim-Jana Stückemann, Séraphin Simboro, Ali Sié, Jonas Franke, and Rainer Sauerborn. 2020. "Estimating Yields of Household Fields in Rural Subsistence Farming Systems to Study Food Security in Burkina Faso" Remote Sensing 12, no. 11: 1717.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop