Downscaling Regional Crop Yields to Local Scale Using Remote Sensing

: Local-scale crop yield datasets are not readily available in most of the developing world. Local-scale crop yield datasets are of great use for risk transfer and risk management in agriculture. In this article, we present a simple method for disaggregation of district-level production statistics over crop pixels by using a remote sensing approach. We also quantiﬁed the error in the disaggregated statistics to ascertain its usefulness for crop insurance purposes. The methodology development was attempted in Parbhani district of Maharashtra state with wheat and sorghum crops in the winter season. The methodology uses the ratio of Enhanced Vegetation Index (EVI) of pixel to total EVI of the crop pixels in that district corresponding to the growth phase of the crop. It resulted in the generation of crop yield maps at the 500 m resolution pixel (grid) level. The methodology was repeated to generate time-series maps of crop yield. In general, there was a good correspondence between disaggregated crop yield and sub-district level crop yields with a correlation coe ﬃ cient of 0.9.


Introduction
Crop production statistics are vital to agricultural planning and policy design and play an important role in the planning and allocation of inputs for the development of the agricultural sector. In an agrarian country, like India, where agriculture is still the main source of livelihood, crop production statistics are even more important. In India, crop area statistics and crop yield statistics are reported using different methods based on the state institutional mechanism. The crop production statistics are being collected at different scales. These statistics are often available at the district scale and to some extent at the sub-district scale. The district/sub-district scale is probably good for the planning of agricultural inputs where the majority of these transactions happen, and also for setting up the minimum support price and for national/international selling/procurement decisions. Recently, the crop insurance schemes were launched in India at the village scale and it required village level time-series of yields [1]. The major challenge in implementation is the absence of historical yield series at the village level. In order to undertake effective implementation of a crop insurance scheme with creditability, there is a need to produce a consistent long time-series of historical crop yield data at the local scale (such as village), which may be useful for fixing key insurance metrics like threshold yield and indemnity level. There have been only some piecemeal studies in India on the use of technologies for yield estimation and insurance contract design. The evidence base of these technologies, under diverse agro-ecological regions of the country, is also limited to support implementation in crop insurance

Study District and Crops
The study was undertaken for the Parbhani district ( Figure 1) of Maharashtra State of India. The Parbhani district is spread over ~650,000 ha with 9 sub-districts (tehsils/taluks) and 848 villages (https://parbhani.gov.in/village_list/ accessed on 25 January 2020). The Parbhani district is slightly larger in terms of area compared to the average Indian district area of ~450,000 ha. This district is in a hot semi-arid ecological region with an average annual rainfall of 830 mm (http://maharain.gov.in/ accessed on 25 January 2020). In Parbhani, cotton, sorghum, soybean, and green gram are the key crops in the rainy season (kharif) and wheat and sorghum in the winter season. In this study, we attempted the disaggregation of winter (rabi) crops of Jowar (sorghum) and wheat. The district level crop statistics for the Parbhani district for 1999-2000 to 2010-2011 period were downloaded from the website of Department of Agriculture and Cooperation, Ministry of Agriculture and Farmers' Welfare, Govt. of India (https://aps.dac.gov.in/APY/Index.htm accessed on 25 January 2020). Owing to an overlap of the agricultural calendar (June to May in next year) and the crop area, production statistics are often referred to with the current and next calendar year (e.g., [2001][2002] for agriculture year starting in 2001). The sub-district level (tehsil/taluk) level crop area and yield statistics were collected from the Office of Commissionerate of Agriculture, Maharashtra State, Pune. The subdistrict level statistics were used for cross validating the accuracy/error of disaggregation of district level crop statistics.

Satellite Datasets
In order to meet the requirement of generating a long time-series of yield statistics, we used the Terra-MODIS (Moderate Resolution Imaging Spectrometer) derived 500 m Vegetation Indices (VIs) image product for the 2000-2001 to 2013-2014 period (14 years) at a frequency of 16 days in this study. A total of 322 VI image products pertaining to the study area (MODIS Granule h25v7) were downloaded through the USGS Global Visualization Viewer (http://glovis.usgs.gov/). MODIS VI Image Product: Global MOD13A1 (version 5) image products are generated and provided every 16 days at a 500-m spatial resolution as a gridded level-3 product in the Sinusoidal projection. Each file contains four reflectance band images (blue, red, near infrared, and mid infrared) and two vegetation indices images (NDVI and EVI). Global MODIS vegetation indices are designed to provide consistent

Satellite Datasets
In order to meet the requirement of generating a long time-series of yield statistics, we used the Terra-MODIS (Moderate Resolution Imaging Spectrometer) derived 500 m Vegetation Indices (VIs) image product for the 2000-2001 to 2013-2014 period (14 years) at a frequency of 16 days in this study. A total of 322 VI image products pertaining to the study area (MODIS Granule h25v7) were downloaded through the USGS Global Visualization Viewer (http://glovis.usgs.gov/). MODIS VI Image Product: Global MOD13A1 (version 5) image products are generated and provided every 16 days at a 500-m spatial resolution as a gridded level-3 product in the Sinusoidal projection. Each file contains four reflectance band images (blue, red, near infrared, and mid infrared) and two vegetation indices images (NDVI and EVI). Global MODIS vegetation indices are designed to provide consistent spatial and temporal comparisons of vegetation conditions. Blue, red, and near-infrared reflectance, centred Agriculture 2020, 10, 58 4 of 14 at 469, 645, and 858-nanometers, respectively, are used to determine the MODIS daily vegetation indices. The daily vegetation index images are composited for 16 days to produce a maximum value composite (MVC) VI image for the 16-day period. This study also used the Enhanced Vegetation Index (EVI) times-series images for classification of crop types and disaggregation of crop yields. EVI is an 'optimised' index designed to enhance the vegetation signal with improved sensitivity in high biomass regions. EVI is more responsive to canopy structural variations, including leaf area index (LAI), canopy type, plant physiognomy, and canopy architecture [31]. It thus supports improved vegetation monitoring through a de-coupling of the canopy background signal and a reduction in atmosphere influences.

Methodology
The overall methodology followed in this study is depicted as a flow chart in Figure 2. The steps shown in the first column are those related to the acquisition and pre-processing of satellite data. The steps shown in the second column are those related to the identification of pixels of the crop of interest in a season in the study districts. The steps shown in the third column are those related to finding a weighting criterion using satellite derived index to distribute district crop statistics over the identified crop pixels. Then, the distributed crop statistics are re-aggregated at the sub-district, i.e., the tehsil/taluk level, and compared with the reported values for an error analysis of the disaggregated values. Based on this methodology, a time-series of disaggregated yield statistics were generated for the 2000-2001 to 2013-2014 period. The methodology is described for these three columns in subsequent sections.  [32]. It thus supports improved vegetation monitoring through a de-coupling of the canopy background signal and a reduction in atmosphere influences.

Methodology
The overall methodology followed in this study is depicted as a flow chart in Figure 2. The steps shown in the first column are those related to the acquisition and pre-processing of satellite data. The steps shown in the second column are those related to the identification of pixels of the crop of interest in a season in the study districts. The steps shown in the third column are those related to finding a weighting criterion using satellite derived index to distribute district crop statistics over the identified crop pixels. Then, the distributed crop statistics are re-aggregated at the sub-district, i.e., the tehsil/taluk level, and compared with the reported values for an error analysis of the disaggregated values. Based on this methodology, a time-series of disaggregated yield statistics were generated for the 2000-2001 to 2013-2014 period. The methodology is described for these three columns in subsequent sections. The overall methodology flowchart showing the steps followed in this study. The steps on the left describe data pre-processing, the middle column describes crop mapping, and the column on the right describes yield downscaling.

Pre-Processing of Satellite Data
The pre-processing of time-series of MODIS image granules was carried out in ENVI™ + IDL™ V4.8 Image Processing software. Each MODIS Granule in HDF-EOS format was imported into native The overall methodology flowchart showing the steps followed in this study. The steps on the left describe data pre-processing, the middle column describes crop mapping, and the column on the right describes yield downscaling.

Pre-Processing of Satellite Data
The pre-processing of time-series of MODIS image granules was carried out in ENVI™ + IDL™ V4.8 Image Processing software. Each MODIS Granule in HDF-EOS format was imported into native binary format of Image Processing Software, and among the scientific datasets present in the granule, only the EVI image dataset was extracted. All the 322 extracted EVI images in original sinusoidal Agriculture 2020, 10, 58 5 of 14 projection were stacked together in a time-series starting from Julian day 129 of 2000 (9-May-2000) to Julian day 113 of 2014 (23-April-2014). The Julian day of an EVI image refers to the starting day of the 16-day period. The stacked images were reprojected from Sinusoidal projection to Lambert Conformal Conic (LCC) projection with WGS-84 datum and a pixel size of 500 m using nearest neighbour resampling method. The stacked EVI images were subsetted, corresponding to the geographical extent of the study district. The original image stores scaled EVI values in integer. By applying the scale factor for recalibration, the images were converted into true EVI values in float.

EVI Time-Series Filtering
Although the 16-day maximum value compositing of EVI is implemented to minimise the noise due to cloud contamination, it has been reported that the VI signal is still contaminated by significant residual noise due to atmospheric variability and bi-directional effects [32,33]. Such residual noise is more prominent under Indian conditions, especially during the rainy (kharif) season, when cloudy conditions persist for months together. These noises greatly undermine the basic assumption that the changes in VI are related to changes in vegetation conditions, thus affecting the monitoring of land cover and terrestrial ecosystems [34,35]. For this reason, a number of methods for reducing noise and constructing high-quality VI time-series data sets for further analysis have been formulated, applied, and evaluated.
Among the different methods proposed in literature, we adopted an algorithm-based Savitzky-Golay filter, to efficiently reduce the contamination in the VI time-series that is caused primarily by cloud contamination and atmospheric variability. The algorithm is mainly based on the approach proposed by Chen et al., 2004. This method is based on two assumptions: (1) that the VI data from a satellite sensor is primarily related to vegetation changes and follows annual cycle of growth and decline; and (2) that clouds and poor atmospheric conditions mostly reduce VI values, requiring that sudden drops in VI, which are not compatible with the gradual process of vegetation change, be regarded as noise and removed. This algorithm makes the data approach the upper VI envelope and to best fit the VI variations over the season through an iterative process. At the heart of the algorithm is the Savitzky-Golay filter, which is a weighted moving average filter with weighting given as a polynomial of a certain degree (3rd degree in this case). The weight coefficients, when applied to a signal, perform a polynomial least squares fit within the filter window. This polynomial is designed to preserve higher moments within the data and to reduce the bias introduced by the filter. We cross-checked the filter (Savitzky-Golay) output for inconsistencies on a sample basis. Please see Figure 3, where we highlight the raw and filtered EVI profile over 14 years.
The blue coloured profile shows noises in terms of sudden dips in data, while the filtered red colour profile fits the outer EVI envelope, is smoothened, and preserves the periods of growth and decline well. Classification with sixteen classes was extracted and reprojected. The global land use layer was subsetted, corresponding to the bounds of the study area (Figure 4a). The grids with land use class of Cropland (class = 12) was retained and other land-use classes were masked out. The croplands mask was overlaid with a study area boundary to generate the cropland mask of the study area. The croplands land use mask for the study area is shown in Figure 4b.

Crop Area Identification
Crop area estimation methodology and its efficacy depends on various operational factors, such as land configuration, field shape, crop type, cropping pattern, available skills, and resources [36]. Here, we used the Hierarchical Decision Rule method for crop area identification based on the growth profile of EVI. The scientific rationale for separability of different landuse/crop classes using hierarchical decision rules is based on using the a-priori knowledge of the temporal spectral growth profiles of the various land covers including crops [37]. MODIS Vegetation Image Product was used in the study, and the EVI values ranged from −2000 to +10,000 with a scale factor of 0.0001. In the first step, the pixels, which were masked out during filtering, were labelled as "Non-crop". In the next step, the current fallow pixels with an EVI below 0.2 during November, as well as Early February periods, were identified. The current fallow represents the agricultural land where sowing has not been done in the current season. Then, the pixels with wheat crop were identified based on the decision that the EVI profile shows a peak anywhere between January and February. Using the decision rule classification above, the winter (rabi) season crops were classified for each crop season of this study.

Land Cover Image
The global MODIS/Terra + Aqua Land Cover Type (MCD12Q1 product for h25v7 granule) for the year 2012 period, available from http://reverb.echo.nasa.gov/ at a resolution of 500 m in sinusoidal projection and HDF-EOS format, was downloaded. The first layer in the IGBP Land Cover Type Classification with sixteen classes was extracted and reprojected. The global land use layer was subsetted, corresponding to the bounds of the study area (Figure 4a). The grids with land use class of Cropland (class = 12) was retained and other land-use classes were masked out. The croplands Agriculture 2020, 10, 58 7 of 14 mask was overlaid with a study area boundary to generate the cropland mask of the study area. The croplands land use mask for the study area is shown in Figure 4b.

Disaggregation of District Yield Statistics
Weighing criterion: The basic premise we adopted in this study to disaggregate district statistics over the crop pixels of interest was that the integrated EVI for the growth period of the crop is directly related to the crop yield. The higher the value of integrated EVI, the higher the crop yield will be. In order to scale this relation, a weighting factor (F) was calculated as the ratio of "Crop pixel integrated EVI" to the "Total integrated EVI over all the Crop pixels in the district". (1) where, EVIpixel is the integrated EVI value of the pixel of the crop of interest, EVIdistrict is the sum of EVIpixel over all the crop pixels in that district, i.e., (2) where, the total number of pixels of the crop of interest in the district is n. So, an F value map for crop pixel was calculated.
As the EVIdistrict was an integrated value, we correspondingly disaggregated district wise crop production statistics over the crop pixels, instead of district yield statistics. For a crop pixel, the F value was multiplied by the district production statistics and divided by the pixel area (i.e., 25 ha), to arrive at the yield value of the crop at the pixel level in kg/ha.
where, Pdistrict is the district crop production in (kg), and A is the area of one pixel in (ha).

Crop Mapping
The decision tree approach was allied to generate crop maps for each year. Figure 5

Disaggregation of District Yield Statistics
Weighing criterion: The basic premise we adopted in this study to disaggregate district statistics over the crop pixels of interest was that the integrated EVI for the growth period of the crop is directly related to the crop yield. The higher the value of integrated EVI, the higher the crop yield will be. In order to scale this relation, a weighting factor (F) was calculated as the ratio of "Crop pixel integrated EVI" to the "Total integrated EVI over all the Crop pixels in the district".
where, EVI pixel is the integrated EVI value of the pixel of the crop of interest, EVI district is the sum of EVI pixel over all the crop pixels in that district, i.e., where, the total number of pixels of the crop of interest in the district is n. So, an F value map for crop pixel was calculated. As the EVI district was an integrated value, we correspondingly disaggregated district wise crop production statistics over the crop pixels, instead of district yield statistics. For a crop pixel, the F value was multiplied by the district production statistics and divided by the pixel area (i.e., 25 ha), to arrive at the yield value of the crop at the pixel level in kg/ha.
where, P district is the district crop production in (kg), and A is the area of one pixel in (ha).

Crop Mapping
The decision tree approach was allied to generate crop maps for each year. Figure  In order to verify the classification accuracy, the wheat and rabi sorghum crop area was aggregated at the sub-district (tehsil/taluk) level and compared with the statistics reported by the State Department of Agriculture. The comparison for the year 2010-2011 is shown in Figure 6. The error in the remote sensing (RS) estimated area of wheat as compared to values reported by the State Department of Agriculture (SDA) varied between a −17.6% underestimation for the Parbhani subdistrict (tehsil/taluk) to a 12.0% overestimation in the Selu sub-district (tehsil/taluk). Overall, at the district level, the error was a 3.6% underestimation. The error in remote sensing estimated that the rabi sorghum area at the sub-district (tehsil/taluk) level varied between −11.0% in the Manwat subdistrict (tehsil/taluk) to +20% in the Pathri sub-district (tehsil/taluk), with an overall 1.0% overestimation error at the district level.  In order to verify the classification accuracy, the wheat and rabi sorghum crop area was aggregated at the sub-district (tehsil/taluk) level and compared with the statistics reported by the State Department of Agriculture. The comparison for the year 2010-2011 is shown in Figure 6. The error in the remote sensing (RS) estimated area of wheat as compared to values reported by the State Department of Agriculture (SDA) varied between a −17.6% underestimation for the Parbhani sub-district (tehsil/taluk) to a 12.0% overestimation in the Selu sub-district (tehsil/taluk). Overall, at the district level, the error was a 3.6% underestimation. The error in remote sensing estimated that the rabi sorghum area at the sub-district (tehsil/taluk) level varied between −11.0% in the Manwat sub-district (tehsil/taluk) to +20% in the Pathri sub-district (tehsil/taluk), with an overall 1.0% overestimation error at the district level. error in the remote sensing (RS) estimated area of wheat as compared to values reported by the State Department of Agriculture (SDA) varied between a −17.6% underestimation for the Parbhani subdistrict (tehsil/taluk) to a 12.0% overestimation in the Selu sub-district (tehsil/taluk). Overall, at the district level, the error was a 3.6% underestimation. The error in remote sensing estimated that the rabi sorghum area at the sub-district (tehsil/taluk) level varied between −11.0% in the Manwat subdistrict (tehsil/taluk) to +20% in the Pathri sub-district (tehsil/taluk), with an overall 1.0% overestimation error at the district level.

Disaggregated Crop Yield Map
Using the disaggregation criterion, the rabi sorghum and wheat yield maps were generated for the crop pixels for each crop year of the study. For example, Figure 7 shows the rabi sorghum disaggregated yield map. For the disaggregation, the district production statistics for 2010-2011 were used, and the EVI was integrated at pixels for the period between 3-January-2011 and 30-January-2011. The pixel level yield generated varied between 694 and 1447 kg/ha for the year 2010-2011, and the EVI was integrated at pixels for the period between 1-January-2011 and 2-February-2011. The pixel level wheat yield generated by disaggregation varied between 900 and 2400 kg/ha for the year 2010-2011. To validate the disaggregation methodology and analyse the extent of error, the disaggregated pixel level yield values were aggregated to the sub-district (tehsil/taluk) level and then compared with the values reported by the State Department of Agriculture (SDA) at the sub-district (tehsil/taluk) level. The comparison of estimated and reported rabi sorghum and wheat yield is shown in Figure 8. In general, there was good correspondence between the two with a correlation coefficient of 0.9 for both crops. The methodology underestimated the sorghum yield in sub-districts (tehsils/taluks) having higher yield values, while it overestimated the yield in sub-districts (tehsils/taluks) with lower yield values.

Disaggregated Crop Yield Map
Using the disaggregation criterion, the rabi sorghum and wheat yield maps were generated for the crop pixels for each crop year of the study. For example, Figure 7 shows the rabi sorghum disaggregated yield map. For the disaggregation, the district production statistics for 2010-2011 were used, and the EVI was integrated at pixels for the period between 3-January-2011 and 30-January-2011. The pixel level yield generated varied between 694 and 1447 kg/ha for the year 2010-2011, and the EVI was integrated at pixels for the period between 1-January-2011 and 2-February-2011. The pixel level wheat yield generated by disaggregation varied between 900 and 2400 kg/ha for the year 2010-2011. To validate the disaggregation methodology and analyse the extent of error, the disaggregated pixel level yield values were aggregated to the sub-district (tehsil/taluk) level and then compared with the values reported by the State Department of Agriculture (SDA) at the sub-district (tehsil/taluk) level. The comparison of estimated and reported rabi sorghum and wheat yield is shown in Figure 8. In general, there was good correspondence between the two with a correlation coefficient of 0.9 for both crops. The methodology underestimated the sorghum yield in sub-districts (tehsils/taluks) having higher yield values, while it overestimated the yield in sub-districts (tehsils/taluks) with lower yield values. At the sub-district (tehsil/taluk) level, the error between the estimated and reported sorghum yield varied between −22.0% in the Selu sub-district (tehsil/taluk) to +15.8% in the Pathri sub-district (tehsil/taluk) (Figure 8). Overall, the average root mean square error was 167.87 kg/ha, which is about 15.4% of the average district yield. For wheat, at the sub-district (tehsil/taluk) level, the error between the estimated and reported yield varied between −16.0% in the Selu sub-district (tehsil/taluk) to +9.0% in the Manwat sub-district (tehsil/taluk). Overall, the average root mean square error was 178.9 kg/ha,

Discussion
Use of remote sensing data for disaggregating existing crop production statistics to the local scale can be highly useful to agricultural planning and implementation of risk management strategies. The remote sensing-based methods provide quick crop yield estimates (downscaled or inseason) covering a vast geographical area. The in-season estimates can be achieved by minor modifications in the methodology described here. A few minor modifications would make this method a predictor model; possible modifications would be the use of vegetation indices at a crop growth stage where yield-vegetation and crop identification are highly correlated.
High resolution yield surface can add substantial value to risk management and planning process in agriculture. These yield surfaces can also give insight to commodity markets and agricultural input planning.
With the space data revolution and availability of more and more datasets with increased spatial and temporal coverages, the accuracy has been substantially increased while costs have come down significantly. Still, there are challenges in obtaining estimates in the rainy season because of cloud cover.
Use of disaggregated crop yields in insurance schemes: Globally, crop insurance is the preferred option for risk transfer. The demand for improved insurance products is increasing. Several countries now operate insurance schemes at the village level; however; unavailability of crop yield and area statistics at this level remains the key bottleneck in implementing or designing such schemes. In India, governments provide a huge subsidy on crop insurance premiums, and the new crop insurance scheme in India is being implemented with the village as the insurance unit [1,39]. The scheme parameters, like threshold yield and loss assessment, are calculated on an area-yield-based approach. Under the previous yield index insurance schemes, the insurance unit was typically at the sub-district level. In the new scheme, the insurance product design and price is based on historical yield values at the village level. A major challenge in implementing a new scheme at the village level is the absence of historical yield series. So, in order to undertake effective implementation of crop insurance schemes with creditability, there is a need to produce a consistent long time-series of historical crop At the sub-district (tehsil/taluk) level, the error between the estimated and reported sorghum yield varied between −22.0% in the Selu sub-district (tehsil/taluk) to +15.8% in the Pathri sub-district (tehsil/taluk) ( Figure 8). Overall, the average root mean square error was 167.87 kg/ha, which is about 15.4% of the average district yield. For wheat, at the sub-district (tehsil/taluk) level, the error between the estimated and reported yield varied between −16.0% in the Selu sub-district (tehsil/taluk) to +9.0% in the Manwat sub-district (tehsil/taluk). Overall, the average root mean square error was 178.9 kg/ha, which is about 10.5% of the average district yield.

Discussion
Use of remote sensing data for disaggregating existing crop production statistics to the local scale can be highly useful to agricultural planning and implementation of risk management strategies. The remote sensing-based methods provide quick crop yield estimates (downscaled or in-season) covering a vast geographical area. The in-season estimates can be achieved by minor modifications in the methodology described here. A few minor modifications would make this method a predictor model; possible modifications would be the use of vegetation indices at a crop growth stage where yield-vegetation and crop identification are highly correlated.
High resolution yield surface can add substantial value to risk management and planning process in agriculture. These yield surfaces can also give insight to commodity markets and agricultural input planning.
With the space data revolution and availability of more and more datasets with increased spatial and temporal coverages, the accuracy has been substantially increased while costs have come down significantly. Still, there are challenges in obtaining estimates in the rainy season because of cloud cover.
Use of disaggregated crop yields in insurance schemes: Globally, crop insurance is the preferred option for risk transfer. The demand for improved insurance products is increasing. Several countries now operate insurance schemes at the village level; however; unavailability of crop yield and area statistics at this level remains the key bottleneck in implementing or designing such schemes. In India, governments provide a huge subsidy on crop insurance premiums, and the new crop insurance scheme in India is being implemented with the village as the insurance unit [1,38]. The scheme parameters, like threshold yield and loss assessment, are calculated on an area-yield-based approach. Under the previous yield index insurance schemes, the insurance unit was typically at the sub-district level. In the new scheme, the insurance product design and price is based on historical yield values at the village level. A major challenge in implementing a new scheme at the village level is the absence of historical yield series. So, in order to undertake effective implementation of crop insurance schemes with creditability, there is a need to produce a consistent long time-series of historical crop yield data at the local scale (such as village), which may be useful for fixing indemnity limits, threshold yields, premium rates at the village panchayat, or other equivalent units. This methodology is the first step towards getting a historical time-series of yield at the village level using district-level production data and remote sensing techniques. Figure 9 shows coefficient of variation of sorghum yield from 1999-2000 to 2013-2014.
Agriculture 2020, 10, x FOR PEER REVIEW 11 of 14 yield data at the local scale (such as village), which may be useful for fixing indemnity limits, threshold yields, premium rates at the village panchayat, or other equivalent units. This methodology is the first step towards getting a historical time-series of yield at the village level using district-level production data and remote sensing techniques. Figure 9 shows coefficient of variation of sorghum yield from 1999-2000 to 2013-2014. Limitations of the study: This study presented a simple method for downscaling the district level statistics to the sub-district or village or pixel level using a remote sensing approach. Overall, the average root mean square error was 167.87 kg/ha and 178.9 kg/ha, which is about 15.4% and 10.5% of the average district yield for sorghum and wheat, respectively. The error can be attributed to several sources, such as the accuracy of the generated crop mask/decision tree approach and to the data resolution issues considering prevalent small holding farming in the region. Although, the study also validates the approach with the sub-district level using reported statistics. We understand that the study was done over a small geographic area and lacks the validation with the ground truth measurements of crop yield and area. It is likely that the performance of this approach might not be similar for other crops, especially in the rainy season where the availability of quality remote sensing Limitations of the study: This study presented a simple method for downscaling the district level statistics to the sub-district or village or pixel level using a remote sensing approach. Overall, the average root mean square error was 167.87 kg/ha and 178.9 kg/ha, which is about 15.4% and 10.5% of the average district yield for sorghum and wheat, respectively. The error can be attributed to several sources, such as the accuracy of the generated crop mask/decision tree approach and to the data resolution issues considering prevalent small holding farming in the region. Although, the study also validates the approach with the sub-district level using reported statistics. We understand that the study was done over a small geographic area and lacks the validation with the ground truth measurements of crop yield and area. It is likely that the performance of this approach might not be similar for other crops, especially in the rainy season where the availability of quality remote sensing data is low.

Conclusions
The study was undertaken to develop and validate a methodology of disaggregation of district level crop yield statistics at the pixel level (i.e., local scale) using satellite remote sensing images. The study procured and pre-processed 500 m MODIS EVI times-series images for 2000-2001 to 2013-2014 crop years for the Parbhani district of Maharashtra State in India. The Savitzky-Golay based filtering technique was adopted in this study for the removal of residual noise in EVI time-series images. A hierarchical decision rule-based image classification technique was adopted for identification and mapping of crop pixels of interest. In the absence of specific ground-truth, the crop growth profile derived phenology events were the basis for fixing the rules during classification.
A simple method was developed and applied for disaggregation of crop district production statistics over crop pixels by using the ratio of EVI of pixel to total EVI of the crop pixels in that district corresponding to the growth phase of crops. It resulted in the generation of crop yield maps at the 500 m resolution pixel (grid) level. The methodology was repeated for all the 14 years to generate time-series of crop yield map. In general, there was good correspondence between disaggregated crop yield and tehsil crop yields with a correlation coefficient of 0.9 for rabi crops in Parbhani (wheat and sorghum). At the sub-district (tehsil/taluk) level, the error between estimated and reported sorghum yield varied between −22.0% in the Selu sub-district (tehsil/taluk) to +15.8% in the Pathri sub-district (tehsil/taluk), and for wheat, −16.0% in the Selu sub-district (tehsil/taluk) to +9.0% in the Manwat sub-district (tehsil/taluk). Overall, the average root mean square error was 178.9 kg/ha, which is about 10.5% of the average district wheat yield. The local scale yield estimates are of immense use in agricultural planning and implementation of risk management strategies. Future research should be targeted at improving these estimates in the rainy season where cloudy conditions pose challenges.
Author Contributions: P.B.S.; V.K.S. and P.K.A. designed the research, V.K.S. and P.B.S. developed methodology and software and analysed the results; V.K.S. and P.B.S. wrote the first draft of the article, which was improved with substantial input from P.K.A. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by CGIAR Fund Donors and through bilateral funding agreements.