Validation of Automatically Generated Global and Regional Cropland Data Sets: The Case of Tanzania

There is a need to validate existing global cropland maps since they are used for different purposes including agricultural monitoring and assessment. In this paper we validate three recent global products (ESA-CCI, GlobeLand30, FROM-GC) and one regional product (Tanzania Land Cover 2010 Scheme II) using a validation data set that was collected by students through the Geo-Wiki tool. The ultimate aim was to understand the usefulness of these products for agricultural monitoring. Data were collected wall-to-wall for Kilosa district and for a sample across Tanzania. The results show that the amount of and spatial extent of cropland in the different products differs considerably from 8% to 42% for Tanzania, with similar values for Kilosa district. The agreement of the validation data with the four different products varied between 36% and 54% and highlighted that cropland is overestimated by the ESA-CCI and underestimated by FROM-GC. The validation data were also analyzed for consistency between the student interpreters and also compared with a sample interpreted by five experts for quality assurance. Regarding consistency between the students, there was more than 80% agreement if one difference in cropland category was considered (e.g., between low and medium cropland) while most of the confusion with the experts was also within one category difference. In addition to the validation of current cropland products, the data set collected by the students also has potential value as a training set for improving future cropland products.


Introduction
To ensure global food security, cropland is regularly monitored by initiatives such as GEOGLAM (Group on Earth Observation's Global Agricultural Monitoring) [1], CropWatch [2], and the MARS (Monitoring Agricultural Resources) unit of the Joint Research Centre of the European Commission, among others [3]. Agricultural monitoring relies heavily on the use of Earth Observation, e.g., from yield and production estimation to identification of cropping patterns [4], which requires baseline information on cropland as a key input [5]. Many different products now exist and must be validated in order to understand whether they are fit for purpose for agricultural monitoring. In the past, medium to coarse resolution imagery from sensors such as AVHRR, SPOT-VGT, MERIS, and MODIS has been used extensively to map land cover, e.g., [6][7][8] and cropland, e.g., [1,9,10]. These products are mostly generated using a top down approach, employing automated or semi-automated classification techniques and a training data sample collected from field data, interpretation of satellite or aerial imagery, or both. However, when these products are compared, there are often large spatial disagreements between them, particularly in the cropland class [11], which has led to the production In total, the students visually interpreted 25,943 VHR satellite images spanning the period 2005 to 2014. The distribution of images across the year by percentage is shown in Figure 1. A total of 15,383 images covered the Kilosa district wall-to-wall while the rest were randomly distributed across the rest of Tanzania ( Figure 2). The campaign used a customized offline branch of the Geo-Wiki application, as shown in Figure 3, to visually interpret 1 km × 1 km VHR images. These RGB images, comprising a mosaic of mainly WorldView-2 imagery, were provided by Digital Globe through their viewing service, which is a Web Map Service (WMS). A resolution of 1 km was chosen because enough detail is visible in the imagery to be able to distinguish between cropland and non-cropland. Although 300 m would have been also possible, this would increase the sample size by almost 10 times and not much additional detail would be visible. Hence 1 km is a compromise between detail and efficiency in data collection. Data collection at 30 m would have required too many samples.  A total of 15,383 images covered the Kilosa district wall-to-wall while the rest were randomly distributed across the rest of Tanzania ( Figure 2). The campaign used a customized offline branch of the Geo-Wiki application, as shown in Figure 3, to visually interpret 1 km × 1 km VHR images. These RGB images, comprising a mosaic of mainly WorldView-2 imagery, were provided by Digital Globe through their viewing service, which is a Web Map Service (WMS). A resolution of 1 km was chosen because enough detail is visible in the imagery to be able to distinguish between cropland and non-cropland. Although 300 m would have been also possible, this would increase the sample size by almost 10 times and not much additional detail would be visible. Hence 1 km is a compromise between detail and efficiency in data collection. Data collection at 30 m would have required too many samples.
Remote Sens. 2017, 9,815 4 of 23 In total, the students visually interpreted 25,943 VHR satellite images spanning the period 2005 to 2014. The distribution of images across the year by percentage is shown in Figure 1. A total of 15,383 images covered the Kilosa district wall-to-wall while the rest were randomly distributed across the rest of Tanzania ( Figure 2). The campaign used a customized offline branch of the Geo-Wiki application, as shown in Figure 3, to visually interpret 1 km × 1 km VHR images. These RGB images, comprising a mosaic of mainly WorldView-2 imagery, were provided by Digital Globe through their viewing service, which is a Web Map Service (WMS). A resolution of 1 km was chosen because enough detail is visible in the imagery to be able to distinguish between cropland and non-cropland. Although 300 m would have been also possible, this would increase the sample size by almost 10 times and not much additional detail would be visible. Hence 1 km is a compromise between detail and efficiency in data collection. Data collection at 30 m would have required too many samples.   In Figure 3, the image displayed on the left of the screen shows the area to be classified; below the image is a scale (in meters and feet), to help determine the field size, and the image date. Each user entered a personal unique identification name or code. Set up by work packages, the number of images per work package was indicated at the bottom so that progress could be monitored. For each image, the students were asked to determine the: Cropland is defined according to the FAO definition of arable and permanent crops, including fallow land of less than five years [31]. The thresholds between low, medium, and high cropland and woody extent were roughly set at one-third (between low and medium) and two-thirds (between medium and high) to aid in visual interpretation. If the image was difficult to classify or if there were clouds present, they could then skip the grid square and move on to the next one.
The campaign was completed when each grid square was classified by each student, and where possible, at least twice, allowing the majority of images to have more than one evaluation by different participants in order to cross check the outputs. The result was a wall-to-wall cropland and woodland extent map for Kilosa. All the data sets used in the study are shown in Table 1.
To determine the variation across students in the validation data set, three data metrics were used for comparison, i.e., the minimum, the median, and the maximum cropland values across all students for the same image. Table 2 shows averages per image for the lowest, mean, and highest values for cropland and woodland extent from 22,190 images across Tanzania where there was more than one student evaluating the image (in percent). Moreover, for any given image, the average standard deviation between students was around 15% for cropland extent and 14% for woody extent. In Figure 3, the image displayed on the left of the screen shows the area to be classified; below the image is a scale (in meters and feet), to help determine the field size, and the image date. Each user entered a personal unique identification name or code. Set up by work packages, the number of images per work package was indicated at the bottom so that progress could be monitored. For each image, the students were asked to determine the: Cropland is defined according to the FAO definition of arable and permanent crops, including fallow land of less than five years [31]. The thresholds between low, medium, and high cropland and woody extent were roughly set at one-third (between low and medium) and two-thirds (between medium and high) to aid in visual interpretation. If the image was difficult to classify or if there were clouds present, they could then skip the grid square and move on to the next one.
The campaign was completed when each grid square was classified by each student, and where possible, at least twice, allowing the majority of images to have more than one evaluation by different participants in order to cross check the outputs. The result was a wall-to-wall cropland and woodland extent map for Kilosa. All the data sets used in the study are shown in Table 1.
To determine the variation across students in the validation data set, three data metrics were used for comparison, i.e., the minimum, the median, and the maximum cropland values across all students for the same image. Table 2 shows averages per image for the lowest, mean, and highest values for cropland and woodland extent from 22,190 images across Tanzania where there was more than one student evaluating the image (in percent). Moreover, for any given image, the average standard deviation between students was around 15% for cropland extent and 14% for woody extent. Prior to comparison, some pre-processing of the data sets was required. The first step involved reclassifying the four land cover products so that the corresponding cropland classes were set to 1 and all other classes were set to 0. Table 3 shows the different cropland classes in the four data sets, their corresponding definitions and LCCS (Land Cover Classification System) labels, codes and levels. Additionally, Table 4 provides the LCCS labels and aggregated classes for each of the data sets used in the study for comparison purposes. In the case of the ESA-CCI, there were two cropland classes, one cultivated land class in GlobeLand30, four in FROM-GC and two in the RCMRD data set. Note that the mosaic classes in ESA-CCI were not considered.  * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. ation data set produced by the students contains four possible classes, i.e., 1 = no ow cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly to values between 0 and 1. mation from each of these four products was aggregated to match the grid of the a set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid s from these four products were then extracted for each of the 25,943 1 × 1 km images the students. From this sample, 22,190 had at least two interpretations from two ents and hence averages could be calculated across all evaluations. The resulting e shown in Figure 4. Both the validation data and the ESA-CCI show a range of ortions while the other data sets are heavily skewed towards lower proportions of ) indicates correspondence with the LCCS label while an X ( * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCC system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion i the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e., 1 cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were lin transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km im interpreted by the students. From this sample, 22,190 had at least two interpretations from different students and hence averages could be calculated across all evaluations. The res histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a ran cropland proportions while the other data sets are heavily skewed towards lower proportio cropland.
) denotes absence. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.

LCCS-Label
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROMwhich were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LC system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportio the FROM-GC data set is very low.
The validation data set produced by the students contains four possible classes, i.e. cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the gri validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km interpreted by the students. From this sample, 22,190 had at least two interpretations fr different students and hence averages could be calculated across all evaluations. The r histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a r cropland proportions while the other data sets are heavily skewed towards lower propor cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to system [28]. The class Crop-Greenhouse is included in this analysis as the ove the FROM-GC data set is very low. The validation data set produced by the students contains four po cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. T transformed into values between 0 and 1.
The information from each of these four products was aggregated validation data set using ArcGIS zonal statistics to produce cropland prop cell. The values from these four products were then extracted for each of t interpreted by the students. From this sample, 22,190 had at least two different students and hence averages could be calculated across all e histograms are shown in Figure 4. Both the validation data and the E cropland proportions while the other data sets are heavily skewed towa cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. Table 4. LCCS labels and their correspondence in each of the data sets used in the study. A check mark (✓) indicates correspondence with the LCCS label while an (✘) denotes absence.

LCCS-Label
Aggregated Class

ESA-CCI GlobeLand30 FROM-GC RCMRD
The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. Table 4. LCCS labels and their correspondence in each of the data sets used in the study. A check mark (✓) indicates correspondence with the LCCS label while an (✘) denotes absence.

LCCS-Label
Aggregated Class

ESA-CCI GlobeLand30 FROM-GC RCMRD
The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low.
The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROMwhich were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LC system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportio the FROM-GC data set is very low. Table 4. LCCS labels and their correspondence in each of the data sets used in the study. A ch mark (✓) indicates correspondence with the LCCS label while an (✘) denotes absence.

LCCS-Label
The validation data set produced by the students contains four possible classes, i.e. cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the gri validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km interpreted by the students. From this sample, 22,190 had at least two interpretations fr different students and hence averages could be calculated across all evaluations. The r histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a r cropland proportions while the other data sets are heavily skewed towards lower propor cropland.  [32] which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to system [28]. The class Crop-Greenhouse is included in this analysis as the ove the FROM-GC data set is very low. The validation data set produced by the students contains four po cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. T transformed into values between 0 and 1.
The information from each of these four products was aggregated validation data set using ArcGIS zonal statistics to produce cropland prop cell. The values from these four products were then extracted for each of t interpreted by the students. From this sample, 22,190 had at least two different students and hence averages could be calculated across all e histograms are shown in Figure 4. Both the validation data and the E cropland proportions while the other data sets are heavily skewed towa cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low.
The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROMwhich were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LC system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportio the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e. cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the gri validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km interpreted by the students. From this sample, 22,190 had at least two interpretations fr different students and hence averages could be calculated across all evaluations. The r histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a r cropland proportions while the other data sets are heavily skewed towards lower propor cropland.  [32] which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to system [28]. The class Crop-Greenhouse is included in this analysis as the ove the FROM-GC data set is very low. The validation data set produced by the students contains four po cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. T transformed into values between 0 and 1.
The information from each of these four products was aggregated validation data set using ArcGIS zonal statistics to produce cropland prop cell. The values from these four products were then extracted for each of t interpreted by the students. From this sample, 22,190 had at least two different students and hence averages could be calculated across all e histograms are shown in Figure 4. Both the validation data and the E cropland proportions while the other data sets are heavily skewed towa cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROM-GC, which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LCCS system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportion in the FROM-GC data set is very low. Table 4. LCCS labels and their correspondence in each of the data sets used in the study. A check mark (✓) indicates correspondence with the LCCS label while an (✘) denotes absence.

LCCS-Label
Aggregated Class The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.

ESA-CCI GlobeLand30 FROM-GC RCMRD
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students. From this sample, 22,190 had at least two interpretations from two different students and hence averages could be calculated across all evaluations. The resulting histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a range of cropland proportions while the other data sets are heavily skewed towards lower proportions of cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] except for FROMwhich were taken from Gong et al. [28]. † Greenhouse agriculture is hard to define under the LC system [28]. The class Crop-Greenhouse is included in this analysis as the overall crop proportio the FROM-GC data set is very low. The validation data set produced by the students contains four possible classes, i.e. cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the gri validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km interpreted by the students. From this sample, 22,190 had at least two interpretations fr different students and hence averages could be calculated across all evaluations. The r histograms are shown in Figure 4. Both the validation data and the ESA-CCI show a r cropland proportions while the other data sets are heavily skewed towards lower propor cropland. * LCCS labels, codes and levels were taken from the ESA CCI Manual [32] which were taken from Gong et al. [28]. † Greenhouse agriculture is hard to system [28]. The class Crop-Greenhouse is included in this analysis as the ove the FROM-GC data set is very low. The validation data set produced by the students contains four po cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. T transformed into values between 0 and 1.
The information from each of these four products was aggregated validation data set using ArcGIS zonal statistics to produce cropland prop cell. The values from these four products were then extracted for each of t interpreted by the students. From this sample, 22,190 had at least two different students and hence averages could be calculated across all e histograms are shown in Figure 4. Both the validation data and the E cropland proportions while the other data sets are heavily skewed towa cropland.
The validation data set produced by the students contains four possible classes, i.e., 1 = no cropland, 2 = low cropland, 3 = medium cropland, and 4 = high cropland. These values were linearly transformed into values between 0 and 1.
The information from each of these four products was aggregated to match the grid of the validation data set using ArcGIS zonal statistics to produce cropland proportions in each 1 km grid cell. The values from these four products were then extracted for each of the 25,943 1 × 1 km images interpreted by the students.

Statistical Comparison of Overall Cropland
A generalized linear model with a binomial distribution and a logit link with a Laplace estimation was used to compare the proportion of cropland generated by each data set. The SAS GLIMMIX procedure was employed to test the model in the Kilosa district and for all of Tanzania.

Assessment of the Validation Data Collected by the Students
The validation data are also subject to some uncertainties due to differences in interpretation between individuals. Disagreement between interpreters is reported and discussed, since for most of the images, more than one observer evaluated the amount of cropland present. The disagreement was calculated as the difference between the minimum and maximum values that an image was assigned by the students; 0 denotes no difference up to a maximum value of 3 for the highest level of disagreement.

Spatial Comparison of Cropland
Confusion matrices were produced that compare the validation data set against the four other data sets in four different classes/categories. The four categories were chosen based on the average amount of cropland in each 1 × 1 km scene that was reported by each data set, where <10% cropland was classified as "no cropland", between 10% and 33% cropland was classified as "low" cropland, between 33% and 66% cropland was classified as "medium" cropland and higher than 66% was classified as "high" cropland. Note that 10% was used as a threshold for "no cropland" to reduce the noise that may occur due to the aggregation of the data sets to 1 km, particularly for data sets at a 30 m resolution.
Additionally, maps showing agreement between the validation data set and the four other data sets were produced for the Kilosa district. For these maps, a threshold of 50% cropland in a given scene was used to differentiate between "cropland" and "no cropland". Finally, a map showing agreement between all data sets using the same definition of cropland/no cropland (50% threshold) was also produced. In all of these figures and tables, 22,190 frames were used to ensure that at least two students evaluated the same scene.

Statistical Comparison of Overall Cropland
A generalized linear model with a binomial distribution and a logit link with a Laplace estimation was used to compare the proportion of cropland generated by each data set. The SAS GLIMMIX procedure was employed to test the model in the Kilosa district and for all of Tanzania.

Assessment of the Validation Data Collected by the Students
The validation data are also subject to some uncertainties due to differences in interpretation between individuals. Disagreement between interpreters is reported and discussed, since for most of the images, more than one observer evaluated the amount of cropland present. The disagreement was calculated as the difference between the minimum and maximum values that an image was assigned by the students; 0 denotes no difference up to a maximum value of 3 for the highest level of disagreement.

Spatial Comparison of Cropland
Confusion matrices were produced that compare the validation data set against the four other data sets in four different classes/categories. The four categories were chosen based on the average amount of cropland in each 1 × 1 km scene that was reported by each data set, where <10% cropland was classified as "no cropland", between 10% and 33% cropland was classified as "low" cropland, between 33% and 66% cropland was classified as "medium" cropland and higher than 66% was classified as "high" cropland. Note that 10% was used as a threshold for "no cropland" to reduce the noise that may occur due to the aggregation of the data sets to 1 km, particularly for data sets at a 30 m resolution.
Additionally, maps showing agreement between the validation data set and the four other data sets were produced for the Kilosa district. For these maps, a threshold of 50% cropland in a given scene was used to differentiate between "cropland" and "no cropland". Finally, a map showing agreement between all data sets using the same definition of cropland/no cropland (50% threshold) was also produced. In all of these figures and tables, 22,190 frames were used to ensure that at least two students evaluated the same scene.

Validation of Disagreeing Areas in the Kilosa District
Following the methodology of Fritz et al. [12], a random sample of 60 scenes in the Kilosa district was selected where all the global/regional products disagreed. The sample is 1% of the total number of disagreeing scenes, and since the average cropland variance across all four data sets is 0.86, this sample is deemed to be representative for the whole district with a confidence interval of approximately 15%. Three members of IIASA staff and two members of UMD staff with a background in remote sensing or geospatial sciences and considerable experience in image classification were chosen as experts. Each expert classified the 60 scenes using the Geo-Wiki interface. The results were then compared against all of the global products and the validation data set collected by the students as a further independent verification of the different data sets.

Cropland Comparison across the Different Data Sets
The validation data set collected by the students is presented in the results as the average (median), minimum (min), and maximum (max) cropland as observed by the students. The comparison of cropland values from different sources in Tanzania and Kilosa follow the same order, i.e., ESA-CCI > Validation data set (max and median) > GlobeLand30 > RCMRD > Validation data set (min) > FROM-GC. The highest value shown by the ESA-CCI data set is around 44% cropland in Kilosa, with FROM-GC showing the lowest value also in Kilosa, ca. 2%. Figure 5 shows the adjusted medians for the generalized linear mixed models (Section 2. Following the methodology of Fritz et al. [12], a random sample of 60 scenes in the Kilosa district was selected where all the global/regional products disagreed. The sample is 1% of the total number of disagreeing scenes, and since the average cropland variance across all four data sets is 0.86, this sample is deemed to be representative for the whole district with a confidence interval of approximately 15%. Three members of IIASA staff and two members of UMD staff with a background in remote sensing or geospatial sciences and considerable experience in image classification were chosen as experts. Each expert classified the 60 scenes using the Geo-Wiki interface. The results were then compared against all of the global products and the validation data set collected by the students as a further independent verification of the different data sets.

Cropland Comparison across the Different Data Sets
The validation data set collected by the students is presented in the results as the average (median), minimum (min), and maximum (max) cropland as observed by the students. The comparison of cropland values from different sources in Tanzania and Kilosa follow the same order, i.e., ESA-CCI > Validation data set (max and median) > GlobeLand30 > RCMRD > Validation data set (min) > FROM-GC. The highest value shown by the ESA-CCI data set is around 44% cropland in Kilosa, with FROM-GC showing the lowest value also in Kilosa, ca. 2%. Figure 5 shows the adjusted medians for the generalized linear mixed models (Section 2.   The ESA-CCI data set shows 16% more cropland than the median of the Validation data set across Tanzania and is over 20% higher in Kilosa. The median of the Validation data set shows around 6% more cropland (ca. 26%) than GlobeLand30, and three times more cropland than the FROM-GC data set across Tanzania. These differences are similar for Kilosa but the median of the Validation data set is 11 times higher than FROM-GC.
For comparison, the FAO statistics have been added for Tanzania to Figure 5. The 2008 figure for area under cropland is based on data reported in an official country publication according to FAOSTAT but the figures for 2009 to 2014 are manual estimations from FAO. The closest match of these figures with the data sets is GlobeLand30 and RCMRD and they fall in between the minimum and the median estimates from the Validation data set.

The Disagreement within the Validation Data Set Collected by the Students
Since the large majority of scenes had at least two interpretations from different people, several statistics could be obtained with regards to cropland coverage, with some of these being shown in Figure 5. Additionally, Figure 6 shows the disagreement between students spatially for each scene covering the Kilosa district. From the total number of scenes, 48% show complete agreement, 33% disagree by one level of cropland category, e.g., one user mentions low cropland and the other says no cropland or medium cropland. Additionally, 15% disagreed by two levels of cropland category, and complete disagreement (i.e., across three levels) occurred 4% of the time.
Remote Sens. 2017, 9,815 10 of 23 The ESA-CCI data set shows 16% more cropland than the median of the Validation data set across Tanzania and is over 20% higher in Kilosa. The median of the Validation data set shows around 6% more cropland (ca. 26%) than GlobeLand30, and three times more cropland than the FROM-GC data set across Tanzania. These differences are similar for Kilosa but the median of the Validation data set is 11 times higher than FROM-GC.
For comparison, the FAO statistics have been added for Tanzania to Figure 5. The 2008 figure for area under cropland is based on data reported in an official country publication according to FAOSTAT but the figures for 2009 to 2014 are manual estimations from FAO. The closest match of these figures with the data sets is GlobeLand30 and RCMRD and they fall in between the minimum and the median estimates from the Validation data set.

The Disagreement within the Validation Data Set Collected by the Students
Since the large majority of scenes had at least two interpretations from different people, several statistics could be obtained with regards to cropland coverage, with some of these being shown in Figure 5. Additionally, Figure 6 shows the disagreement between students spatially for each scene covering the Kilosa district. From the total number of scenes, 48% show complete agreement, 33% disagree by one level of cropland category, e.g., one user mentions low cropland and the other says no cropland or medium cropland. Additionally, 15% disagreed by two levels of cropland category, and complete disagreement (i.e., across three levels) occurred 4% of the time. Figure 6. Cropland disagreement between the student interpreters for the Kilosa district, Tanzania. Figure 6. Cropland disagreement between the student interpreters for the Kilosa district, Tanzania. Tables 5-8 show confusion matrices with levels of cropland as detected by the students compared  to each of the other data sets across Tanzania whereas Tables 9-12 show the same information specifically for the Kilosa district. Table 5. Confusion matrix showing number of images classified as no cropland (<10%), low (10-33.3%), medium (33.3-66.6%) and high (>66.6%) cropland by the students compared to the ESA-CCI data set across Tanzania (N = 22,190). A total of 3745 scenes have only one interpretation and are not compared. The classification uses the average values from each data set. Users and producers (weighted) accuracy with 95% confidence intervals are shown. The overall weighted accuracy is 0.46 ± 0.01.

No
Low Mid High User Acc.       Finally, Figure 11 shows where all of the sources agree or disagree in the Kilosa district, with 52% of the scenes agreeing; of those, 98% show no cropland. A frame is classified as cropland when it has an average of 50% cropland or more. Finally, Figure 11 shows where all of the sources agree or disagree in the Kilosa district, with 52% of the scenes agreeing; of those, 98% show no cropland. A frame is classified as cropland when it has an average of 50% cropland or more.

ESA-CCI
Remote Sens. 2017, 9,815 17 of 23 Figure 11. Overall agreement across all cropland data sets in the Kilosa district where a given scene is considered as cropland if it has an average of more than 50% cropland (N = 12,373). A total of 2160 scenes have only one student interpretation and are shown as blank spaces.

Expert Verification of the Disagreement
Out of a total of 9817 frames covering Tanzania (omitting Kilosa), 53% agree (5182 frames). From those, 92% are classified as no cropland. A similar proportion of agreeing frames is preserved in Kilosa, where 51% of frames (6355 frames, shown in green, Figure 11) agree; 98% of those have no Figure 11. Overall agreement across all cropland data sets in the Kilosa district where a given scene is considered as cropland if it has an average of more than 50% cropland (N = 12,373). A total of 2160 scenes have only one student interpretation and are shown as blank spaces.

Expert Verification of the Disagreement
Out of a total of 9817 frames covering Tanzania (omitting Kilosa), 53% agree (5182 frames). From those, 92% are classified as no cropland. A similar proportion of agreeing frames is preserved in Kilosa, where 51% of frames (6355 frames, shown in green, Figure 11) agree; 98% of those have no cropland.
In the disagreeing areas in Kilosa (6018 frames, shown in red, Figure 11), the proportion of frames classified as cropland are as follows: ESA-CCI 82%, validation data from the students 39%, GlobeLand30 34%, RCMRD 31%, and FROM-GC 3%. These proportions, except for FROM-GC with much larger amounts of cropland at 22%, are in line with the rest of the country, where the proportions of frames classified as cropland on disagreeing areas (4635 frames) are distributed as follows: ESA-CCI 70%, Validation data from the students 43%, GlobeLand30 38%, and RCMRD 34%.
A confusion table comparing the student validations and the expert verification of a subsample of 60 disagreeing areas is shown in Table 13. The cropland distribution in the 60 evaluated locations is shown in Figure 12. The overall agreement between the two groups is 0.60, which is lower than that achieved when compared to the agreement between the students but this is to be expected since the disagreeing areas were harder to interpret. On a positive note, there was no confusion between the students and the experts in extreme cases, i.e., where the students said no cropland, the experts said high cropland and vice versa. Most of the confusion occurs within one class, e.g., between medium cropland (students) and high cropland (experts) while there are a few example of omission errors in identifying cropland by both experts and the students. Table 13. Confusion matrix showing the number of images classified as no cropland, low, medium, and high cropland by the students compared to the expert classification on 60 locations where disagreement between the products was high. Users and producers accuracy with 95% confidence intervals are shown. The classification uses the average values from each data set. The overall accuracy is 0.60 ± 012.

No
Low Mid High User Acc.  Table 13. The cropland distribution in the 60 evaluated locations is shown in Figure 12. The overall agreement between the two groups is 0.60, which is lower than that achieved when compared to the agreement between the students but this is to be expected since the disagreeing areas were harder to interpret. On a positive note, there was no confusion between the students and the experts in extreme cases, i.e., where the students said no cropland, the experts said high cropland and vice versa. Most of the confusion occurs within one class, e.g., between medium cropland (students) and high cropland (experts) while there are a few example of omission errors in identifying cropland by both experts and the students. Table 13. Confusion matrix showing the number of images classified as no cropland, low, medium, and high cropland by the students compared to the expert classification on 60 locations where disagreement between the products was high. Users and producers accuracy with 95% confidence intervals are shown. The classification uses the average values from each data set. The overall accuracy is 0.60 ± 012.

Expert Evaluations No
Low Mid High User Acc.

Discussion
The results show that there are considerable differences in the amount of cropland detected in the different data sets, with most cropland found in the ESA-CCI product and the least in the FROM-GC product. As shown in Tables 3 and 4, all products have comparable cropland classes, where only FROM-GC has some minor differences in the definition. In addition to including classes such as Crop-Bare and Crop-Greenhouse, FROM-GC does not include either rain-fed or irrigated tree crops. This might be one of the reasons why the total cropland is so low compared to the other products. The validation data collected by the students is closer to the ESA-CCI product only when considering the maximum values of the student validations. Note that we have not even considered the mosaic cropland classes, which would raise the amount of cropland found in ESA-CCI even more. The ESA-CCI product is produced using five years of satellite imagery (2008 to 2012) so it does actually better match the dates of the VHR imagery used in the validation, which also varies between 2005 and 2014. However, the majority of the VHR imagery used was from 2010 and later. FAO figures for Tanzania show that cropland increased by around 3.7% from 2008 to 2010. This number is small compared to the size of the differences between the different cropland products but could result in some underestimation in the validation data collected by the students. Similarly cropland increases after 2010 (16.7% between 2010 and 2012 and then stays stable after that according to FAO statistics) so there may be some overestimation in the validation data collected by the students. However, the ESA-CCI overestimates cropland compared to the validation data set, which itself may have some overestimation as outlined above, and hence the ESA-CCI clearly overestimates cropland. A visual inspection of areas where cropland is found in the ESA-CCI compared to VHR Google Earth imagery shows that these areas are mostly misclassified as grassland or shrubland, as shown in the examples in Figure 13.

Discussion
The results show that there are considerable differences in the amount of cropland detected in the different data sets, with most cropland found in the ESA-CCI product and the least in the FROM-GC product. As shown in Tables 3 and 4, all products have comparable cropland classes, where only FROM-GC has some minor differences in the definition. In addition to including classes such as Crop-Bare and Crop-Greenhouse, FROM-GC does not include either rain-fed or irrigated tree crops. This might be one of the reasons why the total cropland is so low compared to the other products. The validation data collected by the students is closer to the ESA-CCI product only when considering the maximum values of the student validations. Note that we have not even considered the mosaic cropland classes, which would raise the amount of cropland found in ESA-CCI even more. The ESA-CCI product is produced using five years of satellite imagery (2008 to 2012) so it does actually better match the dates of the VHR imagery used in the validation, which also varies between 2005 and 2014. However, the majority of the VHR imagery used was from 2010 and later. FAO figures for Tanzania show that cropland increased by around 3.7% from 2008 to 2010. This number is small compared to the size of the differences between the different cropland products but could result in some underestimation in the validation data collected by the students. Similarly cropland increases after 2010 (16.7% between 2010 and 2012 and then stays stable after that according to FAO statistics) so there may be some overestimation in the validation data collected by the students. However, the ESA-CCI overestimates cropland compared to the validation data set, which itself may have some overestimation as outlined above, and hence the ESA-CCI clearly overestimates cropland. A visual inspection of areas where cropland is found in the ESA-CCI compared to VHR Google Earth imagery shows that these areas are mostly misclassified as grassland or shrubland, as shown in the examples in Figure 13. The validation data set collected by the students is closest to GlobeLand30 and the RCMRD data sets when considering the median of the student interpretations. Both of these data sets have a considerable amount of manual checking built into their classification workflows so are not the product of only automatic classification algorithms. Fully automated approaches have led to either overestimation (ESA-CCI) or underestimation (FROM-GC) of cropland. As shown by Yu et al. [15], The validation data set collected by the students is closest to GlobeLand30 and the RCMRD data sets when considering the median of the student interpretations. Both of these data sets have a considerable amount of manual checking built into their classification workflows so are not the Remote Sens. 2017, 9, 815 20 of 23 product of only automatic classification algorithms. Fully automated approaches have led to either overestimation (ESA-CCI) or underestimation (FROM-GC) of cropland. As shown by Yu et al. [15], FROM-GC underestimates cropland by around 37% compared to FAO statistics in the region that they refer to as Middle Africa, which would include Tanzania. The reasons that the authors provide for this underestimation include the problems with obtaining cloud-free images in tropical areas; field sizes are small so there may be spectral mixing at 30 m resolution; and misclassification errors due to insufficient training samples, e.g., paddy rice being misclassified as water. The slight differences in cropland definitions as highlighted above may also be contributing to this underestimation.
Comparing the data sets to FAO statistics for Tanzania, the closest data sets are GlobeLand30 and the RCMRD product while the official statistics fall somewhere in between the minimum and the median of the validation data set collected by the students. Looking at the spatial agreements between these data sets and the validation data set collected by the students for Kilosa, the patterns are quite similar in that the main areas with cropland areas are in agreement. The student validation data set omits cropland in areas at the fringes of the existing cropland areas while there is evidence of cropland in the middle of Kilosa that is not picked up by either data set, which may indicate areas of cropland expansion identified in images after 2010. Hence either of these data sets has the potential to be used for agricultural monitoring purposes.
Another potential source of error could come from comparing data sets with different resolutions, which required aggregation of the data to match the 1 km grid cells of the validation data set. The aggregation may have introduced some artifacts, e.g., 300 m grid cells from ESA-CCI may have been split to fit the 1 km grid cells. Although the validation data set collected by the students is not error free, as clearly evidenced by the comparison between the students and the experts, such a data set does provide a valuable type of reality check for products derived from a top down remote sensing approach.
In terms of the robustness of the validation data, the analysis showed that there is greater than 80% agreement between user interpretations at the same location or that they differed by only one level of cropland category. The agreement with experts was lower, i.e., 60%, but this result was expected since the sample of 60 locations was chosen from areas of disagreement and hence were more difficult areas to interpret. However, most of the confusion was similarly between one level of cropland category and not at the extremes, e.g., where the students said no cropland and the experts said high cropland and vice versa. Although one level may seem like a large difference, the reasons for these findings may be because it is not always easy to interpret cropland from the VHR satellite images due to lack of color in some images, presence of cloud cover, the lack of clarity of features in the images, and images that are on the thresholds between low/medium and medium/high cropland. There were only a few examples of where either the students or the experts omitted cropland so this indicates that converting the data to binary cropland/non-cropland may produce a data set that can be used with reasonably high confidence since most of the confusion occurs in the amount of cropland identified. Considering that the exercise was undertaken wall-to-wall, it would be possible to derive a sufficiently large validation sample from the data set collected by the students.
Another possible use for such a data set would be in training future classifiers, where evidence of small fields can be readily recognized from the images. This is further backed up by a comparison of the student validation data with the other data sets. The ESA-CCI was missing cropland, 38% of the time compared to the student validation data, while these numbers are higher for the other products, i.e., 44% for GlobeLand30 and the RCMRD layer, and 55% for FROM-GC when considering Tanzania as a whole; similar results were obtained for Kilosa. However, there were also instances where the student validation data set showed no cropland and it was recorded in the other data sets. In most cases, the confusion is between no cropland and low cropland; the exception is ESA-CCI where there is confusion in both the low and high cropland classes. Figure 11 provides an example of producing a hybrid map for Kilosa that can act as a cropland uncertainty layer. Thus, where all the products agree, then certainty is high that cropland exists at this location. Such products can provide users with more confidence, especially given the large variation in total cropland between the five different products. The individual disagreement layers can also provide guidance to the producers of the different products regarding where they could potentially sample for additional training data.
In terms of the technology used to collect the data, the offline Geo-Wiki client has real advantages over the regular Geo-Wiki tool. First, the images were directly obtained from Digital Globe so the precise dates of the imagery are known. This information is more difficult to obtain from Google Earth while Bing imagery is only available for one date so the image interpretation in Geo-Wiki is very much driven by the availability of the imagery. This problem is controlled for when using images directly from the provider. Secondly, no internet is required and the client is lightweight and fast so can be brought to locations where there are volunteers in the country itself, which can also draw upon the expertise of locals while teaching them about image interpretation. The disadvantages are that the image quality is sometimes poor or cloud covered and there is no possibility to zoom into the image as would be possible using an application like Google Earth. This may also have affected the quality of the resulting interpretations collected by the students as mentioned above. An additional consideration is the use of visually interpreted VHR imagery as a golden standard compared to classified Landsat or MERIS-scale imagery. It is certainly expected that VHR imagery provides more accurate results but future work should test this by using field data, possibly including unmanned aerial vehicle (UAV) imagery and WorldView3 satellite imagery with 30 cm spatial resolution.

Conclusions
In Africa, food security is highly dependent on smallholder, subsistence agriculture for supporting the majority of its people. Therefore, for national and global agricultural monitoring, spatially explicit data on the distribution of cropland is essential for assessing the risk and extent of crop failure early in the growing season and before harvest. Yet the results of this comparison have shown that despite the increase in map products available, there is still considerable disagreement between them, both in the total amount of cropland and in the spatial distribution. However, of the four data sets considered here, both GlobeLand30 and the RCMRD data sets could be useful for agricultural monitoring purposes.
The validation data collected by the students, although requiring more effort compared to automated top down approaches, can provide a source of training data for improving future cropland products, and where sufficiently large in amount, can be sampled for validation. Disagreement layers between the different products can also provide map producers with information about where additional training samples should be collected.