Downscaling Pesticide Use Data to the Crop Field Level in California Using Landsat Satellite Imagery : Paraquat Case Study

Exposure to pesticides has been associated with increased risk of many adverse health effects. To understand the relationships between pesticide exposure and health outcomes, epidemiologists need information on where pesticides are applied in the environment. California maintains one of the most comprehensive pesticide use reporting systems in the world, yet the data are only recorded at a coarse geographic scale of approximately 2.6 km area. A method is presented that uses Landsat image time series to downscale California pesticide use data to the crop field-level. The approach is demonstrated using paraquat applied to vineyard and cotton fields.


Introduction
Exposure to pesticides has been associated with increased risk of adverse health effects such as cancer [1][2][3][4][5], birth defects [6], and Parkinson's disease [7,8].Populations living near agricultural fields are at higher risk of exposure due to their proximity to areas where pesticides are frequently applied [9][10][11].To understand the relationships between exposure and health outcomes, epidemiologists need accurate, unbiased exposure estimates as study participants rarely, if ever, know what, when, or where pesticides were applied to crops grown near their residence.Direct measurement of pesticide exposures are often expensive, difficult, or impossible to obtain, especially for retrospective studies where data are collected only after study enrollment (i.e., post-diagnosis), hence, epidemiologists are using GIS technology to estimate pesticide exposures.

OPEN ACCESS
Although California maintains one of the most comprehensive pesticide use reporting systems in the world (California Pesticide Use Reporting (CPUR) system) [12], a major limitation of the data is that they are recorded at a coarse geographic scale of approximately 2.6 km 2 area ("Section" in the US Public Land Survey; approximately 1 mile by 1 mile) which is not adequate for residential-level exposure assessment [10,13,14].The CPUR database is the primary source of information on where, when, and how pesticides are used in California.Information is recorded on the type of chemical applied, date of application, crop the chemical was applied to (e.g., cotton, tomato), among other attributes.
Estimating pesticide use density near individual residences is usually calculated by taking a proportion of the total amount of a specific pesticide applied to crops within the entire Section [10,13] which has been found to result in substantial misclassification errors [13].Improved estimates are possible by linking pesticide use to specific crop fields by incorporating land use maps developed by the California Department of Water Resources (CDWR) [10,11,13,15].However, CDWR maps are produced only once every 7-10 years for a single county [16].Since producers frequently rotate crops from year-to-year, these maps cannot merely be duplicated for intervening years.In addition, factors such as climate, soil conditions, cultivation, and irrigation practices allow for double-and triple-crops to be grown on a field within a single year in California [17,18].Because CDWR maps are developed from field observations at only one time during the growing season, multi-cropped fields are frequently misclassified [19].
Landsat satellite images represent a vast resource for characterizing agricultural lands at a relatively high spatial resolution (30 m).Yet Landsat data have had only limited use in pesticide exposure studies [9,[20][21][22] likely due to the previous high cost of the imagery and the lack of effective and efficient methods for linking the imagery to pesticide use data.The US Geological Survey (USGS) is now providing Landsat imagery free of charge eliminating one of the major barriers to using this data [23].In a previous study, Maxwell et al. [19] explored the use of Landsat image time series to characterize crop management practices in southern Central Valley California.This research builds on the foundations presented in that paper.The objective of this paper is to present an approach for using Landsat satellite imagery to downscale CPUR data from Section-level to crop field-level.The concept is demonstrated using paraquat applications in western Fresno County, California.

Overview
Paraquat is a highly toxic herbicide and exposure has been linked to many acute and chronic health effects such as respiratory problems, kidney damage, and Parkinson's disease [8,24].In California, paraquat is widely used to control weeds in crop fields such as orchards and vineyards, and to defoliate green vegetation on crops (primarily cotton) prior to harvest.During the years 1991 through 2009, paraquat was applied to between 423,000 and 736,000 hectares annually in California.
The year 1994 was selected to demonstrate the methodology since a CDWR map was available for Fresno County to use in validating the results.Four Public Land Survey Sections (Section 1 in Range 17 East, Townships 14, 16, 17, and 18) were selected in western Fresno County, California (Figure 1).Six Sections were originally selected in an evenly spaced north-south transect however, two Sections Identifying pesticide use at the crop field-level involves the integration of CPUR, Landsat image time series data, and a crop signature library (Figure 2).The Landsat image time series is first used to identify crop field boundaries and acquire measurements of the vegetation over the growing season for each crop field within the Section.Identification of the specific crop field(s) is then determined by comparing the crop phenological measures for each field to an existing crop phenological signature library [19] to determine the closest match.A principal components analysis (PCA) transformation was applied to the image time series and the first three components used to enhance field separation.A segmentation algorithm was applied to obtain initial field unit boundaries using Definiens eCognition software.Segmentation parameters were set to scale 10, shape 0.2, and compactness 1.0.The field unit boundary delineates a group of contiguous pixels that have similar phenological characteristics.Manual corrections were made to obtain the final field unit boundaries.
Classification was performed by comparing 20 samples selected from the crop signature library (collected in a previous study [19]) to signatures for each of the crop fields identified in the Section.NDVI values for the Landsat image dates were matched to the corresponding NDVI values for time periods in the crop signature library in the classification.A distance measure (median value of the sum of differences squared for all 20 samples in the crop library) was used to identify the closest match.An existing crop map obtained from the CDWR was used to validate the classification [16].

Section 14S17E01
CPUR records indicated that paraquat was applied to two vineyard fields in Section number 14S17E01.Seventeen unique field units were identified in the segmentation of the Landsat image time series (Figure 3).Some field units appeared to have multiple fields within the boundary (Figure 3, field at the top noted with a white "V").The fields were treated as one unit due to their similar NDVI time series patterns.The majority of fields identified as vineyards (5 out of 6) had median difference values less than 0.09 (Table 2).One vineyard field (Figure 3; vineyard-12; blue dashed line) had a median difference value of 0.17 due to the very low NDVI values (<0.3) during the first half of the growing season possibly because this was a newly planted vineyard.Additional signatures need to be added to the library in this case to characterize newly planted vineyards.Identification of the specific two fields that were sprayed with paraquat was not possible in this example because more than two fields were identified as vineyards within the Section.Difference Vegetation Index (NDVI) time series plots (right) for selected crop fields within Section 14S17E01.Paraquat was applied to two vineyard fields.Five of the six field units (identified as a white "V" on the image and blue lines in the graph) identified as vineyard on the California Department of Water Resources (CDWR) map had NDVI time series closely matching vineyard library signatures (median ≤ 0.09).One field unit identified as vineyard on the CDWR map appeared to be newly planted (white "V" inside of white box).The difference value was greater for this field (0.17) because the library did not contain signatures for newly planted vineyards.Identification of the specific two fields that were sprayed with paraquat was not possible in this example because more than two fields were identified as vineyards within the Section.

Section 16S17E01
CPUR records indicated that paraquat was applied to one cotton field in Section number 16S17E01.The segmentation process resulted in 12 individual field units (Figure 4).The field identified with a white letter "C" on the image, and as "field crop-6" in the time series graph in Figure 4, had the closest match to cotton signatures in the library (median difference = 0.34) (Table 2).The CDWR map labeled this field with a general class code ("field crop") possibly because the crop had not matured to the point of identification at the time of the CDWR field visit.The NDVI time series pattern for "field crop-6" is clearly the only feasible match to typical cotton phenological patterns (see [19] and Figures 5 and 6 below for other cotton signature patterns).Identification of the specific field where paraquat was applied is possible in this case because there was only one field sprayed with paraquat and the NDVI time series patterned matched successfully to the crop type (cotton) in the CPUR.

Section 17S17E01
CPUR data listed one paraquat application to a cotton field in Section 17S17E01.Three unique field units were delineated using the Landsat imagery (Figure 5).The field identified with a white letter "C" in Figure 5 was the closest match (median difference = 0.10) to cotton signatures in the library with the other two fields resulting in much higher differences (median differences of 1.30 and 2.84) (Table 2).The CDWR map labeled this field as cotton confirming the correct identification.Identification of the specific field where paraquat was applied was again successful.

Section 18S17E01
CPUR listed four paraquat applications to cotton fields in Section 18S17E01.Two records were likely applications to the same field because the planted area was exactly the same (59.5 h) and the area treated at each application summed to approximately the same total field area (57.4 h).Hence, the CPUR indicates that three cotton fields of approximately equal size (59.5 h, 61.9 h, and 62.7 h) should be grown within this Section.Four crop field units were delineated using the Landsat imagery of which only two fields (Figure 6, white letter "C") had NDVI time series values close to cotton phenological signatures (median difference ≤ 0.09) (Table 2).The other two fields had distinctly different phenological patterns (Figure 6, red and blue lines on graph) and had relatively high differences (median differences ≥ 1.52) when compared to the cotton signatures in the library.The CDWR map confirmed the two fields identified as cotton were classified correctly and that the other two fields were not cotton (labeled as safflower and tomatoes on the CDWR map).Thus, the CPUR data was likely an error in that only two fields had paraquat applications.This example demonstrates the potential of using Landsat imagery to identify possible errors in CPUR data.Pesticide exposure estimates would have been substantially overestimated using the original CPUR data.Again, the two fields where paraquat was applied were successfully identified, although it was not possible to identify which specific field linked to which paraquat record.In this case, the pounds of paraquat applied to each field was approximately equal (0.86 and 0.88) and using an average rate applied over both fields is possible.If the application rates were significantly different between the two fields, over or underestimation of exposure would result.) and NDVI time series plots (right) for all crop fields within Section 18S17E01.Paraquat was applied to two cotton fields.Two fields had NDVI time series values similar to cotton (identified as a white letter "C" on the image and black and purple dashed lines in the graph.Both fields closely matched signatures in the library (median difference ≤ 0.09).The specific fields where paraquat was applied were identifiable in this case.Determining which paraquat application was applied to which cotton field was not possible.

Summary
The four case studies discussed above were presented to demonstrate the general concept for how Landsat satellite imagery could be used to downscale Section-level CPUR data to the crop field-level.Landsat was shown to be useful for identifying the specific field or group of fields where paraquat was applied and also for identifying suspected errors in the CPUR data (Section 18S17E01).In the cases presented, cotton and vineyards were distinguishable from the other crops grown in the Section, yet these case studies only represent a few of the possible crop combinations.The results are encouraging, however, further research is needed to test the methodology on other cropping regions in California and on other types of pesticides.
The successful application of this technique will depend on several factors.Availability of an adequate number of cloud-free Landsat images spanning the growing season is important to characterize the crop field vegetation phenology.A comprehensive crop signature library is also essential to ensure that a wide range of crop conditions are represented.The crop signatures in the library were collected as part of an earlier study in Fresno County, California during the year 2000 and it currently contains less than 30 samples for each crop type [19].The crop signature library should be expanded to include new signatures for different years and crop growing regions in California.
The specific field will be difficult to determine if multiple fields of the same crop type are grown within the Section and the pesticide was applied to only one of the fields.For example, in Section 14S17E01, several fields of vineyards were grown within the Section, yet only a portion of two fields were applied with paraquat.In this case, the method refined the pesticide use to a smaller region within the Section, yet could not identify the specific field.The area where the pesticide was applied was refined to a smaller area allowing for some improvement in the spatial accuracy of pesticide use.Sections 16S17E01, 17S17E01, and 18S17E01, demonstrate cases where substantial improvements could be made to paraquat exposure estimates by incorporating Landsat imagery.Using previous methods for estimating exposure (i.e., averaging paraquat over the entire Section) would have resulted in low estimates for residences living in Sections 16 and 17.Yet, exposures would be much higher for residents living near the actual field, such as the southeast corner of Section 17S17E01, as compared residences living further from the field (northwest corner).
Determining the boundaries of the crop fields required some manual processing and a few of the fields could not be definitively identified (Section 14S17E01).A range of segmentation parameters were evaluated in an attempt to automate the generation of field boundaries.The process resulted in either too many boundaries (sub-field polygons) or not enough boundaries (multiple fields within one polygon) which has been found in other studies [29,30].All of the case studies required manual editing to add or delete vector lines which, although was a fairly quick process for this limited number of Sections, the process could take a significant amount of time in a large study spanning hundreds of Sections and multiple years.In addition, it may not be feasible to delineate some individual field units even visually, as for example in the first case study 14S17E01.Orchard and vineyard crops tend to be grown on small parcels making it difficult to delineate specific field boundaries with the spatial resolution of Landsat imagery (30 m).Aerial photographs or a combination of aerial photographs and Landsat image time series would be more useful for identifying field boundaries for smaller crop fields [29].

Figure 2 .
Figure 2. System diagram for integrating Landsat image time series, Section-level pesticide use data, and a crop signature library to map pesticide use at the field-level.

Figure 3 .
Figure 3. Landsat time series color composite image (left, PCA123) and NormalizedDifference Vegetation Index (NDVI) time series plots (right) for selected crop fields within Section 14S17E01.Paraquat was applied to two vineyard fields.Five of the six field units (identified as a white "V" on the image and blue lines in the graph) identified as vineyard on the California Department of Water Resources (CDWR) map had NDVI time series closely matching vineyard library signatures (median ≤ 0.09).One field unit identified as vineyard on the CDWR map appeared to be newly planted (white "V" inside of white box).The difference value was greater for this field (0.17) because the library did not contain signatures for newly planted vineyards.Identification of the specific two fields that were sprayed with paraquat was not possible in this example because more than two fields were identified as vineyards within the Section.

Figure 4 .
Figure 4. Landsat time series color composite image (PCA123) and NDVI time series plots for selected crop fields within Section 16S17E01.Paraquat was applied to one cotton field.The field with the closest match to cotton is identified as a white "C" on the image to the left and black line in the graph on the right.Identification of the specific field sprayed with paraquat was possible in this example.

Figure 5 .
Figure 5. Landsat time series color composite image (left, PCA123) and NDVI time series plots (right) for all crop fields within Section 17S17E01.Paraquat was applied to one cotton field.Only one field had NDVI time series values similar to cotton (identified as a white "C" on the image to the left and black line in the graph on the right).The specific field where paraquat was applied was correctly identified in this example.

Figure 6 .
Figure 6.Landsat time series color composite image (left, PCA123) and NDVI time series plots (right) for all crop fields within Section 18S17E01.Paraquat was applied to two cotton fields.Two fields had NDVI time series values similar to cotton (identified as a white letter "C" on the image and black and purple dashed lines in the graph.Both fields closely matched signatures in the library (median difference ≤ 0.09).The specific fields where paraquat was applied were identifiable in this case.Determining which paraquat application was applied to which cotton field was not possible.

Table 2 .
Field identification results for each of the study Sections.Results are sorted from lowest to highest median difference value.