Microbial Source Tracking in a Watershed Dominated by Swine

The high concentration of swine production in southeastern North Carolina generates public health concerns regarding the potential transport of pathogens from these production systems to nearby surface waters. The microbial source tracking (MST) tool, antibiotic resistance analysis (ARA), was used to identify sources of E. coli in a segment of Six Runs Creek in Sampson County, North Carolina. Among 52 water samples, fecal coliform (FC) counts averaged 272.1 ± 181.6 CFU/100 mL. Comparisons of isolates from water samples to an ARA library with an average rate of correct classification (ARCC) of 94.3% indicated an average of 64% and 27.1% of 1,961 isolates from Six Runs Creek were associated with lagoon effluent and cattle manure respectively. The potential for aerosol transport of bacteria during lagoon spray events, as well as, the potential for wildlife to serve as a vehicle of transport for bacteria from fields and lagoons to nearby surface waters should be investigated further.


Introduction
Surface water can be polluted with fecal bacteria from both point and nonpoint pollutant sources within agricultural and urban systems [1,2].In agricultural systems that involve livestock and poultry production in the United States, waste on dry weight basis exceeds 200 million tons per year [3].Livestock manure contains more than 150 pathogens including Campylobacter spp., Salmonella spp.and Escherichia coli O157:H7 that account for over 90% of food and waterborne diseases in humans [4].Typically lagoon systems are used for swine waste nutrient management.However reasonable levels of bacteria still persist since lagoons were not designed to control bacteria [5].Consequently, the land application of lagoon effluent gives rise to concern because land application has the potential to introduce many microorganisms to the soil surface of which some may reach surface water via runoff.Fecal indicator bacteria (FIB) can be quantified in any given waterbody; however, the host source of the fecal bacteria cannot be determined by counts alone [6].Previous methods for determining the sources of fecal pollution involved monitoring increases in nutrient concentration against background levels and use of ratios of fecal coliforms to fecal streptococci [7].Microbial Source Tracking (MST) technologies have emerged as tools used to determine the sources of fecal pollution in environmental waters.These technologies typically involve genotypic and phenotypic evaluations of bacteria that are based on the assumption that specific markers or phenotypically expressed traits associated with bacteria are similar within the animal hosts and can be used as a discriminating tool for identifying host sources of bacteria from environmental matrices [8].Genotypic approaches that have been investigated include repetitive element based polymerase chain reaction (rep-PCR) fingerprinting [9][10][11][12], ribotyping [13] amplified fragment length polymorphism [14], pulse field gel electrophoresis [15], denaturing gradient gel electrophoresis [16], detection of source specific marker genes [17,18].Phenotypic investigations have included carbon source utilization profiles (CUP) [19], biochemical fingerprinting [20], and antibiotic resistance profiles [21][22][23].
Antibiotic resistance analysis (ARA), one of the earliest MST methods developed, has been tested in a variety of different watershed environments [22][23][24][25] and is still used in fecal source tracking studies [21,[26][27][28][29].The ARA approach is based on the premise that fecal indicator bacteria (FIB) from hosts exposed to antibiotics will develop resistance to those antibiotics, and on the hypothesis that this selective pressure would be a mechanism for discriminating among fecal indicator bacteria from a variety of hosts [30][31][32][33][34][35].A library or database composed of antibiotic resistance profiles (ARP) of FIB from known host sources is developed and serves as a point of reference for identifying unknown source isolates from water sources.Two source tracking methods comparison studies reported issues with accuracy for numerous source tracking methods, including ARA.However, both studies also described approaches that could compensate for potential problems [35,36].Such approaches include the application of cross-validation analysis of known-source isolates as well as the removal of clonal isolates to ensure a reliable library.
Performance success [21,28,37] coupled with cost effectiveness [38] made ARA a good choice for determining the sources of fecal pollution in a small watershed with few major contributors of fecal pollution.This paper describes the use of ARA to determine sources of E. coli in a segment of Six Runs Creek, located in a small Sampson County, North Carolina, watershed with a high concentration of swine production.

Study Site
The study site (Figure 1) is located in a 275 ha watershed along the upper reach of Six Runs Creek, which flows in a southerly direction in eastern Sampson County, N.C., USA.Four commercial swine operations with 23 swine houses are located within this watershed.The swine operation designated as the study site is approximately 18 km north of Clinton, N.C., USA Six Runs Creek flows in a channel adjacent to the waste application field site one (WAF1) of the study site, and the segment adjacent to waste application field site two (WAF2) is impounded by two beaver dams and forms an elongated pond (Figure 1).The study site consisted of a single swine operation that had a standing herd of 4,400 finishing animals.The WAF2 and associated riparian system located on the west side of Six Runs Creek was the primary focus of evaluation in this study.The width of the forested riparian buffer ranges from 41 m to 87 m.Waste application field site two (1.8 ha) was cropped with coastal Bermuda grass that had been grazed (100 feeder calves, with an occasional cutting of hay removed).On average, WAF2 received lagoon liquid three to four times during the warm season, once in late fall and early spring.
Most of the soils in the study area are well drained and soil types were derived from soil survey maps of Sampson County, N.C., USA [39].Soil in WAF2 is classified as a Wagram series (loamy, kaolinitic, thermic Arenic Kandiudults).Soils in the riparian zone are classified as a Marvyn series (fine-loamy, kaolinitic, thermic Typic Kanhapludults) and Blanton (loamy, siliceous, semiactive, thermic Grossarenic Paleudults).

Host-origin Isolates
Fecal samples from known hosts in the watershed were collected throughout the project to build the library of host-origin ARPs for source identification of E. coli in Six Runs Creek.Lagoon liquid was collected from the lagoon adjacent to WAF2 and fresh swine feces were collected from the facility houses.Cattle and dog feces were collected from WAF2.Bird and deer feces were collected from the riparian area near Six Runs Creek.Feces from area nutria, beaver, and raccoon were collected from the intestine of trapped animals.
All samples were placed on ice in coolers, transported to the laboratory and assayed within 6 hours of collection.Known source samples were diluted and plated on m-FC agar (Becton Dickinson, Cockeysville, MD, USA) to obtain fecal coliform isolates.The fecal coliforms were assigned as E. coli by confirmation as 4-methylumbelliferyl-b-D-glucuronide (MUG)-positive with Colilert broth (IDEXX Laboratories, Westbrook, ME).A subset of the host-origin isolates were also identified by API 20E patterns (bioMerieux sa®, France).Details for collection, transport, dilutions, and plating of fecal samples have been previously described [32].A total of 1,937 E. coli isolates (10-12 isolates/fecal sample) from the nine known hosts were used to develop the database.

Water Sample Isolates
Water samples were collected on a monthly basis when conditions permitted between October 2007 and December 2008.Samples were collected as single grab samples from the center of five Six Runs Creek sites.Sampling sites consisted of upstream, midstream and lower stream sites in relation to WAF2.All water samples were placed on ice and processed within six hours of collection.All water samples were filtered and the membrane filter was incubated on m-FC agar at 44.5 °C for 24 h (Becton Dickinson, Cockeysville, MD, USA) for the detection and enumeration of fecal coliforms.Prior to antibiotic resistance analysis, randomly selected fecal coliforms from the water samples were assigned as E. coli using the same procedure described in the section on host origin isolates.

Antibiotic Resistance Analysis (ARA)
Antibiotic resistance analysis (ARA) was performed on 1,937 host-origin E. coli isolates and 1,961 E. coli isolates from 52 surface water samples.38 concentrations of nine antibiotics were used to determine ARPs of the isolates (Table 1).The antibiotic and concentrations were selected based on previous ARA studies and their common use in human and veterinary medicine [40].An isolate was considered to be resistant to a given concentration of antibiotic if growth comparable to the control plate (no antibiotic) was observed.Observations were converted to binary data; with growth on a given antibiotic concentration represented "1" and "0" represented no growth.Any isolates which failed to grow on the control plates were excluded from the analysis.The details of the ARA procedure have been described and are the same as that used in the method comparison studies [23,32,36].

Host-Origin Library
Data were analyzed with SAS-JMP statistical software (v.5.0.1,SAS Inst., Cary, NC).ARA patterns were evaluated by discriminant analysis (DA, with covariance pooled and not pooled) and cluster analysis (to produce a dendrogram for visualizing the degree of overlap).Clustering analysis is the technique of grouping data together that share similar values across a number of variables.The distance graph feature associated with cluster analysis clustered the isolates as points and demonstrated whether source patterns were clustered about a central location or if there were multiple clusters around different locations.The host-origin library was developed and clonal isolates (duplicate ARPs) were identified and removed.Classification ties were assigned a source depending on where the isolate was observed in dendrograms [41].Additional efforts to develop a stringent host origin library and to obtain reliable source identification of unknown source isolates involved the application of an 80% threshold criterion for correct classification to the library.All isolates below the 80% correct classification certainty (based on posterior probabilities from discriminant analysis) were excluded from the library.The second approach was to calculate the average frequency of misclassification (AFM) for each source category, and use this average to develop a minimum detectable percentage (MDP) to make decisions about the significance of hosts contributing minor sources E. coli in water samples [22,34].

Calculation of ARCC, AFM and MDP
The average rate of correct classification (ARCC) was calculated by adding the percentage of isolates correctly classified from each source category and dividing by the total number of source categories.The average frequency of misclassification (AFM) was calculated by adding the percentage of isolates incorrectly classified from each source category and dividing by the total number of source categories [22,41].
The AFM can be used to estimate the likelihood that an isolate that is not from category X will in fact be classified into category X, and therefore can provide the basis for a significance cut-off (minimum detectable percentage, MDP) when predicting the source of isolates from water samples or unknown sources [33,34].
For example, the AFM of the isolates in the ARA library were misclassified by 0.68 ± 0.92 SD.Given that the library had nine source categories, nine multiplied by the standard deviation (0.92) added to the AFM of 0.68% produced a 9% MDP.With nine source categories, the probability of an isolate being assigned to any category by chance alone was 11%.However, 9% was taken as a more stringent lower limit for considering any one source category to be a significant contributor of fecal pollution [23].Therefore, for example, birds would be considered a significant contributor to the indicator bacteria in a water sample only if 9% or more of the isolates were classified into the "bird" source category.Ultimately, when classifying isolates of unknown origin (water samples), source categories identified at percentages below the MDP were considered a negligible contributing source.
Cross-validation analysis via the hold-out method was used to determine the representativeness of the library.An individual isolate was removed from the library one at a time.Then, the removed isolate was classified based on the library comprised of the remaining isolates, and the ARCC for these removed isolates was calculated [34].

Fecal Coliform Monitoring
The levels of fecal coliforms (FC) in surface water averaged 272.1 ± 181.6 CFU/100 mL over the course of the study, which exceeded the 200 CFU/100 mL maximum standard for recreational water [42].This minimum standard was exceeded for almost three quarters of the study with the exception of surface water samples collected in January, March and June of 2008 (66.8 ± 37.3 to 90.0 ± 98.0 CFU/100 mL).The highest average density of FC (537.3 ± 1098.0 CFU/100 mL) was detected in water samples collected following a rainfall event in July 2008.

Host Origin Library
The removal of clonal isolates, those isolates with duplicate ARPs resulted in a loss of 30%-92% of isolates from the individual source categories, reducing the total number of E. coli (1,937) to 948 isolates.The 948 isolates with unique ARPs were subjected to discriminant analysis, producing an average rate of correct classification (ARCC) of 85%, with individual source correct classification (CC) rates ranging from 70%-100% (Table 2).The application of an 80% correct classification certainty threshold resulted in a loss of up to 60% of isolates from the individual source categories.The refined library of 470 isolates had an ARCC of 94.3%, with individual source CC rates ranging from 75%-100%.The average frequency of misclassification (AFM) for the library was used to calculate the MDP of 9% for the study site and represented a stringent lower limit for considering any one source category to be a significant contributor of fecal pollution [22] (Table 2).
The library, composed of 470 unique ARPs to which an 80% correct classification threshold criterion was applied, was subjected to cross-validation analysis by the hold-out method.The crossvalidation ARCCs for lagoon (93%), cattle (95%) and swine (86%) were only 2%-5% lower than the CC rates for these categories listed in Table 3.The cross-validation ARCC for deer, bird and dog was 100% for each.The cross-validation ARCCs for raccoon (71%), nutria (79%) and beaver (94%) were only 4%-9% lower than the CC rates for these categories listed in Table 3.

Host Source Identification of E. coli from Six Runs Creek
Lagoon (50%-59%) and cattle (17%-32%) represented the predominant host source of E. coli isolates in Six Runs Creek at all sampling locations (Figures 2-4).Isolates identified as the remaining host source categories were below the 9% MDP, with the exception of isolates from the three sampling sites adjacent to WAF2 identified to the combined wildlife category (11%; Figure 4).Among all E. coli from Six Runs Creek, those identified to the lagoon liquid category (>60%) was significantly greater than the remaining host source category.Those isolates identified as cattle (>25%) was significantly greater than identified as swine manure and all wildlife categories (Figure 5)

Seasonal Host Source Allocations
The proportion of E. coli isolates from surface water adjacent to the WAF2 identified as lagoon liquid category was significantly greater than all other host source categories during the summer (64.7%) and fall (74.1%; Figure 6).However, cattle (40.9%) and lagoon liquid (34.5%) identifications were not significantly different from each other during the winter.Neither was the percentage of isolates identified to the combined wildlife (19.3%) category different from the percentage of isolates identified as lagoon liquid during the winter (Figure 6).All of the E. coli allocations to the individual wildlife categories (0.0%-7.9%) and swine manure (0.6%-5.3%) were below the 9% MDP for all seasons (Figure 6).
An evaluation of the seasonal characteristics for the host source identifications that occurred above the 9% MDP indicated isolates identified as lagoon liquid (5%) was significantly less than the percentage of isolates identified as lagoon liquid during the other seasons (34%-74%; Figure 7).A significantly greater percentage of isolates identified as cattle occurred during the winter and spring (40%-82%) than in summer and fall (12%-21%).There were no significant differences in the combined wildlife category.

Discussion
On average, the levels of fecal coliforms (FC) in surface water from Six Runs Creek were 272.1 ± 181.6 CFU/100 mL, which exceeded the maximum standard for recreational water (200 CFU/100 mL).The elevated counts in surface water may be a reflection of the slow turnover of the water movement due to the beaver dam as compared to a true free flowing stream.The resulting ponding may contribute to an accumulation of nutrients thus providing an energy source for the microorganisms to multiply.Fincher et al. [43] reported fecal coliform counts of 373 CFU/100 mL and 1,470 CFU/100 mL in rural and urban sections of an impaired stream, respectively.When considering the study site in the present study exists within a watershed that includes four swine operations with 23 swine houses, average FC counts were not exceptionally high when compared to other rural and urban watersheds.
Antibiotic resistance analysis (ARA) was employed to identify the sources of fecal pollution in the watershed.Antibiotic resistance analysis studies frequently involve large known-source libraries, consisting of about 1,000 to 6,000 isolates [44].However, many of the strains examined with these libraries are isolated from the same sample or source material, and hence the libraries may be biased due to the presence of multiple replications (clones) of the same bacterial genotype within a single animal host source [11].A representative known source library composed of unique antibiotic resistance patterns (ARP) among the host source categories with reliable correct classification (CC) rates is important for unknown source identification.The high rate of correct classification obtained with our database (93.4%) may be attributed to the removal of clonal isolates, those isolates with identical antibiotic resistance patterns (ARPs), and the application of an 80% correct classification threshold criterion to the non-clonal isolates.Furthermore, the cross validation testing indicated the library was representative with a 9% or less difference in the ARCC of the validation analysis and the CC rates of the individual sources in the library.Graves et al. [21] reported a known source library ARCC of 95.4%, Hagedorn et al. [25] achieved an ARCC of 87%, and Wiggins et al. [45] reported a known source library ARCC 74%.The ARCC obtained in our study was similar to that found by Graves et al. [21].However, one must note that nearly all of the earlier studies didn"t employ stringent measures for library representative testing.
Evaluation of surface water isolates by ARA indicated that lagoon liquid was the predominant source of E. coli in water samples above bridge (63.7%), below bridge (54.2%) and surface water locations adjacent to the application field (66.4%).Olivas and Faulkner [37] identified 48% of E. coli isolates as livestock.Livestock as a predominant source of E. coli in surface water has been a common observation.Studies have reported 47%-52% of E. coli isolates were from livestock sources [21,30,37].According to Carroll et al. [26] a higher percentage of isolates identified as human (21%-87%) were associated with surface water near residential developments that relied on onsite wastewater treatment systems.Higher percentages of isolates identified as livestock and other animal sources were associated with surface water in less developed areas [26].Land use patterns and the geographical setting of the present study indicated that the dominant contributor of E. coli would be from livestock sources (swine and cattle).However, flocks of birds were often observed both within the application field and at the water source during some sampling trips.The presence of the beaver dam in the creek indicates that there is a high population of beavers, hence a likely source of bacterial contamination.In the present study, most of the land within the watershed is dedicated for agricultural purposes; fecal loading due to failing onsite wastewater systems was very unlikely due to distances between homes and surfaces waters.
Seasonal evaluations of E. coli isolates from surface water implicated cattle as the predominant source in winter (40.9%) and spring (52.8%).Cattle were fenced in the pasture and had no direct access to the creek at our study site.However, the isolates identified as cattle from the segment of Six Runs Creek evaluated in the present study may indicate cattle isolates from other areas in the watershed were able to persist and be transported in stream currents.E. coli isolates identified as lagoon liquid were predominant in summer (64.7%) and fall (74.1%)and may be a reflection of increased spray events during these seasons.Warmer months tend to increase wildlife activity and there may have been transport of E. coli from the lagoon to the stream by wildlife (birds, turtles).One must also consider the possibility of aerosol transport of E. coli in lagoon liquid during spray events.Results from other studies are similar to our findings except for the possible influence of human sources.Olivas and Faulkner [37] reported 56% of E. coli isolates were of human origin during fall.However, source identification of E. coli as livestock and human sources were similar in winter (39% and 35%), spring (40% and 35%) and summer (43% and 37%) [37].Graves et al. [21] reported cattle as the predominant source of E. coli isolates during warm months (60%) and cool months (53%).Booth et al. [30] reported livestock as the predominant source of E. coli isolates in summer (48.9%) and winter (44.4%).
ARA results from this study were not unexpected given the land use parameters of the watershed thus fortifying the usefulness of ARA as tool for identifying sources of fecal pollution in a watershed with few major contributors of fecal pollution.The study further implies future investigations should address alternate modes of bacterial transport when best management practices for waste management are in place and followed.

Conclusions
Over the course of the study, effective swine waste management in an area with four large swine operations have resulted in average fecal coliform counts in surface water that were not exceedingly high.Among isolates evaluated by antibiotic resistance analysis (ARA), livestock were considered a major source.However, in some instances unknown source isolates grouped into a combined wildlife category were not significantly different from the livestock sources.Considering livestock didn"t have direct access to the stream and the position of the riparian would prevent surface runoff during lagoon spray events one must consider other means of bacterial transport.Aerosol transport of E. coli to nearby surface waters during spray events as well as the potential for wildlife to transport bacteria from fields and lagoons to nearby surface waters should be investigated further as potential sources of lagoon and livestock isolates in Six Runs Creek.

Figure 1 .
Figure 1.Map of the study site.WAF1 = Waste application field site one; WAF2 = Waste application field site two.

Figure 5 .
Figure 5. Host source identification of E. coli isolates (n = 1,961) from the surface waters of Six Runs Creek.

Figure 6 .Figure 7 .
Figure 6.Seasonal host source identification of E. coli isolates from three Six Runs Creek sampling sites adjacent to waste application field two (WAF2).
above the 9% MDP.Bars within each source category with the same letters are not significantly different (Tukey-Kramer (HSD) test p<0.

Table 1 .
Antibiotic concentrations used for antibiotic resistance analysis (ARA) of E. coli isolates.

Table 2 .
Discriminant analysis of unique antibiotic resistance patterns (ARPs) used to classify E. coli from nine known sources into source categories.

Table 3 .
Discriminant analysis of unique antibiotic resistance patterns (ARPs) with the application of an 80% correct classification certainty criterion used to classify E. coli from nine known sources into source categories.Host source identification of E. coli isolates (n = 102) from the Six Runs Creek sampling site above the bridge.Host source identification of E. coli isolates (n = 1,715) from three Six Runs Creek sampling sites adjacent to waste application field two (WAF2).
Figure 3. Host source identification of E. coli isolates (n = 144) from the Six Runs Creek sampling site below the bridge.