1. Introduction
Corn earworm,
Helicoverpa zea (Boddie) (Lepidoptera: Noctuidae), is considered one of the most damaging insect pests in North America. The species is highly polyphagous and has been recorded feeding on more than 120 species of plants, including many economically important crops such as maize, sorghum, tomato, and cotton [
1,
2,
3,
4].
Helicoverpa zea is restricted to the Western hemisphere and was the only major
Helicoverpa pest species present in this region until 2013, when Old World bollworm,
H. armigera (Hübner), was recorded in Brazil [
5,
6].
Helicoverpa armigera is considered native to Europe and Asia and is also present throughout Africa, Australia, and Oceania [
7]. Larvae of this species have an even broader host range than those of
H. zea, with over 180 species of host plants recorded, including many specialty crops [
4]. This broad host range has allowed
H. armigera to spread rapidly throughout much of South America and the Caribbean [
8]. The first U.S. interception was recorded in Puerto Rico [
9] raising concern that the species would soon be detected within the continental U.S. This concern was realized in 2015 when three individuals of
H. armigera were recorded in Florida [
10]. Subsequent surveys in Florida have shown, however, that the species has so far failed to become established. Nevertheless, it is likely that
H. armigera will continue to increase its geographic range in the New World given that adults are highly mobile and capable of flying 20–40 km in one day [
11]. Human-mediated introductions from the movement of passengers and trade goods may also result in
H. armigera increasing its geographic range [
12,
13] as the species is often intercepted at U.S. ports of entry [
14]. Based on habitat and host suitability,
H. armigera could become established in many states in the continental U.S. [
15], where it would substantially threaten important agricultural crops [
16,
17]. Preventing this species from increasing its geographic range and further establishing itself in the Western hemisphere is of utmost importance.
The most important factors for initiating a rapid response to an incursion of
H. armigera are prompt detection and ongoing screening.
Helicoverpa armigera and
H. zea are nearly identical morphologically, which makes morphological identification impossible without genitalic dissections. The time constraints and technical expertise required for identification by genitalic dissection [
7,
18,
19] can result in
H. armigera remaining undetected in areas where
H. zea is present (i.e., most of the Western hemisphere). Molecular methods are therefore increasingly used to distinguish between
H. zea and
H. armigera.
As the range of
H. armigera expands across South America, the Caribbean, and likely into North America [
6,
13,
16,
20], it is essential to have multiple means of screening many individuals in a short period of time. Such screening efforts provide useful data for targeted responses to the spread of
H. armigera as well as detailed information in understanding the range expansion dynamics for this economically important pest. The use of ddPCR (droplet digital PCR) to detect very small amounts of target DNA among large quantities of non-target DNA has been demonstrated in numerous applications (e.g., [
21,
22,
23]) including the detection of
H. armigera in bulk samples [
24]. While the use of ddPCR is an attractive method for pest detection in bulk samples due to its accuracy, tolerance to PCR inhibitors, and method of direct standard-free quantification [
25], it is costly, and the systems are not widely available. Conversely, real-time PCR, because of the relatively low cost of machine ownership and operation, is frequently utilized for detection of pest species of many types (e.g., [
26,
27,
28]). Unfortunately, the real-time PCR-based methods for detecting
H. armigera in bulk samples do not perform well in real-world sized samples which routinely include hundreds of specimens [
24,
29]. As such, an accurate DNA-based identification method that can be employed across either ddPCR or real-time PCR is needed to provide flexibility and compatibility across labs and instruments in screening bulk samples for detection of
H. armigera.
Given the need for a more flexible and sensitive method to screen bulk samples, we sought to optimize a newly designed assay for accurate and repeatable detection of H. armigera in bulk samples to more closely match the efficiency of real-time PCR to ddPCR. Improvements over previously designed assays were sought through changes to primer and probe design, DNA extraction method, and assay output interpretation. Effects of different treatments were measured through increases in assay sensitivity and compared within each assay, across assays, and to previously developed assays where applicable. Since the current standard is to screen bulk samples via ddPCR followed by real-time PCR of individuals in bulk samples positive for H. armigera, the process is bottle necked by the lack of ddPCR systems available to identifiers. An optimized real-time PCR assay for use with bulk samples, as presented here, could vastly improve early detection of an invasion of H. armigera.
2. Materials and Methods
2.1. Origin and Type of Insect Material
The DNA for this study was extracted from the legs of adult specimens of
H. armigera and
H. zea.
Helicoverpa zea specimens were collected in the summer of 2016 in Colorado and Minnesota and in the autumn of 2019 in Florida. Samples were collected using UNI-Trap Multi-Color (Alpha Scents, Inc., Canby, OR) bucket traps baited with
H. armigera pheromone lure formulated and produced by the USDA Forest Pest Methods Laboratory in Buzzards Bay, MA. After collection, samples were stored dry in paper envelopes at 4 °C prior to use. All specimens were identified to genus visually and confirmed to be
H. zea by ddPCR using the method outlined by Zink et al. [
24]. Specimens of
H. armigera were acquired from lab colonies at the USDA-APHIS Forest Pest Methods Laboratory, Cape Cod, Massachusetts, established from Spanish specimens, and the Max Planck Institute of Chemical Ecology in Jena, Germany, established using
H. armigera from Queensland, Australia. Additional
H. armigera were collected from various locations throughout South Africa (
Supplementary Table S1). After collection,
H. armigera samples were preserved in 100% ethanol in individual microcentrifuge tubes and stored at −80 °C until use.
2.2. DNA Extraction from Individual Specimens
For the samples used in optimization steps, DNA was extracted from two legs of individual specimens of H. armigera or H. zea using the Qiagen DNeasy Blood & Tissue Kit (Qiagen, Valencia, California) following manufacturer’s instructions. DNA concentrations were determined using a Nanodrop 2000 (Thermo Fisher Scientific, Waltham, MA, USA). The samples used for testing assay sensitivity were extracted using four legs from individual specimens of H. armigera using the Lucigen MasterPure Complete DNA and RNA Purification Kit (LGC Biosearch Technologies, Novato, CA, USA) following manufacturer’s instructions with an additional 1-h incubation at −20 °C before DNA precipitation to increase total DNA yield. DNA concentrations were determined using a Qubit Fluorometer with the dsDNA HS Assay Kit (Thermo Fisher Scientific) following manufacturer’s instructions. DNA was stored at −20 °C until use and then archived at −80 °C.
2.3. Bulk DNA Extraction
To identify the best bulk DNA isolation method for species-specific real-time PCR detection, we evaluated adjustments to the formulation and method of squish buffer DNA extraction [
30]. The original squish buffer concentrations of 10 mM Tris-HCl, 1 mM EDTA, 25 mM NaCl, and pH 8.2 as well as the modified formulation of Perera et al. [
29], wherein EDTA and NaCl concentrations reduced by 50%, were tested. In addition, we compared EDTA and NaCl concentrations of 2×, 5×, and 10× the concentration of the original Gloor et al. [
30] formulation. In all tests, Proteinase K was excluded which differs from Gloor et al. [
30].
Samples were prepared by adding 1 leg of H. armigera to increasing numbers of H. zea legs along with several 2.3 mm zirconia/silica beads in 1.5 mL microcentrifuge tubes. The samples were pulverized for 2 min on high speed in a mini-beadbeater (Biospec Products, Bartlesville, OK, USA). After grinding to a fine powder, 10 µL/leg of squish buffer was added to the tube, and it was incubated overnight at 80 °C (56 °C equivalent) and shaking at 500 rpm in a dry bath Thermomixer FP (Eppendorf AG, Hamburg, Germany). After incubation, the samples were spun down at 2152, 8609, or 16,873× g, for 10 min to pellet debris in an Eppendorf 5418 bench-top microcentrifuge (Eppendorf AG).
The best performing method, a modified squish buffer with 125 mM NaCl, 5 mM EDTA, and 10 mM Tris-HCl with a centrifugation step at 8609× g for ten minutes, was repeated for eight ratio extractions (H. armigera: H. zea) from 1:0 to 1:500.
A set of 59 samples containing one leg of
H. armigera and 50 legs of
H. zea were extracted using the best-performing modified squish buffer method. The
H. armigera legs used were collected from disparate geographic locations as described above and were of varying quality to account for DNA degradation that occurs in trap samples (
Supplementary Table S1). Because the squish buffer DNA extractions lack the purification that takes place in a column extraction, a 100 µL aliquot from each of these extractions was further purified using AMPure XP paramagnetic beads following the manufacturer’s workflow (Beckman Coulter, Danaher Corporation, Brea, CA, USA).
2.4. Primer and Probe Design
A new set of primers and probes was designed in the same region as other successful molecular assays which had used a portion of ITS1 and the 5′-flanking 18S rDNA region to differentiate
H. armigera from sister species
H. zea and relatives [
29,
31]. Alignments of sequences from
H. armigera and related species (
H. zea,
H. assulta,
Chloridea subflexa, and
C. virescens), as well as intra-genome tandem rDNA repeats between
H. zea and
H. armigera from whole genome alignments (conducted
post hoc), were utilized to find and later confirm consistent differences between samples. Primers and probes were designed using Geneious v8.1.9 (
https://www.geneious.com). The program Primer 3 v2.3.7 [
32] was used to calculate Tm with the SantaLucia [
33] method, and OligoCalc [
34] was employed to test for self-annealing and hairpin formation. Any primers and probes that were found to have poor structural qualities (e.g., many self-annealing sites) were not tested or were repositioned over the variable sites to exclude predicted structural faults. Primers and probes were synthesized by IDT (Integrated DNA Technologies, Coralville, IA, USA). Initial real-time PCR experiments were performed with 2 µL of DNA extracted from
H. armigera only,
H. zea only, and a few ratio samples (1:10, 1:50 and 1:100) in a 20 µL amplification reaction containing 500 nM of each forward and reverse primer, 200 nM probe, 2× iTaq Universal Probes Supermix (Bio-Rad Laboratories Inc., Hercules, CA, USA), and water. After initial denaturation at 95 °C for 3 min, 40 cycles of amplification with a 15 s denaturation step at 95 °C and 1 min anneal and extension step at 60 °C were performed on a Bio-Rad CFX96 Real-Time PCR system (Bio-Rad Laboratories Inc.). After data capture, the amplification of unique products was verified in CFX Manager v3.1 (Bio-Rad Laboratories), and the primer pair with the lowest Cq value was selected (
Table 1).
2.5. Real-Time PCR
The optimal concentration of the primers and probe and the optimal annealing temperature were determined with additional tests including varying the primer concentration from 125 nM to 875 nM and the probe concentration from 40 nM to 320 nM in each reaction and an annealing temperature gradient from 50 to 60 °C. The optimized real-time PCR reaction used iTaq Universal Probes Supermix (Bio-Rad Laboratories Inc., Hercules, CA, USA) with 500 nM of each primer and 160 nM probe, 2 µL DNA of varying concentration, and water to complete the dilution. In select assays, an 18S control probe and primer set were also used following the concentrations and reagents outlined in Barr et al. ([
35];
Table 1). Real-time PCR was done on a Bio-Rad CFX96 Real-Time PCR system (Bio-Rad Laboratories Inc.). The following thermocycling protocol was used: 95 °C for 3 min followed by 40 cycles of 95 °C for 15 s, 1 min at 56 °C, and data capture. Data were visualized and analyzed in CFX Manager v3.1 (Bio-Rad Laboratories). Baseline threshold was set to “auto calculated” for initial testing. However, in a limited number of negative control samples (
H. zea only and NTC), the diagnostic probe exhibited background amplification with an end relative fluorescence unit (RFU) of less than 50.00. Based on this, the baseline threshold setting for the diagnostic probe was changed to “user defined” and the cutoff value adjusted to 100.00 RFU for all runs.
2.6. Real-Time PCR Sensitivity Analyses
To determine the sensitivity of the assay, we ran six technical replicates of serial dilutions of purified
H. armigera DNA with a range of concentrations from 40 ng/µL to 4 × 10
−6 ng/µL. The Cq results were adjusted by a logarithmic regression at each DNA dilution for the diagnostic and control probes according to the model:
where:
yi = Cq observed value referring to the i-th dilution;
β0 = intercept;
β1 = slope;
Xi = the i-th dilution associated to the observed value yi;
ei = residual associated to the yi observation.
The analyses were carried out in R [
36] using the nls2 package [
37].
2.7. Real-Time PCR Comparative Analyses
The statistical significance within real-time PCR between the RFU amplification values of the purified and non-purified samples (use of bead purification factor levels; j) and the presence or absence of the 18S control probe (use of control factor levels; i) were evaluated by analysis of variance (ANOVA). Once the ANOVA assumptions were verified, the RFU value was modeled by the expression below:
where:
yijk = RFU observed value referring to the k-th bulk sample of combination of the i-th level of use of 18S control factor with the j-th level of use of bead purification factor;
µ = intercept;
wk = effect of k-th bulk samples in the observed value yijk;
αi = effect of i-th level of the use of control factor in the observed value yijk;
βj = effect of j-th level of the use of bead purification factor in the observed value yijk;
αβij = effect of the interaction of the i-th level of the use of control factor as the j-th level of the use of bead purification factor;
eijk = residual associated with the yijk observation;
The analyses were carried out in R [
36]. The means were compared by Tukey test using the agricolae package [
38].
2.8. ddPCR
Primer and probe sets were also tested using ddPCR following the protocols outlined in Zink et al. [
24] for EvaGreen reactions and Zink et al. [
39] for probe-based reactions using the QX200 Droplet Digital PCR System (Bio-Rad Laboratories Inc.) The primer set was tested both with the probe and with EvaGreen intercalating dye (
Table 1). The probe-based assay was carried out using 10 µL 2× ddPCR Supermix for Probes (no dUTP), 500 nM of each primer, 200 nM probe, 2 µL of DNA of varying concentration, and 5.6 µL water to complete the dilution of the master mix. After droplet generation, the following thermocycling protocol was used: 95 °C for 10 min followed by 40 cycles of 95 °C for 30 s, 56 °C for 30 s, and 72 °C for 1 min, concluding with 98 °C for 10 min and an infinite hold at 4 °C. During thermal cycling for ddPCR, the ramp rate between all steps was fixed at 2 °C/s.
The fully optimized EvaGreen assay was carried out using 10 µL EvaGreen Supermix (Bio-Rad Laboratories Inc.), 200 nM of each primer, 1–2 µL of DNA of varying concentration, and 7–8 µL of water to complete the dilution of the supermix. The following optimized thermocycling protocol was used for EvaGreen reactions: 95 °C for 5 min followed by 40 cycles of 95 °C for 30 s, 56 °C for 30 s, and 72 °C for 1 min, concluding with 4 °C for 5 min, 90 °C for 5 min, and an infinite hold at 4 °C. The ramp rate between steps was fixed at 2 °C/s.
After droplets were read, data were processed using ‘definetherain’ [
40], and additional analyses were carried out in QuantaSoft v1.7.4.0917 (Bio-Rad Laboratories). The false positive rate (FPR) and Limit of Detection (LoD) for the primers was determined for the EvaGreen-based assay only. Forty-four replicates of bulk extractions containing
H. zea specimens only were run, and the FPR and LoD were determined using look-up tables provided by Bio-Rad as modified from Armbruster and Pry [
41].
4. Discussion
The use of ddPCR and real-time PCR has become mainstream for detection of target DNA from several different sample types [
45]. Because each of these methods has advantages over the other and requires independent optimization, we developed, optimized, and compared assays for each platform. As part of the development and testing steps, the two methods were compared to help guide users as to which approach is most appropriate for their screening needs.
Given the reduced cost, ease of sample prep, wide availability, and the broad dynamic range associated with real-time PCR and because ddPCR is currently the only sensitive and specific method available for screening bulk samples of
Helicoverpa, we focused on improving the sensitivity of detection for bulk samples by real-time PCR. Because ddPCR is generally more precise than real-time PCR when using complex and/or contaminated samples [
46,
47], our efforts were intended to enhance the real-time assay such that the sensitivity was closer to that realized with ddPCR. By using an increased salt concentration for the bulk extraction, secondary bead purification, and running the diagnostic assay in simplex, a significant increase in precision was observed over running the assay without these steps (
Figure 2) as well as over previous attempts to develop a real-time PCR assay for use with bulk samples [
24,
29]. This is most evident in the 59 replicates of 1
H. armigera: 50
H. zea which were run under different control and purification conditions. While we observed that running the diagnostic and control probes separately resulted in no false negative results, we realized that this is an idealized situation that would not be applicable to routine procedures. In a real-world trap screening, running two separate assays to obtain diagnostic and control results is unrealistic as it uses twice the resources. Additionally, in a situation in which the diagnostic probe returned a negative result and the control probe returned a positive result, it would be impossible to know whether there was PCR occurring in the well with the diagnostic probe. Because of this, it would be more logical to split the trap sample into two or more small batches consisting of less than 50 moths each and run them with both probes simultaneously. This would ensure that the concentration of
H. armigera DNA is relatively higher in any positive samples, increasing the probability of a correct diagnosis and allowing the control to work as intended. Our final recommendation is to bead purify all DNA extractions for use with this assay.
The most important aspect from these results is that the real-time PCR assay and the ddPCR assay returned positive results for the same samples. The two assays have the same sensitivity when used with purified DNA and can be used to detect 4 × 10
−5 ng/µL of DNA per reaction at the lowest threshold (
Figure 4). Both assays are also sensitive enough to detect a single
H. armigera leg among 500
H. zea legs when fresh specimens are screened. The correlation between positive results for both assays shows that real-time PCR can be used successfully to screen for
H. armigera in bulk samples using this assay, broadening the utility of this method for use by identifiers in labs that do not have access to ddPCR systems. Additionally, when the 59 1:50 ratio samples were run using real-time PCR, the two samples that return false negatives also did so when using ddPCR (
Figure 6). This emphasizes the need to collect and properly store field samples promptly in order to preserve DNA of the highest quality possible.
The design of primers and a probe for this assay was optimized for the greatest number of fixed nucleotide differences between H. zea and H. armigera to ensure specificity in bulk samples. The largest difference between H. zea and H. armigera ITS1 sequences is a 30 bp deletion in H. armigera (5′ ACCACTATGCGCATGCATATATTGCATCGC). The forward primer was designed to span this deletion such that complete priming on H. zea by the forward primer was negated. The probe sequence immediately follows the forward primer and is designed to include two SNVs and an AA indel between the species. Like the probe, the reverse primer incorporates two SNVs and an AC indel separating the species.
In addition to high levels of lineage specific nucleotide sequence differences created from the rapid evolution in ITS sequences, increased sensitivity is also an advantageous feature of rDNA-based PCR assays due to the presence of high tandem copy numbers of rDNA in the genome [
48]. Using a HiRise [
49], assembled genome not available at the time of assay design [
50] for
H. zea x
H. armigera, 54
H. armigera ITS1 sites (108 in diploid somatic cells), and 105
H. zea ITS1 sites (210 in diploid somatic cells) from distinct rDNA repeats were identified (
Supplementary File S1). All 54
H. armigera ITS1 sites had 100% identity with each other and the primers and probe designed herein. In addition to the 54 distinct copies with exact matches for the primers and probe, four ITS1 copies were found that contained SNVs or indels (one copy with an indel in the forward priming site, one copy with one indel in the forward, probe, and reverse sequences, and two copies with one SNV each in the reverse sequence) in one or more of the priming sites. Because the birth and death cycle of rDNA is so rapid, the difference in rDNA copies is generally limited [
51,
52]. That said, given the large number of rDNA copies in a genome, it is common to find mutations between copies (ribotypes) at any given time especially in the non-functional ITS regions [
53].
One of the most regularly cited problems associated with ddPCR assays is the production of aberrant droplets outside the expected range making analyses and detection calls more difficult [
54]. Given this, our optimization steps were conducted to improve droplet separation and reduce rain and other out of range droplets. As with real-time PCR data, outputs (preferably sigmoidal curves in the case of real-time PCR and clearly separated droplets in the case of ddPCR) should always be visualized to ensure that positive calls are not being made from artifactual results [
55,
56]. Case in point, when the probe designed for real-time PCR was used in the ddPCR assay, we observed a tight and distinct cluster of positive droplets between the positive and negative droplet clusters (
Figure 3a). We tried several optimization steps to eliminate or reduce the presence of these droplets including varying primer and probe concentration, as well as the annealing temperature, number of cycles, ramp rate, and implementing a touch-down-like thermocycling program. While the amplitude of the cluster could be reduced, it could not be eliminated entirely (
Figure 3b).
The multi-peak distribution of droplets was found to be exclusive to the presence of the probe and may have been caused in part by incomplete or inefficient probe binding to ITS1 copies with a single nucleotide difference in the probe site [
57]. When the
H. armigera ITS1 copies are compared, the differential in ∆G (between perfect and next best matches) increases from −31.4 to −19.8 kcal/mol when the ITS1 copy with an indel in the probe binding site is considered. This suggests that increases in the frequency of minor ITS1 ribotypes may reduce the efficiency of probe binding and contribute to double banding. That said, despite the double banding in some samples, sufficient separation was noted between the double bands and the negatives such that positive calls could still be made after inspecting droplet amplitude. Furthermore, the samples that exhibited this droplet pattern were from colony samples of
H. armigera and may not be representative of genotypes found widely in nature. No matter the reason for the multi-peak distribution found in some samples using the probe, the unusual pattern was eliminated in the EvaGreen ddPCR assay. While it is assumed that the off-target probe binding in the reaction is also present in the real-time PCR assay, the way in which the data are processed (a snapshot of total fluorescence in the reaction at each moment of data capture) make real-time assays less susceptible to these effects which are only evident due to the partitioning of DNA present in ddPCR. The advantages in specificity and sensitivity make rDNA loci superb diagnostic regions for species identification (even in the case of hybrids as shown in
Supplementary File S1), but if possible, all ribotypes within a genome should be compared to improve reaction efficiency by avoiding intragenomic polymorphisms in priming sites.
The two assays described here were intended to improve the availability of rapid, sensitive detection methods for
H. armigera. Since most identifiers rely on real-time PCR, barcoding, or genitalic dissection of individual specimens, any increased capacity for bulking samples with a widely available technology like real-time PCR is an important advancement for phytosanitary screening. Even our recommendation of bulking less than 50 specimens at a time for real-time will greatly improve throughput while maintaining a reasonable throughput for downstream identification. Currently, bulk samples are screened using the method described by Zink et al. [
24], and if a bulk sample is determined to be positive for
H. armigera, an individual leg is pulled from each specimen in the sample to extract DNA from it individually. The specimens are screened individually by real-time PCR following the guidelines from Gilligan et al. [
31]. Positive samples are then COI barcoded, and/or the specimen is dissected for official identification. While the assay described by Zink et al. [
24] can detect a single
H. armigera in 999
H. zea, in practice much smaller samples are typically screened. Many traps catch fewer specimens with only a few hundred per trap collected during the peak of the season. Furthermore, if a sample containing hundreds of specimens is determined to be positive for
H. armigera, individually extracting DNA from single legs of each of those specimens and running them each on real-time PCR becomes increasingly time consuming. In practice, it is more feasible to run fewer than 96 moths per trap sample so that any positive samples can be run on a single real-time PCR plate. The lower sample size also helps decrease the possibility of false negatives due to poor DNA quality. Similarly, the smaller sample size recommended here for this real-time PCR assay ensures the most efficient use of lab time and resources. In conclusion, we recommend dividing large trap catches into subsamples of 40 individuals to use the real-time PCR method (
Figure 7).