KASP Genotyping as a Molecular Tool for Diagnosis of Cassava-Colonizing Bemisia tabaci

Bemisia tabaci is a cryptic species complex that requires the use of molecular tools for identification. The most widely used approach for achieving this is the partial sequencing of the mitochondrial DNA cytochrome oxidase I gene (COI). A more reliable single nucleotide polymorphism (SNP)-based genotyping approach, using Nextera restriction-site-associated DNA (NextRAD) sequencing, has demonstrated the existence of six major haplogroups of B. tabaci on cassava in Africa. However, NextRAD sequencing is costly and time-consuming. We, therefore, developed a cheaper and more rapid diagnostic using the Kompetitive Allele-Specific PCR (KASP) approach. Seven sets of primers were designed to distinguish the six B. tabaci haplogroups based on the NextRAD data. Out of the 152 whitefly samples that were tested using these primer sets, 151 (99.3%) produced genotyping results consistent with NextRAD. The KASP assay was designed using NextRAD data on whiteflies from cassava in 18 countries across sub-Saharan Africa. This assay can, therefore, be routinely used to rapidly diagnose cassava B. tabaci by laboratories that are researching or monitoring this pest in Africa. This is the first study to develop an SNP-based assay to distinguish B. tabaci whiteflies on cassava in Africa, and the first application of the KASP technique for insect identification.


Introduction
The whitefly, Bemisia tabaci (Gennadius; Hemiptera: Aleyrodidae), with a host range of over 1000 plant species, is considered one of the most damaging crop pests worldwide. The greatest damage caused is by the vectoring of over 300 plant viruses [1]. In Africa, B. tabaci transmits viruses that cause cassava mosaic disease (CMD) and cassava brown streak disease (CBSD) [2,3]. The combined damage resulting from infection with these two diseases is estimated to cause cassava yield losses amounting to 50% in East and Central Africa, causing annual losses equivalent to more than US$1 billion [4].
Bemisia tabaci is genetically complex [5], with many distinct genetic groups that have been identified based on sequences of the mitochondrial cytochrome oxidase I (COI) gene [6,7]. The occurrence of biotypes or host races of the whitefly B. tabaci was described in the 1950s after the discovery that morphologically indistinguishable populations of B. tabaci differed with respect to host range, host-plant adaptability, and plant virus transmission capabilities [8]. At that time, identification of B. tabaci relied on morphological characterization involving the examination of slide-mounted specimens of fourth instar nymphs [9]. However, some characteristics of the fourth instar are influenced by prevailing cassava B. tabaci. KASP is a technique that was developed by LGC Genomics Teddington-UK, and it is based on allele-specific oligo extension and fluorescent resonance energy transfer (FRET) for signal generation. The assay uses allele-specific fluorescent labelled primers that allow for end-point fluorescent reads enabling the bi-allelic scoring of single nucleotide polymorphisms at specific loci. (https://biosearch-cdn.azureedge.net/assetsv6/KASP-genotyping-chemistry-User-guide.pdf). KASP assays are low cost, high-throughput and have high specificity and sensitivity relative to other markers [36].
The objective of the study described here was to develop SNP markers to distinguish the six major haplogroups of cassava B. tabaci whiteflies using a KASP PCR technique and to validate these markers by comparing them with genotyping by NextRAD sequence data.

Whitefly NextRAD Sequences
We previously collected 95 cassava-colonizing whitefly samples from eight African countries (Burundi, Cameroon, Central African Republic (CAR), Democratic Republic of Congo (DRC), Madagascar, Nigeria, Rwanda and Tanzania) [24]. An additional 190 adult whitefly specimens were collected from cassava fields between 2015 and 2018 from 10 additional countries (Benin, Ghana, Kenya, Liberia, Malawi, Mozambique, Sierra Leone, Togo, Uganda and Zambia) and four countries (Cameroon, DRC, Nigeria and Tanzania) that either had few samples in our previous study or harboured high whitefly diversity. Genomic DNA of the 190 whiteflies was used to construct NextRAD libraries by SNPsaurus, LLC (http://snpsaurus.com/) [23]. Raw reads from NextRAD sequencing (190 samples combined with 95 samples) [24] were first processed to remove adaptor and low-quality sequences. The cleaned reads were then aligned to the SSA-ECA genome, and only uniquely mapped reads were used for SNP calling using TASSEL5 [37]. The final filtered SNPs (63,770) were obtained from a total 243 B. tabaci cassava whiteflies that produced quality sequences out of the combined 285 samples.

Primer Design
SNPs to distinguish the six newly designated cassava-colonising B. tabaci haplogroups (SSA-ECA, SSA-ESA, SSA-WA, SSA-CA, SSA2 and SSA4) were identified (Table 1), and genome portions of~2000 bp containing the SNPs were extracted from the Bemisia tabaci isolate SSA1 (SSA-ECA) whole-genome assembly (GenBank accession PGTP01000000). Seven sets of primers were designed (BTS99-319, BTS22-762, BTS55-473, BTS141, BTS613, BTS46-203 and BTS1161) ( Table 2) and optimised to amplify genome fragments (390 to 1150 bp) containing the target SNPs to distinguish the six haplogroups. Genome sequence portions of~200 bp containing target SNPs were sent to LGC Genomics-UK to design the KASP primers (Table 3). PCR products generated using conventional primer sets were then used as the DNA template to test and optimise the KASP primers ( Table 3). The KASP technique was tested for effectiveness to distinguish the six newly designated cassava-colonizing B. tabaci haplogroups (SSA-ECA, SSA-ESA, SSA-WA, SSA-CA, SSA2 and SSA4). A total of 152 whitefly samples from Nigeria, Ghana, Benin, Sierra Leone, Liberia, Cameroon, Malawi, Mozambique, Kenya, Tanzania, Uganda and the Democratic Republic of Congo (DRC) were tested using KASP. These were selected from the same DNA extracts that were used for SNP genotyping (NextRAD) [25] with the aim of comparing KASP with NextRAD. Conventional PCR products of the six primers were generated and subsequently used as templates in KASP genotyping.

Conventional PCR and KASP Assay
The conventional PCR reaction mixture (10 µL) contained 1 µL template DNA, 5 µL OneTaq Quick-Load 2X Master Mix with Standard Buffer, 0.24 µL of primer (0.25 mM), 0.4 µL MgCl 2 (25 mM) solution, and 3.36 µL of sterile water. A total of 35 cycles of amplification were carried out, and conditions were the same for all set of primers: denaturation at 95 • C for 5 min and 94 • C for 40 s, annealing temperature at 58 • C for 30 s, and extension at 72 • C for 45 s, and a final extension at 72 • C for 10 min and held at 10 • C.
The KASP reaction mixture (10 µL) contained 5 µL 2X KASP master mix, 0.14 µL KASP primer assay mix and 5 µL DNA template (1 µL of PCR product/DNA extract + 4 µL of sterile water). KASP genotyping was performed in a Strategene MX 3000P (Agilent Technologies, California-USA). The quality of genotyping cluster plots was visually assessed, and only samples in distinct clusters were considered for manual SNP calling, using the MxPro software incorporated in the Strategene MX 3000P unit and KlusterCaller (LGC Genomics, Teddington, UK). Genotyping profiles obtained from KASP assays were compared to genotype data from NextRAD to ensure only matching genotypes were considered.

Results
The KASP genotyping results were consistent with NextRAD genotyping (99.3%) with the exception of one sample. The SNP genotyping clusters for the selected six primers for representative samples are presented ( Figure 1). The conventional primers BTS99-319F/R amplified a~940-bp fragment with a clear single bright band for cassava whiteflies but did not amplify any of the non-cassava Bemisia tabaci (MED and IO) and B. afer that were tested in our laboratory. This primer is therefore used to first split African cassava B. tabaci from other species in addition to providing PCR templates for subsequent KASP assays. The KASP diagnostic flow chart ( Figure 2) shows the steps to be followed to split cassava whiteflies into the six populations.
Separating SSA-ECA and SSA-WA from all other cassava B. tabaci haplogroups. The SNP primers BTS99-319 used in KASP gave genotyping results for the 152 samples that were consistent with NextRAD haplogroups and that corresponded with NextRAD alleles in 151 (99.3%) of the samples. Distinguishing SSA-ESA from SSA-CA. Conventional primers BTS46-203F/R amplified a~390-bp fragment. These primers were tested only to separate SSA-ESA from SSA-CA and were not used with the other four haplogroups. KASP genotyping of 45 of these samples identified 39 as SSA-ESA and two as SSA-CA, which is consistent with NextRAD. The remaining four samples, two each from SSA-ESA and SSA-CA based on NextRAD grouping and typed homozygous by NextRAD alleles, were shown to be heterozygous when using KASP. These heterozygous samples were tested by an alternative primer set BTS1161 that also separates SSA-ESA from SSA-CA, and the samples were consistent with NextRAD as either SSA-ESA or SSA-CA. Amplification results using this primer for SSA-ESA samples from East Africa and some from Southern Africa were consistent with NextRAD, but 8 samples (4 from Malawi and 4 from Mozambique) out of 41 were designated as heterozygous and one sample from Mozambique was identified as SSA-CA although NextRAD identified it as SSA-ESA. SSA-ESA samples originated from Kenya, Malawi, Mozambique and Tanzania, whilst SSA-CA samples were from eastern DRC and western Tanzania.
Summarised results show that overall there was a very high level of concordance between NextRAD and KASP diagnoses (Tables 4 and 5). Detailed information on the country, COI identity, NextRAD grouping and KASP result is provided for all 152 samples in Table S1.

Discussion
Bemisia tabaci is a cryptic species complex with diverse members that have different biological properties. Accurate identification of these species is critical for the effective management of whiteflies, both as pests and as virus vectors. Most recently, an analysis of >63,000 genome-wide single nucleotide polymorphisms (SNPs) obtained from cassava-colonizing B. tabaci from Africa revealed the existence of six haplogroups, designated as SSA2, SSA4, SSA-CA, SSA-ESA, SSA-WA and SSA-ECA [24,25]. These data have now been used to design a simple KASP diagnostic assay to distinguish the six haplogroups. The method allows for the identification of these haplogroups in laboratory procedures lasting a matter of hours and with no requirement for sequencing. As such, we suggest that the method is appropriate for adoption across sub-Saharan Africa, where routine identification of cassava B. tabaci whiteflies is required. The SSA-ECA haplogroup is of particular interest as it is predominant in regions currently affected by the severe CMD and CBSD pandemics [25] and commonly occurs in super-abundant populations.
KASP results were consistent with NextRAD (99.3%), with only one sample out of the 152 not matching. A comparison with COI reveals that 18.4% (28 out of 152) of the samples were misidentified by COI. The SSA1-SG1 samples were predominantly placed in SSA-ECA, but 8.5% (13 out of 152) of these were actually SSA-WA. This indicates that authors still using COI will often be lumping two distinct haplogroups (SSA-ECA and SSA-WA) together as SSA1-SG1. Other samples that were misidentified include 7 SSA-ECA that were designated as SSA1-SG2, 3 SSA2 as SSA3, 2 SSA-CA as SSA1-SG1, 1 SSA-WA as SSA2, 1 SSA2 as SSA4 and 1 SSA-ECA as SSA1-SG1/SG2. This high misidentification rate of samples using COI indicates that it is unreliable for identifying the major genetic groups of cassava B. tabaci whiteflies, as has been highlighted in previous studies [24,25]. The KASP assay had 99.3% consistency with NextRAD and is more accurate and reliable compared to COI sequencing. The single sample (0.7%) that was scored heterozygous by KASP but homozygous by NextRAD is an expected infrequent occurrence because allelic dropout can occur during RAD sequencing and it increases for low read loci [38,39].
The whitefly samples directly used in KASP assays were from 12 countries in all major cassavagrowing regions in Africa, indicating that this tool will be reliable in identifying cassava B. tabaci

Discussion
Bemisia tabaci is a cryptic species complex with diverse members that have different biological properties. Accurate identification of these species is critical for the effective management of whiteflies, both as pests and as virus vectors. Most recently, an analysis of >63,000 genome-wide single nucleotide polymorphisms (SNPs) obtained from cassava-colonizing B. tabaci from Africa revealed the existence of six haplogroups, designated as SSA2, SSA4, SSA-CA, SSA-ESA, SSA-WA and SSA-ECA [24,25]. These data have now been used to design a simple KASP diagnostic assay to distinguish the six haplogroups. The method allows for the identification of these haplogroups in laboratory procedures lasting a matter of hours and with no requirement for sequencing. As such, we suggest that the method is appropriate for adoption across sub-Saharan Africa, where routine identification of cassava B. tabaci whiteflies is required. The SSA-ECA haplogroup is of particular interest as it is predominant in regions currently affected by the severe CMD and CBSD pandemics [25] and commonly occurs in super-abundant populations.
KASP results were consistent with NextRAD (99.3%), with only one sample out of the 152 not matching. A comparison with COI reveals that 18.4% (28 out of 152) of the samples were misidentified by COI. The SSA1-SG1 samples were predominantly placed in SSA-ECA, but 8.5% (13 out of 152) of these were actually SSA-WA. This indicates that authors still using COI will often be lumping two distinct haplogroups (SSA-ECA and SSA-WA) together as SSA1-SG1. Other samples that were misidentified include 7 SSA-ECA that were designated as SSA1-SG2, 3 SSA2 as SSA3, 2 SSA-CA as SSA1-SG1, 1 SSA-WA as SSA2, 1 SSA2 as SSA4 and 1 SSA-ECA as SSA1-SG1/SG2. This high misidentification rate of samples using COI indicates that it is unreliable for identifying the major genetic groups of cassava B. tabaci whiteflies, as has been highlighted in previous studies [24,25]. The KASP assay had 99.3% consistency with NextRAD and is more accurate and reliable compared to COI sequencing. The single sample (0.7%) that was scored heterozygous by KASP but homozygous by NextRAD is an expected infrequent occurrence because allelic dropout can occur during RAD sequencing and it increases for low read loci [38,39].
The whitefly samples directly used in KASP assays were from 12 countries in all major cassava-growing regions in Africa, indicating that this tool will be reliable in identifying cassava B. tabaci whiteflies from all of the major cassava-growing regions of the continent. The current protocol is designed to include conventional PCR to generate the DNA template for KASP with conventional primers. This step is necessary for whiteflies because individual insects produce low amounts of DNA that may not be of sufficiently good quality for the KASP assay. Attempts to use directly-extracted DNA as a template for KASP produced inconsistent results with some samples producing distinct clusters while others failed. Using a DNA template from a specific region with distinguishing SNPs can also eliminate the incidence of unspecific primer binding to non-target genome regions. This KASP assay set up did not present any challenges that required optimisation, and samples were resolved into the expected clusters just by following through the provided LGC Genomics protocol for all the SNP primers that were tested.
The availability of the cassava whitefly reference genome [25] made it straightforward to design this assay as it was possible to locate the SNPs and the >50-bp flanking genome regions on both sides of the SNP as recommended by LGC Genomics. In other systems, the lack of a reference genome can make assay design significantly harder [40]. Test results for the six SNP primers recommended yielded a total of 721 tests, out of which 714 (99%) matched the NextRAD genotyping. This level of accuracy is comparable to a study in which 8 SNPs tested on 10 samples of Eurasian beavers yielded 76 out of 80 (95%) samples that matched RAD sequencing results [40]. Our whitefly study described here is the first to use KASP genotyping on an insect. This also means that it should provide a valuable baseline against which to measure the merits of subsequent applications of KASP to identify genetic groupings of insects. This KASP assay will be used for the routine characterization of whiteflies collected from cassava in Africa, which will be expected to cluster into the six populations. Since it is anticipated that there may be distinct populations of cassava B. tabaci occurring at a low frequency that have not currently been sampled, additional NextRAD sequencing will be required at occasional intervals to allow for updates to be made to the KASP diagnostic in order to ensure that it remains as accurate and comprehensive as possible.

Conclusions
This study presents a KASP assay for the routine monitoring of cassava Bemisia tabaci in sub-Saharan Africa. Since this assay has demonstrated results consistent with NextRAD sequencing, it can provide a rapid and reliable analysis of cassava B. tabaci that will allow for same-day in-house genotyping in local laboratories that have limited access to sequencing technologies.
The KASP assay developed here has important practical developmental applications. Since there remains a concern that the haplogroup SSA-ECA may pose a risk for the continued spread of severe CMD and CBSD pandemics, the KASP diagnostic described could be an important component within early warning systems to track the spread of potentially dangerous B. tabaci populations. Finally, the KASP diagnostic represents an important application of comprehensive genomics data for Bemisia on cassava in Africa. As other datasets become available for Bemisia populations elsewhere in the world, there are likely to be similar opportunities to develop and apply KASP diagnostics more widely for Bemisia identification and monitoring as part of pest and vectored virus disease management programmes.