Genetic Diversity of Mitochondrial DNA of Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) Associated with Cassava and the Occurrence of Cassava Mosaic Disease in Zambia

Simple Summary Bemisia tabaci is an important vector that transmits cassava brown streak viruses and cassava mosaic begomoviruses that cause cassava brown streak and cassava mosaic diseases, respectively. In 2013 and 2015 we carried out a study to determine the genetic variability within the Bemisia tabaci complex associated with cassava in Zambia. This investigation made use of mitochondrial cytochrome oxidase I gene sequences of samples collected from selected provinces of Zambia. We found three population subgroups (SGs): SSA1-SG1, SSA1-SG2 and SSA1-SG3 within the sub-Saharan Africa 1 (SSA1) genetic group. Whitefly abundance and the incidence of cassava mosaic disease were both greatest in Western Province, in which the SSA1-SG1 subgroup predominated. Establishing which genetic groups and populations of the B. tabaci species complex are associated with cassava mosaic disease and their distribution in the country is key to guiding the strategic deployment of resources to monitor disease spread and ensure food security for millions of cassava-dependent households. Abstract Bemisia tabaci is an important vector of cassava brown streak viruses and cassava mosaic begomoviruses, the causal agents of cassava brown streak disease and cassava mosaic disease (CMD), respectively. A study was carried out to determine the genetic variability of B. tabaci associated with cassava and the occurrence of CMD in Zambia in 2013 and 2015. Phylogenetic analysis showed the presence of only the sub-Saharan Africa 1 (SSA1) genetic group in Zambia. The SSA1 population had three population subgroups (SGs): SSA1-SG1, SSA1-SG2 and SSA1-SG3. All three SSA1 population subgroups occurred in Western Province. However, only SSA1-SG3 occurred in Eastern Province, while only SSA1-SG1 occurred in North Western and Luapula Provinces. Adult B. tabaci were most abundant in Western Province in 2013 (11.1/plant) and 2015 (10.8/plant), and least abundant (0.2/plant) in Northern Province in both 2013 and 2015. CMD was prevalent in all seven provinces surveyed, with the highest incidence recorded in Lusaka Province in both 2013 (78%) and 2015 (83.6%), and the lowest in Northern Province in both 2013 (26.6%) and 2015 (29.3%). Although SSA1-SG1 occurred at greater abundances than the other subgroups, there was no direct association demonstrated between whitefly subgroup and incidence of CMD. Establishing which B. tabaci genetic groups and populations are associated with CMD and their distribution in the country is a key factor in guiding the development of CMD control strategies for cassava-dependent households.


Study Area
Zambia experiences a tropical savannah climate with three seasons: a hot dry season (August-October); a hot wet season (November-April); and a cool dry season (May-July). Northern, North Western and Luapula Provinces are in agro-ecological region III and experience approximately 1000 mm and above of rainfall per annum. Western, Central, Lusaka and Eastern Provinces are in agroecological region II and receive 800-1000 mm annually. Whitefly abundance, CMD incidence and symptom severity were assessed in 245 and 200 cassava fields in 2013 and 2015, respectively. The field assessments were carried out in seven provinces, including: Central (36 & 29), Eastern (51 & 44), Luapula (47 & 28), Lusaka (24 & 14), North Western (16 & 17), Northern (40 & 43) and Western (31 & 25). Samples of B. tabaci were collected in Western, Eastern, North Western and Luapula Provinces for molecular characterization. CBSD was not assessed systematically in either 2013 or 2015, although survey teams were aware of CBSD symptoms and none were seen in any of the fields assessed.

Sampling Methodology
The protocol of Sseruwagi et al. [34] was followed to collect material for the investigation of CMD incidence and symptom severity, and whitefly abundance in January to March for both the 2013 and 2015 field surveys. Cassava fields were selected at regular intervals of 10-15 km, although this distance was greater in areas with few cassava fields. In each field, 30 plants of 3 to 6 months old were sampled along two diagonals running through the field along an 'X' shape [34]. CMD incidence was determined by counting the proportion of symptomatic plants out of the total (30) sampled per field [34]. CMD symptom severity was recorded for each plant using a 1-5 scale [35], in which 1 indicated the absence of disease symptoms; 2-mild chlorosis over the entire leaflet or mild distortion at the base of leaflets with only the remainder of the leaflets appearing green and healthy; 3-moderate chlorosis throughout the leaf, narrowing and distortion of the lower one-third of leaflets; 4-severe mosaic distortion of two-thirds of the leaflets and general reduction of leaf size; and 5-severe mosaic discoloration and/or distortion of the entire leaf and plant stunting.
B. tabaci adults were counted for each sampled plant on the five top-most leaves of the tallest shoot by gently turning the leaf to make the underside visible. Samples of B. tabaci adults were collected from cassava plants using an aspirator and placed in 2 mL plastic vials containing 70% ethanol for preservation. The geo-coordinates (latitude, longitude and altitude) were recorded for each location using a global positioning system (GPS) (Etrex, HC Sumit, Garmin International Inc., Olathe, KS, USA).

Extraction of Whitefly DNA, PCR and Sequencing
The DNA extraction protocol was followed for all whiteflies collected; the PCR treatment of samples from the 2013 and 2015 surveys differed in terms of the primer sets and protocol used. Therefore, we describe the two processes separately as follows.

2013 Samples
DNA was extracted from adult whiteflies and PCR was performed to amplify an 850 bp fragment of mtCOI according to Frohlich et al. [36]. The extraction was done in the molecular laboratory at the International Institute of Tropical Agriculture (IITA) in Dar es Salaam, Tanzania. One adult whitefly was drawn from each sample tube and placed on an inverted Petri dish covered with Parafilm. The insect was gently ground in 10 µL of extraction buffer to release total DNA. An additional 30 µL of extraction buffer was added and the contents mixed before transferring to a 1.5 mL tube incubated on ice. The extracted total DNA was then incubated in a water bath (Gesellschaft Für Labortechnik mbH, Burgwedel, Germany) at 65 • C for 15 min and at 95 • C for 10 min using a block heater (Grant QBD2, Grant Instruments Ltd., Shepreth, UK). The DNA was then centrifuged briefly for 5 s to pellet the debris. The contents collected in the supernatant were then stored at −20 • C until use. PCR products of the mtCOI fragment (approximately 850 bp) were produced using the forward primer MT10/C1-J-2195 (5 -TTGATTTTTTGGTCATCCAGAAGT-3 ) and the reverse  primer MT12/TL2-N-3014 (5 -TCCAATGCACTAATCTGCCATATTA-3 ) [37] using a thermocycler (Applied Biosystem, GeneAmp PCR System 9700, Foster City, CA, USA), under the following conditions: first cycle of denaturation at 94 • C for 2 min, followed by 35 cycles of denaturation at 94 • C for 30 s, annealing at 54 • C for 30 s, 72 • C for 1 min and final extension at 72 • C for 10 min. A total reaction mixture of 20 µL was made up of 12.54 µL of distilled water, 2 µL of PCR buffer, 2.5 µL of 25 mM MgCl 2 , 0.76 µL of 2 mM dNTPs, 1 µL of 10 mM primer mix, 0.2 µL of Taq DNA polymerase and 1.0 µL of DNA template. The PCR products were electrophoresed using a Midicell Primo electrophoretic gel system in 1% agarose gel stained in GelRed at 100 V for 30 min in gels buffered with 1× TAE buffer. The gel was visualized and photographed using the UVP imaging system (Biomapping Systems, Digidoc-IT, El Dorado Hills, CA, USA). The resulting 38 PCR products were sequenced in both the forward and reverse directions at Macrogen Inc., Rockville, MD, USA.

2015 Samples
DNA was extracted from individual adult whiteflies following the procedure of Walsh et al. [38]. One whitefly was drawn from a composite sample and placed into a 1.5 mL tube containing 25 µL of 10% Chelex solution. The whitefly was gently ground to release total DNA and an additional 25 µL of 10% Chelex solution was added to the tube. The 1.5 mL tube with the extract was incubated at 56 • C for 20 min in a water bath. The extracted DNA was further incubated at 100 • C for 5 min using a block heater. The contents were then transferred to new tubes and centrifuged at 12,000 rpm for 5 min to pellet the debris. The contents collected in the supernatant were then stored at −20 • C until use.
The PCR was carried out with the same PCR conditions as in 2013 using a thermocycler (Applied Biosystem, GeneAmp PCR System 9700, CA, USA). PCR products were electrophoresed using a Midicell Primo electrophoretic gel system in 1% agarose gel stained in GelRed at 100 V for 30 min in gels buffered with 1× TAE buffer. The gel was visualized and photographed using the UVP, Biomapping Systems, Digidoc-IT, USA. The resulting 30 PCR samples were sequenced in both the forward and reverse direction at Source BioScience, Nottingham, UK.

Data Analysis
Following the method described in Sseruwagi et al. [34], disease incidence was calculated as the percentage of CMD-symptomatic plants per field. Disease symptom severity data were edited to remove the symptomless (healthy) counts (score 1) and analysis was conducted for the CMD-affected plants (score 2-5) per field. Adult whitefly population data were determined at plant level. Means were separated using a one-way analysis of variance using SPSS (version 20).
For all fields where whitefly sequences were obtained, the relationships between whitefly population subgroup, location (province) and whitefly abundance, as well as those between whitefly population subgroup and CMD incidence were examined with Kruskal-Wallis non-parametric ANOVA analyses using MedCalc 19.5.3 (MedCalc Software Ltd., Ostend, Belgium). The same software was used to investigate the association between whitefly abundance and CMD by calculating Pearson's correlation coefficient. Analyses were conducted separately for 2013 and 2015 in addition to the data for the two years combined.

Phylogenetic and Sequence Analysis
Whitefly mtCOI sequences were edited manually to produce a consensus contig sequence of 705 bp for each individual whitefly using CLC Main Workbench version 7 (CLC Bio, Qiagen). This sequence portion was identical for both primer sets used, so the results of all sequences obtained were directly comparable. The edited sequences were aligned with reference whitefly sequences obtained from the National Center for Biotechnology Information (NCBI) ( Table 1), using the Clustal W algorithm option available in the MEGA 7 program [40]. The edited contig sequences were aligned using MEGA 7 to infer a phylogenetic tree with the HKY+G+I model. To determine the confidence values for the grouping within a tree, a bootstrap analysis was performed using the maximum-likelihood procedure with a bootstrap setting of 1000 replicates. We also constructed a phylogenetic tree using the model based maximum likelihood (ML) analysis for the same dataset. We discovered that the Hasegawa-Kishino-Yano parameter with discrete Gamma distribution (HKY+G+I) was the best fit model for our dataset, having had the lowest Bayesian Information Criterion (BIC) value. Table 1. Whitefly genotypes used in the phylogenetic analysis of mitochondrial cytochrome oxidase I sequences and their GenBank accession numbers.

Sequence Name
Country GenBank Reference Code (mtCOI) Author Bemisia afer [24] There were 68 final sequences obtained, which were cleaned manually, and the ends trimmed using Chromas Lite (version 2.1.1). The sequences were compared with those obtained from GenBank using the NCBI BLASTn program. After conducting pairwise analysis using Geneious R11 (https://www.geneious.com), a number of samples were found to be identical. From the identical samples, representative sequences were then selected, and used to construct a final phylogenetic tree and to generate evolutionary divergence using MEGA 7.

Population Genetic Analysis
To elucidate the genetic differences among the samples collected, the nucleotide sequences of all the isolates used in the present study were investigated. Population diversity indices based on the mtCOI, including number of haplotypes (h), polymorphic sites (S), average number of nucleotide differences (k), nucleotide diversity (Pi), and haplotype diversity (Hd) were estimated for each collection. The neutrality test for each subgroup population, including Tajima's D [43] and Fu's Fs [44] values, were also estimated. Sequence variation for each of the main haplotype groups detected was analyzed using DnaSP v6 [45].

Diversity Indices
There was a very low level of polymorphism, estimated by haplotype diversity, amongst the three subgroups groups recorded in the study (Table 3). All eight sequences of SSA1-SG2 were identical, whilst SSA1-SG1 had four haplotypes and SSA1-SG3 had two. Haplotype diversity (Hd) for all the 68 sequences was found to be 0.039 SD. The average number of nucleotide differences (k) was calculated to be 5.53. For both SSA1-SG1 and SSA1-SG3, negative Tajima's D values gave an indication of population expansion, although these values were not strong enough to be statistically significant in both cases.

CMD Incidence, Symptom Severity and Whitefly Abundance
There were notable significant differences between the provinces in 2013 with respect to CMD incidence (df = 6, F = 15.756, p < 0.0001), symptom severity (df = 6, F = 38.841, p < 0.0001) and whitefly abundance (df = 6, F = 268.908, p < 0.0001). The field situation was similar in 2015, with significant differences in CMD incidence (df = 6, F = 13.391, p < 0.0001), symptom severity (df = 6, F = 91.809, p < 0.000) and whitefly abundance (df = 6, F = 221.086, p < 0.0001). CMD was prevalent in all seven provinces surveyed (Table 4) There was a significant correlation between whitefly abundance and CMD incidence when considering data for both 2013 and 2015 (r = 0.28; p = 0.025), but no significant differences were apparent between SSA1 subgroups and CMD incidence, although in general it was apparent that CMD incidence was greater in Western Province where subgroup SSA1-SG1 predominated. Analysis of the relationship between subgroups and whitefly abundance showed that there were significant differences between abundances of subgroups when considering data for 2013 (H = 11.2; p = 0.0036) and both 2013 and 2015 (H = 6.8; p = 0.033), but differences were not significant when considering 2015 alone (H = 0.3; p = 0.86). In both cases where sub-groups had statistically different abundances, SSA1-SG1 was most abundant.

Discussion
We report the widespread occurrence of SSA1 as the main subgroup of B. tabaci occurring on cassava in Zambia. This study confirms previous reports by Berry et al. [10], although that study was based on limited sampling and there was no clear indication of the geographical location of the six samples studied. Three populations of SSA1 were identified on cassava: SSA1-SG1, SSA-SG2 and SSA1-SG3. In both 2013 and 2015, SSA1-SG1 was the dominant population compared to SSA1-SG2 and SSA1-SG3. The study by Berry et al. [10] identified only two populations of SSA1: SSA1-SG1 and SSA1-SG3. BLASTn comparisons of the six samples sequenced by Berry et al. [10] revealed that five of these were SSA1-SG1 and one was SSA1-SG3. The only other published mtCOI sequence of B. tabaci from Zambia was obtained from cassava in Chongwe, Lusaka Province, about 40 km east of Lusaka city [21]. This was also SSA1-SG3.
Populations of SSA1-SG1 have been reported most frequently from Tanzania, Uganda, Nigeria and Kenya [14,32,42,[46][47][48] and were often associated with high whitefly abundance, particularly in East Africa. In our study, SSA1-SG1 occurred in areas with a relatively high whitefly abundance, such as Western and North Western Provinces, where populations were comparable to the high populations reported in East Africa [25,30]. Whiteflies on cassava have generally been observed to be more abundant in Western and North Western Provinces of Zambia [49][50][51], where SSA1-SG1 was also most prevalent. In this study, an average population exceeding 10 whiteflies per plant was recorded for both 2013 and 2015 in Western Province. To differentiate between low and high whitefly abundance, Legg [52] suggested a mean whitefly abundance threshold of '5/plant'. Using this definition, Western Province was categorized as having a 'high abundance' of whiteflies. Furthermore, the presence of sooty mold on cassava leaves in some fields is indicative of high whitefly abundance.
Whitefly abundance was also relatively high in both 2013 and 2015 in North Western Province compared to other provinces in this and past studies [50,51]. Whitefly abundance was low in Eastern, Central, Luapula and Northern Provinces, with averages in 2013 of only 1.0, 0.7, 0.5 and 0.2/plant, respectively. In 2015, there were few whiteflies in Eastern, Luapula and Northern Provinces (0.6, 0.3 and 0.2/plant, respectively). The generally low whitefly populations in the surveyed farmers' fields could be attributed to several factors, including an unfavorable climate with a long cool dry season (May to August) which is followed by a hot dry season (September to November). Cassava mosaic disease was prevalent with moderate to high incidence in many surveyed fields. The high disease incidence was mainly attributed to the use of highly susceptible local cultivars and CMD-affected planting materials, although the significant correlation between whitefly abundance and CMD incidence demonstrated in our study confirms that whiteflies are a key factor in spreading the viruses causing CMD in Zambia. These findings corroborate earlier reports on CMD in the country [51,53], but suggest that the situation has changed from the 1990s when CMD incidence and severity were reported to be low in Western Province [49].
Subgroup SSA1-SG1 was prominent in Western Province, and has been reported to be dominant in parts of East and Central Africa affected by severe CMD and CBSD pandemics [32,42,48]. Although SSA1-SG1 has previously been associated with abundant populations of B. tabaci on cassava in East and Central Africa, it has also been shown to occur more widely, including in West Africa as far as Liberia [21], where no unusual levels of abundance have been reported. B. tabaci whiteflies from cassava in countries immediately neighboring Zambia have been reported in recent years, such as SSA1-SG1 from eastern and south-eastern Democratic Republic of Congo (DRC), SSA1-SG1, SSA1-SG2 and SSA1-SG3 from Malawi, and SSA1-SG3 from Mozambique [21,54]. In Tanzania, which borders Zambia to the north-east, SSA1-SG2 and SSA1-SG3 were found in areas unaffected by the severe CMD pandemic [55]. It can be speculated that the lack of SSA1-SG1 in the Eastern Province of Zambia could be contributing to the low CMD incidence recorded in this and other studies [50]. The reason why SSA1-SG2 and SSA1-SG3 occur in Eastern Province while SSA-SG1 is absent is not known, as no physical barriers exist that might limit the spread of SSA1-SG1. No statistically significant correlation was detected between the whitefly subgroups and CMD. However, the lack of a clear correlation between CMD incidence and B. tabaci subgroups could be attributed to a small sample size of sequenced whiteflies that was obtained for both 2013 and 2015 and it would certainly be important for future studies to assess larger sample sizes to ensure adequate representation of all subgroups occurring on cassava in the country.
Genetic divergence analyses showed a low level of variation between SSA1-SG1, SSA1-SG2 and SSA1-SG3, which was comparable to other studies [42,55]. This small divergence suggests that they belong to the same genetic group, based on the widely applied 3.5% partial mtCOI gene sequence divergence threshold used for species-level designation within the B. tabaci species complex [19].
The predominant SSA1-SG1 haplotype in this study shared 100% sequence identity with an SSA1-SG1 haplotype from DRC [MF417582 (DRC-KICKAL1)] which was included here in the phylogenetic tree ( Figure 2). Although SSA1-SG2 and SSA1-SG3 were not as frequent as SSA1-SG1 in both 2013 and 2015, they contributed to the B. tabaci diversity within SSA1. In addition, the negative and significant values obtained for Fu's Fs statistic and Tajima's D signified demographic expansionary changes for SSA1-SG1 and SSA1-SG3. SSA1-SG1 has been reported to be the prominent haplotype on cassava in north-western Tanzania, DRC, Uganda and Central African Republic, which has also been associated with spread of CMD [20,55,56]. With the high CMD incidence reported in this study in both 2013 and 2015 and in other previous studies [50], it is likely that SSA1-SG1 could be driving the high CMD incidence reported in Western Province.
Knowledge of the genetic diversity of B. tabaci in Zambia is of great practical importance in the light of the recent outbreak of CBSD in the country. Changes in whitefly population abundance were the key driver of the severe CMD and CBSD pandemics in East and Central Africa [27,32]. It is therefore anticipated that any changes in B. tabaci and abundance will impact cassava farmers through the spread of CMD and CBSD and cause physical damage through sap sucking and the growth of sooty mold on heavily infested plants. This highlights the importance of directing increased educational programs toward the effective and sustainable management of B. tabaci whiteflies on cassava and the viruses that they transmit.

Conclusions
The study provides important information on the genetic diversity of B. tabaci in Zambia. The findings will lead to better understanding of the B. tabaci genotypes colonizing cassava in Zambia. Although our results indicate no strong association between CMD and SSA1 subgroups, partly due to a small sample size, the relatively high B. tabaci abundance in Western Province appears to be an important factor in the high level of CMD incidence observed there in both 2013 and 2015.
Supplementary Materials: The following are available online at http://www.mdpi.com/2075-4450/11/11/761/s1: Table S1: Symptoms observed on cassava varieties in different selected locations in Zambia in 2013 and 2015; Table S2: Whitefly abundance, CMD incidence and severity and genetic subgroup determined in Zambia in 2013; Table S3: Whitefly abundance, CMD incidence and severity and genetic subgroup determined in Zambia in 2015; Table S4: Estimate of evolutionary divergence (expressed as percent nucleotide divergence) between partial mitochondrial cytochrome oxidase I (mtCOI) sequence representatives of B. tabaci identified on cassava in Zambia as conducted using Tajima-Nei model in MEGA 7.