DNA Barcoding and Taxonomic Challenges in Describing New Putative Species : Examples from Sootywing and Cloudywing Butterflies ( Lepidoptera : Hesperiidae )

DNA barcoding has resulted in the ‘discovery’ of a vast number of new species and subspecies. Assigning formal scientific names to these taxa remains a major challenge. Names sometimes are newly designated. Alternatively, available valid names can be resurrected from synonymy, based on barcode analyses together with classical taxonomic characters. For the most part, however, new putative species revealed by barcoding studies go undescribed. This situation is most often attributed to insufficient taxonomic expertise with the authors conducting the study, together with a critical lack of formally trained taxonomists. However, even with formal training, and additional supportive data from morphological, ecological or life history characters, other factors can arise that impede new species descriptions. In the present paper, several specific taxonomic challenges that have arisen from barcode analyses in two groups of skipper butterflies (Lepidoptera: Hesperiidae), the Sootywings (Pholisora catullus and P. mejicanus) and the Coyote Cloudywing (Achalarus toxeus) are highlighted and discussed. Both P. catullus and A. toxeus show relatively large intraspecific genetic divergences of barcodes (2–3%) which suggests the possibility of previously unrecognized cryptic speciation within each group. Some of the challenges to providing formal names and clarifying taxonomic status of these cryptic taxa could be largely overcome by (1) barcoding type specimens, (2) clarifying imprecise and often vague or suspect type localities, and (3) by conducting in-depth comparative studies on genitalic morphology.


Introduction
Formal taxonomic descriptions of newly discovered cryptic species are critical for understanding the extent of regional biodiversity and for informing management decisions for conservation [1][2][3].
Although not often the case, assigning a scientific name to a cryptic lineage revealed by DNA barcoding can be straightforward and relatively rapid.Examples from our own studies include using barcodes, together with previously overlooked differences in morphology, ecology or life history, to reinstate available names that had been synonymized with an earlier described species or subspecies [4][5][6].In general, however, taxonomic studies resulting in new species descriptions lag far behind the vast number of cryptic lineages now being 'discovered' by barcoding.The most obvious reason for this backlog is a shortage of funding and scarcity of formally trained taxonomists to keep up with the workload generated by molecular studies, resulting in what has been called the 'taxonomic impediment' [2,7,8].However, even with formal training, additional challenges in describing new species can sometimes arise.These include a lack of designated types and type localities, or a vague Diversity 2018, 10, 111 2 of 14 description of the type locality, in the original description of an already recognized species in a given taxon, as well as a lack of DNA barcode data for type specimens.This information is sometimes critical for making decisions on whether a new name is necessary for a newly discovered taxon, or whether a previously applied name is available.For example, many early descriptions of New World butterflies were made from the eighteenth to the early twentieth century, and it was not uncommon to find that a type specimen and type locality were not designated, or if they were designated, that the specimen was subsequently lost or the location of the type locality was suspect [9].A further obstacle is that extremely broad geographic regions were often named as type localities (e.g., 'America', 'Mexico', etc.).Also, designated syntypes were sometimes chosen from individuals collected over wide geographic areas which are now known to include more than a single taxon of the species in question.
In the present study, genetic diversity in a 658 base pair (bp) segment beginning near the 5 end of the mitochondrial cytochrome c oxidase subunit I gene (COI or cox1) [10] was analyzed in two groups of skipper butterflies (Lepidoptera: Hesperiidae), the Sootywings (Pyrginae: Pholisora) and Cloudywings (Eudaminae: Achalarus) based on both newly sequenced samples from southern Sonora, Mexico and sequences available in GenBank and the Barcode of Life Data System (BOLD) [11].The limitations of using a single locus mitochondrial DNA marker in studies on molecular taxonomy and phylogeography have been discussed extensively [12,13], and possible introgressive hybridization, nuclear mitochondrial pseudogenes (numts) and contamination by symbiotic bacteria represent potential sources of error [14][15][16][17][18][19][20].Barcodes alone, although not always successful in identifying species, have shown impressive success rates (>90%) in Lepidoptera [21][22][23], and have been particularly useful in providing important diagnostic characters to support the delimitation of new cryptic butterfly species [24][25][26][27].Barcode analyses reported here reveal previously unreported cryptic diversity in both Sootywings and Cloudywings.Several challenges that arise when attempting to determine species boundaries and assign formal scientific names to the putative new taxa in these two groups of butterflies are discussed.

Samples
Most of the barcode sequences for the Hesperiidae analyzed here were taken from GenBank and BOLD databases (Table 1).New barcode sequences for Pholisora catullus (Fabricius) and Achalarus toxeus (Plötz) were obtained from specimens collected in southern Sonora at Guaymas/San Carlos and La Aduana near Alamos (Table 1).These specimens have been deposited in the insect collection at the Centro de Investigación en Alimentación y Desarrollo (CIAD), Guaymas, Sonora.Owing to occasional species misidentifications in both GenBank and BOLD [28], downloaded records were carefully checked, and if a particular species was clustered in a clade containing many individuals of a different species, it was omitted.Specimen photographs are often shown in BOLD to help verify identifications.Also, the original electropherograms are often included in BOLD, and can be compared with sequence files to correct for any suspect nucleotide calls [26,29].The Common Sootywing, Pholisora catullus, is widely distributed in the USA, southern Canada and throughout much of Mexico [31].Larvae feed on a variety of weedy and garden plants in the family Amaranthaceae, including Amaranthus, Chenopodium, and various other related genera, and the butterfly is often found flying in disturbed habitats [32].Pholisora catullus occurs throughout much of the state of Sonora, Mexico, especially along the coastal plain of the Gulf, where it is commonly encountered [33,34] (see Figure 1a-h).Potential host plants, Amaranthus, Chenopodium and Chenopodiastrum, are common near Guaymas [35], but specific host(s) remain to be determined.
A closely related species, the Mexican Sootywing Pholisora mejicanus (Reakirt), is also widely distributed in Mexico (except for the Baja California Peninsula and Sonora), but shows a more restricted distribution in the USA (Colorado, New Mexico, Arizona and Kansas) [31,34,36].As with its congener, P. mejicanus can be found in open and disturbed areas, and larvae are known to feed on Amaranthus and Chenopodium [37].Pholisora mejicanus is very similar to P. catullus, but can be distinguished by differences in morphology of male genitalia [38] and by the jet-black veins on the underside of the hindwings [31] (see Figure 1i).

Sootywings
The Common Sootywing, Pholisora catullus, is widely distributed in the USA, southern Canada and throughout much of Mexico [31].Larvae feed on a variety of weedy and garden plants in the family Amaranthaceae, including Amaranthus, Chenopodium, and various other related genera, and the butterfly is often found flying in disturbed habitats [32].Pholisora catullus occurs throughout much of the state of Sonora, Mexico, especially along the coastal plain of the Gulf, where it is commonly encountered [33,34] (see Figure 1a-h).Potential host plants, Amaranthus, Chenopodium and Chenopodiastrum, are common near Guaymas [35], but specific host(s) remain to be determined.
A closely related species, the Mexican Sootywing Pholisora mejicanus (Reakirt), is also widely distributed in Mexico (except for the Baja California Peninsula and Sonora), but shows a more restricted distribution in the USA (Colorado, New Mexico, Arizona and Kansas) [31,34,36].As with its congener, P. mejicanus can be found in open and disturbed areas, and larvae are known to feed on Amaranthus and Chenopodium [37].Pholisora mejicanus is very similar to P. catullus, but can be distinguished by differences in morphology of male genitalia [38] and by the jet-black veins on the underside of the hindwings [31] (see Figure 1i).  1 for collection data.All specimens are from COI haplotype group 1, except (c) CIAD 14-EP15 (haplotype group 3).The ventral surface of live adults (not barcoded) of P. catullus from Sonora, Mexico (h) and P. mejicanus from Torrance Co., New Mexico, USA (i); photo courtesy of Ken Kertell, shown to illustrate the diagnostic black veins on the hindwings of P. mejicanus.

Cloudywings
The Coyote Cloudywing, Achalarus toxeus, is broadly distributed in North America, found from west Mexico and southern Texas south to Panama [31].Achalarus toxeus is morphologically very  1 for collection data.All specimens are from COI haplotype group 1, except (c) CIAD 14-EP15 (haplotype group 3).The ventral surface of live adults (not barcoded) of P. catullus from Sonora, Mexico (h) and P. mejicanus from Torrance Co., New Mexico, USA (i); photo courtesy of Ken Kertell, shown to illustrate the diagnostic black veins on the hindwings of P. mejicanus.

Cloudywings
The Coyote Cloudywing, Achalarus toxeus, is broadly distributed in North America, found from west Mexico and southern Texas south to Panama [31].Achalarus toxeus is morphologically very similar and extremely difficult to distinguish from two of its congeners that also occur in this region, A. albociliatus (Mabille) and A. jalapus (Plötz) [31,[39][40][41].The three species, however, can easily be distinguished by their DNA barcodes and by genitalic examination of males [40].In addition, males of A. albociliatus lack the costal fold present in the other two species [42].Preliminary barcode studies indicated substantial genetic divergence (~3%) between populations of A. toxeus from Costa Rica and northwestern Mexico, suggesting that at least two cryptic species are presently assigned to this taxon [41].The northwestern Mexico populations were provisionally referred to as Achalarus sp.cf.toxeus.Barcode analysis from additional specimens recently obtained from southern Sonora, Mexico are analyzed here (Figure 2; Table 1) and provide additional insight into the findings obtained from the earlier study [41].
Host plants for larvae of A. toxeus from Costa Rica are Calliandra tergemina and Sphinga platyloba (Fabaceae) [43].Texas ebony (Ebenopsis ebano Fabaceae) is the larval foodplant in southern Texas [39].The larval host(s) for Achalarus sp.cf.toxeus in Sonora is unknown, but there are several species of fairy dusters (Calliandra spp.) in northwestern Mexico, in addition to various other species of Fabaceae, that might be used [44].
Diversity 2018, 10, x FOR PEER REVIEW 5 of 14 similar and extremely difficult to distinguish from two of its congeners that also occur in this region, A. albociliatus (Mabille) and A. jalapus (Plötz) [31,[39][40][41].The three species, however, can easily be distinguished by their DNA barcodes and by genitalic examination of males [40].In addition, males of A. albociliatus lack the costal fold present in the other two species [42].Preliminary barcode studies indicated substantial genetic divergence (~3%) between populations of A. toxeus from Costa Rica and northwestern Mexico, suggesting that at least two cryptic species are presently assigned to this taxon [41].The northwestern Mexico populations were provisionally referred to as Achalarus sp.cf.toxeus.Barcode analysis from additional specimens recently obtained from southern Sonora, Mexico are analyzed here (Figure 2; Table 1) and provide additional insight into the findings obtained from the earlier study [41].
Host plants for larvae of A. toxeus from Costa Rica are Calliandra tergemina and Sphinga platyloba (Fabaceae) [43].Texas ebony (Ebenopsis ebano Fabaceae) is the larval foodplant in southern Texas [39].The larval host(s) for Achalarus sp.cf.toxeus in Sonora is unknown, but there are several species of fairy dusters (Calliandra spp.) in northwestern Mexico, in addition to various other species of Fabaceae, that might be used [44].  1 for collection data.Specimens (a,b) from a previous study [41] possess COI haplotype A; the remaining specimens show haplotype B (Table 1).

DNA Extraction and Sequence Analysis
Total genomic DNA was extracted from two legs of each butterfly using the DNeasy™ (QIAGEN Inc., Valencia, CA, USA) protocol.The polymerase chain reaction (PCR) was used to amplify the COI barcode segment [11] with primers LCO1490f and HCO2198r using standard PCR conditions [45].Sequencing reactions were performed on an Applied Biosystems (Foster City, CA, USA) ABI 3730XL DNA sequencer at the Laboratorio Nacional de Genómica para la Biodiversidad (LANGEBIO) core DNA sequencing facility in Irapuato, Guanajuato, Mexico using the PCR primers.Translation of the gene segment in MEGA version 5.0.5 [46] revealed no frameshifts or stop codons.The content of See Table 1 for collection data.Specimens (a,b) from a previous study [41] possess COI haplotype A; the remaining specimens show haplotype B (Table 1).

DNA Extraction and Sequence Analysis
Total genomic DNA was extracted from two legs of each butterfly using the DNeasy™ (QIAGEN Inc., Valencia, CA, USA) protocol.The polymerase chain reaction (PCR) was used to amplify the COI barcode segment [11] with primers LCO1490f and HCO2198r using standard PCR conditions [45].Sequencing reactions were performed on an Applied Biosystems (Foster City, CA, USA) ABI 3730XL DNA sequencer at the Laboratorio Nacional de Genómica para la Biodiversidad (LANGEBIO) core DNA sequencing facility in Irapuato, Guanajuato, Mexico using the PCR primers.Translation of the gene segment in MEGA version 5.0.5 [46] revealed no frameshifts or stop codons.The content of combined nucleotides CG ranged from 28.7% to 30.3%.These results suggest that the sequences for both species represent mtDNA, and are not numts, which have been reported for the COI gene in insects [15].GenBank accession numbers for the new COI sequences for P. catullus and A. toxeus from Sonora are found in Table 1.The 658 bp COI segment corresponds to nucleotide positions 1516 to 2173 in the complete mitochondrial genome of the monarch butterfly, Danaus plexippus (GenBank KC836923) [47].
Haplotype networks of COI sequences in Pholisora were constructed using statistical parsimony implemented in TCS version 1.21 [48].The first nucleotide was deleted in all 658 bp barcodes to match the four 657 bp sequences of P. catullus from California available in BOLD (see Table 1).The connection limit among haplotypes was set to a default value of 95%.Relationships among Pholisora sequences were also assessed with Bayesian inference implemented in MrBayes version 3.1 [49].Bayesian analysis was conducted as described previously [41] after determining the most appropriate nucleotide substitution model.Outgroups were the hesperiids Hesperopsis libya (GenBank KP895740), Staphylus ceos (KY019902) and Pyrgus communis (AF170857).Clade support was estimated utilizing a Markov chain Monte Carlo (MCMC) algorithm and expressed as posterior probabilities.For barcodes from the divergent populations of A. toxeus from Sonora and Costa Rica, a median-joining haplotype network was constructed in POPART ver.1.7 [50], and a Bayesian tree was constructed as described above using P. communis and Urbanus proteus (HM905345) as outgroups.

Sootywings
The TCS haplotype network of Pholisora barcodes (Figure 3a), which included new sequences of P. catullus from southern Sonora, Mexico (n = 7; Figure 1), together with barcodes for P. catullus and P. mejicanus sourced from GenBank and BOLD (total n = 31; Table 1), revealed a single network and four distinct haplotype groups (groups 1 to 4).Most (n = 17) of the sequences for P. catullus were obtained from a single locality (Catalina State Park near Tucson) in Pima County, Arizona. Figure 3a shows that all but two of the Arizona sequences, and all but one of the sequences from Sonora, resolved in haplotype group 1.The common haplotype of group 1 (haplotype 1a) was shared among Sonora and Arizona populations.The three outlier sequences, along with one from Texas, were identical (haplotype group 3) and were separated by ten mutational steps from haplotype group 1.Four GenBank sequences of P. catullus from southern California clustered in haplotype group 2, separated by a minimum of seven mutational steps from haplotype group 1.Two complete barcodes for P. mejicanus from BOLD resolved in haplotype group 4 and were separated from haplotype group 1 by a minimum of six mutational steps.Uncorrected mean genetic distance (p-distance) among the four haplotype groups ranged from 1.6 to 2.0%; minimum p-distance among groups ranged from 0.9 to 1.8% (inset in Figure 3a,).Maximum p-distances within groups were 1.4% (group 1), 0.5% (group 2), 0.0% (group 3) and 0.2% (group 4).An additional 658 bp barcode for P. mejicanus was available in BOLD (NGSFT3796-16), but was not trimmed to 657 bp and used for the haplotype network because it was missing data at 94 of the 658 nucleotide sites (Table 1).
The Bayesian tree based on Pholisora barcodes (Figure 3b) generally showed the same resolution of the four haplotype groups as seen in the haplotype network.In addition, the sequence from P. mejicanus that was missing data (see above) clustered with the other two complete sequences from this species in haplotype group 4 (not shown).Statistical support for nodes was high for haplotype groups 2, 3 and 4, but low for group 1.The same resolution of clades and similar support values were found in a Maximum Parsimony tree constructed in MEGA using a matrix of uncorrected p-distances (not shown).Although only a single haplotype was found in the four individuals in group 3 with complete barcodes from Sonora, Arizona and Texas (Figure 3a), it is labeled as a haplotype 'group' based on the results of a query of a Sonora sequence (CIAD 14-EP15) in BOLD, which yielded a tree-based identification showing 19 additional sequences not publicly available, with a variety of presumed haplotypes, from specimens collected over a wide geographic area (western Canada [British Columbia], western USA [Texas, Arizona, Colorado, Wyoming] and northeastern Mexico [Tamaulipas]), clustering in a single clade that included haplotype 3. Haplotype groups 1 and 3 of P. catullus were found sympatrically, and were flying on the same dates, both in southern Arizona (11 August 2011) and Sonora, Mexico (24 Septemper 2014) (Table 1).Haplotype group 1 predominated in samples from both localities (16 of 18 individuals in Arizona and 6 of 7 individuals in Sonora) and to date has not been found elsewhere (Table 1).
The four haplotype groups of Pholisora also showed apparent fixed differences in single nucleotide polymorphisms (SNPs) (Table 2).Pholisora mejicanus (group 4) differs from the three groups of P. catullus at a single site (no.43).A single SNP is diagnostic for group 1, and two diagnostic SNPs are seen in groups 2 and 3. Larger sample sizes of groups 2, 3 and 4, however, will be required to confirm whether these differences are robust.
Diversity 2018, 10, x FOR PEER REVIEW 7 of 14 [Tamaulipas]), clustering in a single clade that included haplotype 3. Haplotype groups 1 and 3 of P. catullus were found sympatrically, and were flying on the same dates, both in southern Arizona (11 August 2011) and Sonora, Mexico (24 Septemper 2014) (Table 1).Haplotype group 1 predominated in samples from both localities (16 of 18 individuals in Arizona and 6 of 7 individuals in Sonora) and to date has not been found elsewhere (Table 1).
The four haplotype groups of Pholisora also showed apparent fixed differences in single nucleotide polymorphisms (SNPs) (Table 2).Pholisora mejicanus (group 4) differs from the three groups of P. catullus at a single site (no.43).A single SNP is diagnostic for group 1, and two diagnostic SNPs are seen in groups 2 and 3. Larger sample sizes of groups 2, 3 and 4, however, will be required to confirm whether these differences are robust.
Barcode analysis suggests that the taxonomy of Pholisora may not be as clear-cut as previously thought.Pholisora catullus and P. mejicanus, both currently recognized as valid taxa [9], show relatively small genetic divergences in COI barcodes (p-distance ~2%; Figure 3a) and are connected in a single TCS haplotype network.In addition, P. catullus consists of at least three populations showing intraspecific divergences similar to the interspecific differences between P. catullus and P. mejicanus.Possible specimen misidentification of P. mejicanus available in BOLD is unlikely.The two complete (658 bp) barcodes of this species were high-quality electropherograms obtained from specimens collected by J.M. Burns, a taxonomic specialist on the Hesperiidae.The very short sequence (208 bp) of P. mejicanus, although not used for the haplotype network or the phylogenetic tree, clustered in haplotype group 4 in preliminary trees (not shown), and the specimen photograph in BOLD clearly showed the characteristic black veins on the ventral hindwing that distinguish this species.
The similarity in genetic divergences of COI barcodes among the four haplotype groups comprising two recognized species raises the possibility of three cryptic lineages of Pholisora in southwestern USA and northwestern Mexico that are currently placed in a single taxon, P. catullus.The taxonomic uncertainty that arises is whether to recognize the three cryptic lineages as distinct species or, alternatively, whether they should be named as subspecies of P. catullus.The latter option raises the possibility that mejicanus should also be placed as a subspecies of P. catullus.Pholisora catullus and P. mejicanus, however, have been collected flying together in central Colorado [51].The observation that the two species are sympatric, together with differences in ventral hindwing vein color, genitalic differences, and barcode differences, suggest that they are indeed distinct species.To date, only a single subspecies of P. catullus has been formally proposed, P. catullus crestar J. Scott & Davenport, for localized populations from the southern Sierra Nevada of California [52].However, in the revised online edition of 'A Catalogue of the Butterflies of the United States and Canada' [9] (see References for URL), the name crestar is placed as a junior subjective synonym of P. catullus.It should be added that barcodes were not reported for 'crestar' [52] and thus it is unclear whether this population should be placed with haplotype 2 of P. catullus from San Diego County, California (Figure 3).The only reported diagnostic character of 'crestar' was a submarginal line of small white spots on the dorsal hindwing, but this character is of limited taxonomic usefulness, being found at low frequency in other populations throughout the USA [52].
Confirmation of additional species-level (or subspecies-level) taxa currently assigned to P. catullus, and any change to the taxonomic status of P. mejicanus, will first require further molecular, morphological and ecological studies with larger sample sizes.Especially important will be thorough comparative studies of male genitalia in the different haplotype groups, given the importance of this character in separating P. mejicanus and P. catullus.In addition, barcodes from the type specimen of P. catullus could potentially provide important baseline data for future taxonomic studies [53,54].However, the type(s) of P. catullus collected from a locality named 'Indiis' are probably lost [9].Some have suggested that the type locality is 'probably Georgia' [32,55].If a type locality can be fixed with reasonable certainty, neotypes could be selected, genitalia examined and barcodes determined and compared with those presented here.Unfortunately, there are no published barcodes for specimens of P. catullus from the eastern USA, including Georgia, currently deposited in GenBank or BOLD.

Cloudywings
As mentioned earlier, a COI genetic divergence between populations of Achalarus sp.cf.toxeus from Sonora (haplotype A, only) and A. toxeus from Costa Rica of ~3% (p-distance = 2.9%; K2P distance = 3.0%) was found in a previous barcoding study [41].A query of Sonora haplotype A in BOLD showed that it clustered with barcodes (not publicly available) found in three specimens from the state of Jalisco in western Mexico, with 99.85% sequence identity, and that an additional specimen from Jalisco and Sinaloa clustered with the Costa Rica haplotype CR, suggesting two distinct but sympatric cryptic species of Achalarus toxeus are found in these two Mexican states [41].Barcodes from the seven new specimens from San Carlos, Sonora possessed a unique haplotype (haplotype B) which differed at six nucleotide sites from Sonora haplotype A, as can be seen on a median-joining haplotype network (Figure 4a).The mean (and minimum) genetic divergence (p-distance) between haplotypes A and B was 0.9%.These results, as well as subtle color differences (Figure 2), suggest incipient speciation may be occurring among Sonora populations, but additional molecular, morphological and ecological studies on larger sample sizes of both haplotypes will be needed for confirmation.Here the combined Sonora sample is treated as a single undescribed species, Achalarus sp.cf.toxeus.The mean (and minimum) p-distance between haplotype B and the Costa Rica population (haplotype CR) was 2.9%, the same result as between haplotype A and Costa Rica.Table 3 shows the 16 diagnostic barcode nucleotides that distinguish the Sonora (both A and B haplotypes) from Costa Rica samples of A. toxeus.These diagnostic nucleotides represent important characters [24,27] for future taxonomic studies on this group.
Diversity 2018, 10, x FOR PEER REVIEW 9 of 14 compared with those presented here.Unfortunately, there are no published barcodes for specimens of P. catullus from the eastern USA, including Georgia, currently deposited in GenBank or BOLD.

Cloudywings
As mentioned earlier, a COI genetic divergence between populations of Achalarus sp.cf.toxeus from Sonora (haplotype A, only) and A. toxeus from Costa Rica of ~3% (p-distance = 2.9%; K2P distance = 3.0%) was found in a previous barcoding study [41].A query of Sonora haplotype A in BOLD showed that it clustered with barcodes (not publicly available) found in three specimens from the state of Jalisco in western Mexico, with 99.85% sequence identity, and that an additional specimen from Jalisco and Sinaloa clustered with the Costa Rica haplotype CR, suggesting two distinct but sympatric cryptic species of Achalarus toxeus are found in these two Mexican states [41].Barcodes from the seven new specimens from San Carlos, Sonora possessed a unique haplotype (haplotype B) which differed at six nucleotide sites from Sonora haplotype A, as can be seen on a median-joining haplotype network (Figure 4a).The mean (and minimum) genetic divergence (p-distance) between haplotypes A and B was 0.9%.These results, as well as subtle color differences (Figure 2), suggest incipient speciation may be occurring among Sonora populations, but additional molecular, morphological and ecological studies on larger sample sizes of both haplotypes will be needed for confirmation.Here the combined Sonora sample is treated as a single undescribed species, Achalarus sp.cf.toxeus.The mean (and minimum) p-distance between haplotype B and the Costa Rica population (haplotype CR) was 2.9%, the same result as between haplotype A and Costa Rica.Table 3 shows the 16 diagnostic barcode nucleotides that distinguish the Sonora (both A and B haplotypes) from Costa Rica samples of A. toxeus.These diagnostic nucleotides represent important characters [24,27] for future taxonomic studies on this group.
The Bayesian tree based on barcodes from A. toxeus from Sonora and Costa Rica (Figure 4b) showed the same resolution of the three haplotypes (A, B and CR) as seen in the haplotype network.In addition, the partitioning of distinct clades of two other cryptic species of Achalarus mentioned earlier, A. jalapus and A. albociliatus, are included for comparison.The same resolution of Achalarus clades, and high clade-support values, were also found in a Maximum Parsimony tree (not shown).
Nucleotide sites 212 and 407 are 1st codon position substitutions; the remainder are 3rd position.* Single barcodes < 658 bp from Costa Rica/GenBank No. (see Table 1).
The Bayesian tree based on barcodes from A. toxeus from Sonora and Costa Rica (Figure 4b) showed the same resolution of the three haplotypes (A, B and CR) as seen in the haplotype network.In addition, the partitioning of distinct clades of two other cryptic species of Achalarus mentioned earlier, A. jalapus and A. albociliatus, are included for comparison.The same resolution of Achalarus clades, and high clade-support values, were also found in a Maximum Parsimony tree (not shown).
Determining a formal scientific name to apply to Achalarus sp.cf.toxeus is especially challenging given the imprecise type locality information and brief species diagnoses in the original descriptions of Aethilla (=Achalarus) toxeus [56] and Murgaria (=Achalarus) albociliata var.nigrociliata [57], the latter now placed as a junior synonym of A. toxeus [9].The type locality of A. toxeus was listed only as 'Mexico', without further locality details [56].The type locality of A. nigrociliata was also not specific, listed only as 'Mexique' (Mexico) [57].A specific type locality, however, was given in the description of Eudamus (=Achalarus) coyote (TL: 'Southern Texas') [58], a name also currently placed as a synonym of A. toxeus [9].Later, the type locality of coyote was further restricted to Aaron, near Corpus Christi, Texas [59].Thus, of the two potentially available names for newly discovered species in the Achalarus toxeus complex, only A. coyote is accompanied by a specific type-locality.A large data set of genitalic dissections from A. toxeus collected by other workers from throughout Mexico is now available [60], and these data, together with planned barcode analyses of these additional specimens, will hopefully be able to clarify which, if any, of the available names should be applied to new species of the A. toxeus complex from North America.It is also worth noting that the type specimen of A. toxeus, deposited in the Berlin Natural History Museum in Germany, is a female [42].There is, however, no superficial character to reliably separate females of A. toxeus from A. albociliatus (or from A. jalapus) [42].Because a COI barcode is not available for the type specimen, it is possible that Plötz [56] may have examined any one of the several cryptic species of Achalarus found in Mexico in his description of A. toxeus in 1882.This uncertainty both highlights the utility of determining barcodes for historical type material [53,54] and the challenges encountered in formally naming new cryptic species revealed by DNA barcoding.

Concluding Remarks
Although the precise number of cryptic species across animal taxa is unknown [61], there are probably thousands of species that have been delimited by molecular data alone, but that have not been formally described [62].In butterflies, an early and now classic, but controversial, example of the underestimation of biodiversity by barcoding is the highly cited (presently 2685 citations returned by Google Scholar) 'ten species in one' paper on the hesperiid Astraptes fulgerator (Eudaminae) [63].Because the ten species of A. fulgerator were inferred from their COI genetic divergences, but not formally described, that paper also highlights several of the taxonomic challenges in cryptic taxa that I have presented here.Several years after 'ten species in one' appeared, Brower [24] provided formal names for each species based on COI diagnoses alone.Although not considered ideal [24], at least each putative taxon now had a scientific name associated with it.However, as with Achalarus toxeus, several names had been previously proposed for members of the Astraptes fulgerator group, and it was uncertain if any of the newly proposed names would eventually end up as synonyms.I have chosen a more conservative approach here by not providing a formal scientific name to Achalarus sp.cf.toxeus, but with the disadvantage that this name is also less than ideal.Clearly, much work remains to be done before taxonomic impediments to providing valid scientific names to the ever-increasing number of cryptic species can be overcome.
Funding: This research was funded by the Consejo Nacional de Ciencia y Tecnología (CONACYT) grant CB-180385 to Dr. Therese Ann Markow, and funds from LANGEBIO and the Centro de Investigación en Alimentación y Desarrollo (CIAD).

Figure 3 .
Figure 3. (a) TCS haplotype network of COI barcodes in Pholisora showing clustering of haplotypes into four groups.Different haplotypes in each group are identified with letters corresponding to those shown in Table 1.Numbers next to circles represent the number of individuals with that haplotype, if greater than one.Each line segment between haplotypes represents a single mutation.Inferred intermediate haplotypes that were not sampled are shown as black dots.Size of the circles is scaled approximately to haplotype frequency.State abbreviations: SON, Sonora (Mexico); AZ, Arizona; CA, California; TX, Texas; NM, New Mexico.Table inset shows mean pairwise p-distances (%) among groups 1 to 4 below the diagonal; shaded values shown along the diagonal are mean within-group p-distances; minimum pairwise p-distances among groups are shown above the diagonal.(b) Bayesian 50% majority rule consensus tree of barcode sequences showing clustering of the four COI haplotype groups (outgroups not shown).Clade support values (posterior probabilities) for the major clades are shown adjacent to branches.The scale represents expected substitutions per site.

Figure 3 .
Figure 3. (a) TCS haplotype network of COI barcodes in Pholisora showing clustering of haplotypes into four groups.Different haplotypes in each group are identified with letters corresponding to those shown in Table 1.Numbers next to circles represent the number of individuals with that haplotype, if greater than one.Each line segment between haplotypes represents a single mutation.Inferred intermediate haplotypes that were not sampled are shown as black dots.Size of the circles is scaled approximately to haplotype frequency.State abbreviations: SON, Sonora (Mexico); AZ, Arizona; CA, California; TX, Texas; NM, New Mexico.Table inset shows mean pairwise p-distances (%) among groups 1 to 4 below the diagonal; shaded values shown along the diagonal are mean within-group p-distances; minimum pairwise p-distances among groups are shown above the diagonal.(b) Bayesian 50% majority rule consensus tree of barcode sequences showing clustering of the four COI haplotype groups (outgroups not shown).Clade support values (posterior probabilities) for the major clades are shown adjacent to branches.The scale represents expected substitutions per site.

Figure 4 .
Figure 4. (a) Median-joining haplotype network of the 658 bp COI barcode segment in Achalarus toxeus showing relationships among haplotypes A (n = 2) and B (n = 7) from Alamos and San Carlos, Sonora, and haplotype CR (n = 3) from the Area de Conservación Guanacaste (ACG), Costa Rica.Sizes of the circles are proportional to sample size.Each hatch mark on lines connecting haplotypes represents a single nucleotide substitution.The small black circle represents an inferred intermediate haplotype.(b) Bayesian 50% majority rule consensus tree of barcode sequences of the three haplotypes of A. toxeus (A, B and CR), and the partitioning of clades of two other cryptic species in this complex, A.

Figure 4 .
Figure 4. (a) Median-joining haplotype network of the 658 bp COI barcode segment in Achalarus toxeus showing relationships among haplotypes A (n = 2) and B (n = 7) from Alamos and San Carlos, Sonora, and haplotype CR (n = 3) from the Area de Conservación Guanacaste (ACG), Costa Rica.Sizes of the circles are proportional to sample size.Each hatch mark on lines connecting haplotypes represents a single nucleotide substitution.The small black circle represents an inferred intermediate haplotype.(b) Bayesian 50% majority rule consensus tree of barcode sequences of the three haplotypes of A. toxeus (A, B and CR), and the partitioning of clades of two other cryptic species in this complex, A. jalapus and A. albociliatus (see text).Clade support values (posterior probabilities) for the major clades are shown on branches.The scale represents expected substitutions per site.Abbreviations: SON, Sonora (Mexico); CR, Costa Rica.

Table 1 .
GenBank and BOLD COI barcode data for Pholisora catullus, P. mejicanus and Achalarus spp.analyzed in this study.
[30]aplotype was also inferred from the two shorter sequences of A. toxeus from Costa Rica although they were both missing several diagnostic sites (see text).b from Pratt et al.[30].c collection date of reared larva; other dates in Costa Rica are adult eclosion dates from reared larvae.

Table 2 .
Variable nucleotide sites in the 658 bp COI barcode segment showing apparent fixed differences among samples of Pholisora catullus and P. mejicanus.For P. mejicanus at site 448, n = 2; for P. catullus at site 634, n = 21 for group 1 and n = 4 for group 2. Nucleotides shown in red are diagnostic for the haplotype group/species indicated.All substitutions are third codon position transitions, except at site 542 (first codon position).

Table 3 .
Diagnostic nucleotide differences in COI barcodes between Sonora (Son; haplotypes A and B; shaded) and Costa Rica (CR) specimens of Achalarus toxeus.