Phylogenetic Assessment of Freshwater Mussels Castalia ambigua and C. inflata at an Ecotone in the Paraguay River Basin, Brazil Shows That Inflated and Compressed Shell Morphotypes Are the Same Species

The phylogeny and taxonomy of freshwater mussels of the genus Castalia in South America is complicated by issues of morphological plasticity and limited molecular genetic data. We present field data on the distributions of the nominal Castalia ambigua and C. inflata in the upper Paraguay River basin in Brazil based on original occurrence data at 23 sample sites and on historical records. The upper basin has distinct highland and lowland regions, the latter including the Pantanal wetland, where “C. ambigua” occurs in the highlands and “C. inflata” occurs in both regions. At Baixo Stream in the highlands, we observed individuals with shell morphologies of either C. ambigua or C. inflata, and also individuals with intermediate shell morphology. DNA sequence variation in the upland Baixo Stream and two representative lowland populations were screened. Two mitochondrial and three nuclear genes were sequenced to test hypotheses regarding the number of species-level phylogenetic lineages present. Reported individual DNA sequences from Amazon-basin C. ambigua and other Castalia and outgroup species were included in the analysis as outgroups. Individuals from the Paraguay River basin exhibited 17 haplotypes at the mitochondrial cytochrome oxidase I (COI) gene and nine at mitochondrial 16S rRNA. Analysis of haplotype networks and phylogenetic trees of combined COI + 16S rRNA sequences among individuals with the respective shell morphologies supported the hypothesis that C. ambigua and C. inflata from the Paraguay River basin belong to the same species and one phylogenetic lineage. No variation was observed at the nuclear 18S rRNA internal transcribed spacer, 28S rRNA, or H3NR histone genes among individuals used in this study. Across all markers, less variation was observed among Paraguay basin populations than between Paraguay and Amazon basin populations. Our results collectively suggest that: (1) “C. ambigua”, “C. inflata”, and morphologically intermediate individuals within the upper Paraguay drainage represent one phylogenetic lineage, (2) a phylogeographic divide exists between Castalia populations occurring in the Paraguay and Amazon River basins, and (3) the evolutionary and taxonomic uncertainties that we have identified among Castalia species should be thoroughly assessed across their distribution using both morphological and molecular characters.

long been recognized [25], with upland forms having thinner shells and flatter cross-sections and lowland forms having heavier shells and more inflated cross-sections. These considerations led us to frame two alternative hypotheses: (1) Castalia ambigua and C. inflata are two distinct phylogenetic lineages, perhaps with interbreeding at a zone of contact in the highlands of the Paraguay basin, and (2) Castalia ambigua and C. inflata represent one phylogenetic lineage, with the morphological variation being the consequence of morphological plasticity common along riverscapes in freshwater mussels. We tested these hypotheses by screening both mitochondrial and nuclear DNA markers within this upland population and two representative populations of C. inflata from the lowlands. We used published mitochondrial DNA sequences of C. ambigua from the Amazon basin to compare populations across a major geographic divide to better understand intraspecific variation and the phylogeographic history of the species. Using mitochondrial data, we also assessed the demographic histories of C. ambigua lineages.

Study Area and Sample Collection
The upper Paraguay River basin-with an area of 365,592 km 2 [26]-extends from the midwest of Brazil to Argentina, draining Bolivia, Paraguay, and Uruguay ( Figure 1A). This riverine system connects two distinct landscapes, the highlands with most of the headwaters, and the lowlands that are marked by a seasonal flood pulse that characterizes the Pantanal, one of the world's largest continuous wetlands [24]. A sampling of freshwater mussels was conducted in the sub-basins of the Cuiabá River and the Paraguay River, in the upper Paraguay Basin inside Mato Grosso state, Brazil. We implemented nested hierarchical sampling along the hydrological gradient from highlands to lowlands, with 70 sites sampled in total ( Figure 1B inset map). The capture effort was standardized at 1 person-hour sampling quadrats within transects. Individual mussels were collected, measured, tagged, and returned to the same place in the river or lake.

Biometrical Measurements
All C. ambigua (n = 200) and C. inflata (n = 557) individuals were measured with a digital caliper at a resolution of 0.01 mm. The characters measured were total length (Lt, mm), height (h, mm), and width (wi, mm). To characterize groups, all data were standardized to have a mean of zero and deviation of 1, transforming the data on the same scale of variation, as suggested by Gotelli and Ellison [27]. Then, the data were ordered through Principal Components Analysis (PCA) using the software R [28]. For a graphical representation of the results of the PCA, we used the autoplot function of the package "ggfortify" [29].

Molecular and Phylogenetic Analyses
After observing individuals with shell morphologies intermediate between C. ambigua and C. inflata at Baixo Stream in the highlands, we resolved to test alternative phylogenetic hypotheses by screening mitochondrial and nuclear DNA markers in individuals from this stream and from other populations. Samples for molecular genetic and phylogenetic analyses were collected by hand at the three sites (Table 1, red numbers in Figure 1B Figure 1B inset map). Castalia inflata was collected at all three sites. At Baixo Stream, 19 mussels were classified as C. ambigua in the field, 44 as C. inflata, and 8 as "C. ambigua-like", i.e., morphologically intermediate between C. ambigua and C. inflata. Thus, "C. ambigua-like" individuals were collected only at Baixo Stream. After gently prying each mussel partially open, the tip of a sterilized scissor was used to non-lethally remove a small Diversity 2020, 12, 481 4 of 30 sample of mantle tissue [30]. The tissue sample was fixed in 70% ethanol at ambient temperature. Samples were transported for subsequent analysis at Virginia Tech, Blacksburg, VA, USA. Orange symbols denote C. ambigua and blue denote C. inflata. The Amazon basin is to the north, and the Paraguay basin to the south. References: Bonetto 1961 [5], Bonetto 1962 [31], Bonetto 1965 [11], Mansur 1970 [7], Serrano et al. 1998 [32], Castillo 2007 [33], Wantzen et al. 2011 [34], Colle and Callil [22], Beasley 2001 [35], Vale et al. 2005 [6], Rumi et al. 2008 [12], Pimpao et al. 2008 [36], and Pimpao and Mansur 2009 [37]. (B). Locations of occurrence and sample sites for Castalia ambigua Lamarck, 1819 (orange triangles) and Castalia inflata d'Orbigny, 1835 (blue dots) within the upper Paraguay River basin in this study. Red numerals indicate sample sites for molecular analysis: 1-Baixo Stream (both C. ambigua and C. inflata); 2-Valo Verde Lake; and 3-Claro River. Lack of shading indicates upland areas; gray shading indicates lowland areas.  The DNA was isolated from preserved tissue using the DNeasy Blood and Tissue Kit (Qiagen, Germantown, MD, USA). DNA quality and quantity were assessed using a µLite PC spectrophotometer (Biodrop, Cambridge, UK). The DNA concentration of samples was diluted to 10-30 ng/µL. We amplified two mitochondrial genes, cytochrome oxidase I (COI) and 16S rRNA, and three nuclear (internal transcribed sequence ITS-1 of 18S rRNA, 28S rRNA, and histone H3NR) DNA sequences. The PCR primers used for amplification of targeted sequences are shown in Table 2. Polymerase chain reactions (PCR) were performed in a final volume of 22 µL, containing 0.1 µL Taq polymerase (5 units/µL) (Promega, Madison, WI, USA), 2 µL of 5× buffer, and 2 µL MgCl 2 (25 mM), 0.4 µL bovine serum albumin (BSA, 1 mg/mL), 0.4 µL of all four 5 mM dNTPs, 0.4 µL each of the respective two 5 µM primer solutions, 14.5 µL sterile ultrapure water (EMD Millipore, Darmstadt, Germany), and 1 µL of template DNA. The PCR was conducted in Bio-Rad thermocyclers, either a T100 or MyCycler. In the case of COI and H3, the PCR protocol was 3 min at 94 • C; 35 cycles of: one min at 94 • C, 45 s at 55 • C, 1 min at 72 • C; and a final extension at 72 • C for 5 min. For 16S and 18S ITS, the PCR started with 3 min at 94 • C; followed by 35 cycles of 40 s at 94 • C, 40 s of 53 • C (for 16S) or 60 • C (18S ITS), one min at 72 • C; and a final extension for 5 min at 72 • C. For amplification of the COI gene, we ran two PCR reactions, each of them with different forward primers but the same reverse primer. Finally, the 28S PCR reactions (22-µL) included 2.2 µL DNA extract, 0.1 µL Promega Taq polymerase, 2 µL 10× buffer and 2 MgCl 2 , 0.4 µL BSA (1 mg/mL), 0.4 µL of all four 5 mM dNTPs, 0.4 µL each of the two 5 µM primer solutions, and 13.3 µL sterile ultrapure water. PCR protocols were carried out in the previously mentioned thermocyclers and included: 3 min at 94 • C; 35 cycles of: one min at 94 • C, one min at 53 • C, and 90 s at 72 • C; and five min at 72 • C. The presence of amplification product of a size appropriate for the respective gene regions used in our study was confirmed using standard gel electrophoresis through a 2% agarose TBE (tris-borate-EDTA) gel, which was stained with ethidium bromide. PCR reaction products were sequenced using both forward and reverse primers with the BigDye Terminator Cycle Sequencing Kit v.3.1 on an ABI3730 DNA sequencer at the Fralin Life Sciences Institute (Blacksburg, VA, USA). Consensus sequences were obtained by using Geneious 7.0.6 (Biomatters, Auckland, New Zealand). GeneStudio Professional Edition, ver. 2.2.0.0 (Informer Technologies, Inc., Los Angeles, CA, USA) was used to align the sequences. All consensus sequences for C. ambigua and C. inflata were aligned within and between the species and any putative polymorphisms were cross-checked against the original chromatogram to remove any reading errors. Our 18S ITS sequences were aligned using both Clustal W [46] and webPRANK [47] alignment software. The resulting gaps were coded using FastGap v1.2 [48]. Original DNA sequences were submitted to GenBank using the BanKlT sequence submission tool (http://www.ncbi.nlm.nih.gov/WebSub/?tool=genbank). DNA sequences in our original .fasta files were compared against entries in GenBank (http://www.ncbi.nlm.nih.gov/nuccore/) using the Basic Local Alignment Search Tool [49]. All archived sequences returned with homology scores over 90% were recorded.
Variable sites and genetic variability of mitochondrial DNA sequences were assessed using DnaSP 5.10 [50], including number of haplotypes, the average number of nucleotides differences, gene diversity, and nucleotide diversity for each species at each sampling location for the COI and 16S gene regions.
Haplotype networks for each marker were constructed using PopArt (Population Analysis with Reticulate trees) [51] to show the distribution of the haplotypes among the three populations and among the three species. All haplotype networks were constructed using the TCS inference method [52].
We conducted phylogenetic analyses of the DNA sequences from C. ambigua sampled in this study and outgroup species within Tribes Castaliini and Hyriini downloaded from GenBank. The COI sequences included those from our own results for Castalia sp. (GenBank accession numbers KY474356-KY474356-372), C. ambigua (KU888236-KU888243), and Triplodon corrugatus (KU888253) from da Cruz Santos-Neto et al. [20]. The 16S sequences analyzed included Castalia sp. (KU463457-KU463465) from this study, as well as archived sequences from C. ambigua (KU888207-KU888213) and Triplodon corrugatus (KU888224). The 18S ITS sequences included Castalia sp. (KY463466) from our study, plus C. ambigua (KU888178-888182), C. stevensi (KU88184), and Callonaia duprei (KU888175) from da Cruz Santos-Neto et al. [20]. For COI + 16S, the outgroup was Triplodon corrugatus (KU888253 + KU888224). The evolutionary model for each sequence alignment was selected using Kakusan4 [53], where we used Akaike information Criterion (AIC) to find the most appropriate theoretical model of molecular evolution. Using that model, maximum likelihood phylogenetic trees were constructed using MrBayes 3.2.5 [54]. Bayesian analysis included four (16S, COI + 16S) and eight (COI and 18S ITS) Markov chain Monte Carlo chains with a total of 600,000, 500,000, 500,000, and 1.5 million generations for COI, 16S, 18S ITS, and COI + 16S, respectively, with trees sampled every 100 generations. Tree topologies remaining after the first 25% were excluded from the cold chain and were used to calculate posterior node probabilities [55]. We used Tracer v. 1.6.0 [56] to assess MCMC convergence and to ensure that the effective sample size (ESS) was higher than 200. For each dataset, we assessed species delimitation using the Automatic Barcode Gap Discovery (ABGD) [57] method which detects barcode gaps, where we used the Kimura distance model, a minimum intraspecific distance of 0.01, and a maximum intraspecific genetic distance of 0.1. The final phylogenetic tree was created by using the consensus trees in RStudio using "ape" [58], and "phytools" [59] packages.

Inference of Historical Demography
We tested two demographic scenarios using Approximate Bayesian Computation (ABC) modeling as implemented in DIYABC [60]. Noting that our mitochondrial DNA phylogenetic analysis identified two distinct lineages for C. ambigua, one in the Amazon River basin and the other in the Paraguay River basin (see Results), Scenario 1 was designated as the null model, where all four populations diverged at the same time from a common ancestor with equal divergence rates ( Figure 2, left). Scenario 2 assumed the Paraguay and Amazon basin lineages of C. ambigua diverged initially from each other earlier in time (t 2 ), and then later (t 1 ) the three populations in the Paraguay River basin diverged simultaneously from their common ancestor with equal divergence rates. Using combined mitochondrial DNA sequences, we simulated historical demography over millennial (e.g., >1000 years ago) time-scales. Each scenario contained all four extant populations at generation t 0 (Figure 2, right). We conducted simulations assuming all populations diverged simultaneously from a common ancestral population. Prior values of Ne for extant and ancestral populations ranged from a minimum of 10 individuals to a maximum of 50,000 individuals and utilized a uniform distribution ( Table 3). The Ne values were based on preliminary analyses to confirm the upper boundaries of the prior values. The prior value for time-point t1 was 10-20,000 generations ago and for t2 1000-100,000 generations ago, with both time-points utilizing a uniform distribution. Mean generation time was set at 5.5 years for both Figure 2. The two demographic scenarios tested for populations of C. ambigua in the Amazon River and Paraguay River basins of Brazil using DIYABC [59,60]. Each scenario assumes four extant populations at t 0 , where RB = Baixo Stream, BVV = Valo Verde Lake, RC = Claro Stream, and RA = Amazon River, populations, respectively, and that they diverged from each other at some point in the past. Divergence time-points t 1 and t 2 are displayed on right.
Prior values of N e for extant and ancestral populations ranged from a minimum of 10 individuals to a maximum of 50,000 individuals and utilized a uniform distribution ( Table 3). The N e values were based on preliminary analyses to confirm the upper boundaries of the prior values. The prior value for time-point t 1 was 10-20,000 generations ago and for t 2 1000-100,000 generations ago, with both time-points utilizing a uniform distribution. Mean generation time was set at 5.5 years for both species based on female longevity of 10 years, which was derived from a life-table analysis of demographic data for the Baixo Stream population of C. ambigua (C. Callil, unpublished data). To estimate the mean mutation rate, we used a Hasegawa-Kishino-Yano mutation model [61] and set the per-site per-generation mutation rate to range from a minimum rate of 1 × 10 −8 to a maximum of 1 × 10 −6 . For each demographic scenario, two million simulations were run, and then their respective posterior probabilities were compared using logistic regression to determine the most probable scenario [62]. Finally, confidence in each scenario was assessed by evaluating Type I and Type II error rates [60]. One thousand test data sets were simulated using each scenario, respectively, and then the posterior probabilities were evaluated for the simulated data sets. The Type I error rate was calculated from the proportion of posterior probabilities of Scenario I that were lower than the posterior probabilities of Scenario II when Scenario I was the true scenario, and vice versa for Scenario II. The Type II error rate was calculated from the proportion of posterior probabilities of Scenario I that were higher than the posterior probabilities of Scenario II when Scenario I was not the true scenario, and vice versa for Scenario II.

Permits and Ethical Aspects
We obtained authorization from IBAMA, the Brazilian Institute of the Environment and Renewable Natural Resources, for the collection of zoological material in accordance with Portaria do lbama n • 332/90 and other relevant rules and regulations.

Morphology and Distribution
The genus Castalia has thick shells of an equilateral triangular shape, the umbos are high and prominent with umbonal sculpture well developed and extending over a large part of the shell, the posterior ridge is sharp, and the anterior shell margin is well defined and ligament well developed [63]. Data from individuals collected at 70 sample sites in the upper Paraguay River basin showed that C. ambigua occurred at 5 sites (Table 1, Figure 1B, inset map) and C. inflata at 19.
Castalia inflata was associated with the lowlands, marginal shallow lakes, and oxbow lakes of the Cuiabá River in areas influenced by the spring flood pulse [22,23]. Castalia inflata ( Figure 3A, top row) has an equilateral triangular shell; very inflated; high and robust; rounded anterior border; obliquely truncated posterior margin, forming an edge with a ventral border that is straight at the posterior region; umbos central, tall, large and wide; radial umbonal sculpture formed by parallel rays strongly marked, generally covering the entire surface of the shell; keel high and prominent, with a very strong back slope; shield large and flat, periostracum matte dark-brown; hinge strong, very arched and broad, with pseudocardinal and lateral teeth perpendicularly striped [64].
Castalia ambigua found exclusively in the highland streams with a fast flow and high productivity in limestone-dominated watersheds [22,23]. Castalia ambigua ( Figure 3B, middle row) has an inequilaterally triangular shell; inflated; high and robust; rounded anterior border and slightly tapered; obliquely truncated posterior margin forming an edge with a ventral margin that is straight or rounded at the posterior third; umbos sub-central, lower and smaller; radial umbonal sculpture formed by strongly marked diverging rays, usually longer posteriorly; keel high and a little rounded, with oblique dorsal slope, shield small and elongated; periostracum matte brown; hinge strongly arched and wide; pseudocardinal and lateral teeth perpendicularly striped [36,64].
As noted above, at Baixo Stream in the highlands, "C. ambigua-like" individuals were observed ( Figure 3C, bottom row) that exhibited morphological traits intermediate to those described for nominal C. inflata and C. ambigua.

Biometry
A total of 557 C. inflata individuals from 19 sample sites were measured, with a mean (±SE) length of 27.22 ± 5.59 mm, width of 18.69 ± 4.17 mm, and height of 24.59 ± 5.24 mm. A total of 200 C. ambigua individuals from five different sites were measured. Variation was higher for C. ambigua, with a mean length of 39.11 ± 12.47 mm, width of 19.94 ± 6.46 mm, and height of 37.76 ± 10.32 mm.
Results of principal components analysis ( Figure 4) showed two distinct groups, one for each nominal species, with 92.88% of the variance explained by the first axis. Some degree of overlap among the groups is visible ( Figure 3A), which may be the consequence of the morphometric variation of individuals with intermediate morphology, hereafter "C. ambigua-like" individuals. The biometrical variable that contributed most greatly to variation was width ( Figure 4).

Biometry
A total of 557 C. inflata individuals from 19 sample sites were measured, with a mean (±SE) length of 27.22 ± 5.59 mm, width of 18.69 ± 4.17 mm, and height of 24.59 ± 5.24 mm. A total of 200 C. ambigua individuals from five different sites were measured. Variation was higher for C. ambigua, with a mean length of 39.11 ± 12.47 mm, width of 19.94 ± 6.46 mm, and height of 37.76 ± 10.32 mm.
Results of principal components analysis ( Figure 4) showed two distinct groups, one for each nominal species, with 92.88% of the variance explained by the first axis. Some degree of overlap among the groups is visible ( Figure 3A), which may be the consequence of the morphometric variation of individuals with intermediate morphology, hereafter "C. ambigua-like" individuals. The biometrical variable that contributed most greatly to variation was width ( Figure 4).

Molecular and Phylogenetic Analyses
A total of 71 samples-34 from Baixo Stream (highlands), 27 from Claro Stream (lowlands), and 10 from Valo Verde Lake (lowlands)-were sequenced at the COI gene, 19 that were morphologically C. ambigua, eight "C. ambigua-like" or intermediate, and 44 C. inflata. We considered molecular variation among C. ambigua from the Amazon River watershed [20] as a geographical outgroup. Among the sequence alignments, 25 haplotypes were identified on the basis of 25 variable nucleotides within 514 bp ( Table 4). The respective haplotypes are reported as GenBank accession numbers indicated in Table 4. Metrics of genetic variation at the mitochondrial COI region (Table 5) show higher numbers of nucleotide differences (k) and nucleotide diversity (π) among C. ambigua-like than among C. ambigua or C. inflata individuals within the Paraguay River watershed despite the smaller sample size. Haplotype diversity (h) was somewhat higher in C. ambigua than in C. inflata. COI sequences for C. ambigua, C. inflata, and C. ambigua-like individuals were 91%, 92%, and 93% similar to the Callonaia duprei (Recluz, 1842), respectively, and 93%, 94%, and 95% similar to the Castalia quadrata (Sowerby, 1867) reference sequences retrieved from GenBank.

Molecular and Phylogenetic Analyses
A total of 71 samples-34 from Baixo Stream (highlands), 27 from Claro Stream (lowlands), and 10 from Valo Verde Lake (lowlands)-were sequenced at the COI gene, 19 that were morphologically C. ambigua, eight "C. ambigua-like" or intermediate, and 44 C. inflata. We considered molecular variation among C. ambigua from the Amazon River watershed [20] as a geographical outgroup. Among the sequence alignments, 25 haplotypes were identified on the basis of 25 variable nucleotides within 514 bp ( Table 4). The respective haplotypes are reported as GenBank accession numbers indicated in Table 4. Metrics of genetic variation at the mitochondrial COI region (Table 5) show higher numbers of nucleotide differences (k) and nucleotide diversity (π) among C. ambigua-like than among C. ambigua or C. inflata individuals within the Paraguay River watershed despite the smaller sample size. Haplotype diversity (h) was somewhat higher in C. ambigua than in C. inflata. COI sequences for C. ambigua, C. inflata, and C. ambigua-like individuals were 91%, 92%, and 93% similar to the Callonaia duprei (Recluz, 1842), respectively, and 93%, 94%, and 95% similar to the Castalia quadrata (Sowerby, 1867) reference sequences retrieved from GenBank. Table 4. Haplotypes for Castalia sp. at the mitochondrial cytochrome oxidase I (COI) gene. Haplotypes Cast_COI_01 through 17 are original data from the Paraguay River basin; C. ambigua 1 through 8 are from the Amazon River basin and were collected by da Cruz Santos-Neto et al. [20]. Numbers refer to nucleotide position. 25  28  31  34  40  47  49  51  52  55  61  64  67  70  73  85  118  139  148  154  163  169 172 Diversity 2020, 12, 481 13 of 30   Because the COI sequences from da Cruz Santos-Neto et al. [20] are shorter than ours, after trimming ours to an equal length of 343 base pairs, some of our haplotypes became the same as others. Sequences that became identical after trimming were Cast_COI_06 with Cast_COI_07, and Cast_COI_03 with Cast_COI_04. Some sequences reported by da Cruz Santos Neto et al. [18] were found to be the same haplotype, viz. KU888238 with KU8888239, KU888236 with KU888237, KU888240 with KU888241, and KU888242 with KU888243. Considering the haplotypic distribution on a geographical basis ( Figure 5A), the Claro Stream collection exhibited haplotypes Cast_COI_2, 3, 14, 15, 16, and 17, the Valo Verde Lake collection haplotypes Cast_COI_1, 2, 3, and 4, and the Baixo Stream collection haplotypes Cast_COI_3, 5, 6, 7, 8, 9, 10, 11, 12, and 13. Claro Stream and Vale Verde Lake haplotypes tended to the left of the haplotype network and Baixo Stream haplotypes to the right, reflecting geographic and elevational distributions. Most striking, the Amazon River haplotypes are at least 18 mutational steps separated from the Rio Paraguay haplotypes. Considering haplotypes on a taxonomic basis ( Figure 5B), some haplotypes (Cast_COI_3, 5, and 6) were exhibited across nominal species, and C. inflata haplotypes were embedded among the C. ambigua haplotypes. That is, haplotypes Cast_COI_1, 2, 4, 14, 15, 16, and 17 were observed only in C. inflata, haplotypes Cast_COI_7, 8, 9, 10, 11 and 13 only in C. ambigua, and haplotypes Cast_COI_3, 5, 6, 7, 11, and 12 in both taxa or in their "C. ambigua-like" individuals. The C. inflata haplotypes tended toward the middle of the haplotype network, C. ambigua haplotypes to the edges, and those of "C. ambigua-like" individuals from the center to the right.
At the mitochondrial 16S rRNA region, 70 samples from the Rio Paraguay watershed were sequenced (19 of C. ambigua, 43 of C. inflata, and 8 of C. ambigua-like). We also considered variation among C. ambigua from the Amazon River watershed [20] as a geographical outgroup. Within a 443-bp sequence, 38 variable nucleotides defined 9 haplotypes in the Rio Paraguay and 7 in the Amazon River watersheds (Table 6), which have been reported as the GenBank accession numbers indicated at the right margin of the table. Our 16S rRNA sequences were 93-94% similar to the C. duprei and 96% similar to the C. quadrata reference sequences retrieved from GenBank. Metrics of genetic variation at the mitochondrial 16S rRNA region (Table 7) were lower than those for the COI region. Metrics for C. inflata were generally higher than for C. ambigua and were higher for C. inflata in the lowlands (Valo Verde Lake, Claro Stream) than at the edge of their distribution in the highlands (Baixo Stream).
species, and C. inflata haplotypes were embedded among the C. ambigua haplotypes. That is, haplotypes Cast_COI_1, 2, 4, 14, 15, 16, and 17 were observed only in C. inflata, haplotypes Cast_COI_7, 8, 9, 10, 11 and 13 only in C. ambigua, and haplotypes Cast_COI_3, 5, 6, 7, 11, and 12 in both taxa or in their "C. ambigua-like" individuals. The C. inflata haplotypes tended toward the middle of the haplotype network, C. ambigua haplotypes to the edges, and those of "C. ambigua-like" individuals from the center to the right.   Table 6. Haplotypes for Castalia species at the mitochondrial 16S rRNA gene. Haplotypes Cast_16S_01 through 9 are original data for samples from the upper Paraguay River basin; C. ambigua 1 through 7 are from samples from the upper Amazon River basin [20]. Numbers refer to nucleotide position. 35 36 83  87  92  98  111  119  124  125  138  149  176  184  186  191  197  200  206  211  214  217  246  258 262 263  . Because the 16S rRNA sequences from da Cruz Santos-Neto et al. [20] are shorter than ours, after trimming ours to equal length, some of our haplotypes became the same. Identical sequences after trimming were Cast_16S_01, Cast_COI_02, and Cast_16S_04. All geographic sites within the Rio Paraguay basin exhibited multiple 16S rRNA haplotypes ( Figure 6A), with the respective populations showing different haplotype frequencies. Haplotype frequencies differed across nominal taxa ( Figure 6B). Most C. inflata individuals exhibited haplotype Cast_16S_2, with some individuals showing any of six haplotypes no more than three mutational steps from Cast_16S_2. Most C. ambigua individuals exhibited the Cast_16S_2 haplotype, with one individual showing the closely related Cast_16S_4. The C. ambigua-like individuals exhibited the common Cast_16S_2 haplotype and one closely related Cast_16S_5. Notably, all Amazon River haplotypes were at least 12 mutational steps removed from all Rio Paraguay haplotypes. On a taxonomic basis ( Figure 6B), most C. inflata and C. ambigua-like haplotypes were among the C. ambigua haplotypes.

Cast 16S 1 A A C T C A C C C C G C G T A T T T A C T A T T C C
A 559-bp segment of the internal transcribed spacer (ITS1) of the nuclear 18S rRNA gene was sequenced in 46 individuals, including seven C. ambigua, 36 C. inflata, and three C. ambigua-like individuals. The use of software packages Clustal and Webprank resulted in the same alignment of our sequences. No variable nucleotides were observed. The sequence is reported as GenBank accession number KY463466. Gaps resulted when our Castalia 18S rRNA sequences were aligned with those of outgroup Callonaia duprei (KU888175), and four indels were coded using FastGap v1.2.
A total of 75 samples were sequenced at the nuclear 28S rRNA gene: 10 C. inflata from Valo Verde Lake, 27 C. inflata from Claro Stream, and 38 C. inflata, C. ambigua, and "C. ambigua-like" from Baixo Stream. One haplotype was observed across 441 nucleotides among all individuals and is reported as GenBank accession number KT885158.
A 354-bp segment of the histone H3NR gene was sequenced among 64 individuals, 19 C. ambigua, 37 C. inflata, and eight C. ambigua-like. The same nucleotide sequence was observed among all individuals and is reported as GenBank accession number KY474373. of six haplotypes no more than three mutational steps from Cast_16S_2. Most C. ambigua individuals exhibited the Cast_16S_2 haplotype, with one individual showing the closely related Cast_16S_4. The C. ambigua-like individuals exhibited the common Cast_16S_2 haplotype and one closely related Cast_16S_5. Notably, all Amazon River haplotypes were at least 12 mutational steps removed from all Rio Paraguay haplotypes. On a taxonomic basis ( Figure 6B), most C. inflata and C. ambigua-like haplotypes were among the C. ambigua haplotypes. For phylogenetic analysis of the COI sequence data set, the General Time Reversible (GTR) with invariable sites model was selected. For 16S sequences, the General Time Reversible (GTR) model with gamma distribution and a proportion of invariable sites [65] was selected as the most appropriate evolutionary model. For 18S ITS1, the GTR model with equal rates was found to be the most suitable. For the combined COI + 16S sequences, the most appropriate evolutionary model was the Hasegawa-Kishino-Yano (HKY) with gamma distribution and a proportion of invariable sites [61]. Phylogenetic trees constructed using Bayesian analysis, which we discuss below, showed the inferred evolutionary relationships of mussel populations that we sampled within the larger context of regional variation. Phylogenetic trees constructed using maximum likelihood (not shown) yielded trees with topologies that were virtually the same as those from the Bayesian analysis.
The phylogenetic tree showing relationships among mitochondrial COI haplotypes (Figure 7) exhibited both geographic and phylogenetic patterns. The two lower clusters included haplotypes from C. ambigua from the Amazon River basin observed by da Cruz Santos-Neto et al. [20] and are shown in brown. The large upper cluster was comprised of all of our samples from the Paraguay River basin, a cluster including a mixture of haplotypes from nominal C. ambigua, C. inflata, and C. ambigua-like individuals from all three geographic sites.
Phylogenetic analysis of mitochondrial 16S rRNA sequence haplotypes (Figure 8) clearly separated haplotypes of C. ambigua and C. inflata from those of outgroup Triplodon corrugatus (Lamarck, 1819). Haplotypes of C. ambigua from the Amazon River were clearly separated from those of all nominal Castalia species from the Paraguay drainage.
Phylogenetic analyses of combined mitochondrial COI and 16S sequences (Figure 9) showed stronger evidence of geographic than of nominal species structure. At the most basal node in the tree, the combined haplotype of the outgroup T. corrugatus sequence was basal to all others. The second node separated haplotypes of C. ambigua from the Amazon basin from all those of all nominal species from the Paraguay basin.
Phylogenetic analysis of our one observed ITS1 haplotype, those of C. ambigua from the Amazon basin [20] and outgroups C. stevensi (H.B. Baker, 1930) and Callonaia duprei ( Figure 10) showed clear distinction of Castalia haplotypes from those of Callonaia. All Castalia haplotypes clustered tightly together.

Historical Demography
Scenario 2 (where populations in the Paraguay and Amazon basins differentiated before the three populations within the Paraguay basin did) was identified as the most highly supported demographic scenario by our DIYABC analysis, with a posterior probability of 0.994 (0.993-0.995), compared to Scenario 1 (all four populations diverged at the same time from a common ancestor with equal divergence rates) with a posterior probability of 0.006 (0.005-0.007). The Type I and II error rates for Scenario 2 were both <1%, which are very low, and thus when combined with the much higher posterior probability suggests that Scenario 2 is the best-supported scenario to explain the data. For Scenario 2, the median posterior distributions of Ne ranged from a high of ~32,600 for the Amazon River population to a low of ~4020 for the Valo Verde Lake population, median divergence

Historical Demography
Scenario 2 (where populations in the Paraguay and Amazon basins differentiated before the three populations within the Paraguay basin did) was identified as the most highly supported demographic scenario by our DIYABC analysis, with a posterior probability of 0.994 (0.993-0.995), compared to Scenario 1 (all four populations diverged at the same time from a common ancestor with equal divergence rates) with a posterior probability of 0.006 (0.005-0.007). The Type I and II error rates for Scenario 2 were both <1%, which are very low, and thus when combined with the much higher posterior probability suggests that Scenario 2 is the best-supported scenario to explain the data. For Scenario 2, the median posterior distributions of N e ranged from a high of~32,600 for the Amazon River population to a low of~4020 for the Valo Verde Lake population, median divergence times of populations at t 1 was~1290 generations ago (~7095 years ago) and at t 2 it was~7240 generations ago (~39,820 year ago), and the median mutation rate was 6.64 × 10 −7 (Table 3).

Distribution, Morphology, and Taxonomy
Due to the high number of species in the genus Castalia Lamarck (1819)-thirteen [8,66] to seventeen [3] in total-and their widespread occurrence throughout South America, the literature on distribution and identification of some member species, such as C. ambigua and C. inflata, remains confusing, with many records of occurrence uncertain [10,11] (Figure 1A). Records for C. inflata have always been common in mid-southern South America, where the type locality was described from specimens collected in the Parana River, in the province of Corrientes in Argentina. The initial records were from the lower Paraná River (von Ihering, cited in 10); small tributaries of the Paraná River (D'Orbigny) in Argentina, to the Paraguay River near the Apa River in Mato Grosso do Sul and upstream at Cáceres in Mato Grosso, both in Brazil (von Ihering, cited in 10). Recently, this species was recorded in the Paraná River in Santa Fé, Argentina [5,31]; in the La Plata River in Uruguay [11]; and at the confluence of the Apa and Paraguay rivers in Paraguay [19]. In Brazil, C. inflata was reported in the Bento Gomes River at Poconé in Mato Grosso [23,32,67], the Uruguay River at Uruguaiana in Rio Grande do Sul [33], and associated with several reaches and marginal lakes of the Cuiabá River in the Pantanal in Mato Grosso [22,34].
The occurrence of C. ambigua is better documented; the description of the type specimen is from the Amazon Basin, but there are records in other major watersheds of South America, occurring from the Magdalena River basin in Colombia to the La Plata River in Argentina [11] and the San-Miguel and Guarayos rivers in Bolivia [9]. In Brazil, this species occurs at the confluence of the Rio Negro with the Solimões River [36]; the Uraricoera and Branco River [68]; the Aripuanã River [37], Tocantins River [35]; and the Irituia River [6] in the Amazon; in the Cuiabazinho River in Mato Grosso [64], and there are records for the São Francisco River basin [11].
The distributions of the nominal taxa C. ambigua and C. inflata meet at the northern portion of the upper Paraguay River basin. In our study system, C. ambigua occurred exclusively in lotic waters with relatively high primary productivity and high conductivity (230-370 µS/cm 3 ). Such environments occur only in the highlands, in the headwaters region of the upper basin associated with streams and rivers rich in calcium derived from calcareous springs. On the other hand, the distribution of C. inflata was linked to lowlands in silted environments, fine substrates, high concentrations of particulate organic matter, and low conductivity (35 to 70 µS/cm 3 ), commonly in wetlands under the influence of the flood pulse, as in the Pantanal. Recent studies carried out on rivers in the Amazon hydrographic basin have shown that the shape of the shell in the Hyriini is largely associated with drainage characteristics [69].
Our morphometric data showed a variation of biometric characters (length, width, and height) from each nominal species. However, some points overlapped, reflecting a phenotypic gradient. In the upper Paraguay basin, there have been reports of the occurrence of two distinct species identified as C. ambigua and C. inflata using only conchological traits [64]. However, visual differentiation is not always effective because the shell shape is subject to marked phenotypic plasticity [25,70].

Molecular and Phylogenetic Analyses
Our field collections of Castalia species in the upper Paraguay basin revealed individuals in Baixo Stream in the uplands whose appearance was intermediate between the classical morphologies of C. ambigua and C. inflata. Hence, we screened genetic markers to characterize their molecular genetic and phylogenetic affinities. Genetic identification of tropical freshwater mussels has been approached through the amplification and sequencing of the cytochrome oxidase I (COI) gene of mitochondrial DNA and the 28S nuclear rDNA gene [71]. Because few genetic markers have been established for tropical freshwater mussels, we screened variation at three additional genes (mitochondrial 16S rRNA, nuclear 18S rRNA-ITS, and histone H3) that are sufficiently well conserved across a range of taxa that primers proving useful for PCR amplification are available.
Our morphological and mitochondrial results supported neither reproductive isolation nor reciprocal monophyly among C. ambigua and C. inflata at the upland Baixo Stream site. To explain the results, we note the occurrence of shell plasticity in unionids; the tendency for shell morphology to vary from compressed in headwaters to inflated downstream, sometimes with tubercles, has been noted in many freshwater mussel species. Reviewing the literature of his time, Ortmann [25] noted numerous examples in his own and colleagues' work. Subsequent work [reviewed by Reference 70] showed this ecotypic pattern within the tribes Pleurobemeni (genera Fusconaia, Pleurobema, and Pleuronaia), Quadrulini (Cyclonaias, Quadrula), Lampsilini (Actinainais ligamentina, Epioblasma torulosa, Dromus, and Obovaria), and Amblemini (Ambema plicata). Using 103 amplified fragment length polymorphism (AFLP) markers, Zieritz et al. [72] showed that morphological differences in shell morphology in Unio pictorum were the result of phenotypic plasticity, and did not reflect genetic differentiation. Shell convexity in the freshwater pearl mussel Margaretifera margaretifera may be affected by thermal effects [73], suggesting that the thermal regimes in colder upland and warmer lowland sites could have affected the morphologies of mussel populations that we studied. Failure to account for shell plasticity in unionoids has resulted in the description of species or subspecies that are not supported by molecular phylogenetic analyses. Inoue et al. [74] showed that morphological differences between Obovaria jacksoniana and Villosa arkansasensis were due to ecophenotypic plasticity and suggested that these species were synonymous. Morphological plasticity is such that the four recognized Unio species in France actually include three or five valid species [75]. Our results suggest that Castalia ambigua and C. inflata within the upper Paraguay River drainage are not phylogenetically distinct, constituting a case of ecotypic variation within one lineage. Further testing of the hypothesis that these nominal species are valid would require analysis of populations across the geographic extent of the ranges of the nominal species.
Analyses of our DNA sequencing results for mitochondrial COI, 16s rRNA, and combined sequences showed phylogenetic differentiation among C. ambigua from the Paraguay and Amazon basins. We interpret the shared ancestry and subsequent differentiation as a consequence of the geological history of the region. Prior to the Miocene, a paleo-Amazon-Orinoco drainage originating in Chile and Argentina drained northward to the Caribbean Sea. The modern watershed divide between the Paraguay and Amazon systems arose approximately 30 million years ago with the initiation of tectonic activity, driving the diversification not only of fishes [76] but presumably also freshwater mussels.

Conclusions and Prospects
After examining phenotypic and genetic variation in Castalia ambigua and C. inflata in the upper Paraguay River basin, the results of our phylogenetic analyses are most consistent with the hypothesis of nominal species belonging to one evolutionary lineage. The phenotypic variation that we observed in the upper Paraguay system is consistent with the tendency of shell morphologies to vary from headwaters to lowland areas, which has long been recognized in North American freshwater mussels and may also apply to South American mussels. Supporting this interpretation, we noted a strong phylogenetic distinction among C. ambigua between the Paraguay and Amazon basins, which may be indicative of species-based differences. Broad-scale phylogeographic examination including examination of type specimens will inform defensible delineation of Castalia species and members of other lineages within the South American freshwater mussel fauna.