Interspecific Genetic Differences and Historical Demography in South American Arowanas (Osteoglossiformes, Osteoglossidae, Osteoglossum)

The South American arowanas (Osteoglossiformes, Osteoglossidae, Osteoglossum) are emblematic species widely distributed in the Amazon and surrounding basins. Arowana species are under strong anthropogenic pressure as they are extensively exploited for ornamental and food purposes. Until now, limited genetic and cytogenetic information has been available, with only a few studies reporting to their genetic diversity and population structure. In the present study, cytogenetic and DArTseq-derived single nucleotide polymorphism (SNP) data were used to investigate the genetic diversity of the two Osteoglossum species, the silver arowana O. bicirrhosum, and the black arowana O. ferreirai. Both species differ in their 2n (with 2n = 54 and 56 for O. ferreirai and O. bicirrhosum, respectively) and in the composition and distribution of their repetitive DNA content, consistent with their taxonomic status as different species. Our genetic dataset was coupled with contemporary and paleogeographic niche modeling, to develop concurrent demographic models that were tested against each other with a deep learning approach in O. bicirrhosum. Our genetic results reveal that O. bicirrhosum colonized the Tocantins-Araguaia basin from the Amazon basin about one million years ago. In addition, we highlighted a higher genetic diversity of O. bicirrhosum in the Amazon populations in comparison to those from the Tocantins-Araguaia basin.


Introduction
Amazonia is the ecosystem hosting the highest amount of biodiversity in the world [1]. Such an extreme biodiversity level is associated with both high speciation and low extinction rates in this region [2]. South America experienced a complex geological and climatic history marked by several intense events that affected the biota in the Amazon basin (AMb), such as important and

Individuals Sampling
The number, sampling sites and sex of the individuals are presented in Table 1 and Figure 1. Sample sizes for the SNP dataset are smaller than those used for the cytogenetic analyses, as some samples were obtained after the library preparation and sequencing experiment. The samples were collected following authorization of the Brazilian Environmental Agency, Instituto Chico Mendes de Conservação da Biodiversidade/ Sistema de Autorização e Informação em Biodiversidade (ICMBio/ SISBIO) (License No. 48290-1) and Sistema Nacional de Gestão do Patrimônio Genético e do Conhecimento Tradicional Associado (SISGEN, A96FF09). All specimens were identified by morphological criteria, and the voucher specimens were deposited under numbers 121638 until 121640 in the Museum of Zoology of the University of São Paulo (MZUSP).  Table 1. Collection sites of the arowana species analyzed, with the corresponding numbers of individuals used for single nucleotide polymorphisms (SNP -DArT_N) and cytogenetic analyses (Cito_N).

Mitotic Chromosomal Preparations and Comparative Genomic Hybridization (CGH)
Mitotic chromosomes were obtained following the protocol by Bertollo et al. [33]. The experiments followed ethical and anesthesia conducts, according to the Ethics Committee on Animal Experimentation of the Universidade Federal de São Carlos (Process number CEUA 1853260315). The total genomic DNAs (gDNAs) were extracted from liver tissue by the standard phenol-chloroform-isoamyl alcohol method [34].
Two different experimental designs of comparative genomic hybridization (CGH) were used. The first one focused on interspecific genomic comparisons and, in this case, gDNAs of O. ferreirai and O. bicirrhosum from the AMb localities were labeled with Aminoallyl-dUTP -ATTO-550 (red) and Aminoallyl-dUTP -ATTO-488 (green) using Nick-translation labeling kit (Jena Bioscience, Jena, Germany), respectively, and hybridized against the chromosomal background of O. ferreirai. The final hybridization mixture for each slide contained 500 ng of each labeled gDNA and 25 µg of unlabeled C0t-1 DNA from both species (to block the shared repetitive sequences prepared according to Zwick et al. [35]), dissolved in 20 µL of the hybridization buffer (50% formamide, 2 × SSC, 10% SDS, 10% dextran sulfate and Denhardt's buffer, pH 7.0). The second set of experiments focused on interpopulational genomic comparisons of O. bicirrhosum. The gDNA of O. bicirrhosum from TAb was compared with the gDNA of O. bicirrhosum from AMb against metaphase chromosomes of individuals from TAb. For this purpose, the gDNA of individuals from TAb was labeled with Aminoallyl-dUTP -ATTO-550 (red), while the gDNA of individuals from AMb was labeled with Aminoallyl-dUTP -ATTO-488 (green), both using Nick-translation labeling kit (Jena Bioscience, Jena, Germany). The final probe cocktail was composed by 500 ng of gDNA of individuals from TAb + 500 ng of gDNA of individuals from AMb + 15 µg of derived C0t-1 DNA of specimens from both populations. The probes were ethanol-precipitated, and the dry pellets were dissolved in hybridization buffer containing 50% formamide, 2 × SSC, 10% SDS, 10% dextran sulfate and Denhardt´s buffer, pH 7.0. The CGH experiments were performed according to Symonová et al. [36].

Chromosomal Analyses and Image Processing
At least 30 metaphase spreads per individual were analyzed to confirm the diploid chromosome number (2n) and CGH results. The images were captured using an Olympus BX50 microscope (Olympus Corporation, Ishikawa, Japan) with Cool SNAP, and processed using Image-Pro Plus 4.1 software (Media Cybernetics, Silver Spring, MD, USA).

DArTseq Procedure
Tissue samples (Table 1) were sent to Diversity Arrays Technology Pty Ltd., Canberra, Australia for obtaining genotypic information from SNPs isolated by DArTseq sequencing technique. Genomic DNA complexity reduction in DArTseq is achieved using a combination of a restriction enzymes SbfI and PstI, an enzyme sensitive to some types of methylation, causing the exclusion of methylated and repetitive DNA fragments from the process [37,38]. Therefore, genomic libraries prepared under this method tends to enrich for sequences located in functional genic regions. Resulting libraries were sequenced on an Illumina HiSeq2500 platform.

DArT Data Filtering
Demultiplexed raw data, provided by the sequencing facility, was processed with ipyrad v.0.7.28 [39]. Sequencing adapters were trimmed, and all sequences with more than 5 low-quality bases (Q < 20) or shorter than 35 base pairs were discarded. The maximum number of SNPs allowed per locus was set to 3 and maximum uncertain base pairs accepted per cluster was defined as 5. The data was then de novo clustered and aligned. Specifically, high-quality consensus sequences were detected for each sample by aligning their reads separately and considering only clusters with more than 6 reads as a pre-locus. After that, the obtained pre-loci were clustered among individuals and only clusters observed in all samples were retained as definitive loci ( Figure 2). This last step was carried in three different datasets, one of them comprising both species (combined dataset), one for O. ferreirai only (O. ferreirai dataset) and the other for O. bicirrhosum only (O. bicirrhosum dataset). This approach was adopted to maximize the number of markers for subsequent analyzes. Only one aleatory SNP per locus was maintained from the selected loci.

Detection of Outlier Markers Putatively under Selection
In order to search for markers that are possibly under selection, a BayeScan [40] analysis was carried out for the combined and the O. bicirrhosum datasets. Runs were executed with a prior odd value of 100, 5000 MCMC chains, and thinning of 10. A total of 20 pilot runs of 5000 iterations were done, burn-in was set to 50,000. Loci with a False Discovery Rate (FDR) value lower than 0.01 were considered as outliers.

Genetic Diversity
Genetic diversity was assessed with Genodive [41]; summary statistics (H O -observed heterozygosity; H E -expected heterozygosity; G IS -inbreeding coefficient) were generated for O. bicirrhosum and O. ferreirai separately. Pairwise F ST between all O. bicirrhosum sampling locations was also recovered.

Interspecific Differences Structure
The genetic diversity distribution between the two species was assessed with a principal coordinate (PCoA) analysis carried in the R package dartR [42]. A subsequent PCoA was used to visualize the genetic diversity distribution of O. bicirrhosum across the two basins, using the O. bicirrhosum dataset.
Population structure was also assessed using the Bayesian approach implemented in fastSTRUCTURE v.1.0 [43], a method based on the Bayesian software STRUCTURE [44] but optimized for larger sets of data. In order to prepare inputs and run fastSTRUCTURE, the "Lizards-are-awesome" pipeline [45] was used. The combined dataset was tested with K (number of genetic clusters) ranging from 1 to 5, O. ferreirai dataset with K from 1 to 2, and O. bicirrhosum dataset with K from 1 to 4.
Because of the small number of localities, isolation by distance (IBD) tests would not have statistical power to give reliable results. Therefore, the spatial analysis implemented in GENELAND [46] was also used to assess population structure only in the O. bicirrhosum dataset, as it contains multiple localities. Spatial methods are usually less prone to artificial cluster detection in the presence of isolation by distance (IBD) and are advised for studies with focal groups more prone to IBD, as freshwater fishes [47]. The runs were carried out with a range of K from 1 to 4, 500,000 iterations and a thinning of 200. The correlated allele frequency model was set, and a total of 10 independent runs were performed. The results of both Geneland and fastSTRUCTURE were zipped and uploaded to Structure Harvester [48], for preparation of CLUMPP [49] input files. Graphical outputs were then generated in Clumpak [50].

Demographic Model Selection
To compare demographic models in O. bicirrhosum, genetic data were simulated in ms [51] based on our SNP data sample sizes. Priors were selected based on a generation time (G) ranging from 1 to 2 years, according to Queiroz and Camargo [9]. A mutation rate (µ) of 1.25 × 10 −9 mutations per site per year was used, according to suggested by Oliveira et al. [52] for the genus Arapaima (Osteoglossiformes, Osteoglossidae). Effective population size (N e ) was sampled from a uniform distribution ranging from 10,000 to 1,000,000 individuals. Four different scenarios were selected to simulate 50,000 datasets in each of them with a Python script modified from Perez et al. [53]. The first scenario (model 1) consisted of a panmictic population, the second scenario (model 2) comprised a vicariant event with a division in two populations of the same size, the third scenario (model 3) considered a colonization event from AMb to TAb, and the fourth one (model 4) simulated the opposite event, a colonization from TAb to AMb basin. For each simulation, a value of θ was calculated based on the 4*N e * µ formula. The divergence time between the basins (DT) was set to 0 on model 1 and sampled from a uniform distribution between 0 and 2 million years ago (MYA) for the subsequent models. The coalescent divergence time (CT), which is the required parameter in ms, was calculated from each sampled DT value based on the DT/4*N e *G formula. The intensity of the population reduction during colonization (θrF-A) was estimated as a ratio of the θ value during colonization over the θ in the ancient population. Values were sampled from a uniform distribution ranging from 0.01 to 0.1. For the magnitude of population expansion after colonization (θrC-A), the ratio between θ values in the contemporary and in the ancient population (sampled from 0.1 to 1) were used. Only models 3 and 4 considered the priors for founder effect intensity and growth ratio.
The simulated scenarios were then compared with a convolutional neural network (CNN). CNN is an artificial intelligence method that can be used to learn to identify patterns and then predict those patterns on new data, not seen before by the network. The architecture of the CNN consists of a set of layers that apply mathematical functions to the data that passes through it, generating outputs for subsequent layers. Convolutional layers are followed by pooling layers that summarize the results from previous layers. CNNs can be applied to genetic data because of its high capacity to detect patterns on images and matrices [54]. Our genetic data, organized on matrices of presence or absence of alleles, were converted to images and explored with the approach described by Flagel et al. [54]. We used the same architecture proposed by Oliveira et al. [52], modified from Flagel et al. [54]. The input for the CNN contained 50,000 and 2000 simulations as train and test data, respectively. The run was performed for 25 epochs, with a mini-batch size of 250. After selecting the most likely model, a dataset of 10 6 simulations was generated under this model, for further parameter estimation using the same neural network architecture. To infer the estimation capacity of the CNN, a Root Mean Squared Error (RMSE) and Spearman's ρ were calculated for each parameter

Paleogeographic Modeling
Potential climatic niche for O. bicirrhosum and O. ferreirai were estimated for both species based on occurrence data for O. bicirrhosum (3 localities from the current study; 3 localities from Verba et al. [55]; 54 localities from Species Link database, http://splink.cria.org.br; and 89 localities from the Global Biodiversity Information Facility-GBIF, https://www.gbif.org/) and O. ferreirai (one locality from the current study; six localities from Queiroz and Camargo [9]; two localities from Species Link database; and 16 localities from GBIF). Models were constructed in biomod2 [56] with combination of nine distribution algorithms, including artificial neural networks (ANN; [57]), mixture discriminant analysis (MDA; [58]), multivariate adaptive regression splines (MARS; [59]), surface range envelop (SRE; [60]), classification tree analysis (CTA; [61]), generalized linear models (GLM; [62]), generalized boosted models (GBM; [63]), random forests [64], and maximum entropy (Maxent; [65]). Calibration of the models was performed with present climatic conditions at 30 arc-seconds resolution. Projections for the present, last glacial maximum (21,000 years ago) and last interglacial (120,000 years ago) were carried out at 2.5 arc-minutes resolution. A total of 5 simulations were performed with each algorithm and only simulations with True Skill Statistic (TSS) higher than 0.7 were kept. Correlation among 15 bioclimatic variables from WorldClim [66] was checked with Pearson index. For variables with high correlation (Pearson index >0.85), only the variable with higher explanatory capacity was maintained.

Cytogenetic Data
While the silver arowana (O. bicirrhosum) has 2n = 56 for all populations analyzed, the black arowana (O. ferreirai) differs by presenting 2n = 54 as previously described in Suzuki et al. [67]. In the cross-species hybridization, the gDNA probes from both Osteoglossum species showed a high diversity level emphasizing their repetitive DNA content. In fact, only a limited number of overlapping hybridization signals were observed, restricted to the heterochromatic blocks, which were widely present in the centromeric regions of some chromosomes. Several chromosome pairs of O. ferreirai showed species-specific sequences with abundant hybridization signals on their centromeric region (Figure 3a-d). The comparative hybridization of the gDNA probes between O. bicirrhosum from Tocantins-Araguaia and Amazon localities produced basically a large number of overlapping hybridization signals, highlighting the heterochromatic blocks that occur in the centromeric and terminal regions of some chromosomes. No substantial genomic differentiation in their repetitive DNA content was observed (Figure 3e-h).

Sequencing and Filtering
The DArTseq procedure resulted in a total of 73,147,298 raw reads obtained for all specimens of Osteoglossum. From these total reads, the combined dataset resulted in 9,687,094 reads after applying the quality filters presented in methods. After clustering and subsequent filtering, 3584 polymorphic SNP markers were obtained, with 0.38% of missing data. The O. ferreirai dataset included 46,057,316 raw reads and, after trimming and filtering, only 5,903,770 reads remained. A total of 433 bi-allelic markers passed the last filtering step with 3.77% of missing data. The O. bicirrhosum dataset included 27,089,982 raw reads, from which 19,381,380 reads passed first filtering, and resulted in a total of 1661 SNPs retained at the end of the process with 0.95% of missing data. Though the ipyrad was set to recover only loci available in all samples in each of the three datasets, the amount of missing data reported refers to multiple SNPs located at the same loci. A scheme of the number of retained reads for each step in the three analyzed datasets is presented in Figure 2.

Detection of Markers Putatively under Selection
For both tested datasets, no locus was selected as a candidate under selection. All FDR values were above 0.95 on the combined dataset and higher than 0.97 for O. bicirrhosum dataset. Based on this result, no marker was removed in further analyses.

Genetic Diversity
Higher genetic diversity was observed in O. bicirrhossum from AMb for all three diversity indexes (A, H E , and H O ) even with fewer samples ( Table 2). Expected heterozigosity (H E ) was slightly higher than the observed (H O ) for all tested populations. Inbreeding levels were smaller for Tocantins-Araguaia locations and higher in Amazon locations as suggested by G IS values ( Table 2). Pairwise F ST estimates reinforced the similarity of the TAb specimens (0.059), with a high differentiation from samples of AMb (0.504 and 0.475 when Catalão was compared with Luciara and Loroti, respectively). The individuals of O. ferreirai presented an H E value higher than that of the individuals of O. bicirhossum from TAb but lower than that of the individuals of O. bicirhossum in AMb. A higher G IS was also observed in O. ferreirai when compared to all other localities, suggesting higher levels of inbreeding.

Population Structure
The PCoA containing both species of Osteoglossum ( Figure 4A) showed that specimens from each species are more distantly related than specimens from the two localities of O. bicirrhosum.  Analyses on fastStrucure resulted in a number of clusters that maximizes both likelihood and is more informative for structure with K = 2 in the combined dataset, containing all analyzed specimens and separating the two species. The dataset recovered a K = 1 for O. ferrerai and a K = 2 for O. bicirrhosum, thus separating samples from the two basins. The GENELAND results of O. bicirrhosum dataset were concordant with fastStrucure, returning a K = 2, that separates populations based on their basins ( Figure 5).

Demographic Model Selection
During the neural network training process in the O. bicirrhosum dataset, the CNN reached an accuracy of 87.5% with the training set, and 87.9% of accuracy in the validation dataset, after 25 epochs. When using only simulated data to predict accuracy, scenario 1 (panmictic) was the easiest to predict with 99% accuracy. The scenario 2 (vicariance) had an accuracy of 84%, while the third (TAb colonization) and the fourth (AMb colonization) scenarios had 87.1% and 79.3% of accuracy, respectively ( Figure 6). Accuracy levels higher than 80% are considered acceptable for CNNs [68], therefore our approach presented overall appropriate values. After training and validation, the empirical data were submitted to CNN. The TAb colonization scenario had the highest posterior probability (PP = 0.872), while the panmictic model presented the lowest probability (PP = 0.000). A parameter estimation step was then carried under the TAb colonization scenario (Table 3). Effective population size (RMSE = 0.4971; Spearman's ρ = 0.6409) and divergence time (RMSE = 0.4816; Spearman's ρ = 0.6571) were the two parameters that CNN had the highest capability to infer. Conversely, founder effect ratio (RMSE = 0.6539; Spearman's ρ = 0.5670) and growth ratio (RMSE = 0.6454; Spearman's ρ = 0.5702) were predicted with less efficiency. Effective population size estimate had a median of 468,195 (interval= 452,302; 481,932). Separation time between basins was estimated at Pleistocene, with a median of 1,049,955 (interval = 1,013,205; 1,074,667). Founder size ratio was estimated with a median of 0.0551 (interval= 0.0538; 0.0561). The estimated growth rate has a median of 0.6454 (interval= 0.5337; 0.5596). Table 3. Demography parameter estimates of Osteoglossum bicirrhosum based on the selected demographic model. N e -Effective population size; DT-divergence time between the two basins; θrF-A-Population size reduction ratio during colonization; θrC-A-the ratio between the current and the ancient population.

Paleogeographic Modeling
Based on the explanatory capacity and correlation between variables, only seven bioclimatic variables were selected (namely 1, 2, 3, 4, 14, 15, 16) for O. bicirrhosum. The analysis generated a distribution in the present that is highly concordant with registered occurrences (Figure 7A,B). The present projection had the broadest distribution among the three simulated periods ( Figure 7B). On the last glacial maximum (LGM) the distribution is far more restrictive, only with a few marginally suitable areas, located predominantly in Amazon and some other regions distributed across the north of South America ( Figure 7C). The last interglacial period presented a stable area at the current Peru region and other lesser stable locations among Amazon and northern regions of South America ( Figure 7D). For O. ferreirai, five bioclimatic variables were maintained (3,7,11,12,14). The present distribution was also congruent with the occurrences (Figure 7E,F), though some areas with clear water were recovered, for example a stable location near Marajó island, more to the east than the current known distribution. The last glacial maximum simulation showed few marginally stable areas, mainly at central Amazon ( Figure 7G). During the last interglacial (LIG), there were even fewer suitable areas, presenting only small regions in Amazon and Colombia ( Figure 7H).

Discussion
The two analyzed species have different 2n: the silver arowana, O. bicirrhosum, 2n = 56, and the black arowana, O. ferreirai, 2n = 54. These results corroborate their distinct species status and thus indicate that they already substantially diverge from one another. Accordingly, these species are also associated to different water characteristics in the environments they live (i.e., different habitat preference) [69,70], and there is no evidence of interspecific gene flow between them at the few sympatric sites (e.g., Demeni River) [9]. In addition, their genomic comparison also showed that both species differ in the composition and distribution of repetitive sequences (Figure 3a-d). A similar scenario is also found in other species of Osteoglossiformes belonging to the family Notopteridae, where most species differ by a significant genomic diversity highlighted by CGH and DArT-Seq analysis [25,71].
Overall, the genetic diversity of O. bicirrhosum from the AMb is higher than O. ferreirai. DaSilva [69] also evidenced that O. bicirrhosum has higher diversity levels analyzing microsatellite markers. This can be related to the restricted distribution and the conservation status of O. ferreirai. Our results suggest that interference of past climatic and demographic events, along with contemporary habitat degradation might be factors that influenced this loss of diversity. In addition, economic features may have had also some additional implication. It is known, for example, that when Scleropages formosus fishing was restricted in the 1970s [72] not only the fishery on Osteoglossum increased, but O. ferreirai reached higher selling prices compared to O. bicirrhosum [9].
No substantial differentiation in the repetitive DNA content of the distinct O. bicirrhosum populations was observed by CGH experiments (Figure 3e-h). Notably, this low evolutionary differentiation at the chromosomal level does not match that of the SNPs dataset. SNPs analyses of population structure highlighted two different populations for O. bicirrhosum, one in the AMb and other at the TAb basin ( Figure 5). The genetic distance between O. bicirrhosum and O. ferreirai is larger than intraspecific distances among O. bicirrhosum populations (Figure 4). It is important to note that our genetic results should be taken with care, as our sampling sizes are very small (i.e., just 13 specimens with only two analyzed in AMb basin, see Table 1) and do not cover the whole distribution of the analyzed species. Though such small sample sizes can impact the diversity estimates and some of the analyses, the high number of molecular markers used might compensate for the limited numbers of specimens [73]. Moreover, the results presented here are congruent to those obtained with DArTseq markers in Arapaima gigas [52], a species with a similar distribution to O. bicirrhosum.
Based on the current niche projections for both species in the present conditions, we recovered a potential distribution that was similar, but larger than their natural occurrence. Our modeling approach used only bioclimatic variables, an approach similar to other authors when analyzing freshwater fishes (e.g., [74][75][76]). We selected this strategy, without incorporating other potentially important information, as the presence of floodplains, as such information would preclude projections for past periods. By comparing silver arowana current and past projections, our paleogeographic modeling results indicated that some periods may have been remarkably harsh for these fishes. During the last interglacial period (LIG), that occurred~120,000-140,000 years ago, O. bicirrhosum may have had smaller populations restricted to AMb and close areas. When considering the suitable niche on the last glacial maximum (LGM), at~21,000 years ago, the distribution is even more restricted. At that time, the Amazon region had several suitable areas that could have served as a shelter for O. bicirrhosum populations. Otherwise, the TAb region had a very small area that could have supported such populations. Climatic niche modeling in O. ferreirai evidenced LIG as the harsher period, with the smallest projected suitable area. In such conditions, the species may have suffered a decrease in the population size, with the presence of small populations in Colombia and, maybe, in other Brazilian regions. The simulation suggests that during the LGM, O. ferreirai experienced more marginally suitable regions, mainly in central Amazon basin, where some populations may have persisted. Comparisons between species indicate that O. ferreirai niche was smaller at all periods, which could be related to the habitat preference of this species for dark acid waters. The population reduction in both species should have concomitantly decreased their genetic diversity due to genetic drift. Besides such factors, the hydrological dynamics that occur in the TAb may also directly impact the species diversity levels. This is the case of the flooding events that connect some rivers allowing gene flow and thus making populations more homogenous. In addition, the species behavior cannot be omitted either, as O. bicirrhosum is characterized as a non-migratory species [22], that tends to form local communities leading to higher inbreeding levels.
Our deep learning approach for model comparison selected a colonization scenario from AMb to TAb. The parameter estimation step indicated that TAb colonization occurred during the Pleistocene (median = 1,049,955.4 years ago), and this result agrees with the suggestion of Rossetti and Valeriano [77] about the age of separation for the TAb and AMb. This result can also be interpreted as TAb having a smaller population before the separation. The effective population size recovered had a median of 468,195, four times greater than the values recovered for Arapaima species [52], another species of the family Osteoglossidae with a similar distribution. This difference may be related to the biological characteristics of both species: A. gigas is a very large species, while O. bicirrhosum is much smaller and usually much more abundant where both species occur in sympatry (E.A.O. personal observation). The other parameters estimated showed low Spearman's ρ and higher RMSE, evidencing lower capacity of CNN to predict these values. It is important to note that several results obtained here are congruent with recent populational assessments of A. gigas [52,78,79], comprising higher genetic diversity in Amazon populations compared to those of TAb, and a scenario of TAb colonization from an ancient Amazon population in a similar timeframe (~900 kya, [52]). This congruence indicates a similar evolutionary history for both O. bicirrhosum and A. gigas, mediated by the separation of the TAb and AMb, which can be also responsible for promoting genetic structuring and differentiation in other fish species. Otherwise, distinct events might have promoted the divergence between O. ferreirai and O. bicirrhosum, as both species occur in the Amazon basin, mostly in allopatry, and estimated divergence age is much older. In this sense, as indicated previously, water characteristics associated with different chemical properties may be related to ecological speciation in fishes as found in some other freshwater fishes (e.g., [70]).

Conclusions
In summary, our results provided evidence of an overall higher genetic diversity in arowanas from the Amazon basin compared to those from the TAb, suggesting that the latter may be more likely to reach a vulnerable conservation status. It also pointed out that populations from the same basin are more similar genetically and that the species divergence is more ancient than the separation of populations in different basins. For both black and silver arowanas, paleomodeling suggested a larger distribution nowadays than in the past, and our demographic model approach pointed to the colonization of the TAb from an ancient Amazon population in O. bicirrhosum during Pleistocene. When comparing both species, O. ferreirai probably requires a more elevated conservation status than that of O. bicirrhosum, because of its restricted distribution and its association to a specific water type.