Microsatellite Characterization of Malaysian Mahseer (Tor spp.) for Improvement of Broodstock Management and Utilization

Simple Summary The Malaysian mahseer (Tor ssp.) of the family Cyprinidae are indigenous large riverine cyprinids that occur only in Southeast Asia. They are the popular freshwater fish for food, ornamental and recreational fishing. However, their wild populations are now ecologically threatened as their numbers decline drastically over the years due to over-exploitation, natural habitat degradation and water pollution. With successful hatchery production, readily accepted artificial feed and fetched high market value, Malaysian mahseer is now considered a perspective for aquaculture. Stocks were collected from various sources for broodstock development to establish an appropriate base population with desirable characteristics that harbour adequate genetic diversity. Information on the genetic status is essential to formulate appropriate strategies for genetic resources protection and its utilization. Genetic diversity of the broodfish can be assessed rapidly with high precision using polymerase chain reaction (PCR)-based DNA markers. Abstract In this study, a mixture of Tor tambra and T. tambroides with unknown genetic background were collected from 11 localities in Malaysia for broodstock development and sperm cryo-banking. This study aims to assess the microsatellite (simple sequence repeat, SSR) variation, genetic diversity, genetic differentiation, level of gene flow, population structure, genetic relatedness and their demographic aspects among these Tor populations, in addition to establishing their SSR profile by employing 22 SSR markers via fragment analysis. Total genomic DNA was extracted from 181 samples (91 cryopreserved milt samples and 90 scale samples of live broodfish). Results showed the Tor spp. collection retained their genetic variation but exhibited excessive homozygosity among individuals within population. Moderate genetic differentiation was shown among the populations, with highly significant (p < 0.001) fixation indices (FST, FIS and FIT). A low gene flow over all loci (Nm 1.548) indicates little genetic variation transfer between populations. The genetic structures of all the populations were successfully resolved into four main clusters by an unweighted pair group method with arithmetic mean (UPGMA) dendrogram generated based on Nei’s genetic distances. The population structures based on principal coordinates analysis (PCoA) and the Bayesian model also suggested four distinct clusters following geographical regions and eight closely related populations. This study provided a useful baseline reference for better genetic management and utilization of the Tor spp. stocks in their breeding and conservation programmes.


Introduction
Malaysian mahseer of the Cyprinidae family is an indigenous fish that occurs only in Southeast Asia [1,2]. Locally, Tor mahseer fish are generally known as kelah in Peninsular Malaysia, empurau and semah in Sarawak or pelian in Sabah [3][4][5]. The indigenous Tor mahseer comprises two valid species, Tor tambra and T. tambroides [1,[6][7][8]. Confusions always occurred in the identification of these local Tor species. As such, their nomenclatures had been revised occasionally for the past few decades [2,9,10]. These ecologically threatened species are classified as data deficient in the IUCN Red List [6,7,11]. Wild populations of Malaysian mahseer are declining due to human activities such as deforestation, uncontrolled logging, agriculture, and over-exploitation that cause natural habitat degradation and water pollution. This is clearly shown in the drastically decreased landing of Malaysian mahseer from the inland capture [12,13].
The mahseer is regarded as the most expensive freshwater food fish in Malaysia as it has high market demand and value. The price can go as high as USD 80-200 per kg though the price of cultured mahseer is not as high as that of wild-caught mahseer, which can fetch about USD 250 per kg [14]. The retail price for cultured mahseer ranges between USD 30-50 per kg [15]. Mahseer price also varies based on its source and use of the fish as food or ornamental fish.
With successful hatchery production and its ready acceptance of artificial feed, the Malaysian mahseer is now considered a good prospect for aquaculture. Lately, it has gained much attention and popularity among local farmers in Malaysia [16,17]. Tor spp. is cultured mainly in ponds in Peninsular Malaysia and cement tanks and cages in East Malaysia. The states of Kelantan and Sarawak are the lead producers of Malaysian mahseer. The annual production of Malaysian mahseer has been approximately 20 tonnes per year since 2012 [12,13,15,[18][19][20][21][22]. This large-sized riverine fish can grow up to 30-50 kg. However, the growth rate of this fish in culture conditions is relatively slow compared to other aquaculture species [23]. It usually takes up to three years to reach the marketable size of 1.5-2.0 kg. The commercial culture of mahseer fish has also gained popularity in Indonesia, especially in the Aceh province [24,25].
The availability of polymerase chain reaction (PCR)-based microsatellites at a reasonable cost offers an easy and convenient way to generate rapid, reproducible and high throughput results for genetic diversity assessment. Microsatellites are also known as simple sequence repeat (SSR). SSR loci are hypervariable genetic markers. This feature made SSR the ideal choice for estimating genetic relatedness without prior knowledge of pedigree information [26][27][28][29][30]. Microsatellite variations are independent of natural selection because most of the SSR markers are from the non-coding regions of the genome. Therefore, they are ideal genetic markers for conservation genetics and sustainable fisheries management purposes [31]. SSR markers have been used to determine the population structures of various mahseer species in the region [32][33][34][35][36][37][38]. Population genetic structure of T. tambroides from several natural populations in Malaysia had been examined using SSR markers in one of the studies by [33]. Still, the study locations were slightly different from the present study. Furthermore, most of the samples used in that study were collected during the 2000s.
Broodstock development is the most vital element in developing a species for aquaculture. In usual practice, broodfish are collected from various sources for broodstock development. A detailed genetic background, such as the status of the genetic diversity and the population structure, including the demographic aspects (bottleneck and effective population size and number of migrants) and population assignment of the candidate broodstock, is often unknown. Very often, genetic studies will only be pursued for the collection after the collected broodstocks are established. Sperm cryo-banking was carried out for this indigenous Tor spp. as part of the conservation measure. In our research, Malaysian mahseer comprising a mixture of T. tambra and T. tambroides (and thus is addressed as Tor spp. in the study) were collected from 11 localities in Malaysia for broodstock development and sperm cryo-banking. Information on the genetic status of these Tor stocks is Animals 2021, 11, 2633 3 of 28 essential to formulate appropriate strategies for genetic resource protection and utilization in aquaculture development. However, no genetic assessment had been done on these Tor spp. stocks. Therefore, the genetic background, whether these Tor spp. stocks from different geographical regions possess different genetic makeups or whether any changes in their genetic variation over time, was unknown. It would be a waste of cost and space to maintain too many stocks of low genetic variabilities in the conservation and breeding programmes. Therefore, the main objectives of this study were to assess the genetic diversities, population structures, genetic relatedness, and their demographic aspects; and establish the genetic profiles of the Tor spp. collection by employing SSR markers. . FRIGL stock was collected from Kenyir Lake, Terengganu during FFRC, Batu Berendam, Melaka. KENS samples were collected from Kenaboi River, PPAP samples were collected from Pahang River, AGHR samples were collected from Keniam River, Taman Negara. In contrast, HLKW samples were obtained from a farmer who ran a farm at Hulu Langat and merely imported mahseer from Sumatra, Indonesia. The stocks from all ten localities in Malaysia were T. tambra, while samples of HLKW was T. tambroides. As claimed by the farm owner, samples of HLKW were T. tambroides. Morphologically HLKW stocks were slightly different from other stocks collected in Malaysia and most likely of T. tambroides [8,10]. Live broodstocks of Grik, Perak (GPRK, 16 samples), Terengganu (TGN, 14 samples) and Raub, Pahang (PHG, 11 samples) populations were collected from the wild stocks at fingerlings stage and domesticated in the pond until reaching a matured size. GPRK samples were collected from Kejar Banding River, Perak, TGN samples were collected from Berang River, Terengganu, and PHG samples were collected from Jerai River, Pahang. For Hulu Langat, Selangor (HLS, 14 samples), Mersing, Johor (MSJ, 28 samples), and Sarawak (EMS, 1 sample) populations, stocks were obtained from the local fish traders of respective localities. For EMS, five fish were collected from a local fish trader at Penang who claimed the stock was originated from Sarawak. However, only one fish survived after the quarantine period, and the sample was used in the present study. There was no overlapping in the sources of the fish among the frozen milt and scale samples. These samples were used for total genomic DNA extraction, and, subsequently, the extracted DNA samples were used for SSR genotyping. The number of samples and their sources of origin from each population are summarized in Table 1 and illustrated in Figure 1.   Langat, Selangor (HLS); 11. Empurau, Sarawak (EMS). * This is a different Tor species from the stocks in Malaysia and morphologically most likely of the T. tambroides. The stocks from other localities were T. tambra; ** Fish sample was obtained from a local fish trader at Penang who claimed the stock was originated from Sarawak. Langat, Selangor (HLS); 11. Empurau, Sarawak (EMS). * This is a different Tor species from the stocks in Malaysia and morphologically most likely of the T. tambroides. The stocks from other localities were T. tambra; ** Fish sample was obtained from a local fish trader at Penang who claimed the stock was originated from Sarawak.

Total Genomic DNA Extraction
Each extraction reaction used 400 µL thawed milt and 0.04 g scale sample. Cryopreserved milt samples in 0.5 mL IMV straws (IMV Technologies, L'Aigle, France) were thawed at 40 • C for 7 s. After that, both ends of the IMV straws were cut, and the thawed milt sample was drawn into a 1.5 mL centrifuge tube. The frozen-thawed milt sample was precipitated by centrifugation at 10,000 rpm for 15 min. The supernatant was discarded, and the sperm pellet was then subjected to total genomic DNA extraction by using the DNeasy ® Blood and Tissue Kit (QIAGEN Minden, Hilden, Germany) according to the manufacturer's instruction.
The extracted total genomic DNA samples were subjected to electrophoresis on a 1% agarose gel. Electrophoresis was carried out in 1× TAE buffer (40 mM Tris, 20 mM acetic acid, 1 mM EDTA, pH 8.0) at 80 V for 30 min to determine DNA integrity. The concentration of the extracted DNA was quantified by a NanoDrop ® ND-1000 Microvolume Spectrophotometer (Thermo Fisher Scientific, Inc., Waltham, MA, USA). Total genomic DNA samples were homogenized at 10 ng/µL and stored at −20 • C until use.

PCR Amplification
DNA was amplified by PCR. A total of 24 microsatellite primer pairs (Supplementary Materials, Table S1) were tested in this study, and suitable primer pairs, i.e., with successful amplification and produced fragments at the targeted sizes, were selected and used in the genotyping analysis. Fourteen pairs (NY01-NY14) were specifically for T. tambroides, according to [38], and ten primer pairs (BS01-BS10) were designed from the microsatellite DNA sequences of T. tambroides deposited in the NCBI genebank. Primer pairs were designed using Primer 3 version 0.4.0 software [39,40]. The 5 end of either the forward or the reverse primer in each primer pair was labelled with a fluorescent dye (6-FAM, HEX, TAMRA, or ROX). Positive and negative controls were included in each amplification reaction. A sample with known fragment size was used as a positive control, while a sample without DNA template but substituted with nuclease-free water was used as a negative control.
The PCR reactions were performed in a total volume of 25 µL reaction mixture containing 2.5 µL 5X Go Taq ® Flexi Buffer (with 1.5 mM MgCl 2 ), 200 µM each dNTP, 10 pmol each of forward and reverse primers, 0.5 U Taq DNA Polymerase (Promega Corporation, Madison, WI, USA) and 3-10 ng template DNA. The PCR amplification profile was started with an initial denaturation at 94 • C for 4 min, 35 cycles of 94 • C for 35 s, annealing temperature between 48-66 • C (depending on the primer pairs) for 35 s and extension at 72 • C for 1 min, followed by a final extension at 72 • C for 10 min and finally held at +4 • C. PCR was performed using MyCycler TM Personal Thermal Cycler (Bio-Rad Laboratories, Inc., Hercules, CA, USA). Before proceeding with the amplification of all samples, the optimal annealing temperature of each primer pair was determined via gradient PCR reaction with the range of annealing temperatures tested from 45 • C to 70 • C.

Gel Electrophoresis and Fragment Analysis
The amplification specificity of the PCR products was determined before performing the fragment analysis via conventional gel electrophoresis on 1.8% agarose gel containing Invitrogen 1× SYBR TM Safe DNA Gel Stain (Thermo Fisher Scientific, Inc., Waltham, MA, USA). A volume of 5 µL of the PCR products was used for the gel electrophoresis. Agarose gel electrophoresis was carried out in 1× TAE buffer at 100 V for 30 min. The gel image was then visualized and photographed under UV illumination using the G: Box Gel Documentation System (Syngene Inc., Frederick, MD, USA).
Fragment analysis was conducted via capillary electrophoresis on the PCR products generated from primer pairs labelled with different fluorescent dye colours (Supplementary Materials, Table S1). Three to four singleplex PCR products were pooled for each run. The optimum dilution factor of the PCR products was determined before the electrophoretic fragment analysis. The samples were then diluted according to the optimum dilution factor (100×-3000×), as shown in Table S2 (Supplementary Materials), and mixed with GeneScan TM LIZ-500 ® size standard and Hi-Di TM Formamide (Thermo Fisher Scientific, Inc., Waltham, MA, USA). After that, 2 µL of the sample mixture was loaded into the Applied Biosystems ABI3730XL Genetic Analyser (Thermo Fisher Scientific, Inc., Waltham, MA, USA) to determine the peaks. Lastly, the raw data were analyzed using the GeneMapper ® Software Version 4 to determine the fragment sizes and the height of the peaks.

SSR Genetic Diversity and Polymorphism
Microsatellite loci were scored and analyzed using the Power Marker Software Version 3.25 to determine the genetic diversity and the genetic distance of the Tor spp. samples in this study [41]. Genetic diversity and polymorphism were analyzed by locus, population and different sample types. The genetic statistics determined using Power Marker Software Version 3.25 were major allele frequency (MAF), number of alleles per locus (N A ), number of genotypes (N G ), expected heterozygosity (He), heterozygosity (Ho), inbreeding coefficient (f ) and the polymorphism information content (PIC) [41]. The GenAleX 6.51b2 software was used to determine allelic richness (A r ), effective number of alleles (A e ), number of private alleles (A p ) and percentage of polymorphic loci [42]. Markov chain exact tests for conformance to Hardy Weinberg Equilibrium (HWE) were carried out on the p-values following [43], with sequential Bonferroni correction, according to [44]. Genotyping errors due to null alleles, stuttering and large allele dropouts were analysed using the MICROCHECKER version 2.2.1 software for all loci and populations except EMS [45]. The SSR data of the EMS population, with only one sample, were insufficient for the analysis.

Population Differentiation, Genetic Distance and Genetic Structure
The genetic structure of the Tor spp. collection was assessed via analysis of molecular variance (AMOVA), F-statistics, pairwise population comparisons and population differentiation, determined using the ARLEQUIN software version 3.5.2.2 [46]. AMOVA was performed based on the distance method with 10,000 permutations [47]. F-statistics analysis was performed based on standard permutations across the full data set of allelic distances to assess the genetic differentiation of all the loci in all populations [48]. Population comparisons were evaluated by computing pairwise population differentiation estimates (F ST ) between populations based on the distance method, with the significance level set at p = 0.05 [49]. Population differentiation was determined via an exact test with 100,000 steps in a Markov chain and 10,000 dememorization steps based on genotype frequencies. The overall differentiation of the Tor spp. collection was determined using a variant of the Mantel test, with the genetic distance matrix constructed based on the shared allele distance for each pair of individuals. Principal coordinates analysis (PCoA) for the distribution of each individual in all populations and estimation of gene flow (Nm) was performed using GenAleX 6.51b2 software [42]. PCoA was made using the first and second components based on the co-dominant genotypic distance matrix. Gene flow among populations was estimated using an indirect method based on the number of migrants per generation (Nm) [50].
Both distance-based and Bayesian model-based clustering methods were used to determine the population structures of the Tor samples studied. The distance-based method through Nei's genetic distance implemented in Power Marker Software Version 3.25 was performed to construct an Unweighted Pair Group Method with Arithmetic Mean (UPGMA) dendrogram and viewed using the TreeView v 1.6.6 program [41,51,52]. The Bayesian model-based clustering method was analyzed using STRUCTURE Version 2.3.4 [53]. In the Bayesian model-based approach, the true number of clusters (K) in the samples studied was determined by computing the log-likelihood Ln P(D) with K values varying from one to eleven and with 20 independent runs for each K value. The most appropriate K value was determined with a 25,000 burn in period and 25,000 Markov Chain Monte Carlo (MCMC) replications after burn in, using the admixture model with no prior population information. The optimal number of clusters is the one that corresponds to the highest value of delta K (∆K). ∆K is based on the rate of change in the log probability of data between successive K values following the criteria of [54]. The output from STRUC-Animals 2021, 11, 2633 7 of 28 TURE software was then visualized and parsed using the web-based Structure Harvester program [55].

Genetic Relatedness
Genetic relatedness between and among individuals within each population was assessed by seven relatedness estimators implemented in the COANCESTRY Version 1.0.1.9 software based on an individual's multilocus genotype data [56]. The relatedness estimators used were five-moment estimators denoted as Wang [57], LynchLi [58,59], LynchRd [60], Ritland [61], QuellerGt [62], and two likelihood estimators denoted as TrioML [63] and DyadML [64]. The moment estimation is based on calculating the level of shared alleles between sample pairs, while the likelihood estimation classifies samples to a limited number of related classes. Inbreeding is assumed to be absent in the allmoment estimators used. The genotyping error rate was zero for all loci. A bootstrap analysis (1000 bootstrap replicates) was conducted to test the statistical differences between populations in mean relatedness coefficients (Rxy). The difference between populations is significant at a 95% confidence level when the difference in mean relatedness between the two groups lies outside the 2.5 and 97.5 percentiles of the distribution curve obtained by bootstrapping [56].

Population Bottleneck, Effective Population Size (Ne), and Population Assignment
Evidence of a recent bottleneck for each population except EMS was tested for using BOTTLENECK version 1.2.02 [65]. Three mutation models, i.e., the infinite alleles model (IAM), two-phase model (TPM), and stepwise mutation model (SMM), were used in the analysis for heterozygosity excess. The TPM model was applied with a variance of 30 and a stepwise mutation probability of 70%. The results over loci for the significant presence of an excess of observed heterozygotes were derived from the Wilcoxon sign-rank test [66]. The allele mode shift test, i.e., distortion from L-shape allele distribution within each population indicated a recent bottleneck occurrence, was also tested for using the same software with 1000 iterations.
The effective population size (Ne) was estimated based on linkage disequilibrium data at unlinked loci according to the Burrows method with a bias correction using the LDNE Software version 1.31 [67,68]. The critical value (P crit ) was defined at 0.01, i.e., all alleles with frequencies less than 0.01 were excluded from the analysis. The 95% confidence intervals for Ne were determined by the JackKnife method. Population assignment was also performed using GenAleX 6.51b2 software [42] to detect levels of genetic admixture according to allele frequencies following [69,70].

Genetic Diversity by Population
Among the eleven populations studied, Kelah World, Hulu Langat, Selangor (HLKW) produced the highest N A (146), A r (6.6818), allele numbers per locus (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20), and the number of genotypes (7.8182), while EMS generated the lowest N A (28), A r (1.2727), allele numbers per locus (1-2) and the number of genotypes (1.0000) ( Table 2). The mean values of A r and the number of genotypes were 3.8967 and 4.8682, respectively. The MAF ranged from 0.5097 (HLKW) to 0.8636 (EMS) and with a mean value of 0.6495. The number of effective alleles (A e ) generated varied from 1.273 to 3.792, with a mean of 2.543. The highest Ae was seen in the AgroHarvest, Raub, Pahang (AGHR) population, while the lowest A e was in the EMS population. A total of 87 private alleles unique to a single population were produced from the Tor samples, with the highest number from the HLKW population (52 alleles), followed by the Aquaculture Extension Center, Perlok, Jerantut, Pahang (PPAP) population (11 alleles). Private alleles were generated in nine populations, i.e., 81.8% of the populations, while they were absent in the AGHR and HLS populations. The list of private alleles generated in each population is shown in Appendix A. The percentages of polymorphic loci ranged from 27.27% in the EMS population to 95.45% in PPAP, HLKW and TGN populations, with an average of 76.45%. The lowest genetic variation was in the EMS population due to its small sample size. The highest He (0.5970), Ho (0.4545) and PIC (0.5711), respectively, were shown in HLKW, while the lowest He (0.0682), Ho (0.2727) and PIC (0.1023), respectively, were seen in EMS. The mean values of He, Ho and PIC, were 0.4189, 0.4083 and 0.3966, respectively. In this study, six out of the eleven populations with He < Ho were conforming to HWE.

Level of Inbreeding across Loci and Populations
The inbreeding coefficient across loci (f ) ranged from −0.6226 (BS08) to 0.9238 (NY13) and with an average f at 0.2259 (Table 3). Positive f was showed in sixteen loci, while negative f was detected in six loci. Positive f values indicated excess homozygosity and the occurrence of deviations from HWE in these loci. On the contrary, negative f values indicated an excess of observed heterozygotes and no incident of inbreeding. The inbreeding coefficient across populations (F IS ) was in the range of −0.003 to 0.234, with a mean of 0.015 ( Table 2). The inbreeding coefficient was the highest in the HLKW population (0.234). Positive F IS was detected in six populations, while negative F IS was detected in five populations. The mean F IS of 0.015 within the populations studied was small, indicating a low level of inbreeding.

Null Allele
As revealed by the analysis using the MICROCHECKER software, a general excess of homozygotes (heterozygote deficits) for most allele size classes might be present in sixteen loci (72.7%), suggesting the presence of null alleles. For those loci with null alleles, the total observed homozygotes appeared higher than the total expected homozygotes and exhibited positive f values (Table 3). Null alleles were detected in nine out of the eleven populations studied (Supplementary Materials, Table S3). The null alleles detected in each population for the respective SSR markers are summarized in Appendix A (Table A1). The AGHR population showed no evidence for null alleles in all loci as the total expected homozygotes appeared higher than the total observed homozygotes.
Meanwhile, stuttering, which is indicated by a highly significant shortage of heterozygous genotypes with alleles of one repeat unit difference, might have occurred in thirteen loci (59.1%) in this study. The HLKW population showed the highest occurrence of null alleles and stuttering. Nevertheless, the results showed no evidence for large allele dropout in all loci. Large allele dropout occurs when large alleles do not amplify as efficiently as small alleles.

Genetic Differentiation
The level of genetic differentiation, as revealed by AMOVA, indicated that the variation among populations (V a ) was 14.92%, variation among individuals within populations (V b ) was 9.00%, and variation within individuals was 76.09%. These indicated that the Tor spp. collection was highly variable, and, statistically, all these variations were highly significant (p < 0.001) ( Table 4). The estimates of fixation indices from the F-statistics analysis were with mean values of overall F ST , F IS, and F IT of 0.149, 0.106, and 0.239, respectively. All the three indices exhibited statistically highly significant differences (p < 0.001) from zero among and within all populations indicating the phenomenon of inbreeding. Inbreeding was resulting in a loss of heterozygosity among populations (F ST ), among individuals relative to the subpopulation (F IS ), as well as within individuals relative to the total population (F IT ).
Pairwise F ST values ranged from 0.016 to 0.237 among populations, indicating mixed levels (from low to high) of genetic differentiation among the populations (Supplementary Materials, Table S4). The lowest pairwise F ST value (0.016) was between the MSJ and HLS populations. The highest value (0.237) was between the GPRK and KENS populations.

Genetic Distance
Genetic divergence among the populations, as revealed by pairwise Nei's (1983) genetic distances, ranged from 0.062 to 0.423 among the populations (Supplementary Materials, Table S4). The lowest genetic distance (0.062) was between the MSJ and HLS populations, while the highest genetic distance (0.423) was between the HLKW and EMS populations. Generally, HLKW and EMS showed genetic divergence ranged from 31.8-42.3% and 27.5-42.3%, respectively, compared to other populations. Other pairwise populations, i.e., TGN and AGHR, TGN and FFRC, and KENS and PPAP, revealed genetic divergence < 15%. Small genetic distances between these populations indicated that the populations were genetically closely related.

Genetic Structure
Based on the UPGMA dendrogram generated based on the Nei's genetic distance (1983), as shown in Figure 2 Table S4).
The principal coordinates analysis (PCoA) plot was also generated to show the genetic structure among all the Tor spp. samples. The results showed that the first, second and third component axes accounted for 28.50%, 11.88% and 6.99%, respectively, of the total variance, and the eigenvalues were 190. 80 The model-based cluster analysis based on the magnitude of delta K (∆K) statistics generated the highest K value at K = 4 (Supplementary Materials, Figure S4). Therefore, the most likely number of clusters to best explain the Tor spp. population structure was four. Another peak was also observed at K = 6. From the plot of mean likelihood L(K) and variance per K value generated, the likelihood L(K) values over 20 runs were consistent without showing significant variance for the same K value (Figure 3).
types were superimposed on each other in the plot. Generally, locality-specific clusterings were evident on this PCoA plot, although some of the samples were not grouped tightly in the HLKW, MSJ and HLS populations. The samples of these three populations were split into two distinct fragmentary clusters instead. Overall, there were four genetically different groups formed among the Tor spp. samples studied. These groups were: (I) EMS and GPRK; (II) PHG and KENS; (III) AGHR, TGN, FFRC, PPAP, MSJ and HLS; (IV) HLKW. The model-based cluster analysis based on the magnitude of delta K (ΔK) statistics generated the highest K value at K = 4 (Supplementary Materials, Figure S4). Therefore, the most likely number of clusters to best explain the Tor spp. population structure was four. Another peak was also observed at K = 6. From the plot of mean likelihood L(K) and variance per K value generated, the likelihood L(K) values over 20 runs were consistent without showing significant variance for the same K value (Figure 3).   Comparisons of the proportional membership of the Tor spp. individuals estimated by the STRUCTURE software at varying K values from K = 2 to K = 7 are shown in Figure 4. Generally, low gene flow was observed among the populations (individual q values > 0.8), except in the HLKW and PPAP populations, which showed admixed ancestry for some of the samples.   (group II), and AGHR, FFRC, and TGN (group III) were correctly assigned to their respective populations. Nevertheless, some of the individuals in both the HLS and MSJ populations of group III were assigned to group I (30% members of HLS and 35.7% members of MJS, respectively). Individuals in the PPAP and TGN populations, 76.9% and 92.9%, respectively, were assigned to group III. Some members in the PPAP population had admixed ancestry with group II and group IV, while 7.1% of the members in the TGN population had admixed ancestry with group IV. Meanwhile, some members in the HLKW population were found assigned to group I and showed admixed ancestry (individual q values < 0.8) with the Tor spp. from Northern Peninsular Malaysia.
At K = 6, it was observed that population differentiation occurred in groups II and III. PHG and KENS populations of group II were differentiated into two distinct groups. HLS and MSJ populations were differentiated from group III to form another separate group.

Genetic Relatedness among Individuals
The genetic relatedness among individuals in all populations was successfully generated from seven relatedness estimators implemented in the COANCESTRY program. Table 5 shows the mean genetic relatedness among individuals of each population generated from each estimator. Generally, the relationships among samples were consistent across all estimators. The likelihood estimators revealed higher estimates than the moment estimators. For the moment estimators, positive genetic relatedness (Rxy) was only observed in the KENS and PHG populations using the [66][67][68] estimators. The highest relatedness values with mean Rxy 0.053 (LinchLi) were observed in the KENS population, indicating that the samples in this population were genetically more related than in the other populations. In most populations, the values of mean relatedness based on moment estimators appeared as negative values. A negative relatedness estimate means the individuals are less related than the average relatedness. The bootstrap analysis also revealed no statistical differences between populations in mean relatedness coefficients (Rxy) for all estimators (Table 5). Table 5. Genetic relatedness (R xy ) among individuals within each population and their correlation coefficients based on individuals' microsatellite (SSR) genotype data. The correlation coefficients between each pair of the seven relatedness estimators were calculated to compare different estimators in all the Tor populations. Overall, the relatedness estimators were positively correlated with correlation coefficients ranging from 0.531 to 0.986 among all populations. The range of correlation coefficients in each population is summarized in Table 5.

Bottleneck Analysis
The results from bottleneck analysis using IAM suggested that there were occurrences of bottlenecks in six populations, i.e., FRIGL stock (FFRC), GPRK, HLKW, Kg Esok, Jelebu, Negeri Sembilan (KENS), MSJ and PHG (Table 6, Supplementary Materials, Table S3). Analysis using TPM and SMM showed evidence of a recent reduction in population size within the population PHG, which supported the bottleneck phenomenon in this population. Nevertheless, there was no evidence of a recent bottleneck within three populations, namely HLS, PPAP and TGN. A mode-shift in the allele frequencies was detected in the AGHR population under mutation-drift equilibrium. However, there were no significant bottlenecks in all the mutational models in this population (Supplementary Materials, Table S3). The allele frequency distributions were approximately L-shaped in all the other populations, indicating no mode-shifts in the allele frequencies.

Estimation of Effective Population Size (Ne)
The estimates of effective population size (Ne) within each population determined from the linkage disequilibrium data are listed in Table 7. The Ne values were negative for the AGHR, EMS and GPRK populations. Negative estimates of Ne indicate genetic drift. However, there was no evidence for any disequilibrium caused by genetic drift due to the finite numbers of parents in these populations. Therefore, sampling errors and small sample sizes in particular populations such as EMS and AGHR might have contributed to this phenomenon (Supplementary Materials, Table S3).

Population Assignment
The outcome of the population assignment showed that 90% (163) of the individuals from the eleven populations were assigned correctly to the self-population, i.e., their original sampling locality. In comparison, only 10% (18) individuals of four populations were mismatched and assigned to other populations ( Table 7). Out of the four mismatches in HLKW, two individuals each were assigned to MSJ and GPRK. Out of the four mismatches in HLS, two individuals each were assigned to KENS and GPRK. Among the seven mismatches in MSJ, three individuals each were assigned to HLS and TGN, while one individual was assigned to AGHR. For the three mismatches in TGN, two individuals were assigned to FFRC, while one individual was assigned to AGHR. The seven populations with 100% correctly assigned individuals were AGHR, EMS, FFRC, HLS, KENS, PHG, and PPAP.

Discussion
In this study, the total genomic DNA extracted from both the frozen milt and scale samples of Tor spp. produced identical SSR loci with the expected sizes (Supplementary Materials, Figures S1 and S2). Similar findings were also obtained in the study on Whitefish (Coregonus lavaretus L.) using DNA from adipose fins and cryopreserved milt [71]. SSR alleles were also proven to be the same from different tissues of the same individual, as reported in a study of Chinese Holstein bulls using blood and semen samples [72]. Therefore, the total genomic DNA obtained from different sample types should not be disputed for its abilities to amplify similar and consistent PCR products.

Genetic Diversity of the Tor spp. Collection
The Tor collection in this study still retained a reasonable amount of genetic variation, with allele richness of 1 to 33 alleles per locus, polymorphic loci 76.45%, and an average PIC of 0.4942. Compared to studies on the same species by [32,34,36], the microsatellite loci assessed in the present study showed the highest polymorphism level and allelic richness. The number of alleles was influenced by the sample sizes [73,74]. Generally, the larger the sample size, the higher the number of alleles generated from the population. Allelic richness, corrected for unequal sample size among samples for each locus, is the preferred measure for genetic diversity.
Allelic richness is an essential element in conservation programmes, as it is indicative of a population's long-term potential for adaptability and persistence [75,76]. The allelic richness of the HLKW population (1-20 alleles per locus) obtained from Indonesia, believed to be T. tambroides, was quite similar to the allelic richness, i.e., 5-21 alleles per locus obtained on the same species in the study by [36]. For T. tambra in all the other populations that we studied, the allelic richness ranged from 1-2 alleles per locus to 1-17 alleles per locus, but with most populations (i.e., FFRC, PPAP, AGHR, PHG, EMS and GPRK) having the allelic richness of around ten alleles per locus. The reported allelic richness for T. tambra was ten alleles per locus in several studies [32,34,36].
Most SSR loci (90.9%) revealed a significant deviation from HWE after the Bonferroni correction. Departures from HWE in most SSR loci are plausible and have been commonly reported in studies of natural populations in a wide range of fish species [77][78][79][80][81]. Excess or lack of heterozygotes can cause departures from HWE. In this study, heterozygote deficiencies were observed in sixteen loci, whereas an excess of heterozygotes was detected in six loci. Heterozygote deficiency can be attributed to small sample size, the presence of inbreeding or genetic patchiness (Wahlund effect), reduction in effective breeding population size (Ne), and the presence of null alleles, which caused an excess of homozygotes. Small sample size may cause the founder effects, bottleneck effects or genetic drift [35,82]. Selective breeding, over-exploitation and anthropogenic disturbances resulted in occurrences of inbreeding and reduction in Ne [83][84][85]. The high mutation rate in microsatellites increases the occurrence of null alleles [86]. In this study, the small sample sizes of the AGHR and EMS populations (with <10 samples) could result in sampling error, inbreeding and the presence of bottleneck, all of which could cause heterozygote deficits, resulting in deviations from HWE and subsequently resulting in population differentiation.
The small sample size in both EMS and AGHR populations has resulted in low genetic variation. In populations with small sample sizes, rare genotypes are likely to be included in the samples. The small population size also increases inbreeding and genetic drift, thus reducing genetic variability over the long term [87]. When inbreeding occurs, the number of homozygotes will increase because the mating individuals have the same alleles. This excess homozygosity, in turn, causes heterozygosity deficit in the population [88][89][90][91].
Heterozygosity deficit in the present study could also be the consequence of null alleles or stuttering among the SSR loci. Both null alleles and stuttering were detected in nine populations in this study. Of the nine populations with null alleles, five populations, i.e., HLKW, MSJ, HLS, FFRC and PPAP, showed heterozygote deficits. A highly significant (p < 0.001) heterozygosity loss among and within the populations was also revealed in the present study, as revealed in the AMOVA and F-statistics analysis. For populations with null alleles in the present study, the null alleles' frequencies were relatively high in general (9.1-30.9%, data not shown). These populations showed highly significant genetic differentiation, with low gene flow among the populations as shown from the AMOVA.
Each locus that deviated from HWE can amplify at least one allele in all the samples. Thus, the low frequencies of null alleles were not enough to affect the analysis [92]. Generally, null alleles with low frequencies between 5% and 8% would only have a minor effect on the classical estimation of population genetic parameters such as genetic diversity, population differentiation, population FST and genetic distances. However, when null alleles were present at frequencies higher than 10%, it could affect the genotyping of individuals at some loci and lead to the under-estimation of genetic diversity and the over-estimation of population differentiation. Genetic distances tend to be underestimated when null alleles occurred at high frequencies [93]. The null allele at microsatellite loci with frequencies higher than 10% and its consequences in estimating population structure and differentiation have been reported in several studies on bivalve species [94][95][96][97][98]. Nevertheless, in a study on Wedge Clam (Donax trunculus), the presence of unusually high frequency of null alleles (>10%) did not appear to affect the FST estimates significantly [94].
The presence of null alleles in microsatellite data and their consequences on population genetic parameters had been tested using various analytical and simulation tools [99] and actual population samples [94]. As shown in the simulations by [100], those SSR loci with null alleles would slightly overestimate the FST but are unlikely to impact genetic differentiation significantly. Therefore, SSR loci with null alleles that did not seem to alter the overall outcome of assignment testing could still be included in the studies. In this study, 163 out of 181 (90%) individuals were correctly assigned to their respective populations. Five out of nine populations with null alleles AGHR, EMS, FFRC, GPRK, KENS, PHG and PPAP exhibited 100% correctly assigned individuals. Therefore, all 22 SSR loci were kept and used in the present study.

Genetic Differentiation and Genetic Structure Analysis
In the present study, the fixation indices F ST , F IS and F IT indicated a significant reduction in heterozygosity within and among the populations due to non-random mating. F IS values significantly higher or lower than zero reveal inbreeding or outbreeding, respectively [101]. As reflected in the AMOVA, genetic differentiation was at a medium level in the overall Tor spp. collection (F ST = 0.149), as evidenced by the low level of gene flow estimate (N m = 1.548 per generation) and highly significant level (p < 0.001) of inbreeding among individuals within population (F IS = 0.106). Generally, it was observed in this study that the inter-population differentiation was low among the Tor spp. populations in Malaysia. Similar findings were also reported in the previous study by [33] on the same species. Nevertheless, a mixed level of population differentiation, from low to high, was observed among the Tor populations, with the pairwise population F ST values ranging from low (0.016) to very high genetic differentiation (0.237) following the F ST classification by [102]. Significant differences (p < 0.05) were detected in 83.6% of the pairwise comparisons among populations. These confirmed their population divergence.
Genetic differentiation can be attributed to migration, geographical barriers, genetic drift and gene mutation [103,104]. Low gene flow resulted in small genetic variation transfer from one population to another among the Tor spp. collection. The proportional membership of Tor spp. individuals with low genetic admixtures (individual q value > 0.8), as shown in Figure 4, revealed a low level of gene flow. Since the Nm value across the overall population in this study was greater than one, it was likely that genetic drift was not the main factor accounting for genetic differentiation among the Tor spp. populations. From the pairwise F ST generated for the Tor spp. populations, results presented relatively higher differentiation between HLKW and all other populations, indicating that the population is most likely a separate species (T. tambroides). The same observation was also seen in the EMS population, which revealed significantly higher differentiation between HLKW, KENS and GPRK populations but no significant differentiation from other remaining populations. For AGHR and EMS, a small sampling size looks likely to be the main cause of population differentiation. For other populations, it seems more likely that the Wahlund effect due to geographic distances or habitat fragmentation may have caused the local genetic differentiation among the Tor spp. by limiting the gene flow among the populations [105,106]. Habitat fragmentation resulted in a reduction in the genetic diversity and viability of the small and isolated populations, consequently impacting the population genetic structure [107].

Genetic Distance and Population Structure among Sampling Locations
Genetic distance among the Tor spp. populations ranged from 6.2% to 42.3% in this study. A great genetic distance was observed between the HLKW population and the other populations, with a pairwise genetic coefficient ranging from 31.8% to 42.3%. This finding again supported the idea that the HLKW population was from a different Tor species than the other populations because there was no sexual selection between different species. The pairwise genetic coefficient of the T. tambra populations ranged from 6.2% to 29.3%. T. tambra from EMS of Sarawak also showed a higher genetic distance from other T. tambra populations in Peninsular Malaysia, ranging from 27.5% to 35.3%. The distinct population clusterings were further supported by the results of the population assignment tests, using both PCoA analysis and Bayesian cluster analysis. A high percentage of correctly assigned individuals indicated substantial genetic divergence among the populations [108]. The pattern of clustering using Bayesian analysis was similar to the PCoA, with four genetically distinct groups formed according to the geographical origins of the Tor spp. samples obtained. Moreover, the genetic admixture of all the Tor stocks was relatively low (individual q value > 0.8), indicating that individuals in each population were weakly differentiated. These genetically uncontaminated populations served as ideal sources of fresh alleles for future aquaculture and restocking programmes [27].
Similar to the genetic structures revealed in both the PCoA plot and the model-based cluster analysis at K = 4, the UPGMA clustering of Tor populations could also be explained according to their geographical distribution. Populations in cluster D (GPRK, HLS, MSJ, KENS, PPAP, AGHR, FFRC and TGN), as illustrated in the UPGMA dendrogram, had small genetic distances which ranged from 6.2% to 18.3%, indicating that these populations were closely related and had a recent common ancestor. It was apparent that these closely related populations were from the same source and origin. Similar clustering was also observed in the previous study by [33], i.e., samples from Negeri Sembilan, Pahang and Perak were closely related and grouped in the same cluster.
Samples of the FFRC population were obtained from Kenyir Lake, while samples of the TGN population were collected from the Terengganu River, originating in Kenyir Lake. Samples of population PPAP were collected from the Pahang River. In contrast, the samples of the KENS population were obtained from the Kenaboi River, which is one of the tributaries of the Pahang River. However, the close relationship between samples of the HLS population from the Hulu Langat River and samples of the MSJ population from Mersing Johor could not be explained according to their geographical locations because these two river systems were not from the same origin. The Hulu Langat River flowed westwards of Peninsular Malaysia and ended at the Straits of Malacca, while the Mersing River flowed to the southeast of Peninsular Malaysia and ended at the South China Sea.
Based on the population structure derived from the STRUCTURE analysis at K = 4, it was noticed that individuals in populations HLS and MSJ had similar genetic compositions. Tor individuals from both the HLS and MJS populations comprise a mixture of Tor with distinct genetic contributions from the North (GPRK) at ≈35% and East Coast (PPAP, AGHR, FFRC and TGN) of Peninsular Malaysia at ≈65%. Generally, the samples of populations HLS and MSJ were more closely related genetically to population GPRK. This observation was also supported in the population assignment test, by which 25% and 33.3% of the HLS and MSJ populations respectively mismatched one another and mismatched with the GPRK population or populations from the East Coast (TGN and AGHR) of Peninsular Malaysia. The admixture percentages were low among individuals in the HLS and MSJ populations, indicating that they did not interbreed. Therefore, it looked more likely that the local fish traders who supplied the HLS and MSJ stocks had obtained their fish stocks from the same source.
On the other hand, the samples of AGHR, which originated from the Keniam River, were found to be more closely related to the stocks from FFRC and TGN. In the meantime, both the GPRK and EMS were found to cluster in the same grouping in both PCoA plot and Bayesian cluster analysis, indicating that samples from Perak and Sarawak were closely related. These results again highlighted the possibility of mislabeling the EMS sample by the fish trader who supplied the fish. The so-called EMS broodfish was most probably from the local source in north Peninsular Malaysia. Unfortunately, no other samples from the same population were available for verification.

Genetic Relatedness among Individuals
Genetic relatedness shows the relationship between individuals in a population [109]. Knowledge of the genetic relatedness of individuals in a population is important in genetic analysis to estimate heritabilities, genetic correlations and breeding values for developing optimized strategies for artificial selection and conservation [110]. The mean Rxy among all Tor samples revealed by different relatedness estimators in the present study were not significantly (p > 0.05) different among all Tor populations across all estimators. It was also observed that mean Rxy based on the moment estimators showed negative values in most populations. A negative Rxy value indicates that the individuals are less related than the Animals 2021, 11, 2633 20 of 28 average relatedness. It also reflected how much lower the probability of recent coalescence is for the individuals relative to the average probability for all considered individuals from the reference population [111]. Meanwhile, mean Rxy based on the likelihood estimators was slightly higher with positive values and highly correlated, especially among HLKW, HLS and MSJ populations. A similar observation was also reported in a study by [31] on seabass (Lates calcarifer), in which the Rxy estimates for wild and hatchery stocks did not differ significantly (p > 0.05). However, a significant increase of genetic relatedness with a high correlation coefficient and a decline in Ne estimates were detected within a selectively bred population from the hatchery stocks. Therefore, selective breeding has caused a significant loss of genetic variation, allelic diversity and overall heterozygosity compared to the parental generation.
The pattern of genetic relatedness has a direct functional relationship with the Ne of the population [109]. The next generation would have a higher probability of sharing the same parents if interbreeding was performed between individuals from populations with small Ne [112]. As a result, the mean and variance in pairwise relatedness within the next generation are expected to increase with decreasing Ne [109]. Therefore, it is also worth noting that extra caution should be taken when selecting broodfish of this Tor spp. collection for cross breeding in the future to avoid a rapid increase in genetic relatedness and reduction in Ne.

Population Bottleneck, Effective Population Size (Ne) and Population Assignment
In the present study, a recent population bottleneck was detected in FFRC, GPRK, HLKW, KENS, MSJ and PHG populations and a mode-shift in allele frequencies in AGHR population. Sampling error in GPRK and small sample size in AGHR and EMS populations have resulted in negative effective population size (Ne) estimates in these populations. Ne measures the rate of inbreeding and genetic drift in the population [113]. Population bottleneck and Wahlund effect could influence the Ne [114]. Generally, a mass reduction in the Ne can lead to a large decrease in SSR variations [115]. Consequently, the genetic differentiation, gene flow and genetic diversity of the population will be affected [114].
The accuracy of the population assignment test did not seem to affect much by null alleles in the present study. In the population assignment test, a high percentage (i.e., 90%) of individuals correctly assigned to respective populations were observed for the Tor spp. collection. The percentage has doubled the previous study (i.e., 42.8%) by [33]. As reported in many studies, SSR loci with null alleles could lower the power to correctly assign individuals in the population assignment test [100]. Therefore, loci less prone to null alleles should always be preferred in population genetic studies [93]. Thus, the population assignment test outcome is more affected by the population differentiation and might improve by having an ample number of loci [100].

Genetic Information and Broodstocks Management
An appropriate base population containing selected fish with desirable characteristics that harbour adequate genetic diversity is a prerequisite for successful broodstock development and effective genetic management. Therefore, the genetic information obtained from this study is essential to formulate appropriate strategies for genetic resource protection of Tor spp. and for their utilization in aquaculture development, especially for selective breeding programmes. The high percentages of departures from HWE as the consequences of excessive homozygosities among the SSR loci showed an urgent need for proper management strategies of these Tor stocks. It was evident from the genetic structures obtained that the Tor spp. collected for the establishment of the hatchery population comprised the natural gene pool of four distinct genetic sources. Understanding the connectivity among the populations provides a useful tool to determine appropriate strategies for fisheries conservation, effective management and genetic improvement of the Malaysian mahseer.
Analysis of genetic relationships is an essential component in a genetic improvement programme. It provides information about genetic diversity, and it also offers the platform for the stratified sampling of breeding populations [116][117][118]. For sustainable aquaculture development, strategies to minimise the loss of genetic variation of the captive breeding populations should be undertaken through minimising genetic drift, while maximising the Ne [112]. A genetic admixture of several different genetic stocks that can help increase the mean number of alleles and heterozygosity is the preferred strategy. This management strategy has been applied successfully in some aquaculture species [30,[119][120][121].
Proper knowledge of stock structure is necessary to preserve genetic diversity and ensure sustainable exploitation of the broodstocks. With reasonable variability, which ranged from intermediate to high levels in the current Tor spp. collection, it should serve as a valuable germplasm resource and a suitable base population to start with for future utilization and genetic improvements of this species. Excessive homozygosity caused the departures from HWE we observed, highlighting the need for better management and planned breeding programmes of these Tor stocks.
For better prospects, Tor spp. stock from east Malaysia (Sabah and Sarawak) and northern Peninsular Malaysia (Kelantan) shall be included in future studies to best characterize the Tor spp. in Malaysia, which could better understand the current genetic status of Malaysian mahseer in the whole country. Future samples shall be obtained from more reliable sources for the stocks of Hulu Langat River and Mersing Johor populations. In genetic conservation programmes, milt samples of the Tor stocks from the GPRK, PHG and east Malaysia populations should be prioritized for sperm cryo-banking. Besides that, the polymorphic SSR loci with considerable genetic variations used in this study and those with private alleles are potentially useful for pedigree and parentage analyses of the new breeds from the stocks, as well as in the development of marker-assisted selection technology (MAS) for Tor spp. in this region. The SSR markers used in the study are expected to be useful for the ongoing inter-population diallel cross-breeding and growth performance assessments of the fingerlings produced from the same pool of candidate broodstocks. These SSR markers are also of potential use in monitoring the genetic impacts of restocking activities on the wild populations of Tor spp. The levels of genetic variation, which included measures of allelic diversity, overall heterozygosity, Ne and genetic relatedness, should be monitored continuously for the breeds resulting from these broodstocks.

Conclusions
The SSR genotyping in this study has successfully established the SSR profile for the Tor spp. collection obtained from 11 populations using 22 SSR loci. The information on SSR polymorphism and diversity, gene flow, population differentiation, genetic structure, genetic relatedness and their demographic aspects of the Tor spp. collection is obtained from the study. This finding facilitated the reliable classification of the Tor spp. stocks and provided excellent information on genetic variabilities and population genetic structures of the Malaysian mahseer stocks obtained from various geographical locations. The Tor spp. collection still retained their genetic variation but exhibited excessive homozygosity among individuals within population and little genetic variation transfer between the populations. Generally, the levels of genetic variation and the population structures corresponded to the geographical origins of the Tor spp. The private alleles we found to be present in the different populations could serve as specific markers for the respective populations. The results on the genetic diversity, genetic structure and relationships, and the methodology used in the study may be utilized in the future for constructing genetic linkage maps for marker-assisted breeding and for identifying growth trait-associated markers in Tor spp.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/ani11092633/s1, Figure S1: Total genomic DNA extracted from some of the (a) post-thawed milt samples (i.e., sample 1-12) and (b) scale samples (i.e., sample 13-28) of Tor spp. after electrophoresis on 1% agarose gel in 1× TAE buffer. λ is the lambda Hind III DNA marker. Figure S2: PCR products amplified for marker NY02 from total genomic DNA extracted from (a) milt sample and (b) scale sample of Tor spp. after electrophoresis on 1.8% agarose gel has yielded targeted fragments at the sizes between 238-274 bp. 100 bp is the GeneRuler DNA ladders. Figure S3: PCoA plot based on genetic distance matrix of 181 Tor spp. collected from eleven populations utilizing data from 22 SSR genotype with data standardization. Figure S4: Magnitude of delta K (∆K) statistics for the Tor spp. collection based on 22 microsatellite loci. ∆K as a function of the number of putative genetic clusters, K. The most likely K value identified, i.e., with the highest value was K = 4. Table S1: List of microsatellites (simple sequence repeat, SSR) markers loci and primer sequences used in genotyping of Tor spp. Table S2: Microsatellite diversity and polymorphism of Tor spp. by different sample types. Table S3: Summary of causes attributed to the departure from HWE in each of the Tor populations. Table S4: Pairwise Nei's genetic distance coefficient (below diagonal) and pairwise FST values (above diagonal) for the Tor spp. populations studied.
Author Contributions: P.C.C. Conceptualization, software, validation, formal analysis, investigation, data curation, writing-original draft preparation, writing-review and editing, visualization, project administration and funding acquisition; P.C.C. and A.C. methodology; P.C.C. and J.M.Z. resources; A.C., M.Y.I.-S. and C.M.C. supervision; P.C.C., A.C. and S.G.T. validation and writing-review and editing. All authors have read and agreed to the published version of the manuscript.
Funding: This work was partially funded by the Putra Grant Universiti Putra Malaysia GP-IPS 95359000.