Sex-Determination Mechanisms among Populations within Cryptic Species Complex of Calotes (Squamata: Agamidae: Draconinae)

Sex-determination mechanisms and sex chromosomes are known to vary among reptile species and, in a few celebrated examples, within populations of the same species. The oriental garden lizard, Calotes versicolor, is one of the most intriguing species in this regard, exhibiting evidence of multiple sex-determination modes within a single species. One possible explanation for this unusual distribution is that in C. versicolor, different modes of sex determination are confined to a particular population or a species within a cryptic species complex. Here, we report on a population genetic analysis using SNP data from a methylation-sensitive DArT sequencing analysis and mitochondrial DNA data obtained from samples collected from six locations: three from Bangladesh and three from Thailand. Our aim was to determine whether C. versicolor is best described as a single species with multiple lineages or as multiple species, as well as if its sex-determination mechanisms vary within or between species. We present evidence that the latter possibility is the case and that C. versicolor comprises a complex of cryptic species. We also identify sex-linked markers within these species and use them to identify modes of sex determination. Overall, our results suggest that different sex-determination modes have evolved among closely related species and within populations of


Introduction
The determination of sex is one of the most fundamental and yet highly variable mechanisms in the animal kingdom [1], and it can evolve rapidly. Variation in this mechanism has been observed within many lineages of animals, plants, and even closely related species or populations [1,2]. For example, in the Japanese wrinkled frog, Glandirana (Rana) rugosa (Temminck and Schlegel, 1838), four genetic forms are distributed in different geographic regions of Japan [3,4], with male and female heterogametic sexes indicating a turnover of sex chromosomes within this single species across its range. The oriental garden lizard, Calotes versicolor (Daudin, 1802), is another widely cited example of a species in which different forms of sex determination are present [5][6][7], and it is this species that we report on here.
Determining heterogametic sex by identifying sex chromosomes is not easy if the sex chromosomes are cryptic. These chromosomes may contain sex-specific genes or sequences that can reveal their sex-determination mode (XY or ZW) [8][9][10][11][12][13][14]. This may be true even in DNA 2021, 1 50 the case of species with temperature-dependent sex determination (TSD) and, therefore, lacking sex-specific chromosomes or where species show diverse sex ratio patterns complicated by gene-temperature interactions. In these cases, the development of sex-linked markers can provide valuable insights by enabling genetic sex to be identified and correlated with phenotypic sex. Such markers have been successfully identified in several reptilian taxa [8,11,12,[14][15][16][17], where they have been used to identify sex-determination modes and cases of sex reversal [18,19].
It is speculated that the differential methylation of the promoters of genes is involved in sex determination at sex-specific temperatures [20,21], but a full understanding of the mechanism is unknown [22]. Differences in gene methylations occurring between genetic males and females have been observed in European sea bass, Dicentrarchus labrax [23]; Nile tilapia, Oreochromis niloticus [24] and half-smooth tongue sole, Cynoglossus semilaevis [25]. Differential methylation patterns associated with phenotypic sex have also been reported in different developmental stages of the red-eared slider turtle, Trachemys scripta [26]. However, epigenetic changes do not involve a change in the nucleotide sequence but rather in the methylation of histones to influence the folding of chromatin and thus the availability of genes for expression [20,27] or in DNA methylation [27]. Restriction site-associated DNA sequencing, such as RADseq and DArTseq, has been proposed as a method for developing sex-linked markers, i.e., sex-linked loci in taxa with homomorphic sex chromosomes [8,10]. This has successfully been used to develop sex-linked markers and infer the sex-determining mode for several amphibian, squamate, and fish species [10,11,14,[28][29][30][31][32][33]. Diversity Arrays Technology (Bruce, Australia), the developer of DArTseq™, has developed a methylation-sensitive DArTseq (DArTseqMet) that uses two different restriction enzyme isoschizomers (one of them CpG methylation-sensitive) to identify sex-specific markers. This has the potential to reveal DNA methylation-mediated sex determination.
The oriental garden lizard, C. versicolor, belongs to the subfamily Draconinae of the family Agamidae. It has a wide distribution and is found from Iran to Malaysia through South Asia and Southeast China across highly heterogeneous habitats [34]. It has been introduced to different countries, including the USA (Florida), Indonesia (Celebes), Maldives, Seychelles, and Kenya, and it is found across highly heterogeneous habitats in different elevation ranges. This species is sometimes considered taxonomically neglected [35], and significant variations in body size among populations of C. versicolor have been reported [36]. Studies have also shown a complex evolutionary history in this species and revealed several distinct mitochondrial lineages. This emphasizes the idea that this species may actually comprise a complex of multiple species across its known range [37,38].
This species is known neither to have chromosomal (genotypic) nor temperaturedependent modes of sex determination; rather, it has an indeterminate, bipotential gonad throughout embryonic development [39]. It lacks heteromorphic sex chromosomes [40][41][42], but gene expression analysis has shown this species to have genotypic sex determination rather than environmental sex determination, as in some other reptiles [5,7]. However, Doddamani et al. [6] claimed that C. versicolor has TSD with a novel female-male-femalemale (FMFM) pattern of offspring sex ratio based on egg incubation experiments at differential temperatures involving the verification of sexes via the histological examination of the gonad and the presence of secondary sexual characteristics. It is not known whether this observed variation is due to the existence of multiple cryptic species, the transition of sex-determination mechanisms and sex chromosome turnovers, or the existence of multiple thermosensitive points among closely related species or populations. Therefore, the investigation of sex-determination modes and sex chromosomes within this species (or species complex) potentially provides a good model for examining variation in the mode of sex-determination mechanisms in a species with a wide geographic distribution.
The primary aim of this study was to identify sex-determining mechanisms (i.e., modes of sex determination and sex chromosome heterogamety) across the Calotes species complex. The first step was to determine whether C. versicolor individuals from different locations were considered as discrete genetic lineages within species or a continuum DNA 2021, 1 51 within a species. We collected known C. versicolor samples from Bangladesh and Thailand to determine the extent of genetic differentiation within its range, and we describe the probable sex-determination mechanisms between populations. We performed a populationlevel analysis to determine whether our collected samples were genetically distinct and then used this information to identify the sex-determination modes that could be ascribed to each discrete unit. Both of these forms of data were used to reveal whether C. versicolor consists of a series of cryptic species. in the mode of sex-determination mechanisms in a species with a wide geographic distribution.

Materials and
The primary aim of this study was to identify sex-determining mechanisms (i.e., modes of sex determination and sex chromosome heterogamety) across the Calotes species complex. The first step was to determine whether C. versicolor individuals from different locations were considered as discrete genetic lineages within species or a continuum within a species. We collected known C. versicolor samples from Bangladesh and Thailand to determine the extent of genetic differentiation within its range, and we describe the probable sex-determination mechanisms between populations. We performed a population-level analysis to determine whether our collected samples were genetically distinct and then used this information to identify the sex-determination modes that could be ascribed to each discrete unit. Both of these forms of data were used to reveal whether C. versicolor consists of a series of cryptic species.  A total of 49 samples of C. versicolor were captured and sexed (Table 1) by everting hemipenes in males following the method of Harlow [43,44] (samples from Bangladesh) in the field or by dissecting the gonads in the lab (samples from Thailand). Tail snips from the Bangladesh samples were collected in the field using sharp scissors (sterilized between samples using hydrogen peroxide) and immediately transferred to a 5 mL 1x Hanks' A total of 49 samples of C. versicolor were captured and sexed (Table 1) by everting hemipenes in males following the method of Harlow [43,44] (samples from Bangladesh) in the field or by dissecting the gonads in the lab (samples from Thailand). Tail snips from the Bangladesh samples were collected in the field using sharp scissors (sterilized between samples using hydrogen peroxide) and immediately transferred to a 5 mL 1x Hanks' balanced salt solution (Sigma-Aldrich, St Louis, MO, USA), kept at room temperature (~25 • C), and transported to the University of Canberra within 5 days. Lizards from Thailand were collected by local people under the supervision of Tulyawat Prasongmaneerut, and the DNA was extracted from either tail or liver tissues at the Kasetsart University, Thailand and transported to the University of Canberra.

DNA Extraction and Sequencing
DNA was extracted and used for two different DNA-sequencing approaches: (i) Sanger sequencing for mitochondrial DNA and (ii) DArTseq for the nuclear DNA (SNP and Silico-DArT data). For Sanger sequencing, we randomly selected three individuals from each of the sampling locations-Dhaka (BD_DHK), Feni (BD_FEN), and Habiganj (BD_HBJ) from Bangladesh and Bangkok, Samut Prakan, and Khon Kaen from Thailand (TL_CAV). DNA was extracted from Bangladeshi samples using the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) following manufacturer's instructions and from Thailand samples following the methods of Srikulnath et al. [45]. The concentrations were measured using a NanoDrop™ One Microvolume UV-Vis Spectrophotometer (Thermo Fisher, Waltham, MA, USA). DNA samples from C. versicolor were sequenced by Diversity Arrays Technology (DArTseq™) following their standard protocol.
A region of the mitochondrial genome spanning tRNATrp, the ND2 gene, and the COI gene was targeted, and all samples were amplified and sequenced with primers designed by Huang et al. [38], L3705 and H5162, which were synthesized by Integrated DNA Technologies, Inc. (IDT, Singapore). Standard polymerase chain reactions (PCR) were performed in 25 µL reactions, including approximately 1 µL of template DNA (25 ng/µL), 1 µL of each primer (10 pmol/µL), 12.5 µL of MyTaq™ HS Red Mix, 2x (Bioline, Memphis, TN, USA), and 9.5 µL of nucleus-free water (Ambion, Austin, TX, USA). PCR was conducted following the conditions set by Huang et al. [38] as an initial denaturing step at 95 • C for 4 min, 35 cycles of denaturing at 94 • C for 35 s, annealing at 65 • C for 45 s with an extension at 72 • C for 90 s, and a final extension step at 72 • C for 8 min. The PCR products were electrophoresed in 0.8% agarose gels and visualized with SYBR ® Safe DNA gel stain (Invitrogen, Eugene, OR, USA). The PCR products (amplicons) were purified with the PureLink™ PCR Purification Kit (Invitrogen, Carlsbad, CA, USA) and sequenced at the ACRF Biomolecular Resource Facility (BRF) of the Australian National University using the corresponding PCR primers.

DArTseqMet-Methylation Analysis
Diversity Arrays Technology has developed a dedicated methylation analysis (DArTse-qMet) using the DArTseqTM platform ( Figure 2). Informative silicoDArT presence-absence markers were used to genotype individuals using this DArTseqMet method. In this study, for silicoDArT presence-absence markers, two complexity-reduction methods were created for each sample. Both methods used the same 'rare cutting' restriction enzyme, while two isoschizomer restriction enzymes that differed in sensitivity to cytosine methylation were used as 'frequent cutting' enzymes for the DArTseqMet.
Diversity Arrays Technology has a pool of known restriction enzyme combinations, and multiple combinations, chosen based on their previous usages on related species, were used for the initial study. Later, only the combinations that yielded the maximum number of loci were selected. Single nucleotide polymorphism (SNP) data were obtained using the conventional DArTseq method and genotyping by sequencing performed by DArTseq™ using a combination of DArT complexity-reduction methods and next-generation sequencing following protocols described by Kilian et al. [46] and Lambert et al. [10]. For SNP data, a PstI (rare cutter with recognition sequence 5 -CTGCA|G-3-HpaII (frequent cutter with recognition sequence 5 -C|CGG-3 ) combination was used.
ence-absence markers were used to genotype individuals using this DArTseqMet method. In this study, for silicoDArT presence-absence markers, two complexity-reduction methods were created for each sample. Both methods used the same 'rare cutting' restriction enzyme, while two isoschizomer restriction enzymes that differed in sensitivity to cytosine methylation were used as 'frequent cutting' enzymes for the DArTseqMet. In contrast to the conventional DArTSeq analysis, two different rare cutting restriction enzymes were used for DArTseqMet SilicoDArT data; these were (1) SbfI (recognition sequence 5 -CCTGCA|GG-3 ) and (2) PstI (recognition sequence 5 -CTGCA|G-3 ). The primary goal here was to develop a series of sex-linked markers in C. versicolor and infer their sex-determination modes based on sex bias within these markers. The two different restriction enzymes (isoschizomers; as frequent cutters) that recognized the same sequence (5 -C|CGG-3 ) were (1) HpaII as CpG methylation-sensitive and (2) MspI as CpG methylation-insensitive. A comparison of the sequence composition of the two resulting representations (libraries) revealed differences in methylation patterns across the genome and had the potential to reveal any methylation-mediated sex determination ( Figure 2). As a result, the data (presence/absence of restriction fragments in representation, i.e., PA markers) were obtained from four different treatments with four restriction enzyme combinations as (1) SbfI-HpaII, (2) SbfI-MspI, (3) PstI-HpaII, and (4) PstI-MspI. Sexlinked markers were identified and analyzed using MS 'Excel' [32] and 'R' packages rdist and ggplot2.
Out of 49 samples, a total of 45 (23 males and 22 females) were used for data analysis (for sex-linked markers) using both SNP and SilicoDArT data ( Table 1).

Population and Phylogenetic Analyses
Both nuclear (DArTseq-SNP) and mitochondrial DNA (Sanger sequencing) data were used for population genetic analyses. For all analyses, we filtered the total 65,652 SNP sites identified by DArTseq based on (i) call rate by locus (the proportion of samples of which genotype call was certain; at least 80%, i.e., 0.8-1.0) and (ii) RepAvg (the proportion of technical replicate assay pairs for which the marker score was consistent; at least 95%, i.e., 0.95-1.0). The genetic dissimilarity of individuals and populations was visualized using PCoA (principal coordinate analysis) with the R package dartR [47]. Population differentiation was calculated using the R package hierfstat and StAMPP (pair-wise Fst values). We also calculated the fixed differences (i.e., counts of loci for which all alleles are in one state in one population and in the other state in the other population [48,49]) using R packages dartR, reshape, and adegenet. Isolation by distance was assessed using Mantel's test in the R package dartR.
Data obtained from Sanger sequencing were edited and aligned, and consensus sequences were created using BioEdit software (version 7.2.5). We also used BLAST (Basic Local Alignment Search Tool) for the consensus sequences (NCBI BLASTN 2.10.0+) and downloaded top hit sequences (threshold E-value: 0.0) from GenBank to construct maximum likelihood (1000 bootstraps) and Bayesian (1,100,000 iterations) gene trees using the PHYML and MrBayes plugins of Geneious Prime (version 2020.2.3) software with default parameters, respectively. HKY85 was chosen as the substitution model determined using MEGA version 10.1.8. After careful visual checking (peaks in the chromatogram), each sequence was further edited and aligned, and phylogenetic trees were constructed while only considering the region of the ND2 gene together with sequences from Chamaeleo chamaeleon (1013 bp; NCBI accession number EF222202.1) and Uromastyx ornata (1013 bp; NCBI accession number AB113822.1) as outgroups of the family Agamidae and subfamily Draconinae, respectively. We also used a second C. versicolor sequence (1013 bp; NCBI accession number KC875647.1) from Hainan Province, China, for this analysis because we applied the sequencing primers from [38]. However, our first hit sequence (1013 bp; NCBI accession number DQ289476.1) was from a study in Myanmar [37]. Therefore, we were able to cover a wider range of C. versicolor, including those from Bangladesh (this study), Myanmar, Thailand (this study), and China.

Sex-Linked Marker Analysis
We identified sex-linked SNPs and silicoDArT (presence-absence; PA) loci independently for each population. Following criteria set by Lambert et al. [10] with modifications. For the sex-linked markers, we filtered the total 65,652 SNP and 741,396 PA sites identified by DArTseq based on (i) call rate by locus (at least 80%, i.e., 0.8-1.0 and recalculated for each population) (for SNP and PA markers), (ii) average read depth (the sum of the tag read counts for all samples divided by the number of samples with non-zero tag read counts; ≥10) (for PA markers), and (iii) reproducibility/RepAvg (at least 90%, i.e., 0.9-1.0) (for SNP and PA markers).
The initial screening of candidate sex-linked markers was performed by filtering loci that met the criteria for ZZ/ZW or XX/XY systems. For a ZZ/ZW system, at least one condition of loci with female heterozygosity above 75% and male heterozygosity less than 20%, male reference allele homozygosity of more than 80% and female reference allele homozygosity of less than 10%, or male SNP allele homozygosity of no more than 10% and female SNP allele homozygosity of more than 90% had to be met for the particular loci to be considered as a sex-linked marker. For an XX/XY system, we used the same criteria but with opposite proportions.
Previous studies have identified that at least 13-15 individuals per sex are required to minimize the occurrences of false-positive sex-linked loci provided that the loci are not in linkage disequilibrium [10,30,50]. Since our sample sizes were small (4-9 per sex per population for SNP and 1-8 for PA marker analysis), we only considered loci that were perfectly sex-linked, especially for silicoDArT PA data. Spurious sex association is more likely when the sample size is small, so we used the formula Pi = 0.5 n [10], where Pi is the probability that any locus is sex-linked by chance and n is the sample size (male and female together). Pi was multiplied by the number of SNP/PA markers (remaining after filtering), which gave an estimation of the number of random sex-linked loci produced through analysis.

Population Analysis with DArTseq Results
The PCoA plot revealed three distinctive clusters within the sampled populations of C. versicolor: (1) Dhaka and Feni samples together, (2) Habiganj samples, and (3) Thailand samples together (Figure 3). Most of the variation is represented in axis 1 (62.3%), while axis 2 represents lower variation (6.1%). Pairwise genetic differentiation (Fst) between pairs of sites ranged between 0.2 and 0.86 (Table 2). Genetic structures based on different localities in Thailand showed low differences, with Fst ranging between 0.09 and 0.22. By contrast, and in line with the level of differentiation suggested by the PCoA, Fst values were extremely high among the Bangladesh localities. Genetic structure based on different localities in Bangladesh showed low significant differences between Dhaka and Feni, with an Fst value of 0.21, to highly significant differences between Dhaka and Habiganj and between Feni and Habiganj, with Fst values of 0.81 and 0.84, respectively. An analysis of fixed differences confirmed the diagnosability (significant number of fixed differences) of each of the six clades as Dhaka, Feni, and Habiganj from Bangladesh and Bangkok, Khon Kaen, and Samut Prakan from Thailand. Fixed difference refers to the number of loci where the populations share no alleles at that locus [48]. Lower fixed differences between populations indicate recent divergence or higher gene flow, and higher fixed differences indicate periods of long isolation and lack of gene flow. The divergences in pairwise comparisons ranged from 338 to 13,900 fixed differences between the clades Pairwise genetic differentiation (Fst) between pairs of sites ranged between 0.2 and 0.86 (Table 2). Genetic structures based on different localities in Thailand showed low differences, with Fst ranging between 0.09 and 0.22. By contrast, and in line with the level of differentiation suggested by the PCoA, Fst values were extremely high among the Bangladesh localities. Genetic structure based on different localities in Bangladesh showed low significant differences between Dhaka and Feni, with an Fst value of 0.21, to highly significant differences between Dhaka and Habiganj and between Feni and Habiganj, with Fst values of 0.81 and 0.84, respectively. An analysis of fixed differences confirmed the diagnosability (significant number of fixed differences) of each of the six clades as Dhaka, Feni, and Habiganj from Bangladesh and Bangkok, Khon Kaen, and Samut Prakan from Thailand. Fixed difference refers to the number of loci where the populations share no alleles at that locus [48]. Lower fixed differences between populations indicate recent divergence or higher gene flow, and higher fixed differences indicate periods of long isolation and lack of gene flow. The divergences in pairwise comparisons ranged from 338 to 13,900 fixed differences between the clades (Table 3). Among the Bangladesh samples, the divergence of populations ranged from DNA 2021, 1 57 652 to 13,900 fixed differences. Dhaka samples differed from Feni samples by 652 fixed differences and from Habiganj by 13,895 fixed differences. Feni and Habiganj samples differed by 13,900 fixed differences. However, among the Thailand samples, the divergence of populations was comparatively lower, ranging from 338 to 439 fixed differences. Between the Bangladesh and Thailand samples, the divergence of populations ranged from 1544 to 12,753 fixed differences. The lowest fixed difference was observed between Habiganj and Bangkok with 1544 loci, and the highest was between Dhaka and Khon Kaen with 12,753 loci.

Phylogenetic Analysis with Sanger Sequencing Results
As found by Huang et al., we expected the overall region of the mitochondrial DNA sequenced to be 2663 bp in total, spanning tRNATrp, ND2, and COI for each of the C. versicolor samples [38]. However, our consensus sequences were found to be much shorter, instead ranging between 882 and 1377 bp long. The BLASTN of these consensus sequences resulted in hits of Dhaka (BD_DHK) and Feni (BD_FEN) samples with Calotes calotes, Calotes htunwini, and C. versicolor, while sequences from Habiganj (BD_HBJ) and Thailand (TL_CAV) only had hits with C. versicolor sequences (Table 4). We could not amplify the target amplicon from the Bangkok samples using this method, so only the Samut Prakan and Khon Kaen samples from Thailand were included in the analysis.
We deposited all new sequences in GenBank with accession numbers MW478730-MW478740. The maximum-likelihood and Bayesian dendrograms derived from the consensus sequences are shown in Figure 4. We set thresholds to support a clade, i.e., >70% bootstrap in ML trees and >75% consensus support in Bayesian trees. Based on these, three distinct clades were observed: one from the Thailand TL_CAV samples and two from the Bangladesh samples as Dhaka-Feni (BD_DHK-BD_FEN) and BD_HBJ, respectively. The BD_HBJ and TL_CAV samples shared a common root within the clade that included different C. versicolor populations. By contrast, the BD_DHK and BD_FEN samples from Bangladesh fell within the clade with a common root from C. calotes (Figure 4). This indicated that C. versicolor samples comprised three genetically distinct clades with the possibility of two distinct species under the genus Calotes.  [51], and C. versicolor sequences were taken from the work of Zug et al. [37]. Only the sequences in bold were used for phylogenetic tree construction.  We deposited all new sequences in GenBank with accession numbers MW478730-MW478740. The maximum-likelihood and Bayesian dendrograms derived from the consensus sequences are shown in Figure 4. We set thresholds to support a clade, i.e., >70% bootstrap in ML trees and >75% consensus support in Bayesian trees. Based on these, three distinct clades were observed: one from the Thailand TL_CAV samples and two from the Bangladesh samples as Dhaka-Feni (BD_DHK-BD_FEN) and BD_HBJ, respectively. The BD_HBJ and TL_CAV samples shared a common root within the clade that included different C. versicolor populations. By contrast, the BD_DHK and BD_FEN samples from Bangladesh fell within the clade with a common root from C. calotes (Figure 4). This indicated that C. versicolor samples comprised three genetically distinct clades with the possibility of two distinct species under the genus Calotes.

Sex-Linked Marker Analysis Using SNP Data
A total of 7136 out of 65,652 loci (45 individuals) were retained for sex-linked screening after quality filtering. Screening for loci that met our sex-linked criteria using samples from all populations together yielded no sex-linked markers; however, separate specific screening for each population resulted in 88 loci that exhibited a sex bias (Figure 5a). Of these, the BD_DHK population had three loci (100%) correlating to females, the pattern expected for a sex chromosome system with ZW females and ZZ males. Likewise, 4 out of 41 loci were assigned to the ZZ/ZW system, and the remaining 37 (90%) were assigned to the XX/XY system for the BD_FEN population. The BD_HBJ population had 38 loci, with 10 assigned to XX/XY system and another 28 loci (74%) assigned to the ZZ/ZW system, while 4 out of 6 loci (66%) indicated ZZ/ZW and the other 2 loci suggested the XX/XY system for the TL_CAV population. from all populations together yielded no sex-linked markers; however, separate specific screening for each population resulted in 88 loci that exhibited a sex bias (Figure 5a). Of these, the BD_DHK population had three loci (100%) correlating to females, the pattern expected for a sex chromosome system with ZW females and ZZ males. Likewise, 4 out of 41 loci were assigned to the ZZ/ZW system, and the remaining 37 (90%) were assigned to the XX/XY system for the BD_FEN population. The BD_HBJ population had 38 loci, with 10 assigned to XX/XY system and another 28 loci (74%) assigned to the ZZ/ZW system, while 4 out of 6 loci (66%) indicated ZZ/ZW and the other 2 loci suggested the XX/XY system for the TL_CAV population.
Following the equation of Lambert et al. [10], our filtered dataset of 7136 SNP markers, for a sample range of 8-17 phenotypically sexed individuals, revealed that 0.05-28 markers are likely to be spuriously sex-linked [10]. Given the number of tested SNP markers, 13 individuals are sufficient for us to confidently state that less than one marker is spuriously sex-linked if the loci are completely independent.  Following the equation of Lambert et al. [10], our filtered dataset of 7136 SNP markers, for a sample range of 8-17 phenotypically sexed individuals, revealed that 0.05-28 markers are likely to be spuriously sex-linked [10]. Given the number of tested SNP markers, 13 individuals are sufficient for us to confidently state that less than one marker is spuriously sex-linked if the loci are completely independent. Considering that markers identified using SbfI-HpaII comprised a subset of the markers identified using SbfI-MspI (and PstI-HpaII of PstI-MspI), methylation-mediated marker profiles could only be created from the BD_DHK and BD_FEN populations. In BD_DHK, no marker from the SbfI-HpaII treatment was identified, denoting that all 2972 markers identified using SbfI-MspI are methylated at their recognition sites. Meanwhile, in BD_FEN, 44 loci out of 295 identified by SbfI-MspI were also identified by SbfI-HpaII, denoting the remaining 251 markers to be methylated at their recognition sites.

Sex-Linked Marker Analysis Using SilicoDArT Presence-Absence Data
From the initial sequenced 741,396 PA markers (45 individuals), we retained 277,926 PA loci after filtering. The data belonged to four different treatments with four diverse restriction enzyme combinations: 1) SbfI-HpaII, 2) SbfI-MspI, 3) PstI-HpaII, and 4) PstI-MspI. Not all treatments worked for all populations. Each population revealed different numbers of markers based on the type of treatment ( Figure 6; Supplementary Table S1). Considering that markers identified using SbfI-HpaII comprised a subset of the markers identified using SbfI-MspI (and PstI-HpaII of PstI-MspI), methylation-mediated marker profiles could only be created from the BD_DHK and BD_FEN populations. In BD_DHK, no marker from the SbfI-HpaII treatment was identified, denoting that all 2972 markers identified using SbfI-MspI are methylated at their recognition sites. Meanwhile, in BD_FEN, 44 loci out of 295 identified by SbfI-MspI were also identified by SbfI-HpaII, denoting the remaining 251 markers to be methylated at their recognition sites. Figure 6. Sex-specific silicoDArT presence-absence (PA) markers identified in different populations. The number of markers is shown on the left side of the corresponding bar. Zero refers to scenarios where no sex-specific loci were identified, whereas empty spaces refer to scenarios where the particular enzyme combination did not work out.
Following the equation of Lambert et al., in our filtered dataset of 277,926 PA markers, for a sample range of 4-16 phenotypically sexed individuals, it was revealed that 4-17,370 markers are likely to be spuriously sex-linked [10]. With this equation and this number of PA markers, 20 individuals of each sex is sufficient for us to confidently state that less than one marker is spuriously sex-linked.

Discussion
Calotes versicolor is an agamid species with a wide distribution and is regarded as likely to consist of a species complex [37,38]. At present, this species is considered to be a TSD species [6] without any detectable heteromorphic sex chromosomes [40][41][42]. However, it is apparent from multiple previous studies that this species may vary in terms of its sex-determination mechanisms in different parts of its range [5][6][7]40,41,52]. In this study, we found that C. versicolor comprises a combination of different divergent lineages,

Marker Enzyme combination and population
Male specific Female specific Figure 6. Sex-specific silicoDArT presence-absence (PA) markers identified in different populations. The number of markers is shown on the left side of the corresponding bar. Zero refers to scenarios where no sex-specific loci were identified, whereas empty spaces refer to scenarios where the particular enzyme combination did not work out.
Following the equation of Lambert et al., in our filtered dataset of 277,926 PA markers, for a sample range of 4-16 phenotypically sexed individuals, it was revealed that 4-17,370 markers are likely to be spuriously sex-linked [10]. With this equation and this number of PA markers, 20 individuals of each sex is sufficient for us to confidently state that less than one marker is spuriously sex-linked.

Discussion
Calotes versicolor is an agamid species with a wide distribution and is regarded as likely to consist of a species complex [37,38]. At present, this species is considered to be a TSD species [6] without any detectable heteromorphic sex chromosomes [40][41][42]. However, it is apparent from multiple previous studies that this species may vary in terms of its sex-determination mechanisms in different parts of its range [5][6][7]40,41,52]. In this study, we found that C. versicolor comprises a combination of different divergent lineages, at least within the sampled locations, and it potentially comprises two different species, i.e., C. calotes and C. versicolor, which may each have evolved different modes of sex determination.

Population and Phylogenetic Analysis
We used both genomic (DArT sequencing) and mitochondrial DNA (Sanger sequencing) to study the population structure and phylogenetic relationships among C. versicolor samples obtained from six locations in Bangladesh and Thailand. Results from PCoA, Fst, and fixed difference analysis using DArTseq data and phylogenetic analyses of mtDNA revealed three distinct clades across the sample populations: clade 1 (Dhaka-Feni; BD_DHK-BD_FEN), clade 2 (Habiganj; BD_HBJ), and clade 3 (Thailand; TL_CAV). Based on the Fst and fixed difference values, the Habiganj clade was found to be associated more closely with the Thailand clade than the other Bangladesh clades (Dhaka-Feni). The Fst values between those two clades (clades 1 and 2) were at least four times as that seen between the two major clusters from Bangladesh (clade 1) and tended to be less affected by geographic distance than within-cluster comparisons, as would be expected among highly divergent evolutionary lineages [53]. Low and non-significant isolation by distance was detected (Mantel's test: r = 0.4049; p = 0.080556) while considering all six sampling locations, indicating that genetic dissimilarities within C. versicolor populations are not related to the distance across all study locations. Our phylogenetic analysis also revealed that the Thailand and Habiganj sequences cluster with C. versicolor, whereas the Dhaka-Feni sequences group with C. calotes sequences instead.
It has been previously suggested that adaptation to local environments might play an important role in the diversification of C. versicolor [38]. We found 370 loci common to all location samples (both SNP and reference alleles; Figure 7) and identified SNP loci unique to sampling locations. The straight-line distance between the two countries-Bangladesh and Thailand-is approximately 1500 km, distances between Bangladesh locations (clade 1-Dhaka and Feni; clade 2-Habiganj) range from 116 to 128 km, and distances between Thailand locations (clade 3-Bangkok, Samut Prakan, and Khon Kaen) range from 20 to 394 km. The levels of sequence divergence and the frequencies of private haplotypes were found to be high for intraspecific data within the Bangladesh lineages, i.e., across a small geographical area. Our results therefore do not suggest that patterns in C. versicolor in Bangladesh are consistent with the isolation-by-distance model [54], whereby gene flow decreases with increasing geographical distance because of limited dispersal. This is also evident when high but non-significant isolation by distance was detected after removing clade 1 (Mantel's test: r = 0.8285; p = 0.0833) and clade 3 (Mantel's test: r = 0.7694; p = 0.1666) from the analysis. Given that there has been limited gene flow occurring between the Dhaka-Feni and Habiganj lineages, one plausible scenario could result from their different geographic positions and geographical barriers that leads to speciation. Geologically, Dhaka and Feni are floodplain areas affected by seasonal flooding predominantly caused by human-modified landscapes, whereas the Habiganj sampling location consists of hilly tracts with tropical evergreen forests [55]. Both the Habiganj and Thailand sampling locations fall under the Indo-Burma biodiversity hotspot area [56] within the Indo-Chinese sub-region of the oriental zoogeographic region. Dhaka and Feni, on the other hand, fall within the Indian sub-region of the oriental region [57]. A high but non-significant isolation by distance (Mantel's test: r = 0.8285; p = 0.0833) was detected among the Habiganj and Thailand sampling locations.
Zug et al. [37] found evidence of a species complex with high genetic differentiation without a differentiated morphotype in C. versicolor. They postulated that C. versicolor represents multiple species and at least two clades, one from India-Myanmar and another from Myanmar-Southeast Asia. Our data also provide evidence of high genetic divergence between the samples reflected in clustering into three distinct groups. These three groups can be considered as three distinct evolutionarily significant units (ESUs) and potentially support the presence of a C. versicolor species complex. Based on our population and phylogenetic analyses, it can be predicted that the Thailand and Habiganj samples could be C. versicolor, while the Dhaka-Feni samples could be C. calotes or even a new species within a cryptic species complex. C. calotes is commonly known as the green forest calotes across South India and Sri Lanka [34], with a completely different phenotypic appearance and green coloration [58]; therefore, it does not comply with our samples' phenotypes. Additionally, there has been no report of C. calotes from Bangladesh yet [59], mainly due to a lack of morphometric and genetic studies. Our study therefore stresses the need for future studies providing evidence that the known C. versicolor in Bangladesh is not a single species but rather comprises different species. However, samples from Habiganj and Thailand may also represent different intraspecific lineages within C. versicolor. Future studies with broader geographic coverage, including type localities of C. versicolor, are essential to further confirm this claim. Zug et al. [37] found evidence of a species complex with high genetic differentiation without a differentiated morphotype in C. versicolor. They postulated that C. versicolor represents multiple species and at least two clades, one from India-Myanmar and another from Myanmar-Southeast Asia. Our data also provide evidence of high genetic divergence between the samples reflected in clustering into three distinct groups. These three groups can be considered as three distinct evolutionarily significant units (ESUs) and potentially support the presence of a C. versicolor species complex. Based on our population and phylogenetic analyses, it can be predicted that the Thailand and Habiganj samples could be C. versicolor, while the Dhaka-Feni samples could be C. calotes or even a new species within a cryptic species complex. C. calotes is commonly known as the green forest calotes across South India and Sri Lanka [34], with a completely different phenotypic appearance and green coloration [58]; therefore, it does not comply with our samples' phenotypes. Additionally, there has been no report of C. calotes from Bangladesh yet [59], mainly due to a lack of morphometric and genetic studies. Our study therefore stresses the need for future studies providing evidence that the known C. versicolor in Bangladesh is not a single species but rather comprises different species. However, samples from Habiganj and Thailand may also represent different intraspecific lineages within C. versicolor. Future studies with broader geographic coverage, including type localities of C. versicolor, are essential to further confirm this claim.

Sex-Determination Modes across Calotes Species Complex
DArTseq data have been previously used to identify sex-linked markers in both amphibians and reptiles [10,12,32]. We applied this method to search for sex-linked markers in C. versicolor. Our results identified a total of 88 SNP loci that were linked to phenotypic sex but extended across different lineages. None of these loci were shared between or among lineages. We also identified up to 2972 sex-linked loci that varied according to lineage.
The SNP DArT results imply that C. versicolor has a TSD system, since no sex-linked

Sex-Determination Modes across Calotes Species Complex
DArTseq data have been previously used to identify sex-linked markers in both amphibians and reptiles [10,12,32]. We applied this method to search for sex-linked markers in C. versicolor. Our results identified a total of 88 SNP loci that were linked to phenotypic sex but extended across different lineages. None of these loci were shared between or among lineages. We also identified up to 2972 sex-linked loci that varied according to lineage.
The SNP DArT results imply that C. versicolor has a TSD system, since no sex-linked markers were found, or that multiple sex-determination systems exist in this species. Our population-specific analysis indicates that the low marker counts seen in the Dhaka and Thailand populations may result from TSD or ZZ/ZW systems, while the Habiganj and Feni populations showed clear biases towards the ZZ/ZW and XX/XY systems, respectively. In this scenario, C. versicolor exhibited multiple sex-determination systems of both XY and ZW, including environmental sex determination (ESD). The sex-determination system of C. versicolor is elusive due to the lack of clear heteromorphic sex chromosomes and contradictory research results regarding sex determination. Previous studies have shown both the effect and lack of effect of temperature in influencing the sex of hatchlings [6,52]. However, Wilson et al. [60] found four weakly sex-associated RAD-tag markers in C. versicolor and concluded that their data did not support the presence of strongly differentiated sex chromosomes in their population of C. versicolor.
These contradicting results could originate from a possible sex reversal caused by temperature that overrides the original sex-determination system, as recorded in other agamid lizards such as in Pogona vitticeps [18,61,62] and the skink Bassiana duperreyi [17], or the existence of multiple sex-determination systems within the same species, as in the Japanese wrinkled frog G. rugosa [3,4]. In previous studies, most C. versicolor samples were from India, while in this study, the samples were from Bangladesh and Thailand. Our results suggest multiple systems based on population and locality. C. versicolor has a vast distribution area from Iran to Southeast Asia, with the potential for each population to adapt to different sex-determination systems, which is similar to amphibians where different populations independently evolve sex-determination [3,4]. Our study, including samples from Bangladesh and Thailand, identified the existence of a Calotes cryptic species complex, as previously suggested by Zug et al. [37]. Therefore, it can be assumed that previously observed variations in sex-determination mechanisms in C. versicolor were due to evolutionary mechanisms by different cryptic species undergoing the process of speciation. This could also have been the case when Doddamani et al. [6] raised the possibility that C. versicolor has a novel FMFM pattern of TSD based on multi-year experiments over several seasons. However, we cannot be certain because they did not include the collection localities of the C. versicolor individuals or eggs. Based on our results, it can be presumed that the pattern inferred from incubation experiments by Doddamani et al. [6] might have come from different populations or even species that were assumed to be single species with the same sex-determination system.
Methylation DArTseq data identified dominant sex-specific markers as loci that were either present or absent according to sex. However, not all combinations of restriction enzymes worked for all populations. We only obtained results from both methylationsensitive and non-sensitive restriction enzyme combinations from the Dhaka and Feni populations (from the SbfI-HpaII/ SbfI-MspI combination). However, the SbfI-HpaII combination yielded markers from all populations, and this combination was used to generate SNP markers. Theoretically, treatment with a methylation-insensitive restriction enzyme (MspI) should have more or equal numbers of loci than a methylation-sensitive restriction enzyme (HpaII) combination. Such phenomena were observed in the Dhaka and Feni samples (only in the SbfI-HpaII/MspI combination). Therefore, we assumed that loci identified from the SbfI-HpaII combination might be a subset of loci from the non-methylated SbfI-HpaII/MspI combination. We found that all markers identified from Dhaka for this enzyme combination were CpG-methylated on their restriction sites. In the Feni samples, we found that the restriction sites of female loci (93.5%) were more methylated than the male loci (80.2%). By contrast, more sex-specific loci were observed in methylation-sensitive restriction enzyme combinations than insensitive combinations in the Dhaka (in the PstI-HpaII/MspI combination) and Habiganj samples (between SbfI-HpaII and PstI-MspI combinations), making it harder to predict methylation patterns in these treatments.
Complete SbfI-HpaII/MspI combination datasets from the Dhaka and Feni populations ( Figure 6) and the number of sex-linked loci associates suggest a ZZ/ZW system in the Dhaka (BD_DHK) population and an XX/XY system with probable environmental influences in the Feni (BD_FEN) population, in accordance with the SNP data ( Figure 5a). However, there could be an environmental influence in both cases, since the female-specific markers accounted for 82% and the male-specific markers accounted for only 63% of the total sex-specific markers in BD_DHK and BD_FEN, respectively ( Figure 6). Putative male-linked alleles were present in ZZ/ZW populations (Habiganj and Thailand), while female-linked alleles were present in the XX/XY population (Feni), suggesting the frequent sex reversal of female-to-male phenotypes and vice versa. Such sex reversals could be caused by environmental factors, i.e., temperature influences during the incubation period, as in P. vitticeps [63] and B. duperreyi [64]. The cytogenetic analysis of C. versicolor revealed no variation between male and female karyotypes [40,41,65]. This finding does not contradict the possibility of a genetic sex determination, since other agamids such as Phrynocephalus vlangalii [66] and P. vitticeps [67,68] have ZW/ZZ sex chromosome systems. This suggests that C. versicolor might have a genetic sex determination with homomorphic sex chromosomes that are in the early stage of differentiation.

Conclusions
Sex-determination mechanisms can occur between closely related species or even between populations within a species. We chose Calotes versicolor for our model because this taxon has a wide geographic distribution with potential for population-level differentiation. Evidence suggested that C. versicolor may not exist as a single species across its known range but rather as a combination of genetically distinct but morphologically similar groups as a cryptic species complex, as indicated by Zug et al. [37]. Our findings showed variations in sex-linked markers, indicating diverse sex-determination modes and mechanisms between genetically distinct lineages comprising both populations and species. Further studies using more individuals with confirmed morphological identification and sex would enable a better understanding of the underlying mechanisms of sex determination within these closely related lineages. Additional sampling locations spanning the entire range of C. versicolor would provide further in-depth information regarding the intensity and diversity of this cryptic species complex. Future detailed research is likely to open up a new window to explore the evolution of sex-determination mechanisms both within this species complex and in reptiles in general.