Next Article in Journal
Detection of Photobacterium damselae Using Sandwich ELISA with Two Anti-Outer Membrane Protein Antibodies
Previous Article in Journal
Rearing Time–Salinity Synergy in Osmoregulation: Ionic Homeostasis and Textural Enhancement in Adult Freshwater Drums (Aplodinotus grunniens)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Sex-Associated Indels and Candidate Gene Identification in Fujian Oyster (Magallana angulata)

1
Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural, Fisheries College, Jimei University, Xiamen 361021, China
2
Key Laboratory of Cultivation and High-Value Utilization of Marine Organisms in Fujian, Fisheries Research Institute of Fujian, Xiamen 361013, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Fishes 2025, 10(9), 438; https://doi.org/10.3390/fishes10090438
Submission received: 21 July 2025 / Revised: 29 August 2025 / Accepted: 30 August 2025 / Published: 2 September 2025
(This article belongs to the Section Genetics and Biotechnology)

Abstract

Sex determination is a fundamental biological process governing animal reproduction. Although substantial progress has been made in elucidating its genetic basis, the genetic architecture underlying complex sex determination systems remains poorly understood. In this study, we identify sex-associated insertion–deletion (indel) variants, screen candidate genes, and compare sex-associated variation across populations with different genetic backgrounds in the Fujian oyster (Magallana angulata). Based on whole-genome resequencing data of a culture strain (designated FL), a total of 299,774 high-quality indels were identified. By integrating genome-wide association analysis (GWAS), fixation index (FST) analysis, and sex-biased genotype frequency comparisons, 77 overlapping sex-associated indels were identified, predominantly clustered within a 1.8 Mb (8.3–10.1 Mb) region on chromosome 9. Principal component analysis (PCA) based on the sex-associated markers and their subsets consistently separated male and female individuals in the FL strain. For two representative sex-associated indels, PCR-based genotyping methods were developed and validated. Functional annotation identified putative candidate genes for sex determination, including PKD1L1, 5-HTRL, SCP, and CCKRa. Comparative analysis of variants within PKD1L1 across wild, farmed, and selectively bred populations revealed a progressive enrichment of male-linked alleles in domesticated and selectively bred groups, particularly in male individuals. This study provides direct evidence that sex in the Fujian oyster is genetically determined and reveals that domestication and artificial selection may drive the emergence of major sex-determining loci, offering important insights into the genetic basis of sex determination in the Fujian oyster, and establishing a theoretical and practical foundation for molecular marker-assisted breeding of monosex lines for this species.
Key Contribution: Sex-associated indels were identified and mapped within a 1.8 Mb region on chromosome 9 in a cultured strain of the Fujian oyster. PCR-based genotyping methods were developed and validated for two sex-associated indels. Candidate genes potentially involved in sex determination were identified within the associated region. A progressive enrichment of male-linked alleles was observed in farmed and selectively bred populations, particularly within a key candidate sex-determining gene.

1. Introduction

Sex determination is a fundamental biological process that governs the development of sexual phenotypes in animals [1,2]. In aquatic species, pronounced sexual dimorphism in traits such as growth rate, nutrient composition, and reproductive capacity is common, making sex-control breeding an effective strategy for improving productivity and product quality. For instance, in the Fujian oyster (Magallana angulata), males exhibit approximately 25% heavier soft tissues than females [3]. Meanwhile, females accumulate more glycogen, lipids, and taurine, whereas males are enriched in protein and unsaturated fatty acids [4]. These differences in economically important traits between sexes highlight the potential for developing monosex lines tailored to specific economic goals.
The successful implementation of sex-control breeding hinges on a clear understanding of the genetic basis of sex determination and the development of reliable sex-linked molecular markers. While significant progress has been made in species with simple XX/XY or ZZ/ZW systems, increasing evidence suggests that many aquatic species exhibit more complex sex determination mechanisms involving polygenic regulation, environmental influences, and plastic developmental trajectories [5,6]. Nevertheless, the genetic basis of these complex systems remains poorly understood. In bivalves, and particularly in oysters, the sex determination system is highly diverse and plastic, with frequent sex reversal, functional hermaphroditism, and periodic sex changes reported in some species [7]. This complexity offers valuable opportunities to explore the evolution and regulation of sex-determining systems beyond simple systems.
In the Pacific oyster (Magallana gigas), two family-based QTL mapping studies identified major sex-determining loci on linkage group 6 and Chr. 9, respectively, demonstrating that sex determination in this species is under significant genetic control [8,9]. Similarly, our previous study on a cultured strain of the Fujian oyster (designated FL), which has undergone multiple generations of selection for growth traits, also identified strongly sex-associated SNPs clustering on Chr. 9 [10]. Despite these findings, no robust, convenient, and easily scorable genotyping method for sex-associated markers is currently available, which greatly limits further investigation into the sex-determination mechanism in oysters. Moreover, existing studies have primarily focused on a single family or population, and potential variation in the genetic architecture of sex determination across populations with different genetic backgrounds remains largely unexplored.
Insertion–deletion polymorphisms (indels) are biallelic genetic variants that can cause length differences in PCR amplicons, making them particularly suitable for gel-based genotyping [11]. Compared to single nucleotide polymorphisms (SNPs), indels located in coding or regulatory regions may have stronger functional effects [12]. Therefore, this study aims to identify sex-associated indels based on whole-genome resequencing data, develop PCR-based markers for genetic sex identification, and reveal potential differences in the genetic architecture of sex determination among wild, farmed, and selectively bred groups. These efforts will provide important insights into the mechanisms of sex determination and establish a foundation for molecular marker-assisted breeding of monosex lines in the Fujian oyster.

2. Materials and Methods

2.1. Sample Collection and Sex Identification

All specimens were one year old and collected from Jiangkou Bay, Fuzhou, Fujian Province, China. After collection, the oysters were gently cleaned, and their shells were opened to extract gonadal tissue. Gonads were fixed, embedded, and sectioned for paraffin histology. Sex was determined under a light microscope based on gonadal morphology. Adductor muscle tissue was dissected, flash-frozen in liquid nitrogen, and stored at −80 °C for subsequent DNA extraction and genome sequencing. A total of 112 sexually mature individuals of the Fujian oyster (FL strain) were collected (57 females and 55 males) for whole-genome resequencing (WGS) and identification of sex-associated indels. An additional 60 individuals (30 females and 30 males) from another batch of the FL strain were collected for marker validation. For comparative population analysis, another 124 sexually mature oysters were sampled for WGS, comprising 50 individuals (FL; 25 females and 25 males) from the FL strain, 50 from a farmed population (DD; 26 females and 24 males), and 24 from a wild population (WW; 12 females and 12 males).

2.2. DNA Extraction and Whole-Genome Resequencing

Genomic DNA was extracted from the adductor muscle using the services of a commercial provider (Annoroad Gene Technology, Beijing, China). Paired-end whole-genome sequencing (150 bp reads) was performed on the Illumina NovaSeq X Plus platform, targeting an average sequencing depth of approximately 9× per individual, as described in a previous report [10].
Raw reads were quality-controlled using Fastp v0.22.0 [13] to remove adapter sequences and low-quality reads. Clean reads were aligned to the M. angulata reference genome (GenBank accession: GCA_025612915.2) using BWA v0.7.17 [14]. The resulting alignment files were sorted, indexed, and converted to BAM format using SAMtools v1.10 [15].

2.3. Detection and Quality Filtering of Indels

Indel variants were identified using the HaplotypeCaller module in GATK v4.1.6.0 [16]. The initial indel dataset was filtered using PLINK v1.90b6.21 [17] with the following parameters: minor allele frequency (MAF) ≥ 0.05, missing genotype rate ≤ 0.05, and Hardy–Weinberg equilibrium p-value ≥ 1 × 10−3. Missing genotypes were imputed using Beagle v5.1 [18], resulting in a high-quality indel genotype dataset for downstream analysis.

2.4. Identification of Sex-Association Variants

Sex-associated loci were identified using genome-wide association studies (GWAS) based on a mixed linear model (MLM) implemented in EMMAX [19], with histologically classified sex as the phenotype and indel genotypes as predictors. The significance threshold was set using the Bonferroni correction (p = 0.01/n, where n is the number of tested indels). To evaluate genetic differentiation between sexes, FST values were calculated using VCFtools v0.1.17 [20] by specifying male and female sample lists via the --weir-fst-pop option, treating males and females as two separate populations, with both the window size and step size set to 1 bp. Additionally, sex-biased indels were identified based on differences in genotype composition between sexes. A sliding window analysis (10 kb) was conducted using custom Python 3.8.8 scripts to calculate the number of candidate loci per window, and the genome-wide distribution was visualized using the CMplot package (https://github.com/YinLiLin/CMplot, accessed on 27 August 2025).

2.5. Principal Component Analysis (PCA)

To assess the discriminative power of selected indel loci, principal component analysis (PCA) was performed using subsets of sex-associated markers. PCA was conducted with PLINK v1.90b6.21 [17], and the resulting data were visualized in Rstudio v4.4.1 to evaluate clustering patterns of male and female individuals in principal component space.

2.6. Marker Development and Validation

Candidate indels were selected for marker development based on their genomic location and observed insertion/deletion patterns, with confirmation by visualization in IGV [21]. PCR primers were designed using Primer Premier 5 and synthesized by Beijing Tsingke Biotech Co., Ltd. (Beijing, China) (https://www.bioprocessonline.com/doc/primer-premier-5-design-program-0001, accessed on 1 September 2025). Each PCR was performed in a 10 μL reaction volume containing 5 μL 2× PCR Mix, 0.4 μL of each primer, 0.5 μL template DNA, and nuclease-free water to volume. Thermal cycling conditions were: 94 °C for 2 min; 30 cycles of 94 °C for 30 s, annealing at primer-specific Tm for 30 s, and 72 °C for 1 min; followed by a final extension at 72 °C for 5 min. PCR products were resolved on 3% agarose gels (130 V, 180 mA, 90 min) and visualized under a gel documentation system.
Genotypes were determined based on band size. Marker performance was evaluated using the following metrics: Accuracy: proportion of correctly predicted sex; F1-score: harmonic mean of precision and recall; area under the ROC curve (AUC): overall prediction performance; chi-square test: association between genotype and phenotypic sex. All statistical analyses were performed in Rstudio v4.4.1. The performance metrics for the two sex-linked indel markers (FO-3 and FO-4) in the validation population are summarized in Supplementary Table S9.

2.7. Functional Annotation and Candidate Gene Identification

Sex-associated indels identified by both GWAS and FST analyses were annotated using SnpEff v5.2 [22] to determine their genomic context (e.g., intronic, upstream, intergenic regions) and affected genes. Genes harboring multiple sex-associated indels were further analyzed for structural features and intragenic variation distribution to identify potential functional relevance and prioritize candidate genes.

2.8. Comparative Analysis of Key Candidate Genes

Genotype data from the FL, DD, and WW populations were extracted for key candidate gene regions. Allele frequencies of reference and alternate alleles were calculated separately for males and females within each population. Visualization was performed using the ggplot2 [23] package in Rstudio v4.4.1 to generate heatmaps and frequency barplots, enabling assessment of sex-linked variation patterns and potential population-specific differentiation.

3. Results

3.1. Resequencing and Variant Detection

Based on the resequencing data, a total of 299,774 high-confidence insertion–deletion (indel) variants were identified following variant calling and stringent quality filtering (Figure 1). These genome-wide variants provide a resource for subsequent association and functional analyses.

3.2. Identification and Localization of Sex-Associated Indels

GWAS of phenotypic sex was performed using a mixed linear model (MLM). A total of 192 indel loci exceeded the genome-wide significance threshold (−log10P = 7.476) (Figure 2A; Supplementary Table S1). These loci exhibited a strong clustered enrichment within a 2.5 Mb region on Chr. 9 (8.0–10.5 Mb), with the most significant locus (−log10P = 19.23) located at Chr. 9:8,921,925.
To validate the GWAS findings and assess sex-specific genetic differentiation, FST values were calculated for each indel between males and females. A total of 167 indels on Chr. 9 (8.0–11.2 Mb) exceeded the high differentiation threshold (FST = 0.25), with the maximum value of 0.499 at Chr. 9:8,916,956 (Figure 2B; Supplementary Table S2). Among these, 147 loci overlapped with the GWAS-significant loci (Supplementary Figure S1).
Additionally, 78 indels with sex-biased genotype frequencies were identified (Supplementary Table S4), 77 of which overlapped with the variants identified by GWAS and FST analyses (Supplementary Figure S1). These variants were significantly enriched within a 1.8 Mb region on Chr. 9 (8.3–10.1 Mb), forming a hotspot of sex-biased polymorphisms (Figure 2C; Supplementary Table S3). Notably, the 10 kb window spanning Chr. 9: 8,920,000–8,930,000 exhibited the highest variant density, harboring seven sex-biased indels (Supplementary Table S4). By integrating evidence from GWAS, FST analysis, and the distribution of sex-biased variants, the 1.8 Mb region on Chr. 9 (8.3–10.1 Mb) was defined as the core candidate region for sex determination in the FL strain of the Fujian oyster, providing a solid foundation for downstream functional characterization and molecular marker development.
To evaluate the effectiveness of sex-associated indel markers in distinguishing phenotypic sex, principal component analysis (PCA) was performed using subsets of top-ranked markers obtained from GWAS (ranging from 1 to 192). The first principal component (PC1) consistently separated individuals into two distinct clusters: a male-dominant cluster and a female-dominant cluster (Figure 3A and Figure S2). Four phenotypic females (F3, F17, F28, and F46) and six phenotypic males (M3, M15, M29, M40, M43, and M53) were consistently misassigned to the opposite cluster. This classification pattern was consistent across all tested marker subsets down to three indels (Supplementary Table S5). Moreover, for an independent validation batch of the FL population, these sex-associated markers also effectively partitioned individuals into male- and female-dominant clusters, highlighting their stability and predictive robustness within the FL strain (Figure 3B).

3.3. Development and Validation of Sex-Linked Indel Markers

Within the 8.3–10.1 Mb region on chromosome 9, three candidate indel loci (Chr. 9:8,409,394, Chr. 9:8,462,525, and Chr. 9:10,037,381) exhibiting distinct allele length differences were selected for the development of sex-associated molecular markers. Based on their flanking sequences, four pairs of primers were designed (Supplementary Table S6), and PCR amplification was performed using individuals with known genotypes (0/0, 0/1, and 1/1) determined by resequencing as reference samples. Among these primers, two markers—FO-3 (Chr. 9:8,462,525) and FO-4 (Chr. 9:10,037,381)—produced clear, stable, and reproducible genotyping patterns (Figure 4). Specifically, FO-3 identified a male-biased 13 bp deletion (allele 0: 224 bp; allele 1: 211 bp), while FO-4 detected a male-biased 28 bp deletion (allele 0: 344 bp; allele 1: 316 bp) (Figure 4, Figures S3 and S4).
Validation in an independent batch of the FL strain (30 males and 30 females) showed strong sex–genotype associations. FO-3 genotyped 25 females as 0/0 and 26 males as 0/1 or 1/1; FO-4 genotyped 27 females as 0/0 and 22 males as 0/1 (Table 1). Chi-square tests confirmed significant associations between genotype and sex for both markers (P = 5.79 × 10−8 for FO-3; P = 6.51 × 10−7 for FO-4). Based on a simple classification rule (0/0 as female; 0/1 or 1/1 as male), FO-3 achieved a prediction accuracy of 85%, an F1-score of 85%, and an AUC of 0.85. FO-4 showed a slightly lower performance with an accuracy of 82%, an F1-score of 80%, and an AUC of 0.82 (Supplementary Table S7).

3.4. Functional Annotation of Candidate Sex-Associated Indels

To explore the potential biological functions of the sex-associated indels, 147 loci satisfying both genome-wide significance (P < 3.34 × 10−8) and high FST between sexes (FST > 0.25) were annotated (Supplementary Figure S1). Most of these variants were in intronic or gene regulatory regions, with no exonic variants identified (Supplementary Table S8).
A total of 45 genes were identified, including 40 with annotated functions. These genes were involved in calcium ion channel regulation (e.g., PKD1L1), neurotransmitter signaling (e.g., 5-HTRL), cytoskeletal organization (e.g., TUBA1A), transmembrane transport (e.g., ABCB1L), and transcriptional regulation (e.g., TWIST2, TRIM33L) (Supplementary Table S9). Notably, the top 15 sex-associated indels were primarily located in or near PKD1L1, 5-HTRL, SCP, CCKRa, and TWIST2 (Table 2), suggesting their potential involvement in sex determination or differentiation pathways.

3.5. Variation in Sex-Associated Indels in the PKD1L1 Gene Across Populations

Among the 45 sex-associated genes identified in this study, PKD1L1 harbored the highest number of sex-associated indel variants, including the most significant locus from the GWAS analysis (−log10P = 19.23; Table 2). This gene is located on Chr. 9, spanning the 8.86–8.97 Mb interval, and its structure comprises 69 exons and 68 introns (Figure 5A). A total of 111 indel loci were identified within this region, of which 25 (23%) showed significant associations with phenotypic sex (Figure 5B), suggesting that PKD1L1 may serve as a key candidate gene for sex determination.
We further compared the genotypes of sex-associated indel loci within the PKD1L1 gene across three populations—wild (WW), farmed (DD), and selectively bred (FL). The genotype heatmap (Figure 5C) revealed that the alternative (ALT) alleles were highly enriched in males of the FL strain, with most individuals being heterozygous (0/1) or homozygous for the ALT allele (1/1), while females were predominantly homozygous for the reference allele (0/0). The DD population exhibited an intermediate pattern, whereas the WW population showed a low frequency of ALT alleles without clear sex-dependent enrichment.
Quantitative analysis of ALT allele frequencies supported these observations. When averaged across all indel loci within the PKD1L1 gene, ALT allele frequencies followed a stepwise increase from WW to DD to FL populations, with significantly higher values in males than females within each group (Figure 5D). A similar pattern was observed when the analysis was restricted to the 25 sex-associated loci, with the strongest enrichment of ALT alleles detected in FL males (Figure 5E). These findings indicate that the sex-linked genomic structure surrounding PKD1L1 likely emerged or was strongly reinforced during domestication and selective breeding, particularly in the FL strain.

4. Discussion

4.1. Mapping the Sex-Determining Region in Fujian Oyster

Oysters can serve as a unique model for studying sex determination and differentiation of complex systems. As early as 1977, Haley observed significant sex ratio biases and paternal effects in C. virginica, proposing a “three-locus model” to explain the genetic basis of sex control [24]. Similar phenomena observed in M. gigas later led to the development of the “single-locus model” and the “major gene–modifier model” to interpret family-specific variations in sex ratios [25,26]. With the advancement of molecular markers and high-throughput sequencing, researchers identified sex-associated QTLs on LG6 and LG9 in M. gigas using microsatellites and genotyping-by-sequencing GBS approaches [8,9].
In M. angulata, our previous SNP-based genome-wide study identified a significant sex-linked region on Chr. 9 (Chr. 9: 8.0–10.5 Mb) in a cultured strain (FL) [10]. Here, we extended our sex-association analysis to include indel variants, identifying a similarly enriched region on Chr. 9, suggesting a stable sex-linked block in the strain. Notably, the peak association signals for indels and SNPs were not fully overlapping, with the top indel located within the intron of PKD1L1, whereas the top SNP was within the intron of 5-HTRL. This discrepancy implies the presence of multiple functional elements or regulatory units, reflecting the genetic complexity of the sex-determining region.

4.2. Development and Validation of Sex-Linked Molecular Markers

PCA using indel subsets (1–192 markers) consistently separated male and female individuals, indicating strong linkage disequilibrium (LD) among sex-associated markers. Even with as few as three top-ranked markers, the classification remained robust, suggesting a dominant effect of the major sex-determining locus. However, ten individuals were consistently misassigned across all marker sets, implying discordance between genotype and phenotypic sex. This mismatch is likely not due to technical failure, but reflects the high complexity and plasticity of the oyster sex-determining system. Like other aquatic animals, oyster sex determination involves major loci, minor modifiers, and environmental factors. For example, in M. gigas, factors such as temperature and nutritional stress have been shown to induce sex reversal through alterations in gene expression and DNA methylation changes [27,28,29]. Two indel markers (FO-3 and FO-4) were further developed for PCR genotyping, achieving sex identification accuracies of 85% and 82%, respectively, in an independent population. Although the applicability of these markers is currently limited to the FL strain, they represent the first population-level sex-linked molecular markers reported in oysters, highlighting the potential feasibility of sex-control breeding through marker-assisted selection within a defined genetic background.

4.3. Identification and Functional Implications of Candidate Genes

In total, 45 genes were annotated in the sex-associated region on Chr. 9. Among these, multiple indel variants in the PKD1L1 gene exhibited significant allele frequency differences between sexes (maximum GWAS −log10P = 19.23; FST = 0.50), suggesting a central role in sex determination. PKD1L1 belongs to the polycystin family, characterized by multiple transmembrane domains, calcium-binding sites, and GPCR-like structures. It forms mechanosensitive calcium channels with PKD2, capable of sensing fluid flow, mechanical stress, or osmotic stimuli, thereby triggering intracellular Ca2+ signaling [30,31]. In vertebrates, PKD1L1 is a well-known nodal flow sensor essential for establishing left–right asymmetry through calcium signaling [32]. Although its role in invertebrates remains underexplored, calcium signaling has been implicated in oocyte maturation, gamete release, and gonadal remodeling in animals [33,34]. The accumulation of sex-associated indel variants within the introns of the PKD1L1 gene, along with its male-biased allele frequencies, suggests that it may regulate gonadal cell fate in the Fujian oyster through calcium signaling pathways. Future functional studies, such as in vivo knockdown, overexpression, or CRISPR-mediated editing of PKD1L1, as well as calcium imaging in oyster gonadal cells, will be essential to test this hypothesis and clarify the mechanistic link between PKD1L1-mediated signaling and sex determination in the Fujian oyster.
Notably, all sex-associated variants identified in this study reside in non-coding regions (introns or upstream), indicating that PKD1L1 may influence sex through expression regulation rather than protein-coding sequence changes. Additionally, other sex-associated genes such as 5-HTRL, SCP, and CCKRa are involved in calcium signaling, neuroendocrine pathways, and cell fate regulation, implying that sex determination in the FL strain involves multi-gene coordination and integration of calcium and neuroendocrine regulatory axes.

4.4. Domestication-Driven Enrichment of Male-Linked Variants

Comprehensive analyses of genotypes and allele frequencies at sex-associated indel loci within the candidate gene PKD1L1 revealed a progressive enrichment of ALT alleles in farmed and selectively bred oyster populations, with the highest accumulation observed in males. Notably, the FL strain has undergone multiple generations of directional selection for growth-related traits such as body weight and shell height. Such selective pressures may have indirectly influenced allele frequencies at sex-associated loci, leading to the remodeling of the genetic architecture underlying sex determination.
Similar phenomena—where domestication and artificial selection reshape sex-determining genetic architecture—have been reported in other aquaculture species. For example, in Danio rerio, certain domesticated strains have lost the primary sex-determining locus that is present in wild populations [35]. In Oreochromis niloticus, distinct sex-determining regions have been identified between wild and cultured populations [36,37]. These findings, together with our results in M. angulata, highlight the remarkable evolutionary plasticity of sex-determination systems in aquatic animals and underscore the critical role of domestication and artificial selection in driving the emergence or evolution of sex-linked genomic regions.

5. Conclusions

This study identified sex-associated indels clustering within a major sex-determining region on Chr. 9 in the FL strain of M. angulata, with PKD1L1 highlighted as a key candidate gene potentially involved in sex determination. Two validated indel markers (FO-3 and FO-4) demonstrated high sex identification accuracy, providing practical tools for molecular breeding for sex control. The population-specific nature of this sex-linked region suggests that domestication and selective breeding for growth and shell morphology may have reprogrammed the genetic architecture underlying sex determination in the Fujian oyster. These findings position M. angulata as a promising model for studying the evolution and environmental plasticity of sex-determination systems in mollusks.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/fishes10090438/s1, Figure S1: Overlap of sex-linked indels identified by GWAS, FST, genotype frequency (SD) approaches. Figure S2: Principal component analysis (PCA) of Fujian oyster individuals based on subsets of top-ranked sex-associated indel markers. Figure S3: Visualization of two sex-linked deletions on chromosome 9 using IGV. (A) A 13 bp deletion at Chr9:8,462,525, present in male individuals (M-1, M-2) but absent in female individuals (F-1, F-2). (B) A 28 bp deletion at Chr9:10,037,381, also observed specifically in males but not in females. These sex-specific deletions suggest strong linkage with the male genotype and support their potential use as sex-linked molecular markers. Figure S4: Expected amplicon sequences of sex-linked markers FO-3 and FO-4. (A) Predicted PCR product sequence amplified by primer pair FO-3. (B) Predicted PCR product sequence amplified by primer pair FO-4. Red text indicates the forward and reverse primer binding sites. Blue-highlighted regions represent male-specific deletions identified from whole-genome resequencing. Figure S5: Agarose gel electrophoresis of PCR products amplified using primer pair FO-3 in Magallana angulata. (A–C) PCR products with the marker lane placed centrally for comparison. Figure S6: Agarose gel electrophoresis of PCR products amplified using primer pair FO-4 in Magallana angulata. (A–C) PCR products with the marker lane placed centrally for comparison. Table S1: List of indels significantly associated with phenotypic sex. The red highlights are the top five significant ones, and the orange highlights are the sixth to twentieth. They are scattered in this genomic region, indicating linkage disequilibrium in this segment. Table S2: The variants where FST values between sexes exceeded the 0.25 threshold. Table S3: List of 78 indels with sex-biased genotype frequency. Table S4: Distribution of indels with sex-biased genotype frequency in non-overlapping 10 Kb windows in the genome. Table S5: Summary of PCA-based classification accuracy using different subsets of top-ranked sex-associated indel markers. Table S6: Primer information for PCR amplification of sex-linked indels in Magallana angulata. FO-3 and FO-4 (highlighted in blue) successfully distinguished genotypes (0/0, 0/1, and 1/1) in resequenced standard samples and were validated as reliable sex-linked markers. Table S7: Classification performance of two sex-linked indel markers (FO-3 and FO-4) for sex identification in Magallana angulata. Accuracy, F1-score, and area under the ROC curve (AUC) were calculated based on genotype-based prediction (0/0 as female; 0/1 and 1/1 as male) in a validation population of 30 males and 30 females. Chi-square p-values indicate the statistical significance of genotype–sex association. Table S8: SnpEff annotation results for 147 sex-associated loci. Table S9: Overview of 45 indel-affected genes identified via SnpEff annotation.

Author Contributions

Conceptualization, Y.N. and M.C.; software, Y.H.; formal analysis, Y.H., Y.N., Q.W. (Qijuan Wan), S.L., Y.Y. and C.T.; investigation, Y.H.; resources, Y.N., Q.W. (Qisheng Wu), X.G., J.Q., Y.K. and H.G.; writing—original draft, Y.H.; writing—review and editing, L.L. and M.C.; visualization, Y.H.; project administration, M.C.; funding acquisition, Y.N., H.G. and M.C. All authors have read and agreed to the published version of the manuscript.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This study was funded by the Agriculture Research System of China of MOF and MARA (CARS-49), the Natural Science Foundation of Fujian Province (2025J01861), the National Key Research and Development Program of China (2018YFD0901400), and the Special Fund Project of Comprehensive Marine and Fisheries Service in Fujian Province (FYZF-KTYJ-2025-6).

Institutional Review Board Statement

This study does not require approval from an ethics committee. The /Magallana angulata/ (Fujian oyster) samples used in this research were collected from a licensed commercial aquaculture facility, Fujian Baosen Aquatic Technology Co., Ltd., and all handling procedures followed standard aquaculture protocols. We have attached an official statement issued by the Science and Technology Ethics Committee of Jimei University, which clearly states that our project titled “Genetic Dissection of Sex Determination in Fujian Oyster and Its Breeding Application” does not involve ethical risks.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used in this study are available from the corresponding author upon reasonable request.

Acknowledgments

We thank all members of our laboratories for helpful discussions and technical assistance. We also acknowledge support from the Fisheries College, Jimei University, Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs; and the Fisheries Research Institute of Fujian, Key Laboratory of Cultivation and High-Value Utilization of Marine Organisms in Fujian.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Martínez, P.; Viñas, A.M.; Sánchez, L.; Díaz, N.; Ribas, L.; Piferrer, F. Genetic architecture of sex determination in fish: Applications to sex ratio control in aquaculture. Front. Genet. 2014, 5, 340. [Google Scholar] [CrossRef]
  2. Li, X.-Y.; Mei, J.; Ge, C.-T.; Liu, X.-L.; Gui, J.-F. Sex determination mechanisms and sex control approaches in aquaculture animals. Sci. China Life Sci. 2022, 65, 1091–1122. [Google Scholar] [CrossRef]
  3. Xie, J.; Ning, Y.; Han, Y.; Su, C.; Zhou, X.; Wu, Q.; Guo, X.; Qi, J.; Ge, H.; Ke, Y. Identification of SNPs and Candidate Genes Associated with Growth Using GWAS and Transcriptome Analysis in Portuguese Oyster (Magallana angulata). Fishes 2024, 9, 471. [Google Scholar] [CrossRef]
  4. Qin, Y.; Li, R.; Liao, Q.; Shi, G.; Zhou, Y.; Wan, W.; Li, J.; Ma, H.; Zhang, Y.; Yu, Z. Comparison of biochemical composition, nutritional quality, and metals concentrations between males and females of three different Crassostrea sp. Food Chem. 2023, 398, 133868. [Google Scholar] [CrossRef] [PubMed]
  5. Nicolini, F.; Ghiselli, F.; Luchetti, A.; Milani, L. Bivalves as emerging model systems to study the mechanisms and evolution of sex determination: A genomic point of view. Genome Biol. Evol. 2023, 15, evad181. [Google Scholar] [CrossRef] [PubMed]
  6. Collin, R. Phylogenetic patterns and phenotypic plasticity of molluscan sexual systems. Integr. Comp. Biol. 2013, 53, 723–735. [Google Scholar] [CrossRef]
  7. Park, J.J.; Kim, H.; Kang, S.W.; An, C.M.; Lee, S.-H.; Gye, M.C.; Lee, J.S. Sex ratio and sex reversal in two-year-old class of oyster, Crassostrea gigas (Bivalvia: Ostreidae). Dev. Reprod. 2012, 16, 385. [Google Scholar] [CrossRef]
  8. Guo, X.; Li, Q.; Wang, Q.Z.; Kong, L.F. Genetic Mapping and QTL Analysis of Growth-Related Traits in the Pacific Oyster. Mar. Biotechnol. 2012, 14, 218–226. [Google Scholar] [CrossRef]
  9. Han, Z.; Li, Q.; Xu, C.; Liu, S.; Yu, H.; Kong, L. QTL mapping for orange shell color and sex in the Pacific oyster (Crassostrea gigas). Aquaculture 2021, 530, 735781. [Google Scholar] [CrossRef]
  10. Zhou, X.; Ning, Y.; Xie, J.; Han, Y.; Tang, C.; Su, C.; Wan, Q.; Wu, Q.; Guo, X.; Qi, J.; et al. Correction: Identification of sex-linked markers and genes in Portuguese oyster (Magallana angulata). Front. Mar. Sci. 2025, 12, 1643904. [Google Scholar] [CrossRef]
  11. Hayashi, K.; Yoshida, H.; Ashikawa, I. Development of PCR-based allele-specific and InDel marker sets for nine rice blast resistance genes. Theor. Appl. Genet. 2006, 113, 251–260. [Google Scholar] [CrossRef]
  12. Bhangale, T.R.; Rieder, M.J.; Livingston, R.J.; Nickerson, D.A. Comprehensive identification and characterization of diallelic insertion–deletion polymorphisms in 330 human candidate genes. Hum. Mol. Genet. 2005, 14, 59–69. [Google Scholar] [CrossRef] [PubMed]
  13. Chen, S.; Zhou, Y.; Chen, Y.; Gu, J. fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 2018, 34, i884–i890. [Google Scholar] [CrossRef] [PubMed]
  14. Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef] [PubMed]
  15. Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; Subgroup, G.P.D.P. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef]
  16. McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef]
  17. Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.W.; Daly, M.J.; et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef]
  18. Browning, B.L.; Browning, S.R. Genotype Imputation with Millions of Reference Samples. Am. J. Hum. Genet. 2016, 98, 116–126. [Google Scholar] [CrossRef]
  19. Kang, H.M.; Sul, J.H.; Service, S.K.; Zaitlen, N.A.; Kong, S.-y.; Freimer, N.B.; Sabatti, C.; Eskin, E. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 2010, 42, 348–354. [Google Scholar] [CrossRef]
  20. Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef]
  21. Thorvaldsdóttir, H.; Robinson, J.T.; Mesirov, J.P. Integrative Genomics Viewer (IGV): High-performance genomics data visualization and exploration. Brief. Bioinform. 2012, 14, 178–192. [Google Scholar] [CrossRef]
  22. Cingolani, P.; Platts, A.; Wang, L.L.; Coon, M.; Nguyen, T.; Wang, L.; Land, S.J.; Lu, X.; Ruden, D.M. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 2012, 6, 80–92. [Google Scholar] [CrossRef]
  23. Ginestet, C. ggplot2: Elegant Graphics for Data Analysis. J. R. Stat. Soc. Ser. A Stat. Soc. 2011, 174, 245–246. [Google Scholar] [CrossRef]
  24. Haley, L. Genetics of sex determination in the American oyster. Proc. Natl. Shellfish. Assoc. 1979, 69, 54–57. Available online: https://cir.nii.ac.jp/crid/1570291224717821568 (accessed on 29 July 2025).
  25. Guo, X.; Hedgecock, D.; Hershberger, W.K.; Cooper, K.; Allen, S.K., Jr. Genetic determinants of protandric sex in the Pacific oyster, Crassostrea gigas Thunberg. Evolution 1998, 52, 394–402. [Google Scholar] [CrossRef] [PubMed]
  26. Hedrick, P.W.; Hedgecock, D. Sex Determination: Genetic Models for Oysters. J. Hered. 2010, 101, 602–611. [Google Scholar] [CrossRef]
  27. Santerre, C.; Sourdaine, P.; Marc, N.; Mingant, C.; Robert, R.; Martinez, A.-S. Oyster sex determination is influenced by temperature—First clues in spat during first gonadic differentiation and gametogenesis. Comp. Biochem. Physiol. Part A Mol. Integr. Physiol. 2013, 165, 61–69. [Google Scholar] [CrossRef]
  28. Sun, D.; Yu, H.; Li, Q. Starvation-induced changes in sex ratio involve alterations in sex-related gene expression and methylation in Pacific oyster Crassostrea gigas. Comp. Biochem. Physiol. Part B Biochem. Mol. Biol. 2023, 267, 110863. [Google Scholar] [CrossRef]
  29. Sun, D.; Yu, H.; Kong, L.; Liu, S.; Xu, C.; Li, Q. The role of DNA methylation reprogramming during sex determination and sex reversal in the Pacific oyster Crassostrea gigas. Int. J. Biol. Macromol. 2024, 259, 128964. [Google Scholar] [CrossRef]
  30. Wang, Z.; Ng, C.; Liu, X.; Wang, Y.; Li, B.; Kashyap, P.; Chaudhry, H.A.; Castro, A.; Kalontar, E.M.; Ilyayev, L. The ion channel function of polycystin-1 in the polycystin-1/polycystin-2 complex. EMBO Rep. 2019, 20, e48336. [Google Scholar] [CrossRef]
  31. Lemos, F.O.; Ehrlich, B.E. Polycystin and calcium signaling in cell death and survival. Cell Calcium 2018, 69, 37–45. [Google Scholar] [CrossRef]
  32. Field, S.; Riley, K.-L.; Grimes, D.T.; Hilton, H.; Simon, M.; Powles-Glover, N.; Siggers, P.; Bogani, D.; Greenfield, A.; Norris, D.P. Pkd1l1 establishes left-right asymmetry and physically interacts with Pkd2. Development 2011, 138, 1131–1142. [Google Scholar] [CrossRef]
  33. Deguchi, R.; Takeda, N.; Stricker, S.A. Calcium signals and oocyte maturation in marine invertebrates. Int. J. Dev. Biol. 2015, 59, 271–280. [Google Scholar] [CrossRef]
  34. Castelli, M.A.; Whiteley, S.L.; Georges, A.; Holleley, C.E. Cellular calcium and redox regulation: The mediator of vertebrate environmental sex determination? Biol. Rev. 2020, 95, 680–695. [Google Scholar] [CrossRef]
  35. Wilson, C.A.; High, S.K.; McCluskey, B.M.; Amores, A.; Yan, Y.-l.; Titus, T.A.; Anderson, J.L.; Batzel, P.; Carvan III, M.J.; Schartl, M. Wild sex in zebrafish: Loss of the natural sex determinant in domesticated strains. Genetics 2014, 198, 1291–1308. [Google Scholar] [CrossRef]
  36. Triay, C.; Conte, M.A.; Baroiller, J.-F.; Bezault, E.; Clark, F.E.; Penman, D.J.; Kocher, T.D.; D’Cotta, H. Structure and Sequence of the Sex Determining Locus in Two Wild Populations of Nile Tilapia. Genes 2020, 11, 1017. [Google Scholar] [CrossRef] [PubMed]
  37. Taslima, K.; Khan, M.G.; McAndrew, B.J.; Penman, D.J. Evidence of two XX/XY sex-determining loci in the Stirling stock of Nile tilapia (Oreochromis niloticus). Aquaculture 2021, 532, 735995. [Google Scholar] [CrossRef]
Figure 1. Genome-wide distribution of insertion–deletion (indel) variants in the Fujian oyster genome.
Figure 1. Genome-wide distribution of insertion–deletion (indel) variants in the Fujian oyster genome.
Fishes 10 00438 g001
Figure 2. Genome-wide identification of sex-associated indel loci in the Fujian oyster. (A) Manhattan plot of GWAS between indel genotypes and phenotypic sex using a mixed linear model (MLM). The horizontal line represents the genome-wide significance threshold (−log10P = 7.476). (B) Genome-wide distribution of genetic differentiation (FST) between males and females. (C) Density of sex-biased indels per 10 kb genomic window.
Figure 2. Genome-wide identification of sex-associated indel loci in the Fujian oyster. (A) Manhattan plot of GWAS between indel genotypes and phenotypic sex using a mixed linear model (MLM). The horizontal line represents the genome-wide significance threshold (−log10P = 7.476). (B) Genome-wide distribution of genetic differentiation (FST) between males and females. (C) Density of sex-biased indels per 10 kb genomic window.
Fishes 10 00438 g002
Figure 3. PCA based on 192 sex-associated indel loci in two batches of the FL strain of M. angulata. Each point represents an individual sample, color-coded by phenotypic sex (orange: female; blue: male). (A) The target batch; (B) independent validation batch.
Figure 3. PCA based on 192 sex-associated indel loci in two batches of the FL strain of M. angulata. Each point represents an individual sample, color-coded by phenotypic sex (orange: female; blue: male). (A) The target batch; (B) independent validation batch.
Fishes 10 00438 g003
Figure 4. Genotyping validation of sex-linked indels using primer pairs FO-3 and FO-4. (A) PCR amplification results for FO-3 targeting the indel at Chr. 9:8,462,525. (B) PCR amplification results for FO-4 targeting the indel at Chr. 9:10,037,381.
Figure 4. Genotyping validation of sex-linked indels using primer pairs FO-3 and FO-4. (A) PCR amplification results for FO-3 targeting the indel at Chr. 9:8,462,525. (B) PCR amplification results for FO-4 targeting the indel at Chr. 9:10,037,381.
Fishes 10 00438 g004
Figure 5. Variation in sex-associated indel loci within the PKD1L1 gene across different populations and sexes. (A) Schematic representation of the PKD1L1 gene structure. (B) Local GWAS results showing that 25 indel loci within the PKD1L1 gene are significantly associated with sex. Black dots indicate PKD1L1 loci not significantly associated with sex, red dots indicate loci significantly associated with sex, and the arrow shows the transcription direction (5′→3′). (C) Genotype heatmap illustrating the sex-associated indel genotypes in the PKD1L1 gene for representative individuals from three populations. ALT alleles are highly enriched in males of the FL strain. Each row represents an individual, and each column represents an indel locus. Red and green indicate homozygous reference (0/0) and homozygous alternate (1/1) genotypes, respectively; yellow indicates heterozygous genotypes (0/1). The left color bar denotes population origin—green: wild population (WW), orange: farmed population (DD), and blue: selectively bred strain (FL). The right color bar indicates phenotypic sex—red for females and blue for males. (D) Average frequency of ALT alleles at all indel loci within the PKD1L1 gene across different populations and sexes. Results show a stepwise enrichment of ALT alleles from wild to farmed to selectively bred groups, particularly in male individuals. (E) Comparison of ALT allele frequencies at sex-associated indel loci between males and females across the three populations. A progressive enrichment of sex-associated ALT alleles is observed in domesticated and selectively bred groups, especially in FL strain males. FF, FD, FW, and MF represent females from the FL, DD, and WW populations, and males from the FL population, respectively. **** indicates p < 0.0001; *** indicates p < 0.001; ** indicates p < 0.01; * indicates p < 0.05; ns indicates not significant.
Figure 5. Variation in sex-associated indel loci within the PKD1L1 gene across different populations and sexes. (A) Schematic representation of the PKD1L1 gene structure. (B) Local GWAS results showing that 25 indel loci within the PKD1L1 gene are significantly associated with sex. Black dots indicate PKD1L1 loci not significantly associated with sex, red dots indicate loci significantly associated with sex, and the arrow shows the transcription direction (5′→3′). (C) Genotype heatmap illustrating the sex-associated indel genotypes in the PKD1L1 gene for representative individuals from three populations. ALT alleles are highly enriched in males of the FL strain. Each row represents an individual, and each column represents an indel locus. Red and green indicate homozygous reference (0/0) and homozygous alternate (1/1) genotypes, respectively; yellow indicates heterozygous genotypes (0/1). The left color bar denotes population origin—green: wild population (WW), orange: farmed population (DD), and blue: selectively bred strain (FL). The right color bar indicates phenotypic sex—red for females and blue for males. (D) Average frequency of ALT alleles at all indel loci within the PKD1L1 gene across different populations and sexes. Results show a stepwise enrichment of ALT alleles from wild to farmed to selectively bred groups, particularly in male individuals. (E) Comparison of ALT allele frequencies at sex-associated indel loci between males and females across the three populations. A progressive enrichment of sex-associated ALT alleles is observed in domesticated and selectively bred groups, especially in FL strain males. FF, FD, FW, and MF represent females from the FL, DD, and WW populations, and males from the FL population, respectively. **** indicates p < 0.0001; *** indicates p < 0.001; ** indicates p < 0.01; * indicates p < 0.05; ns indicates not significant.
Fishes 10 00438 g005
Table 1. Genotyping results of an independent M. angulata population using FO-3 and FO-4 primer pairs.
Table 1. Genotyping results of an independent M. angulata population using FO-3 and FO-4 primer pairs.
GenotypeFO-3FO-4
MaleFemaleMaleFemale
0/0425827
0/1225223
1/14000
Table 2. Annotation of the top 15 indel loci most significantly associated with sex.
Table 2. Annotation of the top 15 indel loci most significantly associated with sex.
RankChromosomePositionGWASFSTMale
Frequency
Female FrequencyGenomic
Region
Gene Name
198,921,92519.230.3685.71%10.34%intronPKD1L1
299,358,71518.360.4485.71%8.62%intronLOC128164001
3910,038,48817.260.3487.50%15.52%--
498,410,37217.230.4187.50%8.62%upstream_gene5-HTRL
599,574,32116.700.3485.71%12.07%intronSCP
699,595,10316.630.4985.71%6.90%upstream_geneSCP
798,914,52916.570.4583.93%8.62%intronPKD1L1
898,916,95616.510.5085.71%6.90%intronPKD1L1
998,887,64316.200.3387.50%10.34%intronPKD1L1
1098,469,05015.490.3383.93%12.07%intronTWIST2
1198,454,18115.340.4180.36%8.62%--
1298,412,55415.030.3685.71%10.34%upstream_gene5-HTRL
1398,412,55815.030.3685.71%10.34%upstream_gene5-HTRL
1499,800,63814.670.3880.36%8.62%upstream_geneCCKRa
1599,800,64114.520.3980.36%8.62%upstream_geneCCKRa
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, Y.; Ning, Y.; Li, L.; Wan, Q.; Li, S.; Yao, Y.; Tang, C.; Wu, Q.; Guo, X.; Qi, J.; et al. Sex-Associated Indels and Candidate Gene Identification in Fujian Oyster (Magallana angulata). Fishes 2025, 10, 438. https://doi.org/10.3390/fishes10090438

AMA Style

Han Y, Ning Y, Li L, Wan Q, Li S, Yao Y, Tang C, Wu Q, Guo X, Qi J, et al. Sex-Associated Indels and Candidate Gene Identification in Fujian Oyster (Magallana angulata). Fishes. 2025; 10(9):438. https://doi.org/10.3390/fishes10090438

Chicago/Turabian Style

Han, Yi, Yue Ning, Ling Li, Qijuan Wan, Shuqiong Li, Ying Yao, Chaonan Tang, Qisheng Wu, Xiang Guo, Jianfei Qi, and et al. 2025. "Sex-Associated Indels and Candidate Gene Identification in Fujian Oyster (Magallana angulata)" Fishes 10, no. 9: 438. https://doi.org/10.3390/fishes10090438

APA Style

Han, Y., Ning, Y., Li, L., Wan, Q., Li, S., Yao, Y., Tang, C., Wu, Q., Guo, X., Qi, J., Ke, Y., Ge, H., & Cai, M. (2025). Sex-Associated Indels and Candidate Gene Identification in Fujian Oyster (Magallana angulata). Fishes, 10(9), 438. https://doi.org/10.3390/fishes10090438

Article Metrics

Back to TopTop