Molecular Characterization and Expression of the LAP3 Gene and Its Association with Growth Traits in the Blood Clam Tegillarca granosa

: Leucine aminopeptidase 3 (LAP3) is a metallopeptidase that cleaves N-terminal residues and is involved in protein maturation and degradation. In this study, we characterized the leucine aminopeptidase 3 ( LAP3 ) gene from Tegillarca granosa ( Tg-LAP3 for short), which appeared to consist of 15,731 nucleotides encoding 530 amino acids. We identiﬁed 12 introns and 13 exons in the Tg-LAP3 gene, suggesting a highly conserved genomic structure. The proximal promoter sequence consists of 1922 bps with a typical TATA box structure, which is the general structural characteristic of core promoters in eukaryotes. We found two functional domains in the Tg-LAP3 protein, including an N-terminal domain (41–174aa) and a peptidase_M17 catalytic domain (209–522aa). Multiple alignment showed that Tg-LAP3 shares 73.4% identity with LAP3 of Mizuhopecten yessoensis and 55.2–70.7% identity with LAP3 of other species. Quantitative analysis of Tg-LAP3 in embryos/larvae and adult tissues indicated that the highest expression occurred in eyebot larva, with limited expression in other stages; among tissues, the highest expression was found in the liver ( p < 0.05). Association analysis found that three single-nucleotide polymorphisms (SNPs) (g.-488A > G, g.-1123C > T, and g.-1304C > A) in the proximal promoter were successfully typed, but there was no signiﬁcant difference in growth traits (body weight, shell length, shell width, and shell height) among these genotypes. The results of our study demonstrate the functional roles of the Tg-LAP3 gene and provide valuable information for molecular marker-assisted selection (MAS) of the blood clam. by induced preserved at − 80 ◦ for embryo/larva expression study SNPs of the Tg-LAP3 80 two-year-old blood clams were collected, and their shell lengths, shell heights, shell widths, and body weights were measured. The adductor muscles of each individual were collected and preserved in ethanol absolute.


Introduction
Leucine aminopeptidase 3 (LAP3) has been shown to catalyze the hydrolysis of leucine residues from the amino termini of proteins or peptide substrates [1]. LAP3 is a zinc-containing enzyme found in many tissues, including lens, kidney, pancreas, muscle, liver, mammary gland, and in subcellular locations in a diversity of species [2]. As a cell maintenance enzyme, LAP3 has different functions in mammals, invertebrate, microbes, and plants. In mammals, it processes peptides for MHC I antigen presentation and bioactive peptides (oxytocin, vasopressin, and enkephalins) and is involved in vesicle trafficking to the plasma membrane [3,4]. In microbes, LAP3 plays a role in proteolysis and contributes to promoting DNA binding ability [4]. Additionally, in plants, it has roles in defense, membrane transport of auxin receptors, meiosis, and osmoregulation [4,5].
However, the major roles of LAP3 are protein maturation and degradation, which are essential for metabolism, development, adaptation, and repair [6]. The level of protein maturation and degradation varies greatly, depending on the stages of development, environmental factors, and genotypes [7]; therefore, variations in the LAP3 gene are likely to be responsible for some of the variations observed in growth rates in some species [8] and in genetic polymorphism significantly associated with traits of interest in breeding. The potential applications of the LAP3 gene in animal breeding as well as its singlenucleotide polymorphisms (SNPs) and their associations with growth traits have been investigated in mammals, such as cattle [9] and sheep [10]. Three SNPs associated with high milk production in cow and two SNPs associated with high birth weight in sheep were investigated and could be used in marker-assisted selective breeding [9,10]. In marine mollusks, LAPs are also known to play an important role in growth and development, stress responses, and adaptation to changing environmental conditions [11,12], and SNPs associated with growth traits have been found in several aquatic species [13,14]. For example, 23 SNPs of the LAP gene were found to be linked to growth rates in the Pacific oyster, Crassostrea gigas, and three SNPs of the LAP3 gene were identified in the razor clam, Sinonovacula constricta; these SNPs could be used for genetic breeding markers in aquaculture. Meanwhile, gene transcripts and enzyme activities of LAP were also found to play important roles throughout early embryonic development in some bivalves, such as C. gigas [15] and Meretrix meretrix [16].
The blood clam Tegillarca granosa is a well-known seafood with red blood, special flavor, and rich nutrients and is an economically important aquaculture species in China. As a lower marine invertebrate animal, the blood clam is an oviparous bivalve and reaches sexual maturity at two years of age. Like most bivalve mollusks, it undergoes a metamorphosis from embryo to larva to juvenile clam in its early life, including the stages of fertilized egg, blastomere, blastula, gastrula, trochophore, D-shaped larva, umbo larva, eyebot larva, and juvenile clam. In recent years, selective breeding for rapid growth has been a primary focus of the aquaculture industry. The identification of growth genes and genetic variations associated with growth could benefit genetic improvement. So far, several genes involved in growth have been cloned and analyzed in T. granosa, including Smad3 (Mothers against decapentaplegic homolog 3), BMP7 (Bone morphogenetic protein 7), ERK2 (Extracellular signalregulated kinase 2) [17], GRB2 (Growth factor receptor-bound protein 2) [18], HDAC1 (Histone deacetylase 1) [19], Smad1/5 [20], BMP2/4 [21], and IGF1R (Insulin-like growth factor 1 receptor) [22], and SNPs associated with growth traits have been identified in Smad1/5 [21] and HDAC1 [23], which might be useful as molecular markers for assisted breeding. However, to date, no studies on the LAP3 gene in this species have been conducted.
In the present study, we characterized the Tg-LAP3 gene from the blood clam, determined the quantitative expression profiles, and analyzed associations between SNPs and growth-related traits. This study improves our understanding of the evolutionary and functional roles of the Tg-LAP3 gene and provides valuable information for molecular breeding programs of this species.

Sample Collection
Adult two-year-old blood clams (body weight 8.99 ± 0.48 g, shell length 29.14 ± 1.88 mm, shell width 22.32 ± 0.88 mm, shell height 19.82 ± 0.75 mm, on average) were collected from the Ningbo Yongsheng Shellfish Hatchery in Ningbo, China. Eight tissues from four individuals, including blood, liver, gill, mantle, adductor muscle, foot, testis, and ovary were sampled, immediately preserved in liquid nitrogen, and stored at −80 • C for tissue expression analysis. Samples of nine developmental stages (including mature eggs, 2-4 cells, blastula, gastrula, trochophore, D-shaped larva, umbo larva, eyebot larva, and juvenile clam, n > 500) were collected by artificially induced spawning and preserved at −80 • C for embryo/larva expression analysis. To study SNPs of the Tg-LAP3 gene, 80 two-year-old blood clams were collected, and their shell lengths, shell heights, shell widths, and body weights were measured. The adductor muscles of each individual were collected and preserved in ethanol absolute.

Cloning and Characterization of Tg-LAP3
To obtain the full-length cDNA, expressed sequence tag (EST) of Tg-LAP3 were retrieved from a 454 cDNA library of T. granosa to design specific primers (Table 1). Total RNAs were extracted by the common method described above, and the integrity and quality of RNAs were assessed using 1.5% agarose gel electrophoresis and ultraviolet spectrophotometry. The 5' RACE and 3' RACE reactions were conducted using a SMART ™ RACE cDNA amplification kit (Clontech, Mountain View, CA, USA) according to the manufacturer's instructions. The obtained PCR products were purified, cloned, and sequenced.  TTACTCTACTCCACTTCACTCCAAAT To obtain the intron sequence of Tg-LAP3, genomic DNA was extracted by the phenol/chloroform method and dissolved in sterile water at a concentration of 100 ng/µL. Eleven pairs of primers were designed according to the cDNA sequence of Tg-LAP3 (Table 1). The PCR conditions were as follows: 5 min at 94 • C; 30 cycles of 45 s at 94 • C, annealing for 45 s and 1 min at 72 • C; a final 10 min extension at 72 • C. A Genome Walker ™ Universal Kit (Clontech, Mountain View, CA, USA) was used to generate the proximal promoter sequence of Tg-LAP3 according to the manufacturer's instructions. Each PCR product was cloned and sequenced following the procedures described above.

Gene Expression Analysis of Tg-LAP3
Expression of the Tg-LAP3 transcript in eight tissues (blood, liver, gill, mantle, adductor muscle, foot, testis, and ovary; n = 4) and nine developmental stages (mature eggs, 2-4 cells, blastula, gastrula, trochophore, D-shaped larva, umbo larva, eyespot larva, and juvenile clam; n > 500) was analyzed using qRT-PCR. One pair of 18S rRNA primers (18S rRNA-real-F and 18S rRNA-real-R; see Table 1) of T. granosa was chosen to amplify the 18S rRNA gene of 200 bp as a reference gene for an internal control, and LAP3-real-F and LAP3-real-R (Table 1) were designed to amplify a product of 192 bps. The cDNA from tissues and larvae were diluted to 1:10 for the template for SYBR Green Fluorescent quantitative real-time RT-PCR using an ABI 7500 Fast Real-time PCR System (Applied Biosystems, Foster City, CA, USA). PCR amplifications were performed in a 20 µL volume with 10 µL of iTaq Universal SYBR Green Supermix (Bio-Rad, Hercules, CA, USA), 7.2 µL of deionized water, 0.8 µL of first-strand cDNA, and 1 µL of each primer. The conditions for the PCR were as follows: incubation at 94 • C for 20 s, 40 cycles of 94 • C for 3 s, 60 • C for 15 s, and 72 • C for 10 s.

Association Analysis of SNPs with Growth Trait in T. granosa
To collect the SNPs in the Tg-LAP3 proximal promoter, the DNA of 80 individuals was extracted using the phenol-chloroform method. High-resolution melting (HRM) primers (Table 1) were designed to validate the SNPs, and primer annealing temperatures were optimized by temperature-gradient PCR. The PCR products were detected using 8% polyacrylamide gel electrophoresis to visualize the results. HRM genotyping was performed using the 7500 Fast Real-time PCR System, and PCR was performed in a 20 µL volume containing 10 µL of Melt Doctor HRM Master Mix (Applied Biosystems), 10 µM forward and reverse primers, 20 ng/µL of template DNA, and 5.4 µL of deionized water. The PCR was conducted as follows: an initial denaturation at 95 • C for 10 min, then 40 cycles of denaturation at 95 • C for 15 s and annealing for 1 min, followed by melting curve conditions consisting of denaturation at 95 • C for 10 s and annealing for 1 min, and then high-resolution melting at 95 • C for 15 s and annealing for 15 s. The amplification results were analyzed by the HRM system software, and the genotype of each individual was determined through melting curve alignment. To determine the mutant genotype, the genotyped primers were used to amplify the target fragments using the genomic DNA of different genotypes (at least three individuals for each genotype). Each PCR product was purified, recovered, and sequenced following the procedures described above. The association of SNP with growth traits was examined. For each SNP locus, 80 individuals were grouped based on their genotypes, and the mean value and standard deviation of each measured trait were calculated for each genotype group.

Statistical Analysis
The expression levels of Tg-LAP3 were analyzed using the method of 2 − ct , and the results were expressed as the mean ± S.E. For descriptive statistics, the data were assessed using one-way ANOVA, and a Tukey's honest significant difference analysis was performed on each data set. The effects of SNPs on growth traits were analyzed using the aforementioned method. A comparison of the means among genotypes was performed for each trait using one-way ANOVA. All statistical analyses were performed using SPSS statistical software version 20.0 (SPSS, Chicago, IL, USA), and the level of significance was assumed at p < 0.05.

Gene Structure and Characterization of Tg-LAP3
The full-length cDNA of the Tg-LAP3 gene appeared to be 1769 bp (GenBank Accession No. JX103498), including an open reading frame (ORF) of 1593 bp encoding 530 amino acids. The Tg-LAP3 gene contains 12 introns and 13 exons, and all introns ware located in the ORF ( Figure 1A). Eleven intron-exon junctions conform to the -GT/AG rule, while the intron-exon boundary of Tg-LN10 (intron 10) is -GC-AG-( Table 2). The maximum length of Tg-LAP3 introns is 1853 bp, while the minimum intron length is 378 bp.
The Tg-LAP3 proximal promoter sequence was found to consist of 1922 bp ( Figure 1B) without a CpG island, and has three potential transcription start sites at -1641 bp, -512 bp, and -347 bp (ATG, A in the initiation codon is considered as +1). A typical TATA box structure, six E-boxes (CAXXTG), a GATA frame, and octamer transcription factors (Oct-1) binding sites were found, as well as 209 potential transcription factor binding sites, e.g., for TBP, SP1, AP1, and C/EBPalp.

Analysis of the Deduced Amino Acid Sequence of Tg-LAP3
The calculated molecular mass of the deduced mature Tg-LAP3 protein is 57.90 kDa, and the theoretical isoelectric point is 8.80. The mature Tg-LAP3 protein possesses two putative conserved domains: an N-terminal domain (cytosol aminopeptidase family, Pepti-dase_M17_N, 42-174aa) and a peptidase_M17 catalytic domain (cytosol aminopeptidase family, 209-522aa). The phylogenetic tree showed that all LAP3 of selected animals mainly clustered into three groups (Figure 2), including one group comprising mollusks, one group containing mammals, amphibians, and fish, and one group comprising other invertebrates. Tg-LAP3 in the mollusk group clustered with LAP3 from Yesso scallop Mizuhopecten yessoensis, and then with LAP3 from other mollusks. Multiple alignments showed that Tg-LAP3 shares the highest sequence identity (73.4%) with LAP3 of M. yessoensis, and 55.2-70.7% identity with LAP3 of other species (Table 3).  (Table 3). Table 3. Species and GenBank accession numbers of LAP3 sequences used for multiple alignment and phylogenetic analysis.

Quantitative Expression Analysis of Tg-LAP3
Among the nine developmental stages, Tg-LAP3 was expressed at low levels from mature eggs to trochophores ( Figure 3A). After that, the expression rose from D-shaped larva to eyebot larva and reached the highest level in eyebot larva than in other developmental stages (p < 0.05). Afterward, the expression subsided to low levels in juvenile clams ( Figure 3A). In adult clams, Tg-LAP3 mRNA was found to be expressed in all eight tissues, and the highest levels were found in the liver (p < 0.05). The lowest expression among adult tissues was detected in the ovary, and no statistically significant difference in the expression was detected among the ovary, blood, gill, mantle, and testis ( Figure 3B).

Association Analysis of SNPs with Growth Traits
When genotyping the Tg-LAP3 proximal promoter of 80 individuals by HRM, the results showed that only three SNPs (g.-488A > G, g.-1123C > T, and g.-1304C > A) were successfully typed ( Table 4). The blood clams with the genotype GG at position g.-488A > G appeared larger (body weight, shell length, shell width, and shell height) than those with AG or AA, and clams with the genotype CC at position eg.-1304C > A tended to be smaller than those with CA or AA. However, the results of one-way ANOVA showed that there was no significant difference in the growth traits with respect to the genotypes of the three SNPs.

Tg-LAP3 Gene and Amino Acid Sequence Features
Leucine aminopeptidase 3 (LAP3) is an exopeptidase belonging to the M17 family, which cleaves N-terminal residues from proteins and peptides [1,4]. LAP3 is highly conserved at the amino acid level, containing two M17 Pfam domains and lacking the metallopeptidase HEXXH motif [27]. In the current study, the Tg-LAP3 protein was found to include the N-terminal domain (41-174aa) of the peptidase_M17 superfamily and the peptidase_M17 catalytic domain (209-522aa). It had no HEXXH-Zn combination motif, a common feature for M17 family members, which proved that the Tg-LAP3 protein is highly conserved [4].
The introns and the proximal promoter sequence of Tg-LAP3 gene were characterized and analyzed. The results showed that Tg-LAP3 contains 12 introns and 13 exons with a total sequence of 12,057 bp. By comparison of intron numbers from different species retrieved from GenBank, the number of introns of the LAP3 gene appeared slightly different, varying from 12 to 14. For example, there are 12 introns in human (Homo sapiens), chicken (Gallus gallus), zebrafish (Danio rerio) LAP3 gene, while there are 13 introns in chimpanzee (Pan troglodytes) and 14 introns in great tit (Parus major) LAP3 gene. In addition, 11 intronexon boundaries (-GT-AG-) and the location of introns are well conserved between T. granosa and vertebrate LAP3 genes, suggesting a highly conserved genomic structure [28]. However, the intron-exon boundary of Tg-LN10 is -GC-AG-, indicating the presence of a C/T mutation [29]. Three possible core promoters were predicated in the 5'-UTR of Tg-LAP3. Obviously, there are conserved TATA boxes upstream of the transcription initiation site, which has the general structural characteristics of eukaryotes' core promoters [30]. Several putative regulatory elements including GATA-box and E-box were identified in promoter regions of Tg-LAP3. The GATA-box is a binding site for the trans-acting factor ASF-2 and exists in the promoter of genes with tissue-specific expression [31], suggesting that the Tg-LAP3 gene is a gene with tissue-specific expression, which is consistent with results of its specific expression in adult tissues. Six E-boxes (CAXXTG) were found in the proximal promoter of Tg-LAP3. The E-box is one of the sequence motifs characterizing the basic helix-loop-helix myogenic regulatory factors [32], and many genes with tissuespecific expression have multiple E-boxes in their promoter region to cooperatively regulate gene transcription [33,34]; therefore, further studies should be performed to identify how these E-boxes regulate Tg-LAP3 expression.

Quantitative Expression Analysis of Tg-LAP3
LAP3 plays crucial roles in cell maintenance, growth and development, and defense [13]. In marine mollusks, LAPs are believed to be of selective importance during adaptation to changes in salinity as a result of tides and varying freshwater input, which is the main function in marine organisms [11,35]. Furthermore, LAP3 is mainly involved in protein maturation and degradation, which are essential for metabolism, development, adaptation, and repair [6]. Tg-LAP3 mRNA was found to be expressed at the highest level in the liver, followed by the adductor muscle and foot. The liver is the main digestive and metabolic organ in bivalves, where protein maturation and degradation is very active. This suggests that LAP3 is the main exopeptidase in T. granosa. In bivalves, protein maturation and degradation enable exceptional changes in body morphology during larval development, from the free-living planktonic stage to sessile juvenile and calcified adults [15]. In C. gigas, gene transcripts of LAP were detected throughout larval development, which reflects the essential role of LAP in protein maturation and degradation during all development stages. The peak in LAP enzyme activity corresponds to the onset of eye development, suggesting LAP helps in the major reorganization of tissue structures that typically occurs during this stage [15]. In M. meretrix, the LAP3 gene is expressed at the highest level in D-shaped larva, which suggests that LAP3 might be involved in tissue formation during early development [16]. Among the developmental stages, Tg-LAP3 mRNA expression increases as the larvae develop, is low from the stage of mature egg to the trochophore stage, and increases immediately from the stage of D-shaped larva to that of eyebot larva. The highest expression level was in the eyebot larva stage (p < 0.05); therefore, the Tg-LAP3 gene might play an important role in larval attachment and metamorphosis, as observed in C. gigas [15].

Association Analysis between SNPs of Tg-LAP3 Gene and Growth Traits
LAP3 plays an important role in protein maturation and degradation, which varies greatly depending on the development stage, environment factors, and genotype [36]. Variations of the LAP3 gene are therefore likely to be responsible for differences of production traits among different individuals, and genetic diversity and correlation analysis between SNP and production traits have been performed in some mammals [9,10] and bivalves [13,14]. For example, in mammals, three SNPs (g.24794T > G, g.24803T > C, and g.24846T > C) of the LAP3 gene were linked to milk production in Chinese Holstein cows, and two SNPs were significantly correlated with birth weight in sheep. In bivalves, 23 SNPs in C. gigas and 3 SNPs in S. constricta were identified to be associated with growth traits, which might be useful for applications in genetic-based stock enhancement and breeding design. In our SNPs typing, however, there was no significant difference in the growth traits among the genotypes of the three SNPs (g.-488A > G, g.-1123C > T, and g.-1304 > A). The blood clams with the genotype GG at position g.-488A > G are larger (body weight, shell length, shell width, and shell height) than those with AG or AA, but we did not find any significant difference in growth traits. It is generally known that the growth traits are controlled by multiple genes [37], indicating that the mutation effects of g.-488A > G in Tg-LAP3 gene may be regulated by other genes and are too small to affect growth traits in T. granosa.

Conclusions
In this study, the Tg-LAP3 gene was characterized. It consists of 12 introns and 13 exons. The proximal promoter sequence of Tg-LAP3 consists of 1922 bps without a CpG island, with three potential transcription start sites, a typical TATA box structure, six E-boxes (CAXXTG), a GATA frame, and octamer transcription factors' binding sites (Oct-1). Tg-LAP3 appeared to be most highly expressed in the eyebot larva when examining embryos/larvae in nine developmental stages, as well as in the liver among eight adult tissues. Three SNPs (g.-488A > G, g.-1123C > T, and g.-1304C > A) in the proximal promoter were successfully typed, but there was no significant difference in growth traits associated with these SNPs. The results of our study provide valuable information for functional studies of Tg-LAP3 and molecular marker-assisted selection of the blood clam.

Institutional Review Board Statement:
In the present study, Tegillarca granosa clams were collected from the genetic breeding research center of Zhejiang Wanli University, China. All experimental procedures were approved by the Institutional Animal Care and Use Committee (IACUC) of Zhejiang Wanli University, China (Approval code: 20210903001).

Data Availability Statement:
The sequence of LAP3 gene of Tegillarca granosa was deposited in GenBank, and the accession number is JX103498.