Molecular Cloning, Screening of Single Nucleotide Polymorphisms, and Analysis of Growth-Associated Traits of igf2 in Spotted Sea Bass (Lateolabrax maculatus)

Simple Summary The spotted sea bass (Lateolabrax maculatus) is an economically important fish species cultured in China. Single nucleotide polymorphisms (SNPs) associated significantly with growth traits were screened to promote culturing conditions for L. maculatus. The insulin-like growth factor 2 gene (igf2) of L. maculatus was isolated. Fourteen SNPs were detected from the genome sequence of igf2. Four SNPs were associated significantly with the growth traits of L. maculatus. The SNPs we identified could be valuable for the genetic breeding of L. maculatus. Abstract The insulin-like growth factor 2 gene (igf2) is thought to be a key factor that could regulate animal growth. In fish, few researchers have reported on the single nucleotide polymorphisms (SNPs) located in igf2 and their association with growth traits. We screened the SNPs of igf2 from the spotted sea bass (Lateolabrax maculatus) by Sanger sequencing and made an association between these SNPs with growth traits. The full-length complementary (c) DNA of igf2 was 1045 bp, including an open reading frame of 648 bp. The amino acid sequence of Igf2 contained a signal peptide, an IGF domain, and an IGF2_C domain. Multiple sequence alignment showed that the IGF domain and IGF2_C domain were conserved in vertebrates. The genome sequence of igf2 had a length of 6227 bp. Fourteen SNPs (13 in the introns and one in one of the exons) were found in the genome sequence of igf2. Four SNPs located in the intron were significantly associated with growth traits (p < 0.05). These results demonstrated that these SNPs could be candidate molecular markers for breeding programs in L. maculatus.


Introduction
In fish, the insulin-like growth factor (Igf) family includes three Igfs, two Igf receptors, and six Igf-binding proteins [1,2]. These proteins have important roles in regulating reproduction, metabolism, growth, and development in fish [3][4][5]. In mammals, IGF2 is an evolutionarily conserved peptide hormone with structural homology to proinsulin [6]. As an important protein hormone in the growth hormone (GH)/IGF axis, Igf2 can transport GH, promote glucose transport into muscle, and regulate the proliferation, differentiation, and survival of cells in fish and mammals [3,7,8].

Cloning of igf2 from Genomic DNA
Total genomic DNA was isolated using the Marine Animals DNA Kit (Tiangen, Beijing, China). The quality and integrity of DNA were examined by electrophoresis on 1% agarose gels and NanoDrop 2000 (Thermo Scientific, Waltham, MA, USA), respectively. DNA was dissolved in sterile water at a concentration of 50 ng/mL and stored at -20 • C.
The genome sequence of igf2 was obtained from the L. maculatus genome (CP027267.1). One pair of primers (gs and ga) were used to verify the accuracy of the sequence (Table 1). PCR was conducted in a reaction volume of 20 µL (13.8 µL of double-distilled H 2 O, 2 µL of 10× LA Taq Buffer containing Mg 2+ (TaKaRa Biotechnology, Shiga, Japan), 0.8 µL of each primer (10 mM), 1.6 µL dNTP mixture (2.5-mM each), 0.8 µL of DNA, and 0.2 µL of LA Taq (5 U/mL)). The PCR program was 1 cycle at 94 • C for 3 min, 35 cycles at 95 • C for 30 s, 52 • C for 30 s, 72 • C for 3 min, and 1 cycle at 72 • C for 7 min. As described above, PCR products were purified and sent to Tsingke Biological Technology for sequencing.

Sequence Analysis
ORF Finder (www.ncbi.nlm.nih.gov/gorf/orfig.cgi/, accessed on 12 October 2022) was used to obtain the open reading frames (ORFs) and predict the coding protein sequence. Primer-BLAST (www.ncbi.nlm.nih.gov/tools/primer-blast/, accessed on 12 October 2022) was employed to assist with primer design. Exons and two introns were analyzed via a comparison between the genomic DNA sequence and cDNA sequence. The signal peptide of the deduced amino acid (aa) sequence was analyzed with SignalP (www.cbs.dtu.dk/ services/SignalP/, accessed on 12 October 2022). Functional domains were predicted by the conserved domains detailed in the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi/, accessed on 12 October 2022). The chemical and physical properties of proteins were analyzed by ProtParam (http://web. expasy.org/protparam/, accessed on 12 October 2022). Global Align (https://blast.ncbi. nlm.nih.gov/Blast.cgi/, accessed on 12 October 2022) was used to analyze the similarities of the aa sequence of Igf2. Multiplex sequence alignment was carried out using Clustal Omega (www.ebi.ac.uk/Tools/msa/clustalo/, accessed on 12 October 2022). A phylogenetic tree was constructed using MEGA 7.0 with the maximum-likelihood algorithm [23].

Identification of SNPs and Statistical Analyses
Twenty individual DNAs were chosen randomly for SNP isolation. Four pairs of primers (E1s/E1a, E2s/E2a, E3s/E3a, E4s/E4a) were used to produce the DNA sequence of igf2 (Table 1). PCR fragments were purified with the DNA Gel Extraction Kit (Omega BioTek, Norcross, GA, USA), ligated into the pMD18-T vector (TaKaRa Biotechnology, Shiga, Japan), and sequenced. These sequences were analyzed to screen for potential SNPs with the SeqMan program (DNASTAR, Madison, WI, USA). If SNPs were found, then PCR amplification of the remaining 170 DNA samples was done.
An association analysis between different genotypes of an SNP and growth traits was carried out using the general linear model with SPSS 19.0 (IBM, Armonk, NY, USA). The model was: where Yij represents the observed value of the jth individual of genotype i, µ denotes the mean of the observed values, Gi represents the effective value of the genotype i, and eij denotes the random error [24]. The significance of differences was tested using Tukey's multiple-range test.

cDNA Cloning and Characterization of igf2
Based on transcriptome sequencing and validation by the primers cs and ca, we obtained the cDNA sequence of igf2. The full-length igf2 cDNA (GenBank accession number: ON462263) was 1045 bp with a 5 -untranslated region (UTR) of 177 bp, 3 -UTR of 220 bp, and ORF of 648 bp ( Figure 1). igf2 encoded a protein containing 215 aa. The molecular mass was 24.60 kDa and the isoelectric point was 10.04. A putative signal peptide (1-50 aa) was detected in the N-terminal sequence, an insulin-like growth factor (IGF) domain (51-117 aa), and an IGF2_C domain (147-202 aa) ( Figure 1).

Characterization of the Genome Sequence of igf2
The genome sequence of igf2 was 6227 bp in length (GenBank accession number: ON462264). It contained four exons and three introns ( Figure 2). The length (in bp) of exons 1, 2, 3, and 4 was 75, 151, 182, and 240, respectively. Three introns had a length (I bp) of 1027, 1574, and 1458, respectively. All intron-exon boundaries were consistent with the GT-AT rule.

Characterization of the Genome Sequence of igf2
The genome sequence of igf2 was 6227 bp in length (GenBank accession number: ON462264). It contained four exons and three introns ( Figure 2). The length (in bp) of exons 1, 2, 3, and 4 was 75, 151, 182, and 240, respectively. Three introns had a length (I bp) of 1027, 1574, and 1458, respectively. All intron-exon boundaries were consistent with the GT-AT rule.

Characterization of the Genome Sequence of igf2
The genome sequence of igf2 was 6227 bp in length (GenBank accession number: ON462264). It contained four exons and three introns ( Figure 2). The length (in bp) of exons 1, 2, 3, and 4 was 75, 151, 182, and 240, respectively. Three introns had a length (I bp) of 1027, 1574, and 1458, respectively. All intron-exon boundaries were consistent with the GT-AT rule.

Sequence Identity and Phylogenetic Analysis of Igf2
Global Align was employed to obtain the sequence alignment and protein sequences of Igf2 in L. maculatus and other species. High identity was found between Igf2 of L. maculatus and Igf2 of other fish. For example, the sequence of Igf2 of L. maculatus was identical to Igf2 of Lateolabrax japonicus, and had 93% identity with Igf2 of Siniperca chuatsi, Seriola dumerili, and Trachinotus ovatus (Table 2). However, Igf2 of L. maculatus

Sequence Identity and Phylogenetic Analysis of Igf2
Global Align was employed to obtain the sequence alignment and protein sequences of Igf2 in L. maculatus and other species. High identity was found between Igf2 of L. maculatus and Igf2 of other fish. For example, the sequence of Igf2 of L. maculatus was identical to Igf2 of Lateolabrax japonicus, and had 93% identity with Igf2 of Siniperca chuatsi, Seriola dumerili, and Trachinotus ovatus (Table 2). However, Igf2 of L. maculatus showed low identity with the IGF2 sequences of higher vertebrates such as mammals: only 47% similarity with Mus musculus and 44% with Homo sapiens ( Table 2). The IIGF and IGF2_C domains of IGF2 were very conserved in all species, which suggested that these two domains were important (Figure 3). The phylogenetic trees showed that mammals and fish were divided into two clades and that Igf2 of L. maculatus and Igf2 of L. japonicus were clustered together ( Figure 4). showed low identity with the IGF2 sequences of higher vertebrates such as mammals: only 47% similarity with Mus musculus and 44% with Homo sapiens ( Table 2). The IIGF and IGF2_C domains of IGF2 were very conserved in all species, which suggested that these two domains were important (Figure 3). The phylogenetic trees showed that mammals and fish were divided into two clades and that Igf2 of L. maculatus and Igf2 of L. japonicus were clustered together ( Figure 4).   The IGF domain is shown with an underline. IGF2_C is represented with a wavy line. Single, strong, or weakly conserved residues are indicated by "*", ":", and ".", respectively. The accession numbers for sequences are shown in Table 2.  Table 2.

Detection of SNPs
Putative SNPs were detected after comparing igf2 sequences from 20 L. maculatus individuals. In the present study, a putative SNP locus was considered to be a nucleotide site with a variation of a base in the DNA sequence in ≥5 individuals. SNPs were detected only in sequences amplified by E3s/E3a and E4s/E4a primers. Subsequently, E3s/E3a and E4s/E4a primers were amplified further and sequenced directly, and SNP sites were detected in the remaining 170 individuals. Sequencing failures were removed. Analyses of sequence alignment showed 14 SNPs, of which 9 were transformations (A/G, C/T) and 5 were transpositions (A/C, T/G) ( Table 3). The first base of the DNA sequence of igf2 was named as the starting position of the SNP locus. Intron 2 had 10 SNPs and intron 3 had 3 SNPs (Table 3). Another SNP located in exon 4 was a nonsynonymous mutation, and the encoded aa was mutated from isoleucine to valine (Figure 1, Table 3). The genotypes, numbers, and genotype frequencies of all SNP loci are listed in Table 3. The sequencing peaks of some SNPs are shown in Figure 5.   Table 2.

Detection of SNPs
Putative SNPs were detected after comparing igf2 sequences from 20 L. maculatus individuals. In the present study, a putative SNP locus was considered to be a nucleotide site with a variation of a base in the DNA sequence in ≥5 individuals. SNPs were detected only in sequences amplified by E3s/E3a and E4s/E4a primers. Subsequently, E3s/E3a and E4s/E4a primers were amplified further and sequenced directly, and SNP sites were detected in the remaining 170 individuals. Sequencing failures were removed. Analyses of sequence alignment showed 14 SNPs, of which 9 were transformations (A/G, C/T) and 5 were transpositions (A/C, T/G) ( Table 3). The first base of the DNA sequence of igf2 was named as the starting position of the SNP locus. Intron 2 had 10 SNPs and intron 3 had 3 SNPs (Table 3). Another SNP located in exon 4 was a non-synonymous mutation, and the encoded aa was mutated from isoleucine to valine (Figure 1, Table 3). The genotypes, numbers, and genotype frequencies of all SNP loci are listed in Table 3. The sequencing peaks of some SNPs are shown in Figure 5.

Association of SNPs with Growth Traits
A general linear model was employed to analyze the relationship between fourteen SNP loci and six growth traits: four SNP loci were correlated significantly with growth traits (p < 0.05) ( Table 4). SNP g2907C>T was associated significantly with HL (p = 0.044) and BD (p = 0.035). HL and BD in the TC genotype were significantly higher than those of the CC genotype. The length of the SNP g3230A>C locus was associated significantly

Association of SNPs with Growth Traits
A general linear model was employed to analyze the relationship between fourteen SNP loci and six growth traits: four SNP loci were correlated significantly with growth traits (p < 0.05) ( Table 4). SNP g2907C>T was associated significantly with HL (p = 0.044) and BD (p = 0.035). HL and BD in the TC genotype were significantly higher than those of the CC genotype. The length of the SNP g3230A>C locus was associated significantly with TL (p = 0.011), and that of homozygous (AA, CC) was significantly higher than that of heterozygous (AC). SNP g3294C>T genotypes were correlated with BW (p = 0.002), BD (p = 0.044), and HL (p = 0.008). Genotype TT was significantly higher than genotype CC in terms of BD, BW, and HL. SNP g5064C>T genotypes were correlated significantly with individual SL (p = 0.027). SL of genotype CC was significantly higher than that of genotype TT.

Discussion
We cloned the cDNA of igf2 from L. maculatus. Similar to other vertebrates such as mammals, birds, amphibians, and fish, the protein sequence of Igf2 of L. maculatus also includes an IGF domain and IGF2_C domain (www.ncbi.nlm.nih.gov/gene/3481 /ortholog/?scope=7776&term=IGF2, accessed on 18 July 2022). The protein sequence of Igf2 of L. maculatus was found to be highly similar to that of other vertebrates, especially the two domains IIGF and IGF2_C, which were highly conserved ( Figure 3). The genome structure of igf2 of L. maculatus comprised four exons and three introns ( Figure 2). This structure is consistent with that of other fish types, including largemouth bass (Micropterus salmoides), common carp (Cyprinus carpio), salmon (Oncorhynchus keta), tilapia (Oreochromis niloticus), Asian sea bass (Lates calcarifer), and pike perch (Sander lucioperca) [25][26][27][28][29][30][31]. This result indicates that the structure of the igf2 genome of fish is relatively conserved. In mammals, the number of igf2 exons is greater than that of fish. For example, in mice (Mus musculus), igf2 has eight exons, and in humans and cows (Bos taurus) igf2 has ten exons, respectively [32,33]. Hence, the structure of the igf2 genome has changed considerably during evolution.
Growth traits are important for fish breeding [2]. Screening the molecular markers closely related to growth traits for direct breeding can avoid the limitations of inaccurate environmental and phenotypic measurement and improve the success rate of breeding [34]. igf2 is a candidate gene for the selection of growth traits [2,8]. Some SNPs associated with growth-related traits have been screened and detected in IGF2 of pigs and cattle [33,35,36]. There have also been reports on the association between igf2 and growth traits in fish. For example, one SNP of igf2 was found to be correlated significantly with nine growth traits of the hybrid yellow catfish (P. fulvidraco (♀) × Pelteobagrus vachelli (♂)) [2]. One SNP in intron 3 of igf2 of S. lucioperca is closely related to body weight [31]. One SNP locus in exon 3 of igf2 in tilapia (O. niloticus) is correlated with growth traits [29]. Xie et al. [37] demonstrated that the mutation site in igf2 of a new strain of Yellow River carp reduced the body weight Animals 2023, 13, 982 9 of 11 of fish. Li et al. [25] detected four SNPs in igf2 of M. salmoides, and the two haplotypes composed of these loci were closely related to body weight (p < 0.05). In the present study, 14 SNPs were isolated from the igf2 genome sequence of L. maculatus, 13 of which were located in the intron. There are many SNPs in an intron because an intron does not participate in encoding proteins, has little selection pressure, and can accumulate mutations readily [38]. In addition, SNPs in the non-coding region may have an impact on alternative splicing, splicing efficiency, mRNA degradation, and gene expression [39,40]. Similar to other studies [2,29,36], we found that four SNPs of igf2 were associated significantly with growth traits (p < 0.05) ( Table 4). These results indicate that: (i) igf2 is a key potential gene for growth traits in fish; (ii) the SNPs associated with growth-related traits in igf2 can be used as markers in studies on L. maculatus.

Conclusions
The cDNA and DNA sequences of igf2 were isolated. Fourteen SNPs were screened in the genome sequence of igf2. Four SNPs located in the intron were associated significantly with growth traits, which may be useful for genetic improvement in L. maculatus breeding. These results are important markers for the marker-assisted selection in L. maculatus.