The Promoter SNPs Were Associated with Both the Contents of Poly-Unsaturated Fatty Acids (PUFAs) and the Expressions of PUFA-Related Genes in Common Carp

Simple Summary Fish is an important PUFA source for human diet. Much research has focused on increasing the PUFA products of fish. Improving the biosynthesis of PUFAs would also be a way to contribute to nutrition supplement for human health. Aquaculture selective breeding by genetic markers is essential for increasing the content of PUFAs. Fatty acid desaturase 2 (fads2) and elongase 5 (elovl5) have been proven key rate-limiting enzymes in the synthesis of PUFAs for bony fishes. These genes’ coding SNPs (cSNPs) were reported to be significantly associated with the PUFA contents in common carp. However, the promoter SNPs (pSNPs) effectiveness for the PUFA contents has not been evaluated yet. This study identified the genetic variants in the promoter regions of fads2a/2b and elovl5a/5b. Moreover, the joint effects of multiple cSNPs and pSNPs in fads2b and elovl5b, two major genes related to the PUFA biosynthesis, were evidenced by the higher explained percentages of phenotype. Furthermore, the gene expression level of fads2a and fads2b showed significant positive correlation with the contents of multiple PUFAs. These SNPs would be potential markers for future selection to improve the PUFA contents in common carp. Abstract The allo-tetraploid common carp encodes two duplicated fads2 genes (fads2a and fads2b) and two duplicated elovl5 genes (elovl5a and elovl5b). The coding SNPs (cSNPs) of these genes were reported to be significantly associated with the PUFA contents. Whether the promoter SNPs (pSNPs) were associated with the PUFA contents has not been reported yet. In this study, after sequencing the promoters of these four genes, we identified six pSNPs associated with the contents of PUFAs in common carp, including one elovl5a pSNP, one elovl5b pSNP, and four fads2b pSNPs. The pSNPs were predicted in the locations of transcriptional factor binding sites. Together with previously identified cSNPs in fads2b and elovl5b, the pSNPs and cSNPs of these two genes had the joint effects on the PUFA contents with higher explained percentage of phenotypic variation of the PUFA contents than single gene. The expression levels of both fads2a and fads2b were significantly positively correlated with the contents of six PUFAs. The fads2b pSNPs corresponding to higher fads2b expression levels were associated with higher PUFA contents. The pSNPs and cSNPs will be useful for the future selection breeding of common carp with higher PUFA contents.


Introduction
The poly-unsaturated fatty acids (PUFAs) are involved in numerous critical biological processes [1][2][3]. Fish is an important PUFA source for human diet and is essential for human health [4,5]. The PUFAs in fish were acquired from both dietary and biosynthesis. The the promoters (Supplementary Table S1). The PCR was performed with 2 × EasyTaq SuperMix (Tiangen, Beijing, China). The PCR reactions included an initial denaturation of 94 • C for 4 min, followed by 35 cycles of 94 • C for 30 s, specific temperature for 30 s and 72 • C for 1 min, and with a final extension step at 72 • C for 10 min (Supplementary  Table S1). The PCR products were determined on 1% agarose gel and then purified with EasyPure Quick Gel Extraction Kit (TransGen Biotech, Beijing, China). All the purified PCR products were sequenced with a paired-end Sanger sequencing mode on an ABI 3730 XL machine (TaiheGene, Beijing, China).

Genotyping and Estimating the Genetic Diversities
We aligned the sequences to the genome using NCBI Blastn. The sequence redundancy resulted from whole genome duplication might lead to mis-amplification. Therefore, if one sequence was aligned to its corresponding promoter with the highest score, this sequence was retained in the following analysis. To call the pSNPs, the confirmed sequences were aligned to the corresponding promoters using the novoSNP software [28]. The pSNPs were featured with F-scores ≥ 30 and at least two sequencing peaks of one site per sample. The homozygotes and heterozygotes were auto-detected with this software.
We grouped three strains into one population and calculated the genetic diversities of the pSNPs, which had a frequency over 0.03. For each pSNP, the observed heterozygosity (Ho), and the expected heterozygosity (He) were measured with the Genepop software 4.7 [29]. The polymorphism information content (PIC) was calculated using PICcalc 0.6 [30].
We predicted the potential transcriptional factors (TFs) and their binding sites (TFBSs) in each promoter using TFBIND (https://tfbind.hgc.jp/, accessed on 10 December 2022) with the default parameters [31]. If one pSNP was located in the TFBSs, it was considered as a TFBS-related pSNP.

Association between the pSNPs and the PUFA Contents
To identify the associated pSNPs with the PUFA contents, we used analysis of variance (ANOVA) and general linear model (GLM) analysis with 100,000 permutations in Tassel 5 [32], respectively. All samples were classified into different groups based on their genotypes in one pSNP. With ANOVA, we performed the pairwise comparison between any two groups on each PUFA content.
Each ANOVA p value was corrected using the false discovery rate (FDR) method for multiple hypothesis testing. The GLM model was performed with the genetic distance matrix and 100,000 permutations. One pSNP was significantly associated with the content of one PUFA if it had a p value < 0.05 in the GLM method and an FDR corrected p value < 0.1 in the ANOVA. The explained percentage of phenotypic variation (PV) of each pSNP was measured using Tassel 5.
To study the strain effect on the association study, we performed the GLM analysis incorporated with the population structures from the PCA analysis (GLM-PCA). We also used the analysis of covariance (ANCOVA) with the strain information to estimate the putative strain effect.

Joint Effect of the pSNPs and cSNPs of elovl5b and fads2b on the PUFA Contents
Previously we demonstrated the joint effects of 25 cSNPs from elovl5b and fads2b on the contents of 12 PUFAs. The fads2b and elovl5b were two major effect genes on the contents of multiple PUFAs since they had more associated cSNPs with the contents of at least two PUFAs [24,25]. We further studied the joint effect of cSNPs and pSNPs of these two genes on the PUFA contents. Before the analysis, we studied the linkage disequilibrium (LD) between any two compared SNPs in one gene with PLINK [33]. The LD was represented with the D' value. The SNPs were clustered into one haplotype block if the D' value of any two compared SNPs was over 0.9. If multiple SNPs were attributed into one haplotype block, the joint effect of multiple SNPs might be overestimated. Therefore, we retained one SNP significantly associated with the PUFA content to represent the SNPs in the same block. The joint effect of multiple retained SNPs was estimated with the explained percentage of PV. We generated three SNP sets, including (i) cSNPs and pSNPs in fads2b (F2b.cpSNPs); (ii) cSNPs and pSNPs in elovl5b (E5b-cpSNPs); (iii) cSNPs and pSNPs in fads2b and elovl5b (F2b-E5b-cpSNPs). In each SNP set, we generated different genotype combination. If one genotype combination was observed in at least three individuals, this combination was used in the comparison. For each PUFA, we performed the pairwise comparisons of the PUFA contents among different retained combinations using ANOVA. The explained percentages of PV of the genotype combination to each PUFA content was estimated with the function of 'lm' in R [34].

Correlation between the Expression Level of Each Gene and the PUFA Content
The livers of 44 randomly selected individuals were used to extract the total RNA with TRIzol (Invitrogen, Waltham, MA, USA). The liver is the major organ where the PUFAs are biosynthesized [8]. The concentration of RNA was determined using the NanoDrop 2000 spectrophotometer (Thermo Scientific™, Wilmington, DE, USA), and its quality was determined by 1% agarose gel electrophoresis. The RNA was reverse transcribed to cDNA using the SuperScript III RNase H-Reverse Transcriptase Kit (Invitrogen, Waltham, MA, USA) with oligo-dT. The common carp β-actin (GenBank accession: M24113.1) was used as an endogenous control. We designed the gene-specific primers to quantify the expression level of targeted gene (Supplementary Table S2). The expression level of the targeted gene in the liver was measured with real-time quantitative PCR (RT-qPCR). The qPCR was performed in triplicate on an ABI7500 (Thermo Scientific™, Wilmington, DE, USA) using the SYBR Green Realtime PCR Master Mix (TOYOBO, Osaka, Japan). The reaction was performed in 15 µL of reaction volume containing 7.5 µL mix, 1 µL cDNA, 0.3 µL forward and reverse primers, and 5.9 µL nuclease-free water. The qPCR profile contained an initial activation step at 95 • C for 2 min, followed by 40 cycles: 15 s at 95 • C, 15 s at 60 • C and 30 s at 72 • C. The product sizes were checked by agarose gel electrophoresis. The expression of beta-actin was used to normalize the target gene expression level. For each gene across 44 samples, we calculated the Pearson correlation coefficient and the associated p value between its expression level and each PUFA content. Finally, we used the FDR method to adjust the p values for multiple testing.
For each pSNP, the samples were classified into different groups based on their genotypes. The normalized expression levels of the corresponding gene in these individuals were distributed into different groups.

Results
3.1. The Genetic Diversities of the Promoters of fads2a, fads2b, elovl5a, and elovl5b The promoter sequences of fads2a, fads2b, elovl5a, and elovl5b were successfully sequenced in 228 individuals (Supplementary Table S3). The lengths of amplified sequences ranged from 698 bp to 1077 bp. The A + T content of these four gene promoter sequences was 63.93%, 64.54%, 63.88%, and 57.02%, respectively. The promoters between the two fads2 genes had an identity of 84.19% and the identity between the promoters of elovl5a and elovl5b was 74.49%.
The pSNPs positions were numbered according to their positions relative to the first base of the full-length mRNA sequences of fads2a, fads2b, elovl5a, and elovl5b (GenBank accession numbers: MK852165.1, MK852166.1, MK893918.1, and MK893919.2). With the minor allele frequency (MAF) over 0.03 and the allowed missing genotypes per individual less than 25%, nineteen pSNPs were identified in fads2a, fads2b, elovl5a, and elovl5b genes in 208 individuals, including one in fads2a, eleven in fads2b, five in elovl5a, and two in elovl5b (Table 1). The observed heterozygosities (Ho) of 19 loci ranged from 0.049 to 0.458 where only five pSNPs had the Ho values smaller than 0.1. The expected heterozygosities (Ho) of these loci were from 0.072 to 0.5 and few pSNPs had the He values smaller than 0.1. Similarly, the PIC value of only one pSNP was smaller than 0.1. The number of pSNPs having the diversity rates (Ho, He, and PIC) higher than 0.1 was more than the ones of cSNPs in these four genes [24,25], suggesting higher polymorphic levels in the promoters than the coding regions. Among 19 pSNPs, two had two genotypes per locus while 17 pSNPs had three genotypes per locus. The phenomena were different from the observation in the cSNPs of these four genes, where few cSNPs had three genotypes per locus. This data also supported the higher polymorphic levels in the promoters.

Associations between the pSNPs with Common Carp PUFA Contents
We did not observe the pSNPs in fads2a associated with the PUFA contents. For elovl5a and elovl5b, only one pSNP in each gene was associated with the content of one PUFA (Figure 1a,b). Intriguingly, four fads2b pSNPs had an association with the contents of six PUFAs. Three pSNPs, including F2b.-104, F2b.-304, and F2b.-785, were significantly associated with the contents of more than two PUFAs (GLM analysis p value < 0.05, Table 2). These six pSNPs were in the transcription factor binding sites (TFBSs) with predicated scores over 0.75 (Supplementary Table S4). The transcription factors (TFs) including HNF, NFKB, Sp1, and C/EBPα were proven to regulate the expressions of fads2 and elovl5 [35][36][37]. Combining different strains into one population might introduce the strain effect on the association study. The GLM analysis and ANOVA identified ten significant associations, eight of which were confirmed by the GLM-PCA analysis (p values listed in Supplementary  Table S5). The ANCOVA analysis with the strain information validated all ten associations (Supplementary Table S6). These two results indicated that the mixed population with more samples was suitable for the association study to increase the accuracy.
F2b.-104 showed associations with the contents of C20:3n-6 and C22:5n-3 (p values listed in Figure 1c,d). This locus included three genotypes, two homozygous genotypes (TT and AA) and the heterozygous genotype (AT). Interestingly, in these two associated PUFAs, the individuals with the homozygous genotype (TT, major allele) had higher PUFA contents than the samples of the other groups. The fold changes of mean PUFA contents between the highest content group and the lowest content group were 1.38 (C20:3n-6) and 1.34 (C22:5n-3), respectively. mean PUFA contents between the highest content group and the lowest content group were 1.1 (C18:3n-3) and 1.81 (C20:3n-6), respectively.
F2b.-785 was associated with the contents of C20:3n-3, C22:4n-6, and C22:5n-3 ( Figure  1h-j). This locus also had three genotypes. The homozygous individuals of the major allele and the heterozygous samples had close contents of these three PUFAs, which were higher than the contents in the homozygous individuals of the minor alleles.

Joint Effect of pSNPs and cSNPs from fads2b and elovl5b on the PUFA Contents
Previously we found that fads2b and elovl5b might be two major genes responding to common carp PUFA contents [24,25]. Herein, we estimated the joint effects of the pSNPs and cSNPs in these two genes on the contents of multiple PUFAs. F2b.-304 was associated with the contents of C18:3n-3 and C20:3n-6 ( Figure 1e,f). In this locus, three genotypes including two homozygous genotypes (CC and TT) and the heterozygous genotype (CT) were observed. Likewise, the individuals with the homozygous genotype (CC, major allele) had the highest PUFA contents. The fold changes of mean PUFA contents between the highest content group and the lowest content group were 1.1 (C18:3n-3) and 1.81 (C20:3n-6), respectively.
F2b.-785 was associated with the contents of C20:3n-3, C22:4n-6, and C22:5n-3 (Figure 1h-j). This locus also had three genotypes. The homozygous individuals of the major allele and the heterozygous samples had close contents of these three PUFAs, which were higher than the contents in the homozygous individuals of the minor alleles.

Joint Effect of pSNPs and cSNPs from fads2b and elovl5b on the PUFA Contents
Previously we found that fads2b and elovl5b might be two major genes responding to common carp PUFA contents [24,25]. Herein, we estimated the joint effects of the pSNPs and cSNPs in these two genes on the contents of multiple PUFAs.
We observed three haplotype blocks in fads2b (Supplementary Table S7). Two fads2b pSNPs associated with the PUFA contents, including F2b.-700 and F2b.-785, were attributed into one block and we retained F2b.-785 in the joint effect analysis. However, the other pSNPs and cSNPs were not clustered into the same blocks and thus kept in the following analysis. Likewise, all elovl5b pSNPs and cSNPs cannot be grouped into the same blocks. These SNPs were used in the joint effect analysis.
The cpSNP set of one gene explained more percentages of PV than single SNP in the same gene. The pSNP of E5b.-330 had an explained PV of 4.53% on C20:4n-3. The PV of the cSNP of E5b.172 was 7.12%. The PV of the cpSNPs of E5b increased to 13.71%. Similar observations of increasing PV values were found in both the contents of the other PUFAs and the cpSNPs of F2b.
The results also showed higher explained PV values of F2b.cpSNPs than E5b.cpSNPs on the contents of eleven PUFAs (Tables 3 and 4). This data suggested that fads2b should be the predominant gene affecting the PUFA contents. Further, the cpSNPs of these two genes showed higher explained PV than the cpSNPs of one gene on the contents of most PUFAs.

The Correlation between the PUFA Contents and the Expression Levels of Four Genes
In general, the expression levels of both fads2a and fads2b in the liver were significantly positively correlated with the contents of six same PUFAs (C20:3n-3, C20:4n-3, C22:5n-3, C20:4n-6, C22:4n-6, and C22:5n-6, FDR p values in Table 5), suggesting these two genes might cooperate to regulate the same PUFAs through expression. However, we did not observe the significant correlation between the elovl5a expression and the PUFA contents. Similar scenario was observed between the elovl5b expression and the PUFA contents except C18:4n-3. These data indicated that the expression levels of these two genes might not influence the PUFA contents.

The Association of pSNPs on the fads2b Expression Levels
Because the fads2b expression levels were regulated by multiple transcriptional factors, we studied the correlation between the genotype combinations of four pSNPs (F2b.-104,  F2b.-304, F2b.-700, and F2b.-785) and the fads2b expression levels. We observed 18 combinations where the individual numbers were over two (Supplementary Table S8). In the contrary, in the above expression study only four combinations (H1, H2, H3, and H4) corresponded to at least three individuals. For the first combination (H1) of AT/CC/GG/AT, the contents of seven PUFAs were higher than the ones in the samples of the other three combinations (Supplementary Table S9). Intriguingly, the fads2b expression levels of the H1 samples were also higher than the ones in the groups of H2, H3, and H4 ( Figure 2). These data indicated that the genotypes of fads2b pSNPs were associated with both the PUFA contents and the fads2b expressions. F2b.-304, F2b.-700, and F2b.-785) and the fads2b expression levels. We observed 18 combinations where the individual numbers were over two (Supplementary Table S8). In the contrary, in the above expression study only four combinations (H1, H2, H3, and H4) corresponded to at least three individuals. For the first combination (H1) of AT/CC/GG/AT, the contents of seven PUFAs were higher than the ones in the samples of the other three combinations (Supplementary Table S9). Intriguingly, the fads2b expression levels of the H1 samples were also higher than the ones in the groups of H2, H3, and H4 ( Figure 2). These data indicated that the genotypes of fads2b pSNPs were associated with both the PUFA contents and the fads2b expressions.

Discussion
Fads2a, fads2b, elovl5a, and elovl5b were reported to participate in the synthesis of PUFAs in common carp [38,39]. In this study, we scanned the pSNPs in these four genes

Discussion
Fads2a, fads2b, elovl5a, and elovl5b were reported to participate in the synthesis of PUFAs in common carp [38,39]. In this study, we scanned the pSNPs in these four genes in common carp and found higher genetic diversities of the promoter regions than the coding regions in these four genes.
Improving the desaturase activity of fads2b would be the top priority for selective breeding of common carp with higher PUFA contents. First, more pSNPs and cSNPs of fads2b were identified than in elovl5b. Second, the pSNPs of fads2b were associated with the contents of more PUFAs than elovl5b. Third, the explained PVs of fads2b cpSNPs was higher than the elovl5b cpSNPs.
We observed the joint effect of pSNPs and cSNPs in this study. Although the joint effect of cSNPs from two genes, fads2b and elovl5b, on the PUFA content were uncovered [25], none has studied the joint effects of pSNPs on the PUFA contents. We selected one SNP in one haplotype block to avoid overestimating the joint effect. The explained PV of each PUFA content was increased by combining the pSNPs and cSNPs, compared with one SNP. The coordination to improve the PUFA contents requires the simultaneous mutations in the promoters and coding regions in these two genes.
The associations of the pSNPs with the PUFA contents could be explained by their locations in the TFBSs. The TF of Sp1 could enhance the PUFA biosynthesis by increasing fads2 and elovl5 gene expression in rabbitfish (Siganus canaliculatus) [40]. Lack of the Sp1 binding site might decrease the promoter activity of fads2 of carnivorous marine fish. The TF of CEBP (CCAAT/enhancer-binding protein) could enhance the fads2 expression [41]. It could up-regulate the fads2 promoter activity involving in the process of PUFAs biosynthesis in the large yellow croaker (Larimichthys crocea), and rainbow trout (Oncorhynchus mykiss) [42]. These pSNPs might alter the binding efficiency of TFs and then affect the target gene expression. In addition, the content of C20:3n-6 was associated with two pSNPs (F2b.-104 and F2b.-304). Because of the location of these two pSNPs in the TFBSs of Sp1 and HNF1, the co-association suggested the possible co-regulation of Sp1 and HNF1 in the biosynthesis of C20:3n-6. Similarly, two pSNPs (F2b.-104 and F2b.-785) were associated with the content of C22:5n-3, indicating the co-regulation of Sp1 and CEBP on this PUFA. More functional studies are required to further validate the effects of pSNPs.
The expressions of fads2a and fads2b were significantly positively correlated with the contents of multiple PUFAs (FDR p values listed in Table 5). We adopted two strategies to increase the reliability of correlation analysis, including increasing sample size and correcting the p value of each coefficient. Ching et al. suggested a sample size of 25 to achieve a power of 0.8 or greater [43]. Bottomly et al. collected 21 mice and compared gene expression between C57BL/6J and DBA/2J mouse strains [44]. In our study, we collected 44 samples to study the correlation between gene expression and the PUFA contents. It is reasonable that the higher expression level of targeted gene could generate more enzymes participating in the PUFA biosynthesis. Recent research showed the fads2 expression influenced the PUFAs in breast milk [45], gilthead seabream [46], and pig [18]. The influences of pSNPs on the TFBSs might result in the change of both target expression and the PUFA contents. The H1 genotype combination of fads2b corresponded to higher fads2b expression, possibly generating more fads2b enzyme. The higher PUFA contents in the H1 samples might be due to more fads2b enzyme.
Herein, we investigated the polymorphisms in the promoters of fads2a, fads2b, elovl5a, and elovl5b. Together with our previous study, our data of explained PVs suggested that fads2b might be the major effect gene on the contents of multiple PUFAs and should be target for selective breeding to improve the PUFA contents. The increasing explained PVs with the pSNPs and cSNPs demonstrated that both changing the expressions of these genes in liver and altering their enzyme activities would influence the PUFA contents. We focused on the polymorphisms of these four genes. In the future, the genome wide association study would be used to identify more SNPs and indels related to the PUFA contents at the entire genome.

Conclusions
We identified the polymorphisms in the promoters of common carp fads2a, fads2b, elovl5a, and elovl5b. The association study identified six pSNPs in the latter three genes significantly related to the PUFA contents. The joint effects of pSNPs and cSNPs in either fads2b or elovl5b improved the explained percentages of PVs of the PUFA contents. Further, the joint effects of both genes had higher explained PVs than single gene. The pSNPs were in the predicted TFBSs. The expression levels of fads2a and fads2b were positively correlated with the contents of multiple PUFAs. The samples with one genotype combination corresponding to higher fads2b expression had higher PUFA contents. These pSNPs and cSNPs would be used to selectively breed common carp of higher PUFA contents.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/biology12040524/s1, Table S1: The primers used for amplification of the promoter regions of four genes. Table S2: The primers used for qRT-PCR of four genes. Table S3: The pSNP genotypes of four genes in all individuals. Table S4: Predicted TFBSs around six pSNPs associated with the PUFA contents. Table S5: Association analysis of PUFAs contents with GLM-PCA. Table S6: ANOVA without the strain information and ANCOVA with the strain information. Table S7: Three haplotype blocks in fads2b.