Identification of Distant Regulatory Elements Using Expression Quantitative Trait Loci Mapping for Heat-Responsive Genes in Oysters

Many marine ectotherms, especially those inhabiting highly variable intertidal zones, develop high phenotypic plasticity in response to rapid climate change by modulating gene expression levels. Herein, we examined the regulatory architecture of heat-responsive gene expression plasticity in oysters using expression quantitative trait loci (eQTL) analysis. Using a backcross family of Crassostrea gigas and its sister species Crassostrea angulata under acute stress, 56 distant regulatory regions accounting for 6–26.6% of the gene expression variation were identified for 19 heat-responsive genes. In total, 831 genes and 164 single nucleotide polymorphisms (SNPs) that could potentially regulate expression of the target genes were screened in the eQTL region. The association between three SNPs and the corresponding target genes was verified in an independent family. Specifically, Marker13973 was identified for heat shock protein (HSP) family A member 9 (HspA9). Ribosomal protein L10a (RPL10A) was detected approximately 2 kb downstream of the distant regulatory SNP. Further, Marker14346-48 and Marker14346-85 were in complete linkage disequilibrium and identified for autophagy-related gene 7 (ATG7). Nuclear respiratory factor 1 (NRF1) was detected approximately 3 kb upstream of the two SNPs. These results suggested regulatory relationships between RPL10A and HSPA9 and between NRF1 and ATG7. Our findings indicate that distant regulatory mutations play an important role in the regulation of gene expression plasticity by altering upstream regulatory factors in response to heat stress. The identified eQTLs provide candidate biomarkers for predicting the persistence of oysters under future climate change scenarios.


Introduction
Climate change has resulted in rapid global ocean warming [1], which threatens the survival and performance of marine ectotherms [2]. Studies suggest that phenotypic plasticity, the ability to exhibit several different phenotypes with the same genotype [3], plays an important role in species' responses to global warming [4][5][6][7]. Oysters have high phenotypic plasticity, which at morphological and molecular levels allow them to adapt to highly variable intertidal zones. This ability makes them ideal representatives of intertidal organisms for the study of adaptive mechanisms in response to heat and other environmental stresses [8][9][10]. Gene expression variation accounts for much of the phenotypic variation among populations and individuals in response to rapid climate change [11][12][13][14].
Energy balance and cellular homeostasis are fundamental for survival and adaption to environmental stress [15]. Transcriptomic and proteomic analyses reveal changes in genes and proteins that are involved in cellular homeostasis, protein stability, metabolic adjustment, signaling transduction, and ion transportation and so on under heat stress in marine organisms. Heat shock proteins (HSP)s and genes related to metabolic adjustment and ion transport show differential expression in fish exposed to elevated temperatures [16,17]. In shrimp, HSPs are strongly upregulated under heat stress [18,19]. Genes related to protein homeostasis, oxidative stress mediation, osmolyte transport, and apoptosis are expressed differentially under different temperatures in shellfish [10,[20][21][22][23][24][25]. Elucidating the genetic basis of heat-responsive gene expression will reveal the underlying mechanisms of phenotypic plasticity and lead to a better understanding of thermal adaption to global warming.
Expression quantitative trait locus (eQTL) mapping, which locates genetic loci related to mRNA transcript abundance in a segregation population [26,27], can link genetic variants to gene expression and phenotypes [11]. eQTLs are genomic regions containing several genetic variants that can influence gene expression [11] and are classified into local and distant eQTLs according to their distance from the target gene [28]. Most local eQTLs regulate gene expression through cis-acting regulation, which directly influences target gene expression and protein characteristics (cis-eQTLs). Distant eQTLs are usually transacting [11], and can change target gene expression by influencing the function, structure, and expression of other factors (trans-eQTLs). However, distant cis and local trans effects have also been reported [28]. Genome-wide eQTL mapping for the detection of genetic loci linked to phenotypes has been used for many species [29] to elucidate the underlying mechanisms of complex traits [30][31][32][33][34][35]. Aquatic ectotherms, the most susceptible groups in response to the deterioration of the environment, are ideal organisms to study to unravel the underlying mechanisms of high plasticity in gene expression. However, the detection of genetic variants of environment-responsive gene expression by eQTL mapping in marine ectotherms has only been studied thus far in three-spined stickleback [36].
Oysters have high economic value and are top ranking in global mariculture but unfortunately suffer summer mortality [37][38][39] mainly due to high temperatures [40]. Therefore, understanding the mechanism of high thermal adaption in oysters will elucidate prediction of marine organisms' adaptive potential to global warming and help tackle oyster summer mortality. Moreover, oysters have high fecundity, which facilitates eQTL analysis. The Pacific oyster, C. gigas, and its sister species, C. angulata, are distributed along the northern and southern coasts of China, respectively [41]. The two species can interbreed, and both exhibit thermal tolerance divergence at physiological and molecular levels [8]. The basic and plastic expression of heat-responsive genes is differentiated between the two species [10].
This study aimed to elucidate the genetic variants regulating heat-responsive gene expression under acute heat stress by eQTL mapping. We identified several eQTLs using a previously reported high-density genetic map [42]. We further validated the association between the single nucleotide polymorphisms (SNPs) located in the eQTLs and the expression of the corresponding heat-responsive genes in an independent family. To our knowledge, this is the first study to identify the potential regulatory locus of environmentally responsive genes by integrating differential gene expression into QTL mapping in mollusks. Our results will help to highlight the adaptive mechanisms of marine organisms in response to environmental stress and provide a potential target region of genetic modification in the selection for thermal resistance in oysters.

Origin and Background of Oysters
Two backcross families of C. gigas and C. angulata were used in this study. ZF2-3 included 106 individuals and was used for eQTL mapping. ZF was used for subsequent association analysis between genotypes and gene expression levels. The two families were produced in a hatchery in Qingdao, China (36 • 12 N, 120 • 41 E) in 2012. Both the male parents of the two backcross families were the progeny of the hybrid family of C. gigas and C. angulata, which were produced and cultured in Nan'ao Island, Shantou, China (23 • 25 N, 117 • 01 E). Both the female parents were 2-year-old wild C. gigas from Jiaonan, Qingdao, China (35 • 44 N, 119 • 56 E). The procedure to construct the families were described in the previous study [42]. Briefly, both the parent oysters were dissected manually, the gonads of male parents were rinsed in seawater for activation of sperm for 30 min, and the gonads of female parents were rinsed in seawater for activation of egg for 1 h. Few sperm add to eggs for fertilization (no more than 10 sperm around one egg). The oyster larval were reared in a 70 L tank for 1 month until they attached to the settlement substrate. And then they were reared in a marine farm in Qingdao.

Heat Stress Simulation
Before the heat stress simulation, the progeny was collected at 14 months-old from the marine farm and transferred to the indoor laboratory, where they were reared in natural seawater and feed with culture algae for two weeks. For the acute heat stress treatment, seawater was heated and maintained at 35 • C using an automatic temperature controller connecting with a heating rod (ZHONGKEHAI, China), Both the progeny of ZF2-3 and ZF families was placed into the seawater at 35 • C for 3 h. Air was pumped into the water throughout the entire experiment.

DNA and RNA Extraction
Gills and mantles were sampled from each heat-treated oyster, immediately frozen in liquid nitrogen, and stored at −80 • C for RNA and DNA extraction, respectively. Approximately 20 mg of gill tissue was used for total RNA extraction, according to RNAprep pure Tissue Kit (TIANGEN, Beijing, China) instructions. Total DNA was extracted using a TIANamp Marine Animals DNA Kit from approximately 30 mg of mantle tissue. The DNA and RNA concentrations were measured using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Sample quality was identified using 1.0% agarose gel electrophoresis.

Genotyping-by-Sequencing
For ZF2-3 family, we used the same SNP dataset produced by genotyping-by-sequencing as described in Wang et al. [42]. Briefly, the EcoRI-HinfI combination was selected for digesting the genomic DNA; the digested DNA fragments were ligated to adaptors with barcodes; barcoded DNAs of parents and progenies of ZF2-3 family were mixed together and separated on a 2% agarose gel; DNA fragments of 400-600 bp (including adaptor) were selected and purified; the selected DNA fragments were amplified by PCR; PCR products were purified with Agencourt AMPure XP beads (Beckman Coulter, High Wycombe, UK); purified DNA fragments were sequenced on IlluminaHiseq 2500 platform (Illumina, San Diego, CA, USA). The Stacks software was used to analyze the sequencing data and to genotype DNA variations in the mapping family. The Pacific oyster reference genome oyster_v9 (GenBank accession no. GCA_000297895.1) was applied for SNP calling [25].

Detection of Thermal Responsive Gene Expressions
Twenty-one previously-reported thermal responsive candidate genes [10,24,25], were selected for quantitative real-time PCR. Gene IDs and primers are shown in Table 1. All primer design was conducted with Primer Premier 5. Elongation factor 1 α (EF1α) was used as the reference gene. One microgram of RNA from each sample was used for cDNA synthesis with the Evo M-MLV RT Kit II (AG, China). The synthetic cDNA was diluted 20 times for subsequent RT-PCR. The reaction system was: 2 µL cDNA, 10 µL SYBR Green 2X Supermix (TaKaRa Bio Inc., Shiga, Japan), 6.8 µL DEPC H 2 O, 0.4 µL sense and anti-sense primer, and 0.4 µL ROX Dye II. The reaction program was as follows: 30 s at 95 • C, 5 s at 95 • C for 40 cycles, and 30 s at 60 • C. The melt curve program was as follows: 15 s at 95 • C, 1 min at 60 • C, 30 s at 95 • C, and 15 s at 60 • C. RT-PCR was conducted using the ABI7500 Fast Real-Time Detection System (Applied Biosystems, Foster City, CA, USA). Relative gene expression was determined using the 2 −∆∆Ct method [10,43].

eQTL Analysis
The aforementioned relative gene expression was used as the phenotype for eQTL mapping, and genotype data were acquired from a high-density SNP genetic linkage map constructed by the aforementioned backcross family [42]. The construction process was described in detail by Wang et al. [42]. QTL mapping analysis was conducted to detect eQTLs for corresponding gene expression regulation using MapQTL6 [44], interval mapping (IM), and restricted multiple QTL model mapping (rMQM). An IM model was used for the selection of cofactors. eQTLs were acquired using rMQM with multiple selections of cofactors. A permutation test was conducted 3000 times at p < 0.05 to determine the significant LOD threshold. eQTLs with LOD scores exceeding the chromosome-wide LOD threshold were considered significantly related to gene expression. Candidate genes were screened within a 100 kb region around these significant SNP markers in eQTL regions. Enrichment and classification analysis of Gene Ontology (GO) [45] and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways [46] were performed with R. Fisher's exact test was used for the GO enrichment analysis. GO enrichment results were ranked by p-value and the top 20 were selected to visualize the results.

Verification of eQTLs for an Independent Family
The relationship between each marker genotype and expression of the corresponding gene was investigated using the Mann-Whitney test, and the threshold for a significant difference was set at 0.05.

Genotyping by SNaPshot
SNPs in the eQTL regions that had no mutations at 30 bp upstream and downstream were selected for association analysis. Genotyping was conducted using SNaPshot (3730xL × L DNA Applied Biosystems) in the ZF hybrid groups. Peripheral amplification primers and unidirectional oligonucleotide primers of different lengths were designed for these mutations (Table S1). After the multiplex reaction, the PCR product was separated by electrophoresis with five-color fluorescence detection and analyzed using Gene Mapper 4.1. Multiple SNP sites were detected simultaneously.

Statistical Analysis
Allele frequency and genotype frequency were calculated using SHesis (http:// analysis.bio-x.cn/myAnalysis.php, accessed on 11 August 2020) [47][48][49]. Since our data didn't meet the normal distribution and homogeneity of variance, non-parametric test is adopted in our analysis. Specially, Mann-Whitney test was used for the analysis of variance in gene expression levels among different genotypes with two genotypes. For three genotypes, Kruskal-Wallis test was used. All analyses were performed using GraphPad Prism software. p < 0.05 was considered statistically significant.

Gene Expression Analysis
The gene expression data are shown in Supplementary Table S2. The gene expression exhibited large phenotypic fold changes (8-642) and the distribution was positively skewed (Supplementary Figure S1). Phenotypic variation was rich, and the variable coefficient of each gene expression level was greater than 0.5.

eQTL Mapping and Distant eQTLs for Candidate Gene Expression
A total of 56 significant eQTLs were mapped for 19 of the 21 genes, covering 164 SNPs. The phenotypic variation explanation (PVE) ranged from 6% to 26.6%. eQTL details and mapping results are shown in Table 2 and Supplementary Figure S2, respectively. All the investigated genes and their related eQTLs (distant) were located on different scaffolds. Most SNPs affected the expression of only one target gene, whereas 29 SNPs affected two target genes. Three important SNPs affected the expression of more than two heatresponsive genes. Marker7064 was located in eQTL regions of three target genes: HSPA12A, HSPA13 and HSPA12B-12492. Four genes (HSPA12A, HYOU1, HSPA13 and HSPA12B-12492) colocalized with the same regulatory SNP: Marker25714. Marker13973 also was detected for GAK, HSPA4, HSPA9, and HSPB1.  Table S3. These genes were enriched in 93 KEGG pathways that mainly participated in metabolism, genetic information processing, environmental information processing, cellular processes, and human diseases ( Figure S3). Functional enrichment analysis suggested that candidate genes were mainly involved in protein heterodimerization activity, DNA binding, nucleosome assembly, and centrosome duplication ( Figure S4). Furthermore, 10 transcription factors were detected in the eQTLs of five target genes (HSPA5-08834, HYOU1, HSPA13, ATG7, RNF123).

Verification of Distant eQTLs in Different Families
Eleven critical distant regulatory SNPs located in nine distant regulatory regions for seven genes were genotyped and tested for associations with the expression of the corresponding genes in family ZF; Marker28123 and Marker30775 for HSPA13, Marker12828 and Marker11127 for HSPA5-15492, Marker13973 for HSPA4, HSPA9, and HSPB1, Marker61915 for HSPA9, Marker35461 for BRKS2, Marker20478 for CLCN2, Marker14346-48 and Marker14346-85 for ATG7, and Marker10740 for HSPA5-08834. Only six distantacting loci (Marker 28123, Marker13973, Marker35461, Marker20478, Marker14346-48, and Marker14346-85) for seven genes were found to be polymorphic in the verification family. The genotype and allele frequencies of these six SNPs are shown in Supplementary  Table S4.

Discussion
Herein, we detected 56 distant eQTLs, including 164 SNPs, without a local regulatory locus. The phenotypic variation of single regulatory loci at the transcript level was generally higher (6-26.6%) than growth-related traits (0.6-13.8%) using the same genetic map in Pacific oysters [42], which may due to the complexity of growth-related traits. However, the PVE in our study was generally lower than 8.4-59.7% of the tuber starch content relative genes at the transcript level in potatoes [50]. Also, a major local regulatory locus could even explain 70% of the muscle IGF2 expression variance in pigs [51]. The limitation of the number and density of markers in the linkage map may explain why we did not detect any major local-eQTL in our study.
This study contains all distant changes and no local changes, which corroborate other studies' findings of more distant than local changes [31,33,52,53]. A gene can be regulated by many distant mutation loci, but only a few local loci, which may be the reason why a higher proportion of distant eQTL has been reported [54]. Many studies considered that local eQTLs played a stronger and more significant role in regulatory ability than distant eQTLs because of their local activity [55,56]. However, recent studies have also realized the importance of distant eQTLs [31,57,58], which account for the majority of heritability in human gene expression and participate in gene-environment interactions [11]. Furthermore, one distant eQTL in a hotspot region may influence many genes by working in gene networks.
We also found three distant regulatory SNPs linked with three or four heat-responsive genes, which may be located in potential distant eQTL hotspot regions. Regulators such as transcription factors, nuclear receptors, miRNAs and small nuclear RNAs usually cluster near distant eQTL hotspot regions [31]. We found one regulatory element in our potential distant eQTL hotspot regions, RPL10A.
In this study, candidate genes within distant regulatory regions were enriched in pathways of environmental and genetic information processing. Other studies have illuminated that distant eQTLs can cause the functional divergence of several components that transduce environmental signals into gene expression changes [59]. Our results further complement the viewpoint that distant regulatory loci might influence the expression or function of regulatory elements that respond to environmental changes. In addition to linkage analysis, the associations between one distant regulatory SNP and HSPA9 and between two distant regulatory SNPs and ATG7 were further verified in an independent population, implying that these SNPs are important distant regulatory loci. The distant regulatory SNPs could affect the expression and function of target genes by mediating other regulatory factors. ATG (autophagy-related) proteins are the core of the autophagic machinery. ATG7 encodes an E1-like activating enzyme and facilitates autophagic vesicle expansion by working with microtubule-associated protein light chain 3 (LC3) and ATG12 [60,61]. In the distant eQTL regions of ATG7, one transcription factor, nuclear respiratory factor 1 (NRF1) was detected 3 kb upstream from the two distant regulatory SNPs (Figure 2a). NRF1 encodes a transcription factor that activates the expression of metabolic genes involved in cellular growth and development [61]. In combination with the finding that NRF1 can upregulate the expression level of ATG7 by binding to its promoter in humans [62], we propose that the two distant regulatory SNPs (Marker14346-48 and Marker14346-85) might regulate the expression level of ATG7 by affecting the expression of NRF1. The eQTLs and the validated SNPs provide candidate regions and markers to predict the adaptive potential in response to climate change. HSPAs are ubiquitous molecular chaperones that participate in many biological processes [63][64][65]. In the distant eQTL regions of HSPAs, we found RPL10A, a highly conserved gene with a variable number of spliceosomal introns in most organisms [66], approximately 2 kb downstream from the regulatory SNP of HSPA9 (Figure 2b). We suggest that this distant regulatory locus Marker13973 might influence the expression of HSPA9 by regulating the synthesis of RPL10A. The regulatory relationship between RPL10A and HSPA9 was already elaborated in white shrimp [67].
Our results demonstrate the importance of distant eQTLs in adaptive evolution and elucidate the underlying regulatory mechanism of thermal adaption. The three regulatory SNPs identified might be used as potential regulatory loci of thermal adaption and could be applied to genetic improvements of heat-resistant oysters. However, we did not confirm the regulatory relationship among SNPs, gene expression, and high temperature resistance traits, which could be topics of further study.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/genes12071040/s1, Table S1: Peripheral amplification primers and unidirectional oligonucleotide primers of SNPs, Table S2: Gene expression level of 21 heat-responsive genes in 106 individuals, Figure S1: Histogram of heat-responsive gene expression distribution in the mapping family, Figure S2: Plots of eQTL mapping of 21 heat-responsive genes, Table S3: Candidate genes identified from eQTL mapping of C. gigas, Figure S3: Classification analysis of KEGG pathway of 831 key candidate genes, Figure S4: Enrichment analysis of Gene Ontology of 831 key candidate genes, Table S4: Genotype and allele frequencies of these six SNPs, Table S5: The association between SNPs and the gene expression level, Table S6: The association between SNPs and the gene expression level.