Genome-Wide Identification, Characterization, and Expression Profiling of Glutathione S-Transferase (GST) Family in Pumpkin Reveals Likely Role in Cold-Stress Tolerance

Plant growth and development can be adversely affected by cold stress, limiting productivity. The glutathione S-transferase (GST) family comprises important detoxifying enzymes, which play major roles in biotic and abiotic stress responses by reducing the oxidative damage caused by reactive oxygen species. Pumpkins (Cucurbita maxima) are widely grown, economically important, and nutritious; however, their yield can be severely affected by cold stress. The identification of putative candidate genes responsible for cold-stress tolerance, including the GST family genes, is therefore vital. For the first time, we identified 32 C. maxima GST (CmaGST) genes using a combination of bioinformatics approaches and characterized them by expression profiling. These CmaGST genes represent seven of the 14 known classes of plant GSTs, with 18 CmaGSTs categorized into the tau class. The CmaGSTs were distributed across 13 of pumpkin’s 20 chromosomes, with the highest numbers found on chromosomes 4 and 6. The large number of CmaGST genes resulted from gene duplication; 11 and 5 pairs of CmaGST genes were segmental- and tandem-duplicated, respectively. In addition, all CmaGST genes showed organ-specific expression. The expression of the putative GST genes in pumpkin was examined under cold stress in two lines with contrasting cold tolerance: cold-tolerant CP-1 (C. maxima) and cold-susceptible EP-1 (Cucurbita moschata). Seven genes (CmaGSTU3, CmaGSTU7, CmaGSTU8, CmaGSTU9, CmaGSTU11, CmaGSTU12, and CmaGSTU14) were highly expressed in the cold-tolerant line and are putative candidates for use in breeding cold-tolerant crop varieties. These results increase our understanding of the cold-stress-related functions of the GST family, as well as potentially enhancing pumpkin breeding programs.

The Cucurbita genus, which includes the pumpkins, squashes, and gourds, belongs to the Cucurbitaceae family. Two pumpkin species, C.maxima and C. moschata, are the most economically important cultivated Cucurbita crops, used as a staple food in many developing countries, although their mature fruits and seeds are consumed worldwide. The largest pumpkin producers are China, India, Russian Federation, Ukraine, United States, and Mexico [36]. C. maxima and C. moschata are used as rootstocks for other cucurbit crops, such as watermelon (Citrullus lanatus var. lanatus), cucumber (Cucumis sativus), and melon (multiple genera), to improve resistance against soil-borne pathogens and tolerance to abiotic stresses [37]. In our study, we identify and characterize the GST genes in C. maxima using a genome-wide analysis and various bioinformatics tools. We also predict which putative GSTs may be involved in cold-stress tolerance through a comparison of cold-induced gene expression data from cold-tolerant C. maxima and cold-susceptible C. moschata materials.

Identification and Sequence Analysis of the CmaGSTs
The GST family members in pumpkin were identified from the cucurbit genomic database [38] and National Center for Biotechnology Information (NCBI) databases using the keyword "GST". The cucumber (Chinese Long) GST gene sequences were used as queries in a Basic Local Alignment Search Tool (BLAST) search, with a cut-off E-value of <10 −10 . The coding sequence (CDS) and protein sequences of the putative GST family members were extracted from the cucurbit genomic database, and the protein sequences were further examined to confirm the presence of the GST-N (thioredoxin-like) and GST-C (hydrophobic or electrophilic binding) domains using the SMART web tool [39] and the NCBI tools [40]. In addition, the protein structures (protein length, molecular weight, and isoelectric point) were determined from the primary gene sequence using Expasy [41]. The sub-cellular localization of the identified CmaGST proteins was determined using Plant-mPLoc [42]. The GST-N domain sequences were aligned using CLUSTAL Omega [43]. The Multiple Expectation Maximization for Motif Elicitation (MEME) [44] was used to determine conserved motifs in the encoded proteins, using the following parameters: maximum number of motifs: 10; width of optimum motif ≥15 and ≤50. To determine the number of introns and exons, the CDS and genomic sequences of the CmaGST genes were compared using the Gene Structure Display Server (GSDS) tool [45].Putative cis-acting regulatory elements in the CmaGST genes were predicted in regions approximately 1 kb upstream of the translation initiation site (ATG) using Place [46].

Phylogenetic Analysis
The predicted CmaGST protein sequences were collected from the cucurbit genomics database [38]. Arabidopsis thaliana, Oryza sativa (rice), and Cucumis sativus var. sativus (cucumber) GST protein sequences were downloaded from TAIR [47], the MSU Rice Genome Annotation Project [48], and the Cucurbit Genome Database [38], respectively. The CmaGST protein sequences were aligned with those of Arabidopsis, rice, and cucumber using Clustal Omega. A phylogenetic tree was constructed in MEGA 6.0 [49] using the neighbor-joining algorithm with 1000 bootstrap replicates, using complete deletion mode to analyze tree topology and reliability. The names and sequences of all proteins used in the phylogenetic analysis are provided in Table S1.

Chromosome Localization and Gene Duplication Analysis
Genomic positional information for the CmaGST genes was collected from the cucurbit genome database, and their locations were plotted using Mapchart 2.2 [50]. Duplicated CmaGST genes were identified by BLAST-searching [51] them against each other, and were classed as duplicated genes when both their identity and query coverage was >80% of their partner sequence [52]. Tandem-duplicated genes were identified as an array of two or more homologous genes within a distance of 100 kb. A chromosome region containing more than two genes within 200 kb was defined as a gene cluster [53]. We estimated the synonymous rate (K s ), non-synonymous rate (K a ), and evolutionary constraint (K a /K s ) between the duplicated CmaGST gene pairs based on coding sequence alignments, using the method described by Nei and Gojobori [54] for MEGA 6.0 software. The mode of selection was identified using the K a /K s value between duplicated genes, where values >1, <1, and equal to 1 reflected positive selection, purifying selection, and neutral selection, respectively. We calculated divergence time of the duplicated gene pairs using the formula: T million year (Mya) = K s /2r; where T is divergence time, K s is the number of synonymous substitutions per site, and r is the fixed rate of 1.5×10 −8 synonymous substitutions per site per year expected for dicotyledonous plants [55].

Microsynteny of the GST Gene Family
The microsyntenic relationships of the GST genes in C. maxima, C. sativus, and C. lanatus subsp. vulgaris were detected using BLAST searches of these genes against the whole genomes of these crops. The physical location of the CmaGST genes on each chromosome were collected from the respective databases, and the relationships among the three crop species were plotted using Circos-0.52 [56].

Plant Materials, Growth, and Treatments
Two pumpkin species, moderately cold-tolerant C. maxima (inbred line CP-1) and cold-susceptible C. moschata (inbred line EP-1), were grown in a growth chamber at the department of Horticulture, Sunchon National University, Republic of Korea, at 24 • C and with a 14/10 h light/dark photoperiod. The seeds were directly sown into plastic containers containing soil. Cold stress was imposed on four-week-old seedlings, with five biological replications per treatment. The seedlings were transferred to incubators (TOGA clean system; model: TOGA UGSR01, Daejong, Korea) maintained at 5 • C, 10 • C, and 15 • C until cold injury symptoms were clearly observed on the seedlings (Figure 1a-c). Leaf samples of the cold-treated plants were excised after 0 h, 6 h, 24 h, 48 h, and 72 h of treatment. The samples were immediately frozen in liquid nitrogen and stored at −80 • C for RNA extraction. Root, stem, leaf, and flower bud samples were also collected from healthy C. maxima plants, to investigate the organ-specific expression of the CmaGSTs. Cold injury symptoms first appeared after 24 h in the 5 °C treated C. moschata plants, and gradually worsened over time. In contrast, no cold injury symptoms were observed in the C. maxima plants at any time point.

RNA Extraction and cDNA Synthesis
Cold-treated frozen leaf samples were used for total RNA extraction with an RNeasy Mini kit (Qiagen, Hilden, Germany), following the manufacturer's instructions. DNA was removed from the samples using RNase-free DNase (Promega, Madison, WI, USA), as instructed by the manufacturer. The extracted RNA was quantified using UV spectrophotometry at A260 NanoDropND-1000 and NanoDrop v3.7 software (Thermo Fisher Scientific, Waltham, MA, USA). The complementary DNA (cDNA) was synthesized using a First-Strand cDNA Synthesis kit (Thermo Fisher Scientific), following the manufacturer's instructions.

Qualitative and Quantitative PCR Expression Analyses
Reverse transcription polymerase chain reaction (RT-PCR) qualitative expression analyses were performed with a one-step Emerald Amp GT PCR Master Mix (Takara Bio Inc., Kusatsu, Japan). CmaGST gene-specific primers were used for the RT-PCR, with CmaActin expression as an internal control (Table S2). The PCR reactions were performed using 1 µL of 50 ng cDNA from the roots, leaves, stems, and flower buds as templates, as well as a master mix containing 10 pmol each of the forward and reverse primers, 9 µL sterile water, and 8 µL Emerald Mix in a total volume of 20 µL. The PCR conditions were: an initial denaturation at 95 °C for 5 min; followed by 30

RNA Extraction and cDNA Synthesis
Cold-treated frozen leaf samples were used for total RNA extraction with an RNeasy Mini kit (Qiagen, Hilden, Germany), following the manufacturer's instructions. DNA was removed from the samples using RNase-free DNase (Promega, Madison, WI, USA), as instructed by the manufacturer. The extracted RNA was quantified using UV spectrophotometry at A260 NanoDropND-1000 and NanoDrop v3.7 software (Thermo Fisher Scientific, Waltham, MA, USA). The complementary DNA (cDNA) was synthesized using a First-Strand cDNA Synthesis kit (Thermo Fisher Scientific), following the manufacturer's instructions.

Qualitative and Quantitative PCR Expression Analyses
Reverse transcription polymerase chain reaction (RT-PCR) qualitative expression analyses were performed with a one-step Emerald Amp GT PCR Master Mix (Takara Bio Inc., Kusatsu, Japan).
CmaGST gene-specific primers were used for the RT-PCR, with CmaActin expression as an internal control (Table S2). The PCR reactions were performed using 1 µL of 50 ng cDNA from the roots, leaves, stems, and flower buds as templates, as well as a master mix containing 10 pmol each of the forward and reverse primers, 9 µL sterile water, and 8 µL Emerald Mix in a total volume of 20 µL. The PCR conditions were: an initial denaturation at 95 • C for 5 min; followed by 30 cycles of denaturation at 95 • C for 30 s, annealing at 58 • C for 30 s, and extension at 72 • C for 45 s; with a final extension at 72 • C for 5 min. The PCR amplicons were run on a 1.2% agarose gel stained with HIQ blue mango (BioD Co, Gwangmyeong, Korea) and visualized with UV.
Quantitative PCR (qPCR) was performed using a 10-µL reaction volume containing 5 µL 2× Quanti Speed SYBR mix (Thermo Fisher Scientific), 1 µL (10 pmol) each of the forward and reverse gene-specific primers (Table S1), 1 µL template cDNA (50 ng), and 2 µL distilled, deionized water. The qPCR conditions were as follows: initial denaturation at 95 • C for 5 min; followed by 40 cycles of denaturation at 95 • C for 10 s, annealing at 58 • C for 10 s, and extension at 72 • C for 15 s. Light cycler 96 SW 1.1 (Roche, Mannheim, Germany) software was used to detect amplification and process data. Fluorescence was measured at the last step of each cycle, and three replicates were used for each sample. The qPCR reactions were normalized using cucurbit Actin genes as a reference for all comparisons [57], and the cycle threshold (Cq)value was calculated using the 2 −∆∆Ct method to determine the relative expression of the genes. Heat maps were constructed based on transcript abundance value of 32 GST genes of the two contrasting lines C. maxima (inbred line CP-1) and C. moschata (inbred line EP-1) exposed to different temperature at several time courses using Cluster 3.0 and tree view software [58].

Statistical Analysis
One-way analyses of variance (ANOVAs) were performed with Minitab 18 (State College, PA, USA) statistical software to detect significant differences among the relative expression levels of the genes. The mean separation of the relative expression values was determined using Tukey's pair-wise comparison test.

Identification of GST Genes in C. maxima
We identified GST genes from the cucurbit and NCBI databases using the key word "GST". The C. sativus gene sequences were BLAST-searched against the C. maxima genome to obtain the putative CmaGST genes, and a series of systematic analyses were performed to assemble the final set of 32 C. maxima GST gene sequences ( Table 1). The protein and genomic sequences for these candidates were obtained from the cucurbit database [38]. The CmaGST protein sequences were highly similar to those of other plant species, and all CmaGST proteins contained both the GST-N and GST-C domains. Certain proteins contained other specific domains; the EF1G (elongation factor 1γ, pfam00647) was present in CmaEF1G1, CmaEF1G2, and CmaEF1G3, while the UCH (ubiquitin carboxyl terminal hydrolase) and RPT1 (internal repeat 1) domains were found in CmaGSTU16 and CmaGSTU7, respectively. The GST-N domain was comprised of βαβαββα motifs, resulting in three layers of β-sheets sandwiched by three layers of α-helixes ( Figure S1). The N-terminus and C-terminus were connected by a small peptide sequence (8-15 aa) called the linker region in all CmaGSTs. The predicted isoelectric points (5.17-9.38), molecular weights (23.19-91.29 KDa), sub-cellular localizations, and residual sizes (202-810 aa) of the 32 putative CmaGST proteins are presented in Table 1.

Phylogenetic Analysis
A phylogenic tree was constructed using 106 full-length GST protein sequences from Arabidopsis (20 sequences), rice (11 sequences), cucumber (43 sequences), and the 32 CmaGST proteins (Figure 2), to investigate the evolutionary relationships of the CmaGST proteins. We have found 11 classes of GSTs out of recently published 14 classes [11], eight of which (tau, phi, zeta, theta, lambda, EF1G, DHAR, and TCHQD) correspond to the Arabidopsis GST gene classification by Dixon and Edwards [12], while the remaining three classes (GHR, mPGES-2, and GST2N) were identified by Vijayakumar et al. [59]. A total of 18 CmaGSTs were attributed to the tau class, the largest and most abundant group, while three CmaGSTs each were positioned in the phi, EF1G, and zeta classes ( Figure 2). In addition, two CmaGSTs were categorized as theta GSTs, while another two were grouped into the GHR category. The remaining CmaGST was located in the lambda class. No CmaGST proteins were attributed to the TCHQD, GST2N, DHAR, and mPGES2 classes.

Phylogenetic Analysis
A phylogenic tree was constructed using 106 full-length GST protein sequences from Arabidopsis (20 sequences), rice (11 sequences), cucumber (43 sequences), and the 32 CmaGST proteins (Figure 2), to investigate the evolutionary relationships of the CmaGST proteins. We have found 11 classes of GSTs out of recently published 14 classes [11], eight of which (tau, phi, zeta, theta, lambda, EF1G, DHAR, and TCHQD) correspond to the Arabidopsis GST gene classification by Dixon and Edwards [12], while the remaining three classes (GHR, mPGES-2, and GST2N) were identified by Vijayakumar et al. [59]. A total of 18 CmaGSTs were attributed to the tau class, the largest and most abundant group, while three CmaGSTs each were positioned in the phi, EF1G, and zeta classes ( Figure 2). In addition, two CmaGSTs were categorized as theta GSTs, while another two were grouped into the GHR category. The remaining CmaGST was located in the lambda class. No CmaGST proteins were attributed to the TCHQD, GST2N, DHAR, and mPGES2 classes.  Phylogenetic analysis of full-length glutathione S-transferase (GST) proteins from C.maxima, Cucumis sativus, Arabidopsis thaliana, and Oryza sativa. This phylogenetic tree was generated using the neighbor-joining method in MEGA 6.0, with 1000 bootstrap replicates. Black, blue, red, and green text indicates C. maxima, C. sativus, A. thaliana, and O. sativa proteins, respectively.

Chromosomal Locations and Gene Duplications of the CmaGSTs
The distribution of the CmaGST genes across the pumpkin chromosomes was mapped, showing that most were located on relatively few chromosomes, including chromosomes 4, 6, 14, and 16 ( Figure 3). Six CmaGST genes each were found on chromosomes 4 and 6, while chromosomes 3, 7, 11, 12, 13, 18, and 19 each contained only one CmaGST gene. The distribution of the CmaGST genes in the pumpkin genome reflected the diversity and complexity of this gene family. A cluster of tau CmaGST genes was identified on chromosome 4 ( Figure 3). In other species, the genomic locations of the GST gene family are known to be influenced by genetic events including segmental duplication, tandem duplication, and polyploidization [60,61]. We identified 11 pairs of segmentally duplicated and five pairs of tandem-duplicated GST genes in the pumpkin genome ( Figure 3, Table 2). Moreover, the substitution rate of non-synonymous (K a ) and synonymous (K s ) mutations was estimated to assess the selection pressures and divergence time among the duplicated CmaGST gene pairs ( Table 2). K a /K s values <1, 1, and >1 indicate negative or purifying selection, neutral selection, and positive selection, respectively. Among the 16 CmaGST duplicated gene pairs, 15 had K a /K s values below 1, indicating that these duplicated genes evolved under purifying selection pressure in C. maxima (Table 2). Only one pair of duplicated genes had a K a /K s value above 1, suggesting that this gene pair evolved under strong positive selection pressures in C. maxima. We also calculated the divergence time of the CmaGST genes, revealing that the duplication events began 18.87 Mya and continued up until 1.30 Mya ( Table 2). The N-terminal GST domain sequence similarity of the CmaGST proteins was greater within classes than between classes ( Figure S1); 11 pairs of tau CmaGST proteins shared more than 80% similarity, indicating the high rate of gene duplication within this class. In addition, four conserved regions and the G-site were found in all CmaGST proteins ( Figure S1).

Structural and Motif Analyses of the CmaGSTs
We analyzed the exon-intron structures of the CmaGST genes using the GSDS online tool, revealing different numbers of exons and introns in different classes of these genes. The tau-class CmaGSTs typically contained two exons, except that only one exon was present in CmaGSTU1, three exons were present in CmaGSTU2 and CmaGSTU5, four exons were found in CmaGSTU3, and CmaGSTU16 contained 10 exons ( Figure S2). All of the genes in the phi class contained three exons, whereas the EF1G class contained six to nine exons. The zeta class genes contained 8-10 very small exons, while the two GHR-class genes contained three or six exons. The lambda (GSTL1) and theta (GSTT1 and GSTT2) genes possessed nine and seven exons, respectively ( Figure S2).
Conserved motifs were found among the GST genes of C. maxima using the MEME web tool. Motifs 3 and 4 were present in all classes except the EF1G and lambda classes, respectively ( Figure S3). The lambda-class gene contained only one motif (motif 3), while motifs 8 and 10 were unique to the EF1G class. Motifs 1 and 9 were unique to the tau class genes, while motifs 5 and 9 were only present in certain tau class members. Motif 2 was present in the tau and zeta class genes, but motif 6 was absent in the genes of the tau, lambda, and GHR classes. The phi and theta class genes contained motifs 3, 4, and 6, but in the theta class genes, motif 4 was partially or completely absent ( Figure S3).

Microsynteny Relationships and Analysis of CmaGST Protein Interactions
A microsynteny map was constructed using orthologous GST gene pairs among C. maxima, C. sativus, and C. lanatus subsp. vulgaris to explore their evolutionary history and relationships (Figure 4). Nine orthologous gene pairs were identified between C. maxima and C. lanatus subsp. lanatus, whereas 24 orthologous gene pairs were identified between C. maxima and C. sativus (Figure 4). This indicates that CmaGST genes are more closely related to those of C. sativus than C. lanatus. This study also provided further evidence of the 16 pairs of duplicated CmaGST genes. We used STRING 10.5 version software to predict the interactions of the 32 C. maxima GST proteins based on their homology to Arabidopsis proteins, to identify their putative functional and physical roles ( Figure 5). CmaGSTU2, CmaGSTU7, CmaGSTU8, and CmaGSTU9 showed high homology to AtGSTU7, which is involved in cold stress [62]. AtGSTU7 is also closely related to AtGSTU19 and AtGSTU25, which are associated with the response to chilling, salicylic acid, and H2O2 stress. The phi (CmaGSTF1, CmaGSTF2, and CmaGSTF1), EF1G (CmaEF1G1, CmaEF1G2, and CmaEF1G3), zeta (CmaGSTZ1 and CmaGSTZ2), and theta (CmaGSTT1 and CmaGSTT2) proteins are highly homologous to Arabidopsis AtGSTF8, AT1G09640, AtGSTZ1, and AtGSTT1, respectively, which are associated with responses to a range of stresses and signals, including H2O2, salicylic acid, and bacterial pathogens. Arabidopsis proteins AtGSTU19, AtGSTU25, AtGSTL3, At5G12110, At1G07940, AteEF-1Bb1, AT5G19510, AT2G18110, and AtGSTZ2 are also involved in the protein interaction network ( Figure 5). We used STRING 10.5 version software to predict the interactions of the 32 C. maxima GST proteins based on their homology to Arabidopsis proteins, to identify their putative functional and physical roles ( Figure 5). CmaGSTU2, CmaGSTU7, CmaGSTU8, and CmaGSTU9 showed high homology to AtGSTU7, which is involved in cold stress [62]. AtGSTU7 is also closely related to AtGSTU19 and AtGSTU25, which are associated with the response to chilling, salicylic acid, and H 2 O 2 stress. The phi (CmaGSTF1, CmaGSTF2, and CmaGSTF1), EF1G (CmaEF1G1, CmaEF1G2, and CmaEF1G3), zeta (CmaGSTZ1 and CmaGSTZ2), and theta (CmaGSTT1 and CmaGSTT2) proteins are highly homologous to Arabidopsis AtGSTF8, AT1G09640, AtGSTZ1, and AtGSTT1, respectively, which are associated with responses to a range of stresses and signals, including H 2 O 2 , salicylic acid, and bacterial pathogens. Arabidopsis proteins AtGSTU19, AtGSTU25, AtGSTL3, At5G12110, At1G07940, AteEF-1Bb1, AT5G19510, AT2G18110, and AtGSTZ2 are also involved in the protein interaction network ( Figure 5).

Expression Profiles of CmaGST Genes in Various Organs
We performed RT-PCR to analyze the expression patterns of the CmaGST genes in different organs (root, stem, leaf, and flower buds) of healthy C. maxima plants. Most of the CmaGST genes were abundantly expressed in all tested organs (Figure 6), although CmaGSTU17 was weakly expressed in all tested organs. CmaGSTU15, CmaEF1G03, and CmaGHR02 were highly expressed in the leaf relative to the other organs. CmaGSTZ03 was expressed in all organs except the flower buds, while CmaEF1G01 expression was absent from the root and leaf. CmaGSTU6 and CmaGSTU9 were expressed in the root, leaf, and flower buds, but not in the stem. Moreover, CmaGSTU2, CmaGSTU8, and CmaGSTU10 were expressed in all organs but only very weakly in the stem and flower buds ( Figure 6).

Expression Profiles of CmaGST Genes in Various Organs
We performed RT-PCR to analyze the expression patterns of the CmaGST genes in different organs (root, stem, leaf, and flower buds) of healthy C. maxima plants. Most of the CmaGST genes were abundantly expressed in all tested organs (Figure 6), although CmaGSTU17 was weakly expressed in all tested organs. CmaGSTU15, CmaEF1G03, and CmaGHR02 were highly expressed in the leaf relative to the other organs. CmaGSTZ03 was expressed in all organs except the flower buds, while CmaEF1G01 expression was absent from the root and leaf. CmaGSTU6 and CmaGSTU9 were expressed in the root, leaf, and flower buds, but not in the stem. Moreover, CmaGSTU2, CmaGSTU8, and CmaGSTU10 were expressed in all organs but only very weakly in the stem and flower buds ( Figure 6).

Expression Profiling of the CmaGST Genes during Cold Treatment
Moderately cold-tolerant C. maxima and cold-susceptible C. moschata lines were treated with different temperatures (5 °C, 10 °C, and 15 °C) to elucidate the expression patterns of the CmaGST genes in response to cold stress. A qPCR analysis revealed that the majority of CmaGST genes were down-regulated in C. maxima during the 5 °C treatment (Figure 7a, and Figures S4a and S5a); however, CmaGSTU3, CmaGSTU8, CmaGSTU12,and CmaGSTU14 had 8-, 6-, 11-, and 8-fold higher expressions, respectively, after 6 h. CmaGSTU7, CmaGSTU9,and CmaGSTU11 exhibited 4-, 5-, and 2-fold higher expressions, respectively, in C. maxima compared with C. moschata after 24 h at 5 °C. CmaGSTU10, CmaGSTU17, and CmaGSTZ2 were up-regulated in both species after 24 h at 5 °C (Figure 7a and Figure S4a,b); however, the majority of the CmaGST genes were down-regulated ( Figure S5b). In contrast, CmaGSTU1, CmaGSTU4, CmaGSTU5, and CmaGSTL1 showed 16-, 15-, 14-, and 5-fold higher expression levels, respectively, in C. moschata compared with C. maxima after 24 h of treatment at 5 °C; however, their expression levels declined over time (Figure 7a and Figure S4b).
A total of nine CmaGST genes showed significantly higher expression levels in C. maxima compared with C. moschata after 6 h at 10 °C. Among them, CmaGSTU3, CmaGSTU7, CmaGSTU8, CmaGSTU9, CmaGSTU11, CmaGSTU12, and CmaGSTU14 had more than a 10-fold higher expression in the cold-tolerant line than the susceptible line. Four genes, CmaGSTU7, CmaGSTU8, CmaGSTU9, and CmaGSTU11, exhibited 10-, 12-, 14-, and 10-fold higher levels of expression, respectively, in C. maxima than in C. moschata after 24 h at 10 °C (Figure 7b and Figure S4c,d). In the cold-susceptible line, the expression levels of CmaGSTU1, CmaGSTU4, CmaGSTU5, and CmaGSTL1 were 22-, 30-, 28-, and 6-fold higher than the cold-resistant cultivar at 24 h and 48 h ( Figure S4d). The rest of the genes did not have significantly different levels of expression at any time point in C. maxima or C. moschata, except for the EF1G3 and GSTT2 genes at the 6 h time point (Figure S5c,d).
During the 15 °C temperature treatment, seven genes were up-regulated in the cold-tolerant line compared with the susceptible line, with expression changes of 6-fold at 6 h for CmaGST6, 12-fold at 24 h for CmaGST7, 10-fold at 6 h for CmaGST8, 4-fold at 24 h for CmaGST9, 14-fold at 6 h for CmaGST11, 8-fold at 24 h for CmaGST12, and 7-fold at 24 h for CmaGST14 (Figure 7c and Figure S4e). By contrast, seven CmaGST genes (CmaGSTU1, CmaGSTU4, CmaGSTU5, CmaGSTU10, CmaGSTF2, CmaGSTZ2, and CmaGSTL1) in the susceptible line were significantly up-regulated at the 48 h time point and drastically decreased at the 72 h time point in C. moschata (Figure 7c and Figure S4f). After 48 h at 15 °C, the majority of genes were up-regulated in C. moschata compared with C. maxima,

Expression Profiling of the CmaGST Genes during Cold Treatment
Moderately cold-tolerant C. maxima and cold-susceptible C. moschata lines were treated with different temperatures (5 • C, 10 • C, and 15 • C) to elucidate the expression patterns of the CmaGST genes in response to cold stress. A qPCR analysis revealed that the majority of CmaGST genes were down-regulated in C. maxima during the 5 • C treatment (Figure 7a, and Figures S4a and S5a); however, CmaGSTU3, CmaGSTU8, CmaGSTU12,and CmaGSTU14 had 8-, 6-, 11-, and 8-fold higher expressions, respectively, after 6 h. CmaGSTU7, CmaGSTU9,and CmaGSTU11 exhibited 4-, 5-, and 2-fold higher expressions, respectively, in C. maxima compared with C. moschata after 24 h at 5 • C. CmaGSTU10, CmaGSTU17, and CmaGSTZ2 were up-regulated in both species after 24 h at 5 • C (Figure 7a and Figure  S4a,b); however, the majority of the CmaGST genes were down-regulated ( Figure S5b). In contrast, CmaGSTU1, CmaGSTU4, CmaGSTU5, and CmaGSTL1 showed 16-, 15-, 14-, and 5-fold higher expression levels, respectively, in C. moschata compared with C. maxima after 24 h of treatment at 5 • C; however, their expression levels declined over time (Figure 7a and Figure S4b) A total of nine CmaGST genes showed significantly higher expression levels in C. maxima compared with C. moschata after 6 h at 10 • C. Among them, CmaGSTU3, CmaGSTU7, CmaGSTU8, CmaGSTU9, CmaGSTU11, CmaGSTU12, and CmaGSTU14 had more than a 10-fold higher expression in the cold-tolerant line than the susceptible line. Four genes, CmaGSTU7, CmaGSTU8, CmaGSTU9, and CmaGSTU11, exhibited 10-, 12-, 14-, and 10-fold higher levels of expression, respectively, in C. maxima than in C. moschata after 24 h at 10 • C (Figure 7b and Figure S4c,d). In the cold-susceptible line, the expression levels of CmaGSTU1, CmaGSTU4, CmaGSTU5, and CmaGSTL1 were 22-, 30-, 28-, and 6-fold higher than the cold-resistant cultivar at 24 h and 48 h ( Figure S4d). The rest of the genes did not have significantly different levels of expression at any time point in C. maxima or C. moschata, except for the EF1G3 and GSTT2 genes at the 6 h time point (Figure S5c,d).
During the 15 • C temperature treatment, seven genes were up-regulated in the cold-tolerant line compared with the susceptible line, with expression changes of 6-fold at 6 h for CmaGST6, 12-fold at 24 h for CmaGST7, 10-fold at 6 h for CmaGST8, 4-fold at 24 h for CmaGST9, 14-fold at 6 h for CmaGST11, 8-fold at 24 h for CmaGST12, and 7-fold at 24 h for CmaGST14 (Figure 7c and Figure S4e). By contrast, seven CmaGST genes (CmaGSTU1, CmaGSTU4, CmaGSTU5, CmaGSTU10, CmaGSTF2, CmaGSTZ2, and CmaGSTL1) in the susceptible line were significantly up-regulated at the 48 h time point and drastically decreased at the 72 h time point in C. moschata (Figure 7c and Figure S4f). After 48 h at 15 • C, the majority of genes were up-regulated in C. moschata compared with C. maxima, although some genes showed very low levels of expression or did not have a significant response to the 15 • C treatment in C. maxima and C. moschata (Figure 7c and Figure S5e,f).
Genes 2018, 9, x FOR PEER REVIEW 15 of 20 although some genes showed very low levels of expression or did not have a significant response to the 15 °C treatment in C. maxima and C. moschata (Figure 7c and Figure S5e,f).

Discussion
The GST genes play key roles in plant growth and stress responses [59,[63][64][65]. Plants have a high number of GSTs categorized into 14 classes [11], which can have constitutive or tissue-specific expression, or be induced by biotic and abiotic stresses. In our study, 32 full-length GST genes were identified in pumpkin and evaluated through various in silico approaches. The phylogenetic analysis revealed high level of similarities within the same classes of GSTs in four plant species, pumpkin, cucumber, Arabidopsis, and rice (Figure 2), which indicates that these classes evolved before the monocot-dicot divergence. Pumpkin has more GST genes than soybean (Glycine max) [66], but fewer than Arabidopsis [62], rice [67], sweet orange (Citrus sinensis) [68] and maize (Zea mays) [66]. Gene family expansion in plants often arises as a result of tandem duplications, segmental duplications, whole-genome duplications, and inter-specific hybridizations, and can facilitate the evolution of functional diversity [69]. We identified seven pairs of segmental-and four pairs of tandem-duplicated tau-class genes, representing 44% of the tau GSTs ( Figure 3 and Table 2). Only one duplicated gene pair was found in each of the theta and zeta classes, whereas the EF1G class contained three gene pairs that evolved by segmental duplication (100% of EF1G-class genes). We calculated the Ka, Ks, and Ka/Ks values of the duplicated gene pairs, providing an estimate of their selection history. Fifteen of the 16 duplicated CmaGST gene pairs had Ka/Ks values less than 0.8 ( Table 2), suggesting that majority of gene pairs evolved through purifying selection.
In the protein association networks, CmaGSTU2, CmaGSTU7, CmaGSTU8, and CmaGSTU9 showed similarity to GSTU7 in Arabidopsis ( Figure 5), which is involved in the response to abiotic (chilling and hypoxia) and biotic (fungal and bacterial) stresses [62,70]. CmaGSTU14, CmaGSTU15, and CmaGSTU16 were predicted to have similar functions to AtGSTU19, while CmaGSTU12 and CmaGSTU17 were associated with AtGSTU25, and were predicted to have a strong interaction with the homologs of AtGSTU7. From these results, we speculated that nine CmaGSTs (CmaGSTU2, CmaGSTU7, CmaGSTU8, CmaGSTU9, CmaGSTU12, CmaGSTU13, CmaGSTU15, CmaGSTU16, and

Discussion
The GST genes play key roles in plant growth and stress responses [59,[63][64][65]. Plants have a high number of GSTs categorized into 14 classes [11], which can have constitutive or tissue-specific expression, or be induced by biotic and abiotic stresses. In our study, 32 full-length GST genes were identified in pumpkin and evaluated through various in silico approaches. The phylogenetic analysis revealed high level of similarities within the same classes of GSTs in four plant species, pumpkin, cucumber, Arabidopsis, and rice (Figure 2), which indicates that these classes evolved before the monocot-dicot divergence. Pumpkin has more GST genes than soybean (Glycine max) [66], but fewer than Arabidopsis [62], rice [67], sweet orange (Citrus sinensis) [68] and maize (Zea mays) [66]. Gene family expansion in plants often arises as a result of tandem duplications, segmental duplications, whole-genome duplications, and inter-specific hybridizations, and can facilitate the evolution of functional diversity [69]. We identified seven pairs of segmental-and four pairs of tandem-duplicated tau-class genes, representing 44% of the tau GSTs ( Figure 3 and Table 2). Only one duplicated gene pair was found in each of the theta and zeta classes, whereas the EF1G class contained three gene pairs that evolved by segmental duplication (100% of EF1G-class genes). We calculated the K a , K s , and K a /K s values of the duplicated gene pairs, providing an estimate of their selection history. Fifteen of the 16 duplicated CmaGST gene pairs had K a /K s values less than 0.8 ( Table 2), suggesting that majority of gene pairs evolved through purifying selection.
In the protein association networks, CmaGSTU2, CmaGSTU7, CmaGSTU8, and CmaGSTU9 showed similarity to GSTU7 in Arabidopsis ( Figure 5), which is involved in the response to abiotic (chilling and hypoxia) and biotic (fungal and bacterial) stresses [62,70]. CmaGSTU14, CmaGSTU15, and CmaGSTU16 were predicted to have similar functions to AtGSTU19, while CmaGSTU12 and CmaGSTU17 were associated with AtGSTU25, and were predicted to have a strong interaction with the homologs of AtGSTU7. From these results, we speculated that nine CmaGSTs (CmaGSTU2, CmaGSTU7, CmaGSTU8, CmaGSTU9, CmaGSTU12, CmaGSTU13, CmaGSTU15, CmaGSTU16, and CmaGSTU17) might be involved in the response to cold, fungal, and bacterial stresses. By contrast, CmaGSTF1, CmaGSTF2, and CmaGSTF3 were more functionally related to AtGSTF8 (Figure 5), which also has functions in the response to chilling, bacterial, and fungal stresses [3,62]. Theta and zeta classes of GSTs have glutathione peroxidase activity [4] and the ability to reduce cytotoxic hydroperoxides. Glutathione reductase reduces glutathione levels by catalyzing the glutathione disulfide into the sulfhydryl [71], providing a substrate for the GSTs. They play a role in the cellular detoxification process through facilitating a conjugation reaction; and are also associated with other cellular proteins or pathways involved in detoxification.
All CmaGST genes were differentially expressed in different organs ( Figure 6). Only one gene (CmaEF1G01) was expressed exclusively in the stem and flower buds. In contrast, CmaGSTU6 and CmaGSTU9 were not expressed in the stem and CmaGSTZ03 was not expressed in the flower buds, suggesting that they might not be responsible for the development of these respective organs ( Figure 6). The rest of the GST genes were expressed in all tested organs, consistent with the organ-specific microarray expression data of the AtGSTs [12], tomato (Solanum lycopersicum) GSTs [63], and OsGSTs [67].
The CmaGSTs were also differentially expressed in cold-tolerant and cold-susceptible lines following treatments with various cold temperatures. Seven genes (CmaGSTU3, CmaGSTU7, CmaGSTU8, CmaGSTU9, CmaGSTU11, CmaGSTU12, and CmaGSTU14) were up-regulated in the cold-tolerant line compared with the cold-susceptible line after 6 h or 24 h of treatment with cold temperatures (Figure 7a-c) and might therefore be putative candidates for breeding cold-tolerant pumpkins. Similarly, in cold-stress treatments in cabbage (Brassica oleracea), BoGSTU19, BoGSTU24, and BoGSTF10 were identified as putative candidate genes involved in cold-stress tolerance, as they were more highly expressed in the cold-tolerant line (Bo106) than the cold-susceptible line (Bo107) [59]. Sappl et al. [62] reported that AtGSTU7 was significantly up-regulated after 3 h of chilling stress. Dalton et al. [72] reported that the GSTs are involved in physiological flexibility and resistance to various biotic and abiotic stresses in plants. The seven putative candidate CamGST genes involved in the cold response contained highly cold stress responsive cis-acting elements (Table S3). Plant cells sense cold stress via membrane rigidification, as well as through changes metabolite concentration, initiating the primary signals, such as ABA, Ca 2+ , nitric oxide (NO) [73], and ROS, to induce a variety of responses, like stomatal closure. The GST and GPX proteins are known to be highly induced during ROS formation, enabling the detoxification of lipid peroxides and DNA degradation products, and removing ROS. During cold stress, the cellular levels of ROS increase and act as a secondary messengers, inducing programmed cell death [74]; however, ROS also induce the MAPKK (mitogen-activated protein kinase kinase) proteins, which can activate defensive genes to induce conjugation processes, including the processing of GSH by GST enzymes, which becomes localized in the vacuole by transporters (ABC transporter cascade) for degradation.
CmaGSTU2, CmaGSTU6, CmaGSTU10, CmaGSTU17, CmaGSTF2, and CmaGSTZ2 expression was up-regulated by the cold treatments in both the cold-tolerant and -susceptible lines (Figure 7a-c), suggesting that these six genes are not affected by any internal variable factors. Four GST genes (CmaGSTU1, CmaGSTU4, CmaGSTU5, and CmaGSTL1) were highly expressed in the cold-susceptible line than the cold-tolerant one, which might result in the cold susceptibility of this plant (Figure 7a-c); therefore, these genes might be the putative candidates for developing cold-tolerant pumpkin cultivars via gene editing/anti-sense technique or molecular genetics technologies. These genes might play differential roles in the signal transduction pathways and/or cooperate with other genes to form networks that defend plants against adverse environmental conditions.

Conclusions
In conclusion, this is the first report of the genome-wide identification, characterization, and expression profiling of the GSTs in pumpkin. We systematically analyzed the C. maxima genome, identified 32 GSTs, and characterized those using bioinformatics and expression analyses in response to cold (5 • C, 10 • C, and 15 • C) stresses. Seven (CmaGSTU3, CmaGSTU7, CmaGSTU8, CmaGSTU9, CmaGSTU11, CmaGSTU12, and CmaGSTU14) of the 32 GST genes were more highly expressed in the cold-tolerant cultivar (C. maxima) than in the cold-susceptible cultivar (C. moschata) during cold stress and might therefore be useful for developing cold-tolerant cultivars through conventional, molecular, or transgenic breeding approaches. The comprehensive expression analysis in response to cold stress in pumpkin provided novel information regarding the cold-related physiological roles of the CmaGST genes, which could enhance the selection of potential genes utilized for marker-assisted breeding and/or the engineering of transgenic plants with increased cold tolerance in future research.

Supplementary Materials:
The following are available online at www.mdpi.com/xxx/s1, Figure S1: Sequence and structural analysis of pumpkin GST proteins with Arabidopsis, Rice and Populus trichocarpa GSTs; Figure S2: Genomic structures of GST genes of Cucurbita maxima; Figure S3: Schematic representation of motif compositions in the GST sequences; Figure S4: Expression patterns of the CmaGST genes under various temperature treatments; Figure S5: Expression patterns of CmaGST genes under various temperature treatments; Table S1: List of GSTs from different species with their amino acid sequences used for phylogenetic analysis; Table S2: Primer sequence used for real time and reverse transcription polymerase chain reaction (RT-PCR) of GST genes of C. maxima; Table  S3: Putative cis-elements, more than 4 bp, were identified in 32 CmaGST genes in C. maxima using Place Web Signal Scan.