GWAS Reveals a Novel Candidate Gene CmoAP2/ERF in Pumpkin (Cucurbita moschata) Involved in Resistance to Powdery Mildew

Pumpkin (Cucurbita moschata Duchesne ex Poir.) is a multipurpose cash crop rich in antioxidants, minerals, and vitamins; the seeds are also a good source of quality oils. However, pumpkin is susceptible to the fungus Podosphaera xanthii, an obligate biotrophic pathogen, which usually causes powdery mildew (PM) on both sides of the leaves and reduces photosynthesis. The fruits of infected plants are often smaller than usual and unpalatable. This study identified a novel gene that involves PM resistance in pumpkins through a genome-wide association study (GWAS). The allelic variation identified in the CmoCh3G009850 gene encoding for AP2-like ethylene-responsive transcription factor (CmoAP2/ERF) was proven to be involved in PM resistance. Validation of the GWAS data revealed six single nucleotide polymorphism (SNP) variations in the CmoAP2/ERF coding sequence between the resistant (IT 274039 [PMR]) and the susceptible (IT 278592 [PMS]). A polymorphic marker (dCAPS) was developed based on the allelic diversity to differentiate these two haplotypes. Genetic analysis in the segregating population derived from PMS and PMR parents provided evidence for an incomplete dominant gene-mediated PM resistance. Further, the qRT-PCR assay validated the elevated expression of CmoAP2/ERF during PM infection in the PMR compared with PMS. These results highlighted the pivotal role of CmoAP2/ERF in conferring resistance to PM and identifies it as a valuable molecular entity for breeding resistant pumpkin cultivars.


Introduction
The Cucurbitaceae family comprises~120 genera and more than 800 species distributed in the tropical and subtropical regions of the world and includes several economically important fruit and vegetable crops, such as cucumber, melon, pumpkin, squash, and watermelon [1]. The genus Cucurbita contains several wild-type species and five major species domesticated globally as cash crops, i.e., Cucurbita argyrosperma K. Koch, Cucurbita ficifolia Bouché, Cucurbita maxima Duchesne, Cucurbita moschata Duchesne ex Poir., and Cucurbita pepo L. [2]. C. moschata Duchesne ex Poir. (pumpkin; 2n = 40) is closely related to C. argyrosperma K. Koch and is the second most diverse species in the genus after C. pepo L. The pumpkin fruit and seeds exhibit extensive morphological diversity [3]. Pumpkin is an herbaceous climbing or creeping vine that grows well in warm tropical areas and is cultivated and consumed across the globe in the form of mature and immature young stems, tendrils, flowers, fruits, and seeds [3,4]. Pumpkins provide a good source of nutrients,

Screening for PM-Resistant Pumpkin Germplasm
From the global diversity panel, 407 pumpkin accessions were collected and evaluated for their response to simulated PM infection at the fully expanded first true leaf stage (~14 d after germination) consecutively for two years (2018-2019) at the experimental farm of Sejong University. The resistance index was scored as susceptible (1)(2)(3)(4), moderately resistant (5 and 6), and resistant (7-9) ( Figure 1A). Based on the resistance index in consecutive screenings, the pumpkin accessions IT 274039 and IT 278592 were consistently identified as powdery mildew resistant (PMR) and powdery mildew susceptible (PMS), respectively ( Figure 1B). PMS and PMR accessions were subsequently used for generating the F 2 population to identify the potential gene responsible for conferring resistance to PM.
CmoAP2/ERF in PMR suggested its potential role in conferring tolerance to PM. Thus, the study revealed the potency of allelic variations in CmoAP2/ERF for the early and efficient detection of PM resistance in pumpkin cultivars.

Screening for PM-Resistant Pumpkin Germplasm
From the global diversity panel, 407 pumpkin accessions were collected and evaluated for their response to simulated PM infection at the fully expanded first true leaf stage (~14 d after germination) consecutively for two years (2018-2019) at the experimental farm of Sejong University. The resistance index was scored as susceptible (1)(2)(3)(4), moderately resistant (5 and 6), and resistant (7-9) ( Figure 1A). Based on the resistance index in consecutive screenings, the pumpkin accessions IT 274039 and IT 278592 were consistently identified as powdery mildew resistant (PMR) and powdery mildew susceptible (PMS), respectively ( Figure 1B). PMS and PMR accessions were subsequently used for generating the F2 population to identify the potential gene responsible for conferring resistance to PM.

Multivariate GWAS Analysis for PM Resistance in Pumpkin
Phenotypic data on resistance to PM were collected from 407 pumpkin accessions over a period of 2 years (2018-2019). Further, GBS analysis led to the identification of 2071 high-quality SNPs distributed across 20 chromosomes of pumpkins [3]. Using this data, we used the Genome Association and Prediction Integrated Tool (GAPIT) program for GWAS analysis. The GAPIT is a single R-language based platform that combines the most powerful and computationally efficient GWAS models [35]. The GAPIT package comprises several multi-locus, e.g., Bayesian-information and Linkage-disequilibrium Iteratively Nested keyway (BLINK) [36], Fixed and Random Model Circulating Probability Unification (FarmCPU) [37], and Multi Locus Mixed Model (MLMM) [38] and single locus, e.g., General Linear Model (GLM) [39], Mixed Linear Model (MLM) [40], and Settlement of MLM Under a Progressively Exclusive Relationship (SUPER) [41] GWAS models. In our study, a threshold of Bonferroni-corrected significance at a -log10(p) value of 4.61 was determined ( Figure 2 and Table 1). Initially, the MLM model data predicted only 2 SNPs (C3. 7419839 and C3. 7419891) ( Table 1 and Supplementary Figure S1). In addition to these two SNPs, the SUPER model predicted 3 more SNPs (C3. 7078924, C15. 7755972, and C15. 7756005) ( Table 1 and Supplementary Figure S1). In addition to the 5 SNPs predicted by the SUPER model, another SNP (C3. 8060138) was predicted by the GLM model (Table 1 and Supplementary Figure S1). The models, such as BLINK, FarmCPU, and MLMM, were employed to further reduce misleading positives ( Figure 2 and Table 1). All three multi locus models predicted only one SNP, C3. 7419839 that was positioned at chromosome 3 ( Figure 2). Hence, it is thought to be the most reliable SNP. Furthermore, the GLM, SUPER, and MLM models also commonly predicted one more SNP at C3. 7419891, which is 52 nucleotides distant from C3. 7419839 ( Figure 2 and Supplementary Figure S1). Together, we selected C3. 7419839 and C3. 7419891 as the significant SNPs to investigate further.

Candidate Gene Identification
To validate the novel SNPs identified in the GWAS analysis, we compared their physical location with the previously reported major pumpkin QTL associated with the PM resistance in chromosome 3 [42]. The SNPs, including C3.7419839 and C3.7419891 as the primary candidates predicted from the multivariate GWAS analysis, were positioned approximately 1.42 Kb away from the previously reported major QTL region (~400 kb region between C3. 7562022~C3. 7981972 nucleotides) in chromosome 3 [42]. Further, sequence analysis revealed that the C3. 7419839 and C3. 7419891 SNPs were indeed present in the 8th exon of CmoCh3G009850 gene encoding for APETALA2 (AP2)-like ethylene-responsive transcription factor (CmoAP2/ERF) ( Figure 3A). The CmoAP2/ERF gene contains a polypeptide of 491 amino acid (AA) residues, with an estimated molecular weight of 54.3 KDa and an isoelectric point of 5.81, as predicted by a web-based protein analysis tool (www. bioinformatics.org/sms/prot_mw.html (accessed on 8 February 2022)). The computational analysis predicted the presence of two conserved domains in the CmoAP2/ERF protein, such as an AP2 superfamily domain spanning from AA151 to AA212 residues and an AP2 domain spanning from AA243 to AA306 (www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi accessed on 8 February 2022) (Supplementary Figure S2).

Validation of SNPs and Allelic Diversity Analysis
To substantiate the SNPs indicated by the GWAS analysis, we employed direct gene sequencing for the CmoAP2/ERF gene using the gDNA extracted from the PMS and PMR parents. The sequencing data asserted the presence of 4 more point mutations and the 2 SNPs predicted in the GWAS analysis in the PMR line AP2-ERF gene coding sequences ( Figure 3B). Among the 6 SNPs, C3. 7418367 (SNP-1) is located in exon-1, C3. 7419693 (SNP-2) is found in exon-7, C3. 7419839 (SNP-3), C3. 7419850 (SNP-4), and C3. 7419891 (SNP-5) are located in exon-8, and C3.7420273 (SNP-6) is found in exon-10 ( Figure 3A). Among the two GWAS-predicted SNPs, one of the SNP at C3. 7419839 (SNP-3) is found to be a synonymous mutation, and it does not cause any amino acid change, despite the nucleotide change ( Figure 3A,B). However, the other SNP at C3. 7419891 (SNP-5) is a nonsynonymous mutation, and it resulted in an amino acid change at AA 321 ( Figure 3A,B). Likewise, all the remaining four SNPs (SNP 1, 2, 4, and 6) were nonsynonymous mutations and resulted in substituting different amino acid residues at AA105, AA302, AA321, AA402, respectively ( Figure 3A,B). In line with these findings, many earlier reports substantiated the roles of the AP2/ERF transcription factor family genes for mediating the pathogen responses in diverse plant species [43][44][45].

Genotyping Marker Development and Segregation Analysis for PM Resistance
Using the CAPS and dCAPS approach, we designed a genotyping marker using the GWAS-predicted SNPs at positions C3. 7419839 and C3. 7419891. However, we could not find suitable polymorphic markers at this position to distinguish the PMS and PMR parents (data not shown). As the gene sequencing result revealed a few more SNP variations between the PMS and PMR, we used one of the SNPs (C3.7420273) at exon 10 for marker development. For validating this dCAPS marker, we initially used both parental lines (PMS and PMR) and the pooled parental line (PMS + PMR) gDNA as a heterogenous control. We confirmed that the designed marker conspicuously differentiates the controls ( Figure 4A).
in the 8th exon of CmoCh3G009850 gene encoding for APETALA2 (AP2)-like ethylene-responsive transcription factor (CmoAP2/ERF) ( Figure 3A). The CmoAP2/ERF gene contains a polypeptide of 491 amino acid (AA) residues, with an estimated molecular weight of 54.3 KDa and an isoelectric point of 5.81, as predicted by a web-based protein analysis tool (www.bioinformatics.org/sms/prot_mw.html (accessed on 8 February 2022)). The computational analysis predicted the presence of two conserved domains in the CmoAP2/ERF protein, such as an AP2 superfamily domain spanning from AA151 to AA212 residues and an AP2 domain spanning from AA243 to AA306 (www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi accessed on 8 February 2022) (Supplementary Figure S2).

Validation of SNPs and Allelic Diversity Analysis
To substantiate the SNPs indicated by the GWAS analysis, we employed direct gene sequencing for the CmoAP2/ERF gene using the gDNA extracted from the PMS and PMR parents. The sequencing data asserted the presence of 4 more point mutations and the 2 SNPs predicted in the GWAS analysis in the PMR line AP2-ERF gene coding sequences ( Figure 3B). Among the 6 SNPs, C3. 7418367 (SNP-1) is located in exon-1, C3. 7419693 (SNP-2) is found in exon-7, C3. 7419839 (SNP-3), C3. 7419850 (SNP-4), and C3. 7419891 (SNP-5) are located in exon-8, and C3.7420273 (SNP-6) is found in exon-10 ( Figure 3A). Among the two GWAS-predicted SNPs, one of the SNP at C3. 7419839 (SNP-3) is found to be a synonymous mutation, and it does not cause any amino acid change, despite the nucleotide change ( Figure 3A,B). However, the other SNP at C3. 7419891 (SNP-5) is a nonsynonymous mutation, and it resulted in an amino acid change at AA 321 ( Figure 3A,B). Likewise, all the remaining four SNPs (SNP 1, 2, 4, and 6) were nonsynonymous mutations and resulted in substituting different amino acid residues at AA105, AA302, AA321, AA402, respectively ( Figure 3A,B). In line with these findings, many earlier reports Previously, by visual assessment, we sorted some clear PM sensitive and resistant lines from the PMS/PMR-F 2 progeny. The phenotyping data based on our analysis obtained 86 susceptible, 120 moderately resistant, and 39 resistant lines, giving a ratio of 1:1.39:0.45 (Supplementary Table S1). Later, we quantified the diseased leaf area (DLA) using Image J software, and genotyped them using the dCAPS marker, developed based on the SNP variation in C3. 7420273. The phenotyping and genotyping results clearly agreed with each other (Supplementary Table S2). Further, we were interested to learn whether the sensitive and resistant lines (identified by visual observation/ImageJ/genotyping) comprise the GWAS-predicted SNP variations. For this purpose, we selected three susceptible and three resistant lines (confirmed by three approaches) and sequenced them, along with the PMS and PMR parental controls ( Figure 4B,C). Intriguingly, the susceptible and resistant lines contain the GWAS-predicted SNP variation, and the genotyping marker can be used for F 2 segregation analysis. Further, we used the same marker to genotype 238 PMS/PMR-F 2 population. We assumed a segregation ratio of 1:2:1, as per single-gene Mendelian inheritance, in the F 2 genotyping. On the contrary, our genotyping analysis revealed 80 sensitive, 109 heterozygotes, and 49 resistant lines, giving a ratio of 1:1.36:0.61 (Table 2). Based on the phenotypic and genotypic data, the PM-resistant gene shows an incomplete dominant inheritance, so the segregation ratio is distorted from the Mendelian single-gene inheritance. We speculate that the segregation deviation might be due to the lower fitness of the resistant genes, introgressed into the C. moschata genome through an interspecific hybridization (wide crossing) event from the wild inedible gourd C. martinezii. find suitable polymorphic markers at this position to distinguish the PMS and PMR parents (data not shown). As the gene sequencing result revealed a few more SNP variations between the PMS and PMR, we used one of the SNPs (C3.7420273) at exon 10 for marker development. For validating this dCAPS marker, we initially used both parental lines (PMS and PMR) and the pooled parental line (PMS + PMR) gDNA as a heterogenous control. We confirmed that the designed marker conspicuously differentiates the controls ( Figure 4A).  Table S1). Later, we quantified the diseased leaf area (DLA)

Scoring and Sorting the PMS/PMR-F 2 Lines Based on the Genotyping Result
The quantification of disease symptoms is critical for evaluating disease severity, and it is a prerequisite for making the best disease management decisions [46]. Based on the visual assessment, we initially selected PM sensitive, moderate resistant, and resistant lines on a scale of 1-9. Further, we imaged each line for quantitative data, measured the DLA using the ImageJ program, and matched their intensity scale ( Figure 5A). After genotyping all the PMS/PMR-F 2 population with the CmoAP2/ERF dCAPS marker, we verified what percentage of Image J quantified DLA value can be used as a threshold for consideration as susceptible, moderately resistant, or resistant. In our analysis, the homozygous susceptible lines held with a DLA percentage range of 48.3-81.6%, whereas the moderate resistant or heterozygote lines ranged between 38.9-55%. On the other hand, the homozygous resistant lines had DLA values between 8.02-30.75%, respectively ( Figure 5B). These results imply that the DLA values correlate with the CmoAP2/ERF gene zygosity, and at least the CmoAP2/ERF homozygous resistant lines can effectively resist PM infestation.

Expression Changes of CmoAP2/ERF Gene in PMR Line
To investigate whether the endogenous expression of CmoAP2/ERF expression correlates with different post-infection intervals, we performed qRT-PCR and assessed the CmoAP2/ERF transcript levels in the PMS and PMR lines. In the real-time experiment, we found that endogenous CmoAP2/ERF expression is already (before the pathogen treatment) significantly higher in the PMR line compared to the PMS line ( Figure 6). Moreover, the CmoAP2/ERF transcript augmentation peaked after 6 h post-infection and started to decline from 24 h to 72 h post-infection in the PMR line. Nevertheless, AP2/ERF expression remained higher in the PMR line than in the PMS line at all the examined time points. (Figure 6). Therefore, the higher endogenous expression of CmoAP2/ERF in the PMR lines might plausibly be involved in quick defense response and enhanced the expression of downstream powdery mildew-resistant genes.

Expression Changes of CmoAP2/ERF Gene in PMR Line
To investigate whether the endogenous expression of CmoAP2/ERF expression correlates with different post-infection intervals, we performed qRT-PCR and assessed the CmoAP2/ERF transcript levels in the PMS and PMR lines. In the real-time experiment, we found that endogenous CmoAP2/ERF expression is already (before the pathogen treatment) significantly higher in the PMR line compared to the PMS line ( Figure 6). Moreover, the CmoAP2/ERF transcript augmentation peaked after 6 h post-infection and started to decline from 24 h to 72 h post-infection in the PMR line. Nevertheless, AP2/ERF expression remained higher in the PMR line than in the PMS line at all the examined time points. (Figure 6). Therefore, the higher endogenous expression of CmoAP2/ERF in the PMR lines might plausibly be involved in quick defense response and enhanced the expression of downstream powdery mildew-resistant genes.

Discussion
Being sessile-natured, the plants have innately evolved with intricate signaling pathways against biotic stress. The transcription factors (TFs) are the key players in the plant defense circuit to fight against various pathogens [47]. Many TF families have thus far been identified for their response to bacterial, fungal, and viral infections [47,48]. Our study uncovered a novel allelic variation in the CmoAP2/ERF transcription factor and demonstrated that augmented endogenous expression of the CmoAP2/ERF gene might improve the resistance in PMR lines.
In pursuing resistance breeding, identifying PM-resistant germplasm resources is critical for the breeding of resistant varieties. Specifically, many breeders have resorted to the wild crop progenitors. These bestow rich genetic diversity and great potential to fortify the modern cultivars with either biotic or abiotic stress tolerance [49,50]. Previously, for breeding of PM resistant varieties, many wild cultivars were identified in various crops, including barley [51], wine grape [52,53], strawberry [54], and wheat [44]. Likewise, in the case of cucurbits, the wild progenitors, including C. okeechobeensis subsp. martinezii and C. lundelliana were found with PM-resistant traits and extensively used for cultivar development [42]. Through interspecific hybridization, a major powdery mildew resistance region from the wild species C. okeechobeensis subsp. martinezii was introgressed into C. pepo and C. moscahata, [55][56][57]. In line with these findings, True French, a

Discussion
Being sessile-natured, the plants have innately evolved with intricate signaling pathways against biotic stress. The transcription factors (TFs) are the key players in the plant defense circuit to fight against various pathogens [47]. Many TF families have thus far been identified for their response to bacterial, fungal, and viral infections [47,48]. Our study uncovered a novel allelic variation in the CmoAP2/ERF transcription factor and demonstrated that augmented endogenous expression of the CmoAP2/ERF gene might improve the resistance in PMR lines.
In pursuing resistance breeding, identifying PM-resistant germplasm resources is critical for the breeding of resistant varieties. Specifically, many breeders have resorted to the wild crop progenitors. These bestow rich genetic diversity and great potential to fortify the modern cultivars with either biotic or abiotic stress tolerance [49,50]. Previously, for breeding of PM resistant varieties, many wild cultivars were identified in various crops, including barley [51], wine grape [52,53], strawberry [54], and wheat [44]. Likewise, in the case of cucurbits, the wild progenitors, including C. okeechobeensis subsp. martinezii and C. lundelliana were found with PM-resistant traits and extensively used for cultivar development [42]. Through interspecific hybridization, a major powdery mildew resistance region from the wild species C. okeechobeensis subsp. martinezii was introgressed into C. pepo and C. moscahata, [55][56][57]. In line with these findings, True French, a commercial C. pepo cultivar, was crossed with a PM resistant wild cultivar, C. okeechobeensis subsp. Martinezii, and the resistance gene was described as Pm-0 [56,57].
In our study, the phenotyping results of the PMS/PMR-F 2 revealed a ratio of susceptible: partially resistant: resistant as 1:1.39:0.45, which suggests incomplete dominant resistance (Supplementary Table S1). Accordingly, the CmoAP2/ERF dCAPS marker genotyping revealed a ratio of susceptible: heterozygous: resistant as 1:1.36:0.61. This implies a lower fitness of the resistant gene, likely due to its introgression into the C. moschata genome via distant hybridization or wide crossing. Although the trait inheritance is incomplete, the heterozygotic lines were found to exhibit adequate PM resistance. This phenomenon agrees with the previous report on the interspecific hybrid between C. pepo and C. martinezii [56,58]. Nevertheless, these seemingly contradicting Mendelian ratios are not uncommon. For example, few qualitative PM-resistance related studies in wheat, including pmX [59] and pmWE99 [60], and two stem rust-resistant genes, including Sr36 [61] and Sr40 [62], also showed segregation distortion patterns due to the favored transmission of specific alleles. Another possible explanation for segregation distortion in our study could be the functional association of the CmoAP2-ERF gene in floral organs, reproductive, and post-embryonic tissue development processes (cucurbitgenomics.org/feature/gene/CmoCh03G009850 (accessed on 25 April 2022)). Interestingly, we also found that the homozygous PMR lines that harbor the incomplete dominant resistance suffered a negative growth impact on their seed production greater than that of the PMS lines (Supplementary Figure S3), which might also, at least in part, contribute to the segregation distortion.
Since the first report on C. moschata whole-genome sequencing was published in 2017 [31], the utilization of pumpkin genomic resources was made available for trait discovery and marker-assisted breeding for crop improvement applications [42]. Particularly while utilizing the next-generation sequencing tools, the GBS strategy aided in high-throughput genotyping and the identification of several genome-wide SNP markers in multiple crop species [63]. In a recent report, Park et al. (2020) carried out GBS analysis and SNP mapping in pumpkins leading to the identification of a single major QTL region in chromosome 3 (~spanned in a 400 kb region between C3. 7562022~C3. 7981972) associated with powdery mildew resistance [42]. In line with these findings, Lee et al.
(2020) analyzed 610 pumpkin germplasms by utilizing the GBS approach and reported 2071 high-quality SNPs distributed across the 20 pumpkin chromosomes [3]. A few other recent studies also elucidated some of the C. moschata genes associated with the PM responses through transcriptome profiling [4]. In their study, Guo et al. identified that six pumpkin genes, including bHLH87 (basic helix-loop-helix 87), ERF104 (ethylene response factor 104), WRKY21, HSFA (heat shock factor A), MLO3 (mildew locus O), and SGT1 (suppressor of G2 allele of SKP1) were differentially regulated in PM-resistant line compared to the PM-sensitive line [4]. Moreover, the CmoSGT1 and CmobHLH87 genes were ectopically overexpressed in tobacco and found to enhance the PM resistance in the transgenic lines [64,65].
While GWAS is a potent technique for dissecting complex traits associated with biotic and/or abiotic stresses in crops, which helps in identifying the novel candidate genes and/or loci [66], one of the challenges is to maintain a stringent check on the false positives generated by the family relatedness and population structure [67]. Therefore, several single-locus and multi-locus based GWAS models were required to evade the false negatives. The GAPIT is a single composite suite which comprises of most powerful and computationally efficient GWAS models [35]. In our study, while using the GAPIT suite, we processed our data through both multi-locus (BLINK, FarmCPU, and MMLM) and single locus (GLM, MLM, and SUPER) GWAS models (Table 1). Two novel SNPs, such as C3.7419839 and C3.7419891, were commonly predicted by single-locus based models (Supplementary Figure S1), whereas the multi-locus models predicted only one SNP, C3. 7419839 ( Figure 2). Apparently, these two SNPs were positioned in exon 8 of the CmoAP2/ERF gene in chromosome-3 ( Figure 3A). Further, Sanger sequencing affirmed both the GWAS-predicted SNPs and additionally uncovered 4 more SNP variations among the PMS and PMR haplotypes ( Figure 3A). Further, the qRT-PCR analysis substantiated the higher endogenous CmoAP2/ERF expression prior to pathogen treatment, and this is also found to be augmented in response to PM infection ( Figure 6). Therefore, we suggest that the CmoAP2/ERF is one of the key regulator genes, and that it is highly associated with the pumpkin PM responses. Additionally, we validated a dCAPS marker that could specifically detect the CmoAP2/ERF gene zygosity ( Figure 4A), which could be beneficial for markerassisted breeding and stacking the CmoAP2/ERF gene with other PM-resistant genes.
The AP2/ERF group of transcription factors is the largest group of plant-specific TFs. This group is widely known for its multifarious roles, including plant development and biotic and abiotic stress regulation [1,43,44]. A recent report revealed that the C. moschata genome contains 212 AP2/ERF genes [68]. In the past, several ERF genes were characterized in the model land plant Arabidopsis thaliana for their functional association with plant pathogenic responses against Alternaria alternata, Botyosphaeria dothidea, Pseudomonas syringae PV tomato DC3000 (PstDC3000), Ralstonia solanacearum, and tobacco mosaic virus (TMV) [43]. Likewise, the ERF genes from one species were functionally characterized in other plant species for their roles in enhancing tolerance to pathogen attack. For instance, overexpression of Malus domestica ERF 100 (MdERF100) in A. thaliana [43]. Gossypium hirsutum ERF gene (GhB301) in Nicotiana benthamiana [69,70], Artemisia annua ORA (AaORA) gene in A. thaliana [71], Glycine max ERF5 (GmERF5) in soybean [45], and tobacco stress-induced gene 1 (Tsi1) in hot pepper [72] enhanced transgenic line tolerance to different pathogens. Furthermore, several reports ascertained the ERF gene functional association with both biotic and abiotic stress-responsive gene elements [44,73,74]. Hence, we speculate the CmoAP2/ERF gene expression might also be instrumental in conveying abiotic stress tolerance as well. Disrupting the function of disease-sensitive genes is also an appealing approach for mitigating PM-induced damage [75]. Using technological advances, such as CRISPR technology, a number of recent studies have shown that knocking down PM sensitive gene expression can effectively render disease resistance in field crops, such as tomato [75], grape [76], sweet basil [77], and wheat [78], without any growth penalty. The next step will be to look at the functional characterization of the CmoAP2/ERF gene in other commercially important crop plants to see if it can help with abiotic stress tolerance, as well as PM resistance.

Plant Materials and Growth Conditions
A total of 407 pumpkin accessions were obtained from the National Agrobiodiversity Centre, Rural Development Administration (RDA) in Korea and from the Agricultural Research Service (ARS) of the United States Department of Agriculture (USDA); the accessions were advanced through self-pollination for 3 cycles. The entire collection was planted in plastic houses at the experimental farm facility at Anseong-si, Republic of Korea, for two consecutive years between, 2018~2019. Further, we used them for the evaluation of powdery mildew (PM) disease responsive phenotypes when the seedlings were in the fully expanded first true leaf stage. Two replications from three plants of each accession were examined throughout the growing period, as described in our previous report [3].

Disease Evaluation and Scoring of PM Response
Healthy pumpkin seedlings at the fully expanded first true leaf stage were used to evaluate the PM response. The conidial suspensions were collected from the heavily diseased plants for inoculation. The density of the spore suspension was adjusted to approximately 1 × 10 5 sporangia per ml before the inoculation. The inoculation was repeated 2 times in 5 day intervals. The resistance index was scored 2 times (10 and 14 days) after the final inoculation day. The disease severity and categorization were briefly estimated using the resistance scale of 1-9, where 1-4: susceptible, 5-6: moderate resistance, 7-9: resistance.
For quantification of the diseased leaf area (DLA), or the percentage of the F 2 progeny, we employed Image J (www.imagej.nih.gov/ij (accessed on 13 January 2022)) online software. Briefly, image of each infected pumpkin leaf sample was captured using a Samsung Galaxy S105G equipped with 12 mega pixel rear camera, 6 mm wide, and F2.4 (telephoto). To achieve uniformity, all the image background was removed using a web-based program called remove.bg (www.remove.bg accessed on 13 January 2022). Further, we imported the processed images to ImageJ and applied the same color threshold value for all the samples to obtain accurate scoring. The infection density was calculated using the formula shown below. DLA (%) = (Damaged area/whole leaf area) × 100

Genome-Wide Association Study
The GAPIT R package [35] was used to perform genome-wide association studies using a dataset of 2071 high-quality SNPs and phenotype data previously generated using 407 examples of pumpkin accessions [3]. Multivariate GWAS methods were employed to evaluate the trait-SNP associations for PM resistance. The significance of associations among markers and traits was given as p-value for significant markers (p ≤ 0.05). The GWAS result was presented using Manhattan plots [−log10(p)] and quantile-quantile (QQ) plots. Based on the Bonferroni test, the threshold value was estimated at a significant level of 5% [−log10 (2.41) = 4.61].

Genomic DNA Extraction and Identification of Candidate Gene
Genomic DNA (gDNA) was extracted using the fresh and turgid pumpkin leaf samples. For the gDNA extraction, we used the cetyltrimethylammonium bromide (CTAB) method [79]. The gDNA quality and integrity was checked by agarose gel electrophoresis, and the quantity was assessed using a Nanodrop ND-2000 spectrophotometer, following the manufacturer s instructions (Thermo Fisher Scientific, Waltham, MA, USA). In this study, we used C. moschata (Rifu) genome (http://cucurbitgenomics.org/organism/9 (accessed on 12 December 2021)) sequences as the pumpkin reference sequences for comparison. To ascertain the in silico SNP data and to affirm the allelic diversity, we sequenced the CmoAP2/ERF gene from both the variants. For this purpose, we extracted gDNA from the PMS and PMR lines, and PCR was amplified using the gene specific primers. Further, we checked the amplification, followed by the purification of the PCR fragments using a gel purification kit (Cosmo Genetech, Seoul, Korea) and submitted the samples for sequencing (Cosmo Genetech, Seoul Korea) using the gene specific primers.

Development of Genotyping Markers and Validation
Based on Sanger gene sequencing, we found many single nucleotide polymorphic (SNP) variations between the PMS and PMR lines. We used one of the SNPs for developing the genotyping markers by using the web-based tool called dCAPS finder (http://helix. wustl.edu/dcaps/dcaps.html (accessed on 1 February 2022)). Based on the program output, we examined various primers and identified one of the dCAPS markers that can distinguish the PMS and PMR lines. The primer pairs used for PCR amplifications are CmoAP2/ERF-dCAPS-for-5 CAATTCAGGCGGCAGGCGGGTGCTG 3 and CmoAP2/ERF-dCAPS-rev-5 ATTATT CGGCCTCCA TTACT 3 . Upon PCR with the specified primers, followed by the Pst1 (New England Biolabs, Ipswich, MA, USA) enzyme digestion, the amplicons were resolved in 5% agarose gel.

RNA Isolation and cDNA Synthesis
For the gene expression analysis, the PMR and PMS lines were grown in plastic pots in a greenhouse facility at Sejong university, Seoul, South Korea. The p. xanthii inoculum was prepared from the heavily infected leaves and sprayed evenly on the seedling surface. The tissues were harvested from the PMS and PMR samples at 0, 6, 24, 48, and 72 h, respectively. The samples were snap frozen in liquid nitrogen and kept at −80 • C for RNA isolation. The total RNAs were isolated from the pumpkin leaf samples using the TaKaRa MiniBEST Plant RNA Extraction Kit (Shiga, Japan). First-strand cDNAs were synthesized using the Nanohelix Easy cDNA synthesis kit (Daejon, Korea).

Quantitative Real Time PCR (qRT PCR)
qRT-PCR was performed using Toyobo Thunderbird SYBR qPCR mix (Osaka, Japan) and carried out in CFX96 connect Real-time Quantitative system (Bio-Rad Laboratories, Hercules, CA, USA). The relative expression of CmoAP2/ERF (CmoCh03G009850) gene expression in different samples was calculated by normalizing the expression values with those of the housekeeping gene CmoActin (CmoCh11G015080) as an endogenous control. For calculating the relative gene expression of the CmoAP2/ERF gene, we adopted the 2 -∆∆Ct method, as previously described [80,81]. The relative expression of the CmoAP2/ERF gene was estimated using the primer pairs CmoAP2/ERF-qRT-For 5 GAACGACTGCC-CTGATGTGA 3 and CmoAP2/ERF-qRT-Rev 5 AAGCCGATCCCACCCTTTTC 3 . As an endogenous control, we used CmoActin-qRT-For-5 CCGCTCTTGCTCCGAGCAG 3 and CmoActin-qRT-Rev-5 ATCCACATCTGTTGGAAGGTAC 3 , respectively. Three independent biological replicates for each treatment were prepared. For each biological replicate, we ran at least three technical replicates of each PCR reaction.

Bioinformatic Analysis
To visualize the Sanger gene sequencing-derived chromatograms and sequence analysis, a web-based program, snap gene viewer suite (www.snapgene.com/snapgene-viewer (accessed on 1 March 2022)), was used. For assembling and comparing the Sanger gene sequencing data, we used DNASTAR Seqman analysis software. For designing primers, the Primer 3 web-based tool was used (https://primer3.ut.ee (accessed on 15 February 2022)).

Statistical Analysis
All statistical comparisons between the variances were determined by Student s t-test and least significant differences (LSD) using Statistix 8.1 computation software. The chisquare test was performed using Microsoft Excel 16.0 software to estimate the segregation ratio of PM resistance in the F 2 progeny derived from a cross between the PMS and PMR lines.

Conclusions
Pumpkins are economically important crops used as food stuffs and rootstocks for cucurbits worldwide. However, PM caused by p.xanthii affects the growth of pumpkins and reduces plant productivity. GWAS is a viable strategy for associating specific genetic variations with a particular trait. The use of GWAS for the resistance to PM in pumpkin resulted in the identification of a novel gene encoding CmoAP2/ERF, with six SNP variations in the coding sequence of the haplotypes PMR and PMS. Further qRT-PCR exhibited an elevated expression of CmoAP2/ERF during PM infection in PMR compared with PMS. Overall, the study provided empirical evidence for a pivotal role of the genotyping marker developed based on the allelic variations in CmoAP2/ERF for rapidly detecting p.xanthiimediated PM resistance in pumpkin genotypes.