Global Transcriptome and Co-Expression Network Analyses Revealed Hub Genes Controlling Seed Size/Weight and/or Oil Content in Peanut

Cultivated peanut (Arachis hypogaea L.) is an important economic and oilseed crop worldwide, providing high-quality edible oil and high protein content. Seed size/weight and oil content are two important determinants of yield and quality in peanut breeding. To identify key regulators controlling these two traits, two peanut cultivars with contrasting phenotypes were compared to each other, one having a larger seed size and higher oil content (Zhonghua16, ZH16 for short), while the second cultivar had smaller-sized seeds and lower oil content (Zhonghua6, ZH6). Whole transcriptome analyses were performed on these two cultivars at four stages of seed development. The results showed that ~40% of the expressed genes were stage-specific in each cultivar during seed development, especially at the early stage of development. In addition, we identified a total of 5356 differentially expressed genes (DEGs) between ZH16 and ZH6 across four development stages. Weighted gene co-expression network analysis (WGCNA) based on DEGs revealed multiple hub genes with potential roles in seed size/weight and/or oil content. These hub genes were mainly involved in transcription factors (TFs), phytohormones, the ubiquitin–proteasome pathway, and fatty acid synthesis. Overall, the candidate genes and co-expression networks detected in this study could be a valuable resource for genetic breeding to improve seed yield and quality traits in peanut.


Introduction
Cultivated peanut (Arachis hypogaea L.) is an important grain legume that provides high-quality edible oil, rich protein, and other nutrients.In legumes, seed development is precisely modulated by both maternal and zygote signals, which, in coordination, regulate the growth of the embryo, endosperm, and seed coat [1].The diploid embryo, triploid endosperm, and maternal integument grow in concert to control the seed size [2].Seed size/weight is a crucial factor determining peanut yield and thus one of the major agronomic traits in peanut domestication and modern breeding.In addition, owing to their well-balanced fatty acid composition, peanuts are considered a functional food rich in specific antioxidants and mono-and polyunsaturated fatty acids [3].Although the oil content of currently cultivated peanut varieties is usually high, there are still significant differences among different germplasms, ranging from 31.7 to 57.0% [4].Hence, developing novel peanut cultivars with higher oil content is a prime target under the condition of limited planting area.
Plants 2023, 12, 3144 2 of 18 A large number of genes that regulate seed size/weight and/or oil content have been identified in Arabidopsis thaliana (Arabidopsis), Glycine max (soybean), Oryza sativa (rice), and Zea mays (maize), including genes related to transcription factors (TFs) and phytohormones [5][6][7][8], as well as genes involved in the ubiquitin-proteasome pathway and fatty acid biosynthesis [9][10][11].For example, GmAP2-1 and GmAP2-4, belonging to the AP2/ERF family, were identified as positively regulating seed weight and size in soybean [7].The overexpression of GmMYB73 was found to increase the oil content of Arabidopsis seeds by 5.9-17.9%and significantly enhance the 1000-seed weight [5].Activation of the expression of the Big Grain1 (BG1) gene, which regulates auxin transport, resulted in a significant increase in rice seed size [6].Furthermore, the ubiquitin receptor protein encoding gene DA1, involved in the ubiquitin-proteasome pathway, is an inhibitor of the regulation of Arabidopsis seed size by restricting the cell proliferation cycle [9].Suppressed expression of fatty acid dehydrogenase 2 (FAD2) and fatty acid elongase 1 (FAE1) involved in the de novo biosynthesis of fatty acids led to a slight decrease in the content of rapeseed oil and affected the accumulation of storage products [10].However, few genes that influence seed size and/or oil content have been found in peanuts, which greatly restricts the breeding of high-quality peanut varieties.
In previous studies, genetic mapping was used to identify genomic loci controlling important traits in peanut, leading to the identification of multiple QTLs associated with yield [12][13][14][15] and oil content [16,17].Nevertheless, it is challenging to detect candidate genes in QTLs that determine seed size/weight and/or oil content due to the complexity of peanut genome structure and the low-level polymorphism of molecular markers across different tetraploid A. hypogaea cultivars [12,[18][19][20][21]. Owing to its reduced cost and high throughput nature, RNA-seq has been widely used to investigate transcriptome profiles in many plant species to assist in the identification of key genes for traits of interest [22][23][24][25][26][27].For instance, comparative transcriptome analyses of wild and cultivated soybean varieties at the early and middle stages of seed maturation resulted in the identification of two genes potentially important for seed trait formation, GA20OX and NFYA [23].A similar transcriptome comparison was conducted between large-seed and small-seed soybean cultivars at seed formation, growth, and early maturation stages, revealing several candidate genes that influence seed size, including TFs and phytohormones [24].
In addition, comparative transcriptome studies on peanuts have also been reported.For example, the transcriptome dynamics at 11 consecutive developmental stages of peanut pods showed that multiple genes involved in various pathways, such as gravitropism, photomorphogenesis, and response to stimuli, were differentially expressed along developmental gradients [22].The transcriptome differences of two peanut varieties with distinct sucrose contents were explored at seven seed developmental stages, identifying 28 DEGs involved in sucrose metabolism [28].Hub genes positively correlated with Aspergillus flavus resistance were identified in two peanut lines with contrasting genotypes via comparative transcriptome analysis [27].Likewise, several candidate genes, including those encoding transcription factor TGA7, IAA-amino acid hydrolase, and pentatricopeptide repeat protein, were detected by a transcriptome study on two cultivated peanut accessions and their wild relative Arachis monticola at four stages of seed development [26].However, studies to understand the molecular mechanisms underlying seed size and oil content in peanut through comprehensive transcriptome analysis are still largely unavailable.
In this study, comparative transcriptome analysis was conducted on two peanut cultivars with significant differences in seed size/weight and oil content at different stages of seed development.Transcriptome dynamics and gene co-expression networks associated with seed development were systematically investigated.Candidate genes potentially controlling seed size/weight and/or oil content were pinpointed.Overall, this study provides new insights into the molecular mechanisms underlying peanut seed development and forms a valuable resource for the genetic improvement of seed size/weight and oil content in peanut breeding.

Plant Materials and Sampling
Two peanut cultivars (Arachis hypogaea L.) with contrasting phenotypes in seed size/weight and oil content (namely, ZH16 (larger seed and higher oil content and ZH6 (smaller seed and lower oil content)) were grown in 2017 in an experimental field of the Oil Crops Research Institute, Wuhan, China.In order to mitigate the effects caused by environmental factors, we maintained consistent field management practices throughout the entire growth period.Daytime temperatures range from 25 • C to 35 • C, and the night temperature is not lower than 20 • C. According to a previous study (Pattee et al., 1974), seeds were collected with three biological replicates at 10, 20, 30, and 40 days after flowering (DAF), corresponding to the S1, S2, S3, and S4 stages, respectively.The seeds were very small and difficult to isolate at the S1 stage, while the entire seed remained flattened and white, or only one end turned pink, at the S2 stage.The seeds in the S3 stage were torpedoshaped and the end of the kernel embryonic axis was generally pink, while the seeds in the final S4 stage were round-shaped and completely light pink.We sampled at least 30 seeds for each biological replicate at the S1 and S2 stages and 10 seeds for each biological replicate at the S3 and S4 stages.The harvested seeds were immediately frozen in liquid nitrogen and then stored at −80 • C until RNA isolation.

Identification of Stage-Specific Expression Genes
Stage-specific genes of two cultivars were identified using a stage specificity (SS) scoring algorithm, which compared the gene expression at one stage with the maximum expression at the other stages of seed development [25,33,34].The SS scores ranged from 0 to 1.The higher the SS score of a gene at a certain stage, the more specific the expression of the gene at that stage.Genes with an SS score ≥ 0.5 were defined as being specifically expressed at a given developmental stage in each cultivar.

Functional Annotation and Enrichment Analysis
The functional annotation file of the reference genome was downloaded from the website (https://data.legumeinfo.org/Arachis/hypogaea/(accessed on 16 June 2023)).GO enrichment analysis was performed using BiNGO plugins in Cytoscape [35], and GO terms with p < 0.05 were considered to be significantly enriched.

Weighted Gene Co-Expression Network Analysis (WGCNA)
To explore the regulatory relationships across genes, WGCNA was performed on DEGs using R v3.6.2 [36].Pairwise co-expressed genes with weighted values < 0.2 were removed via in-house perl scripts, and the resulting significant co-expression networks were finally visualized in Cytoscape v3.8.2 [37].Modules were defined as gene clusters Plants 2023, 12, 3144 4 of 18 with high correlation coefficients among genes.Hub genes in each module were identified with the absolute value of kME (eigengene connectivity) greater than 0.8 (|kME| > 0.8).

Results
3.1.Phenotype Analysis of Two Peanut Cultivars ZH16 and ZH6 with Contrasting Seed Size/Weight and Oil Content In this study, fresh seeds from two cultivars ZH16 and ZH6 at the S1, S2, S3, and S4 stages were collected to evaluate their seed size, while mature seeds were harvested to measure the hundred-seed weight and oil content.The results showed no significant difference in seed size between ZH16 and ZH6 at the S1 stage.With the extension of growth stages, the seed size of ZH16 was larger than that of ZH6 at the S2 stage, and this difference was more obvious at the S3 and S4 stages (Figure 1A).This phenomenon is in accordance with the significant difference in the hundred-seed weight between ZH16 and ZH6, which were 87.58 g and 61.92 g, respectively (Figure 1B).Moreover, a slight significant variation in terms of oil content was also found between these two accessions, with 53.44% and 49.13% in ZH16 and ZH6, respectively (Figure 1C).

Weighted Gene Co-Expression Network Analysis (WGCNA)
To explore the regulatory relationships across genes, WGCNA was performed on DEGs using R v3.6.2 [36].Pairwise co-expressed genes with weighted values <0.2 were removed via in-house perl scripts, and the resulting significant co-expression networks were finally visualized in Cytoscape v3.8.2 [37].Modules were defined as gene clusters with high correlation coefficients among genes.Hub genes in each module were identified with the absolute value of kME (eigengene connectivity) greater than 0.8 (|kME| > 0.8).

Phenotype Analysis of Two Peanut Cultivars ZH16 and ZH6 with Contrasting Seed Size/Weight and Oil Content
In this study, fresh seeds from two cultivars ZH16 and ZH6 at the S1, S2, S3, and S4 stages were collected to evaluate their seed size, while mature seeds were harvested to measure the hundred-seed weight and oil content.The results showed no significant difference in seed size between ZH16 and ZH6 at the S1 stage.With the extension of growth stages, the seed size of ZH16 was larger than that of ZH6 at the S2 stage, and this difference was more obvious at the S3 and S4 stages (Figure 1A).This phenomenon is in accordance with the significant difference in the hundred-seed weight between ZH16 and ZH6, which were 87.58 g and 61.92 g, respectively (Figure 1B).Moreover, a slight significant variation in terms of oil content was also found between these two accessions, with 53.44% and 49.13% in ZH16 and ZH6, respectively (Figure 1C).

Comparative Transcriptome Analysis of Seeds of ZH16 and ZH6 at Different Developmental Stages
To explore transcriptional dynamics of ZH16 and ZH6 during seed development, RNAseq analysis was performed on seeds at the S1, S2, S3, and S4 stages of seed development.
In total, about 810 million raw reads were generated, with an average of ~30 million reads per sample (Supplementary Table S1).After filtering low-quality reads, clean reads were aligned to the reference genome of cultivated peanut (Bertioli et al., 2019).
To offer a comprehensive overview of the transcriptome dataset and to evaluate noise effects, we performed a principal component analysis (PCA) based on the average FPKM values of all expressed genes (Figure 2A).Samples of ZH16 and ZH6 at the same developmental stage were clustered together, indicating that the overall transcriptome dynamics of these two cultivars were similar at the same stage of seed development.Moreover, all samples from the S3 and S4 stages of both cultivars were grouped together, suggesting higher similarity in their transcriptome profiles.Correlation analysis based on FPKM showed high correlation coefficients across three different replicates of nearly all samples, except for ZH16 S2, ZH6 S3, and ZH6 S4, which only had high correlation between two biological replicates (Supplementary Figure S1A).Collectively, these results showed that our RNA-seq data were reliable and reproducible, and that they could be used for subsequent in-depth analyses.

Comparative Transcriptome Analysis of Seeds of ZH16 and ZH6 at Different Developmental Stages
To explore transcriptional dynamics of ZH16 and ZH6 during seed development, RNA-seq analysis was performed on seeds at the S1, S2, S3, and S4 stages of seed development.In total, about 810 million raw reads were generated, with an average of ~30 million reads per sample (Supplementary Table S1).After filtering low-quality reads, clean reads were aligned to the reference genome of cultivated peanut (Bertioli et al., 2019).
To offer a comprehensive overview of the transcriptome dataset and to evaluate noise effects, we performed a principal component analysis (PCA) based on the average FPKM values of all expressed genes (Figure 2A).Samples of ZH16 and ZH6 at the same developmental stage were clustered together, indicating that the overall transcriptome dynamics of these two cultivars were similar at the same stage of seed development.Moreover, all samples from the S3 and S4 stages of both cultivars were grouped together, suggesting higher similarity in their transcriptome profiles.Correlation analysis based on FPKM showed high correlation coefficients across three different replicates of nearly all samples, except for ZH16 S2, ZH6 S3, and ZH6 S4, which only had high correlation between two biological replicates (Supplementary Figure S1A).Collectively, these results showed that our RNA-seq data were reliable and reproducible, and that they could be used for subsequent in-depth analyses.A total of 53,434 genes were identified as expressed genes with FPKM values higher than 0.1 in at least one of the eight samples (Supplementary Table S2).Among them, 42,945, 46,025, 44,129, and 42,378 genes were found to be expressed in the ZH16 cultivar at the S1, S2, S3, and S4 stages of seed development, respectively (Supplementary Figure S1B).Similarly, 44 909, 46 270, 44 078, and 40 917 expressed genes were found to be expressed in the ZH6 cultivar at the S1, S2, S3, and S4 stages of seed development, respectively (Supplementary Figure S1B).Moreover, the proportion of genes with different expression levels was also similar between ZH16 and ZH6 at the same stage.Approximately 40% of the genes exhibited a low expression level (0.1 ≤ FPKM ≤ 2) in different developmental stages of the two cultivars (Figure 2B).With the seed development of each cultivar, the ratio of highly expressed genes (FPKM ≥ 10) dropped slightly (e.g., from 23% at S1 to 15% at the S4 stage), while the ratio of genes with moderate expression (2 ≤ FPKM < 10) increased marginally (e.g., from ~35% at S1 to ~42% at the S4 stage) (Figure 2B).The overlap and specificity of expressed genes in the four stages of ZH16 and ZH6 are displayed in Figure 2C,D.These results suggested large variations in genome-wide gene expression at different stages of seed development, which may be related to the observed differences in seed size/weight and oil content between ZH16 and ZH6.

Identification of Stage-Specific Expressed Genes during Seed Developmental Stages in ZH16 and ZH6
To explore the different transcriptional characteristics of the two cultivars at each stage of seed development, a stage specificity (SS) algorithm was used to identify stagespecific genes with an SS score ≥ 0.5.A total of 22,045 and 20,027 specifically expressed genes were identified in ZH16 and ZH6 at all stages of seed development, respectively (Supplementary Table S3).The numbers of stage-specific genes were notably different across the four different stages, varying from 882 (S4) to 11,433 (S2) for ZH16 and from 684 (S4) to 9529 (S1) for ZH6 (Figure 3A).Accordingly, the common stage-specific expressed genes in both cultivars ranged from 212 (S3) to 6515 (S1) (Figure 3A).The expression heatmap of all common stage-specific genes showed that the majority of these genes were highly expressed at the early developmental stages (S1, S2) in both cultivars (Figure 3B), which was consistent with a previous study on peanut [28].Interestingly, more stagespecific genes were detected in ZH16 than in ZH6 at all stages except S1, with the largest increase at the S2 stage.These results suggest that each cultivar had its own independent developmental process, especially at the early stage.
To explore the specific function at different developmental stages, GO enrichment analysis was performed on common stage-specific genes in each stage of ZH16 and ZH6 (Figure 3C,D, Supplementary Figure S2).At the S1 stage, significant GO terms (p < 0.05) of common stage-specific genes were related to cell wall organization or biogenesis, cell wall modification, the carbohydrate metabolic process, the lipid metabolic process, and transport (Figure 3C).At the S2 stage, common stage-specific genes were mainly involved in cell cycle and division-related process, various metabolic processes, and the regulation of these processes (Figure 3D).Furthermore, for the S3 and S4 stages, the top GO terms included a variety of response and regulation processes, such as the response to hormone/chemical stimulus, the regulation of photomorphogenesis, and the hormone-mediated signaling pathway (Supplementary Figure S2).To explore the specific function at different developmental stages, GO enrichment analysis was performed on common stage-specific genes in each stage of ZH16 and ZH6 (Figure 3C,D, Supplementary Figure S2).At the S1 stage, significant GO terms (p < 0.05) of common stage-specific genes were related to cell wall organization or biogenesis, cell wall modification, the carbohydrate metabolic process, the lipid metabolic process, and transport (Figure 3C).At the S2 stage, common stage-specific genes were mainly involved in cell cycle and division-related process, various metabolic processes, and the regulation of these processes (Figure 3D).Furthermore, for the S3 and S4 stages, the top GO terms included a variety of response and regulation processes, such as the response to hormone/chemical stimulus, the regulation of photomorphogenesis, and the hormone-mediated signaling pathway (Supplementary Figure S2).

Identification of Differentially Expressed Genes during Seed Development in Two Peanut Cultivars
To investigate transcriptional differences between ZH16 and ZH6 cultivars, we performed pairwise comparisons of the two cultivars at four seed developmental stages.Compared with ZH6, a total of 5222 differentially expressed genes (DEGs) were identified in ZH16 across four stages, among which 1810 genes were up-regulated and 3546 genes were down-regulated (Supplementary Table S4).The number of DEGs was highest at the S1 stage (3194), followed by the S2 stage (2237), S3 stage (1580), and S4 stage (1300) (Figure 4A).Notably, the number of up-regulated DEGs was greater than that of down-regulated DEGs at the S3 and S4 stages, whereas the opposite was found at stage S1 (Figure 4A).DEGs grouped in accordance with their log 2 FoldChange (FC) were shown to be unevenly distributed across all stages.FC values of the most up-regulated genes were between two-and four-fold or between four-and eight-fold, while the absolute FC values of down-regulated genes were mostly between one-and two-fold or between four-and eight-fold (Figure 4B).Of all these DEGs, only 349 up-regulated and 218 down-regulated genes were overlapping across all four stages (Figure 4C,D).These results revealed the distinct patterns of gene expression between ZH16 and ZH6 at different stages of seed development.
were down-regulated (Supplementary Table S4).The number of DEGs was highest at the S1 stage (3194), followed by the S2 stage (2237), S3 stage (1580), and S4 stage (1300) (Figure 4A).Notably, the number of up-regulated DEGs was greater than that of down-regulated DEGs at the S3 and S4 stages, whereas the opposite was found at stage S1 (Figure 4A).DEGs grouped in accordance with their log2FoldChange (FC) were shown to be unevenly distributed across all stages.FC values of the most up-regulated genes were between twoand four-fold or between four-and eight-fold, while the absolute FC values of down-regulated genes were mostly between one-and two-fold or between four-and eight-fold (Figure 4B).Of all these DEGs, only 349 up-regulated and 218 down-regulated genes were overlapping across all four stages (Figure 4C,D).These results revealed the distinct patterns of gene expression between ZH16 and ZH6 at different stages of seed development.GO enrichment analysis of DEGs showed that several biological processes were commonly or uniquely enriched at different seed developmental stages (Figure 4E, Supplementary Figure S3).We observed that in all developmental stages except S1, up-regulated DEGs were mainly enriched in various functional terms related to cell division, such as the mitotic cell cycle process, nuclear division, and organic cyclic compound catabolic process.Various primary and secondary metabolic/biosynthetic processes, including carboxylic acid, fatty acid, lipid, flavonoid, phenylpropanoid, and lignin, were significantly enriched for highly expressed genes at the S3 and/or S4 stages.However, genes involved in hormone/cytokinin/phosphatidylcholine metabolic processes were up-regulated at the S1 stage and down-regulated at the S3 stage.

Co-Expression Network Analysis of DEGs by WGCNA
To further explore potential key genes that determine differences in seed size/weight and/or oil content between two cultivars, WGCNA was performed based on the expression levels of 4367 DEGs at four developmental stages after filtering unknown genes.Based on the soft threshold power β = 9, a scale-free network was constructed, resulting in the generation of 11 co-expression modules (Figure 5, Supplementary Figure S4).The modules were color-coded, and the grey module contained genes not assigned to any other modules.The gene numbers of different modules varied greatly, ranging from 84 to 1176.Correlation analysis of module-sample relationships revealed that the magenta module was positively associated with larger-seed and higher-oil ZH16 at the S1 and S2 stages (r = 0.59, p = 0.005; r = 0.57, p = 0.007) and the yellow module was positively associated with ZH16 at the S2 stage (r = 0.84, p = 2 × 10 −6 ), while the red module was strongly positively correlated with smaller-seed and lower-oil ZH6 at the S2 stage (r = 0.8, p = 1 × 10 −5 ) (Figure 5A).Subsequently, the expression patterns of genes within these three modules at different stages of seed development in ZH16 and ZH6 were visualized in heatmaps (Figure 5B-D).During seed development, especially at the S2 stage, the gene expression of the magenta and yellow modules was higher in larger-seed and higher-oil ZH16 than in ZH6, Subsequently, the expression patterns of genes within these three modules at different stages of seed development in ZH16 and ZH6 were visualized in heatmaps (Figure 5B-D).During seed development, especially at the S2 stage, the gene expression of the magenta and yellow modules was higher in larger-seed and higher-oil ZH16 than in ZH6, indicating their positive effect in seed size/weight and oil content (Figure 5B,C).However, the expression of genes in the red module was higher in all stages of smaller-seed and lower-oil ZH6, especially in the S2 stage (Figure 5D), suggesting their negative roles in these two traits.Overall, the genes within the magenta, yellow, and red modules may be closely related to the difference in size/weight and/or oil content between ZH16 and ZH6 cultivars, so further analysis of the genes in these three modules is of high significance.GO enrichment analysis was performed on the genes belonging to these three key modules to elucidate their gene function.Magenta and yellow modules positively correlated with ZH16 were significantly enriched in functional terms of cell cycle, mRNA modification, fatty acid metabolic process, and ubiquitin-dependent proteolysis process.For the genes with lower expression within the red module in ZH16, the top terms were enriched in ribosome biogenesis and rRNA metabolic process.

Identifying Hub Genes Associated with Seed Size/Weight and/or Oil Content within Key Modules
To identify key candidates related to size/weight and oil content in magenta, yellow, and red modules, hub genes were identified based on their high kME values (|kME| > 0.8).According to functional annotations and reported genes associated with these two traits in other species (Supplementary Table S5), hub genes involved in TFs, phytohormones, the ubiquitin-proteasome pathway, and fatty acid synthesis may influence size/weight and/or oil content (Figure 6A-F; Supplementary Table S6).In addition, we collected known QTLs or candidate genes associated with seed size/weight and/or oil content reported in previous studies (Supplementary Table S7).A total of 14 hub genes involved in the aforementioned pathways were found to overlap with these loci or candidate genes, as shown in red in Figure 6.

TFs
A total of twenty-four TFs in the magenta and yellow modules and six TFs in the red module were identified as hub genes belonging to the WRKY, AP2, bHLH, MADSbox, MYB, ARF, SBP, and SNF2 families (Figure 6A-F; Supplementary Table S6).Among them, two WRKY family genes (arahy.Y7DSHF and arahy.V6NF3H) showed significantly higher expression in ZH16 than in ZH6.The Arabidopsis wrky10 mutant exhibited smaller seeds due to early cellularization of the endosperm, which inhibited the proliferation of embryonic cells [38].Two genes (arahy.WY3QF4 and arahy.1LT7× 4), encoding AP2-like ethylene-responsive TF, had notably higher expression in ZH16; they were previously shown to play a positive role in regulating seed weight and size in soybean [7,39].The overexpression of AP2/EREBP TFs BnWRI1 and GmWRI1a led to enhanced seed oil content in rapeseed and soybean, respectively [40,41].Furthermore, the transcript abundances of three MADS-box TFs (arahy.L7SRZY, arahy.S6RW25, and arahy.7JHN4J)were remarkably higher in ZH16.These TFs were reported to play a role in seed size via modulating endosperm cellularization in rice [42].Two MYB genes (arahy.MRFN9C and arahy.17JY83) and one ARF (arahy.RY1LQQ) with a higher expression level in ZH16 were previously shown to determine seed size/weight in plants [43,44].
Furthermore, a number of hub genes involved in the ubiquitin-proteasome pathway were found (Figure 6A-F; Supplementary Table S6).For example, two genes (arahy.VY3GX3 and arahy.2K1FXD)encoding E3 ubiquitin-protein ligases showed significantly higher expression in ZH16, which was recently found to influence seed size through the starch synthesis pathway in wheat [48].Zinc finger (C3HC4-type RING finger) family protein encoding genes arahy.B47L6X and arahy.APGH35, functioning as E3 ubiquitin ligases, had higher transcript abundance in ZH16.Five F-box proteins encoding hub genes derived from the magenta and yellow modules showed a higher expression level in ZH16 with largerseed and higher-oil content.These genes were characterized to have substrate recognition specificity in ubiquitin-mediated proteolysis [49].

Fatty acid Synthesis
A number of DEGs involved in fatty acid biosynthesis were identified, such as those encoding fatty acid hydroxylase, fatty acid desaturase, 3-ketoacyl-CoA synthase, long chain acyl-CoA synthetase, and acyl-CoA thioesterase (Figure 6A-F; Supplementary Table S6).Two genes (arahy.6AK6JCand arahy.ZVY5TW) encoding acyl-CoA thioesterase were found to have lower expression level in ZH16 than in ZH6, while fatty acid desaturase encoding gene arahy.93DW2Dshowed the opposite.These two enzymes were previously demonstrated to influence rapeseed oil content [10,50].In addition, DEGs involved in starch and sucrose metabolism were also detected, including those encoding beta glucosidase, glycoside hydrolase, fructose-bisphosphate aldolase, and xyloglucan endotransglucosylase. Sucrose is used to produce dihydroxyacetone phosphate and acetyl-CoA as raw materials for TAG synthesis, which is a key step in the formation of seed oil [51].

Discussion
In this study, we aimed to determine the molecular mechanisms underlying seed size/weight and oil content by performing comparative transcriptome analysis on two peanut cultivars with contrasting phenotypes at four stages of seed development.A total of 5222 DEGs were identified between ZH16 and ZH6 across the four stages of seed development analyzed.Combined with co-expression networks analysis, hub genes in three key modules were identified, which were mainly involved in TFs, phytohormones, the ubiquitin-proteasome pathway, fatty acid synthesis, cytochrome P450 proteins, ABC transporters, PPR proteins, and receptor-like kinases.TFs play a key role in regulating plant life cycle activities and adaptation to the environment, such as controlling plant seed development [52][53][54].In this study, a total of 30 TFs were identified as both DEGs and hub genes within several co-expression modules (Figure 6A-F; Supplementary Table S6).Among them, two genes (arahy.Y7DSHF and arahy.V6NF3H) belonging to WRKY family showed significantly higher expression in ZH16 than in ZH6.GmWRKY15a and OsWRKY36 were found to regulate seed size in soybean and rice, respectively [52,55].Another two bHLH TFs (arahy.7XD9C1and arahy.48I8DT)were also identified to have higher transcript levels in ZH16.Functionally deficient mutant of the bHLH TF RGE1 (ZOU) resulted in slow embryo growth, eventually leading to the production of small and wrinkled seeds in Arabidopsis and maize [56,57].Another bHLH TF TaPGS1 was also reported to positively regulate grain weight and size in rice and wheat [58].These results indicate that TFs play an important role in the regulation of plant seed size/weight.

Phytohormones
Many phytohormones, such as auxin, gibberellin, cytokinin, and brassinolide, have been reported to regulate plant seed growth and development [46,[59][60][61].In our study, six DEGs related to auxin or cytokinin were identified, including those encoding auxin response factor, auxin transport protein, and histidine kinase (Figure 6A-F; Supplementary Table S6).ZmARF12 is an inhibitor of cell division during seed development, and its defective mutant could lead to increased seed size/weight in maize [8], whereas the arf25 mutant resulted in decreased grain weight and size in wheat [62].The transport of auxin to maternal tissue is a key driver of seed coat development, thus affecting seed size in Arabidopsis [63].Moreover, cytokinins may influence seed size by regulating the growth of embryonic cells during seed development [64,65].A gene (BnaA03G37960D) encoding cytokinin receptor histidine kinase in rapeseed was the candidate gene for 1000-seed weight by linkage and association analyses [45].These results illuminate that phytohormones are crucial for plant seed development and the regulation of seed size/weight.

The Ubiquitin-Proteasome Pathway
It has been reported that the ubiquitin-proteasome pathway is involved in the improvement of crop yield by regulating seed agronomic traits, such as seed size/weight [66].Some DEGs associated with the ubiquitin-proteasome pathway were found in this study (Figure 6A-F; Supplementary Table S6).The Arabidopsis ubiquitin receptor protein encoding gene DA1 was an inhibitor that regulated seed size by limiting cell proliferation [9].Similarly, DA2 encodes an ubiquitin ligase that negatively regulated seed size in Arabidopsis and cotton [11,67].TaGW2-6A, encoding an E3 ubiquitin ligase, has been reported to interact with TaAGPS in wheat to influence seed size through the starch synthesis pathway [48].

Other Important Genes
We also found that two genes (arahy.2F8F1Sand arahy.0K456L)encoding cytochrome P450 proteins had higher expression levels in ZH16.Previously, KLU (CYP78A5) and EOD3 (CYP78A6) encoding cytochrome P450 monooxygenases were found to positively regulate seed size in Arabidopsis [68,69].Overexpression of GmCYP78A72/GmCYP78A5 and TaCYP78A3 could increase seed size in soybean and wheat, respectively [24,70,71].Three genes (arahy.S5WRJR, arahy.VYP3BN, and arahy.TZU5V7) related to the ABC transport pathway displayed relatively higher expression levels in ZH16.An ABCC3 transporter gene CaABCC3 (6) could affect seed size by regulating the transport of glutathione conjugates in chickpea seed cells [72].Additionally, we also found other DEGs, including those encoding PPR proteins and receptor-like kinases.It was previously reported that PPR protein encoding genes Dek39 and Emp18, involved in RNA editing, were essential for seed development in maize [73,74].PPR proteins may also be related to the increase in seed size/weight of cultivated peanut [26].The receptor-like kinase ERECTA (ER) and LecRK-VIII.2 have been shown to positively regulate seed size by modulating the proliferation of outer integument cells, and they were upstream regulators of MAPK signaling pathway in Arabidopsis [75,76].

Roles of Key Genes Related to Fatty Acid Biosynthesis in Determinating Seed Oil Content
Oil synthesis can be divided into three stages during seed development.Firstly, acetyl-CoA is used as the initial substrate to synthesize long-chain fatty acids in plastids.Subsequently, long-chain fatty acids are transported to the endoplasmic reticulum (ER) to assemble triacylglycerol (TAG).Finally, triacylglycerol combines with oil body proteins to form oil bodies [77].In this study, we identified a number of DEGs involved in fatty acid biosynthesis, including those encoding enzymes and TFs (Figure 6A-F; Supplementary Table S6).It has been reported that the knockout of KASIII (β-ketoacyl-ACP synthase) and FATB (acyl-ACP thioesterase) in rapeseed resulted in an increase in contents of medium chain fatty acids [50].Reduced expression of FAD2 (fatty acid dehydrogenase 2) and FAE1 (fatty acid elongase 1) in rapeseed using RNAi caused a slight decrease in seed oil content, an increase in oleic acid content, and a decrease in erucic acid content [10].Similarly, low expression of fatty acid desaturase and elongase in crambe could largely increase seed oleic acid content [78].
AP2-like ethylene-responsive TF encoding genes arahy.WY3QF4 and arahy.1LT7×4were found to show significantly higher expression in ZH16 in our study.The AP2/EREBP TF ZmWRI1 was regulated by upstream regulators LEC2 and LEC1 and positively regulated the downstream target genes PI-PKβ1, BCCP2, ACP1, and KAS1, which were mainly involved in glycolysis and fatty acid synthesis in maize [79].Overexpression of BnWRI1 and GmWRI1a led to enhanced seed oil content in rapeseed and soybean via complex co-expression gene networks related to fatty acid biosynthesis, respectively [40,41,80].Interestingly, a similar phenomenon was found in our work, where arahy.1LT7×4and several genes involved in fatty acid biosynthesis were shown to co-express within the yellow module.The results suggest that AP2-like ethylene-responsive TF may influence seed oil content by regulating downstream target genes involved in fatty acid biosynthesis in plants.

Genes Affecting Both Seed Oil Content and Seed Size/Weight
In our work, several DEGs were found to potentially contribute to both seed oil content and seed size/weight, including AP2/EREBP and MYB TFs.AP2 family members have been reported to regulate ovule development and fatty acid synthesis in seed, thereby determining seed size/weight [7,81].Specific overexpression of the AP2/EREBP TF (BnWRI1) in B. napus led to enhanced oil content and seed size/weight in matured seeds [40].Additionally, overexpression of GmMYB73 could increase seed oil content and seed size/weight in transgenic Arabidopsis [5].These results suggest that AP2/EREBP and MYB TFs may not only increase seed oil content but also seed size/weight.

Conclusions
In this study, we performed comparative transcriptome analysis on two peanut cultivars with contrasting seed size/weight and oil content to identify the key candidate genes determining these two important traits.Our results revealed that a number of crucial genes, including those encoding TFs, phytohormones, ubiquitin-mediated proteolysis related proteins, cytochrome P450 proteins, and ABC transporters, play a crucial role in regulating seed size/weight.Several genes involved in fatty acid biosynthesis were detected to contribute to seed oil biosynthesis.Additionally, some genes may both affect oil content and seed size, such as AP2/EREBP and MYB TFs.Taken together, this study provides comprehensive information underlying seed size/weight and oil content and potentially serves as a beneficial resource for genetic breeding to develop peanut cultivars with enhanced yield and quality.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/plants12173144/s1,Supplementary Figure S1.Correlation statistics and number of expressed genes at different stages of seed development between ZH16 and ZH6.(A) Spearman correlation coefficient (SCC) of RNA-seq data.(B) Bar plot showing number of expressed

Figure 2 .
Figure 2. Global gene expression profiles in ZH16 and ZH6.(A) PCA plot showing clustering of gene transcript levels at four stages of seed development in ZH16 and ZH6.(B) Proportion of genes with different expression levels (based on FPKM).(C,D) Venn diagrams of expressed genes amongfour stages of seed development in cultivars ZH16 (C) and ZH6 (D).

Figure 2 .
Figure 2. Global gene expression profiles in ZH16 and ZH6.(A) PCA plot showing clustering of gene transcript levels at four stages of seed development in ZH16 and ZH6.(B) Proportion of genes with different expression levels (based on FPKM).(C,D) Venn diagrams of expressed genes amongfour stages of seed development in cultivars ZH16 (C) and ZH6 (D).

Figure 3 .
Figure 3. Genes with stage-specific expression during seed development in peanut cultivars ZH16 and ZH6.(A) Bar graph showing numbers of stage-specific expressed genes specifically or commonly present in ZH16 and/or ZH6 at each stage of seed development.(B) Heatmap showing the expression of common stage-specific expressed genes at different stages in ZH16 and ZH6.Color scale represents Z-score.(C,D) Enriched functional GO terms (biological process) of common stagespecific expressed genes in two cultivars at the S1 (C) and S2 (D) stages.

Figure 3 .
Figure 3. Genes with stage-specific expression during seed development in peanut cultivars ZH16 and ZH6.(A) Bar graph showing numbers of stage-specific expressed genes specifically or commonly present in ZH16 and/or ZH6 at each stage of seed development.(B) Heatmap showing the expression of common stage-specific expressed genes at different stages in ZH16 and ZH6.Color scale represents Z-score.(C,D) Enriched functional GO terms (biological process) of common stage-specific expressed genes in two cultivars at the S1 (C) and S2 (D) stages.

Figure 4 .
Figure 4. DEGs between ZH16 and ZH6 at different seed developmental stages.(A) Number of upregulated and down-regulated genes.(B) Distribution of Log2FC values of DEGs.(C,D) Venn diagrams showing numbers of DEGs concurrently or specifically expressed among four stages.(E) Enriched GO terms (biological process) of up-and down-regulated genes.The color scale represents significance (corrected p-value).

Figure 4 .
Figure 4. DEGs between ZH16 and ZH6 at different seed developmental stages.(A) Number of up-regulated and down-regulated genes.(B) Distribution of Log 2 FC values of DEGs.(C,D) Venn diagrams showing numbers of DEGs concurrently or specifically expressed among four stages.(E) Enriched GO terms (biological process) of up-and down-regulated genes.The color scale represents significance (corrected p-value).

Plants 2023 , 20 Figure 5 .
Figure 5. WGCNA of DEGs between ZH16 and ZH6 at each seed developmental stage.(A) Modulesample relationships.The number of genes within key modules is indicated next to the module name.Color bars represent negative (blue) and positive (red) correlations.(B-D) Expression patterns of DEGs in magenta (B), yellow (C), and red (D) modules.

Figure 5 .
Figure 5. WGCNA of DEGs between ZH16 and ZH6 at each seed developmental stage.(A) Modulesample relationships.The number of genes within key modules is indicated next to the module name.Color bars represent negative (blue) and positive (red) correlations.(B-D) Expression patterns of DEGs in magenta (B), yellow (C), and red (D) modules.

Figure 6 .
Figure 6.Heatmap and co-expression networks of DEGs within three key modules.(A,C,E) Heatmaps of hub genes.Genes overlapping with reported QTLs are marked in red.(B,D,F) Co-expression networks.Purple, blue, orange, and red nodes represent hub genes involved in TFs, phytohormones, the ubiquitin-proteasome pathway, and fatty acid synthesis, respectively.Node size represents connectivity.