Screening of Grain Development Heterosis Candidate Genes by Integrating QTL Mapping and RNA-Seq in Super Hybrid Rice WFYT025

The application of heterosis during plant breeding increases rice grain yield. However, there have been limited studies on heterosis during rice grain development during the grain-filling stage; therefore, the genetic basis of heterosis for grain development during the grain-filling stage should be highly valued. In this study, a hybrid combination with the super hybrid rice WFYT025 was used to perform a transcriptomic dynamic analysis in grains at the beginning and middle grain-filling stages. A total of 1556 and 1507 transcripts that were differentially expressed between WFYT025 and its parents (DGHP) were identified at 1-day post-anthesis (DPA) and at 10 DPA, respectively. The analysis of the genetic effects of heterosis showed that the over-dominant effect (66.90% and 55.87%) was the main mode of action during grain development. The KEGG pathway and GO analysis of the DGHP indicated that the gibberellin biosynthetic, starch metabolic, and diterpenoid biosynthetic signaling pathways may be associated with heterosis during grain development. To further explore the candidate genes for grain development heterosis, a recombinant inbred line (RILs) population with a high-density genetic map of 2578 bin markers was constructed by crossing the parents of WFYT025, and nine stable QTLs for grain weight-related traits were identified. By comparing the DGHP with 20 QTLs, LOC_Os02g28820, LOC_Os02g32580, LOC_Os04g25440, and LOC_Os12g04980 were identified as grain development heterosis-related candidate genes. These findings provide resources for the study of heterosis during the grain development of super hybrid rice and provide valuable theoretical references for the cloning and functional analysis of heterosis-related genes.


Introduction
Heterosis is a universal genetic phenomenon in which F1 hybrids exhibit superior phenotypes relative to their parents, including improvements in individual size, reproductive capacity, environmental adaptability, stress tolerance, and grain yield [1]. Heterosis has been widely used throughout the world to significantly increase crop productivity, including in rice (e.g., "super hybrid rice") [2]. The utilization of crop heterosis has greatly contributed to ensuring food security around the world [3]. However, our understanding of the genetic mechanisms of heterosis lags behind its widespread application. At present, there are three classical hypotheses to explain the formation of hybrid rice heterosis: The dominant hypothesis [4], the over-dominant hypothesis [5], and the epistasis hypothesis [6,7], all of which consider that the genetic differences between pure and heterozygous parents are not a simple additive effect. Although these hypotheses have certain weaknesses and cannot fully explain the molecular formation mechanism of heterosis, these hypotheses are not contradictory, and the heterosis formation may originate from a combination of these effects [8]. With the development of molecular marker technology and QTL mapping, geneticists use population genetic analysis combined with molecular markers for heterosis QTL mapping. Even so, the division of dominant groups and genetic diversity can only be inferred from the genetic differences between the parents and cannot essentially explain the heterosis that takes place between the hybrid offspring and the parents [9].
To resolve the formation of heterosis in hybrid rice, researchers have used omics methods to mine a batch of genetic loci related to heterosis. The development of RNA sequencing (RNA-seq) technology allows for the unbiased and highly reproducible deep sequencing of all transcripts in a specific tissue or cell and has been widely used to study heterosis in Arabidopsis, rice, maize, and other crops. Zhai et al. [10] used the RNA-seq technique to perform a comparative transcriptome analysis of the roots of the super rice "Xieyou" 9308 hybrid combination at the tillering and heading stages and found that 66.59% of the DEGs between the hybrids and their parents were down-regulated at the tillering stage and 64.41% were up-regulated at the heading stage, and these DGHPs were significantly enriched in the pathways related to processes such as carbohydrate metabolism and phytohormone signal transduction. Shao et al. [11] performed a transcriptome sequencing analysis of "Shanyou 63" and its parental buds, flag leaves, and panicles and found that 3270 genes in "Shanyou 63" showed allele-specific expression. These genes could be classified into two patterns: Consistent and inconsistent allele-specific expression genes. Consistent allele-specific expression genes may have partial or complete dominance effects on the traits they regulate, whereas discordant allele-specific expression genes may lead to over-dominance. Wei et al. [12] used the genomic microarray method to measure and compare the transcriptomes of LYP9 and its parents in the leaves and spikes at different developmental stages and found that 3926 DEGs were enriched in the energy metabolism and transport-like biological pathways. Li et al. [13] comprehensively analyzed the yield heterosis of two-line hybrid rice LYP9 through QTL mapping and omics and identified several genes affecting yield heterosis. When studying the heterosis of hybrid rice, QTL mapping technology does not provide direct access to the actual gene expression situation, so RNA-seq technology can further reveal the importance of differential gene expression to reveal the heterosis mechanism [14,15].
The rice yield component, which refers to the weight of all of the seeds of a single rice plant, consists of three elements: The number of grains per panicle, the number of panicles per plant, and the 1000-grain weight (TGW) [16]. Yield improvement is closely related to the improvement in the three elements mentioned above, and the utilization of heterotic genes related to these traits has the potential to significantly improve grain yield [17,18]. In recent years, scientists have conducted research on the three elements of rice yield [19][20][21][22]. For example, Shao et al. [11] used RNA-seq technology to study the heterosis of the young panicle tissues of a "Shanyou 63" hybrid combination at four stages and found that the differential expression of the alleles from both parents might be the cause of the rice spike number hybrid advantage; Liang et al. [21] sequenced the transcriptomes of young panicles at different stages of the booting stage in the super-large japonica rice TD70 and in the small-grain indica rice Kasalath and found that for TGW, these DEGs are mainly enriched in signal transduction, cell division, and material transport. All of these studies provide a theoretical basis for the effective utilization of these rice yield-related genes in molecular breeding. However, there are few reports on heterosis in the TGW of hybrid rice. The TGW of rice is related to the grain shape (grain length, grain width, and grain thickness), and the formation of TGW mainly takes place during the grain-filling stage because the nutrients in the leaves are transported to the rice seeds during the filling stage to enrich the seeds. Thus, understanding the genetic and molecular basis of grain development heterosis in rice is extremely important for improving rice yield.
To examine the association of key transcripts with heterosis in grain development, we used RNA-Seq to perform transcriptomic dynamic analysis of grains at the beginning and middle grain-filling stages in a famous super hybrid rice, WFYT025, in China. Meanwhile, the RIL population of 126 lines constructed by the two parents of WFYT025 was used to conduct QTL mapping for the TGW-related traits in two different environments. Some of the key transcripts between the parents and WFYT025 were QTL mapped and related to TGW-related traits, providing a valuable theoretical reference for the cloning and functional analysis of these grain development-related heterosis genes.

Plant Materials and Growth Conditions
WFYT025 is a super hybrid late rice variety that is widely planted in the middle and lower reaches of the Yangtze River [23]. The 126 RIL population was developed by crossing WFB and CHT025 using the single-seed descendant method [24]. The plants were cultivated at the experimental field of Jiangxi Agricultural University, Nanchang province (28 • 45 N, 115 • 50 E; long-day conditions) in 2018 and were planted in Hainan province (18 • 14 N, 109 • 31 E; short-day conditions) in 2019. In this experiment, the marked grains were sampled on the 1, 5, 10, 15, and 21 DPA, and the fresh thousand-grain weight of the WFB, WFYT025, and CHT025 rice were recorded at each stage. By measuring the fresh thousand-grain weight, we were able to select the parent grains and the hybrid WFYT025 grains for RNA-Seq analysis at days 1 and 10, and each sample had at least three biological replications to minimize systematic errors.

Construction of Genetic Linkage Map and QTL Analysis
The extraction of the genomic rice DNA was performed according to the CTAB method [25]. After DNA quality testing, the genomic sequence was randomly interrupted by the EcorI enzyme, and joints were added to the ends of the DNA to make~300 bp libraries. MGISEQ-2000 BGI equipment was used to sequence the constructed library, and the reads from 128 samples were aligned to the Nihon Haru reference genome. Using GATK3 software [26], high-quality mapped reads from 128 samples were used as the INPUT for variant detection, and this initial variant locus dataset was filtered to include a required sequencing depth of at least 5× per locus; loci with an error rate greater than 0.4 were removed, i.e., for a variant locus, at least 60% of the samples were correct. Finally, the obtained SNPs were classified, i.e., SNPs with polymorphisms between the two parents were classified as type a (paternal genotype) and type b (maternal genotype), respectively. Based on the SNP genotypes, we placed the co-segregated SNPs, i.e., the SNPs with the same genotype between the adjacent SNPs, in the same interval and formed a cluster, which was defined as a bin, and each bin required SNPs that were greater than or equal in size to 0.5/kb; otherwise, this bin window was ignored. A high-linkage map of the RIL population derived from the cross between WFB and CHT025 was constructed using the obtained bin tags and Join Map 4.2 software (Supplementary Materials Table S7). QTL IciMapping4.1 software was used to analyze the QTLs related to the grain shape and weight traits using the inclusive composite interval mapping (ICIM) model, and based on a 1000-permutation test, the LOD threshold was 3.0. The QTL naming rules described by Mc. Couch et al. [27] were used with minor modifications.

Rice Grains Heterosis Measurements
To determine the TGW, grain length, grain width, and grain thickness, the harvested panicles were dried in an oven at 37 • C for one week. The grain width, grain thickness, and grain length were measured manually. Mid-parent heterosis (MPH) and higherparent heterosis (HPH) were calculated for these traits according to the following formulas: MPH = (F1 − MP)/MP and HPH = (F1 − HP)/HP, where F1 is the performance of the hybrid, MP is the average performance of the two parents, and HP is the performance of the higher parents. Hypothesis testing was performed using a t-test [19].

RNA Extraction, cDNA Library Preparation, and Sequencing
Total RNA was extracted from the grains of young ears of rice using the Trizol reagent (Invitrogen, Carlsbad, CA, USA) following the manufacturer's instructions. The total RNA quantity and purity were analyzed with the Bioanalyzer 2100 and RNA 1000 Nano Lab Chip Kit (Agilent, Santa Clara, CA, USA) with the RIN number > 7.0. Poly(A) RNA was purified from the total RNA(5ug) using poly-T oligo-attached magnetic beads and two rounds of purification. Following purification, the mRNA was fragmented into small pieces using divalent cations under elevated temperatures. Then, the cleaved RNA fragments were reverse-transcribed to create the final cDNA library according to the protocols of the RNA-Seq sample preparation kit (Illumina, San Diego, CA, USA), and the average insert size for the paired-end libraries was 300 bp (±50 bp). Additionally, we performed paired-end sequencing on an Illumina Hiseq4000 (LC Sciences, Houston, TX, USA) following the vendor's recommended protocols [28]. In addition, the sequencing data were uploaded to the NCBI Sequence Read Archive under BioProject PRJNA781194.

Normalization of Gene Expression Levels and Identification of Differentially Expressed Genes (DEGs)
The method from Li and Dewey [29] was used to reconstruct the transcript, and RSEM was used to calculate the gene expression in each sample. For gene expression, the FPKM (Fragments Per Kilobase of transcript per Million mapped reads) method was used to calculate the FPKM of each gene, according to the length of the gene, and map it to the gene. The DEGs between parents were annotated as "DGPP", and the DEGs between parents and progeny were annotated as "DGHP" [10]. PCA (Principal Component Analysis) was conducted between samples to understand the reproducibility between samples. According to the results of the difference analysis, the genes were screened, and a False Discovery Rate < 0.05 and a log2FC > 2 were determined to be significant for different genes. Differential enrichment analyses, including Gene Ontology and KEGG pathway enrichment analysis, enrichment analysis using the cluster profile R software package (edgeR), and correct gene length deviations, were performed on the genes that were determined to be significantly different from each other. Q-value (correction p-value) GO items with values of less than 0.05 are significantly enriched in differentially expressed genes. Using the short time-series Expression Miner [30] software, all of the expression level files for all of the differential genes should be input, and then the trend-analysis parameters should be selected from the same sample at different developmental times.

The Mode of Inheritance Analysis
For statistical analysis, the analysis of variance (ANOVA) was usually determined using the following model: y = u + (GA) + (GD) + (SR) + e, where y is the acquired gene expression, u is the overall mean, GA is the additive effect, GD is the dominant effect, SR is the replication effect, and e is the residual error [31]. Hp = d/a refers to the dominance ratio or potency (where a and d represent GA and GD, respectively), and was also calculated to measure the non-additivity of the F1 hybrid relative to its parents. Considering gene expression levels as quantitative traits, we adopted traditional quantitative genetic parameters, such as composite additive effect, a, and combined dominance effect, d, to estimate our expression profile. DGHPs were classified according to the dominance ratio Hp (= d/a) based on 99.8% confidence intervals constructed for d − a (d > 0) and d + a (d < 0). To avoid troublesome statistical properties, according to the Hp value (=d/a), we determined whether these genes belonged to the partial dominance (−0.8 < Hp ≤ −0.2 or 0.2 < Hp ≤ 0.8), over-dominance (Hp ≤ −1.2 or Hp > 1.2), dominance (−1.2 < Hp ≤ −0.8 or 0.8 < Hp ≤ 1.2), or additive effect (−0.2 < Hp ≤ 0.2) categories [32].

Real-Time Quantitative PCR (qRT-PCR)
To validate the RNA-seq results, the different expression patterns of several genes were confirmed by qRT-PCR. For qRT-PCR, 1 µg of total RNA was used to synthesize the cDNA using a Prime Script TM RT reagent Kit (Perfect Real Time) (TaKaRa). qRT-PCR was carried out using SYBR ® Premix Ex Taq II (Tli RNaseH Plus; TAKARA BIO Inc., Shiga, Japan) and determined in Light Cycler 480 (Roche, Basel, Switzerland) according to the manufacturer's instructions. The qRT-PCR reactions were amplified at 95 • C for 30 s followed by 40 cycles of 95 • C for 5 s, 55 • C for 30 s, and 72 • C for 30 s. All reactions were performed with three independent biological replicates for each sample, and three technical replicates for each biological replicate were analyzed. The relative gene expression was calculated with ABI7500 Real-Time PCR System software using the 2 −∆∆Ct method. The detection of the threshold cycle for each reaction was normalized against the expression level of the rice Actin gene with the primer sequences 5 -TGGCATCTCTCAGCACATTCC-3 , and 5 -TGCACAATGGATGGGTCAGA-3 [19].

Phenotype Analysis of WFYT025 and Its Parents
We compared the grain shape-related traits of WFYT025 and its parents ( Figure 1). Compared to WFB, WFYT025 had a longer panicle length, a similar grain shape (short, wide, and thick), and a TGW (Supplementary Materials Table S1). Compared to CHT025, WFYT025 had a similar panicle length, a shorter, wider, and thicker grain shape, a higher TGW, and a shorter grain length-to-width ratio. To explore the starch synthesis and accumulation process in the grain endosperm during the grain-filling stage, we investigated the thousand-grain fresh weight of the parental and progeny rice grains at 1, 5, 10, 15, and 21 DPA in Nanchang in 2018 (Figure 2a). The phenotype analysis showed that the husked 1000-grain weight of WFYT025 was significantly greater than that of CHT025 at all stages. The fresh weight of WFYT025 was similar to its parents at 1 DPA, but differences started to appear at 5 and 10 DPA (Figure 2b). Table 1 shows the degree of heterosis in related traits between WFYT025 and its parents. Most of the phenotype parameters of WFYT025 show MPH, although a few features show high-parent heterosis HPH. Notably, the TGW of WFYT025 showed a significant MPH (12.65%, p < 0.01), and the grain yield also showed significant HPH (18.99%, p < 0.01).

Identification of Transcripts by Sequencing
A total of 1748 million reads, with an average of 43.06 million reads per sample, and three biological replicates had similar gene expression levels (0.84 < R2 < 0.99) (Figure 3a). Short reads were pooled and aligned with the Nipponbare reference genome (IRGSP build 5.0). The expression levels of a total of 42,004 transcripts were obtained: 95.44-96.41% of the reads were mapped to exons, 1.79-2.88% were mapped to introns, and 1.66-2.02% were mapped to intergenic regions (Supplementary Materials Table S2). Moreover, the results of the principal component analysis (PCA) demonstrated that the three replicates of two developmental stages can be distinguished from each other (Figure 3b). In addition, the transcriptome profile of WFYT025 was significantly different from that of the two parents ( Figure 4a).    We further analyzed the DEGs via pairwise comparisons between WFYT025 and the parents at two stages of grain development. At 1 and 10 DPA, 2289 and 1995 reliable DEGs were identified, respectively (Figure 3c). The FPKM of all of the transcripts is shown in Table S3 in the Supplementary Materials. In total, 1556 and 1507 DGHPs were identified at 1 and 10 DPA, respectively. Of the 1556 DGHPs identified at 1 DPA, 961 transcripts showed differences between CHT025 and WFYT025, whereas only 793 exhibited differences between WFB and WFYT025 (Figure 3d). At 10 DPA, 1089 transcripts showed differences between CHT025 and WFYT025, whereas only 522 transcripts exhibited differences between WFB and WFYT025 (Figure 3f). To investigate the accuracy and reproducibility of the RNA-seq data, we amplified 12 genes with qRT-PCR using specific primers (Supplementary Materials Table S4). The qRT-PCR findings for the 12 selected genes were consistent with the RNA-seq results, revealing the high accuracy and reliability of our RNA-seq results (Figure 4b).

The Mode of Inheritance for DGHP
Using the method to evaluate the mode of inheritance, the dominance values were calculated for the DGHP between the two parents at 1 and 10 DPA  (Table 2). Among the various genetic effects of the DGHPs, the over-dominance effect accounts for the largest proportion, followed by the dominance, partial dominance, and additive effects.

Functional Classification for DGHP by Gene Ontology (GO)
We applied GO to classify the function of the mRNA [33]. The GO terms in the molecular function, cellular composition, and biological process functions were analyzed (Figure 5a). Within the biological process category, we looked for over-expressed GO terms (p < 0.05) that could represent a differential grain development process between WFYT025 and its parents. At 1 DPA, the GO terms of 1556 DGHPs primarily showed negative regulation of the starch metabolic processes, serine-type endopeptidase inhibitor activity, response to abscisic acid, meiotic chromosome segregation, seed maturation, long-day photoperiodism regulation, flowering, beta-amylase activity, and protein serine/threonine phosphatase activity. At 10 DPA, the GO terms of 1507 DGHPs were primarily enriched in the 1,3-beta-D-glucan synthase complex, gibberellin biosynthetic process, gibberellic acid-mediated signaling pathway, pectin esterase inhibitor activity, and diterpenoid biosynthetic process. Heat map analysis of the expression of the genes involved in the gibberellin pathway, starch metabolism pathway, and decelerative division pathway revealed significant differences in the expression of OsGA20ox4, OsGA2ox5, OsGA2ox6, OsBZR1, Os-AGPL1, OsAGPL2, OsMSH5, OsDMC1A, and OsDMC1B between WFYT025 and the parents (Supplementary Materials Table S7). Therefore, the differences in the expression levels of these genes, which are involved in important pathways, are likely to be closely related to the heterosis of grain development in the super hybrid rice WFYT025.

Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway for DGHP
Functional enrichment analysis was performed on the DGHPs using the KEGG. At 1 DPA, 112 pathways were identified among 1556 DGHPs, with the most enriched pathways including the ribosome, starch, and sucrose metabolism; plant hormone signal transduction; fatty acid degradation and biosynthesis; glycolysis/gluconeogenesis; and glycosphingolipid biosynthesis pathways, among others (Figure 5b).
At 10 DPA, 118 pathways were identified among 1507 DGHPs, with the most enriched pathways including the glycerophospholipid metabolism, starch, and sucrose metabolism; diterpenoid biosynthesis; plant-pathogen interaction; taurine and hypotaurine metabolism; pentose and glucuronate interconversions' and amino-sugar and nucleotidesugar metabolism pathways, among others ( Figure 5b). Overall, we found that the DGHPs were primarily enriched in the starch and sucrose metabolism, fatty acid biosynthesis, diterpene biosynthesis, and interconversion of pentose and glucuronic acid pathways (Supplementary Materials Table S8).

Population Sequencing, Linkage Construction
The sequenced data were analyzed, and 701,310 high-quality SNPs were obtained (Figure 6a,b). The obtained SNPs were bin-converted, and after rigorous screening, 2578 bin markers were obtained that were evenly distributed on 12 chromosomes and that contained 214.8 bin markers per chromosome on average (Supplementary Materials Table S9). Among them, the lowest marker was chromosome 8, which had 159 bin markers, and the highest marker was chromosome 1, with 318 bin markers (Figure 6c). A high-density genetic map was constructed using these bin markers to analyze 126 lines. Among them, the average distance between two neighboring markers was 0.70 Mb; the smallest average distance on chromosome 12 was 0.58 Mb, and the largest average distance on chromosome 8 was 0.92 Mb (Figure 6d).

TGW Phenotypic Correlation Analysis and QTL Mapping
The grain length, grain width, grain thickness, and TGW of the two parents were significantly or extremely significantly different in the two environments (Supplementary Materials Table S10). In the RIL population, the grain length, grain width, and TGW were highly variable and demonstrated continuous variation in the phenotypic values. The correlation analysis of the grain shape traits showed that the TGW was extremely significantly correlated with the grain thickness and the length-width ratio (Supplementary Materials Table S11).
To elucidate the genetic mechanism underlying the heterosis of the grain phenotype, we combined the high-density linkage map and the grain shape data obtained from the RIL population. Using the inclusive composite interval mapping (ICIM) method and QTL IciMapping 4.1 software, 20 QTLs related to grain weight were detected (Figure 7). These QTLs were distributed across ten chromosomes, with limit of detection (LOD) values ranging from 3.06 to 13.67 and phenotype variation explanatory values ranging from 0.73% to 29.42% (Table 3). Between the two environments, 12 and 17 QTL were detected, respectively, with 9 QTL being detected consistently. The genes in the QTL interval that were able to be stably detected in the two environments constitute the critical heterosis genes controlling grain development-related traits such as the grain length, grain width, grain thickness, and TGW between WFYT025 and its parents.

Screening for Heterosis Candidate Genes That Grain Development by the Integration of QTL Mapping and RNA-seq
To screen for candidate heterosis genes, a summary of the genes that overlap between the QTLs and RNA-Seq analysis is provided (Figure 8a). Based on the gene annotation of the reference rice genome (Oryza sativa v7.0), the 10 DGHPs in 1 DPA overlapped with the 8 stably expressed QTLs (qGW2, qGW4, qGW7, qLW1, qGL11, qTGW1, qGT2/qTGW2.1, and qTGW12) (Figure 8b). However, 7 DGHPs in 10 DPA overlapped with 7 QTL (qGW1, qGW4, qLW1, qGT2/qTGW2.1, qTGW5, and qTGW12) (Figure 8c). Of these candidate genes, five are associated with transferases/hydrolases, including LOC_Os01g11670 (homolog of serine carboxypeptidase OsSCP2), LOC_Os01g11660 (GDSL-like lipase/acetylhydrolase), LOC_Os01g05840 (oxidoreductase), LOC_Os04g25440 (cytokinin-O-glucosyltransferase 2), and LOC_Os12g04980 (meiosis-specific DNA recombinase); eight genes were associated with protein activity, including LOC_Os01g05860 (PPR repeat structural domain protein), LOC_Os01g08540 (DNA mismatch repair protein), LOC_Os02g28820 (expressed protein), LOC_Os02g32580 (expressed protein), LOC_Os05g12330 (expressed protein), LOC_Os0-5g12330 (uncharacterized protein), and LOC_Os11g05700 (ABC transporter protein); and one candidate gene was associated with heat shock transcription factor (LOC_Os02g32590) ( Table 4).  The expression levels of 13 candidate genes were verified by qRT-PCR at both periods ( Figure S1a,b). The results showed that the expression levels of most candidate genes showed the same trends as they did in WFYT025 and WFB. There was a significant difference in the expression of four candidate genes (LOC_Os02g28820, LOC_Os02g32580, LOC_Os04g25440, and LOC_Os12g04980) in both WFYT025 and its parents at 1and 10 DPA. Interestingly, the expression levels of these four candidate genes showed opposite trends at the beginning and middle grain-filling stages. Therefore, the formation of grain weight heterosis in WFYT025 during grain development may be due to the differences in the expression levels of multiple genes at different stages. The coordinated processes and possible expression networks among these genes are of great significance for further explorations into the molecular mechanisms of heterosis in rice grain development.

Discussion
Although heterosis has been widely used in agricultural plant breeding, the molecular and genetic basis of this phenomenon is still poorly understood. In previous studies [34,35], it has been demonstrated that the differential gene expression between a hybrid and its parents may be associated with heterosis. The phenotype analysis of the yield showed that the WFYT025's high yield is derived from the complementation of superior bi-parental traits [19]. In this study, we used RNA-Seq and QTL mapping to investigate the relationship between the DEGs and heterosis in the super hybrid rice WFYT025 to further elucidate the regulatory mechanism of differential gene expression.

Comparative Analysis of DGHP
The 1748 million high-quality 150 bp paired-end reads detected in the grain-filling stages were genetically graded, and 42,004 annotated transcripts were identified. To manifest the transcriptome characteristics between the parents and the hybrids, we further compared the transcriptome of WFYT025 with that of the parents at 1 DPA and 10 DPA. In addition, grain weight heterosis exhibited significant differences between WFYT025 and CHT025; however, there were no significant differences between WFYT025 and WFB ( Figure 1b, Table 1). Compared to the DGHP between WFYT025 and WFB, the DGHP expression between WFYT025 and CHT025 at the grain-filling stage may play a key role in grain weight heterosis. The DGHPs of WFYT025 and CHT025 were significantly enriched in sugar transmembrane transporter protein activity, β-amylase activity, O-glycosyl compound hydrolysis, and protein serine/threonine phosphatase activity at the beginning of the grain-filling stage. In the middle of the grain-filling stage, the DGHPs showed significant enrichment in the diterpene biosynthesis process and in the gibberellic acid-mediated signaling pathway. Therefore, it is important to focus on the DGHP-enriched metabolic pathways between CHT025 and WFYT025 if we are to further analyze the molecular mechanism of grain development heterosis in the super hybrid rice WFYT025.

The Role of Hormone Signal Transduction in Heterosis
As we all know, hormones play an important role in the growth and development of plants, and hormone signals have a very efficient regulatory effect on various physiological activities of the plant itself [36]. In this study, through the analysis of the WFYT025 and parental transcriptome data, it was found that, at the beginning of the grain-filling stage, a large number of the DGHPs that overlap between WFT025 and its parents are related to hormone transduction. For example, the gene TGW6, which encodes the IAA-glucose hydrolase, plays an important role in controlling the thousand-grain weight and grainfilling rate of rice. It mainly hydrolyzes IAA-glucose into a free state and regulates the early endosperm development process by supplying IAA [37]. At the beginning of the grain-filling stage, the TGW6 expression in WFYT025 may be higher than that of its parents, which, in turn, promotes the development of endosperm and leads to an increase in the TGW and grain-filling rate.
In addition to the auxin metabolism pathway, the expression levels of the receptor GID1 synthesis-related genes (LOC_Os09g28620, LOC_Os09g28640, LOC_Os09g28660, and LOC_Os09g28730) involved in gibberellin biosynthesis were also significantly different between WFYT025 and the parents. In previous studies, gibberellin has been shown to belong to a tetracyclic diterpenoid plant hormone, which regulates various stages of plant growth and development, such as the bloom of flowers and the development of fruits and seeds [38,39]. The components in the GA signal transduction pathway mainly include receptors; DELLA proteins, which play a key regulatory role; and other regulatory factors that mediate the degradation of the DELLA proteins. The N-terminal of the GA receptor GID1 has a flexible N-Ex structure. When there is a high level of GA, GID1 can sense and bind to the GA signal, triggering a conformational change in the C-terminal domain of GID1, thereby creating the N-terminal. The Ex-structure seals GA on the GID1 binding site to form a GA-GID1 complex [40]. Therefore, the increase in the expression of the gene encoding GID1 in the offspring variety WFYT025 helps to increase the rate of gibberellin synthesis, which, in turn, promotes rapid grain development, leading to significant differences in their TGW and grain shape. In summary, hormone signal transduction plays a very important role in the formation of heterosis for grain development in the super rice WFYT025.

Comparison of Candidate QTL with Reported QTL
Rice grain weight is a complex genetic trait that is regulated by multiple genes and pathways. Grain weight and shape play an essential role in determining rice quality and yield. A QTL, qGL8, which is found on chromosome 8, overlapped OsFIE2, which has specific histone H3 methyltransferase activity and is responsible for the formation of trimethylation on the 27th lysine of histone H3 and regulates rice seed development and grain-filling ability. Decreased OsFEI2 expression leads to smaller seeds, insufficient grain-filling ability, and interrupted dormancy [41,42]. A QTL, qTGW12, which is found on chromosome 12, overlapped OsPK4, which affects pyruvate kinase activity, regulates the transport of sucrose from the leaves to the grains, and affects grain filling in rice [43]. OsPK1 and OsPK4, which are located in the mitochondrion and cytoplasm, respectively, can be recruited to the mitochondria by OsPK3, where it interacts with these two isoenzymes to form stage-specific heterodimers to affect grain filling, grain thickness, and the TGW. To our knowledge, other QTLs (qTGW1, qGW7, qTGW9, qGL11, etc.) were identified that have not been previously reported, suggesting that these may be novel QTLs.

The Significant DGHP Related to Grain Weight QTL
To further characterize the target heterosis genes, we compared 20 QTLs with the DGHPs between parents and progeny. There were 13 candidate genes that overlapped with 10 QTL intervals (Figure 8b,c), suggesting that these genes may play a key role in the formation of grain-weight heterosis in WFYT025. Thirteen candidate genes were found to be involved in a variety of biological processes, including lipid catabolism, serine carbohydrate activity, rice meiosis, and cytokines-o-glucosyltransferase processes, such as LOC_Os01g11670 (serine carboxypeptidase), LOC_Os01g11660 (lipid catabolism), LOC_Os12g04980 (Homologous pairing aberration in rice meiosis; meiosis-specific DNA recombinase), and LOC_Os04g25440 (glucosyl transfer), among others (Table 4).
Previous studies have shown [44] that serine carboxypeptidase 46 interacts with the ABA-inducible protein DI19-1 to regulate grain filling and seed germination and may play a role in the ABA signaling pathway. The expression of LOC_Os01g11670 in WFYT025 and WFB was significantly higher than it was in CHT025 (Figure 8a,b); this may indicate that it reduces the sensitivity to ABA and indirectly modulates grain weight at the grain-filling stage. In addition, Guo et al. [45] compared the transcriptome differences of the novel tetraploid rice "Huaduo3", homozygous tetraploid rice, and different tissues from the hybrid F 1 , which was generated via transcriptome sequencing, and found eight differentially expressed genes related to meiosis, and these genes, which were determined to be involved in decelerative division-related genes, may be related to rice fertility and heterosis. The expression of LOC_Os12g04980 (OsDMC1A) in WFYT025 was significantly higher than that in WFB and CHT025 on 1 DPA. However, the expression of OsDMC1A in WFYT025 was significantly lower at 10 DPA than it was in WFB and CHT025 ( Figure S1a,b). The significant difference in its expression at different grain-filling stages is likely to be closely related to the heterosis formation in the grain weight of WFYT025. In addition, at both 1 and 10 DPA, the expression levels of four key candidate genes (LOC_Os02g28820, LOC_Os02g32580, LOC_Os04g25440, and LOC_Os12g04980) showed significant differences between the parents and progeny ( Figure S1a,b). Combined with the phenotype analysis, these results indicate that the heterosis of grain development is affected by the differential expression of multiple genes.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/agronomy12040835/s1, Table S1: Phenotypic statistics for WFYT025, CHT025, and WFB grain shapes. Table S2: Number of mapped reads. Table S3: The total number of genes expressed (based on FPKM) in WFYT025 and its parents. Table S4: Primer sequences for qRT-PCR expression analysis. Table S5: Classification of DGHPs based on the dominance ratio HP in 1 DPA. Table S6: Classification of DGHPs based on the dominance ratio HP in 10 DPA. Table S7: Significant GO terms of DGHPs between WFYT025 and its parents in the biological process, cellular component, and molecular function category. Table S8: KEGG pathway enrichment of DGHPs between WFYT025 and its parents on the 1 and 10 DPA. Table S9: High-density genetic linkage map with 2,578 bin markers. Table S10: Phenotype values of grain shape in RIL populations. Table S11: Correlation coefficient of grain shape and grain weight relate traits. Figure