RNA-seq Reveals Differentially Expressed Genes between Two indica Inbred Rice Genotypes Associated with Drought-Yield QTLs

Expressed Genes between Two Associated Abstract: Two indica inbred rice lines, IR64, a drought-sensitive, and Apo, a moderately drought-tolerant genotype, were exposed to non- (control or unstressed) and water-stress treatments. Leaf samples collected at an early ﬂowering stage were sequenced by RNA-seq. Reads generated were analyzed for di ﬀ erential expression (DE) implementing various models in baySeq to capture di ﬀ erences in genome-wide transcriptional response under contrasting water regimes. IR64, the drought-sensitive variety consistently exhibited a broader transcriptional response while Apo showed relatively modest transcriptional changes under water-stress conditions across all models implemented. Gene ontology (GO) and KEGG pathway analyses of genes revealed that IR64 showed enhancement of functions associated with signal transduction, protein binding and receptor activity. Apo uniquely showed signiﬁcant enrichment of genes associated with an oxygen binding function and peroxisome pathway. In general, IR64 exhibited more extensive molecular re-programming, presumably, a highly energy-demanding route to deal with the abiotic stress. Several of these di ﬀ erentially expressed genes (DEGs) were found to co-localize with QTL marker regions previously identiﬁed to be associated with drought-yield response, thus, are the most promising candidate genes for further studies.


Introduction
More than half of the world's population depends on rice. Most of these people live in Asia, where at least 90% of the world's rice is produced and consumed [1]. Increasing production has been the center of collaborative efforts and consortia to keep up with the global demands amidst changing climatic conditions, dwindling arable lands, and an increasing world population.
Among the abiotic stresses, drought is considered the most important limitation to rice production in rainfed lowlands and is estimated to affect at least 23M ha of rice area (20% of the total rice area) in Asia [2]. Drought, together with poor soil conditions, limits upland rice yield in over 19M ha representing 15% of the rice-growing area worldwide [3].
While rice is generally adapted to a well-watered or irrigated ecosystem, there are genetic variations against drought that have been observed in traditional and modern varieties [4,5]. These phenotypic variations provide the platform for biological investigation to shed hints on the underlying genetic mechanisms of a drought response. The advent of modern approaches on genomics such as high-throughput sequencing has revolutionized genetic and transcriptome analyses of important crops including rice. Most notably, the 30K rice genome sequencing project provided a plethora of resource information on its genetic variation, population structure and diversity [6].
Genome-wide expression analysis in response to either biotic or abiotic stress has been a subject of a number of research studies for several years before now. Drought tolerance, for example, has been extensively studied in rice (e.g., [7][8][9]). However, such a trait is challenging to work with as it is controlled by many genes-with both small and large effects. Zhou et al. [9], for example, demonstrated that the tolerance of several genotypes, e.g., N22 and Apo, was attributed to the enhanced expression of several enzyme-encoding genes, while the susceptibility of IR64 was ascribed to significant down-regulation of regulatory alleles.
Molecular markers, on the other hand, such as restriction fragment length polymorphism (RFLP), simple sequence repeats (SSR) and single nucleotide polymorphism (SNP), help to track the genetic loci controlling several traits through quantitative trait loci (QTL) mapping or genome-wide association studies (GWAS). Once identified, the QTL can be selected for breeding programs using marker-assisted selection (MAS) and other strategies. Several drought-response QTLs have been mapped in rice in a number of studies [10][11][12][13][14], mostly employing indica × japonica parental lines where the majority of the drought-tolerance traits were contributed by the japonica parent [14,15]. Recently, GWAS using image-based traits has identified OsPP15 (LOC_Os01g62760, protein phosphatase) to be associated with drought resistance [16].
In this study, Apo, a moderately drought-tolerant upland indica cultivar, and IR64, a droughtsusceptible lowland indica cultivar, were sequenced to identify their expression patterns after exposure to both non-and water-stress conditions. These two genotypes are well known for their contrasting sensitivity to drought at a vegetative stage under field conditions [17]. The use of two indica parents offers advantages since indica × japonica lines are extensively studied where both ecotypes are grown in entirely different environments; one allele may not be expressed in a particular ecosystem. Hence, it is desirable to look for genetic variations within indica ecotypes with contrasting response against drought conditions and map loci using the same lines. This approach would provide identification of loci or causative regions, which respond to varying water regimes between two highly genetically related inbred lines. Furthermore, indica accounts for more than 70% of the global rice production and is widely cultivated in China and Southeast Asia [18]. Therefore, such a study will provide valuable insights on the different molecular changes between indica inbred lines.
The present study generally aimed to identify DEGs between the two genotypes through transcriptome analysis under two contrasting water regimes: normal vs. water-stress. Eventually, we co-localized DEGs with previously identified drought-yield QTLs to shortlist potential candidate genes for further studies.

Materials and Method
In this paper, we performed differential expression (DE) analysis between two inbred rice lines, IR64 and Apo, using various models as described in baySeq [19,20]. The biological preparations of the materials and the bioinformatics pipeline (see SI 1 for commands used) are described below.

Dry-Down Experiment
Seeds of the parental inbred lines were pre-germinated for five to seven days in petri plates with sterilized filtered paper, moistened with distilled water. Germinated seeds were transferred in small plastic boxes for one week after which they were transplanted in pots filled with approximately 10 kg of soil mix (2 soil: 1 sand), adequately fertilized and grown under controlled conditions at Phytotron, IRRI at 30 • C temperature with 70% relative humidity ( Figure 1). Saturated soils in the pots were covered with white plastic covers, with an opening in the middle to facilitate planting. Feeder pipes Figure 1. Using large pots, rice plants inside the IRRI phytotron were exposed to either well-watered or water-stressed conditions. Inset: punctured conical tubes served as feeder pipes on each pot to deposit water.
In this study, non-stress or well-watered treatment served as the normal or control condition; water-stress or water-limiting condition as stressed treatment. These terms are used interchangeably in this paper. Furthermore, as a consequence of alternative splicing, several genes were identified to have one or more splice or transcript variants. Thus, genes are represented as gene models or isoforms as described in Michigan State University (MSU) Rice Genome Annotation located at http://rice.plantbiology.msu.edu/ [24].

RNA Extraction
At the end of the dry-down treatment, the flag leaf samples from each plant were collected between 0900 and 1100 and were immediately frozen using liquid N. RNA was extracted using the TRIzol method according to the instructions provided by the supplier (Invitrogen, San Diego, Calif., USA). RNA-seq libraries were prepared as described in Illumina's standard protocol for RNA-seq using the parental (IR64 and Apo) RNA samples from each treatment (non-and water-stress). Libraries were sequenced on Illumina GAIIx, generating a 38-bp read size for our first biological rep; 90-base paired end (PE) reads, for the second rep. We tested whether there was significant effect of these dissimilar read sizes in our analysis by generating M (log ratio) and A (mean average) or MA plots. Using large pots, rice plants inside the IRRI phytotron were exposed to either well-watered or water-stressed conditions. Inset: punctured conical tubes served as feeder pipes on each pot to deposit water.
All pots were irrigated twice daily to maintain the soil at saturation. The day before the start of progressive soil drying, soil in each pot was saturated. Stressed plants were allowed to drain overnight by loosening the base stoppers and were weighed early the next morning to get the saturated weight. Stress was imposed by initiating soil dry down protocol starting 10 days before heading.
Gradual dry down to 0.5 FTSW (fraction of transpirable soil water) was imposed and pots were maintained at this level until sampling [21][22][23]. No water was added back to the pot during dry down. All pots were weighed daily to account for volume of water lost and to ensure the stress level was reached.
In this study, non-stress or well-watered treatment served as the normal or control condition; water-stress or water-limiting condition as stressed treatment. These terms are used interchangeably in this paper. Furthermore, as a consequence of alternative splicing, several genes were identified to have one or more splice or transcript variants. Thus, genes are represented as gene models or isoforms as described in Michigan State University (MSU) Rice Genome Annotation located at http://rice.plantbiology.msu.edu/ [24].

RNA Extraction
At the end of the dry-down treatment, the flag leaf samples from each plant were collected between 0900 and 1100 and were immediately frozen using liquid N. RNA was extracted using the TRIzol method according to the instructions provided by the supplier (Invitrogen, San Diego, CA, USA). RNA-seq libraries were prepared as described in Illumina's standard protocol for RNA-seq using the parental (IR64 and Apo) RNA samples from each treatment (non-and water-stress). Libraries were sequenced on Illumina GAIIx, generating a 38-bp read size for our first biological rep; 90-base paired end (PE) reads, for the second rep. We tested whether there was significant effect of these dissimilar read sizes in our analysis by generating M (log ratio) and A (mean average) or MA plots.

Pre-Processing
Quality checking of PE reads was performed using FASTQC [25]. Reports generated from the FASTQC files indicated the absence of adapters, insignificant proportions of over-represented sequences and high base-quality sequences (Q ≤ 20). Therefore, no further processing steps were made. Reads were mapped via bowtie2 (parameters: -no-discordant) to the cDNA pseudomolecules of Oryza sativa indica 93-11 and Shuhui498 and the MSU v7 and International Rice Genome Sequencing Project (IRGSP) models of the japonica Nipponbare as the transcriptome references. Mapping was performed to generate Binary Alignment/Map files using SAMtools [26] (parameter: view -b). These are binary files which are compressed, thus, occupy smaller memory size and are easier for computers to work with. Using the same tool, reads with low mapping quality were removed using a modest filtering parameter (option: view -q 1). We then compared the percentage alignment of reads mapping to the various transcriptome references.

Read Count Quantification
Read count quantification after transcriptome mapping was performed using Salmon on alignment-based mode (options: -biasCorrect -useErrorModel) [27]. The annotations ("Name") and the number of reads ("NumReads") columns generated by Salmon were extracted and a count data matrix was created using R (v3.6.1; in Linux environment) [28].

Data Filtering and Normalization
Isoforms with low expression values (nearly zero row sums) in the data matrix were removed to decrease memory and increase calculation speed. Normalization of datasets based on library size was performed to make the data from different replicates and treatments more comparable. Library scaling factor was calculated using Trimmed Means of M-values (TMM) [29]. MA plots with Loess curves were created (SI 2) to determine whether normalization was effective. Loess curves indicated an adequate normalization procedure in our datasets.
Prior to DE analysis, Spearman's coefficient of correlations was calculated, and histograms and summary statistics before and after removal of low read counts were generated. Between Pearson and Spearman methods, we preferred the latter, a non-parametric rank-based metric, well-suited for non-normal distributions to calculate the coefficient of correlations since Pearson is heavily influenced by outliers, and RNAseq data is heavily skewed [30]. Additionally, Spearman was shown to perform best among tested correlation methods for identifying differential correlation [31].
In summary, we implemented several statistical filtering measures and parameters to minimize artifacts in identifying DEGs: (i) reads with low-mapping quality were removed, (ii) isoforms (i.e., rows) with low read counts in the data matrix were filtered out, (iii) datasets were normalized with respect to library size, (iv) Spearman's coefficient of correlation between replicates of the same sample were very high (0.94 and 0.95), and (v) in case of the dissimilar read sizes (i.e., 38-and 90-bp), tight and symmetrical MA plots were obtained.

Differential Expression Analysis
BaySeq was used to test DE following the instructions described in the paper by Hardcastle and Kelly [19] and as described in the vignettes [20,32]. Pairwise DE analysis was performed between samples exposed to non-and water-stress conditions for each genotype (i.e., Apo control vs. Apo stress; IR64 control vs. IR64 stress), to determine the effect of the treatment (water stress) to each of the genotypes. DE using more complex models was also determined to capture: (1) genes that do not change between the conditions nor vary between genotypes: we called it non-differential expression or NDE (IR64 non-stress, IR64 stress, Apo non-stress, Apo stress). Here, we assumed that all the samples belonged to the same group; (2) overall variations between IR64 and Apo due to their genotypic differences, across treatments: genotype DE or GDE (IR64 non-stress, IR64 stress vs. Apo non-stress, Apo stress). These are genes that differ between the two cultivars, under both treatments; (3) differences that arise between the two genotypes due to the stress: we called this drought DE or DDE, a three-way DE analysis (IR64 non-stress, Apo non-stress vs. IR64 stress vs. Apo stress). This captures variations between the two genotypes as a consequence of their exposure to water stress; and (4) residual differences among groups or RDE (IR64 non-stress vs. IR64 stress vs. Apo non-stress vs. Apo stress). These are differences across genotypes and treatments. See SI 2 for additional baySeq R commands and their explanations.
Bayseq [19,20] estimates posterior likelihoods of differential gene expression. For pairwise DE, the average of two replicates was obtained for each treatment and expression ratios were calculated as treatment/control (T/C) plus a pseudo-count of 1 to avoid 0 denominators. Log (base 2) ratio or fold-change (FC) was then computed. In the DE analysis between samples exposed to nonand water-stress treatments of the same genotype, an isoform (or a transcript variant) is said to be differentially expressed if it exhibits |log 2 FC| ≥ 1, false discovery rate (FDR) p-value correction < 0.05 [33], and an absolute value difference > 10 as was previously implemented [34].
For DE analysis using the other models described above (NDE, GDE, DDE, RDE), a gene is differentially expressed if it exhibits FDR-corrected p-value < 0.05. No residual differences (RDE) were detected in this study.
Kyoto Encyclopedia of Genes and Genomes (KEGG) [39] analysis was further performed to determine molecular interactions and relational networks among DEGs. Located at https://www. genome.jp/kaas-bin/ [40], the tool implements BLAST search program against the reference Oryza sativa japonica (RefSeq and RAPDB) as reference databases.

Co-Localization Analysis
A co-localization step was performed by aligning the positions of DEGs (described above) to previously identified SSR markers known to be involved with yield under drought in populations with the same parental background (IR64/Apo): F 3:5 [41] and F 2:3 [12]. The population tail ends (highest and lowest yield) of IR64/Apo F 3:5 Recombinant Inbred Lines (RILs) were selectively genotyped using SSR markers then we identified regions associated with drought-yield response [41]. On a separate study, Venuprasad et al. [12] identified markers responding to selection under water-limiting conditions using IR64/Apo F 2:3 RIL population. These SSR marker regions were anchored in the indica genome and their locations were estimated using Ensembl [42]. Using the same tool, their proximity with DEGs could be estimated. SSR markers were provided in a previous study [43]. Several genes were found to co-localize with the QTL markers at a physical distance of at most 240 kb, the estimated equivalent of 1 cM. The application ICIMapping [44] was used to generate the map based on Cornell and IRMI.

Results and Discussion
Two genotypes with contrasting response against drought-(i) IR64, a moderately drought-susceptible genotype [45], and (ii) Apo (IR55423-01), a moderately tolerant variety [46]-were exposed to well-watered (non-stress) and drought conditions (water-stress). Leaf samples were collected after treatment during the flowering stage, the most sensitive stage affecting grain yield [47,48], then were sequenced for RNA-seq. We generated 58,715,576 and 107,027,567 reads from all libraries in rep 1 and 2, respectively, with a grand total read count of 166 million (Table 1).

Read Mapping
Reads were mapped to the transcriptome references of the indica lines 93-11 [49] (Ensembl release 36) and Shuhui498 ("Shuhui" in this paper) [50]. Likewise, reads were mapped to the cDNA references of the japonica lines, Nipponbare, using MSU v7 [51] and the IRGSP (2005) [52] models, via bowtie2 [53] (see Supplementary Information (SI) 1 for complete commands used). Mapping to multiple transcriptome assemblies aims to determine which reference sequence yields the best percentage alignment.
Initial mapping of three sample libraries against the 93-11 cDNA showed an average of 52% alignment rate (data not shown) against the other three transcriptome references. Among the reference assemblies, MSU v7 and Shuhui showed the best alignment rates (83%; shown in Table 1 and Table S1, respectively). Details of alignments including multiple aligned reads using MSU v7 is shown in Table S2. However, between MSU v7 and Shuhui, we preferred the former reference for further analysis to be consistent with our previous studies on allelic imbalance in hybrids [41] and regulatory divergence [54]. However, results using Shuhui, an indica like our materials, are sporadically presented below and are comprehensively discussed in the Supplementary Discussion for comparative purposes.

MA Plots and Spearman's Coefficient of Correlations
The sequencing protocol generated 38-and 90-basepair read sizes for replicates 1 and 2, respectively. To determine whether there was an impact of these varying sizes on the succeeding analysis, MA plots (where M is the difference in log expression values and A is the average [55]) with Loess curve [56,57] were created between IR64 reps 1 and 2 and between Apo reps 1 and 2. Results showed that MA plots indicated tight and symmetrical data points (at M = 0) with centered Loess curves suggesting that there is no significant variability between the two replicates of the same genotype ( Figure S1). Hence, it is reasonable to suppose that there is no influence of the dissimilar read sizes in the succeeding DE analysis.
Using MSU v7 transcriptome reference, Spearman's coefficient of correlations using normalized read counts between replicates showed 0.93 and 0.94 in IR64 control and stress treatments, respectively ( Figure 2). Similarly, Apo showed 0.95 between replicates under both non-and water-stress treatments. These results indicate a high level of reproducibility between replicates of the same genotype.

Pairwise DE (PDE) Between Treatments of the Same Genotype
Samples exposed to non-and water-stress conditions were analyzed for PDE (i.e., IR64 control vs. IR64 stress; Apo control vs. Apo stress) to determine the effect of the treatment in each genotype. Before testing for PDE, read counts were normalized with respect to library size. MA plots with Loess curves were, likewise, generated to visualize whether our normalization procedure was effective. Results showed symmetrical MA plots with "centered" Loess curves in our analyses, indicating that the normalization procedure was effective ( Figure S2).
In the IR64 samples, 170 genes were found to be differentially expressed ( Figure 3; Table S3). Of these, 36 (21.2%) and 134 (78.8%) are down-and up-regulated, respectively, under water-stress conditions (|log2FC| > 1, FDR<0.05). Two of the genes repressed under water-limiting conditions encode for a nucleotide-binding site leucine-rich repeat (NBS-LRR), which is known to be involved in disease resistance, and a brassinosteroid insensitive 1-associated receptor kinase which was recently found to play a role in plant growth and defense [58] (complete list is shown in Table S3). On the other hand, genes up-regulated under water-stress include zinc finger, MYB, NAC, late embryogenesis abundant protein, and a bZIP protein, most of which are transcription factors (TFs). Such molecules were known to play crucial roles during drought stress (reviewed in [59]).
In Apo, four genes showed significant DE between non-and water-stress conditions ( Figure 3; Table S4). Of these, one gene, a transposon is repressed, while three genes which include dehydrin, HSF-type DNA-binding protein and an expressed protein are induced under stress conditions. These three up-regulated genes in Apo were also found to be induced in the susceptible cultivar, IR64 ( Figure  3a) suggestive of their crucial roles in water-limiting conditions. Apparently, dehydrin is up-regulated in both IR64 and Apo which are known to contribute in the acquisition of drought and cold tolerance (reviewed in [60]). On the other hand, TFs were found to be induced in the drought-sensitive IR64 but not in Apo using Shuhui as the reference assembly (see Supplementary Discussion). No GO terms were enriched in the pairwise analysis.

Pairwise DE (PDE) Between Treatments of the Same Genotype
Samples exposed to non-and water-stress conditions were analyzed for PDE (i.e., IR64 control vs. IR64 stress; Apo control vs. Apo stress) to determine the effect of the treatment in each genotype. Before testing for PDE, read counts were normalized with respect to library size. MA plots with Loess curves were, likewise, generated to visualize whether our normalization procedure was effective. Results showed symmetrical MA plots with "centered" Loess curves in our analyses, indicating that the normalization procedure was effective ( Figure S2).
In the IR64 samples, 170 genes were found to be differentially expressed ( Figure 3; Table S3). Of these, 36 (21.2%) and 134 (78.8%) are down-and up-regulated, respectively, under water-stress conditions (|log 2 FC| > 1, FDR < 0.05). Two of the genes repressed under water-limiting conditions encode for a nucleotide-binding site leucine-rich repeat (NBS-LRR), which is known to be involved in disease resistance, and a brassinosteroid insensitive 1-associated receptor kinase which was recently found to play a role in plant growth and defense [58] (complete list is shown in Table S3). On the other hand, genes up-regulated under water-stress include zinc finger, MYB, NAC, late embryogenesis abundant protein, and a bZIP protein, most of which are transcription factors (TFs). Such molecules were known to play crucial roles during drought stress (reviewed in [59]).
In Apo, four genes showed significant DE between non-and water-stress conditions ( Figure 3; Table S4). Of these, one gene, a transposon is repressed, while three genes which include dehydrin, HSF-type DNA-binding protein and an expressed protein are induced under stress conditions. These three up-regulated genes in Apo were also found to be induced in the susceptible cultivar, IR64 (Figure 3a) suggestive of their crucial roles in water-limiting conditions. Apparently, dehydrin is up-regulated in both IR64 and Apo which are known to contribute in the acquisition of drought and cold tolerance (reviewed in [60]). On the other hand, TFs were found to be induced in the drought-sensitive IR64 but not in Apo using Shuhui as the reference assembly (see Supplementary Discussion). No GO terms were enriched in the pairwise analysis. Results using Shuhui and MSU v7 assemblies consistently suggest that there is a major difference in the number of genes responding to changes in environmental conditions between IR64 and Apo. IR64, the drought-intolerant but high-yielding variety, exhibited a wider transcriptional response when exposed to water-stress conditions, while Apo, the drought-tolerant genotype, showed modest expression changes. These findings potentially demonstrate that Apo is relatively transcriptionally stable under stressful conditions. Our results align with a previous study on jute [61]. The droughtsusceptible species (Corchorus capsularis L.) showed a higher number of DEGs (794) as compared to the drought-tolerant species (Corchorus olitorius L.; 39) in response to a Polyethylene Glycol (PEG)-induced drought stress.

Differences Due to Genotypic Background across Treatments
We tested for DE between genotypes using a different model to capture differences between the two parental inbred lines across the environmental conditions ("genotype differential expression" or GDE; see SI 2 and Materials and Method). Analysis showed that there were 729 up-regulated genes in Apo but were down-regulated in IR64 (at FDR < 0.05; Table S5). On the other hand, 828 genes were significantly up-regulated in IR64 but were down-regulated in Apo.

GO Enrichment Analysis.
Biological Process. GO analysis provides information on the potential functions of genes. Using AgriGO [35,36], analysis revealed that both genotypes showed up-regulation of several genes enriched in response to stress, cell death, and protein modification process (biological process; FDR < 0.05; Figure  6). Between the two genotypes, IR64 induces a wider suite of genes (112 genes) associated with response to stress, as compared to Apo (82; FDR< 0.05; Figure 6).
Both varieties commonly induce genes which encode for disease resistance protein, NBS-LRR, and nucleotide-binding -APAF-1, R proteins, and CED-4 (NB-ARC) associated with cell death, one of the Results using Shuhui and MSU v7 assemblies consistently suggest that there is a major difference in the number of genes responding to changes in environmental conditions between IR64 and Apo. IR64, the drought-intolerant but high-yielding variety, exhibited a wider transcriptional response when exposed to water-stress conditions, while Apo, the drought-tolerant genotype, showed modest expression changes. These findings potentially demonstrate that Apo is relatively transcriptionally stable under stressful conditions. Our results align with a previous study on jute [61]. The drought-susceptible species (Corchorus capsularis L.) showed a higher number of DEGs (794) as compared to the drought-tolerant species (Corchorus olitorius L.; 39) in response to a Polyethylene Glycol (PEG)-induced drought stress.

Differences Due to Genotypic Background across Treatments
We tested for DE between genotypes using a different model to capture differences between the two parental inbred lines across the environmental conditions ("genotype differential expression" or GDE; see SI 2 and Materials and Method). Analysis showed that there were 729 up-regulated genes in Apo but were down-regulated in IR64 (at FDR < 0.05; Table S5). On the other hand, 828 genes were significantly up-regulated in IR64 but were down-regulated in Apo.

GO Enrichment Analysis
Biological Process. GO analysis provides information on the potential functions of genes. Using AgriGO [35,36], analysis revealed that both genotypes showed up-regulation of several genes enriched in response to stress, cell death, and protein modification process (biological process; FDR < 0.05; Figure 4). Between the two genotypes, IR64 induces a wider suite of genes (112 genes) associated with response to stress, as compared to Apo (82; FDR < 0.05; Figure 4). the most frequently enriched biological processes, which are not enriched in Apo (Figs. 4b and 6). Some of the genes associated with signal transduction encode for receptor-like protein kinases (12 genes), LRR (4 genes), NBS-LRR disease resistance protein (2 genes), receptor kinase (2 genes) and a serine/threonine-protein kinase receptor precursor.  Molecular Function. GO analysis using AgriGO of the up-regulated genes in both genotypes showed common significant enrichment (FDR < 0.05) of genes associated with kinase activity and nucleotide binding ( Figures. 5 and 6).
Interestingly, Apo exhibits up-regulation of 16 genes distributed across its genome, all of which encode for cytochrome P450 associated with oxygen binding activity (FDR < 0.05) (Figure 5a and 6). These genes are either not expressed or down-regulated in the drought-susceptible, IR64. Cytochrome P450 has been recently found to augment tolerance against drought stress in transgenic tobacco [62]. Both varieties commonly induce genes which encode for disease resistance protein, NBS-LRR, and nucleotide-binding-APAF-1, R proteins, and CED-4 (NB-ARC) associated with cell death, one of the most frequently enriched biological processes. However, Apo uniquely encodes for AP2 (LOC_Os09g11480), Leucine Rich Repeat (LRR; LOC_Os04g26350 and LOC_Os06g16450) and heat-shock protein (LOC_Os08g32130) enriched in cell death, which are not found in IR64. On the other hand, IR64 showed the DE of genes associated with response to biotic stimulus and signal transduction, the most frequently enriched biological processes, which are not enriched in Apo (Figures 4b and 6). Some of the genes associated with signal transduction encode for receptor-like protein kinases (12 genes), LRR (4 genes), NBS-LRR disease resistance protein (2 genes), receptor kinase (2 genes) and a serine/threonine-protein kinase receptor precursor. Molecular Function. GO analysis using AgriGO of the up-regulated genes in both genotypes showed common significant enrichment (FDR < 0.05) of genes associated with kinase activity and nucleotide binding (Figures 5 and 6).
This information sheds hints that Apo is capable of scavenging reactive oxygen species (ROS), which may impose cellular damage and even death if not kept under control for a prolonged period of drought stress (reviewed in [63]).  Additionally, GO enrichment analysis showed that genes associated with protein binding and receptor activity were the most frequently enriched molecular functions of the GDE genes which are unique in IR64; hence not enriched in Apo (at FDR < 0.05; Figure 6). Some of the genes enriched in receptor activities include NBS-LRR disease resistance protein (LOC_Os02g38392, LOC_Os11g10760), LRR (LOC_Os09g15850), LRR receptor kinase (LOC_Os03g48890), serine/threonine-protein kinase (LOC_Os05g42210), 26S proteasome regulatory subunit S5A (LOC_Os10g05180). These genes are commonly induced during stress conditions. Interestingly, Apo exhibits up-regulation of 16 genes distributed across its genome, all of which encode for cytochrome P450 associated with oxygen binding activity (FDR < 0.05) (Figures 5a and 6). These genes are either not expressed or down-regulated in the drought-susceptible, IR64. Cytochrome P450 has been recently found to augment tolerance against drought stress in transgenic tobacco [62]. This information sheds hints that Apo is capable of scavenging reactive oxygen species (ROS), which may impose cellular damage and even death if not kept under control for a prolonged period of drought stress (reviewed in [63]).

KEGG Pathway Analysis of GDE Genes.
ROS scavenging pathways. The accumulation of ROS during stressful conditions may cause damage against essential macromolecules and even death of plant cells. Plants, being sessile, have evolved important metabolic pathways to scavenge harmful ROS which include phenylpropanoid biosynthesis and peroxisome pathway (reviewed in [64,65]).
Apo, on the other hand, showed significant up-regulation of four genes involved in both metabolic pathways. These included LOC_Os12g35890 (FAD-dependent oxidoreductase domain-containing protein) and LOC_Os01g53060 (peroxisomal membrane protein) of the peroxisome pathway; LOC_Os01g36240 (peroxidase precursor) and LOC_Os08g43040 (transferase family protein/ shikimate O-hydroxycinnamoyltransferase) of the phenylpropanoid biosynthesis.
Hormones. Hormones such as abscisic acid (ABA), auxin and jasmonic acids (JAs) play key roles in responding to both biotic and abiotic stresses (reviewed in [66]). Using KEGG pathway analysis, the drought-susceptible IR64 was found to induce LOC_Os03g08320 (jasmonate ZIM domain-containing protein) and indole acetic acid (IAA) synthetase (LOC_Os07g40290), both of which are associated with plant hormone signal transduction. Apo, on the other hand induces LOC_Os01g28450 (pathogenesisrelated protein/SCP-like extracellular protein) which has been known to participate in a drought and salt response ( [67]).
Transcription factors (TFs). Two genes significantly differentially expressed in IR64 were found to encode for MYB (LOC_Os01g50110) and MADS box (LOC_Os01g66030). Apo, on the other hand, Figure 6. Summary of the number of genes enriched in each GO term for both IR64 and Apo using GDE or genotype differential expression model (at FDR < 0.05). (No genes were enriched in cellular components).
Additionally, GO enrichment analysis showed that genes associated with protein binding and receptor activity were the most frequently enriched molecular functions of the GDE genes which are unique in IR64; hence not enriched in Apo (at FDR < 0.05; Figure 6). Some of the genes enriched in receptor activities include NBS-LRR disease resistance protein (LOC_Os02g38392, LOC_Os11g10760), LRR (LOC_Os09g15850), LRR receptor kinase (LOC_Os03g48890), serine/threonine-protein kinase (LOC_Os05g42210), 26S proteasome regulatory subunit S5A (LOC_Os10g05180). These genes are commonly induced during stress conditions.

KEGG Pathway Analysis of GDE Genes
ROS scavenging pathways. The accumulation of ROS during stressful conditions may cause damage against essential macromolecules and even death of plant cells. Plants, being sessile, have evolved important metabolic pathways to scavenge harmful ROS which include phenylpropanoid biosynthesis and peroxisome pathway (reviewed in [64,65]).
Apo, on the other hand, showed significant up-regulation of four genes involved in both metabolic pathways. These included LOC_Os12g35890 (FAD-dependent oxidoreductase domain-containing protein) and LOC_Os01g53060 (peroxisomal membrane protein) of the peroxisome pathway; LOC_Os01g36240 (peroxidase precursor) and LOC_Os08g43040 (transferase family protein/ shikimate O-hydroxycinnamoyltransferase) of the phenylpropanoid biosynthesis.
Hormones. Hormones such as abscisic acid (ABA), auxin and jasmonic acids (JAs) play key roles in responding to both biotic and abiotic stresses (reviewed in [66]). Using KEGG pathway analysis, the drought-susceptible IR64 was found to induce LOC_Os03g08320 (jasmonate ZIM domain-containing protein) and indole acetic acid (IAA) synthetase (LOC_Os07g40290), both of which are associated with plant hormone signal transduction. Apo, on the other hand induces LOC_Os01g28450 (pathogenesis-related protein/SCP-like extracellular protein) which has been known to participate in a drought and salt response ( [67]).
Comparatively, Shuhui and MSU v7 as transcriptome references both suggest that IR64 was found to have a wider repertoire of DEGs relative to Apo (see Supplementary Discussion). A similar study performed between two maize inbred lines with contrasting response to drought exposed to varying levels of stress (drought and well-watered) revealed that TFs enhance tolerance to drought [68]. Moreover, the sensitive line showed a greater number of genes (2558 genes) responding to drought against the tolerant line (555 genes). These findings were consistent with our study in which the drought-intolerant line exhibited a wider dynamic transcriptional response over the drought-tolerant genotype. These studies [61,68], including this paper, suggest that drought-intolerant varieties are highly responsive or are more vulnerable while drought-tolerant are sturdier against drought stress at the transcriptional level.

Differences Due to Drought (G × E) Using 3-Way DE Model
Expression variation that arises due to the interactions between genotypes and environment (G × E) can be detected. Using a different model in baySeq (drought differential expression or DDE; see SI 2), genes responding to contrasting water treatments between the two genotypes could be identified. We used a three-way DE model to detect these types of genes with comparative components including: IR64 and Apo under non-stress vs. IR64 under stress vs. Apo under stress conditions (see Figure 7 for additional information).
Further results reveal six genes significantly up-regulated in Apo under drought conditions compared to the other groups (FDR < 0.05; Table S6) (Figure 7). Hence, a relatively modest number of genes are significantly induced in Apo under a water-stress regime. These genes include cytochrome (LOC_Os11g29720.1), MYB (LOC_Os06g51260.2), DNA-binding protein (LOC_Os02g47560.1), regulator of ribonuclease (LOC_Os02g52450.1), expressed (LOC_Os05g11428.1) and Proton-dependent Oligopeptide Transporter or POT (LOC_Os11g18110.1) ( Table S6). Some of these genes, like cytochromes which participate in ROS scavenging and MYB transcription factors, were known to engage in water-stress conditions. No genes under "Apo stress" group were enriched in any of the GO domains or terms.  Table S6). Some of these genes, like cytochromes which participate in ROS scavenging and MYB transcription factors, were known to engage in water-stress conditions. No genes under "Apo stress" group were enriched in any of the GO domains or terms.
On the other hand, 155 genes were significantly up-regulated in IR64 under water-stress conditions relative to the other groups (see Table S6) further demonstrating that IR64 consistently exhibits a higher transcriptional response when exposed to drought conditions. Examples of IR64 genes responding to the stress (G × E genes) include auxin-induced protein, dehydrins, cytochromes, LEA proteins, stress-responsive protein and known TFs (see Table S6). KEGG pathway analysis further confirms the participation of TFs, six of which were detected: LOC_Os01g46970 (plant G-box-binding factor), LOC_Os08g36790 (ABA responsive element binding factor; bZIP TF), LOC_Os01g10320 (HD-ZIP; homeobox-leucine zipper protein), LOC_Os01g39020 (HSFF; heat shock transcription factor), LOC_Os04g42950 (MYB transcription factor) and LOC_Os03g48970 (NFYA; nuclear transcription factor Y, alpha).
GO analysis showed significant enrichment of up-regulated genes associated with "response to stress" (36 genes; FDR < 0.05; Figure 7) suggesting the induction of drought-response molecules. On the other hand, 155 genes were significantly up-regulated in IR64 under water-stress conditions relative to the other groups (see Table S6) further demonstrating that IR64 consistently exhibits a higher transcriptional response when exposed to drought conditions. Examples of IR64 genes responding to the stress (G × E genes) include auxin-induced protein, dehydrins, cytochromes, LEA proteins, stress-responsive protein and known TFs (see Table S6). KEGG pathway analysis further confirms the participation of TFs, six of which were detected: LOC_Os01g46970 (plant G-box-binding factor), LOC_Os08g36790 (ABA responsive element binding factor; bZIP TF), LOC_Os01g10320 (HD-ZIP; homeobox-leucine zipper protein), LOC_Os01g39020 (HSFF; heat shock transcription factor), LOC_Os04g42950 (MYB transcription factor) and LOC_Os03g48970 (NFYA; nuclear transcription factor Y, alpha).
Results using Shuhui and MSU v7 demonstrated that there were more IR64 genes responding to water stress including TFs which were not found in Apo (see Supplementary Discussion and Table S14 for Shuhui).

Several DEGs Co-Localize with Drought-Yield QTLs
In our previous study [41], we identified regions in the rice genome which were associated with a drought response using selective genotyping. These marker regions, along with drought-yield QTLs identified by Venuprasad et al. [12], were co-localized with genes differentially expressed (described above). Interestingly, several genes were found to align with drought-yield QTL regions located in chromosomes 1, 2, 3, 6, 8 and 12 (Figure 8). These genes are, therefore, the most interesting candidates for further studies on drought response in rice. Figure 8. Several DEGs were found to co-localize with QTL marker regions (boxed in red) identified to participate in drought-yield response in the same bi-parental population background (IR64/Apo) [12,41]. Markers and genes are said to co-localize if their estimated distance is less than 240 kb, the estimated equivalent of 1 cM. (CEN, centromere. Map was generated using ICIMapping [44]. Positions of markers were estimated based on Cornell and IRMI SSR map at www.archive.grameme.org [69]. Chromosome length not drawn to scale). Figure 8. Several DEGs were found to co-localize with QTL marker regions (boxed in red) identified to participate in drought-yield response in the same bi-parental population background (IR64/Apo) [12,41]. Markers and genes are said to co-localize if their estimated distance is less than 240 kb, the estimated equivalent of 1 cM. (CEN, centromere. Map was generated using ICIMapping [44]. Positions of markers were estimated based on Cornell and IRMI SSR map at www.archive.grameme.org [69]. Chromosome length not drawn to scale).
In chromosome 1, an aggregation of DEGs were found to collocate with markers RM11943/RM6333. This region is also associated with yield under drought in N22/IR64 population [70]. One of the most interesting DEGs in this region encodes for a chlorophyll a/b binding protein (LOC_Os01g64960). Its participation in drought tolerance has been reported in Arabidopsis [71].
In chromosome 6, a gene encoding for a protein of unknown function (LOC_Os06g06550) is tightly linked with RM510. On the other hand, a suite of genes are aligned with the marker region RM256/RM80 in chromosome 8, the most interesting of which include transposon (LOC_Os08g38110), transcription factor BIM2 (LOC_Os08g38210), WRKY (LOC_Os08g38990), and heat shock protein (LOC_Os08g39140). Finally, a disease resistance gene (LOC_Os12g29290) co-localizes with RM511 in chromosome 12.
The combination of two approaches-RNA-seq and QTL mapping with co-localization analysis-is a powerful strategy to identify potential segments involved in drought response. Further studies, however, to dissect the participation of these shortlisted candidate genes, including cytochromes, on drought response are highly recommended.

Summary and Conclusions
Using RNA-seq, the whole transcript population of IR64 and Apo were sequenced after exposure to two contrasting water regimes. We implemented a bowtie-salmon-bayseq pipeline to identify DEGs. Several filtering parameters were employed to reduce the influence of artifacts on our data analysis. We also showed that using two different transcriptome reference sequences (MSU v7 and Shuhui) can have an impact on the downstream analysis such as the number and variety of identified DEGs. Therefore, decisions on which reference assembly to be used should be taken into consideration for analysis on RNA-seq.
Taken together, our results suggest that IR64 and Apo have varying strategies of dealing with the stress. IR64 demonstrated a more extensive molecular reprogramming, presumably, a more energy-demanding route. Signaling pathways including ABA, jasmonic acid, and ethylene which interact with ROS were shown to be highly activated in IR64. ROS, which accumulates during abiotic stress such as drought conditions, are also responsible for signal transduction, receptor activity and cell signaling which are highly enriched in IR64 but not in Apo. Both genotypes, enabled programmed cell death in order to survive, which may eventually cause yield losses. Apo, on the other hand, showed enhancement of functions associated with oxygen binding and peroxisome pathway. Further studies to dissect these attributes of Apo is highly recommended.
Our results also showed several DEGs aligning with previous studies on drought-yield QTLs. These are most interesting candidates for further investigations on rice improvement on drought tolerance. Further validation procedures using different approaches (besides RNA-seq) are also recommended.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2073-4395/10/5/621/s1, SI 1. Commands used for the bioinformatics pipeline; SI 2. Commands used for DE using baySeq (R in Linux); Figure S1. To visualize if normalization procedure was adequate, MA plots with Loess curves were generated between IR64 reps 1 and 2 (left) and between Apo rep 1 and 2 (right). Blue line shows Loess curves; yellow, lines of symmetry at M = 0; Figure S2. MA plots with Loess curves were generated for IR64 control vs IR64 stress (left) and Apo control vs Apo stress (right). Legend similar to Figure S1. Table S1. Percentage alignment rates of the reads mapping to the MSU, IRGSP, and Shuhui mRNA pseudomolecules; Table S2. Complete report of bowtie2 on mapped and unmapped reads using the MSU v7 reference sequence. Table S3. List of DEGs in IR64 exposed under non-and water-stress conditions using MSU v7 as transcriptome reference sequence; Table S4. List of DEGs in Apo exposed under non-and water-stress conditions using MSU v7; Table S5. List of DEGs between IR64 and Apo using Genotypic Differential Expression (GDE) model and MSU v7 as transcriptome reference sequence; Table S6. List of DEGs between IR64 and Apo using Drought Differential Expression (DDE) model and MSU v7 transcriptome reference sequence; Table S7. List of DEGs in IR64 exposed under non-and water-stress conditions using Shuhui as transcriptome reference sequence; Table S8. List of DEGs in Apo exposed under non-and water-stress conditions using Shuhui; Table S9. List of DEGs between IR64 and Apo under non-stress