Drought-Tolerance Gene Identification Using Genome Comparison and Co-Expression Network Analysis of Chromosome Substitution Lines in Rice

Drought stress limits plant growth and productivity. It triggers many responses by inducing changes in plant morphology and physiology. KDML105 rice is a key rice variety in Thailand and is normally grown in the northeastern part of the country. The chromosome segment substitution lines (CSSLs) were developed by transferring putative drought tolerance loci (QTLs) on chromosome 1, 3, 4, 8, or 9 into the KDML105 rice genome. CSSL104 is a drought-tolerant line with higher net photosynthesis and leaf water potential than KDML105 rice. The analysis of CSSL104 gene regulation identified the loci associated with these traits via gene co-expression network analysis. Most of the predicted genes are involved in the photosynthesis process. These genes are also conserved in Arabidopsis thaliana. Seven genes encoding chloroplast proteins were selected for further analysis through characterization of Arabidopsis tagged mutants. The response of these mutants to drought stress was analyzed daily for seven days after treatment by scoring green tissue areas via the PlantScreen™ XYZ system. Mutation of these genes affected green areas of the plant and stability index under drought stress, suggesting their involvement in drought tolerance.


Rice Growth Condition
KDML105 rice, DH103, and CSSL seeds were incubated at 60 • C for 48 h before planting. The seeds were then germinated by soaking in distilled water for seven days in plastic cups. Rice seedlings were transferred to a plastic tray and continuously grown in WP nutrient solution [13]. Twenty-eight days after germination, rice plants were drought-stressed for three days by the addition of 10% polyethylene glycol 6000 (PEG6000). This condition was previously shown to cause drought stress in rice [14,15]. In order to induce the stronger drought-stress condition, after treatment with WP nutrient solution with 10% PEG6000 for three days, the solution was then changed to WP nutrient solution with 15% PEG. Plants grown in WP nutrient solution without PEG6000 were used as controls. A complete randomized design (CRD) with four replicates was used for physiological evaluation in each parameter.

Net Photosynthesis Rate and Leaf Water Status Detection
The net photosynthesis rate (Pn) of twenty-eight-day-old rice plants was determined with a LI-6400 XT portable photosynthesis system (LI-COR, Lincoln, NE, USA). The measurement was taken at the middle part of the youngest fully expanded leaves between 9 am and 2 pm, with the following conditions: the molar flow of air per unit leaf area was 500 mmol l −1 m −2 s −1 , the photosynthetically active radiation (PAR) at the leaf surface was 1200 µmol m −2 s −1 , the leaf temperature ranged from 30.0 to 37.0 • C, with a CO 2 concentration of 380.0 mol mol −1 . The leaf water potential (LWP) was measured in the youngest fully expanded leaves using plant water status console model 3005 (Soilmoisture, Goleta, CA, USA).

Whole Genome Sequencing
The aboveground parts of KDML105, DH103, and CSSL104 rice plants were collected at fourteen days after germination. Rice genomic DNA was extracted using a Genomic DNA Mini Kit, 'Plant' (Geneaid, New Taipei City, Taiwan). Genomic DNA libraries were prepared for sequencing by using an Illumina genome analyzer (Illumina, San Diego, CA, USA) with Illumina HiSeq3000's protocol. For genome analyses, the sequence reads were classified into specific categories using the pipeline developed by Missirian et al. [16]. The rice genomic sequence from the Rice Genome Annotation Project database [17] was used as a reference genome [18] to map the sequence reads. The raw reads were submitted to GenBank at the NCBI under BioProject no. PRJNA659381. Bioinformatic tools were used to compare the genome of CSSL104 with the KDML105 rice genome to identify loci containing different single nucleotide polymorphisms (SNPs). These loci may contribute to the drought-tolerance phenotypes of CSSL104. The genome comparison first started by discarding all SNPs shared by both CSSL104 and KDML105. The remaining differential SNPs were counted within a sliding window of 5000 background nucleotides. To visualize the chromosome plots with the marks of different SNPs' loci, the window region containing more than 100 SNPs in CSSL104 with different nucleotides from KDML105 was marked as a blue line on the chromosome plots. The analysis of the locations of SNPs in the candidate genes was analyzed.

Gene Co-Expression Network Analysis
The rice loci containing the different SNPs were used for a gene co-expression network analysis. To predict the important loci involved in drought tolerance, a rice oligonucleotide array database was used with abiotic stress-induced gene expression data with a correlation coefficient cut-off of 0.95 [19]. The predicted loci were searched for gene ontology and expression patterns from the rice expression database [20].

Arabidopsis Homologous Gene
The best candidate genes were used to search for the homologous gene in Arabidopsis from the Rice Genome Annotation Project database [17] and The Arabidopsis Information Resource [19]. Arabidopsis mutant lines with T-DNA insertion in the selected gene were ordered from the Arabidopsis Biological Resource Center (ABRC). The Arabidopsis seeds were screened for homozygous mutant lines via specific primers, LP and RP, to the gene of interest; the LB primer was used for a specific T-DNA region.

Arabidopsis Growth Condition
Four days after germination, Arabidopsis thaliana ecotype Col-0 seeds and seven mutant lines [21] including at1g74880, at5g54270, at3g63190, at4g11960, at4g22890, at2g27680, and at4g34830 were sowed and transferred to 48-well plates, containing Murashige and Skoog (MS) agar media for normal conditions, MS agar media supplemented with 75 mM mannitol for mild drought stress, or MS agar media supplemented with 150 mM mannitol for severe drought stress. The plants were then grown in a growth chamber at 22°C/20°C, 16/8 h light/dark cycle, 120 µmol photons of PAR m −2 ·s −1 , and 60% humidity. RGB imaging was used to collect the green area of plants twice a day via a PlantScreenTM XYZ system (Photon Systems Instruments, Drásov, Czech Republic) [22].

Statistical Analysis
Analysis of variance (ANOVA) and Duncan's Multiple Range Test (DMRT) were used for data analysis by SPSS Statistics program version 22 (IBM, Armonk, NY, USA). The images from the PlantScreenTM XYZ system were analyzed using MATLAB (R2015; MathWorks, Inc., Natick, MA, USA), and the data were analyzed by independent t-tests using SPSS.

Evaluation of Physiological Responses of CSSLs of KDML105 under Drought-Stress Conditions
Selected CSSLs, namely CSSL97, CSSL104, CSSL106, and CSSL107, were evaluated for drought tolerance by growing the seedlings in soil with 100% or 50% field capacity. In normal growth conditions (100% field capacity), all of the lines were similar ( Figure 1A,C), but they differed under 50% field capacity ( Figure 1B,D). CSSL104 displayed the most drought-tolerant phenotype, with the lowest leaf dying score and the highest photosystem II (PSII) efficiency (F v /F m ). This was similar to the performance of DH103 (the drought-tolerant parental line). The highest leaf-death score was detected in CSSL106, while CSSL97 had the lowest PSII efficiency under drought stress. These data suggest that CSSL104 is the most drought-tolerant line, while CSSL97 and CSSL106 are the most susceptible. Therefore, CSSL104 was selected for further characterization.
CSSL104 and its parental lines, KDML105 and DH103 rice, were grown in nutrient solutions corresponding to normal growth conditions and in a solution supplemented with 15% PEG for the drought-stress treatment, which caused about a 50-180% reduction in leaf water potential (Table 1). After nine days of drought stress, we measured the highest reduction of leaf water potential for all lines. Under these conditions, CSSL104 had the lowest leaf water potential at −7.75 MPa, which was about threefold lower than the LWP of the plants grown in normal conditions. The parental lines, KDML105 and DH103, had about a twofold reduction in LWP. Drought stress reduced all parameters of photosynthesis, including net photosynthesis rate, stomatal conductance, transpiration rate, intercellular CO 2 concentration, ΦPSII, and electron transport rate in all lines. However, after six days under drought stress, CSSL104 had a significantly higher photosynthetic rate than the KDML105 parent. In addition, CSSL104 had a greater tendency than KDML105 toward higher values for all photosynthetic parameters after nine days of drought treatment (Table 1).

Figure 1.
Response to drought stress in chromosome segment substitution lines (CSSLs) and parents. CSSLs, CSSL97, CSSL104, CSSL106, and CSSL107, and their parental lines KDML105 and DH103, were compared for leaf death (leaf dying score) and photosystem II efficiency (F v /F m ) under (A,C) normal (100% field capacity) and (B,D) drought-stress (50% field capacity) conditions. The mean + 1 standard error (SE) was derived from four replicates. Means with a different lowercase letter above them are significantly different (p < 0.05). NS demonstrates no significant difference among lines. Table 1. Photosynthetic performance of CSSL104 and parent lines. Net photosynthesis rate, transpiration rate, stomatal conductance, ΦPSII, electron transport rate, intercellular CO 2 concentration, F v '/F m ', relative chlorophyll content (determined by portable chlorophyll meter SPAD-501) and leaf water potential of rice at vegetative stage in normal and drought stress conditions. ANOVA and Duncan's Multiple Range Test (DMRT) were used for statistical analysis. The data show mean ± SE. Different superscript letters show the significant difference among lines at p value ≤ 0.05.

Conditions
Normal

Whole Genome Sequence Comparison between CSSL104 and KDML10' and Co-Expression Network Analysis Revealed that Major Hub Genes Have a Role in Photosynthesis
We compared whole genome sequences of CSSL104 and its drought susceptible parental line KDML105 to define the genes responsible for drought tolerance in CSSL104. A total of 101,950 SNPs located on 3440 genes were detected. The regions with a high density of SNPs were on chromosomes 1, 8, 9, and 11 ( Figure 2).  Figure 3A, revealed 18 major nodes with a high connection to other genes. The gene ontology of these 18 genes is listed in Table 2. The map position of these genes is shown in Figure 3B. Based on quantitative trait loci (QTL) data from the Qtaro database [23], six loci are located in QTL regions for drought-stress tolerance on chromosomes 1, 3, and 8 ( Figure 3B). The high-density SNP on chromosome 1 was consistent with the location of the drought-tolerant (DT)-QTL, which is flanked by markers RZ14 and R117. In this QTL, co-expression network analysis identified two genes, LOC_Os01g72800 and LOC_Os01g72950, as the major nodes. Chromosome 3 did not display high-density SNPs. On this chromosome, LOC_Os03g02590 and LOC_Os03g03910 were located in two drought-tolerance QTLs mapped between markers RM7332, RM545, and RG104, RZ329. Another node gene, LOC_Os03g52460, is located between markers C136 and R1618, corresponding to another drought-tolerance QTL. Chromosome 8 displays several major nodes: LOC_Os08g16570 is located between markers RM72 and RM331, while LOC_Os08g41040 and LOC_Os08g41460 are located between RM5353 and RM 3480. This region on chromosome 8 also displayed high-density SNPs between CSSL104 and KDML105 (Figures 2 and 3B).
Therefore, we obtained seven homozygous, T-DNA tagged Arabidopsis mutant lines corresponding to ndhO (at1g74880), lhcb3 (at5g54270), rrf (at3g63190), pgrl1b (at4g11960), pgrl1a (at4g22890), at2g27680, and mrl1 (at4g34830). These lines were drought stressed by growing them in MS medium supplemented with 0 mM, 75 mM, or 150 mM mannitol. Their growth response was assessed by measuring the green pixel area per plant and compared to the wild type (WT). Candidate genes for drought tolerance. Gene co-expression network was analyzed by using the Rice Oligonucleotide Array Database [19], showing the major node genes with yellow dots (A), while DT-QTLs from the Q-TARO database [23] and Kanjoo et al. [11] are shown in red and yellow boxes on the chromosome, respectively. (B) Loci written in blue letters indicate the proposed drought-tolerance loci based on this study. Under normal conditions, all mutant lines displayed a significantly lower number of green pixels than WT, suggesting lower growth than WT ( Figure 4A). Both drought-stress treatments decreased growth in all lines, with the 150 mM causing the more severe reduction. At 75 mM mannitol, pgrl1b had a significantly lower number of green pixels than WT, while pgrl1a showed similar growth to WT. Other mutant lines showed better growth than WT ( Figure 4B). Under the severe drought-stress conditions induced with 150 mM mannitol, the growth of lhcb3, at2g27680, mrl1, and WT were similar. Mutants pgrl1b and pgrl1a had a significantly lower growth than WT, while rrf had a lower growth at the beginning of the treatment but displayed better growth than WT after 5 days of the treatment. However, similar growth between WT and rrf was found after seven days of drought stress. Among the mutant lines, ndhO was the only mutant line that had significantly better growth than WT under severe drought stress ( Figure 4C).
The stability indexes of Arabidopsis mutants and WT were calculated to compare drought tolerance after six days of drought stress. After the intermediate drought stress (75 mM mannitol), all mutant lines except pgrl1a displayed significantly higher stability than WT, suggesting the contribution of NDH-0, LHCB3, RRF, PGRL1b, at2g27680, and MRL1 to drought-tolerance adaptation ( Figure 5A). Under severe drought (150 mM mannitol), significantly higher stability than WTs was detected for the ndh-o, rrf, pgrl1b, at2g27680, and mrl1 mutants ( Figure 5B

Discussion
We investigated the effect of a drought-stress QTL introgressed into the elite rice line KDML105. Using leaf water potential under drought stress, we found that introgression line CSSL104 manifested drought tolerance similar to that of the parent line DH103. Both tended to have a better ability to maintain water status in the first fully expanded leaves. After six days of drought stress, both lines had about a 22% higher leaf water potential than the KDML105 parent. Drought stress limits water uptake from the rice root and reduces water availability in the cells, which is critical for survival under drought stress. Water depletion can compromise photosynthesis and cell growth [34], and three main maintenance mechanisms are used by plants to offset water loss: leaf rolling, stomatal closure, and osmoregulation [35,36]. Evidence for the drought tolerance of CSSL104 was also provided by the lower leaf-death score and higher F v /F m displayed by CSSL104 compared to KDML105 (Figure 1).
Drought stress resulted in decreased stomatal conductance in all lines (KDML105, DH103, and CSSL104; Table 1). This was a water-preservation mechanism that also resulted in a decline in the net photosynthesis rate. After three days of drought stress, the photosynthesis parameters, net photosynthesis rate, stomatal conductance, transpiration rate, ΦPSII, electron transport rate, intercellular CO 2 concentration, and F v '/F m ', were similar among all lines. After six days under drought stress, the net photosynthesis rate of DH103 and CSSL104 were about twofold higher than the net photosynthesis rate of KDML105. In comparison with normal plants, KDML105 rice had a nearly 70% reduction in photosynthesis rate, while DH103 and CSSL104 had only a 46 and 44% reduction, respectively. Interestingly, the ΦPSII and electron transport rate of all lines were similar, while the stomatal conductance of KDML105 was 58% lower than DH103. These findings suggest that stomatal closure could be one of the major factors contributing to the decline in the net photosynthesis rate of KDML105. Although the stomatal conductance of CSSL104 was lower than DH103 by 42%, CSSL104 could maintain a net photosynthesis rate (Table 1). These indicated that this CSSL is better adapted than its drought-tolerant parental line. It is possible that a KDML105 locus contributed to maintenance of the photosynthesis process through an epistatic interaction with the introgressed DH103 region.
Using whole genome sequence comparison and co-expression network analysis, we characterized the molecular fingerprint of the introgression. The first detected DNA segments introgressed, while the second identified genes connected with the drought response. During this stress, 18 genes were highly co-expressed with other genes ( Figure 3A and Table 2). Nine of them (CPFTSY, NDH-O, SOQ1, LHCB3, RRF, PGRL1, HCF244, NAD(P)-linked oxidoreductase, and MRL1) are involved in the photosynthesis process, and CPRF1 is essential for chloroplast development. These findings indicate that the drought-adaptation QTL affects photosynthetic genes whose modulation maintains the net photosynthesis rate of CSSL104.
The function of the identified genes is as follows. CPFTSY (chloroplast FtsY, i.e., chloroplast signal-recognition particle) is required for light-harvesting chlorophyll a/b-binding protein (LHCP) integration into thylakoids [25], and NDH-O is the subunit required for the NADH dehydrogenase-like (NDH) complex assembly that functions in cyclic electron flow [25,37]. SOQ1 is required to maintain light harvesting efficiency especially during nonphotochemical quenching (NPQ) recovery [26]. Light-harvesting chlorophyll (LHC) functions as a light receptor to capture light energy and deliver it to photosystems. The Lhcb3 gene product regulates the rate of state transition by changing the excitation energy transfer and charge separation [38]. RRF is a ribosome recycling factor in chloroplast [39]. RRF is required to maintain photosystem II efficiency (F v /F m ) and proper stacking of the internal membranes of chloroplast. Loss of these functions led to a lower growth rate for the rrf mutant compared to WT [28], which is consistent with the phenotype documented in this study ( Figure 4A). PGRL1A (AT4G22890) and PGRL1B (AT4G11960) are paralogous genes whose products switch linear electron flow to cyclic electron flow. PGRL1 is the elusive ferredoxin-plastoquinone reductase (FQR) [29].
During drought stress, plants shift the electron transfer route from linear to cyclic to balance the energy flow from light reaction to Calvin cycle and photorespiration [37,40,41]; this change is due to CO 2 limitations caused by stomatal closure. It was recently shown that tomato with co-silencing of the PGR5/PGRL1A gene was more susceptible to cold stress [42]. This is consistent with our finding that the pgrl1 mutant had a significantly lower growth rate than WT in both intermediate and severe drought stress ( Figure 5B,C). HCF244 is required for the biogenesis of photosystem II (PSII), and specifically for the synthesis of the reaction center proteins [30][31][32]. The NAD(P)-linked oxidoreductase gene is involved in redox reactions, but the details are still unclear. MRL1 is the only gene in this study whose product is involved in the Calvin cycle. It is involved either in the processing or in the stabilization of the large subunit (LS) of RuBisCO transcripts [43].
Not surprisingly, impairment of these photosynthesis genes significantly reduced growth under normal conditions ( Figure 4A). Moreover, we could not obtain homozygous lines of the soq1, hcf244, and cprf1 mutation, probably because the homozygous mutants are lethal. However, under drought-stress conditions, some of the mutants in the genes involved in the light reaction process (ndhO (at1g74880), lhcb3 (at5g54270), rrf (at3g63190), and at2g27680 mutants) displayed significantly higher growth than WT. These responses suggested that the decrease in light energy harvest during drought stress could prevent damage to chloroplasts and prolong survival of photosynthetic tissues. The mrl1 mutant was expected to lack of the ability to stabilize the rbcL mRNA, which could result in a decline in Calvin cycle activity. The higher stability index of the mrl1 mutant line under intermediate and severe drought stress indicated that a slower rate of the Calvin cycle may help plants cope with drought stress.
Based on the growth phenotype under drought stress, the ndhO, lhcb3, rrf, and mrl1 mutants showed a higher growth rate than WT ( Figure 4B,C). The homologs in rice of these genes are LOC_Os01g72800, LOC_Os07g37550, LOC_Os07g38300, and LOC_Os10g10170, respectively. Therefore, we would like to propose that these genes contribute to drought-tolerance regulation in rice by mediating adaptation in the photosynthesis process. LOC_Os01g72800 is located in the previously reported drought-tolerance QTL between RZ14 and R117 [21], while the other three genes are not. Collectively, our results indicate that the combination of SNP analysis with co-expression network analysis is a suitable method for drought-tolerance gene prediction. This approach will help future exploration to identify the candidate genes for abiotic stress tolerance.

Conclusions
The KDML105 chromosome substitution line CSSL104 displayed a drought-tolerance phenotype based on photosynthetic maintenance ability. Identification of SNPs between KDML105 and the tolerant CSSL, together with co-expression network analysis, predicted 18 candidate drought-tolerance genes-ten of which were involved in photosynthesis or chloroplast development. Seven of them were selected for the characterization by using Arabidopsis mutant lines for the homologous genes.
Four out of seven mutants showed a higher growth rate than WT under drought stress. Therefore, LOC_Os01g72800, LOC_Os07g37550, LOC_Os07g38300, and LOC_Os10g10170 are proposed to be the drought-tolerance genes in CSSL104 rice.