Genome-Wide Association Study Reveals the Genetic Basis of Cold Tolerance in Rice at the Seedling Stage

: We conducted a genome-wide association study (GWAS) of cold tolerance in a collection of 127 rice accessions, including 57 Korean landraces at the seedling stage. Cold tolerance of rice seedlings was evaluated in a growth chamber under controlled conditions and scored on a 0–9 scale, based on their low-temperature response and subsequent recovery. GWAS, together with principal component analysis (PCA) and kinship matrix analysis, revealed four quantitative trait loci (QTLs) on chromosomes 1, 4, and 5 that explained 16.5% to 18.5% of the variance in cold tolerance. The genomic region underlying the QTL on chromosome four overlapped with a previously reported QTL associated with cold tolerance in rice seedlings. Similarly, one of the QTLs identified on chromosome five overlapped with a previously reported QTL associated with seedling vigor. Subsequent bioinformatic and haplotype analyses revealed three candidate genes affecting cold tolerance within the linkage disequilibrium (LD) block of these QTLs: Os01g0357800, encoding a pentatricopeptide repeat (PPR) domain-containing protein; Os05g0171300, encoding a plastidial ADP-glucose transporter; and Os05g0400200, encoding a retrotransposon protein, Ty1-copia subclass. The detected QTLs and further evaluation of these candidate genes in the future will provide strategies for developing cold-tolerant rice in breeding programs.


Introduction
Seed germination and seedling establishment are important factors required for the successful rooting of plants in an irrigated field [1]. A temperature of 25-35 °C is considered optimal for the growth of rice (Oryza sativa L) seedlings; however, the temperature of irrigation water is frequently below 15 °C in tropical and subtropical areas at high altitudes [2]. Because of the unavailability of labor and high production cost, direct seeding has become increasingly important and popular in many rice-growing countries in Asia and is being adopted as an alternative to conventional transplanting [3]. Cold stress restricts rice plant growth and development at all stages of the life cycle. At the early growth stage, cold stress leads to poor germination rate, weak seedlings with wilting, yellowing or withering leaves, delayed seedling emergence, retarded plant development, and yield loss [4,5]. Japonica rice (O. sativa L. ssp. japonica) varieties are well-adapted to temperate and sub-temperate regions and to high-altitude areas in subtropical regions and exhibit greater cold tolerance than indica rice (O. sativa L. ssp. indica) varieties, which are cultivated in tropical and subtropical regions [6]. Cold temperature affects the growth and development of rice plants during their entire life cycle, starting from seed germination to the grain filling stage, eventually leading to yield loss. Cold tolerance research in rice has generally been focused on the seedling stage and reproductive developmental stages, which are highly sensitive to unfavorable temperature conditions. Like other important agronomic traits, cold tolerance in rice is genetically controlled by multiple quantitative trait loci (QTLs). In several studies, QTLs have been detected using indica/japonica bi-parental mapping populations. More than 250 cold-tolerance-associated QTLs have been identified in rice, spanning all 12 chromosomes, at various growth and developmental stages [7][8][9][10][11].
Despite many QTLs reported for cold tolerance in rice, only a few underlying genes have been identified at different growth stages [12][13][14]. The gene identified at the lowtemperature germinability QTL, qLTG3-1, encodes a hybrid glycine-rich protein, and a single-nucleotide polymorphism (SNP) in this gene was responsible for cold tolerance [15]. Since cold tolerance in rice is a highly complex trait, identifying genes underlying the reported QTLs is a slow process.
Undoubtedly, the identification of additional cold-tolerance QTLs and genes will enhance our understanding of the mechanisms of cold tolerance and provide useful strategies for developing elite varieties. In recent years, genome-wide association study (GWAS) has been widely used to discover the genetic basis of complex traits. To date, most of the published data on genetic loci controlling cold tolerance in rice have been obtained using bi-parental mapping populations derived from O. sativa ssp. indica × O. sativa ssp. japonica crosses, where japonica cultivars usually serve as the donor of cold tolerance [16]. Because genetic variability in a bi-parental mapping population is limited to the two parental lines, GWAS has been increasingly utilized as an alternative tool to study cold tolerance in rice. With the accumulation of genomic data from a large number of rice accessions, GWAS has led to the detection of QTLs associated with cold stress response in rice. For example, 33 and 22 cold tolerance QTLs were identified at the germination and booting stages, respectively, in a collection of 174 Chinese rice accessions [5]. Similarly, 17 QTLs associated with low-temperature germinability were reported in a collection of 63 Japanese rice varieties [17]. Sales et al. (2017) [18] identified 24 SNPs associated with germination and growth rate at low temperatures. Shakiba et al. (2017) [19] reported 42 QTLs associated with cold tolerance at the seedling stage.
The damaged phenotype by cold stress in the vegetative stage shows symptoms like yellowing of leaves, reduced leaf expansion, and wilting to damage to the photosynthetic machinery, resulting in an overall reduction in photosynthetic processes and deficit in energy resources [20]. At the molecular level, cold stress leads to the modification of metabolism, such as structure, catalytic properties and function of enzymes [21]. Cold stress reduces the fluidic nature of cellular membranes and increases their rigidity. Thus many metabolites having functioned as osmolytes have been reported as an important role in inducing cold stress tolerance [21]. Cold stress affects membrane rigidification in plant cells, which is known to be triggers downstream cold-stress responses in plants [22]. The cold stress signal is transduced through several components of signal transduction pathways whose components are calcium, reactive oxygen species, protein kinase, protein phosphatase and lipid signaling cascades [23,24]. Recent gene expression analyses reported that specifically expressed genes under the cold stress condition are involved in detoxification of reactive oxygen species, damage control and repairing, restructuring plasma membrane, and acceleration in osmolytes synthesis [20].
The cold stress damage occurs throughout the whole rice cultivating season, from germination to the grain filling stage. The symptoms of cold sensitivity and damage vary according to the developmental stages. Thus, so far, many genetic studies for cold tolerance were conducted with various treatment methods and the rice population [25]. Because cold tolerance is a complex trait, various cold treatment methods have been used to understand its molecular genetic bases, such as cold water treatment, air temperaturecontrolled growth chamber, and naturally cold regions [26]. Because of the complexity of cold-tolerant traits depend on various environment and the genetic, developmental stages, the genetic studies for the cold-tolerant trait are an ongoing project. Here, we conducted GWAS to identify QTLs related to cold tolerance in a core collection of 127 Korean rice accessions at the seedling stage to provide genetic information to the breeding program for developing cold-tolerant rice in the seedling stage. Among 127 accessions, 57 are Korean landrace, including 27 weedy rice, which has been adapted to the environment of the Korean peninsula and have the potential to introduce a cold-tolerant gene into Korean commercial varieties efficiently in breeding programs.

Plant Material and Genotyping
A core set of 127 Korean rice accessions (KRICE-CORE) (Table S1), including 19 tropical japonica, 59 temperate japonica, 39 indica, seven aus, and three admixed accessions, was used in this study [27][28][29][30]. Whole-genome sequences of all 127 accessions were obtained using the Illumina HiSeq 2500 sequencing systems platform (Illumina Inc., San Diego, CA, USA). The average genome coverage was 8×, and filtered reads were aligned to the rice reference genome sequence (IRGSP 1.0). The following filtering parameters were used for the genomic sequences, as implemented in the Plink software v1.90 [31]: minor allele frequency (MAF) > 5%, missing data < 1%, and heterozygosity ratio < 5%. Finally, approximately 2 million SNPs were selected from a total of 6.5 million raw-data SNPs. The selected SNPs were distributed widely on the chromosomes ranging from 154,346 SNPs on chromosome 12 to 274,855 SNPs on chromosome 1.

Cold Treatment and Cold Tolerance Evaluation
Approximately 20 seeds of each of the 127 rice accessions were sown in a grid (50 mm × 50 mm) with seedbed media (Punong Ltd., Gyeongju, Korea) and placed in a greenhouse under darkness for 3 days to induce seed germination. After germination, the young seedlings were grown in the greenhouse under natural light conditions for 2 weeks. Then, the 2-week-old seedlings were transferred to a growth chamber (Hanbaek Sci. Bucheon, Korea) and subjected to cold treatment at 12 °C and 70% relative humidity [32] for 7 days and then at 10 °C and 70% RH for 5 days. Subsequently, the cold-treated rice seedlings were returned to the greenhouse and allowed to recover for 2 days. The coldtolerance level of rice seedlings was evaluated based on their phenotype and scored on a scale ranging from 0 (highest cold tolerance) to 9 (zero cold tolerance; dead plant) ( Figure  1) [33]. Three independent biological replications were performed, and the final cold-tolerance score was calculated as the average of the three replications. Figure 1. Representation of cold-tolerant score stress at the seedling stage. 0: resistance ~9: highly sensitive. 0-1: no damage to leaves, normal leaf color (strongly tolerant), 2-3: the tip of leaves slightly dried, folded and light green (tolerant), 4-5 some seedlings moderately folded and wilted, pale green to yellowish leaves (moderately tolerant), 6-7 seedlings severely rolled and dried, reddish-brown leaves (sensitive), 8-9: most seedlings dead and dying (highly sensitive).

Population Structure and Linkage Disequilibrium (LD) Decay Analysis
The population structure of 127 accessions was analyzed using ADMIXTURE 1.3.0 software [34], based on ~2 million high-quality SNPs. The number of distinct sub-groups was estimated using cross-validation (cv) error. ADMIXTURE was run with K starting from 2 to 6, and the optimal number of ancestries (K) was obtained by cross-validation (cv) error from ADMIXTURE based on tenfold cross-validation. The results were visualized using Pophelper structure Web App v1.0.10 (http://pophelper.com) [35]. The principal component analysis (PCA) matrix was generated using the Plink software [31], and an optimized number (5) of principal components (PCs) was used as a Q-matrix for GWAS correction. PCA plot visualization was performed in R version 4.0. Neighbor-joining (NJ) trees were generated using MEGA X [36], and the result was visualized using Tree of Life (iTOL) v4 (https://itol.embl.de) [37]. LD was calculated using the PopLDdecay software [38]. Pair-wise, LD was calculated for all SNPs and averaged across the whole genome. The LD decay was estimated as the chromosomal distance at which the average pair-wise correlation coefficient (r 2 ) decreased to half its maximum value.

GWAS
First, SNPs with MAF < 5% were removed from the data. To perform GWAS analysis, the efficient mixed-model association (EMMA) [39], implemented in the R package EMMA was used. The kinship matrix was estimated using the emma.kinship function. To use MLM with Q-matrix and K-matrix, the kinship (K-matrix) was computed, and PC1-PC3 of genomic data was used as the Q-matrix. GWAS results and plot visualization was performed using the GAPIT package (version 3.0) in R. The association threshold was calculated using the following Equation (1) [40]: Finally, the threshold was set as −log10(P) = 5.954 for the identification of the association QTLs. The SNP markers located at QTL peaks were designated as lead SNPs. LD decay analysis identified a 250 kb region on either side of the lead SNP as the candidate genomic region for gene identification.

Candidate Gene Prediction and Haplotype Analysis
Based on the results of LD decay analysis, a 500 kb reference sequence of the detected QTL regions was identified as the candidate region. Functional annotations of genes within the candidate regions were extracted from the Gramene database (www.gramene.org, accessed on 8 February 2020). Haplotype analysis was conducted using all SNP markers, except those with missing and heterozygous data. Haplotypes in at least five rice accessions were used for comparative phenotypic analysis. One-way analysis of variance [41] followed by the least significant difference (LSD) test was used to compare phenotypic differences among haplotypes.

Cold Tolerance in Rice at the Seedling Stage
A total of 127 rice accessions belonging to five different groups (tropical japonica, temperate japonica, indica, aus, and admixed) were used to evaluate cold tolerance at the seedling stage. The boxplot of cold-tolerance score that divided the groups by ecotype showed significance between the japonica group and the other groups ( Figure 2A). The graph was skewed to the left of the 0-9 scale, with a score of 3 showing the highest frequency ( Figure  2B). The average cold-tolerance score of 127 accessions was 1.69, where 64 accessions showed a score of <2.00. RWG-017 (OKCHEONG; temperate japonica) and RWG-102 (Sando; tropical japonica) were the two most cold-tolerant accessions (score = 0.00) while RWG-040 (Magnolia; tropical japonica) was the least cold-tolerant (score = 8.67). Compari-son of variation in cold tolerance among the five different groups revealed that the temperate japonica group showed greater cold tolerance than the indica and aus groups, with an average cold-tolerance score of 1.66 ( Figure 2B). Temperate japonica rice showed slightly higher cold tolerance than tropical japonica rice; however, both groups showed high cold tolerance.
A B

Population Structure and LD Decay
The ADMIXTURE software was used to calculate the genetic composition of all 127 accessions. Cross-validation (CV) analysis indicated that K = 5 was the optimal population grouping, which showed the lowest CV error compared with other K values ( Figure 3A). Thus, these 127 accessions were divided into five groups, which were mostly distinguished by their subspecies ( Figure 3B, Supplementary Table S1). Temperature japonica was divided into cluster 1 and cluster 4. Tropical japonica was dominant in cluster 2, Aus was dominant in cluster 3, indica was dominant in cluster 5. The results of PCA showed that PC1 and PC2 explained 39.47% and 20.44% of the total variation in population structure, respectively. Accessions belonging to temperate japonica, tropical japonica, indica, and aus subspecies formed significantly distinct clusters, whereas the admixed accessions exhibited ambiguous separation ( Figure 3C). Additionally, the NJ tree constructed based on Nei's genetic distance revealed five clusters ( Figure 3D), consistent with the results of PCA in distinguishing among ecotypes. Most accessions were clearly separated by subspecies, while the admixed accessions were dispersed among the different clusters. Based on the PCA and neighbor-joining tree PCs ( Figure 3C,D), the two subpopulations, indica and japonica, were clearly differentiated. Temperate japonica and tropical japonica were distinguished but closely related together. Aus was distinguished from indica and japonica but closed related with indica. The decay of LD along physical distance was computed for the full panel of rice accessions. The value of r 2 declined with the increase in physical distance. The threshold value for candidate regions was determined as half of the maximal r 2 value (0.26), which produced a candidate genomic region of 250 kb (Supplementary Figure S1).

GWAS of Cold Stress Tolerance in Rice Seedlings
Manhattan plots of SNP markers significantly associated with cold tolerance at the seedling stage are presented in Figure 4. SNPs with −log10(P) score of 6 were considered to be significantly associated with cold tolerance (refer to Materials and Methods). Four significant association QTLs were detected (Table 1): qCTS1 (chromosome 1; −log10(P) score = 6.64; explaining 17.5% of the total phenotypic variation), qCTS4 (chromosome 4; −log10(P) score = 6.35; 16.5% of the total phenotypic variation), qCTS5-1 (chromosome 5; −log10(P) score = 6.40; 16.7% of the total phenotypic variation), and qCTS5-2 (chromosome 5; −log10(P) score = 6.97; 18.5% of the total phenotypic variation). Genomic segments corresponding to these four QTLs were defined according to the LD block size surrounding the lead SNP; for each lead SNP, a 500 kb region surrounding the marker (250 kb flanking the lead SNP on either side) was determined. Next, we compared the genomic region of the four QTLs with previously reported QTLs ( Table 1). The results showed that the qCTS4 QTL on chromosome 4 overlapped with QTL CQAA8 associated with cold tolerance at the seedling stage [42]. Interestingly, qCTS5-1 overlapped with QTL AQFR028 reported for seedling vigor [43]. Similarly, the AQAL060 QTL reported for root thickness [44] and CQE39 QTL reported for rubisco content [45] overlapped with qCTS1 on chromosome 1 and qCTS5-2 on chromosome 5, respectively.

Haplotype Analysis to Identify Candidate Genes underlying Cold Tolerance
To identify candidate genes responsible for cold tolerance, all genes within the 500 kb region encompassing the lead SNP were extracted and annotated. At the qCTS1 QTL, 57 genes were identified in the rice annotation project database (RAP-DB) (IRGSP 1.0). After removing nonprotein-coding and hypothetical genes from this gene set, 38 genes were retained. Using this approach, we identified 12, 40, and 50 candidate genes at qCTS4, qCTS5-1, and qCTS5-2 QTLs, respectively. These candidate genes (140 total) were then subjected to haplotype analysis. The phenotypic comparison was conducted among haplotypes containing at least five rice accessions. Finally, a total of five candidate genes showing statistically significant differences among the haplotypes were detected (Supplementary Figure S2). Based on the phenotypic differences among haplotypes and functional annotation of genes, we selected three candidate genes for further analysis. Haplotype analyses of these three candidate genes are presented in Figures 5-7 and Supplementary Figure S3. Heterozygous SNPs and SNPs with missing data were not included in the analysis. SNPs in exons were used for haplotype and haplotype variation analysis. Os01g0357800 (PPR domain-containing protein) contained two non-synonymous SNPs (C→G, Chr1_14473611, L→V substitution; C→T, Chr1_14474211, P→S substitution) that formed three haplotypes (Hap1, Hap2, and Hap3) ( Figure 7B). The cold-tolerance score of Hap3 differed significantly from that of Hap1 and Hap2. The Hap1 of Os01g0357800 was the superior genotype in indica rice ( Figure 5C). Os05g0171300 (similar to plastidial ADPglucose transporter) contained two non-synonymous SNPs (G→A, Chr5_4267411, A→T substitution; G→C, Chr5_4267425, Q→H substitution) in exons, which formed two haplotypes (Hap1 and Hap2) ( Figure 6). The mean cold-tolerance score of hap1 (4.62, 7 accessions) was higher than that of hap2 (2.40, 120 accessions). Os05g0400200 (similar to retrotransposon protein Ty1-copia subclass) contained three SNPs in exons ( Figure 5A). Among these SNPs, two were non-synonymous SNPs and were located in the coding region (G→C, Chr5_19471968, E→D substitution; T→C, Chr5_19472003, V→A substitution) located in the dienelactone hydrolase family domain. These two SNPs formed two haplotypes (Hap1 and Hap2) ( Figure 5B), which showed significantly different cold-tolerance scores; the mean cold-tolerance score of Hap1 (3.32, 67 accessions) was higher than that of Hap2 (1.64, 60 accessions).

Discussion
In general, cold tolerance in rice is evaluated under natural conditions, where rice cultivated in the field is evaluated for yield loss. However, this method is subject to variability in environmental conditions, such as the degree of cold temperature and fluctuations in temperature over time [46]. In this study, the cold tolerance of rice seedlings was evaluated in a growth chamber where the temperature was constantly controlled, unlike the field evaluation. After cold treatment at 12 °C and 70% RH for 7 days and then at 10 °C and 70% RH for 5 days, rice seedlings were allowed to recover in the greenhouse for 2 days. Therefore, the cold-tolerance score represents not only the cold sensitivity of the rice accession but also its ability to recover from the cold-induced damage. The overall distribution of the cold-tolerance score was skewed to the left (low score) on the 0-9 scale, which was possibly due to the additional recovering cultivation after cold treatment. In addition to the 127 rice accessions in the panel, we included the cold-sensitive variety IR38 as a check genotype. The cold-tolerance score of "IR38" was 8.7, indicating that the cold treatment was effective.
Comparative analysis of the physical/genetic positions of the QTLs identified in the current study and previously reported QTLs [47] revealed that the qCTS4 QTL overlapped with the CQAA8 QTL, which is associated with cold tolerance in rice seedlings and was identified in a recombinant inbred line [48] population derived from a cross between M202 (temperate japonica) and IR50 (indica) [42]. Although parameters, such as temperature, duration, and humidity, were not identical between the current study and the cold treatment and evaluation methods were similar between the two studies. The qCTS5-1 QTL overlapped with the AQFR028 QTL associated with rice seedling vigor, which was identified in a RIL population derived from a cross between Lemont (japonica) and "Teqing" (indica) [43]. Seedling vigor was evaluated based on the germination rate, root length, shoot length, and dry weight. Thus, the seedling vigor-related traits evaluated in this QTL study are regarded as cold-tolerance traits, based on the fact of evaluating the performance under low temperature. The qCTS1 QTL overlapped with the root thicknessassociated AQAL060 QTL [44], which was identified in a RIL population derived from the lowland indica rice, IR58821 and IR52561. Although there is no direct evidence supporting the relationship between root development and cold tolerance in rice seedlings, further experiments evaluating the possible role of root development in cold tolerance are worthwhile. Additionally, the CQE39 QTL associated with rubisco to chlorophyll ratio overlapped with the qCTS5-2 QTL identified in this study [45].
According to the results of our GWAS, 140 candidate genes were identified in the LD blocks of the detected QTLs. Based on the results of functional annotation and haplotype analysis, we focused on three candidate genes (Os01g0357800, Os05g0400200, and Os05g0171300) in this study. According to the haplotype analysis for the three candidate genes, RWG-107-one of the tolerant accessions showing a tolerant score as 0-contains a cold-tolerant haplotype for three candidate genes. RWG-107 is a Korean landrace and temperate japonica rice. Thus RWG-107 is suggested as a donor plant for cold-tolerant breeding programs in Korea. To analyze the transcript levels of the three candidate genes under various conditions, we used the transcriptome encyclopedia of rice (TENOR) database (http://tenor.dna.affrc.go.jp/) [49]. Os01g0357800 is predicted to encode an r-domain-containing protein. PPR proteins are involved in a wide range of organelle RNA processing activities [50]. Increasing molecular evidence shows that PPRs play important roles in various biotic and abiotic stresses, such as photo-oxidative stress responses, in Arabidopsis [51], gene expression regulation in response to wounding stress and pathogen infection [52,53], and the negative regulation of drought stress and abscisic acid (ABA) signaling genes [54]. According to the TENOR web search, the expression levels of Os01g0357800 were increased under cold and flooding stress (Supplementary Figure S4). Considering the previously reported function of PPRs and the expression of Os01g0357800 under cold stress, further investigation is needed to confirm its role in cold tolerance. Os05g0400200 is described as similar to retrotransposon protein Ty1-copia subclass. Transposable elements (TEs) are known for DNA sources, causing insertion-mediated gene dysfunction. Retrotransposons, as a class of TEs, can generate stable mutations when inserted within or near a gene [55]. Previous reports show the activation of a retrotransposon under the influence of environmental factors, such as cold, pathogen infection, microbial elicitors, tissue culture, protoplast production, and wounding [56,57], suggesting the possible role of retrotransposons in the environmental adaptation of an organism [57]. According to the TENOR web search, Os05g0400200 is expressed under various environmental treatments; however, its expression level under stress conditions is not significantly different from that under control conditions (Supplementary Figure S4). Os05g0171300 is described as similar to plastidial ADP-glucose transporter. ADP-glucose transporter is involved in starch biosynthesis and compound granule formation in the rice endosperm [58]. Starch is regarded as an important key molecule in mediating plant responses to abiotic stresses, such as water deficit, high salinity or extreme temperatures [59]. Under the unfavorite environmental conditions, plants generally remobilize starch to provide energy to compensate for reduced photosynthesis. The released sugars and other derived metabolites support plant growth under stress and function as osmoprotectants and compatible solutes to mitigate the negative effect of the stress [60]. Sugars are known to act as signaling molecules, which crosstalk with the ABA-dependent signaling pathway to activate downstream components in the stress response cascade [61]. In addition to the association of starch metabolism with abiotic stress, according to TENOR web search results that the expression of Os05g0171300 is increased under high cadmium stress ( Supplementary Figure S4), suggests the possible role in cold tolerance.

Conclusions
In this study, we evaluated the cold-tolerance levels of a collection of 127 diverse rice accessions. Four cold-tolerance-associated QTLs were identified by GWAS. Haplotype analysis and phenotypic comparison revealed three candidate genes affecting cold tolerance at the seedling stage. RWG-107, a Korean landrace and temperate japonica rice, is suggested as a donor plant for a cold-tolerant breeding program in Korea. Further evaluation of these candidate genes in the future will provide strategies for developing cold-tolerant, elite rice varieties.

Supplementary Materials:
The following are available online at www.mdpi.com/2077-0472/11/4/318/s1, Table S1: The list of rice accessions and cold tolerance scoring, Figure S1: LD decay analysis; Figure S2: Haplotype analysis of Os05g0397700, Os05g0401000. (A) Gene structure and SNP positions of Os05g0397700; (B) Gene structure and SNP positions of Os05g0401000; Figure S3: The location of detected QTLs and the candidate genes; Figure S4: Expression profiles in rice seedling under the various environmental conditions.