Fine Mapping of a Major Pleiotropic QTL Associated with Sesamin and Sesamolin Variation in Sesame (Sesamum indicum L.)

Deciphering the genetic basis of quantitative agronomic traits is a prerequisite for their improvement. Herein, we identified loci governing the main sesame lignans, sesamin and sesamolin variation in a recombinant inbred lines (RILs, F8) population under two environments. The content of the two lignans in the seeds was investigated by HPLC. The sesamin and sesamolin contents ranged from 0.33 to 7.52 mg/g and 0.36 to 2.70 mg/g, respectively. In total, we revealed 26 QTLs on a linkage map comprising 424 SSR markers, including 16 and 10 loci associated with sesamin and sesamolin variation, respectively. Among them, qSmin_11.1 and qSmol_11.1 detected in both the two environments explained 67.69% and 46.05% of the phenotypic variation of sesamin and sesamolin, respectively. Notably, qSmin11-1 and qSmol11-1 were located in the same interval of 127–127.21 cM on LG11 between markers ZMM1776 and ZM918 and acted as a pleiotropic locus. Furthermore, two potential candidate genes (SIN_1005755 and SIN_1005756) at the same locus were identified based on comparative transcriptome analysis. Our results suggest the existence of a single gene of large effect that controls expression, both of sesamin and sesamolin, and provide genetic information for further investigation of the regulation of lignan biosynthesis in sesame.


Introduction
Sesame is a worldwide oilseed crop that, owing to its nutritional and therapeutic qualities, has gained substantial attention [1]. Its seeds are rich in oil, proteins, vitamins, minerals, and a class of lignans highly sought after by humans because of their various biological properties [2,3]. In some places, the daily intake of lignans by males and females from sesame seeds and oil was estimated at 18.39 and 13.26 mg/person, respectively [4]. Lignans are chemically classified as monolignol dimers [5]. Sesame seeds and products contain a large number of lignans, among which sesamin and sesamolin are the major ones [6]. Many studies had reported various pharmacological abilities of sesamin and sesamolin, including anti-inflammatory, anti-oxidative, anti-cancerogenic, anti-hypertensive, anti-proliferative, anti-melanogenesis, auditory-protective, anti-cholesterol and anti-aging [7][8][9][10]. The total lignan content is a critical factor in sesame seed quality evaluation [11].
The health-promoting properties of sesamin and sesamolin have expanded the demand for sesame products containing high lignans [12]. Thus, breeding for high oil, lignans contents and seed yield is the main objective of sesame breeders. Dissecting the genetic basis and understanding the genetic control of lignan biosynthesis is a prerequisite for achieving the target. Sesamin and sesamolin contents in sesame seed are quantitative polygenic traits controlled by additive and dominance effects [13][14][15]. The sesamin and sesamolin contents vary broadly in sesame germplasms [16][17][18]. Their contents are influenced by other seed components, including oil, protein, and lignin. A significant positive

Seed Sesamin and Sesamolin Content Variation in the RILs Population
In total, seeds of 477 and 449 lines (394 commons) from Wuchang and Yangluo, respectively, were tested. The other lines were discarded from the sesamin and sesamolin contents analysis due to disease or fewer seeds. The results showed a variation of the seed sesamin and sesamolin contents among the population in the two environments. The two parents exhibited obvious differences in the two lignan contents. The averages of sesamin and sesamolin contents in Zhongzhi No. 13 seeds were 3.86 mg/g and 1.86 mg/g, respectively, and those of ZZM2748 were 0.91 mg/g and 0.94 mg/g, respectively (Table 1). Among the RIL population, the sesamin content ranged from 0.33 mg/g to 7.52 mg/g with an average of 2.76 mg/g in Wuchang, and from 0.36 mg/g to 5.58 mg/g with an average of 2.31 mg/g in Yangluo. Meanwhile, the amount of sesamolin in Wuchang and Yangluo ranged from 0.40 mg/g to 2.70 mg/g and from 0.38 mg/g to 2.70 mg/g, respectively ( Table 1). The average contents of the sesamin and sesamolin in the RIL population were both between the ranges of the two parents. The datasets of sesamolin variation showed nearly normal distributions, while that of sesamin exhibited a bimodal distribution ( Figure 1). The sesamin content had the highest coefficient of variation (CV) of 61.81% across the two field trials. Moreover, the sesamin and sesamolin contents showed a significant positive correlation under the two environments. As presented in Table 2, the correlation coefficient (r) between the two lignans varied from 0.64 to 0.83 (p < 0.01).

The QTL for Sesamin and Sesamolin Content Variations
The phenotype data of the 548 RILs from the two environments were used to map sesamin and sesamolin QTLs on a previously constructed genetic map by Wang et al. [29]. The genetic map constituted of 13 linkage groups comprising 424 SSR markers with a total length of 1869.78 cM. The average distance between two consecutive markers on the map was 5.1 cM. We performed the QTL mapping with two software Windows QTL Cartographer 2.5 and QTL ICIMapping v4.1 (1000 permutations, p = 0.05), using the composite interval mapping (CIM) method. In total, we detected 26 QTLs associated with the two lignan contents (Table 3). In particular, 16 QTLs for sesamin and 10 for sesamolin with phenotypic variation explanation (PVE) contributions of 1.15% to 67.69% and 1.87% to 46.05%, respectively. These QTLs were distributed on LG2, LG3, LG4, LG6, LG8, LG9, LG11, and LG13. The linkage groups LG4 and LG8 were outstanding, with seven and eight QTLs, respectively, accounting for half of the total loci ( Figure 2). Among these QTLs, qSmin_3.1, qSmin_9.2, qSmin_11.1 and qSmol_11.1 were detected in the two environments. The loci qSmin_11.1 and qSmol_11.1 on LG11 explained 67.69% of sesamin and 46.05% of sesamolin contents variation. Notably, qSmin11-1 and qSmol11-1 were located in the same region as a pleiotropic locus between SSR markers ZMM1776 and ZM918 (Table 3). In order to compare the detected QTLs with the previously identified sesamin QTLs by Wu et al. [23], we mapped the RAD tags (markers) of the previous QTLs into the reference genome of Zhongzhi No. 13 [28]. The result indicated that qSmin_4.2 might coincide with Qsc-8 as both are located in the same interval of 3.1~4.7 Mb on Chr 4.

Digenic Epistatic Interactions Analysis of the Predicted Loci
To perform this analysis, we used the mixed linear model in the QTLNetwork program ver. 2.0. We identified four pairs of digenic epistatic interactions with additive × additive (AA) and AA × environment (AAE) interaction effects ( Table 4). The epistatic interactions involved six loci distributed on linkage groups 2, 3, 6, 11, and 13 ( Figure 3). Three pairs of loci with digenic epistatic interactions were detected for sesamin. They involved four loci that were distributed on LGs 2, 6, and 11. The AA and AAE interaction Plants 2021, 10, 1343 6 of 14 effects varied from 0.18% to 0.27% and 0.04% to 0.28%, respectively. A significant interaction effect was found between QTL qSmin_2.1 and qSmin_11.1 with an AA interaction effect of 0.27% and AAE of less than 0.04%. Moreover, these QTLs were implicated in the remaining two epistasis interactions for sesamin individually. Only one digenic epistatic interaction was identified for sesamolin and was found to involve two loci on LGs 3 and 13. The AA interactions contribution was 0.25%, and the AAE interactions contribution was 0.30%.

Digenic Epistatic Interactions Analysis of the Predicted Loci
To perform this analysis, we used the mixed linear model in the QTLNetwork program ver. 2.0. We identified four pairs of digenic epistatic interactions with additive × additive [37] and AA × environment (AAE) interaction effects ( Table 4). The epistatic interactions involved six loci distributed on linkage groups 2, 3, 6, 11, and 13 ( Figure 3). Three pairs of loci with digenic epistatic interactions were detected for sesamin. They involved four loci that were distributed on LGs 2, 6, and 11. The AA and AAE interaction effects varied from 0.18% to 0.27% and 0.04% to 0.28%, respectively. A significant interaction effect was found between QTL qSmin_2.1 and qSmin_11.1 with an AA interaction effect of 0.27% and AAE of less than 0.04%. Moreover, these QTLs were implicated in the remaining two epistasis interactions for sesamin individually. Only one digenic epistatic interaction was identified for sesamolin and was found to involve two loci on LGs 3 and 13. The AA interactions contribution was 0.25%, and the AAE interactions contribution was 0.30%.

Candidate Genes under the Major QTL Region
The QTL qSmin_11.1 located near the edge of chromosome 11 between 126.2 and 129.2 cM showed pleiotropic effects for both sesamin and sesamolin contents. We screened all gene models in this region based on their functional annotation and selected 60 of them that may contain the candidate causative genes of sesamin and sesamolin.
We performed the transcriptome analysis of developing seed at 10, 20, and 30 DPA (days post-anthesis) and compared the expression profiles of the 60 preselected genes. Among these genes, only SIN_1005755 and SIN_1005756 showed significant expression differences in the two parents at 10, 20, and 30 DPA with the highest expression in Zhongzhi No.13. In particular, SIN_1005756 was not expressed in ZZM2748 (Figure 4; Supplementary Figure S1). To check the reliability of the RNA-seq data, we performed the qRT-PCR of the two candidate genes. The results were consistent with the RNA-seq with similar trends, supporting the RNA-seq analysis (Figure 4). We then selected SIN_1005755 and SIN_1005756 as possible candidate genes associated with sesamin and sesamolin variation in sesame developing seed for future studies. SIN_1005755 was annotated as the NAC domain-containing protein in the Swissprot database. SIN_1005756 was a novel gene without any annotation (Table S1).

Discussion
Sesame breeding programs have been focused mostly on seed yield, disease resistance, and high oil content. Recently, the objective in sesame breeding has been changed due to the discovery of huge positive effects of sesame lignans on human and animal health. Currently, breed environment stable sesame varieties containing high oil and lignans are the focus of sesame breeders. Previous studies demonstrated that genomic assisted-breeding techniques could be useful for creating high-quality varieties in sesame [38]. However, the genetic basis of lignan content in sesame is still not well-understood. Therefore, acknowledging the genetic control of sesamin and sesamolin biosynthesis in sesame is of the utmost interest. Quantitative trait locus (QTL) mapping is a common practice in crop plants due to progress made in the statistical genomics and molecular markers area [32]. It was useful in detecting loci linked to complex quantitative traits in Moreover, the expression profiles of the five genes that may be involved in sesame lignan biosynthesis were also checked [26]. The five genes included a dirigent gene (SIN_1015471), a piperitol/sesamin synthase (PSS) gene (SIN_1025734) and three homologous genes of PSS-SIN_1025729, SIN_1025730 and SIN_1003948. SIN_1003948 was expressed lower in the two parents both at 10, 20, and 30 DPA. No significant difference was observed in the expression profile of SIN_1015471 and SIN_1025734 in the two parents. SIN_1025729 was expressed differently at 30 DPA in the two parents. SIN_1025730 showed high expression in ZZM2748 at the three stages in comparison with Zhongzhi No. 13.

Discussion
Sesame breeding programs have been focused mostly on seed yield, disease resistance, and high oil content. Recently, the objective in sesame breeding has been changed due to the discovery of huge positive effects of sesame lignans on human and animal health. Currently, breed environment stable sesame varieties containing high oil and lignans are the focus of sesame breeders. Previous studies demonstrated that genomic assisted-breeding techniques could be useful for creating high-quality varieties in sesame [37]. However, the genetic basis of lignan content in sesame is still not well-understood. Therefore, acknowledging the genetic control of sesamin and sesamolin biosynthesis in sesame is of the utmost interest. Quantitative trait locus (QTL) mapping is a common practice in crop plants due to progress made in the statistical genomics and molecular markers area [32]. It was useful in detecting loci linked to complex quantitative traits in sesame and other crops [12,32]. The present study investigated the variation of 548 RIL seed sesamin and sesamolin contents in two environments and revealed 26 loci associated with the two lignans in sesame.
Several studies conducted on the variability of lignans content proved that sesamin and sesamolin are the major lignans that constitute sesame seed [19,38,39]. Generally, sesamin accounts for 0.20 to 8.00 mg/g of the dry weight of sesame seeds. To our knowledge, this is the first report on the variability of sesamin and sesamolin contents in a RIL population grown in two distinct locations using the HPLC method. In the two environments, the 548 RILs seed sesamin and sesamolin contents varied from 0.33~7.52 mg/g and 0.36~2.70 mg/g, respectively, and those for the two parents ranged from 0.86~4.38 mg/g and 0.91~2.06 mg/g, respectively. The results suggested the possibility to obtain higher lignans content sesame lines from the cross of proper parents, although the efficiency of traditional crossbreeding in sesame is low. The results agree with the reports on the variability of sesamin and sesamolin content observed in various germplasms. Wu et al. [23] used near-infrared reflectance (NIR) spectroscopy to evaluate sesame seed sesamin content in 224 RILs from three locations and observed a variation from 1.70~5.10 mg/g. Wang et al. [16] reported that seed sesamin and sesamolin mainly ranged from 0.88~11.05 mg/g and 0.93~6.96 mg/g, respectively, in a core collection conserved in China. In some Indian and Thai germplasms, seed sesamin ranged from 0.08~6.45 mg/g and 1.63~7.23 mg/g, respectively, and sesamolin from 0.28~3.76 mg/g and 0.48~2.25 mg/g, respectively [19,40,41]. The phenotype data from the two locations showed a significant positive correlation between sesamin and sesamolin content. The CV of sesamin in Wuchang and Yangluo was 65.12% and 58.03%, respectively, and that of sesamolin was 34.85% and 35.53%, respectively. These results are consistent with previous reports supporting that sesamin and sesamolin contents in seeds are influenced by environmental conditions [18,19].
Despite the importance of seed sesamin and sesamolin contents in sesame genotypes selection, few genomic markers were detected to be associated with these lignans variation. Using multi-locus mapping, Lei et al. [36] detected eight significant SNPs associated with sesamin (M15E10-5, M7E18-2, SSI182-3, and SSR023-1) and sesamolin (E5M6-3, M8E10-1, SSI182-3, and SSI281-4). The PVE of the sesamin and sesamolin SNPs ranged from 3.33% to 6.36% and 3.30% to 5.21%, respectively. Wu et al. [23] combined mixed composite interval mapping (MCIM) and multiple interval mapping methods in 224 RILs grown in three environments and detected five QTLs for sesamin. These QTLs were distributed on LGs 5, 6, 8, 11, and 16 and explained 0.41% to 14.55% of PVE. Here, using CIM in two software (Windows QTL Cartographer 2.5 and QTL IciMapping v4.1) and phenotype data from 548 RILs cultivated in two locations, we identified 16 and 10 QTLs linked to sesamin and sesamolin, respectively. The PVE of sesamin and sesamolin QTLs varied from 1.15% to 67.69% and 1.87% to 46.05%, respectively. The 26 QTLs were distributed on all the LGs except LGs 1, 5, and 7. The comparison of the location of the detected QTLs with the previously identified sesamin QTLs by Wu et al. [23] indicated that qSmin_4.2 might be identical to Qsc-8. These results indicate the need for functional characterization of the identified QTLs to confirm their potential effect on sesamin and sesamolin contents variation in sesame.
Major QTLs with high heritability detected simultaneously in multiple environments are considered highly stable and reliable. We detected one major pleiotropic QTLs (qS-min_11.1/qSmol_11.1) related to sesamin and sesamolin. The QTLs were located between the same SSR markers (ZMM1776 and ZM918) on LG11 and explained 67.69% and 46.05% of the PVE, respectively. Of the remaining 24 minor QTLs, seven and eight QTLs were located closely on LG4 and LG8, respectively, indicating they could originate from one true QTL on the respective LGs. These findings support the close genetic correlation between sesamin and sesamolin and suggest epistatic effects in genetic control of lignan biosynthesis in sesame. Therefore, we performed the digenic epistatic interactions analysis and identified three and one pairs of epistatic interactions with AA and AAE effects for sesamin and sesamolin, respectively. The major QTL qSmin_11.1 was involved in two pairs of epistatic interactions with AA interaction effects of 0.18% to 0.27% and AAE interaction effects of 0.04% to 0.22%. These results were in accordance with previous studies stating that sesamin and sesamolin contents are polygenic traits controlled by additive and dominance effects [13][14][15]. Moreover, they confirmed that the yield of these lignans could be affected by growth conditions. Two new loci that were not identified by the QTL mapping were revealed by the QTLNetwork. Such loci might be minor QTLs that are generally difficult to detect by mapping [32]. Although minor QTLs are unstable, they can influence lignan biosynthesis through AA and AAE interactions. It also suggested that cloning of all the identified QTL regions can help to dissect the genetic basis of sesamin and sesamolin contents in sesame.
Previous studies reported that the biosynthesis of sesame lignans might involve complex biochemical mechanisms [42][43][44]. The expression profiles of pineresinol and piperitol/sesamin synthase genes (SIN_1015471, SIN_1025734, SIN_1025729, SIN_1025730, SIN_1003948) observed in this study confirmed that these genes were not involved in the regulation of sesamin and sesamolin content variations. Coupling transcriptome analysis of the parent's developing seed at 10, 20, and 30 DPA with gene function-annotation, we screened the major QTL region and selected two candidate genes (SIN_1005755 and SIN_1005756) that might control sesamin and sesamolin biosynthesis. SIN_1005755 encoded an NAC domain-containing protein, while SIN_1005756 was a novel gene without any annotation. NAC proteins are transcription regulatory genes and are involved in various gene interaction regulatory networks [45]. Studies in rice, pepper, populus, cotton, and Arabidopsis indicated that NAC domain proteins influence lignin biosynthesis and composition, promote senescence by inducing chlorophyll degradation, regulate abiotic stress responses and secondary cell wall biosynthesis, and can be targeted to obtain nutrient remobilization in crop plants [45][46][47][48][49]. This suggests that SIN_1005755 might be involved in sesamin biosynthesis regulation by controlling lignification in the sesame seed coat. Thus, functional studies using advanced genome editing tools are needed to dissect the roles of these genes, especially SIN_1005755, in the lignans pathway and during sesame seed development. Our findings constitute a foundation for further investigations that will lead to genomic-assisted breeding of high-quality sesame varieties.

Plant Materials
A population of 548 recombinant inbred lines (RILs, F8) derived from a cross between ZZM2748 (P1, male parent) and Zhongzhi No. 13 (P2, female parent) was used for sesamin and sesamolin contents related QTL mapping in this study. Zhongzhi No. 13 had been de novo sequenced [28]. All the plant materials were given by the National Medium-term Sesame GenBank of China (Wuhan, China).

Experimental Design and Sampling
The RILs and the two parents were cultivated in two experimental field stations of the Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences (OCRI-CAAS) located in Wuchang and Yangluo (Hubei province, China). The Wuhan field trial was carried out in 2013 and that of Yangluo in 2014. In each location, all the genotypes were grown in a complete randomized block design with three replications. Standard agronomic practices were used in field management. The seeds were harvested in the two trials when they reached maturity. The three replicate seeds from each location were mixed equally and well-preserved at the seed storage room of OCRI up to the high-performance liquid chromatography (HPLC) analysis of sesamin and sesamolin contents.

Sesamin and Sesamolin Extraction from Seeds
The sample sesamin and sesamolin were extracted following the method of Rangkadilok et al. [41], modified by Wang et al. [16]. In brief, 0.6-0.7 g of seed sample was ground to a fine powder with a mortar containing liquid nitrogen. The sample's flour was accurately weighed (200 mg) and dissolved in 5.0 mL of 80% ethanol after its temperature returned to normal. Then, the samples were vortex-mixed for two hours and centrifuged for 5 min at 5000 rpm. The supernatant was transferred into a 15 mL volumetric flask, and the residue was re-extracted with 5.0 mL of 80% ethanol. Finally, the two extracted solutions were mixed and filtered with a 0.22 µm Nylon membrane prior to HPLC analysis.

HPLC Analysis of Sesamin and Sesamolin
Using the standard external method, the extractions were analyzed by Agilent 1260 Infinity II (HPLC, Agilent Technologies, Waldbronn, Germany) with a thermostatically controlled column oven, a binary pump, and a diode-array detector as per Wang et al. [16]. The reversed-phase column consisted of Agilent ZORBAX SB-C18 (250 mm × 4.6 mm, 5 µm). The mobile phase was a mixture of methanol-deionized water (80/20, v/v) at a flow rate of 1 mL/min (injection volume 10 µL). The absorption was monitored at 290 nm. Each sample was checked twice, and the average was counted as the final contents of sesamin and sesamolin. When the discrepancy between the two repeats was 10% higher than the average, the sample's analysis was repeated.

Phenotypic Data Analysis
The descriptive statistics, correlation analysis, and frequency distribution were performed using IBM SPSS Statistics 20 for Windows (SPSS Inc, Chicago, IL, USA). Additionally, phenotypic data for sesamin and sesamolin were recorded using the Microsoft Office Excel 2010 software (https://www.microsoft.com).

QTL Analysis
The genetic map constructed previously from the same RIL population was associated with the sesamin and sesamolin content variations [29]. It covered a total map length of 1869.78 cM of the sesame genome with 424 SSR markers clustered in 13 linkage groups. To ensure the reliability of the QTL detection, the QTL analysis was performed with Windows QTL Cartographer version 2.5 (Microsoft, Inc., Redmond, WA, USA) and QTL IciMapping version 4.1 (1000 permutations, p = 0.05) using the composite interval mapping (CIM) algorithm. In Windows QTL Cartographer, the experiment-wide threshold was determined by 1000 permutations at a significance level of 0.05 with a genome walk speed of 1 cM. Marker IDs, LOD scores, and genetic positions were collated on a per-trait basis and imported to MapChart version 2.32 for QTL visualization [50]. The detected QTLs were named as suggested by McCouch et al. [51]. The designation started with a "q" (lowercase), then the trait name in capital letters. Since the two traits have the same initial "S", "min" and "mol" were added after the abbreviation "S" to differentiate sesamin and sesamolin, respectively. Finally, the chromosome and the serial number numbers were added.
The digenic epistatic interactions between the Loci, with their AA and AAE effects, were analyzed by QTLNetwork software version 2.1 [52]. The testing window, walking speed, and filtration window were set at 10, 1, and 10 cM, respectively. The permutation test was 1000 times, and all significance level configurations were 0.05.

RNA Isolation and RNA-seq Analysis
To examine the gene expression difference between the two parents, developing seeds of P1 and P2 at 10, 20, and 30 DPA were sampled in triplicate from the Wuchang field for RNA-seq. The total RNA was extracted from each sample using Trizol reagent. The RNA quality was checked on a Bioanalyzer 2100 (Aligent, Santa Clara, CA, USA); the RNA integrity number (RIN) values were ≥7. RNA-Seq libraries were prepared and sequenced on an Illumina HiSeq2500 platform. The gene expression levels were normalized to the number of fragments per kilobase of transcript per million mapped reads (FPKM) using the HTSeq 6.0 software [53].

qRT-PCR Validation
Real-time quantitative PCR was performed using SYBR ® Select Master Mix (2X) (Vazyme Biotec, Nanjing, China) on a Light Cycler 480 II (Roche, Basel, Switzerland). Specific primers for the selected genes were designed using Primer Premier 5.0 (Table S1). Histone H3.3 gene (SIN_1004293) was used as the internal control to normalize transcript levels. The PCR reaction mixtures were as follows: 10 µL mix (Vazyme Biotec, Nanjing, China), 5 µL cDNA, 0.5 µL of each primer, and 4 µL ddH 2 O. The PCR program was conducted according to the manufacturer's protocol. Pre-incubation, one cycle: 95°C for 30 s; amplification, 40 cycles: 95°C for 10 s, 60°C for 30 s; melting curve, one cycle: 95°C for 15 s, 60°C for 60 s, 95°C for 15 s; cooling, one cycle: 40°C for 30 s. The real-time assay for each gene was performed with three independent biological replicates under identical conditions. Each gene expression level was calculated from cycle threshold values using the 2 −∆∆Ct method [54].

Conclusions
In summary, we investigated the variability of sesamin and sesamolin in 548 RILs cultivated in two environments. It showed broad variation with some lines over the high parent in sesamin or sesamolin content, indicating that these traits could be easily improved. A total of 26 QTLs associated with sesamin and sesamolin contents variations were detected, including a locus (qSmin_11.1/qSmol_11.1) of large effect. This locus and its flanking SSR markers might be suitable and tested for marker-assisted selection. The QTLNetwork analysis confirmed that sesamin and sesamolin contents are regulated by epistatic interactions with AA and AAE effects. Finally, two candidate genes (SIN_1005755, SIN_1005756) that might be associated with these lignans variation were selected and will be targeted for functional studies to understand the molecular mechanisms involved in lignan biosynthesis in sesame and for genomics-assisted breeding of varieties containing high sesamin and sesamolin.