Genome Size Variation in Sesamum indicum L. Germplasm from Niger

Sesamum indicum L. (Pedaliaceae) is one of the most economically important oil crops in the world, thanks to the high oil content of its seeds and its nutritional value. It is cultivated all over the world, mainly in Asia and Africa. Well adapted to arid environments, sesame offers a good opportunity as an alternative subsistence crop for farmers in Africa, particularly Niger, to cope with climate change. For the first time, the variation in genome size among 75 accessions of the Nigerien germplasm was studied. The sample was collected throughout Niger, revealing various morphological, biochemical and phenological traits. For comparison, an additional accession from Thailand was evaluated as an available Asian representative. In the Niger sample, the 2C DNA value ranged from 0.77 to 1 pg (753 to 978 Mbp), with an average of 0.85 ± 0.037 pg (831 Mbp). Statistical analysis showed a significant difference in 2C DNA values among 58 pairs of Niger accessions (p-value < 0.05). This significant variation indicates the likely genetic diversity of sesame germplasm, offering valuable insights into its possible potential for climate-resilient agriculture. Our results therefore raise a fundamental question: is intraspecific variability in the genome size of Nigerien sesame correlated with specific morphological and physiological traits?

Sesame is mainly cultivated in the tropical and subtropical areas of the world [4,9].Numerous wild relative species have been found in Africa and a smaller number in India [1].Sesame cultivation is well adapted to arid environments and offers Africa in general and for Niger in particular, a good economical and national opportunity in the changing climatic conditions that have become unsuitable for other crops.World sesame production reached 6,741,479.41tons in 2022 [10].African production accounts for 59.3% of this world production (4,000,119 tons) with Niger contributing with 104,088.04 tons [10] representing a 20% increase in Niger's production in 2021.
The African germplasm remains still poorly characterized compared to the Asian germplasm, which has benefited from powerful genomic and genetic investigations [9,[11][12][13][14][15][16][17][18][19].Dossa et al. [20] carried out an analysis of the genetic diversity and population structure of sesame accessions originating mainly from west Africa and Asia and showed that the African accessions have lower genetic diversity than the Asian accession.However, in our previous work, preliminary analyses using AFLP markers revealed high genetic diversity in the Nigerien sesame germplasm [21].In addition to these genetic studies, sesame has been extensively characterized morphologically and biochemically [1,5,[22][23][24].The accessions of the Nigerien sesame show great agro-morphological and biochemical diversity [22][23][24].However, the data on the genome size (GS) are too sparse.For the Asian accession, genome sequencing allows to estimate the GS [9,13,[15][16][17]19], but only two reports concern the total nuclear DNA amount (2C DNA) obtained via flow cytometry [15,25].
The genome size or 2C DNA is one of the most fundamental biological traits [26].Swift [27] was the first to propose the term "C-value" as the DNA content of the unreplicated gametic chromosome set of an individual, considered to be invariable (C for constant).Since then, the intraspecific variation in the GS has often been detected [25,28,29].The GS is a highly relevant trait that correlates with many biotic and abiotic characteristics [26,30].Therefore, the knowledge of the DNA amount is currently an invaluable feature in a number of disciplines, including ecology and phytogeography [31][32][33], systematics and evolution [34][35][36][37], biodiversity screening [29,[38][39][40] and also biotechnology and agricultural sciences [41][42][43].
Flow cytometry is currently the most widely used method for assessing the nuclear DNA amounts [44,45] because it is the most accurate and easiest to use.Our study is the first analysis of the GS via flow cytometry carried out on Nigerien sesame accessions.The aim was to explore the potential intraspecific GS variation in this germplasm, which presents variable morphological, biochemical and phenological characteristics [22][23][24].

Plant Materials
The plant material studied included 75 of the 140 accessions of S. indicum collected from 2015 to 2016, in 44 localities from 6 regions of Niger (Figure 1, Table 1).These regions belong to the agroecological zones where sesame production is the most important in Niger.The seventy-five accessions in the Niger germplasm were selected on the basis of their membership of the previously identified genetic and/or agro-morphological groups [21,22] and their geographical distribution (Figure 1, Table 1).Most of them had already been biochemically characterized [23].A Tai sesame accession (STh) purchased on the market was included in the studied sample.The seeds of each accession were germinated in Petri dishes, and cotyledons were used for the GS evaluation.

Genome Size Assessment Using Flow Cytometry
In order to isolate the nuclei for cytometric analysis, the cotyledons from the germinated seedlings of S. indicum and the leaves from the internal standard (tomato, Solanum lycopersicum L. 'Montfavet 63-5', 2C = 1.99 pg [46]) were co-chopped using a razor blade in a Petri dish containing 1 mL of cold Gif nuclear isolating buffer GNB: 45 mM MgCl2, 30 mM Sodium-Citrate and 60 mM MOPS acid pH 7.0, 1% PVP 10.000, RNAse (2.5 U/mL) and 10 mM sodium metabisulfite (S2O5.Na2), which is a reducing agent that is less toxic than β-mercaptoethanol [44].The nuclei suspension was filtered through a nylon mesh

Genome Size Assessment Using Flow Cytometry
In order to isolate the nuclei for cytometric analysis, the cotyledons from the germinated seedlings of S. indicum and the leaves from the internal standard (tomato, Solanum lycopersicum L. 'Montfavet 63-5', 2C = 1.99 pg [46]) were co-chopped using a razor blade in a Petri dish containing 1 mL of cold Gif nuclear isolating buffer GNB: 45 mM MgCl 2 , 30 mM Sodium-Citrate and 60 mM MOPS acid pH 7.0, 1% PVP 10.000, RNAse (2.5 U/mL) and 10 mM sodium metabisulfite (S 2 O5.Na 2 ), which is a reducing agent that is less toxic than β-mercaptoethanol [44].The nuclei suspension was filtered through a nylon mesh (Partec-CellTrics, pore size of 30 µm) and stained using a specific DNA fluorochrome intercalating dye propidium iodide (stock 1 mg/mL, Sigma-Aldrich, F-38297 Saint-Quentin-Fallavier Cedex, France) to a final concentration of 50 µg/mL, and kept for 5 min at 4 • C.
At least five individuals from each accession were analyzed for their average GS.The total 2C DNA content of at least 2000 stained nuclei was determined for each sample using a CytoFLEX S (Beckman Coulter, F-93420 Villepinte, France-Life Science United States (excitation 488 nm (50 mW) or 561 nm (30 mW); emission through a 690/50 or 610/20 nm band-pass filter, respectively, to lasers).Fluorescence histograms were analyzed using Kaluza software version 2.1 (Beckman Coulter) for each sample and internal standard.The nuclear DNA content was estimated using the linear relationship between the fluorescent signals from the stained nuclei of the S. indicum sample and the internal standard obtained via the following formula: 2C DNA (pg)/nucleus = (Sample 2C peak mean/Standard 2C peak mean) × Standard 2C DNA (pg) The mean of the 1Cx value (monoploid genome size) was calculated taking into account that 1 pg = 978 Mbp [47].

Statistical Analyses
Statistical analyses were carried out on all the data which were entered into Excel and analyzed using the R software version 4.3.2developed by the Core Team of 2023 [48].Descriptive statistics (mean, minimum, maximum, standard error, coefficient of variation) were first calculated for each accession (Table S1).
The non-parametric Kruskall-Wallis test was used to check for any differences in the GS among the 76 accessions of S. indicum.Due to the small sample size, 10,000 random permutations of individuals were also performed to calculate each p-value, ensuring the significance of the Kruskall-Wallis test.In the event of a significant effect on the sample, post-hoc non-parametric tests were used for the pairwise comparisons of sample medians using Dunn's test [49].Adjusted p-values for multiple testing were calculated using Holm's method [50].Spearman correlation coefficients were also used to test the relationship between the GS and various variables: branching (number of primary lateral branches at physiological maturity), fatty acid content (percentage of total fatty acids per unit dry matter), flowering time (date on which 50% of plants in a plot flower), height (from the crown to the top of the plant, measured at harvest in cm), latitude and longitude (Table 1), seed maturity (date at which all plants are fully mature), tegument color of mature seeds after harvest and yield (weight of total seeds harvested from a plant in g/plant).The data on the genetic groups (based on AFLP markers) and agro-morphological groups (determined by using classification methods based on the agro-morphological data from the experimental trials) were obtained from previous studies [21][22][23][24] that used the same seed lots as in the present work.The chi-squared test was performed to examine the relationship between the genome size and flowering time.

Genome Size of Nigerien and Tai's Accessions
The 2C DNA values obtained for 76 S. indicum accessions are presented in Table 2 and Table S1.Individual GSs ranged from 2C = 0.77 pg for S86, S91, S96 and S104 to 2C = 1.00 pg in accession S132 for the entire Niger panel (Tables 2 and S1).The mean GS ranged from 2C = 0.78 ± 0.008 pg in accession S96 to 2C = 0.95 ± 0.031 pg in accession S132 (Figures 2A,B, S1 and S2; Table 2 and Table S1).The mean 1Cx value (monoploid genome size) of the estimates of the Nigerien accessions therefore ranged from 382 Mbp for accession S96 to 464 Mbp for accession S132 (Table 2).The GS of the Tai STh accession was the smallest (2C = 0.73 ± 0.01 pg; 1 Cx = 356 Mbp) compared to the GS of the Nigerien accessions (Tables 2 and S1) and also showed the smallest individual value (2C = 0.72 pg; 352 Mbp) (Figures 2C and S3).

Analysis of the Genome Size Variation among the Nigerien Sesame Accessions
All the accessions studied showed a fairly wide range of 2C values and an overall coefficient of variation of 4.4% to 4.8% depending on whether or not the STh GS was included (Tables S1 and S2).When estimated within the genetic (Gr) and agro-morphological (AgroM) groups (Table 2 and [21,22]), the coefficient of variation was 3.7%, 4.8% and 4.3% in Gr1, Gr2 and Gr3, respectively, and 4.5%, 4.4% and 4.2% for AgroM1, AgroM2 and AgroM3, respectively (Table S2).
Kruskall-Wallis tests were performed on all the scored GS data from the accessions studied and among or within the Nigerien genetic and agro-morphological groups (Tables 2  and 3).The analysis undertaken with all the GS scored data including or excluding the STh accession provided strong evidence of a highly significant difference (p < 0.001) between at least one pair of GS sesame accessions under study (Table 3).Dunn's pairwise test was then performed among the Niger sesame accessions, and the results showed a significant difference (p < 0.05) among 58 pairs of the GS sesame accessions (Table S3).When the GS of the STh accession was included in Dunn's test, 68 pairs of accessions showed a significant variation in the GS (p < 0.05), with 19 of them showing a significant difference in the GS compared with the Tai accession (p < 0.05).
Table 3. Kruskall-Wallis tests performed on the genome size scored data from all accessions studied (All accessions), among the genetic and agro-morphological groups (Among groups) and within each Niger genetic (Gr1, Gr2 and Gr 3) or agro-morphological (AgroM1, AgroM2 and AgroM3) group.The Kruskall-Wallis tests revealed a significant GS variation (p < 0.001 and p < 0.01) within and among the genetic and agro-morphological groups (Figure 3, Tables 2 and 3).Dunn's pairwise test including or excluding the STh accession (Tables S4 and S5) revealed a significant GS variation between the Gr2 and Gr3 genetic groups (p < 0.01), the AgroM1 and AgroM2 groups (p < 0.05) and the AgroM2 and AgroM3 groups (p < 0.01).When STh was excluded from the pairwise test, the GS variation between AgroM2 and AgroM3 groups was highly significant (p = 0.001).All the genetic and agro-morphological groups displayed a highly significant GS variation (p = 0.00) compared to the Tai STh accession used as an independent group (Figure 3, Table S5).

Correlation between the Genome Size and Flowering Time
Among all the variables tested, a significant correlation for the Nigerien GS accessions was observed only between the GS and flowering time (Table 4).Although the Spearman's correlation coefficient was moderate (r = −0.27), it was still statistically significant (p = 0.02).

Correlation between the Genome Size and Flowering Time
Among all the variables tested, a significant correlation for the Nigerien GS accessions was observed only between the GS and flowering time (Table 4).Although the Spearman's correlation coefficient was moderate (r = −0.27), it was still statistically significant (p = 0.02).The linear regression model showed a statistically significant (p < 0.05) but relatively weak association between the GS and flowering time (Figure 4, Table 5).In fact, only 4.5% of the variation in flowering time can be explained by the GS values.No correlation was found between the GS and flowering time when the analysis was carried out within the genetic or agro-morphological groups (Table 5).The linear regression model showed a statistically significant (p < 0.05) but relatively weak association between the GS and flowering time (Figure 4, Table 5).In fact, only 4.5% of the variation in flowering time can be explained by the GS values.No correlation was found between the GS and flowering time when the analysis was carried out within the genetic or agro-morphological groups (Table 5).

Discussion
Previous information on the GS of S. indicum was mainly based on the whole genome sequencing of Asian landraces [9,13,15,19].Although genome sequencing provides valuable genomic information, the estimates of the GS derived from this method often underestimate the true total amount of DNA due to the lack of data on repetitive DNA, which is difficult to assemble using short-read sequencing technologies [51].The improved assembly and annotation of the sesame genome using long-read sequencing and Hi-C technologies have resulted in an updated sequence [9], which is still lower than the GS expected by flow cytometry [15] or k-mer analysis [9].Flow cytometry, particularly when using internal standards with known and stable GS, is the most reliable approach [44,45], especially for analyzing the intraspecific variation in the GS.Despite the importance of accurate GS determination, flow cytometry has only been used in two studies to assess the GS in S. indicum [15,25].In our study, flow cytometry, used on a large sample of sesame accessions from Niger, provided evidence of an intraspecific variation in the GS in S. indicum.The Niger sesame accessions have previously been structured into three groups on the basis of agro-morphological traits or genetic criteria [21,22,24].Statistical analyses showed that the GS varied among and within the genetic and agro-morphological groups (Figure 3, Table 3).Pairwise comparison showed that the GS varied significantly between certain genetic and agro-morphological groups (Tables S4 and S5).The lack of information on growing conditions does not allow to explain these GS variations at present.
The Niger sesame accessions had a higher average GS (1C = 0.43 pg; 420 Mbp) than the Chinese sesame cultivar 'Zhongzhi No. 13' for which the 1C value, estimated by flow cytometry, was 0.34 pg (337 Mbp) [15].This difference in C-values could be due to the use of different cytometers and internal standards.The GS of STh, a Tai sesame accession, and the only Asian representative available, showed a significantly smaller GS (2C = 0.73 ± 0.01 pg; 1C = 356 Mbp) than our panel of sesame accessions.It would be interesting to extend this study to a larger sample to check whether the GS of the Asian sesame is generally smaller than the African sesame.
The Plant DNA C-values Database [25] provides the GS estimates for ten species of the genus Sesamum, including S. indicum with 2C = 1.91 pg (1C = 934 Mbp).This 2C value is approximately 2-to 2.45-fold higher than those of the Niger accessions.This value could be due to an error in the GS estimation or to the consequence of the polyploidy or amplification of repetitive DNA sequences, two mechanisms contributing to the increase in the GS [30,52].However, in our Niger sesame panel, no cases of polyploidy were observed.Repetitive DNA is estimated to represent 52.81% of the updated sesame genome assembly [9] with the transposable elements (TEs) being the most abundant repetitive elements (27.42%).With the divergence rates estimated at less than 20% for the TEs and other repetitive elements, it has been suggested that the significant recent activity has led to their progressive accumulation in the sesame genome [9].With the sesame genome sequence available [9], it would be relevant to isolate different types of repeat sequences (such as the TEs and satellite DNA) and use them as probes for in situ hybridization (FISH) on chromosomes.By comparing the FISH profiles of different sesame accessions, it would be possible to assess their contribution to the GS variation.
The variation in the GS among species is widely documented, particularly in plants [30], and its impacts play a key role in plant biodiversity, ecology and evolution [30,39,[52][53][54][55][56].There are also several reports on the relationship between the intraspecific variation in the GS and the phenotypic, phenological or ecological factors [28,31,34,[57][58][59][60].Indeed, the inter-and intra-specific variation in the GS may have ecological and evolutionary significance, as it has been correlated with the various phenotypic traits in plants such as cell size [57], seed mass [60], flower size [61], leaf size and metabolic rates [62], growth rate [63] and flowering time [59,64,65].
An investigation of the correlation between the variation in the GS of the Niger sesame and different traits revealed only an inverse relationship between the GS and flowering time, as shown by the moderate negative correlation between the two variables (Figure 4, Tables 4 and 5), mainly attributed to the variation in the GS among the accessions.The moderate correlation revealed in the present study reflects the weak effect of the GS on the flowering time in the Niger sesame, as has been suggested for maize [66].Plants with small genomes have a shorter cell cycle, rapid growing periods [57,67,68] and seemed to be favored in environments with harsh conditions such as low and high temperatures or nutrient deficiencies [33,69,70].
Despite the biochemical and morphological diversity reported among the Niger sesame accessions [22][23][24], no correlation was found between the GS and their representative variables, nor with the geographical variables (Table 4).However, in other cases, such as Geranium macrorrhizum, the chemical composition of the essential oils was correlated with polyploidy and consequently with the GS [58].In Silene latifolia, the flower size was closely linked to the 2C DNA values [61].In future studies, we will look for any correlation between the GS variation and the morphological and phytochemical characteristics of the African sesame.

Conclusions
The present work focused on the variation in the nuclear DNA content of S. indicum.In our panel of 75 accessions from the Niger germplasm and one Asian accession, several important results can be highlighted.This is the first time that the GS of S. indicum has been accurately determined from a large sample.The intraspecific variability in the GS was observed among germplasm accessions and a significant difference in the nuclear DNA between the Nigerien and Asian representatives was detected.A moderate but significant negative correlation was also observed between the GS and flowering time.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes15060711/s1. Figure S1: Flow cytometry density plots (top panel) and histograms (bottom panel) for the smallest GS (S96) among the Niger accessions; Figure S2: Flow cytometry density plots (upper panel) and histograms (lower panel) for the largest GS (S132) among the Niger accessions; Figure S3: Flow cytometry density plots (top panel) and histograms (bottom panel) for the smallest GS of the Tai STh accession; Table S1: Descriptive statistical variables of sesame accessions Genome size; Table S2: Coefficients of variation (cf) of sesame genome size among accessions (All accessions) and within genetic (Gr1, Gr2 and Gr3) and agro-morphological (AgroM1, AgroM2 and AgroM3) groups; Table S3: Pairwise comparisons of sample medians using Dunn's test among Niger sesame accessions; Table S4: Pairwise comparisons of sample medians using Dunn's test among genetic and agro-morphological groups; Table S5: Pairwise comparisons of sample medians using Dunn's test among genetic, agro-morphological groups and STh, the Tai sesame as a separate group.

Figure 1 .
Figure 1.Spatial distribution of Nigerien sesame studied accessions.The map shows the different states of Niger and green circles indicate collection sites.Niger sesame accessions were numbered with S for sesame followed by a number.

Figure 1 .
Figure 1.Spatial distribution of Nigerien sesame studied accessions.The map shows the different states of Niger and green circles indicate collection sites.Niger sesame accessions were numbered with S for sesame followed by a number.

Figure 2 .
Figure 2. Flow cytometry histograms showing the fluorescence intensity peaks of samples and the internal standard: (A) the smallest GS (S96), (B) the largest GS (S132) among the Niger accessions and (C) the smallest GS Tai accession (STh).

Figure 2 .
Figure 2. Flow cytometry histograms showing the fluorescence intensity peaks of samples and the internal standard: (A) the smallest GS (S96), (B) the largest GS (S132) among the Niger accessions and (C) the smallest GS Tai accession (STh).

Figure 3 .
Figure 3. Box plots of the mean 2C values of genetic (A) and agro-morphological groups (B) of the Niger sesame.The Tai STh accession is included as an independent group.Horizontal lines denote median values.Boxes below and above the median line indicate the first and the third quartile.Extended vertical lines indicate extreme values.Dots indicate outliers.

Figure 3 .
Figure 3. Box plots of the mean 2C values of genetic (A) and agro-morphological groups (B) of the Niger sesame.The Tai STh accession is included as an independent group.Horizontal lines denote median values.Boxes below and above the median line indicate the first and the third quartile.Extended vertical lines indicate extreme values.Dots indicate outliers.

Figure 4 .
Figure 4. Correlation between the genome size and flowering time between and within the genetic (A) and agro-morphological (B) groups.The black line shows the linear regression from all GS data.The blue, red and green lines represent the linear regression within Gr1 or AgroM1, Gr2 or AgroM2 and Gr3 or AgrM3, respectively.

Figure 4 .
Figure 4. Correlation between the genome size and flowering time between and within the genetic (A) and agro-morphological (B) groups.The black line shows the linear regression from all GS data.The blue, red and green lines represent the linear regression within Gr1 or AgroM1, Gr2 or AgroM2 and Gr3 or AgrM3, respectively.

Author
Contributions: Conceptualization, N.T. and S.S.-Y.; methodology and validation, N.T., S.S.-Y.and A.A.; formal analysis and investigation, N.T., S.S.-Y., H.Z. and A.K.N.J.; resources, H.Z.; statistical analysis, A.K.N.J.; writing-original draft preparation and writing-review and editing, N.T. and S.S.-Y.All the authors have read and agreed to the published version of the manuscript.

Table 1 .
Origin of the studied accessions with GPS coordinates.
a STh (S. indicum from Thailand) was purchased on the market.

Table 2 .
[21]me size estimates and sample data.Mean 2C value, standard deviation (SD), minimum (Min) and maximum (Max) 2C values in pg and mean 1Cx value in Mbp obtained for each accession.Group membership according to agro-morphological analyses (AgroM)[22]and genetic diversity analyses (Genetic)[21]are indicated for each Niger accession.n: number of individuals assessed.

Table 2 .
Genome [21] estimates and sample data.Mean 2C value, standard deviation (SD), minimum (Min) and maximum (Max) 2C values in pg and mean 1Cx value in Mbp obtained for each accession.Group membership according to agro-morphological analyses (AgroM)[22]and genetic diversity analyses (Genetic)[21]are indicated for each Niger accession.n: number of individuals assessed.

Table 4 .
Spearman correlation analyses between 2C DNA values from the current study and data for different variables from previous work [22-24].r: coefficient of correlation.

Table 4 .
Spearman correlation analyses between 2C DNA values from the current study and data for different variables from previous work [22-24].r: coefficient of correlation.