Genome Studies in Four Species of Calendula L. (Asteraceae) Using Satellite DNAs as Chromosome Markers

The taxonomically challenging genus Calendula L. (Asteraceae) includes lots of medicinal species characterized by their high morphological and karyological variability. For the first time, a repeatome analysis of a valuable medicinal plant Calendula officinalis L. was carried out using high-throughput genome DNA sequencing and RepeatExplorer/TAREAN pipelines. The FISH-based visualization of the 45S rDNA, 5S rDNA, and satellite DNAs of C. officinalis was performed on the chromosomes of C. officinalis, C. stellata Cav., C. tripterocarpa Rupr., and C. arvensis L. Three satellite DNAs were demonstrated to be new molecular chromosome markers to study the karyotype structure. Karyograms of the studied species were constructed, their ploidy status was specified, and their relationships were clarified. Our results showed that the C. officinalis karyotype differed from the karyotypes of the other three species, indicating its separate position in the Calendula phylogeny. However, the presence of common repeats revealed in the genomes of all the studied species could be related to their common origin. Our findings demonstrated that C. stellata contributed its genome to allotetraploid C. tripterocarpa, and C. arvensis is an allohexaploid hybrid between C. stellata and C. tripterocarpa. At the same time, further karyotype studies of various Calendula species are required to clarify the pathways of chromosomal reorganization that occurred during speciation.


Introduction
The genus Calendula L. (Asteraceae) includes 25 to 27 species of herbaceous plants and shrubs distributed mainly in the Mediterranean, Iran, Central Europe, Africa, and Asia [1].Currently, various species of Calendula including Calendula officinalis L. (calendula or pot marigold), are widely used in the pharmaceutical, food, and cosmetic industries, and also as ornamental plants [2,3].
In the present study, the bioinformatic analysis of the C. officinalis genome DNA sequencing data was carried out to investigate its repeatome composition.The FISH-based mapping of 45S rDNA, 5S rDNA, and also the identified satellite DNA families (satDNAs) of C. officinalis was carried out on the chromosomes of C. officinalis and the related species C. stellata, C. tripterocarpa, and C. arvensis to determine their ploidy status and clarify the genome relationships of these species.

BLAST Analysis of the Identified SatDNAs
According to BLAST, the five satDNAs (Cal 143, Cal 101, Cal 109, Cal 163, and Cal 187), identified in the genome of C. officinalis demonstrated sequence identity with satDNAs revealed in the genera Glycine, Syngnathus, Patella, Pulicaria, Aphis, Cantharis, Harmonia, and Solea (Table 2).A high sequence identity (84.9-94%) was revealed among the repeats Cal 2, Cal 5, and Cal 180, so in further FISH assays, we used only Cal 2, which showed the highest degree of sequence identity with both the Cal 5 and Cal 80 repeats.In C. officinalis, the FISH-based mapping of the 45S rDNA and 5S rDNA revealed two satellite chromosome pairs bearing major clusters of 45S rDNA, and also one chromosome pair with small polymorphic clusters of 45S rDNA localized in the short arms.5S rDNA clusters were observed in the short arms of one chromosome pair (Figures 2 and 3 2 and 3).
In the karyotypes of C. stellata, C. tripterocarpa, and C. arvensis, we examined the chromosome distribution of the satDNAs (Cal 2, Cal 39, Cal 43, and Cal 163) which demonstrated clustered localization on the chromosomes of C. officinalis.Among these four repeats, however, Cal 39 was localized dispersedly on the chromosomes of these Calendula species (Figure 4), and because of this, Cal 39 was not used for further analysis.
In the karyotype of C. stellata, large 45S rDNA clusters were detected in the short arms of two chromosome pairs.5S rDNA signals were observed in the short arms of one chromosome pair (Figure 5).In the karyotype of C. tripterocarpa, 45S rDNA clusters of different sizes were revealed in the short arms of three chromosome pairs.5S rDNA hybridization signals were observed in the short arms of four chromosome pairs (Figure 6).In the karyotype of C. arvensis, 45S rDNA clusters were found in the short arms of five chromosome pairs.5S rDNA hybridization signals were observed in the short arms of another five chromosome pairs (Figure 6).

Discussion
Plant genomes contain large numbers of highly heterogeneous repetitive DNA including thousands or even tens of thousands of families which differ in motif length, copy number, and organization within the genome [46][47][48][49].Transposable elements (TEs) are highly abundant and diverse fractions of plant genomes [50].TEs can influence the genome organization and evolution since they can change their location and/or copy numbers [51,52].Based on their structural characteristics and mode of replication, TEs are subdivided into two classes: class 1 (retrotransposons including LTR retrotransposons) and class 2 (DNA transposons) [51][52][53].DNA transposons can move their locations in the genome with a 'cut-and-paste' mechanism [53].LTR retrotransposons are the predominant group of TEs, constituting up to 75% of plant genomes [54][55][56][57].LTR retroelements are the main contributors to the genome size variation within angiosperms since they can replicate and generate new copies with the 'copy and paste' mechanism, and also, they can be eliminated from the genome through both solo LTR formation and the accumulation of deletions [55, [58][59][60].These retroelements include the superfamilies were detected in the intercalary, distal, and/or the terminal regions of both arms in most of the chromosomes (Figure 6).
The analysis of the chromosome morphology and distribution patterns of the studied markers indicated that C. officinalis had an allotetraploid genome (2n = 4x = 32).The C. tripterocarpa genome also contained two subgenomes (2n = 4x= 30), one of which had significant similarity with the diploid genome of C. stellata (2n = 2x = 14).The second subgenome of C. tripterocarpa is derived from an unknown 16-chromosome ancestor, the genome of which differs from the genome of C. stellata.According to our results, C. arvensis had an allohexaploid genome, which resulted from a hybridization event between C. stellata × C. tripterocarpa (2n = 6x = 44).Thus, our findings allowed us to construct, for the first time, karyograms of the studied accessions of C. officinalis, C. stellata, C. tripterocarpa, and C. arvensis and clarify the ploidy status of these species (Figures 3, 5 and 6).

Discussion
Plant genomes contain large numbers of highly heterogeneous repetitive DNA including thousands or even tens of thousands of families which differ in motif length, copy number, and organization within the genome [46][47][48][49].Transposable elements (TEs) are highly abundant and diverse fractions of plant genomes [50].TEs can influence the genome organization and evolution since they can change their location and/or copy numbers [51,52].Based on their structural characteristics and mode of replication, TEs are subdivided into two classes: class 1 (retrotransposons including LTR retrotransposons) and class 2 (DNA transposons) [51][52][53].DNA transposons can move their locations in the genome with a 'cut-and-paste' mechanism [53].LTR retrotransposons are the predominant group of TEs, constituting up to 75% of plant genomes [54][55][56][57].LTR retroelements are the main contributors to the genome size variation within angiosperms since they can replicate and generate new copies with the 'copy and paste' mechanism, and also, they can be eliminated from the genome through both solo LTR formation and the accumulation of deletions [55, [58][59][60].These retroelements include the superfamilies Ty1-Copia and Ty3-Gypsy, which are further subdivided into a number of families, mostly specific to one or a group of closely related species [61].In the present study, the repeatome analysis of C. officinalis also demonstrated that TEs made up the majority of its repetitive DNA.Retrotransposon elements, including the Ty1-Copia and Ty3-Gypsy superfamilies, were highly abundant.In the Ty1-Copia superfamily, the SIRE and Angela families were the most common, and in the Ty3-Gypsy, the Tekay chromovirus was predominant.This is typical for many species of vascular plants although the number of these retroelements can vary among taxa [44,45,62].
According to our results, ribosomal DNA made up about 3% of the C. officinalis genome.The sequences of the 45S (35S) rDNA and 5S rDNA, are known to be rather conserved in different eukaryotes, and hence, they are often used as FISH probes [36].Previously, two major and also two minor clusters of 45S rDNA were revealed in the karyotype of C. officinalis [34,38].In the studied accession of C. officinalis, we observed two major 45S rDNA clusters but only one minor cluster, which indicates the presence of intraspecific variability in this marker.In the karyotypes of C. stellata, C. tripterocarpa, and C. arvensis, the analysis of the chromosome morphology coupled with the chromosome distribution patterns of the 45S and 5S rDNA allowed us to reveal the similar chromosomes bearing these clusters.
The genome of the studied accession of C. officinalis contained substantial portions of (about 4 %) of satellite DNA.SatDNA sequences are considered to be the fast-evolving fractions of a plant repeatome, demonstrating divergence in both copy number and sequence even between closely related species [63].SatDNAs can vary in a number of features including nucleotide composition, distribution, and abundance in plant genomes [64,65].SatDNAs have a variable-length repeat unit (monomer) and usually form tandem arrays up to 100 Mb [66,67].The sequences of the satellite monomers evolve concertedly via the process called 'molecular drive'.Mutations are homogenized in a genome and become fixed in populations.The abundance of satDNA can vary within the plant genomes, and even between generations, resulting in high variability in the lengths of satellite arrays [68].Some satDNA sequences, however, showed sequence conservatism for long evolutionary periods [69].Although a high rate of genomic changes has been identified in different satellite DNAs, they can be either species-specific or common to a certain group of related species [68][69][70][71].In the present study, according to BLAST, the sequence identity was revealed between two satDNAs identified in the C. officinalis genome and several DNA fractions of the Pulicaria dysenterica genome (Asteracea), indicating that these repeats are rather conservative within the Asteracea family.At the same time, the lack of data on the sequence homology between the identified satDNAs and the repeats of other Calendularelated taxa indicates the need for further studies of their repeatomes.The FISH-mapping of the satDNAs identified in the C. officinalis genome demonstrated the presence of common repeats in the chromosomes of all the studied Calendula species, which indicated the conservatism of these sequences within this genus.Considering the complexity of the taxonomy and phylogeny of this genus, the presence of common repeats in their genomes is important for clarifying the species relationships within the genus.
SatDNAs are often associated with heterochromatin and are localized in certain regions of chromosomes, which makes it possible to identify chromosome pairs in a karyotype, detect different chromosome rearrangements, estimate the range of chromosome variability and clarify species relationships [44,45,62,66,72].The analysis of the chromosome distribution of the oligonucleotide satDNAs in the karyotypes of C. officinalis, C. stellata, C. tripterocarpa, and C. arvensis allowed us to detect three satDNAs (Cal 2, Cal 43, and Cal 163), which presented species-specific localization on the chromosomes of all the studied species and might be effective markers for the analysis of karyotypes within Calendula.
The taxa of the genus Calendula vary significantly in life cycle, morphology, genome size, and also chromosome number (2n = 14, 18, 30, 32, 44, and ~85-88) [18].The morphological and karyological studies, coupled with the chromosome number variation, supported several hypotheses on the species origin and genomic diversity within the genus.In particular, a wide range of chromosome numbers within the genus might result from interspecific hybridization and polyploidization that occurred during speciation, which could lead to the appearance of a number of intermediate forms, complicating the taxonomy of this genus [12][13][14][15]18,21].
The origin of C. tripterocarpa, which is represented by three genotypes, 2n = 30 [17,18], 2n = 30 + 2B [73,74], and also 2n = 54 [75], is still controversial.Based on the chromosome number 2n = 30, C. tripterocarpa is considered to be either a diploid or a polyploid, and its ploidy level (2n = x?= 30) is still not fully understood [22].It was previously assumed that C. tripterocarpa could be a hybrid between a hypothetical aneuploid or diploid ancestor of C. stellata with 2n = 16 chromosomes and C. stellata itself having 2n = 14 [13].Our results mainly confirmed the earlier reported hypotheses of a hybrid origin for C. tripterocarpa [15,18,40].Moreover, we demonstrated that C. tripterocarpa was an allotetraploid (2n = 4x = 30).At the same time, we noticed that only one of its subgenomes was similar to the C. stellata genome, and the second subgenome did not contain chromosomes from C. stellata genome.Our results indicated that an unknown 16-chromosome species was involved in the origin of the C. tripterocarpa genome.Therefore, further detailed cytogenomic studies of the related Calendula species having a karyotype with 2n = 16, 32 chromosomes are required.
The karyotype of the widespread and highly polymorphic annual species C. arvensis (2n = 44) was previously assumed to be a result of hybridization between C. stellata and C. tripterocarpa, followed by genome duplication [13,14,18].In our study, we confirmed these chromosome numbers for the studied C. arvensis accession.We demonstrated that C. stellata contributed its genome to both C. tripterocarpa and C. arvensis.Therefore, C. arvensis is not a tetraploid, as previously suggested [15], but an allohexaploid (2n = 6x = 44), resulting from crossbreeding between diploid C. stellata and tetraploid C. tripterocarpa followed by chromosome duplication.
Thus, further cytogenomic studies of various Calendula species are required to understand the functional and structural features of their genomes, as well as to clarify the pathways of chromosomal reorganization in their karyotypes during speciation.

Sequence Analysis and Identification of DNA Repeats
The obtained genome sequences of C. officinalis were used for the genome-wide analyses, and also for the identification and characterization of major repeat families with the use of RepeatExplorer 2 and TAREAN pipelines [76,77].The genomic reads were filtered by quality.One million high-quality reads were randomly selected for further analyses, which corresponds to 0.2× of a coverage of the C. officinalis genome (1C = 726 Mbp) [14].This is within the limits of the genome coverage (0.01-0.50×) recommended by the developers of these programs [77].RepeatExplorer/TAREAN was launched with the preset settings based on Galaxy platform (https://repeatexplorer-elixir.cerit-sc.cz/galaxy (accessed on 27 September 2023)).Initially, the preprocessing of the genomic reads was performed.Then, the reads were filtered in terms of quality using a cut-off of 10, trimmed, and filtered by size to obtain high-quality reads.Default threshold was explicitly set to 90% sequence similarity spanning at least 55% of the read length (in the case of reads differing in length, it applies to the longer one).
The sequence homology of the satDNAs identified in the genome of C. officinalis with repeats, which had been revealed earlier in other taxa, was estimated using BLAST (NCBI, Bethesda, MD, USA).Based on eight abundant satDNAs of C. officinalis, oligonucleotide FISH probes Cal 2, Cal 39, Cal 43, Cal 101, Cal 103, Cal 109, Cal 163, and Cal 187 (Table 3) were generated using the Primer3-Plus software [78].

Chromosome Slide Preparation
Root tips (0.5-1 cm long) were stored in ice water with 1 µg/mL of 9-aminoacridine (9-AMA) for 16-20 h.Then, the root tips were fixed in ethanol:glacial acetic acid (3:1) fixative for 3 days at 6-8 • C. The fixed roots were put into 1% acetocarmine solution (in 45% acetic acid) for 15-20 min.Then, a root tip was placed on the slide, the root meristem was cut from the tip cap, macerated in 45% acetic acid, covered with a cover slip, and a squashed chromosome preparation was made.After freezing in liquid nitrogen, the cover slip was removed.The slide was dehydrated in 96% ethanol and stored at −20 • C until use.
Before the first FISH procedure, chromosome slides were pretreated with 1 mg/mL RNase A (Roche Diagnostics, Mannheim, Germany) in 2×SSC at 37 • C for 1 h.Then, the slides were washed three times for 10 min in 2×SSC, dehydrated through a graded ethanol series (70%, 85%, and 96%) for 3 min each and air-dried.Several sequential FISH procedures were performed with various combinations of these labelled DNA probes as described previously [62].A total of 15 µL of hybridization mixture containing 40 ng of each labelled probe was added to each slide.Then, the slide was covered with a coverslip, sealed with rubber cement, denatured at 74 • C for 5 min, chilled on ice and placed in a moisture chamber at 37 • C for overnight.Then, the slide was washed in 0.1×SSC (10 min, 44 • C), twice in 2×SSC for 10 min at 44 • C, followed by a 5 min wash in 2×SSC and three 3 min washes in PBS at room temperature.Then, the slide was dehydrated through a graded ethanol series for 2 min each and stained with DAPI (40,6-diamidino-2-phenylindole) dissolved (0.1 µg/mL) in Vectashield mounting medium (Vector Laboratories, Burlingame, CA, USA).After documenting FISH results, the chromosome slide was washed in distilled water for 10 min, and the sequential FISH procedure was conducted on the same slide.

Chromosome Analysis
The chromosome slides were analyzed using the epifluorescence microscope (Olympus BX61) with the standard narrow band pass filter set and UPlanSApo 100/1.40 oil UIS2 objective (Olympus, Tokyo, Japan).Chromosome images were captured with a monochrome CCD (charge-coupled device) camera (Snap, Roper Scientific, Tucson, AZ, USA) in grayscale channels.Then, the images were pseudo-colored, and processed with Adobe Photoshop 10.0 (Adobe Systems, Birmingham, AL, USA) and VideoTesT-FISH 2.1 (IstaVideoTesT, St. Petersburg, Russia) software.For each species sample, at least five plants and 15 metaphase plates were studied.Chromosome pairs in karyotypes were identified according to the chromosome size and morphology, localization of chromosome markers, and also the cytological nomenclature proposed previously [34].

Conclusions
New effective chromosomal markers (Cal 2, Cal 43, and Cal 163) were detected for the analysis of Calendula karyotypes.Our results show that the karyotype of C. officinalis differs from karyotypes of C. stellata, C. tripterocarpa and C. arvensis, however, the presence of common repeats in their genomes could be related to their common origin.The ploidy status of C. officinalis was specified as tetraploid.Our findings demonstrate that diploid C. stellata has contributed its genome to the allotetraploid C. tripterocarpa, and C. arvensis is an allohexaploid hybrid between C. stellata and C. tripterocarpa.Our approach could be useful for further cytogenomic studies of various Calendula species.

Figure 1 .
Figure 1.Types of highly and moderately repeated DNA sequences in the Calendula officinalis genome.A TE proportion of each repeat type or family is shown inside parenthesis.

Figure 1 .
Figure 1.Types of highly and moderately repeated DNA sequences in the Calendula officinalis genome.A TE proportion of each repeat type or family is shown inside parenthesis.

Figure 2 .
Figure 2. Localization of the studied molecular cytogenetic markers on chromosomes of Сalendula officinalis.Merged images after multicolor FISH with 45S rDNA, 5S rDNA, Cal 2, Cal 39, Cal 43, and Cal 163.The names of the probes and their pseudocolors are indicated on the lower right of each metaphase plate.Scale bar-5 µ m.

Figure 2 .
Figure 2. Localization of the studied molecular cytogenetic markers on chromosomes of Calendula officinalis.Merged images after multicolor FISH with 45S rDNA, 5S rDNA, Cal 2, Cal 39, Cal 43, and Cal 163.The names of the probes and their pseudocolors are indicated on the lower right of each metaphase plate.Scale bar-5 µm.

Figure 3 .
Figure 3. Karyograms of С. officinalis after multicolor FISH with 45S rDNA, 5S rDNA, Cal 2, Cal 43, Cal 163, and Cal 39.The same metaphase plates are shown as in Figure 2. I-II-subgenomes.The names of the probes and their pseudocolors are indicated on the left.In C. stellata, large clusters of Cal 2 and Cal 43 were localized in the pericentromeric regions of all chromosomes.Additionally, Cal 2 and Cal 43 clusters were observed in the distal region of the chromosome pair 7 (short arms), and also small polymorphic clusters of Cal 43 were revealed in the terminal regions of the chromosome pairs 3, 4, and 5. Large Cal 163 clusters were revealed in the pericentromeric regions of all chromosomes and occupied the whole short arms of the chromosome pairs 1, 2, and 6 (Figure 5).In C. tripterocarpa, clusters of Cal 2, Cal 43, and Cal 163 were localized in the pericentromeric regions of most chromosomes.Moreover, Cal 2 and Cal 43 clusters were detected at the end of the satellite on chromosome pairs 2, and in the distal and terminal regions of several chromosome pairs.Cal 163 clusters were revealed in the terminal regions of both arms of most of the chromosomes (Figure 6).In C. arvensis, large Cal 2, Cal 43, and Cal 163 clusters were revealed in the pericentromeric regions of most chromosomes.Additionally, clusters of Cal 2, Cal 43, and

Figure 3 .
Figure 3. Karyograms of C. officinalis after multicolor FISH with 45S rDNA, 5S rDNA, Cal 2, Cal 43, Cal 163, and Cal 39.The same metaphase plates are shown as in Figure 2. I-II-subgenomes.The names of the probes and their pseudocolors are indicated on the left.
). Different patterns of the chromosome distribution of the oligonucleotide Cal satDNA probes were revealed.Clusters of Cal 2, Cal 39, Cal 43, and Cal 163 were detected in the pericentromeric regions of most chromosomes.Additionally, clusters of Cal 43, and Cal 163 were detected in the intercalary and/or terminal regions of several chromosomes.Cal 101, Cal 103, Cal 109, and Cal 187 presented mixed clustered and dispersed localization (Figures

Figure 6 .
Figure 6.Localization of (A) 45S rDNA (green), 5S rDNA (red), Cal 2 (pink), Cal 43 (yellow), Cal 163 (blue), and DAPI-staining (grey) and also (B) 45S rDNA (green), 5S rDNA (red), and DAPI-staining (blue) on chromosomes of C. tripterocarpa and C. arvensis.I-III-subgenomes.Scale bar-5 µm.In C. stellata, large clusters of Cal 2 and Cal 43 were localized in the pericentromeric regions of all chromosomes.Additionally, Cal 2 and Cal 43 clusters were observed in the distal region of the chromosome pair 7 (short arms), and also small polymorphic clusters of Cal 43 were revealed in the terminal regions of the chromosome pairs 3, 4, and 5. Large Cal 163 clusters were revealed in the pericentromeric regions of all chromosomes and occupied the whole short arms of the chromosome pairs 1, 2, and 6 (Figure 5).In C. tripterocarpa, clusters of Cal 2, Cal 43, and Cal 163 were localized in the pericentromeric regions of most chromosomes.Moreover, Cal 2 and Cal 43 clusters were detected at the end of the satellite on chromosome pairs 2, and in the distal and terminal regions of several chromosome pairs.Cal 163 clusters were revealed in the terminal regions of both arms of most of the chromosomes (Figure 6).In C. arvensis, large Cal 2, Cal 43, and Cal 163 clusters were revealed in the pericentromeric regions of most chromosomes.Additionally, clusters of Cal 2, Cal 43, and Cal 163

Table 1 .
Proportions of major repetitive DNA repeats identified in the genome of Calendula officinalis.

Table 1 .
Proportions of major repetitive DNA repeats identified in the genome of Calendula officinalis.

Table 2 .
Comparison of the satDNAs identified in the genome of Calendula officinalis with the available data.

Table 3 .
List of the generated oligonucleotide FISH probes.