1. Introduction
Apomixis defines a few plant reproductive strategies which permit the inheritance of the maternal genome without genetic recombination and syngamy. This reproductive strategy is characterized by the differentiation of functional embryos, either from a somatic cell of the ovule nucellus or from an unreduced female gametophyte. The development of unreduced functional gametes from unorthodox cell precursors (i.e., different from the meiotically reduced functional megaspore) is a defining characteristic of a subset of reproductive strategies known as gametophytic apomixis (for reviews on apomixis, see References [
1,
2]). Among these strategies, the aposporous type of apomixis is characterized by the differentiation of functional embryo sacs (ESs) from somatic cells of the ovule. Histological observations in several aposporous species have indicated that competence for ES founder cell specification extends to the cellular layers surrounding the meiocytes (i.e., the nucellus and/or the chalaza), rather than being restricted to the meiocyte cell type [
3,
4,
5,
6,
7,
8]. Results have shown that epigenetic regulation of gene expression by either DNA methylation or posttranscriptional gene silencing (PTGS) is required for proper ovule development in the model species
Arabidopsis thaliana [
9]. Accordingly, the activity of the RNA-dependent DNA methylation (RdDM) pathway during reproduction is essential for gametophyte development in this species and loss-of-function mutants for genes involved in this pathway result in aberrant cell fate establishment within the ovule with phenotypes that are strikingly reminiscent of apomictic development [
9,
10].
DNA methylation of RdDM target loci is mediated by the co-occurrence and interaction of small interfering RNAs (siRNAs), long noncoding RNAs (lncRNAs) and core RNA interference (RNAi) proteins, including INVOLVED IN DE NOVO 2 (IDN2) and its highly similar interactors FACTOR OF DNA METHYLATION 1 (FDM1) and FACTOR OF DNA METHYLATION 2 (FDM2) [
11,
12,
13]. IDN2 belongs to a small plant-specific gene family whose members are characterized by the presence of three major protein domains (Zf-XS, XS and XH) and a coiled coil, which were named after Arabidopsis SGS3 and its rice homolog X1 [
14,
15]. The XS domain is an RRM-like RNA-binding domain [
11,
12] that is required for the binding of dsRNA with 5′-overhangs. The Zf-XS domain is a small finger-like motif conjugated with one or more zinc ions and is usually paired with an XS domain recognizing a specific sequence of nucleic acids [
14,
16]. The XH domain, which is also found in SGS3, is required for the interaction of IDN2 with FDM1 and FDM2 [
17]. The coiled-coil domain, which is located between XS and the XH, is essential for IDN2 homodimerization [
17]. The interaction of IDN2 with FDM1 and FDM2 is thought to be required for RdDM [
11,
12] and to be downstream of siRNA production [
13]. IDN2 and FDM1 have been shown to bind the 5′-overhangs of dsRNAs
in vitro and FDM1 has been shown to bind unmethylated DNA
in vitro [
11,
18,
19].
According to the current model of the molecular function of IDN2-FDM1 at RdDM-targeted loci [
20], IDN2 potentially binds to lncRNA by recognizing the 5′-overhang of a double-stranded RNA consisting of siRNA and lncRNA. Following the binding of dsRNAs, IDN2 is required for the association of the methyltransferase DOMAINS REARRANGED METHYLTRANSFERASE2 (DRM2) with the lncRNA at RdDM-targeted loci [
20]. At the chromatin level, IDN2 interacts with the SWI/SNF chromatin-remodelling complex, which alters the nucleosome position and facilitates RdDM [
21]. Although FDM3, FDM4 and FDM5 are also involved in RdDM, their functional mechanisms remain to be identified [
22].
Gene expression studies performed on the aposporous apomictic medicinal plant
H. perforatum have recently shown that several genes involved in the RdDM pathway are differentially expressed in pistils collected from aposporous plants [
23,
24]. Furthermore, two independent studies identified an orthologue of
A. thaliana IDN2 as differentially expressed in
H. perforatum pistils and microdissected ovules, suggesting involvement in developmental processes that define the aposporous apomictic reproductive strategy. However, little is known regarding the composition of the IDN2 and FDM-like gene family in
H. perforatum. Furthermore, no information is currently available with respect to the gene expression, transcriptional regulation and protein-protein interaction potential of family members in this species. To gain further insight into the expression and regulation of
IDN2 and
FDM-like genes in aposporous apomixis, we performed detailed bioinformatic and transcriptional investigations in sexually reproducing
A. thaliana and
H. perforatum. Bioinformatic investigations were performed to address the gene family composition, the conservation of characteristic protein domains and gene expression in premeiotic ovules. Furthermore, detailed spatial and temporal expression studies were performed to investigate the expression of genes in this family throughout pistil and gamete development. Finally, the transcriptional regulation of
IDN2/FDM1-5 genes was addressed by computational annotation of promoter sequences and gene expression studies coupled with hormonal treatments.
2. Results
The
A. thaliana genome encodes a single
IDN2 gene and five potential or known interactors, currently annotated as
FDM1-5. Their expression was detectable throughout the plant in
A. thaliana, including the shoot apical meristems, flowers and siliques (
Table S1, Figure S1). The clustering of the
ATIDN2 and
ATFDM1-5 genes based on their expression levels revealed similar patterns for
ATIDN2 and
ATFDM1/ATFDM2, showing relatively high expression in the shoot apex, particularly during the transition from the vegetative to the inflorescence stage and in flowers at early developmental stages (Flower stage 9). Within flowers, their expression was higher at the earliest developmental timepoints. Among the different flower parts, high expression was found in the pistils (referred to as carpels, flower stages 12 and 15) and in the ovary (the component of the pistil that bears the ovules (
Table S2, Figure S1).
ATFDM4 was the gene with the highest expression in shoot apical inflorescences, flowers at early developmental stages, carpels and ovaries. Despite the lower expression observed in anthers (referred to as stamens, flower stages 12 and 15),
ATIDN2 and
ATFDM1-5 transcripts were detectable during male gametogenesis, exhibiting an overall decreasing trend. Seeds and siliques were characterized by two alternative expression patterns. Hence, while the expression of IDN2 and FDM1-3 peaked at seed developmental stages 4 and 7,
ATFDM4 and
ATFDM5 displayed a decreasing trend of expression throughout seed development.
To gain additional insights into
ATIDN2 and
ATFDM1-5 gene functions in reproductive processes, single-knockout mutants (
Table S3) were obtained from Nottingham Arabidopsis Stock Centre (here after NASC) and investigated to determine the seed set (
Figure 1), the average seed size (
Table S4) and the embryo DNA content (
Figure 1).
fdm5 seeds (SALK_075378) did not germinate in Kan- and Kan+ selective media. Additionally, several
idn2 and
fdm1-4 homozygous knockout plants (min: 3, max: 5) were selected, genotyped and analysed in detail (
Figure S2). The mutant lines displayed a 15% reduction of the seed set (
Figure 1A). While the number of aborted seeds in
fdm3,
fdm4 and
idn2 plants was significantly higher than in the wild type (here after WT), the
fdm1 and
fdm2 lines were characterized by remarkably low abortion frequencies (
Figure 1A,
Figure S2). The mutation of single ATIDN2 and ATFDM1-4 genes is therefore considered to affect the plant seed set by affecting different developmental mechanisms. Notably, seed set variability was significantly higher in
fdm2,
fdm4 and
idn2 than in WT accessions (
Figure 1A). The seeds produced by the
idn2 and
fdm1-5 knockout lines were morphologically different from those of WT control plants (
Table S2, Figure S2). More specifically, while the average seed area was significantly greater in the
idn2 and
fdm2 lines (respectively 0.137 ± 0.024 mm
2 and 0.134 ± 0.019 mm
2) than in the WT (0.126 ± 0.011 mm
2), the remaining mutant lines produced smaller seeds than the WT (for more details, please refer to
Table S2). The observed differences in seed size raised the question of whether embryo and/or endosperm DNA contents were affected by the knockout of these genes. The events assigned to endosperm cell populations in single seeds were numerically scarce and no significant differences were recorded among wild-type and mutant lines. However,
fdm1 (30%),
fdm5 (20%) and, to a lesser extent,
fdm2 (10%) and
idn2 (10%) displayed a significant reduction in the estimated embryo DNA content detected by flow cytometry screenings as count of bound from DAPI and embryo DNA (
Figure 1B) [
25].
2.1. Gene Annotation and Phylogenetic Analysis
To elucidate the composition and expression of
IDN2-like and
FDM-like genes in reproductive structures, we took advantage of recent sequencing efforts in
H. perforatum sexual and aposporous apomictic accessions.
H. perforatum aposporous apomicts are characterized by modifications in the ovule cell fate determination program, by which the competence differentiation of functional embryo sacs (ES) from somatic cells of the ovule instead of being restricted to a single reduced megaspore (i.e., the functional megaspore) deriving from meiosis [
23,
24,
26,
27]. Annotation of the sexual diploid genome of
H. perforatum identified 9 genes encoding XH/XS-domain-containing proteins, sharing high sequence similarity with the
A. thaliana IDN2 and
FDM1-5 genes (
Table 1).
The annotated gene regions were 3385 bp in length on average, ranging from 2254 to 7304 bp. Two genes belonging to this family were predicted in a 30 kb sequence from ctg61981. Similarly, the contig sequence ctg61606 included four XH/XS domain-containing genes in a sequence window of approximately 20 kb (
Table 1). Phylogenetic investigations performed on the available sequence data for 18 different species provided consistent clustering of the moss
Physcomitrella patens (Pp), monocotyledon and dicotyledon protein sequences (
Figure 2,
Table S5). Within the dicotyledons, three main clusters (clusters 1 to 3) were detected. Cluster 1 (UFB support: 73) was defined by the presence of the
A. thaliana (At) FDM1, FDM2 and FDM5 protein sequences, together with two
Populus trichocarpa [
28] proteins annotated as FDM1 [
28]. Three
H. perforatum sequences were grouped in this cluster, with UFB internode support of 97. This clade was sister to the clade that included sequences of
A. thaliana and
Brassica napus (Bn) and this relationship received robust statistical support (UFB= 85). According to this topology, the
H. perforatum proteins included in this cluster were designated HPFDM1-like A, HPFDM1-like B and HPFDM1-like C. Cluster 2 included sequences annotated as IDN2 from
A. thaliana and all other species considered in this investigation (
Table S5). Two
H. perforatum proteins were included in this cluster with very high bootstrap support. Cluster 3 grouped sequences annotated as FDM4 and IDN2 from the following species:
V. vinifera,
B. napus and
S. lycopersicum. For
C. sinensis, a unique cluster of protein sequences currently annotated as FDM4 and FDM5 was generated, situated between clusters one and two. For the
H. perforatum sequences included in Cluster 3, the UFB support was high (UFB: 82). Accordingly, the
H. perforatum sequences included in this group were designated HPFDM4-like A-D. It is worth noting that the
H. perforatum FDM4-like sequences were divided into two subclusters with maximum support (i.e., UFB: 100). Estimates of the evolutionary divergence between the
H. perforatum sequences (
Table S3) were consistent with the clustering of
H. perforatum sequences into three main clusters represented by the putative Arabidopsis orthologues
FDM1,
FDM4 and IDN2.
2.2. Gene Expression Analysis
RNAseq gene expression analysis was focused on Laser Capture Microdissected (LCM) premeiotic ovules bulked from sexual and aposporous plant accessions. The sequencing data were analysed by using two complementary approaches. In the initial approach, the transcriptome assembled
de novo from both sexual and apomictic sequence data was adopted as a reference (
Table 1,
Figure S3). The annotation of the transcriptome identified 18
IDN2-like and
FDM1,4-like transcript variants. The alignment of the transcript variants to the annotated gene loci (
Table 1) provided, on average, 2 variants per locus (min: 0, max: 4).
Single transcript variants associated with the three genes
IDN2-like A,
FDM1-like C and
FDM4-like C were differentially expressed in ovules (FDR
p-value ≤ 7.0 E-04,
Table 1). Notably, the clustering of samples on the basis of ovule expression data was consistent with the sample phenotypes (
Figure S3 indicating that intraphenotype expression variation at the level of gene family was lower than that observed between antagonistic phenotypes.) In the second approach, mapping and GE analyses were performed on gene features predicted from the genome sequence of a diploid sexual accession, thereby focusing the investigation on gene features rather than transcript variants (
Figure 3). The adoption of these last references did not alter the separation of samples according to their phenotype (
Figure 3A,B). At this analytical level, GE analysis of sequencing data provided additional evidence of the downregulation of IDN2-like B in the reproductive structures of aposporous samples (
Table S4).
Although the expression differences recorded for FDM1-like C and FDM4-like A-C were not significant, higher mean expression values were recorded in aposporous ovules (
Table S4). Notably, the alignment of transcript sequences assembled de novo from sexual and apomictic ovules to sexual (i.e., predicted from the sexual diploid genome sequence) gene sequences resulted in high variability of the fraction of aligned transcript sequences and identity within the alignments. Furthermore, considering all transcript variants matching the FDM1-like C and FDM4-like C gene sequences, we noted that differentially expressed transcript variants l2_t405120_835163 (FDR:1.0E-6) and l2_t484402_835163 (FDR 9.0E-8) displayed the shortest length of alignment with the corresponding gene sequences (41.7% and 61.0%, respectively).
2.3. Protein-Protein Interactions
Since interactions between
A. thaliana IDN2 and FDM1 are expected to be required for RdDM [
11,
12] and to be downstream of small interfering RNA (siRNA) production [
13], we tested the interaction potential of
H. perforatum proteins by using sequence-based features. The domain architecture of IDN2-like and FDM1,4-like proteins was highly conserved in the investigated protein accessions (
Figure 2,
Table S5) and most
H. perforatum proteins displayed the four characteristic domains: Zf-XS, XS, coiled-coil and HX domains (
Figure 2). However, the two N-terminal domains involved in DNA/RNA binding (i.e., Zf-XS and XS) were not detected in the HPFDM4-like A and HPFDM4-like C sequences. Additionally, the Zf-XS domain but not the XS domain, was missing in HPIDN2-like B (
Figure 2). Highly supported protein-protein interactions among the
H. perforatum proteins were selected by using the experimentally determined value for the
A. thaliana IDN2/FDM1 heterodimer (PPI score ≥ 0.419) as the cut-off value (
Table 2). The Pearson correlation coefficients (PCCs) of the protein-protein interaction scores estimated for all pairwise comparisons between the predicted
A. thaliana, P. trichocarpa and
H. perforatum sequences are shown in
Figure 4A.
Comparative analyses with the sexual model species
A. thaliana and
P. trichocarpa are relatively complex because of the increased number of protein sequences encoded by the
H. perforatum genome. However, several potential interactions were conserved among species. For example, HPIDN2-like A and HPIDN2-like B displayed opposite interaction potentials with members of the HPFDM1-like clade (e.g., a low distance from HPFDM1-like A and a long distance from HPFDM1-like B and C), as observed for Arabidopsis FDM1, FDM2 and FDM5 (
Figure 4A). Furthermore, the pattern of the interaction of Arabidopsis FDM4 with FDM5 was similar to those of HPFDM1-like A and two members of the FDM4-like clade (HPFDM4-like A and HPFDM4-like D). At the same time, the predicted interactions between protein pairs suggested some functional differentiation between members of the three protein subclades (i.e., HPIDN2, HPFDM1 and HPFDM4). More specifically, HPIDN2-like A and HPIDN2-like B displayed different interaction potentials with members of the HPFDM1-like and HPFDM4-like clades (
Figure 4A,
Table 2). The predicted interactors of HPFDM1-like B and HPFDM1-like C largely overlapped and differed from those predicted for HPFDM1-like A and the interactors predicted for HPFDM4-like B and HPFDM4-like C were different from those predicted for HPFDM4-like A and HPFDM4-like D (
Figure 4A,
Table 2). The integration of predicted PPIs with data related to the protein domain composition and ovule gene expression illustrated a complex network of potential interactions between HPIDN2-like and HPFDM1,4-like proteins (
Figure 4B,C). Despite exhibiting relatively low expression levels (represented by the area of the node), FDM4-like B and FDM4-like C were characterized by the highest number of predicted interactions. As shown in
Figure 4, proteins lacking one or both domains required for DNA and/or dsRNA binding (yellow nodes) were connected to different sets of interactors. Interestingly, comparison of the integrated sexual and aposporous networks suggested that apospory (
Figure 4C) is accompanied by variation in the overall proportion of nucleic acid-binding proteins (blue nodes) versus proteins partially or completely lacking this ability (yellow nodes) which favoured the former class.
2.4. Spatial and Temporal Gene Expression Analyses in H. perforatum.
Time-course gene expression analysis was performed on pistils collected at different developmental stages spanning premeiosis (S1), meiosis (S2) and two terminal stages of gametogenesis (early: S3, late: S4) from both sexual and aposporous plant accessions. A principal component analysis [
29] performed with the qPCR data separated early (S1 and S2) and late (S3 and S4) pistil developmental stages collected from sexual samples. The same PCA clustered sexual and aposporous samples according to the second component (
Figure 3C). Notably, the clustering of samples along the first component placed the aposporous pistils in an intermediate position with respect to sexual early and late developmental timepoints. The only exception to this sample grouping was observed for apomictic sample S1 (corresponding to premeiotic pistils), which clustered in proximity to the sexual sample corresponding to gametogenesis (S3 and S4) (
Figure 3C).
At the single-gene level, no significant differences (ANOVA
p-value cut off ≤ 0.05) in
FDM1-like genes were recorded in sexual and aposporous pistils. However, the increase in the expression of
FDM1-like A and
FDM1-like C coinciding with the onset of gametogenesis (stage S3) observed in sexual pistils was not recorded in apomictic pistils collected at comparable developmental stages (
Figure 5).
The results of the comparison of the FDM1-like C mean expression pattern in sexual and aposporous S1-S3 pistils were consistent with heterochrony in the expression pattern of this gene. For the
FDM4-like genes, sexual samples were generally characterized by increased expression throughout pistil development, with higher expression levels detected in the two terminal stages corresponding to female gametogenesis (
Figure 5).
Notably, pistils collected from aposporous plants displayed the opposite trend of expression for the same genes (i.e.,
FDM4-like). As a result, the
FDM4-like genes exhibited significantly higher expression in aposporous pistils at stage S1 (premeiosis) and lower or comparable expression levels in terminal pistil developmental stages. The expression of
IDN2-like genes showed alterative patterns in sexual pistils but not in aposporous ones. Indeed, while
IDN2-like B displayed constitutive expression in both sample sets and a similar (i.e., steady) expression pattern was observed for
IDN2-like A in aposporous pistils, the expression of the latter gene constantly increased throughout pistil development in sexual accessions (
Figure 5).
2.5. Transcriptional Regulation of XH/XS Domain-containing Genes
To gain additional insight into the biological functions and transcriptional regulation of
HPIDN2-like and
FDM1,4-like genes, the 1.0 kb upstream sequence from the translation start was investigated for the presence of cis-regulatory elements. The computational annotation of promoter sequences identified 431 TFs belonging to 66 TF families (
Tables S6 and S7). The most represented TF families that we detected were as follows: GATA tify, Dof, AT-Hook, ZF-HD, bZIP, bHLH and Myb/SANT_ARR-B. CIS-regulated elements associated with the responses to ethylene (ERFs, EILs, ERS), auxin (ARFs, MP), brassinosteroids (BEE1, BES1), abscisic acid (ABIs, ABFs, CBFs) and salicylic acid were also identified in
IDN2-like and
FDM1,4-like promoters. Most CIS-regulatory elements were detected only in subsets of the gene family, suggesting that multiple biotic and abiotic stimuli exhibit the potential to promote the transcription of the different
IDN2-like and
FDM1,4-like genes and that distinct regulatory factors might act in different plant contexts (
Tables S6 and S7). The clustering of gene promoters according to the predicted TF binding sites grouped
IDN2-like B and
FDM1-like A together with FDM4-like C and far from
FDM4-like A/B (
Figure S4). Accordingly, B3/ARF CIS elements were only found in
IND2-like,
FDM1-like A and
FDM1-like C genes (
Tables S6 and S7). Moreover, promoter sequences recognized by B-type cytokinin response regulators were detected in the promoter sequences of
FDM1-like,
FDM4-like C/
D and
IDN2-like B but were absent in the promoters of
FDM4-like A/
B and
IND2-like A. At the same time, CIS-regulatory elements recognized by CYTOKININ RESPONSE FACTORs (CRF1, CRF6) were detected in
IDN2-like promoters but not in the promoter sequences of
FDM1,4-like genes (
Tables S6 and S7). Interestingly, the sequence recognized by the WUSHEL (WUS) homeobox gene controlling the stem cell pool was found in the promoters of
FDM1-like A/B and
IDN2-like genes. Notably,
FDM1-like A and
IDN2-like B but not the other promoter sequences, were predicted to bind WRKY28 (
Tables S6 and S7).
The identification of multiple CIS elements related to cytokinin and auxin responses in
HPIDN2-like and
HPFDM1,4-like promoters, together with the differential expression of several gene members in sexual and aposporous reproductive structures, raised the following two questions: (i) does an increased hormonal concentration affect the transcript abundance of these genes in reproductive structures (e.g., pistils)? and (ii) in the case of a transcriptional effect mediated by the two treatments, is this effect conserved in sexual and aposporous genotypes? To address these questions, we treated different cuttings originating from both sexual and aposporous plant accessions with synthetic analogues of the two hormones and measured the variation of
HPIDN2-like and
HPFDM1,4-like transcript abundance 1 h after the treatments (
Figure 6). Several transcriptional responses were elicited by the two treatments, suggesting that the transcription of these gene members in pistils is, to so some extent, regulated by the two hormones (stage S1–S2). Remarkably, regarding the observed transcriptional changes, while treatment with NAA resulted in increased transcript abundance, the opposite effect (i.e., decreased transcript abundance) was most frequently associated with treatment with BAP (
Figure 6). Indeed, while NAA significantly increased the transcripts of FDM1-like B and FDM4-like C in sexual pistils, no difference was recorded for the same genes in pistils collected from aposporous plants (
Figure 6,
Figure 7②). Moreover, the reduction of the transcript levels of
FDM1-like C,
FDM4-like D,
IDN2-like A was only observed in pistils collected from aposporous plants (
Figure 6,
Figure 7③). However, a significant reduction of
FDM4-like A transcript levels following BAP treatment was documented in both phenotypes (
Figure 6). Notably, our RNAseq analysis showed that
FDM1-like C and
FDM4-like C were differentially expressed in aposporous ovules, suggesting that the modulation of their transcript levels in ovules could be related to local responses to auxin and cytokinin stimuli (
Figure 6 and
Figure 7).
3. Discussion
Ovule cell fate determination requires the expression of genes involved in posttranscriptional gene silencing (PTGS) mediated by sRNA [
9,
30,
31,
32,
33]. Posttranscriptional gene silencing mediated by siRNAs might be associated with the RNA-directed
de novo methylation of non-symmetrical CpG residues in targeted DNA loci and subsequent transcriptional gene silencing (TGS). The RNA-directed DNA methylation pathway in
A. thaliana requires the expression of ATIDN2 and its interaction with at least two similar proteins known as FACTOR OF DNA METHYLATION 1 (ATFDM1) and FACTOR OF DNA METHYLATION 2 (ATFDM2) [
11,
12,
13]. The protein domain composition, protein-protein interactions and co-expression network (
Figure S5) suggest that 3 additional proteins, ATFDM3, ATFDM4 and ATFDM5, might also be involved in RNA-directed DNA methylation.
Although DNA methylation status in a subset of RdDM target loci is not affected in single
fdm3-1,
fdm4-1 and
fdm5-1 knockout lines, double mutant analyses revealed that FDM3, FDM4 and FDM5 play redundant roles to FDM1 in RdDM [
19]. The clustering of FDM-like genes based on their expression patterns across different plant contexts in Arabidopsis grouped
ATFDM4 apart from
ATIDN2,
ATFDM1-2 and, to a lesser extent,
ATFDM. In Arabidopsis,
ATFDM4 is the gene with the highest expression in shoot apical inflorescences, flowers at early developmental stages, carpels and ovaries. Within seeds and siliques, the expression of IDN2 and FDM1-3 peaks at seed developmental stages 4 and 7, while
ATFDM4 and
ATFDM5 display a decreasing trend of expression throughout seed development. Our phenotypic investigations revealed a 15% reduction of the seed set in
idn2 and
fdm1, fdm2, fdm3 and
fdm4 mutants. These phenotypic effects are in line with a sporophytic effect affecting the reproductive process, as observed for
AGO9 and other mutants affecting the RdDM pathway [
31]. However, the higher frequency of aborted seeds observed in
fdm1 and
fdm2 mutants is consistent with partial differentiation of the activity of these genes with respect to FDM3 and FDM4. Accordingly, knockout of
IDN2 and the cluster I genes
FDM1, FDM2 and
FDM5 resulted in variability in embryo DNA content estimates. The observed reduction of the estimated genome size is not compatible with the parthenogenic development of both reduced and unreduced gametes. Additionally, apoptosis-associated chromatin degradation typically generates a much greater decrease in DNA stainability and comparable peaks in different mutant backgrounds. Furthermore, it is worth noting that despite the apparent genome size reduction, the seeds produced by
idn2,
fdm1, fdm2 and
fdm5 plants can germinate and the generated plants set seeds, indicating that meiosis is mostly unaffected in single-gene knockout mutants, as otherwise expected in case of large genomic mutations involving 10% - 30% DNA loss. Estimates of embryo DNA content by flow cytometry rely on the accessibility of the DNA minor groove to DAPI and its relationship to chromatin structure [
34]. Accordingly, the removal of the basic nuclear proteins, including histones, by acids or digestion with trypsin results in a marked increase in the accessibility of DNA to several dyes commonly used in flow cytometry, including DAPI [
34]. It has been shown that IDN2-FDM complexes facilitate RdDM by altering nucleosome positioning through interactions with a subunit of the SWI/SNF chromatin remodelling complex [
21]. Since nucleosome positioning might affect chromatin structure [
35], it is possible that the reduction in DAPI-dependent embryo DNA contents observed in our mutants reflects a decrease in the accessibility of chromatin to DAPI, implying that nucleosome density is affected in
idn2 and Class I
FDM mutants but not
fdm3 and
fdm4.
As the acquisition of embryo sac (ES) cell fate from illegitimate cell precursors of the ovule nucellus is a key feature of aposporous apomixis in several species, including
H. perforatum [
3,
4,
5,
6,
7,
8], comparative gene expression studies focused on
H. perforatum reproductive structures might prove to be a powerful tool for dissecting genes involved in characteristic processes of the two reproductive strategies. Here, we report that two IDN2-like and 7 FDM-like proteins are encoded by the
Hypericum perforatum genome. Our estimates of evolutionary divergence between sequences and phylogenetic analysis identified two
HPIDN2-like genes (Cluster II), three
HPFDM1-like genes (Cluster I) and four
HPFDM4-like genes (Cluster III), which were named according to the Arabidopsis and Poplar gene nomenclature. As sequencing errors or inaccurate definition of intron-exon boundaries might lead to errors in the prediction of protein primary sequences [
36] we focused our computational investigations on model species for which the genome sequence is available and by considering gene sequences marked as complete in the public databases. Despite the adoption of these stringent criteria for the selection of protein sequences, about 16% of the investigated protein sequences lacked the Zf-XS domain, while the XS and XH domain were missing in about 8.5% and 1.9% of investigated proteins and multiple Zf-XS domains were annotated in about 2.8% of considered sequences. Detailed investigations will be required to confirm the accuracy of the annotated sequences. However, we noticed that proteins lacking one or multiple domains were typically shorter than their putative orthologs. This might suggest that the observed variability in protein domains composition was associated to the lack of the corresponding sequence portions in the investigated gene sequences rather than misrecognition of protein domains within sequences. Both the HPFDM4-like A and HPFDM4-like C proteins lack the two N-terminal domains required for nucleic acid binding (i.e., Zf-XS and XS). However, the FDM4-like proteins display distinct and partially non-overlapping sets of predicted interactors, indicating functional differentiation among duplicated gene members. Interestingly, the lack of the central domain XS in our protein dataset was typically associated with the loss of the N-terminal Zf-XS domain and the only exception to this observation in our dataset is represented by the FDM4--like protein ONM30063 (Z. mays), which retains only the two terminal domains Zf-XS and XH. Interestingly, we noticed that within monocots and dicots, the higher number of protein sequences lacking one or multiple domains was observed in species with a greater number of gene members (e.g.,
C. sinensis,
G. max,
H. perforatum,
Zea mays). At the same time, no deviations in the protein domain architecture were detected in plant species characterized by a lower number of predicted genes (e.g.,
J. curcas,
P. trichocarpa,
R. communis,
S. tuberosum,
V. vinifera). This high variability in protein domain composition, which was recorded in the moss
P. patens together with several monocots and dicots, suggests that protein domain architecture in FDM-like proteins might be characterized relatively high plasticity. Compared to the sexually reproducing Arabidopsis and Poplar, the
H. perforatum gene family has experienced expansion of the FDM4 clade. Among the considered species, only the polyploid
Glycine max genome was annotated with four
FDM4-like genes. By contrast, one to two FDM4-like genes were found in
M. esculenta,
P. trichocarpa and
R. communis, which, together with
H. perforatum, belong to the order Malpighiales. Based on the estimated evolutionary divergence between
FDM4-like sequence pairs and gene order in the genomic sequence represented by ctg61006 (positions 220,754–241,969), we hypothesize that
FDM4-like B and
FDM4-like C originated from a duplication of a ~6 kb region from ctg61006, harbouring the two loci
FDM4-like D –
FDM4-like A or
vice versa. Accordingly, the annotation of FDM4-like promoter sequences suggested alternative regulatory pathways, a prediction that is in line with the altered expression patterns observed in pistils treated with BAP and NAA. In this study, we integrated two independent datasets of transcriptomic data generated from a large number of tetraploid genotypes of both sexual and apomictic phenotypes and genomic data originating from one sexual diploid individual. Our computational investigations identified 18 transcript variants that were associated with nine genomic loci. The integration of RNAseq data with genome sequence assembly allowed the identification of multiple transcript variants (N: 1-4) sharing variable extents of sequence similarity with the reference gene sequences and characterized by alternative expression values. Three transcript variants were found to be differentially expressed in the ovule nucellus of aposporous plant accessions. These genes encode IDN2-like B, FDM1-like C and FDM4-like C. Furthermore, while downregulation of the highly expressed
IDN2-like B gene in aposporous ovules was detected in both references (i.e., the sexual diploid genome and the transcriptome assembled
de novo from both sexual and apomictic ovules), upregulation of two
FDM1-like C and
FDM4-like C transcript variants was only detected at the transcriptome level. Notably, sequence variability and expression levels appeared to be interconnected for these loci, as indicated by the differential expression of two highly divergent variants showing a short length of transcript/gene alignment.
Both the RNA-seq and qPCR gene expression data allowed consistent clustering of samples according to their phenotype, indicating that intraphenotype gene expression variation was lower than interphenotype variation. Remarkably, the adoption of highly specific primers to examine the expression of single gene members throughout pistil development highlighted extensive differences in the expression of FDM4-like genes in aposporous pistils. The apomictic accessions adopted for the hormonal treatments and gene expression analysis by qPCR were facultative apomictic (
Table S8) characterized by the co-occurrence of both biological pathways within the same individua. Consequently, it is likely that transcriptional differences observed in pistils collected from sexual and apomictic accessions might have been underestimated due to the superposition of both reproductive pathways in facultative apomicts. Nevertheless, the expression of
FDM1-like C and
FDM4-like genes emphasized a clear dichotomy between the developmental stages preceding gametogenesis in sexual plants (i.e., S1, S2) and the following developmental stages (concomitant with gametogenesis). More specifically, the expression of the
FDM1-like C and
FDM4-like genes in aposporous samples is heterochronic, developmentally precocious and significantly higher at pistil developmental stages, which are concomitant with failure of meiosis and differentiation of the aposporous embryo sac precursors in aposporous genotypes. The heterochrony in the expression of these genes was such that a principal component analysis of qPCR expression data clustered premeiotic aposporous pistils (S1) with sexual pistils undergoing female gametogenesis (S3, S4). These expression patterns are consistent with the hypothesis that the transcriptional modulation of
HPFDM1-like C and
HPFDM4-like genes in
H. perforatum reproductive structures is chronologically concomitant with the earliest biological differences detectable between sexual and aposporous ovules: apomeiosis and the acquisition of ES cell fate from illegitimate cell precursors [
3,
4,
5,
6,
7,
8]. Accordingly, the hybridization signals for
HPFDM1-like C and
HPFDM4-like C in
H. perforatum reproductive structures were higher or were restricted to the ovule nucellus [
24].
According to our current understanding of the molecular functions of ATIDN2, ATFDM1 and ATFDM2 [
18,
20], the activity of these proteins relies on protein-protein interactions involving the XH and coiled coil domains as well as DNA and RNA interactions involving the XS and ZF-XS domains. Our computational investigations of protein-protein interactions in
H. perforatum predicted a complex network of potential interactions (edges: 16) between HPIDN2-like and HPFDM1,4-like proteins. Taken together, our RNAseq and qPCR gene expression patterns along with the predicted PPI and promoter annotations suggest that
IDN2-like A and
IDN2-like B are characterized by distinct transcriptional profiles, differential transcriptional regulation and alternative sets of FDM1-like and FDM4-like interactors. The proteins encoded by
IDN2-like B,
FDM1-like C and
FDM4-like C were involved in most predicted interactions (n: 10/16). Remarkably, the integration of ovule gene expression data with PPIs suggested that the overall proportion of transcripts encoding DNA/RNA-interacting versus DNA/RNA-noninteracting proteins is remarkably different in sexual and aposporous ovules. Based on our current understanding of IDN2-FDM molecular functions, we hypothesize that (i) the activity of DNA/RNA-noninteracting proteins requires an association with DNA/RNA-interacting proteins; and (ii) the activity of DNA/RNA-noninteracting proteins does not require the binding of siRNA/lncRNA dsRNAs at RdDM-targeted loci. In both cases, considering that the association of DOMAINS REARRANGED METHYLTRANSFERASE2 (DRM2) with lncRNAs at RdDM-targeted loci in
A. thaliana depends on both AGO4 and IDN2 [
20], it seems reasonable to hypothesize that the altered proportions of DNA/RNA-interacting versus DNA/RNA-noninteracting proteins might influence the association and/or distribution of DRM2 at RdDM-targeted loci.
Regarding the transcriptional regulation of
HPIDN2-like and
HPFDM1,4-like genes, the annotation of CIS-regulatory elements in gene promoters was consistent with the activity of distinct, yet partially overlapping gene expression regulatory mechanisms. The identification of CIS elements recognized by WUSHEL (WUS) and/or SHOOT MERISTEMLESS (STM) in the promoters of
IDN2-like B and
FDM4-like C is in line with the expression recorded in Arabidopsis (
Table S1, Figure S1) and
Hypericum, as indicated by our RNAseq data and the hybridization signals in the nucellus of
H. perforatum ovules (
Figure S3). It is worth noting that other TRANS-acting factors involved in cell patterning or ovule development (INO, WRKY28, BP1, NACs and members of the KANADI family of putative transcription factors) were predicted to bind
HPIDN2-like B and
HPFDM1-like promoters.
The modulation of transcript abundance mediated by the BAP treatments was generally consistent with the annotated CIS-regulatory elements (e.g., type-B ARRs and CRFs, ARF). However, promoter annotations and expression studies of
HPFDM1-like A,
HPFDM4-like A and
HPIDN2-like B suggested that transcript abundance might be modulated by complex regulatory mechanisms possibly involving multiple TRANS-acting factors with antagonistic activity and/or posttranscriptional gene expression regulation. Cell specification in ovules requires epigenetic regulation of gene expression by either RdDM or PTGS [
9] and IDN2 and FDM-like are required for siRNA-mediated DNA methylation [
11,
12]. At the same time, cell proliferation and expansion in different plant contexts—including lateral organ development, apical meristems and ovules—are affected by cytokinins and auxins [
37,
38,
39,
40,
41,
42,
43,
44]. Therefore, the observed transcriptional responses of the RdDM-related genes
HPIDN2 and
HPFDM1,4-like elicited by the hormonal treatments are particularly intriguing. The expression data for S1/S2 pistils (pregametogenesis), collected from both phenotypes and following BAP and NAA treatments, are consistent with the model reported in
Figure 7, which summarizes the interplay between hormonal treatments and apospory in modulating FDM transcript abundance in reproductive structures (i.e., pistils).
Expression of the FDM1-like C and FDM4-like genes is elicited in ovules and/or higher order reproductive structures (i.e., pistils) of aposporous genotypes (
Figure 7①). This elicitation is directly or indirectly promoted by one or more apospory controlling gene/s (defined herein as APO). This hypothesis is in line with the upregulation of these genes in pistils and the higher expression detected for FDM1-like C and FDM4-like C in aposporous ovules, as indicated by the RNAseq analyses of LCM microdissected cells. NAA increases the transcript levels of FDM1-like B and FDM4-like C in sexual but not aposporous pistils, leading to the conclusion that the transcriptional response of these genes is negatively regulated by APO (
Figure 7②). However, increased expression of FDM4-like C was detected in both pistils and ovules collected from aposporous genotypes, indicating that additional as yet undetermined regulatory mechanisms might exist, at least for this gene. BAP decreases the transcript levels of FDM1-like C, FDM4-like A and FDM4-like D and IDN2-like A in aposporous but not sexual pistils. As the expression of FDM1-like C, FDM4-like A and FDM4-like D is promoted in aposporous genotypes, the repression of these transcripts only in aposporous samples is consistent with a model in which BAP directly or indirectly represses the (positive) transcriptional response elicited by APO (
Figure 7③). The transcriptional effects exerted by the two treatments on FDM4-like C and IDN2-like A in sexual and aposporous pistils, respectively, might suggest synergic, rather than antagonistic responses. However, additional, not yet defined, factors might contribute to the observed transcriptional responses, including the crosstalk between the auxins and cytokinins [
37], interplay with other hormones [
45] and interaction with webs of local responses [
37,
46]. Although additional studies will be needed to clarify the causal relationship between apospory and cytokinin/auxin hormonal responses (e.g., one or more apospory-related genes modulating hormonal responses or
vice versa), our expression data from BAP- and NAA-treated pistils demonstrated that the effects of the two synthetic hormones on
IDN2-like and
FDM1,4-like transcript levels are intimately connected to the reproductive phenotype. It is worth noting that recent research in the aposporous complex
Hieracium spp. indicates that aposporus initials in this species are characterized by the overexpression of transcripts involved in the inhibition of cytokinin signalling, which could result in desensitization to the cytokinin signalling [
47]. Our model in
H. perforatum reveals two important aspects of the interplay between hormonal treatments and apospory in modulating FDM transcript abundance in reproductive structures: (i) the transcriptional responses on FDM-like genes in
H. perforatum are determined by crosstalk between genetic determinants of apospory and response regulators elicited by NAA and BAP treatments; and (ii) the promotion of FDM transcript abundance mediated by the genetic determinants of apospory is repressed by the regulators of BAP responses.
In conclusion, aposporous H. perforatum reproductive structures with altered ES founder cell specification are characterized by differential and/or heterochronic expression of IDN2-like and FDM1,4-like gene. The transcript levels of IDN2-like and FDM1,4-like in pistils are affected by BAP and NAA treatments but the transcriptional responses elicited by the two synthetic hormones are dependent on the plant reproductive phenotype. This finding suggests that FDM transcript abundance in reproductive structures is modulated by a molecular network involving interplay between hormonal response regulators and apospory-controlling genes. Our findings contribute to elucidating the phylogeny, expression and regulation of HPIDN2-like and HPFDM1,4-like transcript abundance in H. perforatum reproductive structures and provide additional insight into transcriptional deviations and hormonal responses associated with the aposporous apomictic reproductive strategy in H. perforatum.