Identification of Imprinted Genes Based on Homology: An Example of Fragaria vesca

Genomic imprinting has drawn increasing attention in plant biology in recent years. At present, hundreds of imprinted genes have been identified in various plants, and some of them have been reported to be evolutionarily conserved in plant species. In this research, 17 candidate genes in Fragaria vesca were obtained based on the homologous imprinted genes in Arabidopsis thaliana and other species. We further constructed reciprocal crosses of diploid strawberry (F. vesca) using the varieties 10-41 and 18-86 as the parents to investigate the conservation of these imprinted genes. Potentially informative single nucleotide polymorphisms (SNPs) were used as molecular markers of two parents obtained from candidate imprinted genes which have been cloned and sequenced. Meanwhile, we analyzed the SNP site variation ratios and parent-of-origin expression patterns of candidate imprinted genes at 10 days after pollination (DAP) endosperm and embryo for the hybrids of reciprocal cross, respectively. A total of five maternally expressed genes (MEGs), i.e., FvARI8, FvKHDP-2, FvDRIP2, FvBRO1, and FvLTP3, were identified in the endosperm, which did not show imprinting in the embryo. Finally, tissues expression analysis indicated that the five imprinted genes excluding FvDRIP2 mainly expressed in the endosperm. This is the first report on imprinted genes of Fragaria, and we provide a simple and rapid method based on homologous conservation to screen imprinted genes. The present study will provide a basis for further study of function and mechanism of genomic imprinting in F. vesca.


Introduction
Genomic imprinting is a phenomenon in which paternal and maternal alleles are differentially expressed in the offspring depending on their parental origin [1]. Therefore, based on their parent-of-origin manner, imprinted genes are divided into two groups, maternally expressed genes (MEGs) and paternally expressed genes (PEGs). Until now, genomic imprinting has been identified in fungi, mammals, and flowering plants [2]. Gene imprinting in flowering plants occurs mainly in the endosperm, and only a few imprinted genes are described in the embryo [3][4][5][6][7]. The endosperm and embryo of flowering plants are both derived from a double fertilization event: the egg cell (1n) and central cell (2n) fuse with two sperm cells (1n) to form the diploid embryo (2n) and the triploid endosperm (3n), composed of two maternal and one paternal genome copy, respectively [8][9][10].
It has been demonstrated that genomic imprinting is an epigenetic modification process which includes DNA methylation and trimethylation of histone H3 lysine 27 (H3K27me3) [11], and many genes involved in this process are imprinted ones. For example, in A. thaliana, MEDEA (MEA), and FERTILIZATION-INDEPENDENT ENDOSPERM (FIE), the components of polycomb repressive complex 2 (PRC2) that mediate H3K27me3 were found maternally expressed during endosperm development [12][13][14][15][16]. Interestingly, studies on maize found that the homologs of MEA and FIE were also imprinted [17,18]. Moreover, in rice, OsFIE was also identified as an imprinted gene [19]. Besides the genes of PRC2, Genes 2021, 12, 380 2 of 11 YUCCA10, VIM5, VIM1, and ARID-BRIGHT DNA binding domain from A. thaliana and their homologs in rice and maize, 29765.m000727 from castor bean and its homolog in A. thaliana, rice and maize, have also been identified as imprinted genes [20,21]. These studies indicated that genomic imprinting is evolutionarily conserved in flowering plants.
In this study, based on evolutionary conservation of genomic imprinting across plant species, we constructed a reciprocal cross of wild diploid strawberry F. vesca to test the conservation of PRC2-related genes MEA, FIE, and 16 other conserved imprinted genes in strawberry. By analyzing the imprinting status of candidate conserved imprinted genes based on SNP site variation ratios and parent-of-origin expression patterns, we found that five genes were conserved imprinted in strawberry endosperm. Moreover, the tissuespecific expression characteristics of these five genes are similar to the reported imprinted genes in other plants [7,22]. Our findings not only give us a preliminary understanding of strawberry genomic imprinting, but also provide evidence of the conservation of imprinted genes. It will help us to better explore the function and mechanisms of imprinted genes.

SNP Information of Candidate Imprinted Genes in Parents
In total, 17 homologous genes were obtained by BLASTP from F. vesca, and the coding sequences (CDs) of candidate genes were obtained by RT-PCR cloning and sequencing. SNPs were analyzed based on the sequencing results of 17 candidate genes from 10-41 and 18-86 leaves. Among the 17 genes we amplified, 10 candidate genes (FvFIE, FvYLS9, FvMLP, FvUNP, FvKH, FvVIP2, FvYLS3, FvUNP1, FvMIA40, and FvAMI) lacked SNP sites between two F. vesca ecotypes, which means that there is no difference among these sequences between the parents. The remaining seven candidate genes have SNPs between parental ecotypes, of which FvARI8 has 1 SNP site, FvBRO1 has 2, FvKHDP-2 has 19, FvDRIP2 has 17, FvLTP3 has 3, FvTAR4 has 11, and FvCAL has 2 (Table S1).

Biallelic Expression of FvTAR4 and FvCAL Genes in Endosperm
FvTAR4 and FvCAL genes have 11 and 2 SNPs, respectively. Amplification and sequencing results showed that there are two types of expression at the SNP site of the hybrid endosperms. At position 314 of FvTAR4, the SNP site has two different bases, A and T ( Figure S1A). Similarly, FvCAL has A and G at position 406 ( Figure S1B), and it was estimated that the base variation ratio of its SNP site is close to 2: 1, which is in line with the expected value. Thus, we speculated that FvTAR4 and FvCAL genes are not imprinted genes.

Five Genes Show Imprinting in the Endosperm of Wild Strawberry
There is only one SNP site in the coding region of FvARI8 in the 10-41 and 18-86 parents. The expression pattern of this SNP site in reciprocal cross endosperm is shown in Figure 1. In the hybrid endosperm of 10-41 maternal, the SNP site of FvARI8 at the same position is the same as the parent 10-41; and in the hybrid endosperm of the 10-86 maternal parent, the SNP site at the same position is the same as 18-86. In other words, FvARI8 showed monoallelic expression in a parent-of-origin-dependent manner, indicating this gene is not only a MEG but also a binary imprinting. Expression patterns of FvKHDP-2, FvDRIP2, FvBRO1, and FvLTP3 genes in the hybrid endosperm at the SNP site are shown in Figures S2-S5 and their expression patterns are the same as FvARI8, which confirmed that gene imprinting was conserved in some plants.

SNP Information of Candidate Imprinted Genes in Parents
In total, 17 homologous genes were obtained by BLASTP from F. vesca, and the coding sequences (CDs) of candidate genes were obtained by RT-PCR cloning and sequencing. SNPs were analyzed based on the sequencing results of 17 candidate genes from 10-41 and 18-86 leaves. Among the 17 genes we amplified, 10 candidate genes (FvFIE, FvYLS9, FvMLP, FvUNP, FvKH, FvVIP2, FvYLS3, FvUNP1, FvMIA40, and FvAMI) lacked SNP sites between two F. vesca ecotypes, which means that there is no difference among these sequences between the parents. The remaining seven candidate genes have SNPs between parental ecotypes, of which FvARI8 has 1 SNP site, FvBRO1 has 2, FvKHDP-2 has 19, FvDRIP2 has 17, FvLTP3 has 3, FvTAR4 has 11, and FvCAL has 2 (Table S1).

Biallelic Expression of FvTAR4 and FvCAL Genes in Endosperm
FvTAR4 and FvCAL genes have 11 and 2 SNPs, respectively. Amplification and sequencing results showed that there are two types of expression at the SNP site of the hybrid endosperms. At position 314 of FvTAR4, the SNP site has two different bases, A and T ( Figure S1A). Similarly, FvCAL has A and G at position 406 (Figure S1B), and it was estimated that the base variation ratio of its SNP site is close to 2: 1, which is in line with the expected value. Thus, we speculated that FvTAR4 and FvCAL genes are not imprinted genes.

Five Genes Show Imprinting in the Endosperm of Wild Strawberry
There is only one SNP site in the coding region of FvARI8 in the 10-41 and 18-86 parents. The expression pattern of this SNP site in reciprocal cross endosperm is shown in Figure 1. In the hybrid endosperm of 10-41 maternal, the SNP site of FvARI8 at the same position is the same as the parent 10-41; and in the hybrid endosperm of the 10-86 maternal parent, the SNP site at the same position is the same as 18-86. In other words, FvARI8 showed monoallelic expression in a parent-of-origin-dependent manner, indicating this gene is not only a MEG but also a binary imprinting. Expression patterns of FvKHDP-2, FvDRIP2, FvBRO1, and FvLTP3 genes in the hybrid endosperm at the SNP site are shown in Figures S2-S5 and their expression patterns are the same as FvARI8, which confirmed that gene imprinting was conserved in some plants.

Candidate Imprinted Genes Show Non-Imprinting in the Embryos
We tested the seven candidate genes with SNPs between two ecotypes of F. vesca, and verified the imprinted genes in the reciprocal cross embryo of 10 days after pollination (DAP). Results are shown in Figure S6. Alleles of the maternal and paternal of FvARI8, FvKHDP-2, FvDRIP2, FvBRO1, FvLTP3, FvCAL, and FvTAR4 genes are all expressed in the

Candidate Imprinted Genes Show Non-Imprinting in the Embryos
We tested the seven candidate genes with SNPs between two ecotypes of F. vesca, and verified the imprinted genes in the reciprocal cross embryo of 10 days after pollination (DAP). Results are shown in Figure S6. Alleles of the maternal and paternal of FvARI8, FvKHDP-2, FvDRIP2, FvBRO1, FvLTP3, FvCAL, and FvTAR4 genes are all expressed in the embryos of the hybrid combinations 18-86 × 10-41 and 10-41 × 18-86, and base variation ratio at the SNP site is close to the expected value of 1:1. We surmised that these genes are non-imprinted genes in reciprocal cross embryo.

Expression Patterns of Imprinted Genes in Different Tissues
In order to investigate whether five MEGs have specific expression characteristics in endosperm, qRT-PCR was used to analyze expression patterns of these genes in different tissues of F. vesca. Expression patterns of FvARI8 and FvBRO1 are similar, all showing a Genes 2021, 12, 380 4 of 11 lower expression level in nutritional tissues, and some reproductive tissues such as style, fruit, and ovary, but higher expression in endosperm and pollen (Figure 2A,D). FvBRO1 has a higher expression level in the hybrid endosperm and embryo tissues of 10 DAP compared with other tissues except pollen, and a lower expression in style. Expression of FvBRO1 in green fruit, ripe fruit, receptacle, and achene is extremely low, indicating that it may be less important in the development of fruit ( Figure 2D). FvKHDP-2 shows weakly expressed in pollen ( Figure 2B). Tissue-specific higher expression of FvDRIP2 was observed in achenes and embryo ( Figure 2C). The low expression pattern of FvLTP3 in nutritional tissues is similar to that of FvARI8 and FvBRO1, whereas FvLTP3 shows a higher expression level in embryo, endosperm, and pollen ( Figure 2E). In addition, we found all of the imprinted genes have a low expression in root and runner. ratio at the SNP site is close to the expected value of 1:1. We surmised that these genes are non-imprinted genes in reciprocal cross embryo.

Expression Patterns of Imprinted Genes in Different Tissues
In order to investigate whether five MEGs have specific expression characteristics in endosperm, qRT-PCR was used to analyze expression patterns of these genes in different tissues of F. vesca. Expression patterns of FvARI8 and FvBRO1 are similar, all showing a lower expression level in nutritional tissues, and some reproductive tissues such as style, fruit, and ovary, but higher expression in endosperm and pollen (Figure 2A,D). FvBRO1 has a higher expression level in the hybrid endosperm and embryo tissues of 10 DAP compared with other tissues except pollen, and a lower expression in style. Expression of FvBRO1 in green fruit, ripe fruit, receptacle, and achene is extremely low, indicating that it may be less important in the development of fruit ( Figure 2D). FvKHDP-2 shows weakly expressed in pollen ( Figure 2B). Tissue-specific higher expression of FvDRIP2 was observed in achenes and embryo ( Figure 2C). The low expression pattern of FvLTP3 in nutritional tissues is similar to that of FvARI8 and FvBRO1, whereas FvLTP3 shows a higher expression level in embryo, endosperm, and pollen ( Figure 2E). In addition, we found all of the imprinted genes have a low expression in root and runner.

Discussion
The phenomenon of genomic imprinting has been widely observed in animals and plants. In plants, the imprinted gene was first identified in maize by phenotypic identification methods [23]. However, only a handful of imprinted genes had been identified in plants until the emergence of next-generation sequencing technology. At present, more and more imprinted genes have been identified and characterized in other plant species, including A. thaliana [4,5,7,24,25], rice [26][27][28], maize [20,22,29], sorghum [30], castor bean [21], wheat [31], tomato [32,33], Capsella rubella [34], and Brassica [35,36]. Further study indi-cated that many imprinted genes were found to be conserved among different species. For example, 24 imprinted genes in A. lyrata were found to be imprinted in A. thaliana [25], and 55.6% imprinted genes are reported as conserved imprinted genes in hexaploid wheat and its close relative tetraploid [31]. Chen et al. [37] compared the imprinted genes obtained in rice with those in A. thaliana, maize, and sorghum, and found that several of these genes were conserved imprinted genes in these species. They further identified eight homologous imprinted genes in barley (Hordeum vulgare) through the conservative imprinted genes from rice.
In this study, MEA, FIE, and 16 conserved imprinted genes in A. thaliana and their homologs of four other plants were used as inquiry genes to screening imprinted homologs in diploid strawberry F. vesca. As shown in Table 1, all genes except MEA have homologs in F. vesca, indicating that this is a feasible approach to identify imprinted genes in plants.
To verify that the 17 candidates were true imprinted genes, we cloned and sequenced the CDs of 17 genes from the respective parents. Sequence analysis indicated that 10 of the 17 candidates lacked SNPs between parents. Therefore, it is difficult to determine whether these 10 genes are imprinted genes. The remaining seven candidate genes that contained more than one SNPs were further analyzed in both endosperm and embryo. Results indicated that five of the seven genes, i.e., FvARI8, FvKHDP-2, FvDRIP2, FvBRO1, and FvLTP3, were imprinted in endosperm. Interestingly, these seven genes all showed non-imprinting expression in reciprocal cross embryo. These results were consistent with research results that indicate that gene imprinting in flowering plants mainly occurs in endosperm [3][4][5][6][7].
The identification of imprinted genes in the endosperm and embryo can be confounded by maternal tissues [38,39] and parent-of-origin effects [40]. The seed coat, a maternal tissue that surrounds and protects the seed, can lead to false-positive MEGs. To reduce falsepositive MEGs, Pignatta et al. 2014 [7] censored the genes with expression that was more than twice as high in seed coat relative to endosperm during RNA-seq analyses. In this study, we also analyzed the expression of imprinted genes in the seed coat (Figure 2), the results showed that they have a lower expression level in seed coat relative to endosperm excluding FvDRIP2. In previous studies, the imprinted genes regulated by specific epigenetic marks, such as MEA [14][15][16], FIS2 [15], LORELEL [41], NUWA [42], and Mez1 [17], etc., have a monoallelic expression. Similarly, five imprinted genes identified in this study also show a monoallelic expression in a parent-of-origin-dependent manner. However, examples of monoallelic expression of non-imprinted genes have also been reported, which was contribute to phenotypic diversity in poplar [43], barley [44], rice [45], and A. thaliana [46]. In addition, maternal effects can lead to the appearance of a parent-of-origin effect because of the deposition of mRNA from gametophytic in the fertilized egg cell (zygote) or fertilized central cell (endosperm) during early seed development, which will increase contamination of RNA-seq analyses [38]. In this regard, sequencing cannot distinguish imprinting from contamination or parent-of-origin effects, hence we need further research to prove that these genes show a monoallelic expression in a parent-of-origin-dependent manner are modified by specific epigenetic marks.
Further analysis indicated that the five strawberry endosperm imprinted genes were MEGs; however, their homologs in other plants belong to PEGs. For example, FvKHDP2 and its homolog in A. thaliana are MEG, while its ortholog in rice and Sorghum bicolor is PEG, indicating that although they are conservative imprinted genes, the regulatory mechanism of imprinted expression in different species may be different. MEGs and PEGs differ in their targeting by 24-nt small RNAs and asymmetric DNA methylation, suggesting different mechanisms establishing DNA methylation at MEGs and PEGs [34,47,48]. Batista and Köhler [11] thought that MEGs were dependent on parental DNA methylation asymmetries, and that PEGs were dependent on parental asymmetric DNA methylation and H3K27me3.
Among the five imprinted genes, FvARI8 and FvDRIP2 belong to the E3 ubiquitin ligase RING protein family; FvARI8 is structurally similar to AtARI8 in A. thaliana, with a RING1-IBR-RING domain, a supercoiled domain and a leucine-rich region at the C-terminus, which plays an important role in maintaining protein binding; AtARI8 is expressed in stems, leaves, flowers, and silique [49]; and FvARI8 is also expressed in these tissues, which may have similar functions, but we have not found any functional studies. Amino acid sequences of FvDRIP2 and Vigna unguiculata VuDRIP are 48.85% similar. VuDRIP interacted with VuDREB2A was detected by yeast two hybrids, DRIP negatively regulates DREB2A, which reduces the expression of DREB2A under non-stress conditions, thereby reducing the metabolic burden [50]. FvDRIP2 may also participate in stress-related regulation. FvKHDP-2 contains the KH domain. Proteins containing KH domains perform multiple cellular functions. KH domain is essential for establishing a post-transcriptional regulatory network, and it also has E3 ubiquitin ligase activity [51][52][53]. FvBRO1 is a member of the heavy-metal-associated protein family. Studies have shown that members of this family are mainly involved in the process of heavy metal accumulation in A. thaliana [54]. FvLTP3 is a non-specific lipid transfer protein, it plays an important role in plant defense and should also be applied to abiotic stresses such as drought, cold, and salt [55,56].
The five strawberry MEGs' expression characteristics are similar to the imprinted gene expression patterns in maize [22] and A. thaliana [7], that is, they are both highly expressed in the endosperm and may be involved in the seed development. Tissue expression specificity analysis also found that identified imprinted genes have expression levels in various tissues, indicating that imprinted genes play a role in other tissues in addition to their role in endosperm development. Similarly, 67% of MEGs in castor bean were found to be expressed in other tissues [21], which suggests that imprinted genes also play a role in the development of other tissues. Future study will be undertaken to characterize the functions of these strawberry imprinted genes. In summary, this simple and rapid method of identifying imprinted genes based on homology is feasible. Although the number of imprinted genes obtained is limited, as more imprints are discovered, this method will show greater power. To our knowledge, this is the first to find imprinted genes in the endosperm in the Fragaria.

Plant Materials
Wild F. vesca ecotype 10-41 (collected from Europe) and 18-86 (collected from Tianshan, Fukang, Xinjiang, China) were planted in the greenhouse at Baima Teaching and Scientific Research Base of Nanjing Agricultural University, Nanjing, China. Unopened flowers were emasculated; then, the pollinated flowers were bagged at the beginning of flowering in April.

Acquisition of Candidate Imprinted Genes
Some conserved imprinted genes have been reported in plants, was listed in Table 2. Using the protein sequences encoded by imprinted genes in A. thaliana as the query sequences, BLASTP was performed in the F. vesca protein database based on National Center for Biotechnology Information (NCBI) (E-value < 10 × 10 −10 ). Multiple sequence alignment of proteins was performed between species by using BioXM (https://cbi.njau.edu.cn/BioXM/ (accessed on January 2014)) and MEGA-X [57] to further confirm their similarity.

RNA Isolation and Cloning of Candidate Imprinted Genes
Total RNA from different tissues was extracted using an RNA extraction kit (Tiangen, Beijing, China), and RNAs were reverse-transcribed into complementary DNA (cDNA) using the PrimeScript RT reagent kit (TaKaRa, Dalian, China). According to the candidate gene sequences from NCBI, Primer 5.0 software was used to design specific primers among the CDs (Table 3). The cDNAs of leaves from the parents, endosperm, and embryo of the reciprocal cross between F. vesca ecotypes 10-41 and 18-86 at 10 DAP were performed using PCR amplification with primers in the CDs. The PCR program was as follows: 1 cycle of 5 min at 98 • C; 35 cycles of 30 s at 94 • C, 30 s at 55 • C, 1 min at 68 • C; and a final extension for 10 min at 68 • C. All PCR products were analyzed by agarose gel electrophoresis, and the target band was recovered using an AxyPrep DNA gel recovery kit (Axygen, Union City, CA, USA). Then, the target fragment was ligated into a pCloneEASY Blunt Vector Kit (TransGen Biotech, China) and transformed into Escherichia coli DH5α (Tsingke Biotechnology, China). Finally, 20 single colonies for each fragment were picked and grown overnight at 37 • C for Sanger sequencing. Sanger sequencing was performed by Tsingke Biotechnology Ltd. (Beijing, China).

Identification of Imprinted Genes in Wild Strawberry
Potentially informative SNP sites were obtained by aligning the Sanger sequencing data from leaves of 10-41 and 18-86 with the BioXM (https://cbi.njau.edu.cn/BioXM/ (accessed on January 2014)) and MEGA-X [57]. Candidate imprinted genes with pure SNP site were retained, otherwise it would be eliminated. Then, we analyzed SNP information from endosperm and embryo of reciprocal cross to observe the expression manners of alleles based on their parent of origin. Finally, expression ratios of maternal and paternal alleles were calculated for the genes that had SNP sites in endosperm and embryo tissue according to the method of Liu and Qian [59] and ImageJ software [60]. The expression ratio of maternal and paternal alleles is theoretically 2:1 and 1:1 in hybrid endosperm and embryo, respectively [61,62]. In the reciprocal crosses endosperm, MEG was a defined gene with a ratio higher than 4m:1p, and PEG was a defined gene with a ratio higher than 2p:1m. We defined MEG and PEG as being higher than the ratio 3:1 (3m:1p, 3p:1m, respectively) in the reciprocal crosses embryo.

Expression Profiles Analysis of Imprinted Genes
RT-qPCR was performed using an ABI 7300 Real-Time PCR System (Applied Biosystems, Foster City, CA, USA) and SYBR Green Real-time PCR Master Mix (Toyobo, Osaka, Japan). A total reaction volume (20 µL) comprised 10 µL SYBR Green Master Mix, 8.4 µL ddH 2 O, 1 µL cDNA sample (equivalent to 100 pg of total RNA), and 0.3 µL of each primer (the final concentration of all primers was 10 µM). The primers involved are listed in Table 4. The reactions were incubated at 95 • C for 4 min, followed by 40 cycles at 94 • C for 20 s, 62 • C for 20 s, and 72 • C for 40 s (extending and gathering the fluorescent signal). Four technical replicates were performed for three biological replicates of each sample. Quantitative analysis of gene expression was performed using the 2 −∆∆CT [63] method, and SPSS software version 25.0 was used for statistical analysis.  Figure S1: FvTAR4 and FvCAL are biallellic expressions in the endosperms of the hybrid of two Fragaria vesca ecotypes. Sequencing of FvTAR4 and FvCAL was used to confirm imprinting status in 10-41 × 18-86 and 18-86 × 10-41. At least ten colonies were sequenced for each amplicon. SNP sites are shaded in blue. FvTAR4 and FvCAL show a biallellic expression, and their variation ratio of SNPs is close to the 2:1 expected ratio in reciprocal cross. Figure Figure S4: Monollelic expression of FvBRO1 in reciprocal cross endosperm of two Fragaria vesca strains. Figure S5: Monollelic expression of FvLTP3 in reciprocal cross endosperm of two Fragaria vesca strains. Figure S6: Biallelic expression of seven candidate genes in reciprocal cross embryo of two Fragaria vesca ecotypes. Seven candidate genes with SNPs were cloned in reciprocal cross embryos, and sequencing showed that their variation ratio of SNPs is close to the 1:1 expected