The centromeres are specialized chromosomal domains that are required for proper separation of chromosomes during mitosis and meiosis. The centromere is composed of centromeric DNA, often enriched in satellite repeats, and the large protein complex “kinetochore”. In centromeric nucleosomes of most eukaryotes, histone H3 is partially replaced by the centromeric histone H3 variant cenH3 (also known as CENP-A in mammals, CID in Drosophila
, Cse4 in Saccharomyces cerevisiae
, and Cnp1 in Schizosaccharomyces pombe
]). Deposition of cenH3 at the centromeric region is a prerequisite of the correct assembly and function of the kinetochore complex. It depends on different cenH3 assembly factors and chaperones [2
], the transcription of the centromeric repeats [3
], and the epigenetic status of centromeric chromatin [6
]. In mammals, the Mis18 complex composed of Mis18α, Mis18β, and Mis18-binding protein 1 (also known as KNL2) plays an important role in the licensing of centromeres for cenH3 recruitment [8
]. The human Mis18 protein complex localizes to centromeres during late telophase and remains associated with the centromere during early G1 phase when new CENP-A is deposited [2
]. It mediates the recruitment of the cenH3 chaperone Holliday junction recognition protein (HJURP) to endogenous centromeres [11
]. Knockout of murine Mis18a is embryo lethal [7
]. Cultured homozygous mutant embryos showed misaligned chromosomes, anaphase bridges, and lagging chromosomes [7
Up to now, only the cenH3 assembly factor KNL2 has been identified and characterized in plants [13
]. In contrast to the mammalian cenH3 assembly factor Mis18BP1, the Arabidopsis
KNL2 protein is present at centromeres during all stages of the mitotic cell cycle, except from metaphase to mid-anaphase [13
]. Knockout of KNL2
resulted in a reduced amount of cenH3 at centromeres, mitotic and meiotic defects, decreased DNA methylation degree, and lowered growth rate and fertility [13
All homologs of Mis18BP1 (KNL2) identified up to now contain the conserved SANTA domain [15
] at the N-terminus. However, the functional role of this domain still remains obscure. It was suggested that it might be involved in protein–protein interactions due to the presence of many conserved hydrophobic residues. It was shown that an absence of the SANTA domain in Arabidopsis
KNL2 does not disturb its centromeric localization. Recently, the conserved C-terminal CENPC-k motif [14
] required for the targeting of Mis18BP1 (KNL2) to centromeres was identified [14
]. It presents in the Mis18BP1 (KNL2) proteins of most eukaryotes excluding therian mammals and Caenorhabditis elegans
KNL2 binds centromeric repeat pAL1
and non-centromeric DNA sequences in vitro
, whereas in vivo
it associates preferentially with the centromeric repeat pAL1
. The level and function of the Mis18BP1 protein in human cell culture is regulated by SUMOylation [19
], and its centromeric localization is controlled by the phosphorylation in a cell cycle-dependent manner [20
]. Whether the KNL2 of plants is regulated in the similar way remains to be elucidated.
Although the Mis18 protein complex is important for the deposition of cenH3 to centromeres in different organisms, its mechanism of function remains to be elucidated in detail. For mammals, it was shown that an interaction of the Mis18 complex with the de novo
DNA methyltransferases DNMT3A and DNMT3B is required for the regulation of the epigenetic status of centromeric DNA and subsequently the transcription of centromeric repeats [7
]. A knockout of mammalian Mis18α resulted in reduced DNA methylation, altered histone modifications, and increased centromeric transcripts in cultured embryos [7
]. However, it was not tested whether a knockout of Mis18 complex components has an effect on the methylation status of non-centromeric chromatin and on the expression of other repetitive or gene-coding chromosomal regions. For instance, knockout of KNL2
resulted in decreased DNA methylation of the marker regions MEA-ISR
and the At-SN1
In the current study, we used an RNA-sequencing (RNA-seq) approach to address the question of whether the inactivation of KNL2 influences the genome-wide gene expression during seedling and flower bud stages in Arabidopsis. This analysis allowed the identification of highly differentially expressed genes (DEGs) in flower buds (n = 1861) and seedlings (n = 459) of the knl2 mutant. Gene Ontology (GO) term enrichment analysis links the activity of KNL2 to centromere function, DNA repair, DNA methylation as well as regulation of transcription. The specific pattern of gene expression in response to the inactivation of the KNL2 gene provides a resource for future functional studies to unravel the role of KNL2 in kinetochore assembly and function.
3. Materials and Methods
3.1. Plant Materials and Growth Conditions
The A. thaliana knl2
mutant (SALK_039482) in Col-0 background was described previously [13
]. Seeds of A. thaliana
wild-type and knl2
were germinated in Petri dishes on half strength Murashige and Skoog (MS) medium (Murashige and Skoog, 1962) for eight days. For harvesting of flower buds, populations of wild-type and knl2
plants (30 plants per each population) were grown in soil until flowering. In both cases, plants were cultivated with a 16 h photoperiod (21 °C day/18 °C night), 70% relative humidity. Light irradiance at plant level was 130 µmol m−2
3.2. DNA Damage Sensitivity Assays
For the pilot root length assay experiment, plants were germinated and grown for seven days on ½ strength Murashige and Skoog medium containing 0.01% DMSO and different DNA damage inducers—mitomycin C (MMC) (Duchefa, Haarlem, The Netherlands), bleomycin (Calbiochem, San Diego, CA, USA), and camptothecin (Sigma-Aldrich, Saint Louis, MO, USA) with concentrations specified in the text. For the subsequent analysis, plants were germinated and grown for 14 days on media containing different concentrations of MMC (2.5, 5, 7.5, 10, and 15 µM from Sigma-Aldrich). For PI staining assay, plants were grown on solid ½ Murashige and Skoog medium for four days, then transferred to liquid ½ Murashige and Skoog medium containing 0, 10, and 20 µM MMC and grown for 24 h. Subsequently, the plants were stained with 10 μg/mL propidium iodide solution (Sigma-Aldrich) for 3 min, rinsed with tap water and analyzed using an AxioImager Z2 (Zeiss, Jena, Germany) microscope equipped with the DSD2 confocal module (Andor Technology, Belfast, Great Britain). Plants for all procedures were grown in a Percival growth chamber under long-day (16 h light) conditions and 21 °C.
3.3. RNA Isolation and Illumina Sequencing
Total RNA was extracted from the eight-day-old seedlings of A. thaliana
. At least 60 seedlings (100 mg) were pooled to produce a biological replicate. For the RNA-seq analysis of flower buds, inflorescences were harvested from three individual plants for each sample (30 mg). Flower buds older than stage 12 and flowers [55
] were removed. All tests were performed on three biological replicates per condition and genotype. Total RNA was isolated from seedlings and flower bud samples using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s manual. The RNA preparations were checked for quality using a NanoDrop spectrophotometer and a 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA). In total, 12 RNA samples (1 µg each) were provided to the IPK-Sequencing-Service (IPK, Gatersleben, Germany) for construction of cDNA libraries with the TruSeq RNA Sample Preparation Kit (Illumina, San Diego, CA, USA). The libraries were sequenced in a HiSeq2500 rapid 100-bp single-read run system After sequencing, the adapter sequences and the barcodes were removed.
3.4. RNA-seq Data Processing
Sequencing quality of the reads was examined by using FastQC Read Quality reports, Galaxy Version 0 0.72 [56
]. At least 94% of the bases of each read in all samples possessed Illumina Quality >30 and no sequence flagged as poor quality was detected.
Sequences were aligned against the Arabidopsis
TAIR 10 genome assembly using HISAT2 Galaxy tool version 2.0.3. [57
] with default settings. Read counts for each gene were quantified based on the BAM files produced with the HISAT2 by using the tool feature Counts, Galaxy Version 1.4.6.p5 [58
]. The advanced setting parameters were strand specificity = “no”, GFF feature type filter = “exon”, and GFF gene identifier = “Parent”. Differentially expressed features were determined based on the feature Counts tables by applying the tool DESeq2, Galaxy Version 2.11.38 [21
] with the setting parameter fit type = “parametric”. DESeq2 tested for differential expression based on a model using the negative binomial distribution. Differentially expressed genes were identified by comparison of two groups, namely, (1) a mutant line and the wild-type control, condition seedlings and (2) a mutant line and the wild-type control, condition flower buds.
The results of all statistical tests were adjusted for the multiple testing false discovery rate (FDR) with the Benjamini and Hochberg procedure [59
]. A cutoff value of adjusted p
-values equal to 0.05 was chosen as a threshold to identify significant differentially expressed genes.
3.5. Analysis of Differentially Expressed Genes (DEGs)
Functional characterization of the DEGs showing significant expression changes in response to KNL2
depletion was done based on the TAIR10 annotation (https://www.arabidopsis.org/
). Gene Ontologies were analyzed for term enrichment using the Generic Gene Ontology GO::TermFinder tool ([22
]. The analysis was carried out using the Benjamini–Hochberg FDR with a filter p
-value of < 0.05.
Information about flowering-related genes was extracted from the Flowering Interactive Database [60
]. The appearance of the transcription factors in the analyzed gene sets was confirmed with the Plant Transcription Factor Database v.4.0 [40
]. Information about the genes essential for the Arabidopsis
development from the SeedGenes Project [61
] was used to find the corresponding genes among differentially expressed genes.
3.6. Gene Expression Validation by Reverse Transcription Quantitative PCR (RT-qPCR)
Total RNA extraction was performed as described above. The RNA was treated with DNase I (Ambion, Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer’s protocol to eliminate any residual genomic DNA. Reverse transcription was performed using a first-strand cDNA synthesis kit, oligo dT (18-mer) primer (both Fermentas, Thermo Fisher Scientific, Waltham, MA, USA), and 2 μg of total RNA as a starting material.
The gene-specific primers (Table S6
, Supplementary Materials
) were designed using the fully automated QuantPrime tool [62
]. The amplification of the UBQ10
(AT4G05320) reference gene [63
] was used as an internal control to normalize the data.
Quantitative real-time measurements were performed using POWER SYBR Green Master Mix reagent in a QuantStudio 6 Flex system (Applied Biosystems, Thermo Fisher Scientific, Waltham, MA, USA), according to the manufacturer’s instructions. The cDNA equivalent to 40 ng of total RNA was used in a 10 µL PCR reaction.
The cycling conditions comprised 10 min polymerase activation at 95 °C and 40 cycles at 95 °C for 3 s and 60 °C for 30 s. Three biological replicates per genotype (wild-type and knl2
mutant line) in both conditions (seedlings and flower buds) were tested. Each biological replicate was represented with three technical replicates, which were analyzed during the same run. Relative gene expression was calculated using the comparative method 2−∆∆CT