An Update to the TraVA Database: Time Series of Capsella bursa-pastoris Shoot Apical Meristems during Transition to Flowering

: Transition to ﬂowering is a crucial part of plant life directly a ﬀ ecting the ﬁtness of a plant. Time series of transcriptomes is a useful tool for the investigation of process dynamics and can be used for the identiﬁcation of novel genes and gene networks involved in the process. We present a detailed time series of polyploid Capsella bursa-pastoris shoot apical meristems created with RNA-seq. The time series covers transition to ﬂowering and can be used for thorough analysis of the process. To make the data easy to access, we uploaded them in our database Transcriptome Variation Analysis (TraVA), which provides a convenient depiction of the gene expression proﬁles, the di ﬀ erential expression analysis between the homeologs and quick data extraction.


Summary
The ability to produce flowers and seed-covering fruits is a major achievement of Angiosperms, which gave rise to the enormous diversity and wide distribution of this taxon. A critical stage in the life of Angiosperm plants is the transition to flowering. The precise determination of flowering time ensures the reproductive success of a plant, relies on various environmental and endogenous stimuli and, thus, is subjected to natural selection, as was shown for various annual (Arabidopsis thaliana, rice, Brassica rapa [1,2]) and perennial (Boechera stricta, orchid Myrmecophila christinae [3,4]) plants. The main regulators of floral transition are thoroughly described in the model plant A. thaliana (reviewed in [5]), and the orthologs of the A. thaliana regulatory genes are often the targets of selection in other species [6].
Transcriptomic time series were shown to be a suitable tool for the analysis of process dynamics [7]. Transition to flowering was analyzed with time series in several papers on A. thaliana [8,9], which allowed the identification of cellular processes and participants beyond the known regulators of the reproductive switch.
Capsella bursa-pastoris is a recent allopolyploid plant and a close relative of A. thaliana. Since its emergence 100,000-300,000 years ago [10], C. bursa-pastoris has spread all across the world; such a wide habitat area leads to different, environmentally dependent flowering strategies, as has been explored in population studies [11,12]. In our previous study of the different C. bursa-pastoris organs, we noted that duplicated floral transition regulators may be undergoing a divergence of function [13]. In the current study, we created a transcriptomic time series dataset of C. bursa-pastoris shoot apical meristems (SAMs), as the SAM is the plant structure where the main processes of floral transition occur. The dataset is available for easy access to gene expression profiles in our database Transcriptome Variance Analysis (TraVA, travadb.org), and the raw data in a public repository as well.

Time Series of Capsella Shoot Apical Meristems
We have constructed an RNA-seq-based time series of C. bursa-pastoris shoot apical meristems under long-day (16 h light/8 h dark) conditions. The close relationship between C. bursa-pastoris and the model object A. thaliana provides ample opportunities for interspecies comparisons in an evolutionary context. We aimed to make the Capsella time series comparable with a time series of A. thaliana SAMs [9], so we reproduced the plant growing conditions (day length, light quality, temperature, humidity) and sample harvesting techniques, including hand dissection of the SAM and tissue fixation. As a result, we obtained nine samples of C. bursa-pastoris SAMs collected from 11 to 16 days after germination (DAG). A brief description of the samples is given in Table 1. Two biological replicates consisting of 15 plants each were harvested for every sample. At the early stages of the time series, the plant SAM exclusively formed rosette leaves and at the latest stage the SAM produced floral primordia; our dataset covers the transition to flowering and can be used for the analysis of the process. RNA extracted from the SAM samples was sequenced on an Illumina platform. At least 9.3 M uniquely mapped on the genes reads were generated for each sample (Table 1). To show that the time series is suitable to use in floral transition studies, we analyzed the expression profiles of the Capsella orthologs of several A. thaliana regulators of flowering-genes LEAFY (LFY), SUPPRESSOR OF OVEREXPRESSION OF CO 1 (SOC1) and APETALA 1 (AP1) ( Figure 1). As expected, expression of SOC1 and LFY increased at the early stages of floral transition [14,15]. Beginning with stage M6, expression of both the AP1 homeologs was elevated, as in the A. thaliana SAM time series [9].

An Update of Transcriptome Variance Analysis Database
The transcriptomic database TraVA (travadb.org) was created for gene expression profile display, differential expression analysis and intergenic transcription comparison. Detailed transcriptome maps of such plants as A. thaliana, tomato or C. bursa-pastoris form the main body of the database [9,13,16]. We consistently aim to increase the diversity of the datasets and analysis types in TraVA; in the current update, we added a time series of C. bursa-pastoris SAMs. A substantial part of the Capsella genes has orthologs in the A. thaliana genome, which facilitate the search for a gene of interest. Our search engine supports the C. bursa-pastoris gene identifier as well as A. thaliana gene id and gene names ( Figure 2). As C. bursa-pastoris is a tetraploid plant, the majority of its genes form homeologous pairs with one gene coming from the C. rubella parental species (Subgenome A of C. bursa-pastoris) and the second homeolog from C. orientalis (Subgenome B) [10].
The expression profile of the searched gene is pictured along with its homeolog (Figure 1).

An Update of Transcriptome Variance Analysis Database
The transcriptomic database TraVA (travadb.org) was created for gene expression profile display, differential expression analysis and intergenic transcription comparison. Detailed transcriptome maps of such plants as A. thaliana, tomato or C. bursa-pastoris form the main body of the database [9,13,16]. We consistently aim to increase the diversity of the datasets and analysis types in TraVA; in the current update, we added a time series of C. bursa-pastoris SAMs.
A substantial part of the Capsella genes has orthologs in the A. thaliana genome, which facilitate the search for a gene of interest. Our search engine supports the C. bursa-pastoris gene identifier as well as A. thaliana gene id and gene names ( Figure 2). As C. bursa-pastoris is a tetraploid plant, the majority of its genes form homeologous pairs with one gene coming from the C. rubella parental species (Subgenome A of C. bursa-pastoris) and the second homeolog from C. orientalis (Subgenome B) [10]. The expression profile of the searched gene is pictured along with its homeolog (Figure 1). Our database provides several modes of transcription profile representation: absolute read counts normalized on library size (as in the "DESeq" package [18], Raw Norm mode on Figure 1) and the relative normalized read counts (divided by the maximum expression value for a given gene (0-1 mode); in both cases the read count values can be shown or hidden. The color chart ensures a clear picture of gene expression.
One of the main purposes of the TraVA database is an easily available differential expression analysis. In the Capsella SAM time series dataset, as well as in our previously published C. bursapastoris transcriptome map, we focused on the comparison of the expression level of the homeologous genes. We used the "DESeq2" package [19] to compare the expression of Homeolog B with Homeolog A in each sample. The fold changes of the significantly differentially expressed genes are indicated in the third column ( Figure 1).
Gene expression profiles and the results of the differential expression analysis can be downloaded in the .xls format.  Our database provides several modes of transcription profile representation: absolute read counts normalized on library size (as in the "DESeq" package [18], Raw Norm mode on Figure 1) and the relative normalized read counts (divided by the maximum expression value for a given gene (0-1 mode); in both cases the read count values can be shown or hidden. The color chart ensures a clear picture of gene expression.

Plant Growth and Sample Collection
One of the main purposes of the TraVA database is an easily available differential expression analysis. In the Capsella SAM time series dataset, as well as in our previously published C. bursa-pastoris transcriptome map, we focused on the comparison of the expression level of the homeologous genes. We used the "DESeq2" package [19] to compare the expression of Homeolog B with Homeolog A in each sample. The fold changes of the significantly differentially expressed genes are indicated in the third column ( Figure 1).
Gene expression profiles and the results of the differential expression analysis can be downloaded in the .xls format.

The Area of Dataset Application
A detailed time series of C. bursa-pastoris SAMs is a powerful source for the analyses of transition to flowering. Its representation in the TraVA database allows easy and quick access to gene expression profiles. Together with the possibility to search for the Capsella orthologs of the A. thaliana genes, it facilitates the analysis of possible floral regulators and their involvement in the different stages of the transition process.

Plant Growth and Sample Collection
Seeds of C. bursa-pastoris were planted on a mix of 1/2 vermiculite:1/2 soil and kept at 4 • C for 7 days. Then they were transferred to a climatic chamber (Pol-eko Aparatura, Poland) with preset long day (16 h light/8 h dark cycle) conditions, at 22 • C and with 50% relative humidity. Four Philips Master TL5 HO 54W/840 lamps at each of the two shelves were used as the light source. Shoot apical meristems (SAM) were collected from 8 till 16 days after germination (DAG) at 10-11 h after turning on the light. Each sample was harvested in two biological replicates containing 15 plants. Hand-dissected material was fixed in RNAlater (Qiagen, Venlo, The Netherlands).

RNA Extraction and Sequencing
Total RNA was extracted using the RNeasy Mini Kit (Qiagen, Venlo, the Netherlands) following the manufacturer's protocol and immediately used for Illumina cDNA library construction to avoid degradation. TruSeq RNA Sample Prep Kits v2 (Illumina, San Diego, CA, USA) was used for polyA mRNA collection in 0.4 of the recommended volume and all the subsequent stages were performed with the NEBNext Ultra II RNA Library Prep Kit for Illumina (New England BioLabs, Ipswich, MA, USA) following the manufacturer's protocol in 0.5 of the recommended volume. cDNA libraries were sequenced on the NextSeq500 (Illumina, San Diego, CA, USA) in a single mode with an 84 bp read length.

Data Processing
Reads were trimmed using the CLC Genomics Workbench 9.5.4 with the parameters "quality scores-0.005; trim ambiguous nucleotides-2; remove 5 terminal nucleotides-1; remove 3 terminal nucleotides-1; discard reads below length 25". Read mapping on the genome-based custom reference transcriptome was performed as described in [13] with the CLC Genomics Workbench 9.5.4 and the parameters "only unique mapping allowed, % of length aligned = 100, % of mismatches = 1". The total number of uniquely mapped reads was used as a measure of gene expression level. Differentially expressed homeologous pairs in a given sample were identified by comparison of the expression level of Homeolog B in the sample with the expression level of the corresponding Homeolog A in the same sample. Differential expression was calculated with the R package "DESeq2" [19], and a fold change > 2 and false discovery rate (FDR) < 0.05 were used as the thresholds.

Data Availability
RNA-seq data were deposited in the NCBI database under BioProject No PRJNA632857.