Single-Cell Genomic Analysis in Plants

Individual cells in an organism are variable, which strongly impacts cellular processes. Advances in sequencing technologies have enabled single-cell genomic analysis to become widespread, addressing shortcomings of analyses conducted on populations of bulk cells. While the field of single-cell plant genomics is in its infancy, there is great potential to gain insights into cell lineage and functional cell types to help understand complex cellular interactions in plants. In this review, we discuss current approaches for single-cell plant genomic analysis, with a focus on single-cell isolation, DNA amplification, next-generation sequencing, and bioinformatics analysis. We outline the technical challenges of analysing material from a single plant cell, and then examine applications of single-cell genomics and the integration of this approach with genome editing. Finally, we indicate future directions we expect in the rapidly developing field of plant single-cell genomic analysis.


Introduction
Single-cell genomic analysis is the tracking and study of single isolated cells using sequencing technologies such as whole genome sequencing (WGS) and RNA sequencing. Single-cell sequencing enables high-resolution measurements of cell-to-cell variation that is masked in conventional bulk sequencing, in which each sequencing library consists of a population of cells rather than a single cell. Single-cell analysis has been increasingly used in mammalian studies in the past decade. Single-cell DNA (scDNA) sequencing was applied for single nucleotide and copy-number variation analysis of both tumour and normal single cells [1][2][3], as well as the analysis of recombination activity in germ cells [4,5]. Single-cell RNA (scRNA) sequencing is also widely used to identify gene expression dynamics between subpopulations of cells [6].
The importance of single-cell analysis was recognised when increasing evidence showed that distinct cell types in an organism undergo specific physiological processes and contain unique mutations [7]. In humans and animals, the somatic evolution of cells [8,9] and the recombination of germlines [4] generate genomic signatures indicative of temporal developmental stages. The significance of genetic dynamics is even more emphasised in tumour cells, where genetic heterogeneity is common [10]. Similarly, plant tissues and cells are highly specialised, not only morphologically, but also biochemically and physiologically [11]. Early research has shown that the ion and metabolite distribution of individual epidermis cells in barley leaf vary depending on leaf developmental stage and light level [12]. This work highlighted the two main purposes of single-cell analysis: understanding the individuality of cell stages, and their differential response to environmental stimuli. High-resolution gene expression maps of Arabidopsis roots have shown that expression patterns do not always correlate with previously defined anatomical boundaries [13,14].
In shoots, isolated cell populations in the apical meristem displayed specific expression profiles, which contributed to the identification of stem cell markers [15]. Transcripts differentially expressed in cell types of the leaf epidermis were also observed in Arabidopsis [16], barley [17], and maize [18]. Gene expression studies have also successfully described the development and differentiation of other unique plant morphologies, such as stomatal cells [19], pollen [20,21], and female gametophytes [22]. Distinct cell-type-to-cell-type gene expression when responding to environmental stimuli suggests tight gene regulation. For example, Dinneny et al. [23] revealed that the transcriptional response of Arabidopsis root cells to salinity and iron deficiency are specific to the developmental stage of the cell. In a separate study, five Arabidopsis root cell types showed a distinct cellular response to nitrogen influx such as the cell-specific regulation of hormone signalling [24]. The assumption of the universal stress response was also rejected in other studies [25,26]. Similarly, plant defence to biotic stress is tissue-specific. For example, the transcriptional state of rice root tissues differs from leaf tissues following rice blast fungus invasion [27].
The understanding that molecular characteristics in cell types of an individual organism vary has provided new perspectives on the conclusions drawn from previous bulk sequencing studies. Single-cell genomic analysis has successfully described cancer cell states, for example, of stem cells in leukaemia patients [28] and biological developmental processes such as ageing [29]. However, technical issues, such as cell isolation difficulties [30], have delayed the use of single-cell analysis in plants.
To date, two studies employed adapted protocols developed for animal systems to sequence Arabidopsis root cells and classify cells using clustering [31,32]. As a result, the process of root regeneration was successfully described [33].
Single-cell studies in plants have the potential to increase the resolution of previous studies in two major areas: (1) developmental dynamics of plant tissues to identify non-anatomical markers for important cell populations; and (2) plant stress signalling, responses, and adaptation. Here, we review the opportunities provided by plant single-cell analysis and discuss the experimental and analytical challenges that need to be addressed to maximise the scientific impact of this approach.

Challenges and Opportunities in Plant Single-Cell Analysis
Single-cell genomic analysis generally comprises four steps ( Figure 1): single-cell preparation, DNA amplification, next-generation sequencing, and bioinformatics analysis [34,35]. The study of single cells in plants is still in its early stages. However, recent technological advances are driving increasing interest in plant single-cell studies (Tables 1 and 2).

Single-Cell Isolation
To perform single-cell experiments, cells of interest first need to be isolated. However, single-cell isolation is not a trivial task, especially in complex solid tissues [35], and the development and standardisation of best practices for isolation techniques is ongoing [36]. Traditionally, the first isolation step is to macerate or remove cell walls, allowing manipulation of individual cells in a suspension [35]. Compared to animal cells, plant cells usually have rigid exteriors, which complicates isolation [30]. Macerating plant cell walls using enzymatic digestion is a feasible solution [30]. Enzymatic hydrolysis was used to isolate single cells from potato leaves [37] and apple flesh [38], indicating that pectinase is a crucial enzyme in cell isolation. However, long enzymolysis time may damage the completeness and activity of cells [38]. Later, many studies improved this method, for instance, Jia et al. [39] used cellulose digestion to obtain protoplasts from wheat leaves.     After obtaining a suspension, several approaches are used for single-cell isolation, among which are serial dilution [40], micromanipulation [7], fluorescence-activated cell sorting (FACS) [41], and optical tweezers [51]. Serial dilution is the simplest approach for isolating a single cell in a single well. During the process, cells are serially diluted to approximately one cell per microliter. However, owing to the low accuracy of serial dilution, this approach has rarely been used in recent single-cell studies. Micromanipulation is a simple and cheap method for isolating single cells such as early embryos [50]. However, micromanipulation is low-throughput, time-consuming, and has high misidentification rates [7,50]. FACS, on the other hand, is the most commonly used method to isolate individual cells based on size, granularity, and fluorescence of cells [52]. FACS has been made commercially available by companies such as BD Biosciences (San Jose, CA, USA) and Beckman Coulter (Brea, CA, USA) [53]. However, FACS requires a large number of cells in suspension (thounsands of cells), which may affect the yield of low-abundance cell subpopulations. Additionally, due to the rapid flow, cells might be damaged during FACS [7]. Optical tweezers are an alternative, using a highly focused laser beam to capture cells [7]. With the assistance of imaging-based selection, optical tweezers can isolate cells in suspension or a cell array inside a microfluidic device [6].
In addition to suspension based isolation methods, techniques such as laser microdissection (LMD), laser microdissection and pressure catapulting (LMPC), and laser capture microdissection (LCM) [42] are used to extract single cells in situ based on cellular morphology [50,54]. However, several drawbacks remain to be overcome, including low throughput, accidental slicing of cells during sectioning, UV damage to nuclei, and contamination from neighbouring cells [50,55]. Magnetic-activated cell sorting (MACS) is another commonly used single-cell isolation method. MACS is a column-based technique that isolates cells using antibodies, enzymes, or lectins to bind specific cell-surface proteins [56]. However, the high costs for the separation magnet, the columns, the antibodies, and the specific sensitivity to positively and negatively charged cell populations makes its usage far more limited than FACS [56].
More recently, microfluidic technologies have been shown to be a parallel, accurate, high-throughput, and sensitive single-cell isolation technique [43]. However, costly proprietary reagents are needed to complete the isolation when using these commercial microfluidics platforms [34]. Additionally, microfluidic platforms require uniform cell sizes [57], limiting their applicability for cell samples with varying size. Currently, microfluidics is only being used to isolate animal cells, but it is expected that it will be applied in plant cells in the near future.

Whole Genome Amplification
The process of scDNA sequencing is considerably more challenging than scRNA sequencing. The main reason for this is the error-prone nature of the DNA amplification step, which is required, as there is a limited amount of DNA that can be extracted from a single cell. For instance, a single mammalian cell generally contains less than 10 picograms (pg) of DNA [56], and plant cells may contain between <0.1 pg and >120 pg, with a low modal weight of 0.6 picograms in flowering plants [58]. As DNA sequencing generally requires over 200 nanograms of DNA, and low-input protocols still require 500 picograms to 10 nanograms of DNA (https://nanoporetech.com/products/kits; https://www.neb.com/products), scDNA sequencing requires DNA amplification. However, DNA amplification leads to nonuniform coverage, allelic dropout, and false positive mutations [57]. These technical challenges affect the results of DNA sequencing and hamper downstream analyses, complicating the discovery of real biological variation. To solve the problems related to DNA amplification, several methods have been developed.
PCR-based methods such as linker-adapter PCR (LA-PCR) [59], interspersed repetitive sequence PCR (IRS-PCR) [60], primer extension preamplification PCR (PEP-PCR) [61], and degenerate oligonucleotide-primed PCR (DOP-PCR) [62] were initially used for scDNA amplification. However, the low genome coverage (~10%), limited production, severe amplification biases, and allelic dropout substantially limited these approaches [57]. Later, multiple displacement amplification (MDA) [44] was developed and widely used in scDNA amplification. The application of MDA is simple, generating a high genome coverage (>90%) and a low false positive rate (~10 −7 ) [57]. However, nonuniform coverage and high allelic dropout rates (~31-65%) lower the sensitivity of MDA to copy number variation (CNV) [56]. To increase uniformity of coverage and decrease allelic dropout, a method called multiple annealing-and looping-based amplification cycles (MALBAC) [2] was developed. The allelic dropout rate in MALBAC is reduced to~1%. Almost 93% of genome coverage can be amplified to 25× on average [2]. Moreover, MALBAC is particularly useful for CNV and single nucleotide variant (SNV) detection. However, the high false positive rates of MALBAC require further improvement. Another amplification method is the microwell displacement amplification system (MIDAS) [45]. MIDAS uses a massive parallel polymerase cloning method to reduce amplification bias and alleviate nonuniform coverage [63]. Compared to MDA, this method can reduce reaction volume~1000 fold. MIDAS also reduces the template concentration required and the level of contamination [56].

Whole Transcriptome Amplification
Previous studies using population samples have provided insights into the distribution of gene expression levels across cells. However, the bulk cells used in RNA sequencing make it difficult to quantify gene expression in individual cells. Studies applying scRNA sequencing can shed light on variability in gene expression across cells. As the RNA material in a single cell is insufficient for scRNA sequencing, whole transcriptome amplification (WTA) is required. Compared to whole genome amplification (WGA), WTA is less challenging because the presence of multiple transcript copies reduces the dropout rate. In recent years, numerous technologies have been developed to improve WTA. Although WTA methods have improved their throughput, sensitivity, accuracy, and precision, the challenges of amplification bias and additional noise remain [64].
To characterise the transcriptome of a single cell, mRNA must be reverse-transcribed into cDNA before WTA. Prior to the use of next-generation sequencing (NGS), cDNA microarrays were applied to analyse gene expression from single cells. However, this method was less sensitive and could miss many rare but key transcripts [65]. To overcome this limitation of microarrays, in 2010, Tang et al. [66] improved the WTA method and used NGS to detect genes and splice junctions in one cell. In their method, oligo deoxythymine (dT) primers with anchor sequences were used for mRNA reverse-transcription before PCR amplification. However, this method could generate 3 -end mRNA bias mainly due to the limited length of cDNAs [67]. To alleviate this situation, a WTA method named SMART-seq [46] was developed. SMART-seq generates and amplifies full-length cDNA from single cells using Moloney murine leukaemia virus (MMLV) to perform reverse-transcription. However, the low sensitivity of SMART-seq prompted development of the improved SMART-seq2 approach [47]. SMART-seq2 enables researchers to detect gene expression differences in multiple samples, at the expense of a strong 5 -end bias.
Several in vitro transcription (IVT) methods were developed, including cell expression by linear amplification sequencing (Cel-seq) [48]. The main benefit of IVT is linear amplification, which reduces amplification bias compared to exponential amplification methods such as PCR [7]. However, the bias towards the 3 -end makes it difficult to control, which impedes the detection of the full spectrum of transcript variants [7]. To mitigate this bias, unique molecular identifiers (UMIs) are used in single-cell WTA [49]. UMIs can be implemented for quantitative scRNA sequencing with absolute molecule counts. More recently, droplet-based RNA-seq technologies have been released, including the commercial Chromium System platform (10X Genomics, Pleasanton, CA, US). Droplet-based RNA-seq technologies can differentiate the cell-of-origin of each mRNA molecule to help study single cells in complex tissues. The low level of noise generated by this approach has enabled the analysis of thousands of different cells in parallel [50].

Bioinformatics Analysis
Bioinformatics analysis is essential in providing biological insights and achieving the aims of single-cell experiments, such as detecting variants, quantifying gene expression, and subpopulation detection. However, conventional bioinformatics tools developed for bulk-cell genomics cannot be directly applied to single-cell sequencing data. Due to the low amount of raw genetic material, single-cell data is limited by low sequencing coverage and high amplification bias. Analytical challenges to differentiate between technical noise and true variants are further complicated by the lack of biological replicates. Furthermore, the large genome size, highly repetitive regions in plant genomes, whole genome duplications, and large amounts of gene families make bioinformatics analysis difficult [68].
To achieve a genome coverage of above 90%, 30× sequencing depth is required in single-cell sequencing, in contrast to 4× depth in bulk-cell sequencing [69]. This low coverage characteristic of single-cell sequencing data has posed difficulties in the variant calling procedure. Most bioinformatics tools employ sequence read density to call variants. Single nucleotide polymorphisms (SNPs) and small insertions/deletions with low read support are excluded in conventional bioinformatics tools. This problem is particularly evident in algorithms used to detect CNV, which strongly rely on read counts. In genome assemblies, the low coverage and heterogeneity of single-cell sequencing data also bring substantial disadvantages, leading to truncated sequences with high numbers of sequencing artefacts [35]. Recently, single-cell assemblers such as SPAdes [70] and IDBA-UD [71] have been specifically developed to overcome the challenge of amplification artefacts in single-cell sequencing and generate more precise single-cell genomic assemblies.
In scRNA sequencing, the loss of coverage leads to low-abundance transcripts, as well as incomplete transcripts with a 3 -end bias. These transcripts affect the accurate detection of gene expression levels [72] and limit the detection of alternative splicing. For example, single blastomere cell RNA sequencing in mice produced transcripts that were approximately 3 kb shorter compared with those from conventional RNA sequencing, resulting in the loss of 36% of expressed genes [73]. Common gene expression metrics such as Fragments Per Kilobase Million/Reads Per Kilobase Million (FPKM/RPKM) do not address these 3 -end biases [69] and thus have a limited application for scRNA sequencing. To overcome the biased quantification of gene expression resulting from incomplete transcript amplification, an unbiased metric for gene expression is required. For instance, a novel synthetic statistical approach provided by Korthauer et al. [74] allows an unbiased characterisation of differences in transcript expression distribution. By utilising a Bayesian modelling framework, this novel approach can characterise differences of expression in scRNA sequencing experiments and identify biological heterogeneity with multi-modal expression with differential distributions. A second strategy is to normalise the differences in the single-cell transcripts. In this case, gene expression levels are quantified based on the normalised RNA sequencing data instead of the full-length RNA transcripts [72]. Finally, a third method for characterising gene expression is to apply unbiased clustering methods such as principal component analysis (PCA). PCA is a nonlinear dimensionality-reduction method that effectively clusters similar cells in two or three dimensions [75]. In addition, machine learning approaches have become an effective tool to addresses low sequencing coverage and amplified artefacts in scRNA sequencing. For example, Wang et al. [76] developed a machine learning algorithm called single-cell interpretation via multiple-kernel learning (SIMLR). The authors reanalysed seven representative scRNA sequencing datasets with random amplification biases, obtaining a higher clustering sensitivity and accuracy. Lin et al. [77] introduced a neural network approach to analyse scRNA sequencing data. The neural network enables simplification of scRNA sequencing data by reducing data dimension representation and accurate prediction of cell type or state through querying a database with thousands of single-cell transcriptome profiles.
The amplification bias in single-cell sequencing is another challenge for bioinformatics tools tailored for bulk-cell sequencing. In CNV detection, the amplification bias in scDNA sequencing can lead to the generation of multiple reads that obscure the correct prediction of CNVs. It is therefore necessary to examine the amplification bias to identify the associated pattern of candidate CNVs, including GC content, variant position, and repeat sequences [69]. This additional information can be incorporated into novel CNV identification algorithms, which combine a synthetic-normal-based DNA sequencing tool (SynthEx) with allele-specific copy number analysis (ASCN) [78], to address amplification bias and reduce unexpected variations in single-cell sequencing data [79]. Other newly developed single-cell CNV detection algorithms, which include GC correction [80], binary segmentation [81] and rank segmentation [82], have also enabled high detection accuracy at the base-pair level.
Amplification bias also leads to a large proportion of false-positive SNP calling [35]. Zong et al. [2] indicated that errors in amplification during single-cell amplification could cause around one in 20 false-positive SNPs using the Genome Analysis Toolkit (GATK) [83]. Additionally, amplification failure of one or both alleles result in a high rate of allelic dropout [84], which contributes to the phenomenon of missing heterozygosity. Zong et al. [2] estimated that the value of allelic dropout could reach up to 60% for scDNA sequencing, which leads to inaccurate SNP calling. To reduce the false-positive SNPs produced by the amplification bias, two common strategies are employed. Firstly, SNPs from bulk DNA samples can be used as a reference to filter out the false-positive results [35]. Secondly, SNPs can be verified within two to three different single-cell samples, which can effectively reduce the false-positive variants introduced by amplification errors [69]. Nevertheless, no specific research has been carried out to investigate the actual number of single-cell samples that are required to validate SNPs in interrogated genomic regions.To avoid allelic dropout, one possible strategy is to apply a further filtering algorithm that identifies and removes noisy SNPs based on control groups [85].

Applications of Single-Cell Analysis in Plants
Prior to the developments of modern single-cell technology, specific cell types such as root hairs [86][87][88], cotton fibres [89], and trichomes [90] served as early single-cell-type models due to their easy isolation. When compared to bulk-cells studies, these single-cell-type models increased the resolution of our understanding in cellular processes and differentiation of plant roots, cell walls, and shoot epidermal hair. For example, despite being morphologically recognised as leaf trichomes, gene expression profiles during secondary wall cellulose synthesis in cotton fibres resembled sclerenchyma cells [89,91]. In another example, transcriptomes of root hair single cells isolated from soybean only contain 25% of the transcription factors found in whole root transcriptome studies [88].
Plant cells show high developmental plasticity, and differentiated somatic plant cells can be stimulated to form embryos in culture [92]. However, it remains unclear whether plant cell-fate regulation is a lineage-dependent mechanism, as in animals [93], or based on cell relative position [94], or a mix of both [95]. Single-cell analysis can be used to map individual cell stage from initial to differentiated, therefore shedding light on regeneration mechanisms, cell-fate regulation, and totipotency in general. Protocols for single-cell lineage tracing were established in animal and human studies [96], and could be adapted for use in plant analysis. Recent single-cell analysis of Arabidopsis roots showed that multiple cell types could rapidly reconstitute stem cells by replaying the patterns of embryogenesis [33], therefore supporting the notion of a decentralised stem cell control system [97]. Single-cell transcriptomics can further contribute to the identification of critical genes in regeneration, which can be tracked and used as markers for developmental studies.
Due to environmental variation, stress tolerance of plants has always been of great interest in both disease resistance as well as trait improvement for crop breeding. Whole tissue bulk material is widely used to understand stress signalling in plants (examples in Arabidopsis [98][99][100]) and to detect markers such as nucleotide polymorphisms (e.g., in soybean flowering [101]) and CNVs (e.g., in rice grain size [102]) as the basis of crop breeding programs. However, as stress regulation is cell type-specific [103], bulk tissue analysis diluted plant response signals and overlooked cell-type-specific structural variation. Advances in single-cell sequencing can thus offer novel insights into stress adaptation in plants, particularly for modelling gene regulatory networks. For example, plant hormones are the key mediators of stress response [104], yet the interactions between hormone signalling pathways are poorly understood [105]. A recent analysis showed that interactions between hormones directly manipulate tissue formation and patterning using single-cell information [33]. This work could be applied to model hormone signalling networks in stress responses, such as dissecting the conflicting evidence of ethylene as a positive or negative regulator during high salinity stress in different species at different developmental stages [106], as well as the ethylene-jasmonate-abscisic acid crosstalks [107][108][109]. Single-cell analysis can also detect novel regulatory processes. One example is the identification of new rhizobial infection-related genes and novel processes in Medicago root hair that were previously undetected in bulk-cell whole-root studies [110]. There is also increasing evidence of the regulation of stress response by alternative splicing [111], for example alternative isoforms of resistance genes regulate defence against tobacco mosaic virus [112]; alternative splicing occurs as a result of temperature-induced stress in Arabidopsis [113]. As gene isoforms were also shown to be allocated to different cell types [114], single-cell analysis has the potential to mark and track alternative transcripts following developmental stages and stimuli.
Applying single-cell analysis in plants can discover unknown cell types through deconvoluting heterogeneous cell populations by unbiased identification of biological variation between adjacent cell states. The current description of plant cell states is still widely based on morphology and known markers [30]. Signatures of rare subpopulations can be detected through single-cell technology, as demonstrated in human T cells [75]. In addition, the development of single-cell analysis in plants will contribute to the collection of physiologically-based markers and serve as a foundation for cell type marking in future work.

Integration with Genome Editing
The genome editing technology clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas) enables the prediction of gene function using highly parallel pooled mutation screens [115,116]. Current CRISPR/Cas-pooled screens suffer from limited resolution to study individual mutant genotypes and associated transcriptomes, leading to high false-positive and false-negative results [72,117]. Recently, this limitation of CRISPR/Cas screens has been overcome by combining them with scRNA sequencing in approaches referred to as CRISPR-seq [118], CROP-seq [119], and Perturb-seq [120]. These approaches allow detection of the transcriptional effects of multiple gene disruption events in hundreds of thousands of single cells. Although these techniques have only been applied in mammalian cells, they have potential to shed light on gene functions and regulatory pathways in plants. In particular, model and crop plants are suitable for these genetic screens, as high-quality genome assemblies and knowledge of target genes is an important prerequisite.
CRISPR-seq and related techniques rely on a guide RNA (gRNA) vector with a unique barcode that can be detected in scRNA sequencing, a massively parallel scRNA sequencing assay, and a bioinformatics pipeline for obtaining gRNAs from single-cell transcriptomes and analysing the generated transcriptional profiles. CRISPR-seq requires compartmentalisation of each guide RNA (gRNA) and its biological signal in a single cell [119]. A gRNA is transferred into one single cell in the pool, inducing a specific knockout in a targeted gene. By measuring gRNA in each cell and its corresponding transcriptome, scRNA sequencing can directly detect precise gene expression levels of each targeted gene knockout on a large scale of cells. For example, Jaitin et al. [118] were able measure the expression of single gRNAs at 50,000 cells per well in a 500 µL culture solution. Compared with the classical pooled screening method (gene knockout followed by transcriptome analysis), CRISPR-seq combines gene knockout and expression analysis in one step to provide a simpler, cheaper, more flexible, and more efficient method to study biological mechanisms in various cell states or cell types [118]. Diego et al. [118] utilised CRISPR-seq to investigate the regulatory mechanism of myeloid cells during cell differentiation and the expression level of significant developmental and immune-related regulators. They indicated that the transcription factors CEBPB and IRF8 play opposing roles in regulating development of monocyte/macrophage versus dendritic cell lineages. Wang et al. [121] also applied CRISPR-seq to identify significant genes required for mammalian cell proliferation and formation of cancer cells.
When the difficulties of isolating single plant cells are overcome, CRISPR-seq will become a new generation genome editing tool improving knowledge on plant genetics, with a potentially substantial impact on plant breeding. CRISPR-seq enables sequential knockout of target genes in crop cells, allowing large-scale gene function analysis across cell lineages. For instance, CRISPR-seq might be applied for studying the expression level of regulatory factors such as LEC1, WUS, and ODP2 during cell proliferation and differentiation [122]. A better undererstanding of these regulatory factors could help induce the formation of somatic embryos from plant tissue cultures to accelerate the breeding cycle [122]. With genome editing now allowing precise modification of DNA and RNA [123], tools to assay plant cells for suitable functional editing targets will become increasingly important.

Data Repository for Plant Cells
As the study of single cells is still evolving, protocols used in WGA and WTA at the moment are diverse and difficult to standardise [34]. Algorithms developed for data extraction and compilation are different. In single-cell studies, the amount of genome and transcriptome data generated poses a potential challenge for data storage and sharing. To efficiently document each single-cell experiment, data repositories are required and should be able to categorise each data format and make data reusable, shareable, and comparable. To achieve this, proper data management and novel algorithms are needed to ensure users track experimental parameters and allow upload and download of plant single-cell data.
In the study of bulk cells or tissues, data repositories, such as the National Center for Biotechnology Information (NCBI), provide a good example for data storage and management. However, for single-cell sequencing data, although NCBI has already provided a similar service, it has missed the importance and demand for experimental metadata such as molecular information.
In the near future, comprehensive data repositories for single cells are expected. Some standardised experimental data formats similar to the established sequence format FASTQ or the alignment map format BAM are also needed to make the study of single cells more robust.

Conclusions
Single-cell genomic analysis provides novel solutions for studying cells that play important roles in system behaviour, tissue development, regeneration, and repair. By studying biological diversity in plant cells or tissues, the development of plant organs and the response of plants to environmental stress will be better understood. Combined with gene editing technologies and modelling of regulatory networks for target discovery, single-cell sequencing will boost crop improvement. Although challenges remain in single-cell preparation, DNA/RNA amplification, DNA sequencing, and bioinformatics analysis, the rapid evolution of single-cell technologies is expected to play an important role in feeding the world by helping to breed high-yielding and stress-tolerant elite cultivars.