# Analysis of Single-Cell RNA-Sequencing Data: A Step-by-Step Guide

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{8}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Material and Methods

#### 2.1. Data Description

#### 2.2. Raw Data Download

#### 2.3. Data Preprocessing

#### 2.4. Quality Check of Raw Reads

^{−3}; the error rate is 0.1%). A high-quality score implies that a base call is more reliable and less likely to be incorrect; for example, p = 10

^{−10}, Q = 100 (1 in 10 billion), which is unrealistic and unlikely.

#### 2.5. Extracting the Cell Barcodes, UMIs and Reads

#### 2.6. Mapping to Reference Genome to Obtain Read Counts

#### 2.7. Assigning Reads to Genes

## 3. Statistical Modeling and Data Analysis

#### 3.1. Mathematical Models for scRNA Count Data

#### 3.1.1. Negative Binomial Model

#### 3.1.2. Zero-Inflated Negative Binomial Model

#### 3.2. Zero-Inflation Analysis

_{0}; ${\widehat{\Omega}}_{i}$: is the unconstrained MLE of ${\Omega}_{i}$ for the ith gene and ${\Omega}_{i}$ is the parametric space for the ith gene, i.e., ${\Omega}_{i}=\left\{{\mu}_{i},{\theta}_{i},{\pi}_{i}\right\}$. The test statistic in Equation (10) is asymptotically distributed as a chi-squared distribution with 1 degree of freedom under H

_{0}. Since droplet-based single-cell sequencing methods can capture approximately 1–10% of mRNA from the cells, ‘zero’ counts (for low expressed genes) or dropout events (due to stochasticity of expression) are observed in single-cell data [9,48,49].

#### 3.3. Clustering

#### 3.4. K-means Clustering

#### 3.5. Determination of the Optimum Number of Clusters

^{th}cluster (k = 1, 2,…, K); ${Y}_{jk.}$ be the mean expression value of the jth cell in k

^{th}cluster; ${Y}_{..k}$ be the mean expression value of the k

^{th}cluster; and ${Y}_{\dots}$ be the overall mean across M cells. Then, the total sum of squares (TSS) can be expressed as

#### 3.6. Differential Gene Expression Analysis

#### 3.7. DESeq2

_{ij}for the ith gene in the jth cell is described with the NB generalized linear model by the following expression:

_{ij}and the gene-specific dispersion factor = α

_{i}. The fitted mean comprises a sample-specific size factor s

_{j}and a parameter q

_{ij}, the expected count of fragments for the jth cell. The coefficients β

_{i}give the log2 fold changes for the ith gene for each column of the model matrix. The DESeq2 first estimates the size factors that account for the differences in the library size, then estimates the dispersion for each gene and, lastly, fits a generalized linear model [26]. The DESeq2 uses the Wald statistic to calculate the p-value and size effect estimate for the log2 fold change.

#### 3.8. DEsingle

_{0}. The normalization step is usually conducted before DE analysis to correct the amplification bias. We used the median normalization method, as implemented in DEsingle and DESeq2, to normalize the scRNA-seq count data (Supplementary Document S2).

#### 3.9. Performance Evaluation of DE Methods

## 4. Results and Discussion

#### 4.1. Quality Control

#### 4.2. Extracting UMIs, Cell Barcodes and UMIs

#### 4.3. Mapping

#### 4.4. Quantification

#### 4.5. Determining the Distribution of Zeros in Data

#### 4.6. Distribution of Cell Sequencing Depths

#### 4.7. Clustering Analysis

#### 4.8. Study the Effect of Zero’s Reduction on the Determination of Optimum Cell Clusters

#### 4.9. Case 1: No Reduction

#### 4.10. Case 2: Reduction in the Number of Genes when many Cells have Zero Counts

#### 4.11. Differential Expression Analysis

#### 4.12. Evaluating Performance

## 5. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Vallejos, C.A.; Richardson, S.; Marioni, J.C. Beyond comparisons of means: Understanding changes in gene expression at the single-cell level. Genome Biol.
**2016**, 17, 1. [Google Scholar] [CrossRef][Green Version] - Hwang, B.; Lee, J.H.; Bang, D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med.
**2018**, 50, 96. [Google Scholar] [CrossRef] [PubMed][Green Version] - Lavin, Y.; Kobayashi, S.; Leader, A.; Amir, E.-A.D.; Elefant, N.; Bigenwald, C.; Remark, R.; Sweeney, R.; Becker, C.D.; Levine, J.H.; et al. Innate Immune Landscape in Early Lung Adenocarcinoma by Paired Single-Cell Analyses. Cell
**2017**, 169, 750–765.e17. [Google Scholar] [CrossRef] [PubMed][Green Version] - Tang, F.; Barbacioru, C.; Wang, Y.; Nordman, E.; Lee, C.; Xu, N.; Wang, X.; Bodeau, J.; Tuch, B.B.; Siddiqui, A.; et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods
**2009**, 6, 377–382. [Google Scholar] [CrossRef] - Scialdone, A.; Natarajan, K.N.; Saraiva, L.; Proserpio, V.; Teichmann, S.; Stegle, O.; Marioni, J.C.; Buettner, F. Computational assignment of cell-cycle stage from single-cell transcriptome data. Methods
**2015**, 85, 54–61. [Google Scholar] [CrossRef] [PubMed] - Picelli, S.; Bjorklund, Å.K.; Faridani, O.; Sagasser, S.; Winberg, G.; Sandberg, R. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods
**2013**, 10, 1096–1098. [Google Scholar] [CrossRef] [PubMed] - Brink, S.C.V.D.; Sage, F.; Vértesy, Á.; Spanjaard, B.; Peterson-Maduro, J.; Baron, C.; Robin, C.; Van Oudenaarden, A. Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat. Methods
**2017**, 14, 935–936. [Google Scholar] [CrossRef] [PubMed] - Hashimshony, T.; Senderovich, N.; Avital, G.; Klochendler, A.; de Leeuw, Y.; Anavy, L.; Gennert, D.; Li, S.; Livak, K.J.; Rozenblatt-Rosen, O.; et al. CEL-Seq2: Sensitive highly-multiplexed single-cell RNA-Seq. Genome Biol.
**2016**, 17, 1–7. [Google Scholar] [CrossRef][Green Version] - Macosko, E.Z.; Basu, A.; Satija, R.; Nemesh, J.; Shekhar, K.; Goldman, M.; Tirosh, I.; Bialas, A.R.; Kamitaki, N.; Martersteck, E.M.; et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell
**2015**, 161, 1202–1214. [Google Scholar] [CrossRef] [PubMed][Green Version] - Zemmour, D.; Zilionis, R.; Kiner, E.; Klein, A.M.; Mathis, D.; Benoist, C. Single-cell gene expression reveals a landscape of regulatory T cell phenotypes shaped by the TCR. Nat. Immunol.
**2018**, 19, 291–301. [Google Scholar] [CrossRef] [PubMed] - Jaitin, D.A.; Kenigsberg, E.; Keren-Shaul, H.; Elefant, N.; Paul, F.; Zaretsky, I.; Mildner, A.; Cohen, N.; Jung, S.; Tanay, A.; et al. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science
**2014**, 343, 776–779. [Google Scholar] [CrossRef] [PubMed] - Ramsköld, D.; Luo, S.; Wang, Y.-C.; Li, R.; Deng, Q.; Faridani, O.; Daniels, G.A.; Khrebtukova, I.; Loring, J.F.; Laurent, L.; et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol.
**2012**, 30, 777–782. [Google Scholar] [CrossRef][Green Version] - Ziegenhain, C.; Vieth, B.; Parekh, S.; Reinius, B.; Guillaumet-Adkins, A.; Smets, M.; Leonhardt, H.; Enard, W. Comparative Analysis of Single-Cell RNA Sequencing Methods. Mol. Cell
**2017**, 65, 631–643.e4. [Google Scholar] [CrossRef][Green Version] - Wang, Z.; Gerstein, M.; Snyder, M. RNA-seq: A revolutionary tool for transcriptomics. Nat. Rev. Genet.
**2009**, 10, 57–63. [Google Scholar] [CrossRef] [PubMed] - Kolodziejczyk, A.; Kim, J.K.; Svensson, V.; Marioni, J.; Teichmann, S.A. The technology and biology of single-cell RNA sequencing. Mol. Cell
**2015**, 58, 610–620. [Google Scholar] [CrossRef] [PubMed][Green Version] - Das, S.; Rai, A.; Merchant, M.L.; Cave, M.C.; Rai, S.N. A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-cell RNA Sequencing Studies. Genes
**2021**, 12, 1947. [Google Scholar] [CrossRef] - Bacher, R.; Kendziorski, C. Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol.
**2016**, 17, 63. [Google Scholar] [CrossRef] [PubMed][Green Version] - Brennecke, P.; Anders, S.; Kim, J.K.; Kolodziejczyk, A.; Zhang, X.; Proserpio, V.; Baying, B.; Benes, V.; Teichmann, S.; Marioni, J.; et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods
**2013**, 10, 1093–1095. [Google Scholar] [CrossRef] [PubMed] - Blower, M.D.; Jambhekar, A.; Schwarz, D.S.; Toombs, J. Combining Different mRNA Capture Methods to Analyze the Transcriptome: Analysis of the Xenopus laevis Transcriptome. PLoS ONE
**2013**, 8, e77700. [Google Scholar] [CrossRef] - Hicks, S.C.; Townes, F.W.; Teng, M.; Irizarry, R. Missing data and technical variability in single-cell RNA-sequencing experiments. Biostatistics
**2018**, 19, 562–578. [Google Scholar] [CrossRef] - Haque, A.; Engel, J.; Teichmann, S.A.; Lönnberg, T. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications. Genome Med.
**2017**, 9, 75. [Google Scholar] [CrossRef] [PubMed] - Qiu, P. Embracing the dropouts in single-cell RNA-seq analysis. Nat. Comm.
**2020**, 11, 1169. [Google Scholar] [CrossRef][Green Version] - Lafzi, A.; Moutinho, C.; Picelli, S.; Heyn, H. Tutorial: Guidelines for the experimental design of single-cell RNA sequencing studies. Nat. Protoc.
**2018**, 13, 2742–2757. [Google Scholar] [CrossRef] [PubMed][Green Version] - Luecken, M.D.; Theis, F.J. Current best practices in single-cell RNA-seq analysis: A tutorial. Mol. Syst. Biol.
**2019**, 15, e8746. [Google Scholar] [CrossRef] - Andrews, T.S.; Kiselev, V.Y.; McCarthy, D.; Hemberg, M. Tutorial: Guidelines for the computational analysis of single-cell RNA sequencing data. Nat. Protoc.
**2021**, 16, 1–9. [Google Scholar] [CrossRef] - Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol.
**2014**, 15, 550. [Google Scholar] [CrossRef] [PubMed][Green Version] - Miao, Z.; Deng, K.; Wang, X.; Zhang, X. DEsingle for detecting three types of differential expression in single-cell RNA-seq data. Bioinformatics
**2018**, 34, 3223–3224. [Google Scholar] [CrossRef][Green Version] - Tian, L.; Su, S.; Dong, X.; Amann-Zalcenstein, D.; Biben, C.; Seidi, A.; Hilton, D.J.; Naik, S.H.; Ritchie, M.E. scPipe: A flexible R/Bioconductor preprocessing pipeline for single-cell RNA-sequencing data. PLoS Comput. Biol.
**2018**, 14, e1006361. [Google Scholar] [CrossRef] - Tian, L.; Dong, X.; Freytag, S.; Cao, K.-A.L.; Su, S.; JalalAbadi, A.; Amann-Zalcenstein, D.; Weber, T.S.; Seidi, A.; Jabbari, J.S.; et al. Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments. Nat. Methods
**2019**, 16, 479–487. [Google Scholar] [CrossRef] [PubMed] - Cock, P.J.; Fields, C.J.; Goto, N.; Heuer, M.L.; Rice, P.M. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res.
**2010**, 38, 1767–1771. [Google Scholar] [CrossRef] [PubMed][Green Version] - Sequence Read Archives. Available online: https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=software (accessed on 10 November 2020).
- Leinonen, R.; Sugawara, H.; Shumway, M.; on behalf of the International Nucleotide Sequence Database Collaboration. The Sequence Read Archive. Nucleic Acids Res.
**2010**, 39, D19–D21. [Google Scholar] [CrossRef][Green Version] - Andrews, S. FastQC-A Quality Control Tool for High throughput Sequence Data. 2014. Available online: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 10 November 2020).
- Smith, T.; Heger, A.; Sudbery, I. UMI-tools: Modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome Res.
**2017**, 27, 491–499. [Google Scholar] [CrossRef][Green Version] - “GRC and Collaborators”. Genome Reference Consortium. Available online: https://www.ncbi.nlm.nih.gov/grc/credits/ (accessed on 19 October 2020).
- Harrow, J.; Frankish, A.; Gonzalez, J.M.; Tapanari, E.; Diekhans, M.; Kokocinski, F.; Aken, B.L.; Barrell, D.; Zadissa, A.; Searle, S.; et al. GENCODE: The reference human genome annotation for The ENCODE Project. Genome Res.
**2012**, 22, 1760–1774. [Google Scholar] [CrossRef] [PubMed][Green Version] - Dobin, A.; Davis, C.A.; Schlesinger, F.; Drenkow, J.; Zaleski, C.; Jha, S.; Batut, P.; Chaisson, M.; Gingeras, T.R. Gingeras, STAR: Ultrafast universal RNA-seq aligner. Bioinformatics
**2013**, 29, 15–21. [Google Scholar] [CrossRef] - Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics
**2009**, 25, 2078–2079. [Google Scholar] [CrossRef][Green Version] - Liao, Y.; Smyth, G.K.; Shi, W. featureCounts: An efficient general-purpose program for assigning sequence reads to genomic features. Bioinformatics
**2014**, 30, 923–930. [Google Scholar] [CrossRef] [PubMed][Green Version] - R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019. [Google Scholar]
- Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A K-Means Clustering Algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.)
**1979**, 28, 100–108. [Google Scholar] [CrossRef] - Ewing, B.; Hillier, L.; Wendl, M.C.; Green, P. Base-calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome Res.
**1998**, 8, 175–185. [Google Scholar] [CrossRef] [PubMed][Green Version] - Batut, B.; Hiltemann, S.; Bagnacani, A.; Baker, D.; Bhardwaj, V.; Blank, C.; Bretaudeau, A.; Brillet-Guéguen, L.; Čech, M.; Chilton, J.; et al. 2018 Community-Driven Data Analysis Training for Biology. Cell Syst.
**2018**, 6, 752–758.e1. [Google Scholar] [CrossRef] [PubMed][Green Version] - Dobin, A.; Gingeras, T.R. Mapping RNA-seq Reads with STAR. Curr. Protoc. Bioinform.
**2015**, 51, 1–19. [Google Scholar] [CrossRef] [PubMed][Green Version] - GENOCODE. Available online: https://www.gencodegenes.org/human/stats.html (accessed on 15 November 2020).
- Robinson, M.D.; McCarthy, D.J.; Smyth, G.K. EdgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics
**2010**, 26, 139–140. [Google Scholar] [CrossRef] [PubMed][Green Version] - Hardcastle, T.; Kelly, K. BaySeq: Empirical Bayesian Methods for Identifying Differential Expression in Sequence Count Data. BMC Bioinform.
**2010**, 11, 422. [Google Scholar] [CrossRef] [PubMed][Green Version] - Li, W.V.; Li, J.J. An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nat Commun.
**2018**, 9, 997. [Google Scholar] [CrossRef] [PubMed][Green Version] - Lun, A.T.L.; Bach, K.; Marioni, J.C. Pooling Across Cells to Normalize Single-Cell Rna Sequencing Data with Many Zero Counts. Genome Biol.
**2016**, 17, 75. [Google Scholar] [CrossRef] [PubMed] - Žurauskienė, J.; Yau, C. PcaReduce: Hierarchical clustering of single-cell transcriptional profiles. BMC Bioinform.
**2016**, 17, 140. [Google Scholar] [CrossRef] [PubMed][Green Version] - Das, S.; Rai, S.N. SwarnSeq: An improved statistical approach for differential expression analysis of single-cell RNA-seq data. Genomics
**2021**, 113, 1308–1324. [Google Scholar] [CrossRef] - Das, S.; Rai, S.N. Statistical methods for analysis of single-cell RNA-sequencing data. MethodsX
**2021**, 8, 101580. [Google Scholar] [CrossRef] - Shalek, A.K.; Satija, R.; Shuga, J.; Trombetta, J.J.; Gennert, D.; Lu, D.; Chen, P.; Gertner, R.S.; Gaublomme, J.T.; Yosef, N.; et al. Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature
**2014**, 510, 363–369. [Google Scholar] [CrossRef][Green Version] - Pierson, E.; Yau, C. Zifa: Dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol.
**2015**, 16, 241. [Google Scholar] [CrossRef][Green Version] - Scholtens, D.; von Heydebreck, A. Analysis of Differential Gene Expression Studies. In Bioinformatics and Computational Biology Solutions Using R and Bioconductor; Gentleman, R., Carey, V.J., Huber, W., Irizarry, R.A., Dudoit, S., Eds.; Statistics for Biology and Health; Springer: New York, NY, USA, 2005. [Google Scholar] [CrossRef]

**Figure 1.**Outlines of the workflow for various steps in scRNA-seq data analysis. (

**A**) Key steps involved in a typical single-cell RNA-seq experiment starting from the sample preparation by the isolation and lysis of single cells up to the data analysis. (

**B**) Data preprocessing steps beginning from the .fastq files up to the generation of count matrix and the tools required at each stage. (

**C**) Significant data analysis steps with the input count data matrix undertaken in the scRNA-seq study.

**Figure 2.**Data structure model, distributions and estimated parameters. (

**A**) Different cumulative distribution function (CDF) fitted to the single adenocarcinoma cells RNA-seq data. The x-axis corresponds to the cumulative densities and the y-axis represents the read counts. The red color corresponds to the observed CDF, the pink color to NB and the blue color to ZINB. (

**B**) Fitting of count data models to the given adenocarcinoma single cells RNA-seq data. In this plot, the x-axis represents the scRNA-seq read counts and the y-axis represents the densities. The red color corresponds to the observed density, the pink color to NB density and the blue color to ZINB density. (

**C**) Distribution of zeroes. The x-axis is the number of genes and the y-axis shows the percentage of zeroes. (

**D**) Plotting of observed zero proportions vs. estimated zero-inflation. The x-axis represents the estimated zero-inflation and the y-axis represents the observed zero proportion. Here, the plot shows that the observed zero proportions are greater than the estimated zero-inflation. (

**E**) Relation between mean and dispersion. The log(mean) is shown on the y-axis vs. log(dispersion) on the x-axis. (

**F**) The plot shows zero-inflation in data. The y-axis corresponds to the p-value and the x-axis represents the genes.

**Figure 3.**Distribution of cell sizes for Adenocarcinoma scRNA-seq data. (

**A**) Distribution of library sizes across the total number of cells. (

**B**) Plot for cell ranks vs. cell sizes—distribution of cell library sizes over the cell ranks. Here, the y-axis represents the cells’ rank and the x-axis represents the sequencing depth.

**Figure 4.**Effects of zero reductions on the determination of an optimum number of cell clusters. The figures are shown for (

**A**) no reduction, (

**B**) 80% reduction, (

**C**) 50% reduction and (

**D**) 30% reduction. The y-axis represents the values of clustering indices and the x-axis represents the values of optimum cell clusters. The blue line indicates the value of the optimum number of cell clusters in which the cells in the data can be clustered. For (

**A**), we observe the optimal number of clusters is approximately 10 for the 972 cells with all genes included. (

**B**) 80% reduction (80% reduction of zeros with 2415 reduced genes): the number of cell clusters was found to be 10.. (

**C**) 50% reduction (50% reduction of zeros with 879 reduced genes): the number of cell clusters was found to be 10. (

**D**) 30% reduction (30% reduction of zeros and with 454 genes): the number of cell clusters was found to be 10.

**Figure 5.**Comparative analysis of DEsingle and DESeq2 in terms of AUROC. The figure shows the ROC curves of the two DE analysis methods, DESeq2 and DEsingle. The red color indicates the DEsingle and the blue color represents DESeq2. DEsingle has better performance in terms of AUC value as compared to DESeq2.

Name | Version | Description | Reference |
---|---|---|---|

FastQC | v0.11.9 | FastQ Quality Check | [33] |

UMI-tools | 1.0.0 | Tools for handling Unique Molecular Identifiers | [34] |

Human genome | Grch38/hg 38 | Human genome reference file | [35] |

GTF | Release 35 | Gene Transfer Format | [36] |

STAR | 2.7 | Spliced Transcripts Alignment to a Reference | [37] |

SAM tools | 1.4 | SAMtools software package | [38] |

Subread package | 2.0.1 | The package used by SAMtools | [39] |

Stats R package | 3.6.1 | Package for k-means clustering | [40,41] |

DESeq2 | 1.28.1 | DE analysis tool for RNA-seq | [26] |

DEsingle | 1.8.2 | DE analysis tool for scRNA-seq | [27] |

Case Type | Percentage Reduction | No. of Optimal Clusters | No. of Genes |
---|---|---|---|

Case 1 | No reduction | 10 | 42,406 |

Case 2 | 80% | 10 | 2415 |

Case 2 | 60% | 10 | 1201 |

Case 2 | 50% | 10 | 879 |

Case 2 | 30% | 10 | 454 |

**Table 3.**Summary of results from a comparative analysis of DEsingle and DESeq2 using case I clustering method.

Level of Significance | DEsingle Genes | DESeq2 Genes | Common Genes |
---|---|---|---|

1% | 634 | 79 | 25 |

0.1% | 401 | 75 | 22 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Malhotra, A.; Das, S.; Rai, S.N.
Analysis of Single-Cell RNA-Sequencing Data: A Step-by-Step Guide. *BioMedInformatics* **2022**, *2*, 43-61.
https://doi.org/10.3390/biomedinformatics2010003

**AMA Style**

Malhotra A, Das S, Rai SN.
Analysis of Single-Cell RNA-Sequencing Data: A Step-by-Step Guide. *BioMedInformatics*. 2022; 2(1):43-61.
https://doi.org/10.3390/biomedinformatics2010003

**Chicago/Turabian Style**

Malhotra, Aanchal, Samarendra Das, and Shesh N. Rai.
2022. "Analysis of Single-Cell RNA-Sequencing Data: A Step-by-Step Guide" *BioMedInformatics* 2, no. 1: 43-61.
https://doi.org/10.3390/biomedinformatics2010003