Single-Cell Bioinformatics and Machine Learning

A special issue of Genes (ISSN 2073-4425). This special issue belongs to the section "Technologies and Resources for Genetics".

Deadline for manuscript submissions: closed (31 December 2021) | Viewed by 19263

Special Issue Editor

Departament of Veterinary Medicine and Biomedical Sciences, Texas A and M University, College Station, TX 77843, USA
Interests: biostatistics; human genetics; gene expression and regulation; computational statistics; data science; evolutionary bioinformatics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We are at the intersection of two major technical revolutions. One is single-cell technology, and the other is machine learning. The single-cell technology revolution refers to the recent technological advent that makes it possible and economically feasible to obtain robust quantitative measurements (such as mRNA abundance levels) from thousands of individual cells per assay. Compared to single-cell technology, the development of machine learning as part of the more general field of artificial intelligence has a relatively long history, with its groundwork laid down in the middle of the last century. However, increasingly powerful computers, harnessed to algorithms refined over the past decade, are driving an explosion of applications in every field, from business to healthcare. The marriage of the two technologies—single-cell technology and machine learning—is inevitable. With the development of high-throughput single-cell RNA sequencing (scRNA-seq) platforms, it is becoming almost a routine matter to obtain complete transcriptome information from hundreds of thousands and even millions of individual cells. Therefore, scRNA-seq data currently represent a truly Big Data opportunity with superior statistical power and open new horizons for applying machine learning for data analysis. Single-cell big data are high-dimensional and sparse; the relationships between cells and those between genes are usually nonlinear. The effective analysis of single-cell data requires developing novel machine learning algorithms, which itself is a revolutionary challenge.

This Special Issue invites research articles, reviews, and short communications focusing on machine learning to address new problems and improve the management of existing tasks using single-cell data. Submitted articles will be evaluated based on their quality and relevance to the research topic.

Dr. James Cai
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Genes is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Single-cell RNA sequencing 
  • Bioinformatics 
  • Machine learning 
  • Network science 
  • Big data science

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

17 pages, 4800 KiB  
Article
scInTime: A Computational Method Leveraging Single-Cell Trajectory and Gene Regulatory Networks to Identify Master Regulators of Cellular Differentiation
by Qian Xu, Guanxun Li, Daniel Osorio, Yan Zhong, Yongjian Yang, Yu-Te Lin, Xiuren Zhang and James J. Cai
Genes 2022, 13(2), 371; https://doi.org/10.3390/genes13020371 - 18 Feb 2022
Cited by 4 | Viewed by 4086
Abstract
Trajectory inference (TI) or pseudotime analysis has dramatically extended the analytical framework of single-cell RNA-seq data, allowing regulatory genes contributing to cell differentiation and those involved in various dynamic cellular processes to be identified. However, most TI analysis procedures deal with individual genes [...] Read more.
Trajectory inference (TI) or pseudotime analysis has dramatically extended the analytical framework of single-cell RNA-seq data, allowing regulatory genes contributing to cell differentiation and those involved in various dynamic cellular processes to be identified. However, most TI analysis procedures deal with individual genes independently while overlooking the regulatory relations between genes. Integrating information from gene regulatory networks (GRNs) at different pseudotime points may lead to more interpretable TI results. To this end, we introduce scInTime—an unsupervised machine learning framework coupling inferred trajectory with single-cell GRNs (scGRNs) to identify master regulatory genes. We validated the performance of our method by analyzing multiple scRNA-seq data sets. In each of the cases, top-ranking genes predicted by scInTime supported their functional relevance with corresponding signaling pathways, in line with the results of available functional studies. Overall results demonstrated that scInTime is a powerful tool to exploit pseudotime-series scGRNs, allowing for a clear interpretation of TI results toward more significant biological insights. Full article
(This article belongs to the Special Issue Single-Cell Bioinformatics and Machine Learning)
Show Figures

Figure 1

21 pages, 2227 KiB  
Article
Accurate Single-Cell Clustering through Ensemble Similarity Learning
by Hyundoo Jeong, Sungtae Shin and Hong-Gi Yeom
Genes 2021, 12(11), 1670; https://doi.org/10.3390/genes12111670 - 22 Oct 2021
Cited by 1 | Viewed by 2335
Abstract
Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provide the transcriptomic profiles of each cell. Although [...] Read more.
Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provide the transcriptomic profiles of each cell. Although an accurate estimation of the cell-to-cell similarity is an essential first step to derive reliable single-cell clustering results, it is challenging to obtain the accurate similarity measurement because it highly depends on a selection of genes for similarity evaluations and the optimal set of genes for the accurate similarity estimation is typically unknown. Moreover, due to technical limitations, single-cell sequencing includes a larger number of artificial zeros, and the technical noise makes it difficult to develop effective single-cell clustering algorithms. Here, we describe a novel single-cell clustering algorithm that can accurately predict single-cell clusters in large-scale single-cell sequencing by effectively reducing the zero-inflated noise and accurately estimating the cell-to-cell similarities. First, we construct an ensemble similarity network based on different similarity estimates, and reduce the artificial noise using a random walk with restart framework. Finally, starting from a larger number small size but highly consistent clusters, we iteratively merge a pair of clusters with the maximum similarities until it reaches the predicted number of clusters. Extensive performance evaluation shows that the proposed single-cell clustering algorithm can yield the accurate single-cell clustering results and it can help deciphering the key messages underlying complex biological mechanisms. Full article
(This article belongs to the Special Issue Single-Cell Bioinformatics and Machine Learning)
Show Figures

Figure 1

13 pages, 2539 KiB  
Article
Investigating Cellular Trajectories in the Severity of COVID-19 and Their Transcriptional Programs Using Machine Learning Approaches
by Hyun-Hwan Jeong, Johnathan Jia, Yulin Dai, Lukas M. Simon and Zhongming Zhao
Genes 2021, 12(5), 635; https://doi.org/10.3390/genes12050635 - 24 Apr 2021
Cited by 13 | Viewed by 3203
Abstract
Single-cell RNA sequencing of the bronchoalveolar lavage fluid (BALF) samples from COVID-19 patients has enabled us to examine gene expression changes of human tissue in response to the SARS-CoV-2 virus infection. However, the underlying mechanisms of COVID-19 pathogenesis at single-cell resolution, its transcriptional [...] Read more.
Single-cell RNA sequencing of the bronchoalveolar lavage fluid (BALF) samples from COVID-19 patients has enabled us to examine gene expression changes of human tissue in response to the SARS-CoV-2 virus infection. However, the underlying mechanisms of COVID-19 pathogenesis at single-cell resolution, its transcriptional drivers, and dynamics require further investigation. In this study, we applied machine learning algorithms to infer the trajectories of cellular changes and identify their transcriptional programs. Our study generated cellular trajectories that show the COVID-19 pathogenesis of healthy-to-moderate and healthy-to-severe on macrophages and T cells, and we observed more diverse trajectories in macrophages compared to T cells. Furthermore, our deep-learning algorithm DrivAER identified several pathways (e.g., xenobiotic pathway and complement pathway) and transcription factors (e.g., MITF and GATA3) that could be potential drivers of the transcriptomic changes for COVID-19 pathogenesis and the markers of the COVID-19 severity. Moreover, macrophages-related functions corresponded more to the disease severity compared to T cells-related functions. Our findings more proficiently dissected the transcriptomic changes leading to the severity of a COVID-19 infection. Full article
(This article belongs to the Special Issue Single-Cell Bioinformatics and Machine Learning)
Show Figures

Figure 1

Review

Jump to: Research

11 pages, 270 KiB  
Review
A Primer for Single-Cell Sequencing in Non-Model Organisms
by James M. Alfieri, Guosong Wang, Michelle M. Jonika, Clare A. Gill, Heath Blackmon and Giridhar N. Athrey
Genes 2022, 13(2), 380; https://doi.org/10.3390/genes13020380 - 19 Feb 2022
Cited by 9 | Viewed by 3495
Abstract
Single-cell sequencing technologies have led to a revolution in our knowledge of the diversity of cell types, connections between biological levels of organization, and relationships between genotype and phenotype. These advances have mainly come from using model organisms; however, using single-cell sequencing in [...] Read more.
Single-cell sequencing technologies have led to a revolution in our knowledge of the diversity of cell types, connections between biological levels of organization, and relationships between genotype and phenotype. These advances have mainly come from using model organisms; however, using single-cell sequencing in non-model organisms could enable investigations of questions inaccessible with typical model organisms. This primer describes a general workflow for single-cell sequencing studies and considerations for using non-model organisms (limited to multicellular animals). Importantly, single-cell sequencing, when further applied in non-model organisms, will allow for a deeper understanding of the mechanisms between genotype and phenotype and the basis for biological variation. Full article
(This article belongs to the Special Issue Single-Cell Bioinformatics and Machine Learning)
9 pages, 1044 KiB  
Review
The Trifecta of Single-Cell, Systems-Biology, and Machine-Learning Approaches
by Taylor M. Weiskittel, Cristina Correia, Grace T. Yu, Choong Yong Ung, Scott H. Kaufmann, Daniel D. Billadeau and Hu Li
Genes 2021, 12(7), 1098; https://doi.org/10.3390/genes12071098 - 20 Jul 2021
Cited by 10 | Viewed by 4088
Abstract
Together, single-cell technologies and systems biology have been used to investigate previously unanswerable questions in biomedicine with unparalleled detail. Despite these advances, gaps in analytical capacity remain. Machine learning, which has revolutionized biomedical imaging analysis, drug discovery, and systems biology, is an ideal [...] Read more.
Together, single-cell technologies and systems biology have been used to investigate previously unanswerable questions in biomedicine with unparalleled detail. Despite these advances, gaps in analytical capacity remain. Machine learning, which has revolutionized biomedical imaging analysis, drug discovery, and systems biology, is an ideal strategy to fill these gaps in single-cell studies. Machine learning additionally has proven to be remarkably synergistic with single-cell data because it remedies unique challenges while capitalizing on the positive aspects of single-cell data. In this review, we describe how systems-biology algorithms have layered machine learning with biological components to provide systems level analyses of single-cell omics data, thus elucidating complex biological mechanisms. Accordingly, we highlight the trifecta of single-cell, systems-biology, and machine-learning approaches and illustrate how this trifecta can significantly contribute to five key areas of scientific research: cell trajectory and identity, individualized medicine, pharmacology, spatial omics, and multi-omics. Given its success to date, the systems-biology, single-cell omics, and machine-learning trifecta has proven to be a potent combination that will further advance biomedical research. Full article
(This article belongs to the Special Issue Single-Cell Bioinformatics and Machine Learning)
Show Figures

Figure 1

Back to TopTop