Special Issue "Selected Papers from the International Conference on Intelligent Biology and Medicine (ICIBM 2019)"

A special issue of Genes (ISSN 2073-4425). This special issue belongs to the section "Technologies and Resources for Genetics".

Deadline for manuscript submissions: closed (15 August 2019).

Special Issue Editor

Prof. Yan Guo
Website1 Website2
Guest Editor
Department of Internal Medicine, University of New Mexico, Albuquerque, NM, 87131, USA.
Interests: genomics; genetics; bioinformatics; mitochondria; data mining; machine learning; high throughput genomic data
Special Issues and Collections in MDPI journals

Special Issue Information

Dear Colleagues,

The 2019 International Conference on Intelligent Biology and Medicine (ICIBM 2019) will be held on June 9-11, 2019 in Columbus, OH, USA. The event webpage is: http://icibm2019.org/.

ICIBM conference series have two main aims: 1) to foster interdisciplinary and multidisciplinary research in bioinformatics – related fields, and 2) to provide an educational program for trainees and young investigators across a range of scientific disciplines to learn the frontier research in these areas and to build a network among both the established and junior investigators.

The current Special Issue invites submissions on unpublished original work describing recent advances on all aspects of bioinformatics, systems biology and intelligent computing, including but not restricted to the following topics:

  1. Cancer genomics
  2. Metabolomics
  3. Microbiome/Metagenomics
  4. Translational pharmacoinformatics
  5. Omics integration
  6. Medical informatics
  7. Scientific databases
  8. Imaging informatics
  9. Systems biology
  10. Algorithms/Artificial intelligence
  11. Single-cell analysis

Prof. Yan Guo
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Genes is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2000 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • cancer genomics
  • metabolomics
  • microbiome
  • metagenomics
  • translational pharamacoinformatics
  • omics integration
  • medical informatics
  • scientific databases
  • imaging informatics
  • systems biology
  • algorithms
  • artificial intelligence
  • single-cell analysis

Published Papers (19 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

Open AccessEditorial
Innovating Computational Biology and Intelligent Medicine: ICIBM 2019 Special Issue
Genes 2020, 11(4), 437; https://doi.org/10.3390/genes11040437 - 17 Apr 2020
Abstract
The International Association for Intelligent Biology and Medicine (IAIBM) is a nonprofit organization that promotes intelligent biology and medical science. It hosts an annual International Conference on Intelligent Biology and Medicine (ICIBM), which was established in 2012. The ICIBM 2019 was held from [...] Read more.
The International Association for Intelligent Biology and Medicine (IAIBM) is a nonprofit organization that promotes intelligent biology and medical science. It hosts an annual International Conference on Intelligent Biology and Medicine (ICIBM), which was established in 2012. The ICIBM 2019 was held from 9 to 11 June 2019 in Columbus, Ohio, USA. Out of the 105 original research manuscripts submitted to the conference, 18 were selected for publication in a Special Issue in Genes. The topics of the selected manuscripts cover a wide range of current topics in biomedical research including cancer informatics, transcriptomic, computational algorithms, visualization and tools, deep learning, and microbiome research. In this editorial, we briefly introduce each of the manuscripts and discuss their contribution to the advance of science and technology. Full article

Research

Jump to: Editorial

Open AccessArticle
Network-Based Single-Cell RNA-Seq Data Imputation Enhances Cell Type Identification
Genes 2020, 11(4), 377; https://doi.org/10.3390/genes11040377 - 31 Mar 2020
Cited by 3
Abstract
Single-cell RNA sequencing is a powerful technology for obtaining transcriptomes at single-cell resolutions. However, it suffers from dropout events (i.e., excess zero counts) since only a small fraction of transcripts get sequenced in each cell during the sequencing process. This inherent sparsity of [...] Read more.
Single-cell RNA sequencing is a powerful technology for obtaining transcriptomes at single-cell resolutions. However, it suffers from dropout events (i.e., excess zero counts) since only a small fraction of transcripts get sequenced in each cell during the sequencing process. This inherent sparsity of expression profiles hinders further characterizations at cell/gene-level such as cell type identification and downstream analysis. To alleviate this dropout issue we introduce a network-based method, netImpute, by leveraging the hidden information in gene co-expression networks to recover real signals. netImpute employs Random Walk with Restart (RWR) to adjust the gene expression level in a given cell by borrowing information from its neighbors in a gene co-expression network. Performance evaluation and comparison with existing tools on simulated data and seven real datasets show that netImpute substantially enhances clustering accuracy and data visualization clarity, thanks to its effective treatment of dropouts. While the idea of netImpute is general and can be applied with other types of networks such as cell co-expression network or protein–protein interaction (PPI) network, evaluation results show that gene co-expression network is consistently more beneficial, presumably because PPI network usually lacks cell type context, while cell co-expression network can cause information loss for rare cell types. Evaluation results on several biological datasets show that netImpute can more effectively recover missing transcripts in scRNA-seq data and enhance the identification and visualization of heterogeneous cell types than existing methods. Full article
Show Figures

Figure 1

Open AccessArticle
Computational Cancer Cell Models to Guide Precision Breast Cancer Medicine
Genes 2020, 11(3), 263; https://doi.org/10.3390/genes11030263 - 28 Feb 2020
Cited by 2
Abstract
Background: Large-scale screening of drug sensitivity on cancer cell models can mimic in vivo cellular behavior providing wider scope for biological research on cancer. Since the therapeutic effect of a single drug or drug combination depends on the individual patient’s genome characteristics and [...] Read more.
Background: Large-scale screening of drug sensitivity on cancer cell models can mimic in vivo cellular behavior providing wider scope for biological research on cancer. Since the therapeutic effect of a single drug or drug combination depends on the individual patient’s genome characteristics and cancer cells integration reaction, the identification of an effective agent in an in vitro model by using large number of cancer cell models is a promising approach for the development of targeted treatments. Precision cancer medicine is to select the most appropriate treatment or treatments for an individual patient. However, it still lacks the tools to bridge the gap between conventional in vitro cancer cell models and clinical patient response to inhibitors. Methods: An optimal two-layer decision system model is developed to identify the cancer cells that most closely resemble an individual tumor for optimum therapeutic interventions in precision cancer medicine. Accordingly, an optimal grid parameters selection is designed to seek the highest accordance for treatment selection to the patient’s preference for drug response and in vitro cancer cell drug screening. The optimal two-layer decision system model overcomes the challenge of heterology data comparison between the tumor and the cancer cells, as well as between the continual variation of drug responses in vitro and the discrete ones in clinical practice. We simulated the model accuracy using 681 cancer cells’ mRNA and associated 481 drug screenings and validated our results on 315 breast cancer patients drug selection across seven drugs (docetaxel, doxorubicin, fluorouracil, paclitaxel, tamoxifen, cyclophosphamide, lapitinib). Results: Comparing with the real response of a drug in clinical patients, the novel model obtained an overall average accordance over 90.8% across the seven drugs. At the same time, the optimal cancer cells and the associated optimal therapeutic efficacy of cancer drugs are recommended. The novel optimal two-layer decision system model was used on 1097 patients with breast cancer in guiding precision medicine for a recommendation of their optimal cancer cells (30 cancer cells) and associated efficacy of certain cancer drugs. Our model can detect the most similar cancer cells for each individual patient. Conclusion: A successful clinical translation model (optimal two-layer decision system model) was developed to bridge in-vitro basic science to clinical practice in a therapeutic intervention application for the first time. The novel tool kills two birds with one stone. It can help basic science to seek optimal cancer cell models for an individual tumor, while prioritizing clinical drugs’ recommendations in practice. Tool associated platform website: We extended the breast cancer research to 32 more types of cancers across 45 therapy predictions. Full article
Show Figures

Figure 1

Open AccessArticle
CNV Detection from Circulating Tumor DNA in Late Stage Non-Small Cell Lung Cancer Patients
Genes 2019, 10(11), 926; https://doi.org/10.3390/genes10110926 - 14 Nov 2019
Cited by 7
Abstract
While methods for detecting SNVs and indels in circulating tumor DNA (ctDNA) with hybridization capture-based next-generation sequencing (NGS) have been available, copy number variations (CNVs) detection is more challenging. Here, we present a method enabling CNV detection from a 150-gene panel using a [...] Read more.
While methods for detecting SNVs and indels in circulating tumor DNA (ctDNA) with hybridization capture-based next-generation sequencing (NGS) have been available, copy number variations (CNVs) detection is more challenging. Here, we present a method enabling CNV detection from a 150-gene panel using a very low amount of ctDNA. First, a read depth-based CNV estimation method without a paired blood sample was developed and cfDNA sequencing data from healthy people were used to build a panel of normal (PoN) model. Then, in silico and in vitro simulations were performed to define the limit of detection (LOD) for EGFR, ERBB2, and MET. Compared to the WES results of the 48 samples, the concordance rate for EGFR, ERBB2, and MET CNVs was 78%, 89.6%, and 92.4%, respectively. In another cohort profiled with the 150-gene panel from 5980 lung cancer ctDNA samples, we detected the three genes’ amplification with comparable population frequency with other cohorts. One lung adenocarcinoma patient with MET amplification detected by our method reached partial response to crizotinib. These findings show that our ctDNA CNV detection pipeline can detect CNVs with high specificity and concordance, which enables CNV calling in a non-invasive way for cancer patients when tissues are not available. Full article
Show Figures

Figure 1

Open AccessArticle
DNA Methylation Markers for Pan-Cancer Prediction by Deep Learning
Genes 2019, 10(10), 778; https://doi.org/10.3390/genes10100778 - 04 Oct 2019
Cited by 6
Abstract
For cancer diagnosis, many DNA methylation markers have been identified. However, few studies have tried to identify DNA methylation markers to diagnose diverse cancer types simultaneously, i.e., pan-cancers. In this study, we tried to identify DNA methylation markers to differentiate cancer samples from [...] Read more.
For cancer diagnosis, many DNA methylation markers have been identified. However, few studies have tried to identify DNA methylation markers to diagnose diverse cancer types simultaneously, i.e., pan-cancers. In this study, we tried to identify DNA methylation markers to differentiate cancer samples from the respective normal samples in pan-cancers. We collected whole genome methylation data of 27 cancer types containing 10,140 cancer samples and 3386 normal samples, and divided all samples into five data sets, including one training data set, one validation data set and three test data sets. We applied machine learning to identify DNA methylation markers, and specifically, we constructed diagnostic prediction models by deep learning. We identified two categories of markers: 12 CpG markers and 13 promoter markers. Three of 12 CpG markers and four of 13 promoter markers locate at cancer-related genes. With the CpG markers, our model achieved an average sensitivity and specificity on test data sets as 92.8% and 90.1%, respectively. For promoter markers, the average sensitivity and specificity on test data sets were 89.8% and 81.1%, respectively. Furthermore, in cell-free DNA methylation data of 163 prostate cancer samples, the CpG markers achieved the sensitivity as 100%, and the promoter markers achieved 92%. For both marker types, the specificity of normal whole blood was 100%. To conclude, we identified methylation markers to diagnose pan-cancers, which might be applied to liquid biopsy of cancers. Full article
Show Figures

Figure 1

Open AccessArticle
A Portal to Visualize Transcriptome Profiles in Mouse Models of Neurological Disorders
Genes 2019, 10(10), 759; https://doi.org/10.3390/genes10100759 - 26 Sep 2019
Cited by 2
Abstract
Target nomination for drug development has been a major challenge in the path to finding a cure for several neurological disorders. Comprehensive transcriptome profiles have revealed brain gene expression changes associated with many neurological disorders, and the functional validation of these changes is [...] Read more.
Target nomination for drug development has been a major challenge in the path to finding a cure for several neurological disorders. Comprehensive transcriptome profiles have revealed brain gene expression changes associated with many neurological disorders, and the functional validation of these changes is a critical next step. Model organisms are a proven approach for the elucidation of disease mechanisms, including screening of gene candidates as therapeutic targets. Frequently, multiple models exist for a given disease, creating a challenge to select the optimal model for validation and functional follow-up. To help in nominating the best mouse models for studying neurological diseases, we developed a web portal to visualize mouse transcriptomic data related to neurological disorders. Users can examine gene expression changes across mouse model studies to help select the optimal mouse model for further investigation. The portal provides access to mouse studies related to Alzheimer’s diseases (AD), Parkinson’s disease (PD), Huntington’s disease (HD), Amyotrophic Lateral Sclerosis (ALS), Spinocerebellar ataxia (SCA), and models related to aging. Full article
Show Figures

Figure 1

Open AccessArticle
Identification of Alternatively-Activated Pathways between Primary Breast Cancer and Liver Metastatic Cancer Using Microarray Data
Genes 2019, 10(10), 753; https://doi.org/10.3390/genes10100753 - 25 Sep 2019
Cited by 2
Abstract
Alternatively-activated pathways have been observed in biological experiments in cancer studies, but the concept had not been fully explored in computational cancer system biology. Therefore, an alternatively-activated pathway identification method was proposed and applied to primary breast cancer and breast cancer liver metastasis [...] Read more.
Alternatively-activated pathways have been observed in biological experiments in cancer studies, but the concept had not been fully explored in computational cancer system biology. Therefore, an alternatively-activated pathway identification method was proposed and applied to primary breast cancer and breast cancer liver metastasis research using microarray data. Interestingly, the results show that cytokine-cytokine receptor interaction and calcium signaling were significantly enriched under both conditions. TGF beta signaling was found to be the hub in network topology analysis. In total, three types of alternatively-activated pathways were recognized. In the cytokine-cytokine receptor interaction pathway, four active alteration patterns in gene pairs were noticed. Thirteen cytokine-cytokine receptor pairs with inverse activity changes of both genes were verified by the literature. The second type was that some sub-pathways were active under only one condition. For the third type, nodes were significantly active in both conditions, but with different active genes. In the calcium signaling and TGF beta signaling pathways, node E2F5 and E2F4 were significantly active in primary breast cancer and metastasis, respectively. Overall, our study demonstrated the first time using microarray data to identify alternatively-activated pathways in breast cancer liver metastasis. The results showed that the proposed method was valid and effective, which could be helpful for future research for understanding the mechanism of breast cancer metastasis. Full article
Show Figures

Figure 1

Open AccessArticle
Forming Big Datasets through Latent Class Concatenation of Imperfectly Matched Databases Features
Genes 2019, 10(9), 727; https://doi.org/10.3390/genes10090727 - 19 Sep 2019
Cited by 1
Abstract
Informatics researchers often need to combine data from many different sources to increase statistical power and study subtle or complicated effects. Perfect overlap of measurements across academic studies is rare since virtually every dataset is collected for a unique purpose and without coordination [...] Read more.
Informatics researchers often need to combine data from many different sources to increase statistical power and study subtle or complicated effects. Perfect overlap of measurements across academic studies is rare since virtually every dataset is collected for a unique purpose and without coordination across parties not-at-hand (i.e., informatics researchers in the future). Thus, incomplete concordance of measurements across datasets poses a major challenge for researchers seeking to combine public databases. In any given field, some measurements are fairly standard, but every organization collecting data makes unique decisions on instruments, protocols, and methods of processing the data. This typically denies literal concatenation of the raw data since constituent cohorts do not have the same measurements (i.e., columns of data). When measurements across datasets are similar prima facie, there is a desire to combine the data to increase power, but mixing non-identical measurements could greatly reduce the sensitivity of the downstream analysis. Here, we discuss a statistical method that is applicable when certain patterns of missing data are found; namely, it is possible to combine datasets that measure the same underlying constructs (or latent traits) when there is only partial overlap of measurements across the constituent datasets. Our method, ROSETTA empirically derives a set of common latent trait metrics for each related measurement domain using a novel variation of factor analysis to ensure equivalence across the constituent datasets. The advantage of combining datasets this way is the simplicity, statistical power, and modeling flexibility of a single joint analysis of all the data. Three simulation studies show the performance of ROSETTA on datasets with only partially overlapping measurements (i.e., systematically missing information), benchmarked to a condition of perfectly overlapped data (i.e., full information). The first study examined a range of correlations, while the second study was modeled after the observed correlations in a well-characterized clinical, behavioral cohort. Both studies consistently show significant correlations >0.94, often >0.96, indicating the robustness of the method and validating the general approach. The third study varied within and between domain correlations and compared ROSETTA to multiple imputation and meta-analysis as two commonly used methods that ostensibly solve the same data integration problem. We provide one alternative to meta-analysis and multiple imputation by developing a method that statistically equates similar but distinct manifest metrics into a set of empirically derived metrics that can be used for analysis across all datasets. Full article
Show Figures

Figure 1

Open AccessArticle
Identifying Interaction Clusters for MiRNA and MRNA Pairs in TCGA Network
Genes 2019, 10(9), 702; https://doi.org/10.3390/genes10090702 - 11 Sep 2019
Cited by 5
Abstract
Existing methods often fail to recognize the conversions for the biological roles of the pairs of genes and microRNAs (miRNAs) between the tumor and normal samples. We have developed a novel cluster scoring method to identify messenger RNA (mRNA) and miRNA interaction pairs [...] Read more.
Existing methods often fail to recognize the conversions for the biological roles of the pairs of genes and microRNAs (miRNAs) between the tumor and normal samples. We have developed a novel cluster scoring method to identify messenger RNA (mRNA) and miRNA interaction pairs and clusters while considering tumor and normal samples jointly. Our method has identified 54 significant clusters for 15 cancer types selected from The Cancer Genome Atlas project. We also determined the shared clusters across tumor types and/or subtypes. In addition, we compared gene and miRNA overlap between lists identified in our liver hepatocellular carcinoma (LIHC) study and regulatory relationships reported from human and rat nonalcoholic fatty liver disease studies (NAFLD). Finally, we analyzed biological functions for the single significant cluster in LIHC and uncovered a significantly enriched pathway (phospholipase D signaling pathway) with six genes represented in the cluster, symbols: DGKQ, LPAR2, PDGFRB, PIK3R3, PTGFR and RAPGEF3. Full article
Show Figures

Figure 1

Open AccessArticle
A Super-Clustering Approach for Fully Automated Single Particle Picking in Cryo-EM
Genes 2019, 10(9), 666; https://doi.org/10.3390/genes10090666 - 30 Aug 2019
Cited by 5
Abstract
Structure determination of proteins and macromolecular complexes by single-particle cryo-electron microscopy (cryo-EM) is poised to revolutionize structural biology. An early challenging step in the cryo-EM pipeline is the detection and selection of particles from two-dimensional micrographs (particle picking). Most existing particle-picking methods require [...] Read more.
Structure determination of proteins and macromolecular complexes by single-particle cryo-electron microscopy (cryo-EM) is poised to revolutionize structural biology. An early challenging step in the cryo-EM pipeline is the detection and selection of particles from two-dimensional micrographs (particle picking). Most existing particle-picking methods require human intervention to deal with complex (irregular) particle shapes and extremely low signal-to-noise ratio (SNR) in cryo-EM images. Here, we design a fully automated super-clustering approach for single particle picking (SuperCryoEMPicker) in cryo-EM micrographs, which focuses on identifying, detecting, and picking particles of the complex and irregular shapes in micrographs with extremely low signal-to-noise ratio (SNR). Our method first applies advanced image processing procedures to improve the quality of the cryo-EM images. The binary mask image-highlighting protein particles are then generated from each individual cryo-EM image using the super-clustering (SP) method, which improves upon base clustering methods (i.e., k-means, fuzzy c-means (FCM), and intensity-based cluster (IBC) algorithm) via a super-pixel algorithm. SuperCryoEMPicker is tested and evaluated on micrographs of β-galactosidase and 80S ribosomes, which are examples of cryo-EM data exhibiting complex and irregular particle shapes. The results show that the super-particle clustering method provides a more robust detection of particles than the base clustering methods, such as k-means, FCM, and IBC. SuperCryoEMPicker automatically and effectively identifies very complex particles from cryo-EM images of extremely low SNR. As a fully automated particle detection method, it has the potential to relieve researchers from laborious, manual particle-labeling work and therefore is a useful tool for cryo-EM protein structure determination. Full article
Show Figures

Figure 1

Open AccessArticle
Gene Co-Expression Networks Restructured Gene Fusion in Rhabdomyosarcoma Cancers
Genes 2019, 10(9), 665; https://doi.org/10.3390/genes10090665 - 30 Aug 2019
Cited by 2
Abstract
Rhabdomyosarcoma is subclassified by the presence or absence of a recurrent chromosome translocation that fuses the FOXO1 and PAX3 or PAX7 genes. The fusion protein (FOXO1-PAX3/7) retains both binding domains and becomes a novel and potent transcriptional regulator in rhabdomyosarcoma subtypes. Many studies [...] Read more.
Rhabdomyosarcoma is subclassified by the presence or absence of a recurrent chromosome translocation that fuses the FOXO1 and PAX3 or PAX7 genes. The fusion protein (FOXO1-PAX3/7) retains both binding domains and becomes a novel and potent transcriptional regulator in rhabdomyosarcoma subtypes. Many studies have characterized and integrated genomic, transcriptomic, and epigenomic differences among rhabdomyosarcoma subtypes that contain the FOXO1-PAX3/7 gene fusion and those that do not; however, few investigations have investigated how gene co-expression networks are altered by FOXO1-PAX3/7. Although transcriptional data offer insight into one level of functional regulation, gene co-expression networks have the potential to identify biological interactions and pathways that underpin oncogenesis and tumorigenicity. Thus, we examined gene co-expression networks for rhabdomyosarcoma that were FOXO1-PAX3 positive, FOXO1-PAX7 positive, or fusion negative. Gene co-expression networks were mined using local maximum Quasi-Clique Merger (lmQCM) and analyzed for co-expression differences among rhabdomyosarcoma subtypes. This analysis observed 41 co-expression modules that were shared between fusion negative and positive samples, of which 17/41 showed significant up- or down-regulation in respect to fusion status. Fusion positive and negative rhabdomyosarcoma showed differing modularity of co-expression networks with fusion negative (n = 109) having significantly more individual modules than fusion positive (n = 53). Subsequent analysis of gene co-expression networks for PAX3 and PAX7 type fusions observed 17/53 were differentially expressed between the two subtypes. Gene list enrichment analysis found that gene ontology terms were poorly matched with biological processes and molecular function for most co-expression modules identified in this study; however, co-expressed modules were frequently localized to cytobands on chromosomes 8 and 11. Overall, we observed substantial restructuring of co-expression networks relative to fusion status and fusion type in rhabdomyosarcoma and identified previously overlooked genes and pathways that may be targeted in this pernicious disease. Full article
Show Figures

Figure 1

Open AccessArticle
Sparse Convolutional Denoising Autoencoders for Genotype Imputation
Genes 2019, 10(9), 652; https://doi.org/10.3390/genes10090652 - 28 Aug 2019
Cited by 3
Abstract
Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (HMMs), and statistical inference. Deep learning-based [...] Read more.
Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (HMMs), and statistical inference. Deep learning-based methods have been recently reported to suitably address the missing data problems in various fields. To explore the performance of deep learning for genotype imputation, in this study, we propose a deep model called a sparse convolutional denoising autoencoder (SCDA) to impute missing genotypes. We constructed the SCDA model using a convolutional layer that can extract various correlation or linkage patterns in the genotype data and applying a sparse weight matrix resulted from the L1 regularization to handle high dimensional data. We comprehensively evaluated the performance of the SCDA model in different scenarios for genotype imputation on the yeast and human genotype data, respectively. Our results showed that SCDA has strong robustness and significantly outperforms popular reference-free imputation methods. This study thus points to another novel application of deep learning models for missing data imputation in genomic studies. Full article
Show Figures

Figure 1

Open AccessArticle
Tumor-Infiltrating Leukocyte Composition and Prognostic Power in Hepatitis B- and Hepatitis C-Related Hepatocellular Carcinomas
Genes 2019, 10(8), 630; https://doi.org/10.3390/genes10080630 - 20 Aug 2019
Cited by 12
Abstract
Background: Tumor-infiltrating leukocytes (TILs) are immune cells surrounding tumor cells, and several studies have shown that TILs are potential survival predictors in different cancers. However, few studies have dissected the differences between hepatitis B- and hepatitis C-related hepatocellular carcinoma (HBV−HCC and HCV−HCC). Therefore, [...] Read more.
Background: Tumor-infiltrating leukocytes (TILs) are immune cells surrounding tumor cells, and several studies have shown that TILs are potential survival predictors in different cancers. However, few studies have dissected the differences between hepatitis B- and hepatitis C-related hepatocellular carcinoma (HBV−HCC and HCV−HCC). Therefore, we aimed to determine whether the abundance and composition of TILs are potential predictors for survival outcomes in HCC and which TILs are the most significant predictors. Methods: Two bioinformatics algorithms, ESTIMATE and CIBERSORT, were utilized to analyze the gene expression profiles from 6 datasets, from which the abundance of corresponding TILs was inferred. The ESTIMATE algorithm examined the overall abundance of TILs, whereas the CIBERSORT algorithm reported the relative abundance of 22 different TILs. Both HBV−HCC and HCV−HCC were analyzed. Results: The results indicated that the total abundance of TILs was higher in non-tumor tissue regardless of the HCC type. Alternatively, the specific TILs associated with overall survival (OS) and recurrence-free survival (RFS) varied between subtypes. For example, in HBV−HCC, plasma cells (hazard ratio [HR] = 1.05; 95% CI 1.00–1.10; p = 0.034) and activated dendritic cells (HR = 1.08; 95% CI 1.01–1.17; p = 0.03) were significantly associated with OS, whereas in HCV−HCC, monocytes (HR = 1.21) were significantly associated with OS. Furthermore, for RFS, CD8+ T cells (HR = 0.98) and M0 macrophages (HR = 1.02) were potential biomarkers in HBV−HCC, whereas neutrophils (HR = 1.01) were an independent predictor in HCV−HCC. Lastly, in both HBV−HCC and HCV−HCC, CD8+ T cells (HR = 0.97) and activated dendritic cells (HR = 1.09) had a significant association with OS, while γ delta T cells (HR = 1.04), monocytes (HR = 1.05), M0 macrophages (HR = 1.04), M1 macrophages (HR = 1.02), and activated dendritic cells (HR = 1.15) were highly associated with RFS. Conclusions: These findings demonstrated that TILs are potential survival predictors in HCC and different kinds of TILs are observed according to the virus type. Therefore, further investigations are warranted to elucidate the role of TILs in HCC, which may improve immunotherapy outcomes. Full article
Show Figures

Figure 1

Open AccessArticle
The Molecular Evolution of Circadian Clock Genes in Spotted Gar (Lepisosteus oculatus)
Genes 2019, 10(8), 622; https://doi.org/10.3390/genes10080622 - 17 Aug 2019
Cited by 2
Abstract
Circadian rhythms are biological rhythms with a period of approximately 24 h. While canonical circadian clock genes and their regulatory mechanisms appear highly conserved, the evolution of clock gene families is still unclear due to several rounds of whole genome duplication in vertebrates. [...] Read more.
Circadian rhythms are biological rhythms with a period of approximately 24 h. While canonical circadian clock genes and their regulatory mechanisms appear highly conserved, the evolution of clock gene families is still unclear due to several rounds of whole genome duplication in vertebrates. The spotted gar (Lepisosteus oculatus), as a non-teleost ray-finned fish, represents a fish lineage that diverged before the teleost genome duplication (TGD), providing an outgroup for exploring the evolutionary mechanisms of circadian clocks after whole-genome duplication. In this study, we interrogated the spotted gar draft genome sequences and found that spotted gar contains 26 circadian clock genes from 11 families. Phylogenetic analysis showed that 9 of these 11 spotted gar circadian clock gene families have the same number of genes as humans, while the members of the nfil3 and cry families are different between spotted gar and humans. Using phylogenetic and syntenic analyses, we found that nfil3-1 is conserved in vertebrates, while nfil3-2 and nfil3-3 are maintained in spotted gar, teleost fish, amphibians, and reptiles, but not in mammals. Following the two-round vertebrate genome duplication (VGD), spotted gar retained cry1a, cry1b, and cry2, and cry3 is retained in spotted gar, teleost fish, turtles, and birds, but not in mammals. We hypothesize that duplication of core clock genes, such as (nfil3 and cry), likely facilitated diversification of circadian regulatory mechanisms in teleost fish. We also found that the transcription factor binding element (Ahr::Arnt) is retained only in one of the per1 or per2 duplicated paralogs derived from the TGD in the teleost fish, implicating possible subfuctionalization cases. Together, these findings help decipher the repertoires of the spotted gar’s circadian system and shed light on how the vertebrate circadian clock systems have evolved. Full article
Show Figures

Figure 1

Open AccessArticle
Multi-Objective Optimized Fuzzy Clustering for Detecting Cell Clusters from Single-Cell Expression Profiles
Genes 2019, 10(8), 611; https://doi.org/10.3390/genes10080611 - 13 Aug 2019
Cited by 7
Abstract
Rapid advance in single-cell RNA sequencing (scRNA-seq) allows measurement of the expression of genes at single-cell resolution in complex disease or tissue. While many methods have been developed to detect cell clusters from the scRNA-seq data, this task currently remains a main challenge. [...] Read more.
Rapid advance in single-cell RNA sequencing (scRNA-seq) allows measurement of the expression of genes at single-cell resolution in complex disease or tissue. While many methods have been developed to detect cell clusters from the scRNA-seq data, this task currently remains a main challenge. We proposed a multi-objective optimization-based fuzzy clustering approach for detecting cell clusters from scRNA-seq data. First, we conducted initial filtering and SCnorm normalization. We considered various case studies by selecting different cluster numbers ( c l = 2 to a user-defined number), and applied fuzzy c-means clustering algorithm individually. From each case, we evaluated the scores of four cluster validity index measures, Partition Entropy ( P E ), Partition Coefficient ( P C ), Modified Partition Coefficient ( M P C ), and Fuzzy Silhouette Index ( F S I ). Next, we set the first measure as minimization objective (↓) and the remaining three as maximization objectives (↑), and then applied a multi-objective decision-making technique, TOPSIS, to identify the best optimal solution. The best optimal solution (case study) that had the highest TOPSIS score was selected as the final optimal clustering. Finally, we obtained differentially expressed genes (DEGs) using Limma through the comparison of expression of the samples between each resultant cluster and the remaining clusters. We applied our approach to a scRNA-seq dataset for the rare intestinal cell type in mice [GEO ID: GSE62270, 23,630 features (genes) and 288 cells]. The optimal cluster result (TOPSIS optimal score= 0.858) comprised two clusters, one with 115 cells and the other 91 cells. The evaluated scores of the four cluster validity indices, F S I , P E , P C , and M P C for the optimized fuzzy clustering were 0.482, 0.578, 0.607, and 0.215, respectively. The Limma analysis identified 1240 DEGs (cluster 1 vs. cluster 2). The top ten gene markers were Rps21, Slc5a1, Crip1, Rpl15, Rpl3, Rpl27a, Khk, Rps3a1, Aldob and Rps17. In this list, Khk (encoding ketohexokinase) is a novel marker for the rare intestinal cell type. In summary, this method is useful to detect cell clusters from scRNA-seq data. Full article
Show Figures

Figure 1

Open AccessArticle
Network as a Biomarker: A Novel Network-Based Sparse Bayesian Machine for Pathway-Driven Drug Response Prediction
Genes 2019, 10(8), 602; https://doi.org/10.3390/genes10080602 - 09 Aug 2019
Cited by 4
Abstract
With the advances in different biological networks including gene regulation, gene co-expression, protein–protein interaction networks, and advanced approaches for network reconstruction, analysis, and interpretation, it is possible to discover reliable and accurate molecular network-based biomarkers for monitoring cancer treatment. Such efforts will also [...] Read more.
With the advances in different biological networks including gene regulation, gene co-expression, protein–protein interaction networks, and advanced approaches for network reconstruction, analysis, and interpretation, it is possible to discover reliable and accurate molecular network-based biomarkers for monitoring cancer treatment. Such efforts will also pave the way toward the realization of biomarker-driven personalized medicine against cancer. Previously, we have reconstructed disease-specific driver signaling networks using multi-omics profiles and cancer signaling pathway data. In this study, we developed a network-based sparse Bayesian machine (NBSBM) approach, using previously derived disease-specific driver signaling networks to predict cancer cell responses to drugs. NBSBM made use of the information encoded in a disease-specific (differentially expressed) network to improve its prediction performance in problems with a reduced amount of training data and a very high-dimensional feature space. Sparsity in NBSBM is favored by a spike and slab prior distribution, which is combined with a Markov random field prior that encodes the network of feature dependencies. Gene features that are connected in the network are assumed to be both relevant and irrelevant to drug responses. We compared the proposed method with network-based support vector machine (NBSVM) approaches and found that the NBSBM approach could achieve much better accuracy than the other two NBSVM methods. The gene modules selected from the disease-specific driver networks for predicting drug sensitivity might be directly involved in drug sensitivity or resistance. This work provides a disease-specific network-based drug sensitivity prediction approach and can uncover the potential mechanisms of the action of drugs by selecting the most predictive sub-networks from the disease-specific network. Full article
Show Figures

Figure 1

Open AccessArticle
Long Non-Coding RNA Expression Levels Modulate Cell-Type-Specific Splicing Patterns by Altering Their Interaction Landscape with RNA-Binding Proteins
Genes 2019, 10(8), 593; https://doi.org/10.3390/genes10080593 - 06 Aug 2019
Cited by 7
Abstract
Recent developments in our understanding of the interactions between long non-coding RNAs (lncRNAs) and cellular components have improved treatment approaches for various human diseases including cancer, vascular diseases, and neurological diseases. Although investigation of specific lncRNAs revealed their role in the metabolism of [...] Read more.
Recent developments in our understanding of the interactions between long non-coding RNAs (lncRNAs) and cellular components have improved treatment approaches for various human diseases including cancer, vascular diseases, and neurological diseases. Although investigation of specific lncRNAs revealed their role in the metabolism of cellular RNA, our understanding of their contribution to post-transcriptional regulation is relatively limited. In this study, we explore the role of lncRNAs in modulating alternative splicing and their impact on downstream protein–RNA interaction networks. Analysis of alternative splicing events across 39 lncRNA knockdown and wildtype RNA-sequencing datasets from three human cell lines—HeLa (cervical cancer), K562 (myeloid leukemia), and U87 (glioblastoma)—resulted in the high-confidence (false discovery rate (fdr) < 0.01) identification of 11,630 skipped exon events and 5895 retained intron events, implicating 759 genes to be impacted at the post-transcriptional level due to the loss of lncRNAs. We observed that a majority of the alternatively spliced genes in a lncRNA knockdown were specific to the cell type. In tandem, the functions annotated to the genes affected by alternative splicing across each lncRNA knockdown also displayed cell-type specificity. To understand the mechanism behind this cell-type-specific alternative splicing pattern, we analyzed RNA-binding protein (RBP)–RNA interaction profiles across the spliced regions in order to observe cell-type-specific alternative splice event RBP binding preference. Despite limited RBP binding data across cell lines, alternatively spliced events detected in lncRNA perturbation experiments were associated with RBPs binding in proximal intron–exon junctions in a cell-type-specific manner. The cellular functions affected by alternative splicing were also affected in a cell-type-specific manner. Based on the RBP binding profiles in HeLa and K562 cells, we hypothesize that several lncRNAs are likely to exhibit a sponge effect in disease contexts, resulting in the functional disruption of RBPs and their downstream functions. We propose that such lncRNA sponges can extensively rewire post-transcriptional gene regulatory networks by altering the protein–RNA interaction landscape in a cell-type-specific manner. Full article
Show Figures

Graphical abstract

Open AccessArticle
Kinetic Modeling of DUSP Regulation in Herceptin-Resistant HER2-Positive Breast Cancer
Genes 2019, 10(8), 568; https://doi.org/10.3390/genes10080568 - 26 Jul 2019
Cited by 1
Abstract
Background: HER2 (human epidermal growth factor 2)-positive breast cancer is an aggressive type of breast cancer characterized by the overexpression of the receptor-type protein tyrosine kinase HER2 or amplification of the HER2 gene. It is commonly treated by the drug trastuzumab (Herceptin), but [...] Read more.
Background: HER2 (human epidermal growth factor 2)-positive breast cancer is an aggressive type of breast cancer characterized by the overexpression of the receptor-type protein tyrosine kinase HER2 or amplification of the HER2 gene. It is commonly treated by the drug trastuzumab (Herceptin), but resistance to its action frequently develops and limits its therapeutic benefit. Dual-specificity phosphatases (DUSPs) were previously highlighted as central regulators of HER2 signaling; therefore, understanding their role is crucial to designing new strategies to improve the efficacy of Herceptin treatment. We investigated whether inhibiting certain DUSPs re-sensitized Herceptin-resistant breast cancer cells to the drug. We built a series of kinetic models incorporating the key players of HER2 signaling pathways and simulating a range of inhibition intensities. The simulation results were compared to live tumor cells in culture, and showed good agreement with the experimental analyses. In particular, we observed that Herceptin-resistant DUSP16-silenced breast cancer cells became more responsive to the drug when treated for 72 h with Herceptin, showing a decrease in resistance, in agreement with the model predictions. Overall, we showed that the kinetic modeling of signaling pathways is able to generate predictions that assist experimental research in the identification of potential targets for cancer treatment. Full article
Show Figures

Figure 1

Open AccessArticle
Changes in the Microbial Community Diversity of Oil Exploitation
Genes 2019, 10(8), 556; https://doi.org/10.3390/genes10080556 - 24 Jul 2019
Cited by 4
Abstract
To systematically evaluate the ecological changes of an active offshore petroleum production system, the variation of microbial communities at several sites (virgin field, wellhead, storage tank) of an oil production facility in east China was investigated by sequencing the V3 to V4 regions [...] Read more.
To systematically evaluate the ecological changes of an active offshore petroleum production system, the variation of microbial communities at several sites (virgin field, wellhead, storage tank) of an oil production facility in east China was investigated by sequencing the V3 to V4 regions of 16S ribosomal ribonucleic acid (rRNA) of microorganisms. In general, a decrease of microbial community richness and diversity in petroleum mining was observed, as measured by operational taxonomic unit (OTU) numbers, α (Chao1 and Shannon indices), and β (principal coordinate analysis) diversity. Microbial community structure was strongly affected by environmental factors at the phylum and genus levels. At the phylum level, virgin field and wellhead were dominated by Proteobacteria, while the storage tank had higher presence of Firmicutes (29.3–66.9%). Specifically, the wellhead displayed a lower presentence of Proteobacteria (48.6–53.4.0%) and a higher presence of Firmicutes (24.4–29.6%) than the virgin field. At the genus level, the predominant genera were Ochrobactrum and Acinetobacter in the virgin field, Lactococcus and Pseudomonas in the wellhead, and Prauseria and Bacillus in the storage tank. Our study revealed that the microbial community structure was strongly affected by the surrounding environmental factors, such as temperature, oxygen content, salinity, and pH, which could be altered because of the oil production. It was observed that the various microbiomes produced surfactants, transforming the biohazard and degrading hydro-carbon. Altering the microbiome growth condition by appropriate human intervention and taking advantage of natural microbial resources can further enhance oil recovery technology. Full article
Show Figures

Figure 1

Back to TopTop