molecules-logo

Journal Browser

Journal Browser

Selected Papers from the Second CCF Bioinformatics Conference (CBC 2017)

A special issue of Molecules (ISSN 1420-3049).

Deadline for manuscript submissions: closed (10 November 2017) | Viewed by 61516

Special Issue Editors


E-Mail Website
Guest Editor
School of Information Science and Engineering, Central South University, Changsha 410083, China
Interests: bioinformatics; systems biology; genomic and proteomic data analysis; biological network analysis; disease association prediction

grade E-Mail Website
Guest Editor
Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
Interests: bioinformatics; parallel computing; deep learning; protein classification; genome assembly
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The Second CCF Bioinformatics Conference (CBC2017), organized by the China Computer Federation, will be held in Changsha, China, 13–15 October, 2017. The conference is supported and sponsored by Central South University, WeGene Company (Shenzhen), Sugon Information Industry Co., Beijing Zhongkejingyun Co., and the National Natural Science Foundation of China (NSFC).

Bioinformatics have become intensive research topics in the past decade and have attracted many leading scientists working in Biology, Physics, Mathematics, and Computer Science. Optimization, statistics, algorithms, and many other informatic methods are widely used in the field.

Following the successful CBC conference series from 2016, the purpose of CBC 2017 is to extend the international forum for scientists, researchers, educators, and practitioners to exchange ideas and approaches, to present research findings and state-of-the-art solutions in this interdisciplinary field, including theoretical methodology developments and their applications in biosciences and research on various aspects of bioinformatics. Excellent speakers in China will present their results. For all details, please see http://bioinformatics.csu.edu.cn/resources/CBC2017/, where a full list of presenters is available.

Prof. Dr. Min Li
Prof. Dr. Quan Zou
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Molecules is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • bioinformatics

  • machine learning

  • system biology

  • biological networks

  • computational biology

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 7229 KiB  
Article
Reconstructing Phylogeny by Aligning Multiple Metabolic Pathways Using Functional Module Mapping
by Yiran Huang, Cheng Zhong, Hai Xiang Lin, Jianyi Wang and Yuzhong Peng
Molecules 2018, 23(2), 486; https://doi.org/10.3390/molecules23020486 - 23 Feb 2018
Cited by 2 | Viewed by 3970
Abstract
Comparison of metabolic pathways provides a systematic way for understanding the evolutionary and phylogenetic relationships in systems biology. Although a number of phylogenetic methods have been developed, few efforts have been made to provide a unified phylogenetic framework that sufficiently reflects the metabolic [...] Read more.
Comparison of metabolic pathways provides a systematic way for understanding the evolutionary and phylogenetic relationships in systems biology. Although a number of phylogenetic methods have been developed, few efforts have been made to provide a unified phylogenetic framework that sufficiently reflects the metabolic features of organisms. In this paper, we propose a phylogenetic framework that characterizes the metabolic features of organisms by aligning multiple metabolic pathways using functional module mapping. Our method transforms the alignment of multiple metabolic pathways into constructing the union graph of pathways, builds mappings between functional modules of pathways in the union graph, and infers phylogenetic relationships among organisms based on module mappings. Experimental results show that the use of functional module mapping enables us to correctly categorize organisms into main categories with specific metabolic characteristics. Traditional genome-based phylogenetic methods can reconstruct phylogenetic relationships, whereas our method can offer in-depth metabolic analysis for phylogenetic reconstruction, which can add insights into traditional phyletic reconstruction. The results also demonstrate that our phylogenetic trees are closer to the classic classifications in comparison to existing classification methods using metabolic pathway data. Full article
Show Figures

Figure 1

15 pages, 704 KiB  
Article
The Integrative Method Based on the Module-Network for Identifying Driver Genes in Cancer Subtypes
by Xinguo Lu, Xing Li, Ping Liu, Xin Qian, Qiumai Miao and Shaoliang Peng
Molecules 2018, 23(2), 183; https://doi.org/10.3390/molecules23020183 - 24 Jan 2018
Cited by 25 | Viewed by 5051
Abstract
With advances in next-generation sequencing(NGS) technologies, a large number of multiple types of high-throughput genomics data are available. A great challenge in exploring cancer progression is to identify the driver genes from the variant genes by analyzing and integrating multi-types genomics data. Breast [...] Read more.
With advances in next-generation sequencing(NGS) technologies, a large number of multiple types of high-throughput genomics data are available. A great challenge in exploring cancer progression is to identify the driver genes from the variant genes by analyzing and integrating multi-types genomics data. Breast cancer is known as a heterogeneous disease. The identification of subtype-specific driver genes is critical to guide the diagnosis, assessment of prognosis and treatment of breast cancer. We developed an integrated frame based on gene expression profiles and copy number variation (CNV) data to identify breast cancer subtype-specific driver genes. In this frame, we employed statistical machine-learning method to select gene subsets and utilized an module-network analysis method to identify potential candidate driver genes. The final subtype-specific driver genes were acquired by paired-wise comparison in subtypes. To validate specificity of the driver genes, the gene expression data of these genes were applied to classify the patient samples with 10-fold cross validation and the enrichment analysis were also conducted on the identified driver genes. The experimental results show that the proposed integrative method can identify the potential driver genes and the classifier with these genes acquired better performance than with genes identified by other methods. Full article
Show Figures

Figure 1

225 KiB  
Article
Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics
by Xiaohui Lin, Chao Li, Yanhui Zhang, Benzhe Su, Meng Fan and Hai Wei
Molecules 2018, 23(1), 52; https://doi.org/10.3390/molecules23010052 - 26 Dec 2017
Cited by 71 | Viewed by 4615
Abstract
Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in [...] Read more.
Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data. Full article
3181 KiB  
Article
Extracting Fitness Relationships and Oncogenic Patterns among Driver Genes in Cancer
by Xindong Zhang, Lin Gao and Songwei Jia
Molecules 2018, 23(1), 39; https://doi.org/10.3390/molecules23010039 - 25 Dec 2017
Cited by 1 | Viewed by 3769
Abstract
Driver mutation provides fitness advantage to cancer cells, the accumulation of which increases the fitness of cancer cells and accelerates cancer progression. This work seeks to extract patterns accumulated by driver genes (“fitness relationships”) in tumorigenesis. We introduce a network-based method for extracting [...] Read more.
Driver mutation provides fitness advantage to cancer cells, the accumulation of which increases the fitness of cancer cells and accelerates cancer progression. This work seeks to extract patterns accumulated by driver genes (“fitness relationships”) in tumorigenesis. We introduce a network-based method for extracting the fitness relationships of driver genes by modeling the network properties of the “fitness” of cancer cells. Colon adenocarcinoma (COAD) and skin cutaneous malignant melanoma (SKCM) are employed as case studies. Consistent results derived from different background networks suggest the reliability of the identified fitness relationships. Additionally co-occurrence analysis and pathway analysis reveal the functional significance of the fitness relationships with signaling transduction. In addition, a subset of driver genes called the “fitness core” is recognized for each case. Further analyses indicate the functional importance of the fitness core in carcinogenesis, and provide potential therapeutic opportunities in medicinal intervention. Fitness relationships characterize the functional continuity among driver genes in carcinogenesis, and suggest new insights in understanding the oncogenic mechanisms of cancers, as well as providing guiding information for medicinal intervention. Full article
Show Figures

Figure 1

2204 KiB  
Article
HIGA: A Running History Information Guided Genetic Algorithm for Protein–Ligand Docking
by Boxin Guan, Changsheng Zhang and Yuhai Zhao
Molecules 2017, 22(12), 2233; https://doi.org/10.3390/molecules22122233 - 15 Dec 2017
Cited by 2 | Viewed by 4257
Abstract
Protein-ligand docking is an essential part of computer-aided drug design, and it identifies the binding patterns of proteins and ligands by computer simulation. Though Lamarckian genetic algorithm (LGA) has demonstrated excellent performance in terms of protein-ligand docking problems, it can not memorize the [...] Read more.
Protein-ligand docking is an essential part of computer-aided drug design, and it identifies the binding patterns of proteins and ligands by computer simulation. Though Lamarckian genetic algorithm (LGA) has demonstrated excellent performance in terms of protein-ligand docking problems, it can not memorize the history information that it has accessed, rendering it effort-consuming to discover some promising solutions. This article illustrates a novel optimization algorithm (HIGA), which is based on LGA for solving the protein-ligand docking problems with an aim to overcome the drawback mentioned above. A running history information guided model, which includes CE crossover, ED mutation, and BSP tree, is applied in the method. The novel algorithm is more efficient to find the lowest energy of protein-ligand docking. We evaluate the performance of HIGA in comparison with GA, LGA, EDGA, CEPGA, SODOCK, and ABC, the results of which indicate that HIGA outperforms other search algorithms. Full article
Show Figures

Figure 1

324 KiB  
Article
Multi-Objective Optimization Algorithm to Discover Condition-Specific Modules in Multiple Networks
by Xiaoke Ma, Penggang Sun and Jianbang Zhao
Molecules 2017, 22(12), 2228; https://doi.org/10.3390/molecules22122228 - 14 Dec 2017
Cited by 5 | Viewed by 3028
Abstract
The advances in biological technologies make it possible to generate data for multiple conditions simultaneously. Discovering the condition-specific modules in multiple networks has great merit in understanding the underlying molecular mechanisms of cells. The available algorithms transform the multiple networks into a single [...] Read more.
The advances in biological technologies make it possible to generate data for multiple conditions simultaneously. Discovering the condition-specific modules in multiple networks has great merit in understanding the underlying molecular mechanisms of cells. The available algorithms transform the multiple networks into a single objective optimization problem, which is criticized for its low accuracy. To address this issue, a multi-objective genetic algorithm for condition-specific modules in multiple networks (MOGA-CSM) is developed to discover the condition-specific modules. By using the artificial networks, we demonstrate that the MOGA-CSM outperforms state-of-the-art methods in terms of accuracy. Furthermore, MOGA-CSM discovers stage-specific modules in breast cancer networks based on The Cancer Genome Atlas (TCGA) data, and these modules serve as biomarkers to predict stages of breast cancer. The proposed model and algorithm provide an effective way to analyze multiple networks. Full article
Show Figures

Figure 1

1853 KiB  
Article
Developing an Agent-Based Drug Model to Investigate the Synergistic Effects of Drug Combinations
by Hongjie Gao, Zuojing Yin, Zhiwei Cao and Le Zhang
Molecules 2017, 22(12), 2209; https://doi.org/10.3390/molecules22122209 - 14 Dec 2017
Cited by 14 | Viewed by 3410
Abstract
The growth and survival of cancer cells are greatly related to their surrounding microenvironment. To understand the regulation under the impact of anti-cancer drugs and their synergistic effects, we have developed a multiscale agent-based model that can investigate the synergistic effects of drug [...] Read more.
The growth and survival of cancer cells are greatly related to their surrounding microenvironment. To understand the regulation under the impact of anti-cancer drugs and their synergistic effects, we have developed a multiscale agent-based model that can investigate the synergistic effects of drug combinations with three innovations. First, it explores the synergistic effects of drug combinations in a huge dose combinational space at the cell line level. Second, it can simulate the interaction between cells and their microenvironment. Third, it employs both local and global optimization algorithms to train the key parameters and validate the predictive power of the model by using experimental data. The research results indicate that our multicellular system can not only describe the interactions between the microenvironment and cells in detail, but also predict the synergistic effects of drug combinations. Full article
Show Figures

Figure 1

4366 KiB  
Article
Detection of Network Motif Based on a Novel Graph Canonization Algorithm from Transcriptional Regulation Networks
by Jialu Hu and Xuequn Shang
Molecules 2017, 22(12), 2194; https://doi.org/10.3390/molecules22122194 - 10 Dec 2017
Cited by 17 | Viewed by 3786
Abstract
Network motifs are patterns of complex networks occurring significantly more frequently than those in random networks. They have been considered as fundamental building blocks of complex networks. Therefore, the detection of network motifs in transcriptional regulation networks is a crucial step in understanding [...] Read more.
Network motifs are patterns of complex networks occurring significantly more frequently than those in random networks. They have been considered as fundamental building blocks of complex networks. Therefore, the detection of network motifs in transcriptional regulation networks is a crucial step in understanding the mechanism of transcriptional regulation and network evolution. The search for network motifs is similar to solving subgraph searching problems, which has proven to be NP-complete. To quickly and effectively count subgraphs of a large biological network, we propose a novel graph canonization algorithm based on resolving sets. This method has been implemented in a command line interface (CLI) program sgip using the SeqAn library. Comparing to Babai’s algorithm, this approach has a tighter complexity bound, o ( exp ( n log 2 n + 4 log n ) ) , on strongly regular graphs. Results on several simulated datasets and transcriptional regulation networks indicate that sgip outperforms nauty on many graph cases. The source code of sgip is freely accessible in https://github.com/seqan/seqan/tree/master/apps/sgip and the binary code in http://packages.seqan.de/sgip/. Full article
Show Figures

Figure 1

2297 KiB  
Article
A Seed Expansion Graph Clustering Method for Protein Complexes Detection in Protein Interaction Networks
by Jie Wang, Wenping Zheng, Yuhua Qian and Jiye Liang
Molecules 2017, 22(12), 2179; https://doi.org/10.3390/molecules22122179 - 08 Dec 2017
Cited by 9 | Viewed by 4233
Abstract
Most proteins perform their biological functions while interacting as complexes. The detection of protein complexes is an important task not only for understanding the relationship between functions and structures of biological network, but also for predicting the function of unknown proteins. We present [...] Read more.
Most proteins perform their biological functions while interacting as complexes. The detection of protein complexes is an important task not only for understanding the relationship between functions and structures of biological network, but also for predicting the function of unknown proteins. We present a new nodal metric by integrating its local topological information. The metric reflects its representability in a larger local neighborhood to a cluster of a protein interaction (PPI) network. Based on the metric, we propose a seed-expansion graph clustering algorithm (SEGC) for protein complexes detection in PPI networks. A roulette wheel strategy is used in the selection of the seed to enhance the diversity of clustering. For a candidate node u, we define its closeness to a cluster C, denoted as NC(u, C), by combing the density of a cluster C and the connection between a node u and C. In SEGC, a cluster which initially consists of only a seed node, is extended by adding nodes recursively from its neighbors according to the closeness, until all neighbors fail the process of expansion. We compare the F-measure and accuracy of the proposed SEGC algorithm with other algorithms on Saccharomyces cerevisiae protein interaction networks. The experimental results show that SEGC outperforms other algorithms under full coverage. Full article
Show Figures

Figure 1

1541 KiB  
Article
A Robust Manifold Graph Regularized Nonnegative Matrix Factorization Algorithm for Cancer Gene Clustering
by Rong Zhu, Jin-Xing Liu, Yuan-Ke Zhang and Ying Guo
Molecules 2017, 22(12), 2131; https://doi.org/10.3390/molecules22122131 - 02 Dec 2017
Cited by 16 | Viewed by 4479
Abstract
Detecting genomes with similar expression patterns using clustering techniques plays an important role in gene expression data analysis. Non-negative matrix factorization (NMF) is an effective method for clustering the analysis of gene expression data. However, the NMF-based method is performed within the Euclidean [...] Read more.
Detecting genomes with similar expression patterns using clustering techniques plays an important role in gene expression data analysis. Non-negative matrix factorization (NMF) is an effective method for clustering the analysis of gene expression data. However, the NMF-based method is performed within the Euclidean space, and it is usually inappropriate for revealing the intrinsic geometric structure of data space. In order to overcome this shortcoming, Cai et al. proposed a novel algorithm, called graph regularized non-negative matrices factorization (GNMF). Motivated by the topological structure of the GNMF-based method, we propose improved graph regularized non-negative matrix factorization (GNMF) to facilitate the display of geometric structure of data space. Robust manifold non-negative matrix factorization (RM-GNMF) is designed for cancer gene clustering, leading to an enhancement of the GNMF-based algorithm in terms of robustness. We combine the l 2 , 1 -norm NMF with spectral clustering to conduct the wide-ranging experiments on the three known datasets. Clustering results indicate that the proposed method outperforms the previous methods, which displays the latest application of the RM-GNMF-based method in cancer gene clustering. Full article
Show Figures

Figure 1

1825 KiB  
Article
An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer
by Xi Yang, Chengkun Wu, Kai Lu, Lin Fang, Yong Zhang, Shengkang Li, Guixin Guo and YunFei Du
Molecules 2017, 22(12), 2116; https://doi.org/10.3390/molecules22122116 - 01 Dec 2017
Cited by 3 | Viewed by 4253
Abstract
Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing [...] Read more.
Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing in China provides extra possibilities and capacity to address the challenges associated with big data. In this paper, we propose Orion—a big data interface on the Tianhe-2 supercomputer—to enable big data applications to run on Tianhe-2 via a single command or a shell script. Orion supports multiple users, and each user can launch multiple tasks. It minimizes the effort needed to initiate big data applications on the Tianhe-2 supercomputer via automated configuration. Orion follows the “allocate-when-needed” paradigm, and it avoids the idle occupation of computational resources. We tested the utility and performance of Orion using a big genomic dataset and achieved a satisfactory performance on Tianhe-2 with very few modifications to existing applications that were implemented in Hadoop/Spark. In summary, Orion provides a practical and economical interface for big data processing on Tianhe-2. Full article
Show Figures

Figure 1

1092 KiB  
Article
Cancer Classification Based on Support Vector Machine Optimized by Particle Swarm Optimization and Artificial Bee Colony
by Lingyun Gao, Mingquan Ye and Changrong Wu
Molecules 2017, 22(12), 2086; https://doi.org/10.3390/molecules22122086 - 29 Nov 2017
Cited by 43 | Viewed by 5577
Abstract
Intelligent optimization algorithms have advantages in dealing with complex nonlinear problems accompanied by good flexibility and adaptability. In this paper, the FCBF (Fast Correlation-Based Feature selection) method is used to filter irrelevant and redundant features in order to improve the quality of cancer [...] Read more.
Intelligent optimization algorithms have advantages in dealing with complex nonlinear problems accompanied by good flexibility and adaptability. In this paper, the FCBF (Fast Correlation-Based Feature selection) method is used to filter irrelevant and redundant features in order to improve the quality of cancer classification. Then, we perform classification based on SVM (Support Vector Machine) optimized by PSO (Particle Swarm Optimization) combined with ABC (Artificial Bee Colony) approaches, which is represented as PA-SVM. The proposed PA-SVM method is applied to nine cancer datasets, including five datasets of outcome prediction and a protein dataset of ovarian cancer. By comparison with other classification methods, the results demonstrate the effectiveness and the robustness of the proposed PA-SVM method in handling various types of data for cancer classification. Full article
Show Figures

Graphical abstract

610 KiB  
Article
Deep Convolutional Neural Network-Based Early Automated Detection of Diabetic Retinopathy Using Fundus Image
by Kele Xu, Dawei Feng and Haibo Mi
Molecules 2017, 22(12), 2054; https://doi.org/10.3390/molecules22122054 - 23 Nov 2017
Cited by 204 | Viewed by 10237
Abstract
The automatic detection of diabetic retinopathy is of vital importance, as it is the main cause of irreversible vision loss in the working-age population in the developed world. The early detection of diabetic retinopathy occurrence can be very helpful for clinical treatment; although [...] Read more.
The automatic detection of diabetic retinopathy is of vital importance, as it is the main cause of irreversible vision loss in the working-age population in the developed world. The early detection of diabetic retinopathy occurrence can be very helpful for clinical treatment; although several different feature extraction approaches have been proposed, the classification task for retinal images is still tedious even for those trained clinicians. Recently, deep convolutional neural networks have manifested superior performance in image classification compared to previous handcrafted feature-based image classification methods. Thus, in this paper, we explored the use of deep convolutional neural network methodology for the automatic classification of diabetic retinopathy using color fundus image, and obtained an accuracy of 94.5% on our dataset, outperforming the results obtained by using classical approaches. Full article
Show Figures

Figure 1

Back to TopTop