Next Article in Journal
Efficiency Enhancement of Chlorine Contact Tanks in Water Treatment Plants: A Full-Scale Application
Previous Article in Journal
In Vitro Antifungal Efficacy of White Radish (Raphanus sativus L.) Root Extract and Application as a Natural Preservative in Sponge Cake
Previous Article in Special Issue
Ear Detection and Localization with Convolutional Neural Networks in Natural Images and Videos
Open AccessFeature PaperReview

A Review of Computational Methods for Clustering Genes with Similar Biological Functions

1
School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, Johor, Malaysia
2
Institute for Artificial Intelligence and Big Data, Universiti Malaysia Kelantan, Kota Bharu 16100, Kelantan, Malaysia
3
Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirate University, Al Ain 15551, UAE
4
School of Computing and Information Systems, University of Melbourne, Parkville 3010, Victoria, Australia
5
Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, Serdang 43400, Selangor, Malaysia
6
BISITE Research Group, Digital Innovation Hub, University of Salamanca, Edificio I+D+i, C/ Espejos s/n, 37007 Salamanca, Spain
7
Division of Data-Driven Smart Systems Design, Digital Monozukuri (Manufacturing) Education and Research Center, Hiroshima University, #210, 3-10-31 Kagamiyama, Higashi-Hiroshima 739-0046, Hiroshima Prefecture, Japan
*
Author to whom correspondence should be addressed.
Processes 2019, 7(9), 550; https://doi.org/10.3390/pr7090550
Received: 8 July 2019 / Revised: 5 August 2019 / Accepted: 16 August 2019 / Published: 21 August 2019
(This article belongs to the Special Issue Bioinformatics Applications Based On Machine Learning)
Clustering techniques can group genes based on similarity in biological functions. However, the drawback of using clustering techniques is the inability to identify an optimal number of potential clusters beforehand. Several existing optimization techniques can address the issue. Besides, clustering validation can predict the possible number of potential clusters and hence increase the chances of identifying biologically informative genes. This paper reviews and provides examples of existing methods for clustering genes, optimization of the objective function, and clustering validation. Clustering techniques can be categorized into partitioning, hierarchical, grid-based, and density-based techniques. We also highlight the advantages and the disadvantages of each category. To optimize the objective function, here we introduce the swarm intelligence technique and compare the performances of other methods. Moreover, we discuss the differences of measurements between internal and external criteria to validate a cluster quality. We also investigate the performance of several clustering techniques by applying them on a leukemia dataset. The results show that grid-based clustering techniques provide better classification accuracy; however, partitioning clustering techniques are superior in identifying prognostic markers of leukemia. Therefore, this review suggests combining clustering techniques such as CLIQUE and k-means to yield high-quality gene clusters. View Full-Text
Keywords: gene clustering; swarm intelligence; biological functions detection; informative genes gene clustering; swarm intelligence; biological functions detection; informative genes
Show Figures

Figure 1

MDPI and ACS Style

Nies, H.W.; Zakaria, Z.; Mohamad, M.S.; Chan, W.H.; Zaki, N.; Sinnott, R.O.; Napis, S.; Chamoso, P.; Omatu, S.; Corchado, J.M. A Review of Computational Methods for Clustering Genes with Similar Biological Functions. Processes 2019, 7, 550.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map

1
Back to TopTop