Mutual Information Based on Multiple Level Discretization Network Inference from Time Series Gene Expression Profiles
Abstract
:1. Introduction
2. Materials and Methods
2.1. Related Works
2.2. Discretized Network Model
2.3. The Discretization Network Inference Problem
2.4. Structure Performance Metrics
3. Our Proposed Method
3.1. Discretization
3.2. MIFS and SWAP Subroutines
Algorithm 1: subroutine, where is the target variable, is the set of variables, and k is the desired number of input variables |
|
Algorithm 2: subroutine, where is the target variable is the set of selected variables such that if for all , and is the set of unselected variables such that if for all |
|
4. Results
4.1. Case Study 1: Artificial Dataset
4.2. Case Study 2: Escherichia Coli Dataset
5. Discussion and Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhang, B.; Horvath, S. A General Framework for Weighted Gene Co-Expression Network Analysis. Stat. Appl. Genet. Mol. Biol. 2005, 4, 17. [Google Scholar] [CrossRef]
- Faith, J.J.; Hayete, B.; Thaden, J.T.; Mogno, I.; Wierzbowski, J.; Cottarel, G.; Kasif, S.; Collins, J.J.; Gardner, T.S. Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles. PLoS Biol. 2007, 5, e8. [Google Scholar] [CrossRef]
- Zoppoli, P.; Morganella, S.; Ceccarelli, M. TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach. BMC Bioinform. 2010, 11, 154. [Google Scholar] [CrossRef]
- Meyer, P.E.; Kontos, K.; Lafitte, F.; Bontempi, G. Information-Theoretic Inference of Large Transcriptional Regulatory Networks. Eurasip J. Bioinform. Syst. Biol. 2007, 2007, 1–9. [Google Scholar] [CrossRef]
- Lähdesmäki, H.; Shmulevich, I.; Yli-Harja, O. On Learning Gene Regulatory Networks Under the Boolean Network Model. Mach. Learn. 2003, 52, 147–167. [Google Scholar] [CrossRef]
- Haider, S.; Pal, R. Boolean network inference from time series data incorporating prior biological knowledge. BMC Genom. 2012, 13, S9. [Google Scholar] [CrossRef]
- Barman, S.; Kwon, Y.-K. A novel mutual information-based Boolean network inference method from time-series gene expression data. PLoS ONE 2017, 12, e0171097. [Google Scholar] [CrossRef]
- Barman, S.; Kwon, Y.K. A Boolean network inference from time-series gene expression data using a genetic algorithm. Bioinformatics 2018, 34, i927–i933. [Google Scholar] [CrossRef]
- Barman, S.; Kwon, Y.K. A neuro-evolution approach to infer a Boolean network from time-series gene expressions. Bioinformatics 2020, 36 (Suppl. S2), i762–i769. [Google Scholar] [CrossRef]
- Shen, J.; Chang, S.I.; Lee, E.S.; Deng, Y.; Brown, S.J. Determination of cluster number in clustering microarray data. Appl. Math. Comput. 2005, 169, 1172–1185. [Google Scholar] [CrossRef]
- Halkidi, M.; Batistakis, Y.; Vazirgiannis, M. Vazirgiannis, Clustering algorithms and validity measures. In Proceedings of the Thirteenth International Conference on Scientific and Statistical Database Management, Fairfax, VA, USA, 18–20 July 2001. [Google Scholar]
- Zou, M.; Conzen, S.D. A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 2004, 21, 71–79. [Google Scholar] [CrossRef]
- Yang, B.; Xu, Y.; Maxwell, A.; Koh, W.; Gong, P.; Zhang, C. MICRAT: A novel algorithm for inferring gene regulatory networks using time series gene expression data. BMC Syst. Biol. 2018, 12, 115. [Google Scholar] [CrossRef]
- de Luis Balaguer, M.A.; Sozzani, R. Inferring Gene Regulatory Networks in the Arabidopsis Root Using a Dynamic Bayesian Network Approach. Methods Mol. Biol. 2017, 1629, 331–348. [Google Scholar]
- Yu, J.; Smith, V.A.; Wang, P.P.; Hartemink, A.J.; Jarvis, E.D. Advances to Bayesian network inference for generating causal networks from observational biological data. Bioinformatics 2004, 20, 3594–3603. [Google Scholar] [CrossRef]
- Imoto, S.; Goto, T.; Miyano, S. Estimation of genetic networks and functional structures between genes by using Bayesian network and nonparametric regression. Pac. Symp. Biocomput. 2002, 7, 175–186. [Google Scholar]
- Kim, S.Y.; Imoto, S.; Miyano, S. Inferring gene networks from time series microarray data using dynamic Bayesian networks. Briefings Bioinform. 2003, 4, 228–235. [Google Scholar] [CrossRef]
- Murphy, K.; Mian, S. Modeling Gene Expression Data Using Dynamic Bayesian Networks; Technical Report, Computer Science Division; University of California: Berkeley, CA, USA, 1999. [Google Scholar]
- Perrin, B.-E.; Ralaivola, L.; Mazurie, A.; Bottani, S.; Mallet, J.; D’alché–Buc, F. Gene networks inference using dynamic Bayesian networks. Bioinformatics 2003, 19, ii138–ii148. [Google Scholar] [CrossRef]
- Reshef, D.N.; Reshef, Y.A.; Finucane, H.K.; Grossman, S.R.; McVean, G.; Turnbaugh, P.J.; Lander, E.S.; Mitzenmacher, M.; Sabeti, P.C. Detecting Novel Associations in Large Data Sets. Science 2011, 334, 1518–1524. [Google Scholar] [CrossRef]
- MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Davis Davis, CA, USA, 21 June–18 July 1965 and 27 December–7 January 1966. [Google Scholar]
- Bholowalia, P.; Kumar, A. EBK-Means: A Clustering Technique based on Elbow Method and K-Means in WSN. Int. J. Comput. Appl. 2014, 105, 17–24. [Google Scholar]
- Kodinariya, T.M.; Makwana, P.R. Review on determining number of Cluster in K-Means Clustering. Int. J. Adv. Res. Comput. Sci. Manag. Stud. 2013, 1, 90–95. [Google Scholar]
- Ray, S.; Turi, R.H. Determination of number of clusters in K-means clustering and application in colour image segmentation. In Proceedings of the 4th International Conference on Advances in Pattern Recognition and Digital Techniques (ICAPRDTÕ99), New Delhi, India, 27–29 December 1999; pp. 27–29. [Google Scholar]
- Schaffter, T.; Marbach, D.; Floreano, D. GeneNetWeaver: In silico benchmark generation and performance profiling of network inference methods. Bioinformatics 2011, 27, 2263–2270. [Google Scholar] [CrossRef]
Parameters | Artificial Dataset | Escherichia coli Dataset [25] |
---|---|---|
Size of networks (N) | 10, 20, ..., 190, 200 | 10, 50, 100, 200 |
Noise rate | without noise | 0, 5, 10% |
Time lag | 1 | 1 |
Number of time points (T) | 30 | 30 |
Include at least regulators | 0.4 × N | 0.4 × N |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Anh, C.-T.; Kwon, Y.-K. Mutual Information Based on Multiple Level Discretization Network Inference from Time Series Gene Expression Profiles. Appl. Sci. 2023, 13, 11902. https://doi.org/10.3390/app132111902
Anh C-T, Kwon Y-K. Mutual Information Based on Multiple Level Discretization Network Inference from Time Series Gene Expression Profiles. Applied Sciences. 2023; 13(21):11902. https://doi.org/10.3390/app132111902
Chicago/Turabian StyleAnh, Cao-Tuan, and Yung-Keun Kwon. 2023. "Mutual Information Based on Multiple Level Discretization Network Inference from Time Series Gene Expression Profiles" Applied Sciences 13, no. 21: 11902. https://doi.org/10.3390/app132111902
APA StyleAnh, C.-T., & Kwon, Y.-K. (2023). Mutual Information Based on Multiple Level Discretization Network Inference from Time Series Gene Expression Profiles. Applied Sciences, 13(21), 11902. https://doi.org/10.3390/app132111902